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Stressful Life Events, Personality, and Health: 
An Inquiry Into Hardiness 


Suzanne C. Kobasa 
University of Chicago 


Personality was studied as a conditioner of the effects of stressful life events on 
illness onset. Two groups of middle and upper level executives had comparably 
high degrees of stressful life events in the previous 3 years, as measured by the 
Holmes and Rahe Schedule of Recent Life Events. One group (n = 86) suffered 
high stress without falling ill, whereas the other (n= 75) reported becoming 
sick after their encounter with stressful life events. Illness was measured by the 
Wyler, Masuda, and Holmes Seriousness of Illness Survey. Discriminant function 
analysis, run on half of the subjects in each group and cross-validated on the 
remaining cases, supported the prediction that high stress/low illness executives 
show, by comparison with high stress/high illness executives, more hardiness, 
that is, have a stronger commitment to self, an attitude of vigorousness toward 
the environment, a sense of meaningfulness, and an internal locus of control. 


~ 


An exceptional number of studies in the last 
20 years (cf. Dohrenwend & Dohrenwend, 
1974; Gunderson & Rahe, 1974) have sug- 


This article is based on the author’s doctoral dis- 

sertation (Kobasa, 1977), submitted to the Depart- 
ment of Behavioral Sciences at the University of 
Chicago. The preparation of this manuscript was sup- 
ported in part by Public Health Service Grant MH- 
28839-01 from the National Institute of Mental 
Health. The author wishes to thank Robert R. J. 
Hilker, James Kennedy, and all of the executives 
who participated in the study. Special appreciation 
is extended to Salvatore R. Maddi who supervised 
the project. Chase P.` Kimball, David E. Orlinsky, 
and Marvin Zonis contributed many useful sugges- 
tions as dissertation committee members. 
Requests for reprints should be sent to Suzanne 
. Kobasa, Department of Behavioral Sciences, Uni- 
ity of Chicago, 5848 S. University Avenue, Chi- 
; Tlinois 60637. 


gested that stressful life events precipitate 
somatic and psychological disease. This article 
considers the importance of personality as a 
conditioner of the illness-provoking effects of 
stress. - 

During the last decade, investigators have 
shown that the recent life histories of hos- 
pitalized persons contain significantly more 
frequent and serious stressful events than do 
histories of matched controls from the general 
population (e.g., Paykel, 1974) and that 
Navy personnel who begin a cruise with high 
stress scores suffer more illness episodes dur- 
ing the months at sea than do sailors who 
start out with low stress scores (Rahe, 1974), 
But the possibility of a causal connection be- 
tween stress and illness is hardly a new idea. 
Physicians, philosophers, and Persons simply 
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Studying Stressed but Healthy Persons 


In contrast, the present study considers 
how highly stressed subjects who remain 
healthy differ from those who show illness 
along with high stress, Studying the individual 
who undergoes high degrees of stress without 
falling ill amounts to inquiring about the 
‘mediating factors that affect the way one re- 
acts to stress. Holmes and Masuda (1974) 
and the majority of other stress investigators 
attempt to draw a direct causal link between 
the occurrence of stressful life events and the 
onset of illness by reference to the physio- 
logical model of a stress reaction formulated 
by Hans Selye (1956). Stressful life events 
are said to evoke “adaptive efforts by the hu- 
man organism that are faulty in kind or dura- 
tion, lower ‘bodily resistance’ and enhance the 
probability of disease occurrence” (Holmes 
& Masuda, 1974, p. 68). Holmes and others, 
however, fail to take into account what Selye 
goes on to say about individual differences 
and the stress reaction. In the study described 
in this article, the more subtle points in 
Selye’s work are referred to, in the attempt 
to consider factors in the stress reaction that 
serve to deflect the negative impact of stress- 
ful events. Mediators of the stress and illness 
connection, which probably include physio- 
logical predisposition, early childhood experi- 
ences, and social resources, as well as the 
mediator emphasized here, personality, are to- 
gether responsible for what Selye calls the 
distinctive way in which each individua 
“takes to” stressful life occurrences. 

The proposition of this study is that per- 
sons who experience high degrees of stress 
without falling ill have a personality structure 

ifferentiating them from persons who become 
ick under stress. This personality difference 
is best characterized by the term hardiness. 
The conceptual source of the supposition, in 
contrast to the passive and reactive view of 
humankind found in most stress and illness 
work, is a set of approaches to human be- 
havior that Maddi (1976), in his categoriza- 
ion of the major personality theories, calls 
ulfillment theories. The hardy personality 
ype formulated here builds upon the theoriz- 
ng of existential psychologists (Kobasa & 
Maddi, 1977; Maddi, 1975) on the strenuous- 
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ness of authentic living, White (1959) on 
competence, Allport (1955) on propriate 
striving, and Fromm (1947) on the produc- 
tive orientation. Hardy persons are considered 
to possess three general characteristics: (a) 
the belief that they can control or influence 
the events of their experience, (b) an ability 
to feel deeply involved in or committed to the 
activities of their lives, and (c) the anticipa- 
tion of change as an exciting challenge to 
further development. Much research has al- 
ready shown the advantages in behavior of 
control (e.g., Lefcourt, 1973; Rodin & Langer, 
1977; Rotter, Seeman, & Liverant, 1962; 
Seligman, 1975), commitment (e.g., Antonov- 
sky, 1974; Kobasa & Maddi, 1977; Lazarus, 
1966; Lazarus, Averill, & Opton, 1974; Moss, 
1973), and challenge (Fiske & Maddi, 1961; 
Maddi, 1967). In discussing the hypotheses 
presented below, the implications of theory 
and research concerning these three general 
characteristics are extended to considerations 
of health and illness. 

Hypothesis 1. Among persons under stress, 
those who have a greater sense of control 
over what occurs in their lives will remain 
healthier than those who feel powerless in the 
face of external forces. Following the model 
proposed by Averill (1973) to explain his 
laboratory observation that some organisms 
are not debilitated by stressful stimuli, the 
highly stressed but healthy person is hypothe- 
sized to have (a) decisional control, or the 
capability of autonomously choosing among 
various courses of action to handle the stress; 
(b) cognitive control, or the ability to inter- 
pret, appraise, and incorporate various sorts 
of stressful events into an ongoing life plan 
and, thereby, deactivate their jarring effects; 
and (c) coping skill, or a greater repertory 
of suitable responses to stress developed 
through a characteristic motivation to achieve 
across all situations. In contrast, the highly 
stressed persons who become ill are powerless, 
nihilistic, and low in motivation for achieve- 
ment. When stress occurs, they are without 


recourse for its resolution, give up what little 


control they do possess, and succumb to the 
incapacity of illness. 

Hypothesis 2. Among persons under stress, 
those who feel committed to the various areas 
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along with questions about demographics and per- 
ception of life stressfulness. 

The data from approximately half of each group 
were used to test hypotheses concerning group dif- 
ferences in hardiness. Data from the remaining cases 
in each group were used to cross-validate the results. 
The division of the groups into test and cross- 
validation cases was necessitated by the statistical 
technique relied upon in the study, discriminant 
function analysis, which has been characterized in 
previous research by problems of generalizability 
due to instability of results (cf. Huberty, 1975). Al- 
though powerful as a tool for looking at group dif- 
ferences, discriminant function analysis, like regres 
sion analysis, has not always provided results that 
hold up in the making of inferences from sample re- 
sults to some population, and over repeated sam- 
plings. Until a replication of the study is possible, 
the use of a “holdout sample” must be relied upon 
for an accurate test of the adequacy of the derived 
discriminant function that defines group differences. 


Subjects 


This study required a subject pool both large and 
stressed enough to obtain sufficiently large groups 
for study. All of the middle and upper level execu- 
tives of a large public utility served as the pool 
from which groups were selected. This utility, in the 
year prior to the onset of the study, had entertained 
serious discussion (in executive seminars, company 
publications, and consultations with the medical de- 
partment) of the increasing numbers of stressful 
events faced by executives. These events consisted 
of changes instituted by the utility itself, like a pro- 
gram of job evaluations that led to some promotions 
and many more demotions, as well as requirements 
for readjustment from external sources, such as the 
federal government’s affirmative action demands. 
These changes, coupled with the expected usual range 
of personal and family stresses, suggested that stress 
scores for the subject pool would be generally high. 

Demographically, the pool was quite homogeneous. 
The modal characteristics of the subjects were (a) 
male gender; (b) 40 to 49 years of age; (c) married, 
with two children; (d) on the third or middle man- 
agement level, and having been there for 6 years 
or more; (e) possessing at least a college degree; 
(f) wife not working outside the home; (g) usually 
Protestant, and attending religious services very or 
fairly often. 


Measurement of Stress and Illness 


The most frequently used scales in stress and ill- 
ness research, the Schedule of Recent Life Events 
and the Social Readjustment Rating Scale (Holmes 
& Rahe, 1967), were employed in this study. Addi- 
tions to these scales were cgi the a a 
pilot testing. The majority of additions we 
detailed specidcatians of the original items, modeled 
after suggestions from other adapters of the test 
(Hough, Fairbank, & Garcia, 1976; Paykel, 1974). 


Each of the most ambiguous events was replaced 
by two events, one presenting the positive form of 
the Holmes and Rahe item and the other, the nega- 
tive version. “Change in financial state,” for example, 
was translated into “improvements in financial state” 
and “worsening of financial state.” These specifica- 
tions were given the seriousness weights of the items 
from which they were derived. Other additions to the 
Holmes and Rahe list were based on a pilot use of 
the test with 50 randomly selected executives. In 
response to the question “What other events have 
you experienced during the past 3 years?”, these 
subjects reported 15 events not found on the original 
list. Most of these referred to occurrences at work, 
Seriousness weights were assigned to these additions 
by the investigator and 20 other judges using the 
ratio scale judgment procedure of Holmes and Rahe 
(1967). 

The illness items in the stress and illness question- 
naire were taken from the Wyler, Masuda, and 
Holmes (1968) Seriousness of Illness Survey, After 
consultation with the medical director of the execu- 
tives’ company, 118 of the diseases listed in this 
survey were chosen as applicable to the pool being 
tested. Each illness item is characterized by a seri- 
ousness weight based on a consensual agreement of 
numerous and diverse judges (both medical and 
lay) and is obtained in a manner similar to the 
derivation of the Social Readjustment Rating Scale. 
The reliability and validity of this scale as a com- 
plete listing of disease syndromes in a form accessible 
to both laypersons and physicians, and as an accu- 
rate set of evaluations of the general seriousness of 
various distinct illnesses, has been established (cf. 
Wyler, Masuda, & Holmes, 1970), 


Measurement of Personality, Demographic, 
and Perception Variables 


A composite questionnaire, made up of all or parts 
of four standardized and two newly constructed 
instruments, was designed to test the three personal- 
ity hypotheses. The standardized tests were chosen 
for their theoretical relevance and empirical relia- 
bility and validity. All of the instruments are ap- 
propriate for use with a sample of executives (i.e. 
a group of well-educated adult professionals who 
are relatively ftee of gross psychopathology). 

The control dimension was measured through four 
different instruments. What has been called deci- 
sional control or autonomy was measured through 
the Internal-External Locus of Control Scale (Lef- 
court, 1973; Rotter, Seeman, & Liverant, 1962), and 
the Powerlessness versus Personal Control scale of 
the Alienation Test (Maddi, Kobasa, & Hoover, 
Note 2). The latter instrument also provided a 
way of measuring cognitive control (i.e. the abil- 
ity to find meaning in stressful life events) in its 
Nihilism versus Meaningfulness scale. Coping skill 
(ie., the availability of responses with which to deal 
with stressful life events) was measured through the 
Achievement scale of the Personality Research Form 
(Jackson, 1974; Wiggins, 1973). 
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Table 1 
Differences Between High Stress/Low Illness* and High Stress/High Illness* Executives 
High stress/ High stress/ Standardized 
low illness high illness discriminant 
5 t function 
Variable M SD M SD value coefficient 
Control 
Nihilism 196.05 133.61 281.02 169.86 2.49** 73 
External locus of control 5.92 4.10 7.90 4.61 2.03* .22 
Powerlessness 301.15 188.93 388.47 188.44 Pils — 
Achievement 16.50 2.10 15.12 3.20 —1.20 = 
Dominance 14.60 3.26 13.85 4,46 86 = 
Leadership 33.47 7.34 34.63 6.80 ABS 43 
Commitment 
Alienation from self 102.35 117.24 219.15 185.77 3,36** 1,04 
Alienation from work 181.67 122.04 223.73 175.09 1.22 43 
Alienation from interpersonal 256.02 162.76 316.10 165.24 1.64 — 
Alienation from family 158.47 139.02 198.72 144.33 1.27 — 
Alienation from social 202.15 100.21 226.95 133.93 94 — 
Role consistency 29.22 6.42 29.50 6.44 19 30 
Challenge 
Vegetativeness 155.50 140.24 216.27 160,94 1,98* 99 
Security 21.11 6.33 22.19 8.60 34 35 
Cognitive structure 13.35 2.81 14.10. 2.85 1.10 21 
Adventurousness 269.00 164.58 337.54 174.95 1.78* — 
Endurance 15.97, -2.35 14.37 3.19 —.96 — 
Interesting experiences 34.97 6.83 325261 102 —.92 
Perception of personal stress 3.00 1.21 3.83 1.73 2.46** A3 


Note. For all variables, the higher the number, the greater the degree of the variable observed. Superior 
hardiness is indicated by higher scores on achievement, role consistency, endurance, and interesting experi- 
ences, and lower scores on nihilism, external locus, powerlessness, dominance, leadership, alienation (from 
self, work, social institutions, interpersonal relationships, and family), vegetativeness, security, cognitive 
structure, and adventurousness. A subject's scores on all areas of alienation, measured by the Alienation 
Test, have a possible range of 0 to 1,200. Vegetativeness, nihilism, powerlessness, and adventurousness 
scores, also from the Alienation Test, may range from 0 to 1,500. External locus has a low of 0 and an upper 
limit of 23. The scales taken from the Jackson. test—achievement, dominance, cognitive structure, and 
endurance—have a minimum value of 0 and a maximum of 20. The California Life Goals scale—leadership, 
security, and interesting experiences—may range from 0 to 60, Role consistency has a low of 0 and a high 
of 40; perception of personal stress can range from 0 to 7. 

`n = 40. 

*p <05. 
*p <01. 


discriminant function analysis done on all the discriminant function analysis and their sum- 
personality variables plus the one perception mary statistics, including significance of dif- 
variable that yielded a significant ¢ score. ference between groups established by ¢ test. 
After all data are transformed into standard The results demonstrate that high stress/low 
scores, discriminant function analysis com- illness executives can be distinguished from 
putes a discriminant equation, or a linear high stress/high illness subjects. The 11 vari- 
combination of weighted variables that pro- ables for which a standardized discriminant 
duces the greatest statistically derivable dis- function coefficient is provided in the table 
tance between the two groups. The larger the combine to form a significant function, with 
weighting or discriminant coefficient of a a Wilks’s Lambda of .64, significant at the 
variable, the more powerful it is as a group -001 level, and a canonical correlation of .60. 
miao The Wilks’s Lambda (see Huberty, 1975) is 

Table 1 presents all variables submitted to a measure of the original variable’s discrimi- 
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subject’s scores on the discriminating varia- 
bles by the associated coefficients allows one 
to predict the likelihood of the subject’s mem- 
bership in each of the groups. As a test of 
the classificatory power of the derived dis- 
criminant equation, its coefficients are (a) 
reapplied to the test subjects used to derive 
the equation and (b) applied to the scores 
of the “holdout” cases. The discriminant 
function presented in Table 1 shows predic- 
tive capability both internally and externally. 
Applying the unstandardized versions of the 
discriminant function coefficients to the raw 
scores used to derive the function, 78% of 
the “test” cases are correctly classified (80% 
of the high stress/low illness executives and 
75% of the high stress/high illness subjects). 
This significantly correct (p < .025) classi- 
fication is matched in the external cross-vali- 
dation. Using the unstandardized coefficients 
on the raw data from the “holdout” subjects, 
35 hits (77%) and 21 hits (60%) in the high 
stress/low illness and high stress/high illness 
groups, respectively, are realized (p < .05). 
These cross-validation results offer support 
for the stability and generalizibility of the 
results obtained through discriminant func- 
tion analysis. An examination of the statis- 
tics for the entire sample (test cases plus 
holdouts), which are notably similar to those 
of the test cases in Table 1, illustrates the 
strength of the discriminant analysis. Table 
2 presents the mean values, standard devia- 
tions, and ¢ values for the full high stress/low 
illness and high stress/high illness groups. 


Discussion 

This study of persons who do not fall ill 
despite considerable stress suggests that per- 
sonality may have something to do with stay- 
ing healthy. Using the five most significant 
discriminators of the high stress/low illness 
executives (i.e., the variables that contribute 
to the discriminant equation and produce sig- 
nificant żs) one can speculate on what hap- 
pens when the hardy individuals meet a stress- 
ful life event—how they aes the threat 
posed by the event and cope with i - 3 

A male executive having to deal with a job 
transfer will serve as an example. Whether 
hardy or not, the executive will anticipate and 
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experience the changes that the transfer will 
bring about—learning to cope with new sub- 
ordinates and supervisors, finding a new home, 
helping children and wife with a new school 
and neighborhood, learning new job skills, and 
so on. The hardy executive will approach the 
necessary readjustments in his life with (a) 
a clear sense of his values, goals, and capa- 
bilities, and a belief in their importance (com- 
mitment to rather than alienation from self) 
and (b) a strong tendency toward active in- 
volvement with his environment (vigorousness 
rather than vegetativeness). Hence, the hardy 
executive does more than passively acquiesce 
to the job transfer. Rather, he throws himself 
actively into the new situation, utilizing his 
inner resources to make it his own, Another 
important characteristic of the hardy execu- 
tive is an unshakable sense of meaningfulness 
and ability to evaluate the impact of a trans- 
fer in terms of a general life plan with its 
established priorities (meaningfulness rather 
than nihilism). For him, the job transfer 
means a change that can be transformed into 
a potential step in the right direction in his 
overarching career plan and also provide his 
family with a developmentally stimulating 
change. An internal (rather than external) 
locus of control allows the hardy executive 
to greet the transfer with the recognition that 
although it may have been initiated in an 
office above him, the actual course it takes is 
dependent upon how he handles it. For all 
these reasons, he is not just a victim of a 
threatening change but an active determinant 
of the consequences it brings about. In con- 
trast, the executive low in hardiness will react 
to the transfer with less sense of personal re- 
source, more acquiescence, more encroach- 
ments of meaninglessness, and a conviction 
that the change has been externally deter- 
mined with no possibility of control on his 
part. In this context, it is understandable that 
the hardy executive will also tend to perceive 
the transfer as less personally stressful than 
his less hardy counterpart. 

The mechanism whereby stressful life 
events produce illness is presumably physio- 
logical. Whatever this physiological response 
is, the personality characteristics of hardiness 
may cut into it, decreasing the likelihood of 


breakdown into illness. Needless to say, de- 
scription of the actual nature of physiological 


more sophisticated stress research. Until thee, 
however, two alternative explanations of the 
present results should be considered. 

It could be argued that there is a spurious 
factor at work in the subjects’ completion of 
questionnaires, one that determines what they 
say about personality, stress levels, and iii- 
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states before and after the occurrence of a 
stressful life event. Until such studies are 
available, reliance must be placed on data al- 
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longitedinal study in which stress and 
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groups (Kobass, 1077). it is of specific 
vance to the alternative etplenation that 
how streas/high illari subjects were lower 
nibiliem, alienation from self, 
and external locus of control than were 
siress/high ines: subjects This finding 
cates that personality quesvtionasire 
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Communal relationships, in which the giving of a benefit in reponse to s sond 
for the benefit is appropriate, are distinguished from exchange relationships, in 
which the giving of a benefit in response to the receipt of a benefit is appropriate 
Based on this distinction, it was hypothesized that the receipt of a benefit after 
the person has been benefited leads to greater attraction when an exchenge ro- 
lationship is preferred and decreases attraction when a communal relationship is 
desired, These hypotheses were supported in Experiment 1, which used male 
subjects. Experiment 2, which used a different manipulation of exchange versus 
communal relationships and female subjects, supported the bypothenes that (a) 
a request for a benefit after the person is aided by the other leads to greater 
attraction when an exchange relationship is expected and decreases attraction 
when a communal relationship is expected, and (b) a request for a benefit in the 
absence of prior aid from the other decreases attraction when an exchange rels- 


tionship is expected. 


This research is concerned with how the 
effects of receiving a benefit and a request 
for a benefit differ depending on the type of 
relationship one has with the other 
Two kinds of relationships in which persons 
give benefits to one another are , 
exchange relationships and communal rela- 
tionships. The stimulus for this distinction was 
Erving Goffman’s (1961, pp. 275-276) dif- 
ferentiation between social and economic 


benefits that people give and receive do not 
involve money or things for which a monetary 
value can be calculated. A benefit can be any- 
thing a person can choose to give to another 
person that is of use to the person receiving 
iA 


In an exchange relationship, members as- 
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sume that benefits are given with the 
tation of receiving a benefit in ret 
receipt of a benefit incurs a debt or 
to return a comparable benefit. Each 
is concerned with how much he or 
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of relationship. Although it might appear to 
an observer that there is an exchange of bene- 
fits in communal relationships, the rules con- 
cerning giving and receiving benefits are dif- 
ferent than in exchange relationships. 

Members of a communal relationship as- 
sume that each is concerned about the welfare 
of the other, They have a positive attitude 
toward benefiting the other when a need for 
the benefit exists. They follow what Pruitt 
(1972) has labeled “the norm of mutual re- 
sponsiveness.” This rule may create what ap- 
pears to an observer to be an exchange of 
benefits, but it is distinct from the rule that 
governs exchange relationships whereby the 
receipt of a benefit must be reciprocated by 
the giving of a comparable benefit. The rules 
concerning the giving and receiving of bene- 
fits are what distinguish communal and ex- 
change relationships, rather than the specific 
benefits that are given and received. 

From the perspective of the participants in 
a communal relationship, the benefits given 
and received are not part of an exchange. 
The attribution of motivation for the giving 
of benefits is different from that in an ex- 
change relationship. In a communal relation- 
ship, the receipt of a benefit does not create 
a specific debt or obligation to return a com- 
parable benefit, nor does it alter the general 
obligation that the members have to aid the 
other when the other has a need. In a com- 
munal relationship, the idea that a benefit is 
given in response to a benefit that was re- 
ceived is compromising, because it calls into 
question the assumption that each member 
responds to the needs of the other. 


Experiment 1 


The first study reported here was based on 
the assumption, similar to that made by 
Kiesler (1966) in her study of the effect of 
perceived role requirements on reactions to 
favor-doing, that the giving of a benefit will 
decrease attraction if it is inappropriate for 
the type of relationship one has with the 
other. A benefit given in response to a benefit 
received in the past or expected in the future 
is appropriate in an exchange relationship but 
is inappropriate in a communal relationship. 


A benefit given specifically because it ful- 
fills a need is appropriate in a communal rela- 
tionship but not in an exchange relationship. 

If two people have an exchange relationship 
and one person benefits the other; it is ap- 
propriate for the other to give the person a 
comparable benefit. The receipt of a benefit 
under these circumstances should lead to 
greater attraction. On the other hand, if two 
people have a communal relationship and one 
person benefits the other, it is inappropriate 
for the other to give the person a comparable 
benefit, since it leaves the impression that the 
benefit was given in response to the benefit 
received previously. The other is treating the 
relationship in terms of exchange, which is 
inappropriate in a communal relationship. 

When a communal relationship does not yet 
exist but is desired, the receipt of a benefit 
should have the same effect as when a com- 
munal relationship is assumed to exist. A 
benefit from the other after the other has 
been benefited should reduce attraction if 
there is a desire for a communal relationship 
with the other. If an exchange relationship 
is preferred, the receipt of a benefit after the 
other is benefited should result in greater at- 
traction. Experiment 1 was conducted to test 
these hypotheses. 

The predictions concerning communal rela- 
tionships might seem contrary to what would 
be expected from equity theory (Adams, 
1963). On the basis of equity theory, one 
might expect that a benefit from another fol- 
lowing aid to that other would increase liking 
in any relationship, because it would reduce 
inequity. However, the predictions are not 
inconsistent with a recent discussion of equity 
theory (Walster, Walster, & Berscheid, 1978). 
According to Walster, Walster, and Berscheid: 


Another characteristic of intimate relationships, 
which may add complexity, is that intimates, through 
identification with and empathy for their partners, 
come to define themselves as a unit; as one couple, 
They see themselves not merely as individuals in- 
teracting with others, but also as part of a partner- 
ship, interacting with other individuals, partnerships, 
and groups. This characteristic may have a dramatic 
impact on intimates’ perceptions of what is and is 
not equitable. (pp. 152-153) 


In Experiment 1 the desire for a communal 
relationship was manipulated by using un- 
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married males as the subjects and having the 
part of the other played by an attractive 
woman, who was described as either married 
or unmarried. It was assumed that people 
desire communal relationships with attractive 
others, but only with those available for such 
relationships. It was further assumed that the 
unmarried woman would be considered avail- 
able for a communal relationship, whereas 
the married woman would not, Thus, it was 
assumed that the male subjects would desire 
a communal relationship with the attractive, 
unmarried woman but would prefer an ex- 
change relationship with the attractive, mar- 
ried woman. 


Method 


Overview. Under the guise of a 
performance, unmarried male college st 
on a task while a television monitor sho anat 
tractive woman working on a similar task in another 
room. When the subject completed the task, he was 
awarded 1 point toward extra credit for 
on time and given the opportunity to 
of his excess materials to the other, who 
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Participated in order to earn extra course credit. 
They were randomly assigned to one of the four 
experimental conditions: exchange-benefit, exchange- 


no benefit, communal-benefit, communal-no benefit. 
Procedure. 
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She sumpected that people's appreache: to soh 
this tak varied whee certain condition: 
changed. In the condition to which be and Ti 
had been randomly smigaed, they would be able 
see cack other over chowd-cireult television but 
be able to talk to each other directly To a 
credibility, there was è portable tricvitios © 
in the room pointing si the subject. Through the 
of videotape, what appeared on the monitor was 
same for every subject. Whee the subject ached 
he was watching the other person on the 
which typically happeeed, be wa: told that is 
past it had been found that whee propie w 
separately on thee tsiks is the mme room 
performance was often aficcted by (he presence 
the other person. This might have happened 
people could talk te one another or beceute 
could se one another, and the experiment was 


would be quite differeat from the Gre, xace it 
involve much more contact with ibe other 
She would bring both participants isto ome room 
ask them to talk over things thet they hed is 
mon. She was interested in the way in which 
ing common interests helped people to ert to 
one another. The experimenter mentioned that 
the past, people whe bad participated bad 
times gotten to know each other quite well 
Vocabulary task, The experimenter sext 
to a batch of letters printed on small cards is 
of the subject and sald that bis task in the 
study was to form 10 different four-letter 
from the letters. She went om to say that there 
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ing of points to maintain motivation would obviously 
not be necessary in the second study. 

The subject was told to start his stopwatch, begin 
working, and stop the watch when he had finished. 
The experimenter picked up another stopwatch and 
jeft, saying she would give Tricia the watch and 
start her on her task. The SS letters that the sub- 
ject had to work with allowed him to complete the 
task within 10 minutes. When approximately 10 
minutes had elapsed, the experimenter returned and 
asked if the subject had any letters he wanted to 
send to Tricia. All subjects gave the experimenter 
some letters for Tricia, At this point, the experimenter 
looked at the subject's stopwatch and told him that 
he had finished the task in time to get 1 point to- 
ward extra credit, As she did so, she also filled out 
a form indicating the time the subject had taken to 
complete the task and that he had earned 1 point. 
She explained that if Tricia finished her task in 
time she would earn 4 points, since her task was 
more difficult. The experimenter went on to say that 
since she allowed participants in the study to send 
and request letters, she also allowed them to share 
points they earned. Thus, Tricia could send the sub- 
ject some of her points if she wanted to do so. The 
experimenter left the room saying she would give 
Tricia the letters. Tricia continued to work on her 
task for about § minutes and then finished, The ex- 
perimenter handed Tricia a form similar to that 
given the subject earlier, Tricia smiled, wrote a note 
on a slip of paper, folded it, and gave it to the ex- 
perimenter, 

Benefit manipulation, Within a few moments, 
the experimenter returned to the subject's room 
and turned off the monitor, She mentioned that 
Tricia had completed her task within the necessary 
time and had received 4 points. She handed the 
subject a folded note that she said Tricia had 
asked be given to him. In the no benefit condition, 
the note said, “Thanks for sending the letters.” In 
the benefit canditions, the note said, “Thanks for 
sending the letters. The experimenter said it would 
be OK to give you one of my points. She said she 
would add it onto the points you've already earned 
before the end of today’s session.” Which message 
the note contained was unknown to the experi- 
menter at the time she handed i the ae 
This was accomplished by having the experimenter 
pick the note out of a container of folded notes of 
both types. 

Relationship manipulation, The experimenter told 
the subject that there was one more thing to be 
done before getting on to the next study. She said 
she was going to give Tricia some questionnaires to 
fill out and would then get some more forms for 
the subject. In the exchange conditions, the experi- 
menter said: 


Tricia is anxious to get on to the next part of the 
study, since she thinks it will be interesting. Her 
husband is coming to pick her up in about half 
an hour and she wants to finish before then. 


In the communal conditions, she said: 


Tricia is anxious to get on to the next part of 
the study, since she thinks it will be interesting. 
She’s new at the university and doesn’t know 
many people. She has to be at the administration 
building in about half an hour and she wants to 
finish before then. 


Dependent measures. The experimenter then left 
the room for approximately 5 minutes. When she 
reappeared she brought two forms, mentioning that 
these were the forms she had told the subject about. 
She reminded the subject that the second study in- 
volved having the participants talk over things 
they had in common with each other, She said that 
before starting it was necessary to get some idea of 
what their expectations were in order to control 
for them, since they would vary from person to 
person. The subject was asked to fill out a form 
indicating what he expected the interaction would 
be like and, in addition, another form indicating 
what his first impressions of the other person were, 
The experimenter said that these forms would be 
kept completely confidential and left the room while 
they were filled out. 

The first-impressions form, which was given to 
the subject on top of the form concerning expecta- 
tions about the discussion, asked him to rate how 
well 11 traits applied to the other, on a scale from 0 
(extremely inappropriate) to 20 (extremely appropri- 
ate). The traits were considerate, friendly, insincere, 
intelligent, irritating, kind, open-minded, sympa- 
thetic, understanding, unpleasant, and warm. The 
subject was also asked to indicate his degree of lik- 
ing for the other, on a scale from O (dislike very 
much) to 20 (like very much). The other form 
asked the subject to indicate how friendly, spon- 
taneous, relaxed, enjoyable, and smooth he expected 
the discussion to be, on scales from 0 to 20, 

Suspicion check. After the subject had completed 
both forms, the experimenter casually mentioned 
that there was something more to the study and 
asked whether the subject had any idea what it 
might be. The responses of eight persons indicated 
suspicion of the instructions, and they were not in- 
cluded as subjects. Four persons thought the experi- 
ment was designed to test reactions to the note, 
one thought the points might have something to do 
with the ratings, two thought Tricia was a part of 
the experiment, and one questioned whether Tricia 
was actually married. Four of these persons were 
run under the exchange-benefit condition, two under 
the exchange-no-benefit condition, one under the 
communal-benefit condition, and one under the 
communal—no-benefit condition. In addition, 12 
other persons were not included as subjects, Six 
could not finish the task within the 10 minutes al- 
lowed, one did not read the note before filling out 
the forms, two failed to follow instructions when 
filling out the forms, one discovered the concealed 
videotape recorder beneath the monitor, and two 
were married. 


Results 


A measure of liking for the other was cal- 
culated by summing the scores for each of 
the 11 traits and the direct measure of liking 
on the impressions questionnaire. The scores 
for the favorable traits and for the direct 
rating of liking were the same as the subject's 
ratings. The scores for the unfavorable traits 
were obtained by subtracting the subject's 
ratings for those characteristics from 20. 
The means for the experimental conditions 
for the measure of liking are presented in 
Table 1. 


From the hypotheses, it would be expected 


t con- 
dition and would be lower in the communal- 
benefit condition than in the communal-no- 
benefit condition. From Table 1 it can be 


significant, F(1, 92) = 8.35, p < .01. Neither 
of the main effects approached significance. 
A planned comparison indicated that the 
difference between the exchange-benefit con- 
dition and the ‘benefit condition 
was significant, F(1, 92) = 4.17, p < .05. A 
second planned indicated 

the difference between the communal-bene- 
fit condition and the communal-no-benefit 
condition was also significant, F(1, 92) = 
4.37, p < .05. 
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Table 1 

Means for the Measure of Liking in 

Experiment i 

DA 
Henefit from the other 


Communal 7 
Nele. The higher the score, the greater the liking, 
Scores could range from Ô to 240. « = 24 per call, 


interaction between the type of relationship 
and benefit was not significant. The main ef- 
fects were also not significant 


Discussion 


The results of Experiment 1 provide sup- 
port for the hypothesis that when a com- 
munal relationship is desired, a benefit fol- 
lowing prior ald decreases attraction. When 
the attractive woman they had aided was 
unmarried (communal conditions), the um- 
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If the future interaction involved an explicit 
exchange of benefits, there is no reason the 
repayment would have suggested that the un- 
married women did not wish to interact with 
the subject. Since an interpretation in terms 
of the anticipation of future interaction re- 
quires a distinction between different types 
of interaction similar to the distinction be- 
tween communal and exchange relationships, 
it is not an alternative explanation but es- 
sentially the same interpretation in some- 
what different language. 

A different interpretation might be sug- 
gested that has to do with the role relation- 
ships of males and females. It might be ar- 
gued that the male subjects subscribed to a 
“traditional” rule that males should give 
gifts to females, who should gracefully ac- 
cept them and not attempt to repay their 
benefactor, and the unmarried woman who 
kave them a point violated this rule. Such an 
interpretation would assume that the males 
did not apply the same rule to their rela- 
tionships with a woman who is married, 
which does not seem consistent with tradi- 
tional values concerning the role of men 
vis-à-vis women. If the interpretation is re- 
stricted to the relation of men and women 
in romantic relationships, then it is not an 
alternative explanation, since romantic re- 
lationships are communal relationships. 

That the desire for a communal relation- 
ship was induced by creating a situation in 
which there was a possibility of a romantic 
relationship with an attractive member of 
the opposite sex was not fortuitous. A ro- 
mantic relationship that might lead to the de- 
velopment of a family relationship through 
marriage is a particularly appropriate situa- 
tion for the study of communal relation- 
ships, since relationships between family 
members are the most typical kind of com- 
munal relationships. However, the distinc- 
tion between communal and exchange rela- 
tionships is not restricted to romantic rela- 
tionships with members of the opposite sex. 
The same effect should occur in situations 
in which a communal relationship, such as 
friendship, is desired or expected with a 
member of the same sex. 

It was assumed not only that the other 


was perceived as available for a communal 
relationship in the communal conditions but 
also that she was regarded as an attractive 
partner for such a relationship. If the other is 
unattractive, a communal relationship with 
her should not be desired even if she is avail- 
able for such a relationship. People do not 
desire communal relationships with people 
they dislike. An exchange relationship should 
be preferred with an unattractive other, and 
thus a benefit from such a person after he 
or she has been aided should lead to greater 
attraction. . 

Since the effect found in the first study 
involves the assumption that the benefit that 
the person received from the other is per- 
ceived as a response to the previous benefit 
that the other received, it should not occur 
if the other had not received something of 
value from the person. The receipt of a bene- 
fit when the other has not been aided previ- 
ously should lead to greater liking when a 
communal relationship is expected or de- 
sired. The rule in communal relationships is 
to respond to a need rather than to recipro- 
cate benefits. The giving of a benefit when 
no prior help has been received is appropri- 
ate for a communal relationship if there is or 
might be a need for the benefit. 

Some evidence that the receipt of a bene- 
fit when the other is not aided previously 
produces greater liking when there is a com- 
munal relationship than when there is not is 
provided by the study by Kiesler (1966). 
She found that a partner on a cooperative 
task was liked more when he shared his win- 
nings with the subject than when he did not 
share, whereas an opponent on a competitive 
task was liked about the same when he 
shared and did not share. Partnership on a 
cooperative task should create the expecta- 
tion of a communal relationship, and in 
Kiesler’s study the subject always lost, so 
the partner did not receive any aid from 
the subject prior to benefiting the subject. 
However, the possibility that the subject’s 
losing could have been construed as a kind 
of aid for his opponent in the competitive 
conditions complicates the comparison of 
the cooperative and competitive conditions 
of Kiesler’s study. 
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Experiment 2 

The distinction between communal and 
exchange relationships also has implications 
for reactions to a request for a benefit. If it 
is true that in an exchange relationship any 
benefit given by one member to the other 
creates a debt or obligation to return a com- 
parable benefit, a request for a benefit from 
another after one has been given aid by that 
other creates an opportunity to repay the 
debt. Thus, such a request following aid 
should be appropriate in an exchange rela- 
tionship. Since it provides an opportunity to 
eliminate any tension caused by the presence 
of the debt, it should increase liking for the 
other. 

The idea that the recipient of a benefit 
will like his or her benefactor more if be or 
she can return the benefit has been expressed 
before (Mauss, 1954). Several studies have 
shown that recipients of benefits like the 
donor more if they are 
benefit than if they are not 
whether the opportunity is provided 
donor’s specifically requesting that the other 
repay the benefit (Gergen, Ellsworth, Mas- 
lach, & Seipel, 1975) or whether the op- 
portunity to repay is provided but repayment 
is not specifically asked for (Castro, 1974; 
Gross & Latané, 1974). 


g 


is inappropriate. It may imply that the 
original aid was not given with the intent of 
satisfying a need but rather with the expec- 
tation of receiving something in return, which 
may be taken as an indication that the other 
does not desire involvement in a communal 
relationship. Assuming that beginning or 
maintaining a communal relationship with 
another is desirable, such an implication 
should be frustrating and therefore result in 
decreased liking. 

If one has not been previously aided by 
another and there is no opportunity to aid 
the other in the future, a request from that 
other is inappropriate in an exchange rela- 
tionship. In an exchange relationship, a per- 
son who has not been aided by another should 
like the other more when he or she does not 


ask for a benefit than when be or she does 
ask for a benefit 

However, requesting a benefit in the ab- 
sence of prior aid is appropriate in a com- 
munal relationship. Such a request implies 
that the other desires a communal relation- 
ship and, assuming that beginning or main- 
taining such a relationship is desirable, it 
should result in increased liking. Jones and 
Wortman (1975) suggest that asking another 
for a benefit is a way of conveying that we 
think highly of them. They say, “This tactic 
is likely to convey that we feel good about 
our relationship with the target person, since 
it is not customary to ask people to do favors 
for us unless our relationship is a relatively 
good one” (Jones & Wortman, 1975, p. 13). 

The implications of the distinction between 
communal and exchange relationships for re- 


thing about the other wanting to meet people, 
implying that the other was very busy and 
leading the subject to believe that she would 
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Method 


Overview. Under the guise of a study of task 
performance, female college students worked on a 
task while a television monitor showed another 
female working on a similar task in another room. 
Some of the subjects were told that the other was 
married, had a child, and lived far from the uni- 
versity, and that she and the subject would be dis- 
cussing differences in interests in a second study 
(exchange conditions), Other subjects were 
that the other was sew at the university and 
not know many people, and that she and the 
would be discussing common interests in a 
second study (communal conditions). The 
female finished the task, received 1 point, and gave 
the subject ald on her supposedly more difficult task 
or did not give the subject ald, The other female 
then requested a point from the subject or did not 
request a point, Finally, the subject's liking for 
the other and expectations concerning the future 
discussion with the other were assessed. 

Subjects, The subjects were 80 female, intro- 

ductory psychology students who received extra 
credit toward thelr course grade for their participa- 
tion They were randomly assigned to one of the 
cight experimental conditions: exchange-aid-request, 
exchange-aid-no request, exchange-no aid-request, 
exchange-no ald-mo request, communal-aid-request, 
communaj-aid-no request, communal-no aid-request, 
and communal-po aid-no request. 
Procedure. Upon arriving for the study, the sub- 
ject was greeted by the experimenter and told that 
the other subject scheduled to participate at the 
same time had already arrived and was waiting in 
another room for the experiment to begin. The ex- 
perimen: xplained that it would take a little time 
for the equipment to warm up before the experi- 
ment could begin. 

Relationship manipulation, In the communal con- 
ditions, Uhe experimenter casually stated that the 
other person was anxious to begin because: 


ject 


She thinks it will be interesting. She's new at the 
university, doesn't know many people, and she's 
interested in getting to know people. 


by to pick ber up, then they have to pick up her 
child and go home to Columbia (a city some dis- 
tance from the university). 


The experimenter said that the first study was 
actually one of two short, unrelated studies they 
would be asked to participate in that day. The ra- 
tionale for the first study was the same as in 
periment 1. The experimenter went on to say 
the second study would be quite different from the 
first. In the communal conditions she continued: 


What we're going to do is bring you both into 
one room. We want you to talk over common in- 
terests. We're interested in finding out how peo- 
ple get to know one another. We try to create a 
relaxed atmosphere, and actually, in the past we've 
found that some of the people have gotten to 
know one another quite well. 


In the exchange conditions she continued: 


What we're going to do is bring you both into 
one room. We want you to talk over differences in 
interests. We're doing this because most people 
avoid talking about differences in interests and 
we're interested in getting people’s reactions to 


doing so. 


Vocabulary task. After the subject had signed 
an experimental consent form, the procedure for 
the vocabulary task was explained in the same 
manner as in Experiment 1, except that the subject 
was told that she would be performing the more 
difficult task while the other person would be per- 
forming the easier task. The experimenter pointed 
out that since the subject’s task was the more dif- 
ficult one, she would have a chance to earn 4 points 
toward the extra credit, whereas the other person, 
Tricia, only had a chance to earn 1 point, since she 
had the easier task. As in Experiment 1, the ex- 
perimenter mentioned that the awarding of points 
to maintain motivation obviously would not be 
necessary in the second study. 

After the same instructions concerning the stop- 
watches as in Experiment 1, the subject was left 
alone to work on the task for a short time, With 
the 45 letters the subject had it was impossible for 
her to finish the task in that time. Subjects typically 
finished between five and seven words. During the 
time the subject was working, the experimenter, 
who always wore a lab coat so that changes in 
clothing over days could not be detected, could be 
seen on the monitor starting Tricia on her task and 
then leaving the room. Tricia finished her task 
easily. After a short time the experimenter reentered 
the other room, and Tricia could be seen pushing 
some extra letters to the front of the table, At this 
point the experimenter stepped in front of the 
camera, blocking the subject’s view of the other 
so that the subject could not see whether the other 
handed the letters to the experimenter. Finally, the 
experimenter could be seen leaving the room, and 
Tricia sat back in her chair. 

Aid manipulation. Shortly thereafter, the ex- 
perimenter reentered the subject’s room and said 
that Tricia had finished her task and received 1 
point. The experimenter turned off the monitor, 
commenting that it wouldn’t be needed any more. 
In the aid conditions she said, “Tricia asked me to 
give you these letters,” and handed the subject some 
letters. In the no-aid conditions there was no 
mention of the letters. The experimenter then left 
the room, telling the subject she would be back 
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Table 2 
Means for the Measure of Liking in 
Experiment 2 


Aid from & request for benefit from 
Mihe other 


Aid- Aid- Noaid- Noaid- 
request no request request no request 


149 173 
191 177 


Relation- 
ship 


173 
156 


Exchange 


Communal 


149 
179 


Note. The higher the score, the greater the liking. 
Scores could range from 0 to 240. » = 10 per cell. 


shortly. In approximately 3 minutes, she returned, 
and regardless of whether the subject had finished 
(none in the no-aid conditions did, 
in the aid conditions did), she told 

had done well enough to receive the 4 points toward 
extra credit. She filled out a form indicating 
and handed it to the subject. The 
that was all there was to the first study, except 


É 


f 
the form was checked, having drawn it from a con- 
tainer of folded forms checked in both ways. If the 
subject wished to fill out a form to request points 
from the other, the experimenter took it. 

Dependent measures. Next, the experimenter re- 
minded the subject that the second study would in- 
volve having both subjects talk over common in- 
terests (communal conditions) or differences in in- 
terests (exchange conditions). Before starting the 
study it was necessary to get some idea of what 
their expectations were about the forthcoming in- 
teraction in order to control for those expectations, 
since they might vary from person to person. There- 
fore, she was asking the subject to fill out two forms 
indicating what her first impressions of the other 
Person were and also what she expected the discus- 
sion to be like. The subject was told that these 


asked the subject to indicate how 


pression form used in Experiment 1. The other form 

spon- 
taneous, strained, enjoyable, 
pected the discussion to be, 
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Suspicion chech. The caperimenter bit and 

a newsy miror omi she could 
that the subject had Gnished the form: She thea 
approuimstely 0 additional me asd reentered 
ibe picked Gp the forms, she casually 
was more to the nudy 
mentionrd before and amed the sub. 
ida of what Ñ might be. The 
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saspetted thet the other persoa 
the other room, one thought 
other's had been intestionslly made 
thought that the request did not actually 
, šad ome thought that 
nor the request come from the other 
saspected that the other 
in the ether room bad been 
experimenter, is 
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ference was significant, F(1, 72) = 4.03, p 
< 0S. 

From the second hypothesis, it would be 
expected that liking would be less in the com- 
munal-aid-request condition than in the 
communal-aid-no-request condition. As can 
be seen in Table 2, the difference was as 
predicted. A planned comparison indicated 
that this difference was significant, F(1, 72) 
= 8.60, p < OL 

From the third hypothesis, it would be ex- 
pected that liking would be less in the ex- 
change po-aid-request condition than in the 
exchange-no-aid-no-request condition, As can 
be seen in Table 2, the difference was as pre- 
dicted. A planned comparison indicated that 
this difference was significant, F(1, 72) = 
4.07, p < .0S 

From the fourth hypothesis, it would be 
expected that liking would be greater in the 
communal—no-aid-request condition than in 
the communal-no-aid-no-request condition. 
As can be seen in Table 2, the means for these 
two conditions were very similar, A planned 
comparison indicated that the difference be- 
tween these two means was not significant. 

Another way of looking at the results is to 
compare the aid and the no-aid conditions. 
It would be expected that liking would be 
greater in the exchange-aid-request condition 
than in the exchange-no-aid-request condi- 
tion. As can be seen in Table 2, this expected 
difference was obtained. A planned compari- 
son indicated that the difference was signifi- 
e 
expected that liking woul in th - 
anai -aid-request condition than in the 
communal-no-aid-request condition. As can 
be seen in Table 2, this expected difference 
was obtained. A planned comparison indicated 
that the difference was marginally significant, 
F(1, 72) =3.93, p< .06. It would be ex- 
pected that liking would be less in the ex- 
change—aid-no-request condition than in the 
exchange—no-aid-no-request condition. As can 
be seen in Table 2, this expected difference 
was obtained. A planned comparison indi- 
cated that it was significant, F(1, 72) = 4.10, 
$ <.05. Finally, it would be expected a 
liking would be greater in the communal-aid- 


no-request condition than i pa communal- 
F t 
[i 
i 


Table 3 


Means for the Measure of Anticipated 

Pleasantness of the Discussion in 

Experiment 2 

Se OE > ES 
Aid from & request for benefit from 


the other 
= VAE E VE EN 
Relation- Aid- Aid- Noaid- Noaid- 
ship request no request request no request 
e a RG E AES 
Exchange 56 59 58 59 
Communal 63 72 70 52 


Note. The higher the score, the more positive the 
subjects’ expectations for the discussion, Scores 
could range from 5 to 100. n = 10 per cell, 


no-aid-no-request condition. Although the 
means were in the expected direction, the 
planned comparison indicated that this dif- 
ference was not significant. 

A measure of anticipated pleasantness of 
the discussion was calculated by summing the 
scores on the questions concerning how 
friendly, spontaneous, strained, enjoyable, and 
awkward the subjects expected the discus- 
sion to be. The scores for the favorable char- 
acteristics were the same as the subjects’ 
ratings. The scores for the unfavorable charac- 
teristics were obtained by subtracting the 
subjects’ ratings for those characteristics from 
21. The means for the experimental condi- 
tions for the measure of anticipated pleasant- 
ness of the discussion are presented in Table 3. 

An analysis of variance of the measure of 
anticipated pleasantness of the discussion re- 
vealed that the main effect of type of rela- 
tionship was significant, F(1, 72) = 5.01, p 
< .05. The main effects of aid and of request 
and the interactions between type of relation- 
ship and aid and between type of relationship 
and request were not significant. The inter- 
action between aid and request was significant, 
F(1, 72) =7.31, p < .01, and the three-way 
interaction between type of relationship, aid, 
and request was also significant, F(1, 72) = 
5.61, p < .05. 

As can be seen in Table 3, anticipated 
pleasantness was approximately the same in 
all four of the exchange conditions, but there 
were differences within the communal condi- 
tions. Anticipated pleasantness was lower in 
the communal-ai uest condition than in 


si] 
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the communal-aid-no-request condition. A 
simple comparison indicated that this differ- 
ence approached significance, F(1, 72) = 
2.99, p < .10. Anticipated pleasantness was 
greater in the communal—no aid-request con- 
dition than in the communal-no-aid-no-re- 
quest condition, A simple of this 
difference was significant, F(1, 72) = 11.19, 
p < .001. 


Discussion 


change relationship is expected, it was found 
that liking for the other was higher in the 
exchange-aid-request condition than in the 
exchange-aid-no-request condition. As pre- 
dicted from the hypothesis that a request for 
a benefit after the person is aided 


traction when an exchange relationship is ex- 
pected, liking for the other was lower in the 


The hypothesis that a request for a benefit 
in the absence of aid from the other increases 
attraction when a communal relationship is 
expected was not supported; there was no 
difference in liking between the communal- 
no-aid—request condition and the communal- 
no-aid-no-request condition, The subjects in 
the communal-no-aid-request condition may 
have been somewhat uncertain about the in- 
tentions of the other. the request 
may have indicated to the subject that the 
other wanted a communal with 
her and consequently led the subject to ex- 
pect such a relationship, it also may have 
reminded the subject that the other had not 
given her aid earlier. This reminder may have 
raised doubt about whether the other would 


MARGARET $. CLARK AND JUDSON MILLS 


behave in an appropriate way for a communal 
relationship. This could explain why the re- 


As would be expected from the distinction 
between communal and exchange relation 
ships, liking was greater in the exchange-aid~ 
request condition than in the exchange-oo-aid- 
request condition, marginally les in the 
communal-aid-request condition than in the 
communal-po-aid-request condition, and less 
in the exchange-aid-no-request condition than 
in the exchange-no aid-no-request condition, 
The greater liking in the exchange-aid-re- 
Quest condition than in the exchange -oo-aid- 
request condition could be due to a general 
tendency for aid to increase liking, as well as 
to the appropriateness of the request. How- 
ever, the fact that liking was less in the ex- 
change-aid-no-request condition than in the 
exchange-no-aid-no-request condition is op- 
posite to what would be expected from a 
general tendency for ald to increase liking, 
but follows from the idea that differences in 
liking are due to the appropriateness of the 
other's behavior for the type of relationship. 
That liking was less in the communal-aid- 
request condition than in the communal-no- 
aid-request condition is also opposite to the 
tendency for aid to increase liking and con- 
sistent with the effect on liking of the appro- 
priateness of the other's behavior for the type 
of relationship. 

Since the focus was on the interactive ef- 


did subjects in the exchange conditions. 
The results for the measure of anticipated 
pleasantness of the interaction were also com 


antness was similar in the four exchangt 
conditions, marginally less in the communal- 
aid-request condition than in the communal- 
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aid-no request condition, and greater in the 
communal oo-aid-request condition than in 
the « 

The 
can be understood in terms of the assumption 
that one will anticipate Interaction with the 
> be more pleasant if one expects a 
al relationship than if one expects an 
exchange relationship. In the exchange condi- 
tion, subjects were led to expect an exchange 
relationship by the experimental instructions 
given in those conditions, Subjects in the com- 
munal aid-tequest condition should have ex- 
pected an exchange relationship, because the 
rules of a communal relationship were vio- 
lated by the request following the prior aid. 
Subjects in the communal-no-aid-no-request 
on should also have expected an ex- 
ixe relationship, because the rules of a 
munal relationship were violated by the 
other's failure to respond to their needs or 

> request something from the subject for 
which the other presumably had a need. 

The interpretation of the results for the 
neasure of anticipated pleasantness might 
appear inconsistent with the fact that liking 
was not significantly higher in the communal- 
no-aid-request condition than in the com- 
munal-no-aid-no-request condition, However, 
t is possible that the request for a benefit in 
the absence of prior aid was sufficient to 
create an expectation of a communal rela- 
tionship yet insufficient to increase liking. As 
mentioned earlier, the request for aid in the 
communal-no-aid-request condition may have 
been taken as an indication that the other 
felt positively toward the subject and thus 
led the subject to expect a communal rela- 
tionship with that other, but it may also 
have reminded the subject that the other wee 
not fulfilled her needs earlier, resulting in 
ambivalent feelings toward the other. 


other 


com 


condit 


General Discussion 

While it is assumed that the distinction be- 
tween communal and exchange eke 
is made implicitly by most people in F 
interactions with others, it is not assum 
that they are explicitly aware of the distinc- 
tion or are able to describe how it alist 
their reactions. Certainly they do not use the 
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terms communal and exchange relationships. 
It is also not assumed that oats fe 
the distinction in the same way. Some people 
restrict their communal relationships to only 
a very few persons, whereas others have com- 
munal relationships with a wide circle of 
others. There are some people who do not 
make the distinction at aj], Some people treat 
every relationship, even relationships with 
members of their own immediate family, in 
terms of exchange. 

It is possible for a person to haye both a 
communal relationship and an exchange re- 
lationship with the same other, for example, 
when a person sells something to a friend or 
hires a family member as an employee. In 
such instances, a distinction js typically made 
between what is appropriate for the business 
(exchange) relationship and what is appro- 
priate for the family or friendship (com- 
munal) relationship. Exchange relationships 
sometimes can develop into communal rela- 
tionships, such as when a merchant and a 
customer become close friends or when an 
employer and an employee marry, 

The lack of attention paid to communal 
relationships in previous research on inter- 
personal attraction may be accounted for by 
the fact that almost all of the past research 
has involved attraction toward persons who 
are not only previously unknown to the sub- 
ject but who are not expected by the subject 
ever to be known in the future. Communal 
relationships involve an expectation of a long- 
term relationship, whereas exchange relation- 
ships need not be long-term, However, the 
variables of communal versus exchange rela- 
tionship and expected length of the relation- 
ship are conceptually independent, Exchange 
relationships may be expected to continue 
over a long period. 

If it is true that treating a communal rela- 
tionship in terms of exchange compromises. 
the relationship, then exchange theories of 
interpersonal attraction (e.g., Secord & Back- 
man, 1974, chapter 7) may create a mislead- 
ing impression about the development and 
breakup of intimate relationships. The idea 
that exchange is the basis of intimate relation- 
ships may actually have the effect of impair- 
ing such relationships. For example, the rec- 
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ommendation, which seems to be growing in 
popularity, that prior to marriage a marriage 
contract be drawn up that specifies in detail 
what each partner expects from the other, 
should, if followed, tend to undermine the 
relationship. 

If the theoretical viewpoint of this research 
is correct, a communal relationship will be 
strained by dickering about what each of the 
partners will do for the other. Of course, if 
one of the partners in a communal relation- 
ship is convinced that he or she is bei 
ploited by the other because that person is 
concerned about the other's welfare while the 
other is not concerned about his or her wel- 
fare, the communal has 
tegrated. If this happens in a marriage, 
may be attempts to preserve the marriage by 
changing it into an exchange relationship 
through dickering. 
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Person Memory: Personality Traits as Organizing 
Principles in Memory for Behaviors 
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subjects studied and recalled sentences describing behaviors while performing a 
laboratory impression-formation task. Recall was high for behaviors that were 
incongruent with a personality-trait impression for a character, whereas recall 
was much lower for behaviors that were congruent or neutral with reference to 
the impression. Set size, the number of congruent and incongruent behaviors at- 
tributed to the character, was shown to be a major determinant of this result. 
The smaller the size of the incongruent set, the higher the probability of recalling 
an item from the set. There was no tendency for behaviors to cluster by trait 
category in recall output protocols, This result was interpreted as evidence that 
a simple analogy to hierarchical noun categories, studied in many verbal learning 
experiments on organization of memory, did not apply to the present results. 
Three theoretical analyses—an associative network model, a depth-of-processing 


model, and a schema model—are reviewed in light of these results. 


Psychologists have been interested for a 
long time in the effects of abstract organizing 
principles on memory. Sir Frederic Bartlett's 
(1932) schema theory is one of the earliest 
cflorts to explicate the relations between an 
abstract structure and the recall of specific 
facts. Bartlett proposed that memory is not 
primarily or literally reduplicative or repro 
ductive. Rather, he argued that perceived 
events are assimilated to mental schemata 
that have been formed by similar ey | 
perienced in the Bartlett's classic 
tise on peters replete with examples 
of phenomena that he believed evidenced the 


workings of these schemata. ë 
Researchers have studied the manner in 
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which experimental subjects use well-defined 

izational structures in numerous memory 
tasks. Miller (1956), Mandler (1967), Tul- 
ving (1968), Bower (1970), and Rosch 
(1973) conducted pioneering research on the 
operation of semantic structures in the recall 
of words; Bransford and his colleagues 
(Bransford & Franks, 1972; Bransford & 
Johnson, 1973) have studied the operation 
of linguistic and spatial schemata in recog- 
nition and recall; and Kintsch (1974), Thorn- 
dyke (1977), and Mandler and Johnson 
(1977) have studied the operations of sche- 
mata in memory for prose paragraphs, One 
consistent conclusion from these experiments 
is that recall of specific facts is affected by 
their relation to an abstract organizing prin- 
ciple, theme, or schema. 

Psychologists interested in social judgment 
have also studied organization—the organiza- 
tion of attitudes and beliefs (Abelson et al., 
1968; McGuire, 1968), the perception of so. 
cial groups (Heider, 1958; Insko, Songer, & 
McGarvey, 1974), and the organization of 
impressions (Anderson, 1974b; Hastorf, 
Schneider, & Polefka, 1970; Rosenberg & 
Sedlak, 1972a). Recently, researchers haye 
attempted to integrate the two traditions to 
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study the manner in which social information 
is stored and retrieved from memory. Ander- 
son and Hastie (1974), Sulin and Dooling 
(1974), D'Andrade (1974), Lewis and An- 
derson (1976), Cantor and Mischel (1977), 
Markus (1977), Snyder and Uranowitz 
(1978), and Rogers, Kuiper, and Kirker 
(1977) have studied the operation of general 
organizing principles, themes, or prototypes 
in the perception and memory of information 
about people. Several of these researchers 
have suggested that when information is 
stored in memory about an individual person, 
personality-trait dimensions or personality 
prototypes provide the mental categories for 
the new information, Thus, inferences about 
abstract personality traits will organize the 
_ Perception, storage, and retrieval of informa- 
tion about people. 

In the three experiments reported in this 
article, subjects studied and recalled lists of 
behavior descriptions. The behaviors in each 
list were attributed to a single fictional char- 
acter, along with a brief personality-trait 
sketch. The behavior descriptions were se- 
lected to be congruent or incongruent with 
respect to the personality sketch, and the 
focus of the experiments was on differential 
recall of these descriptions. 

Differential recall of specific behaviors vary- 
ing in congruence to the general personality 
impression would be taken as evidence that 
the impression controls the encoding, reten- 
tion, or retrieval of social information. Results 


gruent and incongruent behavior descriptions. 


Experiment 1 
Method 


with a personality trait, 4 describing behaviors that 
were incongruent with the trait, and 4 
behaviors that were neutral with respect to 
trait. The three types of sentences were distributed 
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uniformly scrom input serie! pondtions Heck of the 
tis Bate of behaviors studied by è selject we: ass- 
Gated with è Giflerent personality irait There wore 
12 Ghereet traits ia the initial at of crprimratal 
material, and so iwo sts of 6 treks pd wore 
arbitrarily created to prodece twe replications of 
the design. Finally, cach of the © subjects sudying 
one of the G-irelt subsets received the 6 traits lo a 
dilereni order, Thee orders wore amenped ia a 
Latin square plan se thet cock trait @ccarred ia a 
dileai within-seuice porition fer cach object 
This design war replicated with a soomd st of LI 
subjects 

Materials The memory experiment osibard shove 
required sets of matrace dewridóng bebavisr: char- 
acteristic of a represretstive sample of permonsity 
traits, First, 12 traits were chowa from the #0 traits 
studied by Romsderg sed Sedlak (1977D): mii- 
om , homers dar, response < oanira- 
tous, frieedly-henile, aggrettershy, and sairt- 
cynical). The sample was compennd of ate piir of 
“opposite meaning” traits thet «panerd the pace 
revealed by Rownturg and Sedii multitimmrn sonal 
scaling analysis, The pairs were presented to 1) pee 
test subjects with as inttrection of the form, “Css 
sider a person whe b very deielligest, you would 
expect to we him (her) = The reveling Bets of 
pretest subject-genersied behaviors were edited te 
yield 12 thiee-to-five-word behavior description: fer 
each of the traits. Some examples of behavior: prs- 
erated as congrecnmt with the intelligrnet trait are 
“won the chess tournament” and “sttemdnd thr 1y- 


examples of neutral behavior 
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“dt u sad dnioe formed the adjec- 


the were typed on 
såra onde for diplay to the 
at wore Di unl. 


sh reported te this article, In no 
scatiatically reliable effect of the 


"pott 
The isdividssl experimental seatlons 
=sisiy 30 minutes. Subjects were told 


| 


ibe read aloud 70 
a S-sccperesentence vate, describing beha 
formed by the character. Third, 
( the sentenors as possible in 
free recall imetrections Finally, be or 
fictions! character's personality on aine 
This rating task was iechaded to emphasize 
pres formation task, The 
taks were consistent with averaging 
models for impresion formation—the 
on prevented that were congruent with a 
trait, the higher the rating of the stimulus character 
n that dimension. Beyond this 

atts of the rating task were not 


recall, and so they will not be discussed in this article 
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Results 


The primary data from the experiment 
the proportions of behavior descriptions 
called from the congruent, incongruent, 
neutral subsets of each list. A note 


forth. First, a lenient scoring criterion was 
employed: An item was t 

recalled if the general sense of the descrip- 
tion was (ie, any changes from 
the original were synony- 
mous words), The data reported in this article 
were tabulated following this criterion. Sec- 
ond, a strict scoring criterion adopted: 
An item was scored as correctly recalled if it 
differed from the original in at most one im- 
portant word (and if this deviation did not 
affect the meaning of the phrase). All anal- 
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Table 1 
Mean Proportions of Behavior Descriptions 
Recalled in Experiment 1 
poset ee 
Item type 
Input 
serial Congruent Incongruent Neutral 
position behaviors behaviors behaviors 
ito 5 Al 57 38 
6 to 10 33 A8 36 
11 to 1S 35 AT 27 
16 to 20 60 61 Jt 
Total 43 54 -38 
n 1728 576 576 


yses were repeated with this criterion, and no 
discrepancies from the present summary were 


apparent. 
The proportions of recalled data are dis- 
in Table 1, broken down by item type 
and input serial positions. The single striking 
effect in the data is the superior recall of in- 
t behavior descriptions in comparison 
to recall of congruent and neutral descrip- 
tions, F(2, 40) = 18.51, p< ‚01. A multiple 
comparison test following the Tukey (a) pro- 
cedure indicated that the incongruent mean 
recall was reliably higher than the mean pro- 
tions recalled from congruent and neutral 
subsets (Pp < -01), although recall from these 
latter two subsets did not differ reliably. 
Twenty-one of the 24 subjects recalled a 
proportion of incongruent behaviors 
than either of the other two item types. The 
serial position data show that the superior 


list. A statisti 
Serial Position interaction did not yield a 
cant result, F(6, 120) = .98, ms, and 


signifi 
so the relationship 
ition has not been clearly established. 

The pattern of recall did not vary system- 
atically from the first half of the experimental 
session to the second. However, the low in- 
trusion rate (recall of nonpresented behav- 
iors), less than one item per trial, did increase 
slightly over the course of the session, A 
separate analysis on the order of recall of the 
three item types to determine whether con- 
gruent items were more likely to precede or 


(congruent, neutral, 
culated for each recall Protocol 
stimulus category repetition — 
ulus category Tepetition; 
field, 1966). An analysis of 
formed on these data revealed no 
order recall of behaviors according to the item 
type (the grand mean of 
was not significantly 
F(1, 20) = .82, ns, 


Discussion 


The results of this experiment are evidence 
for the importance of Personality traits 
organizing principles in memory for 
about people. Recall of 
descriptions is Clearly affected 
tion (congruent, i 
salient trait. It is 
gruent behaviors are so well recalled. 
Versions of (Bartlett, 1932; 
1972; Mandler & John- 


to observe clustering t 

categories disconfirms this ae. ba 
The present resuli are related to two phe- 

nomena observed in learning 

with more traditional list 

may be that the differential recall of con’ 


Sucut and incongruent behaviors is best 
viewed as a case of the 


effect in free recall: As list length 
the proportion of i recalled 
there are 


(Murdock, 1962). Since 
gruent behaviors than incongruent 
Portional recall of Congruent items 
a panler with this account i 

OF neutral items that 
frequency as the well- enbere ince am 
items, However, it may be that subjects no- 
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tice incongruent items and class them apar 
from congruent and neutral behaviors without 
distinguishing between the hatter 


two types 
of items. Second, the pattern of recall for the 
incongruent items resembled the recall of 


striking “von Restorff 
spects—for instance, the apparent primacy 
effect in the serial Position curve for the is. 
congruent items from early serial postions 
(Bellezza & Cheney, 1973) 

Tt is important to note that these wer taal 
learning interpretations of the recall results 
assume that congruent and incongruent items 
are perceived and mentally Categorized by the 
abstract trait impression and its relation to 
the individual behaviors Thus, act sise and 
von Restorfl interpretations imply that per- 
sonality-trait impresdons are organising prin- 

controlling memory for the behavioral 
information. 


items in certain te 


Experiment ? 
second experiment was designed to or- 
the 
recall 
haviors in 
3 included 


a 
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The materials from Experiment 1, with 
of two trait pairs ( and 
) were wpd in this experiment. Once 


claas: 

O fet was peel ee 
+ to establish am initial personality im- 

premion 

Subjects. Td subjects were 24 undergraduate uni- 


versity students paid $1.50 each to participate in the 
JOminute ciperimental session. 
Procedure instructions and procedures were the 


game às in Experiment 1 


Results 


The primary data from Experiment 2 are 
the proportions of behavior descriptions re- 
called, These data are displayed in Table 2 
classified according to list type (12/4/0, 


type (congruent, 


haviors). An analysis 
from the three list types in 


of variance of the data 
which all three 


recall of incongruent items W 

pletely determined by subset size. The fewer 
incongruent items in a list, the higher the 
probability of recalling any particular item. 


size factor, F(1, 18) 
probability of recalling congruent 


did not depend on set size: F(3, 


behaviors 
54) = 2.24, ms, for congruent recall; F(3, 
54) = 1.09, ms, for neutral item recall. Serial 


ble 2 i ; 
ie Proportions of Behavior Descriptions Recalled in Experiment 2 
Item type i 


29 


position curves were extremely unstable and 
will not be described for this experiment. 

Again, as in Experiment 1, the Bousfield 
and Bousfield (1966) clustering index was not 
reliably greater than zero, F(1, 18) = .79, ns, 
indicating that subjects were not grouping 
items by trait categories in recall. 


Experiment 3 


A third experiment using a new sample of 
behaviors and four new traits was designed 
as a systematic replication of Experiment 2. 
Furthermore, the experiment was designed to 
obtain stable serial position curves. 


Method 


Design. Each subject studied and recalled sen- 
tences from four 14-sentence lists. Each list was 
associated with a personality trait and included a 
mixture of congruent and incongruent behaviors. 
There were four types of lists containing (a) 13 
congruent items and 1 incongruent item; (b) 11 con- 
gruent and 3 incongruent items; (c) 9 congruent 
and 5 incongruent items; and (d) 7 congruent and 7 in- 
congruent items. Each subject studied one list of 
each type. Precautions were taken to counterbalance 
the materials against serial position and sequence 
effects. 

Materials. Sentences describing behaviors char- 
acteristic of four personality traits were generated 
by pretest subjects, as in Experiment 1. The four 
traits, intelligent, stupid, friendly, and hostile, were 
selected to span Rosenberg and Sedlak’s (1972a) 
trait space representing poles of their social and 
intellectual dimensions. A separate sample of 48 
subjects rated each of the behavior descriptions on 
intelligence, friendliness, and imageability scales. 

During the memory experiment, each list was pre- 
ceded by an ensemble of five trait adjectives selected 
to be close to the key trait in meaning (for example, 


Congruent Incongruent Neutral 
behaviors behaviors behaviors 
aS 
List type M n M n M n 
12 congruent/0 incongruent/4 neutral 56 576 T Ta 2 1e 
11 congruent /1 incongruent/4 neutral Kan A528 : a ae 
9 congruent/3 t/4 neutral 49 432 61 ‘4 192 
-50 288 59 288 A2 192 


6 congruent/6 incongruent /4 neutral 
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Table 3 


Mean Proportions of Behavior Descriptions 
Recalled in Experiment 3 


Item type 
Congruent Incongruent 
behaviors behaviors 
a FO ee, 
List type M ” M " 
or oe 
13 congruent/ 
1 incongruent 60 624 79 “ 
11 congruent/ 
3 incongruent 62 528 Jas iu 
9 congruent/ 
5 incongruent O 432 1 240 
7 congruent/ 


7 incongruent 61 336 67 336 


intelligent, knowledgeable, reliable, Skilful, persistent, 
and scientific), 


Subjects. The subjects were 48 undergraduate uni- 


to nly students. They were paid a wage of $2 cock 
to participate in the 1-hour experimental 


Session, 
Procedure. The experimental sessions followed the 
Procedure used in the experiments, Subjects 
studied trait and behavioral information about 


a 
imaracter, recalled the behaviors, and then rated helt 
impression of the character. These tasks were re- 
peated for each of four characters, 


Results 


The pattern of recall data replicated the 
results of Experiment 2 in almost every de- 
tail. These data are displayed in Table 3 
Classified according to list type (13/1, 11/3, 
9/5, 7/7 congruent/incongruent behavior sub. 
set sizes) and item type (congruent or incon- 
gruent behavior). 

Again, a large set-size effect appeared for 
the recall of incongruent items, such that the 


was recalled, linear trend F(1, 47) = 3.80, 
Ż < .05. No such effect was apparent in the 
congruent-item recall data, main effect F(3, 


A further breakdown of these data yielded 
the serial position Curves displayed in Figure 


incongruent items appear, the carlics pre 
sented incongruent items have the greates 
advantage. 


Twelve Paeudosubject units were created by 
aggregating data from subsets of 4 subjects 
(in the original sample of 48) to permit an 


the serial position results, 
Again, an analysis of the Bousfield and 


Category in recall, F(1, 47) = 
A test of the “pure” trait 

Ga recall was performed to answer the ques- 

tion: When congruent and incongruent 
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PROPORTION RECALLED 


oe 4 4 
INPUT SERIAL POSITION 


Figure 1. Serial position curves for Experiment 3. (Congruent behavior recall is indicated by 
filled circles, incongruent behavior recall by unfilled circles. The panels present data from the four 
set-size conditions in the experiment: [a] 13 congruent items and 1 incongruent item; [b] 11 
congruent items and 3 incongruent items; [c] 9 congruent items and 5 incongruent items; [d] 7 


congruent items and 7 incongruent items.) 


jects recalled equal numbers of congruent and 
incongruent items, 5 recalled more congruent 
than incongruent items, and 14 recalled more 
incongruent items than congruent ones. In 
Experiment 3, 8 subjects recalled equal num- 
bers, 11 recalled more congruent items, and 
29 recalled more incongruent items. 


General Discussion 


The research reported in this article has 
established four results about the recall of 
information about people. First, specific acts 
that are unexpected or incongruent with ref- 
erence to a person’s general impression are 
well remembered compared to acts that are 
unsurprising or congruent. Neutral acts that 
are uninformative for the impression-judg- 
ment task are least well recalled. Second, the 


primary determinant of the differential recall 
of congruent and incongruent acts is the rela- 
tive number of each type of act performed 
by the person. When the person performs 
only a few incongruent acts, the relative re- 
call of incongruent items is quite high. When 
the person performs equal numbers of con- 
gruent and incongruent acts, the relative re- 
call advantage of incongruent acts is much 
smaller. Third, differential recall of congru- 
ent and incongruent acts is most pronounced 
in the central portions of the serial position 
sequence. Both types of acts are well recalled 
from early (primacy) and final (recency) 
serial positions. Fourth, there was no group- 
ing of items in recall by trait categories. 
Clustering measures revealed no tendency 
to recall congruent and incongruent acts 


grouped in any orderly pattern. 
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Conflicting Findings of Other Studies 


We should mention several conditions that 
may limit the generality of the present re- 
sults. First, the personality dimension that 
defines congruence-incongruence in the ex- 
perimenter’s eyes must also be prominent in 
the subject’s perception of the characters, In 
the present research, we guaranteed this con- 
dition by using personality dimensions that 
have been shown to dominate subjects’ social 
judgments in previous research, and we re- 
quired subjects to perform a personality-im- 
pression task to make the dimensions even 
more salient. Second, the pace at which in- 
formation was presented was leisurely in 
the experiments, allowing the subject to re- 
flect for several seconds on each act and to 
distribute rehearsal and attention optionally. 
Third, the retention intervals between pre- 
sentation of to-be-recalled behavior descrip- 
tions and the free-recall test were brief— 
under 5 minutes for all items, Fourth, the 
instructions allowed subjects to anticipate 
the recall tests so that recall was an inten- 
tional memory test rather than an unex- 
pected, incidental memory test. Fifth, all 
memory tests followed free-recal] procedures. 
It may be that the present results would not 
generalize to cued recalled or recognition 
test conditions (cf. Cantor & Mischel, 1977). 

Several experiments (Bear & Hodun, 1975; 
Cantor & Mischel, 1977; Greenwald & Sa- 
kumura, 1967; Picek, Sherman, & Shiffrin, 
1975; Snyder & Uranowitz, 1978; Zadny & 
Gerard, 1974) have found that information 
that is consistent with an impression or ex- 
Pectation is better remembered than incon- 
sistent information. Each of these experi- 
ments differs from the Present research on 
one or more of the conditions we have listed. 
Furthermore, several experiments have found 
that novel (Greenwald & Sakumura, 1967), 
distinctive (Hamilton & Gifford, 1976), or 
schema-incongruent (Smith, 1973) informa- 
tion is well remembered. The obvious path 
for future researchers is to identify the vari- 
ables that interact with degree of Consistency 
or congruence to produce one pattern of re- 
sults or another, 
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A Simple Network Model 


tween semantic and episodic memory stores 
proposed by Tulving (1972) Conceptual 
memory controls the “pattern recognition 
Processes” of identification, comprehension, 
and categorization of on-going behavior, and 
it has usually been called implicit personal: 
ity theory by social psychologists (Rosen 
a & Sedlak, 1972a; Schneider, 1973). As 


major reasons for this assumption. First, 
there is a tradition of describing conceptual 


Person memory, implicit personality the 
ory, as a multidimensional 
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studying the list of facts about one of our 
characters produces a mental structure in 
episodic memory best described as a network 
of associative links and idea modes. We hy- 
pothesize that a typical structure encoded 
in our experimental task has a hierarchical 
structure of the form depicted in Figure 2. 
At the highest (entry point) level in the 
structure are features or ideas such as the 
individual's proper name, definite descrip- 
tions (“the best lawyer in town”), or per- 
haps a representation of the individual’s 
physical appearance. At the second, inter- 
mediate, level of the structure are organiz- 
ing principles—trait labels in the present ex- 
periments. Finally, at the lowest level in the 
structure are specific facts or attributes of 
the individual—behavior descriptions in this 
research. (The HAM analysis actually pro- 
vides for a much finer-grained analysis of 
this structure.) Thus, we hypothesize that 
behaviors are classified into trait categories 
and stored accordingly during the study 
phase of our experiments. These assumptions 
are plausible if we keep in mind that the ex- 
perimental materials were generated such that 
each behavior was distinctly and uniquely 
associated with a single trait category, that 
incongruent acts were strikingly opposed to 
congruent acts, and that list presentation 


z 


was at a leisurely pace so that subjects had 
time to review each act carefully. 

Second, we suppose that during the reten- 
tion interval, a matter of seconds or minutes 
in these experiments, links slowly disintegrate 
or fade from memory. 

Third, we postulate a variety of processing 
rules that operate on the stored memory 
structure when the signal to recall facts about 
an individual is given. (a) A search process 
starts at the highest (proper name) node in 
the structure and traverses the associative 
links downwards until it reaches a terminal, 
behavior description, node. (b) Upon reach- 
ing a behavior description, the subject re- 
views it, to see if that item has been recalled 
previously, and writes the description on 
the response form. In our example (Figure 
2), the search process would start at the node 
labeled “James Bartlett,’ might take the 
left-hand, “honest trait” path, and then se- 
lect the path terminating in the behavior 
description “admitted he caused the acci- 
dent.” (c) If the act has already been re- 
called, the subject does not write it down. 
(d) After a behavior has been reviewed, the 
search process starts over, from the highest 
(entry point) node. (e) The probability that 
the search process chooses a particular link 
as it passes through a node is determined by 


Fu 
IN THE POKER HIS BOSS 
GAME 


trait information and specific be- 


network structure to represent abstract i 
babilities of choosing a particular 


. A hierarchical 
haviors in long-term memory. (Numbers in parentheses are pro 
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the number of other links emanating from 
the node. The probability of choosing a par- 
ticular path is simply the reciprocal of the 
number of “departing” links. These tra- 
versal probabilities are indicated by the num- 
bers in parentheses in the example in Figure 
2. (f) The subject cycles through this re- 
trieval routine for a fixed number of trials. 
(Twelve to 14 cycles provide the best fits to 
data from the present research.) 

There are several features of this model 
that may seem implausible to some readers. 
First, the model will seem too simple, since 
it relies on a single, orderly mnemonic struc- 
ture and a small set of processing rules. Of 
course, the model is not nearly as simple as 
it appears in this rendering. We urge the in- 
terested reader to review the elaborate de- 
velopment in Anderson’s (1976; Anderson 
& Bower, 1973) lengthy theoretical treatises. 
For the reader who is still dissatisfied, we 
would argue that the moflel is complex enough 
to account for the major features of the data 
we have presented thus far. Second, the blind, 
probabilistic search process embodied in the 
retrieval rules may not appear “intelligent” 
enough to some readers. It is important to 
note the pattern of overt behavior that char- 
acterizes the recall phase of these experi- 
ments. The subject sits quietly for a few 
seconds, presumably scanning his or her 
memory for relevant information, and then 
spends several seconds writing out the sen- 
tence summarizing the information recalled. 
Then “the sequence is repeated with longer 
and longer pauses between writing. The task 
of writing interrupts the subject’s search 
processes and forces him or her to start again 
“from the top.” This process accords with 
the weak evidence from subjects’ reports of 
the experience of performing the recall task. 
The “clogging” effect, in which a subject is 
unable to avoid repeatedly recalling previ- 
ously recalled behaviors, is also consistent 
with the model and subjects’ interior views 
of the task. 

However, the model is not perfectly suc- 
cessful even given the current paltry data 
base. There are three subtle empirical fail- 
ures. First, the model predicts a set-size ef- 
fect for congruent-item recall as well as for 
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within any experiment. Second, 
predicts equal recall of congruent and ir 
gruent items when set sises are equal 
periments 1 and 2 showed a small, but 
liable, recall advantage for incongruent it 
in equal set-size lists Third, the simple 
work model outlined here predicts that 
congruent items will tend to be recalled 
fore congruent items in any single out 
sequence.’ An analysis of recall output 
quences revealed no consistent output pr 
ity for incongruent items in any of the t 
experiments. Thus, we are not overw 
ingly in favor of the simple network mx 
Its major theoretical function is to show 


: 
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Lockhart's (Craik & Lockhart, 1972; Lock- 
hart, Craik, & Jacoby, 1976) levels-of-pro- 
framework.” First, certain events in 
the series of acts are perceptually striking or 
distinctive, We assume that perceptual dis- 
UncUveness is determined by the degree to 
which an act is congruent with the general 
impression of a person at the time when the 
act occurs (Cantor & Mischel, 1977, make a 
smular proposal in their discussion of recog- 
nition memory for traits), Another expres- 
son for this characteristic of an event is its 
invormativeness, Informative events are 
novel,” unexpected, or nonredundant with 
previous information about a person. Crudely 
‘peaking, events early in the consequence of 
acts are more informative than later events; 
events that disconfirm the current impression 
ol a person are more informative than events 
that confirm that impression; and events 
that are extremely inconsistent or incongruent 
with a prevailing impression are more in- 
formative than events that are consistent or 
only slightly incongruent with that impres- 
sion. Of course, this list is merely a summary 
of variables that have been demonstrated to 
have large effects on the meanings of items 
in an initial impression (Asch, 1946; Lu- 
chins, 1957) or to determine the weights as- 
signed items in information integration tasks 
( Anderson, 1974b). 

Second, distinctiveness or informativeness 
determines the depth to which an item is 
processed. The more distinctive or informa- 
tive an item, the deeper its processing. In 
the present experimental task, two types of 

processing are particularly important: 
change and causal attribution. 

We hypothesize that the adjustment of an 
individual's impression involves special pro- 
sessing of the specific information that leads 
o that adjustment. We also hypothesize that 
the subject is very likely to generate causal 
explanations for distinctive acts. This at- 
ribution processing produces deep, durable 
memory traces in Craik and Lockhart’s terms. 

Third, deeply encoded acts are less likely 
han shallowly encoded acts to undergo in- 
lerference or decay during the retention in- 
erval. Finally, deeply encoded acts are re- 
tieved more easily than other acts during 


Comming 
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the Tecall phase of the task. Figure 3 sum- 
marizes this analysis for the present experi- 
mental task. It is clear that this approach 
can account for the congruence, set size, and 
serial position results. Again, the primacy 
results require a further comment. We as- 
sume that any initial items are informative 
and processed deeply. This means that initial 
congruent items will be well recalled rela- 
tive to congruent items from the central 
portions of the sequence, as was observed. 

The Craik and Lockhart account, with its 
emphasis on encoding or perceptual effects, 
is in the spirit of current models for impres- 
sion formation and attribution (Ajzen & 
Fishbein, 1975; Anderson, 1974a; Jones et 
al, 1972; Kelley, 1973). Furthermore, the 
Craik and Lockhart approach suggests some 
additional data analysis. For example, degree 
of incongruence should be positively related 
to probability of recall. If we compute cor- 
relations between probability of recall and 
rated extremity of the behavior on the trait 
dimension relevant to each list across incon- 
gruent behavior items, we consistently ob- 
tain positive coefficients, although they are 
not impressively large (Experiment 1, mean 
r= .15; Experiment 2, mean r=.21; Ex- 
periment 3, mean r=.07; none of these 
values is significantly different from zero), 

We should point out that a hybrid model 
including components of both network and 


8 Recently the conceptual analysis underlying the 
levels-of-processing framework has been challenged 
(Baddeley, 1978; Nelson, 1977). We retain the levels- 
of-processing analysis for two reasons, First, ac- 
knowledging that the critics have pointed out aspects 
of the approach that need further conceptual de- 
velopment, that require additional empirical inves- 
tigation, or that may be incorrect, we still believe 
that the framework is a useful theoretical analysis. 
Additionally, we believe that future work on the 
framework will address all of the critics’ objections 
and yield a more precise and more powerful theo- 
retical structure. Second, everyone agrees that vari- 
ations in subjects’ encoding behaviors are critical 
determinants of later recall test performance. The 
Craik and Lockhart framework is the richest cur- 
rent account for variations in processing quality as 
well as quantity. Our speculation is that there are 
differences in the quality of processing afforded in- 
congruent items as compared to congruent items. 
However, we admit that the present experiments do 
not provide convincing support for this inference. 
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levels-of-processing formulations is casy to powerful than the network models we 
imagine. Our preference would be to follow outlined. But, at present, so such 
the levels-of-processing analysis of percep- schema memory model is in the literature, 


‘onch 
interbehavior links than shallow encoding pea 
Thus, an informative, incongruent act will be The present research establishes the 
i nificance of abstract trait information in 
termining the recall of specific behaviors 
of congruence or consistency 
acts attributed to a persoa depend on 


ro 
af 
H 
iL 
ize 
AAN 


network of interitem links between 
be-explained act and other information 
the actor. Retrieval would follow the 
work model rules of tracing paths 
the memory structure and reading 
formation when new nodes are reached. stract trait impresion. For the present, 
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model of learned helplessness, an investigation was conducted to examine the 
effects of amount of helplessness training ‘and internal-external locus of control 
on subsequent task performance and on self-ratings of mood. Subjects were 
groups and were then given either high, 


divided into “internal” and “external” 
or no helplessness training on a series of concept-formation problems. 


low, 


After completing a mood checklist, all subjects worked on an anagram task 


presented as a second experiment 


that internals exhibited greater performance 
than did externals. 


depression under high helplessness 
conditions, internals tended to 
ternals tended to perform worse 


also reported the highest levels o 


perform better than c 
than control subjects; low helplessness subjects 
f hostility. The results are discussed within the 


t by a second experimenter. The results revealed 


decrements and reported greater 
In the low helplessness 
ontrol subjects, while ex- 


context of Wortman and Brehm’s integration of reactance and learned help- 


lessness theories. 


Seligman (1972, 1975) has proposed that 
learned helplessness, a phenomenon originally 
demonstrated in animal research, may under- 
lie human reactive depression. According to 
learned helplessness theory, when individuals 
are exposed to situations in which reinforce- 
ments are independent of responses, they 
gradually develop the expectation that rein- 
forcements cannot be controlled. This ex- 
pectation of uncontrollability reduces ‘the 
person’s motivation to respond in new situa- 
tions, interfering with performance on new 
tasks, and leads to anxiety and a state of 
depression, because of the perceived lack of 
control over outcomes. 
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Wortman and Brehm (1975) have pro- 
posed an integration of the learned helpless- 
ness model with reactance theory. According 
to reactance theory, individuals who expect 
to have control (i.e., freedom to obtain an 
outcome) are motivationally aroused to re- 
store control when it is threatened or re- 
duced. This motivational arousal may be 
accompanied by hostility and aggression. In 
integrating the two theories, Wortman and 
Brehm suggested that mild experiences with 
helplessness training will result in reactance 
and improved performance on subsequent 
tasks in an effort to reassert control, while 
more extensive experience with uncontrolla- 
bility will lead to passivity and debilitated 
performance on new tasks. Several investi- 
gators have reported evidence consistent 
with this hypothesized relationship between 
amount of helplessness training and subse- 
quent task performance (e.g, Roth & Boot- 
zin, 1974; Roth & Kubal, 1975). 


Learned Helplessness and Mood 


There is a substantial amount of evidence 
consistent with the hypothesis that experience 
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with uncontrollable Contingencies leads to jects initially Fespoed to uscontrollah 
depression. Miller and Seligman (1973) by experiencing frustration and/or hast 
found that depressives were more likely than but become resigned and depressed as 
nondepressives to view reinforcements as in- controllability continues 

dependent from responses. Klein, Fencil- 

Morse, and Seligman (1976) found that non- a 
depressed subjects given helplessness train. “<#?med Helplessmes; and Locus oj Conin 


ing on t series of unsolvable problems ex- The second major purpose of the pres 


ng. limensog | 
Gatchel, Paulus, and Maples (1975) and Internal-External (I-E) Locus of C 
Miller and Seligman (1975) found that sub. (Rotter, 1966) refers to the extent to a 


* r 
exhibited increases in levels of reported de- ments as response dependent or cont 
pression, anxiety, and hostility on the Mul. Internal iniiai To to perceive 

Checklist forcements to be reponse dependent, 


and n 
to assess the relative balance of depression that initial tations of contro! affect 
and anxiety, gave helplessness sub Seen r 


jects 10 yy 
trials of noncontingent feedback on each of pected that internals and externals 
Half of 


iold that they had succeeded on each prob. wncontroflability than 
lem. Results on the mood checklist indicated M/ronted with 


that noncontingent feedback — failure sub. ternals 

jects changed in the direction of —_ 

depression after treatment, while subjects in exhibi Pronounced 

the noncontingent feedback — success group chet a ee ae 

Bit ge cate pi diae roar ht also be suggested that 
oth groups ibi! Performance expect control will be 

ments on a Posthelplessness solvable ana- [z ond + 


obtained self-ratings in one investigation that atson Baumel Petrel 
revealed that subjects who Perceived no con- korek. (1967) and 
trol over an aversive stimulus rated them- More uncomfortabl uncont 
selves as significantly more helpless, incom- iy than Eaa oe the 
petent, and weak than subjects who perceive ate eae 
control over the same stimulus. bility and the situation 
While there is considerable support for a Predictions sacuntvelieble in the 
relationship between helplessness training were examined 
and depression, the question of whether dif. __ 
ferent moods are experienced by subjects after i P 
different amounts of helplessness training has > pad an 
received less attention.? One major purpose Mei de ouau ee 
o oreen study was to amili skaka, Roth and kata enter ere 
as Wortman and Brehm (1975) imply, sub- are dicu vert roy 
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study by exposing groups of internals and ex- 
ternals to one of several levels of helpless- 
ness training. 


Method 


Design 


The design was a 2X 3 between-groups factorial, 
with three levels of helplessness training on a con- 
cept formation task (high helplessness, low helpless- 
ness, and no-helplessness control) crossed with two 
groups of subjects selected on the basis of their 
extreme internal or external scores on the I-E scale 
(Rotter, 1966). The proportion of noncontingent 
incorrect feedback on the concept formation prob- 
lems was held constant at 5 for all low and high 
helplessness subjects. A yoked contingent and non- 
contingent feedback design was not employed, since 
practice effects could be expected to lead to differ- 
ential overall proportions of correct and incorrect 
feedback as a function of the number of problems 
in the contingent feedback conditions.2 A review 
of previous studies that employed similar concept- 
formation tasks and both contingent feedback (non- 
yoked) and no-feedback control groups (e.g., Ben- 
son & Kennelly, 1976; Griffith, 1977; Hanusa & 
Schulz, 1977; Hiroto & Seligman, 1975; Klein et al., 
1976; Roth & Kubal, 1975; Tenner & Eller, 1977) 
indicated that contingent feedback either enhanced 
performance or did not affect performance com- 

to no-feedback or baseline subjects. A no- 
feedback procedure was therefore employed in the 
no-helplessness control conditions of the present 


study. 


Subjects 


Ninety students were selected on the basis of 
their scores on the Rotter I-E scale. Three hundred 
eighty-five students in introductory psychology were 
given the I-E scale by their discussion leaders. Sub- 
jects with scores above 15 and below 8 on the I-E 
scale were assigned to the external and internal 
groups, respectively, and were called and asked to 
participate in two different learning experiments. 
The 90 subjects were randomly assigned to one of 
the three conditions within levels of externality. 
Each subject was run individually and received 
two credits for participation in the experiment. 


Procedure 


A series of five-dimensional stimulus patterns 
previously used in discrimination learning studies 
(Levine, 1966) and in other helplessness studies (cf. 
Hiroto & Seligman, 1975; Klein et al., 1976) were 
used for the treatment task. Each of the five di- 
mensions had two values: (a) letter (A or T); (b) 
letter color (black or red); (c) letter size (upper- 
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or lowercase); (d) border surrounding letter (circle 
or square); and (e) underline (dotted or solid). 
Two stimulus patterns were shown on each 3X5 
card. One pattern consisted of a set of values from 
each of the five dimensions; the other pattern con- 
sisted of the complimentary values of the dimen- 
sions. 

All subjects were given six concept identification 
problems by the first experimenter, who was seated 
next to the subject at a table (the first experimenter 
was unaware of the subject’s I-E scale score). Sub- 
jects in the helplessness groups were told that for 
each card, the task was to choose which side of 
the card contained the “correct” value, arbitrarily 
set by the experimenter, and that from the experi- 
menter’s feedback (‘“correct” or “incorrect”) they 
should learn the correct answer and choose cor- 
rectly as often as possible. Subjects in the low help- 
lessness groups were told that for several problems 
they would be asked to work without feedback, to 
provide a “baseline.” Subjects in the control groups 
received the same task description as those in the 
helplessness groups, but were told that the experi- 
menter needed a baseline of choices made without 
feedback for all of the problems, 

All subjects initially received one 10-trial sample 
problem to clarify the tasks. After the sample prob- 
lem, all subjects were told that it was very im- 
portant for the experiment that they work hard, 
and if they did their best on the task in the ex- 
perimenter’s judgment, they would receive an addi- 
tional credit. These instructions were given to mo- 
tivate subjects to attend to the tasks and to try 
harder, Subjects then received six problems, 10 trials 
per problem. 

High helplessness. Following the sample prob- 
lem and offer of credit incentive, predetermined 
noncontingent feedback (example: Cticiccicr) was 
given for each problem; incorrect feedback was al- 
ways given on the last trial. After each set of 10 
trials, subjects were asked to state what they be- 
lieved was the “correct” value for that set but were 
given no feedback concerning their answer.’ They 
were then told to start the next problem, whose 


2Since the training conditions differed in the ab- 
solute amount of feedback, it might be argued that 
the sheer amount of negative feedback, rather than 
the noncontingency of feedback, could account for 
the obtained results. While this interpretation is 
possible, it is unlikely, since (a) other investigations 
in which yoked controls were employed have dem- 
onstrated helplessness effects (e.g., Glass & Singer, 
1972; Hiroto & Seligman, 1975), and (b) noncon- 
tingent success has been shown to produce helpless- 
ness (Benson & Kennelly, 1976). 

3 Pretesting indicated that subjects became sus- 
picious when they were given incorrect feedback at 
the end of all six problems, so subjects were simply 
given no feedback on their final guesses to eliminate 
suspicion. 
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value might or might not be the same as that of 
the previous problem. 

Low helplessness. For the first four problems, 
subjects were asked to choose the correct side and 
value without any feedback, to provide a baseline 
for their performance. For the last two problems, 
contingent feedback was given using the same pro- 
cedure as in the high helplessness conditions. 

Control. Subjects in this condition were asked 
to guess the correct side for all problems without 
any feedback, to provide a baseline for guessing for 
use with other subjects who would receive feedback. 

In all conditions, the experimenter recorded all of 
the subject’s responses, No trial was allowed to ex- 
ceed 15 sec; after 10 sec subjects were told that they 
had 5 sec left to make their choice, 

Mood assessment. The Multiple Affect Adjective 
Checklist (MAACL) Today Form (Zuckerman & 
Lubin, 1965) was given to all subjects immediately 
following the treatments, Subjects were told that 
their responses would be anonymous and that they 
were being asked to fill out the checklist because 
the experimenter wanted to include their current feel- 
ings with the results. The MAACL provides mea- 
sures of three negative affective states: anxiety, de- 
pression, and hostility. After reading the standard 
instructions and completing the MAACL, all sub- 
jects received two credits and were thanked for 
their participation. 

Solvable test task. The test task was conducted 
at a different location by the second 
who was always blind to conditions, 


was “developing norms for anagram solution times 
from a’ cross section of students.” The subject's 
task was to solve a series of anagrams, The experi- 
menter explained that there could be a pattern by 
which to solve all of the anagrams but that it was 
up to the subject to figure that out. Subjects were 
told to give their solution 
since their performance was being timed. If the 
subject stated an incorrect solution, the experimenter 
said, “No, that’s not it.” If the subject stated the 
correct word, the experimenter said, “Yes, that’s it,” 


recorded the time to solution, and handed the subject 
the next anagram. 


Twenty solvable 
3X5 


Tresselt and 
lagri obtained: (a) trials to 
criterion, defined as the solution of three consecutive 
anagrams in less than 15 sec; (b) number of fail- 


ures to solve within 100 sec, at which Point the trial 
was ended; 


trials, the first experi- 
menter entered the second experimenter’s 
ing “forgotten” to give the subject a 

about the first learning experiment, The question- 
naire assessed perceptions of task solvability before 
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and after treatment, level of motivation, degree 
success on the task, and Wielibeed ths: 
would be able to solve the concept-formation tat 
lems. AN subjects were them debricted by the 
experimenter 


Results 
I-E Seale Scores 


A 2x5 analysis of variance of the I 
scores indicated the expected main effect 
the I-E variable, F(1, 84) = 1,151.44, p 
001, such that subjects in the external c 
ditions had higher scores than subjects in 
internal conditions (Af = 18.20 and 4.20, 
spectively). Neither the main effect for 
lessness nor the two-way interaction 
significant, F(2, 84) 39 and 1.16, rep 
tively. 


Anagram Performance 


A 2 X 3 analysis of variance of the tr 
to-criterion measure indicated a signif 
main effect for the helplessness variable, 
(2, 84) = 81.97, p< 001, such that 
jects in the high helplessness conditions 
formed worse than subjects in either the 
helplessness or the control conditions 
main effect for I-E was not significant 
ever, the analysis indicated a significant 
teraction between the helplessness and I- 
variables, F(2, 84) = 14.99, p< 001 ( 
Table 1). Although internal and «x 
subjects did not differ from each other 
the control condition {F < 1), they r 
in opposite ways to the low helplessness 


ternal subjects, however, performed 
Control-external subj 
F(1, 84) = 5.92, p < 05. In the high 
lessness conditions, both internals and 
Fap Significantly worse 


controls, F(1, 84) 
f < 001, and F(1, 84) = 


respectively ; 


HELPLESSNESS AND CONTROL: 


Table 1 
Mean Anagram Performance as G Function of 
Helplessness Training and Internal-External 


Classification 
High Low 
Group helplessness helplessness Control 
‘Trials to criterion 
Internal 16.80 5.66 7.66 
External 14.40 10,60 8.26 
No, of failures to solve 
Internal 4.80 13 1.40 
External 3.06 2.13 1.53 
Mean response latency (in sec) 
Internal 51.02 24.84 29.51 
External 42.85 37.43 30.93 


Note. The number of trials to criterion could range 
from a minimum of 3 (if the first 3 anagrams were 
each solved in less than 15 sec) to a maximum of 20 
(if criterion was never reached); the number of 
failures to solve could range from 0 (if every ana- 
gram was solved) to 20 (if none of the anagrams 
were solved); the maximum possible response 
latency was 100 sec (the point at which a trial was 
terminated). 


jects, F(1, 84) = 6.23, p < 05. Statistically 
weaker Helplessness X I-E interactions of 
similar form were also obtained for the num- 
ber of failures to solve, F(2, 84) = 3.75, P 
<.05, and for mean response latencies, F 
(2, 84) = 4.42, p < 05 (see Table 1). 


Mood Indexes 


Depression. A2X3 analysis of variance 
of the depression scores from the MAACL 
yielded a significant main effect for helpless- 
ness, F(2, 84) = 91.86, P< .001, such that 
subjects in the high helplessness conditions 
were more depressed than those in the low 
helplessness and control conditions. The main 
effect of the I-E variable was also signifi- 
cant, F(1, 84) = 5.12, p < .05, such that in- 
ternal subjects reported themselves to be 
more depressed than external subjects. The 
interaction between helplessness and I-E was 
also significant, F(2, 84) = 6.34, p< 0l. 
While internal and external subjects did not 
differ from each other in the control and low 
‘helplessness conditions (Fs < 1), in the high 
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helplessness condition, internals were sig- 
nificantly more depressed than externals, F 
(1, 84) = 17.76, p < .001 (see Table By 

Hostility. A2x3 analysis of the hos- 
tility scores (see Table 2) yielded a signifi- 
cant main effect for helplessness, F(2, 84) 
= 43.29, p < .001. There were no other sig- 
nificant effects on this measure. Comparisons 
of the marginal means indicated that high 
helplessness subjects were significantly more 
hostile than controls, F(1, 84) = 39.07, P< 
001 (overall M = 12.37 and 6.94, respec- 
tively). Low helplessness subjects were sig- 
nificantly more hostile than high helplessness 
subjects, F(1, 84) = 8.05, p< -01 (overall 
M = 14.83 and 12.37, respectively). 

Anxiety. A 2X3 analysis of the anxiety 
scores (see Table 2) yielded a significant 
main effect for the I-E variable, F(1, 84) = 
5.32, p< .05, such that internal subjects 
overall reported more anxiety than external 
subjects (overall M =10.27 and 9.22, re- 
spectively). The main effect for helplessness 
was also significant, F(2, 84) = 49.64, p < 
001. Since no predictions had been made 
for anxiety scores, a Newman-Keuls analysis 
was performed on the marginal means. This 
analysis indicated that although subjects in 
the low and high helplessness conditions did 


not differ, both groups reported significantly 
greater anxiety than controls (p < .01). The 
Table 2 
Multiple Affect Adjective Checklist Means 
High Low 
Group helplessness helplessness Control 
Depression 
Internal 26.60 12,80 11,93 
External 20.86 13.13 12.00 
Hostility 
Internal 12.80 15.73 7.00 
External 11.93 13.93 6.87 
Anxiety 
Internal 12.40 11.20 7.20 
External 10.80 10.87 6.00 


Note. For each mood measure, the higher the score, 
the greater the mood. Mood scores could range from 
a minimum of 0 to a maximum of 40 for depression, 
28 for hostility, and 21 for anxiety. 
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interaction on this measure was not sig- 
nificant. 


Questionnaire Responses 


A2 X 3 analysis of the question that asked 
subjects how hard they had worked on the 
concept-formation task indicated that there 
were no significant main or interaction ef- 
fects (all Fs < 1). All groups reported that 
they had worked hard on the task (on the 
7-point scale, where 1 was labeled “extremely 
hard,” the means ranged from 1,93 to 2.47). 

The second question dealt with subjects’ 
Perceptions of their performance on the con- 
cept-formation task. A 2 x 3 analysis of vari- 
ance of these ratings indicated that there 
was a significant main effect for helplessness, 
F(2, 84) = 51.14, p < .001. There were no 
significant effects for the I-E variable or for 
the two-way interaction. Analysis of the 
marginal means indicated that high helpless- 
ness subjects rated themselves as less suc- 
cessful than low helplessness subjects and 
that the low helplessness subjects rated them- 
selves as less successful than control subjects 
(p < 01, Newman-Keuls). 

There were no significant main effects or 
interactions on the i 
whether subjects 


Perceptions of their 
task after they had 
worked on it. A 2 x 3 analysis of variance in- 


è While the fourth 
jects’ perception of their own ability to solve 
the Problems, the 


jects to what extent they believed that the 
Problems could be solved by someone else, 
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In order to examine the relative decline in 
belief in one's own and another's ability to 
solve, a 2x 2x5 analysis of variance with 
type of question as the repeated (actor was 
Performed. In addition to main effects for 


the type of question and the amount of help. 
lessness training, an interaction between 
these two variables was obtained, F(? 84) 
= 13.56, P < 001. Control subjects believed 
that both they and others could solve the 


problems (Af = 6.50 for self and 6.25 for 9 
others on the 7-point scale), but both low 
and high helplessness subjects reported that 
they believed they were leas able to solve than 
others (in low helplessness, Me 4.13 for self 
and 5.67 for others; in high helplessness, Af 
= 2.70 for self and 4.43 for others). Although 
these data should be interpreted with caw 
tion, since they were collected at the very 7 
end of the experimental sessions, they seem 
to indicate that subjects in the helplessness 
conditions were attributing their inability to 
solve to themselves High helplessness sub. 
jects in particular believed that they could 
not solve the problems, but tended to be 
lieve that others could. 


Discussion 
Overall Mood Patterns 


The mood results were generally consistent 
with Wortman and Brehm's (1975) predic- 
tion that individuals will initially respond to 
uncontrollability by experiencing reactance 
but will eventually exhibit helplessness after 
Continued experience with lack of control. 
In the present study, low helplessness sub- 
jects were Significantly more hostile than 
either high hel; 
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series of 7-point Likert-type scales to mea- 
sure affect at the end of the second (test) 
task, in the present investigation moods were 
measured with the MAACL immediately fol- 
lowing the training task. In addition to the 
possibility that the measuring instruments 
used in the two studies may have tapped dif- 
ferent mood states, subjects’ ratings of their 
moods immediately after helplessness train- 
ing may differ from their subsequent retro- 
spective reports. Finally, since moderate in- 
ternals and externals were excluded from the 
present study but were presumably included 
in Roth and Kubal’s investigation, it is pos- 
sible that moderates on the I-E scale react 
differently than their more extreme counter- 
parts. 

The mood results also indicated that both 
high and low helplessness pretreatments pro- 
duced anxiety, consistent with the findings 
of Gatchel et al., (1975), Miller and Seligman 
(1975), and Roth and Kubal (1975). The 
anxiety data do not, however, provide sup- 
port for the alternative hypothesis that im- 
paired performance following helplessness 
training is due to anxiety (Hebb, 1966, as 
cited in Wortman & Brehm, 1975). High and 
low helplessness subjects did not differ in 
reported anxiety but exhibited quite different 
levels of performance on the anagram test 
task. Furthermore, anxiety and 
he not significantly correlated 
Js 


Locus of Control, Mood, and Performance 


For internals, the pattern of performance 
results was also compatible with Wortman 
and Brehm’s prediction that individuals who 
initially expect control will attempt to Te- 
assert control after mild experiences of uni 
controllability but will eventually show help- 
lessness effects when the uncontrollability is 
extensive. Low helplessness internals exhib- 
ited improved performance on the anagram 
task relative to controls, while high helpless- 
ness internals displayed the greatest pedom: 
ance decrements in the experiment. Under 
low helplessness, internals’ hostility was 
negatively correlated with failures, such that 
the more hostile they were, the more ee: 


45 


Table 3 

Within-Cell Correlations Between Mood Indexes 
and Number of Failures to Solve on the 
Anagram Task 


Se a 


High Low 


Group helplessness helplessness Control 
Depression vs. no. of failures to solve 
Internal as 233 wine 
External Y fas AL 43 
Hostility vs. no. of failures to solve 
Internal 14 —.52* A2 
External —.4l —.43 —.43 
Anxiety vs. no. of failures to solve 
Internal 39 34 — 36 
External —.50 00 — 13 
*p<.05 
s p< 01 
grams they solved (see Table 3). Under high 


helplessness, internals’ depression, rather than 
hostility, was most highly correlated with 
number of failures. For these subjects, the 
higher their reported depression, the fewer 
anagrams they solved. That hostility was re- 
lated to enhanced performance under low 
helplessness for internals was a finding par- 
ticularly supportive of Wortman and Brehm’s 
(1975) model. 

Externals, however, showed a different pat- 
tern of performance. As the amount of un- 
controllability increased for these subjects, 
their performance became increasingly worse. 
This finding is consistent with Wortman and 
Brehm’s notion that individuals who do not 
initially expect control will not experience 
reactance and will show helplessness effects 
after mild experience with uncontrollability. 
However, externals were unexpectedly simi- 
lar to internals in their pattern of moods. As 
would be expected under high helplessness, 
externals showed substantial depression (al- 
though less than internals), and the correla- 


oSA 


4 Within-cell correlations between the mood in- 
and the number of failures to solve are dis- 
played in Table 3. The correlations between moods 
and the other measures of performance followed a 
similar pattern but were generally weaker. 
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tion between externals’ depression and num- vestigation only total I-E scores from a mas 
ber of failures to solve was similar to that testing were available for analysis; it wouk 
of internals (see Table 3), However, under be interesting in future studies to focus og 
low helplessness, externals were also similar specific factors of the TI-E scale lavestigatorn 
to internals in that they were more hostile have found the I-E scale lo contain from twy 


task than controls, Also in contrast to the studies), Since, as Phares has observed, the 
relationship for internals, the correlation be- ability to derive reliable separate fac tors from 
tween hostility and failures for externals was the original scale is limited because of the 


pear to alter the relationship between hos- reactions to uncontrollability, The effects of 

tility and Performance, expecta of control on fractions to 
That both internal and external subjects lessness training should also be directly stud- 

were hostile after mild experience with un. ied by employing unambiguous Manipulations 

controllability could mean that both groups Of expectations instead of individual differ. 

experienced reactance but behaved differently ence variables 

in response to reactance, This in 

implies that some Aspect of the I-E variable 

other thea initial expectations of control may References 

have ed to the different behavior patterns : 
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icle, “Effects of Desegregation f 
In tha a d (Journal of Personality and Social Psychology, 


the following corrections should be made in Table 
rrelation between parental authoritarian child-rearing 
cial attitudes should be —.04, not .04. (b) The cor- 


Stephan and David Rosenfiel 
Vol. 36, No. 8, pp. 795-804), 
2 on page 800: (a) The co 
practices and changes in ra 


relation between parental opposition to in 


should be .16, not —.16. 
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Source Credibility in Social J 


Social Psychology 


udgment: Bias, Expertise, 
and the Judge’s Point of View 


Michael H. Birnbaum and Steven E. Stegner 
University of Illinois at Urbana-Champaign 


Mathematical models of source credibility were tested in Ove caperiments is 
which judges estimated the value of hypothetical used cars based on bier book 
value and/or estimates provided by sources who examined the cars The source: 
varied in mechanical expertise and in bias; they were described as friends of 


the buyer or seller of the car or as neut 
the buyer should pay, the lowest price 


value (“fair” price) of the 


. Individuals judged the highest price 


ral. 

the seller should accept, and the “true 
car. Data indicated that expertise amplifies the efec: 

of the source’s bias. This effect is predicted by 


a xale-adjatiment model, in 


which the source’s bias shifts the scale value of the source's estimate. The weigh 
on the 


of an estimate depends chiefly 
estimate also depends configurally on 


the other estimates: Judges instructed to 
take the buyer's point of view give greater 
judges who identify with the seller place 


weight to the lower estimate, whereas 
greater weight on the high estimate 


Simple premises about human judgment give a good account of the dats 


Social judgments often require the combina- 
tion of pieces of information provided by 
sources who vary in credibility (Hovland, 
Janis, & Kelley, 1953; McGuire, 1968). 
Rosenbaum and Levin (1968, 1969), Anderson 
(1971), Birnbaum, Wong, and Wong (1976), 
and Birnbaum (1976) have proposed and/or 
tested formal theories of source credibility. 
The present research extends these develop- 
ments and argues that the concept of credi- 
bility, used loosely in early persuasion research 
to mean “believability,” can be profitably 
decomposed into at least three constructs: 
expertise, bias, and the judge’s point of view. 

The judge is the (the subject in the 
Present experiments) who combines informa- 
tion provided by one or more sources to make 
an overall evaluation or judgment. The juror, 
who decides guilt based on contradictory 
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evidence, the voter, who choos among 
candidates who disagree, and the consumi 
who evaluates the worth of a product, a 
examples of people acting as judges. 

The expertise of the source refers to the 
correlation between the source's 
report and the outcomes of empirical verify 

tion. 


would an untrained student. 

The bias of the source refers to factors that 
are perceived to influence the expected alge 
braic difference between the source's 
and the true state of nature. For example; * 


might be an expert source of information abo“ 
the value of his cars; however, the sales 
estimates may be biased upward, since od 
seller stands to profit by convincing pote? 
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buyers of the high worth of the car. Similarly, 
an insurance claims adjuster is an expert 
source who might underestimate the value of 
damaged goods on which his or her company is 
required to pay claims. 

The judge’s bias is termed the judge’s 
“point of view.” Thus, a judge who is a 
Republican or a Democrat may treat informa- 
tion provided by a Republican source and a 
Democratic source differently. It is important 
to maintain the distinction between the bias 
of the sources and the point of view (bias) of 
the judge who combines the information from 
the sources. 


Previous Research on Source Expertise 


Rosenbaum and Levin (1968, 1969) ex- 
tended an averaging model of impression 
formation to account for source credibility 
effects, Anderson (1971) discussed additive and 
averaging models of source credibility and 
theorized, on the basis of previous research 
(on the set-size effect in impression formation), 
that averaging models would prove superior. 
Wyer (1974) revived an additive model for 
source credibility as a“ case against averaging.” 
Lichtenstein, Earle, and Slovic (1975) offered 
a constant-weight averaging model of numeri- 
cal cue prediction (each cue 
of as a source). Additive and averaging models 
both predict that 
report should increase 
bility, but the models make different predic- 
tions for the effect of the credibility of one 
source on the effect of information provid 
by another source. i 

Birnbaum et al. (1976) and Birnbaum (1976) 
conducted three experiments testing three 
models for the effect of source expertise 1n 
information integration. In one experiment 
(Birnbaum et al., 1976, Experiment 1) under- 
graduates judged the value of used cars b 
on two cues: blue book value and an estimate 
provided by one of three sources who examined 


with incr 


cal expertise. For example, 
car worth if its blue book value is $500 but 
an expert mechanic whe has 
car evaluates its worth at $700 
experiment, judges rated the likableness of 
hypothetical persons described by personality- 
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trait adjectives attributed to sources who 
varied in their length of acquaintance with 
the person they described. For example, how 
much would you like a person who was de- 
scribed by an acquaintance of 3 years as 
sincere and by an acquaintance of 3 weeks as 
phony? In a third experiment (Birnbaum, 
1976), students were trained with feedback 
to predict a numerical criterion from single 
independent cues separately, then asked to 
predict (without feedback) the criterion from 
pairs of cues. 

The results of the used car, impression 
formation, and numerical prediction studies 
form a coherent picture. The effect of a cue 
varies inversely with the number of cues 


in violation of the constant-weight averaging 
model (including the models of Rosenbaum & 
8, 1969, and Lichtenstein et al, 
1975, as special cases). The results of Birnbaum 
et al. (1976) and Birnbaum (1976) were quali- 
tatively consistent with a relative-weight 
averaging model, in which the effect of informa- 
tion provided by one 
related to the number and credibility of the 
other sources. 

Since the results of the three quite different 
experiments were consistent with the same 
general model, it is inductively appealing to 
theorize that the model will hold across 
different judgmental domains. Values of the 
estimates, adjective likableness values, and 
numerical values of the cues are represented 
by scale values. Levels of mechanical expertise, 
length of acquaintance, and cue-criterion 
validity are represented by changes in weight. 

The relative-weight averaging model for 
used car judgment can be written as follows: 


__ woso + WVSY + wse 

R=- tote’ 0) 
where R is the judged worth; wo, Wv, and w 
are the weight of the initial impression, the 
blue book, and the source, respectively ; and 
So SV> and seg are the scale values of the initial 
impression (the presumed response in the 
absence of information), the blue book value, 
and the source’s estimate, respectively. 
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Purposes of the Present Research 


The present research investigates the effects 
of the source’s bias and the judge’s point of view 
on the information integration process. Judges 
made evaluations of hypothetical used cars 
based in part on estimates provided by sources 
who varied in bias and expertise. Bias was 
manipulated by stating that the source was 
either a friend of the buyer or the seller or an 
independent. The judge’s point of view was 
manipulated by asking the judge to identify 
with either the buyer or the seller of the car. 
For example, subjects were asked to judge the 
most they would advise the buyer to pay for a car, 
given that the blue book value is $500 and an 
expert mechanic, who is a friend of the seller, 
estimates its value at $700, The bias of a 
source may affect either the weight or the 
scale value of the information provided by the 
source. Furthermore, the effects of the source’s 
bias may depend on the judge’s point of view. 
The next section shows that certain experi- 
mental designs make it possible to distinguish 
different theories of the effect of source charac- 
teristics on weight and scale value. 

The first four experiments compare three 
models of the effects of the source's bias. 
These relative-weight averaging models are 
compatible with previous research ; however, 
they make strikingly different predictions for 
the interaction of the source’s expertise and 
bias. The models can be distinguished by 
qualitative comparisons that do not require 
metric assumptions about the dependent 
variable or global tests of goodness of fit. 

The fifth experiment tests a theory of 
configural effects. Deviations from the relative- 
weight averaging model obtained in previous 
research (Birnbaum, 1974; Birnbaum et al., 
1976) and in the first four experiments of this 
article are presumed to depend on the stimulus 
configuration. Implications of a configural- 
weight theory (Birnbaum, 1974) are explored 
in the fifth experiment, 


Weight and Scale Value 


Figure 1 shows how effects of the source on 
scale value and weight can be separated and 
analyzed in a relative-weight model. The three 
examples T ypotace predictions assum- 
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ry oS ate = 

Source’s Estimate Blue Book Volve 
Figure 1. Hypothetical results assuming that sources 
affect weight only (Panels Al & R1), scale valor of the 
source's estimate only (A2 & RZ), or both weight and 
sale value (A3 & B3). (Panels Al, A2, and AJ chow 
mean 


ing that the source affects weight only (Al 
& B1), scale value only (A2 & B2), or both 
(A3 & B3).! Panels Al, A2, and A3 plot 
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judged value as a function of the source’s 
estimate with a separate curve for each source. 
The effect of the source’s estimate, AReg, that 
is, the change in response due to the source’s 
estimate (averaged over blue book value), 


from Equation 1, will be given by the ex- 
pression : 


(1a) 


w 

ARE = y+ wy + © ae 
where Ase is the range of scale values of the 
source's estimates. Note that the slopes in the 
A panels of Figure 1 will depend on both 
weight (w) and the range of scale values, Ase- 
The slopes are proportional to ARy because 
the abscissa values are constant for all sources; 
Asy and w may or may not vary for different 
sources 

Panels B1, B2, and B3 of Figure 1 show the 
response plotted against the blue book value, 
averaged over the source’s estimate, with a 
separate curve for each source. For the B 
panels, the effect of blue book value, ARv, is 
given by the following equation: 
(1b) 


nÂ, 


ARy 
v= wo + wy +e 


where Asy is the range of scale values of the 
blue book values. Note that if wo, wv, and 
Asy are presumed to be independent of the 
source, the effect of the blue book value (slope) 
should vary inversely with the weight of the 
source, w. 

Change of weight only. The pattern of 
results in Figure 1, Panels Al and BI, is 
consistent with the results of Birnbaum et.al: 
(1976), who varied expertise (but not bias) 
of the source. For that experiment, Sources 1, 
2, and 3 would represent low, medium, and 
high expertise, respectively. It was concluded 
that expertise affects weight, not the range of 
scale values, since the effects of blue book 
value (slopes in Panel B1) are lower for sources 
of higher expertise, as predicted by Equation 
1b if expertise affects weight. 
Change of scale value only. It 
likely that manipulation of the 


seems un- 
bias of a 


wo ESSE 


was 2; the scale values for Source 1 were $450 and $850, 
for Source 2 they were $300 and $700, and for Source 3 
they were $150 and $550. 
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source, holding expertise constant, would affect 
only weight and produce the pattern of Panels 
A1 and B1. If the source’s bias affects the scale 
values of the source’s estimates, one would 
expect main effects of bias and possibly inter- 
actions between the bias of a source and the 
source’s estimate. Figure 1, Panel A2, illus- 
trates such a possibility. In this case, Sources 
1, 2, and 3 might represent friend of the seller, 
independent, and friend of the buyer, respec- 
tively. At first, the increased slope for Source 
2 in Panel A2 might be thought to indicate that 
Source 2 has greater weight. However, Panel 
B2 shows the weights are equal, since the 
curves are parallel. Parallelism implies that the 
effect of blue book value is independent of 
the source, indicating (by Equation 1b) that 
the sources have equal weights. 

Change of weight and scale value, Panels A3 
and B3 of Figure 1 indicate a pattern in 
which the source affects both weight and scale 
value. The curve with the steeper slope in 
Panel A3 also has the flatter slope in Panel B3. 

These examples illustrate that in order to 
separate the influences of the source on weight 
and scale value, one must examine not only 
the effect of a source’s estimate (A panels) 
but also the effect of another cue such as the 
blue book value (B panels). 
source’s estimate depends on both the source’s 
weight and the range of scale values (Equation 
la and A panels). The effect of blue book 
value, however, provides an unambiguous 
constraint on the source’s weight (Equation 
ib and B panels). By comparing Panel A 
with Panel B one can tease apart the effects of 
source variables such as bias on weight and 


scale value. 


Experiments 1-4: Three Models of 
Source Bias 


Figure 2 shows three models of source bias 
(on the left) and an important prediction of 
each (on the right). All of the models are 
relative-weight averaging formulations and 
can be represented by lever and fulcrum 
models. The lever is an analog computer that 
can be used to make predictions for the three 
models. In each case, the scale value of the 
source’s estimate is represented by the location 
along the lever where the weight is placed, 
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pi 


R 
Figure 2. Three models of source bias. (Scale value, +, 
corresponds to a point along the lever; 
physical weight 
bias, the 
CR] would be the center of gravity, fulcrum : 


Model 1 predicts no interaction between bias and 
expertise. Model 2 assumes that bias affects the scale 
value of the source’s estimate and predicts that the 
effect of bias increases with the increasing expertise 


averaged to form an overall assessment. Since the 
relative weight of bias decreases as expertise increases, 
Model 3 predicts that the effect of bias diminishes with 


weight corresponds to weight, and the response 
is represented by the center of gravity (location 
of the fulcrum at equilibrium). 

In each model, the scale value and weight 
of the other information (e.g., blue book 
value) are s and w, the scale value of the 
source’s estimate is represented by ss, and 
the weight of the source’s expertise is wx. 
To represent the initial impression, the plank 
alone is presumed to have a weight of we with 
a center of gravity at so. The change in re- 
sponse due to bias is shown in each figure by a 
comparison of solid and dashed symbols 
Te or arrows). The direction of bias 
i lustrated in the figure is negative, gonsisten t 
with the source being a friend of the seller. 
Model 1: Response Revision 


According to the first model, bias produces a 
shift in the response, rather than aff f 
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weight or scale value. As shown in Figure 2, 
this can be represented by the lever if the 
response is assumed to be the weighted 
average of a blasfree reponse, R*, and a 
response effect of bias, bs. A special case of this 
model can be written as follows 

na Jhs, a) 


wa 
R= (5 ze Y (2 + we 


where wet and we are the weights of the bias 


tively, which are assumed to be cons! 
R® (fulcrum, or balance point not cons 
bias) is given by Equation 1 


This model predicts no interaction between 
bias and any of the other factors. In par ocular, 
the effect of bias is predicted to be inde pendent 
of expertise, as shown in the upper right 
section of Figure 2 


R° = (wets + wo + Wate) (ee + + x) | 


A more complicated version of Model 1] 


would allow the source's weight (wx) @ 


depend on both expertise and bias 


The | 


changes in weight due to bias and experte a 


can be estimated from the effects of blue book 


model still predicts that once the weights have 
been estimated in this fashion, the rosdi 
effect of bias will be independent of expertise 
The response revision model represents the 
judgment as a two-stage process in which the 
average value is first computed without 
paying attention to bias, then bias is av 
together with this implicit response 


Model 2: Scale Adjustment 


The second model assumes that the bias 
of the source causes a shift in the value of the 


f 
value (as in Figure 1). This more general | 


source is adjusted “prior” to the integr® 
process. Figure 2 depicts this process bY 

source's weight is no! piace 
ue of the estimate, s. id 


case of this model, which assumes that 9# 
= sg + bn, can be written: 
Ra et m terte) =F 


we + w+ wx 


> 
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This scale-adjustment model predicts that 
the effect of bias will be greater for sources of 
high expertise than for sources of low expertise, 
since expertise (wx) multiplies the bias 
correction (bp). 

A more complicated version of this model 
allows the weight of the source to depend on 
both expertise and bias and allows the corrected 
scale value to depend on both estimate and 
bias. This general version of Model 2 can be 
written as follows: 


wWoso + ws + WXBSEB 4) 


= wo + w+ wxg 


where the subscripts of wxn and sgn suggest 
that the weight of a source depends on both 
expertise (X) and bias (B), and scale value 
depends on both estimate (E) and bias (B). 
The more general model does not require 
estimate and bias to combine additively to 
produce scale value. 


Model 3: Weighted Bias 


The third general theory assumes that the 
source’s estimate and the source’s bias are 
both pieces of information that must be 
integrated to form an overall evaluation. The 
scale value of the source’s estimate, SE, 
receives a weight that depends on thesource’s 
expertise, wx. However, the scale value of the 
source’s bias, bp, has a weight that depends 
only on bias, ws. Thus, as the expertise in- 
creases, the source’s estimate receives greater 
relative weight, and the source’s bias receives 
reduced relative weight. Model 3 can be 
written: 


woso + ws + wxse t wabs 6) 
wi + w+ wx + we s 


Model 3 predicts that the effect of bias will 
be inversely related to the i 
source: As wx increases, thi 
of bias, wp/ (wo + w + wx + wp), decreases- 
Model 3 can also be extended to all 
weight of the source’s estimate to depend on 
bias, wxs, or to allow the bias correction to 
depend on estimate and bias. M 
that as a source grows in experti the judge 
places more weight on what the source says 
and therefore less relative weight on the 
correction for bias. 


R= 


In sum, the three models predict that 
increasing expertise either increases the effect 
of bias (Model 2), decreases the effect of bias 
(Model 3), or does not interact with bias 
(Model 1). These differential predictions, 
shown on the right in Figure 2, are tested in 
the first four experiments. 


Method 


Instructions 


The task was to judge the values of hypothetical 
used cars based on blue book value and/or estimates 
of value provided by sources who varied in bias and 
expertise. The sources of the estimates were described 
as people who attempted to judge the “true” value of 
the cars, based on a 30-minute inspection and test 
drive. Their relationship with the buyer or seller and 
their expertise in judging the value of automobiles were 
specified, 
Separate paragraphs discussed 
the training and mechanical skill of ‘the sources, who 
were described as low, medium, or high in expertise. 
The low-expertise source was described as a competent 
person who drives a car regularly and has purchased 


hobby is the repair and modification of sports cars. 
Each source was described as a friend 
of the buyer, @ friend of the seller, or an independent. 
It was explained that the buyer's friend would be 
ted to be sensitive to his friend’s desire to get 


the seller’s desire to get as much money as possible for 
the car. He might be optimistic about the car and 


ts. 
‘The blue book value was described 
as a standard “fair” price that is determined by such 
factors as year, 
remarked that blue book value is widely relied upon by 
businesses that deal in large numbers of cars, but that 
it would not describe individual cars. 


Procedure and Designs 


Each test booklet contained three pages of instruc- 
tions, 20 warm-up trials, and 293 randomly ordered 
test trials. The test trials were constructed from the 
following designs: 

Source estimate. A 3X3X5 (Bias X Expertise 
X Estimate) factorial design generated 45 trials on 
which the only information presented was the estimate 
of a source whose expertise and bias were specified. 
The three levels of the source’s bias were friend of the 
buyer, friend of the seller, and independent. The three 


Judge's Point of View: 
. Buyer's B. Fair C. Seller's 


Mean Judgment 


Figure 3. Mean judgment of value as a function of the 
source’s ise with a separate curve for each level 
of the source’s bias. (Points are empirical means, and 
lines are predictions based on the scale-adjustment 
model, Equation 4. Solid points and lines are for friends 
of the buyer [B]; open circles and dashed lines are for 
friends of the seller [S]. Upper and lower rows of 
panels are for the first and second sources, respectively. 
Panels A, B, and C represent different points of view 
for the judge, Experiments 1-4.) 


levels of source expertise were high, medium, and low. 
The five levels of estimate were $300, $400, $500, $600, 
and $700. 

Source estimate and blue book value. The entire 
3X3X5 source-estimate design was factorially 
combined with the four levels of blue book value: $350, 
$450, $550, and $650. The resulting (3 X 3 X 5) X 4, 
(Bias X Expertise X Estimate) X Blue Book Value, 
design yielded 180 trials on which the information 
consisted of a source’s estimate and a blue book value. 

Two source estimates. Two separate 2 X 2 X 2 source- 
estimate designs were factorially combined to generate 
64 trials ([2 X 2 X 2] X [2 X 2 X 2J) with estimates 
from two sources. For both sources, the two levels of 
bias were friend of the buyer and friend of the seller; 
the two levels of expertise were high and 
levels of estimate for the first source were 

$300 


and $650. 


Judge’s Point of View and Research Participants 


All four experiments used the same test trials and 
warm-ups, but three different points of view were 
given to different groups of judges, who were instructed 
to identify with either the buyer, the seller, or an 
independent. The judges were 121 undergraduates at 
the University of Illinois who received extra credit in 
an introductory psychology course. A small number 
of additional students failed to follow instructions or 
complete the task and were excluded. 
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Experiments 1 and 2) Paw price 
instructed to imagior thet they bad dee > 
“independent, wntaased guaretet lo reisi < 
value of many cars.” They were told to 
nor underestimate “tree worth” Pipiri» 
Si participants, was comdocted a yasr tebere the o 

t 2, with 19 joiga was o rpi 
Experiment | comdected conteepernnr 
Experiments J and 4 The data bor Kaperi: 
were analysed separately aed were virtually © 
consequently, the data were posind 

Experiment J: Buyer's price, The buyer: | 
to estimate the highest price fer whic be te would 
recommend buying a car. The 26 jwipr is $ + j-erimnnnt 
J were instructed to immagine that they were sting as 
the agent of someone whe would be buying sed can 
and to ask themseives, “What is the seartones ammount 
I would advice paying for each car?” 

Buperiment 4; Seller's price, The wether > tack war 
to estimate the lowest price fos which be or he would 
recommend selling a car, These 25 jedere were is 
structed to imagine that they were arting a ibe agent 
of someone who would be selling wed cars sod to ad 
themselves, “What ix the minions atecunt | would 
advise accepting for each car?” 


Revulis 


Figure 3 shows the interaction between 
expertise and bias for the two-source design, 
plotted for comparison with the predictions 
on the right in Fig. 2. Mean judgments of 


expertise, with solid points for the friend of 
the buyer (B) and open circles for the inend 
of the seller (S). As expected, judged values 
are greater when the source is a friend of the 
buyer than when he is a friend of the soler. 
The upper panels are for the first sour, 
averaged over levels of both estimates 


over expertise and bias of the second source: 
Lower panels are for the expertise and bias 
of the second 


source. 

Panels A, B, and C show the results for the 
buyer's point of view (Experiment 5), the 
independent, “fair” point of view (Experiments 
1 & 2), and the seller's point of view (Expe™ 
ment 4), respectively. ing the three 
panels shows that the mean judgments aTe 
greater for estimates of the “lowest selling 
price” (Panel C) than they are for the" highest 
buying price” (Panel A). 


expertise i 
the source's bias increases. Divergence i$ 
characteristic of plots made for indiv A 

judges. Analysis of variance tests of the int # 
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Judge's Point of View: 


A. Buyer's Price 


Mean Judgment 


Figure 4. Mean judgment of value based 
value of the first source's estimate. 


$400 or $600, by a friend of either the buyer [B] or 
for second sources of high expertise [H]. Open circles and 


action between bias 
second source yielded F(1, 25) = 12.7, F(1, 69) 
= 36.6, and F(1, 24) = 10.2 for buyer's, fair, 
and seller's points of view, respectively. i 
divergence is predicted by 
ment model (Model 2 of Figure 2), 
assumes that the source’s bias causes a shift 
in the scale value of the information he 
provides. The solid and dashed lines plotted 
on Figures 3-5 represent predictions for Model 
2 of Figure 2 (Equation 4). Model analyses 
will be discussed in a later section- 

Figure 4 shows the results for all 64 two- 
source combinations, with a separate set of 
four panels for each point of view of the judge. 
Each panel within Figure 4 contains letters 
representing the expertise of the first and 
second sources, respectively- Thus the upper 
left panel of each figure represents the results 
for two low-expertise sources (LL); the upper 
right panel - shows the results for a high- 
expertise first source and a low-expertise 
second source (HL). The abscissa is spaced 
according to the scale vålue of the first source's 
estimate, derived from Equation 4, The letters 


B. Fair Price 


(The four abscissa values represent scale values 


C. Seller's Price 


rces, 
seller [S]. Upper panels represent data for second 


ler and friends of the buyer, respectively. 


ical predictions based on scale-adjustment theory, with dashed lines for friends 


S and B on the abscissa show the scale values 
for the first source’s estimates (either $400 or 
$600), provided by a friend of the seller (S) 
or buyer (B), respectively. The four separate 
curves within each panel are for levels of bias 


and lines are used for second sources who are 
friends of the buyer; open circles and dashed 
lines are used for friends of the seller. 

There are four important results in Figure 
4 that are common to all points of view, 
characteristic of the majority of individual 
subjects, statistically reliable, and (impor- 
tantly) relevant to evaluation of the models. 
First, the effect of a source’s estimate is greater 
for sources of higher expertise. Within each 
set of panels, proceeding from the upper left 
to the upper right corresponds to an increase 
in the expertise of the first source. Since the 
first source’s estimate is on the abscissa, the 
increase in slope represents an increase in the 
effect of this estimate. The vertical separation 
between the two curves of the same type 
(either solid or dashed) represents the effect 
of the second source’s estimate, which was 


Mean Judgment of : 


Fair Price 
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Source’s Expertise: 
Low Medium High 


e Se 


r function of the source’s estimate and blue Panels A1-A3 
e Se peia = buyer should pay (Experiment 3. Aeng Arar tease of 
( ) : the lowest acceptable 


iment 


selling price 


z, sod 
been shifted vertically on the ordinate, $250 up for the friend of the bayer ent toe Jor the ! 


and $250 down for the friend 
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either $300 or $700. Dropping from an upper 

to a lower one corresponds to an 
increase in the expertise of the second source. 
Accordingly, the spread between the curves is 
greater in the lower panels. The tests of the 
Expertise X Estimate interactions for the 
first source yielded F(1, 25) = 60.7, F(1, 69) 
= 239.2, and F(1, 24) = 45.0 for the buyer’s, 
fair, and seller's price judgments, respectively. 

Second, the effect of either source’s estimate 
is inversely related to the expertise of the other 
source, Thus, as the slopes increase from left 
to right, the spreads decrease. As the spreads 
increase from the upper to the lower panels, 
the slopes decrease. For the Expertise of the 
First Source X Estimate of the Second Source 
interaction, the Fs were F(1, 25) = 56.0, 
F(i, 69) = 329.3, and F(1, 24) = 59.6 for 
buyer’s, fair, and seller's price conditions, 
respectively. 

Third, the effect of bias varies directly with 
the expertise of the same source. Note that 
the spread between open and filled circles 
(effect of the bias of the second source) is 
greater in the lower than the upper panels 
(when the expertise of the second source is 
greater). Fourth, the effect of bias varies 
inversely with the expertise of the other source. 


is smaller in the panels on the right (HL and 
HH, where the first source is high in expertise) 
than in the panels on the left. For the Expertise 
second 
source) interaction, the F values were F(A, 25) 
= 68, F(1, 69) = 26.2, and F(1, 24) = 21.3, 
for buyer's, fair, and seller’s price, respectively. 
Thus, bias interacts with expertise in the same 
fashion as estimate, consistent with the scale- 
adjustment model. 

Figure 5 shows the results for the source- 
estimate and blue book value designs for all 
three points of view. The abscissa of each 
figure is spaced according to least squares 
estimates of the scale values of the source’s 
estimate, based on Equation 4. The three 
leftmost notches on the abscissa, facing into 
each panel, show the positions of the scale 


of the seller [right-ordinate scales]. Separate cu 


57 


values for an estimate of $300 provided by a 
seller’s friend, an independent, or a buyer’s 
friend, respectively. 

The solid points represent mean judgments 
based on a source’s estimate and the blue 
book value (indicated next to curves). Open 
circles and dashed lines denote judgments 
based only on a source’s estimate (no blue 
book value). Panels 1, 2, and 3 show the effects 
of increasing the source’s expertise. It can be 
seen that proceeding from left to right, 
slopes increase but the vertical spreads be- 
tween the curves decrease. 
shifted 
vertically ($250 up for the friend of the buyer 
and $250 down for 
The ordinate on the far right is labeled for 
these two; the left ordinate is labeled for the 
independent-source data. 

A portion of the present experiment (in- 
dependent sources, fair price condition) repli- 
cates Experiment 1 of Birnbaum et al. (1976). 
These 60 data points give virtually identical 
results to those of the earlier experiment. 

The effects in Figure 5 for the source- 
estimate and blue book value design are 
highly reliable relative to the error terms. 
More importantly, similar effects occur in 
each experiment, providing evidence of repli- 
cation. Statistical analyses confirm these 
graphic interpretations. For example, for 
the fair point of view, the F(2, 138) for the 
main effect of bias was 102.3. The divergent 
interaction between bias and expertise, pre- 
dicted by the scale-adjustment model, has an 
F(4, 276) = 15.9. The interaction between 
expertise ani estimate yields F(8, 552) 
= 122.5. The theory that the weight of a 
source depends on expertise predicts the 
Expertise X Blue Book Value interaction, 
in which the effect of the blue book value is 
inversely related to expertise, F(6, 414) 
= 114.7. The effect of the blue book value 
also depends on the bias of the source, F@, 414) 
= 44, and on the Expertise X Bias inter- 
action, F(12, 828) = 7.3, consistent with the 


rves are drawn for different levels of blue book value 


[BBV], Open circles represent judgments based on one source’s esti 


tions based on scale-adjustment theory- Abscissa spacing 
permitted to depend on source bias and estimate p 


iments 1-4].) 


f 
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interpretation (Equation 4) that the source’s 


_ weight depends on both bias and expertise. 


The data for the blue book value and source- 
estimate design are consistent with the 
following : Scale value depends not only on the 
source’s estimate but also on the source's 
bias. This theory (scale adjustment) explains 
the divergent Expertise X Bias interaction. 
Bias also affects the weight of a source, as 
evidenced by the differences in slope in Figure 
5 for different bias-expertise combinations, in 
conjunction with the corresponding changes in 
the spread of the curves. Appendix A considers 
a complex alternative to all three models in 
Figure 2, which attempts to explain the effect 
of bias without scale adjustment. The complex 
model makes several incorrect predictions. 


Confidence Intervals 


The point size in Figures 4 and 5 is such 
that the solid and open circles contain a 
confidence interval of at least +1 probable 
error in most cases. For Experiments 1 and 2, 
42% of the means had standard errors less 
than $6.00, 64% were less than $8.00, 80% 
were less than $10.00, and 93% were less than 
$12.00. For Experiment 3, 35% of the means 
had standard errors less than $8.00, 51% were 
less than $10.00, and 63% were less than 
$12.00, For Experiment 4, 50% of the standard 
errors were less than $8.00, 58% were less 
than $10.00, and 60% were less than $12.00. 
These data seem quite neat, considering that 
adding another point size vertically on either 
side would include a 95% confidence interval 
in most cases. 


Analyses for Individual Judges 


Examination of data indicated that the 
group means are representative of the data 
for the vast majority of single judges. For 
example, four Expertise X Bias interactions 
were drawn separately for each of the 70 judges 
of Experiments 1 and 2. Model 2 predicts 
that the effect of bias should vary directly 
with the source’s expertise and inversely with 
the expertise of the other source. The two- 
Source design allows two examinations of 
each of these two predictions, Of the 70 
subjects, 66% had either three or four inter- 


actions of the predicted form. Only | subjects 
had a greater number of inconsistent patterns 
than consistent patterns; 2 of these appear to 
have reversed the directions of bias for buyer 
and seller. Only 2 subjects had the pattern 
predicted by Model 5, Given that each point 
was the average of only 16 judgments and the 
fact that the Expertise X Bias interactions 
are predicted to be small, the small number of 
“deviant” judges doesn't provide evidence for 
the existence of large subgroups obcying * 
different models, Of the 280 gures, 206 were 
of the predicted form, the same proportion 
for each prediction, suggesting that the group 
means are highly representative of individual 
data. 


Model Analyses 


Choice among modes. Model 2 is vastly 
superior to Models 1 and 3, since it correctly 
anticipates the divergent Expertise x Bias 
interaction, in which the effect of bias & 
magnified by expertise. Accordingly, tbe 289 
data points for cach experiment were seps- 
rately fit to Equation 4 by means of a com 
puter program, which used Chandler's (1969) | 
STEPIT subroutine to minimize the sum of 
squared model-data discrepancies.’ 

The data of Experiments 1 and 2 wert 
analyzed separately with nearly identical 
results, showing excellent cross-validation of 
both the model and parameter valucs upon 
replication. The data of Experiments 1 and 2 
are pooled in the analyses reported here 

Weights and scale values. The scale value 
for the blue book value of $550 was set to tS 
monetary value, and the weight of the blue 
book was arbitrarily set to 1.0. There were 
29 parameters to estimate, 9 weights for the 
sources (Expertise X Bias), 15 scale values 
(Estimate X Bias), 3 scale values for th 
blue book value, and a weight and a 
value for the initial impression. 

The weights of the initial impression Wef® 
small: .08, .08, and .10 for the buyer's ie 
fair price, and seller's price, respectively: $ 


* We thank Ron Hinkle for checking our parsed 
estimates from eraser against chore stvaioed 0 H 
subroutine, BLACKBOK, “which uses an impo 
minimization algorithm. 
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Table 1 
Estimated Weights and Scale Values for the Scale-Adjustment Model 
Judge's point of view 
Seller’s price 


Buyer's price 


———— 


Source's bias 


2 
Variable Buyer Independent Seller 


, Se 
Buyer Independent Seller 


Fair price 


Se 


Source’s bias Source’s bias 


Gee a 
Buyer Independent Seller 


Estimated weights of sources* 


Source’s expertise 


Low 68 we: a 76 
Medium 1.35 1.32 1.32 1.43 
High 2.22 3.21 2.25 2.63 


62 84 -90 -16 92 
1,62 1.62 1.42 1.49 1.46 
4.33 2.77 2.76 3.45 2.44 


Estimated scale values? 


Source's estimate 


$300 307 291 271 330 
$400 407 397 370 430 
$500 51S 491 464 538 
$600 608 594 562 633 
$700 692 686 647 729 


301 267 330 297 276 
401 364 428 404 380 
502 459 534 504 474 
601 561 629 607 576 
691 651 723 702 671 


s Each entry in the upper portion of the table is the estimated weight for each source as a function of the 
'’s expertise and bias. A separate analysis was performed for each point of view. The weight of the 


blue book value was set to 1.0. For example, the weight of the high-expertise friend of the buyer was 2.22 


for Experiment 3 (buyer's price). 


d Each entry in the lower portion of the table is the estimated scale value as a function of the source's 
estimate and the source's bias. For example, the largest scale value for Experiment 3 (buyer's price), $692, 
was for an estimate of $700 by a friend of the buyer. For Experiment 4 (seller's price), the same estimate 
from the same source had an estimated scale value of $723. 


values for the initial impression were related 
to the point of view: 243, 368, and 390 for 
buyer’s, fair, and seller’s price, respectively. 
Estimated scale values for the blue book were 
339, 446, 550, and 648 for the fair point of 
view (values were very similar for the other 
points of view). The least squares estimates 
of weights and scale values are shown in 
Table 1. 

Table 1 shows that the weights depend 
mostly on expertise, but tend to be larger for 
the independent source of high expertise and 
possibly smaller for the low-expertise, in- 
dependent source. This pattern of weights 
appeared in all experiments. The weights for 
the 9 sources estimated from the least squares 
analysis are consistent with weights estimated 
graphically, from the effect of blue book value, 
using the method of Figure 1. 

The scale values depend on three factors: 
estimate, source’s bias, and judge’s point of 
view. For example, a $500 estimate provided 
by an independent has @ scale value of only 


491 from the buyer’s point of view, 502 from 
the fair price point of view, and 504 from the 
seller’s point of view. If the $500 estimate in 
the fair price condition is provided by a friend 
of the buyer, the scale value jumps to 538, 
compared with only 459 if the estimate is 
provided by a friend of the seller. 

Fit of the scale-adjustment model. Predic- 
tions of the model are shown in Figures 3, 4, 
and 5 and come very close to the data. The 
square roots of the mean squared model-data 
discrepancies were 12.15, 10.16, and 11.16 for 
buyer’s, fair, and seller’s price, respectively. 
Hence, the average discrepancy between the 
model and sample means is not much larger 
than the expected discrepancy (standard error) 
between sample and population means. 

Figures 3, 4, and 5 show that Model 2 makes 
the following correct predictions: First, it 
correctly predicts the divergent Expertise 
Bias interactions. These interactions are 
best seen in Figure 3, where they have been 
plotted for comparison with Figure 2. The 
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Expertise X Bias interactions can also be seen 
in Figure 4. Figure 5 does not permit one to 
see this prediction easily, but when data for 
the Blue Book Value X Source Estimate 
design were plotted as in Figure 3, a similar 
pattern was evident: The greater the expertise, 
the greater the effect of bias. 

Second, Model 2 gives a good description of 
the slopes and spreads of the curves in Figures 
4 and 5. The greater the expertise of a source, 
the greater the effect of that source's estimate 
and bias and the less the effect of the estimate 
and bias of another source (Figure 4) or of the 
blue book value (Figure 5). 

Third, Model 2 gives a good account of the 
single-source data (open circles and dashed 
lines in Figure 5), which cross the curves for 
different levels of blue book value. 

Deviations of fit. The deviations from the 
model should be taken seriously, since each 
circle in Figures 4 and 5 contains a fair-sized 
confidence interval. The model predicts that 
the curves in each set in Figure 5 should be 
parallel. Instead, the interaction between 
estimate and blue book value shows a di- 
vergence to the right for the buyer's and the 
independent’s points of view, F(12, 300) = 6.6 
and F(12, 828) = 11.2, respectively. Similar 
divergence was also obtained by Birnbaum 
et al. (1976) and can be described by a con- 
figural-weight model, which assigns greater 
weight to the lower estimate. A configural- 
weighting revision of the scale-adjustment 

model is tested in Experiment 5. 

There is also a hint of a higher order con- 
figural effect. When the judge has the seller's 
point of view and the buyer provides 
estimate, the effect of blue book value is 

(points are 
wider than predictions), as though the buyer's 
low estimate receives red) weight. When 

the buyer provides a higher estimate, 


i 


ment model (Model 2) of Figure 2. Deviations 
are small but regular. Consequently, Experi- 
ment 5 explores these deviations, 

modifications of Model 2 that allow the weight 


of an estimate to depend on the stimulus 
configuration. 


Experiment 5: Test of Configural 
Weighting Models 
t 5 addressed three eure raised 

by the first four experiments. First, although | 
the between-aubjects manipulation of point 
ol view did affect scale values in a predictable 
fashion, the pattern of weights did pot support 
a prior conjecture that the weight of a source 
would be greater when the source's biss 
matched the judge's point of view (Table I) 
Interpretation of between-subjects resulti 
requires extreme caution, however, since the: 
eflects of a variable such as point of view may 
depend on the establishment of a context d 
comparison for the individual judge. Conse 
quently, Experiment 5 allowed each judge to 
experience all three points of view 

Second, the fact that the effect of a source's 
estimate and bias varies inversely with the 
number and expertise of other sources would 
also be consistent with a linear-weighted modd 
in which the effective weight of a source varia 
with the difference between the source's 
and some function of the total absolute weight 
instead of the ratio; that is, the effective weight 
would be w — /(Zw) instead of w/Zw. In ordet 
to test this linear-weight model against relative- 
weight averaging models (such as Model D); 


it is necessary to have at least two souret 


z 
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Configural-Weight Theory 


To account for stimulus interactions, simple 
configural-weight theories have been pi 
and tested by Birnbaum, Parducci, and 
Gifford (1971), Birnbaum (1972, 1973, 1974), 
and Birnbaum and Veit (1974). The scale- 
adjustment model (Model 2 of Figure 2) can 
be modified to allow for configural weighting. 
The simplest configural-weight theory was 
termed the range model. According to this 
model, the relative weight of a stimulus 
depends in part on the rank of that stimulus 
in the configuration of stimuli to be integrated 
on a given trial. 

For two stimuli, the range model may be 
written: 


woso + wisi + Wass 


we wt + up| —s:|, © 


R= 
where wp is the weight of the configural, or 
range (|sı — sa|) effect, which, in the present 
case, would presumably depend on the judge’s 
point of view. Note that when sı > sz, the 
relative weight of sı can be written: 

Wi 
prcaraeenpress te @ 
However, when sı < S$» the relative weight 
of sı can be written: 


wi 

wo + W1 + we oo © 
The range model assumes that the effective 
relative weight of a stimulus depends on the 
rank of its scale value in the set of stimuli to be 
combined. As a limiting case, when wp equals 
the relative weight of a stimulus, the range 
model can become a maximum or minimum 
(conjunctive or disjunctive) model, depending 
on the sign of wp. The model implies that the 
response varies linearly with the range of scale 

values, holding mean scale value constant. 
Tf the two stimuli are two estimates provided 
by different sources, if the weight depends on 
the expertise and bias of the source (e.g. 
wı = wx,m,), and if the scale value depends 
on the source’s estimate and bias (e.g. 
sı = smp), then Equation 6 becomes an 
extension of the scale-adjustment model 
(Model 2 of Figure 2). This configural-weight 
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model adds a single parameter, wP, to account 
for the effects of the judge’s point of view. 
Note that wp in Equation 7 is the amount 
of relative weight taken from the lower valued 
stimulus and given to the higher. If wp is 
negative, weight is added to the lower valued 
stimulus, yielding a divergent stimulus inter- 
action. Such an interaction might be expected 
in the buyer’s point of view. A convergent 
interaction (positive wp) might be expected 
in the seller’s point of view, since the larger 
of two estimates should receive greater weight. 


Method 


The instructions, stimuli, and procedure were similar 
to those of Experiments 1 through 4. The chief differ- 
ences were as follows: (a) Each judge was exposed to 
all three points of view via instructions and then judged 
all of the 146 stimulus combinations separately under 
each point of view. (b) Descriptions of the levels of 

ise were modified to provide five levels. (c) 
The experimental designs in Experiment 5 permitted 
assessment of previously untested predictions of the 
models. 


Experimental Designs 


The trials selected were a subset of a (5X 3X 9) 
x (5 X 3 X 9), First Source (Expertise X Bias X Esti- 
mate) X Second Source (Expertise X Bias X Esti- 
mate) factorial design. The complete design would have 
required 18,225 trials; therefore, six smaller factorial 
designs involving these six variables were selected to 
test particular implications of the models. These 
designs are shown in Table 2 and described below. 

Estimate X Estimate. Two related (1X 1X 4) 
x (1X 1X 4) factorial designs investigated the 
hypothesis that the configural weight of the lower scale 
value depends on the point of view of the judge. In both 
designs, both sources were medium in expertise; the 
levels of estimate for Source 1 were $350, $450, $550, 
and $650; and the levels of estimate for Source 2 were 
$300, $400, $600, and $700. In one design, the first 
source was a friend of the buyer and the second was a 
friend of the seller; in the other design, the biases were 


Bias X Bias. The design was a (1X 3 X 2) 
X (1 X 3 X 2), in which both sources were medium 
in expertise; the three levels of bias for both sources 
were friend of the buyer, independent, and friend of the 
seller; the estimate of Source 1 was either $350 or $650; 
the estimate of Source 2 was either $300 or $700, The 
purpose of the Bias X Bias design was to investigate 
the hypothesis that when a source provides an estimate 
contrary to what one would expect from that source’s 
bias, the weight of that estimate is increased. 

Bias X Expertise. Two related designs reexamined 
the predictions of Model 2 that expertise amplifies the 
effect of bias (and estimate) in the same source and 
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Table 2 

Two-Source Designs in Experiment $ 

Seen eee ee ee ee eee 

First source ba sootee 
Design Expertise Bias Estimate Rapertue Mase I senate 

Est, X Est; (BS) 1 (Med) 1(B) 4 i (Mei) 10 p 
Est, X Est: (SB) 1 (Med) 16) 4 iAd 118 4 
Bias X Bias 1 (Med) 3 2 i(Med) 3 } 
Bias; X Exp: 1 (Med) 3 2 š i 1 om) 
Bias; X Exp: 5 3 2 i(Me@) l 1 oo 
Exp X Exp 5 10) 1 (69) 5 ii 1 i0) 


Note. Each entry represents the number of levels of the factor Estad above tbe osissa The prodot of the 
entries in each row gives the number of celle in y 

Estimate (Est) X Estimate design consisted of 16 trials ia which the fret sowrce wee « nmts (Mad) 
expertise, friend of the buyer (B) who provided four estimates, and the aosd merve was è 
expertise friend of the seller (S) who provided four estimates. The last row shows thar the E 
(Exp) X Expertise design contains 25 cells in the expertise of cach source could attain one of 
levels, the first source was an independent (1) gave an estimate of 600, asd the cecoed source 
also an independent with a $650 estimate. 


rel! 


diminishes the effect of bias (and estimate) in a 
source. A (1 X 3 X 2) X (5 X 1 X 1) factorial 
used five levels of expertise (very low, low, 


and 10 warm up trials reminding the jarige of the 
of view for cach wet joder were permitied to wert 
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pendent who gave an estimate of $500, 
a medium-expertise first source, with 
bias of Source 1 (S, I, B) and two estimates, 
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of weights and to replicate the Bias 
X Estimate interactions of Experiments 1 

Expertise X Expertise. To test the a 
against the linear-weighted model 
two levels of expertise for two 
(5X1X1)X(5X1X 1) 

Mail ie the Aye aa 


[ 
nis 
i 
7 
| 
i i 
i 


4 
i 
ji 
i 
Eh 
il 
i 
4 
ak 


second source gave an estimate of $650. lower panels plot judgments as a function 
The designs employing two sources generate a total the buyer's estimate, with a separate 
of 153 stimulus sets, 15 of which are shared by two for each level of seller's estimate ‘Panels 


to 
d many of the three- and four-way interactions among iM Figures 6-9 are 
the six variables. In addition, 8 trials were produced model (Equation 6), an extension of 


which the two levels of expertise were low and high, ater section. 

the two levels of bias were friend of buyer and friend shows 

of seller, and the estimate was either $300 or $700. the models considered in E: 
in 
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Estimate of Medium 
350 450 550 650 350 450 550 


Al. Buyer's Price | Bl. Fair 


Mean Judgment 
ron) x 
88 


350 450 550 650 350 450 


63 
Expertise Seller 

650 350 450 550 650 

Price | Cl. Seller's Price 


Estimate of 
Med. Expertise 


Estimate of 
Med. Expertise 
Seller 


550 650 350 450 550 650 


Estimate of Medium Expertise Buyer 


Figure 6. Upper panels: Judgment of value as a function of the estimate of the medium-expertise friend 
of the seller with a separate curve for each estimate of the medium-expertise friend of the buyer. Lower 
panels: Judgment of value as a function of the estimate of the buyer’s friend (on the abscissa) with 
separate curves for the estimates of the seller’s friend. (Panels A, B, and C are for the judge’s point of 
view. Lines represent predictions of the range model [Experiment 5, Estimate X Estimate designs ].) 


diverge for the buyer’s point of view and for 
the fair price point of view; however, they 


_ converge for the seller’s point of view. It is as 


if the judge places greater weight on the lower 
estimate when identifying with the buyer and 
places greater weight on the higher estimate 


when identifying with the seller. The three-way 
interaction of Point of View X First Estimate 
X Second Estimate yielded Fs(18, 1062) = 11.8 
and 11.3 for the two panels of Figure 6. 
Figure 7 shows Estimate X Estimate inter- 
actions with a separate panel for each combina- 


A.Buyer's Price 


Mean Judgment 


E oae anheg 


tion of biases for the sources. Each panel plots 
mean judgments as a function of the estimate 
of the second source (plotted on the abscissa) 
with a separate curve for each level of estimate 
of the first source. Letters inside panels 
represent biases of first and second sources, 
respectively. 

The interactions in Figure 7 are divergent 
for all nine buyer’s price panels (7A) and for 
all nine fair price panels (7B); however, the 
interactions are convergent in all nine seller's 
price panels (7C). Combined with the data 
of Figure 6, all 11 tests of the Estimate 
X Estimate interaction show the same form 
within each point of view. Assuming a relative- 
weight averaging model (all of the models in 
Figure 2 predict parallelism), it would be 
extremely unlikely to obtain 11 divergent or 


The group means are highly represen 
of data for individual judges. For example, 58 
of 60 judges show the divergent Estimate 
X Estimate interaction of Figure 7 for the 
buyer’s point of view; 48 judges show di- 
vergence for the fair price of view, and 40 


B Foir Price 
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judges show convergence for the seller’s point 
of view. Hence, these data provide strong 
evidence that the Estimate X Estimate inter- 
actions of Experiments 1-4 and of Birnbaum 
et al. (1976) are “real” and, most importantly, 
that the Estimate X Estimate interaction can 
be reversed by manipulating the judge’s point 
of view. For the data of Figure 7, this three-way 
interaction yielded F(2, 118) = 39.7. Thus, 
some modification of the relative-weight 
models, such as  configural weighting, is 
necessary. 

Figure 8 shows the results for the Bias 
X Expertise X Estimate designs, averaged 
over the judge's point of view. These designs 
retested the implications of the models of 
Figure 2. The left panel of Figure 8 shows 
that as the expertise of the second source 
increases, the effects of the bias and estimate 
of Source 1 (the distances between the curves) 
decrease. The Expertise X Bias interaction 
yielded F(8, 472) = 6.1, and the Expertise 
X Estimate interaction had an F(4, 236) of 


Š 


Mean Judgment 


Very 


Medium Very 
io ™ High Low 


Low 


said $300 (Panels A, B, and C are for buyer: 


e |B. Fair Price 
700 Expertise of Indep. Source 


Medium Very 
High Low 
Expertise of Indep. Source Who Said $650 


Figure 9. Mean judgment of value as @ function of the expertise of an independent source who gave an 
estimate of $650, with a separate curve for each ake 

's, fair, 
curves show predictions of the range model. Right ordinate represents an approximate scale of relative 
weight [Experiment 5, Expertise X Expertise design) 


226.1. The right-hand panel shows that as the 
expertise of the first source increases, the 
effects of bias and the estimate of the first 
source increase, F (8, 472) = 4.8 and F(4, 236) 
= 158.2, respectively. The general pattern 
was similar for all three points of view, drawn 
separately. These data reconfirm the results 
of Experiments 1-4; this pattern of results is 
predicted by the scale-adjustment model 
(Model 2 of Figure 2). 

Figure 9 shows the results for the Expertise 
Expertise design, with a separate panel for 
each point of view. The abscissa represents the 
expertise of an independent source who gave 
an estimate of $650; the separate curves are 
for levels of expertise of the independent 
source who said $300. The positive slopes 
indicate that judged value increases with the 
expertise of the source giving the higher 
estimate. Judged value decreases as a function 
of the expertise of the source giving the lower 
estimate. 

Averaging models predict a nonparallel set 
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Table 3 

Estimated Bias Adjustments for the Range Model 

2 a 

Source's bias 

Point of view Buyer Independent Seller 
Buyer's price 8.79 6.39 -33.50 
Fair price 20.22 0 —34.76 
Seller's price 24.26 1.26 —14.79 


$662, and $703 for the nine levels of estimate ranging 
from $300 to $700. For example, the largest value 
in the table (24.26) means that the scale value for 
an estimate provided by a friend of 

the scale value $24.26 when the judge takes the 
seller's point of view. Thus, an estimate of $500 has 
a scale value of $511 + $24 = $535 in this condition. 
The negative numbers show thai riend of 
the seller provides the estimate, the scale value is 
reduced. The value for the independent source in 
the fair price condition was set to 0. 


i 


The averaging model 
expertise of a source increases, the response 
should asymptotically approach the scale value 
of that source’s estimate. Note that for the 
buyer’s point of view, when a very low- 
expertise source rates the car at $300, even a 
very high-expertise source reporting $650 
cannot bring the mean judged value above 
$550. Although the range model approximates 
the shape of the data (dashed lines), it requires 
a revision to account for the data of Figure 9. 


Confidence Intervals 


The standard errors of the means for Experi- 
ment 5 were comparable with those of Experi- 
ments 1 and 2. Of the 483 standard errors, 
37% were less than $6.00, 64% were less than 
$8.00, and 76% were less than $10.00. As in 
the other experiments, the size of the standard 
error correlates with the range of estimates. 
The point size in Figure 7 contains a fair-sized 
confidence interval; however, note that the 


ordinate sales in Figures 6 5, and 9 ae 
expanded, so that a poist ite dors not seo 
sarily contain a large comSddemce (terval 
Model Analyses 
madd. The range modei (Equation 

6) was &t to the data by means of a computer 
program utilizing the steri subroutine, de 
signed to minimize the sum of squared data= 
model des over all 453 (161X9) 
cells in the experiment. Except for the came 
figural-weight parameter, wr, the model it 
form of sale-adjustment model (Mode 2 df 
Figure 2). 

The number of parameters required ti 
estimate scale values was reduce! by the 


simplifying assumption that the sale value 
of a biased source's estimate is an additive 
function of a scale value depending on the 
estimate alone and a bias parameter depending 
on the source's bias and the judge's point @ 
view: 


fear = te + bar, (u 


where saap is the effective scale value for af 
estimate, E, provided by a source of bias, By 
for a judge of point of view, P; se 
only on the estimate, and bar depends on the 
source's bias and the judge's point of view. The 
value of bar for the independent source in the 
fair price point of view was set at zero. nan 
scale values require the estimation of 5 
parameters, 9 for the sm and 8 (3x 3-1 
for bar. 

Table 3 shows the values of bias 3 
ments for the scale values. The estimated 
scale values, sr, for the 
the fair price condition 
451, S11, 563, 620, 662, and 705, respective: 
$300 to $700 in $50 increments. Table 3 shot 
that for the fair price condition, 
values would be about $35 less if the about 
were provided by a friend of the seller or 87 
$20 more if provided by a friend of the rite 
As one might expect, the absolute value i, 
bias adjustment is less when the source ye 
matches the judge's point of view, © poit 
a ee oe he on 

To permit examination of how weight? 
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different sources might depend on point of 
view, a different weight was estimated for 
each type of source in each point of view. Thus, 
45 weights, wxsr, were estimated for the 
5 X 3 X 3, Expertise X Bias X Point of View 
combinations. The weight of the low-expertise 
independent in the fair price condition was 
set to 1.0, leaving 44 parameters to be 
estimated. 

The least squares estimates of weights are 
shown in Table 4. The general pattern of 
weights in Experiment 5 is similar to that of 
Experiments 1-4. The ratio of the weight of 
high- to low-expertise independents for the 
fair price condition is 6.78, not far from the 
corresponding value of 6.98 for Experiments 
1 and 2. The weights depend mostly on 
expertise, with the greatest weights being for 
independents of very high expertise. For the 
buyer’s point of view, the buyer’s weight 
exceeds that of the seller for all levels of 
expertise. For the seller’s point of view, the 
friend of the seller receives greater weight 
than the corresponding buyer for the very 
high-expertise source. There is thus a very 
slight hint of support for the notion that biased 
sources receive relatively higher weight when 
the judge shares the source’s bias, but the 
evidence does not seem decisive. 

In addition, separate weights and scale 
values for the initial impression and separate 
values of wp, the configural weight parameter, 
were permitted for each point of view. These 9 
parameters, plus 44 source’s weights, plus 17 
for scale values make a total of 70 parameters 
Fi estimated to describe the data in Figures 
} The estimated scale values of the initial 
impressions were 281, 308, and 413, with 
weights of .30, .15, and .24, for buyer’s, fair, 
and seller’s price, respectively- The values of 
the configural-weight parameter, wp, Were 
—.194, —.066, and .055 for the buyer’s price, 
fair price, and seller’s price, respectively. 

_ The predictions of this model are shown 
in Figures 6-9. The model is an extension of 
Model 2 of Figure 2, with only one elabora- 
tion—the configural-weight effect to allow for 
Estimate X Estimate interactions. Consider- 
ing that only one additional parameter is used 
for each point of view, the range model gives 
a very good account of the data in Figure 6 


Table 4 5 
Estimated Weights of Sources jor the Range 
Model 

Expertise of source 


Very Very 

Source’s bias low Low Medium High high 
Buyer's point of view 

Buyer 91 80 2.42 4.52 6.42 

Independent 100 1.53 3.05 5.70 9.98 

Seller 76 45 2.39- 2.63 4,90 
Fair price point of view 

Buyer 66 77 140 3.22 4,89 

Independent ‘47 1.00 2.68 6.78 17.57 

Seller ‘43 38 1.50 2.34 4.13 

Seller’s point of view 

Buyer 96 1.43 2.37 410 4.95 

Independent 4S 1.22 2.62: 5.44 10.20 

Seller "g5 1.08 2.25 3,47 6.35 


Note. Each entry is the least squares estimate of 
weight for each type of source under each con- 
dition. The weight of the low-expertise independent 
source in the fair price condition was set to 1.0. For 
example, the table shows that when the judge takes 
the seller's point of view, the very high-expertise 
friend of the seller has greater weight (6.35) than 
the very high-expertise friend of the buyer (4.95). 


and a good account of the data of Figure 7. 
The root mean squared errors were 6.83 and 
10.40 for Figures 6 and 7, respectively. The 
overall sum of squared deviations was 61,497 
over 483 points, yielding a root mean squared 
deviation of 11.28. This value was 13.08, 10.23, 
and 10.30 for the buyer’s, fair, and seller’s 
price conditions, respectively. 

The range model would have been a great 
success had the data of Figure 9 not been 
collected. When a very high-expertise, in- 
dependent source gives an estimate of $300 
and a very low-expertise independent source 
gives an estimate of $650, the mean judgment 
for the buyer’s point of view is $342, compared 
with a prediction of $292, a deviation of about 
$50. The range model makes this extremely 
low prediction because the relative weight of 
the $650 estimate is so low in this case (.09) 
that when the negative value of wp(—.194) is 
added to it, the effective relative weight of the 
$650 estimate becomes negative. In other 
words, the best-fit value of wp for the entire 
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experiment (including the data of Figures 6 
and 7) is a poor value for the data of Figure 9. 
To examine the effective relative weights, 
the right ordinate of Figure 9 has been labeled 
from 0 to 1. Since the scale values are approxi- 
mately equal for the independent sources from 
different points of view, and since the weight 
of the initial impression is so small, the right- 
ordinate values can be read off 
approximate relative weight of 
estimate for each point in Figure 9. 
Revised configural-weight theory. as- 
sumption of the range model that the con- 


relative weight appears inconsistent with 

data of Figure 9. The model was revised to 
incorporate the principle that the amount of 
decrease in absolute weight is directly propor- 
tional to the absolute weight apart from the 
configural effect. Weight taken from 
stimulus is given to another. Hence, if a 
source’s estimate loses weight, it does so in 
proportion to its original weight; if a source's 
estimate gains weight, it does so in 

to the weight of the other source's estimate. 


A. Buyer's Price | B. Fair Price 
TOOK Expertise of Indep. Source Who Said $300 
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Mean Judgment 
8 8 


bed Medium Nery Very Medium thp Medium Nich 


Thus, there is a conservata of a) noite 
weight: The low of one source is anothy 
source's gain 

The revised configure) ergh! model isa 
scaleadjustment model (Hquation 4 & 


it is distinguished from the r 
(Equation §) in that the configural ¢ 


of weight operates on absolute rather 
relative weight. The weights are given M 
the following equation 

Waare = Frar + orrira 


where wxern is the efective absolute 
of an estimate | #xsp is defined a: in the 
model; wr is the configural-wesgh par 
on = 1, if the cotimate is the largest is 
set, on = — 1 if it is the smallest, and os 
otherwise. By definition, pre = exer if 
< 0; otherwite, gre = wrap, where Wx 


is the absolute weight of the other 
Thus, a source loses weight in proportion 
its original value but increases by taking 
weight lost by the other source | 
This modified configural-omght 
(Equations 4, 9, & 10) provides a great 
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Expertise of Indep. Source Who Said $650 


Figure 10. Data of Figure 9 with predictions (dashed curves) of revised confyqural weight theo" 
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provement over the simple range model 
(Equations 4, 6, & 9), though it uses the same 
number of parameters. The total sum of 
squared deviations over all 483 cells was 
reduced from 61,497 to 48,333, yielding a root 
mean squared error of 10.00. The improve- 
ment in fit was greatest for the Expertise 
X Expertise design (Figure 9), where the root 
mean squared mistake was reduced from 17.24 
to 13.09. The fit was also noticeably better for 
the Expertise X Bias design (Figure 8). Fit 
was about the same or slightly better in the 
other designs. 

Figure 10 shows the fit of the improved 
configural-weight model to the Expertise 
X Expertise design. Solid points are empirical 
means; dashed curves represent predictions. 
As can be seen by comparing Figures 9 and 10, 
the revised configural-weight theory gives a 
superior description of these data, especially 
for the fair price point of view. 

The estimated scale values and bias param- 
eters of this configural model were similar to 
those of the range model. The pattern of 
estimated weights was also similar, except 
that the weights always incr with 
expertise for this model (an improvement), and 
the range of weights increased (especially for 
the independent sources). The values of we 
were —.385, —.155, and .130 for the buyer's, 
fair, and seller’s points of view, respectively. 
An extension of the configural-weight model 
is discussed in Appendix B. 

Differential-weight theory. In differential- 
weight theory, the weight of a stimulus is 
permitted to depend on the estimate. For the 
present data, the weight would be required to 
depend on point of view as well. The most 
complex, scale-adjustment, differential-weight 
model allows an increment in weight to 
depend on three factors: bias, point of view, and 
estimate. There are 81 values of the weight 
increment, 9 of which (for the independent, 
fair price conditions) were set to 0, leaving 72 
weight parameters to be estimated (69 more 
than the configural-weight model). Two ver- 
sions were tried: In one version the added 
increment in weight was jndependent of 
wxsr; in the other it was proportional to 
wxpp. Scale adjustment was allowed, in accord- 
ance with Equation 9, Neither of these differ- 
ential-weight models described the data of 


Figure 9 as well as the configural-weight model 
(Equation 10). The best-fitting differential- 
weight model can be written as follows: 


(11) 


where Ogre is the weight increment and wxpr 
is defined as in Equation 10. The root mean 
squared deviations for the Expertise X Exper- 
tise design were 16.8, 16.9, and 14.3 for the 
buyer’s, fair, and seller’s price judgments, 
respectively, compared with 15.3, 10.6, and 
12.9, respectively, for the modified configural- 
weight theory (Equation 10). 

The differential-weight theory requires a 
relationship between the shape of the curves 
in Figure 9 and their approach to the scale 
values. The curves should “bow over” only 
as they asymptotically approach the scale 
values (Birnbaum, 1973; Riskey & Birnbaum, 
1974). As can be seen from Figure 9, however, 
the curves bow over long before reaching the 
estimated scale values. For example, the 
highest point in Figure 9B was predicted to 
be $641 by the differential-weight averaging 
model, compared with $610 for the data and 
$608 for the prediction of the configural-weight 
model (Equation 10). This deviation from the 
averaging model is similar to that obtained 
by Birnbaum (1973) and by Riskey and 
Birnbaum (1974). Since the differential-weight 
model of Equation 11 uses nearly twice as 
many parameters yet fits the data of Figure 9 
worse, configural-weight theory seems 
preferable. 


WXBPE 7 wxpr(1 F BPE), 


Discussion 


The purpose of this research is to examine 
theories of how judges (who may be biased) 
combine information from sources who vary 
in their ability to report the truth (expertise) 
and their motivation to distort it (bias). By 
representing these processes with mathe- 
matical models, it becomes possible to test 
experimentally among explicit theories of 
social judgment. 

Experiments show that certain algebraic 
representations cannot account for the judg- 
ments. Additive and constant-weight averaging 
formulations have been tested and rejected 
in previous studies of information integration 
(T. Anderson & Birnbaum, 1976; Birnbaum, 
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1972, 1973, 1974, 1976; Birnbaum et al., 1976). 

The present data are in agreement with these 

previous studies in refuting additive and 

constant-weight averaging models in favor of 

some form of configural, relative-weight aver- 
ing model. 


not alter the effect of his or her bias ; in another, 
expertise magnifies the effect of bias; 
source’s bias. The data give a clear indica 
that the response-revision and weighted-bias 
models can be rejected. The retained model, 
scale-adjustment, which predicts that exper- 
tise will magnify the effects of bias, makes 


predictions 
fulfilled by the data. By making these correct 
predictions, the status of the model as a repre- 
sentation of source effects is enhanced. The 
present research also extends the investigation 
to examine configural-weight theories of how 
the judge’s point of view affects the weights 
and scale values of estimates from 


f 


that experts are necessarily judged 
biased, only that the efect of bias will be 
greater for sources of greater expertise. It is 
as if the judge, hearing that the seller’s friend 
provided an estimate of $500, thinks, “That 
means $470,” before averaging the information, 
giving greater weight to the estimate provided 
by the source of greater 

The weight of an estimate depends mostly 
on the source’s expertise; however, weight 
also varies with bias. A consistent trend across 
all experiments is that the unbiased source 
(the independent) of high expertise tends to 
have greater weight than either biased source 
of the same expertise. 

It is important to note that the effect of bias 


on weight and its effect on xal 
work in tandems or ia opposta 
instance. Perhaps this wouid 

contradictions in resas: d 
change for the effects of sous 
nem, or biss (ore McGuire, | 
both a signed variable (plus or © 
eflect on sale value and an odialmie 
in its effect on weight. For etampie, a b 
source (eg, a scientist empioyed by a 
company uting nuclear tractors) mi 
larger attitude change in favor o d 
than an unbiased source if be said, ” 


reactors are unsafe.” On the other hand, 
efect of this message on the impact of 


messages (weight) might be less than that 
an unbiased source of the same cxpertine 
gave the same message 

The data are suggestive, but by so 
conclusive, on two other weighting 5 
First, a source may receive estira wright 
making an estimate that would pot be expat 
on the basis of his or her bias Second, 
weight of a biased source may be greatetl 
the judge is of the same point of view as 1 
source. 


Judge's Point of View 


The judge's point of view consistan 
affects the scale values and the value of & 
initial : The values tend to 
lower for the buyer's point of view and 
for seller's point of view. Perhaps the b 
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2 and by Birnbaum et al. (1976), who obtained 
fair price judgments. Perhaps undergraduates 
tend to identify with the consumer under the 
fair price condition. 

A modified configural-weight theory (a 
simple extension of the range model) gives 
a good account of this change of weight across 
points of view. The configural-weight theory 
uses just one additional parameter, besides the 
basic structure of Model 2 of Figure 2, to 
represent the configural effect of point of view. 


Comparison of Configural- and Differential- 
Weight Theories 


In configural theories, stimulus parameters 
depend on the stimulus pattern. Hence, the 
higher or lower estimate receives extra weight 
in the seller's or buyer’s point of view, respec- 
tively. On the other hand, differential-weight 
theories assume that weight depends on 
stimulus magnitude, a seemingly context-free 
theory. In actuality, differential-weight theory 
requires (implicitly) a contextual theory that 
assigns the weights as a function of the context 
of the entire distribution of weights and scale 
values presented. For the present study, $400 
would be a low estimate, deserving of a large 
weight from the buyer’s point of view. 

In contrast, configural-weight theory would 
not declare a $400 estimate to be low, except in 
comparison with the other estimate of the same 
car—that is, except with respect to the 
within-set context (Birnbaum et al., 1971). 
Thus, $400 would be a low estimate if the other 
stimuli are greater, but it would be high if the 
other estimates were lower. It is the relation- 
ships among estimates of the same car that 
define the configural weights. Thus, configural- 
weight theory takes the implicit context theory 
of differential weighting and extends it ex- 
plicitly to the immediate (within-set) context 
of stimuli presented on each trial. 

Configural-weight theory can be represented 
by means of a fulcrum and balance analogue, 
as in Figure 2. The configural parameter, 
represents the proportion of weight that, 
depending on the point of view, is taken from 
either the higher or lower stimulus and given 
to the other. Differential-weight theory can 
also be represented by the lever and fulcrum; 
however, each location on the lever has a 


different weight associated with it. In addition, 
differential weighting requires a complete 
remapping of weights to locations for each 
point of view, using many more parameters 
than the configural-weight models. 

Differential weighting may be required for 
certain situations in which multidimensional 
social stimuli vary in both weight and scale 
value simultaneously (e.g., T. S. Anderson & 
Birnbaum, 1976). Differential weighting is an 
extremely powerful curve-fitting device, one 
that requires special experimental designs to 
test. When appropriate designs have been 
employed, differential-weight models have 
failed to account for the data (Birnbaum, 
1973; Riskey & Birnbaum, 1974). Judgments 
of morality fail to show the compensatory 
effects predicted by nonconfigural averaging 
theories: Given that a person has committed 
a very bad deed, there appears to be no number 
of good deeds that will make the person’s 
judged morality approach the same high 
asymptote as if the person had done only good 
deeds. Similarly, the data of Figure 9 suggest 
that from the buyer’s point of view, if a low- 
expertise source says $300, no level of expertise 
of the $650 estimate will compensate to bring 
the judged value above $550. For the present 
interactions, the simple configural-weight 
models provide a more elegant, accurate, and 
theoretically appealing description of the 
interactions. 


Concluding Comments 


This research investigates information inte- 
gration under conditions in which the relevant 
variables can be manipulated, and it attempts 
to discover principles that explain the results. 
The hope is that principles that apply in 
controlled experiments are characteristic of 
basic psychological processes that have applica- 
bility to a wide range of judgmental phe- 
nomena. Results of experiments in a large 
number of domains encourage the hope that 
a reasonably small set of premises may 
account for a large array of data. Previous 
research has shown that the same principles 
of source expertise can be applied to intuitive 
numerical predictions, ratings of likableness, 
and judgments of the value of used cars. We 
are currently investigating the generality of 
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the principles of bias discovered here to see 
if they account for judgments of probable 
guilt in simulated court cases. 

The present research suggests that a simple 
algebra can account for the complex effects 
of bias in information integration. Judgments 
can be represented by the center of gravity 
of a lever. Information can be represented 
by weights placed at various locations along 
the lever. The location, or scale value, of the 
information depends on the source's communi- 
cation. This location is adjusted to account for 
the source’s bias and the judge's point of view. 
The weight of a source's communication 
depends mostly on the source's 
but diminishes if the source is biased. In 


tative account of a variety of predictions, its 
plausibility is increased. It may then be used 
as a measuring device to study the effects of 
other variables on its parameters. With this 
Archimedian lever, a fulcrum, and a place to 
stand, we hope to raise our understanding of 
human judgment. 
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Appendix A 


An alternative to all three of the models of 
source bias presented in Figure 2 is a differ- 
ential-weight model : 

R= woso + ws + WXBESE 
wo + w+ wxse ' 
in which the weight of an estimate, WXBE, 
depends on expertise, bias, and estimate, but 
the scale value is independent of bias. This 
model assumes that a high estimate provided 
by the buyer's friend receives greater weight 
then if the seller's friend had made the same 
estimate. Consequently, the judged worth 
would be greater for the buyer’s high estimate 
than for the seller's. Similarly, a low estimate 
is assumed to receive greater weight if it is 
provided by the seller's friend than if provided 
by the buyer's friend; consequently, judged 
worth would still be greater for the buyer's 
than the seller's estimate. 

This differential-weight model cannot de- 
scribe the present data. It makes three in- 
correct predictions for effects that are correctly 
predicted by the scale-adjustment model. 
First, it predicts that if wo/wxBE is near zero, 
the effects of both bias and expertise should be 
very small in the source-estimate designs, 
where blue book value is not presented, since 


wxpe/(wo + wxpe) ~ 1. In contrast, Equa- 
tion 3 predicts that the Expertise X Estimate 
interaction should be small but the effect of 
bias should be maximal, since if wo/wx = 0 
for all wx, then 


wx(se + bs) _ 
wot wx ae 


The data for the source-estimate designs (open 
circles in Figure 5) show a small Expertise 
Estimate interaction and a large effect of 
bias. The average effect of bias is $76.6 in the 
source-estimate design of the fair price condi- 
tion, compared with $46.6 when blue book 
value is also presented. The differential-weight 
model incorrectly predicts that the effect of 
bias should have been larger when blue book 
value is presented. 

Second, the differential-weight model pre- 
dicts that there exists some level of estimate 
for which bias has no effect, contrary to the 
data. Third, the differential-weight model 
predicts a complex four-way interaction be- 
tween expertise, bias, estimate, and blue book 
value that did not materialize in the predicted 
form. Consequently, differential weighting 
alone cannot explain the effect of bias—scale 
adjustment appears to be necessary. 


Appendix B 


It seems intuitively reasonable that if a 
source gives an estimate that would not be 
expected on the basis of his bias and the other 
estimate, he might receive greater weight. 
For example, if a friend of the seller gave an 
estimate below the blue book value, his 
estimate might receive greater weight. 

To investigate this hypothesis, the data of 
Experiment 5 were fit using the following 
equation: 


wxere = wxup + wponpre + OPBÊBOEÔPBE: 


where wxspg is the absolute weight of an 
estimate, E, by a source of expertise, X, and 
bias, B, from a point of view, P; wxar is the 
configural free weight, as in Equation 10; 
app is the estimated configural-weight param: 
eter that expresses the magnitude of the 
expectancy-contrast effect; WP, TE, and pPE 
are defined as in Equation 10; 83 = —1 if the 
source is a seller, By = 0 if the source is an 
independent, and Bg = 1 if the source is a 
buyer; öpse = wxpe if apsĝeoe Ś 0, and 
otherwise dppe equals the configural free weight 
of the other source. 


The product Bog will be positive when 
either (a) a buyer provides the higher estimate 
or (b) a seller provides the lower estimate— 
instances in which the source seems to deserve 
extra weight for doing the unexpected. It will 
be negative when either a buyer provides the 
lower estimate or a seller provides the higher 
(i.e. when the source's estimate is expected). 
Notice that PBE and ppg make a decrease in 
weight proportional to the weight to be 
decreased. 

Figure 7A, Panel IB shows that when the 
buyer gives an estimate of $300, the effect 
of the independent’s estimate is greater than 
predicted, as if the buyer lost weight. Figure 
7A, Panel IS, shows that when the seller says 
$300, the effect of the independent is decreased 
relative to the predictions of the range model. 

For the buyer's point of view, the opp 
weights for the buyer and seller are —.08 and 
20, respectively ; for the fair price point of 
view, the values are close to zero, —.04 and 
—.01, respectively ; and for the seller’s point 
of view, they are .05 and .11, respectively. 
Thus, it appears that for either biased point 


74 MICHAEL H. BIRNBAUM AND STEVEN E STHGNER 


of view, the seller receives extra weight for The expectancy weight m-i bration 
making the lower estimate. The friend of the the overall sem-of qaare Cow epancion 
buyer receives additional weight for the 48,333 tw 44.267, as imprhrraeat of 
seller's point of view when he provides the over Equation 10 The root sesa 
higher (unexpected) estimate; however, for deviation was 957, leprovements 
the buyer's point of view, the buyer receives greatest for the Expertise X FE aperticn, 
greater weight for providing the lower esti- X Bias, and Expertise X Mice designs for 
mate. The near-zero values for the fair buyer's poist of view 

condition yd reflect the fact that the Ét Se 

Equation 10 was alread te good in t 

condition (see Figure 1 Received Neveester 7, 
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Two experiments were conducted to test the effects of self-focused attention on 
positive and negative social interactions. In the first study, the behavior of dis- 


positionally high and low 


publicly self-conscious women was 


examined in an 


interpersonal situation involving rejection by a group. It was hypothesized that 


persons high in self-consciousness, 
by others, would be more sensi 


than those low in self-consciousness. The pi 
ed with favorable or unfavorable feed- 


periment 2, female subjects were present 
back in the context of an interview, 

manipulated by exposing half the subjects to their images in a mirror. Self-aware- 
ness increased the negative response to the negative evaluation and tended to 


increase the positivity of the 
awareness theory for the social 


Everyday observations as well as theoretical 
approaches to social behavior suggest that in 
the presence of others, one is apt to become 
self-conscious, that is, aware of the self as a 
social object that can be observed and evalu- 
ated by others. Goffman (1959) has argued 
quite persuasively that when one is attending 
to and involved in an ongoing interaction, 
that interaction can proceed smoothly and 
naturally; but if one is engaged in self-focused 
thought during that interaction, then concern 
is shifted away from what is being said toward 
whether what one says will be received favor- 
ably or unfavorably. 

Argyle (1969) also proposes that self-con- 
sciousness, conceived of as the activation of 
the “self-system,” produces a decreased con- 


This article is partially based on a doctoral dis- 
sertation submitted to the Graduate School of the 
University of Texas at Austin. The manuscript was 
written while the author was on leave at the Uni- 
versity of Virginia. The author is grateful to Arnold 
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ful comments on an earlier draft. Portions of this 
article were presented at the annual convention of 
the American Psychological Association, San Fran- 
cisco, August 1977. 

Requests for reprints should be sent to Allan Fenig- 
stein, Department of Psychology, Kenyon College, 
Gambier, Ohio 43022. 


being more aware of how they are perceived 
tive and react more negatively to the rejection 


redictions were confirmed. In Ex- 


and self-attention was experimentally 


positive evaluation. The implications of self- 
self and social interaction are discussed. 


cern with evaluating the behavior of others 
and an increased concern with the personal 
and public assessment of one’s own behavior. 
It follows that when the self-system remains 
dormant, relatively less thought is given to 
one’s own behavior or its effects on others. 
Similarly, Duval and Wicklund (1972) have 
experimentally demonstrated that a state of 
self-focused attention causes one to engage in 
self-examination and self-evaluation. 

Underlying all of these approaches to self- 
consciousness are several common, but crucial, 
assumptions: (a) During a social encounter, 
attention may be directed either toward or 
away from the self. (b) When attention is 
self-directed, the person becomes conscious of 
the self as an object of attention to others; 
conversely, when attention is directed away 
from the self toward external stimuli, there is 
little consciousness of the self as a social ob- 
ject. And (c) a major consequence of self- 
consciousness is an increased concern with the 
presentation of self and the reactions of others 
to that presentation. 


Determinants of Self-Attention 


Self-focused attention has several sources, 
both situational and dispositional. It can be 
brought about through the presence of others, 
whereby the person becomes cognizant of the 
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fact that others perceive him or her as an 
object (Carver & Scheier, 1978); or it may be 
the result of a reflective stimulus, such as a 
mirror or a tape recording of one's own voice, 
which is a reminder of one’s status as an ob- 
ject (Argyle, 1969; Duval & Wicklund, 1972). 

In addition to situationally induced self- 
attention, there may also be individual differ- 
ences in the degree to which self-consciousness 
occurs. Argyle (1969) has theorized that per- 
sons differ in the extent to which the self- 
system becomes salient in the presence of 
others, and more recently, Fenigstein, Scheler, 
and Buss (1975) constructed and extensively 
administered a scale to measure sel/-comscious- 
ness, defined as the enduring tendency of per- 
sons to direct attention toward themselves. 
Factor analyses of the scale consistently yield 
two stable self-consciousness dimensions: pub- 
lic and private. The private factor is defined 
by an awareness of one’s personal thoughts 
and feelings (e.g., “I’m always trying to fig- 
ure myself out”). The public subscale involves 
an awareness of the self as a social object, 
that is, an awareness that others are aware 
of the self (e.g., “I’m concerned about what 
other people think of me”). In addition, a 
social anxiety factor emerges. Social anxiety, 
which is defined as discomfort in the presence 
of others (e.g, “I get embarrassed very 
easily”) may be seen as a reaction to the 
process of self-focused attention. Previous re- 
search involving several independent samples 
has established that the two self-consciousness 
factors are only weakly correlated (Fenig- 
stein et al., 1975; Scheier, 1976) and that the 
scale as a whole has considerable discriminant 
validity (Carver & Glass, 1976; Scheier & 
Carver, 1977). 


Public Self-Consciousness and Social Behavior 


The first study reported here had two pur- 
Poses: One was to examine the effects of 
public self-consciousness; a second aim was 
to extend the study of self-focused attention 
into the area of social interaction. Although 
self-awareness has been shown to have conse- 
quences for many realms of behavior, vir- 
tually all of these behaviors have been non- 
social, involving little interaction with others. 
For example, heightened self-awareness has 
been found to facilitate task performance, 
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increase the consistency bert were alutudas and 
behavior, decrease nell esters: Saraso selfay 
tribution (soe Wicklund, 1975), and heights 
the experience of emotion (5chrier & Carve, 
1977)—mome of these behaviors is intetpar. 
sonal, Even in those studies which & 
involve social behavior, there is so semed 
any ongoing, face-to-face interaction. In re 
search concerning the effects of sl! awarenm 
on conformity, the reference group's oplaia 
is simply represented on paper (eg, 

1976); and in the application of slf-awam) 
neis theory to aggression, the aggressor 
mains essentially isolated from the 
during the aggressive encounter (eg, § 
Fenigstein, & Buss, 1974; Carver, 1974). i 
may be concladed that little is kaown | 
the consequences of self-focuse! attention 
interpersonal behavior { 

The present research focused on the role 

public self-consciousness in social interact 
Publicly self-conscious persons are pr 
to be susceptible to feelings of being 
served” when in the company of o 
(Argyle & Williams, 1969; Fenigsicin et 
1975). Believing that others are pr jet 
with their appearance and behavior, these peti 


SELF-ATTENTION AND SOCIAL INTERACTION 


are easily recognized. At the high end is the 
recently stigmatized person who, almost by 
definition, is an object of attention and is 


" sensitive to the concern, disgust, or pity that 


A 


# 


l 
l 


is elicited from others. At the opposite end is 
the totally unself-conscious social boor who 
not only lacks any conception of how he or 
she appears to others but could not care less. 
These speculations about the behavior of 
the publicly self-conscious person were tested 
using a relatively ubiquitous situation in so- 
cial interaction: the experience of being 
shunned by or excluded from a group. Rejec- 
tion by others may be extremely aversive. 
As an implicit negative evaluation, rejection 
has been shown to produce feelings of anxiety, 
embarrassment, and worthlessness (Geller, 
Goodstein, Silver, & Sternberg, 1974) and to 
decrease attraction toward the group (Dittes 
& Kelley, 1956; Mettee, Fisher, & Taylor, 
1971). However, other studies have found 
that rejection does not alter liking for the 
group (Snoek, 1962) and may even increase 
one’s evaluation of the others (Jackson & 
Saltzstein, 1959). Apparently, some persons 
are capable of shrugging off social rejection 
and seem oblivious to its effects, whereas 
others are very hurt by such experiences. 
One factor that may help to account for 
these differences in reactions to rejection is 
the degree of awareness of the public or so- 
cial self (i.e. the degree to which persons 
recognize and are concerned about the way 
they are perceived by others). For persons 
low in public self-consciousness, there is little 
awareness of how others regard them. Atten- 
tion is not directed toward the seli—presum- 
ably, it is consumed by the group activity or 
other external events—and the group’s behav- 
ior in relation to the self is not relevant. Thus, 
the experience of being excluded from the 
interaction is not very salient. But highly 
self-conscious persons are sensitive to the is- 
sue of how they are regarded by others. For 
them, the group’s behavior is perceived as 
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author have found a correlation of .24 (N = 380) 
between public self-consciousness and self-monitor- 
ing; similar results were obtained by Scheier and 
Carver (1977). Thus, it is unlikely that the effects 
of public self-consciousness can be accounted for by 
self-monitoring. 
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having personal relevance, and the experience 
of rejection should have a clear impact on 
subsequent behavior. 

Public self-consciousness should also affect 
causal explanations of the rejection. A person 
may make either internal attributions and as- 
sume personal responsibility for the rejection 
or external attributions and assume that the 
cause of the rejection lies with the other per- 
sons or within the situation (Pepitone & Wil- 
pizeski, 1960). Much of the recent attribution 
literature has shown that causality for an 
event is likely to be attributed to those sa- 
lient entities that attract attention and can 
reasonably be seen as a cause (e.g., Duval & 
Wicklund, 1973; Pryor & Kriss, 1977; Storms, 
1973; Taylor & Fiske, 1975). Given that the 
social self is salient to publicly self-conscious 
persons and a primary target of their atten- 
tion, it follows that high publicly self-con- 
scious persons are more likely than low pub- 
licly self-conscious persons to perceive them- 
selves as the cause of the rejection. 

In a recent study, Buss and Scheier (1976) 
found that private self-consciousness increased 
the amount of causality attributed to the self, 
but public self-consciousness had no effect on 
self-attributions. Those authors reasoned that 
attention to one’s own private thoughts and 
feelings is an important determinant of self- 
attributions, whereas focus on the self as a 
social object is irrelevant to such attributions. 
However, their research involved the use of 
hypothetical situations into which subjects 
imaginally projected themselves, and under 
these circumstances, it is reasonable that only 
private self-consciousness affected causal at- 
tributions. The present experiment involves 
a real social interaction, with attributions 
being made about behaviors that have signifi- 
cant social consequences, and in this case, 
public self-consciousness should play an im- 
portant role. 

In summary, the first study hypothesized 
that persons high in public self-consciousness 
would be more sensitive and react more nega- 
tively to rejection by a group, and would hold 
themselves responsible for the rejection to a 
greater degree, than those low in public self- 
consciousness. The major dependent variables 
were attraction toward and desire to affiliate 
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with the group, and causal attributions for 
the group’s behavior. 


Experiment 1 
Method 


The study was a 2 X2 factorial design with one 
manipulated variable (rejection vs. acceptance) 
and one subject variable (high vs. low public self- 
consciousness). 


Subjects 


Several weeks prior to the experimental manipuls- 
tion, during a period designated for this purpesw, 
all students from courses in introductory psychol- 
ogy filled out a number of questionnaires. The cre- 
cial items consisted of the Self-Consclowsness Seale 
(Fenigstein et al, 1975) and three generalized seli- 
esteem items (eg, “In general, I feel very good 
about myslf”).? Measures of self-esteem were is- 
cluded because previous research (Dittes, 1999; 
Jacobs, Berscheid, & Walster, 1971; Jones, Kourek, 
& Regan, 1973) suggested that it may affect reactions 
to rejection. 


were selected. A total of 92 women participated in 
the experiment. Women were used because 
Studies indicated that as a group, they were 
sensitive-to rejection than men. Twelve subjects 
dropped: eight because of suspicion expressed dur- 
ing the posi interview and 

of failure to follow instructions, Dropouts 

related to experimental conditions. This left 80 sb- 
jects, 20 in each condition, 


Procedure 


All the experimental manipulations 
the subject was waiting to participate in the supposed 
“real” experiment. The subject arrived first at 


experimental room, which was bare except for a 
table and chairs, and was told that several other 
subjects were expected the experiment 


and that 
would begin when they arrived. All subsequent par- 
ticipants were female confederates of 
menter. 

Before entering the “waiting room,” 
were instructed to leave their belongings outside the 
room so that the experiment could begin promptly 
as soon as everyone had arrived. (This was actually 
done to prevent subjects from 
tention and behavior toward easily available and 
irrelevant objects such as books or purses.) 

Upon entering the room, the subject was subtly 
directed toward one end of a rectangular table that 
had one side against the wall and a chair at each 
end. About one minute later, the first confederate 
arrived and was given the same introduction as 
the subject. The experimenter brought another 
into the experimental room, placed it halfway 
tween the two ends of the 
first confederate took this seat and remained quiet. 
If the subject attempted to initiate 
the confederate made as brief and neutral a response 
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as paii (og. “yen” “ea” ce “I Gent 

Rejection. Apguerimetcly © su ites, the 
confederete arrived ged wie: (eb! isi cow mote 
ject wes eageeted After de tek Uhr romneiming 
opposite the sebjert snd the copwrtemretey bad 
both confederate: romeine quiet fe: shout 0 
(te etabah the lock of sy sryoristenoniip 
tween them) The meoad fomtetercie Chew bapa 
Cates) Coeversetiee eth the Grit contederste, 
Wes fethoret of Bret bet douh bocce more 

The Geet confederate ployed the soie ofa 
freshen whe war Diag ib ome of the 
dorms, and the weed cosinimsis soqnened te 
a tomewhst more cepetirmied epg! Soinen 
who war ving of compen The crentes 
tually termed te the remrinive»m of 
living, A mom escape: tecidest comcernig 
reids amitas whe war ectually +s 
narcotics informer served a: the focal poist of 
diskusion The youngr: confederste took the 
conservative position, driradiag the «hoo! 


mark such as “I see.” Eye contact with the 
was minimal The imprenion crested by (ht 
federates was not dislike or bestilty bet 
lack of interest 

Acceptance, Subjects ia this poup 7 Pes 
aged to participate is the conversation. | 
ject did not do so spontancously, ber opiniont 
solicited by the second confederate. Any 
uttered by the subject was attended to with 


Self-esteem did not correlate bog TY 
lic self-consciousness variable, nor 


ope 


7 All items are available from the sathot 
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fect any of the dependent variables; thus, 
it was not included in the analyses. Modest 
correlations did exist between public self-con- 
sciousness and measures of private self-con- 
sciousness (7 = .17) and social anxiety (7 = 
24). To measure the independent effects of 
public self-consciousness, 2( high vs. low 
public self-consciousness) X 2 (rejection vs. 
acceptance) analyses of covariance were 
used, with private self-consciousness and so- 
cial anxiety as the covariates. 


Manipulation Checks 


Group’s behavior, Subjects were asked to 
rate on a 7-point scale how favorably the 
others, as a group, behaved toward them. Re- 
jected subjects rated the group’s behavior as 
less favorable (M = 3.8) than subjects who 
were not rejected (M = 5.4). This difference 
was significant, F(1, 74) = 66.4, $ < 001, 
thus establishing the adequacy of the rejec- 
tion manipulation. No other effects were sig- 
nificant, Additional support for the success 
of the rejection manipulation is provided be- 
low in other analyses. 


conscious persons lies in the amount of atten- 
tion they focus on themselves during a social 
interaction, An attempt was made to assess 
this difference directly by asking subjects to 
indicate the relative percentage of time that 
they spent thinking about themselves during 
the experiment. Results showed that high pub- 
licly self-conscious subjects reported i 

more time attending to themselves than low 
publicly self-conscious subjects: 247 vs. 9%; 
on the average—a highly significant differ- 
ence, F(1, 74) = 24.9, p< -00l. The data 


Table 1 


Number of Subjects Choosing to Afiliate with 
the Same Group as a Function of Treatment by 
the Group and Subjects’ Level of Public 


Self-Consciousness 


Treatment 
Self-Consciousness ‘Acceptance Rejection 
Low 14 10 
High 15 3 


ES, ee oe 
Note. n = 20 subjects per cell. 


19 


Table 2 

Mean Liking for the Group as a Function of 
Group Treatment and Subjects’ Level of Public 
Self-Consciousness 


Treatment 


ee 


Self-Consciousness Acceptance Rejection 


Low 
High 


Note. Ratings were on a 7-point scale on which 1 
corresponded to ‘‘did not like the group at all” and 
7 to “liked the group very much.” 


also suggest that the experience of rejection 
by others may focus attention on oneself: 
There was a slight tendency for rejected sub- 
jects to engage in more self-attention (M= 
18.5%) than accepted subjects (M = 15%), 
but this difference was marginally significant, 
F(1, 74) = 2.6, P< 1. 


Dependent Variables 


Affiliation. Subjects were asked whether, 
they wanted to continue to work with the 
same group or with a different group in sub- 
sequent parts of the study. Table 1 provides 
a frequency analysis of subjects’ choices and 
shows that rejected subjects were less willing 
to continue to affiliate with the same group 
than nonrejected subjects, x2(1) = 12.8, P< 
001. Within the acceptance condition, public 
self-consciousness had no effect on affiliative 
behavior. But following rejection, high pub- 
licly self-conscious subjects chose to affiliate 
with the same group far less frequently than 
low self-conscious subjects, x°(1) = 5.6, p < 
.02. 

Attraction. Liking for the group, indicated 
on a 7-point scale, was considerably less in 
the rejection condition than in the acceptance 
condition (see Table 2). Within the rejection 
condition, public self-consciousness had a 
clear effect on liking: High self-conscious 
subjects were less attracted toward the group 
than low self-conscious subjects. The’ signifi- 
cance of these effects was revealed in a 2 X 2 
covariance analysis that yielded a main effect 
for rejection, F(1, 74) = 43.2, p < .001, and 
a main effect for public self-consciousness, 
F(1, 74) = 4.0, 2 < .05, due primarily to the 
differences between the rejected subjects. This 
result was confirmed by a significant Public 
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Table 3 

Mean Percentage of Self-Attributions for the 
Group's Behavior as a Function of Treatment 
by the Group and Subjects’ Level of Public 
Self-Consciousness 


Treatment 
Self-Consciousness Acceptance Rejection 
Low 38.7 3.2 
High $5.7 49.7 


Self-Consciousness X Rejection interaction, 
F(1, 74) = 5.1, p < .05; planned contrast in- 
dicated that public self-consciousness had no 
effect in the acceptance condition, but in the 
rejection conditions, high self-conscious sub- 
jects reacted more negatively toward the 
group than low self-conscious subjects #(38) 
= 2.97, p < .004. 

Causal attributions. Subjects were asked to 
distribute responsibility for the way the group 
behaved toward them to either 
the others as a group, or the situation; the 
amount of responsibility to be attributed 
totalled 100%. Table 3 shows that the high, 
compared to the low, publicly self-conscious 
subjects made more causal attributions to 
themselves for the group’s behavior across 
both rejection and control conditions. This 
finding was confirmed by a 2 X 2 covariance 
analysis that showed a main effect for self- 
consciousness, F(1, 74) = 9.6, p < .01. Causal 
attributions were also affected by social anx- 
iety, F(1, 74) = 8.3, p < .01; those who be- 
came anxious in the presence of the group 
were more likely to feel responsible for the 
group’s behavior than low anxious subjects. 
No other effects were significant. 


Discussion 
Experimental Findings 


Rejection. Rejection by others, not sur- 
prisingly, had a strong effect on 
behavior. Rejected subjects perceived the 
group as less cooperative, liked the group 
less, and were less likely to continue affilia- 
tion with the group than accepted sub; 
In addition, rejected subjects were eae 
more likely to focus attention on themselves 
during the interaction than nonrejected sub- 
jects. 
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For ethical reasons, rejection was i 
lated in a deliberately passive manner: Od 
group members failed to show an istent 
the subject. The obtained reoults suggest U 
interpersonal rejection can be efectively 
nipulated and studied in Ube laboratory 
out the use of any explicit derogation or 
hostility toward the target 

Public selj-consciow:mes: The study 
that individual differences in public selfa 
scioumes had strong and consistent efed 
on reactions to an interpersonal ct 
After being shunned by a peer group, kig 
publicly self-conscious women were lem 
tracted to the group and were less Ekdy 
continue affiliating with that group U 
women low in public self comsciouspem, 
sumably, persons high in public self-consd 
hess Were more aware of how ibey were 
by others. Thus, they were more sensitive 
the shunning and were more disposed to 
the people who ignored them 

Tt was assumed that private self-consd 
ness would have no effect on reactions © 
jection. The focus of the experimental n 
tion was on how subjects were perceived 
others, that is, on their social self, and 
vate, covert aspects of the self were © 
ably less salient. Covariance analyses © 
firmed that the role of private self-consomt 
ness was negligible in this situation. Thus 
present study provides further evidence of À 
discriminant validity of the two sel{-cons 
ness dimensions (see also Scheier & G 
1977). : 

It is important to note that public sell-® 
sciousness had no effect on perceptions © 
group's behavior. The “rejection group 
seen as equally unfriendly by both low 
high self-conscious persons. Their CP cai 
reactions were presumably a result of ¢ 


of the group's behavior, That self- 
ness affects one's interpretation of an 


is suggested by the finding that publicly “H 
conscious persons tended to see 1g bE 
being largely responsible for 


the grouP 
havior. Both Duval and Wicklund (1975) 
Taylor and Fiske (1975) postulate © 
of causal attributions, that is, causes 
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ten attributed to the most salient source of 
information, and salience is highly dependent 
on attention, Thus, the high publicly self- 
conscious person, because she or he directs 
attention to herself or himself (even while 
observing others), is likely to perceive a 
causal relationship between the self and the 
behavior of others. Events perceived as un- 
related to the person have little impact on that 
person. However, if those same events are be- 
lieved to be personally relevant, they should 
have a strong effect on behavior (Jones & 
Davis, 1965). The publicly self-conscious per- 
son presumably sees the rejection as being 
uniquely focused on or caused by her- or 
himself, and this self-blame for the aversive 
event may add to the negativity of the reac- 
tion (eg, Abramson & Sackheim, 1977; 
Kuiper, 1978). 


Acceptance 


It may be argued that just as self-focused 
attention during an unpleasant interaction in- 
creases one’s negative response, self-focused 
attention should also heighten the positivity 
of a positive interaction. Scheier and Carver 
(1977), for example, found that self-aware- 
ness increased responsivity to both positive 
and negative affective states in a nonsocial 
context, Although the acceptance condition of 
the present study may clearly be considered 
a pleasant encounter, reactions to this event 
were not altered by public self-consciousness. 
One possible explanation may be that persons 
high in public self-consciousness are especially 
sensitive to rejection but are not responsive 
to acceptance. 

A related explanation for the failure of 
public self-consciousness to affect reactions to 
acceptance lies in the nature of the acceptance 
manipulation. Acceptance is neither an unus- 
ual nor an impactful occurrence, and it is 

_ unlikely that acceptance generates aS much 
affect as rejection. As William James (1890) 
noted in his discussion of the social self, we 
all have the need to be noticed favorably, and 
there is “no more fiendish punishment” then 
to be ignored and have others act as if we 
were nonexistent. It may simply be that re- 
jection and acceptance are not comparably 
balanced instances of positive and negative 
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social interactions. Pilot observations and 
postexperimental interviews offer some sup- 
port for this position. Rejected subjects dis- 
played and reported a great deal of emotional 
discomfort, but very little emotion was shown 
by accepted subjects. In addition, accepted 
subjects were actively engaged in conversa- 
tion: listening to and processing information, 
formulating and expressing opinions. To the 
extent that these behaviors were occurring, 
self-attention would be reduced and its effects 
mitigated (Duval & Wicklund, 1972). Re- 
jected subjects, on the other hand, were much 
less involved in the ongoing interaction, and 
under these conditions, the effects of self-con- 
sciousness may have been considerably 
stronger. 
Experiment 2 


Experiment 2 was designed to provide a 
more unequivocal test of the effects of self- 
focused attention on both positive and nega- 
tive interpersonal feedback. It was hypothe- 
sized that under conditions of increased self- 
attention, when persons become more aware 
of and concerned with self-evaluation, they 
also become more closely attuned to the self- 
relevant evaluations of others. Under these 
conditions, interpersonal feedback should be 
more salient than when attention is directed 
away from the self and self-evaluation is of 
little concern.’ An attempt was made in this 
study to control (a) the involvement of sub- 
jects in the interaction and (b) the relative 
valences of the positive and negative feed- 
pack. Also, in order to enhance the plausibility 
and generalizability of a self-attention inter- 
pretation, self-awareness was experimentally 
manipulated in this study through the pres- 
ence or absence of a mirror. Mirrors have 
been used extensively to induce self-awareness 
(eg., Duval & Wicklund, 1973; Scheier et 
al., 1974), and there is now strong evidence 
that mirrors provide a valid means of increas- 


3 Theories of deindividuation lead to similar pre- 
dictions. For example, Zimbardo (1969) argues that 
under conditions of deindividuation, a state in which 
one’s sense of identity is lost, there is a decreased 
concern with self-evaluation or evaluation by others. 
Although it is tempting to argue that deindividua- 
tion represents the extreme low end of the self- 
awareness dimension, there is little research on the 
issue, and the question remains unresolved. 
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ing the occurrence of self-relevant thoughts 
(Carver & Scheier, 1978; Geller & Shaver, 
1976). 

Evaluative descriptions of the subject were 
provided in the context of an interview dur- 
ing which active participation by the subject 
was minimal. Subjects were presented with 


controlled so as to insure that positive and 
negative evaluations were of equal magnitude. 
Half the subjects were confronted with an un- 
avoidable mirror image throughout the inter- 


Subjects 


A total of 52 female undergraduates, all having 
siblings, were recruited from courses in introductory 
psychology. Four subjects were 


subjects were randomly assigned 
mental conditions: 10 each in the two 

back conditions and 14 in each of the two negative 
feedback conditions. (As explained below, these ss 
were unequal because 
could not be determined until after 
was over.) 


Procedure 
Subjects were brought into the 


recent experimental 


favorable to firstborns, 
the subject would be in the positive f 
dition if she were a 


eye contact with the subject four 
dni SGML sehen acing ha ete nes 
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Results and Discussion 
Affective Responses 


The affective responses toward the 
viewer and the interview were highly 
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positive feelings then negative feedback, F(1, 
44) = 6.10, p < .02, and there was no main 
effect for the mirror. The interaction was 
highly significant, F(1, 44) = 7.73, p < .001, 
indicating that the effects of the verbal evalu- 
ations were strongly influenced by the pres- 
ence or absence of a mirror. Planned contrasts 
revealed that when the interview was critical, 
subjects responded more negatively in the 
presence of the mirror than in its absence, 
t(26) = 2.98, p < .005; following positive 
feedback, there was a nonsignificant tendency 
toward greater positive affect in the mirror 
condition. 


Other Interview Measures 


Concerning the ratings for the interviewer’s 
competence, sincerity, and poise, and for how 
personal the situation was, there were no sig- 
nificant effects. From these results, it can be 
inferred that the interviewer’s presentation 
was consistent across conditions, and the ex- 
perimental manipulations had no effect on 
the subject’s objective appraisal of the inter- 
viewer’s capabilities. Finally, there were no 
significant differences between the ratings of 
firstborns and those of later-borns. 


Anxiety 


Subjects in the negative feedback group 
(M = 3.0) were more anxious than the posi- 
tive feedback subjects (M = 2.0), F(1, 44) 
= 6.40, p < .02. The mirror by itself did not 
evoke anxiety, and the interaction was not 
significant. 


Table 4 

Mean Overall Affective Response Toward the 
Interview and Interviewer for Each Treatment 
Group of Experiment 2 


k Positive Negative 
Stimulus evaluation evaluation 
No mirror 5.1 5.2 
Mirror 5.6 41 


Note. Ratings were on a 7-point scale on which 1 
corresponded to a highly unfavorable response and 
7 to a highly favorable one. 
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Eye Contact 


The interviewer reported that subjects re- 
ciprocated over 92% of her gazes across all 
conditions. It could be argued that the mirror 
merely provided an easily available distrac- 
tion to avoid a gaze-averting interview. How- 
ever, these data suggest that the interviewer’s 
eye contact was consistently met and cast 
serious doubt over any attempt to explain the 
effects of the mirror as due to its “distraction” 
qualities. 


Discussion 


It was hypothesized that evaluations from 
others would have a greater effect on subjects 
in the presence of a mirror than in its ab- 
sence. These predictions were partially con- 
firmed: The mirror substantially enhanced 
negative reactions to negative interpersonal 
feedback; the effect for positive feedback was 
in the expected direction but was not signifi- 
cant. Overall, the results suggest that self- 
attention intensifies the process of self-evalua- 
tion, and when self-awareness is low, the 
importance of self-relevant information from 
others is reduced. 

It may be argued that the mere absence of 
a mirror is not sufficient reason to assume a 
low state of self-awareness. But, as in the 
present experiment, if the mirror’s absence is 
combined not only with an interviewer who 
establishes a minimal amount of eye contact 
but also with a highly impersonal interview 
about birth order (in general, not specifically 
the subject’s own), then there may be suffi- 
cient justification to argue for considerably 
diminished self-awareness in the no-mirror 
condition. 


General Discussion 
Empirical Findings 


Tt has been widely assumed by self-con- 
sciousness theorists (e.g., Argyle, 1969; Goff- 
man, 1959; James, 1890) that when attention 
is directed toward the self, there is an in- 
creased concern with how one is perceived 
by others; and when self-attention is low, 
feedback from others is of little importance. 
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The present research provides confirmation 
of these hypotheses. The results of both stud- 
ies together demonstrate that when persons 
become more aware of the self during social 
interactions, because of either chronic dis- 
Positions toward self-consciousness or the re- 
flection of their images in a mirror, there is 
an increased responsiveness to the evaluations 
of others (especially in the case of unpleasant 
feedback). 

Although the present research has been in- 
terpreted in terms of self-attention processes, 
it may be argued that both public self-con- 
sciousness and the mirror produce an increase 
in arousal and a facilitation of dominant re- 
sponses (e.g., Liebling, Seiling, & Shaver, 
1974). In the present context, dislike and 
liking could be construed as the dominant 
responses to disparaging and complimentary 
feedback, respectively. This alternative ex- 
planation of the results is weakened by several 
considerations. The mirror in the second study 
had no consistent effect on subjects’ self-re- 
ported anxiety. The lack of an association 
between the mirror and arousal is corrobor- 
ated by physiological research showing that 
mirrors do not increase palmar sweating 
(Paulus, Annis, & Risner, Note 1). In the 
first study, covariance analyses established 
that the effects of public self-consciousness 
were independent of the effects of social anx- 
iety. Thus, although high publicly self-con- 
scious subjects may be more anxious in the 
presence of others than subjects low in public 
self-consciousness, this anxiety apparently 
does not account for their reactions to rejec- 
tion. In addition, Carver and Glass (1976) 
found public self-consciousness to be uncor- 
related with test anxiety and only weakly 
correlated with emotionality. There appears to 
be little evidence that arousal played a major 
role in either study. On the other hand, the 
fact that both a stable self-consciousness at- 
tribute (in Experiment 1) and a self-reflective 
stimulus (in Experiment 2) produced similar 


results lends support to an attentional ex- 
planation. 


Direction of Causality 


The convergence between the findings of the 
two studies also allows a question of causal 
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would then follow that the salience of nega- 
tive events such as rejection or criticism is 
heightened more by self-awareness than the 
salience of positive events such as acceptance 
or praise. 


Selj-Awareness Theory 


Self-awareness has been construed primarily 
as a process that regulates the direction and 
intensity of thoughts, feelings, and actions 
(Duval & Wicklund, 1972). Much of the re- 
search in this area has been interested either 
in the relationship between behavior and 
standards of behavior (e.g, Carver, 1974; 
Scheier et al., 1974) or in the effects of self- 
focused attention on internal states (e.g, 
Scheier & Carver, 1977) and cognitive phe- 
nomena (e.g, Carver & Scheier, 1978). In 
the context of this previous research, it has 
been assumed that self-focusing stimuli such 
as mirrors direct attention toward covert, in- 
ternal aspects of the self. Similarly, interest in 
personality dimensions of self-consciousness 
has concentrated primarily on the private 
component (e.g., Scheier, 1976). 

The study of self-awareness, then, seems 
to have been largely concerned with what 
James (1890) called the “spiritual” self, 
which refers to our inner being, our cognitive 
faculties, our emotional states, and our 
havioral tendencies. But the self involves 
many: facets, and a particularly important 
constituent is the “social” self (James, 1890), 
which is concerned with the recognition or re- 
gard we get from others. The present research 
Suggests that when we attend to ourselves in 
social situations, it is the social self that is 
focused on; that is, we become more aware 
of ourselves as objects of attention to others. 
In this respect, it has been shown that public 
self-consciousness (awareness of ourselves as 
social objects), rather than private self-con- 
sciousness (awareness of our internal selves), 
is the crucial determinant of how much the 
evaluations of others affect us. In addition, 
the parallel between the effects of public self- 
consciousness and the mirror manipulation 
suggests that in social situations, mirrors di- 
rect attention toward the public, external 
aspect of the self. Thus, the consequences of 
self-attention are not limited to a greater 
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awareness of our own standards, emotions, 
and cognitions; self-attention also affects our 
awareness of the self-relevant actions, 
thoughts, and feelings of others. Self-aware- 
ness theory needs to acknowledge the exist- 
ence of both the private, internal self and 
the public, social self if it is to develop a 
fuller understanding of our unique ability to 
be conscious of ourselves. 


Conclusion 


The present findings indicate that self-at- 
tention is an important mediator of social 
behavior. By raising or lowering attention to 
oneself, one can heighten or diminish the ef- 
fect of others’ evaluations, perhaps through the 
facilitation of selective attention processes 
(Duval & Wicklund, 1972). Indeed, our own 
personal exerience would suggest that the 
reduction of self-attention through, for ex- 
ample, avoiding eye contact (Fenigstein, Note 
2) or engaging in random physical activity 
(Duval & Wicklund, 1973) may be an effec- 
tive strategy for warding off the impact of 
criticism, derogation, or embarrassment. Fur- 
ther research confirming these inferences 
would suggest that the control and modifica- 
tion of self-attention may have implications 
for the way in which persons deal with the 
social feedback of everyday life. 
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A field experiment focused on some impli 


plored in earlier research. Conceptually, 
effects of touch over a relatively long tim 
dimensions, and in a nonreactive setting ¢ 
plied level, the research studied the value of touch as a concomitant of nurse- 


patient interactions. 
between-subjects des 


preoperative teaching, on patient affective, 
cated that female subjects in the touch condition 


affective, behavioral, and physiological reactions 
In contrast, males in the touch condition reacted 
these dimensions. 


logical responses. Results indi 
experienced more favorable 
than a no-touch control group. 


more negatively than control subjects on 


Background 


The various modes of nonverbal communi- 
cation (e.g., proxemics, eye contact) have 
received increased attention from social psy- 
chologists in recent years. There have been 
several theoretical (e.g., Argyle & Dean, 1965; 
Patterson, 1976) and numerous empirical 
(e.g., Buck, Miller, & Caul, 1974) attempts 
to understand the way nonverbal messages 
are interpreted and to predict the quality of 


| subsequent reactions. 


Surprisingly, although touch is considered 
by many to be the most powerful of the non- 
_verbal modalities, it has traditionally re- 
ceived the least research attention (Duncan, 
1969). Further, what data are available on 
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cations of interpersonal touch not ex- 
the research included measuring the 


e frame, for a broad range of response 
haracterized by dependency. On an ap- 


Specifically, a 2 (touch vs. no touch) X 2 (male vs. female) 
ign assessed the effects of nurses touching patients, during 


evaluative, behavioral, and physio- 


tactile stimulation have generally focused on 
the responses of animals and human infants 
to touch. Within the animal literature, studies 
have revealed that early tactile contact can 
influence later emotional and physiological 
behavior (Denenberg, 1963; Harlow, 1971; 
Levine, 1960). With human infants, the clas- 
sic work of Spitz (1946) and more recent - 
research by Montagu (1971) highlight the 
importance of touch as a stimulus essential for 
normal intellectual, emotional, and social de- 
velopment. 

Only in the last few years have researchers 
begun to explore the parameters of touch 
with human adults. Further, the limited work 
with this population suggests that the effects 
of touch may be either positive or negative. 
For example, in research demonstrating posi- 
tive consequences, Kleinke (1977) reported 
that touch led to greater compliance than in a 
control group, and Fisher, Rytting, and Hes- 
lin (1976) observed that females who were 
touched experienced more positive affective 
and evaluative reactions than no-touch con- 
trols. In research suggesting negative effects, 
Walker (1971) reported that communication 
by means of touch made subjects feel anxious 
and uncomfortable, and Henley (1973) ob- 
served that touch may be perceived as ex- 
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ploitative and/or as highlighting the lower 
status of the recipient. Finally, it has been 
shown that the same touch may be experi- 
enced positively by one sex and negatively 
by the other (Fisher et al, 1976; Nguyen, 
Heslin, & Nguyen, 1975). 

A review of past research with adult pop- 
ulations suggests that whether touch is ex- 
perienced positively or negatively depends 
on the meaning and evaluation inferred by 
the recipient (Fisher et al, 1976). Clearly, 
touching a person can mean many things, for 
example, a desire for intimacy, sexual attrac- 
tion, or dominance, Further, any of these 
messages could be expected to elicit negative 
reactions.if it oversteps the boundaries a per- 
son has as appropriate. In line with 
these ol ns, and supported by com- 
ceptual work on nonverbal communication 
(Altman, 1975; Argyle & Dean, 1965; Pat- 
terson, 1976), Fisher et al. (1976) suggested 
that a touch will be experienced as positive 
to the extent that it (a) is appropriate to 
the situation, does not impose a greater 
level of intimacy than the recipient desires, 
or (c) does not communicate a negative mes- 


sage (e.g., ig not perceived as condescend- 
ing). TI assertions have also been sup- 
ported ið empirical work (e.g, Hall, 1966; 


Henley, 1973; Nguyen et al., 1975). 


Problems With the Extant Literature on 
Touch 


Although the present literature on touch 
in human adults affords some organizing no- 
tions for predicting responses and suggests 
that reactions may vary widely, it is limited 
in important ways. For example, the range of 
situations examined has been decidedly nar- 
row (e.g., touch has rarely been studied in 
situations characterized by , an- 
ger, or sexual intimacy; Henley, 1977). Also, 
very few studies have been conducted in a 
nonreactive context (Fisher et al, 1976), 
which is a serious limitation, since touch is 
associated with strong socialization-related 
prohibitions (Heslin, Note 1). In addition, 
past research has measured only reactions 
that occur very soon after tactile stimula- 
tion; the longer-term consequences of touch 
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have yet te be explore! Further, the 
sponses asesi beve erorcaily bors N 
to the affective asd evelustive domaing] 
havioral aad physiologic] rections to 
have been mrasered infrequently, and 
search has provided a « mullaneogs 
ment of all four respos: 
most studies on the effects of te 
empirical demonstration: 
been conducted either to trst 
coptual perspective or ia 
findings cowld have direct 
cane 


The Present Experiment ond its Im isti 
The present aperiment was Goig 
yield initial data pertinent to some of 
weakness of past research described 
Specifically, the study explored the eeg 
nurse-patient touch in a b „pital seti 
a conceptual level, this research om 
to the extant literature in the following 
First, nurse-patient interactions aft 
characterized by dependency ( Barnett, D 
a context variable ignored in past re 
touch (Henley, 1977). In addition, the 
of touch were simultancously a! 
fective, evaluative, behavioral, and J 
logical dimensions, and over a reisti 
period of time. Finally, the natumia 
touch stimulus and of the « r 
sures rendered the study relatively 
sive, so that it would be a : 
problems associated with reactivity. 
In addition to its conceptual imp 
the experimental study of touch in 8 
setting may have practical value. | 


has directly measured the E 
quences of tactile stimulation, the | 
of touch is suggested by work in re 
Specifically, it has been found that 
ization is frequently an anxiety" 
sonalized situation (Barnett, 1972); 
propriate supportive care (of 
may constitute an element) 
anxiety (Johnson & Leventhal, 
most importantly, that the level ° 
anxiety may be related to pos” 
covery (Johnson, Leventhal, & 


Hence, by addressing the effects of touch in 
the present setting, this study may disclose 
some initial answers to an important applied 
»roblem. 


Predicting the Effects of Touch in a 
Hospital Context 


In predicting the effects of touch in a con- 
text characterized by dependency, some im- 
portant factors should be taken into account. 

`learly, in such a setting touch may convey 

a mixture of positive (e.g., caring) and nega- 
tive (e.g., power, dominance) elements (Hen- 
ley, 1977; Heslin, Note 1). This interpretive 
ambiguity is exacerbated, because in a physi- 
cal sense, touches that express negative as- 
pects of dependency (e.g, inferiority) and 
those that express positive aspects of caring 
and concern are quite similar (Henley, 1977; 
Heslin, Note 1). Thus, it is not surprising 
that competing rationales exist for predicting 
which of these two messages will be most 
salient. 

One prediction stems from research on 
sex differences in socialization. Specifically, 
it is assumed that males are socialized to be 
more uncomfortable with dependency than 
females (Hoffman, 1972; Maccoby, 1966; 
Stein & Bailey, 1973). Further, since touches 
signaling. dependency and those expressing 
be physically similar, it is be- 
lieved that socialization may importantly 
affect their interpretation. Thus, females may 
touch as a primarily 


tively, 
touch as conveying a message of relative in- 
feriority and dependency, 
situationally appropriate nor of a comfortable 
level of intimacy. In effect, it is suggested 
that touch in a dependency. context is more 
sex role-appropriate for females than for 
males and should therefore lead to more posi- 
tive female than male responses across mea- 
surement indexes (e.g., affect, evaluation). 
While sex role-inappropriate touch has been 
shown to lead to negative effects in other con- 
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texts (e.g., romantic encounters; Nguyen et 
al., 1975), it has not yet been explored in 
dependency situations. 

An opposing prediction centers on the spe- 
cial status that may be accorded to depen- 
dency that results from illness. Specifically, 
when one is a patient, being dependent may 
be socially acceptable for both sexes. In line 
with this interpretation, it would be expected 
that both males and females would interpret 
nurse-patient touch as an expression of car- 
ing that is situationally appropriate and of 
a reasonable level of intimacy (cf. Fisher et 
al., 1976). If this hypothesis were supported, 
the positive reaction of males and females to 
touch should generalize across the various 
response dimensions. 


Method 


Subjects 


Subjects were patients at a major university hos- 
pital complex. Forty-eight individuals (19 males, 29 
females) who entered the hospital for elective sur- 
gery participated in the study; anonymity was 
maintained through the use of code numbers. The 
prerequisites for inclusion were (a) an adequate 
reading and speaking knowledge of English, (b) 
signing a document upon admission to the hospital 
that permitted participation in research, (c) as- 
signment to the seventh floor of the hospital, and 
(d) assignment to a primary nurse who had agreed 
to participate in the study. All individuals who 
met thee criteria were included in the experiment. 


Design 


A 2 (touch vs. no touch) 
between-subjects design was em| loyed, in which pa- 
tients were randomly assigned to conditions, For 
ethical reasons, and to permit a comparison with 
“routine procedures,” individuals in control groups 
were not deprived of any aspect of normal hospital 
care and were exposed to customary professional/ 
functional tactile contact with nurses. In experi- 


2 Pimale vs, female) 


1 The number of patients who refused to sign the 
informed-consent document averaged 10%, with no 
differences between the experimental cells. 

2 All nursing personnel scheduled to work on the 
seventh floor volunteered to take part in the study 
and were blind to the experimental hypotheses. Fur- 
ther, it should be noted that primary nursing care, a 
system in which a patient is assigned to one nurse 
who coordinates care throughout the hospital stay, is 
used at the facility where the study was run. 
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Experimental Procedures 


If a patient met the prerequisites for 
tion, the floor clerk attached a 
“work sheet” to his or her medical chart 
arrived at the seventh floor, The work sheet 
the patient’s treatment condition 
randomly sequenced by the experimenter 
the beginning of the t. 
examined the patient’s chart, the 
formed her if the patient was participating in 
study and what condition he or she was in, 

The experimental manipulation took 
after admission, Initially, the nurse 
that separated the patient from any 
viduals in the room, Next, she touched the 
hand for a few seconds while 
and explaining that she was going to inform 
tient about his or her surgery, Toward the 
the teaching session, the nurse gave the 
booklet that further detailed the 
gery, to be read at his or her leisure. At 


Pirs 
H aH 


Adin 


arm and 


Upon completing the teaching sessi Primary 
nurse left the room, Several hours rai patient 
was approached by a male experimen: 
experimental hypotheses, who explained that the 
university was c icting a study of patient reac- 
that the patient fill ‘out a AAA bien i 


re ical ind 
peg bie ate bet} te who Hatt 
‘avorably and those Teact vorably 
hospital experience, bi an ad 


vietions above 40 


| the nations 
Concerning hewitaliration amd surgery Be 


Specifically, 
the items mratured bow sariso: n ients felt abou 
having to undergo surgery, bow besssat it 

sed how 


them to be is the bospital 
were 


preoperative instruction and ose sefe 


Measure oj satisfaction with preoperative insiru- 
tion. This index consisted of two items, which 
asked patients how satisóed they were with the in- 
formation they received concernine angery and how 
satisfied they were with the nurning care provided 
UP to that point. The items showed an intercorrela- 


ting attrac. 


tion of 36 and were arranged as }-point semantic 
differential scales with endpoints of “not at all” 
and “extremely.” 

Mearure of atiraction. Three items comprised a 
measure of the degree of patient liking for the 


nurse. Specifically, they asked patients to indicate 


terested they felt the nurse was in their questions 
and feelings about surgery, and how friendly they 
felt toward the nurse. The items showed intercor- 
relations above $0, and each was presented as ; 
semantic differential scale with endpoints o! 
'not at all” and “extremely.” 


Behavioral Measures 


It was as- 
em touch might lead to differential patient 
reading of the booklet left by the nurse during pre 
teaching. Therefore, a 4-point Likert-type 
Scale assessed how much of the booklet the patient 
had This measure ranged from “none at 

to “the entire booklet thoroughly.” 


Preoperative instruction. Further, the neo 
trained the staff in the execution of the touch. Si 
urses role played the manipulation with each 
the experiment began to insure that kt wom 
be administered in a uniform fashion. In ( 
they were provided with a typewritten set © 
structions, which they reviewed periodically. 
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Measure of reciprocation of touch. At the end 
of the preoperative instruction, the nurse extended 
her hand in a gesture that allowed for reciprocal 
intimacy and then observed the patient’s reaction. 
The response was coded in terms of whether the 
patient (a) grasped or touched the nurse’s hand, 
(b) reached toward but did not touch the nurse’s 
hand, (c) looked at but did not reach toward the 
nurse’s hand, or (d) ignored the nurse’s gesture. 


Physiological Measures 


The physiological data recorded for patients in- 
cluded measures taken directly after surgery while 
the patient was in the recovery room and periodic 
vital signs readings collected by the staff during the 
first several days of hospitalization. Specifically, the 
recovery room data consisted of five repeated mea- 
sures of patient pulse and systolic and diastolic 
blood pressure, recorded at 15-minute intervals. The 
periodic vital signs data consisted of seven repeated 
measures of pulse, temperature, and systolic and 
diastolic blood pressure, collected approximately 
every 4 hours after the preoperative teaching. Past 
research has shown that these indexes can be con- 
sidered to discriminate between anxious and non- 
anxious patients (e.g., Martin, 1961). All readings of 
blood pressure were recorded by a Baum portable 
mercury-meter sphygmomanometer with a branchial 
artery cuff, Measures of heart rate were assessed by 
a radial pulse, and temperature data were recorded 
with a standard mercury thermometer.* 


s 
Results 


Equivalence of Groups 


Preliminary 2 (touch vs. no touch) x 2 
\(male vs. female) analyses of variance 
‘(ANovAs) and chi-square tests were run to 
insure that randomization had resulted in 
equivalence of patients in all cells. Since the 
ANovas revealed no differences between groups 
in terms of physiological indexes or age at 
hospital entry (Fs < 1), and the chi-square 
tests indicated no differences in type (e.g. 
orthopedic, eye, thoracic) or severity (major 
vs. minor) of surgery (x? < 1), each of the 
groups was assumed to be equivalent. Thus, 
the dependent variables were subject to fur- 


ther analysis. 


Measure of Affect 


A 2 (touch vs. no touch) X 2 (male vs. 
female) multivariate anova was performed 
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on the three questionnaire items pertaining to 
affect concerning surgery.* This procedure 
revealed a significant main effect for touch, 
A(1, 3) = 3.09, p < .037, which was qualified 
by a significant Touch X Sex interaction, 
A(1, 3) = 4.84, p < .006. Parallel univariate 
analyses run on the three affect items indi- 
cated a similar pattern of findings. A 

Specifically, the univariate analyses re- 
vealed a significant main effect for touch on 
the item reflecting worry about complica- 
tions, F(1, 43) = 9.27, p < .004, and a mar- 
ginally significant main effect for touch on 
the item that measured unpleasantness ‘of 
hospitalization, F(1, 43) = 2.82, p< .10. 
These findings indicated that patients who 
were touched during preoperative instruction 
experienced more positive reactions than in- 
dividuals not touched. Further, a significant 
univariate Touch X Sex interaction was ob- 
served for ratings of anxiety concerning sur- 
gery, F(1, 43) = 15.01, p < .001, and trends 
toward significant Touch X Sex interactions 
were observed for the measures of unpleas- 
antness of hospitalization (p< .13) and 
worry about complications (p< .16); see 
Table 1. For the measure of anxiety con- 
cerning surgery, Newman-Keuls comparisons 
indicated that females who were touched re- 
ported less anxiety than control females (p 
< .05), whereas males in the touch condition 
reported more anxiety than control males 
(p < .05). Also, females were more anxious 
than males in control conditions (p < .05), 
whereas males were more anxious than fe- 
males in touch conditions (p< .05). The 
overall pattern of the nonsignificant interac- 
tions paralleled that observed for the mea- 
sure of anxiety concerning surgery. 


4 All experimental procedures were approved by 
the hospital ethics committee composed of doctors, 
lay people, and clergy. 

5 Fmax tests indicated that homogeneity of vari- 
ance was maintained for all of the dependent mea- 
sures. Since there were minor discrepancies in the n 
in various cells, a least squares ANOVA was used (cf. 
Overall & Spiegel, 1969; Winer, 1971). Finally, the 
Wilks’s Lambda criterion (Finn, 1974) was used as 
a significance criterion for all the multivariate tests. 
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Table 1 n 
Cell Means for Touch X Sex Interactions 


i 4 5.9, 

Anxiety over surgery 6.2, IA Lo ` 

Unpleasantness of hospitalization 3.2, 34 1.0, s i, 

Worry concerning complications 3.0, 2.9, | 6, n 3 ls 

Recovery room systolic pressure 144.6, 121.8, 126 ? I iS 0, 

Recovery room diastolic pressure 96,2, 78.3, 24 83.1, 
Note. Means with common subscripts do not differ at the ,05 level, as indicated by the Newman-Keuls 


procedure. For the measures of anxiety over surgery, unpleasantness of hospitalization 


and worry about 
complications, scores can range from 1 (low) to 7 (high). 


Evaluative Measures 


Measure of Satisfaction With Preoperative 
Instruction 


A 2X2 multivariate anova and parallel 
univariate aNovas performed on the two 
questionnaire items assessing patient satis- 
faction with Preoperative teaching were not 
significant. However, responses on these mea- 
sures indicated a ceiling effect, that is, pa- 
tients in all conditions felt extremely satisfied 


with the information and nursing care they 
received. 


Measure of Attraction 


A 2X2 multivariate ANOVA and parallel 
univariate ANOVAs performed on the three 
questionnaire items that assessed patient lik- 
ing for the nurse revealed no significant ef- 
fects. However, when the data for males and 
females were analyzed separately, some sug- 
gestive findings emerged. For females, the 
multivariate anova indicated a trend toward 
a main effect for touch, A(1, 3) = 1.64, p< 
.20. This finding was paralleled by a uni- 
variate effect for touch on the item measur- 
ing perceived nurse interest, F(1, 26) = 5.10, 
$ < .03, which Suggests that females per- 
ceived nurses who touched to be more in- 
terested in them than nurses who did not 
touch. For the Measures of perceived nurse 
warmth and friendliness, the separate male— 
female univariate analyses revealed no sig- 
nificant: effects; however, responses on these 
items indicated a ceiling effect. 


Behavioral Measures 
Measure of Preoperative Matcrial Read 


Patient reading of the preoperative teach- 
ing booklet was measured on a 4 point scale. 
The overall 2 x 4 chi-square analysis was 
significant, x*(3) = 9.10, p < .03, a 
gested that patients in the touch conditio 
read more of the booklet than controls. zm 
rate 2 X 4 chi-square tests revealed that a 
females, the effect of touch paralleled the 
overall chi-square analysis, y*(3) = apie , 
< .01. However, for males the separate a y 
sis was nonsignificant, ,*(3) = 1.78, ES 
indicating that it was the female data 


l 
were primarily responsible for the overall” 


effect. 


Measure oj Reciprocation of Touch 


Patient responses to the nurse’s Ss 
stretched hand were coded on a 4-point 
reflecting the subject’s desire for recip ae 
intimacy. The overall 2 x 4 chi-square sca 
ysis was marginally significant, with pati ee 
in the touch condition more frequently be 
ing out and touching the nurse’s hand 5 
tive to control patients, x7(3) = 6.97, ? 
.07. Separate 2 X 4 chi-squares for males pe 
females revealed that while the pattern gi 
proached significance for females, ee 
7.39, p < 06, it was not significant for male 
(x? < 1). Here again, it was the female 
sponse that determined the overall effect. 


v 
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Physiological Measures 
Recovery Room Measures 


A 2 (touch vs. no touch) 2 (male vs. fe- 
male subject) X 5 (time) ANOVA was per- 
formed for each of the physiological measures 
taken in the recovery room. 


Systolic blood pressure. The anova per- 


formed on the recovery room index of systolic 
blood pressure revealed a significant Touch 
X Sex interaction, F(1, 31) = 4.15, p < .05 
(see Table 1). The pattern of the means in- 


dicates that at all five measurement intervals, 
touch females tended to have lower systolic 
pressure than control females, whereas touch 
males tended to have higher readings than 
male controls. However, Newman-Keuls anal- 
yses revealed that none of these differences 
were significant. The overall anova further 
revealed a significant effect for time, F(1, 128) 
= 2.85, p< .03, which indicated that patients 
in both conditions demonstrated a significant 
decline in systolic pressure over the period of 
Measurement, 

Diastolic blood pressure. The ANOVA per- 
formed on the five measures of diastolic 
blood pressure showed a significant Touch x 
Time interaction, F(1, 128) = 2.47, p < .05, 
as well as a marginally significant Touch x 
Sex interaction, F(1, 31) =3.26, p < .08 
(see Table 1). Taken together, the interac- 
tions suggest that touch males tended to in- 
Crease in diastolic blood pressure relative to 
Control males over the five intervals. In con- 
trast, touch females tended to decrease in 
diastolic pressure over time relative to con- 
trol females. However, Newman-Keuls com- 
Parisons of the means did not reach a con- 
ventional level of significance for either in- 
teraction. 

Pulse. The repeated measures ANOVA on 
the five pulse readings indicated that only the 
effect for time was significant, F(1, 124) = 
4.51, p < 003, with patients demonstrating a 
decrease in pulse over the period of measure- 
ment, 


Periodic Vital Signs Measures 


In addition to the physiological data col- 
lected in the recovery room, periodic measures 


of pulse, temperature, and systolic and di- 
astolic blood pressure were recorded for pa- 
tients approximately every 4 hours after pre- 
operative teaching. A 2 (touch vs. no touch) 
X 2 (male vs. female subject) x 7 (time in- 
tervals following preoperative teaching) ANOVA 
was performed on these measures. 

Systolic blood pressure. The Anova on 
the periodic index of systolic pressure revealed 
a significant effect for sex of subject, F(1, 30) 
= 13.46, p < .002, with males demonstrating 
higher systolic pressure than females. Al- 
though the analysis failed to indicate a main 
effect for touch (p < .26), the direction of 
cell means suggested that patients in the touch 
condition (M = 122.73) tended to have lower 
systolic pressure over the seven measurement 
intervals than controls (M = 127.37). 

Diastolic blood pressure. The ANovA on 
the periodic diastolic blood pressure readings 
revealed a significant main effect for sex 
of subject, F(1, 30) = 14.25, p < .001, with 
males demonstrating higher diastolic pressure 
at all measurement intervals than females. 
The data further indicated a trend toward 
significance for the effect of touch, F(1, 30) 
= 2.40, p < .13, with patients in the touch 
condition (M = 76.03) tending to have lower 
diastolic pressure over the seven measures 
than controls (M = 86.38). 

Pulse. The repeated measures ANOVA per- 
formed on the seven pulse readings revealed 
no significant effects (Fs < 1). 

Temperature. The repeated measures 
anova performed on the periodic vital signs 
temperature data revealed a significant effect 
for time, F(1, 30) = 6.03, p < .001, with pa- 
tients showing an increase in temperature over 
the seven measurement intervals. This main 
effect was qualified by a significant Sex x 
Time interaction, F(1, 186) = 2.26, p < .04, 
which indicated that males’ temperatures 
tended to increase at a more rapid rate than 


females’. 


Discussion 


In an attempt to provide some conceptual 
and applied data not available from earlier 
work, the present study explored the effects 
of touch in a hospital setting. The findings 
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on a variety of measures suggested that in 
this context, touch led to positive effects pri- 
marily for females. Specifically, on affective 
dimensions touch precipitated positive reac- 
tions for females, whereas the reverse was 
true for males. Further, the behavioral data 
indicated that touch led to reciprocal touch- 
ing and to more extensive reading of the pre- 
operative material for female but not for 
male subjects. Finally, several physiological 
indexes also tended to show this pattern of 
positive reactions only for females. 


Evaluation of the Support for the Sex 
Difference Hypothesis 


While there was a degree of inconsistency 
across the various dependent measures, the 
overall pattern of results supports the sex 
difference hypothesis, which predicts that 
when dependency cues are pervasive, males 
and females will react differently to touch. 
The rationale for this hypothesis is that 
males are socialized to be relatively uncom- 
fortable with dependency (Hoffman, 1972; 
Maccoby, 1966; Stein & Bailey, 1973) and 
thus experience nurse-patient touch as a 
threatening gesture communicating inferior- 
ity and asserting dominance, which leads to 
negative reactions. Conversely, females are 
socialized to be relatively affiliative and com- 
fortable with dependency (Hoffman, 1972; 
Maccoby, 1966; Stein & Bailey, 1973) and 
thus experience nurse-patient touch as a 
supportive gesture of caring and warmth, 
which leads to favorable reactions. Our sup- 
port for the sex difference hypothesis comple- 
ments earlier research on tactile stimulation. 
In contexts other than dependency, it has 
been found that when touch conveys a sex 
role-inappropriate message, negative effects 
occur (Nguyen et al., 1975; Heslin & Boss, 
Note 2). 

Although the present data were explained 
in terms of the sex difference hypothesis de- 
tailed above, several other possibilities are 
tenable. One centers on females’ richer and 
more varied tactile history. Specifically, fe- 
males are touched more often by a variety of 
others (mother, father, same/opposite-sex 
friends) than es (Jourard, 1966; Jourard 
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& Rubin, 1968), and in other nonverbal 
modalities (e.g, personal space) they display 
more permeable boundaries. Further, females 
probably have more exposure to tactile stim- 
ulation in medical contexts than males (eg, 
yearly gynecological exams, giving birth). 
Hence, touch during preoperative teaching 
may have been reassuring for females but 
disruptive for males in part because males | 
have less experience with touch in general’ 
and, more particularly, in medical contexts. 

A second possibility stems from the fact 
that female patients interacted with a same- 
sex other, whereas males interacted with an 
opposite-sex individual. Since there was no 
way to separate out the possible confounding 
influence of sex of toucher {rom the effects of 
touch in the present setting, it is possible that 
being touched during preoperative teaching 
was a qualitatively different experience for 
males and females. For example, males could 
be more likely than females to interpret the 
present touch as sexual in nature. However, 
past research tends to argue against this pos 
sibility. Specifically, a study by Nguyen et 
al. (1975) indicates that touches to the 
and/or shoulder, whether applied by a same 
or an opposite-sex other, are interpreted aS 
nonsexual in nature. Further, an earlier field 
experiment that varied sex of toucher and 
employed a similar touch manipulation foun 
no differential effects for sex of toucher 
(Fisher et al., 1976). Nevertheless, it 15 not 
possible to rule out the above explanation 
solely on the basis of this study. 


Consistency of the Data Across Dependent 
Measures 


While the present findings generally sti 
a consistent pattern of effects for the afec 
and behavioral measures, the physiology 
measures were less consistent. : 
the measures taken in the recovery room i 
dicated that touch females had lower syst 
and diastolic blood pressure than contro 
whereas similar measures taken in ae 
tient’s room showed only a trend towar = 
effect for touch. Also, indexes of pulse a 
temperature did not discriminate ae 
experimental and control conditions- 
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estingly, on some of these measures sizable 
mean differences were obtained, yet effects 
were not significant. If the high degree of 
variability associated with physiological mea- 
surement is taken into account (e.g., Martin, 
1961), these nonsignificant patterns become 
more impressive. Further, it should be noted 
that where sources of variability were mini- 
mized (i.e. in the recovery room), stronger 
effects were observed than under less con- 
trolled conditions (i.e. in the patient’s room). 

Although the affective, behavioral, and 
physiological measures were generally sensi- 
tive to the effects of touch, the evaluative 
items were not. We think that this result can 
be explained in terms of a ceiling effect, since 
the means show almost no differences and in- 
dicate that all patients felt very positively 
toward the nurse and were extremely satis- 
fied with their preoperative teaching. Consid- 
ering the very high level of health care at the 
hospital where the present study was run, a 
ceiling effect on these indexes should not 
have been unexpected. However, it would not 
be unreasonable to expect stronger effects on 
evaluative measures in settings where health 
care is much less personalized (e.g, emer- 
gency rooms). 


Conceptual Implications 


Moving to conceptual implications, the 
present data may be seen as adding to the 
extant literature. For example, past research 
tended to look at single rather than multiple 
responses to tactile stimulation. Based on 
this study, it appears that touch may have 
consequences for affective state, behavior, 
and physiological reactions and that these 
response modalities may be related (i.e., par- 
allel effects may occur along all three dimen- 
sions), Further, the data provide a first look 
at some long-term consequences of touch, 
Suggesting that it may have powerful multi- 
dimensional effects over time, in addition to 
the short-term effects demonstrated in earlier 
research (e.g, Fisher et al., 1976). It should 
be noted, however, that touch in the present 
study occurred in a context where it was par” 
ticularly salient and meaningful. While touch 
in other situations of this type might be ex- 


pected to have long-term consequences, the 
effects of incidental tactile stimulation (cf. 
Fisher et al., 1976) or touch that is common- 
place or role defined should be relatively 
short-term. 

It is of further conceptual interest that the 
present findings are in accord with Patter- 
son’s (1976) arousal model of interpersonal 
intimacy and can be interpreted in terms of 
that formulation. According to Patterson, 
changes in intimacy produce arousal change, 
which may be a signal to evaluate and in- 
terpret the environment. The change in arousal 
may be labeled as positive or negative, de- 
pending on the context in which it occurs 
(e.g, socialization-related beliefs concerning 
touch and dependency). To the extent that a 
positive label is engendered, it is assumed 
that favorable reactions (e.g., positive affect, 
reciprocal intimacy) will occur. In the present 
situation for females, contextual factors may 
have permitted a positive label for decreased 
arousal, which resulted in favorable affective, 
physiological, and behavioral reactions. For 
males, the context may have produced a neg- 
ative labeling of increased arousal, which re- 
sulted in unfavorable responses on these di- 
mensions. It is suggested that the fit of Pat- 
terson’s model to the present data indicates 
that it may be useful as an interpretive 
framework for past research on touch and as 
a predictive framework for future work. 


Practical Implications 


Finally, the results of this study may have 
practical implications for patient care. Earlier 
research (e.g., Johnson et al., 1971) found 
that various conditions that facilitate low pre- 
operative anxiety improve patient recovery. 
In the present study, touch for females re- 
sulted in lower anxiety and more positive be- 
havior preoperatively, and these results were 
associated with more favorable postoperative 
physiological responses. This finding sug- 
gests that touch during preoperative teaching 
is a form of communication that may be 
beneficial to the well-being of certain patients, 
Clearly, however, the negative reactions of 
males in the present context prohibit any 
unqualified statements concerning the practi- 
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cal implications of the data. Before applica- 
tion will be possible, future research must 
assess the consequences of touch across a 
variety of medical settings and individual dif- 
ference variables, in order to very precisely 
delineate the domain in which touch has 
beneficial effects. 


Conclusions 


In conclusion, touch in a hospital setting 
tended to elicit positive reactions in females 
and more negative reactions in males. It was 
suggested, based on these findings, that male 
and female subjects attributed differential 
meaning to touch in a dependency context. 
Further, the findings demonstrated that touch 
may have long-term effects, that reactions to 
touch may be related across various response 
dimensions, and that tactile stimulation in a 
hospital setting may sometimes be of ther- 
apeutic value. 


Reference Notes 


1. Heslin, R. Attraction in the dyad. Unpublished 
manuscript, Purdue University, 1976. 

2. Heslin, R., & Boss, D. Nonverbal boundary be- 
havior at the airport. Unpublished manuscript, 
Purdue University, 1975. 
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Although the mere exposure effect has been researched widely, surprisingly little 
is known about the attitudinal and cognitive effects of message repetition. It 
was hypothesized that the sequence of topic-relevant thoughts generated in re- 


sponse to a (repeated) persuasive 


message would parallel attitude change. To 


test this prediction, two experiments were conducted, In Experiment 1, in- 
dividuals heard a communication either zero (control), one, three, or five 


times in succession, rated their agreement 


with the advocated position, and 


listed the message arguments they could recall. In Experiment 2, individuals 


heard a communication either one, three, 


or five times, rated their agreement, 


listed their thoughts, and listed the message arguments they could recall. In both 


experiments, 
increased (regardless of the position 
the recall of the message arguments. 


only topic-relevant thoughts were 
terpreted in terms of an attitude-mo 
content of a persuasive advocacy 
erated; these thoughts, in turn, affect the 


In this article, we will consider the atti- 
tudinal effects of repeated exposure to per- 
suasive communications, an area that has 
generated surprisingly little research by social 
Psychologists despite its frequent occurrence 
This area 
is not well understood (cf. Harrison, 1977), 
in part because most research has focused on 
Tepetition in contexts that do not involve 
communication. In Zajonc’s (1968) original 
statement on the effects of repeated exposure, 
evidence was provided for relationships be- 
tween (a) the frequency of usage and evalua- 
tion of words, (b) the frequency of inter- 
Personal contact and attraction, and (c) the 
familiarity of aesthetic stimuli (e.g, musical 
selections) and liking. Since then, immediate 
effects of mere exposure have been observed 
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that counterargumentation 
as exposure frequency 
related to agreement. 
dification model in which repetition and 
affect the type and number of thoughts gen- 


agreement first increased, then decreased as exposure frequency 
advocated), but agreement was unrelated to 
In Experiment 2, analyses of the listed 


decreased, then increased, whereas 
increased; as expected, 
These results are in- 


attitudinal reaction to the advocacy. 


using children as well as adults (Heingartner 
& Hall, 1974), employing between-subjects 
(Moreland & Zajonc, 1976) as well as within- 
subjects (Crandall, 1972) designs, in field 
(Zajonc & Rajecki, 1969) as well as labora- 
tory (Matlin, 1970) settings, employing pic- 
torial magazine advertisements (McCullough 
& Ostrom, 1974) as well as nonsense syllables 
(Harrison & Hines, 1970) as stimuli, and 
using stimuli evaluated initially either posi- 
tively or negatively (Hamm, Baum, & Nikels, 
1975; Zajonc, Markus, & Wilson, 1974). 
However, conditions in which the exposure- 
liking relationship breaks down have also 
been identified. When the stimulus is simple, 
repetition leads to either a decrease in liking 
(Skaife, cited in Berlyne, 1971, pp. 191-194) 
or an initial increase, then decrease in liking 
(Saegert & Jellison, 1970; Smith & Dorfman, 
1975). The homogeneity of the stimulus se- 
quence presented seems also to be an im- 
portant factor. A stimulus that is presented 
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repeatedly within a homogeneous sequence 
of stimuli results typically in first increasing, 
then decreasing liking of the stimulus (Har- 
rison & Crandall, 1972). Moreover, when 
subjects are able to expose themselves to 
stimuli, they first display exploratory behav- 
ior and then expose themselves repeatedly to 
a self-selected subset of best-liked stimuli 
(Brickman & D’Amato, 1975). And finally, 
when measurement is administered immedi- 
ately after the presentations of a stimulus, 
exposure effects may be attenuated or ob- 
literated (Johnson & Watkins, 1971; Stang, 
1974; Stang & O’Connell, 1974). 


Models of Exposure Frequency and Affect 


Several theoretical accounts have been of- 
fered for these results, including (a) response 
competition, (b) arousal theories, (c) classi- 
cal conditioning, (d) intuition as artifact, 
(e) expectancy arousal, (f) satiation/gen- 
eration, and (g) two-factor theories (see Har- 
rison, 1977, and Stang, 1973, for reviews). 
Of these interpretations, the two-factor the- 
ories provide the most flexible explanation 
because they “can account for any pattern of 
results (by drawing differentially on each 
factor)” (Harrison, 1977, p. 74). However, 
two-factor theories are not cogent unless the 
contribution of each factor can be described 
a priori. 

Berlyne (1970) proposed an inverted-U 
relationship between familiarity and liking. 
The notions are that (a) two separate and 
opposing psychological processes, positive ha- 
bituation (that is, a reduction in uncertainty 
or conflict) and tedium, operate simultane- 
ously; (b) the relative strengths of each vary 
as a function of exposure to the stimulus; and 
(c) initially, the process of habituation has 
greater impact on liking than does tedium. 
Thus, repeated exposure leads initially to lik- 
ing, but ultimately leads to disliking. Ac- 
cordingly, stimulus complexity and sequence 
heterogeneity slow the positive habituation 
process and extend the inflection point of the 
inverted-U curve to higher levels of exposure. 

Stang (1973, 1975) proposed an extension 
of Berlyne’s (1970) two-factor account: Re- 
peated exposure provides more opportunity to 


learn about the stimulus; this learning is 
presumably rewarding and leads to increased 
liking for the stimulus. With continued repeti- 
tion of the stimulus, however, boredom o 
satiation develops; hence, repeated exposure 
leads ultimately to negative affect toward the 
stimulus. 

Stang (1975) presented three experimental 
demonstrations that affect toward and lear- 
ing of Turkish words and trigrams were af- 
fected similarly by exposure frequency. Al- 
though these results may indicate that learn- 
ing mediated liking, there are three other 
possibilities, First, liking may have mediated 
learning rather than vice versa. Second, both 
learning and affect may have been mediated 
by a third variable, For instance, the cogni- 
tive elaboration of stimuli results in both en- 
hanced recall (Craik & Lockhart, 1972; 
Craik & Tulving, 1975; Rogers, Kuiper, & 
Kirker, 1977; Cacioppo & Petty, Note 1) 
and polarized affect (Petty & Cacioppo, 1977; 
Petty, Wells, & Brock, 1976; Tesser & Con- 
lee, 1975; Tesser & Leone, 1977). Stimulus 
learning, however, has not been related con- 
sistently to affective reactions (cf. Gr , 
1968; Petty, 1977). Nevertheless, repea! 
exposure does provide the opportunity for 
more elaborate processing of the stimuli 
(Wyer, 1974, pp. 15-16). Thus, there is the 
distinct possibility that cognitive elaboration 
(or extent of semantic processing) mediates 
both learning and affect. at 

This latter interpretation may appeat a 
odds with some of the existing © tik 
mere exposure. Specifically, preexposure iter 
ing for the stimulus does not necessarily fect 
the direction or form of the exposure © a 
(Zajonc et al., 1974). But in the do ote 
persuasive communication, Petty et al. (19 
demonstrated that it is not the pr of 
affect toward an advocacy but the nature j 
the associates (i.e., cognitive responses) mi 
ited by the message that determines pen 3 
tudinal effect of a communication. Inde ia 
slightly different form of their aie |" 
been employed in past studies of mere © 
sure as well (Brickman, Redfield, Ha 
& Crandall, 1972; Mitchell & Olson, i 
Pearlman & Oskamp, 1971). For ae 
peated presentations of words that © 
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either positive or negative associations re- 
sulted in a polarization of affect toward the 
stimuli (Grush, 1976). Finally, the third pos- 
sibility is that although learning and affect 
show similar patterns, they are mediated in- 
dependently 


Cognitive and Attitudinal Effects of Message 
Repetition: Two Studies 


The current experiments focus on the re- 
lationships among attitudinal, associative, and 
learning effects of message repetition. These 
experiments differ in several important re- 
spects from previous research on message 
repetition and attitudes. McCullough and 
Ostrom (1974) found an exposure effect when 
they presented messages repeatedly in a maga- 
zine advertisment format to subjects. How- 
ever, each presentation differed in the phras- 
ing and ordering of the message arguments, 
and a different photograph and headline ac- 
companied each presentation. Weiss (1971) 
presented the same argument repeatedly and 
found that exposure led to quicker ratings 
of agreement with the argument. However, to 
the extent that practiced responses are also 
quicker, Weiss’s results may be irrelevant to 
the study of message repetition and attitudes. 
Finally, Wilson and Miller (1968) and Jobn- 
son and Watkins (1971) found an attitudinal 
effect for message repetition only on a delayed 
posttest, The absence of the immediate atti- 
tudinal effect is due presumably to satiation 
(cf. Harrison, 1977; Sawyer, in press; Stang, 
1974). Thus, if a wider variety of levels of 
Message repetition, more complex stimuli, or 
a more heterogeneous stimulus sequence had 
been studied, immediate attitudinal effects 
might have been observed. In the present 
studies, Experiment 1 was primarily methodo- 
logical and exploratory in nature, serving as a 
test of the experimental materials and of the 
effects of the repetition of persuasive commu- 
nications on liking and learning. Experiment 
2 was designed to assess the viability of the 
various possible processes mediating the lik- 
ing and learning effects in repetition experi- 
ments using such stimuli. 

To explore more fully the predictive value 
of two-factor theories, experimental condi- 
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tions were designed to elicit first increasing, 
then decreasing favorability toward the ad- 
vocacy. Mere exposure research indicates that 
the inflection of the inverted-U function be- 
tween exposure and affect occurs more quickly 
when stimuli are homogeneous (Harrison & 
Crandall, 1972) and simple (e.g, Smith & 
Dorfman, 1975) and ratings are made imme- 
diately after their presentation (Stang, 1974). 
In the present experiments, one of two per- 
suasive communications was presented either 
one, three, or five times in succession, and 
the dependent measures were administered 
immediately afterwards. We expected to find 
that agreement with the advocacy would in- 
crease, then decrease as exposure frequency 
increased. 


Experiment 1 
Method 


Subjects and procedure. One hundred thirty-three 
introductory psychology students participated in an 
experiment employing a 2 X 4 factorial design in which 
the position advocated (proattitudinal versus counter- 
attitudinal) and the number of presentations (0, 1, 
3, or 5) served as between-subjects factors. Subjects 
were tested in groups of 12 to 29 in language-lab- 
oratory cubicles constructed so that no subject could 
have visual or verbal contact with any other sub- 
ject. During any one session, in which one level of 
repetition was presented, half of the subjects heard 
the proattitudinal message over the headphones, while 
half of the subjects heard the counterattitudinal 
message over the headphones. Following a procedure 
employed previously by Petty and Brock (1976), 
the same highly persuasive arguments, which were 
equally applicable to each advocacy, were used in 
each communication. Thus, the affective qualities of 
the elicited associates (Grush, 1976) or cognitive 
responses (Greenwald, 1968; Petty & Cacioppo, 1977) 
were presumably equated for the two communi- 
cations. 

Two groups (the zero-exposure pro- and counter- 
attitudinal conditions) rated their agreement with a 
recommendation that university expenditures be in- 
creased. One group was told that the expenditures 
were to be financed by instituting a 25% service 
tax on visitor luxuries (proattitudinal position), and 
one group was told that a $70-per-quarter increase in 
student tuition would be instituted to finance the 
expenditures (counterattitudinal position). Ratings 
were made on a 15-point Likert-type scale where 
15 indicated “agree completely” and 1 indicated 
“disagree completely.” 

Subjects in each of the remaining cells heard a 
taped message through headphones either one, three, 
or five consecutive times. The message contained a 
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Figure 1. The effects of preexposure position and 
message repetition on agreement: Experiment 1. 


statement that the university currently provided a 
substandard level of education to the undergraduates 
and that this problem could be remedied by increas- 
ing university expenditures. Half the subjects heard 
a message that contained a paragraph about institut- 
ing a 25% service tax on visitor luxuries to finance 
the expenditures, and half heard a message that con- 
tained a paragraph about instituting a student tui- 
tion increase of $70 per quarter to finance the ex- 
penditures. All subjects then heard the following 
eight arguments in favor of increasing expenditures: 
(a) Teaching and educational materials for students 
would be improved; (b) classroom sizes would be 
reduced; (c) handouts containing the important 
points in lectures could be provided; (d-f) the 
library facilities, job placement, and placement in 
graduate and professional schools could be improved; 
(g) better teachers could be hired; and (h) improved 
counseling to students, on both academic and per- 
sonal matters, could be provided. 

A postexperimental questionnaire containing 15- 
point Likert-type scales was used to assess the ef- 
fects of the advocated position and message repeti- 
tion on agreement with increasing university ex- 
penditures and to assess the perceived amount of 
distraction, effort, and involvement in the task. A 
recall measure was obtained after completion of 
the questionnaire by asking subjects to list on the 
last page of their booklets all of the message argu- 
ments that were presented in the communication. A 
judge, blind to the experimental conditions, counted 
an item as recalled if it correctly summarized one 
of the eight message arguments listed above. After 
completing the dependent-yariable booklets, subjects 
were debriefed, thanked, and dismissed. 


Results and Discussion 


Attitude measure. The means for each cell 
on the attitude measure are displayed in Fig- 
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ure 1. It was expected that (a) subjects woui 
agree more with the pro- than with the coup 
terattitudina] advocacy, and (b) repetition 
would lead to increasing, then decreasing 
agreement with the advocacy. The analyses 
supported both hypotheses: Subjects agreed 
more with the pro- (Af = 9.05) than th 
counterattitudinal message (Af = 6.62), Fil, 
125) = 14.87, p < 001; the number of pre 
sentations of the message affected agreement 
(Mo = 5.58, M, = 8.23, M, = 9.83, M= 
7.70), F(3, 125) = 7.74, p < 001; and the 
interaction between position advocated and 
number of presentations did not approach sig: 
nificance (p > .25). Because the lack of an 
interaction indicated that the position of the 
advocacy (pro or counter) did not alter the 
effect of message repetition on agreement, 
trend tests were conducted on the data col 
lapsed across position. The trend analyses 
were conducted according to Gaito's (1965) 
procedure for determining trend coefficients 
for unequal spacing and unequal n. A signifi- 
cant quadratic trend on the agreement mes 
sure, F(1, 125) = 25.45, p < .001, provi 
evidence of a curvilinear effect of £ 
repetition on agreement (see Figure 1). Pair- 
wise comparisons using the Newman-Ki 
procedure for unequal » (Winer, 1971, pP: 
215-218) revealed that presentations of 
same persuasive message one, three, or five 
times led to increasing, then decreasing 
ment (ps < .05). f, 
Recall. and ancillary measures. AD analys 
of the number of message arguments recalled 
revealed that message repetition, F (2, 1 F 
12.04, p < .001, and message position, H 
79) = 14.27, p < .001, altered the inde 
learning of the message arguments. The 
main effect indicated that learning increas g 
with repetition. The mean number of jet 
ments recalled at one, three, and five n 
tions, respectively, was 4.08, 4.67, sub 
The second main effect indicated that wat 
jects recalled more message agen sain 
the position advocated was counteratt 
(M = 5.54) than when the position ad ed 10 
was proattitudinal (M = 4.29). Compar (d 
previous research on selective learning 1 
Greenwald & Sakumara, 1967), PrE 


ment © 
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(a) Subjects with initially similar attitudes 
were assigned randomly to pro- and counter- 
attitudinal conditions rather than selecting 
subjects for participation on the basis of their 
initially different attitudes; (b) the message 
arguments to be learned by the different 
groups of subjects were identical, though they 


were used to support either a pro- or a coun- 
terattitudinal advocacy. Thus, the differential 
learning could not be attributed to the ease 
with which the arguments could be learned. 


In addition, a Message Position X Message 
Repetition interaction was found, F(2, 79) = 
5.10, p < 01. Specifically, the increase in re- 
call was greatest between three and five repe- 
titions of the proattitudinal advocacy ($ < 
05) and between one and three repetitions 
of the counterattitudinal advocacy (ns—all 
comparisons made using the Newman-Keuls 
procedure for unequal #). 

The first experiment provided strong sup- 
port for the notion that a persuasive appeal 
presented repeatedly in close temporal proxim- 
ity leads to increasing, then decreasing ac- 
ceptance of the advocacy. However, the find- 
ing that more message arguments were Te- 
called when the advocacy was disliked 
initially (i.e., counterattitudinal) than when 
it was liked initially (i.e. proattitudinal) sug- 
gests that liking does not necessarily result 
in greater learning. The results of Experiment 
1 are also at odds with the learning-leads-to- 
liking hypothesis. This is especially dramatic 


‘for the proattitudinal advocacy: The greatest 


increase in affect (between one and three 
repetitions) was associated with no change in 
learning, whereas the greatest learning (be- 
tween three and five repetitions) was asso- 
ciated with a decrease in affect. A within-cell 
correlation (Insko, Lind, & La Tour, 1976, 
P. 69) between agreement and recall indicated 
that these variables were not related signifi- 
cantly (r = —,.02). 3 

Analyses of the ancillary measures (effort, 
involvement, and distraction) revealed that 
they were not affected by any of the manip- 
ulations. 


Experiment 2 


A second experiment was conducted to in- 
vestigate other possible mediators of the ex- 
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posure effect observed in the preceding ex- 
periment. Of particular interest in Experiment 
2 was the possibility that a recipient’s cog- 
nitive responses would be influenced by mes- 
sage repetition and would predict more ac- 
curately the subsequent attitudinal reactions 
than would learning. Indeed, evidence already 
exists for just such a hypothesis. Grush (1976) 
investigated the effects of repeatedly present- 
ing infrequently used words that elicited either 
positive or negative associations. He found 
that words that elicited positive associations 
were evaluated more positively after repeated 
exposures, but that the opposite effect oc- 
curred for words that elicited negative asso- 
ciations. Grush proposed an attitude-forma- 
tion explanation of his results. The notion is 
that a stimulus initially elicits only a few 
cognitive responses (associations). With re- 
peated exposure, subjects generate increased 
(or increasingly consistent) cognitive Te- 
sponses to the stimulus (thus, it is similar in 
some respects to the response competition in- 
terpretation). The final evaluation of an ex- 
posed stimulus is assumed to be a function of 
the summed evaluations of the cognitive re- 
sponses it elicits. If the responses are generally 
favorable, increased exposure should lead to 
a more positive evaluation of the stimulus, but 
if the responses are generally negative,, in- 
creased exposure should lead to a less favora- 
ble evaluation of the stimulus. 

Miller (1976) investigated the effects of 
mere exposure using communicative stimuli 
(posters containing a political message) and 
found that moderate but not high levels of 
exposure led to attitude change (increased 
agreement). Miller argued that the high ex- 
posure loads caused individuals to feel that 
their personal freedom was being restricted, 
and reactance (Brehm, 1966) led to resistance 
to persuasion. It is equally possible, on the 
other hand, that an unpleasant state of satia- 
tion (e.g, boredom) developed at the high 
exposure levels. These models suggest that 
the cognitive responses (i.e., counterargu- 
ments, favorable thoughts, neutral /irrelevant 
thoughts) elicited in a repetition experiment 
may mediate the affective reactions to the 
stimulus. However, no study to date that has 
employed repetitions of the same persuasive 
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communications has investigated the nature 
or the temporal sequence of the cognitive re- 
sponses elicited. Thus, in Experiment 2 sub- 
jects heard the previously tested pro- or 
counterattitudinal advocacy either one, three, 
or five times in succession. Afterwards, sub- 
jects (a) were instructed to list everything 
about which they had thought during the pre- 
ceding minutes, (b) rated their attitude to- 
ward the advocacy and completed ancillary 
measures, and (c) responded to a measure of 
the incidental learning of message arguments. 


Method 


One hundred ninety-three introductory psychology 
students participated in an experiment employing 
a 2X3 factorial design in which the position ad- 
vocated (pro- versus counterattitudinal) and the 
number of presentations (1, 3, or 5) served as be- 
tween-subjects factors.1 As in Experiment 1, subjects 
were tested in groups of 12 to 29 in cubicles in a 
language laboratory. During any one session, in which 
one level of repetition was presented, half the sub- 
jects heard the proattitudinal and half the subjects 
heard the counterattitudinal message through head- 
phones. Subjects rated their agreement with increas- 
ing university expenditures on a 15-point Likert-type 
scale and completed the postexperimental question- 
naire. In a procedure adapted from Brock (1967) 
and Greenwald (1968) and employed previously by 
Petty and Cacioppo (1977), subjects were given 3 
minutes to list the actual thoughts that occurred to 
them during the presentations of the communica- 
tion they had heard. Subjects were then asked to 
go back through the thoughts they had listed and 
to place a plus (+) next to each thought that sup- 
ported the advocacy, a minus (—) next to each 
thought that attacked the advocacy, and a zero (0) 
next to each thought that was neutral or irrelevant 
to the advocacy. Subjects were then asked to list 
all of the message arguments they could recall. (Re- 
call was scored in the same manner as in Experi- 
ment 1.) 

Two judges who were blind to the experimental 
conditions rated the cognitive responses. Counted 
as unfavorable thoughts (i.e. counterarguments) 
were statements directed against the advocated posi- 
tion that mentioned specific unfavorable conse- 
quences, statements of alternative methods of raising 
money, challenges to the validity of arguments in 
the message, and statements of affect opposing the 
advocated position. Counted as favorable thoughts 
were statements in favor of the advocated position 
that mentioned specific favorable consequences, state- 
ments ruling out alternatives, statements that sup- 
ported the validity of the message arguments, and 
statements of affect supporting the advocated posi- 
tion. All other listed items were rated as neutral/ 
irrelevant thoughts, Similar items were counted as one 
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thought. Judges agreed on over 95% of the rathg: 
disagreements were resolved through discusion, 


Results 


Attitude measure. The mean for each ol 
on the attitude measure is graphed in Figur 
2. As in Experiment 1, subjects agreed mor 
with the pro- (Af = 9.61) than the counter 
attitudinal advocacy (Af = 8.19), F(1, 187) 
= 6.48, p < .02, and subjects agreed differ- 
entially as a function of the number of me 
sage repetitions (Af, = 8.09, M, = 9.77, M, 
= 8.86), F(2, 187) = 3.04, p < 05. The i 
teraction again did not approach significance 
(p > .20). Trend tests were conducted using 
Gaito's (1965) procedure for unequal n ani 
revealed that message repetition resulted ina 
significant quadratic effect, F(1, 187) = 946 
p < .001. Pairwise comparisons of cell means 
using Newman-Keul’s procedure for unequsl 
n provided evidence that agreement with the 
advocacy increased (p < .05), then 
only marginally (p < .10) as exposure fre- 
quency increased. 

Cognitive-response measures. ‘The means 
for each cell on the cognitive-response me 
sures are graphed in Figure 
yses of variance revealed that subjects ge 
erated more counterarguments, F(1, 181) © 
10.71, p < .001, and fewer neutral/irrelevat 
thoughts, F(1, 187) = 4.35, p < 05, 4 A 
sponse to the counterattitudinal than e 
sponse to the proattitudinal advocacy: s 
number of message repetitions margi 51) 
fected the production of favorable, F(2, ri 
= 2.36, p < .10, and neutral/irrelevant, a 
187) = 2.61, p < .08, thoughts. There 


ws 

1 An additional factor included in the desi6” py 
whether subjects were informed as 
times they would hear the message prior tO 5g the 
That is, half of the subjects were told 
one, 


taped communication. The other half of the su ihe 
were told nothing about the number of © 


aed 
vealed that this factor made no difference 08 | 
the measures obtained, and it will not be discus 
further. 
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Figure 2, The effects of preexposure posi 
response: Experiment 2. 


no significant interactions; thus, trend con- (employing Gaito’s procedure for unequal n) 
trasts were computed on the data collapsed indicated that counterargumentation was af- 
across message position. The results of trend fected quadratically, F(1,187) = 15.37, p< 
contrasts on each cognitive-response measure 001, and neutral/irrelevant thinking was af- 


Table 1 
Within-Cell Correlations: Experiment 2 
Counter- Favorable Neutral 
Variable Agreement arguments thoughts thoughts 
Counterarguments —.56 
Fite thoughts a TA Le 
Nenna] thoughts oi 47 — 07 02 


a 
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fected linearly, F(1, 187) = 5.87, p < .02, as 
exposure frequency increased (see Figure 2). 

Finally, a 23 X3 repeated measures 
analysis of variance was performed in which 
message position and exposure served as be- 
tween-subjects factors and type of thought 
(i.e. counterargument, favorable thought, 
neutral/irrelevant thought) served as the 
within-subjects factor. The analysis yielded 
two interactions: (a) A significant Type of 
Thought X Message Position interaction, 
F(2, 370) = 6.14, p < .003, revealed that the 
counterattitudinal advocacy elicited a greater 
number of topic-relevant thoughts (primarily 
counterarguments) and fewer neutral/irrele- 
vant thoughts than the proattitudinal advo- 
cacy; and (b) a marginally significant Type 
of Thought x Message Repetition interaction, 
F(4, 370) = 2.02, p < .09, showed that mes- 
sage repetition affected each type of thought 
differently (see Figure 2). Favorable thoughts 
increased, then decreased; counterarguments 
decreased, then increased; and irrelevant 
thoughts continually increased with repetition. 

Recall and ancillary measures. The anal- 
ysis of the number of message arguments re- 
called replicated the incidental learning find- 
ing in Experiment 1: More arguments were 
recalled when they were used to support the 
counterattitudinal (M = 5.51) than the pro- 
attitudinal (M = 4.48) position, F(1, 187) = 
14.63, p < .001. Message repetition also af- 
fected message recall (M, = 4.14, Ms = 5.50, 
Mz = 5.36), F(2, 187) = 11.49, p < .001; 
the interaction, however, was not significant 
this time. Trend analyses (collapsed across 
message position) yielded significant Fs for 
both the linear, F(1, 187) = 14.14, p < .001, 
and quadratic, F(1, 187) = 15.01, p < .001, 
trends. 

The analyses of the ancillary measures re- 
vealed again that neither the position advo- 
cated nor the number of presentations affected 
any of these ratings. 

Within-cell correlations. A within-cell cor- 
relation between each pair of dependent mea- 
sures was calculated to provide measures of 
association between the variables with the ef- 
fects of the treatments held constant. The 
results of these analyses are presented in 
Table 1. Note that agreement and topic-rele- 


vant thinking (counterarguments and favon. 
ble thoughts) were interrelated highly, 
whereas learning of the message (recall), 
topic-irrelevant thinking (neutral thoughts), 
and agreement were relatively independent ol 
each other.” 


2 Although the hypothesined attitude modification 
process holds that the cognitive responses mediated 
agreement with the advocacy, otber causal chalet 
are possible. Following Osterhouse and Brock (1970), 


Insko, Turnbull, and Yandell (1974), Petty et a 
(1976), and Petty and Cacioppo (1977), four analy. 
ses of covariance (ANCOVA) were conducted, The m 


tionale for the procedure employed is discussed by 
Insko et al. (1974), The awcova procedure comparti 
specific causal models by bolding constant stal 

the postulated mediator between an initial variable 
and a final criterion variable. The relationship be- 
tween these measures should be within error vari- 
ance of zero when the mediator is held constant 
through the use of the covariance procedure; 8 te- 
duction of a significant F for the criterion measure 
through the use of this procedure would suggest . 
the covariate mediated the criterion va 
causal-model-testing procedure is adapted from Bls- 
lock’s (1964) technique using partial , 
Scheffé’s (1959) technique of examining partial slopes 
and Cochran and Cox's (1957) technique emploi 
analysis of covariance. Heise (1969) Leper ge 
assumptions and problems of causal model- 
analyses. 

Separate 2 X 3 ANCOVAS were conducted wing (0) 
counterarguments, (b) favorable thoughts, (© 
tral/irrelevant thoughts, and (d) recall a$ the K 
variate (ie, initial variable) and agreement 5 di 
criterion variable. The originally signiñcant i wa 
3.02 for the repetition main effect on avor- 
reduced to Fs < 1 when counterarguments neutral) 
able thoughts served as the covariate. When the F 

F 
F 


irrelevant thoughts served as the covariate, at 
x the 
of 2.41; when recall served as the covariate, whe 


th 

for the repetition main effect on favorable 
was reduced slightly to 1.78; the marginally 
nificant F of 2.61 on momen /erien th pe 
came a significant F of 6.04; nonsignifi 
remained 
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Discussion 


Although various research, including Ex- 
periment 2, has demonstrated that repetition 
influences learning and affect similarly, addi- 


tional evidence provided in Experiments 1 and 
2 suggests that this relationship in mere ex- 
posure research using persuasive communica- 
tions is neither the result of learning leading 


to liking nor of liking leading to learning. 
Specifically, the within-cell correlations be- 
tween learning and agreement in the experi- 
ments were near zero and nonsignificant. Ad- 
ditionally, the analyses of covariance (Foot- 
note 2) suggested that although learning did 
not mediate agreement, the cognitive re- 
sponses elicited by the communications did. 
Of course, it is possible that the learning- 
leads-to-liking hypothesis in mere exposure 
research applies only to stimuli that possess 
few cognitive associations (e.g., nonsense 
stimuli). The present experiments provide no 
evidence from which to evaluate this hy- 
pothesis. 

The present experimental results indicated 
that regardless of the position advocated, mes- 
sage repetition led to (a) increasing, then de- 
creasing agreement with the advocacy; (b) 
decreasing, then increasing counterargumenta- 
tion; and (c) increasing topic-irrelevant 
thinking, Of importance is that the analyses 
of variance revealed that the sequence of the 
topic-relevant thoughts generated in response 
to the repeated messages paralleled the ob- 
served attitude change. It was hypothesized 
that the effects of message repetition on agree- 
Ment were mediated by a two-stage attitude- 
modification process in which the repetition of 
the message arguments provided more oppor- 
tunities to elaborate cognitively upon them 
and to realize their cogency and favorable 
implications, Hence, counterargumentation de- 
clined at the moderate exposure frequency. 
At high exposure levels, however, tedium 
and/or reactance may have motivated the in- 
dividual to again attack the now offensive 
communication, Thus, counterargumentation 
was renewed, and agreement decreased at high 
exposure levels. 

Validity of thought listings. It might be 
argued that the thought listings do not Pr>- 
vide evidence about the nature of the sub- 
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jects’ thoughts, but instead reflect what sub- 
jects thought the experimenter wanted them 
to list (Orne, 1962). This interpretation is 
considered unlikely in the present case be- 
cause (a) the experimenter clearly requested 
a list of everything about which subjects had 
been thinking during the message; (b) al- 
though a complex pattern of counterargu- 
mentation was produced (first decreasing, 
then increasing), reports of topic-irrelevant 
thinking increased as exposure frequency in- 
creased; and (c) a between-subjects design 
was employed, making unlikely the successful 
intuiting of the experimenter’s hypotheses. 
Furthermore, the position of the advocacies 
did not alter the exposure effects on the listed 
thoughts and attitude; although this result 
replicates previous research on mere exposure 
(Zajonc et al., 1974), it is unlikely that such 
an effect is common knowledge to naive 
subjects. 

It might also be argued that subjects are 
unable to provide veridical information about 
the nature of their thoughts. Cacioppo, Har- 
kins, and Petty (in press) and Cacioppo and 
Petty (Note 2) have argued that subjects are 
able to report accurately their recent and 
current thoughts and ideas and that the fact 
that subjects may be unaware of the reasons 
for, or consequences of, their thoughts (cf. 
Nisbett & Bellows, 1977; Nisbett & Wilson, 
1977) does not invalidate the use of their re- 
sponses in a content analysis. A thought-list- 
ing analysis assumes only that the thoughts 
and ideas of subjects are as accessible as are 
attitudes, judgments, facts, and so on (cf. 
Wyer, 1974). The burden then lies with re- 
searchers to determine whether a subject’s 
cognitive responses are predictive of, or in- 
fluential in, affect and behavior. It should be 
noted that although the present results are’ 
consistent with the hypothesized attitude- 
modification process and inconsistent with the 
Jearning-affect model, no definitive evidence 
concerning the causal role of cognitive re- 
sponses is provided. Future research might 
fruitfully employ distraction to attenuate the 
dominant type of thought generated and/or 
vary message-argument quality to differen- 
tiate the nature of the dominant thought 


elicited. 
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Cognitive dynamics of message repetition. 
We have proposed that message repetition 
guides a sequence of cognitive reactions to a 
persuasive communication. But are the re- 
peated presentations of the stimulus per se 
responsible for this sequence? We think not. 
It should be apparent that in the present ex- 
periments, the number of repetitions of a 
message was confounded with the available 
time to think about the message arguments, 
generate new topic-relevant thoughts, and so 
forth. This (additional) time to think has 
been shown to be necessary for individuals 
to process more deeply and elaborate more 
fully the content of an impending communi- 
cation (Petty & Cacioppo, 1977), the per- 
suasive arguments for a group decision ( Burn- 
stein & Vinokur, 1975, 1977), and the qual- 
ities of a stimulus (Tesser & Leone, 1977). 
For example, Tesser and his colleagues (e.g., 
Sadler & Tesser, 1973; Tesser & Conlee, 
1975) have demonstrated that if persons are 
asked to think about some object or issue, 
the amount of time spent thinking is related 
to the amount of attitude polarization that 
results. Thus, objects that are liked initially 
are evaluated more favorably with increased 
thought, and initially disliked objects are 
evaluated more negatively. If we had simply 
presented subjects with the topics of the pro- 
attitudinal (increasing expenditures by insti- 
tuting a visitor’s tax) and counterattitudinal 
(increasing expenditures by increasing tui- 
tion) advocacies, without the accompanying 
persuasive arguments, we expect that we 
would have replicated the attitude polariza- 
tion effect: More time to think would have 
produced increased agreement with the pro- 
attitudinal and decreased agreement with the 
counterattitudinal advocacy. However, we 
‘found that moderate repetitions of (and time 
to think about) the persuasive message argu- 
ments led to increased agreement with both 
the pro- and counterattitudinal advocacies. 
This finding supports the notion that the first 
stage of the attitude-modification process of 
message repetition involves enhanced oppor- 
tunities to process the content of the message 
(which was the same for both pro- and coun- 
terattitudinal groups). Moreover, we found 
that high exposure frequencies of (and length- 
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ier time to think about) the message J 
ments resulted in decreased agreement wih 
the advocacy. This result is consistent with 
the notion that the second stage of the atti f 
tude-modification process involves the deve- 
opment of tedium and/or reactance, whid 
modifies the subsequent information-process 
ing activity. 

We assumed that individuals would be 
motivated to process personally involving 
stimuli (cf. Cialdini, Levy, Herman, Kozlow 
ski, & Petty, 1976); thus, we employed com 
municative stimuli that were personally ree 
vant to the individuals in the experiment. We 
also assumed that these individuals, if able, 
would be motivated to elaborate more fully 
the arguments contained in cach advocacy. 
Thus, the provision of additional opportunities 
via message repetition to process the persua 
sive arguments contained in the advocacies 
would result in greater realizations of their 
validity, and not until the repetitions became 
excessive would this process yield to one mo- 
tivated by reactance or tedium 

However, there are at least two other ways 
to account for the observed sequence of 8: 
nitive responses. First, since each ks 
arguments contained in the advocacies was 
in favor of an increase in spending, this 8 
eral orientation may have served as à %1 
trieval cue for initially accessing and bare 
ating topic-relevant cognitions that furthe’ 
supported an increase in spending. As A 
memorial pool of favorable thoughts War e 
pleted, however, counterarguments would bè 
come increasingly accessible. Hence, 
ing to this account, the direction of the 
sage arguments is important rather than 
cogency or persuasiveness. se 

A second plausible mechanism has been oe 
cussed by Burnstein and Vinokur (a 
1977). The simple statement that ona 
advocating a position discrepant from a 
own “may induce a person to rec 4 
line of reasoning which he thinks could ‘eid 
produced such (an advocacy)” Cnr 
Vinokur, & Trope, 1973, p. 244). This as 

a 


differs from Tesser’s in that in this ins : 
the individual is motivated to const p 
least initially, all possible arguments nad 
port the advocacy, since it has been 
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salient to the individual that others endorse 
this view. Again, as the proargument pool in 
memory is exhausted, counterarguments be- 
come increasingly accessible. While this posi- 
tion is similar in many respects to our own, 
we place a greater emphasis on the cognitive 
responses elicited by the content and context 
of message repetition. 

Cognitive dynamics and incidental recall. 
Finally, a word might be said about the in- 
cidental recall findings of Experiments 1 and 
2. In each experiment, it was found that per- 
sons recalled more message arguments used to 
support a counterattitudinal than a proattitu- 
dinal position, Neither familiarity nor novelty 
can account for this effect, because the same 
message arguments were used to support each 
position. Nor can a consistency explanation 
account for this effect, because best remem- 
bered were the arguments used to support a 
position inconsistent with the beliefs of the 
subjects. 

Perhaps a more cogent account of the in- 
cidental recall findings can be provided by an 
analysis of the extent of processing of 
pro- and counterattitudinal advocacies. Ac- 
cumulated research has supported Craik and 
Lockhart’s (1972) proposal that the depth or 
extent of processing influences the recall of 
a stimulus (e.g, Craik & Tulving, 1975; 
Rogers et al., 1977; Cacioppo & Petty, Note 
1). It is interesting to note in light of this 
evidence that as discovered in Experiment 2, 
the counterattitudinal advocacies elicited a 
greater number of topic-relevant thoughts 
than did the proattitudinal advocacies. This 
effect was obtained even though the message 
arguments comprising each advocacy were 
identical. If one assumes that the number of 
listed topic-relevant thoughts indicates the 
extent to which the individuals elaborated 
Cognitively upon the message arguments, then 
superior recall of the message arguments sup- 
porting the counterattitudinal rather than the 
Proattitudinal advocacy would be expected. 
The results are in accord with this analysis. 
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Effects of Controllable Versus Uncontrollable Factors 
on Responsibility Attributions: 
A Single-Subject Approach 


Daniel Arkkelin, Thomas Oakley, and Clifford Mynatt 
Bowling Green State University 


A “lens-model” methodology was used to assess the influence of different informa 
tional “cues” on responsibility attributions for an automobile accident. Multiple 
regression analyses indicated that although there were large individual differ 
ences, in Experiments 1 and 2 the major determinants of responsibility judg- 
ments were car speed and the condition of the car's brakes. In Experiment 3, 
speeding and brakes information was deleted. The major determinant of at 
tributions was the driver's past record, although the importance of severity of 
Consequences was somewhat greater than in the first two experiments, In Experi- 
ment 4, driving record was deleted. The majority of subjects indicated that not 
enough information was available, Overall, severity was a relatively unimportant 
cue. Observers tended, instead, to rely on factors over which the driver had con- 
trol. The lens-model approach seems quite suitable for studying attributional 
judgments, particularly since it provides a detailed description of individual 


judgment strategies. 


Attribution research has typically utilized 
experimental designs in which only a limited 
number of attributional determinants are ma- 
nipulated. Additionally, between-subjects sta- 
tistical analyses are used in which individual 
differences are treated as “error variance.” 
The present studies were conducted to explore 
an alternative approach that permits the as- 
sessment of the effects of several independent 
variables and the determination of the rela- 
tive importance of each variable for individual 
subjects. 

Fischhoff (1976) has pointed out that at- 
tribution theory is not the only area of psy- 
chological research concerned with inferential 
behavior. An attribution can be conceptual- 
ized as a special case of a general judgment 
process wherein individuals integrate informa- 
tion to arrive at judgments. 
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Investigators of human judgment bare) 
used multiple regression analysis extensively 
to quantitatively describe the manner j 
which informational cues are weighted a 
combined by a judge to make a decision 
(Slovic & Lichenstein, 1971). Brunswik’s 
(1952) lens model has served as the frame 
work for many of these investigations ($€? 
Dudycha & Naylor, 1966; Haor 
Stewart, Brehmer, & Steinmann, i 
Keeley and Doherty (1971) describe 5 
typical paradigm in which a set of cues, 2: 
are presented to a decision maker, whose a 
sponse, Y,, varies as a function of the = 
values. The individual makes a large num a 
of judgments in response to various 3 
binations of the cue values. Multiple regre 
sion analysis is then applied to “capture © 
policy” of the judge, resulting in a vudge’ 
sion equation that is a model of the JU i i 
integrating and weighting strategy, Or PO 
Specifically, the regression weights =a a 
dexes of the extent to which the judge 
loys each cue. s 
è Since the relative size of the regressi 
weights is partially determined by their 
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nal position in the regression equation when 
cues are intercorrelated, Darlington’s (1968) 
“usefulness index” is commonly used to as- 
sess the relative importance of each cue. This 
index is the amount R° decreases if a given 
cue is deleted and the regression equation 
recalculated. The usefulness indexes can then 
be tested for statistical significance. 

While this methodology can be applied to 
any area of attribution research, attribution 
of responsibility for an accident was chosen 
for the present investigations, since there are 
several unresolved questions concerning this 
topic. Several investigators have reported that 
observers tend to assign increasing responsi- 
bility to a target person as the severity of an 
accident increases (e.g., Chaiken & Darley, 
1973; McKillip & Posavac, 1975; Shaver, 
1970; Walster, 1966). Other studies have 
not replicated this effect (e.g., Shaver, 1970; 
Walster, 1967). Additionally, researchers have 
not reported the magnitude of the severity 
effect when it has occurred or its relative 
magnitude in comparison to other potential 
attributional determinants. The following ex- 
periments used the lens-model methodology 
to assess the relative influence of seven cues 
on individual observers’ responsibility at- 
tributions. 


Experiment 1 
Method 


Subjects. Ten students recruited from introduc- 

tory psychology classes served as subjects. 
_ Materials. Pilot subjects were asked to list what 
information they would need to determine a driver's 
responsibility for an automobile accident. The most 
frequently requested items were road conditions, 
traffic conditions, the driver’s past record of acci- 
dents, car speed, and the condition of the brakes of 
the driver’s car. In addition to these “cues,” two 
others, severity of consequences to victims and 
Severity of consequences to the driver, were included 
ìn the cue set. 

Statements describing two levels of each cue were 
constructed (eg., high or low severity of conse- 
quences, light or heavy traffic, etc.) resulting in 128 
Combinations of the cues. An example of one par- 
ticular combination follows: 


While driving to work last week, Elaine W. was 
eka in an automobile accident. Elaine was 
nħarmed, but the people in the other car were 
critically injured, The stretch of road on which 
the accident occurred was full of potholes, and 
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there was very little traffic at the time of the ac- 
cident. Elaine’s record indicates that she has been 
in several accidents since she got her driver's 
license, It was determined that at the time of the 
accident she was not speeding and that the brakes 
on her car were in need of repair. On the basis of 
the above information, how responsible do you 
think Elaine should be held for the accident? 


A 7-point scale was at the bottom of each profile, 
with the statement “not at all responsible” at the 
left extreme and “completely responsible” at the 
right extreme. 

Since multiple regression analysis does not require 
that all possible cue combinations actually be pre- 
sented to the subjects, 40 of the 128 profiles were 
randomly selected, and booklets containing the 
chosen profiles were constructed, The booklets con- 
tained the following frequencies of each level of each 
cue: severity-driver (high = 20, low = 20) ; severity- 
victim (high = 20, low = 20); road conditions (good 
=19, poor=21); traffic (light =23, heavy = 17); 
driving record (good=13, poor=27); speed (ex- 
cessive = 14, moderate=26); brakes (good= 12, 
poor = 28). 

Procedure. Each subject was given a booklet of 
profiles and a cover sheet indicating that each page 
contained a short story, with the pattern of informa- 
tion differing from page to page. Subjects were asked 
to make a judgment about the driver on the scale 
at the bottom of each page and instructed not to 
look back at their previous judgments. 


Results 

The numerical values (0 vs. 1) of the two 
levels of each cue were entered as predictor 
variables in a stepwise multiple regression 
analysis of each subject’s 40 responsibility 
judgments. Table 1 presents the R? values 
(M = .66) and significant ($ < 05) useful- 
ness indexes (UI) for each subject.’ 

Inspection of this table reveals a marked 
reliance on the cues of speeding (9 of 10 
subjects employed this cue to a significant 
extent, Mp: = .31) and brakes (9 subjects, 
My = .27). Information regarding conse- 
quences to victims was relied upon to a sig- 
nificant extent by only one subject (Mor = 
01), and information about consequences to 
the driver was not employed by any subject. 

Two conclusions are clearly warranted 
from Table 1: (a) Subjects’ “policies” were 


iTable 1 is included to provide an example of 
how data obtained with this technique are typically 
summarized. Tables for the remaining experiments 
are available from the first author upon request. 
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Table 1 


D. ARKKELIN, T. OAKLEY, AND C. MYNATT 


Usefulness Indexes (UI) of Each Cue for Each Subject in Experiment 1 


Cue 


| 


Subject Severity Severity Road 
no. (driver) (victim) conditions Traffic Record Speeding  lrakes R 

1 w 35 “ ” 
2 23 as 
3 06 Os 2 ši w 
4 06 08 23 30 a” 
5 42 42 1 
6 07 46 08 45 
7 on Ji ji 4 
8 0 w 5% 
9 08 49 a 
10 v u 3 Ns 
Frequency 0 1 2 4 2 9 9 

Aea .00 01 1 03 02 3 


PEA i A Yy N n 
Note. No entry indicates that the usefulness index was not significant (p < 05) 


similar, to the extent that 8 of 10 people 
utilized both speeding and brakes informa- 
tion to a significant degree, whereas severity 
of consequences to either the driver or vic- 
tims was of minimal importance; and (b) 
there were large individual differences re- 
garding both the magnitude of importance of 
particular cues and which cues were relied 
upon by any given subject. 


Experiment 2 
Method 


brakes may have been an artifact of serial position 
(ie, a “recency effect”). Thus, Experiment 1 was 
replicated using the same profiles, except that con- 
sequences to the driver and victims appeared last 
and next to last in each story, while speeding and 
brakes were second and third in order of appear- 
ance. Ten introductory psychology students were 
presented the new booklets, following the same pro- 
cedure as in Experiment 1, 


Results 


The R? values were similar to those ob- 
tained in Experiment 1 (M = .72). The over- 
whelming importance of speeding and brakes 
information observed in Experiment 1 was 
again apparent. All subjects utilized speed- 
ing information to a significant extent (Myr 


= 41). Seven subjects employed informs 
tion about brakes (Afp: = .10) Thus, 
possibility of an artifact due to serial posi 
tion does not seem likely, The frequency 
usage of consequences was somewhat greater 
than in Experiment 1 (four subjects i 
ployed consequences to victims, Mu: =% 
two subjects employed consequences to 
driver, My; = .02), but the importance 

cues was still much less than that 
speeding and brakes. 

Experiment 3 

Method 


Since the first two experiments indicated 
speeding and brakes were the major 
of responsibility attributions, and given the 
increase in the frequency of utilization of poe 
information in Experiment 2, it was of in 
determine whether severity of consequencit 
become a major determinant of attributions 
ing and brakes information were not available. h 
new profile booklets were constructed in the s9% 
speeding and brakes cues were deleted from De 
40 profiles employed in Experiment 2 (i¢+ ` 
information still appeared last in the stories)» 
10 additional subjects were asked to make 
bility judgments about these stories. 


Results 

The R? values were again similar 0) Er 
of the first two experiments a= a 
amination of the significant usefulne® g 
dexes revealed that the most importa? 
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terminant of attributions was driving record 
(eight subjects employed this cue to a sig- 
nificant extent, Mor = 41). Severity of con- 
sequences to the victims was the second most 
important cue (five subjects, My; = .05). 
Severity of consequences to the driver was 
used by only one subject. 


Experiment 4 


In order to determine how important se- 
verity of consequences would be in the ab- 


sence of information regarding speeding, 
brakes, and driving record, 10 more subjects 
were presented the same 40 profiles, in which 
driving record was now deleted as well. Four 
of these su ts circled Js (i.e. “not at all 
responsible”) on all profiles, and two subjects 
circled all 4s (i.e., the midpoint on the scale). 
These subjects also indicated on an open- 
ended postexperimental questionnaire that 
there was not enough information available 
on which to base their judgments. Of the re- 


maining four subjects, the R? for one was 
nonsignificant, and there was no identifiable 
pattern of cue utilization for the other three 
subjects, 


General Discussion 
Role of Severity in Responsibility Attribution 


_ The results of Experiments 1 and 2 clearly 
indicate that the two most important de- 
terminants of attributions were the cues re- 
garding whether the driver was speeding prior 
to the accident (19 of 20 subjects employed 
this cue to a significant degree) and whether 
the brakes of the car were in good or poor 
Condition (16 of 20 subjects). The importance 
of these two cues was far greater than that 
of severity of consequences to the victims (5 
Subjects) or severity of consequences to the 
driver (2 subjects). This underscores the con- 
clusion drawn by Shaver (Note 1) that “over- 
all severity-dependent responsibility attribu- 
pees the exception rather than the rule” 
P. 9). 


Role of Controllable Versus Uncontrollable 
Factors 


It is interesting that the two most fre- 
quently utilized cues (speeding and brakes) 
Were factors directly under the driver’s con- 
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trol. That is, an individual can be expected 
to take responsibility for driving at a safe 
speed and for maintaining the brakes of his 
or her car. On the other hand, factors such 
as road and traffic conditions or the severity 
of the consequences of an accident are not 
under a person’s direct control. It is also in- 
teresting that few subjects employed more 
than two or three of the seven possible fac- 
tors. It appears that most subjects simply 
chose to attend to that information which 
they deemed to be most relevant to the prob- 
lem of assigning responsibility for the acci- 
dent, while ignoring the other cues. When 
the two most relevant cues were eliminated, 
subjects relied almost unanimously on driv- 
ing record (8 of 10 subjects). In accord with 
the hypothesis that observers rely on in- 
formation regarding factors over which the 
driver has control, the overall importance of 
driving record in Experiment 3 is not sur- 
prising, since this was the only factor present 
over which the driver could be thought to 
have any control (in the sense of suggesting, 
for example, carelessness). The tendency for 
subects to rely on factors that are under the 
driver’s control is consistent with Lerner’s 
(1970) proposition that we have a need to 
believe that people deserve bad things that 
befall them, as well as with Wortman’s (1976) 
argument that we make attributions in order 
to enhance our feelings of control over the 


environment. 


Methodological Issues 


In interpreting the results of the present 
experiments, several methodological caveats 
should be borne in mind. First, the relative 
“importance” of the various factors manip- 
ulated may be due not to the factors them- 
selves but to the range of levels employed 
(ie., the particular wording used to specify 
the different levels). Different wording might 
have produced a very different ordering of 
the variables in terms of importance (see Eb- 
besen & Koneéni, 1975). Although the levels 
employed here seem representative of high 
and low values, it might be useful to consider 
several different levels in future research. Ad- 
ditionally, a different pattern of cue utiliza- 
tion may be obtained with other dependent 
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measures (e.g., sanction or guilt; cf, Fish- 
bein & Ajzen, 1973; Vidmar & Crinklaw, 
1974). It is also possible that judges may 
integrate cues in an interactive fashion. How- 
ever, the R* values obtained here indicate 
that the majority of the variance in responsi- 
bility judgments was accounted for by a 
linear combination of the cues, and there is 
little evidence in the literature on human 
judgment and decision making of nonlinear 
cue utilization (Goldberg, 1968, 1970). Never- 
theless, interaction terms can be included in 
the multiple regression equation (Kerlinger 
& Pedhazur, 1973), 

Finally, there is some evidence in our data 
for a serial position effect on severity utiliza- 
tion. That is, consequences appeared to be 
somewhat more readily utilized when they 
came last in serial order than when 
came early. Recent work dealing with chil- 
dren’s moral judgments (Austin, Ruble, & 
Trabasso, 1977; Feldman, Klosson, Parsons, 
Rholes, & Ruble, 1976) has demonstrated a 
recency effect on consequence utilization due 
to presentation order. While this result sug- 
gests a possible resolution for the inconsisten- 
cies in past responsibility attribution research 
(i.e., perhaps consequences information was 
presented last in those experiments reporting 
the severity effect, whereas it may have been 
presented first in those studies failing to find 
the effect), a review of the procedures of 
past research indicates that this is not the 
case (cf. Chaiken & Darley, 1973; McKillip 
& Posavac, 1975; Medway & Lowe, 1975; 
Shaver, 1970; Stokols & Schopler, 1973; 
Walster, 1966). Also, no serial position effect 
was observed for speeding and brakes utiliza- 
tion, and that observed for severity was mini- 
mal. However, the small effect does suggest 
that perhaps future research should counter- 
balance the serjal position of a particular cue. 


Applicability of the Lens-Model A pproach 


tions of these 
tribution are 
lens-model approach. The usefulness of this 
methodology seems apparent. It permits the 
parsimonious assessment of proportions of 
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4 
variance accounted for by a large numba 


attributional determinants for individuel nh 
jects, thus providing a way to syst 
investigate individual dificrences. For 
ample, individuals may difer in their 
ency to rely on severity of « snsequences 
assigning responsibility for an accident, 
fact, our experiments suggest that this 
indeed be the case. Even among subj 


who relied on severity, some used this ouet 
a great extent, whereas others relied oof 
very little, Such individual differences (ë 
well as differences with respect to how ott. 


cues are employed) could subjected ii 
experimental investigation, perhaps 
to the identification of attributional “st 
and the construction of theories 
the characteristic manner in which 
integrate information in making attri 
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1. Shaver, K. G. Intentional ambiguity in the é 
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Marital Roles, Sex Differences, and Interpersonal Attraction 


oseph E. Grush 
na Illinois University 


This investigation tested the hypothesis that role and 
alter the usual link between similarity and attraction. In 
and males with traditional or nontraditional attitudes toward sex roles and mar 
riage rated similar or dissimilar opposite-sex strangers on three attrac) 


sures. Results showed that sex 


discrepancy could not. 


Investigators using Byrne’s (1971) attrac- 
tion paradigm have generally found a strong 
positive relationship between attitude simi- 
larity and interpersonal attraction. This re- 
lationship has been found for different modes 
of stimulus presentation (Byrne & Clore, 
1966), different populations (Byrne, Grif- 
fitt, Hudgins, & Reeves, 1969), and different 
issues of varying levels of importance (Clore 
& Baldridge, 1968). The generalizability of 
the similarity—attraction relationship has 
been further demonstrated by investigations 
that have manipulated similarity on the basis 
of abilities (Zander & Havelin, 1960), eco- 
nomic status (Byrne, Clore, & Worchel, 
1966), emotional states (Zimbardo & For- 
mica, 1963), and Personality traits (Byrne, 
Griffitt, & Stefaniak, 1967). 

According to Clore and Byrne’s (1974) 
reinforcement-affect model, attraction toward 
another person is determined by the propor- 
tion of reinforcements and punishments as- 


ducting the laboratory experiment, 
Requests for reprints should be sent to Joseph E. 


Grush, Department of Psychology, Northern Illinois 
University, DeKalb, Illinois 60115. 
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and traditionality interacted with similarity in 
determining ratings of the Strangers’ general likability and persona! role 
tion (e.g., desirability as a dating partner), but not their functional role 
tion (e.g., desirability as a debater on sex roles). The discussion suggests that 
previous attempts to find interactions have often failed because of a lack of cor- 
Tespondence between dispositional factors and similarity manipulatio: 
discussion also suggests that reinforcement and informational explanation: 
account for the overall findings, but that communication factors of extremity and 


Janet G. Yeh! 
University of South Carolins at 


dispositional factor 
two replications, fen 


mes- 


attrac 


attrac 


The 


ould 


sociated with that person, Although simil 
ity between judge and target has frequeall 
been found to be reinforcing, the focus of U 
model is on interpersonal reinfo 
rather than similarity per se. Thus, a 
corollary of the model postulates that m 
and dispositional factors can alter the us 
similarity-attraction relationship by alterin 
the reinforcement contingencies a 
with another person (Clore & Byrne, 19 

In the corollary, roles refer to st 
Positions in which the occupant is a 
to behave in certain prescribed ways. Tm 
certain roles may attenuate the reinford 


among individuals in their orientation to mi 
responding to the same stimulus. Thos F 
viduals with different socialization (reinlol 4 
ment) histories may differ in their attract 
responses to stimulus persons who are SI 
or dissimilar to themselves. igned 1 
The present investigation was desi re 
test whether role and dispositional ai 
would moderate the impact of similari z 
attraction. Female and male students, | d 
traditional or nontraditional attitudes = 
marital roles, judged opposite-sex stra Tr 
who had similar or dissimilar attitudes. 


DISPOSITIONAL FACTORS AND INTERPERSONAL ATTRACTION 


ditionality was determined by scores on a 
marital role decisions questionnaire that is 
described later (see Preliminary Study). Tra- 
ditional respondents, for example, believe 
that wives should be concerned with child 
rearing and domestic matters, whereas hus- 
bands should focus on career and financial 
matters. Nontraditional respondents believe 
that these roles should be jointly shared by 
both spouses. Judgments of the opposite-sex 
strangers were made on measures of personal 
role attraction (e.g,, desirability as a marriage 
partner), junctional role attraction (e.g., de- 
sirability as a debater on marital roles), and 
general attraction (i.e, likability). 

The personal and functional role attrac- 
tion measures were included to assess differ- 
ent types of (anticipated) interactions be- 
tween judge and target. Personal role items 
assessed attraction in role relationships in 
which judge and target would directly inter- 
act at a personal or intimate level. Functional 
role items assessed attraction in role rela- 
tionships in which the judge would primarily 
observe and evaluate the target person’s per- 
formance of certain public roles. Finally, the 
general attraction item was included to de- 
termine if global attraction tends to mirror 
personal or functional role attraction. 

The major hypothesis was that sex and 
traditionality of the judges would interact 
with similarity of the target persons in de- 
termining personal and general attraction 
but not functional attraction. This hypothe- 
sis was made up of three distinct predictions 
relating to the three attraction measures. 

Prediction 1 was that nontraditional fe- 
males and traditional males would display 4 
distinct preference for similar partners on 
Personal role attraction, whereas traditional 
females and nontraditional males would not. 
This prediction was based on the assumption 
that nontraditional women and traditional 
men have irreconcilable differences that would 
lead to constant conflicts and make personal 
relationships between the two quite punish- 
ing. Prediction 1 also assumed that tradi- 
tional women and nontraditional men could 
find similar or dissimilar partners equally 
reinforcing in personal roles but for different 
reasons. With similar partners, reinforcement 


117 


would come from the mutuality of shared be- 
liefs. With dissimilar partners, traditional 
women would find it reinforcing to know, for 
example, that they could pursue careers and 
receive help at home if they so desired. Non- 
traditional men would find it reinforcing to 
know, for example, that they could fully 
pursue their own careers and avoid the drudg- 
ery of household chores if they chose to do so. 

Prediction 2 was that general attraction 
would parallel personal role attraction. This 
prediction was based on research that shows 
that general likability tends to mirror role at- 
traction when the latter involves direct in- 
teractions between judge and target (Grush, 
Clore, & Costin, 1975). 

Prediction 3 was that similarity manipula- 
tions, either alone or in conjunction with 
characteristics of the judges, would not affect 
ratings of functional role attraction. This pre- 
diction was based on the assumption that de- 
bates and discussions of controversial issues 
are by nature interesting and informative 
(i.e, reinforcing) to the extent that both 
sides of the issues are presented. In other 
words, ratings of another person’s suitability 
to perform these roles should be based more 
on whether they have a point of view to ex- 
press than whether they agree with us. 


Preliminary Study 


Scale Construction 


The preliminary study was conducted to 
develop the Marital Role Decisions Ques- 
tionnaire (MRDQ) that would be used in the 
main experiment to select individuals who 
had traditional or nontraditional orientations 
toward sex roles and marriage. Possible items 
for this questionnaire were initially elicited 
from 20 married couples who responded to an 
open-ended questionnaire that asked them to 
describe (a) “important decisions that you 
and your spouse (or couples you know) had 
to make where the two of you had opposing 
viewpoints” and (b) “the choices that were 
available to you and your spouse (or couples 
you know).” 

This elicitation procedure produced nearly 
40 possible items. These 40 items were then 
reduced to 14 by (a) eliminating trivial or 
idiosyncratic decisions, (b) discarding de- 
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Table 1 
Comparison of Mean Scores on the Marital Role Decisions Questionnaire 
Females 
Group Score 

a rs 

Members of NOW* 21,08, 

College students 40.75, 

Wives of policemen 51.00, 


aaae 


Note. Scores on the Marital Role Decisions Questionnaire could range from 12 to 84, with higher a 
indicating a more traditional orientation toward sex roles and marriage. Means with diferent subsa 
differ significantly from each other beyond the .05 level (two-tailed) 


* National Organization of Women. 


cisions thought not to be amenable to tradi- 
tional and nontraditional responses, and (c) 
constructing composite descriptions of deci- 
sions that were similar in content. The 14 
resulting decisions covered the areas of do- 
mestic (whether household chores should be 
shared), financial (whether home should be 
purchased or rented), social (whether func- 
tions should be attended separately or to- 
gether), familial (whether children should 
be placed in a nursery), and career (who 
works to put whom through professional 
school) considerations, Except for domestic 
considerations, each content area was repre- 
sented by 3 items. These decisions were then 
rewritten as dilemmas, each with a 7-point 
response format: slight, moderate, or strong 
agreement with a traditional or a nontradi- 
tional alternative for resolving the dilemma 
and a “cannot decide” response. 
Reliability and Validity 

To test the reliability and validity of the 
MRDQ, the 14-item scale was administered 
to a small number of individuals from five 
“known groups.” The number of individuals 
and the five groups from which they came 
were 13 female members of a local chapter 
of the National Organization of Women 
(NOW); 12 female college students; 12 
spouses of policemen; 13 male college stu- 
dents; and 12 policemen. It is important to 
note that these samples were independent. 
That is, the female and male college students 
were from different schools, and the spouses 


of policemen and the policemen were not mar- 
ried to one another, 
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* Copies of the MRDQ, as well as instruction yp 
scoring wailable 


Group 


College students 
Policemen 57.14 


Coefficient alpha was computed to dé 
mine the reliability of the MRDQ With t 
item treated as a parallel measure, reli bili 
for the 14-item MRDQ was .78 However, 
spection of the data revealed that 2 items i 
social and 1 financial dilemma) functioned 
dependently of the other 12 items The d 
tion of those 2 items produced a 12i 
MRDQ scale with a reliability of .79. 

The validity of the 12-item MRDQ # 
was assessed by determining whether thet 
sponses of the known groups significan 
differed in the expected direction. The 
sults of these comparisons are presen 
Table 1. As can be seen there, the mé 
MRDQ scores for the three groups of 
significantly differed from each other: a 
NOW women were more nontraditional 
college women, who in turn were more 7 
traditional than the spouses of policemen. 
Table 1 also indicates, the two grouP 
male respondents significantly differed A 
each other: Male college students were © 
nontraditional than policemen. Compar 
between the sexes for comparable groups 
vealed no significant differences, an 
male college students and the di “nfl 
policemen were slightly more nont at 
than their male counterparts. In al e 
then, the preliminary study showed tha A 
12-item MRDQ had sufficiently good PSY". 
metric properties for use as a researc 
strument.* 


procedures for its use, are a 
first author upon request. 
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Main Experiment 
Method 
Design 


The experiment had a 2 (female or male) X2 
(traditional or nontraditional) X 2 (similar or dis- 
similar) between-subjects factorial design. The first 
two factors involved characteristics of the experi- 
mental participants, who served as judges in an im- 
pression-formation task. The factor of traditionality 
was determined by participants’ responses on the 
MRDQ. Individuals scoring above the true midpoint 
of the scale were classified as “traditionals,” whereas 
individuals scoring below the true midpoint were 
classified as “nontraditionals.” The third factor was 
whether the target person's score on the MRDQ was 
similar or dissimilar to the judge's score. 


Participants 


One hundred twenty-eight college students (64 fe- 
males and 64 males) served as experimental par- 
ticipants, with 16 individuals participating in each 
cell of the 2X 2X2 design. All students were en- 
rolled in introductory psychology courses at Northern 
Illinois University and received extra course credit 
for their voluntary participation in the experiment. 
Students were recruited during the first week of 
classes, at which time volunteers completed the 
MRDQ. The experiment was conducted several weeks 
later, with students participating in groups of 5 
to 10, 


Procedure 


Under the guise of an impression-formation task, 
each student examined the MRDQ responses of an 
©pposite-sex stranger. Consistent with the cover 
Story, no other information about the strangers was 
given except that they were also college students 
and of the same age group as the participants. The 
Strangers’ responses on the MRDQ were ipulated 
to be either similar or dissimilar to the students’ re- 
Sponse patterns. After examining the strangers’ 
MRDQ responses, the students recorded their im- 
Pressions of the strangers on items intended to mea- 
Sure personal role, functional role, and general at- 
traction. Finally, all students were thanked for their 
Participation and debriefed about the general pur- 
Pose and methodology of the investigation. 


Similarity Manipulations 


The target person’s nse pattern on the MRDQ 
was manipulated to be EO (similarity) or 30 
(dissimilarity) points discrepant and opposite in di- 
rection from the student's response pattern. The 
tules for constructing the target's MRDQ responses 
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were (a) always change the target’s responses in 
the direction that is opposite the student’s overall 
score on the MRDQ; (b) make three sequential 
changes of 10 points each; (c) for each sequential 
change, randomly alter as many different items as 
possible; and (d) avoid using the “cannot decide” 
response. Once these rules were employed, each stu- 
dent was randomly assigned to the similar or dis- 
similar condition. Similarity and dissimilarity were 
based on 10 and 30 points, respectively, because the 
standard deviation of scores on the MRDQ was 
10.32 for this population. 


Dependent Measures 


The dependent measures included items intended 
to measure personal role, functional role, and gen- 
eral attraction, There were three items intended to 
measure personal role attraction, or the extent to 
which the judge would enjoy interacting with the 
target person: How suitable is the target person as a 
work, dating, and marriage partner? There were 
three items intended to measure functional role at- 
traction, or the extent to which the judge would 
enjoy observing the target person perform public 
roles: How suitable is the target person as a de- 
bater, panelist, and discussant on sex roles and mar- 
riage? A single item measured how much the judge 
would generally like the target person. Four filler 
items assessed the degree to which the target per- 
son was intelligent, moral, socially adjusted, and 
knowledgeable about current affairs. All 11 items 
were assessed on 7-point response scales, with an- 
chors of “very much” and “very little.” 


Results 


Reliability 


Item analysis based upon pooled within- 
cell correlations was used to assess whether 
items designed to measure personal and func- 
tional role attraction could be legitimately 
combined. Ratings of the target persons’ suit- 
ability as work, dating, and marriage part- 
ners were highly intercorrelated with each 
other (mean r = .65). Scores on these items 
were thus averaged to create a measure of 
personal role attraction. Ratings of the target 
persons’ suitability as debaters, panelists, and 


eee 
2In actuality, two replications of the experiment 
were conducted during two consecutive semesters. 
Since an analysis for a replicated experiment pro- 
duced no reliable differences between the two se- 
mesters, the data for both semesters were combined 
to increase the power of the subsequent analyses. 
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Table 2 


Mean Ratings of Personal Role, General, and Functional Role Attraction for the Esgh! 


Experimental Groups 


Experimental group 


LO O e—a 


Female-nontraditional S +i 
Similar . ” <0 
Dissimilar 2.87 sa 4.13 
Female-traditional 
Similar 5.25 $31 5 
Dissimilar 5.46 = Sa 
Male-nontraditional 
Similar 5.08 ae 5.13 A 
Dissimilar 4.50 481 
Male-traditional : vee 
Similar 44 i oe? 
Dissimilar 3.54 <3 3.88 . 
Jean Sudre E 
Note. Ratings on each attraction measure could range from 1 to 7, with higher ratings indicating paee 
attraction. Given that there were four directional predictions concerning the simple tests, a conserva 


alpha level would be .0125 rather than .0S. 


discussants of sex roles and marriage were 
also highly intercorrelated (mean r= .60). 
Scores on these items were thus averaged to 
create a measure of functional role attraction. 
Since personal role items were only minimally 
correlated with functional role items (mean 
r = .17), discriminant evidence was also suf- 
ficient to treat these two sets of items as 
separate measures of attraction. Finally, the 
likability item was treated as a single mea- 
sure of general attraction, because its mean 
correlations with personal and functional role 
attraction were .53 and .28, respectively. 


Principal Hypothesis 


The three attraction measures were treated 
as separate dependent variables in a multi- 
variate analysis of variance. The results of 
this analysis showed that the multivariate F 
for the Sex x Traditionality X Similarity in- 
teraction was significant, F(3, 118) = 4.42, p 
< .01. The corresponding univariate Fs for 
personal role and general attraction were also 
significant: F(1, 120) = 12.34, p < .001, and 
F(1, 120) = 5.73, p < .05, respectively. The 
univariate F for functional role attraction 
was not significant, F(1, 120) = .01, p > .90. 
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Functional 


p Rating f role rating 


These findings supported the principal 
pothesis that sex and traditionality of 
judges would interact with similarity of 
targets in determining personal role and gë 
eral attraction but not functional role attra 
tion.* 


Specific Predictions 


Simple tests compared the personal 
and general attraction ratings of similar f 
dissimilar target persons made by the v 
ous judges. The results of these comparsa 
are reported in Table 2. As can be seen t 


nontraditional females displayed a di 


ducted. This analysis 
tions of the attraction measures: (a) 
personal and general attraction versu: 
traction and (b) personal versus general 


nificant, F(2, 119) =4.94, f < 01- acco 
analysis indicated that the first ain verl 
for a significant amount of variance in the 
analysis, F(1, 120) = 9.44, ? < 005, sch 
dition of the second contrast did not ae 
49. Thus, the results of this analysis als 


the principal hypothesis. 
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preference for similar as opposed to dissimi- 
lar target persons on personal role and gen- 
eral attraction (p< .001 and p< 01, re- 
spectively). Traditional men also preferred 
similar partners on personal role and general 
attraction (p<.02 and p<.01, Tespec- 
tively). By contrast, traditional women and 
nontraditional men displayed no such prefer- 
ence for similarity on either attraction mea- 
sure (ps>.15). These findings supported 
Predictions 1 and 2 of the principal hypoth- 
esis, 

Although Table 2 contains the means for 
functional role attraction, simple tests were 
not conducted because they were unwar- 
ranted. As already noted, for example, the 
triple interaction for functional role attrac- 
tion was not significant. Moreover, the main 
effect for similarity was not significant, F 
(1, 120) = .97, p > 30. The Sex x Similar- 
ity and the Traditionality x Similarity inter- 
actions were also nonsignificant: F(1, 120) 
=.77, p> 35, and F(1, 120) = 3.07, p> 
.08, respectively. These findings supported 
Prediction 3 of the principal hypothesis. 


Discussion 


This study tested the hypothesis that role 
and dispositional factors can moderate the 
effects of similarity on attraction. Results 
showed that nontraditional women and tradi- 
tional men displayed more liking (general 
attraction) and stronger preferences for simi- 
lar versus dissimilar others as work, dating, 
and nfarriage partners (personal role attrac- 
tion). By contrast, traditional women and 
Nontraditional men were equally attracted to 
Similar and dissimilar others on these same 
Measures. Results also showed that all judges 
Considered similar and dissimilar others to be 
equally suited to perform the roles of de- 
baters, discussants, and panelists on topics of 
bie Toles and marriage (functional role attrac- 
vane In summary, this study demonstrated 
i characteristics of judges and dimensions 
tig ement can affect interpersonal attrac- 

fe much as similarity of target persons. 
= E penificance of the present study lies 

aie act that it is apparently the first to 
ce clearly replicated results (see Foot- 
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note 2) of the moderating effects of disposi- 
tional factors on attraction responses, Thus, 
the question naturally arises as to why the 
present study succeeded when past efforts 
have generally failed in this regard (see 
Byrne, 1971; Fishbein & Ajzen, 1972). 

One possibility is that previous studies 
typically used Byrne’s Interpersonal Judg- 
ment Scale (IJS), whereas the present study 
employed different attraction measures, Since 
the likability and work partner items of the 
present study are psychologically similar to 
the two items comprising the IJS, an analysis 
of these combined items was conducted to 
test this possibility. Not surprisingly, the re- 
sults of this analysis were exactly parallel to 
those for personal role and general attrac- 
tion: The Sex x Traditionality x Similarity 
interaction was significant, F(1, 120) = 
13.80, p < .001; nontraditional women dis- 
played a distinct preference for similarity, F 
(1, 30) = 33.05, p < .001, as did traditional 
men, F(1, 30) = 7.19, p < .05; and tradi- 
tional women and nontraditional men dis- 
played no such preference (F's < 1.50), Thus, 
the success of the present study cannot be 
attributed to its use of a novel measure of 
attraction. 

A second possibility is that previous studies 
failed to obtain moderating effects because 
the relationship between dispositional factors 
and similarity manipulations was more ten- 
uous than in the present study. For example, 
in the studies of authoritarianism, partici- 
pants’ scores on the F scale and their re- 
sponses on the attitude items were either un- 
related (Byrne, 1965) or only minimally cor- 
related (Sheffield & Byrne, 1967). Thus, it is 
not surprising that authoritarianism did not 
interact with the similarity of others’ re- 
sponses on issues that were generally unre- 
lated to the authoritarian ideology. In other 


4 Although moderating effects on attraction have 
been obtained with need for affiliation, different in- 
vestigations have produced different results (cf. 
Byrne, 1961, 1962). Somewhat similarly, the dis- 
positional variable of social avoidance and distress 
has significantly interacted with similarity in affect- 
ing attraction in one study (Smith, 1972) but not 
another (Gouaux, Lamberth, & Friedrich, 1972). 
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studies, the tenuous relationship may have 
occurred in terms of unrealistic expectations 
about the personal benefits of attitude simi- 
larity. For example, it seems unreasonable to 
expect that individuals with high need for 
approval (Ettinger, Nowicki, & Nelson, 1970) 
can be so positively affected by attitude simi- 
larity that they will display more attraction 
toward agreeing strangers than that displayed 
by individuals with low need for approval. 

By contrast, the present study ensured a 
close relationship between dispositional vari- 
ables and similarity manipulations by using 
the same items to assess the dispositional 
variable and to manipulate similarity. Fur- 
thermore, information contained in these 
items was directly relevant to the dependent 
measures of attraction. For example, faced 
with the task of rating a target person’s suit- 
ability as a potential marriage partner, par- 
ticipants could directly assess whether their 
own orientations toward marriage were com- 
patible with the target person’s orientation. 
Thus, the present study was probably success- 
ful in finding interactions because there was 
a closer correspondence between dispositional 
factors and similarity manipulations than that 
which existed in previous studies. 


Alternative Interpretations 


While the present findings were predicted 
from a reinforcement Perspective, they are 
compatible with informational approaches to 
attraction. For example, Ajzen (1974) views 
attraction as being a summed function of the 
subjective probabilities that a target person 
has certain attributes and the affective values 
of those attributes. Thus, if nontraditional 
women and traditional men consider similar 
others to have more positive attributes for 
personal roles than dissimilar others, they 
would evaluate similar others more positively 
than dissimilar others. If these same judges 
also consider both similar and dissimilar 
others to have equally positive attributes for 
functional roles, they would evaluate both 
target persons equally well. 

The present findings are also compati 
with Kaplan and Anderson’s (1973) fice 
tion integration approach to attraction. Ac- 
cording to this approach, variations in the 
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judges’ initial impressions or in the sak 
values and weights assigned to pieces of is 
formation about the target persons could» 
count for the results obtaine! here. Fore 
ample, the personal role findings could 
due to nontraditional women and traditi 
men having more negative initial impressis 
of dissimilar others than traditional woma 
and nontraditional men have of them. Adé 
tionally, differences between personal al 
functional role findings could be due to dé 
ferences in the scale values or weights a 
signed to the same target when that persi 
is judged on different dimensions 

A final interpretation warrants commen 
The similarity manipulations of this stud) 
bear a strong resemblance to discrepant 
manipulations in persuasion research. Thus, 
nontraditional women and traditional ma 
held extreme views on marriage, their dislike 
of dissimilar others would merely replicalt 
the well-known derogation effect of discrepat 
communications and communicators (Hof 
land, Harvey, & Sherif, 1957; Johnson, 1966) 
To determine whether any group of j 
held extreme views on marriage, mean MR 
scores were computed for cach group. 
mean scores of both groups of nontraditio 
women were nearly two standard devia 
away from the midpoint of the scale. BO 
ever, the mean scores of all other groups 
cluding the two groups of traditional m 
were within a standard deviation of the 
point. 

Thus, extremity and discrepancy has 
count for the nontraditional womens 
of dissimilar others on personal role and 
eral attraction. However, these factors © 
not explain the traditional men’s de 
of dissimilar others on these same meast 
Nor can they explain the nontradit 
women’s liking of dissimilar others 0n, A 
tional role attraction, This latter poin i 
particularly important because functio 7 
traction assessed the target persons’ SU! 
ity to perform roles (e.g., debater) 
similar to the communicator roles for ¥ 
derogation effects have been obtaine” 
summary, extremity and discrepancy 
count for part of the present findi 
not their overall pattern. 
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Conclusion 


j McGuire (1968) has argued that the prin- 

ciples governing human affairs are more likely 
to be contingent-interactive effects rather than 
condition-free main effects. In this regard, 
the present study represents a modest be- 
ginning in the area of interpersonal attrac- 
tion, That is, the present study demonstrated 
that interpersonal attraction can be a multi- 
variate function of the type of person making 
the judgment, the type of person being 
judged, and the type of dimension upon which 
the judgment is made. As investigators con- 
tinue to pursue the interactive nature of in- 
terpersonal attraction more fully, it is hoped 
that the principles underlying evaluative 
judgments of others will begin to emerge 
more explicitly. 
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Simple Minitheories of Love 


P. M. Bentler and G. J. Huba 
University of California, Los Angeles 


Two causal models of love were developed as alternatives to the Tesser and 
Paulhus theory. These models were tested for adequacy of ût using ma 
likelihood methods. While the Tesser—Paulhus models can be rejected emp 
the alternate models provide acceptable statistical representations 

variables measuring love at two times separated by 2 weeks One formu! 
based on the idea of unidimensionality of interpersonal attraction The second 
formulation represents a refinement of the Tesser-Paulhus view of love The 
formulation based on the idea that interpersonal attraction is primarily a uni- 
dimensional construct provides the more parsimonious and interpretable theory 
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Causal modeling methods have been used 
in this journal to evaluate a theory of love 
(Tesser & Paulhus, 1976). As a method of 
theory testing with correlational data, causal 
modeling represents a set of statistical tech- 
niques for investigating sample data in the 
light of hypothesized population models of 
the influence of certain variables on others. In 
general, a model represents a simultaneous 
series of statements about the regression of 
particular variables on various other (“causal,” 
explanatory) variables. The most popular 
form of causal modeling, path analysis, is 
thus a sophisticated form of simultaneous 
multiple correlation/regression analysis, while 
recent developments in causal modeling em- 
bed path analysis into a factor-analytic type 
of framework. In such a framework, the re- 
gressions are at the level of the unmeasured, 
latent variables or factors, and the factors 
themselves are related by factor-analytic as- 
sumptions to the observed, manifest variables 
in the system. The goals of causal modeling 
include both the testing of a proposed model 
against data and the development of models 
that adequately account for data. The inter- 
relations among causal modeling, data analy- 
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sis, theory testing, and construct validati j 
have been discussed by Bentler (1978). 
Although causal models must be evalua 
by a variety of criteria including mean 
fulness, they can be tested statistically J 
the chi-square statistic: The hypothesis Wi 
the sample variances and covariances 
drawn from a population having a hyp 
sized causal structure is tested against the # 
ternative hypothesis that the variables 
simply correlated. If the chi-square is ™ 
relative to degrees of freedom, the prop% 
causal structure must be rejected, because 
observed sample data would be eme 
unlikely to be obtained if the hypothesi 
model were true in the population. The : 
square statistic, in turn, is developed be 
basis of various mathematical and stal p 
assumptions. A very general mathen 
system of equations for causal models (Be i 
ler, 1976) reduces to a set of equation i 
scribed by Wiley (1973) and Jöreskog (1777 
for which a computer program, LISREL i 
exists, to obtain maximum-likelihoo 
mates of model parameters and chist 
statistics (Jéreskog & Sérbom, 1976): 
a proof of these relations, see “oll 
Woodward (1978); for a more gener 1978 
and computer program, see Weeks ( 


and to evaluate two new minitheories » 
Tesser and Paulhus (1976, P- ! 
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tained the variances and covariances among 
four measures associated with the concept of 
love, assessed for 202 subjects on two occa- 
sions separated by 2 weeks. The first variable 
indicates how often the participant thought 
about a target love person during the previ- 
ous 2 weeks (T, and Ty, where “1” and “2” 
indicate the measurement occasion). Love 
(Lı and Ly) was assessed by Rubin’s (1970) 
nine-item love scale, Reality constraints (RC; 
and RC.) were assessed via a scale of new 
information obtained about the target per- 
son that confirmed (high RC) or contradicted 
(low RC) expectations. Finally, dating (Dı 
and D+) was indexed by the subject's report 


of number of dates with the target person 
during the previous 2 weeks. Any proposed 
miniature theory should account for the vari- 
ances and covariances among variables on 


both occasions in a statistically acceptable 
manner; the means of the variables are ir- 
relevant to the causal modeling formulation. 
-Tesser and Paulhus developed two causal 
models for their data, In these models, all 
variables on the first measurement occasion 
are allowed to correlate freely, and certain 
variables at the second occasion are func- 
tions of particular first-occasion variables, 
some simultaneous Occasion 2 influences, and 
residual or unknown effects. The details of 
the Tesser-Paulhus (1976) original model 


(p. 1098) and the final model (p. 1100) 


need not be described here, since, unfor- 
tunately, the Tesser-Paulhus models do not 
fully fit their data. Although Tesser and 
Paulhus (1976, 1978) apparently believe that 
a successful model has been developed to 
fully describe the measures and the 36 vari- 
ances and covariances, the models only fit 28 
of the dispersions, and arguments about the 
Meaning of the models (Smith, 1978; Tesser 
& Paulhus, 1978) are limited to the subset 
of relationships. It can be shown that the 
original model, fit to all the data, yields as 
an index of fit a x? value of 25.2, with 9 de- 
grees of freedom, indicating that the sample 
covariance matrix is extremely unlikely (? 
= .003) to be drawn from a population hav- 
ing the proposed causal structure. Although 
Tesser and Paulhus’s final model is reason- 
able for 28 of the 36 variances and covari- 
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ances (i.e. by excluding reality constraints 
on the second testing), it is not capable of 
explaining all the data. When reality con- 
straints on the second occasion are included, 
one obtains a x? of 55.6, with 14 degrees of 
freedom (p < .0001), thus requiring similar 
rejection of the final model.* 

Since the Tesser-Paulhus theory is not 
consistent with the data, one may inquire 
whether any other theory could be developed 
that would be successful in describing their 
data. Such a theory should properly be based 
on prior psychological theorizing. The pur- 
pose of this note is to explore two alternative 
miniature theories of love, based on research 
in interpersonal attraction, and to test these 
models in the Tesser-Paulhus data using 
maximum-likelihood estimation and chi-square 
goodness-of-fit tests. The first formulation is 
based on the well-known idea of unidimen- 
sionality of interpersonal attraction, whereas 
the second approach is a refinement of the 
Tesser-Paulhus view of love. Both formula- 
tions will be shown to be capable of explain- 
ing all the data involved. 


A Unidimensional, Latent-Variable Approach 
to Love 


Interpersonal attraction is often defined 
as “attitudinal positivity” (Huston & Lev- 
inger, 1978), which implies a general con- 
struct of favorability—-unfavorability towards 
a partner. Consequently, it is not surprising 
that “most researchers have assumed that 
attraction is a unidimensional variable” (Ber- 
scheid & Walster, 1978, p. 3). As a conse- 
quence, one possible model for the Tesser— 
Paulhus love data would propose that the 
four variables being measured represent in- 
dicators of generalized love and attraction 


1]t should be noted that the general mathematical 
models and computer programs for their implementa- 
tion were not widely distributed at the time Tesser 
and Paulhus did their research, Given the theoretical 
formulations and software available at the time, the 
Tesser—-Paulhus solution of deleting one variable was 
quite appropriate, and their model fits the data 
within that context. Newer methodology, however, 
makes it possible to develop a more complete and 


robust model for these data. 
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Ag 
Figure 1. Schematic representation of a simple latent- 
variable (A) model linking general attraction (A), 


love (L), thought (T), reality constraints (RC), and 
dating (D) at two different times (1 and 2). 


toward a partner. Unfortunately, such a 
model was never evaluated for the Tesser- 
Paulhus data. In such a model, one would 
suppose that all of the observed variables are 
reflections of a single latent construct (a 
“common factor”) of attraction (A) towards 
the target person. Additionally, it seems rea- 
sonable to posit that each of the variables 
also measures its own special quality or at- 
tribute (a “unique factor”), so that one can 
propose that the four overt variables on the 
first occasion represent the five latent vari- 
ables of general attraction, Aj, and the specific 
attributes T,, Lı, RC, and D;. That is, each 
observed variable is composed of, or in part 
attributable to, general attraction and uncor- 
related specific influences including measure- 
ment error. Since the times of measurement 
were separated by the short period of 2 weeks, 
the most simple unidimensional model of 
these love measures would additionally pro- 
Pose that the identical measurement struc- 
ture exists on the second occasion (i.e., the 
loadings of the four variables on the com- 
mon and unique factors are the same at both 
times). Furthermore, the most parsimonious 
way of interrelating the measures across time 
is to posit a simple reliability model, in 
which the second-occasion constructs are 
linear functions of only the associated first- 
Occasion construct plus a residual. Thus, for 
example, T2 = wT, + e, where w is a weight 
and e is a residual (i.e., the unique part of 
thinking about an individual at the second 
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time is adequately predicted by regresingi 
only on the unique part o! thinking abw 
the individual at the earlier time) This pur. 
ticular latent-variable mode! does not ak 
quately fit the data, y"(15 40.5, p < MY 
thus lending some credence to the Berschid! 
Walster (1978) criticisms of simple nid 
mensionality. However, some slight 
tions of this specialized model, constrainiay 
the parameters of measurement structure 
be identical, does yield a formulation bas 
on the unidimensionality of attraction thi 
adequately fits all the observed variances al 
covariances. 

Figure | presents the modified unidime 
sional reliability model. Since it yields a 
significant goodness-of-fit index, (19) 
24.2, p = .19, one can conclude that the 
served sample data are consistent with t 
population model shown. The estimates fit 
the model parameters, their standard eron 
and critical ratios (formed by dividing tht 
parmeter estimates, minus the null en 
zero, by the respective standard errors) 
shown in Table 1. 


Table 1 
Latent- Variable Model Parameter E nan 
Standard Errors, and Critical Ratios (C 


Parameter Estimate SE 


Within-occasion parameters 


M 3.18 18 
As 16.16 1,10 
Ms .29 10 
M 1.52 18 
As 1.83 14 
Me 10.31 71 
` 1.77 09 
Ae 248 13 


Cross-time parameters 


d 70 08 108 
a 20 05 158) i 
S ‘30 05 r] 

Aw —.28 08 


Residual parameters 
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The figure uses a convention, now quite 
standard in the methodological literature, of 
; representing observed variables by squares. 
The latent constructs, which are unmeasured, 
are represented in circles. The entire mini- 
theory proposes directional causal influences, 
with the causa! flow originating in the latent 
variables and yielding the overt variables (the 
measurement model) as well as cross-time ef- 
fects on the latent variables (the structural 
model). The coefficients Ay represent the pa- 
rameters of the model to be estimated (the 
estimates are given in Table 1). It can be 
seen that at both occasions, there are five 
latent constructs, the specific constructs as- 
sociated with each variable (e.g, Tı) and 
the general attraction construct, Ay. The pa- 
rameters of the measurement structure, A; to 
“As, are constrained to be identical on both 
occasions. The cross-time causal effects are 
given by parameters Ay to Aye (with the two 
Mo parameters set equal), while Ais to Aiz 
represent regression residuals (i.e., the un- 
| predictable parts of the Time 2 latent con- 
structs). The specific estimated values for 
the parameters, as reported in Table 1, can 
be used to reproduce the data to within sta- 
tistical accuracy. However, the important 
feature of Table 1 lies in the critical ratios 
(CR), which are like familiar 2 ratios, 
representing a ratio of the parameter estimate 


minus the null value of zero to its standard 
error. Using most familiar cutoff values for 
this ratio shows that all parameters are im- 
Portant to the model. 

The substantive meaning of the simple 
model in Figure 1 is the following: The gen- 
eral attraction construct on the second occa- 
sion, As, is caused by a simple weighted sum 
of general attraction (A) and a residual 
(àis). That is, general attraction is stable 
Over a 2-week period and is not altered by 
Specific aspects of thought, love, reality con- 
Straints, or dating. Each specific construct on 
the second occasion is given by a weigh 
Sum of the identical construct on the previous 
Occasion plus a residual (e.g, Le = ALi + 
dis). Thus, each of the specific parts of the 
Measured variables is stable across 2 weeks. 
The single exception is given by thought on 
the second occasion (T), which is indepen- 
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dent of a causal influence of thought at the 
initial time (Tı); apparently, the specific 
aspect of thoughts (T2) cannot be predicted 
from any variable or construct at initial test- 
ing. Finally, there are two cross-variable 
causal influences across time, with T, affect- 
ing RC» and RC, influencing L». The mean- 
ing of these influences appears to be that the 
more subjects think about their dates at Time 
1, the more they subsequently believe that 
recent, new information about their friend 
contradicts their expectations; and the more 
they initially believe that information about 
their friend confirms expectations, the more 
they tend to love their friend later. 

It is possible that the first cross-variable 
effect reflects an unrealistic quality associ- 
ated with excessive initial brooding, which 
becomes contradicted with real, intervening 
experience, and that the second cross-variable 
effect reflects the loving consequences of early 
cognitive security. While the last two links 
were necessitated by the data and not postu- 
lated in the simple reliability model, the af- 
fective consequences of having expectations 
validated seem consistent with the laboratory 
studies cited by Clore and Byrne (1977) 
and are well justified by Tesser and Paulhus 
(1976). Furthermore, the link between large 
amounts of initial thought about the partner 
and believing that new information contra- 
dicts expectations reflects a point that is made 
by Tesser and Paulhus in their original article 
(1976, P- 1095) in citing Kerckhoff and 
Davis (1962), who note that a “point so often 
stressed in the literature is that couples go 
through a period of idealization and percep- 
tion distortion which may lead to disillusion- 
ment (or ‘reality shock’) at a later date” (p. 
302). It must be remembered, furthermore, 
that the cross-time links are between those 

of the variables not attributable to gen- 
eral attraction or evaluation; within this con- 
text, both links seem quite interpretable. Of 
course, these explanations associated with the 
model must await testing with further data 
in the context of a larger causal model includ- 
ing measures of the hypothesized influences. 
Although our statistical methods are reas- 
suring regarding the internal validity of the 
model, external validity or generalizability to 
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Figure 2. Schematic representation of a simple mani- 
fest-variable (y) model linking love (L), thought (T), 
reality constraints (RC), and dating (D) at two dif- 
ferent times (1 and 2). 


other situations must await further research. 
Smith (1978) has made a similar point, 


A Manifest-Variable Approach to Love 


Although the above model based on gen- 
eral attraction is a viable one, it is also pos- 
sible that a model without such a construct 
may fit the data. Among such models, the 
simplest one is surely a reliability model for 
the variables across time. This model proposes 
that the four variables at Time 1 could be 
freely intercorrelated but that all causation 
across time consists of simple links such that 
a given variable depends causally only on its 
predecessor (e.g, To = wT, +e). As might 
be expected from the Previous result, this 
simultaneous regression model for the four 
observed, manifest variables measured on two 
occasions also could not account for all fea- 
tures of the data, x’(18) = 139.7, p < .0001. 
Cross-variable causation is obviously essen- 
tial to enable the model to fit the data, and 
thus a 24-parameter model was developed 
that would contain the features of the simple 
model, while incorporating some of the most 
necessary causal influences Proposed by Tesser 
and Paulhus. This model is Presented in Fig- 
ure 2, and the parameter estimates, 
errors, and critical ratios are given in Table 
2. This model, when estimated by maximum 
likelihood, has an acceptable index of fit, 
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x°(12) = 15.9, p= 20. Thus, although @ 
manifest-variable model has 24 parame 
as compared to the prev latent wariahh 


able model also cannot be 1 
for these data 

The manifest-variable model can bea 
sidered to be in the spirit of the Tesser-F 
hus approach. It considers the variable a 
Time 1 to be exogenous noncausal) | À 
ables, with parameters y, to y, asodi 
with covariances among the variables andy 
to ys with their variances 
figure, but given in the table) 
Tesser-Paulhus model, however, the | 
model found that the estimated covariandl 
of T, and RC, and of RC, and D; could 


itastically rejec 


Table 2 
Manifest- Variable Model Para 
Standard Errors, and Crit 


Parameter Estimate SE E- 
E Exogenous parameters 
44s 79 cil 
‘a 4.62 44 
7 49.79 5.98 
y 446 - m- 10 
= (T) 12.89 ? 10 
Mo eiL) 370.50 $6.62 re 
yı = (RC) 3.24 ~ jou 
ve = o*(D;) 8.24 — 
Cross-time parameters rr 
4 06 oH 
d 66 06 1s 
yu 17 07 168 
ve 79 05 108 
mu 08 01 3 
yu —1.09 oe 1M 
ys -13 vt 33! 
yu 1.52 “ak > 
Endogenous parameters Pr 
mm eA oh F 
a —1.78 8 Ri 
03 d 
Y ‘ A 
Residual parameters 1 ! 
5.04 50 9. 
bes 108.04 11.35 gst 
n 2.84 30 100 
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set to zero, so that no parameters are needed 
for these covariances, (Stated differently, the 


observed correlations of .129 and .086 could 
not be considered to be significantly different 
from 0.) The causal connections across time 
again required consideration of effects of 
specific variables on themselves (yə to y2) 
as well as some cross-variable effects (yis 


to ye). Two such effects, associated with 
yis and yi», were described in the context of 
the previous model and would seem to have 


a similar interpretation in the current model. 
The remaining effects seem to indicate that 
loving leads to thinking about one’s loved 


one (yia) and that extensive thinking at Time 
1 leads to reduced loving at Time 2. It is as 
if thinking does not have a positive effect on 
loving at a later time; but this hypothesis is 
contraindicated by the endogenous, within- 
Time-2 effect represented by yır, which is 
positive. The remaining endogenous effects, 
yis tO yap, are not consistent with the Tesser 
and Paulhus concept of retrospective assess- 
ment on T, RC, and D, which they suggested 
would cause love, a contemporaneously mea- 
sured variable. Rather, we find La having an 
effect on RC2 (yis) and on Da (y20)s in ad- 
dition to the effect of RCy on Lz (y19)- 

causal paths, though statistically necessary 
to the model, do not appear to us to be par- 
ticularly interpretable. Finally, the param- 
eters y», to yay reflect the residuals of the 
Causal model. 


Discussion 


_ Although we have developed a path analy- 
sis model at the level of the manifest variables 
that would indeed account for the ob: 
data reported by Tesser and Paulhus, we 
agree with Smith (1978) that causal models 
of love should not be formulated at the level 
of overt variables. Not only is the model of 
Figure 2 likely to be modified solely as 4 
Consequence of arbitrary changes in the re- 
liability of the variables involved, but con- 


-Sidering all variables at Time 1 as exogenous 


and outside the explanatory system seems to 
e shortsighted, The unidimensional latent- 
Variable model not only accounts for the overt 
variables at Time 1, it also provides for a 


129 


very simple representation of effects across 
time and does not require complicated re- 
ciprocal causation among variables at Time 2. 
Since the latent-variable model provides an 
accurate description of the data with seven 
fewer parameters, parsimony also favors the 
latent-variable model. Furthermore, this la- 
tent-variable model is consistent with many 
research approaches to interpersonal attrac- 
tion as a unidimensional construct (cf. Hus- 
ton, 1974). Finally, although the manifest- 
variable model is statistically acceptable for 
the data, there are clear theoretical ambi- 
guities that make this formulation of dubious 
validity. For example, the opposing signs as- 
sociated with the effects of thought on love 
(y14 VS. y17) are not consistent with the ob- 
served positive intercorrelations among all 
four thought and love variables. While the 
correlation between thought at Time 1 and 
love at Time 2 is .612, the associated regres- 
sion weight is —1.09, suggesting an uninter- 
pretable suppressor effect. 

A glance at Figures 1 and 2 will provide 
rapid visual evidence for the dissimilarities 
among the models, but certain similarities 
should also be observed. In particular, both 
models propose an effect for a given variable 
on itself across time. In the manifest-variable 
model, all these effects are important, while in 
the latent-variable model all but the reliabil- 
ity of the specific parts of thought (Tı and 
To) are necessary. The two cross-variable 
paths (M2 VS. 715 and A1o vs. 716) have also 
been noted and discussed previously. Since 
these effects appear to be similar across two 
completely different formulations of the phe- 
nomenon, it is likely that they represent fun- 
damental features of the experience of love. 

In contrast to the original models of Tesser 
and Paulhus (1976), which were debated by 
Smith (1978) and Tesser and Paulhus (1978), 
the current formulations are consistent with 
all the data collected. While we do share the 
concerns of Smith (1978) that causal model- 
ing should be theory based and a priori, the 
unidimensional model we have presented is 
consistent with current psychological formu- 
lations of general interpersonal attraction as 
operationalized in existing laboratories around 
the country, as well as being a slight modifi- 
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cation of the most mathematically simple 
model for the data. Of course, it must be 
recognized that in moderately sized samples 
such as the present one, minor modifications 
can be made in the present models that would 
yield indexes of fit approximately comparable 
to those we have obtained; such modifica- 
tions should, however, incorporate all the 
measurement and structural influences found 
to be necessary here. 

Since causal models of nonexperimental 
data can be modified substantially when 
other crucial causal influences are simulta- 
neously considered, the current models must 
not be considered as final theoretical for- 
mulations of love, but rather as potentially 
interesting representations of the Tesser- 
Paulhus data.* As additional longitudinal 
data are collected about love, it should prove 
useful to include the present models as a 
subset of some larger, more comprehensive 
formulation in order to account for changes 
over time in attraction and emotional re- 
sponses to significant others, and to attempt 
to reject the proposed simple model in ex- 
Plaining comprehensively the phenomena of 
interpersonal attraction and love, 


2 Unfortunately, the sample of Tesser and Paulhus 


is too small to allow a cross-validati 
aie ion study of the 
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Statistically Combining Independent Studies: 
* A Meta-Analysis of Sex Differences in Conformity Research 
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Traditional (literary) reviews of research in social psychology are compared with 
a statistical approach. It is concluded on both abstract and practical grounds 
that adoption of the statistical approach would lead to theoretical progress for 
the research area covered. A meta-analysis “package” is described and then 
applied to the question of whether there are sex differences in degree of con- 


formity. The meta-analysis is yoked to a literary analysis, and conclusions of 
differing direction and confidence appear. Problems in application are en- 


countered, and appropriate courses of action are discussed. Finally, limitations on 


the power of the procedure are outlined. 


The traditional way to review research in 
social psychology has been to take a literary 
approach. That is, summary statements about 
research areas are usually based on impres- 
sions gleaned by the reviewer from a reading 

of related studies. This article takes issue 
( with the efficiency of such an approach. An 
example of a statistical technique for com- 
bining the results of independent experiments 
is provided, and this approach is contrasted 
with a literary summary. The quantitative 
procedures describe numerically the charac- 
teristics of a body of evidence and give a 
probability level related to the observed pat- 
tern of results. Although the actual numeri- 
t cal manipulations require little statistical 
sophistication, the theoretical progress pos- 
sible through their employment may be great 
(progress is here defined as more precise and 
confident statements about segments of our 
world). The reasons for believing that prog- 
tess will follow the adoption of quantitative 
procedures are both abstract and practical. 
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Cooper, Center for Research in Social Behavior, Uni- 
versity of Missouri, 111 East Stewart Road, Co- 
lumbia, Missouri 65201. 
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The abstract reasons are presented first, and 
the practical reasons are given after the il- 
lustration is in hand. 

Theoretical progress within a scientific 
discipline is intimately tied to two other kinds 
of advancement, one relating to methodology 
and the other relating to the volume of re- 
search being produced. Methodologically, ad- 
vancement can occur in either research de- 
sign or analysis, but both advancements typi- 
cally involve the development of increasingly 
precise measurement instruments. In the case 
of design refinements, precision of measure- 
ment permits the observation of events that 
were inaccessible before. An example of this 
kind of advancement would be, say, the in- 
troduction of videotape to the study of non- 
verbal behavior. Analysis refinements, on the 
other hand, allow for the more exact descrip- 
tion of observed phenomena. Such an advance 
would be the development of factorial analy- 
sis of variance. Factorial designs allow re- 
searchers to reduce error (in comparison to 
one-way designs) and to study interaction, 
An increased ability to define a concept op- 
erationally, to measure it soundly and power- 
fully, and to study it in relation to other con- 
cepts will clearly influence related theoretical 
developments. 

The second kind of advance related to 
theoretical progress is the accumulation of 
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cation of the most mathematically simple 
model for the data. Of course, it must be 
recognized that in moderately sized samples 
such as the present one, minor modifications 
can be made in the present models that would 
yield indexes of fit approximately comparable 
to those we have obtained; such modifica- 
tions should, however, incorporate all the 
measurement and structural influences found 
to be necessary here. 

Since causal models of nonexperimental 
data can be modified substantially when 
other crucial causal influences are simulta- 
neously considered, the current models must 
not be considered as final theoretical for- 
mulations of love, but rather as potentially 
interesting representations of the Tesser— 
Paulhus data.” As additional longitudinal 
data are collected about love, it should prove 
useful to include the present models as a 
subset of some larger, more comprehensive 
formulation in order to account for changes 
over time in attraction and emotional re- 
sponses to significant others, and to attempt 
to reject the proposed simple model in ex- 
plaining comprehensively the phenomena of 
interpersonal attraction and love. 


2 Unfortunately, the sample of Tesser and Paulhus 


is too small to allow a cross-validation study of the 
models, 
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Traditional (literary) reviews of research in social psychology are compared with 
a statistical approach. It is concluded on both abstract and practical grounds 
that adoption of the statistical approach would lead to theoretical progress for 
the research area covered. A meta-analysis “package” is described and then 
applied to the question of whether there are sex differences in degree of con- 
formity. The meta-analysis is yoked to a literary analysis, and conclusions of 


differing direction and confidence appear. Problems in application are en- 


countered, and appropriate courses of action are discussed. Finally, limitations on 


the power of the procedure are outlined. 


The traditional way to review research in 
social psychology has been to take a literary 
approach. That is, summary statements about 
research areas are usually based on impres- 
sions gleaned by the reviewer from a reading 
of related studies. This article takes issue 

l with the efficiency of such an approach. An 
example of a statistical technique for com- 
bining the results of independent experiments 
is provided, and this approach is contrasted 
with a literary summary. The quantitative 
procedures describe numerically the charac- 
teristics of a body of evidence and give a 
probability level related to the observed pat- 
tern of results. Although the actual numeri- 

cal manipulations require little statistical 
Sophistication, the theoretical progress pos- 
sible through their employment may be great 
(progress is here defined as more precise and 
Confident statements about segments of our 
world). The reasons for believing that prog- 
Tess will follow the adoption of quantitative 
Procedures are both abstract and practical. 


i ES are extended to Robert Arkin, Bruce 
ia le, Anthony Greenwald, Robert Rosenthal, and 
reviewers for comments on this article. 
eee for reprints should be sent to Harris M. 
ce eter for Research in Social Behavior, Uni- 
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The abstract reasons are presented first, and 
the practical reasons are given after the il- 
lustration is in hand. 

Theoretical progress within a scientific 
discipline is intimately tied to two other kinds 
of advancement, one relating to methodology 
and the other relating to the volume of re- 
search being produced. Methodologically, ad- 
vancement can occur in either research de- 
sign or analysis, but both advancements typi- 
cally involve the development of increasingly 
precise measurement instruments. In the case 
of design refinements, precision of measure- 
ment permits the observation of events that 
were inaccessible before. An example of this 
kind of advancement would be, say, the in- 
troduction of videotape to the study of non- 
verbal behavior. Analysis refinements, on the 
other hand, allow for the more exact descrip- 
tion of observed phenomena. Such an advance 
would be the development of factorial analy- 
sis of variance. Factorial designs allow re- 
searchers to reduce error (in comparison to 
one-way designs) and to study interaction. 
An increased ability to define a concept op- 
erationally, to measure it soundly and power- 
fully, and to study it in relation to other con 
cepts will clearly influence related theoretical 
developments. 

The second kind of advance related to 
theoretical progress is the accumulation of 
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evidence concerning a specific topic. Repeated 
observation allows for greater confidence in 
uncovered relations, when observations show 
consistent or similar results. Repeated re- 
sults also increase confidence in the general- 
izability of observed phenomena, with in- 
tended and unintended differences in replica- 
tions testifying to the robustness of relations. 
Statements that can be made with great con- 
fidence undoubtedly motivate future research 
to take a more precise theoretical tack. Meth- 
odological and accumulation advances are 
connected, in that increased precision of mea- 
surement reduces the number of observations 
necessary for relations to be accepted at a 
given level of confidence. In sum, then, it can 
be argued that refined measurements and the 
accumulation of aligned studies are two major 
spurs to theoretical progress. 

With these relations in mind, one can argue 
that the traditional literature review in social 
psychology lacks analytic precision in at least 
three ways. First, literary reviews may be 
more susceptible to the idiosyncracies of a 
particular reviewer’s perspective than are re- 
views using consensually validated statistical 
procedures. As Gene Glass (1976) states: 


A common method for integrating several studies 
with inconsistent findings is to carp on the design or 
analysis deficiencies of all but a few studies—those 
remaining frequently being one’s own work or that 
of one’s students or friends—and then advance the 


one or two “acceptable” studies as the truth of the 
matter, (p, 4) 


Second, the literary review usually ignores 
the issue of relationship strength by rarely 
assessing the size of the effect under study. 
Lastly, the typical review imprecisely weights 
conclusions with respect to the volume of 
available evidence (Light & Smith, 1971). 

: It seems, then, that replacing the present 
literary model of research view with a sta- 
tistical model would almost necessarily in- 
crease measurement precision, In addition, 
this particular replacement relates directly to 
the issue of accumulating evidence. Areas 
having statistical combinations, therefore 
should benefit theoretically from the more 
exact description, sounder inferences, and 
greater confidence in observed phenomena 
that such a technique allows. Again to quote 
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Glass (1976), statistical analysis “connote 
a rigorous alternative to the casual, narra- 
tive discussions of research studies whic 
typify our attempts to make sense of the 
rapidly expanding research literature” (p. 3), 
A methodological advance cannot occur 
unless the development of a field is compatible 
with it. Many areas of social psychology ate, 
I feel, compatible with the advancement pro- 
posed here. This judgment is based on two 
observations. First, certain areas have gen- 
erated enough research on identical or com 
ceptually similar topics that statistical pro- 
cedures can be reliably applied to them, 
Many areas in social psychology can now 
cite 20, 40, or more similar hypothesis tests, 
as the example to be presented here illustrates 
These areas seem “ready” for more precision 
A second reason for the proposed compatibil 
ity between field and method is an increasing 
awareness concerning the importance of legit 
imate data analysis. Partially in response t0 
McGuire’s (1973) call for redirection, social 
psychology training programs tend now to 
require advanced course work in oe 
procedures (I am myself a product of thi 
“trend”). In addition, the major journals it 
the field increasingly present research an 
lyzed with complex statistical techniqua 
While the procedures presented here req 
only a knowledge of the fundamental b 
ability axioms behind nearly all poll 
analyses, this increased exposure to statis J 
may remove some of the “inertia” that oi 
cally surrounds a newly introduced analy! 
procedure. 
The term newly introduced was used E 
because the methods presented do not ™ u 
their debut in this article. Before turning 


methodologies are available than the me 
presently used for drawing inferences iy 
multiple tests of related hypotheses- not 
theory, therefore, may benefit if thes oy 
precise analytic techniques are €™P 
more frequently. Certain areas may 
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STATISTICALLY COMBINING STUDIES 


“ready” for such precision because the vol- 
ume of research available permits statistical 
testing and because researchers may now feel 
quite positive toward rigorous analytic pro- 
cedures. 


Meta-Analytics 


Glass (1976) has suggested that data anal- 
ysis can be categorized into three distinct ac- 
tivities: primary, secondary, and meta-analy- 
sis. Primary analysis occurs when the data of 
a given study are analyzed for the first time. 
Secondary analysis is the reanalysis of data. 
This may occur so that more appropriate 
statistical procedures can be applied or so 
that new questions can be answered with an 
old data set. Finally, meta-analysis occurs 
when the results of independent experiments 
are combined “for the purpose of integrating 
the findings” (p. 3). 

Methods for meta-analysis have a history 
as long as most commonly used statistical 
procedures. In 1938, Fisher and Pearson 
both presented methods for obtaining an 
overall probability for a series of studies 
(Fisher, 1938, pp. 104-106; Pearson, 1938). 
In social psychology, Mosteller and Bush 
(1954) presented a meta-analysis procedure 
in the Handbook of Social Psychology. While 
early instances of meta-analysis applications 
are rare, they now appear with some fre- 


quency. Rosenthal (1976, p. 441) presents a 


meta-analysis testing the influence of inter- 
Personal expectations on behavior. Over 300 
studies are included. He reports that nearly 
50,000 failures to find an influence would 
be needed to reverse, at the .05 level of sig- 
nificance, the conclusion that expectations af- 
fect behavior, Glass (1976) cites meta-analy- 
$s relating to teaching methods, television 
"struction, and socioeconomic status rela- 
i to IQ, to name a few. Smith and Glass 
th 1) present a meta-analysis integrating 
€ literature on the outcomes of psycho- 
RRA Their conclusion, based on 375 stud- 
TA that “therapy of any type under .. . 
i hee conditions can be expected to move 
Peau client from the 50th to the 75th 
6 -Of the untreated population” (Glass, 
»P. 7). Glass also concludes that these 
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studies exhibit only a “trivial” advantage for 
behavioral over nonbehavioral therapies. 

As the above examples illustrate, a meta- 
analysis is conducted on a group of studies 
related either because (a) they share a com- 
mon conceptual hypothesis or (b) they share 
operations for the realization of the inde- 
pendent or dependent variables, regardless of 
conceptual focus. Meta-analytics result in a 
single set of numbers describing the body of 
literature involved. Typically, a significance 
level is generated, which addresses the ques- 
tion “What is the probability that a set of 
studies exhibiting these results could have 
been generated if no actual relation existed?” 
Other statistics are usually presented also, 
such as Rosenthal’s assessment of the number 
of studies needed to reverse a conclusion and 
Glass’s description of the percentile relation 
between experimental groups. 

Meta-analytic procedures have been de- 
veloped that range in sensitivity. The least 
sensitive test simply “counts” the number of 
probability levels below .50 (chance for the 
predicted direction) and asks whether this 
number is too large. The most sensitive pro- 
cedures require that the raw data from in- 
cluded experiments be available (Light & 
Smith, 1971). Rosenthal (1978) has pre- 
sented a detailed description of available 
meta-analytic methods, including their rela- 
tive advantages and limitations. This effort 
will not be repeated here. The method to be 
illustrated presently asks only that probabil- 
ity levels from included studies be available. 
These levels are first transformed to Z scores 
(standard normal deviates) and then related 
to the number of studies involved. This pro- 
cedure is presented because it is more precise 
than the counting method and more accessible 
than methods using raw data. In addition, 
the illustrated method requires no assump- 
tions concerning data beyond those typically 
made by the authors of the original studies 
(ie., linearity, normality, homogeneity of 
variance, independence of events) and re- 
quires that each study exhibit unit variance. 
The method was introduced by Stouffer in 
1949 and then modified by Mosteller and 
Bush in 1954. i 

The weighted and unweighted Stouffer 
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method. The Stouffer (1949, p. 45) method 
for combining studies uses the following for- 
mula: 


= tee o 


where Zma = the standard normal deviate for 
the meta-analysis; Zs1ı* *'Zsn = the standard 
normal deviate for each included study; and 
N, = the total number of studies included. 
The method asks that the reviewer take the 
following steps: (a) Record the probability 
level reported in each study associated with 
the relevant hypothesis; (b) turn to the Z 
score (standard normal deviate) table ap- 
pearing in any elementary statistics text and 
find the Z score associated with each proba- 
bility level; (c) sum these Z scores and 
divide this sum by the square root of the 
number of studies involved; and (d) refer 
this Z score back to the table and record the 
appropriate probability level. This probability 
describes the likelihood that the included 
studies’ results were generated by chance. 
The user should be alert to the fact that a 
direction for the hypothesis should be chosen 
before the procedure is begun (Mosteller & 
Bush, 1954, p, 330). This means that a re- 
ported two-tailed p value supporting the hy- 
pothesis under consideration should be halved 
before the associated Z score is retrieved. A 
disconfirming p value should be halved and 
its associated Z score subtracted from the sum 
of the supporting studies, 

The Stouffer method has many advantages, 
not the least of which is ease of employment. 
As Mosteller and Bush (1954) point out, 
however, even more precision can be built 
into this procedure if the reviewer weights 
(or differentially assigns importance to) stud- 
ies within the body of literature. The Stouffer 
method allows the reviewer to weight each 
standard normal deviate by the size of the 
sample on which it is based or by other de- 
sirable characteristics such as internal or 
external validity. Thus, a reviewer can make 
adjustments based on methodological rigor 
before the statistic is generated. Studies that 
are viewed as “sound” can be weighted more 
heavily than “unsound” studies, if the re- 
viewer so desires. 


Zma 
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The formula for the weighted Stouffer 
method, when sample size is the weighting! 
factor, is: 


NZa + Naa + 


Zm = : 
VWN + Na? H 


“N anlan 
Za 
= 


Td 


where Nsı*''Nan = the number of subjects 
in each included study, and all other quan- 
tities are defined as before. In this case, the 
modification requires that each study’s Z 
score be multiplied by the number of sub- 
jects in the study before the scores are 
summed. This numerator is then divided by 
the square root of the squared and summed 
individual study sizes. The formula given) 
here is weighted by sample sizes rather than 
another criterion because sample size is the 
most objective weighting criterion. Other less 
objective criteria have been convincingly em- 
ployed, however (e.g., Stickell, 1963). The) 
sample-size weighting means that larger stud: | 
ies are given greater importance. In compari 
son to the unweighted method, the weighted | 
method will lead to a lower probability level 
when larger studies produce larger Z scores) 
and a higher probability when smaller studies | 
produce larger Z scores. N 
Studies needed to reverse a conclusion 
We can state with some confidence that 3 
literature review will uncover every re 
hypothesis test ever conducted (Rosen a 
in press, has referred to this as es hash 
drawer problem”). The Stouffer metho E 
the advantage of allowing us to easily e 
pute what could be called a fail-safe ee: ‘he 
With the unweighted procedure we can as ull 
question “How many studies totaling 2 
hypothesis confirmation would be ne a 
reverse the conclusion that a relationshiP 
ists?” To state this numerically, we ask, total 
many studies showing a summed Z soor | 
of zero would be necessary to raise the we 
probability level to about .05, oF P salt 
10?” For the p<.05 level, the faik“ 
formula would be: 


Zat Zat: Za _ yy, A 
Nis.05 = em G45 cL, 


£ jond! 
where Nrsos = the number of ee. j 
studies needed to increase the a qua" 
probability to above .05, and all othe 


i 


f 


STATISTICALLY COMBINING STUDIES 


tities are defined as before. This formula is a 
direct algebraic manipulation of the un- 
weighted formula. It requires that (a) the 
sum of the Z scores for known studies be 
divided by 1.645 (the Z score associated with 
p< .05, two-tailed); (b) this quantity be 
squared; and (c) the number of known stud- 
ies be subtracted from it. The resulting quan- 
tity is the number of additional studies show- 
ing a summed null relationship needed to 
increase the meta-analysis probability above 
the .05 level of significance. Other significance 
levels can be used by replacing the denomi- 
nator on the right with the critical Z value. 
Obviously, the fail-safe N should be presented 
only when a significant overall probability 
level can be reported. 

The fail-safe N is an important descrip- 
tive statistic in that it allows a reader to 
easily evaluate the “strength” exhibited in a 
review against the felt completeness of the 
teviewer’s sampling procedure. However, a 
limitation of the fail-safe N should also be 
Pointed out. It is an appropriate guide for 
the reader only if the assumption of a summed 
null relation in undiscovered studies is ac- 
ceptable. It is always possible that a smaller 
number of studies exist that have a summed 
Z score of equal but negative value to the 
sum of those reviewed. The plausibility of 
this alternative also should be considered by a 
teader. The illustration below includes both 


4 weighted and unweighted Stouffer procedure 


and a fail-safe W. All three procedures are 
recommended for future reviews. 

Describing relationship strength. Observ- 
ers of psychological research have long be- 
Moaned the infrequent reporting of measures 
of relationship strength, or effect sizes, in the 
literature (e.g. Bakan, 1966; Levy, 1967). 
As O’Brien and Shapiro (1968) point out, 
the usual criteria for gauging the importance 


of a study, its level of significance, are num- 


bers bound tightly to the size of the sample 
"volved. Significance testing, then, which 
oe an observed relation to the chance 

„DO relation, becomes less informative as 
pete supporting a phenomenon accumu- 
‘ates, The question turns from “whether” to 
os much.” Effect-size measures that are 
tee of sample-size influence, therefore, ought 
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to be an integral part of all literature reviews, 
especially when the reviewer concludes that, 
in fact, a relation exists. 

There is an effect-size measure appropriate 
for every kind of research design or analysis. 
Jacob Cohen (1977) presents a definitive 
catalogue of these measures. It is the present 
author’s hope that Cohen’s work will con- 
tinuously grow in importance. Here Cohen’s 
d index is illustrated: 


(4) 


where M, and M, =the means of the two 
groups compared, and SD = the pooled esti- 
mate of the population standard deviations. 
The d index is chosen because it is simple, it 
is “scale free” (i.e., the standard deviation 
adjustment indicates that studies using differ- 
ent measurement scales can be compared), 
and it is applicable to a plurality of the stud- 
ies in social psychology. It requires that only 
two means be compared, but even in most 
factorial designs we find that each indepen- 
dent variable is represented by two condi- 
tions. If more complex hypotheses are under 
consideration, alternate effect-size indexes are 
readily available in Cohen’s book, 

The d index is a number that tells how 
far apart the means of our two groups are in 
terms of their common standard deviations. 
For instance, if d= .2, it means that two 
tenths of a standard deviation separate the 
two sample means. In Cohen’s terminology, 
a d index equaling .2 corresponds to a “small” 
effect size, with .5 being “moderate,” and .8 
being “large.” This classification may leave 
something to be desired in terms of intuitive 
appeal. For this reason, Cohen (1977) also 
presents “percentage of distribution overlap” 
measures like the one in use by Smith and 
Glass (1977), cited earlier. The overlap mea- 
sure illustrated here, called Us, tells what 
percentage of the population with the smaller 
mean is exceeded by 50% of the population 
with the larger mean. Put more conversa- 
tionally, “how much of the smaller-meaned 
group is exceeded by the average person in 
the larger-meaned group?” A table for convert- 
ing the d index to U; is presented by Cohen 
(1977, p. 22). As an example, assume Groups 


136 


A and B have been compared and a d index 
of .2 is found, with Group A exhibiting the 
larger mean. The Us measure tells us that 
the average person in Group A has a score 
greater than 58% of the people in Group B. 

The reader may wonder how this effect-size 
analysis can be carried out when means 
and/or standard deviations are not reported 
in an experimental write-up. Friedman (1968) 
has shown that when these quantities are not 
reported, the d index can be estimated from 
the formula: 


d= Vay’ (5) 


where ¢ = the reported ¢ value (of VF if 
analysis of variance is used), and df = the 
associated degrees of freedom for error, In- 
stances will exist where neither means, stan- 
dard deviations, nor £ values will be reported. 
This problem is addressed in conjunction with 
the illustration presented below. 

To summarize, a meta-analysis “package” 
has been suggested that includes the follow- 
ing statistics: (a) a Stouffer Z score; (b) a 
weighted Stouffer Z score; (c) a fail-safe N; 
(d) a d index of relationship strength; and 
(e) a percentage of overlap of distributions 
measure, Uz. The remaining tasks, then, are 
to illustrate the application of these proce- 
dures and to address the problems one en- 
counters in doing so, 


An Illustration: Do Females Conform More 
Than Males? 


An area in social Psychology where enough 
data have been accumulated to warrant meta- 
analysis is that of sex differences in conform- 
ity. Maccoby and Jacklin (1974) uncovered 
47 independent tests of this relationship. Sex 
and conformity is chosen here because (a) 
the independent variable realization need not 
be explained; (b) the research designs and 
dependent measures are familiar to most so- 
cial Psychologists, so limited Space is needed 
for their description; and most importantly, 
(c) Maccoby and Jacklin’s literary combina- 
tion of studies provides an excellent counter- 
part to the statistical combination. By choos- 
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ing an area in which a traditional review 
already exists and by employing only the 
earlier cited studies in the meta-analysis, il! 
is possible to compare the conclusions and 
precision of the two techniques. Essentially, 
Maccoby and Jacklin's conclusions are Sup. 
ported, though differences in some inference 
are also apparent. 

As hypotheses for the present study, then, 
Maccoby and Jacklin’s (1974) interpretation 
of the sex differences in conformity literature 
will be used. The following passages, I be 
lieve, best summarize their inferences from 
the body of evidence; 


1, All that can be said at this point is that the re 
sults are inconsistent, and that when Asch experi 
ments are considered in conjunction with other con 
formity studies, neither sex shows an overall tend 
ency to be more susceptible to social influence from 
peers. (p. 271) 

2. In face-to-face encounters, when an individual 
must openly disagree with the opinion of othe 
is the case in the Asch situation, women somewhil 
more often conform to others’ judgments, but nog , 
sistency of the findings and the frequency of S| 
similarity are striking, (p. 268) d 
3. Studies on the effects of persuasive on 
tions have a more consistent outcome . . > On a 
grounds no overall difference in susceptibility to : 
cial influence would be expected. In any case -* 
none is found. (p. 268) j 


Hypothesis confirmation in this meta-analym 
means not only that the predicted relations a 
supported but also that Maccoby and Ja 
lin’s earlier inferences have been upheld. ev- 
The studies sampled. The literature y 
ing as data was Maccoby and Jacklin ple 
cited studies. No attempt was made to mia at 
ment or update the earlier review. This iid 
done so that the statistical combina ae to 
yoked to the literary combination accordi 
evidential base. ing it 
The data included 35 studies nea hol 
the Journal of Personality and Social sy in 
ogy, 5 in Developmental Bieker social | 
Child Development, 2 in the Journal UE pet 
Psychology, and one in the Journal of eat J 
mental Child Psychology. The mean ¥ 
publication is 1969, with the ear: ne i 
of appearance 1958, and the at be 
Twenty-one of the studies were publi 
tween 1968 and 1970. Twenty-four 
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used college-age samples, with the rest choos- 
ing younger subjects. 

The most striking characteristic of the 
sample is that only published studies are in- 
cluded. Maccoby and Jacklin (1974) note 
that “some information is lost owing to the 
selection that occurs in the publication process 
itself” (p. 3). Most distressing is that, as 
Maccoby and Jacklin point out, “negative 
findings probably constitute the most frequent 
omissions from the literature” (p. 5). Essen- 
tially, two kinds of bias are operating here. 
First, we are less likely to see null results in 
published form. Second, we are less likely to 
see results that contradict previously pub- 
lished findings. This selection bias operates 
both on the decision of researchers to submit 
reports and on journal reviewers. The bias in 
the sampling process highlights the importance 
of calculating the fail-safe NV before making 
inferences. Readers of a review should ask 
themselves, “How many studies are missing, 
given the sampling procedure employed?” A 
fail-safe M surpassing this estimate should in- 
Crease the readers’ confidence in the conclu- 
sions. 

Sampling procedures more exhaustive 
than Maccoby and Jacklin’s choice are avail- 
able. Convention programs, computer-assisted 
Searches, and Dissertation Abstracts Interna- 
tional are all alternate or supplemental sources 
to professional journals. Rosenthal (1976) 
applies the interesting technique of separately 
analyzing dissertation research, under the as- 
sumption that this sample contains the least 

confirmatory” bias (mitigating the “con- 
firmatory” bias in published papers, by the 
Way, is the argument that published research 
'Sconducted by the more capable researchers). 

The analysis design. To test the first hy- 
Pothesis (that no overall difference in con- 
raU associated with gender appears in the 
et), the studies were analyzed as a 

Sle group. To test the second and third 
TA Pa eses (relating to situation distinc- 
Eh the studies were divided into three 

p raies. The first category contained Asch- 

oa (Aach, 1956) experiments, in which sub- 

ae ad face-to-face interaction with either 
* subjects or an experimental confederate. 
© second category included studies where 
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conformity to a fictitious group norm or to a 
simulated other was measured. The final cate- 
gory was composed of persuasive communica- 
tion or social reinforcement (Insko, 1965; 
Insko & Cialdini, 1969) experiments. The 
Asch-type studies used primarily agreement 
with a group of peers’ erroneous perceptual 
judgment as the dependent variable. Fictitious 
group norm studies used primarily opinion 
change, and persuasive communication studies 
used mainly attitude change to measure con- 
formity. It is not clear whether Maccoby and 
Jacklin’s (1974) distinction concerning face- 
to-face interaction intended the categoriza- 
tion employed here. Their remark may refer 
to both in-person and simulated-other experi- 
ments. In either case, the results support the 
face-to-face distinction. 

This three-way classification led to the ex- 
clusion of three studies because they were 
correlational (Baumrind & Black, 1967; Sa- 
morjzyk, 1969; Whiting, 1963) and two 
studies because they related to behavior in 
experimental situations (Cook et al., 1970; 
Dillehay & Jernigan, 1970). In addition, three 
studies were excluded because they were con- 
ducted on non-American subject populations 
(Beloff, 1958; Bronfenbrenner, 1970; Frager, 
1970). Of the excluded studies, Bronfen- 
brenner reported that Russian males were 
more conforming than Russian females, and 
Cook et al. reported that females were more 
compliant to experimental deception. No other 
excluded study reported significant sex differ- 
ences. Finally, the study by Sistrunk and Mc- 
David (1971) was not included because of 
its special place in the literature, to be noted 
later. The exclusions encompassed 19% of 
the sample but permitted the following gen- 
eralizations to be made about the studies re- 
maining in the analysis: (a) They all involved 
peer influence; (b) they all were performed 
using white Americans (or results reported 
here are relevant to this population); (c) 
they all included an experimenter-controlled 
introduction of the conformity-producing 
stimulus. 

After the 9 studies not fitting into a cate- 
gory were dropped, 38 studies remained. 
Sixteen studies fell into the Asch-type cate- 
gory, 8 into the group norm category, and 14 _ 
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Table 1 
Asch*-Type Conformity Experiments 
a nn ne UE aEEEEEEEEE SESS SEESSEESEEEEEEeee 
Maccoby & 
Jacklin’s 
(1974) Retrieved Retrieved Retrieved 
Author Year N t Z score dindex U; (%) 
Allen & Newston 1972 366 -50 —.84 —.08 46 
Bishop & Beckman 1971 144 50 — 1.64 —.28 39 
Costanzo & Shaw 1966 96 -50 1.55 35 63 
Hamm & Hoving 1969 42 0025 3.29 1.06 85 
Mock & Toddenham 1971 280 .05 3.29 .47 67 
Carrigan & Julian 1966 96 005 3.29 96 83 
Dodge & Muench 1969 122 50 0 .00 50 
Schneider 1970 96 -50 67 16 56 
Landsbaum & Willis 1971 64 -50 01 -00 50 
Gerard, Wilhelmy, & Conolley 1968 154 -025 1.96 33 63 
Stricker, Messick, & Jackson 1967 190 50 0 00 50 
Endler 1966 120 025 2.05 Al 66 
Glinski, Glinski, & Slatin 1970 56 50 0 .00 3 
Hollander, Julian, & Haaland> 1965 112 025 2.58 49 6 
Julian, Regula, & Hollander 1968 240 025 3.29 -56 L 
Willis & Willis 1970 96 50 0 0 5i 
M 1968.9 142.1 .32 1.22 .28 60 


Note, Positive Z values denote more female conformity. 


a Asch (1956). 
b Hollander et al. used a chi-square statistic; Z, d, and 


into the persuasive communication category. 
Tables 1 through 3 present this categoriza- 
tion. The mean year of journal appearance 
for each grouping is nearly identical. This 
indicates that there was no research paradigm 
shift in sex and conformity studies over 
the covered period. Also, the tables show that 
the average Asch-type experiment was con- 
ducted using about 17 more subjects than 
the average group norm experiment (Asch- 
type = 141.1; group norm = 124; persuasive 
communication = 132.7). All three experiment- 
size means, however, are well within a stan- 
dard deviation of each other (Asch-type SD 
= 87.6; group norm SD = 72.8, persuasive 
communication SD = 88.5). This analysis 
of category differences helps rule out alterna- 
tive hypotheses, should different results be 
found for different categories. 

Obtaining data about experiments. A num- 
ber of problems arise when the reviewer at- 
tempts to extract from each experimental 
report the statistics needed for combination 
(here, ¢ values and # levels). The most glar- 
ing problem is that many studies finding no 
differences do not report the statistics upon 


U; are estimated from their probability level. 


which this inference is based. In the present 
analysis, 12 of the 38 categorized studies 
(32%) reported no statistics. Creating evell 
more frustration was the finding that 4 
studies (11%) reported ¢ (or F) values with: 
out mentioning the direction of the effect: 
In each case, a simple finding of “no differ- 
ence” is reported. Although the premium i 
space in professional journals is appreciat l 
much of this “planned obsolescence” that È 
searchers build into their reports cam i 
avoided. Instances were found in which aa 
ysis of variance summary tables were pe d 
sented with nonsignificant F values Tep 
by lines. This does not save space. The see 
ment that “no differences were found beva 
Groups A and B” can be replaced with a 
statement that “a nonsignificant an 
was found, with Group A higher than C 
B.” At the least, directional statements & f 

with their statistical basis ought to aPP% 
for all main effects tested. The increas’ | 
length that this may entail will be Aiye 
repaid through the increased amount of E 
the report will contribute to the area 0 = 
terest. It should also be noted that the Pt 


Experiments With Fictitious Group Norms 
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Retrieved Retrieved Retrieved 
Author Year N t Z score d index Us (%) 

Hamm 1970 216 -50 0 0 50 
Sistrunk & McDavid 1971 80 50 0 0 50 
Lefurgy & Woloshin 1969 53 -50 0 0 50 
Wyer 1966 80 .50 1.64 .41 65 
Sampson & Hancock 1967 251 —.025 —2.05 =.30 —62 
Sistrunk* 1971 64 .50 0 0 50 
, Endler & Hoy 1967 120 -50 52 0 50 
Wyer 1968 128 -50 =.33) —.04 50 
M 1968 124 44 —0.03 01 51 


42%-incomplete reporting rate is probably 
higher than the rate that will be found in 
other areas. This is because the experiments 
included here typically studied sex differences 
only secondarily. Most researchers will take 
‘More care in reporting primary findings. 

The best strategy for dealing with incom- 
plete data reports is to assume an exact find- 
ing of no difference. Thus, such studies are 
treated below as having uncovered a ¢ value of 
0 with a probability of .50. As meta-analysis 
Combined probabilities become lower, we can 
assume that this procedure increases “con- 

Servative” bias in our estimations and infer- 


A second difficulty encountered when sta- 
tics are being collected is that some studies 
nploy multiple dependent measures. In such 
Event, alternate courses of action are open 
the reviewer. First, all dependent variables 
n be included. This procedure means that 
experiments will contribute to the over- 
Probability more than others. If the re- 
er does not wish this to be so, each de- 
€nt variable within a study can be 
nted so that the sum of each study’s mea- 
1s equal. A third strategy is to choose 
a- Pajor dependent variable (i.e., the one 
ai). to the sample description) and exclude 
oc This is the strategy used here. 
Problems related to multiple mea- 
when (a) separate papers report 
dependent measures collected on the 


Note. Negative Z values denote more male conformity. 
*Sistrunk reports more conformity among black females than males but no differences among whites. 


same subjects or when (b) the same re- 
searcher (laboratory) reports several studies, 
These problems (as well as the sampling bias 
referred to above) diminish the independence 
of data involved in meta-analysis. At present, 
the more elegant statistical adjustments for 
nonindependence of observations (most not- 
ably, adjustments in degrees of freedom) can 
only be subjectively approximated in meta- 
analytics. This can be accomplished through 
the reviewers’ weighting decisions. Thus, if 
Researcher X has produced four similar stud- 
ies in a 2-year period, a cautious meta-analyst 
might individually weight these studies less 
heavily than a similar single study produced 
from a different laboratory. 

Another problem in data collection arises 
when both parametric and nonparametric sta- 
tistics appear in the literature. Only one in- 
stance of nonparametric use was encountered 
in this analysis (Hollander, Julian, & Haa- 
land, 1965). The procedure used was to ex- 
trapolate all important values from the re- 
ported probability level. A more equal mix of 
statistics based on differing assumptions 
might lead to the use of other procedures (i.e., 
combining raw data from expected-frequency- 
type studies and treating these separately). 

A final problem in data retrieval relates to 
levels of significance. Care should be taken 
to recompute the relevant significance levels 
presented in the individual reports. Probabil- 
ities falling between .05 and .01 are often 
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reported as p < .05. For testing the null hy- 
pothesis, this procedure is legitimate, but 
more accurate Z scores will be retrieved if 
more accurate probabilities are computed. 
With the introduction of high-speed com- 
puters, the rounding of p levels is practiced 
less often. However, the reviewer cannot easily 
discern whether this has been done. 

The reader may have noted that the pro- 
cedures outlined above (gathering ¢ values, 
examining means, recomputing probabilities) 
represent a significant departure from tradi- 
tional literary reviews regardless of whether 
studies are statistically combined. It would 
be interesting, then, to gauge separately the 
increased precision created by means of data 
retrieval and by the combining procedures. 
For this reason, the Stouffer methods have 
been performed on both the p levels gathered 
by the procedures outlined above and the $ 
levels presented in Maccoby and Jacklin’s 
(1974) annotated bibliography. These latter 
probabilities take one of four values: „50, 
.05, .01, and .001. 

Results. In brief, the hypotheses can be 
stated as follows: 

1, When the body of evidence is examined 
as a whole, neither sex will show an overall 
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fluence from peers. 

2. In face-to-face encounters, women more 
often than men will conform to others’ judg 
ments. However, (a) there will be incon- 
sistency of the findings, and (b) the fre 
quency of sex similarity will be striking. 

3. Studies on the influence of persuasive 
communications will show no overall differ- 
ence in susceptibility to social influence. 

Testing Hypothesis 1 first entailed com- 


most resulting statistics (all p levels are re- 
ported as one-tailed). When the probabilities 
reported in Maccoby and Jacklin (1974) 
served as data, the unweighted Stouffer 
method led to a Zma of 2.13 with a one-tailed 
probability of less than .015. The weighted 
method produced a Zma of 1.18 with p < .12 

The Stouffer procedures were then repeated 
with only the Z scores of the 38 categorized 
studies included. Using the Maccoby and 
Jacklin probabilities led to an unweighted 
Zma Of 2.35. Such a score would occur less 
than 1% of the time given that females ac 
tually do not conform more than males 
When the Z scores retrieved from the primary 


tendency to be more susceptible to social | 


Persuasive Communication (Attitude Change) Experiments 


bining Z scores over 46 studies (with Sistrunk 
& McDavid, 1971, excluded). Table 4 presents 
| 


Maccoby & 
Jacklin’s jeved 
(1974) Retrieved Retrieved Retriey 
Author Year N t Z score d index Vi (h 

Dean, Austin, & Watts 1971 161 50 52 12 7 
Eagly & Telaak 1972 118 ‘50 o ‘0 A 
Greenbaum 1966 100 -50 0 0 2 
Insko 1965 70 50 0 0 3 
Insko & Cialdini 1969 152 50 0 0 a 
Linder, Cooper, & Jones 1967 53 50 0 0 a 
Marquis 1973 52 50 0 0 su 
Nisbett & Gordon 1967 152 ‘50 0 0 a 
Osterhouse & Brock 1970 160 ‘50 0 0 a 
Rosenkrantz & Crockett 1965 176 ‘50 0 0 Fe 
Rule & Rehill 1970 90 ‘50 0 0 per 
Silverman 1968 f 3 
Silverman, Shulman, & as se ott en 

Wiesenthal 1970 F 
Worchel & Brehm 1970 7 2 . o A 

M 1968.8 132.7 -50 14 02 3 | 
Note. Positive Z values denote more female conformity. | 
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ible 4 
mmary of Combined Z Score Results of Three Types of Experiments 
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s Fictitious Persuasive 
Overall Asch*-type group norm communication 
Maccoby Maccoby Maccoby Maccoby 
& & & & 
Jacklin Jacklin Jacklin Jacklin 
| Procedure (1974) Retrieved (1974) Retrieved (1974) Retrieved (1974) Retrieved 
Unweighted 2.35 3.44 4.11 4.87 69 —.08 .00 51 
p 01 -0005 -00005 -000005 75 .53 .50 31 
Weighted 1,92 3.04 3.56 4,03 =1:23 —.91 00 1.10 
TS .03 .002 0005 -00005 89 82 50 14 
Fail-safe N 40 129 84 125 


Note. All Z scores are based on 38 studies; ps are one-tailed. 


*Asch (1956). 


jurces were used, the unweighted Zma rose 
ae an event that chance produces less 

5 in 10,000 times. If Maccoby and Jack- 
lin’s data are used, the fail-safe N, the num- 
ber of null-totaling studies needed to raise 
the observed probability to above 5 chances 
in 100, is 40. Based on the retrieved data, a 
fail-safe V of 129 is found. 

Tt seems, then, that taken as a whole, the 
body of evidence supports the conclusion that 
females conform more than males, If we re- 
Strict our inference to American subjects in 
€xperimenter-controlled, peer-influence situa- 
tions, we can make this statement more con- 
fidently. This conclusion is at odds with Mac- 
i and Jacklin’s, which saw “no overall 
tendency” for females to be more conform- 
ig. In this instance, the literary and statisti- 


cal combinations led to differing interpreta- 
tins of the data. In addition, the retrieving 
Í data from primary sources led to a more 
ot conclusion of sex differences than 
‘ d the annotated bibliography. The p levels 
Produced by the two gathering methods differ 
Y 89 uncovered studies needed to reverse the 
Conclusion, 
à €sts of Hypotheses 2 and 3 show less 
oe differences between the approaches. 
pPothesis 2 entailed analyzing separately 
Th Asch-type and group norm experiments. 
Sy unweighted Zma for only Asch-type ex- 
imants was 4.11 with Maccoby and Jack- 
l Ba levels and 4.87 with retrieved p levels. 
tae by chance less than 1 in 10,000 
s. The weighted Zma scores are again 
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lower, being 3.56 and 4.03, respectively, but 
are still highly significant (p < .0005 and p 
< .00005, respectively). The fail-safe N at 
p = .05 for Maccoby and Jacklin’s data was 
84, whereas the fail-safe W for retrieved data 
exceeded this by 41. 

Analysis of group norm experiments pro- 
duced the first set of nonsignificant results. 
As Tables 2 and 4 show, there was a slight 
tendency for group norm studies to report 
more conforming in males than females (as 
the negative Zma score indicates). This tend- 
ency was far from significant, however. 

Hypothesis 2, then, was clearly supported. 
As Maccoby and Jacklin (1974) conclude, 
females do seem to conform more than males 
in face-to-face situations. The difference in 
conclusion strength introduced by reporting 
precision also seems negligible. What does go 
unsupported is Maccoby and Jacklin’s conclu- 
sion that the results show inconsistency. In 
face-to-face encounters, only 2 of 16 studies 
report “small” effects contradicting the over- 
all conclusion. The appearance of these 
anomalies could have been generated by un- 
discovered interactions or by chance. To label 
the overall results as “inconsistent” because of 
them is to attach undue importance to their 
probabilistic nature. 

Also unsupported is Maccoby and Jacklin’s 
conclusion that “the frequency of sex simi- 
larity” is striking. A more legitimate state- 
ment would be that the size of the sex differ- 
ence may be smaller than the reviewers an- 
ticipated or the cultural stereotype predicted. 
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The effect-size analysis allows these impres- 
sions to be quantified. As Table 1 indicates, 
the average d index for Asch-type experi- 
ments was .28. This shows that on the aver- 
age, the male and female distribution mid- 
Points were .28 standard deviations apart (as- 
suming that normality holds). In terms of 
overlap (U3), it is found that the average 
female conformed more than 60% of the male 
population (of course, the sample and setting 
restriction also restrict the generalizability of 
this statement), Further, since the d index 
exhibits a standard deviation of 37, we can 
expect that future experiments will exhibit d 
indexes between .08 and .48 about 95% of 
the time. Alternatively, Asch-type studies 
should show the average female conforming 
more than between 54% and 69% of the male 
sample, 95% of the time. 

Hypothesis 3 is supported by the data pre- 
sented in Tables 3 and 4. The Maccoby and 
Jacklin (1974) data produced a Zma of 0, 
whereas the weighted procedure with retrieved 
data produced the largest, though still non- 
significant, Zma of 1.10 (p < .14). Neither re- 
porting precision nor method seems to in- 
fluence this conclusion. 

To summarize, then, the statistical com- 
bination of studies led to a conclusion different 
from the literary one when it showed an over- 
all tendency for females to conform more than 
males. Increased precision in data gathering 
also weighed against the null hypothesis. 
When the studies were categorized, Maccoby 
and Jacklin’s conclusion that sex differences 
would arise only in face-to-face interactions 
was upheld, However, the interpretation of 
this effect as inconsistent and exhibiting strik- 
ing similarity between the sexes was ques- 
tioned. Finally, it should be noted that, as 
one would expect, the two methods differ 
most in the “least clear” instance. Although 
in this instance the ability to categorize studies 
meant that the differing overall conclusions 


were substantively inconsequential, this may 
not always be the case. 


The Eagly (1978) Update 


Since the completion of the above meta- 
analysis, 


another review of sex differences in 
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conformity research has appeared (Eagly, f 
1978). This literary review, based on cond 
siderably more studies than Maccoby and 
Jacklin’s, also proves instructive when comf 
pared to meta-analytics. 

Eagly provides a three-way classification 
of conformity studies nearly identical to the 
one employed above. With regard to Asch 
type (group pressure) studies, the available 
evidence is summarized as follows: Of the 6l 
studies, 38 (62%) reported no difference, 21 
(34%) found females to be significantly more 
conforming, and 2 (3%) found males sig% 
nificantly more conforming (Eagly, 1978, p| 
92). ; 

Eagly’s criterion for significance is a main 
effect reaching the .05 chance level for null 
acceptance. A meta-analytic chi-square pro: 
duced by Eagly’s data has a value of 55.94! 
Such a statistic is almost impossible to pro 
duce by chance (a chi-square value of 10.8 
appears by chance once in 1,000 observa: 
tions). To reduce the observed frequency o 
sex differences to the exact chance expecta, 
tion, 361 studies each showing a null rela- 
tion would have to exist but be undiscovered 
About 221 undiscovered null studies ve 
be needed to raise the expected mune 
confirmations close enough to the prea 
observed number so that chance could not bê 
ruled out at the .05 level. Pe 

For fictitious group norm studies (not ; 
volving group pressure) Eagly reports in 
studies, with 2 reporting more conformity A 
women and 1 reporting more conformity i 
men. Eagly’s conclusion that no ditas 
found here is justified through meta ane a 
Only 19 undiscovered null studies woul ad 
duce the present observations to an ra 
chance rate. For attitude change (persuasi 
studies, 51 studies showing no difference, 
showing greater female persuasibility, 4” 


E 


eget pee name 


A 
“| 0 
: the & 
1In generating this statistic, I have used posen | 2 
pected values of 5 for significant differences sfa 


; than 
thal, 1978, suggests that no expectation less sy a 


be! 
be used) and 54 for null differences. These a that} a 
correspond to a .08 level of significance, implyi"6 i, 
the reported chi-square value is conservative g of a 
nored the below-expectation (3%) oee us 
greater male conformity. A similar procedure 
throughout this discussion. 
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showing greater male persuasibility are re- 
ported. Eagly (1978) interprets these studies 
4s indicating a very slight balance favoring 
female influenceability (p. 95). The associ- 
ated meta-analytic chi-square statistic is 5.45, 
which is significant between the .02 and .01 
levels. Undiscovered null studies numbering 
139 would reduce the observed frequencies to 
a chance rate. About 59 null studies would 
reduce the observed patterns’ relation to 
chance to above the .05 level. 

In sum, Eagly (1978) states that “the 
modal finding in the conformity literature is 
no difference between the sexes” (p. 93) and 
hat “in group pressure settings, a substantial 
minority of studies has reported differences 
n the female direction” (p. 93). These state- 
nents are correct but may be misinterpreted. 
They ought to be accompanied by a state- 
nent emphasizing that except in the case of 
xtremely strong effects and/or large-sized 
Xperiments, we would expect the modal re- 
ut to be a null confirmation. Further, it is 
aad that in the case of sex differences in 
uP Pressure conformity, the available “sub- 
tantial minority” of null rejections is so 
ubstantial as to be almost impossible to 
teate by chance, 


discussion 


pa the illustration in hand, it seems ap- 

i to conclude by addressing some of 

binat practical advantages of statistical 

uments lons, in contrast to the abstract ar- 

hould Presented earlier. Some limitations 
also be pointed out. 


ae issue of inconsistency in experi- 

ther ans deserves fuller examination. 

uence ca ors make statements whose elo- 

aven uot be improved upon here. As 
gla (1974) offers: 


Mel ` 
j reser ological principle overlooked by writers 
- reviews is that research results are 

id of thems pike this principle suggests is that, in 
© Meanin = ves, the findings of any single research 
lance, Tẹ en may have occurred simply by 
researches) lows that, if a large enough number 
lance mee has been done on a particular topic, 
ot inconsi dictates that studies will exist that re- 
bat appears at and contradictory findings! Thus, 
Ars to be contradictory may simply be the 
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positive and negative details of a distribution of 
findings. (p. 397) 


The disadvantages of maintaining our pres- 
ent use of the word inconsistent are listed by 
Light and Smith (1971): 


First, a great deal of information, much of which 
might be potentially valuable, would be thrown away. 
Second, a decision would be postponed for at least 
the length of time required by the new research. 
Third, from the point of view of the next reviewer 
of the literature, this new research would simply be 
the fourteenth in a set of studies. No matter what 
the result, for the next reviewer the contradictions 
remain. Thus, for any research, it is worth making 
an attempt to find a way to combine and reconcile 
conflicting studies. (p. 430) 


The adoption of statistical procedures, then, 
will lead to less frequent conclusions of in- 
consistency. The use of the word inconsistent 
should occur only when experiments reveal 
strong relations in opposite directions. 

When this stricter definition of inconsist- 
ency is applicable to a research area, a second 
advantage of meta-analysis will become evi- 
dent. Often, inconsistency is an indication 
that undiscovered interaction between vari- 
ables exists. Statistical combinations allow 
for the testing of certain kinds of interaction 
without running additional subjects. Experi- 
mental differences in time, location, opera- 
tional definitions, age of subject, and so on 
can be used to cluster research (Feldman, 
1971). Statistical procedures can then test 
the importance of cluster distinctions. 

Clustering experiments will also lead to 
fuller awareness of sample restrictions rele- 
vant to the results in hand. In the present ex- 
ample, categorizing research led to uncover- 
ing what is a fairly typical generalization 
restriction in social psychological research 
(i.e. the results apply to white Americans, 21 
years old or younger, in peer influence situa- 
tions). These restrictions become possible 
mediators for future research to investigate. 
They are, however, a function of the available 
evidence and not of the combining method. 

It should be emphasized that the ability to 
test certain interactions with meta-analysis 
does not mean that all problems of concep- 
tualization and methodological artifact can 
be resolved in this manner. Whether con- 
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formities in Asch-type and in persuasive com- 
munication experiments are truly phenomena 
of a similar nature is not addressed by meta- 
analysis. Further, as in evaluating a single 
study, alternative conceptualization of in- 
cluded independent variables may rival the 
one offered. Thus, Festinger and Carlsmith’s 
(1959) F statistic cannot tell us whether dis- 
sonance or self-perception is behind the “more 
attitude change for less reward” finding. The 
Zma Statistic in a Present day meta-analysis 
on replications of this study would not ad- 
dress this issue either, Finally, studies shar- 
ing a certain methodological technique be- 
come open to rival hypotheses (we hope, of 
course, that undiscovered method weaknesses 
are distributed throughout the research pro- 
cess and are equally likely to weigh against 
as for a given hypothesis), The study by 
Sistrunk and McDavid (1971) provides an 
example of restricting an inference by vary- 
ing a method, They noted that one character- 
istic of Asch-type studies was a consistent 
use of perceptual dependent variables, Their 
results showed that face-to-face conformity 
differences may be restricted to this certain 
type of judgment only. 

A final advantage of meta-analysis deals 
with the power of research to uncover rela- 
tionships. As earlier stated, the probability 
of a study’s finding significance is tied to the 
number of subjects involved. An example of 
this is present in Table 1. The experiments 
by Costanzo and Shaw (1966) and by Gerard, 
Wilhelmy, and Conolley (1968) uncovered ef- 
fects of almost equal magnitude. Yet, because 
Gerard et al, ran 58 more subjects, their effect 
size proved significant. Meta-analyses should, 
as a matter of course, contain effect-size esti- 


mates, and these should be used to guide fu- 
ture research. 


it possible to compare the two methods with 
conceptual issues controlled, Some statistical 
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conclusions differed in direction and con 
dence from the literary ones. 

Pleas for the increased use of archival da 
are often heard. Add one more voice, stati 
that the best archival data we have are 9 
own records. 
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The concept of self-deception has received little empirical investigation despite 
its broad implications for theories of personality and consciousness. Four cri- 
teria, based on a logico-linguistic analysis, are presented as necessary and suffi- 
cient for ascribing self-deception: To be self-deceived an individual must hold 
two contradictory beliefs; these beliefs are held simultaneously; one belief is 
not subject to awareness; and the nonawareness of this belief is motivated. Two 
experiments are described that examine whether misidentifications of the voices 
of oneself and others fulfill these criteria and are, therefore, instances of self- 
deception. In Experiment 1, it was found that when subjects were incorrect in 
their self-report identifications of voices, at some level of processing correct 
identifications had been made, and these subjects held contradictory beliefs, 
Obtrusive and unobtrusive measures indicated that for the most part, subjects 
were not aware of committing errors. Correlational data supported the conten- 
P tion that the errors were motivated. In Experiment 2, the hypothesized motiva- 
3 tional contexts for the errors were manipulated by giving subjects pretreat- 
ments either of success or of failure. As predicted, the failure group committed 
more misidentifications of self, whereas the success group committed more mis- 
identifications of others. Overall, the findings confirmed that some misidentifi- 

cations of voices of self and others are instances of self-deception. 


The concept of self-deception has received 1969; Gide, 1955; Sartre, 1958). Numerous 
much attention in philosophical and literary psychologists have also discussed self-decep- 
Pnunesa (eg, Camus, 1956; Fingarette, tion in accounting for a wide range of be- 
havior. Meehl and Hathaway (1946) and 
This Anastasi (1961) have argued that self-decep- 
Found we was supported by National Science tion on the part of Tespondents Contributes 
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voked the concept of self-deception as a 
possible explanation of the persistence of 
subjects in maintaining hypotheses in the 
face of disconfirmation. Murphy (1970, 1975) 
related the concept of self-deception to pos- 
sible interpretations of the experimental 
findings on perceptual defense. In fact, 
Mischel (1974) has viewed all neurotic be- 
havior as instances of self-deceptive acts. 
Despite the considerable role assigned to 
the concept of self-deception in several areas 
of study, there have been no attempts to 
demonstrate that any given set of behavior 
conforms to what is meant by self-deception. 
Philosophers have frequently remarked that 
determining what is meant by self-deception 
and demonstrating that people do deceive 
themselves have major implications for views 
concerning the structure of consciousness (cf. 
Fingarette, 1969; Sartre, 1958). For in- 
stance, findings that indicate that people do 
lie to themselves may also indicate that 
people can hold beliefs of which they are not 
aware and that the selective awareness of 
beliefs can be motivated. Within psychology 
it has been a long-standing tradition to as- 
sume that people are necessarily aware of 
their cognitions (Wundt, 1912/1973). Re- 
cently, however, studies have appeared that 
shed strong doubt on the validity of this 
view (Dixon, 1971; Erdelyi, 1974; Nisbett 
& Bellows, 1977; Nisbett & Wilson, 1977; 
Sackeim, Packer, & Gur, 1977). The claim 
that selective nonawareness of cognition can 
be motivated—the essence of what is en- 
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tailed by the concept of self-deception—ig 
particularly controversial. 

Elsewhere, we have presented a logico. | 
linguistic analysis of what is meant by the 
concept of self-deception (Sackeim & Gur, 
1978). On the basis of this analysis, four 
criteria have been offered as necessary and 
sufficient for the ascription of self-deception, 
Here, we will present two experiments de- 
signed to test whether a phenomenon, mis- 
identification of voices of self or others, fits, 
these criteria. Such findings may be viewed 
as an experimental demonstration of the 
existence of self-deception. 


Criteria Necessary and Sufficient for 
Ascribing Self-Deception 


It has been noted that when it is assumed 
that people are necessarily aware of their 
cognition, the concept of self-deception is 
paradoxical (Canfield & Gustafson, 1962; 
Demos, 1960; Fingarette, 1969; Gardiner, 
1970; Penelhum, 1966; Sartre, 1958; Sieg 
ler, 1962). This paradox has been formulated 
by Sartre (1958) as follows: 


The one to whom the lie is told and the one who 
lies are one and the same person, which means that 
I must know in my capacity as deceiver the truth 
which is hidden from me in my capacity as tht | 
one deceived. Better yet I must know the truth 
very exactly in order to conceal it more carefully— 
and this not at two different moments, which at # 
pinch would allow us to reestablish a semblance ° 
duality—but in the unitary structure of a singlt 
project. How then can the lie subsist if the duality 
which conditions it is suppressed? (p. 49) 


In the psychological literature, a simile! 
paradox was raised by critics of research 
subliminal perception and perceptual defen) | 
(Bruner & Postman, 1949; Eriksen & Broni í 
1956; Howie, 1952). The point was ma” 
that in order for a perceiver to avoid ye 
ceiving a stimulus, the stimulus must first ý 
perceived. When it is assumed that pero?” 
tion implies awareness of the percept, ™ 
tions like subliminal perception and perce 
tual defense are paradoxical. Proponents 5 
the existence of these phenomena Ca is 
1971; Erdelyi, 1974) have argued that * 
erroneous to assume that cognition must j 
subject to awareness. In support of this P ) 
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tion, Nisbett and Wilson (1977) have sum- 
marized a sizable body of literature that in- 
\ dicates that people may lack awareness for 
th the contents and processes involved in 


7 ection of the assumption that cogni- 
Te necessarily subject to awareness is 
implicit in the common use of the term self- 
tion. Often when people describe an in- 
dividual as self-deceived, they state that “all 
‘or “deep inside” the individual holds 
f that contradicts an avowed belief, 
urther implied in the common use of 
term that the self-deceived individual, 
holds such contradictory beliefs, does 


der to provide some psychological gain. 
as people attribute motivational de- 
nants in cases of conscious lying, they 

tances of self-deceit as implicating 
we have offered the 


he individual holds two contradictory 
liefs (that p and not that b). 

a ‘These two Contradictory beliefs are 
held simultaneously. 


1. Th 
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experience provides a likely can- 
t been shown that when subjects 
or videotape feed- 


n (Duval & Wicklund, 1972; 
1966; Holzman et al., 
Holzman, 1967), and 
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changes in self-concept (Alkire & Brunse, 
1974; Boyd & Sisney, 1967; Duval & Wick- 
lund, 1972; Gur & Sackeim, in press; Storms, 
1973). Self-confrontation manipulations have 
also been shown to influence dream content 
(Castaldo & Holzman, 1967, 1969) and 
other behaviors that do not appear to be 
mediated by conscious awareness (Huntley, 
1940; Wolff, 1943). 

Differences in the direction and intensity 
of reactions to self-confrontation have been 
associated with experimental manipulations 
of self-esteem and with individual differences 
in personality (Davis & Brock, 1975; Duval 
& Wicklund, 1972; Gibbons & Wicklund, 
1976; see Sackeim & Gur, 1978, for a re- 
view). These findings have led to the con- 
clusion that individuals who hold negative 
attitudes about the self and score high on mea- 
sures of cognitive discrepancy? find self- 
confrontation to be aversive and will tend to 
avoid it. On the other hand, people low in 
Cognitive discrepancy do not find self-con- 
frontation aversive and in fact will seek it 
out. Several studies have shown that people 
often fail to recognize their own voice, and 
sometimes people identify others as the self 
(Holzman et al., 1966; Huntley, 1940; 
Olivos, 1967; Wolff, 1943). The fact that 
there are individual differences in reactions 
to self-confrontation, and that some people 
find self-confrontation aversive while others 
seek it out, may indicate a motivational basis 
for the errors in identification. 

The question we posed for investigation is 
whether individuals are engaging in self-de- 
ception when they avoid self-confrontation 
by misidentifying the self as others and when 
they artificially seek out self-confrontation 


1 Cognitive discrepancy is defined here as the ex- 
tent to which individuals hold discrepant attitudes 
and beliefs about themselves. These discrepant cog- 
nitions may involve conflicts between what individ- 
uals believe themselves to be and what they believe 
they should be or wish to be. On paper-and-pencil 
personality inventories, items that a priori would 
tap this dimension are questions concerning dis- 
satisfaction with the self. For example, on the 
Neuroticism scale of the Eysenck Personality In- 
ventory (Eysenck & Eysenck, 1963), such an item 
would be “Do you often worry about things you 
should not have done or said?” 
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by misidentifying others as the self. Both 
of these errors involve distortions of reality. 
Mistaking self for others reflects an avoid- 
ance or a denial of an aversive event. Indi- 
viduals who misidentify others as self com- 
mit a “narcissistic,” self-projecting response. 
Our hypothesis, then, is that at least some 
misidentifications of self or others are in- 
stances of self-deception. 


Experiment 1 


Operationalizing the Criteria for 
Self-Deception 


In order to demonstrate that misidentifica- 
tions of self and others are instances of self- 
deception, it must be shown that when peo- 
ple commit such errors they fulfill the four 
criteria for ascribing self-deception. The op- 
erationalization of the criteria is presently 
outlined in specific relation to the procedure 
used in Experiment 1. Subjects were ad- 
ministered a simple identification task in 
which they were asked to indicate whether 
audio stimuli were tapes of their own or 
other people’s voices. Galvanic skin responses 
(GSR) to each voice and reaction time of 
identifications were recorded. Subjects had 
completed a battery of personality inven- 
tories before the experimental session. 


The First Criterion: Holding Two 
Contradictory Beliefs 


The first criterion for ascribing self-decep- 
tion requires evidence that when subjects 
misidentify a voice they hold contradictory 
beliefs. Subjects’ self-report identifications of 
voices provide one index of beliefs. In affirm- 
ing that a given voice is, for instance, that 
of another, a statement has been made con- 
cetning a particular belief. Levels of GSR 
reactivity to voices may provide another in- 
dex of beliefs. In every relevant study that 
examined Psychophysiological reactivity, it 
has been found that levels of arousal are sub- 
stantially higher following self-confrontation as 
compared to confrontation with others (Holz- 
man et al., 1966; Murray, 1963; Olivos, 1967: 
Sackeim & Gur, 11978; Verwoerdt Nowlin & 
Agnello, 1965; Dickinson & Ray, Note 1- 
Sackeim, Note 2). If subjects hold contradic. 
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tory beliefs when they incorrectly identi 
voices, then their levels of GSR reactivity 
should be high when they misidentify voice 
of self as others (false negative responses) 
and low when they misidentify voices of. 
others as self (false positive responses), This 
would indicate that when subjects committe 
such errors, at some level of processing cor- 
rect identifications had also been made. This 
use of psychophysiological responding to de $ 
termine whether subjects hold contradictory, 
beliefs is similar in concept to the use of psy- 
chophysiological measurement in the detec 
tion of conscious lying (Lykken, 1959, 1974; 
Orne, Thackray, & Paskewitz, 1972) and to 
the employment of such measures in subcep 
tion experiments (Dixon, 1971; Eriksen 
1956, 1958; Lazarus & McCleary, 1951). 


i 


The Second Criterion: Simultaneity of Belih 


The second criterion for the ascription 0 
self-deception requires that the two contie 
dictory beliefs be held simultaneously. M 
the present experimental context this «t 
terion is fulfilled by examining the psycht 
physiological index of beliefs at the same 
time that self-report identifications are dë 
livered. 


The Third Criterion: Nonawareness of 
a Belief 


Fulfillment of the third criterion requ“ 
evidence that one of the two contradictor 
beliefs is not subject to awareness and in! 
present investigation requires demonstratio! 
that subjects are not aware of misidentifyi® 
voices. Showing that subjects are uaa 
of stimulus conditions, cognitions, and/or À 
havior has been particularly difficult in a 
eral areas of psychology except through is 
use of highly obtrusive measures (Mant 
Hawryluk, & Guse, 1974; Nisbett & wie 
1977). Martin et al. (1974) wrote, 
most pressing methodological problem Na 
the present studies is the one that plas ; 
all studies of learning without aware 
There seems to be no entirely defers 
method of assessing subject awareness oa 
the experiment” (p. 604). To overcome 
problem in the present study, two mes | 
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of awareness of errors were taken. The first 
measure relies on postexperimental self-re- 
ports of whether subjects are aware of hav- 
ing committed errors. This measure is sub- 
ject to the criticisms mentioned by Martin 
et al. (1974). The second measure, however, 
provides for this context a possible solution 
to the methodological problem, an unobtru- 
sive index of awareness. 
The unobtrusive index of awareness rests 
“on the claim that consciously holding a be- 
lief that a voice is that of the self leads to 
differential consequences in subsequent re- 
sponding than not being aware of such a be- 
lief, When subjects either correctly or in- 
correctly identify a voice as the self, their 
subsequent identification of the voice of an- 
Other will be faster than their responses on 
other trials of voices of others. In the present 
study, trials of voices are arranged so that 
every presentation of the voice of self is fol- 
lowed by presentation of the voice of a 
stranger. If individuals correctly identify the 
self, awareness of this fact should make the 
next identification of the voice of another 
faster than responses to voices of another not 
preceded by identifications of self. This is 
the case because any difference in the physi- 
cal properties of the voice consciously be- 
lieved to be the self and those of the sub- 
Sequent voice will aid in identifying the latter 
Voice as that of another. If subjects are un- 
aware that a given voice is the self, that is, if 
they commit a false negative error, they 
should not be faster in identifying the fol- 
lowing voice than they are on remaining 
trials of voices of others. By the same argu- 
ment, if subjects are not aware of committing 
false positive errors, that is of misidentifying 
others as self, they should be faster in identi- 
fying voices that immediately follow the 
false positive errors than in identifying voices 
that are not preceded by such errors. Com- 
Parisons of reaction times on trials following 
_ Particular response types can thus provide 
an unobtrusive index of awareness. 


The Fourth Criterion: Motivational 
Determinants 


N fulfill the fourth criterion, it must be 
Shown that misidentifications of self and 
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others are motivated. This criterion is the 
most difficult to establish, and Experiment 1 
is viewed as an initial step in this direction. 

Previous studies have found that manipu- 
lations of self-esteem influence the aversive- 
ness of self-confrontation (Davis & Brock, 
1975; Duval & Wicklund, 1972; Gibbons & 
Wicklund, 1976). Likewise, it has been shown 
that measures of personality and psycho- 
pathology, particularly assessments of dis- 
crepant cognitions about the self, predict 
levels of psychophysiological reactivity dur- 
ing self-confrontation and changes in affect 
following the confrontation (Sackeim, Note 
2). Given the hypothesis that subjects com- 
mit false negative errors in order to avoid 
aversive consequences of self-confrontation, 
it might be expected that subjects who com- 
mit such errors would score higher on mea- 
sures of cognitive discrepancy than subjects 
who do not make these errors. However, this 
prediction can not be made because of the 
very fact that it is hypothesized that these 
subjects are engaging in self-deception. In 
line with Meehl and Hathaway (1946), we 
have argued that self-deception be viewed as 
a generalized response set that depresses 
measures of psychopathology (Sackeim & 
Gur, 1978, 1979). On the other hand, in- 
dividuals who commit false positive responses 
(i.e, misidentifications of other as self), do 
so because they find self-confrontation plea- 
surable. The motivational determinants of 
these errors are opposite in direction to 
those of the false negative responses. It is 
expected that subjects who commit such er- 
rors will be low on measures of cognitive dis- 
crepancy that predict the aversiveness of 
self-confrontation. 

To examine further the motivational con- 
texts of misidentifications of self and others 
and to test the face validity of the claim that 
both these errors may be instances of self- 
deception, subjects were administered a ques- 
tionnaire designed to assess individual dif- 
ferences in tendencies to engage in self-de- 
ception. It was predicted that subjects who 
committed either type of error would score 
higher on this measure than subjects who did 
not misidentify voices of self or others. 
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In summary, the following set of condi- 
tions should be fulfilled in order to demon- 
strate that misidentifications of voices of 
self and others are instances of self-decep- 
tion: Psychophysiological reactivity to the 
voice of self is greater than reactivity to the 
voices of others, regardless of the correctness 
of identifications of voices. Subjects demon- 
strate that they are unaware of committing 
errors. Subjects who commit false negative 
responses or false positive responses, by and 
large, report that they are not aware of hav- 
ing made such errors. On trials following 
both true positive and false positive re- 
sponses, subjects are faster in identifying 
voices, whereas no savings occur following 
false negative responses. Subjects who com- 
mit false positive responses are lower on 
measures of cognitive discrepancy than sub- 
jects who make no such responses. Subjects 
who commit either false positive and/or false 
negative responses are higher on a measure 
of individual differences in tendencies to en- 
gage in self-deception than subjects who 
commit no errors, 


Method 
Subjects 


The subjects were 60 University of Pennsylvania 
undergraduate volunteers (30 male, 30 female). 
Only those students who had always lived in the 
northeastern United States were accepted as subjects. 


Procedure 


Subjects were given a set of personali = 
chopathology inventories to fil out F Ae with 
assurance that all results would be kept confidential. 
Appointments were made for returning the fonns 
an participating in the experimental part of the 
n 3 ne nee arriving in the laboratory, subjects 
nue me formed only that they would be participating 
we y on “voice discrimination and personal- 


Upon arrival, subj 
jects were taken to an experi- 
ae ech samples of their ee eer 

orded, were seated in a chai 
their mouths were no more th: Satan 
h an one fi 

from a microphone, They were at an a 
to ser: in a “normal speaking voice” Each saka 
m as me poeta, taken from Kuhn (1962. 
p. -125). S paragraph was ch 3 
pilot studies had indicated that subje aas 


Pay careful attention to the text to Wad Hie an 
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graph correctly. This insured that few subject 
would be paying close attention to particular ip. | 
tonations or phrasings. \ 

After the voice sample was recorded, subjects | 
were escorted to another room for electrode place. | 
ment. GSR electrodes were placed on the volar sur- 
face of the middle phalanges of the second and 
third fingers of the nondominant hand. Subjects 
were then escorted to the experimental room, 

Upon entering the experimental room, subjects 
were seated in a chair positioned in front of a 
small table. GSR electrodes were then connected to 
leads from a polygraph, which was in the room 
adjacent to the experimental room, When this was | 
accomplished, the following instructions were read 
to subjects: 


We would like you now to participate in a rather 
straightforward task, I will soon place earphones 
on your head and we will play a tape for you. 
On the tape, there are 30 voices. There is a 
period of silence separating cach voice, We watt 
you to listen to every voice carefully. As soon a 
a voice starts, you should decide whether you 
think the voice is your own or that of a strangtl 
As soon as you have made a decision, you should 
press one of the six buttons, which tells us 
whether you thought the voice was your own of 
that of a stranger and how certain you are 0 
this decision. The three buttons on the left (right) 
correspond to the choice of a stranger. The three 
buttons on the right (left) correspond to the 
self.2 If you press Button 1 on either side, it 
means you are not very certain of your decision. 
If you press Button 2 on either side, it means yea 
are somewhat certain of your decision, whethet 
it be self or other. If you press Button 3 
either side, it means that you are very certain 0 
your decision. Remember, press the button # 
quickly as you can, indicating your decision as 
whether the voice was your own or not and hot 
certain you are. 


Many of the voices you may hear will be be 
very short periods. Do not be surprised. ints 
own voice may appear many times, & few tim 
or even not at all. Press only onè 
voice. 


In summary, all the task requires is that y 
listen to each of the 30 voices, make a decision 
to whether each voice is that of a strange 
your own, and press one of six buttons te. 
what your decision was, and how certai" i 
were. You should do this as quickly as pos“! 
Do you have any questions? 


( 
2 The positions of “self” and “other” button: 
the left and right sides of the panel were randon 


it 
Both sets of buttons were numbered from eh 
right. 
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Remember, please keep- as still as possible and, 
in particular, do not move your left/right arm. 
There will now be a resting period of a few 
minutes before the task begins. I will put the 
earphones on you SO that you can get used to 
them and, after a short while, come back and tell 
you when we are ready to begin. 


After a period of 5 to 7 minutes, the experimenter 
returned to inform the subject that the tape was 
about to be run. During the experiment, subjects 
were monitored through a one-way mirror. 

Each tape contained five groups of six voices, 
{ differing in temporal duration (2-, 4-, 6-, 12- and 
24-sec periods). Ordering of the groups of voices 
was sequential, starting with the 2-sec voices. A 
period of silence, ranging from 20 to 40 sec (M= 
30.00 sec) occurred before and after every voice. 
The position of voices on the tapes was fixed, with 
the self appearing once in each group (Voices 4, 
10, 15, 24, and 28). Two voices of others were also 
repeated, each appearing once in each group. Male 
subjects were played tapes that contained male 
voices, and female subjects heard only voices of 
women. The tapes for the two sexes were identical 
in format. Each subject heard the same other: voices, 
and all stimuli began with the third sentence in the 
recorded paragraph. The voices of others were pro- 
vided by 17 male and 17 female undergraduates at 
the University of Pennsylvania. All had been born 
and lived continuously in the northeastern United 
States, They did not differ in age from the subjects 
in this study. 

After subjects responded to the 30 voices, the 
electrodes were removed, and the subjects were 
taken to another room. They were then given a 
postexperimental questionnaire and the Self-Decep- 
tion Questionnaire (SDQ; Sackeim & Gur, 1979). 
After these tests were completed, subjects were de- 
briefed, 


Apparatus 


Master tapes were constructed so that silence 
periods, before and after the self, consisted of un- 
recordable leader tape. In this way, it was assured 
that voices would be present on the tapes for the 
exact temporal duration intended. No master tape 
was played on more than 11 occasions. The audio- 
tape that was used in this study was a 1.5-mil pro- 
fessional mastering tape. 

Silver/silver chloride electrodes, 11 mm in di- 
ameter, were used, with K-Y jelly serving as the 
electrolyte. Electrodes were plated prior to the be- 
ginning of the experiment. No pair of electrodes 
with a resistance greater than 250 @ oF with bias 
pone greater than .5 mV was used (Edelberg, 
mon Venables & Martin, 1967). GSR was recorded 
eet Grass Model 7P polygraph, using 2 low-level 
GSR -current preamplifier. For measurement of 

» à constant current of 50 uA was passed. 
eee on the master tapes were recorded on 

e right channel. Simultaneous with the onset of 
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the audio stimuli on the right channel was the onset 
of a 1000-cycle/sec sine wave signal on the left 
channel. These signals, of 150 msec duration, were 
recorded from a low-frequency function generator 
Tapes were played in stereo, with subjects receiving 
in both ears only the right channel, through modi- 
fication of a pair of headphones. 

The onset of the sine wave signal triggered a 
voice-operated relay that simultaneously started a 
reaction time clock and registered on one channel 
of the polygraph. When subjects pressed one of the 
six buttons on the panel before them, the reaction 
time clock was stopped, and a light was lit in the 
control room indicating which button had been 
pressed, 


Materials 


Subjects were administered the Eysenck Person- 
ality Inventory (EPI; Eysenck & Eysenck, 1963) 
and a revised version of the Preexamination Ques- 
tionnaire (PEQ; Liebert & Morris, 1967; Morris & 
Liebert, 1969, 1970).5 The 10 items from the Ney- 
roticism scale of the EPI used by Sackeim (Note 2) 
to measure cognitive discrepancy were again used 
in this fashion, along with the Worry scale of the 
PEQ. These inventories were administered before 
the experimental session. 

The postexperimental questionnaire was concerned 
with whether subjects had hearing difficulties, how 
many times they thought they heard tape record- 
ings of their own voices, whether they thought they 
committed either type of error, how certain they 
were when they first heard their own voices, what 
their reaction was to the voice, and how frequently 
in the past they had been exposed to recorded ver- 
sions of their voices. 

In addition, subjects filled out a questionnaire 
specifically constructed to measure self-deception, 
the SDQ. The questionnaire consists of 20 questions, 
all of which are meant to be psychologically threat- 
ening (eg. Have you ever enjoyed your bowel 
movements? Have you ever doubted your sexual 
adequacy?). The questionnaire is answered onal 
(not at all) to 7 (very much so) Likert-type scale, 
with scores of 1 or 2 on individual questions keyed 
as self-deceptions. Total SDQ scores (range = 0-20) 
are used in the analysis of results. 


Results 
Effects of Confrontation With Self and Others 


In line with previous studies using self- 
confrontation procedures (e.g, Holzman et 


Js 

3 The PEQ, which is usually administered prior to 
examinations, was revised so that subjects indicated 
how worried and anxious they were about filling out 
the set of personality’ and psychopathology inven. 
tories. 
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al., 1966; Olivos, 1967; Sackeim, Note 2), 
subjects showed greater psychophysiological 
reactivity to the voice of self than of the 
voices of others. A three-factor (Self versus 
Other Voice Conditions x Sex X Trials) re- 
peated measures analysis of variance on GSR 
change-in-conductance scores * revealed a sig- 
nificant effect of self versus other conditions, 
F(1, 58) = 42.43, p< .001. Sex was also 
associated with GSR reactivity, with females 
more reactive than males, F(1, 58) = 4.41, 
p < .05. There was a main effect of order of 
trials, F(4, 232) = 4.47, p <.005, as well 
as an interaction between the factor of self 
versus other voice conditions and that of 
order of trials, F(4, 231) = 11.82, p < .001. 
This interaction is illustrated in Figure 1. 

As seen in Figure 1, the significant inter- 
action indicates that psychophysiological re- 
activity in the other voice condition habitu- 
ated with increasing number of trials, whereas 
reactivity to self did not habituate. 

A similar analysis of variance with reac- 
tion time as the dependent variable revealed 
that reaction time was longer on trials of 
self compared to trials in which the voices 
of others were presented, F(1, 58) = 18.84, 
$ < .001. There was no significant effect for 


GSR CHANGE IN CONDUCTANCE 
x 
Ci 


BLOCKS OF TRIALS (SECs) 


Figure 1, Galvanic skin response (GSR) reactivity 
(in micromhos) to voices of self and others as a 
function of block of trials. (Each block of trials in- 
creasingly differed in voice durati indi 

a ration, as indicated 
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REACTION TIME (IN SECONDS) 
> 


BLOCKS OF TRIALS (SECS) 


Figure 2. Reaction time to voices of self and othe 
as a function of block of trials, (Each block dl 
trials increasingly differed in voice duration, as in] 
dicated on the abscissa.) 


sex, F(1, 58) < 1. The main effect of ordt 
of trials was significant, F(4, 232) = 2010 
> < .001, as was the interaction between th 
factor and that of self versus other voll 
conditions, F(4, 231) = 14.68, p < 0 
This interaction is illustrated in Figure 2. 

Figure 2 suggests that the interaction «i 
curred because the longer reaction time m 
later periods of the experiment was evident) 
in identifications of self, but not in identi 
cations of others. 


Criteria for Ascribing Self-Deception 


The first criterion: Holding two contrat 
tory beliefs. To satisfy the first ct 
for ascribing self-deception, it must be si) 
that GSR reactivity to the voice of self is r 
high when a subject denies that the vol) 
the self as when the subject correctly 100 
fies the voice as the self. Likewise, it T 
be shown that GSR reactivity to the v% 


*GSR reactivity scores are measured in unid 
microhmos (Edelberg, 1972). These scores ari 
puted by converting resistance measures tO Rì- 
in-conductance scores (ie, 1000/R:— 1000/! val 
in k-ohms, is the lowest level of resistance Tpi 
by subjects within 5 sec of voice onset. 
k-ohms. is the level of resistance at voice onset 
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of others is as low when a subject misiden- 
tifies others as self as when the subject cor- 
rectly identifies others. As stated above, the 
motivational factors offered to account for 
false negative and false positive errors are 
opposite in direction. Subjects were therefore 
divided into four groups: subjects who made 
no errors (» = 15), subjects who made false 
negative errors only (7# = 14),° subjects who 
made false positive errors only ("= 18), 
and subjects who made both false negative 
and false positive errors (m= 13). If sub- 
jects held contradictory beliefs when they 
misidentified voices, we would expect that 
in the group that committed false negative er- 
rors only, levels of GSR for true positive and 
false negative responses would not differ and 
would be higher than those for true nega- 
tive responses. In the group that committed 
false positive errors only, levels of GSR for 
true negative and false positive responses 
should not differ and should be lower than 
levels of GSR for true positive responses. 
For the group of subjects who made no er- 
rors, levels of GSR should be higher on true 
positive as compared to true negative re- 
sponses. Finally, the commission of both 
types of errors by the same subjects cannot 
be viewed as bona fide instances of self-de- 
ception, given the differing motivational fac- 
tors that presumably produce each type of 
error. Accordingly, no expectations could be 
offered for this group. 

Because of the significant interaction be- 
tween self and other voice conditions and 
order of trials on GSR, subjects’ change-in- 
conductance scores for all identifications had 
to be adjusted for the effect of this interac- 
tion.® To do that, for each subject, trial posi- 
tion was regressed on change-in-conductance 
scores separately for each voice condition. 
Adjusted scores, corrected for trial position, 
were obtained. The mean change-in-conduct- 
ance scores, adjusted for trials, were Com- 
puted within subjects for each response cate- 
gory. The results for the four groups of sub- 
Jects are presented in Figure 3. 

As expected, in the group that made no 
errors, GSR reactivity was greater for the 
voice of the self (true positive responses) as 
Compared to the voices of others (true nega- 
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tive responses), (14) = 4.63, p < .001, one- 
tailed. The crucial test of the self-deception 
hypothesis pertains to the behavior of sub- 
jects in the second and third groups. Con- 
sistent with this hypothesis, in the group 
whose only errors were false negative re- 
sponses, levels of GSR were higher for both 
true positive and false negative responses 
compared to true negative responses, ¢(12) 
= 2.91, p < .01, one-tailed, and ¢(12) = 2.14, 
p< .05, one-tailed, respectively. Levels of 
GSR for true positive and false negative re- 
sponses did not differ, #(12) < 1. The results 
for the group whose only errors were false 
positive responses were also consistent with 
the self-deception hypothesis. Levels of GSR 
were higher for true positive responses in 
comparison to both true negative responses, 
1(17) = 4.05, p < .001, one-tailed, and false 
positive responses, t(17) = 2.20, p< .025, 
one-tailed. False positive and true negative 
responses did not differ in levels of GSR, 
t(17) <1. The results for the group that 
committed both types of error differ in pat- 
tern from the results of the two groups that 
committed one type of error. Therefore, it 
can not be claimed that subjects who com- 
mitted both types of error held contradictory 
beliefs when the errors were being made, For 
this last group of subjects, none of the dif- 
ferences among the four response types 
reached statistical significance by two-tailed 
tests. The results for this group indicated that 
they were relatively insensitive in their self- 
other identifications on both verbal report 
and psychophysiological response measures. 

In summary, those subjects who com- 


5 One subject, whose only errors were false nega- 
tive responses, committed five such responses and 
did not make any true positive responses. Since the 
analyses of results bearing on the first criterion are 
within-subjects comparisons of mean change-in-con- 
ductance scores for the various response categories, 
the results for this subject were not included in these 
analyses. 

6To determine whether GSR scores were also as- 
sociated with the number of times subjects cor- 
rectly identified voices as self, correlations were com- 
puted between number of identifications of self 
and mean GSR scores for all voices and separately 
for mean GSR scores for each response category. 
No associations were found. 
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NO ERRORS 
nsi5 


FN ONLY 
nei3 


Figure 3. Mean galvanic skin response (GSR) reactivity (in micromhos), adjusted for order af 
trials, as a function of types of self-report identification for groups that committed no E 
(No ERRoRS), false negative errors (FN ONLY), false positive errors (FP ONLY), and false negati 


and false positive errors (BOTH ERRORS). 


mitted one type of error indicated by their 
GSR reactivity that when they misidentified 
voices, at some level of processing correct 
identifications had been made; these sub- 
jects, therefore, held contradictory beliefs. 

The second criterion: Simultaneity of be- 
liefs. Since the psychophysiological mea- 
surements were taken concurrent with sub- 
jects’ identification of voices, the second cri- 
terion was satisfied by the very nature of the 
experimental procedure. 

The third criterion: Nonawareness of mis- 
identifications. Subjects’ postexperimental 
reports as to whether they committed either 
type of error and their reports as to how 
many times they believed that the voice of 
self appeared on the tapes were in high agree- 
ment. Only 1 subject out of 27 who com- 
mitted false negative responses reported hav- 
ing done so. On the other hand, of the 31 sub- 
jects who committed false Positive responses, 
16 claimed to have made such errors. Thus, 
on the basis of this obtrusive measure, vir- 
tually all subjects were not aware of having 
made false negative errors, whereas about 
half of the subjects who committed false 
positive errors lacked awareness of having 
made these responses. 
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BOTH ERRORS 
n=l3 


The results for the unobtrusive measure 
of awareness were congruent with the find- 
ings from subjects’ postexperimental reports. 
Within blocks of trials, reaction time on 
trials of other voices immediately following 
correct identifications of self (true positive 
responses) was first compared to reaction 
time for the remaining voices of others. This 
was done in order to establish that com 
scious belief that a voice was the self re 
sulted in savings in reaction time on the a 
sequent trial. This was the case both e 
subjects who committed no false negati i 
sponses, ¢(32) = 2.68, p < .01, one-tal } 
and for subjects who committed false nee 
tive errors, ¢(25) = 2.12, p< .025, one 
tailed. Following false negative Bay 
however, reaction time was not faster oD a 
subsequent trial of other, ¢(25) = ahi? 
indicating that savings occurred only 2 i 
subjects expressed a conscious belief that 
voice was the self. 

A similar comparison was made betwee? 
reaction time on trials that immediately m 
lowed misidentifications of voices of o 
as self (false positive responses) and ra 
tion time on remaining voices of others 
within blocks of trials (excluding trials ° 


ed 
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j| 
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other voices preceded by the voice of self). 

Here, the results were somewhat more com- 

| plex. When this comparison was made for all 
subjects who committed false positive re- 

sponses, faster reaction time on trials fol- 

“lowing false positive responses was not found, 

1(30) < 1. However, subjects were divided 
into two groups based on whether they were 
aware of committing false positive errors, as 
indicated by postexperimental self-reports. It 
was found that subjects who claimed not to 

be aware of making these errors were faster 

‘in their identifications following false posi- 

tive responses, #(14) = 2.25, p < .025, one- 

failed. Subjects who claimed to be aware of 
false positive errors were not faster on trials 
immediately following such responses than 

‘on remaining trials of voices of others. This 

pattern of results appears to buttress the 

validity of the unobtrusive measure of aware- 

ness. It also indicates that the lack of faster 

responding subsequent to false negative errors 

cannot be accounted for simply by the pos- 

sible retarding effects of the commission of 
errors per se. 

r In summary, the results of both the obtru- 
sive and unobtrusive measures of awareness 
converge in indicating that virtually all sub- 
jects who committed false negative responses 
were unaware of having made such errors. 
About half of the subjects who committed 
false positive errors were not cognizant of 
the fact. 

The fourth criterion: The motivational as- 
pect. In order to test the motivational ac- 
a of misidentifications of voices, scores 
ice two measures of cognitive discrepancy 
ia compared for subjects who committed 
fea: More false positive errors and sub- 
Jects who did not commit such errors. As 
a ned; the groups differed on the Dis- 
SY scale taken from the EPI, ¢(40) = 
Kak : S a par faid, and on the Worry 
E n EQ, #(40) = 2.23, p < .025, 
tively se e group hypothesized to be ac- 
Positive req a. self-confrontation (false 
Predictors ot he scored low on these 
eae wersiveness of self-con- 

Ontation. Since this finding pertains only t 
One type of error ee ea te 
validity of th > in order to test the 

e claim that both false negative 
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and false positive errors can be instances of 
self-deception, SDQ scores were compared 
for subjects who had two or more misidenti- 
fications of either type and the remaining 
subjects. The results indicated that the 
former group had higher scores on this mea- 
sure of individual differences in tendencies 
to engage in self-deception, £(58) = 1.99, p 
< .05, one-tailed. 

These findings provide initial support for 
the motivational account of misidentifications 
of voices of self and others. 


Consideration of Alternative Explanations 


Independence of GSR and self-reports. 
The evidence presented in support of the 
first criterion can be interpreted as indicat- 
ing that self-report identifications and identi- 
fications of voices as inferred from the GSR 
measure were relatively independent. In par- 
ticular, when subjects misidentified voices by 
their self-reports, correct identifications could 
often be made on the basis of the GSR mea- 
sure. This pattern of results is reminiscent 
of what early studies on subception were at- 
tempting to demonstrate (e.g., Lazarus & 
McCleary, 1951). In these studies, instances 
where correct identifications of stimuli could 
be attributed on the basis of GSR measures, 
while verbal reports were incorrect, were 
taken as evidence for subception. Eriksen 
(1956, 1958) criticized this interpretation 
of the findings by claiming that GSR and 
verbal reports were not independent response 
systems. He argued that GSR and verbal re- 
ports were substantially but imperfectly cor- 
related and that the subception findings could 
be simply accounted for by the uncorrelated 
error term for GSR and verbal reports. 

In order to examine whether the two re- 
sponse systems were independent, we first 
relied on Eriksen’s method of computing the 
point-biserial correlation on mean GSR scores 
for trials of self when subjects were correct 
and incorrect in their self-reports. The same 
correlation was computed for trials of the 
other. Both correlations were not significant, 
rp» (25) = -03 for voices of self, and 7p(30) 
= 15 for voices of the other. These results 
indicate that whether subjects identified 
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voices as self or other had little influence on 
levels of GSR, and therefore, in this respect, 
the two response systems were independent. 

Signal detection theory affords an oppor- 
tunity to separate within response systems 
the effects of sensitivity and response bias 
in the identification of stimuli (Green & 
Swets, 1966). In the present context, such 
procedures would be particularly useful in 
establishing whether subjects who differ in 
the correctness of their self-reports do not 
differ in the sensitivity of their GSR in dis- 
criminating between voices of self and others. 
However, the small number of trials admin- 
istered to each subject and the fact that the 
voice of self was presented to each subject 
on only five occasions preclude the applica- 
tion of standard signal detection methodology 
to these data, since separate measures of 
sensitivity and response bias can not be com- 
puted for each subject. Nonetheless, for de- 
scriptive purposes it might be useful to ex- 
amine the sensitivity of GSR in groups of 
subjects pooled as a function of their self- 
report behavior. 

In order to obtain confidence levels of the 
degree to which the GSR measure indicated 
judgments of self, percentages were computed 
for the GSR on each trial relative to the 
largest GSR within each block of trials. These 
percentages were categorized into 10% units. 
The number of trials of self and other voices 
falling within each unit was totaled for each 
subject. As in the analysis of results bearing 
on the first criterion, subjects were classified 
into the four groups on the basis of their 
self-report identifications. The data for sub- 
jects were pooled within groups to form es- 
sentially four types of perceivers. 

Figure 4 presents receiver operating char- 
acteristic (ROC) curves for the four groups 
on the GSR measure. Cumulative probabilities 
of identifying the voice of another as self 
P(S/o), on the basis of GSR, are on the 
abscissa, and cumulative Probabilities of 
identifying the voice of self as self, P(S/s) 
are on the ordinate. A nonparametric EE 
sure of sensitivity, the area under the ROC 
curve, or P(A) (McNicol, 1972), was com- 
puted for each group.’ As can be seen in 
Figure 4, the sensitivity of the GSR was 
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well above chance for all four groups of sub. | 
jects. Subjects who did not commit any er. | 
rors, P(A) = .86; subjects whose only error i 
were false negative responses, P(A) = 7; 

and subjects whose only errors were false 

positive responses, P(A) = .85, did not differ 

in sensitivity, whereas subjects who made 

both types of errors, P(A) = .71, showed 

lowered sensitivity in their GSR identifica. 

tions of self and others. At least for the 

groups that identified voices in a manner con- } 
gruent with the motivational account of mis. 

identifications, these findings indicate that 

the self-report and GSR response systems 

were independent. 

Effects of certainty of identifications. Aw 
other alternative account of the findings te- 
lated to the first criterion might be stated a 
follows: Subjects have more difficulty and, 
therefore, are less certain when they identify 
their own voices as compared to voices of 
others. This greater uncertainty produces 
greater psychophysiological reactivity ji 
voices of self. Furthermore, the differences m 
levels of GSR found for the four respons 
categories may reflect differences in levels of 
certainty. In particular, the high level o 
GSR found when subjects commit false negi 
tive errors may indicate that subjects a 
particularly uncertain when such misidentl: 
fications are made. Of course, this account 
would have to posit that subjects are not un 
certain when false positive errors are made. 


7 The most common measure of sensitivity d 
rived from ROC curves is d’. However, to us | 
must be assumed that the distributions of ae 
(ie. self) and of signal plus noise (i.e. other) 
equal in variance and that both are bts ihe 
sumptions we are not entitled to make with i 
present data. The nonparametric measure © est 
tivity, P(A), does not require fulfillment of eth- 
assumptions. Furthermore, signal detection Ped 
odology usually enables one to separate the “i o 
of sensitivity from the effects of response oe) 
response bias on performance (Green & Swets, e, of 
However, while the area under the ROC Bie of 
P(A), is an acceptable nonparametric equivale! a 
d', there are no satisfactory nonparametric 4972): 
tive measures of response criteria (McNicol, 1”, 
Therefore, response criteria effects were ri 
amined in this analysis. À 
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(ROC) curves of identifications of voices of self and 


others based on galvanic skin response (GSR) reactivity for groups committing no self-report 
errors, false negative errors, false positive errors, and false negative and false positive errors (all 
errors), [Points marked 1 correspond to the cumulative probabilities of hits, P(S/s), and false 


alarms, P(S/o), for GSR scores that were between 91% 


and 100% of the largest GSR within 


their respective block of trials. Points marked 2 correspond to the 81% to 90% interval; points 


marked 3 correspond to the 71% to 80% 


[i 
Í 
(0) 
Figure 4. Receiver operating characteristic 


50% interval. Because of low frequencies of oci 
and 31% to 40% were combined and correspon 


correspond to the 0% to 20% interval.] 


. 
: a three-factor (Self versus Other Voices 
ex x Trials) repeated measures analysis 
of Variance on certainty judgments showed 
ae for self versus other voices, F(1, 
| ae p < .001, and trials, F (4, 232) 
T a ES ee Subjects were less certain 
ae enti cations of the voice of self 
ie e voices of others, and certainty in- 
ased with trials (see Figure 5). Sex had 
: i significant effect, F(1, 58) < 1, nor were 
: Lan any significant interactions. These find- 
: ee some support for the certainty 
EN na z here are a number of reasons, how- 
i en ejecting this alternative explanation. 
Rides Bet of identifications was in- 
BSR and a covariate in both analyses of 
= ie time, as a function of self 
ee of te fee sex, and trials, the pat- 
Be tn a ts reported earlier were unal- 
ka na analysis of the first criterion, the 
Me difer tore was that levels of GSR did 
senses rue positive and false negative 
Be ana ne the one hand and for true nega- 
ne alse positive responses on the other. 
jects who committed false negative 


interval; points marked 5 correspond to the 41% to 
currence, the data for the intervals 21% to 30% 
d to points marked 7. Likewise, points marked 8 


errors, mean levels of certainty of identifica- 
tions were compared for false negative and 
true positive responses. Subjects were neither 
more nor less certain when they made cor- 
rect or incorrect identifications of their own 
voices, (25) < 1. Therefore, the equivalence 
of levels of GSR despite differences in the cor- 
rectness of self-reports for these responses 
cannot be attributed to differences in the cer- 
tainty of identifications. 

A similar comparison was made for levels 
of certainty on true negative and false posi- 
tive responses for subjects who committed 
false positive errors. Here, it was found that 
when subjects correctly identified others, 
they were more certain than when they mis- 
identified others, (30) = 8.12, P< .001, 
two-tailed. Assuming that less certainty 
should result in greater psychophysiological 
arousal, this difference in certainty for the 
two response groups cannot account for their 
equivalence in low levels of GSR. 

Finally, mean GSR for the four types of 
response might be compared at each level of 
certainty. Given that few subjects committed 
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Figure 5. Certainty of identifications of voices of 
self and others as a function of blocks of trials. 
(Each block of trials increasingly differed in voice 
duration, as indicated on the abscissa.) 


all four types of response and that it was 
impossible for any subject to make all four 
Tesponses at each level of certainty, data had 
to be pooled across subjects. In Table t 
mean GSR is presented for the four types of 
self-report response at the three levels of cer- 
tainty. The pooling of data across subjects, 
whereby subjects do not equally contribute 
to each cell of the table, precludes the use 
of standard statistical tests of significance of 
differences. However, inspection of Table 1 
indicates greater GSR reactivity for true 
positive responses with the higher levels of 
certainty, There is no clear-cut relationship 
between GSR and certainty for false negative 
responses, For true negative responses, there 
is a trend for lower levels of certainty to be 
associated with greater levels of GSR. As in 
the case for true positive responses, higher 
levels of GSR for false positive responses 
were associated with higher levels of cer- 
tainty. In sum, the trend of the results in 
Table 1, along with the previous analysis, dis- 
confirms the alternative explanation that dif- 
ferences in the certainty of identifications 
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account for the evidence originally presents 
in support of the first criterion. 

Effects of repeating voices. No distin, 
tion has yet been made in the obtained r. 
sults between repeated and nonrepeated voicg 
of others. Perhaps repetition per se of th 
voice of self on five trials led to the significan 
differences in reactions to self and other 
However, this argument may be discounted] 
Recall that two of the voices of others wer 
also repeated, each for five trials, to alloy 
examination of any repetition effects, Thi 
difference in mean reaction time on repeatel 
and nonrepeated voices of others was not sig 
nificant, #(59) < 1. The difference in meal 
levels of GSR was significant, ¢(59) = 3.) 
p < .001, two-tailed, indicating greater hai 
ituation of GSR on repeated trials of othe 
voices. Since this observed effect is oppositi 
in direction to that which would account fi 
the greater reactivity on trials of self, th 
combining of repeated and nonrepeated voii 
of others was justified. i 

Effects of previous exposure to playbus 
of self. There may be a more parsi Aa 
explanation of individual differences m a 
of misidentifications of the self and ot 
(rather than the motivational scoot 
fered here). People differ in their expert | 
with tape playbacks of their own voices. 


Table 1 : ‘i 
Mean GSR Reactivity Across Subjects ine 
Positive, False Negative, True vee 
False Positive Responses for each. of 


Levels of Certainty of Response ) 
m SD 
n 


Response type Certainty level A 
18 | 
ate High 48 1 ‘i 
True positive ae ae ie a : 
Low 13). i 
k 4 14 ii 
False negative T i ; a i 
7 . 
Low 
i 
534 
i High 60 A 
True negative ne Ha igi: 3 i 
Low 26 : i 
34 
False positive High et: | 
Medium 4 a 
Low 13. a 
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only previous study to examine the relation- 
ship between frequency of previous exposure 
to tape playbacks of the self and the rates of 
false negative responding found the relation- 
ship to be negative (Rousey & Holzman, 
1967), However, that study, as well as other 
studies that have provided data concerning 
nonrecognition of the self (e.g, Holzman & 
Rousey, 1966; Holzman et al., 1966; Hunt- 
ley, 1940; Olivos, 1967; Rousey & Holzman, 
1967; Wolff, 1943), used procedures that 
tended to maximize rates of nonrecognition. 
This approach was deemed unadvisable for 
the present experiment, which attempted to 
minimize rates of nonrecognition attributable 
to factors other than self-deception. In this 
regard, the repetition of the trials of the self, 
the quality of the audio stimuli, the longer 
durations of voices, and the explicit instruc- 
‘tions that the task was one of identification 
all served to minimize the difficulty of making 
self-other discriminations. The lowest rate of 
honrecognition previously reported was in 
the Olivos (1967) experiment, where false 
negative responses constituted 45% of the 
responses to voices of self. In the present in- 
vestigation, 18.7%: of the trials of self in- 
volved false negative responses. The differ- 
ence is maintained even when the percentage 
of false negatives on the first trial of self 
(31.7%) is compared with those of previous 
studies. Rates of false positive responding 
had not been reported in the literature. The 
rate of false positive responding in the present 
investigation was 3.8%. It can be concluded 
that the method employed in the present ex- 
periment succeeded in minimizing instances 
of nonrecognition. 

To examine the association between ex- 
posure to playback of the self and rates of 
misidentifications, a one-way analysis of vari- 
ance was performed for groups stratified for 
number of false negative responses (none, 
lire two or more). This analysis indi- 

no association, F(2, 56) < 1. A similar 
hoe for groups stratified for number of 

Se Positive responses indicated that the 
groups did not differ significantly, F(2, 57) 
| 52.74, 05 < p<.1. However, when sub- 
s who committed two or more false posi- 
Ive errors were compared to the rest of the 
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sample, it was found that the former group 
had less experience with tape playbacks, F 
(1, 57) =5.47, p < 025. These results sug- 
gest that overall frequency of previous ex- 
posure to playbacks of the self does not ac- 
count for misidentifications of the self and 
others. As in the case of the analysis of re- 
sults bearing on the third criterion, where it 
was found that some subjects who made false 
positive responses were aware of their errors, 
it appears that at least some false positive 
errors may reflect lack of previous exposure 
to recorded voices of self and are unlikely 
instances of self-deception. 


Discussion 


The results of Experiment 1 support the 
contention that there is 4 phenomenon that 
fits the concept of self-deception, since con- 
firming evidence was found for fulfilling the 
four criteria for its ascription. When subjects 
misidentified the voices of self and others, 
they showed that at some level of processing 
correct identifications were made; their levels 
of GSR did not differ from those when they 
correctly identified voices, and therefore, 
they simultaneously held contradictory be- 
liefs. Furthermore, subjects were not aware 
of misidentifying the voice of self and some- 
times were not aware of incorrectly identifying 
voices of others. Finally, there is initial evi- 
dence that misidentifications of the voices of 
self and others are motivated. It was claimed 
that subjects who misidentify others as self 
are narcissistic in their self-regard and do not 
find self-confrontation aversive. In fact, they 
seek it out. These subjects scored low on 
measures that predict the aversiveness of self- 
confrontation. Subjects who misidentified 
either voices of self or others scored high on 
a measure of individual differences in tend- 
encies to engage in self-deception. 

Of the four criteria for ascribing self-decep- 
tion, the results of the present study are least 
compelling in regard to satisfying the fourth 
criterion of motivation. The view that mis- 
identifications of self and others are motivated 
requires further substantiation. The evidence 
presented so far is correlational in nature and 
does not fully address the question of whether 
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these misidentifications are purposeful, an is- 
sue implicit in attributions of motivation (Ir- 
win, 1971). Furthermore, the results bearing 
on the fourth criterion pertain primarily to 
subjects who committed false positive errors. 
The claim that subjects who commit false 
negative errors do so because they find self- 
confrontation particularly aversive could not 
be examined. Additional evidence must be 
sought relating levels of cognitive discrepancy 
or self-esteem to misidentifications of both 
self and others in order to support the claim 
that these errors are motivated. 


Experiment 2 _ 


Recent studies have shown that experi- 
mental manipulations of cognitive discrepancy 
influence selective exposure to the self. Duval, 
Wicklund, and Fine (cited in Duval & Wick- 
lund, 1972, pp. 16-21) found that subjects 
who received prior negative false feedback 
about themselves departed from a room in 
which they were confronted with a mirror 
sooner than subjects who received prior posi- 
tive false feedback or subjects who were not 
confronted with a mirror. Gibbons and Wick- 
lund (1976) found that subjects whose self- 
esteem was enhanced by positive interaction 
with a female confederate subsequently spent 
more time listening to tapes of their own 
voices compared to subjects who experienced 
decreased self-esteem. Davis and Brock 
(1975) found that self-confronted subjects 
who received positive false feedback concern- 
ing a bogus test of creativity emitted more 
first person pronouns, when asked to guess 
English translations of foreign words, than 
subjects who received Negative false feed- 
back or subjects who were not self-confronted. 

The results of these studies indicate that 
after experiences of negative feedback, self- 
esteem is lowered, and confrontation with the 
self becomes more aversive. On the other 
hand, positive feedback enhances self-esteem 
and makes self-confrontation less aversive. 
If the misidentifications subjects make when 
self-confronted are indeed motivated, one 
would expect subjects who have experienced 
failure or who have lowered self-esteem to 
demonstrate greater difficulty in making iden- 
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tifications of self. They should be slower in 
their reaction time on trials of self, they 
should be less certain in making such identi. 
fications, and, in particular, they should com. 
mit more false negative errors. Subjects why 
have experienced success should show legs dif. 
ficulty in identifying the self. Most impo 
tantly, they should commit a greater numbe 
of false positive errors. 


Method 
Subjects 


The subjects were 60 University of Pennsylvaniy 
undergraduates (30 male, 30 female) who volunteered 
to participate in a study on intelligence and voit 
discrimination. All subjects had been born and live 
continuously in the northeastern United States. 


Materials and Procedure 


When subjects reported for the study, they wet 
informed that they would participate in two experi 
ments, one of which required the recording ofi 
speech sample. They were then asked to ran 
same paragraph that was used in Experiment 1, al 
their voices were recorded. Subjects were then gre 
by another experimenter, who told them that a 
were to undergo an assessment of their intellectu 
capacities. They were informed that the purpose? 
that part of the study was to validate a short 
designed to measure intelligence among college s 
dents. They were told that they would be pe 
with 15 multiple-choice synonym problems and | i 
they must provide an answer to each problem wil 
30 sec of presentation. They were also info i 
that they would receive feedback concerning a 
performance on each question and on the tes 
a whole. a 

The experimenter then presented subjects wit 
synonym problems taken from the Guilford- Zina 
man Aptitude Survey, Verbal Comprehension a pi 
Each problem was presented individually 0n ot i 
dex card. The experimenter used a stopwatch ti uei 
subjects’ responses. If a subject did not pros B 
response within 30 sec of presentation, the Hi 
menter informed the subject that time had © fel 
and requested an answer. Subjects were given H 
back as to the correctness of each respon i 
when errors were made, subjects were told sbi 
rect answers. When the test was completed, 5 ji 
were told how many of the 15 problems tl 
answered correctly. eset 

Subjects in the success condition were p b 
with the 15 easiest problems on the test, E ith t 
jects in the failure condition were presented dos 
15 most difficult problems. Subjects were rant 
assigned to the two conditions. However, . 
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ceptance into the failure condition was contingent 
on subjects not succeeding on more than 10 of the 
15 items. Similarly, final acceptance into the success 
group was contingent on correctly solving more than 
10 of the problems. Five subjects who were initially 
assigned to the failure condition succeeded in solving 
more than 10 of the problems. The results of these 
subjects were dropped from the analyses reported 
below. None of the subjects initially assigned to the 
success group failed to reach the criterion for final 
inclusion in that group. In order to avoid any bias 
in assignment of subjects to groups, the results of 
four subjects, who were initially assigned to the suc- 
cess group and who had succeeded in solving all 
problems, were also dropped from the analyses. 
These four subjects were randomly chosen from all 
subjects in the success condition who had correctly 
solved the 15 problems. This procedure resulted in 
a final sample of 30 subjects in the failure group and 
21 subjects in the success group.® 

After completing the verbal problems, subjects 
were escorted to the experimental room, in which the 
voice identification task took place. Subjects were 
administered the Multiple Affect Adjective Checklist 
(Zuckerman & Lubin, 1965) and were then read the 
same instructions for the voice identification task as 
were used in Experiment 1, with the exception that 
subjects were not informed as to the number of 
voices that would appear on the tape. The tapes 
played to subjects contained 28 voices, each voice 
4 sec in duration. The voice of the self appeared four 
times on each tape in Positions 6, 11, 19, and 24. A 
period of silence, ranging from 15 to 25 sec (M = 
20 sec) occurred before and after each voice. Male 
subjects were played tapes that contained male voices, 
and female subjects heard only voices of females. 
Each subject heard the same voices of others, and 
all voices began with the third sentence in the re- 
corded paragraph. The voices of others were provided 
by 24 male and 24 female undergraduates at the 
paversty of Pennsylvania. All had been born and 
AES continuously in the northeastern United States. 

ey did not differ in age from the subjects in this 
study, 
ee the tape was Played, subjects completed a 
p Seen questionnaire. Questions were de- 
PER check the effect of the pretreatment ma- 
they pee Respondents were also asked whether 
scanner ae whether they had been 
were to their pansy errors, what their reactions 
the pastethey had ee and how frequently in 
Of themselves, n exposed to tape recordings 

The apparatus was identical to that used in Ex- 


periment 1, with the h 
ENA Ae ae exception that GSR recordings 


Results 
Manipulation Checks 


ae Multiple Affect Adjective Checklist 
uckerman & Lubin, 1965) is scored on 
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Figure 6. Effects of failure and success manipulations 
on Multiple Affect Adjective Checklist scores. 


HOSTILITY 


DEPRESSION 


three dimensions: anxiety, depression, and 
hostility. As is seen in Figure 6, subjects in 
the failure condition were more anxious, t (49) 
= 2.23, p < .025, one-tailed; more depressed, 
t(49) = 3.03, p < .005, one-tailed; and more 
hostile, #(49) = 2.03, $< .025, one-tailed, 
than subjects in the success condition. 

Subjects’ evaluations of their performance 
on the verbal problems were assessed on the 
postexperimental questionnaire. As is shown 
in Table 2, subjects in the success condition 
were more pleased with their performance, 
t(49) = 4.18, p < .001, one-tailed, and rated 
their performance as comparing more favor- 
ably to that of other undergraduates, ¢(49) 
= 3.92, p < .001, one-tailed, than subjects in 
the failure condition. Ratings of degree of 
intelligence regardless of performance did not 
differ significantly between the two groups, 
t(49) = 1.02, ms. 


8 It could be argued that the five subjects initially 
assigned to the failure condition, by solving most 
of the difficult questions, had an experience of suc- 
cess and should have been reassigned to the success 
group. Were this to be done, there would be 30 
subjects in each condition. With this assignment of 
subjects to conditions, the results are somewhat 
stronger than those to be reported below. 
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Table 2 

Self-Ratings of Subjects in Failure and Success 
Conditions About Performance on Verbal 
Problems and Overall Intelligence 


OE 


Measure and condition M SD 
How pleased or displeased were you 

with your performance on the 

verbal problems? (scale = 1-7) 
Failure* 3.63 1.64 
Success» §.57 ER 

In percentile terms, how would you 

compare your performance on 

the verbal problems with that 

of other undergraduates at the 

University of Pennsylvania? 

(scale = 0-100) 
Failure* 58.33 20.83 
Success? 81.62 19.89 

How would you rate your overall 

intelligence compared to other 

undergraduates, regardless of 

your performance today? 

(scale = 0-100) 
Failure* 71.00 16.80 
Success? 76.67 22.11 

an = 30. 
bn = 21, 


Effects of Success and Failure on Reaction 
Time and Certainty of Identifications 


In line with the findings of Experiment i, 
overall, subjects were slower in making iden- 
tifications on trials of self than on trials of 
others. A three-factor (Success versus Failure 
X Self versus Other Voice Conditions x 
Trials) repeated measures analysis of vari- 
ance on reaction time revealed a significant 
main effect for self versus other voice condi- 
tions, F(1, 49) = 11.46, p = .002. The main 
effect of order of trials was also significant 
F(3, 147) = 21.23, p < 001. In Experiment 
1, the order of trials was confounded with 
the duration of voices. The results of the 
Present experiment indicated that when dura- 
tion was held constant, subjects became 
faster in making identificati 


f : ons with increas- 
ing numbers of trials. The interaction between 


success versus failure and self versus other 
voice conditions was significant, F(1 49) = 
j = 
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4.24, p < .05. As is seen in Figure 7, sub- 
jects in the failure condition, when compared 
to subjects in the success condition, were ‘ 
slower in making identifications on trials of 
self (p< .05). Subjects in the two groups 
did not differ in reaction time on trials of 
others. This finding indicates that the effects 
of failure did not generalize in slower reactions 
to all voices but resulted in retarded perform. 
ance only on trials of self. 

A similar analysis was performed on cer- 
tainty of identifications. The main effects of 
self versus other voices, F(1, 49) = 7.76, p 
< .01, and order of trials, F(3, 147) =5.13, 
p < .005, were significant. The interaction 
between success versus failure and self versus 
other voices did not reach significance, F 
(1, 49) = 2.29, p> .05. The results are il- 
lustrated in Figure 8. Figure 8 indicates that 
there was a trend for less certainty in iden- 
tification on trials of self in the failure group 
and not in the success group. 


e—e FAILURE -SELF 


4.8 a—a FAILURE -OTHER 
o—o suCCESS~SELF 
44 aA SUCCESS - OTHER 


REACTION TIME (SECONDS) 


1 2 3 4 
BLOCKS OF TRIALS wo 
Figure 7. Reaction time of failure and “is 
hers, 


groups in identifying voices of self and ot 
function of block of trials. 
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Bfects of Success and Failure on Errors of 
Identification 


The rationale for Experiment 2 was based 
on the argument that errors made in identifi- 
cation of self and others are instances of self- 
deceptive acts and are therefore motivated. 
‘Accordingly, it was expected that subjects 


“who had experienced failure would make more 


false negative errors, whereas subjects who 


“had experienced success would commit more 


false positive errors. Eleven out of 30 sub- 


jects in the failure group had at least one 


failure to identify self (false negative error), 
while only 2 out of 21 subjects in the success 
group committed at least one false negative 
response. This effect was significant, x (1) 
= 4.79, p < .05. On the other hand, 13 sub- 
jects in the failure group made at least one 
erroneous judgment of self (false positive 
error), and 16 subjects in the success group 


FAILURE -SELF 


4—a FAILURE -OTHER 
@--e SUCCESS ~ SELF 
Kale A-A SUCCESS -OTHER 
r YA 
2.8 F 


CERTAINTY OF IDENTIFICATIONS 


1 2 3 4 
” BLOCKS OF TRIALS 
gure 8. Certainty of failure and success groups in 


_Ìdentifying voices of self and others, as a function 


of block of trials. 
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made at least one false positive response, 
This effect was also significant, x° (1) = 5.44, 
p < .025. In short, when people are made to 
feel good about themselves, they tend to 
“project” and see themselves in places where 
they are not. When people are made to feel 
bad about themselves they tend to “deny” 
seeing themselves in places where they are. 
To examine whether overall rates of error 
differed for the two groups, a degree-of-self- 
projection measure was computed by scoring 
each false negative response as —1, each cor- 
rect identification as 0, and each false posi- 
tive response as 1. Subjects’ scores were 
summed for the 28 identifications, and it was 
found that the success group (M = 1.19, SD 
= 1.68) and the failure group (M = —.07, 
SD = 1.81) differed in types of errors com- 
mitted, ¢(49) = 2.47, p < .01, one-tailed, 
Thus, the experimental pretreatment differ- 
entially influenced rates of types of misiden- 
tifications, thereby supporting the motiva- 
tional account of these errors.” 


Reactions to Hearing the Voice of the Self 


In the postexperimental questionnaire, sub- 
jects were asked to rate how much they en- 
joyed hearing their own voices and how un- 
pleasant or pleasant they found their voices 
to be. Supporting the hypothesis that ma- 
nipulations of cognitive discrepancy influence 
the aversiveness of self-confrontation, sub- 
jects in the failure condition (M = 3.73, SD 
= 1.41) tended to enjoy hearing the voice of 
self less than subjects in the success condi- 
tion (M = 4.33, SD = 1.04), #(49) = 1.63, 
p < .06, one-tailed. The failure group (M= 
4.00, SD = 1.07) also rated their voices as 
less pleasant than the success group (M = 
4.81, SD = 1.05), t(49) = 2.63, p < 01, one- 


tailed. 


9 The obtrusive and unobtrusive measures of 
awareness Of errors taken in Experiment 1 were 
also computed for Experiment 2. Replicating the 
results of Experiment 1, subjects were not aware of 
committing false negative errors, whereas some sub- 
jects were aware of having made false positive errors, 
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We have argued that when people mis- 
identify voices of self and others, they are 
engaging in self-deceptive behavior (Sackeim 
& Gur, 1978). In our view, part of the at- 
tribution of self-deception necessitates demon- 
strating that these errors are motivated. We 
claimed that individuals who are dissatisfied 
with themselves find confrontation with the 
self aversive, Identification of the self is 
more difficult for them, and they will avoid 
the aversive consequences of self-confronta- 
tion by failing to identify correctly the voice 
of self. On the other hand, we argued that 
individuals who hold themselves in high 
esteem do not find confrontation with the self 
aversive. Indeed, such individuals narcissisti- 
cally seek out self-confrontation. Identifica- 
tion of the self is not difficult for them, and 
they demonstrate their preference for self- 
confrontation by identifying voices of others 
as self. 

Previous studies have indicated that ex- 
perimental manipulations of cognitive dis- 
crepancy or self-esteem influence selective ex- 
posure to the self (Duval & Wicklund, 1972; 
Gibbons & Wicklund, 1976). The principal 
measure used in these studies was the amount 
of time individuals engaged in exposure to 
the self. If our claim is valid that errors made 
in identification of the self and others are 
motivated, then we would expect manipula- 
tions of self-esteem to influence the difficulty 
of making identifications of the self and the 
rates of types of errors committed in identi- 

fying the self and others. 

The results of Experiment 2 supported 
these predictions. Subjects who experienced 
failure were slower in making identifications 
of voice of self than subjects given pretreat- 
ments of success. The effects of the failure 
pretreatment did not generalize to identifica- 
tion of all voices, since the failure and suc- 
cess groups did not differ in the speed of 
their identifications of others. Furthermore. 
the two groups differed in the types of errors 
they: committed. The failure group engaged 
in more false negative responding (misidenti- 
fications of self); the success group made 
more false positive errors (misidentifications 
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of others). It could be argued that a number 
of seemingly “defensive” behaviors, such as 
self-serving biases in perception and mem. 


onstrate that a specific phenomenon, misiden- 
tification of the voices of self and others, is 
an instance of self-deception. In Experiment 
1, we showed that when subjects made errors 
in identifying the voices of self and others, 
they provided evidence of simultaneously hold- 
ing correct and incorrect beliefs as to the 
nature of the voices. We also found that sub- 
jects, for the most part, were not aware of 
holding the correct beliefs, and correlational 
data suggested that these errors in identifica- 
tion were motivated. Experiment 2 focused 
specifically on the role of motives by exper- 
mentally manipulating motivation to seek 
out or to avoid self-confrontation. We found 
that subjects who underwent a pretreatment 
designed to lower self-esteem and, therefore, 
to increase the aversiveness of self-confronta 
tion had more failures to recognize their ow? 
voices. In contrast, subjects who received @ 
pretreatment designed to increase self-esteem 
and, therefore, were hypothesized to be motè 
likely to seek out self-confrontation made mote 
errors of identifying voices of others as their 
own. i 
Our finding that some misidentifications 0 
self and others are instances of self-dece 
tion collaterally demonstrates that the pro" 
erties of motivated, selective nontransparen) 
should be attributed to consciousness. 
conclusion goes beyond recent assertions 
people can be unaware of cognitions 
Hilgard, 1976, 1977; Nisbett & Wilson, 1977) 
We claim that at times, such selective 108 
awareness can be determined by motivation 
demands. he 
A question that presents itself concerns v a 
relationship of the concept of self-decepti 
to the psychoanalytic notion of repressi 


that 
(egr 


ory, need not be determined by motivational 
factors (e.g., Greenwald, Note 3). However, 
the pattern of results presented here indicates 
that misidentifications of the self and others 
involve motivated distortions of reality. 
General Discussion 
This investigation was an attempt to dem- 


Se 


Certainly, the concept of repression also at- 
‘tributes a motivated, selective, nontransparent 
nature to consciousness. Since Freud (1914/ 
1957) wrote that repression is “the corner- 
stone on which the whole structure of psy- 
‘choanalysis rests” (p. 16), hundreds of psy- 
“chological investigations have been interpreted 
“as either propping up or tearing down this 
"cornerstone. In a review of many of these in- 
vestigations, Holmes (1974) concluded that 
there is no consistent experimental evidence 
supporting the existence of repression. How- 
ever, it should be pointed out that the criteria 
for the ascription of self-deception may be 
ecessary, but are certainly not sufficient, 
a the ascription of repression. The ascrip- 
ion of repression requires not only the claim 
hat there is a motivated selective nonaware- 
ness of beliefs, it also requires the additional 
‘assertion that beliefs not subject to aware- 
ness are stored in an unconscious. The un- 
conscious is a functionally independent con- 
trol system, capable of purposeful influence 
‘on behavior. In this respect, the concept of 
repression entails that consciousness is not 
only hontransparent but also nonunitary. It 
is for these reasons that we believe that the 
demonstration of the existence of self-decep- 
tion is logically necessary prior to a demon- 
stration of the existence of repression and, 
further, that such a demonstration in the case 
of repression is likely to be a more arduous 
endeavor. 
‘ Given this initial evidence that self-decep- 
ae ieee real phenomenon, a 
a ae additional questions come to the 
Ene i ey concern the nature of self-decep- 
ie ee of individual differences 
the rataa e A behavior, and 
deception, Tie aoe underlying self- 
zation io ne esul ts of the Present investi- 
theae qestio Povo solutions to any of 
He upset gee ut in relation to some they 
e ions for further study. 
ey ie in the elucidation of 
of ANF deception: -deception is whether acts 
dividuals to, fi aaea l responses of in 
» tor instance, threatening stimuli 
or whether they ar A ane 
stimulus-bo a e specific and situation- or 
RONTE . Our use in Experiment 1 of 
i -pencil questionnaire to assess in- 
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dividual differences in tendencies, toward self- 
deception was predicated on the view that 
self-deception is not a stimulus-bound phe- 
nomenon but a generalized response set, or 
characteristic defense (Hilgard, 1949), the 
frequency of its use varying among people. 
In support of this view, we found that sub- 
jects who committed either false negative or 
false positive errors showed greater tendencies 
to deny psychologically threatening state- 
ments on the SDQ. Elsewhere (Sackeim & 
Gur, 1978, 1979), we have found that 
there are substantial negative correlations 
between SDQ scores and degree of self-re- 
ported psychopathology. The magnitude of 
the associations between SDQ and psycho- 
pathology measures is greater than that be- 
tween standard lie scales and self-reported 
psychopathology. These findings support the 
contention of Meehl and Hathaway (1946) 
that “what is much more important, they 
(lie scales) are mainly directed at the sort of 
conscious falsehood which most writers have 
stressed, while ignoring the more subtle tend- 
encies to self-deception which are probably 
of even greater importance in affecting scores” 
(p. 528). As a generalized response set, self- 
deception may influence behavior in contexts 
other than personality testing. For instance, 
Mischel (1974) argued that the “neurotic 
paradox” reflects a motivated attempt by in- 
dividuals to keep specific ideas out of aware- 
ness and that all neurotic behavior is) self- 
deceptive (cf. Abramson & Sackeim, 1977). 
The phenomena found in experiments on 
cognitive dissonance, self-serving biases in 
attribution, and studies of subjects’ accounts 
of their behavior in obedience and conformity 
situations may be likewise interpreted in 
terms of self-deceptive acts. Since it may be 
possible to measure relative tendencies to em- 
ploy self-deception, there is now an oppor- 
tunity not only to examine individual differ- 
ences in self-deception but also the influence 
of self-deception on broad ranges of behavior. 
As Jacques Rivière (cited in Fingarette, 
1969) has suggested, “The discovery of a de- 
ceiving principle, a lying activity within us, 
can furnish an absolutely new view of all 
conscious life” (p. 1). 
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Television Viewing and Fear of Victimization: 
Is the Relationship Causal? 
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Previous findings have suggested that people who watch a lot of television are 
more likely to fear their environment than are those who report being less fre- 
quent viewers of television. From this simple correlation, previous authors have 
suggested that television causes people to overestimate the amount of danger 
that exists in their own neighborhoods. The present study attempted to replicate 
this finding and to determine if the apparent effect was due to a previously 
uncontrolled factor: the actual incidence of crime in the neighborhood. Re- 
spondents to a door-to-door survey indicated their media usage and estimated 
the likelihood of their being a victim of violence. Neighborhoods were .chosen 
so as to include a high- and a low-crime area in downtown Toronto and a 
high- and a low-crime area in Toronto’s suburbs. Pooling across the four areas 
sampled, the previous findings were replicated. However, the average within- 
area correlation was insignificant, Suggesting that when actual incidence of 


crime is controlled for, there is no overall relationship between television view- 
ing and fear of being a victim of crime. A multiple regression analysis and a 
canonical correlation analysis confirmed these findings. 


A variety of social problems have been at- 
tributed to television viewing. It is said that 
television makes people more violent, that it 
lowers the level of literacy in the population, 
and that it distorts the viewer’s perception of 
the world. There is little denying that the pic- 
ture of reality that comes into people’s homes 


u data necessary for 
choosing our experimental neighborhoods, The re- 


and a version of this study is published in Vi 
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is not an accurate reflection of their owns 
ciety. That we learn from television, as 
learn from every other medium, seems int 
tively plausible independent of research ! 
sults. Gerbner and Gross (1974, 1976a, 197% 
Gerbner et al., 1976), however, have Si 
gested something even more serious F 
simple learning effects: that people not oh 
learn factual information, such as the pro” 
tion of people involved in law enforcem 
but that they generalize from the infom 
tion that they get from television. In parti 
lar, Gerbner and his associates show ™ 
those who watch a lot of television are F 
likely to feel that they might be involv 
some kind of violence during a give? 4 
than do those who watch relatively little E 
vision. This same pattern of results show. 
in a variety of questions having to 4 
the viewers’ perceptions of various 
of the society in which they live. si 
et al. (1976) point out, “Their ee f 
sense of fear and mistrust is manifes ns 
their typically more apprehensive resp” 
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about their own personal safety, 
e and law enforcement, and about 
people” (p. 9). 

| Obviously, heavy television viewing is not 
endent of other social factors. Gerbner 
ross (1976a) have found that “heavy 
is part and parcel of a complex syn- 
which also includes lower education, 
mobility, lower aspirations, higher anx- 
and other class, age, and sex-related 

eristics” (p. 191). 
use of the problem of confounding 
bles, Gerbner has been careful to break 
his data on various other characteristics 
levision viewers such as age, sex, educa- 
level, news reading, news magazine 
, prime-time viewing, and viewing or 
iewing of TV news. The notable finding 
all of these comparisons is that although 
e may be main effects of some of these 
ther characteristics, in all cases, heavy view- 
s are more likely than light viewers to feel 
at they might be involved in some violence. 
No list of possible confounding variables 
an be complete. The worry of any researcher 
doing correlational research and wishing to 
make a causal statement is that some other 
variable would, in fact, account for the effect 
È apparently demonstrated. We felt that there 
‘1s one quite plausible factor that might ac- 
count for the correlation between viewing 
and fear of violence: People who watch a lot 
of television may have a greater fear of being 
victims of violent crimes because, in fact. 
they live in more violent neighborhoods. i 
The study that this explanation suggests, 
then, is quite obvious: A survey of the tele- 
vision viewing habits of people and their per- 
ee of being involved in violence should 
Rin oe in both high- and low-crime 
wee ods. Pooling across neighborhoods, 
i ie ise able to replicate Gerbner’s and 
‘ociates’ findings; within neighborhoods, 


however, the effect sh i 
ould b 
reduced or eliminated. ee 


Method 


oa ae of efficient distribution of resources, 
hears politan Toronto Police have divided To- 
eo © approximately 210 patrol areas. The size 

ese patrol areas varies not only as a function 
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of the resident population but also as a function of 
the number of calls of all types that the police receive 
in the area: Busy areas thus tend to be smaller in 
terms of the size of the population served and in 
terms of geographic area. The police identified for 
us the 10 patrol areas with the highest number of 
reported assaults and woundings and the 14 areas 
with the lowest number of reported assaults and 
woundings for the 7-month period ending 2 months 
before the beginning of the survey. From these data, 
four geographic areas, approximately equal in size, 
were chosen, Two (one within the city of Toronto, 
the other suburban) were high in reported crime; 
two (one city, one suburban) were low in reported 
crime, It is difficult to estimate the exact rates of 
crime for the four areas. However, very rough 
estimates would suggest that the rates of assaults 
and woundings per 100,000 resident population for 
the 7-month period for the four designated areas 
would be the following: high-crime city, 614; low- 
crime city, 8; high-crime suburb, 195; low-crime 
suburb, 6. It must be emphasized that these are very 
rough figures: The low-crime areas each had only two 
reported assaults (and no woundings) for the entire 
7-month period; hence the estimates are bound to 
be unstable. There were eight patrol areas constitut- 
ing the high-crime city area; one patrol area was 
sampled for the high-crime suburban area and two 
each for the low-crime areas. 

Obviously, the four areas differ considerably on a 
large number of social variables other than reported 
crime rates. The high-crime city area contains a 
portion of the downtown commercial/entertainment 
district of the city, the largest block of public hous- 
ing in the metropolitan area, and much of the poor- 
est portion of the population. The low-crime city 
area is largely expensive, single, detached houses and 
is one of the more exclusive residential areas. The 
high-crime suburban area contains a high concentra- 
tion of low-rise public housing and is generally 
fairly poor, The low-crime suburban area is mostly 
single, detached, middle-class housing. 

Random households were chosen within each of 
these areas. Interviewers, employed by a commercial 
survey company, did a door-to-door survey. The 


person who answered the door was asked to list all 
f age living in the 


of the people over 18 years © 

household. One of these people was then chosen at 
random by the interviewer. If this person could not 
be interviewed at that time or at some mutually ac- 
ceptable time, the interviewer went on to the next 
randomly chosen household, and the procedure was 
repeated. The effect of this selection procedure was 
an oversampling of women (70.5%) and, presum- 
ably, a general oversampling of those who spend 
much of the time at home. Although this effect 
would be unfortunate if one were interested in 
estimating population values for the measures that 
were taken, it was less relevant in our study, where 
we were interested in the relationship between tele- 
vision viewing and fear of criminal victimization. 
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Table 1 
Mean Fear-of-Crime Factor Scores for Each 
of the Sampled Areas 


City Suburb 

Area M n M n 
High crime 28 83 AS 69 
Low crime SA | 71 —13 77 


Note. The higher the number, the more fear. 


Respondents were first asked to indicate those 
Programs that they had watched during the previous 
week. According to our interviewers, very few people 
had any difficulty in doing this. They were then 
asked to complete a 37-item fixed-alternatives ques- 
tionnaire. This questionnaire consisted of six ques- 
tions dealing directly with the Person’s estimate of 
his or her own likelihood of being a victim of a 
crime; four questions dealing with estimates of the 
likelihood of particular groups of people being vic- 
tims; four questions dealing with the Perception of 
crime in general being a problem and there being a 
need for more police personnel; two questions deal- 
ing with the necessity to arm oneself; eight questions 
of a factual nature dealing with crime; three ques- 
tions dealing with society’s Tesponse to crime; four 
questions dealing with the respondents’ view of To- 
ronto with respect to crime; three questions dealing 
with the respondents’ Prediction of their response to 
a request for help; and three questions dealing with 
media usage. The whole interview took approxi- 
mately 45 minutes on the average. 

For purposes of analyzing the types of television, 
e decided to use the number of programs watched 
as an index of total viewing. In addition, programs 
were coded by a research associate into violent and 
nonviolent types before the tabulation of the other 
data, It should be pointed out that this last measure 
1S, necessarily, somewhat subjective, However, as will 
be Seen, this turns out not to be a serious problem 
in understanding the results, 


we 


Results 


In order to reduce the number of measures 
to a somewhat workable number, a factor 
analysis was performed 1 on the 34 opinion 
questions. Using a varimax Totation, only one 
factor accounted for a substantial amount of 
the common variance. The Percentages of the 
common variance accounted for by the first 
4 of the 11 factors were 35.9%, 12.5%, 
10.8%, and 8.4%. The questions that loaded 
highest on the first factor are shown in Table 
4; they were the six questions related to the 
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respondents’ estimates of their own ch 

of victimization, two of the questions eal, 
with the chances of victimization of partin 
lar groups, and one of the questions deal 
with crime as a general problem. Genera 
speaking, it seems fair to label this dachi 
“fear of crime.” Each of the next three fad 
tors had substantial loadings from only o 
or two questions. 

As one would expect, the residents of 
four areas differed significantly on their ov 
all fear of crime. The average factor som 
for the four areas are shown in Table 1, Ami 
ysis of variance on the factor scores reveald 
a main effect for high-/low-crime area th 
was highly significant, F(1, 296) = 17.19, 
< .01. Neither the city/suburb effect nor t 
interaction was significant. It is clear, tha 
that people who live in high-crime areas, at 
in fact, more afraid. 

The four areas sampled also differed 0 
their exposure to the various media. Table} 
presents these data. Overall, people in hi 
crime areas watched more television and, g 
erally speaking, tended to watch more viole 
television. Although there were interaction 
between the two factors on these two me 
sures, for the purposes of this article, thé 
interactions are not very important. As i 
might expect, since the areas differed on : 
many dimensions, there were also effects 
self-report of exposure to radio news: Peo 
living in low-crime areas tended to a 
listening to radio news more frequently. #0 
thermore, the reported frequency of we 
Paper reading was higher in low-crime 4 
and in the city. 

Gerbner and his associates (Gen 
Gross, 1974, 1976a, 1976b; Gerbner a 
1976) do not directly present me 3 
association between the total amount 0 


isted o 
1 The input for the factor analysis consisted wht 
of those 300 respondents (of the total of her 
answered every question, Most of the oh 
respondents failed to answer only a few of tion! 
tions. The proportion of complete piee d 
varied somewhat from area to area. The ei igh 
complete/total questionnaires are as follows: ijs, 
crime city, 83/119, or 70%; low-crime an 0w- 
or 60%; high-crime suburb, 69/85, or 817% 
crime suburb, 77/86, or 90%. 


e aut 
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Media Usage for the Four Areas 
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eee ee eee 


High-crime area 


Low-crime area 


F value 


Eee eee 


High/low City/ Inter- 


suburb action 


-= Medium City (119) Suburbs(85) City (118) Suburbs (86) crime 
Te TV 36.25 31.71 18.89 25.03 252172 <1 4,98* 
“TY violence 6.97 3.73 2.11 3.33 pK Be Aan 2.72 18,11** 
‘TV news 3.07 2.99 3.72 3.74 2.83 <i <1 
Radio news 5.07 4.96 5.44 5.37 9.79** <1 <1 
Newspaper 

reading 4.78 4.58 5.26 4.80 6.89** 6.01* <1 


p< 05. 
b< O01. 


sion viewed by their respondents (in re- 

nse to the question “How many hours a 
lay do you usually watch television? Please 
include morning, afternoon, and evening”) 
and their fear of being a victim of a violent 


about a 50-50 chance, about a 1-in-10 chance, 
or about a 1-in-100 chance?”). However, 
estimating from the data that are presented 
in the various reports, we calculated a phi 
coefficient of .13 and a contingency coefficient 
of the same value. 

Looking at our data, then, we calculated 
the (Pearson) correlation between our fear- 
of-crime factor scores and our various mea- 
Sures of media usage. These correlations are 
presented in the first column of Table 3. It 


Table 3 


Ci i t 
orrelations Between Media Usage and Fear-of-Crime Factor Scor 


fole, For TV viewing, numbers refer to mean number 
er measures are mean values on a scale where 1 equa’ 


of programs watched during the previous week. The 
Is never and 6 equals daily. ns are in parentheses. 


is quite clear that the basic effect is much 
the same as that found by Gerbner and his 
associates: Across the four areas, those who 
watched the most television (or violent tele- 
vision) tended to be those who were the most 
afraid. However, the effect within area is 
not quite so simple: Although the effect would 
appear to hold in the high-crime area of the 
city, it tended to disappear for the other 
areas. Indeed, the average correlations (last 
column of Table 3) indicate that there is es- 
sentially no relationship between media usage 
and fear of crime when the effect of neigh- 
borhood is removed. We have suggested that 
the artifact that created the first two correla- 
tions in the first column might be labeled 
“actual incidence of crime.” However, in 
terms of the focus of this article (the rela- 
tionship of media usage to fear of crime), the 


es for all Subjects (Pooled), 


Jor Each of the Four Areas, and for the Average of the Four Areas 


Pooled High crime Low crime 
Medi across Average 
edium all areas City(83) Suburb (69) City (71) Suburb (77) correlation 
Total TV 
: 187e * 06 —.09 -09 
By e ies 22 0 14 —.04 07 
Radio ne 05 4 —.04 05 .06 .05 
N io news .05 18 —.09 —.02 .21 .07 
ewspaper reading  —.07 ay 14 ‘09 15 —.03 
Ne ae Ẹ 
ote, Positive correlations indicate more fear associated with higher media usage. ns are in parentheses. 
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Table 4 


Fear-of-Crime Questions and the Correlations Between Responses to Each Question and 
Total TV Viewing and TV Violence for the Four Areas Pooled and the Average of the 


Four Areas Calculated Individually 


ee ee 


Question 


1. To what extent are crimes of violence 
a serious problem in your neighbor- 
hood? (399) 


2. What do you think the chances are 
that if you were to walk alone at 
night on the residential streets of 
your neighborhood each night for a 
month that you would be the victim 
of a serious crime? (391) 


3, If a child were to play alone in a park 
each day for a month, what do you 
think the chances are that he would 
be the victim of a violent crime? 
(382) 


4, If you were to walk by yourself in a 
park close to your home each night 
for a month, what do you think the 
chances are that you would be the 
victim of a serious crime? (391) 


5. What do you think the chances are 
that an unaccompanied woman 
would be the victim of a violent 
crime late at night in a Toronto sub- 
way station? (389) 


6, What do you think the chances are 
that you, one of your family, or one 
of your close friends might be the 
victim of an assault during the next 
year? (385) 


7. How likely do you think it is that you 
or one of your close friends would 
have their house broken into during 
the next year? (405) 


8. Do you ever decide not to walk alone 
at night because you are afraid of 


being the victim of a violent crime? 
(402) 


9. Is there any area around your home 
(i.e, within a mile) where you 
would be afraid to walk alone at 
night? (403) 


High TV viewing 
associated with 


Serious problem 


High probability, 
(1 in 10) 


High probability 
(1 in 10) 


High probability} 
(1 in 10) 


High probability | 
(1in10) & 


4 ility 
High probabil 
(1 in 10) 


Extremely 
unlikely 


Yes 


Note. ns of respo 
*p < 05. 


ndents for each question are in parentheses, 


Very often 


n to this variable is unimportant: 
ie effect of neighborhood is removed, 
» of television is reduced to al- 
jothing. 

y, however, the artifact (whatever it 
d) measures in only the crudest way 
aunt of crime that a person is exposed 
example, a person living in one part 
we have labeled as a high-crime sec- 
‘of the city might, in fact, be quite safe: 
crimes might well be in a different sec- 
of that patrol area. However crude the 
e might be, the size of the correla- 
ms does drop dramatically. 

It should be pointed out that correlations 
e responsive to effects other than the 
h of the relationship between two 
tiables. In particular, as McNemar (1962) 
ints out, “the magnitude of the correlation 
ficient varies with the degree of heteroge- 


ated) of the sample” (p. 144). Given that 
We have divided our overall sample (into the 
four areas) in a manner that clearly relates 
to both of the variables (see Tables 1 and 
2), this curtailment of the variance could be 
a problem. It turns out, however, not to be a 
Serious problem in this case. The ratios of the 
Standard deviations of the “curtailed” dis- 
tribution (average of the four areas’ standard 
deviations) to the uncurtailed distribution 
(standard deviation for the four areas pooled) 
are .965, .917, and .844 for the factor scores, 
total TV viewing scores, and TV violence 
Scores, respectively, McNemar (1962) indi- 
cates that “formulas for ‘correcting’ for double 
curtailment are not too satisfactory” (p. 145). 
Hoeven correcting for the curtailment of 
i arange for the most curtailed distribution 
cone aes) would only raise the average 
Sie Woe fear and amount of vio- 
areas) AON ed across the four 
ie aa look very much the same when 
Fez nae Be by question. The nine 
E e highest weight on the first 
OA be factor analysis are shown in 
for i Be addition, the overall correlations 
in a Subjects pooled across areas are shown 
a e first column (for each question with 

tal TV viewing) and the third column (for 
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Table 5 
Stepwise Multiple Regression Summary* 
Rwhen Fwhen Fin final 
Variable entered’ entered® equation’ 
High/low crime 234 17.234** 16.704** 
City/suburb «235 -186 .051 
Interaction: Crime 
X City/Suburb 251 2.335 2.322 
Sex 350  19.858** 21.176** 
Age 376 6.605* 5.206* 
Total TV 385 2.363 058 
TV violence 391 1.703 1.537 
Radio news A01 2.637 3.002 
Newspaper reading 403 389 389 


a Variables entered in the order indicated. 

b R achieved with this and all variables above it 
included. 

© Equivalent to a test of the significance of the par- 
tial correlation between this variable and fear of 
crime with all variables listed above it partialed out. 
4 Equivalent to a test of the null hypothesis that 
the beta for this measure in the final equation in- 
volving all nine variables is zero. 

* p < .05 (df =1 and > 290 for all Fs). 

*p < 01. 


the responses to each question and violent 
TV). All of the significant correlations are in 
the direction consistent with the Gerbner 
(Gerbner & Gross, 1974, 1976a, 1976b; Gerb- 
ner et al., 1976) findings (ie, more TV 
associated with higher likelihood of victimiza- 
tion, etc.). Generally speaking, it is clear that 
the correlations tend to decrease substantially 
in size when they are run within the four 
areas and then pooled. 

‘An alternative method of analyzing these 
data is in a stepwise multiple regression 
analysis using the fear-of-crime factor scores 
as the criterion and various other social and 
media exposure data as predictors. In order 
to control for neighborhood, this was entered 
first into the regression equation (coded as 
three variables: high/low crime, city/suburb, 
and their interaction). Next, two subject 
characteristics, sex and age, were entered, 
since both of them related to the fear-of- 
crime measure. (Not surprisingly, women 
and older people reported higher levels of 
fear than did men and younger people.) 
‘After these more “basic” variables had been 
entered, total TV viewing and TV violence 
were entered. Finally, the frequency of listen- 
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Table 6 3 ý 
Standardized Canonical Variate Coefficients 


7 


High score r s } 
Variable indicates Variate 1 Variate 2 Variate 3 
e set 
ass 1 No problem 536 —.102 .339 
Question 2 Low chance 064 —.108 .266 
Question 3 Low chance 365 —.012 —.173 
Question 4 Low chance 107 —.175 283 
Question 5 Low chance —.159 —.059 —.806 | 
Question 6 Low chance 394 -216 541 
Question 7 Unlikely —.420 131 .131 
Question 8 Never —.501 —.451 417 
Question 9 No 028 —.577 —.281 
redictor set 
: Total TV Much .079 088 —,063 
TV violence Much —.136 —.017 335 
Radio news Little —.177 —.194 .430 
Newspaper reading Little —.014 —.168 aaa 
High/low crime area High —.609 472 - me 
City/surburb City —.279 —.145 — 845 
Interaction : High suburb/ 465 —.283 —.236 
ime X Location low cit 
es i Older y .010 163 .096 
Sex Female .229 -889 =.110 
Canonical correlation .608 468 +305 
p value* <.001 <.001 <.002 


“4 * “ig noni 
^ Using Wilks’ lambda. Using the method of the greatest characteristic root, the third pair of ca 
variates is not significant. For a discussion of this problem, see Harris (1976). 


ing to radio news and newspaper reading 
were entered into the equation. 

The results are shown in Table 5. It is 
clear that after the subject characteristics had 
been entered, the media questions had no 
significant predictive value. Most relevant to 
the Gerbner results is of course the lack of 
importance of total TV viewing when it first 
entered the equation, 

Finally, a canonical correlation analysis 
was done, using the nine fear-of-crime ques- 
tions (see Table 4) as the criterion set and 
the same nine variables as in the multiple re- 
gression analysis (see Table 5) as the pre- 
dictor set. Three significant canonical corre- 
lations were found. The variates associated 
with these correlations are shown in Table 6. 

The first pair of canonical variates suggests 
that those who do not see crimes of violence 
as a problem in their neighborhood (Ques- 
tion 1), who do not think that a child play- 
ing alone in a park is in danger (Question 


3), and who do not think that hey 
selves are likely to be victims of an ih 
(Question 6), but who are afraid E j 
houses will be broken into (Question ' P 
who do not walk alone at night (oue ai 
tend to be females living in low-crimè V4 
areas. ir 
The second pair of canonical ve 
pears to indicate that people who E aig 
near them that they will not walk in alone f 
(Question 9) and who fear wi í 
night (Question 8), but who do not f 
they will be victims of a violent crini i 
tion 6), tend to be females ee i 
crime (city) areas who listen to 
S. 
Ca “third set of variates suggest 
those who think that unaccompar á 
subway riders (Question 5) and chi 7 
ing alone in parks (Question 3) i ; 
able to attacks, but who themse’ vA 
feel vulnerable (Question 6) and do ® 


ren Pi 


do if 
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about walking alone at night (Question 8), 

since their neighborhoods are safe (Question 
Mt), tend to be suburban (low-crime area) 
‘residents who watch a lot of violent TV and 
donot listen to radio news. 

The total amount of television watched did 
“not seem to be important, and the amount of 
violent television watched entered into the in- 
{pretation only in the third canonical vari- 
at, Even in this case, it appears that the 
amount of violent TV watched related posi- 


crime, 

In summary, then, looking at these data 
ftom three somewhat different points of view, 
it appears that the amount of television 
watched did not relate to the amount of fear 
a person felt about being a victim of crime 
When other, more basic variables were taken 
into account. 


Other Findings 


As indicated earlier, we asked 25 other 
questions. Most of these questions related, 
directly or indirectly, to the respondents’ 
views of the nature and frequency of crime 
or violence around them. In 14 of the ques- 
tions, there was a significant relationship 
(pooled or calculated individually and then 
averaged) between TV viewing and the re- 
sponse to the question. The questions and 
the relationship of each question to TV view- 
ing are shown in Table 7. What is noteworthy 
about these correlations is that there is gen- 
hae not a substantial drop when the cor- 
relations are computed within area and then 
averaged (column 2 of Table 7). Thus, it ap- 
pears that the relationship between total TV 
ee and responses to these questions is 
no! mediated by the area in which the re- 
spondent resides. These were the only other 
questions that correlated with TV viewing 
and, at least from our point of view, they are 
qualitatively different from those that were 
larg contributors to the “fear index” (see 

able 4). Because it is not of central interest 


j 
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to this study, we did not look at other pos- 
sible factors that might account for the cor- 
relations that we have presented. 


Discussion 


A number of things are reasonably clear 
from these data. First of all, the basic find- 
ings of Gerbner and his associates (Gerbner 
& Gross, 1974, 1976a, 1976b; Gerbner et al., 
1976) are replicable: People who watch a 
lot of television are more likely to indicate 
fear of their environment. It is equally clear, 
however, that this relationship disappears 
when attempts are made to control for other 
variables, including the actual incidence of 
crime in the neighborhood. Thus, it would 
appear that television itself is not likely to 
be a direct cause of people’s fear of being 
victims of crime. 

Although clearly at the level of specula- 
tion, it is interesting to note that Gerbner’s 
(Gerbner & Gross, 1974) own data on this 
issue were collected by telephone interview 
in four cities: Philadelphia, Chicago, Los An- 
geles, and Dallas. One can assume that there 
exists in these cities some variability in the 
dangerousness of different neighborhoods. 
Since households for that survey were se- 
lected randomly from telephone books, it 
seems reasonable to expect that neighborhoods 
differing in actual dangerousness would be 
included from each city. This variation could 
be sufficient to produce the apparently small 
correlation that Gerbner found. More inter- 
esting, however, is the possibility that for 
some unspecifiable reason, the relationship 
only holds in high-crime areas, Or in high- 
crime cities in particular. As shown in Table 
3, we, too, got significant correlations within 
the high-crime area of the city of Toronto. 
One possible admittedly post hoc explana- 
tion for this result is that television violence 
in the form of police shows and so forth 
deals mostly with high-crime city neighbor- 
hoods. It is possible that people outside of 
such areas do not feel that the violence on 
television has any relevance for them; hence, 
there is no relationship between the amount 
of television watched and the perception of 
the likelihood of being a victim. 
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Table 7 ak { ; 
Questions Associated Significantly With TV Viewing: Correlations Between Responses to l 
Each Question and Total TV and TV Violence for the Four Areas Pooled and the y 
Average of the Four Areas Calculated Individually 


Total TV TV violence 


Average Average 
within within High TV viewing 
Question Pooled area Pooled area associated with 


10. Would you imagine that you would be sz .09* 04 .06 Previously 
more likely to be seriously harmed known 
by someone you knew previously 
or by a complete stranger? (400) 


12. How dangerous do you think it is for .09* .06 06 06 Dangerous 
a female driver of a car to pick up 
a male hitchhiker who is a 
stranger? (404) 


13, Do you think that it would be a good 11* .06 12 08 Definitely yes | 
idea to spend more money on police 
patrols of your area of the city? 
(403) 


17. Do you think that it is useful for 3i* .20* a5 Vy Definitely yes 
people to keep firearms in their 
homes to protect themselves? (405) 


20. Should women carry a weapon such .18* A7* .19* -18* Definitely ye 
as a knife to protect themselves 
against sexual assault? (405) 


21. Some people have suggested that one .09* 07 .09* .06 Definitely ye 
way to reduce the incidence of 
violent crime is to encourage people 
to stay away from areas thought to 
be high in crime. Do you think that 
this is a good way of dealing with 
the problem of crime? (403) 


22. What proportion of murders in To- ar -10* .08* 09" High or 
ronto do you think are committed 
by people who could be classified 
as mentally ill? (381) 


: tior 
23. Approximately what proportion of as- -10* 129 .09* A1* High prope! l 
saults in Toronto are directed 
against members of racial minori- 
ties (i.e., nonwhites) by whites? | 
(359) 


jon 
24, What proportion of serious assaults in .08 .11* -08 12* High ae 
Toronto do you think are carried 
out by nonwhites? (364) 
er 
25. How many murders do you think a7 a16 ae at Large mul 
took place in metropolitan Toronto 
during 1975? (372) 
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Table 7 (continued) 
Total TV TV violence 
Average Average 
within within High TV viewing 
Question Pooled area Pooled area associated with 
16, During the last 5 years, how many Bit 12% .09* .08* Large number 
people do you think were murdered 
in the TTC subway? (384) 
3l. If you were walking alone on a resi- .10* eels 0 0 Definitely not 
dential street at night and someone 
asked you for directions, would 
you stop and give him the direc- 
tions? (404) 
a If a person were to have an epileptic .10* .07 .08 0 Very likely 
seizure on the street in front of you, 
how likely do you think most people 
would be to help? (405) 
ii* 112. .05 .03 Not help 


in the middle of the night, a 

| stranger knocked on your door and 
asked to use your telephone to call 

someone to help him start his car 

that had apparently stalled on your 

street, which of the following would 

you be most likely to do? (404) 


33. If, 


Note. ns of respondents for each question are in parentheses. 


*b < 05. 


The second general point that should be 
made about our data is that although the 
correlation between TV viewing and fear 
dropped off when neighborhood was used as 
a controlling factor, this same factor did not 
eliminate the relationship between TV view- 
ing and other factors (see Table 7). It is pos- 
sible that the questions listed in Table 7 are, 
in fact, related to television viewing because 
they deal with matters of a more factual na- 
ture than the questions having to do with 
the person’s own level of fear. Thus, televi- 
eer may well act as a source of information 
a Tein questions of fact, whereas it 
he Gee rs people’s views of how afraid 
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Retrieval Selectivity in Memory-Based Impression Judgment 


John H. Lingle 
Livingston College, Rutgers— 
The State University 


Judgments about others are often based on memory for information about the 
persons being judged. Three studies are reported that use decision time to de- 
termine what information subjects selectively recall when they make memory- 
based person judgments. Each study employed a sequential judgment paradigm 
in which a subject first made an impression judgment about a person on one 
dimension while stimulus information was continuously available. Immediately 
thereafter, the subject made a second judgment about the same person on a 
different dimension without the stimulus information being available. It was 
concluded that subjects’ memory-based judgments were based on memory for 
their first impression judgments combined with a selective memory search for 


negative stimulus information. 


In recent years a great deal of impression 
formation research has been directed towards 
describing how people integrate multiple in- 
formation items when they make a single 
person-impression judgment. Such research 
has dealt almost exclusively with stimulus- 
based judgments. These are judgments that 
derive from descriptive information that the 
experimenter provides immediately prior to, 
or simultaneously with, the judgment that is 
made. In most day-to-day situations, how- 
ever, the judgments that people make about 
others are memory-based. That is, rather 
than being derived from a set of presented 
factual information items, the judgments are 
based on a selective sampling of information 
from a cognitive representation of the person 
that the perceiver has built up and stored 
in memory. In order to understand the ma- 
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jority of interpersonal judgments that peo 
make, it is necessary to understand how (at 
what kind of) information is sampled ind 
memory when memory-based interpersia 
decisions are made. To examine this quj 
tion, this article reports three experime 
that investigated the degree to which an 
tial stimulus-based judgment that a pai 
makes about a stranger influences 4 Si 
sequent memory-based decision made abo 
the same individual. 


Effects of a Stimulus-Based Judgment 
Subsequent Memory-Based Decisions 


person selectively samples from me ‘| 
considers when making a memory-bas 
ment will be similar in composition to | i 
formation available when an initial a 
based judgment is made, especially w E A 
judgments occur close together in i revid 
is, in both cases a person may simp a A 
the factual stimulus information (eit teg 
is presented or from memory) and vous 
it into a single response. Some a ig 
forgetting of the stimulus informat jud 
occur in the case of the memory-bas an 
ment, but this would not represent ot 
portant difference in the compositio? 


=E 
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sampled information as long as the forgetting 
was unselective. 

Alternately, there are several ways in which 
an initial judgment might affect the cognitive 
representation a person forms about another 
individual as well as the information likely 
{fo be sampled from memory when a later 
memory-based decision is made. One way 
would be to make available from memory 
new thoughts or beliefs (apart from the items 
of stimulus information) that could be re- 
‘alld and used as the basis of a decision. 
Such potentially available cognitions would 
cude memory for the initial judgment as 
ell as any person characteristics normally 
sociated with that judgment. It has been 
sown, for example, that people are some- 
times able to remember their own cognitive 
Itsponses better than the stimulus information 
m which such responses are based (Green- 
Wald, 1968). Furthermore, research exists 
demonstrating that people rely on abstracted 
inferences and affective impressions in addi- 
tion to factual information in decision tasks 
(cf. Posner & Snyder, 1974). Thus, if a per- 
son were to decide upon meeting someone 
that the new acquaintance would probably 
be a good lawyer, she or he might later con- 
clude that the same person would make a 
good judge, simply because, being a “good 
lawyer type,” the acquaintance would likely 
be intelligent and judicious. Such a judgment 
process would be different from, and highly 
dependent on, the earlier stimulus-based judg- 
ment (i.e., this person would make a good 
lawyer). 

A second way in which an initial judg- 
ae Pi ae the thoughts that are pro- 
T E Pa memory-based judgment 
A z to influence the likelihood 
Faas ras So hogs of the original stimulus 
baie: eS Ce later be recalled. There 
E E a a O 
vane to thae adi stimulus information rele- 
hee cae jee should be remembered 
Se ate event information. First, it 
Pe ee ee 

mation is processed when it i 
encountered (Sirois when it is first 
baa (Craik s er it will later be remem- 

ockhart, 1972; Craik & 


p- 1975). When a person makes an 
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initial stimulus-based judgment about some- 
one, it seems likely that she or he will pay 
greatest attention to, process more deeply, 
and remember better information that is 
relevant to making that judgment. Second, if 
an initial judgment is later remembered, it 
could easily serve as a cuing mechanism and 
facilitate the recall of information that had 
been instrumental in reaching the judgment 
(cf. Tulving & Thomson, 1973; Tulving & 
Watkins, 1975). Both processes, then, would 
likely result in judgment-relevant informa- 
tion being remembered better than judgment- 
irrelevant information when memory-based 
person judgments are made. 


Availability Versus Retrieval Selectivity 


In considering how an initial stimulus-based 
judgment influences a subsequent memory- 
based judgment, it is necessary to distinguish 
between the entire set of thoughts a person 
might have about another and the cognitions 
that are selectively sampled in the process of 
reaching a specific decision about that per- 
son. For example, research by Geva (1977), 
Lingle and Ostrom (in press), and Lingle, 
Geva, Ostrom, Leippe, and Baumgardner (in 
press) has provided evidence that an initial 
stimulus-based judgment influences how sub- 
jects cognitively represent a described per- 


son, as measured by memory for the stimulus 


traits and freely generated thoughts used to 
such 


describe the stimulus person. However, | 
descriptions and recall measures do not iden- 
tify how an initial judgment affects the set 
of cognitions that is sampled and actually 
used when a later memory-based judgment is 
made. To study this problem, the three ex- 
periments reported in this article used decision 
king inferences about 


time as a basis for ma 
the set of cognitions subjects rely on when 


they make memory-based person judgments. 
designed to determine 
whether decision time is affected by the degree 
of similarity between a first (stimulus-based) 
and second (memory-based) judgment. Ex- 
periments 2 and 3 used decision time to test 
three alternative processing models capable 
of accounting for the results of Experiment 1. 


Experiment 1 was 
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Experiment 1 


If an initial stimulus-based judgment in- 
fluences the manner in which a person cog- 
nitively represents another, thoughts that are 
recalled when a later memory-based decision 
is required should be more relevant to a simi- 
lar subsequent judgment than to a dissimilar 
one. As a result, subjects should be able to 
make similar second judgments more quickly 
than dissimilar ones. 

Experiment 1 tested the hypothesis that an 
initial judgment influences the set of cogni- 
tions on which a later judgment is based by 
determining whether the similarity between a 
first and second judgment influenced how 
quickly the second judgment could be made. 
If the first judgment has no systematic effect 
on the cognitive Tepresentation, then decision 
time for the second judgment should not be 
affected by the similarity between the two 
judgments. To test the hypothesis, a within- 
subjects design was used in which 12 subjects 
made two occupational judgments about 12 
different stimulus persons. Subjects made each 
of their second judgments without being re- 
exposed to the stimulus traits Presented when 
they made their first occupational judgment. 
For some of the trials, the initially judged 
occupation was similar to the second occupa- 
tion, whereas on other trials, the initial oc- 
cupation was dissimilar to the second. 


Method 


_ Subjects, Subjects in the 
eight male and four female undergraduate students 
from Ohio State University i 


It was explained that at the 
an initial occupation would be projected on the wall 
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stimulus person. The subject was to consider th 
suitability of the person for the job previo 

shown. After the traits had been displayed for; 
fixed period of time, the initial Occupation was agin 
presented, and the subject was asked to register his 
or her decision by moving the toggle switch to th 
good or bad position. The stimulus traits were di 
Played for a fixed 20 sec in order to minimize 


form an initial impression, and decide whether they! 
thought the person would be successful in the desi. 
nated occupation. 

After a subject registered his or her initial occupa 
tional rating, a second occupation was shown, ani 
the subject was asked to judge the suitability of thy 
same stimulus person for this second profession, TW 
Subject was told to register the decision for thi 
second judgment as soon as he or she was able ti 
make it. Finally, a blank slide was presented to in 
dicate the end of the trial. The process was repeat 
with a new set of traits and occupations until ead 
subject had made a pair of occupational judgmens 
for 12 different stimulus persons, On each trial, te 
blank slide and initial occupation were each di 
Played for 5 sec; the slide with four traits ap 
for 20 sec. The two decision slides followed one 
other immediately, each being changed 3 sec aft 
the subject had indicated his or her judgment, Ti 
Practice trials (one with a similar and one wi 
dissimilar pair of judgments) were repeated bs 
both were completed without a mistake, in 0 
to ensure that subjects understood the procedur 
At the end of the experiment, probes for suspic 
were conducted, and subjects were debriefed 4$ 
the purpose of the research. 

Second decision time was measured from the a 
ment at which the slide change mechanism 0 
Kodak Carousel 800H projector was activated i 
playing the stimulus slide, to the point at Wh 
the subject moved the response switch. Decision E 
was measured by a Hewlett-Packard 12.5-MHz 
tronic counter and was automatically recorded. d 

Design and stimulus materials. Twelve groups 
four occupations were generated. Each group 
tained two pairs of similar occupations, i; 
which was dissimilar from the occupations © 
other pair (e.g., store clerk/salesman — lawyer/ju 
Three independent judges demonstrated 100% ns 1 
ment in judging each pair of similar occupã alr 
be more similar than any possible dissimilar P (ae 
Occupations within the same occupational groUP 
cording to the criterion that “the occupations ics”) 
require individuals with similar character yur 
Each group of four occupations was paired frot 
randomly selected descriptive traits chose 
Anderson’s (1968) trait-adjective list. isd in 

A completely within-subjects design wa ™; 
which each of the 12 subjects judged one P 
Occupations from each of the 12 occupation 


& 
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The design was counterbalanced so that across the 
2 subjects, each occupation appeared twice with 
each of the three occupations in its group, one time 

the initial judgment and one time as the second 
judgment. Consequently, over the 12 subjects, each 
occupation within an occupational group was used 
an initial judgment three times, as a second dis- 
imilar judgment two times, and as a second simi- 
lar judgment one time. The order of presentation of 
imilar and dissimilar occupational pairs was coun- 
balanced across the 12 subjects by means of a 
tin square (12 subjects being the minimum num- 
er of subjects that allowed complete counterbal- 
incing of occupations across subjects). 


Results and Discussion 


During the debriefing, none of the subjects 
was able to verbalize any part of the experi- 
mental hypothesis, nor did any subject indi- 
cate an awareness that decision time was the 
principal dependent variable. 

A test of the experimental hypothesis de- 
nded on an examination of subjects’ second 
lecision times. To review the alternative pos- 
ibilities, if subjects unselectively review the 
stimulus information both when they make an 
initial stimulus-based judgment and when 
they make a subsequent memory-based judg- 
Ment, the similarity of the first judgment to 
the second should not affect how quickly the 
later decision is made. On the other hand, to 

“the extent that subjects base a second judg- 
Ment on a set of remembered cognitions that 
18 influenced by an initial judgment, they 
‘Should make second similar judgments more 
quickly than second dissimilar judgments. 
Consistent with this latter alternative, mean 
a time was significantly shorter for 
ae Occupational judgments than for dis- 
a r judgments (Ms = 5.2 versus 6.3 sec, 
spectively), taep(11) = 2.78, p < .02. When 

aaa individually, 10 out of the 12 sub- 
cts spent less time on the average in mak- 
3 ae eis to dissimilar second 

. s was signi i 

2 < 04), gnificant by the sign 
deci Ae wea! the subjects’ second 
often a ie ed their first decisions more 
dissimilar ) ey made similar (as opposed to 
second judgments (81% versus 


63 i 
i respectively), It was conceivable, there- 


at the differences in second judgment 


im 
es resulted merely from subjects consist- 
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ently making two matching responses more 
quickly than two nonmatching responses, not 
from the difference in relevance of their re- 
membered impressions to their second judg- 
ments. 

To examine this alternative explanation, 
the point-biserial correlation between second 
decision time and whether a second decision 
matched its corresponding first decision was 
computed for both the similar and dissimilar 
second judgment conditions. In the similar 
conditions, there was a small positive correla- 
tion (7 =.17) such that subjects showed a 
slight tendency to make faster decisions when 
they made two matching judgments. How- 
ever, in the dissimilar judgment conditions, 
the correlation was slightly negative (r= 
—.15). The correlation across both experi- 
mental conditions was —.04. Thus, the pat- 
tern of correlations did not support the ex- 
planation that the decision-time differences 
resulted from subjects persistently making 
matching second responses more quickly than 
nonmatching ones. 

The difference in judgment times also did 
not appear to result from subjects making a 
disproportionate number of negative responses 
in either of the two conditions. The per- 
centage of negative responses was almost 
identical (54% for similar versus 577% for 
dissimilar). Furthermore, the point-biserial 
correlation between second-decision valence 
and response time was negligible (the cor- 
relations for the dissimilar, similar, and com- 
bined conditions were .00, —.03, and .01, re- 
spectively). j 

Although the results from Experiment 1 
are inconsistent with the thesis that subjects 
based their second judgments on an unse- 
lected review of the presented stimulus traits, 
they do not discriminate between any of 
the several alternative information-processing 
models capable of accounting for the differ- 
ences in similar and dissimilar judgment times. 
Experiments 2 and 3 were designed for this 


purpose. 
Experiment 2 


In the introduction, three distinct cate- 
gories of remembered cognitions on which a 
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subject might base a second judgment about 
a person were identified: (a) traits originally 
used to describe the person, (b) a previous 
judgment the subject had made about the 
person, and (c) characteristics associated 
with an earlier judgment. Several models of 
how these three types of cognitions are used 
singly, and in combination with each other, 
would be capable of accounting for the dif- 


ferences in judgment times obtained in Ex- 


periment 1. In this section, three such models 


are described. Each model makes a different 
Prediction about how the pattern of second 
judgment times observed in Experiment 1 
would be affected by the number of traits 
(set size) used to describe the stimulus per- 
son. Experiment 2 manipulated both simi- 
larity and set size in order to test which of 
the models best describes how subjects pro- 
cess information when they make second oc- 
Cupational judgments. As with any such test, 
the three models (and their differing predic- 
tions) rest on several assumptions that are 
briefly identified prior to our discussing the 
models themselves, 

The first such assumption is that the amount 
of time subjects require to review a set of 
information items in memory increases as the 
number of items in the set increases. This as- 
sumption is empirically ‘well supported and 
has been shown to hold true in a wide range 
of cognitive Processing tasks including set in- 
clusion tasks (Sternberg, 1969), person-trait 
identification tasks (Posner & Snyder, 1974), 
and paragraph comprehension tasks (Kintsch, 
1974), 

A second assumption is that the amount of 
time subjects’ require to make a judgmental 
response in this task, once they have reviewed 


empirically than the first, there is evidence 
that in at least some judgment tasks, response 
time is independent of the amount of informa- 
tion that is reviewed in memory to make the 
judgment (cf. 


A final assumption of Experiment 2 (and 
Experiment 3 
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Figure 1. Predicted second judgment times for hn 
different processing models when judgment simi 
larity and set size are both varied. 


tion of the relevance of a subject’s cognitiv 
representation to the judgment. It is assume 
that the less relevant a set of remembe "| 
cognitions is to a judgment, the more time 
subject will have to spend in reviewing his H 
her cognitive representation in order either ! 
find judgment-relevant cognitions or to g 
erate judgment-relevant inferences. A 

With these assumptions as background, F 
three processing models tested in Experi 
2 can be described. The patterns of aeo | 
times predicted by each model when vil 
ment similarity and trait set size are e | 
are displayed in Panels a, b, and c of Figu | 


Three Models of Memory-Based Impress 
Judgments j 


An information-retrieval model. Me 
capable of accounting for the results if 
periment 1 posits that when subjects 
their second occupational judgments 
serially review from memory the ft 
traits until they feel they have cons? of the 
sufficiently representative proportio! ad 
available factual information to reach menti 
sion. The major effect of the first jai 4 
to increase the probability that traits 


to the first decision will more readily be re- 
called to mind as compared with traits ir- 
relevant to that decision. Since the amount 
of time required to review a set of items in 
memory is assumed to increase as set size in- 
creases, subjects in both the similar and dis- 
similar conditions should take longer to make 
| their judgment as the amount of information 
in the person description increases. However, 
something else should also occur as set size 
‘increases. Since some percentage of the traits 
used to describe each stimulus person is rele- 
yant to the first judgment (e.g., 50%), the 
absolute number of relevant traits-—and the 
lime required to recall a representative num- 
ber of these traits—will increase with set 
size. When subjects are making a similar sec- 
‘ond occupational judgment, these traits, being 
relevant to the second judgment as well, 
should be sufficient to make a decision. How- 
ever, when making a dissimilar second judg- 
ment, subjects should not find these traits 
particularly helpful, since the dissimilar oc- 
Cupation was selected to require dissimilar 
abilities. In the dissimilar condition, thgn, 
subjects should have to recall the remaining 
traits in the set. Since the number of remain- 
ing traits will also increase with set size, sub- 
Jects will have to spend additional time re- 
membering and reviewing them, As a result, 
the average difference in decision time for 
similar and dissimilar second judgments should 
Increase as set size increases. This pattern of 
results is depicted in Panel a of Figure 1. 

A j judgment-retrieval model. In the intro- 
duction, it was argued that an initial decision 
and its associated characteristics may serve 
as the basis for subsequent judgments. The 
second model focuses on retrieval of the ini- 
tial judgment rather than on retrieval of the 
information or events on which the first im- 
pression was based. If subjects tend to base 
a second judgment on an initial judgment 
fand its associated characteristics), the num- 

ie cognitions considered in reaching a sec- 
nae a a a ak a 
describe a person. A. aah oe y 

Be - As a result, neither similar 
nor dissimilar judgment times would be ex- 
pected to increase with set size, However, be- 
cause the initial judgment and its related at- 
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tributes would be less relevant to the second 
judgment in the dissimilar (as opposed to the 
similar) condition, subjects would be expected ~ 
to spend additional time reaching a decision 
in the former situation. The pattern of de- 
cision times that would be expected is pic- 
tured in Panel b of Figure 1. 

A mixed model. People are flexible infor- 
mation processors capable of using myriad 
decision strategies. It is possible that the dif- 
ference in decision times in Experiment 1 
came about because subjects used different 
decision strategies in making the two types of 
judgments. When subjects are faced with 
two very similar judgments, they may tend 
to base their second decision on their first, 
without considering the original stimulus in- 
formation. However, when confronted by a 
dissimilar second judgment, subjects may 
find their first judgment insufficient grounds 
for the second and begin to review from mem- 
ory the previously presented stimulus traits. 
In such a case, decision time for dissimilar 
judgments would increase with set size, 
whereas decision time for similar judgments 
would not. The pattern of response times 
would then be expected to parallel Panel c 
of Figure 1. 

To test which of the three models best ac- 
counts for the similarity effect of Experiment 
1, two sequential replications of Experiment 
2 were conducted. In Replication 1 homoge- 
neous trait sets, comprised of either all posi- 
tive or all negative traits, were used to de- 
scribe the stimulus»persons. In Replication 2, 
heterogeneous trait sets, consisting of both 
positive and negative characteristics, were 
employed. The purpose of the two replica- 
tions was to extend the generalizability of 
the results to more than one configuration of 
person information. Conceivably, the decision 
strategy adopted by a subject is affected by 
the degree of heterogeneity in the informa- 
tion set. For example, people may be more 
prone to engage in information retrieval than 
judgment retrieval with heterogeneous (as 
opposed to homogeneous) stimulus ensembles. 

Both replications in this study used a 
within-subjects design in which subjects made 
pairs of occupational judgments about 12 
different stimulus persons. As in Experiment 
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1, subjects made their second judgments with- 
out access to the stimulus traits. Half of these 
second judgments were similar to a first judg- 
ment and half dissimilar; an equal number of 
the 12 stimulus persons each subject evalu- 
ated were described by two, four, and six 
traits, 


Method 
Subjects. Subjects were 48 undergraduates from 
Ohio State University who Participated in the ex- 


periment in partial fulfillment of an introductory 
psychology course requirement, In Replication 1, 11 
male and 13 female subjects participated; in Replica- 
tion 2, the subjects were 14 males and 10 females, As 
in Experiment 1, subjects were randomly assigned to 
one of the counterbalancing presentation orders, 
Procedure. The experimental procedure and in- 
structions to the subjects were identical to Experi- 
ment 1. Subjects were first shown an initial occupa- 
tion, followed by a set of stimulus traits that was 
displayed for 20 sec. Thereafter, they were asked to 
register their decision concerning the described per- 
son’s suitability for the initially displayed occupation, 
as well as for a second occupation that was either 
similar or dissimilar to the first. As in Experiment 1, 


In Experiment 2, the number of Practice trials 
was increased to three in order to include stimulus 


by a dissimilar first occupational judgment for 
the other half. Likewise, one third of the stimulus 
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Figure 2. Mean second occupational judgment times 
from Experiment 2 as a function of judgment-pait 
similarity and set size. 


dissimilar occupation. These subject pairs then formed 
the unit of counterbalancing in the rest of the design. 
For each of the two sequential replications, foun 
lists of 12 traits were selected from Anderson’ 
(1968) trait-adjective list. In Replication 1, two ol 
the lists were selected from the positive half of the’ 
Scale and two from the negative half of the sca 
Providing evaluatively homogeneous sets of gee 
Replication 2, traits for all four lists were select 
from the middle three fifths of the scale, prove 
heterogeneous descriptions containing both positi 
and negative adjectives. x 
Counterbalancing of the traits across jdi 
similarity and set size for the 12 subject e 
each replication was achieved by the ie. 
cyclical replications. Traits within each list ad 
randomly ordered from 1 to 12. The traits in fi 
list were then sorted into the three experimé 
Set sizes in 12 unique ways by simply ts) a 
first trait to the end of the list and grouping a 
maining traits in order into sets of two, ett , 
six. For example, (1, 2), (3, 4, 5, 6), and (1, 4,5, 
10, 11, 12) was the first group of sets; (2, 3); ii 
6, 7), and (8, 9, 10, 11, 12, 1) was the eee orè 
so forth. Three person descriptions were ge the 
obtained from each of the four lists to Pro ie 
12 trait sets needed for each subject. The 14 unter- 
lus persons generated in this manner were 00 often 
balanced so that each trait appeared equi ount 
for each level of set size and similarity. br: t 
balancing scheme assured that none of Jess tH, 
was presented to a subject more than, oF 
once. 


Results and Discussion 
a test 


Replication 1. As in Experiment 1, ea 
of the three models depended on a compat 


of subjects’ second, memory-based judgment 
times, Natural log transformations of these 
response times were performed prior to the 
data analyses in order to eliminate a strong 
correlation between the cell means and vari- 
ances (cf. Winer, 1971, p. 400). Analyses of 
subjects’ untransformed judgment times were 
"also conducted and produced results similar 
to the analyses of transformed scores. None 
of the statistical tests for either Experiment 
 2or Experiment 3 that are appropriate for 
discriminating between the several models dif- 
fered in significance as a function of whether 
the raw or transformed scores were used in 
the analysis. Subjects’ retransformed mean 
decision-time scores for Replication 1 are 
presented in Panel a of Figure 2. The scores 
were analyzed using the multivariate analysis 
_ of variance approach for within-subjects de- 
_ signs as discussed by Poor (1973).* 

The first replication involved homogeneous 
trait descriptions. The analysis of subjects’ 
transformed second judgment times yielded a 

significant effect for similarity, F(1, 23) = 
14.7, p < .001, replicating the principal find- 


ing of Experiment 1. However, no significant 
Main effect or interaction for set size emerged 
(both Fs < 1), 

Replication 2. Subjects’ scores from Repli- 
Cation 2 (heterogeneous trait descriptions) 
were analyzed in an identical manner to the 
Scores from Replication 1. The retransformed 
Means are displayed in Panel 6 of Figure 2. 


he analysis again produced a significant 
main effect for the similarity of the second 
judgment, F(1, 23) = 30.8, p <.001, and a 
iy significant main effect for. set size (F < 

). The interaction term approached signifi- 
tace, F(2, 22) = 2.88, p < .10. This border- 
oem was due entirely to the qua- 
23) = Se of the set-size factor, F(1, 
ie E p< .03. The interaction term 
coh ear trend produced an F of less 
ey 


cian the results of both replications were 
Consistent with a judgment-retrieval 


ae ae anomalies emerged in the pattern 
toga when heterogeneous, as opposed to 
in Repl; eous, trait sets were employed. First, 
‘“Piication 2 a quadratic interaction was 


obt oe 
ained, raising the possibility that a more 
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complicated version of the mixed model might 
be required to explain the manner in which 
subjects processed heterogeneous information 
sets. Second, subjects took slightly longer to 
make a second decision when heterogeneous, 
as opposed to homogeneous, trait sets were 
used to describe the stimulus person (Ms = 
4.22 versus 3.99, respectively). A clear in- 
terpretation of the source of these differences 
was impossible, since the two replications were 
conducted during different months in the 
academic quarter, using different subject pop- 
ulations. A third experiment, to be reported 
below, was therefore conducted in which 
heterogeneity of the trait sets was manipu- 
lated as a separate within-subjects factor. 

First judgments. As discussed earlier, the 
test of the three models involved the assump- 
tion that the time required to review items 
in memory increases as set size increases. In 
Experiment 2, subjects’ first judgments were 
based on sets of differing sizes. However, the 
design of this (and the following) study pre- 
cluded using first-decision times to verify this 
assumption. As previously described, in both 
studies subjects were first shown the occupa- 
tion and then given a fixed amount of time 
to consider the traits, regardless of set size. 
Subjects were instructed to reach their judg- 
ment during the time the stimulus items were 
exposed and to report that decision when the 
traits were removed and the first decision re- 
shown. Because there was no reason for sub- 
jects to review from memory the traits when 
reporting their first judgment, decision time 
for first judgments was not recorded. 


Experiment 3 


Experiment 3 was similar to Experiment 2 
except for two changes. First, a wider range 


1This approach analyzes within-subjects factors 
by defining a set of new variables that incorporates 
the elements of the within-subjects effects. An analy- 
sis of variance (sometimes univariate, sometimes 
multivariate) is then performed on the transformed 
variables. The principal advantage of the multi- 
variate approach is that it is based on a less re- 
strictive set of assumptions than the traditional 
within-subjects analysis, In the present report, bold- 
faced F values are multivariate values using Wilks’s 
Lambda criterion. 
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of set sizes was used in order to insure that 
the findings of Experiment 2 were not the 
result of a restricted range. The set sizes used 
were one, three, five, and seven traits. The 
inclusion of a condition in which the stimulus 
person was described by a single trait also 
provided an additional test between the 
models. Because Experiment 3 was designed 
in such a way that no systematic difference 
existed in the relevance of the single trait 
descriptions to subjects’ second judgments, 
the information-retrieval model (in which 
subjects rely solely on remembered stimulus 
traits) would not predict any difference in 
decision times for similar and dissimilar judg- 
ments. On the other hand, the judgment-re- 
trieval model would predict a longer decision 
time for subjects’ dissimilar second judgments 
even with only one trait, since a subject’s 
first judgment would differ in relevance to 
his or her second in the two conditions (see 
Figure 1). 

A second change incorporated into Experi- 
ment 3 was that the heterogeneity of the per- 
son descriptions was included as a separate 
within-subjects factor. This was done because 
of the unexpected quadratic interaction with 
heterogeneous sets in Experiment 2. In Ex- 
periment 3, each subject judged an equal 
number of stimulus persons who were de- 
scribed by a homogeneous trait set (either all 
positive or all negative) and a heterogeneous 
trait set (including both Positive and nega- 
tive traits). 


Method 


Subjects. Subjects included 11 male and 21 female 
undergraduates from Ohio State University who 
Participated in partial fulfillment of a course re- 
quirement. These subjects were randomly assigned 
to the 32 stimulus presentation orders used to coun- 
terbalance the design, 

Procedure. The experimental procedure and in- 
structions to the subjects were similar to those used 
in Experiments 1 and 2, The only difference was 
that in this case, subjects were required to complete 
four practice trials, which included stimulus persons 
described by each of the four set sizes. For two of 
these practice trials, subjects judged Pairs of similar 
occupations; for the other two, they judged dissimi- 
Jar pairs. 

Design and stimulus materials. The basic design 
and stimulus materials of Experiment 3 were simi- 
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lar to Experiment 2. However, because four y 
sizes were employed, the counterbalancing shew 
required that 32 subjects (16 pairs) be used, each 4 
whom made pairs of occupational judgments aboy 
16 different stimulus persons. 


8 of the 16 stimulus persons judged by each subjed 
were described by traits selected randomly from th 


eight descriptions seen by each subject, half wen 
comprised of traits selected from the Positive por 
tion of Anderson’s list and half from the negatin 
portion, thus providing evaluatively homogeneo 
descriptions. As in Experiment 2, the counter 
ancing of traits across similarity and set size wa 
achieved by cyclical replications. In the case of By 
periment 3, each list consisted of 16 traits so tht 
Set sizes of one, three, five and seven could be gem 
erated (ic, 1+3+5+7= 16). 


Results and Discussion 


Effects of trait heterogeneity. One purpoe 
of Experiment 3 was to determine if the dit 
ferences in subjects’ pattern of judgment tin 
for heterogeneous and homogeneous trait s 
observed in Experiment 2 were reliable, Tht 
first question investigated was whether th 
quadratic interaction between judgment simi 
larity and set size was replicated for th 
heterogeneous trait sets, This was examin 
in several ways. First, the design of mo 
ment 3 permitted a statistical test of whel | 
the trait-set composition (homogeneous V 
sus heterogeneous) affected the shape P 
Similarity X Set Size interaction. This 
(the three-way interaction between iu 
similarity, set size, and trait-set onma % 
proved to be nonsignificant, F(3, 29) = d f 
$ < .20. An additional analysis was pea 
on the heterogeneous data alone, and na 
the overall interaction between similarity 
set size, F(3, 29) = 1.78, p< .20, n% 7 
quadratic component, F(1, 31) = 1.0% 
-40, was significant. Hs 

Finally, subjects’ mean judgment ti 4 
heterogeneous trait sets were plotte 
visually examined. This examination imitt 
that there was no tendency for the vel 
and dissimilar judgment times to Te“ ad i 
between Set Sizes 5 and 7 as they ™ py 
converged between Set Sizes 4 and viden 
periment 2. In sum, there was nO in BS 
that the quadratic interaction foun 


Q 
ake 
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periment 2 (Replication 2) was replicated in 
Experiment 3. 

The second question examined was whether 
the tendency found in Experiment 2 for sub- 
jects to take longer with heterogeneous than 
with homogeneous trait sets was replicable. 
This was not the case. The main effect for 
trait-set heterogeneity on subjects’ second 
judgment times was nonsignificant (F < 1). 

Test of the three models. Because trait-set 
heterogeneity had no detectable effect on 
subjects’ second judgment times, this factor 
is not discussed further. In order to test the 
three models, the data were analyzed in a 
similar manner to the data of Experiment 2. 
Subjects’ retransformed response times, com- 
bined across trait-set replications, are pre- 
sented in Figure 3. As in both Experiments 1 
and 2, the judgment similarity main effect 
was significant, F(1, 31) =31.9, P< .001. 
Neither the set-size effects nor the Set Size X 
Judgment Similarity interaction reached sig- 
nificance, F(3, 29) = 2.27, p < .15, and F 
<1, respectively. Thus, even with an in- 
creased range of set sizes, the pattern of 
results obtained in Experiment 2 was repli- 
cated and best supported the judgment-re- 
trieval model. 

This model was further supported (and 
the information-retrieval model infirmed) by 
the fact that even when subjects made judg- 
ments on the basis of only one trait, 75% of 
them took longer on the average to make a 
Second dissimilar judgment than a similar 
one (sign test, p < .004). It therefore would 
appear that even subjects in the single-trait 
ee pae recalling cognitions that dif- 
Bnet relevance to their second occupa- 

judgments. 
ee Experiments 2 and 3. Although 
a ele statement can be made about 
the ae of the null hypothesis regarding 
ion Se of a set-size main effect or inter- 
Reise th, similarity, it is possible to ag- 
alan = findings across experiments in a 
to be bro a a all the relevant available data 
as: Wine t to bear on a particular hypothe- 
whereby i he p. 49) outlines a procedure 
series of individual significance tests from a 
ries experiments may be combined to 

Single hypothesis. If different samples 
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Figure 3. Mean second occupational judgment times 


from Experiment 3 as a function of judgment-pair 
similarity and set size. 


are employed in each experiment, an overall 
probability statement can be computed by 
aggregating the natural logarithms of the in- 
dependent probability values and comparing 
this aggregation with the chi-square distribu- 
tion. In the present case, such a test can be 
made even more sensitive by considering just 
the linear component of the set-size effect, 
since it is the component that differentiates 
between the three models being considered. 
The first test performed on the combined 
experimental data was a test of the linear 
interaction between judgment similarity and 
set size. Such an interaction is predicted by 
both the information-retrieval and mixed- 
judgment models, but not by the judgment- 
retrieval model. The F values of this linear 
interaction for all three data sets (cf. Figures 
2 and 3) were less than 1, and the significance 
test based on the combined probabilities was 
also nonsignificant, x° (6) = 3.46, p < .50. 
The judgment-retrieval model predicts not 
only the absence of an interaction between 
judgment type and set size but also the ab- 
sence of a linear increase in decision time 
across set size. A test of this hypothesis was 
conducted by collapsing across judgment 
type and analyzing the linear trend of deci- 
sion times over set size. Whereas for both 
replications of Experiment 2 this analysis 
yielded Fs of less than 1, the corresponding 
analysis of Experiment 3 yielded a significant 
result, F(1, 31) = 6.25, p< .02. However, a 
test of the hypothesis that subjects’ judgment 
times increased with set size, based on the 
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combined probabilities of all three data sets, 
was not significant, y2(6) = 9.38, p < .20. 

Effect of trait-set valence on decision time. 
The data analyses reported thus far support 
two conclusions. First, the pattern of subjects’ 
second judgment times across the combined 
experiments was most consistent with a judg- 
ment-retrieval model. Second, this finding was 
true when both homogeneous and heteroge- 
neous trait descriptions were considered. The 
design of Experiments 2 and 3 made it pos- 
sible to examine whether one additional vari- 
able, trait-description valence, might limit 
the generalizability of the experimental re- 
sults. This was possible because in both Ex- 
periments 2 and 3, to avoid a valence ex- 
pectancy set, half of the homogeneous trait 
descriptions were constructed of positive traits 
and half were constructed of negative traits. 

Two separate analyses indicated that the 
absence of an interaction between judgment 
similarity and set size held for both positive 
and negative trait sets. First, the three-way 
interaction between trait-set valence and the 
other two variables was not significant: The 
analysis of the linear component of the three- 
way interaction, based on the combined prob- 
abilities from Experiments 2 and 3, yielded 
a value of x?(4) = 7.81, p < .10, Second, the 
two-way interactions between judgment simi- 
larity and the set-size linear components 
were nonsignificant when subjects’ judgment 
times for’ positive and negative trait sets were 
analyzed independently: The analyses of this 
interaction, based on the combined probabil- 
ities from the two experiments, yielded values 
of x*(4) = 4.72, p< .50, for the positive trait 
Sets, and x*(4) = 4.68, p < .50, for the nega- 
tive trait sets. 

The strength of the similarity effect also 
was not significantly affected by trait valence. 
For both types of traits, subjects took longer 
to make second dissimilar, as opposed to simi- 
lar, judgments. This result was indicated by 
the absence of a significant interaction be- 
tween trait valence and judgment similarity; 
the analysis based on the appropriate com- 
bined probabilities yielded a value of x7(4) = 
2.19, p < .50. 

Finally, the question of whether trait val- 
ence in any way modified the finding that de- 
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cision time is independent of set size ml 
tested by analyzing the linear interaction by, 
tween trait-set valence and set size, In cop! 
trast to the other analyses, the linear interac 
tion between trait-set valence and set sin 
was significant for both Experiments 2 and4, 
F(1, 11) = 8.20, p < .02, and F(1, 15): 
7.56, p < .02, respectively,? as well as sig 
nificant for the combined analysis, y*(4) = 
16.80, p < .005. 

To determine the nature of this interaction 
separate analyses of the linear trends of judg 
ment times across set size for the positive 
and negative trait sets were conducted, Thes 
analyses indicated that when subjects cor 
sidered negative trait sets, the linear com 
ponent of their second judgment times dil 
not vary significantly across set size, F(1, ll) 
= 1.14, p < .31, for Experiment 2; F(1, 18) 
= 1.38, p < .26, for Experiment 3; x'(4)= 
5.06, p < .25, for the combined probabilities 
On the other hand, when subjects considered 
trait sets that consisted solely of positive 
traits, there was a significant linear increas 
in their second judgment times as the numbet 
of traits used to describe the stimulus perso 
increased, F(1, 11) = 4.99, p < .05, for a 
periment 2; F(1, 15) = 10.87, p < .005, E 
Experiment 3; x?(4) = 16.70, p < 005, M 
the combined probabilities. This effect ® 
shown in Table 1, where subjects’ retrati 
formed mean decision times for posita 
negative trait sets from Experiments 2 ani , 
are displayed. As can be seen, there Mis, 
tendency for subjects’ judgment times a: 
crease as set size increased for negat vai 
sets. This tendency approached in mie i 
the increase in judgment time over set * 


i sg 
for positive traits. The linear trend was 


r nd $ 

*For design efficiency, both Experiments er. 
contained a confounding between trait-sé found 
and judgment similarity, However, this con dered 55 
did not exist if pairs of subjects were are 
the unit of analysis and the three factors GE gp- 
similarity, trait valence, and set size) Hie t is 
sidered as within-(pairs of) subjects factori i tpe 
for this reason that the degrees of freedom i, 
Present analyses are halved. Such an anara ce 
sible using a multivariate approach, since d assum | 
covariance homogeneity is not a require 
tion (see Poor, 1973, p. 206). 
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Mean Second 
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Occupational Judgment Times as a Function of 


Trait-Set Valence and Set Size: Experiments 2 and 3 


No. of descriptive traits 


Valence 
Experiment 2 
2 4 6 
Positive 3.87 4.11 4.42 
Negative 4.28 3.64 3.17 
Experiment 3 
| 1 ~ 3 5 7 
Positive 4.28 4.39 5.32 5.09 
Negative 5.33 4.94 4.65 4.58 


Nole, Response times are subjects’ mean log scores transformed into seconds. For Experiment 2, each mean 
is computed from 48 responses (24 subjects providing two responses each); for Experiment 3, each mean 
| fepresents 32 responses (32 subjects providing one response each). 


| nificant for positive but not for negative 
traits because the mean square error of the 
linear trend analysis for negative traits was 
more than three times as large as the mean 
square error for positive traits in both ex- 
periments. This implies substantially greater 
individual differences in the effect of set size 
for negative information than for positive in- 

formation. 
; The finding that subjects’ second judgment 
imes increased more with set size when they 
considered homogeneously positive (compared 
A eo trait sets would not be pre- 
Bre Me a of the three processing models. 
a pore le explanation of this unexpected 
bi, RA that subjects, in addition to recall- 
Stine ae judgment, spent some time 
tion m eir memory for stimulus informa- 
ae might disqualify a person from 
tive Sl at the occupation. For nega- 
Bits, Be negus sets, the ease (or speed) 
Dibere? z subject could recall a sufficient 
o cae evant negative information items 
ina iets person from being successful 
ficvease occupation could be expected 
when ae set size increased. However, 
ive traits ering sets of homogeneously posi- 
oid ee as set size increased, subjects 
of infor Ney search an increasing number 
samplin ens before proportionally 
8 the stimulus items for possible dis- 


Qualifying attributes. 


If such a memory search strategy did ac- 
count for the obtained interaction between 
negative and positive information sets and 
set size, it would appear that it supplemented 
(rather than replaced) a judgment-retrieval 
process. The absence of an interaction be- 
tween trait valence and judgment similarity 
(along with the absence of a three-way inter- 
action between these two factors and set size) 
indicates that subjects consistently took longer 
to make dissimilar, as compared with simi- 
lar, second judgments when the stimulus per- 
sons were described by both positive and 
negative trait sets. 

Availability of stimulus information in 
memory. The primary concern of all three 
studies was to uncover the kinds of informa- 
tion people selectively review in memory prior 
to making a person judgment. As noted in the 
introduction, this concern is distinct from 
assessing the kinds of stimulus information 
that can potentially be remembered by a per- 
son. Although the data of Experiments 2 and 
3 suggest that people do not actively recall 
more stimulus items as a function of set size 
while making a judgment (at least not for 
homogeneously negative or heterogeneous 
stimulus sets), this should not be taken to 
mean that people are incapable of remember- 
ing more stimulus item$ when a person de- 
scription contains more items of information. 
In Experiment 3, after subjects had made 
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their last occupational judgment, they were 
unexpectedly asked to list as many of the 
traits describing the final stimulus person as 
they could remember. Subjects were able to 
recall an increasing number of traits as the 
number of traits in the stimulus set increased. 
The mean number of accurately recalled traits, 
when subjects had seen descriptions contain- 
ing one, three, five, or seven traits was 1.0, 
2.6, 2.9, and 3.2, respectively, F(3, 12) = 
3.34, p < .06. Thus, subjects were capable of 
recalling more traits as set size increased, 
even though there was little evidence that 
they were doing so while making their judg- 
ments.’ Although the information-retrieval 
model did not adequately account for the pat- 
tern of response times found in the present 
decision setting, this model may well be ap- 
propriate in other contexts. The recall data 
show that people can, if required, recall more 
information items as set size increases in this 
type of sequential judgment task. 


General Discussion 
Summary and Conclusions 


The research reported in this article was 
undertaken to investigate the selective Te- 
trieval process subjects employ when making 
memory-based judgments. The first study 


made about a person influenced i 
subjects Made a s er uay 
judgment about the same individual. The fact 


t imilar second decision i 

m s sion in- 
ted that subjects’ memory-based judg- 
ely on an unselective 
information they had 
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Experiments 2 and 3 was how impervious tf 
similarity effect was to manipulations of th 
stimulus information. The magnitude of tht 
similarity effect was undiminished by trait 
composition (heterogeneous versus homog. 
neous), trait-set valence (positive versus Tey, 
tive), and set size (one through seven item 
of person information). Even when subject 
demonstrated 100% recall ability (as in i 
periment 3, where only one stimulus trait w 
presented), 75% of them took longer to reat 
a second dissimilar (as opposed to simila, 
decision. The persistence and stability of thy 
similarity effect proved most consistent if 
a judgment-retrieval model, which postulats 
that subjects base their second decisions i 
memory for their initial judgment and it 
associated characteristics. | 
Although stimulus-information-retrieval pt” 
cesses were not responsible for the judgmetl 
similarity effect, it cannot be concluded thi) 
such processes were totally absent in the prt 
ent judgment task. There was evidence thii 
subjects’ judgment processes included as! 
lective search for negative stimulus infom 
tion in addition to recall of their initial judg 
ment. When subjects were presented with? 
homogeneously positive set of descripti 
traits, their mean judgment times increase 
as set size increased. However, when oif 
hegative traits were presented, there was! 
nonsignificant tendency for decision time U 
decrease as set size increased. Ree 
While subjects in the present decision t i 
appeared to rely on their first impression jude 
ments in combination with a memory a 
for disqualifying attributes, there is lil 
doubt that their cognitive processing strat% 
for making memory-based decisions wou ia 
different in other settings. For example, 
there been a substantial period of time 


i 
3 The only other significant result from the af 
sis of the Memory data was a significant © jects 9 
judgment type, F(1, 12) = 4.84, p < 05. Subje 
the dissimilar judgment condition recalled Sol 
average more traits than subjects from the por 
condition (Ms = 2.73 versus 2.08, respectively): sin 
ever, this effect must be regarded with caution pe 
no such difference was found in a compa Lise 
Son-impression sequential judgment task (Cf: 


& Ostrom, in press), 
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tween the first impression judgment and the 
memory-based judgment, it would be much 
less likely that subjects could have remem- 
bered any specific item of descriptive informa- 
tion. Under such circumstances, they might 
have resorted solely to a judgment-retrieval 
process. Alternatively, a person might depend 
much more heavily on an information-retrieval 
process if there were sufficient incentive. This 
might be the case, for example, if the sub- 
jects were rewarded for a correct decision, 
were required to justify their second judg- 
ment, or perceived that the experimenter’s 
hypothesis concerned memory for the traits. 


Methodological Note 


To understand the way in which people 
draw on memory when they make impression 
judgments, social psychologists will need to 
develop and adopt methodologies appropri- 
ate for investigating cognitive information 
processing. A number of investigators (cf. 
Anderson, 1977; Anderson & Hastie, 1974) 
have shown how decision time may be used 
to infer the structure of people’s cognitive 
representations of others. The present studies 
demonstrate the way in which decision time 
May also be used to make inferences about 
how people draw on their memory in a person 
judgment task. 

Although decision time represents a power- 
ful, relatively unobtrusive measure of cogni- 
tive processing, there are problems associated 
with its use. Decision-time data are difficult 
to interpret without precise processing models 
that make clear alternative predictions. Such 
ae invariably rest on multiple assump- 
peas wose number tends to increase as the 
a exity of the decision being studied in- 
ate Consequently, decision time is much 
Heiner for eliminating processing 
Reset an it is for providing definitive evi- 
iis at ae single model is a true repre- 
liana, a way in which people process 
vide ieee ah the present studies pro- 
the EN strong evidence that neither 
model, ag artical model nor the mixed 
reflects Sites in this article, accurately 
task, Other ve _ judgment process in this 
ber tik sions of these models, how- 

» Might account for the data better. For 
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example, it may have been that in the present 
judgment context, subjects felt time-limited, 
as opposed to relevance-limited, in their mem- 
ory search. That is, the lack of relationship 
between set size and decision time may have 
resulted from subjects tending to sample cog- 
nitions from memory for a fixed period of 
time rather than sampling a proportional 
number of cognitions. Although the interac- 
tion between trait valence and set size sug- 
gests that subjects were not tightly time- 
bound in their processing strategy, the pos- 
sibility of this type of limitation cannot be 
ruled out. Consequently, the present support 
for judgment-retrieval processes must be re- 
garded as tentative. 

The existence of alternative explanations 
for the present findings reflects the fact that 
model building is an iterative process in which 
some models are eliminated while others are 
expanded and refined to conform with empiri- 
cal regularities that have emerged from previ- 
ous testing. We hope that the present studies 
represent only a first step in an evolutionary 
process that will result in the development 
of more complete models of the ways in which 
people selectively draw on memory when they 
make memory-based impression judgments. 
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Personality Correlates of Susceptibility 
to Biasing Information 
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cts were asked to score two drawings, one allegedly made by a high-status 
d the other by a disadvantaged child. The difference between the scores 
; ch subject attributed to the two drawings (compared to the objective 
ase level) was the subject’s bias score. High-bias subjects (with bias scores 
standard deviation or more above the overall mean bias) described them- 
ves as more reasonable than no-bias subjects (bias scores of zero), and as 
motionally extreme in the directions of toughness, tenderness, and level 

i enthusiasm. This pattern was interpreted as reflecting the need system sug- 
; Hi tive of the dogmatic personality. In a second study, no differences were 
ound between the extreme-bias groups in responses to self-report questionnaires 

d the Embedded Figures Test, and in political views. However, high-bias 
jects responded more extremely than no-bias subjects to direct questions 


logical content. 
ey 


er bias effects, expectancy ef- 
fulfilling prophecies of various 
en repeatedly demonstrated in 
Populations, settings, content 
oe of experiments (Rosenthal, 
In d 
re attributed to the influence of 
ation, which is presumably task 


lations in susceptibility to bias- 
lation (Rosenthal & Rubin, in 


hes. 


my friends Daniel J. Davis for his 
ruce T. Oppenheimer for his con- 
on and suggestions, and Chaim 
patient assistance in data analysis. 
eprints oad be sent to Elisha Y. 
ication, Hebrew University of 
Israel. y 


g to their political ideology, independent of its content and direction, 
with the conception of dogmatism as a style of thinking independent of 


Variations in the influence of biasing infor- 
mation are probably caused by various fac- 
tors, including design and procedure differ- 
ences, nuances of delivery, and personality 
differences among subjects. This study focused 
on the latter factor, and an attempt was made 
to identify systematic personality correlates 
of susceptibility to biasing information. 

McFall and Schenkein (1970) divided 
their subjects into extreme groups on several 
variables (need for achievement, field de- 
pendence). Then they divided each extreme 
group into two experimental groups and ad- 
ministered Rosenthal’s (1966) photo-rating 
task to these groups, feeding one experimental 
group with “positive” expectancy and the 
other with “negative” expectancy. Laszlo and 
Rosenthal (1970) used a similar design, with 
dogmatism scores as the criterion for divid- 
ing the groups. 

In a different design, each subject receives 
both positive and negative expectancy infor- 
mation, and the resulting difference between 
the subject’s two performances serves as an 
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index of susceptibility to bias. This design 
seems to be more powerful than the McFall 
and Schenkein design, since each subject's 
influenceability can be directly assessed. On 
the other hand, the disadvantage of this de- 
sign is that the credibility of the cover story 
and of the biasing information (enough of a 
problem even when subjects receive only one 
type of biasing information) may be seriously 
hampered, limiting the range of experimental 
possibilities. Laszlo and Rosenthal (1970) 
used the first design and later created a mea- 
sure of individual influenceability by com- 
puting the difference between each subject’s 
mean photo rating and the mean photo rating 
obtained by the same experimenter from all 
his subjects contacted under the opposite con- 
dition of expectancy. The experimental situa- 
tion chosen for the present study—test scor- 
ing—makes it possible to provide each sub- 
ject with positive and negative expectancy in- 
formation without losing credibility. In this 
design, subjects are asked to score mental 
tests of two “examinees,” one allegedly of 
high status and the other of low status. 

This type of study measures Suggestibility, 
namely, the influence of the information on 
the performance of the receiver. Another pos- 
sible type of study measures communicability, 
that is, the extent to which the receiver of 
the biased information influences the behavior 
of other people (Finn, 1972). Bias in scoring 
(e.g., Babad, Mann, & Mar-Hayim, 1975) ex- 
emplifies Suggestibility effects, whereas teacher 
expectancy effects (Rosenthal & Jacobson, 
1968) and photo-rating bias ( Rosenthal, 1966) 
are examples of communicability effects. Sug- 
gestibility is a necessary, but not sufficient, 
component for communicability effects. It is 
interesting to note that McFall and Schenkein 
(1970) used tape recordings of experimenters 
in Rosenthal’s photo-rating task, thus turning 
a communicability study into a suggestibility 
study. 

In the present 
Harris Draw-A-Person 


the minimal experience it requires, and the 
short time it takes to 
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tem. While the first study in this report 
exploratory, the second study was desi 
to test hypotheses formulated on 
the results of the first study. 


: 
gd 
the basis y 


Study 1 
Method 
Overview of the Study 


Introductory psychology students were taught th 
Scoring procedures of the Draw-A-Person Test aif 
were then asked to score two drawings. The drar 
ings were actually duplicated from the test manu 
but the students were provided with falsified infu 
mation alleging that one drawing was made bya 
disadvantaged, low-class child and the other, byi 
high-status, advantaged child. Later, the studen 
completed a self-report personality inventory. 


Subjects 


Subjects were 133 students in introductory pi 
chology classes at the School of Education of lit 
Hebrew University of Jerusalem. All subjects wt 
freshmen, 


Procedure 


The experiment was conducted in one (2-bow) tg 
sion, the purpose of which (the students were W 
was “to familiarize you with simple instrumen 
psychological evaluation via direct, first-han ie 
perience.” The instructor presented the Dra 
Person Test and distributed a sheet summa a 
73 scoring categories. The procedures for a 
were then taught, and the subjects practiced wa 
scoring a drawing on the board and (b) by 
their neighbors’ drawings. A practice sheet w a 
distributed, and the students were told thal ts 
liability study would be conducted. The shee 
divided into two columns, each presenting i” wil 
ing of a man and information about the chi F 
allegedly drew that picture. The informat [ 
sisted of name, address, school, number a 
and parents’ occupations and education. a 
was described as coming from a high-stat 
ground (with a European name, Rubinstein), ed cil 
the other child was depicted as a disadvantas' 
of Moroccan origin (named Amsalem). ed from 

The drawings were actually duplicati ‘ 
DAP manual, so that their objective mar iby ® 
were initially known. To increase the Crê the hi 
the cover story, the drawing selected for 4) bdi 
status child (Harris, 1963, p. 268, Figure "gt 
somewhat higher manual score than the 963 pi 
tributed to the low-status child (Harris, Zi poini n 
Figure 72). Therefore, a difference of upo bi 
favor of the high-status child represente 


i 
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(To double check, we asked at a later time five 
graduate students to score these drawings without 
adding the expectancy information. Their scores con- 
firmed the 3-point difference between the two draw- 


ings.) 
"ie instructor explained to the students that these 


drawings were photocopied from records collected in 
a recent study conducted in various Jerusalem schools. 
‘The instructor commented that “one can immediately 
see the difference between the drawings of Rubin- 
stein and Amsalem” and proceeded to urge the stu- 
dents to be objective in their scoring, reminding them 
that the reliability of their scoring would be ex- 


amined, 

After the completion of the scoring, the Adjective 
Check List (ACL; Gough & Heilbrun, 1965) was 
distributed, The ACL consists of 300 adjectives, and 
subjects are asked to check those adjectives that 
describe them as. they are (i.e. “actual-self” image). 
The ACL was chosen because of its nonpathological 
focus, the short time it takes to administer, and the 
availability of a Hebrew version. 


Results 


Overall Bias 


The mean bias score (i.e., the difference 
between the scores of the drawings attributed 
to Rubinstein and Amsalem) of the 133 stu- 
dents was 6.47, the standard deviation was 
5.9, and the bias scores ranged from —6 to 26 
points. Considering a difference of 3 points as 
zero bias,” the ¢ test yielded a highly sig- 
nificant result (¢ = 6.77, p < .001) This find- 
ing is another demonstration of bias in scoring 
(see Babad et al., 1975); for the purpose of 
A pla of susceptibility to biasing in- 
8 a as a Personality dimension, the wide 
A of individual differences in the bias 

$18 quite satisfactory. 


Correlational ; 
Anal: i 
Personality Scales ysis and Construction of 


Rig ai of the ACL responses pre- 
tong Gough and Heilbrun (1965) 
the Aer oc two dozen scales for scoring 
standard at these scales had never been 
Israeli g i or empirically validated on 
ubjects. Since numerous adjectives 
translate Considerably in the process of 
etican St was inappropriate to use the 
Adjectives es on the Israeli subjects. Some 
ve a very special meaning or 
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double meaning in American English (e.g., 
“queer”), other adjectives cannot be ade- 
quately translated for lack of an appropriate 
word in Hebrew (e.g., “zany”), and cultural 
nuances are not readily reproducible in He- 
brew. 

Being unable to use Gough and Heilbrun’s 
scales (and lacking an alternative nonpatho- 
logically oriented test in Hebrew), we were 
compelled to construct new scales from the 
available data. Factor analysis and other mul- 
tivariate methods were ruled out because of 
the small number of subjects (133) relative 
to the number of variables (300). The most 
viable alternative was to use the subjects’ 
bias scores as reference and to construct ad 
hoc scales according to the correlations of the 
ACL adjectives with that score. By chance 
alone, 15 of the 300 correlations between the 
ACL adjectives and the bias score should 
have reached the .05 level of significance, and 
30 correlations should have reached the .10 
level of significance. The results showed 34 
correlations significant at the .05 level and 57 
at the .10 level, both of which exceed chance 
level. 

Of the 34 correlations reaching the .05 level 
of significance, 28 (or 82%) were negative, 
indicating that the high-bias subjects tended 
to check these adjectives less frequently than 
the low-bias subjects. To further examine this 
finding, we computed the correlations between 
the bias score and (a) the total number of 
adjectives checked, (b) the number of favor- 
able adjectives checked, and (c) the number 
of unfavorable adjectives checked. (This 
analysis seemed more legitimate than the use 
of Gough and Heilbrun’s (1965) specific 
scales, since it dealt with overall numbers of 
adjectives.) Only the last correlation was 
statistically significant (r = —.17, p < .05), 
indicating that the more biased subjects 
tended to check unfavorable adjectives less 
frequently in describing themselves. Gough 
and Heilbrun (1965) state that the low-scor- 
ing person on the number of unfavorable ad- 
jectives “is more placid, more obliging, more 
mannerly, more tactful, and probably less 
intelligent” (p. 6). 

Four distinct clusters emerged in the 
scrutiny of the adjectives significantly related 
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Table 1 


Mean Scores of High-Bias and No-Bias Groups on the Four Ad Hoc Scales (Study 1) 
= 


ELISHA Y. BABAD : 


Scale 
Reason- Over- Over- Over- 
ableness toughness tenderness enthusiasm 
Group (13 items) (38 items) (31 items) (20 items) 
High bias (n = 25) 6.28 4.68 5.12 3.72 
No bias (n = 34) 4.97 7.29 6.74 5.59 
t test 1.87* 2.25* 1.80* 2.18* 


*p <.05. 


to the bias score. About two thirds of the “sig- 
nificant” adjectives fitted in these clusters, 
whereas no meaningful pattern could be de- 
tected connecting the remaining adjectives. 
On the basis of these clusters, four ad hoc 
scales were defined: (a) Reasonableness. In 
this cluster, subjects described themselves as 
very reasonable, adhering to values of sobriety 
and realism, Typical adjectives are rational, 
reasonable, civilized, mature, relaxed, and 
discreet. (The adjectives in this cluster were 
positively related to bias.) (b) Over-tough- 
ness. This factor emphasizes extremely force- 
ful and egocentric interaction with, and con- 
trol of, other people. Typical adjectives are 
aggressive, autocratic, cruel, rude, dominant, 
hardheaded, and hostile. (The adjectives in 
this cluster—as well as in the next two clusters 
—were negatively related to bias.) (c) Over- 
tenderness. This factor also consists of a rather 
extreme description, but the emphasis is on 
lack of forcefulness, dependence on other peo- 
ple, and an overly “soft” position. Typical 
adjectives are softhearted, timid, defensive, 
dependent, and sensitive, (d) Over-enthusi- 
asm, This factor relates to a behavioral style 
of a particular nature, characterized by ex- 
cess energy and activity. Typical adjectives 
are enthusiastic, blustery, praising, energetic, 
Spunky, and uninhibited. 

The definitions of these four factors and 
the 300 ACL adjectives were submitted to 
three experienced psychologists, who were 
asked to mark all adjectives fitting these 
definitions. The three psychologists were un- 
aware of the correlations between the adjec- 
tives and the bias score. All adjectives ap- 
pearing in the lists of all three judges were 


included in the new ad hoc scales. The fow 
new scales consisted altogether of 102 (out 
of 300) ACL adjectives, the majority o 
which (66) had not been significantly re 
lated to the bias score; only 23 were signif- 
cant at the .05 level. 


Extreme Group Comparison 


To examine the relationships between tht 
ad hoc scales and the bias score, an extreme 
group design was chosen, comparing high-bi 
to no-bias subjects. All subjects whose bits 
scores were one standard deviation or mot 
above the overall mean of the bias scores “a 
included in the high-bias group (” = 25). 
subjects whose bias scores were at the ae 
bias” plus or minus one (i.e., granting Rubin 
stein between 2 and 4 points more than i 
salem) were included in the no-bias goir 
= 34). Scores on the four ad hoc scales “4 
computed for the subjects in these se 
groups, and ¢ tests between the scores of 
high-bias and no-bias groups were calcul : 
All £ tests yielded statistically significan E 
< .05) differences; the pattern of e k, 
sults is presented in Table 1. The high? 
subjects described themselves as more ri na 
able than the no-bias subjects, while the ae 
group was higher on all three “over a 
over-toughness, over-tenderness, and 0 
thusiasm. 


Discussion 


The 
This pattern is extremely interes a 
subjects who showed more suscepta of the 
biasing information, and whose scoring 


drawings was less objective, chose to describe 
yes. as more reasonable, rational, and 
i objective and as less emotionally extreme. On 
the hand, the subjects whose scoring 

‘yas more objective and unaffected by the 
formation described themselves as 
Je and as more extreme in several 
| directions. To interpret these re- 
seems quite clearly indicated that the 
criptions should not be viewed as re- 
sting real, “actual-self” attributes but as 
ions of a need system and an “ideal- 

e. Thus, the high-bias person needs 
himself or herself as more reasonable 
deny the existence of any emotional 
in his or her personality. The no-bias 
e not necessarily less reasonable, 
or more tender than the high bias 
‘but they can more readily admit the 
of these extremes in their personality 
not need to hold on so strongly to 


mage of the high-bias person seems 
t from the literature on authoritarian- 
id dogmatism: rigid adherence to and 

sis of conventional values (being 
reasonable); rejection of emotional 
es and of self-awareness that may 
2 one’s adjustment; and narrow range 
sciousness. Extreme strength or weak- 
Must be rejected as “not me” if one’s 
ipt to be over-reasonable is to succeed. 
‘conclusion, the only personality correlate 
Susceptibility to biasing information in- 
from the results of Study 1 is dog- 
Matism. This is in line with the findings re- 
by McFall and Schenkein (1970) and 
lo and Rosenthal (1970). Moreover, 
e of findings on other personality 
of susceptibility to bias in this study 
50 Consonant with other reports (Laszlo 
nthal, 1970; Rosenthal, 1966). 


) this conclusion in the present con- 
l U and highly tentative. Con- 
n Study 2 was designed to focus di- 
On various manifestations of dogmatism 
T relations to susceptibility to bias- 
r tion. The instruments used in 
to assess dogmatism included specific 
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self-report questionnaires, a performance task 
of field dependence, and questions concern- 
ing political ideology. 


Study 2 
Method 
Overview of the Study 


measured in Study 2 as in 


Bias in scoring was 
the Embedded 


Study 1. Immediately thereafter, 
Figures Test (Oltman, Raskin, Witkin, & Karp, 1971) 
was administered to measure field dependence, fol- 
lowed by a self-report questionnaire with scales mea- 
suring dogmatism, locus of control, defensiveness, 
and impulsiveness. A week later, the subjects re- 
sponded to several direct questions concerning their 


political views. 


Subjects 


A new sample was selected for Study 2, consist- 
ing of 179 introductory psychology students at the 
School of Education of the Hebrew University of 
Jerusalem. All subjects were freshmen, and the study 
was conducted at the beginning of the school year. 


Procedure 


The DAP experiment was conducted in replication 
of Study 1. The only exception was that the drawing 
attributed to the low-status child (Amsalem) was 
replaced by another drawing duplicated from the 
DAP manual. The new drawing (Harris, 1963, Fig- 
ure 57, p. 264) had a higher manual score than the 
drawing attributed to the high-status child, but in 
superficial examination it seemed to be of poorer 
quality than the other drawing. This change was due 
to the investigator’s eagerness to maximize the bias 
effect and increase the spread of the bias scores. In 
retrospect, it seems that a simple counterbalancing 
of stimuli would have been much more appropriate, 
and this change is regrettable. Contrary to the man- 
ual scores, the five judges who scored the two draw- 
ings without the expectancy information and without 
access to the scoring of these drawings in the DAP 
manual awarded the new drawing 5 points less than 
Rubinstein’s drawing. Thus, the investigator’s over- 
eagerness resulted in a “boomerang effect.” The man- 
ual scores were therefore discarded, and a difference 
of 5 points in favor of the high-status child was 
defined as “zero bias.” 

Immediately following the scoring of the two draw- 
ings, a short (16-item) version of the Embedded 
Figures Test (Oltman et al., 1971) was administered, 
with a time limit of 4 min. The students then filled 
out a self-report questionnaire containing the short 
version of Rokeach’s Dogmatism Scale (Schulze, 
1973); a locus of control questionnaire (Rotter, 
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1966); a defensiveness questionnaire (from Babad, 
1974) ; the Impulsiveness Scale (Barratt, 1965); and 
several filler items. 

In the next class session, the students were asked 
to provide ratings (on S-point scales) to the following 
questions: (a) What is your political ideology ?— 
from 1 (left) to 5 (right); (b) What is the intensity 
of your political views?—from 1 (very low) to 5 
(very high); and (c) What is your position regard- 
ing “greater Israel”? (i.e, a dovish or a hawkish 
position concerning the possibility of returning terri- 
tories on the West Bank of the Jordan River to the 
Arabs—from 1 (against—dovish stand) to 5 (for— 
hawkish stand). 


Results and Discussion 
Overall Bias 


The mean bias score of the 179 subjects was 
6.20, the standard deviation was 5.33, and 
the bias scores ranged from —5 to 23 points. 
Considering a difference of 5 points as “zero 
bias” (in accordance with the stricter scoring 
of the five judges and disregarding the manual 
scores), the ¢ test yielded a significant result 
(t = 3.00, p < .05). This finding is yet an- 
other demonstration of bias in scoring. In ad- 
dition, it constitutes a replication of the DAP 
method of affecting expectancy bias. 


Extreme Group Comparison 


As in Study 1, the analysis of personality 
correlates was carried out using an extreme- 
group design, and the same principles were 
employed to define the high-bias and no-bias 
groups. All subjects whose bias scores were 
one standard deviation or more above the 
overall mean of the bias scores were included 
in the high-bias group (n= 20), while all 
subjects whose bias score was 5 (“zero bias”) 
constituted the no-bias group (» = 18). 

i The self-report questionnaires were analyzed 
in several ways. The responses of all 179 sub- 
Jects to each of the scales (dogmatism, locus 
of control, defensiveness, and impulsiveness) 
were factor analyzed, and new factor scores 
were computed. None of the £ tests (of total 
scales and of new factor scores) yielded a 
significant difference between the high-bias 
and no-bias groups. Thus, no correlates of 
Susceptibility to biasing information were 
identified in these self-report questionnaires, 
No significant difference was found between 


the scores of the high-bias and no-bias groups 
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on the Embedded Figures Test. This finde 
is contrary to McFall and Schenkein’s (1919 
report, where a significant expectancy eff 
on Rosenthal’s (1966) photo-rating task 
found for groups of subjects defined as fi 
dependent according to their Embedded fj 
ures Test performance. 

Considering the three direct Political que 
tions (political ideology, intensity of politi 
views, and position about returning territorie! 
to the Arabs), no significant differences ye, 
found between the high-bias and noba 
groups, although a slight (insignificant) tent 
ency of the high-bias subjects towards th 
right-wing and hawkish positions was d 
tected. Following Rokeach’s (1960) argumen 
concerning “the authoritarian of the lelt 
the subjects’ responses to these question 
were rescored according to their level of e 
tremity, disregarding the direction of tht 
content. Thus, ratings of 1 and 5 were sorl 
as extreme (1), and 2 and 4 were scored a 
moderate (2), and a rating of 3 was soar 
as neutral (3). The t tests of the extremil 
Scores yielded a statistically significant (ps 
05) difference, the high-bias subjects cm 
sistently taking a more extreme posi 
reacting to all three questions. Thus, W J 
the groups did not differ in the content e 
tion of their political views, the high ; 
subjects differed from the no-bias subjects 
that they consistently responded more 
tremely to these questions. wd 

In light of this finding, the responses 
self-report questionnaires were rescore 
cording to their level of extremity, dfe 
data were reanalyzed. No additiona 
ences were found. Thus, the difference be 
the bias groups in the extremity Hi aut 
ing was limited to the direct poli 
tions. 


tweeti 


General Discussion 


be 
iF an 

The findings of the two saim E 
summarized as follows: (a) Bias "yi 


was demonstrated and replicated. t mot 
bias subjects described themselves 
p to Pel 


1This study was conducted prio 17 
Sadat’s peace initiative in November 1977- 


reasonable and rational than no-bias subjects 
and as less emotionally extreme in the direc- 
tions of toughness, tenderness, and level of 
enthusiasm. (c) High-bias subjects responded 
more extremely than no-bias subjects to ques- 
tions pertaining to their political views. On 
the other hand, no differences were found be- 
tween the high-bias and no-bias groups in 
(a) responses to self-report scales (and sub- 
gales) measuring dogmatism, locus of con- 
trol, defensiveness, and impulsiveness; (b) 


performance on the Embedded Figures Test; 
and (c) the content and direction of the sub- 
" jects’ political ideology. These findings are 
interpreted as being consistent with theoreti- 
cal treatments of dogmatism as a style of 
thinking independent of ideology, even though 
they did not correlate significantly with the 
Dogmatism Scale. 

The Embedded Figures Test and the rod- 
and-frame test had been constructed as per- 
formance measures of field dependence. Their 
relationship with the authoritarianism-dog- 
matism personality syndrome (as measured 
by self-report questionnaires) has been de- 
bated for over 20 years, with no clear resolu- 
tion (see, for example, Ohnmacht, 1968; 
Rudin & Stagner, 1958; Witkin & Moore, 
Ua 1). As to the relationship between per- 
ormance on the Embedded Figures Test and 
susceptibility to biasing information, here 
ie the reports are in conflict. McFall and 
chenkein (1970) found a significant ex- 
Wee. effect for field-dependent subjects 
ae n for field-independent subjects, whereas 
did eas investigation high-bias subjects 
Erte differ from no-bias subjects in their 
ae oe Figures Test performance. More- 
thie correlation between the Embedded 
att oe Test and the Dogmatism Scale on 
tisti subjects in Study 2 did not reach sta- 
Stical significance, 

"ahead issue that has received lively de- 
ism is rel a years is whether authoritarian- 
political A ed to conservative and right-wing 
aa oes cology or whether this personality 
the politi co ee gece of the content of 
Quite convi ideology. Rokeach (1960) wrote 
of the ney about the “authoritarian 
of evidenc (although his most famous piece 

e was based on a sample of only 13 
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British communists), and the construction of 
the Dogmatism (D) Scale as a substitute for 
the original authoritarianism scale, the F 
scale, (Adorno, Frenkel-Brunswik, Levinson, 
& Sanford, 1950) was motivated (in part) by 
this conviction. More recently, the superiority 
of the D scale over the F scale with regard to 
left-wing ideology was put in serious doubt 
by Thompson and Michel (1972), who found 
that both F and D scales were significantly 
related to political conservatism and Christian 
traditionalism. There is a need for research 
directed at identifying measures that more 


‘adequately reflect cognitive styles theoreti- 


cally related to authoritarianism and dog- 
matism. 

The findings of Study 2 have some bearing 
on this issue. The tendency of the high-bias 
subjects toward the right-wing ideology was 
slight and insignificant, whereas their excess 
extremity in political view independent of its 
contentual direction was clear and statisti- 
cally significant. Thus, if bias in scoring can 
be considered as one facet of dogmatism, 
Rokeach’s argument that the trait of dog- 
matism is largely independent of the specific 
content of the political ideology is supported 
by these findings. 

Conclusions 

Bias in scoring was demonstrated and rep- 
licated. 

It is indicated by the data that susceptibil- 
ity to biasing information is related to a dog- 
matic style of thinking, but is independent 
of the content and direction of ideology. 

The utilization of self-report personality 
questionnaires and scales with college stu- 
dents is highly questionable and problematic: 
There are grounds to suspect that ideal self- 
image responses are provided when actual 
self-image descriptions are asked for and that 
specific questionnaires and scales loaded with 
social desirability may be too transparent for 
college students, resulting in falsified re- 
sponses. 


Reference Note 


1. Witkin, H. A., & Moore, C. A. Cognitive style and 
the teaching-learning process. Paper presented at 
the annual convention of the American Educa- 
tional Research Association, Chicago, April 1974. 
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In two experiments, 


subjects were required to observe and predict the behavior 


of a hypothetical “chooser” who made choices for him- or herself and for a 


hypothetical other in a series of decomposed games. 


comes, or social motivational orientation, 


and varied across conditions. Results of 
jects were more readily able to detect 
who made choices according to individual 
of choosers who behaved in a prosocial 
Furthermore, the prediction 


subjects tended to perceive choosers’ own gain 


The preference for out- 
of the chooser was preprogrammed 
the experiments confirmed that sub- 
the outcome preferences of choosers 
listic or competitive choice rules than 
or negatively self-interested manner. 


data obtained from Experiment 2 revealed that 


as an important component of 


most of the choosers’ secondary motivation. Evidence from subjects’ ratings of 


the choosers’ personality attributes and 


choosers attached to their own and the other's gain 


estimates of the relative weights the 
(Experiment 2) indicated 


that subjects formed distinctive impressions of the choosers despite differences 


in predictive accuracy across conditions. 
vestigate the relationship between predi 
complexity of the 


choosers’ various choice 


A third experiment, performed to in- 
ctive accuracy and the mathematical 
rules, found no evidence that math- 


ematical complexity influenced subjects’ performance on the prediction task. 


The study of social exchange is concerned 
with situations where the outcomes of two or 
more individuals are dependent upon their 
mutual behaviors. Recent literature dealing 
With social exchange has focused on deter- 
Pes the values that individuals seek to 
e, a outcome interdependence situa- 
Boos A lintock (1972b), for example, has 
a sed that the value of outcomes to an 

ividual may be defined by that person not 
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Judith x Conde Council Doctoral Fellowship to 
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only in terms of what he or she receives for 
self but also in terms of what others receive. 
Experiments by Messick and McClintock 
(1968), McClintock, Messick, Kuhlman, and 
Campos (1973), and Kuhlman and Marshello 
(1975) confirm that people show individual 
differences in their tendencies to maximize 
their own outcomes, joint outcomes, and the 
relative difference between their own and 
others’ outcomes. 

These preferences for outcomes, Or social 
motives, have been formally characterized in 
a geometric model by Griesinger and Liv- 
ingston (1973). The Griesinger and Living- 
ston representation takes the form of a two- 
dimensional space where outcomes Or payoffs 
to self are ordered along the horizontal axis, 
and outcomes to the other are ordered along 
the vertical axis. An actor’s preference for 
certain distributions of points in this space is 
represented as a vector of infinite length ex- 
tending from the origin. Thus, an individual- 
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Other’s Outcome 


Figure 1. The Griesinger and Livingston vector repre- 
sentation of social motives, (Numbers in parentheses 
are examples of distributions of points to self and 
other.) 


istic motive, or preference to maximize one’s 
own gain, can be represented as a vector ex- 
tending in the positive direction along the 
horizontal axis. The locations of the vectors 
representing cooperation (maximization of 
joint gain) and the competition (maximiza- 
tion of relative gain) are 45° removed from 
the horizontal in the counterclockwise and 
clockwise directions, respectively (see Fig- 
ure 1). 

Preference orderings for points in this out- 
come space are defined in terms of the mag- 
nitudes of their perpendicular projections on 
the actor’s motivational vector. The greater 
the projection of a point, the more preferred 
is the associated outcome. For example, an 
individualistic actor might be presented with 
a choice between a point that affords self 5 
units and the other 4 units and a point that 
affords self 3 units and the other one unit. 
As Figure 1 indicates, the point 5,4 is pre- 
ferred to the point 3,1 because of its greater 
projection on the actor’s motivational vector: 
Receiving 5 for self is better than receiving 
4. However, if the actor is competitively 
oriented, 3,1 is preferred to 5,4 because it 
affords the actor greater relative advantage 
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over the other (2 units as opposed to 1). 1 
point 5,4 is preferred by a cooperative acy 
because it leads to the higher joint gain, 

The previously cited work of McClintod, Y 
Messick, and their colleagues has led to th " 
refinement of a method for assessing the w P 
cial motivational orientations of experiment 
subjects utilizing a form of payoff representa : 
tion known as the decomposed game. The dė 
composed game procedure involves simultan À 
ously presenting both members of a pair 
subjects with a series of two- or multichoi 
decision situations. Each of the choice altem 
tives in these situations affords a certain pfi 
off to the subject and a certain payoff toa 
other dyad member. Thus, each choice alter 
tive in a decomposed game represents a poll 
in self/other outcome space. Subjects are te 
quired to select their preferred alternative o 
each trial in ignorance of each other’s behavior 
Classification of subjects’ motivational ont 
tations is performed by comparing their pt 
terns of choices in the decomposed games wi 
those which would have been produced y 
actors displaying perfect consistency be 
individualistic, cooperative, and competit 
vectors. i WA 

This technique has two major streng ii 
a motive assessment procedure. First, E 
maintaining the essentials of outa 
dependence between subjects, the i) 
inimi ssibility 
posed game minimizes the po the sl 
strategic considerations will influence jet’ 
jects’ selections. Although each ne 
choice affects both own and other's ou d 
neither subject can utilize his or her ©) 
in an instrumental way to influence it 
choices. Second, the procedure allow: | 
vestigator to confront the subject w 
binations of choices reflecting 
come structures that are dominan 
to the motives under consideration- 


a 
is deter 
1 The dominance structure of a game 8° iit 


mi 
ini i i r values are P.. pw 
by examining which motives 0! at an acto 


other 10 units, $ will 
units and the other 30 units. Choice a her ee 
both the actor’s own gain and his ° ¢ the 4 


aa 0) 
gain. Choice B will maximize joint 8!" 
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he “Naive Psychology” of Social Motives 


Although researchers have examined indi- 
dual differences in motives and how they 
alate to subjects’ observed laboratory game 
lay (e8; Kuhlman & Marshello, 1975), 
ttle empirical attention has been paid to the 
uestion of whether and how subjects, as 
naive psychologists” (Heider, 1958), per- 
give the outcome preferences of others. When- 
yer outcome interdependence and the pos- 
ibility of social influence exist, an actor in- 
tent on maximizing preferred outcomes must 
simultaneously take into consideration both 
own and other’s goals, as well as the relative 
power of self and other (Thibaut & Kelley, 
1959), Thus, in most settings of outcome in- 
terdependence, accurate information about 
the other’s preferred outcomes should be in- 


sttumental in guiding the actor’s attempts 


both to approximate directly preferred out- 
comes and to influence strategically the 
other’s behavior. 

Perception of motives in the Prisoner’s 
Dilemma Game (PDG). One of the few in- 


vestigations relevant to the topic of the naive 


perception of social motives was performed 
by Kelley and Stahelski (1970). They as- 
sessed the motivational orientations of sub- 
jects as cooperative or competitive by asking 
them about the goals they wished to maxi- 
mize in the PDG. Subjects were then paired 
with other subjects who had expressed CO- 
all or competitive intent. Periodically, 
ubjects’ impressions of their partners were 
ie during the course of PDG play. The 
tion = enriey that a competitive orienta- 
Rial ae part of the other was more readily 
ther i than a cooperative orientation. Fur- 

, although cooperative subjects were able 


an 

oo An example of a decomposed game 
which the ae dominance structure is a situation in 
Kale (ee ‘hole affords 50 units for self and 20 
for self su other and the B choice affords 40 units 
mizes (is th Me units for the other. Choice A maxi- 
own and joi lominant choice with respect to) one’s 
Bain, In yen gain, whereas B maximizes relative 
4 similar mi i t example, the decomposed game has 
Dilemma nal structure to the Prisoner’s 
Closely eee (PDG); the latter example most 
M sees the Maximizing Difference Game 

3 McClintock & McNeel, 1966) . 


to detect their partner’s goals regardless of 
whether these goals were cooperative or com- 
petitive, competitive subjects often misjudged 
cooperative partners’ goals to be competitive. 
Finally, when asked about the propensity of 
others to select the two PDG choices, the com- 
petitors indicated that the vast majority of 
people would be similar to themselves and opt 
for the competitive goal, whereas cooperators 
felt that some people would be like themselves 
and others would choose the competitive al- 
ternative. 

Expectations concerning the goals of others 
in the decomposed game. The findings of a 
recent experiment conducted by Kuhlman and 
Wimberley (1976), however, suggest that 
the Kelley and Stahelski study may indicate 
more about the motivational structure of the 
PDG than about the beliefs that cooperators 
and competitors have concerning the relative 
predominance of these motivational orienta- 
tions in the population. Kuhlman and Wim- 
berly classified subjects as cooperative, com- 
petitive, or individualistic on the basis of 
their choices in a series of three-choice decom- 
posed games. Since the desire to maximize 
relative gain and the desire to maximize one’s 
own gain represent the two primary motiva- 
tional reasons for “competition” in the PDG 
(McClintock, 1972a), the Kuhlman and 
Wimberley competitors and individualists cor- 
respond to a partition of Kelley and Stahel- 
ski’s PDG competitors. 

After making a series of decomposed game 
choices, subjects in the Kuhlman and Wim- 
berley experiment were asked to estimate 
what proportion of other persons would make 
each of the three choices in four three-choice 
decomposed games having various dominance 
structures in regard to joint, relative, and 
own gain maximization. Also, subjects esti- 
mated the proportions of others who would 
make the cooperative and competitive choices 
in the PDG. Subjects then played 30 trials 
of a PDG against a uniformly cooperative or 
competitive other. 

The results obtained on choice behavior in 
the PDG and subjects’ estimates of others’ 
choices in the PDG paralleled those of Kelley 
and Stahelski (1970). However, estimates of 
population choice proportions in the decom- 
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posed games revealed different expectations 
about the preferences of others. Results from 
a game in which joint, own, and relative gain 
were each maximized by separate choices 
were particularly interesting. In this situa- 
tion, cooperative subjects thought that the 
joint-gain maximizing choice would be the 
most. popular followed by the own-gain choice 
and then the relative-gain choice. Competi- 
tive subjects’ ordering of others’ preferences 
was just the reverse of that of the coopera- 
tive subjects: relative gain followed by own 
and then joint gain. Individualistic subjects 
felt that the own-gain maximizing choice 
would be preferred by most and that the 
relative-gain maximizing choice would be 
slightly more popular than the joint-gain 
choice. 

When considered together with the esti- 
mates for others’ choices in the remaining 
decomposed games, these data suggest that 
each person may view her or his own motive 
as the modal orientation in the population. 
Recent work by Ross (1977) provides addi- 
tional support for this interpretation. He ob- 
serves that individuals who lacked good base- 
line data about typical behavior in a situa- 
tion resorted to personal criteria to make 
predictions about others. In other words, one’s 
own behavioral preferences are assumed to be 
more normative than nonpreferred tendencies. 

Detecting the motives of others in the de- 
composed game. The previous research con- 
cerns how an actor’s own motivational dis- 
positions influence his or her attributions of 
motives to others. None of this research has 
asked the prior, perhaps more fundamental, 
questions of whether and how rapidly actors 
can make accurate attributions to various 
major dispositions, given that these disposi- 
tions are accurately and consistently repre- 
sented in the behavior of another. Kelley 
(Note 1) terms such dispositional acts as 
motivational moves with “action-immediate 
consequences,” that is, behaviors for which 
an observer can make a direct attribution 
concerning an actor’s goal in regard to the 
actor’s own and another’s outcomes. 

The major purpose of the studies to be re- 
ported was to determine whether and how 
rapidly actors can make consistent and ac- 
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curate attributions regarding different ¢ I 
of social motives. Subjects in Experiments) { 
and 2 were required to observe and preii ¢ 
the behavior of a hypothetical person ca 
“the chooser,” who made choices consiste, í 
with a particular motivational vector actos | 
series of decomposed games. The use of 
decomposed game format made it possible 
represent various motivational choice vecio 
in a relatively systematic and unambigiy 
way. Further, the hypothetical perso 
choices were made in a context where heq) 
she had complete control over own and sn 
other person’s outcomes. Hence, the obser 
or subject had no reason to anticipate thi 
the hypothetical person was making choi 
for strategic reasons, that is, to influence th 
other to behave differently at some subs 
quent point in time. In effect, then, the by 
pothetical person was portrayed as making 
motivational moves with action-immetit 
consequences. And it was the observer's tal 
to detect the nature of these motivation! 
moves. 1 
A second purpose of the first two cn 
ments was to explore subjects’ impressions 
actors behaving according to different mi 
tivational rules, The overwhelming majo 
of impression-formation studies have 
on impressions of individuals describ 
a series of personality trait adjectives. 


is a paucity of information, wel i 
i iti information that ov m 
the dispositional info formatit 


haviors convey. The lack of such in eat 
is unfortunate, since behaviors are, He, 
to play a central role in the att a 
dispositions (Jones & Davis, 1965). ia 
tional attributions, if they deal with nf 
characteristics of an actor’s orientat ‘i 
ward self and others, should in e 
nificant predictors of an observer A nce 
to the actor in outcome interdepent? 
uations. ti 
Of particular interest to the prerii 
vestigation is whether choice bel i 
sistent with a particular motivation’ 
results in impressions that disting iffet 
actor from other actors who iaa in W 
outcome preferences. To find ou cts in B 
ways impressions might differ, pee a a! 
periments 1 and 2 rated the choo: 
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olar adjective scales immediately following 
heir observations and predictions of the 
hooser’s choices. 

Finally, since each motivational vector is 
fned by a different set of mathematical 
yperations in regard to one’s own and the 
other’s outcomes (e.g. maximization of rela- 
live gain implies the algebraic operation of 
subtracting other’s outcomes from own out- 
comes, whereas joint gain implies adding own 
and other’s outcomes), & third study was con- 
ducted to determine whether the rate of de- 
tecting a particular vector was related to its 
mathematical as well as its social properties. 


Experiment 1 


The major hypothesis of Experiment 1 was 
that subjects would experience greater dif- 
ficulty predicting the choices made by a 
chooser who placed equal or greater value on 
other’s outcomes (i.e. behaved prosocially ) 
‘than a chooser who placed higher value on 
own rather than other’s outcomes (i.e., be- 
haved in a self-centered manner). This find- 
ing was anticipated on the basis of previous 
studies that have demonstrated the pervasive- 
a of self-centered choice behavior in labora- 
4 outcome interdependence situations (e.g., 
(Clintock & McNeel, 1966; McClintock & 
Moskowitz, 1976). 
an aa to get some idea of the generality 
t ee riot of the hypothesis, a range 
ae was examined. In addition to the 
Bie A vectors of cooperation, individ- 
Rites competition, the prosocial motive, 
Bika in Ng other’s gain), was in- 
E ia x design. Three other conditions 
Perni ed to vectors falling midway be- 
iem ooo, of these “pure” motives (i.e., al- 
and aea cooperation/individual- 
ure 2) ae ividualism /competition; see Fig- 
namely Mie in three out of seven conditions, 
Cooperation ER altruism/cooperation, and 
itiotstrat’n e chooser met the criterion of 
with maes much or greater concern 
other’s outcomes as with t 
oorer predictive 1 own ou (OMe, 
eeeconaid accuracy was expected in 
a ions than in the remaining condi- 


lons, Ww 
ohio the chooser valued own outcomes 
ighly than other’s. 
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Other's Outcome 


Outcome 


Figure 2, Vector representation of the social motives 
studied in Experiment 1: 


Method 


Subjects. The subjects were 70 introductory psy- 
chology students (29 males, 41 females) at the Uni- 
versity of Alberta, Canada. All subjects received. 
course credit for their participation. Subjects were 
randomly assigned to observe and predict a chooser ` 
having one of the seven social motives described 
previously. 

Stimuli and apparatus. The stimulus materials 
consisted of two-choice decomposed games. Decom- 
posed game situations were presented, responses were 
made, and feedback was given at a standard tele- 
type connected to a PDP8/I computer. Payoffs to 
the chooser and the other in every decomposed game 
trial were randomly selected by a computer sub- 
routine from the digits 0 through 9. The correct 
choice on each trial was determined according to 
which of the randomly generated chooser-other al- 
ternatives had the largest projection on the chooser’s 
motivational vector. It is important to note that 
the correct choice often satisfied both the particular , 
chooser’s motive and one or more of the other mo- 
tives also being investigated in the study. How many 
other motives and which other motives were also 
satisfied by the chooser’s choice varied from trial 
and trial. An example of a decomposed game as it 
appeared to subjects is shown in Table 1. 

Procedure. After subjects were brought to the ex- 
perimental room, they were informed that the ex- 
periment was concerned with how well one could 
predict another person’s behavior and the kinds of 
impressions formed of the person. Subjects were told 
that during the previous year, pairs of students had 
been brought to the lab. One member of each pair 
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Table 1 i 
Example ofa Decomposed Game Used in 
Experiment 1 


——— eS 


Choice 
Designation of points A B 
Chooser gets 5 1 
Other gets 3 6 


Note. In this example, the A choice maximizes 
cooperation, cooperation/individualism, individu- 
alism, individualism/competition, and competition ; 
the B choice maximizes altruism and altruism/ 
cooperation. 


was selected to act as chooser on the basis of the 
outcome of a coin flip. The chooser was presented 
with a large number of choice situations and asked 
to pick one alternative from each. Both the chooser 
and the other were subsequently paid by the ex- 
perimenter according to the choices that had been 
made, The structure of the decomposed games was 
then explained, and subjects were told that during 
the experiment their task would be to predict, as best 
they could, what the chooser chose in each situation. 

Three practice trials were provided (without feed- 
back) to acquaint the subjects with the manner in 
which the situations would be be presented on the 
teletype and the manner in which the subjects were 
to enter their responses. Following the practice trials, 
subjects predicted the chooser for a total of 40 trials. 
On the first 10 of these trials, no feedback was pro- 
vided about the correctness of predictions. After 
the 10th trial, feedback was introduced. The pace of 
the experiment was controlled by the subject’s rate 
of responding, 

Upon completion of the last trial, subjects evalu- 
ated the personality attributes of the chooser on 30 
9-point bipolar adjective scales: young-old; unCa- 
nadian-Canadian; bad-good; active-passive; weak- 
strong; cooperative-competitive; rational-irrational; 

` peaceful-warlike ; conservative-liberal; immoral- 
moral; religious—unreligious; rich-poor; stable-un- 
stable; untrustworthy—trustworthy ; honest-dishonest $ 
impolite-polite ; selfish—unselfish ; cruel-gentle; calm- 
agitated; wise-stupid; kind-unkind ; usual-unusual ; 
unpredictable-predictable ; serious-humorous; un- 
friendly—friendly ; sociable-solitary;  timid—coura- 
Seous; masculine-feminine ; insincere-sincere ; and tol- 
erant-intolerant. These scales were selected both on 
the basis of general interest and on their potential 
ability to discriminate among the choosers along 
psychologically meaningful dimensions, as suggested 
by the factor-analytic work o; j d, 
Tannenbaum (1957) and Kuusinen (1969). 


Results 


The average number of times subjects cor- 
rectly predicted the chooser in each block of 


f Osgood, Suci, and. 


J. MAKI, W. THORNGATE, AND C. McCLINTOCK 


10 trials is shown in Figure 3. The figure my ; 
vides support for the hypothesis: Subj 
were better at predicting individualių 
through-competitive choosers than cop 
tive-through-altruistic choosers. Overall, 
former choosers were predicted accurately 
about 83% of the trials, whereas for the li 
ter choosers, the accuracy rate was only 66 
Two separate analyses of variance w 
performed after submitting the proportion 
correct predictions for each subject on 
trial block to an arc sine transformation, 
the first analysis, performance on the fin 
block of 10 trials when no feedback wi 
given to subjects was compared with mel 
performance on the subsequent feedback trid 
Significant main effects were obtained fy 
chooser motive, F(1, 63) = 6.10, p <W 
and feedback, F(1, 63) = 21.09, p < 00i 
well as a significant Motive X Feedback it 
teraction, F(6, 63) = 3.16, p < 01. An i 
amination of Figure 2 indicates that wl 
accuracy increased in the competitive, altru 
istic, and individualistic/competitive choose 
conditions with the introduction of feedbae 
the accuracy for cooperative and altruistt 
cooperative choosers declined slightly. 
second analysis, performed on the ti 
blocks of trials in which subjects Tet 
feedback, once again yielded a signi | 
main effect for motive, F (6, 63) = 1.91, i. 
.001. There was also a significant e H 
trial blocks, F(2, 126) = 13.93, p < 0h) 
dicating that subjects improved thei} ‘4 
formances with practice. The Motive x ai 
Blocks interaction, however, was not = 
cant, F(12, 126) < 1. An a priori ota 
performed on the overall accuracy 4 
on feedback trials between those a 
served altruistic, altruistic/coopera i j 
cooperative choosers versus coope uals 
vidualistic, individualistic, indivi ers i 
competitive, and competitive choos we 
cated a statistically reliable difference 
these two groups, #(63) = 6.66, ? oe 


F ich obs? 
To determine the extent to bases i 


formed similar or dissimilar IMP: pe 
the choosers whom they observe a oa sit 
tained personality attribute ratings j 


mitted to a discriminant tuner cll 
(Tatsuoka, 1970). Discriminar 
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nalysis is @ multivariate statistical technique 
hat finds the set of uncorrelated, weighted 
near combinations of predictor variables 
hat maximally discriminates among individ- 
jals classified according to some grouping 
variable, In this study, groups consisted of 
chooser-motive treatment conditions. 

Of the discriminant functions that resulted 
ftom this analysis, only one reached sta- 
tistical significance, Wilks lambda (180 = 
0116, p< .02. This function accounted for 
47.53% of the discriminatory power of the 
battery of derived functions. The locations of 
group means on the function or “centroids” 
are plotted in Figure 4. 

Figure 4 indicates that subjects’ ratings 
were able to discriminate among the choosers 
along a dimension roughly corresponding to 
social motivational orientation. While the 
ordering of centroids on this dimension is not 
isomorphic with the degree of the various 
choosers” prosocial intent, subjects’ ratings of 
choosers discriminated between those with 
prosocial and those with self-centered out- 
come preferences. The larger discriminant 
coefficients obtained from the analysis indi- 
tated that higher ratings of agitation, sta- 
bility, selfishness, unfriendliness, badness, 
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Figure 4. Location of group centroids on the signifi- 
cant discriminant function, Experiment 1. 


masculinity, and—oddly enough— politeness 
tended to be associated with self-centered 
classification by this function. 


Discussion 


Two important findings emerged from this. 
experiment. First, subjects were not uniformly 
accurate at predicting the outcome prefer- 
ences of actors. In a third-party prediction 
task, the social motivational orientations of 
self-centered choosers were more readily de- 
tected than those of prosocial choosers. This 
finding is particularly interesting if considered 
in the light of subjects’ performance on the 
initial no-feedback trials, which amounted to 
guessing the choosers’ preferences. In four 
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of the experimental conditions—competition, 
competition/individualism, cooperation, and 
altruism/cooperation—guessing accuracy was 
approximately equivalent, (Ms = 6.6 to 7.0). 
However, with the addition of feedback, sub- 
jects became more accurate at predicting the 
competitive and cooperative/individualistic 
choosers than the cooperative and the altru- 
istic/cooperative choosers. Second, subjects 
formed impressions of the choosers that 
showed a coherent relationship to the choos- 
ers’ social motives. That is, they distinguished 
between self-centered and prosocial choosers 
in their personality trait ratings. It is curious 
that while at some level subjects apparently 
recognized the latter choosers’ prosocial in- 
tent, as indicated by their trait ratings, this 
recognition was not reflected in the accuracy 
with which they predicted these choosers’ 
outcome preferences. 


Experiment 2 


While the results of Experiment 1 are en- 
couraging as an indication of the potential 
usefulness of the decomposed game technique 
for investigating the perception of social mo- 
tives, they must be interpreted with caution. 
A relatively high degree of confounding of 
motives exists in the choice-dominance struc- 
tures of the two-choice decomposed game. De- 
pending upon the particular configuration of 
chooser’s and other’s payoffs in the situa- 
tions employed as stimuli in Experiment 1, 
anywhere from one to all seven of the motiva- 
tional vectors that were being investigated 
could have been represented by the same 
decomposed game choice, The most precise 
diagnostic information about the chooser’s 
motive (from a mathematical standpoint) 
should have been communicated in decom- 
posed games in which the alternative selected 
by the chooser was not the same as the al- 
ternative preferred by choosers with any of 
the other motivational orientations under 


study. However, this special situation could 


only have occurred in Experiment 1 when the 
chooser’s motive was altruism or competition. 
Even under this circumstance, the situation 
would be expected to occur relatively infre- 
quently because of the random nature of the 
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computer-generated decomposed game 
Thus, it is unclear to what extent 
every condition of Experiment 1 
equivalent amounts of informat 
whether it was in a sufficiently uneo 
form to permit them to accurately 
chooser’s motivational vector 

The purpose of Experiment 2 
tend and replicate the findings of the p 
study. In this experiment, the decot 
game observation and prediction p 
Experiment 1 was used but modified i 
that the degree and nature of the G 
ing of motives could be precisely 
and controlled across conditions. 
previous study, the issues of whether 
patterns consistent with different W 
tional vectors are perceived by sub 
differential accuracy and whether 
in social motivational orientation 
impression formation were of 
terest. 

In addition, however, Experi 
also concerned with subjects’ 
situations where an actor's domini 
vector is not represented among the) 
posed game choices, but the two in 
adjacent choices are represented. If af 
is assumed to be motivated by only & 
vector, then the actor should be ind 
toward two alternatives equally distant 
this dominant motive. However, if Si 
believe that the chooser has a Ss 
choice rule, they should not predict 
these equidistant alternatives with eq 
quency (see MacCrimmon & Messick, 
Of particular interest was whether 
would tend to predict the more self 
or the more prosocial of the alten 
equally distant from the chooser’s P : 
vector, and if they did tend to p 
alternative more than the other, whethe 
tendency would be related in any Way 
chooser’s social motive. 

In this experiment, four of the 
motivational orientation conditions Wi 
tained from Experiment 1—altruism, € 
tion, individualism, and competition. 
once again predicted that 
adopted an individualistic or competiti 
would be more accurately s 
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choosers who made choices consistent with 
an altruistic or cooperative goal. As well as the 
vectors just noted, four motivational vectors 
not previously investigated were included in 
the design of Experiment 2—aggression (mini- 
mize other’s gain), martyrdom (maximize 
other's relative gain), masochism (minimize 
own gain), and sadomasochism (minimize 
joint gain). The inclusion of these additional 
vectors permitted a more general test of the 
hypothesis that self-interested motives are 
the most accurately perceived. Figure 5 shows 
the locations of these vectors in own-other 
outcome space, 

Although seldom observed in laboratory 


gaming situations, aggression undoubtedly 
plays a significant role in many real-life out- 
come interdependence situations. Studies con- 


ducted by Davis (1972) suggest that an ag- 
gressive preference on the part of an actor is 
often interpreted as a symptom of a competi- 
tive nature. Aggression differs fundamentally 
from martyrdom, masochism, and sadomas- 
ochism in that one’s own gain receives greater 
weight (zero) than the other’s gain (negative) 
in determining outcome preference. Thus, it 
was anticipated that subjects would experi- 
ence relatively little difficulty in predicting an 
aggressive chooser and that accuracy would 
resemble that observed for other self-centered 
Motives. 

The vectors of martyrdom, masochism, and 
sadomasochism are rare not only in laboratory 
games but also in real life. Despite their ab- 
normal status among social motives, subjects’ 
relative abilities to detect them are of theo- 
retical interest. To the extent that an actor 
repeatedly behaves in an atypical fashion 
(ie., displays a preference for negative over 
Positive outcomes to self), an expectancy 
hypothesis would predict that observers would 
have difficulty identifying the actor’s choice 
rule. However, it is possible that just the op- 
Posite might obtain. Because these motives 
violate expectations concerning “rational 
choice behavior, they may tend to arouse 
Breater interest or attention on the part of 
observers and, hence, may potentially be de- 
tected more readily than those motives which 
affords the self greater positive outcomes. 

. 
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Other's Outcome 


Figure 5. Vector representation of the social motives 
studied in Experiment 2. 


Method 


Subjects. The- subjects were 89 female introduc- 
tory psychology students at the University of Cali- 
fornia, Santa Barbara, who participated in the ex- 
periment in partial fulfillment of a course require- 
ment. The number of subjects who watched choosers 
with each of the eight motivational orientations 
varied between 10 and 12. An additional 12 subjects 
were run but were discarded from the analysis for 
reasons of equipment failure (10), experimenter er- 
ror (1), or failure to understand the instructions (1). 

Stimuli, Subjects alternately observed the chooser 
make choices then predicted the chooser’s behavior, 
without feedback, in a series of four-choice decom- 
posed games (see Table 2). An observation/predic- 
tion trial block consisted of a total of 8 observation 
trials followed by 12 prediction trials. This observa- 
tion/prediction sequence was repeated five times in 
the course of the experiment, Although the observa- 
tion/prediction format differed from that used in 
Experiment 1, no differences in the direction of the 

i, were e: e a 
Sit decomposed games presented during the 
observation trials contained one choice that was 
dominant in terms of the chooser's motivational 
orientation. Games were constructed in a manner 
such that this dominant alternative maximized only 
the chooser’s motive in 25% of the games. In the 
remaining 75%, this dominant choice not only maxi- 

ized the chooser’s motive but also two neighboring 
motives (motives falling within 90° of the chooser’s 
vector). 

Motivational structures of 8 of the 12 decom- 
posed games presented during the prediction trials 
were the same as the 8 games presented during the 
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Table 2 
Example of a Decomposed Game Used 
in Experiment 2 


Choice 
Designation ——— E EEA 
of points A: TRAC FD 
Chooser gets DAA 1 6 
Other gets 1 0 3 1 


Note. In this example, the A choice uniquely maxi- 
mizes sadomasochism; the B choice uniquely 
maximizes aggression; the C choice maximizes 
altruism, martyrdom, and masochism; the D 
choice maximizes competition, individualism, and 
cooperation. 


observation trials. The other 4 prediction trials con- 
sisted of (a) 2 games in which the chooser’s social 
motive would lead to a unique choice, as it did in 
the 8 games previously described, and (b) 2 games 
in which there was no unique choice consistent with 
the chooser’s motive, but rather the chooser’s mo- 
tive was equally satisfied by two outcomes repre- 
senting maximal projections on the motivational 
vectors immediately adjacent to the chooser’s vector, 
Inclusion of the latter games in the prediction trials 
provided a means for examining subjects’ beliefs 
about the choosers’ possible secondary motives, 

Payoffs on all games were randomly transformed 
so that no two situations looked identical to sub- 
jects. Thus, accurate predictions could not be made 
by rote memorization of the observation trials, 

i Procedure, From one to four subjects participated 
in each experimental session, Upon their arrival, sub- 
jects were placed in individual cubicles, Each of 
these cubicles was equipped with a TV monitor and 
a response box with four letters labeled A through 

. The instructions, presented on videotape, were 
read by a male graduate student. 

Subjects were instructed that the Purpose of the 
experiment was to develop a test of “social intel- 
ligence” that would 
ity to understand and Predict the behavior of an- 


was then presented 
ained. Subjects were 
cents and that they 


and the nature of the payoffs expl; 
told that the payoffs were in 
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were to assume that these cent 
with an unlimited number of 
chooser made a choice that re 
payoff to self and or other, penr 
extracted from the bowl. When cither the choc! 
or the other got a negative payoff as a result of th 
chooser's choice, the party or parties who receiv 
the negative payoff returned the appropriate numba 
of cents to the bowl. The reference to a bowl wil 
an unlimited number of pennies was introduced t 
reduce the likelihood that subjects would interpri 
the situation as zero-sum (Rapoport, 1966) 
During the first part of the experiment, blocks d 
observation and prediction trials were clearly segre- 
gated by segments of blank videotape and approptl: 
ate warning messages: “In five seconds the oben 
tion (prediction) trials will begin.” On cach oh 
servation trial, subjects first saw the decompo 
game, and then, after a brief pause, the letter d 
the chooser’s choice was displayed along with Út 
decomposed game choices, Prediction trials wet 
similar to observation trials except that the wot 
prediction? appeared beneath the decomposed gami 
instead of the letter of the chooser's selected alterni 
tive. No feedback was provided Subjects were ak 


came from a 


lowed approximately $ sec in which to indicate Of 
predictions by pressing the appropriate button. 
responses were recorded by an Esterline-Angt 


event recorder. tral 
Following the observation and prediction 
Subjects were given a questionnaire similar to 
used in Experiment 1, which asked them to evalu 
the chooser’s personality attributes on 22 we 
bipolar adjective scales, These included 16 a 
that were selected from among the items used in H 
periment 1 because the mean ratings received 
the choosers in the previous experiment exhibited i 


range of at least 2.5 scale points cal-agi 
weak-strong; _rational-irrational ; religious- 
gious; impolite-polite; bad-good; usual-un' 


insincere-sincere; active-passive; cooperative 
petitive; peaceful-warlike; stable-unstable; $ Fi 
unselfish; kind-unkind; masculine-feminine; 


Table 3 

Mean Percentage of Correct Predictions, 
by Chooser Motive Condition, in 
Experiment 2 


OEE 


Fo correct 
Individualism 78.8 
Cooperation 64.1 
Altruism 58.8 
Martyrdom 41.1 
Masochism 51.1 
Sadomasochism 28.8 
Aggression 65.0 
Competition 74.7 
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Six new scales (punishes self- 
ful-loving, unfair-fair, ungener- 
nfident-confident, just-unjust) 


unfriendly-friendly 
punishes other, he 
ous-generous, UDC 
were added 

Subjects also estimated the weights that the chooser 
placed on own gain and other's gain on separate 9- 
point scales, which ranged from 1, “Chooser wanted 


to avoid giving s to self (other),” to 9, “Chooser 
ol give as many cents to self (other) as 
Results 

The overall mean percentages of correct 
predictions made by subjects in each condi- 


ition are shown in Table 3. Figure 6 shows the 
cane 1 correct predictions plotted 
1 on trial blocks. For analysis of 
Variance (ANOVA) purposes, those prediction 
trials for which there were two possible cor- 
E response s were omitted. (The results for 
e trials will be discussed later.) For ease 
of computation, subjects were randomly elim- 
pn from conditions in which the number 
A Laen exceeded 10. In addition, prior 
ang the ANOVA an arc sine trans- 
= ation was once again performed on the 
in of correct responses made by each 
B on each block of prediction trials. 
ees of the analysis revealed sig- 
ee nain effects for chooser motivational 
‘Ai ion, F(7, 72) = 5.955, p < .001, and 
Table — F (4, 288) = 8,076, p < .001. As 
bi be: os Figure 6 indicate, individualism 
tives oe proved to be the easiest mo- 
RN subjects to detect, followed by ag- 
E> cooperation, and altruism. Masoch- 
me tyrdom; and sadomasochism posed 
x BAA with subjects exposed to 
ie ieee: chooser performing at no 
2 general =e chance level. While there was 
Sifou: ao toward improvement over 
Prediction o the chooser observation and 
Cant tril a ase, as indicated by the signifi- 
ment in 4 locks effect, the rate of improve- 
across Played did not differ appreciably 
interac ions: The Motive x Trial Bloc 
Cate, p id not reach statistical signifi- 
a” (28, 288) = 1.102 
ike the 102. 
Planned o results of the overall ANOVA, @ 
Comparison eon designed to replicate the 
Ment 1 sy Performed on the data of Experi- 
Medictive tted the original findings. The 
curacy of those who watched 4 
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10 Inde 
Comp. 
3 Coop. 


Aggr. 


Maso. 
Mart. 


Sado, 


Mean Number of Correct Predictions 


13-24 25-36 37-48 49-60 
Trials 

Figure 6. Mean number of correct predictions by 
chooser motive condition and trial blocks, Experi- 
ment 2. (Ind. = individualism ; Comp. = competi- 
tion; Coop, = cooperation; Ager. = aggression ; Maso. 
= masochism; Alt. = altruism; Mart. = martyrdom ; 
Sado. = sadomasochism.) 


cooperative or altruistic chooser was signifi- 
cantly poorer than those who watched an in- 
dividualistic or competitive chooser, t(12) = 
2.43, p < .02. A second planned comparison, 
between conditions in which the chooser as- 
sumed a choice rule in the motivationally 
unormal” range (i.e. altruism, cooperation, 
individualism, competition, aggression) versus 
conditions in which the chooser’s social mo- 
tive might be considered abnormal (i.e., mar- 
tyrdom, masochism, and sadomasochism), 
demonstrated the existence of a significant 
difference in favor of higher accuracy on the 
part of subjects who watched the former 
types of choosers, t(72) = 5.30, < 001. $ 

tion of subjects’ biases concerning 


Examina! r 
the choosers’ possible secondary motives was 
performed by first counting the number of 


times that subjects made correct predictions 
on the total of 10 trials in which the decom- 
posed tion gave the chooser two 


game situa ; 
dominant choice alternatives, then determin- 
ing how often each of these two choices was 
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1 
pe i ive in ‘* di, nt” Prediction Trials 
Percentage of Choices of Each Dominant Alternative in “Chooser Indiffere: ctio i 
Noen i Z value 
Condition predictions Choice Bb 
ividuali 99 16% competition 5.20** 
Individualism om 4 ENORI 
i 106 73% individualism 4.74°° 
eee 21% altruism 
Altruism 81 63% cooperation 2.34" 
37% martyrdom 
Martyrdom 63 58% altruism 1,25 
42% masochism 
Masochism 79 60% martyrdom 1.77 
s 40% sadomasochism 
Sadomasochism 38 34% masochism —1,97* 
66% aggression 
Aggression 97 17% sadomasochism —6.15** 
83% competition 
Competition 107 32% aggression —3.73"* 


68% individualism 


*p = 05. 

> = 01. 

predicted by the subjects. The results are 
shown in Table 4. From the table it is evident 
that there was a tendency on the part of sub- 
jects to predict that the chooser would prefer 
the alternative favoring self-interest as op- 
posed to the other’s interest when confronted 
by two choices equivalent in terms of their 
actual dominance. 

To summarize, the prediction phase of Ex- 
periment 2 provided a general replication of 
Experiment 1, yielding three main findings. 
First, the farther the chooser’s motive was 
from own or relative gain maximization, the 
less readily it tended to be detected. Second, 
there was no evidence to suggest that the 
atypicality of motivational vectors affording 
negative outcomes to self in any way facili- 
tated their recognition. Finally, there ap- 
peared to be a general expectation among 
subjects that the chooser would behave in 
such a way as to further self-interest when 
faced with a choice between two presumably 
equally attractive alternatives, although this 
expectation became less pronounced the far- 


ther the chooser’s actual choice rule was from 
individualism or competition. L 
Analysis of the ratings of choosers A 
lected during the impression phase of ail 
study also produced results basically simi at 
to those of Experiment 1. A discrimi 
function analysis performed on the ratte 
resulted in a single significant function 
counting for 48.26% of the discrimina! 4 
power of the battery of derived function, 
Wilks lambda(154) = .0359, p < .001. At 
Figure 6 indicates that choosers E. 
roughly ordered themselves from self-cen 
to other-centered along the dimension ce ti 
by the discriminant function, but this ie 
clustered into three rather than two 8! ant 
Choosers with the most negative discrimi” 
scores—the individualistic, competitiv® | 4 
aggressive choosers—were those who wie: 
their own gain more highly than the ne 
gain. The most positive discriminant 4 
were obtained by choosers who va Al 
other’s gain above their own gain—t 


nter 
truist, the masochist, and the marty". l 
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nediate between these two extremes were the 
adomasochistic and cooperative choosers, 
yho had in common the characteristic of 
reating self and other alike (and in the long 
run, equitably). (See Figure 7.) 

This time the larger discriminant coeffi- 
cients indicated that “self-centered” classifi- 
cation tended to be negatively associated 
with perceived politeness, passivity, gener- 
osity, and kindness, and positively associated 
with perceived agitation, other punitiveness, 
stability, selfishness, and justness. Differ- 
ences between this function and the function 
obtained in Experiment 1 could be due to 
differences both in the sample of adjective 
scales used for rating the chooser and in the 
range of stimulus motives investigated in the 
e experiments. 


1 
N 


Competition 


Aggression 


Individualism 


Sadomasochism 
Cooperation 


Martyrdom 


Masochism 
Altruism 


N 


Figu 7 
rai. is Location of group centroids on the signifi- 
riminant function, Experiment 2. 
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Figure 8. Mean observer-assigned vectors of inac- 
curate observers, Experiment 2. (ms are in parentheses. 
Maso. = masochism ; Alt.= altruism; Mart. = mar- 
tyrdom; Comp. = competition; Ind. = individualism ; 
Coop. = cooperation; Agger. = aggression ; Sado. = 
sadomasochism.) 


Finally, subjects’ direct estimates of the 
weights that the choosers placed on their own 
and the other’s gain were examined. Since it 
was anticipated that the ability to accurately 
estimate these weights might be related to a 
subject’s success at predicting the chooser, 
weight estimates of accurate and inaccurate 
observers were considered separately. An ac- 
curate observer was defined as 4 subject who 
made at least 9 out of 12 correct predictions 
on the last prediction trial blocks. Subjects 
who made 8 or less correct predictions were 


classified as jnaccurate. Mean observer-as- 
transformed by sub- 


signed weights were 
the transformed scale a 

zero midpoint. Vectors derived from these 
i ights are shown in Figures 


Y es 8 and 9, it is evident that ob- 


servers who accurately predicted the chooser 


on the last trial block were also more ac- 
curate in estimating the relative weights that 
d on own and other’s gain. 


the chooser place! 
For the accurate observers, the ordinal place- 


ment of the eight vectors corresponding to 
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ayy (9) 


+ Other's 
Outcome 


Figure 9. Mean observer-assigned vectors of accurate 
observers, Experiment 2. (ns are in parentheses, 
Sado. = sadomasochism ; Maso. = masochism; Mart. 
= martyrdom; Alt. = altruism; Coop. = cooperation ; 
Ind. = individualism ; Comp. = competition; Ager. 
= aggression.) 


the eight chooser motivational orientations 
paralleled the correct order. However, several 
errors in the ordinal placement of these 
vectors occurred for the inaccurate observers. 
Both accurate and inaccurate observers ex- 
hibited a tendency to assimilate the relative 
weights assigned to the masochistic and the 
altruistic choosers toward martyrdom. Simi- 
larly, there was a tendency for the relative 
weights assigned to the aggressive and indi- 
vidualistic choosers to be assimilated towards 
competition. In the case of inaccurate ob- 
servers, this latter assimilative tendency also 
held for the weight estimates of observers 
who viewed the cooperative and sadomasoch- 
istic choosers. 

Prior to discussing the results of these 
first two studies, we will describe a third 
study that asked whether the rate of detect- 
ing a particular motivation vector was related 
to its mathematical properties. 


Experiment 3 


A few studies have investigated concept 
learning tasks that bear a resemblance to the 
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present chooser prediction paradigm. In theg 
experiments, the subject is typically require 
to make interval-level responses to a series of 
stimuli that vary on one or more attribute 
dimensions. In order to attain a reasonable 
level of accuracy, the subject must come to 
identify which of the stimulus attributes ate 
relevant to the response, to correctly assign 
a relative weight to each of these attributes 
and to identify the direction of the correla 
tion between each relevant attribute and the 
response dimension. ‘ 
The chooser prediction task of Experiments 
1 and 2 is similar to an interval-concept 
learning problem, because subjects must de- 
cide whether one or both of two attributes, 
chooser’s payoff and other’s payoff, are rele- 
vant to determining the chooser’s value for 
outcomes. Which of these attributes are at 
tually important depends upon the mo 
tional orientation of the chooser. To dea 
mine the value of outcomes for an individu 
istic chooser, for example, it is nee 
pay attention only to the chooser's et 
However, both the chooser’s payoff an out 
other’s payoff must be taken into ae 
when calculating the value of outcomes ” fe 
cooperative chooser. The relative yee 
placed on one’s own outcome and the ol a 
outcome and their signs (positive oe i 
tive) are shown in Table 5 for each 0 A 
chooser motivational orientation condition 
Experiment 2. - 
Experiment 3 was concerned wit M 
gating the chooser prediction paradigm aut 
context of a nonsocial, interval-concept 


h investi: 


Table 5 
Relative Weights Placed by Chooser om 
Own and Other's Outcomes, Experimen 


SS patie el el 

ight 08 

Weiss on Vie 
Condition outcome out 
0 
Individualism +1 44 
Cooperation +1 H 
Altruism 0 44 
Martyrdom -1 0 
Masochism -1 e 
Sadomasochism —1 ai 
Aggression 0 ey 

Competition +1 
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ing task. In order to study just the mathe- 
matical component of the prediction task, 
references to a chooser making choices for 
self and other were deleted from the instruc- 
tions and the decomposed game matrices. 

The findings of previous studies indicate 
that subjects are better at interval-concept 
learning when relevant attributes are few as 
opposed to many (Azuma & Cronbach, 1966). 

In addition, variation in the relevant dis- 
tribution of weights over attributes affects 
response accuracy. Uhl (1963) varied the 
, predictive validities of three attributes and 
found a nonmonotonic relationship between 
subjects’ response accuracy and relative 
weight disparity. Overall, subjects performed 
poorly when there was slight disparity among 
attribute weights, improved their accuracy 


‘when attributes were equally valid, and 


) 


showed the best performance when there was 
a large discrepancy among weights. On the 
basis of these findings, it was hypothesized 
that subjects would find the nonsocial version 


Wi ‘the chooser prediction task easier in condi- 


tions where one of the outcomes received a 
weight of zero than in conditions where both 
outcomes were equally important in determin- 
mg an alternative’s value. 

k Although it was recognized that the task 
involves making an ordinal comparison of al- 
ternatives to determine which is most highly 
valued, in addition to the interval judgment 
of the value of each alternative, this ordinal 
component was not expected to differentially 
affect accuracy across conditions in the non- 
social version of the task employed in this 
experiment., 


Method 


d Seien The subjects were 64 introductory psy- 
versity TIS (13 males, 51 females) at the Uni- 
eee of California, Santa Barbara. Eight subjects 
oy assigned to serve in each condition. 
tion jects received course credit for their participa- 
ae es Series of decomposed games identical 
Satine in Experiment 2 was employed. The 
Tuction: were referred to as “matrices” in all in- 
teen ste to subjects, and the words Blue and 
and Oth ere substituted for the words Chooser gets 
Protea gets in the matrix display. A 
Pairs ane Subjects arrived at the laboratory in 
were seated on the opposite sides of a 


to 
sit 
Sty 
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partition by the experimenter. Both subjects could 
clearly see the video monitor on which instructions 
and stimuli for the experiment were presented. 

Subjects were told that the experiment was con- 
cerned with finding out how well people could learn 
numerical concepts, They were instructed that they 
would be shown alternately a series of matrices in 
which an instance of the concept to be learned (a 
column of each matrix) would be indicated and a 
series of matrices in which they would be required 
to predict which of the four columns, A, B, C, or 
D, was an instance of the concept. A “hint” was 
provided that the column that was an instance had 
something to do with the entries in the blue and/or 
green rows. The terms matrix, column, and row 
were thoroughly explained. 

As in Experiment 2, the sequence of first observing 
for 8 trials, then predicting for 12 trials was re- 
peated five times. Subjects recorded their responses 
by marking the appropriate letter beside the trial 
number in an answer booklet. 

At the end of the experiment, subjects were given 
a questionnaire designed to assess their comprehen- 
sion of the concept. The questionnaire consisted of 
three parts. In the first part, subjects were asked to 
indicate whether for instances of the concept, the 
blue (green) numbers had tended to be larger, 
smaller, or neither larger nor smaller than the other 
blue (green) numbers in the matrix. Second, subjects 
were asked to write down the rule for deciding 
which column in a matrix was an instance. Finally, 
they were asked to fill in a blue number and a 
green number in a sample matrix where columns A, 
B, and C were given in such a way as to make 
column D an instance of the concept. 


Results and Discussion 


Table 6 gives the mean percentage of cor- 
rect predictions across all trials for each con- 
dition in the present concept learning experi- 
ment. The figures for sister conditions in Ex- 
periment 2 are also shown. 

A comparison between the Experiment 2 
and the Experiment 3 results indicates that 
with the exception of the condition blue: —1- 
green: —1, the average accuracy for subjects 
in the social prediction task of Experiment 2 
was higher (in some cases, quite substantially ) 
than the average accuracy in the concept 
learning task. 

Considering the accuracy figures for Ex- 
periment 3 alone, it is evident that the ex- 
perimental hypothesis was not supported: ° 
There was no difference in performance on 
the task between conditions where either blue 
or green numbers received zero weight and 
conditions where both blue and green numbers 
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Table 6 


Mean Percentage of Correct Predictions by Chooser Motive/Concept Condition, 


Experiments 2 and 3 
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% correct predictions 


Experiment 2 Experiment 3 
Motive/concept (motives) (concepts) 

Individualism/blue: +1—green: 0 78.8 40.0 
Cooperation/blue: +1—green: +1 64.1 37.5 
Altruism/blue: 0—green: +1 58.8 30.7 
Martyrdom/blue: —1—green: +1 411 31.3 
Masochism/blue: —1—green: 0 51.1 34.1 
Sadomasochism/blue: —1—green: —1 28.8 32.7 
Aggression/blue: 0O—green: —1 65.0 32.1 
Competition/blue: +1—green: —1 74.7 38.3 

Chance 29.2 29.2 


had nonzero weight. In support of this con- 
clusion, an analysis of variance performed on 
these data after transforming them by the 
arc sine procedure yielded no significant main 
effect for condition, F(7, 56) <1. A sta- 
tistically demonstrable improvement in ac- 
curacy did occur across trial blocks, however, 
F(4, 224) = 7.689, p < .001. 

The results obtained from the questionnaire 
administered at the end of the experiment in- 
dicated that few of the subjects in the experi- 
ment ever grasped the concept they were try- 
ing to learn. (Since the pattern of results 
appeared to be uninfluenced by condition, it 
will be described for all subjects.) Of the 
total of 64 subjects, 42, or approximately two 
thirds, failed to answer any part of the ques- 
tionnaire correctly. Of the remaining 22 sub- 
jects, 19 were able to correctly generate an 
instance of the concept. However, because a 
large range of answers was potentially ac- 
ceptable, it is not possible to determine how 
many of these 19 correct instances could be 
attributed to guessing. Subjects’ performances 
were much poorer on the other two parts of 
the questionnaire than on the instance-gen- 
eration question. Only 8 subjects managed to 
verbalize the rule, and only 6 subjects cor- 
rectly identified the relative weights of both 
the blue and green numbers. In total, only 3 
subjects answered the entire questionnaire 
correctly. 

Thus, the results of the present experiment 
do not suggest that a relationship existed be- 


tween the numerical complexity of the choos 
er’s-choice rule and the subject’s ability ti 
predict the chooser in a social context. 

fact, it appears safer to conclude the opposi | 
that the social context of the prediction taii 
in the earlier studies facilitated learning f 
the mathematical aspects of the chooséts 
choice rule. 


General Discussion 


The results of Experiments 1 and 2 p 
confirmed that variations in an actor’s ae 
ence for outcomes affect the overall acca 
with which observers are able to predict id 
actor’s choice behavior in a series of del 
posed games. Although the preprogram 
choosers employed in these experiments ; 
hibited perfect consistency in making $i 
tions from among sets of own/other outa 
distributions, subjects nevertheless one 
enced difficulty in predicting the cho 
choices when he or she behaved in @ ne 
tively self-interested or prosocial fashion. ; 

These data seem best interpreted of 
within an interaction-expectancy framer 
The more popular a particular social m th 
is perceived to be as an interaction 807 , 
more likely it will be anticipated (aM f 
tected) in the behavior of others. Sura 
this assertion was obtained both o as 
accuracy data and from the existence 0! "4 
interest bias in subjects’ predictions © 
the chooser would do when confront 


PREDICTION AND PERCEPTION OF SOCIAL MOTIVES 


paired rather than single alternatives domi- 
nant with respect to the chooser’s motiva- 
tional vector. 

Expectations concerning the chooser’s likely 
pursuit of self-interested goals appear to 
have been highly pervasive in influencing 
subjects’ responses throughout the experi- 
ments, An examination of the performance of 
individual subjects indicated that most of 
them improved their accuracy in an incre- 
mental fashion rather than by the abrupt 
transition that might have occurred if the 
subjects had been systematically testing a 
tange of alternative hypotheses about the 
chooser’s motive. Subjects may have been 
teluctant to abandon a self-interest hypothe- 
sis even when it was less than perfectly valid. 
Confounding of motives within the choice- 
dominance structures of the decomposed 
games would allow subjects who held certain 
wrong hypotheses about the chooser (i.e., 
Subjects who believed the chooser’s motiva- 
tional vector to be within 90° of his or her 
‘true vector) to still make some correct re- 
Sponses. Exacerbated by a tendency to pay 
More attention to instances of the chooser’s 
behavior that confirmed a preference for self- 
Interest than instances that provided discon- 
mation (see Bourne, Ekstrand, & Dominow- 
a 1971), this explanation might account 
for some of subjects’ conservatism in utiliz- 
mg the information contained in the decom- 
Posed game choices made by prosocial and 
‘Negatively self-interested choosers. 

di eo noted previously, it is curious that 
Felice emerged in subjects’ abilities to 
Bick, se {interested and prosocial choosers, 
hk cn that fairly distinctive impres- 
a i the personality attributes of these 
ace of choosers were formed. Re- 

a Af in the area of person perception 
sonalit AN stressed the importance of per- 
Bead impressions, particularly first im- 
er ta determining our reactions toward 

fee ae Schneider, & Polefka, 1970). 
Pressione xperiments 1 and 2 the im- 
nemai that were formed of prosocial and 

Batively self-inter. 2 
Parent} ! ested choosers were ap- 

R Y of little value to subjects in pre- 

ing these choosers’ choi yects P 
though ane $’ choice behaviors, even 
ices to be predicted were of the 
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same type as those upon which impressions 
were based. 

One possible explanation for this apparent 
incongruity is that subjects’ ratings of the 
choosers’ personality attributes may have 
been retrospective reflections upon the choos- 
ers’ choices, as opposed to impressions formed 
early in the course of observing the choosers, 
Whether the personality attribute ratings rep- 
resent early or retrospective impressions, 
however, it is interesting to note the existence 
of groupings of motives in terms of both the 
discriminant function analysis of these rat- 
ings and the vector representations derived 
from the subject-estimated weights of Ex- 
periment 2. These groupings suggest that the 
psychological representation of motivational 
vectors may differ from their formal repre- 
sentation in terms of perceived angular dis- 
tance, if not in their ordering. 

Finally, Experiment 3 investigated the 
difficulty of learning various choice rules con- 
sistent with certain social motives outside of 
a social context, No evidence was obtained to 
suggest that the mathematical complexity of 
a given choice rule had any influence on in- 
creasing or decreasing an observer’s ability 
to predict. Conversely, identifying the rule- 
detection task as a social one positively, 
though differentially, affected the likelihood 
of correct detection. 

A major limitation of the present investiga- 
tions is that they have provided little infor- 
mation about the factors that mediate the 
observed relationship between: the chooser’s 
social motive and the subjects’ predictive ac- 
curacy. While observer expectations about 
the probability that an actor will adopt par- 
ticular social motives provide an overall ex- 
ntion for the results, other influences may 
have made important contributions to the 
present findings. For instance, the social mo- 
tivational orientations of observers may have 
been a significant factor in determining their 
relative success at the prediction task. Sub- 
jects in the experiments described here ex- 
hibited a considerable degree of individual 
difference in their abilities to predict the vari- 
ous motives. If interaction expectancies are 
affected by one’s own social motive, as the 
work of Kuhlman and Wimberley (1976) 


pla 
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would suggest, then it follows that this mo- 
tive would influence the ability to detect 
others’ social motives, especially others with 
motives identical to the observer’s own orien- 
tation. Future research will consider this re- 
lationship. 


Reference Note 


1, Kelley, H. H. Action and perception: An attribu- 
tion analysis of social interaction. The Katz-New- 
comb Lecture, University of Michigan, 18 April 
1975. 
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Effects of External Evaluation on Artistic Creativity 
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This study examined the con 
$ constraint upon performance 


no specific focus. Finally, sí 
plicit instructions on how to 
evaluation groups produced 


group for which this pattern was reve 
how to make artworks that would be ju 
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Tn his autobiography, Albert Einstein de- 
ribes a serious motivational problem he en- 
‘ountered during his student days: His 
ysics examinations, forcing him to “cram 
; this stuff into one’s mind” (Schilpp, 1949, 
D 17), were so unpleasant that afterward he 
ould not bring himself to consider scientific 
blems for an entire year. When he went on 
D advanced study in Zurich, he found ways 
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these two disparate results is proposed, and 


cal Association, Inc. 0022-3514/79, 


ditions under which the imposition of an extrinsic 
of an activity can lead to 
i Female college students worked on an art activity either with 
expectation of external evaluation. In addi 
either the creative or the technical aspects o; 
ome subjects expecting evaluation were given ex- 
make their artworks. As predicted, subjects in the 
artworks significantly lower 
than did subjects in the nonevaluation control groups. 
sed had received explicit instructions on 
dged creative. A possible reconciliation of 


decrements in creativity. 
or without the 
tion, subjects were asked to focus on 
f the activity, or they were given 


on judged creativity 
The only evaluation 


practical implications are discussed. 


to blunt the effects of educational constraint, 
“which smothers every truly scientific im- 
pulse” (p. 17). For example, he had a friend 
who agreed to work over the lecture materials 
so that Einstein would be freed from attend- 
ing classes. In commenting upon this arrange- 
ment and its boost to his creativity motiva- 
tion, Einstein later said, 

freedom in the choice of pursuits 
ths before the examination, a free- 
dom which I enjoyed to a great extent and have 
gladly taken into the bargain the bad conscience con- 
nected with lesser evil, It is, in fact, 
nothing short modern methods 
of instruction entirely strangled the 
holy curiosity of inquiry; 
plant, aside from stimulation, 
of freedom; without this it goe: 


without fail. (p. 17) 

rospections and speculations 
y are an elegant expres- 
be advanced here: An 
tate is conducive to 


This gave one 


Einstein’s int 
about scientific inquir 
sion of the thesis to 


intrinsically motivated si 
creativity, whereas an extrinsically motivated 


state is detrimental. That is, if individuals 
engage in some activity primarily for its own 
sake, they will be most likely to produce crea- 
tive work. If, however, they are led to engage 

a means to achieve some 


in that activity as 
salient extrinsic goal, their creative perform- 


ance will be undermined. 
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The present conceptualization of creativity 
proposes that intrinsically motivated indi- 
viduals will be deeply involved in the activity 
at hand because they will be free of extrane- 
ous and irrelevant concerns, concerns about 
goals extrinsic to the activity itself. They will 
be playful with ideas and materials because 
of their freedom to take risks, to explore new 
cognitive pathways, to engage in behaviors 
that might not be directly pertinent to at- 
taining a “solution.” Since they undertook 
the activity primarily for the enjoyment of 
engaging in it, they will see the activity as 
more like play than like work. Extrinsically 
motivated individuals, on the other hand, 
will be, at some level, concerned with the ex- 
trinsic goal to be attained and will thus not 
be as deeply involved in the activity. In ad- 
dition, they will feel less free to engage in 
risk taking and will therefore rely more upon 
well-worn cognitive pathways. Finally, since 
they undertook the activity primarily for 
some reason extraneous to the activity itself, 
they will see it as more like work than like 
play. 

Most recent intrinsic motivation research 
has been concerned with the “overjustifica- 
tion” hypothesis (Lepper, Greene, & Nisbett, 
1973), derived from the attribution theories 
of Bem (1967, 1972), Kelley (1967, 1973), 
and deCharms (1968). This hypothesis states 
that if a person undertakes an interesting 
task under conditions that make salient to 
him the instrumentality of his behavior as a 
means to some extrinsic end, then that person 
will show less intrinsic interest in that ac- 
tivity later, when external constraints are ab- 
sent, than a person who did not act under 
salient external constraints. Using the con- 
straint-free measure of subsequent intrinsic 
motivation, results from a number of studies 
have supported this hypothesis (Amabile, De- 
Jong, & Lepper, 1976; Condry, 1977 ; Deci, 
1971; Kruglanski, 1975; Lepper & Greene, 
1975; Lepper et al., 1973). 

Recently, several theorists haye begun to 
speculate about the effects of extrinsic con- 
straint on immediate performance. In the 
formulation that is most relevant to the pres- 
ent research, McGraw (1978) has proposed 
a distinction between two different types of 
activities in terms of the differential effects 
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that extrinsic constraint might have upo 
performance of those activities. McGraw de 
scribes tasks having algorithmic solutions a 
those for which the path to the solution is 
clear and straightforward; performance on 
these tasks should be enhanced by increases 
in extrinsic motivation. By contrast, crea: 
tivity tasks require Heuristic solutions, where 
it is difficult to immediately determine which 
operations would be relevant to a solution 
Thus, creative performance should be ad 
versely affected by increases in extrinsic mo- 
tivation. 


Direct Empirical Evidence 


There is a handful of studies directly test 
ing the hypothesis that decrements in crea: 
tivity will accompany the imposition of &| 
trinsic constraints. Of these, one was an ovet 
justification study that included a measute 
of creativity in assessing the effects of æ 
trinsic constraints (Kruglanski, Friedman, i 
Zeevi, 1971). In this experiment, one group © 
students was promised a reward for participa: 
tion; for a second group, reward was not men 
tioned. Rewarded subjects produced less a 
tive responses (as judged by two independen 
raters) during the experiment than did not 
rewarded subjects; in addition, nonrewal! 
subjects later expressed greater enjoyment ° 
the experiment. for 

Other studies provide further support a 
the proposed relationship between creativi 
and intrinsic motivation. In two experim oar 
in which children made drawings under eit? oa 
reward expectation or no reward expect 
(Greene & Lepper, 1974; Lepper et al., 1a d d 
there was a tendency for rewarded chil ra 
to produce more drawings, but of po 
quality (as judged by teachers), than aa 
warded children. A study of the effects i 
rewards on problem-solving performance a 
Graw & McCullers, in press) found that i 
warded subjects took significantly longe i 
break set in solving a Luchins water ja! a i 
lem than did nonrewarded subjects. An 
an elementary school setting (White 
1970), the creative performance of ! 
a self-evaluation group was significa 
higher than that of boys in peer-evalua! 
or teacher-evaluation groups. 


EFFECTS OF EVALUATION ON CREATIVITY 


While this research seems to support the 
intrinsic motivation view of creativity, studies 
within the behavior modification or token- 
economy traditions have obtained results 
that appear to contradict it. In one such study 
(Glover & Gary, 1976), children worked on 
a standard verbal creativity task in which 
points were awarded for fluency (number of 
different responses), flexibility (number of 
verb forms), elaboration (number of words 
per response), or originality (statistical in- 
frequency of verb forms). Consistent with the 
Eeperimental hypotheses, all four aspects of 
creativity were “demonstrated to be under 
experimental control” (p. 79); when fluency 
was rewarded, the children were fluent; when 
originality was rewarded, they were original; 
and so on. Under extinction, each aspect fell 
eee or below. Other studies of operant 

ques, using both intersubject and be- 
por eour designs, have demonstrated func- 
onal control over creative performance (Hal- 
ibis Halpin, 1973; Johnson, 1974; Raina, 

8). 

ae i aM review of evidence on the 
Mirt a EUSIC constraint on creativity 
Mictis n ropa in results. Over- 
| ements in pea sete generally shown 
ee A beh vity under reward condi- 
ann ehavior modification studies 
ESA oenen under such conditions. 
vill Biases e results of the present study 
Paradigms i i deeper analysis of the two 

Roa s to a Possible resolution of 

contradiction. 


The Present Study 


As pae examination of the procedures 
o EEN overjustification and behavior 
a n studies suggests that the key to 

lle in ay contradictory results may 

| gested b ype of instructions used. As sug- 
a oer theoretical analysis of Mc- 
Recrements ), perhaps subjects will show 

| straints “are in creativity if extrinsic Con- 
Specifically e unless they are told 
order to ex ow to perform creatively. In 
Study eR this hypothesis, the present 
sign, to ea within one experimental de- 
mate ow both decrements in creativity 
TJustification procedure and incre- 


qu 


223 


ments in creativity by a behavior modification 
procedure. In the former, a constraint was im- 
posed upon subjects’ engagement in an art 
activity, and they were given no specific per- 
formance instructions; in the latter, the same 
constraint was imposed upon subjects, and 
they were told specifically how to perform 
creatively. In accord with the intrinsic mo- 
tivation model of creativity, it was expected 
that the simple overjustification groups—the 
groups working under extrinsic constraint, 
without explicit instructions on how to per- 
form creatively—would show decrements, rel- 
ative to control, in both creativity and 
intrinsic interest. And, as required by the theo- 
retical argument developed here, it was ex- 
pected that, although the specific-instructions 
group might show an increment in creativity 
compared to its control, it should not show a 
corresponding increment in intrinsic motiva- 
tion. Thus, the major purpose of this study 
was to attempt a reconciliation of seemingly 
contradictory results by identifying those in- 
structional sets under which extrinsic con- 
straint might undermine creativity and those 
under which it might enhance creativity. 

The specific extrinsic constraint employed 
in this study was the expectation of evalua- 
tion. Clearly, this constitutes an externally 
imposed constraint that is extrinsic to the 
activity itself; the external evaluation of an 
art product is not intrinsic to the activity of 


art itself. The use of this particular constraint 


is desirable for two reasons. First, expecta- 
Idom, if ever, been 


tion of evaluation has se 
used in previous overjustification research. 
‘As noted earlier, the extrinsic constraint usu- 
ally employed is the promise of reward. Sec- 
ond, and most important, the external evalua- 
tion of products that may potentially be crea- 
tive is so commonly employed in education 

as to be accepted as a fact 


and other settings 
of life. Thus, a demonstration of detrimental 
effects of evaluation could have significant 


practical implications. 
Method 


Design Overview 


The experimental 
lowed the usual overjustification para 


were either told or no’ 
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would be evaluated. Additionally, some subjects 
within each level of evaluation expectation were 
asked to focus on the creativity of their artworks; 
some were asked to focus on the technical aspects; 
others were not given any particular focus. Finally, 
two evaluation-expectation groups received specific 
instructions for performance: One was told how to 
make a technically good artwork, and one was told 
how to make a creative artwork. 


Subjects 


Subjects were 95 women enrolled in the introduc- 
tory psychology course at Stanford University .1 
They signed up for an experiment entitled “Effects 
of Various Activities on Mood” in partial fulfillment 
of a course requirement, The experiment was run by 
one female experimenter, 


Experimental Task 


Most of the more widely used tests of creativity 
are amenable to showing individual differences that 
depend on certain specialized skills, such as verbal 
facility and flexibility or drawing ability. Since the 
present study attempted to show differences between 
groups of subjects on the basis of environmental 
manipulations, it was desirable to find a method of 
assessing creativity that would minimize individual 
differences in performance. For the purposes of the 
present investigation, then, a simple subjective 
method of assessing creativity was developed, a 
method that does not depend on specialized skills 
and yet permits reliable judgments of creativity. 
In addition, it was desirable, for the Purposes of 
differentiating between creativity and other per- 
formance effects, that judgments of creativity be 
separable from judgments of “goodness” and liking 
for the products, 

A study investigating the adequacy of an art ac- 
tivity for this task was conducted as a pretest. Using 
colored paper and subjects made collages that 
on several artistic dimen- 


of these factors. Thus, 
fits the criteria Proposed. 
no more specialized than manip 
glue; it permits reliable subjective judgments of 
er artistic dimensions; and 
ical competence and aesthetic 


y separated from judgments 
creativity, ' y 
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Procedure 


After asking the subject to complete a consent 
form, the experimenter explained that this study was 
actually a pretest for an experiment to be done the 
following quarter. This alleged pretest was being 
run in order to identify activities that would affect 
most individuals’ moods in certain, predictable ways, 
The subject was told that she would randomly 
choose one of five different activities, and after en- 
gaging in that activity, she would complete a mood 
questionnaire, 

At this point, the experimenter presented five num- | 
bered cards to the subject, turned them face down, 
shuffled them, and asked the subject to choose one, 
After she had done so, the experimenter consulted 
a small “activity card” and announced, “Number 
the art activity.” The experimenter 
then presented the subject with her materials: ont 
white cardboard, 15 X 20 inches (.38 X.51 m); on 
small plastic squeeze-bottle of glue; and one large 
round aluminum tray containing 154 colored pieces 
of lightweight paper in various shapes and sizes. 
After presenting the materials, the experimenter asked 
the subject if she was an artist or had done much 
collage work. If the subject answered in the afirma- 
tive, the session was terminated. (Only three sub- 
jects were eliminated on this basis, and they were} 
replaced.) If not, the experimenter went on to intro- 
duce the subject’s task. In so doing, she stressed that 
the subject had complete freedom in using the ma 
terials to form a design, but that only the materia 
Provided should be used. In addition, the subject bis 
asked to make a design that conveyed a feeling h, 
silliness, “as when a child is acting and feeling silly: 
This last instruction was given to insure that i 
subjects would employ the same theme in a 
their designs, in an effort to reduce extraneous soul 
of variability. nana 

Following this general introduction, the ins i 
tions diverged to produce the eight treatment con 
tions, (Table 1 summarizes the experimental baa 
Assignment of subjects to conditions had been P 
determined by a randomized schedule, and UP 


eee ean 


the 
this point, the experimenter was unaware oa 
subject’s treatment condition, Now, the experi jects 


turned to the “manipulation” page in the ie 
packet and read the crucial instructions. Subj 
the control conditions (nonevaluation - n° 


1 Males were not used, in an effort to reduce 
traneous sources of variability, Pretesting ha 
that on this task, females produce artwou 
are judged as significantly more creative ae 
Produced by males. The number of subjects vitions 
proximately equal in each of the eight con®! ech 
nonevaluation—no focus = 13, nonevaluation ocs = 
nical focus = 12, nonevaluation— creativity ation 
11, evaluation- technical focus=11, €V techni 
creativity focus = 11, evaluation — specific us 
focus = 12, evaluation — specific creativity fo¢ 
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Instruction 
Specific Specific 
Evaluation No Technical technical Creativity creativity 
expectation focus focus focus focus focus 
Absent Cı C: C: 
Present E; E: E: E; E, 


fole. The letter in each box indicates whether a condition is control or experimental, and the subscripts 


non valuation — technical focus, and nonevaluation - 
tativity focus) were told: 


There is one more important point that I should 
make before you begin. We won’t be using your 
design as a source of data. We are not interested 
at all in the activity itself or what you do with 
the activity. We are only interested in the mood 
you report on the questionnaire. So we do not 
Care about the design itself at all—its only purpose 
i$ to provide you with this experience so we can 
See how it affects your mood. 


faattiton, Subjects in the nonevaluation — technical 
aa condition were asked to concentrate on the 
“chnical aspects of the activity “for this particular 


m $ A > í 
od induction,” and subjects in the nonevaluation - 


tteativity focus condition were asked to focus on the 

creative aspects, 

Bin basic instructions for the five experimental 

ae (evaluation -no focus, evaluation — technical 
» evaluation ~ creativity focus, evaluation — spe- 


c technical focus, e i i i 
ocus) were: » evaluation — specific creativity 


M P 
Eee a ae more important point that I should 
Questionnal fore you begin. In addition to your 
design eat Wwe will be looking at your finished 
five grad an important source of data. We have 
cee artists from the Stanford Art De- 
ment is = Working with us, and when this experi- 
each artyy a we will have them come in to judge 
tion of es - They will make a detailed evalua- 
criticizing as design, noting the good points and 
at our TER weaknesses, And since we know 
evaluated jects are interested in how they were 
evaluation we will send you a copy of each judge’s 
n of your design in about two weeks. 
Th addition, 
cus condition 
eir evaluation 
eres subjects 
'ondition were 


subjects in the evaluation — technical 
were told that the judges would base 
on how technically good the designs 
ìn the evaluation — creativity focus 
eir evaluations 21d that the judges would base 
Chose in a ns on how creative the designs were. 
tition were {Yaluation — specific technical focus con- 
echnical eya d that the judges would make their 
Valuation on the basis of these six ele- 


fo 


ments: (a) the neatness of the design, (b) the bal- 
ance of the design, (c) the amount of planning evi- 
dent, (d) the level of organization in the design, 
(e) the presence of actual recognizable figures or ob- 
jects in the design, and (f) the degree to which the 
design expresses something to them. Subjects in the 
evaluation — specific creativity focus condition were 
told that the judges would base their creativity evalu- 
ation on these seven elements: (a) the novelty of 
the idea, (b) the novelty shown in use of the ma- 
terials, (c) the amount of variation in the shapes 
used, (d) how asymmetrical the design is, (e) the 
amount of detail in the design, (f) the complexity of 
the design, and (g) the amount of effort evident. 
These components were the ones that had, in fact, 
clustered closely with pretest judges’ ratings of tech- 
nical goodness and creativity, respectively. 

After asking the subject if she had any questions, 
the experimenter looked at the digital clock that had 
been placed in the room and said, “Okay, it’s (time) 
right now; IIl be back in 15 minutes.” She then left 
the room. 

When the 15 minutes had elapsed, the experimenter 
reentered the experimental room, asked the subject 
to stop working, and—in keeping with the cover 
story—presented her with a “Mood Questionnaire.” 
This questionnaire contained 15 mood adjectives and 
asked the subject to rate herself on each. This was 
the first of two questionnaires presented to the sub- 
ject. The second, an “Art Activity Questionnaire,” 
included a number of items designed to assess the 
subject’s interest in and attitude toward the art ac- 
tivity. After completing this questionnaire, the sub- 
ject was asked a series of questions designed to probe 
suspicions about the experimental situation, the tasks, 
or the instructions.2 The experimenter then fully 
explained the purposes and hypotheses of the study, 


2Fifteen subjects, distributed evenly throughout 
the eight conditions, expressed suspicions about be- 
ing watched during a 10-minute free-time period that 
elapsed between administration of the two ques- 
tionnaires. Since the behavioral measure obtained 
during that period proved not to be useful, these 
subjects were retained, and their data appear in all 
analyses reported here. 
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noting the subject’s reactions and comments. No 
subject appeared distressed by the deception that had 
been employed, and all expressed interest in the hy- 
potheses and outcomes of the study. 


Dependent Measures 


Creativity. The creativity of the designs was the 
dependent measure of primary interest in this study. 
Creativity and several other artistic dimensions (in- 
cluding technical goodness) were assessed according 
to the judging procedure developed during pretesting. 
Fifteen artists, nine males and six females, served as 
judges of the designs made by subjects in this study. 
Each judge had had at least 5 years of experience 
doing studio art (painting, drawing, or design) ; most 
were graduate students enrolled in the Stanford Art 
Department. These artists were paid for their par- 
ticipation. 

The judges worked in individual sessions, rating 
the 95 designs, which had been affixed to the walls of 
a lab room. A different random ordering of the de- 
signs was used for each judge. Before beginning a 
judging session, the experimenter gave the judge a 
brief summary of the general instructions the sub- 
jects had received, without mentioning the crucial 
evaluation/nonevaluation instructions or the par- 
ticular instructions focus that some subjects had 
received. The judge was shown a sample of the 
materials that each subject had worked with, as well 
as a sample of the continuous scale (with 5 refer- 
ence points marked) on which all ratings would 
be made. 

The judge was told that he or she would judge the 
95 designs (always in the same order for a particular 
judge) on each of 16 different artistic dimensions: 
(a) expression of meaning, (b) degree of representa- 
tionalism, (c) silliness, (d) detail, (e) degree of 
symmetry, (f) planning evident, (g) novelty of the 
idea, (h) balance, (i) novelty in use of materials, 
(j) variation of shapes, (k) effort evident, (1) com- 
plexity, (m) neatness, (d) overall organization, (0) 
creativity, and (p) technical goodness. The judge 
was left alone to do each rating task, which con- 
sisted of rating all 95 designs on one of the 16 
artistic dimensions, Before rating the designs on any 
given dimension, the judge was given a brief written 
definition of that dimension.’ A different random 
ordering of the 16 dimensions to be judged was used 
for each of the 15 judges, 

In essence, the rating of the designs can be viewed 
as a discrimination task for the judges. It was ex- 
pected that on many dimensions of judgment, the 
judges would, on the average, see differences between 
the artworks made by the nonevaluation groups and 
the artworks made by the evaluation groups. Could 
the judges, for example, see the artworks of the 
nonevaluation—no focus group as more creative 
than those of the evaluation—no focus group? 
Viewed in this way, the experimental design em- 
ployed here was a repeated measures design, with 
judges as “subjects”; for any one artistic dimension, 
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each judge provided data on each of the eight ex. 
perimental conditions. 

Other measures. As noted earlier, a questionnaire | 
designed to assess interest in the art activity was 
administered to subjects at the end of the experimen- 
tal session. This questionnaire contained items that, 
in previous research (Amabile et al., 1976), had 
served as indicators of intrinsic interest. In addition, 
an attempt was made to obtain a behavioral mea- 
sure of interest in the activity. Subjects were left 
alone in the experimental room for a period of 10 
minutes between administration of the Mood Ques- 
tionnaire and administration of the Art Activity 
Questionnaire. Some collage-making materials and 
pieces from a structure-building game had been left] 
on another table in the room, and the amount of 
time subjects spent playing with these activities dur- 
ing this free time was observed from behind a one- 
way mirror. It was expected that this behavioral 
measure would provide results analogous to those 
obtained on the intrinsic interest questionnaire. The 
behavioral measure proved not to be useful, however, 
since few subjects in any condition played with 
either set of materials during the free-time session 


Results 


Preliminary Analyses ` 


Reliability. As a first step in analysis of 
the creativity results, it was essential to de- 
termine that the subjective ratings made by 
artist-judges were reliable at an acceptable 
level. Thus, a Spearman—Brown interjudg 
reliability was calculated for each dimension 
of judgment (see Nunnally, 1967, p. 233): Ih 
general, the reliabilities calculated in this 
manner were quite high: 12 of the 16 dimen 
sion reliabilities were above .80, and tH 
median reliability was .84. Of particular in- 
terest is the reliability of the major depende” 
measure for this study, creativity; this value 
was quite satisfactory at .79. Only one aie 
sion of judgment, balance, failed to reach ? 
acceptable reliability level; this reliability, 2 
-48, was so much lower than the others H 
judgments of balance were excluded from fu 
ther analyses. a 

These generally high reliabilities pari 
special significance, given the nature ga 
judgment task employed in this study. 


y 
3 These definitions were all nondirective. FO" ir 

ample, the creativity “definition” was, “Uins oe 

own, subjective definition of creativity, the 

to which the design is creative.” 


asked to perform a particularly 
to work alone in a small room 
3 hours, viewing 95 designs affixed 
making 1,520 separate judgments, 
arely 5 sec on each one. Given 
tances, and given particularly 
ges were not trained in any way 
th one another on this task, their 
of agreement is all the more en- 


analysis. A factor analysis (vari- 
ation) was performed on the dimen- 
judgment to determine if the eight 
” dimensions and the seven “techni- 
ensions did in fact cluster together.* 
s of this analysis suggest that the 
‘did cluster in this fashion. First, 
clearly two orthogonal factors. More- 
were the two primary factors iden- 
le analysis, and they do indeed seem 
creativity factor and a technical factor. 
the eight original creativity dimen- 
tivity, novelty of material use, 
ji idea, effort evident, variation of 
detail, and complexity) load high and 
on the creativity factor; all of these 
are +.80 or higher on that factor 
30 or lower on the technical factor. In 
Mm, five of the seven original technical 
S (organization, neatness, planning 
ah balance, and expression of meaning) 
end positive on the technical good- 
hee a SRA loadings are +.45 or 
r + 
EUG SR and +.16 or lower on 
on of this analysis differed from 
t the pretest in three minor respects. 
pa ponent dimensions that had previ- 
high on their respective factors 
he the creativity factor and rep- 
uffici m on the technical factor) 
ay low on those factors here 
“areas from further analy- 
Be a. ee Tatings of “technical 
Pied ingle dimension did not, in 
A i a higher on the technical 
; creativity factor. Thus, 
technical goodness measure used 
nt analyses did not include the 
goodness dimension of judg- 
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Creativity 


It was expected that with the exception of 
the specific creativity instructions group, the 
artworks produced by subjects who expected 
evaluation would be judged lower on crea- 
tivity than the artworks produced by subjects 
who did not expect evaluation. Judge ratings 
on creativity and the six creativity component 
dimensions strongly support this hypothesis. 

A composite creativity measure was formed 
by combining the normalized rating for each 
of the creativity dimensions: novelty of ma- 
terial use, novelty of idea, effort evident, 
variation of shapes, detail, and complexity.° 
Means for this composite measure are pre- 
sented in Table 2. 

‘An overall analysis of variance for the seven 
groups excluding the specific creativity in- 
structions group (evaluation — specific crea- 


4 The factor analysis was done on the average rat- 
ings (over all judges) for each of the 95 designs on 
each of the 16 dimensions. P 

5 This result is not surprising, since, in this study, 
artists’ judgments of t ical goodness correlated 
68 with their creativity judgments. However, it 
clearly implies that while the technical goodness 
cluster of judgments on the factor analysis does cap- 
ture the technical aspect of the artworks, the single 
technical goodness dimension does not. It appears 
that, unlike the artist-judges in the pretest study, 
the judges in the present study did not assess simply 
the technical goodness of the designs with their rat- 
ings on this one dimension, but an overall Gestalt 
of “goodness,” which included aspects of creativity. 

6 Since the experimental design was treated as a 
repeated measures design for the creativity and tech- 
nical goodness measures, with judges as subjects, 
each subjective measure was analyzed by first ob- 
taining one number for each judge on each of the 
eight experimental conditions (an average for each 
condition) and then carrying out repeated measures 
analyses of variance, dependent ¢ tests, and any ad- 
ditional analyses on these. Thus, for example, the 
creativity composite scores were formed by first 
normalizing all judgments on each of the six crea- 
tivity components and then averaging the normalized 
ratings for each judge on each component for each 
condition. Finally, one number was obtained for 
each judge on each condition by forming a com- 
posite of the six components (a simple average). 
By using this technique of analysis for subjective 
measures, it is possible to examine overall rated dif- 
ferences between conditions and, at the same time, 
to account for differential scale usage by allowing 
each judge to act as his or her own control. 
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Table 2 uA 
Mean Judge Ratings of Creativity 


a Eee 
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Instruction 


Specific Specific 
Evaluation No Technical technical Creativity creativity 
expectation focus focus focus focus focus 
Absent 356 -181 160 
Present — 499 —.056 —.466 —.472 177 


Note. These numbers are the means of composites of six normalized components of creativity that clustered) 


on the factor analysis of artist judgments. 


tivity focus) was statistically significant, 
F(6, 84) = 12.13, p < .001. Thus, a planned 
contrast was performed on these seven groups 
to test the hypothesis that control groups 
(nonevaluation) were judged higher on crea- 
tivity than experimental groups (evaluation). 
This contrast was clearly significant, F(1, 84) 
= 45.81, p < 001. 

This pattern is borne out by a series of 
paired comparisons between control groups 
and the-relevant experimental groups. As ex- 
pected, only when evaluation subjects were 
given specific instructions on how to make a 
creative design Aid they produce artworks 
judged as significantly more creative than 
those of nonevaluation’ subjects. The mean 
rated creativity for this specific instructions 
group (evaluation — specific creativity focus) 
is significantly higher than that of the rele- 
vant control, #(14) = —3.88, p <.01;7 in- 
deed, this group is higher than any other on 
judged creativity, In all other cases, the non- 
evaluation groups are significantly higher on 
judged creativity than the comparable evalua- 
tion groups: For the no focus groups (non- 
evaluation -no focus vs, evaluation—no fo- 
cus), £(14) = 9.44, p < .001; for the tech- 
nical focus groups (nonevaluation — technical 
focus vs. evaluation — technical focus), £(14) 
= 2.07, p< .06; (nonevaluation — technical 
focus vs. evaluation — specific technical focus), 
t(14) = 3.62, p < .01; and for the crea- 
tivity focus groups (nonevaluation — creativ- 
ity focus vs. evaluation — creativity focus), 
t(14) = 3.79, p < 01. 


Technical Goodness 


The technical goodness of the artworks— 
the degree of technical competence displayed 


by subjects in their work—was chosen as 
feature that might be compared with crea 
tivity. Means for the composite techni 
goodness measure (which included organiza, 
tion, neatness, planning, and expression off 
meaning) are presented in Table 3. As can 
be seen, for the no focus groups (nonevalua: 
tion—no focus and evaluation—no focus) 
and the creativity focus groups (nonevalua- 
tion —creativity focus, evaluation — creativity 
focus, and evaluation — specific creativity fo- 
cus), the evaluation groups are rated lower 
on technical goodness than their nonevalua- 
tion controls. Paired comparisons bear out 
this observation: The nonevaluation — no fod 
cus group is rated significantly higher than 
the evaluation — no focus group, #(14) = 3-16, 
< .01, and the nonevaluation — creativi 
focus group is rated higher than both u 
evaluation — creativity focus group, atg 
6.16, p< .01, and the evaluation — speco i 
creativity focus group, t(14) = 5.00, p<" 
However, this pattern seems to be reversi i 
for the technical focus groups; here, the non: 
evaluation control (nonevaluation — technici 
focus) is rated lowest of the three, altho 
it is significantly lower than only the spea 
technical focus group, #(14) = —2.21, ? 
05. 

It appears that at first glance, the ies 
of technical goodness results mimics the P' 


R ne i, 

TAIl significance tests reported here are ff 
tailed. re 

The results from the single creativity bes, 
mirror the results reported for the composite focus 
sure in all respects except that the technical © og 
evaluation groups (evaluation — technical focu: differ- 
evaluation -specific technical focus) were not Oi 
ent from their control (nonevaluation — tè 
focus) at acceptable levels of significance. 


EFFECTS OF EVALUATION ON CREATIVITY 229 


feel during the activity session?” (d) “How 
satisfied were you with your performance on 
the art activity?” (e) “How much do you 
like your finished design?” and (f) “How 
much pressure did you feel during the activity 
session?” This last item loaded high and nega- 
esting deviation from this parallelism. The tive on the intrinsic interest factor, so it was 
tivity focus group that was told to expect subtracted from the other five in forming the 
aluation, but was not told specifically what composite. 
lo to receive a good evaluation (evalua- Means for this composite measure are pre- 
m-creativity focus) was rated very low sented in Table 4. It was expected that over- 
I creativity (significantly lower than its all, the control groups (nonevaluation) would 
ntrol, nonevaluation — creativity focus). By be higher in self-rated interest than the ex- 
ntrast, however, the technical focus group perimental groups (evaluation). Recall that 
was told to expect evaluation, but was on the creativity measure, the specific crea- 
i told specifically what to do to receive a tivity instructions group (evaluation — spe- 
od evaluation (evaluation — technical fo- cific creativity focus) was expected to be an 
, was not rated low on technical good- exception to the general pattern. This excep- 
; in fact, it was rated higher (although tion was not predicted on the intrinsic interest 
Ot significantly so) than its control group. measure, however. On the contrary, it was ex- 
pected that even though the specific creativ- 
tlf-Rated Interest ity instructions subjects might exhibit supe- 
rior creativity in accord with their task in- 
structions, their intrinsic interest’ would still 
be undermined by evaluation expectation. 
This overall pattern of results ‘was, in fact, 
obtained. An analysis of variance-on all eight 
groups yielded a significant overall effect, 
2.68, p < .025, and a planned con- 


that had been obtained for creativity: 
the exception of the evaluation group 
h was told exactly what to do (the spe- 
jnstructions group in each case), non- 
ation groups are rated higher than 
aluation groups. There is, however, one in- 


Several items on the Art Activity Question- 
lite administered to subjects just prior to 
tieng were intended to measure their 
sy toward the art activity. A composite 
i a interest measure was formed using 
Be these items; all six loaded higher than F (7,87) = 
0 on the intrinsic interest factor obtained trast testing the specific trend of nonevalua- 
la factor analysis of questionnaire items, tion groups being higher on intrinsic interest 
pe all correlated significantly with one than evaluation groups was statistically sig- 
jaer, These six items were (a) “Did you nificant, F (1, 87) = 4.08, p < .05. | 
Mw your engagement in the art activity as In comparison with the creativity results, 
t tivated more by intrinsic factors, like your however, the intrinsic interest results are not 
4 ' interest, or by extrinsic factors, like the as strong. Indeed, only two experimental- 
w<timenter’s instructions?” (b) “Was the control paired comparisons are statistically 
Ht activity more like work or more like lei- signi 


$ a ficant: that for the two no focus groups 
* activity?” (c) “How playful did you (nonevaluation — no focus vs. evaluation — no 


Table 3 


Mean : 
Judge Ratings of Technical Goodness 


Instruction 
E Specific si Specific 
on No Technical technical Creativity ee 
gece focus focus focus focus focus 
254 —.058 .258 
Spent —081 014 231 — 1322 —.269 


e The i 
se numbers are the means of composites of four normalized components of technical goodness that 


Sluste 
Ted 
on the factor analysis of artist judgments. 
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Table 4 
Mean Self-Ratings of Intrinsic Interest 
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Instruction 
Specific Specific 
Evaluation No Technical technical Creativity creativity 
expectation focus focus focus focus focus 
Absent Sil 294 —.059 
Present —.184 222 —.568 .207 —.158 


Note. These numbers are the means of composites of six normalized measures of intrinsic interest th 


clustered on the factor analysis of questionnaire items. 


focus), #(22) = 2.07, p < .05, and that for 
the specific technical focus group and its con- 
trol (nonevaluation — technical focus vs. eval- 
uation — specific technical focus), ¢(22) = 
2.77, p < .02. 

Despite the failure of specific comparisons, 
however, the overall planned contrast sug- 
gests that it is reasonable to assert that in- 
trinsic interest was undermined by evaluation 
expectation in this study. This result is par- 
ticularly important when the two specific in- 
structions groups are considered: Although 
the technical instructions group (evaluation — 
specific technical focus) was very high on 
rated technical goodness, and the creativity 
instructions group (evaluation — specific crea- 
fivity focus) was very high on creativity, 
both of these groups were quite low on in- 
trinsic interest. In other words, as predicted, 
the - specific creativity instructions group 
(evaluation — specific creativity focus) did 
not exhibit a high level of intrinsic interest 
to match its high level of creativity. 


Discussion 


ed earlier, only 
ctions evaluation 


group (evaluation — specific creativity focus} 
was higher than its nonevaluation control oi 
rated creativity. Thus, the present results sug 
gest a reconciliation between the seeming 
contradictory findings on the effect of com 
straint on creativity provided by the over 
justification and behavior modification lit 
eratures. 


Reinterpretation of Previous Findings 


The behavior modification studies appear! 
to demonstrate that creativity can be im 
creased by the offer of rewards for creativt 
performance. This conclusion clearly Co 
tradicts the intrinsic motivation view of cH 
tivity proposed here: that the imposition 0 
extrinsic constraints (such as rewards or & 
ternal evaluation) can lead to decrements m 
creativity. The key to the reconciliation a 
in the nature of the task and the nature of A 
instructions given. In accord with the pi i 
thesis, if a creative performance depends E 
some degree of risk taking and set breaking ti 
some level of production beyond the oboa 
and commonplace—the imposition of sali sic 
extrinsic constraints, establishing an Go 
motivation, will result in lower levels of E 
tivity. In order to evoke a spontaneously © a 
tive performance, a task must have orate 
gree of ambiguity—a nonobvious Sa 978) 
or method of approach. In McGraw’s ( fA 
terms, it must require a heuristic rather nol 
an algorithmic solution. This was cleat dies 
the case in the behavior modification St" e 
cited earlier, and it was not the case be j 
specific creativity instructions conditio 
the present study. 


The behavior modification studies, for the 
fost part, used verbal tests of creativity. The 
Istructions given to children under reward 
fonditions in those studies effectively elimi- 
ted any ambiguity about what constituted 
a good (“creative”) performance; if the ex- 
perimenters wanted to demonstrate that “flu- 
ency” was under experimental control, they 
told the children that they would be rewarded 
for producing a large number of ideas. Not at 
all remarkably, children produced large num- 
ers of ideas under these conditions. And, ap- 
pa ently, the instructions given to the specific 
preativity instructions group in the present 
tudy also succeeded in reducing the am- 
b guity in the task; subjects were told to 
Come up with a novel idea, to use the ma- 
terials in a novel way, to make a detailed 
and complex design, and so on. Since, as was 
| evident from pretest results, judges considered 
complex, detailed designs with novel ideas 
ee of materials to be creative, this 
hat ac ieved very high creativity scores. 
3 As crucial here is that, within the same 
| Sees design, when subjects were given 
1 uation instructions but not told specifi- 
A a to do, their creativity was dramati- 
ea And, when subjects did not expect 
4 bere their creativity remained high— 
a er what they were asked to focus on. 
ae Present thesis proposes motivational 
a a mediator of the observed perform- 
Pe, ee The results obtained on subjects’ 
Kea iain while generally support- 
might eh, proposition, are not as strong as 
inie s been desired. An attempt at ob- 
Shane A ehavioral measure of intrinsic in- 
leva Hh ed because of the generally low 
during ae payis emitted by subjects 
questionnair z nS period. The self-rating 
TORE re on intrinsic interest did show 
R interest differences consistent with 
ka pothesis. However, because som 
c paired co) i ; > en 
and cont] mparisons between experimental 
dicted eis aaa did not yield the pre- 
is A t differences, further research 
fee efore the observed creativity 
| ments inj can reliably be attributed to decre- 
Intrinsic motivation. 
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Practical Implications 


In determining the practical implications 
of the present results, it is perhaps most in- 
structive to consider the two no focus groups 
(nonevaluation — no focus and evaluation — no 
focus). These groups permit a test of the 
basic, uncluttered version of the hypothesis: 
Evaluation group subjects will be less crea- 
tive than nonevaluation group subjects and 
will show less intrinsic interest. These two 
groups provide an analogue to conditions that 
exist in many social settings; in classrooms, 
for example, children are often given art 
projects and writing projects that will be 
graded, but the criteria for evaluation are 
left unspecified. Appropriately, these two 
groups provide the clearest, most consistent 
pattern of results obtained in this study: 
The nonevaluation group was significantly 
higher on rated creativity and technical good- 
ness and significantly higher on self-rated in- 
trinsic interest. In fact, with the exception of 
the specific creativity instructions group, the 
no focus nonevaluation group was higher than 
any other on rated creativity; moreover, it 
was the highest of all eight groups on self- 
rated intrinsic interest and second highest 
on rated technical goodness. In all respects, 
then, the nonevaluation group here performed 
better than the evaluation group and felt, 


apparently, more intrinsically motivated. 


The specific creativity instructions group 
(evaluation — specific creativity focus) is of 
particular interest when considering practical 


applications ©: 
was higher in rated crea 


that there is no 
constraint, if i 
performance instruc d 
however, that there are several flaws in this 


approach. First, for most activities, it is diffi- 
cult or impossible to specify the behaviors 
that would lead to a creative performance. 
Second, the high creativity of subjects in the 
specific creativity instructions group was not 
accompanied by a high level of technical 
competence; in fact, the rated technical 
competence of this group was significantly 
lower than control. Finally, and perhaps most 
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importantly, the high creativity of this group 
was not accompanied by a high level of in- 
trinsic interest. 

It is important to note that the present 
analysis and the present study deal with rela- 
tively low, “nonprofessional” levels of creativ- 
ity. While we might expect that some profes- 
sional scientists or artists could succumb to 
the overjustification effect, it seems eminently 
clear that many highly creative people go on 
being creative in the face of numerous ex- 
trinsic constraints. (See Simonton, 1977, for 
a historical look at this question.) There are 
two possible explanations for the seeming 
impotency of the overjustification effect at 
this level. First, it is probably the case that 
these exceptional individuals have largely 
internalized the norms and standards by 
which their work would be judged. Thus, ex- 
ternal evaluation becomes less salient and 
less necessary. Second, highly creative and 
successful individuals almost certainly have a 
very high level of intrinsic interest in their 
work, and they are probably well aware of 
this interest. The overjustification effect is 
proposed to occur only when internal states 
are ambiguous or nonsalient (see Bem, 1972). 

The practical implications of these results 
are perhaps most relevant for educational 
settings. They suggest that if creativity is to 
be fostered and interest maintained, care 
must be taken in the use of control in the 
classroom, Surely, some children would have 
no interest in even attempting a task—say, 
an art activity—unless some control were 
exerted over their behavior. But the present 
results suggest that if control is oversufficient, 
undesirable consequences could result, Con- 
sider what the state of modern physics might 
be if Einstein had not been clever enough 
early in his career to circumvent the detri- 
mental effects of external evaluation. 
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Corrections to Lynch and Cohen 
e Expected Utility Theory as an Aid to | 


Helping Behavior,” by John G. Lynch, 


Personality and Social Psychology, Vol. 36, 


rrections should be made: 
140) should read 


—5.4 —2.5 —.4 


2.5 —6 


6.4 


robability levels are listed in ascending 
e higher the level of probability, the 


je slope of the curve to which it properly corresponds. 

‘third sentence in the first full paragraph on page 1146 should read, 
1 X Probability 2, Probability 1 X Utility 1x 
, Probability 1 X Probability 2 x Utility 2, 
2 interactions were all significant, with 


and Utility 1 X Probability 2 
F(8, 26) = 5.22, 9.52, 3.89, 
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Menstrual Cycle Affects Kinesthetic Aftereffect, 
An Index of Personality and Perceptual Style 
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Research suggests that kinesthetic aftereffect (KAE) scores reflect status on a 
postulated stimulus intensity modulation (SIM) mechanism that damps down 
subjective stimulus intensity for some (reducing) and increases it for others 
(augmenting). Such a mechanism would help account for empirically observed 
individual differences in such behaviors as pain tolerance, sensory deprivation 
reactivity, and stimulation seeking. It was hypothesized and confirmed in three 
adult female samples that KAE varies curvilinearly over the menstrual cycle: 
Greater KAE reduction occurs at the cycle’s beginning and end. Neither tired- 
ness, oral contraception, medication, attention, nor social expectations can ex- 
plain this finding. Of the behaviors studied in the KAE literature, only five are 
also encompassed by the menstrual cycle literature. Four of these (antisocial 
behavior, acute schizophrenic episodes, accidents, and activity level) show sim- 
ilar curvilinearity over the cycle. We hypothesize that cyclical variation in the 
SIM mechanism mediates the curvilinear pattern observed for both these four 


behaviors and KAE. 


This study explores the relationship be- 
tween menstrual cycle and kinesthetic after- 
effect (KAE). Prior findings support the hy- 
pothesis that KAE scores reflect individual 
differences in a postulated mechanism that 
modulates the subjective intensity of incom- 
ing stimulation. If KAE scores vary over the 
menstrual cycle, then the stimulus intensity 
modulation (SIM) mechanism may similarly 
vary over the cycle, providing a useful frame- 
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11367, 


work for understanding certain reported be 
havioral variations across the cycle. 


Kinesthetic Aftereffect as an Index of 
Stimulus Intensity Modulation 


Several researchers (e.g., Baker, Mishara, 
Kostin, & Parker, 1976; Mishara & Baker 
1974; Petrie, 1967; Silverman, 1967) bave 
hypothesized that the central nervous syste™ 
functions as if it had a modulator mecha 
which differs from person to person. T ; 
mechanism acts like a volume control, modu 
lating the subjective intensity of incoming 
stimulation. For people with the volume oa 
trol “set low” (“reducers”), the subjectiV’ 
intensity of incoming stimulation is pees 
ably attenuated. Under normal circumstance 
reducers are stimulus deprived and seek E a 
stimulation to compensate. Although oa 
handle high-intensity stimulation well, 
are very uncomfortable when environ 
stimulation is low, as in sensory depriv4 
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T MENSTRUAL 


ontrast, people with the volume 
» (“augmenters”) are sub- 
ded with stimulation, show 
dant behaviors, and cope well 
ms involving minimal intensity. 
res used to measure this postulated 
Gnclude average evoked response 
varying intensities (e.g. Lukas 
977; Silverman, Buchsbaum, & 
69); indexes of Russian “strength 
Ous system” (Sales, Guydosh, & 
14; Sales & Throop, 1972); electro- 
se (Depue & Fowles, 1976); 
le, a self-report questionnaire re- 
ferred degree of stimulus intensity 
974); a self-report index assessing 
of behaviors and attitudes (Baker 
; and KAE (e.g, Buchsbaum & 
968; Deaux, 1976; Petrie, 1967; 
‘Silverman, 1969). 
been the most widely used and 
e ated to three other SIM indexes 
be low). KAE involves changes from 
‘test trials in width judgments of a 
block that occur with intervening 
f an aftereffect-inducing block dif- 
dth from the standard block. Two 
J variants exist. In the one-hand 
trie, 1967), the subject rubs the 
block with one hand while resting 
. There is a two-hand variant (e.g. 
5 & Silverman, 1968); however, it 
less frequently used in KAE/SIM 
Studies than the one-hand version, 
Ba tap the same psychological 
ooler & Silverman, 1969). We 
treat only the one-hand variant 


ex 


as in this study, the aftereffect-in- 
block for one-hand KAE is larger 
pod block, judged width of the 
a decreases on the average following 

induction (e.g., Bakan & Thomp- 
+ Reducers’ width judgments de- 
Most following aftereffect induc- 
es. decrease the least (or 
107s) (Mishara, Baker, Parker, & 
AS the SIM hypothesis, reduc- 
en AE are associated with (a) 
l olerance (Petrie, Collins, & Solo- 
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mon, 1958; Poser, 1960; Ryan & Foster, 
1967; Sweeney, 1966); (b) lesser sensory- 
deprivation endurance (Petrie et al., 1958, 
Study 2); (c) higher “need for stimulation” 
(Mishara & Baker, 1974; Ryan & Foster, 
1967; Sales, 1971, 1972); (d) delinquency 
(Compton, 1967; Petrie, McCulloch, & Kaz- 
din, 1962); (e) more accidents (Petrie, 1967, 
p. 25); (f) schizophrenia (Petrie, Holland, & 
Wolk, 1963); (g) greater activity level (Pet- 
rie, 1967, p. 100; Sales, 1971); and (h) use 
of amphetamines rather than barbiturates 
(Deaux, 1976). Regarding 0 indexes of 
SIM status, KAE reducin es are as- 
sociated with (i) reducing, as assessed by 
average evoked response (Schooler, Buchs- 
baum, & Carpenter, 1976) ; (j) higher sensory 
thresholds (as well as with two other indexes 
of a “strong” nervous system) (Sales & 
Throop, 1972); and (k) higher scores on a 
self-report index (Baker et al., 1976). 
Despite these validity findings, critics have 
recently contended that KAE lacked retest 
reliability (e.g., McDonald, 1974; Morgan & 
Hilgard, 1972; Platt, Holtzman, & Larson, 
1971), showed intermittent validity (e.g., 
Sales & Throop, 1972; Weintraub, Green, & 
Herzog, 1973), and thus should not be used 
as a personality index. However, we (Baker 
et al, 1976; Baker, Mishara, Parker, & 
Kostin, 1978; Mishara & Baker, 1974) and 
others (Bakan & Thompson, 1962) have dem- 
onstrated that systematic individual differ- 
ences in practice effects bias second-session 
KAE scores. Low retest reliability, which is 
caused by such practice effects, can be disre- 
garded. When KAE-personality findings are 
restricted to one-session studies, there is 
clear evidence of validity (Baker et al., 1976). 
Internal consistency, the appropriate reliabil- 
ity statistic for a one-session task, is high in 
10 samples (Mishara & Baker, 1978). We 
conclude that the KAE-SIM formulation 
shows good construct validity when a one- 
session design is used. 


Kinesthetic Aftereffect and the Menstrual 


Cycle 


SIM is presumed to reflect not only stable 
individual differences (trait) but also tran- 
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sient organismic fluctuations (state) (Petrie, 
1960; Petrie et al., 1963). If KAE is a sensi- 
tive index of current modulator status, it 
should reflect both between-people differences 
in trait, as the validity findings reviewed 
above indicate, and. transient changes in state, 
The latter possibility is supported by Gupta’s 
(1974) finding that state-altering variables 
such as Dexedrine and phenobarbital signifi- 
cantly affect KAE performance. 

The present KAE-menstrual cycle hypoth- 
esis is based on two interrelated assumptions: 
(a) Strong stimulation (whether external or 
internal 6 pF@Eumed to lead to a shift to- 
ward reducing on the part of the SIM mech- 
anism, This shift results in a damping down 
of the subjective intensity of incoming stimu- 
lation (cf. Petrie, 1967, pp. 51-60). The 
KAE scores should reflect this shift toward 
reducing. (b) One important form of internal 
stimulation, pain (e.g, lower abdominal 
cramps, tender breasts) is presumed to vary 
curvilinearly over the menstrual cycle, with 
maximal pain Occurring at the cycle’s be- 
ginning and’ end (Dalton, 1964; Moos et al., 
1969).? If these two assumptions are correct, 
then (a) there*should be a shift toward SIM 
reduction at the menstrual cycle’s beginning 
and end, when pain is maximal, and (b) KAE 
Scores should index this shift in SIM. Where, 
as here, the aftereffect-inducing block is 
‘larger than the test block, we hypothesize 
greater KAE reduction (larger aftereffect) 
at the beginning and end of the menstrual 
cycle—in other words, we hypothesize a 
curvilinear relationship between KAE and 
locus in the menstrual cycle.? 


Relationship of 


This Study to the Menstrual 
Cycle Literature 


Very few of the reported menstrual cycle 
findings have withstood recent critical re- 
views (O'Connor, Shelly, & Stern, 1974; Par- 
lee, 1973, 1974; Redgrove, 1971; Sommer, 
1973; Zimmerman & Parlee, 1973), Improper 
generalizations from sample to population, 
problems in drawing causal inferences from 
correlational data, inadequate statistical treat- 
ment, and failures in replicating earlier find- 
ings are some central Problems raised by the 
critics, 

The reviews generally agree that a few 
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consistent, replicable findings of Periodicity 
have been reported regarding (a) self- 
ported symptoms such as mood changes (Cop, 
pen & Kessel, 1963; Moos et al., 1969), (b) 
incidence of criminal behavior (Dalton, 1961; 
Morton, Additon, Addison, Hunt, & Sullivan; 
1953), and (c) incidence of psychiatric crise 
(Dalton, 1959; Glass, Heninger, Lansky, 4 
Talan, 1971; Jacobs & Charles, 1970), Ay 
Parlee. (1974) and Sommer (1973) note, fer! 
replicated findings exist involving objectively 
measured nonverbal behaviors such as per 
formance on cognitive tasks, reaction time, 
and athletic performance. Parlee (1974) and 
Sommer (1973) conclude that (a) most of 
the replicable findings are in behavioral areas 
that can be influenced by social-cultural ex 
Pectations concerning how women “should! 
behave or feel at different phases of the cyd 
and that (b) few, if any, consistent and sig) 
nificant findings exist in areas where no cleat 
cultural expectations exist. They hold, in ey 
fect, that most psychological and behavioral 
changes reported do not reflect phenoment 
intrinsically associated with the cycle (such 
as hormonal or other physiological changes) 
but rather are “artifactual,” arising {rom 
social-cultural expectations. 

Recently, a few carefully done studies have 
focused on objective, nonverbal behaviors- 


*This assumption is supported by findings he, 
(a) high-intensity auditory stimulation by Si 
increases pain tolerance (Gardner & Ce i 
and high-intensity auditory stimulation CO a 
with suggestion also increases pain tolerance a j 
zack, Weisz, & Sprague, 1963) and decree 
reported pain (Lavine, Buchsbaum, & e, nonr 
that (b) pain tolerance also increases Wi A 
painful somatosensory stimulation (Higgins, y and 
& Schwartz, 1971; Satran & Goldstein, ied 
that (c) the Strong internal stimulation of sensi 
pain raises the pain threshold (decreases poi A 
tivity) in undamaged areas of the body (Ha 
Mueller, 1950; Mersky & Evans, 1975). nges i 

“Our assumption here is that these cha ai 
self-report of pain over the menstrual Sogi 
primarily due to hormonal and other phy palto 
changes over the cycle, a view consistent Wii amo 
(1964) and Lennane and Lennane (2973); | (eS 
others. We are aware that some would Pe ( 
Parlee, 1974) that such findings reflect shat W 
Pectancy effects, an alternative hypoth 
be considered briefly in the discussion s p 

Petrie (1967, p. 103) mentions, w cow 
senting any supportive data, pilot fin ns. 
sistent with our prediction regarding preme! 


& 


Se 


ikely to be mediated by social expecta- 
s For example, both conditionability to 
ersive stimulus (Asso & Beech, 1975) 
tod-and-frame test performance (Klaiber, 
erman, Vogel, & Kobayashi, 1974) vary 
the menstrual cycle. More such studies 
be done before the significance of these 
indings can be determined. Here, we attempt 
late another objectively measured, non- 
rbal behavior to the menstrual cycle. 


Method 


paratus 


Following Petrie (1967), apparatus included a 30- 
1 (76.20 cm) long tapered comparison wedge, a 
-inch (6.35 cm) wide aftereffect induction block, 
a test block (whose width is specified below 
each sample). A mounted ruler ran the length 
the comparison wedge, Around the ruler were a 
of finger guides, allowing the exact location of 
4 subject's hand on the wedge to be recorded at 
ot of each trial. The subject was blindfolded 
ae in front of a table that had the apparatus 


Subjects 


im dependent groups of right-handed women 
E; Pi 55), predominantly of college age (Ms 
RN » and 20.9 years, respectively), were 
f min i le samples are reported separately because 
a pa brecedural differences: Samples 1 and 2 
IPP concurrently by two female experimenters 

Soe miera KAE test blocks—2 inches 
ii Le 1.5 inches (3.81 cm), respectively. 
Eae. run 1 year later by a different female 
These with the 15-inch (3.81 cm) wide test 

tiles a epariion samples of right-handed 
Md 21.2 yen 10, and 59; mean ages = 19.6, 21.1, 
i years, respectively) were run concurrently 


vith the three female samples, 


D, 
tocedure 


Becay i 
ane differential bias in repeated KAE 
edure of jae discussed above, the more ideal pro- 
€ menstry ne multiple KAE administrations over 
8-trial KAE cycle’s course was contraindicated. An 
ubject, {ello Procedure was thus given once to each 
actly, The Nad Petrie’s (1967) detailed procedures 
i Droged important features of Petrie’s stan- 
ach of i saat include the following: (a) On 
lock with ih judgments, subjects held the test 
and in aie umb and forefinger òf the right 
a ed its width on the tapered block 
Te were ent forefinger of the left hand. (b) 
then a patia trials, four pretest trials, 
nd 120 see ie tereffect-induction periods (90, 90, 
Our test deinen: Tespectively), each followed by 
ents (12 test trials in all). (c) Before 
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KAE, subjects rested their hands for 45 min so that 
nothing touched their thumbs and forefingers (cf. 
Petrie, 1967, pp. 107-127 for further details). 


Scores 


The KAE score was the mean of the 12 test judg- 
ments minus the mean of the four pretest judgments. 
Negative scores represent the usual aftereffect (width 
judgments on test trials smaller than on pretest 
trials). The most negative scores (largest aftereffect) 
indicate reducing; minimally negative—or even posi- 
tive—scores (smallest aftereffect) represent aug- 
menting. 


Interview 


As part of a large questionnaire ‘battery orally ad- 
ministered by the experimenter during the 45-min 
rest period, the subject estimated onset date of next 
menses. Based on this estimate, raw menstrual cycle 
score (number of days before onset of menses) was 
calculated. 

In Sample 3, a more refined procedure—occasion- 
ally used in recent menstrual cycle studies (eg. 
Golub, 1976)—was adopted: Cycle length (date of 
most recent onset and date of onset of next menses) 
was estimated at the time of the KAE administra- 
tion, and then the actual date of onset of the next 
menses was verified by follow-up phone calls.* This 
permitted, for Sample 3 only, conversion of the raw 
menstrual cycle score into an adjusted menstrual cycle 
score® based on a standard 29-day „cycle.® 

Information on possible confounding variables was 
also collected: (a) rating of tiredness now as com- 


4The validity of subjects’ estimates can be as- 
sessed (a) by correlating number of days until next 
onset, based on estimated versus confirmed onset 
dates, or (b) by computing the mean absolute dif- 
ference between estimated and confirmed dates. In 
Sample 3 and in a larger follow-up study regarding 
menstrual cycle estimates, the validity correlations 
were high (rs=.92) and the absolute differences 
adequately small (1.0 and 2.1 days for Sample 3 and 
the follow-up sample, respectively). 

5 For example, suppose a particular subject had a 
cycle length of 24 days and that she was admin- 
istered KAE on the 6th day before onset of next 
menses. The day on which she was tested was first 
converted into a percent of her cycle, in this case 
25%, and then the percent was reconverted to days 
based on the standard 29-day cycle, in this case to 
7.25 days. Employment of a standard cycle is a 
more refined procedure that has occasionally been 
used in the menstrual cycle literature (e.g., Moos et 
al., 1969). 

6 The decision to specify the standard cycle as 29 
days is based on the mean cycle length found in a 
large study (n =2,542) by Sheldrake and Cormack 
(1976) and on the median mean cycle length deter- 
mined from a review of 21 other studies (Presser, 


1974). 
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Sample 3 
(N=55) 


DAYS BEFORE ONSET OF MENSES 
Figure 1. Menstrual cycle affects kinesthetic aftereffect (KAE). 


pared to usually (on a 7-point scale); (b) oral con- 
traceptive usage; and (c) any kind of medication or 
drug usage during the past 12 hours. 

Since the same experimenters conducted the in- 
terviews and administered the KAE task, the experi- 


menters were not told the purposes and hypothesis of 
the study. 


Results 


Figure 1 presents the KAE scores for each 
sample as a function of days before onset of 
next menses. To smooth the curves, reported 
day of the cycle was collapsed into 10 equal 
intervals of 3 days each. For Sample 2, no 
subjects fell in two of these intervals, so only 
8 data points are depicted. For Sample 3, the 
last interval consists of 2 days only—Days 
28 and 29 of a standard 29-day cycle (see 
Method section). 


These curves each show irregularity—most 
notably, a dip near midcycle (an issue treated 
briefly in Footnote 11), Despite this irregu- 


jcted 
larity, the curves generally show the predic 
curvilinear pattern. 


Test of the Main Hypothesis 


Correlational analysis. One way o 
for curvilinearity is to ascertain the deei i 
which the observed data can be H (a) 
curve that looks like an inverted U, wi fect 
the most negative scores (largest rel ‘he 
at the beginning and end of the cycle, o ) 
least negative scores (smallest ea te 
at the middle, and (c) days equidista” 
the midpoint having the same value. Jineat 

The choice of method to test for curv! resent 
ity was shaped by the size of te ae 
samples, the goal of treating ae 
as a continuous variable (many si : deft 
broken up the cycle into “phases, a ), a 
tion of which differs from study to stu Y ubie 
the goal of using all information 4V4" ip 
The following procedure met these 


mints: First, the raw menstrual cycle score 
subject’s estimate) was transformed 
the absolute value of each day’s devia- 
rom midcycle (Day 15).° This procedure 
one arm of the predicted inverted U- 
d curve onto the other arm (days with 
tical predicted KAE, for example, 1 and 
have the same value). Since we expected 
e, we then squared these deviations 
Day 15 to produce our transformed 
enstrual cycle score. According to the curvi- 
near hypothesis, (a) the least negative KAE 
ores (augmentation) occur at midcycle, 
here the smallest values of the transformed 
occur (e.g, raw menstrual cycle score 
Day 15 corresponds to a value of 0° on 
ê transformed scale); and (b) the most 
tive KAE scores (reduction) occur at 
oth ends of the cycle, where the largest 
alues of the transformed scale occur (for 
Sample, raw scores of 1 and 29 both cor- 
spond to a transformed value of 14°). Thus, 
| negative linear relationship should obtain 
tween KAE and the transformed scores. 
ee 1, 2, and 3, the Pearson prod- 
correlations between KAE and 
or menstrual cycle scores were 
ie pe. 05), —.602 (p< .05), and 
Ctelation ( FR , respectively. The average 
<.002). Sin cNemar, 1969) was —.285 ($ 
k iona the main hypothesis involved 
tiled tests prediction, p values are for one- 
Two double checks on the correlational 
one-way a correlations approximate 
de as alysis of variance with menstrual 
Tey the independent variable (with 29 
quad and with curvilinearity tested by 
Tatic trend analysi 
ze w ysis. Because sample 
vas far too small, one-way anal; f 
ance using trend 1 l 4 baer 
oyed with {f analyses could only be 
levels, Th ar fewer than 29 menstrual 
e here coll e raw menstrual cycle scores 
land 3 (see lapsed into 10 levels for Samples 
Sample 2 R Figure 1) and into 4 levels for 
> Resulting Fs from the quadratic 


Were patlses 


= 


(djs = 1, 44; 1, 10; 1, 52) 
90. ? > 4, 

ining oe et and 3.68, respectively: Com- 
the abilities for independent tests on 


e h i s 
a (Wien 1a 
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For the second double check, the three 
samples were pooled, and subjects were clas- 
sified into two categories occasionally used 
(e.g., Dalton, 1964) in menstrual cycle stud- 
ies: (a) paramenstruum (week before and 
days of menses)* *° versus (b) midcycle (the 
rest of the cycle). Paramenstruum females 
(M = —4.92 mm, SD = 4.08 mm) showed 
greater KAE reducing, t(114) = 3.18, p< 
.001, than midcycle females (M = —2.34 mm, 
SD = 4.35 mm). Since this single simple anal- 
ysis has the advantage of providing a large 
n for each cell, certain of the later analyses 
of secondary issues are patterned on it. 


Some Secondary Issues 


Male-female comparisons. Although few 
menstrual cycle researchers (e.g, Dalton, 
1964) have used comparison groups of males, 
Parlee (1973) recently enunciated a rationale 
for employing this admittedly not perfect 
control procedure: She asserts ‘that the use 
of nonmenstruating individuals (for example, 
males) can help in both description and 
understanding by providing a baseline with 
which to compare behaviors associated with 
different parts of the menstrual cycle. 

Paramenstruum females showed greater re- 
duction (larger aftereffect) than males in 


7 Day 15 is the day exactly in the middle of the 
standard 29-day cycle used here. 

8 When the day of menstrual cycle raw score is 
converted into an adjusted score for a standard 
29-day cycle, which was possible only for Sample 
3, the value of the correlation becomes slightly 
higher, r = —.258. 

° For menstruating subjects, days of menses at 
time of KAE testing ranged from Day 1 to Day 5 
of menstrual flow. 

10 Because our hypothesis was based, in part, on 
cyclical variation of painful symptoms, we used the 
same cutoff points as Moos (1968), whose extensive 
(n = 839) study found that more women complain 
of painful symptoms during menstruation and during 
the week before menses relative to intermenses. When 
we review other menstrual cycle studies, we use the 
term “paramenstruum” to refer to findings clearly 
associated with the cycle’s beginning and end either 
as defined by Dalton (1964; that is, the 4 days 
before and the 4 days after menses) or as defined 
by us or including slight variations on this definition 
(eg., 1 week before menses and 1 week after menses, 
as used by Glass et al., 1971). 
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each sample. For paramenstruum females, 
Ms = —4.97, —5.26, and —4.86 mm (SDs 
= 4.97, 1.97, and 3.60 mm; ns = 16, 5, and 
19) for Samples 1-3, respectively. For males, 
Ms = —1.98, —1.95, and —3.45 mm (SDs 
= 5.75, 2.38, and 2.99 mm; ns = 31, 10, and 
59). Comparisons of paramenstruum female 
and male KAE scores within each sample gave 
t values (dfs = 45, 13, and 76) of 1.76, 2.69, 
1.57; p< .10, p< .02, and p < .14, respec- 
tively. Combining probabilities across sam- 
ples (Winer, 1962), x? = 16.36, p < .02. (All 
p values are for two-tailed tests, since there 
was no a priori directional hypothesis here.) 

Midcycle females in Samples 1 and 2 (Ms 
= —2.37 and —2.85 mm; SDs=5.07 and 
3.86 mm; ms = 31 and 9, respectively) also 
showed greater reduction (larger aftereffects) 
than the corresponding males (see above), 
but this difference was clearly not significant, 
ts(60) and (17) = .28 and .63, ns. In Sam- 
ple 3, midcycle females (M = —2.19 mm; 
SD = 3.60 mm; n= 36) showed less reduc- 
tion (smaller aftereffect) than the correspond- 
ing males. This difference, opposite in direc- 
tion to the nonsignificant male-female dif- 
ferences for Samples 1 and 2, approached the 
conventional level of significance, ¢(93) = 
1.67, p<.10. Taken together, these data 
Suggest that there is no difference between 
males and midcycle females on KAE.” 

Given that paramenstruum females do, but 
midcycle females do not, differ from males, 
one may ask whether females in general differ 
from males. For all females pooled, KAE 
scores are —3.24, —3.71, and —3.06 mm 
(SDs = 5.13, 3.43, and 3.75; ns = 47, 14, 
and 55) for Samples 1-3, respectively. (See 
preceding text for Corresponding male values.) 
Male-female comparisons within each sample 
gave t values (dfs = 76, 22, and 112) of .97, 
1.42, and .61, respectively, ns. In Samples 1 
and 2, females show greater reduction (larger 
aftereffect); in Sample 3, they show lesser 
reduction (smaller aftereffect). Overall there 
is no evidence of a significant sex difference 
when all females are involved in the com- 
parison. This finding is consonant with re- 
Ports from three earlier, single-session studies 
employing the same one-hand KAE procedure 
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used here (Barthol, 1958; Charles & Duncan, 
1959; Spitz & Lipman, 1961). 

Possible confounding variables. We an- 
lyzed three variables that might have com 
founded the observed menstrual cycle-KAR 
relationship: (a) tiredness, which reportedly 
fluctuates over the cycle (Dalton, 1964); 
(b) oral contraceptive use, which affects 
mood and behavior (Kutner, Phillips, & Haag, 
1971); and (c) use of drugs or medication, 
which might vary systematically over the 
cycle, and which can affect KAE (Gupta, 
1974). Each variable was assessed by seli- 
report. Two separate 2 x 2 unweighted: 
means, independent group analyses of vati 
ance were undertaken on the KAE scores 
with (a) menstrual status (paramenstruum 
vs. midcycle) as one variable and (b) degre 
of tiredness (more tired than usual vs. the 
same or less tired than usual) and oral con 
traceptive status (use vs. not use), respet 
tively, as the other variable. No significant 
interaction was observed between degrees of 
tiredness and menstrual status, F(1, 112) = 
.06, ns, or between oral contraceptive status 


11 Perhaps a true male/midcycle female ane 
exists that is obscured by the presence of the the 
that each sample’s curve displays somewhere m 


middle of the cycle (see Figure 1). Since our 


of ovulation, Indeed, Vollman (1974) regards a 
midcycle pain as one test of time of ovulation, Sa 
that for some women this pain is so severe as 
incapacitating. Moos et al. (1969) plotted TP, 
of painful symptoms and found maximal valu 
either end of the cycle and also an increase 
reported pain in the middle of the cycle. 1% Wy 
context, the increase in reduction on x con 
middle of the curves for our three samples ase i 
sistent with the assumption that an ae, 
internal pain leads to damping down ee mh 
the part of the postulated modulator me ple $ 
Exploratory ¢ tests were undertaken in Sn t 
where the most precise menstrual cycle measu"® gys 
was made. These indicate that the mean 1 soed 
16-18 (depicted as Day 17 on the graph) 
from both the mean of Days 22-24, t(11 of DOS 
p < 02, two-tailed test, and from the mean gine 
13-15, (91) =2.41, p < 05. This exploratory og, 
ing” of a midcycle dip warrants further inv 


in sell: 


= 1.95: 


estiga 
i 


md menstrual status, F(1, 98) = .07, ns. 
latter n is smaller here because the ex- 


ct.) 
Finally, we assessed whether the KAE- 


nstrual cycle finding might artifactually 
fect differential use of drugs and medica- 
over the cycle. If, for example, drugs and 
ications that lead to greater reducing 
Gupta, 1974) were used more at paramen- 
um and drugs that lead to greater aug- 
mentation were used more during midcycle, 
differential drug use might have produced 
ur finding. To see if the findings would 
ld, independent of drug usage, we simply 
opped all subjects who reported using drugs 
ind/or medication in the 12 hours before 
esting and computed a ¢ test between the re- 
aining midcycle and paramenstruum fe- 
Males, There was still a menstrual cycle- 
effect, (89) = 2.66, p < .005. 


Discussion 


Observed Kinestheti 
etic Aftereffect /Menstrual 
Cycle Relationship abe ge 


pre magnitude. In three samples, KAE 
a ce the same curvilinear pattern 
Shot? abla cycle. How large is this 
E ect magnitude can be assessed from 
N spectives—means (effect size) or vari- 

: z (explained variance): (a) Using 
kd tee d statistic, paramenstruum 
a pa females differ in KAE means 
Spectivel , and .74 SD for Samples 1-3, re- 
E y. Cohen described effects of this 
aaa (Sample 1) and “large” 
Porrlation and 3). However, (b) the average 
de expl of .285 shows that menstrual 
scores, ans 8% of the variance in KAE 
Suspect ee but modest amount. We 
; ‘at improved methodology and spec- 


ification 5 
of the midcycle dip would strengthen 


€ CO) : 

pee results (see Footnote 11). 
tion P confounding variables. Elimina- 
strengthen ose confounding variables 


we findings. Empirical analyses 
Gaia oral contraceptive use, 
ts of drugs and/or medication 


Ow that 
the 
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do not account for the results. On logical 
grounds, one can eliminate a possible atten- 
tional hypothesis: If cyclical variation in 
pain and discomfort (Moos et al., 1969) di- 
rectly affects attention, then attention to the 
KAE task would presumably be lower at 
menses and premenses. Since KAE is minimal 
with lessened attention (Bakan & Thompson, 
1965), one might predict greater KAE at 
midcycle, whereas smaller aftereffects were 
found here. 

Social expectancy effects. It appears that 
the KAE findings cannot be explained in 
either of the two primary ways in which 
social expectancy effects have been invoked 
to account for certain prior menstrual cycle 
findings (Parlee, 1974; Sommer, 1973): (a) 
When interpreting a correlational relationship 
between the menstrual cycle and some other 
variable, the researcher may, because of his 
or her social expectations, ascribe causal 
status to the menstrual cycle when it is an 
effect. For example, Dalton (1964) inter- 
preted the greater frequency of accidents at 
paramenstruum as being caused by psycho- 
logical processes associated with the menstrual 
cycle and ignored the alternative hypothesis 
(Parlee, 1973) that stress associated with 
the accident accelerated menses’ onset, thus 
leading to this “finding.” We can see no way 
in which taking KAE might have any such 
effect on menstrual cycle locus. (b) The sub- 
ject, because of her social expectations, may 
behave differently at or near menses—for 
example, doing more poorly on a task because 
she believes women do poorly then. KAE, 
however, is not a “right or wrong” task that 
focuses on simple performance errors. Nor 
do we know of any social expectancies about 
when augmenting rather than reducing errors 
occur on KAE. So, neither type of social ex- 

tancy effect is likely to occur here. 

The method of this study. Precise physio- 
logical assessment (assay of blood samples, 
assessment of basal temperature, and indexes 
of major metabolites in the urine) of the 
menstrual cycle variable is important for ac- 
curate measurement, since pronounced indi- 
vidual differences exist in occurrence of ovula- 
tion and in patterning of hormonal secretions. 
But these procedures are quite expensive and 
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difficult. An admittedly simpler and more 
economical approach seems appropriate when 
first exploring a given domain for possible 
menstrual cycle relationships. In a situation 
in which social expectancies have an effect, 
use of self-report measures might well in- 
crease the role of such expectations, and thus 
Type I errors would be more likely. With such 
biasing effects ruled out, the decrease in ac- 
curacy in menstrual cycle assessment due to 
self-reports should simply make verification 
of a directional hypothesis such as ours more 
difficult. Studies using more precise menstrual 
cycle measurement procedures should simply 
find a stronger effect. 

Implications for KAE methodology. Is it 
feasible to continue to use KAE as a measure 
of trait since we and Gupta (1974) have 
shown that state variables affect KAE? Yes. 
Only 2 (Deaux, 1976; Schooler et al., 1976) 
of the 18 KAE validity studies cited in the 
introduction controlled for state. Thus, state 
need not be controlled to obtain significant 
personality findings with KAE, Of course, a 
useful task for improving KAE’s predictive 
power would be the development of some 
means of separating state and trait variance. 


Possible Behavioral Implications 


In contrast to most menstrual cycle studies, 
we observed a relationship that (a) holds 
across independent samples, (b) involves ob- 
Jective measurement on a laboratory-type 
task, and (c) cannot be explained by social 
expectancy or (d) by other obvious sources 
of confounding, What are the possible behav- 
ioral implications of this finding? 

According to KAE validity studies, be- 
tween-people differences in KAE scores (trait) 
correlate with between-people differences in 
diverse psychological domains, Based on our 
Present findings for females, is what appears 
to be a within-person cyclical variation in 
KAE scores (state) similarly associated with 
a within-person cyclical variation in some or 
all of these same Psychological domains? 

Our literature review shows that five of 
these domains have been studied in relation 
to the menstrual cycle, For these five, the 
KAE validity literature indicates that KAE 
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reducing scores are associated with delinquey 
behavior, accidents, schizophrenia, greate 
activity level, and higher sensory threshold 
The present findings associate KAE reduciy 
scores with the paramenstruum. Is it possibl 
that relative to the rest of the cycle, womer 
at paramenstruum show more delinquent be 
havior, more accidents, higher incidence 0 
schizophrenic symptomatology, higher actiy 
ity levels, and higher sensory thresholds? 
The findings for the first three of thes 
five domains show just this pattern. Relativ 
to the rest of the cycle, there is significant} 
higher incidence during the paramenstruum 
of (a) crime (Dalton, 1961), unpremeditate 
acts of violence (Morton et al., 1953), am 
disorderly behavior among prisoners (Dalton 
1961); (b) accidents (Dalton, 1964); am 
(c) admission of newly hospitalized schizo 
phrenics (Dalton, 1959) and hospital emer 
gency ward treatment of schizophrenics (Glas 
et al., 1971). 
Regarding the fourth domain, activity 
level, Morris and Udry (1969, 1970) report it 
two samples evidence that greater activil 
level (assessed by pedometer) occurs at the 
beginning and end of the cycle. Three 
of activity were found—the largest aroun 
the time of ovluation, and two others, one 0° 
curring early in menstruation and the other i 
the premenstrual week. Although one migt! 
describe these curves as U-shaped (with a 
additional peak in the middle, much like ! 
curves), no statistical test for curvilineatil) 
was reported. At best, one can say that K 
observed data are consistent with the K 
curve over the menstrual cycle. ' 
The results from the fifth domain—labo" 
tory studies of sensory thresholds—are mixe™ 
Only one study (of olfactory sensitivity) e 
ports our expectation that higher sens? 
thresholds should occur at the beginning "4 
end of the menstrual cycle (Vierling & r 
1967). In contrast, a study of visual thres a 
(Diamond, Diamond, & Mast, 1972), 5 
though reporting high threshold during oa 
struation, observed low threshold during a 
Premenstrual week. A different pat 
reported for taste threshold (Glanville & 
lan, 1965), where the lowest thresholds 


menstruation as compared to 
ods during the cycle. 
lusions can be drawn from these 
respect to sensory thresholds, 
e confusing. This may well mean 
that we were seeking to 
KAE and the menstrual cycle 
thresholds and menstrual cycle 
btain. Another possibility is that 
gs may reflect a problem in the 
aethodology employed (i.e., a con- 
f sensitivity with response bias). 
Tong (1974) recently applied sig- 
on theory to the menstrual cycle/ 
eshold relationship. They showed 
to distinguish between changes 
Criterion and changes in sensi- 
lure true of each of the threshold 
v wed above) can considerably 
| relationship between threshold 
and menstrual cycle. It remains an 
luture research to determine which 
0 possibilities is most likely. 
four domains reviewed each in- 
iralistic observation, and only the 
involved laboratory investigation. 
One would give greater weight to 
on laboratory research, because 
ater precision and greater control 
neous variables normally associated 
methodology. Here, only sensory 
Studies used laboratory methods, 
n methodological limitations may 
l. Under these circumstances, there 
is for weighting the laboratory evi- 
a, and discounting the evidence 
lomains involving naturalistic ob- 


four domains, behaviors related 
in the KAE-trait literature show 
E ation with the cycle’s beginning 
r each domain, other possible ex- 
tor this curvilinear pattern exist 
on). Yet, within the limitations 
os evidence, there is a convergence 
Curvilinear hypothesis and with 
curvilinear KAE findings. It is 
Y that recent menstrual cycle 
ews (see the introductory sec- 
ce) concur that consistent 
© menstrual cycle findings have 
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been reported for criminal behavior and for 
psychiatric crises (which does subsume 
schizophrenic crises as one important com- 
ponent).*? Thus, for two of these four do- 
mains, there is the most agreement that be- 
havioral variation does occur over the cycle. 
It would be worthwhile to explore whether 
other behaviors known empirically to cor- 
relate with KAE show a similar curvilinear 
pattern over the menstrual cycle. 


Possible Theoretical Implication 


Our hypothesis regarding KAE was based, 
in part, on the assumption that the postulated 
SIM mechanism varies curvilinearly over the 
cycle. The finding that KAE shows the pre- 
dicted curvilinear pattern thus constitutes 
one basis for inferring that the underlying 
process of SIM (presumed to underlie indi- 
vidual differences in KAE and thus to medi- 
ate the relationship between KAE and various 
external behaviors) also varies curvilinearly 
over the menstrual cycle. Obviously, this type 
of inference requires additional independent 
confirmation, preferably from other proce- 
dures that assess the postulated process (for 
example, the average evoked response mea- 
sure of reducing-augmenting). 

Suppose that future research clearly estab- 
lishes that amount of activity level and 
incidence of criminal behavior, schizophrenic 
symptomatology, and accidents vary curvi- 
linearly over the cycle. We would then tenta- 
tively offer the single parsimonious explana- 
tion that the modulator mechanism varies 
curvilinearly over the menstrual cycle and 
that this variation mediates the cyclic pat- 
tern observed for these four behavioral do- 
mains. There are various other alternative 
hypotheses that might also explain the cyclical 

ttern observed for any one of these domains. 
Dalton (1964), for example, explained sev- 


12 Although there is general agreement that inci- 
dence of “psychiatric crises” shows periodicity over 
the menstrual cycle, it is important to add that 
three studies have additionally indicated that the 
particular phenomenon of concern here, schizophrenic 

also shows such periodicity (ie. 


symptomatology, 
Dalton, 1959; Glass et al, 1971; Jacobs & Charles, 


1970). 
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Vicarious Exposure to Hedonic Extremes 


Marshall Dermer, Sidney J. Cohen, Elaine Jacobsen, and 
Erling A. Anderson 
University of Wisconsin—Milwaukee 


two experiments, the hypothesis was corroborated that vicarious exposure to 
hedonic extremes—especially the hedonically negative—results in contrast re- 
garding evaluative judgments of aspects of life that have evolved or been ac- 
quired in the course of life beyond the laboratory. In Experiment 1, participants 
who wrote about hedonically negative events occurring at the turn of the cen- 
tury expressed greater satisfaction on a composite index of present life quality 
than participants who wrote about hedonically positive events. In Experiment 
2, participants who wrote about hedonically negative events, personal tragedies, 
scored higher on a composite index of satisfaction with life, health, and physical 
= appearance than participants who wrote about hedonically positive events. The 
findings for the composites corroborate a comparison level model of evaluative 
a The findings for individual items, however, suggest that aspects of 
life are not evaluated in terms of a single utility scale and standard—the com- 
"Parison level. Other findings are discussed that appear to contradict a simple 
ive model of evaluation in which the positivity of evaluations is postu- 


tive Judgments of Aspects of Life as a Function of 


of psychologists have generalized 
(Helson, 1964) and judgmental 
: 1951) principles, corroborated 
tional experimental psychology 
to social judgment. The most 
‘generalization is that of Thibaut 
ley (1959) who proposed a theory of 

ation of outcomes resulting from 

tion. Although outcomes might 
eir specifics, it was assumed that 
could be characterized in terms 
ty or hedonic value. The com- 


ch was supported by grants from the 
1001 of the University of Wisconsin— 
Marshall Dermer, Sidney Cohen and 
On contributed to the design and exe- 
Ament 1, and Elaine Jacobsen con- 
‘design and execution of Experiment 
are due Ellen Berscheid, Philip 
te Upshaw for their encourage- 
n onnering for skillfully transcribing 


x Teprints should be sent to Marshall 
ment of Psychology, University of 
lukee, Milwaukee, Wisconsin 53201. 


lated to increase with the positivity of affective states. 


parison level was conceptualized as a “psycho- 
logically meaningful midpoint for the scale 
of outcomes—a neutral point of satisfaction— 
dissatisfaction,” (p. 81) and was defined as 
the “average value of all the outcomes known 
to the person (by virtue of personal or vi- 
carious experience), each outcome weighted 
by its salience (or the degree to which it a 
instigated for the person at the moment)” 
(p. 81). An outcome, therefore, is judged to 
be positive or negative to the extent that its 
hedonic value is, respectively, above or below 
the comparison level. 

Comparison level theory has been corrobo- 
rated in a number of studies in which satis- 
faction with outcomes attained within ex- 
periments has increased as a function of 
manipulations hypothesized to lower the com- 
parison level (e-8., Brickman, 1975; Fried- 
land, Arnold, & Thibaut, 1974). Judgmental 
theories, however, apply to the evaluation of 
outcomes that are of greater importance to 
individuals than those usually bestowed upon 
participants in laboratory experiments. Thi- 
baut and Kelley, for example, illustrated 
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their theory in terms of marital satisfaction. 
Volkman (1951, p. 286) also presumed the 
generality of judgmental principles. More re- 
cently, Brickman and Campbell (1971) gen- 
eralized adaptation level theory (a conceptual 
cousin of comparison level theory) to satis- 
faction with, for example, personal wealth 
and competence. These generalizations are 
intriguing, but are without direct experimental 
support. 

In this report, two experiments are pre- 
sented in which the generality of comparison 
level theory was tested for evaluations of 
important and often vital life outcomes that 
have evolved or been acquired in the course 
of a participant’s life beyond the laboratory. 
Furthermore, a condition in which partici- 
pants were vicariously exposed to the he- 
donically negative was common to both ex- 
periments. The importance of this condition in 
testing comparison level theory, for this judg- 
mental domain, is emphasized. 


Experiment 1 * 


In the past 10 years, a social indicator 
movement has evolved in which it has been 
proposed that the well-being of the nation 
can best be assessed by measuring a broad 
range of variables. Andrews and Withey’s 
(1976) respondents, in particular, indicated 
their feelings regarding various aspects of 
life along rating scales similar to those used 
in laboratory studies of contrast. Further- 
more, the “objects” of judgment (for exam- 
ple, the respondent’s life as a whole and 
standard of living) differed from those of 
previous laboratory studies of contrast. The 
study of contrast in relation to judgments of 
the quality of life offers, consequently, an 
opportunity to replicate the response assess- 
ment methodologies of previous experiments 
and to generalize comparison level theory to 
a new class of stimuli. Furthermore, although 
the theoretical orientation of the present re- 
port tends toward linguistic rather than per- 
ceptual or subjective interpretations of con- 
trast (see Manis, 1971) and comparison 
level theory (see Upshaw, 1969), as An- 
drews and Withey (1976) have noted, “any- 
thing that can be done to improve the human 
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lot that is reflected as felt improvement js, 
condition to be coveted” (p. 10). 


Hedonic Manipulation 


In the present experiment, comparison level 
theory was tested by presenting historical 
information regarding life at the turn of the 
century. In the “good old days” condition, 
life was presented as the embodiment of tra 
ditional American ideals. For example, edv- 
cational instruction was described as person- 
alized, and students were depicted as having 
excellent foundations in reading, writing, ant 
arithmetic; the air was clean and the wate 
pure; there was time for people to genuinely 
relate to one another; food was wholesomt 
and nutritious; people enjoyed their work 
and took pride in the products and servic 
they provided; and government was efficient 
In the “bad old days” condition, a descrip 
tion was presented of what historically ap 
pears to have been the plight of the vasi 
majority of Americans. This information an 
the contrast principle are described in OW 
Bettmann’s (1974) prefatory remarks to i 
The Good Old Days—They Were Terrible! 


I have always felt that our times have overrated 
and unduly overplayed the fun aspects of the m 
What we have forgotten are the hunger of the 
employed, crime, corruption, the despair of the 4 o% 
the insane and the crippled . . . In most of our a 
talgia books . . . the period’s dirty business is SW 
under the carpet of oblivion. What emerges ft 
glowing picture of the past, of blue-skied mea 

where children play and millionaires sip tea. 


Zst SSSeBessesBessses Bes Se ee EE 


If we compare this purported Arcadia with our Me 
days we cannot but feel a jarring disconten® 
sense of despair that fate has dropped us wg 
worst of all possible worlds. (pp. xii—xiii) 


Female students evaluated the pg 
quality of life at the beginning of the seme: i 
and some weeks later, immediately ee 
having been exposed to one of the “a 
mental conditions. It was generally Re 
sized, on the basis of comparison level bee 
that pretreatment-to-posttreatment chang! 
evaluations would be more positive 3 


wo 
o 

1 Special thanks are due Linda S. Messi w 
David Stamm for preparing stimulus ™4! 
helping conduct this experiment. 
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f Utility Scale 
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Rati 
ing Scale Terrible 


Days Rating Scale 3 
Terrible 


Posttreatment Bad Old 


Days Rating Scale : 3 


Posttreatment Good Old 


i 
Terrible 
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5 6 7 
Delighted 


Bigure 1. Location of pretreatment comparison level and information presented in experimental 
conditions, in relation to a hypothetical utility scale and rating scales. 


ms of present life quality would increase 
tom pretreatment to posttreatment in the 
old days condition (positive contrast) but 
tend to decrease (negative contrast) some- 
What or remain unchanged in the good old 
lys condition. 
The prediction of differential absolute mag- 
on of contrast across experimental 
a tions and therefore a stronger test of 
on level theory in the hedonically 
abe than in the positive condition is 
rated in Figure 1. As indicated on the up- 
pe horizontal (utility) scale, the good 
o pes information was assumed, by virtue 
be ae traditional American values, to 
ee ae value to participants’ 
Bi mparison levels than the bad 
lin 


than in the good old days condition. In 
ticular, it was hypothesized that evalua- 


i Se ee The second horizontal 
Metered at che ne pre rearment rating scale, 
o i nic value of the pretreat- 
a parison level. As Upshaw (1969, p. 
Ditty Rane comparison level theory im- 
fk each j ranes that the “origin” or midpoint 
u ki ge's reference scale is anchored at 
tine nae level. The third horizontal 
in the td the posttreatment rating scale 
shifted at p days condition, which has been 
it relation 4 three eighths of a unit upward 
is shift A the pretreatment rating scale. 
Mparison 1. ows from the definition of the 
ae ia as the average of all known 
ience, oe Outcome weighted by its 
the shift ig oh the absolute magnitude of 
© represent: itrary. The fourth horizontal 
S the posttreatment rating scale 


in the bad old days condition, which has been 
shifted about seven eighths of a unit down- 
ward in relation to the pretreatment rating 
scale. Again, the absolute magnitude of the 
shift is arbitrary but is consistent with the 
definition of the comparison level as an 
average—the assumed hedonic value of the 
bad old days information and the magnitude 
of the shift postulated in the good old days 
condition. 

The vertical line cutting across the three 
rating scales in Figure 1 represents an aspect 
of life that was evaluated 3 on the premea- 
sure, As a consequence of the hypothesized 
scale shifts, its posttreatment evaluation is 
2.7 in the good old days condition (a shift of 
3 unit downward) and 3.9 in the bad old days 
condition (a larger shift of .9 unit upward). 
Finally it is important to note that although 
the posttreatment scales are postulated to 
shift, the assumptions just mentioned imply 
that posttreatment evaluations of the bad 
old days information in the bad old days 
condition should more greatly depart from 
the posttreatment scale midpoint than corre- 
sponding evaluations in the good old days 
condition. This implication is illustrated by 
the location of check marks on the posttreat- 


ment scales. 


Scale Anchoring 


Mere prior exposure to a stimulus complex 
that contrasts with a stimulus under evalua- 
tion has not generally been considered suffi- 
cient for generating contrast effects (see 
Eiser & Stroebe, 1972, p. 48). Judgmental 
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theorists have considered anchoring processes 
in which the response scale is coordinated 
with the continuum putatively underlying 
judgments. To the extent that only one seg- 
ment of the range of potential stimuli is pre- 
sented and judged, with the location of this 
segment varied across experimental condi- 
tions, linguistic effects may be generated. 
Pepitone and DiNubile (1976) recently used 
such a procedure in studying crime-related 
judgments. They reported, for example, that 
a homicide was judged to be a more severe 
criminal violation when participants first had 
read and publicly judged the seriousness of an 
assault case than in a condition in which the 
first case was another homicide. The overt 
recording of the first judgment appeared to be 
a necessary condition for producing a con- 
trast effect. 

In the present experiment, an anchoring 
manipulation was also included, After partici- 
pants had been exposed to the stimulus ma- 
terials and had written about them, all par- 
ticipants indicated their feelings regarding the 
present quality of life in Milwaukee. In the 
unanchored condition, participants simply ex- 
pressed their judgments in terms of aspects 
of present life quality, In the anchored con- 
dition, participants were required to first 
evaluate an aspect of their present life in 
terms of the quality of life in Milwaukee in 
1900, immediately before expressing their 
judgment of the same aspect of present life 
quality. Theoretically, this manipulation 
should increase the salience of the hedonic in- 
formation and consequently result in greater 
contrast because of greater shifts in compari- 
son levels in the anchored than in the un- 
anchored conditions, 


3 Method 
Participants 


Seventy-three women enrolled in introductory 
psychology at the University of Wisconsin—Milwau- 
kee participated in the experiment in exchange for 
extra credit in their courses. Women were selected as 
a matter of convenience. 


Pretreatment Measures 


At the beginning of the semester, a survey was 
distributed that included a series of questions per- 
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taining to the quality of life. Women indicated thy 
feelings regarding the following aspects, of th 
lives: life as a whole; personal health; standard 
living; amount of time for doing the things thy 
want to do; the way the police and courts in thi 
area are presently operating; present working co 
tions in Milwaukee; present quality of education i 
Milwaukee; quality of their food; the present ¢ 
ciency of the fire department; the honesty of pr 
local, elected officials; the present physical envito 
ment in Milwaukee—the purity of the air, lakes, 
streams; the present level of public spirit and c 
involvement in Milwaukee; and the present qualit 
of human relationships (e.g., between family me 
bers, doctors and patients, and merchants and ¢ 
tomers). Items were included that appeared relevi 
to the hedonic manipulation. Respondents answ 
each question by placing a check mark onyw 
along a 67-point rating scale similar to that 


Andrews and Withey (1976, p. 18): 


ee Racine Dep ether i E Aie 
Terrible Unhappy Mostly Mixed 
dissatisfied 
neat Sa 6. ucnnaad 
Mostly Pleased Delights! 
satisfied 


Those women whose ratings were between Unhas 
(12) and Pleased (56) were invited to partiaj 
several weeks later and randomly assigned to © 
tions. | 


Procedure | 


size from 2 to 10 women and responded inde 

dently to materials. Participants were told that 

conceptions of life in Milwaukee at the turn a 
century were being studied. Specifically, they x 
be asked to write a description of a day in icp 
of a typical Milwaukeean living in 1900. P i 
were told that since they were probably " esc 
mately familiar with life in 1900, materials ts 

ing this era would be presented. Participan he | 
instructed to imagine vividly what life was wis! 
Milwaukee in 1900 and to take notes if the: 


Participants were run in small groups i 
Í 


Audiovisual Materials and Presentation 


Procedure i 
ide”! 
In all conditions, participants viewed at ti 
series of 16 slides; materials susceptible each g 
interpretations were used. Accompanying pedo! 
was a tape-recorded narrative by via old # 
information was manipulated. In the 80° wl 
condition, life was described positively, f 
the bad old days condition negative a 
emphasized. For example, portions O fo 
associated with a slide of an elaborate 


as follows: 


id days). Beautifully carved fountains such 
& one on Kilbourn Avenue provided a cool- 
Spite for Milwaukeeans and contributed to 
d, slow-paced, leisurely atmosphere of 
umn of the century, Note the graceful, flowing 
sof the artfully sculpted statue that was char- 
ristic of the quality craftsmanship of the 
m immigrants. Fountains like this one 
eda thirsty public with cool drafts of crystal- 
spring or pure well water on sunny after- 


old days). This fountain, located in the 
lle of a dirt highway later known as Kilbourn 
tue, is an example of the poor public hygiene 
ices of the turn of the century. Horses, birds, 
as well as people, drank and washed them- 
in this public germ spreader. People all 
Ank from the same filthy communal drinking 
jp. It was not known for years that fountains 
a this one and the public drinking cups used 
them were transmission vehicles for tu- 
osis, smallpox, and other dreaded diseases 
th tore at the heart of the community by ex- 
infected people to isolation, or to conva- 
nt hospitals for extended terms. Most people 
eck down and killed by tuberculosis were be- 
ten the ages of 18 and 30. 


taal were longer than others, but for a 
a + e narratives were of approximately equal 
3 ae ey conditions. After the 

as ge slide, participants were given 
a ly imagine the scene without the 
cr cs g narrative, Overall, participants spent a 
a min viewing the slides, listening to the 


b 

i Eee next distributed, in which partici- 
Hie ot» cribed their conceptions of a day in 
É person Hatta Milwaukeean by indicating what 
ng at TA A doing at 1-hour intervals be- 
hing at ieee i when the person woke up, and 
Only upon ight when the person retired to 
He Participants completing the descriptive task 

instructed to proceed. 


Posttreat 
Mon: "ent Measures and Anchoring 
tion 


Biisi 
Dtive ee ae were assessed after the de- 
the pretreatm, e first (Present) set was identical 
similar to E measures, The second (Past) set 
their life he Present set, but participants 
in Gite terms of the quality of life 
onding to as For example, the Past item 
te e first pretreatment measure was: 
feel about your life as a whole if 
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it were similar in quality to that of a typical Mil- 
waukeean living in 1900?” The questions were in- 
troduced as a way that the researchers might better 
understand participants’ conceptions and feelings 
regarding life in Milwaukee. 

In the unanchored sequence, participants first 
completed the 13 items in the Present set before 
responding to those in the Past set. In the an- 
chored sequence, participants completed the first 
item in the Past set before completing the first item 
in the Present set. The remaining items were pre- 
sented in this alternating fashion. 


Debriefing 


After completing their ratings, participants were 
informed that for experimental purposes, a balanced. 
historical description had not been presented and 
that four volumes depicting life in Milwaukee in 
1900 had been placed on reserve at the library. Par- 
ticipants were encouraged to approach the experi- 
menter if they had any questions they wanted im- 
mediately answered and to contact the first author 
if they had questions Jater. Finally, participants were 
asked not to discuss the study with others. After the 
data had been analyzed, a complete description of 
the study and the findings was mailed to all partici- 


pants. 
Results 


Manipulation Check 


Responses were averaged across the 13 
Past items to form an index of the judged 
quality of life in 1900 (coefficient alpha, based 


on the pooled within-cell covariance matrix, 
was .91). A univariate Hedonic x Anchoring 
Condition analysis of variance revealed only 
a reliable hedonic main effect, F(1, 69) = 
181.11, p < .0001. Participants in the good 
old days condition indicated that if their 
present lives were similar in quality to the 
life portrayed, they would feel on the average 
“mostly satisfied” (M = 454), or about 11 
points above the scale midpoint (34), where- 
as those in the bad old days condition would 
feel “unhappy” (M = 15.1), or about 19 

ints below the scale midpoint. These find- 
ings are consistent with the assumed hedonic 
value of the stimulus materials, as illustrated 
by the location of the materials on the utility 
scale in Figure 1 and the check marks on the 
accompanying rating scales. 

It is also important to note that univariate 
analyses of variance for each item in the Past 
set revealed reliable hedonic effects (p< 
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Table 1 
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Average Judgments of Present Life Quality—Composite 
BERS RP EE I EE ee 


Pre- Post- 
Condition n treatment treatment Difference F(1, 69) p 

Unanchored 

Good old days 19 39.7 39.7 0 .00 1.00 

Bad old days 18 38.5 41.0 2.5 3.49 -07 
Anchored 

Good old days 18 39.7 37.9 =1.8 1.78 19 

Bad old days 18 39.1 44.9 5.8 18.56 00005 


Note. Scores range from 1 (Terrible), through 34 (Mixed), to 67 (Delighted). 


.0001) generally consistent with the overall 
index. The major exception was for evalua- 
tions of the fire department, in which the 
evaluations in the good old days condition 
(M = 32.4), although more positive than 
those in the bad old days condition (M = 
10.4), were just below the scale midpoint. 
Finally, it is important to note that the seven 
largest F ratios were detected for the personal 
health, police and courts, food, elected offi- 
cials, physical environment, public spirit, and 
human relationship items, 


Major Analyses 


Life quality composite. Responses were 
averaged across the 13 pretreatment items 
(coefficient alpha = .70) and the 13 post- 
treatment items (coefficient alpha = .81) to 
form indexes of present life quality, These 
composites are presented in Table 1. Analysis 
of pretreatment-to-posttreatment differences 
revealed a hedonic main effect, F(1, 69) = 
14.12, p= 0004, as well as a Hedonic x 
Anchoring interaction trend, F(1, 69) = 3.62, 
p = .06. The test for the anchoring main 
effect was not reliable (p = .56). 

The reliable hedonic main effect is due to 
pretreatment-to-posttreatment change in eval- 
uations being more positive in the bad (M = 
4.2) than in the good (M = —.9) old days 
conditions. This finding corroborates the most 
general comparison level prediction. Further- 
more, the more specific hypothesis of greater 
positive contrast in the bad old days condi- 
tion than negative contrast in the good old 
days condition was also corroborated. Inspec- 
tion of the tests in Table 1 for pretreatment- 


to-posttreatment simple effects explicitly re 
veals that the positive contrast hypothesi 
was weakly corroborated in the unanchottl 
bad old days condition (p= .07) an 
strongly corroborated in the anchored bad oll 
days condition (p= .00005). Negative cor 
trast, that judgments would decline in pos 
tivity after participants were exposed to th 
good old days information, was, howevdl 
detected in neither the unanchored ()* 
1.00) nor the anchored condition ($ = 19) 
The Hedonic X Anchoring interaction tren 
is, as predicted, due to the difference betwé 
the good and bad old days conditions bei 
more enhanced when judgments were a 
chored (the difference is 7.6) than when tht} 
were unanchored (2.5). 

Individual items. To investigate whell 
all items were uniformly influenced M 
the treatments, pretreatment-to-posttreatmé 
difference scores were calculated for each itë 
and a multivariate analysis of variance k 
conducted. Pe 

A reliable multivariate anchoring m 
fect was detected, F (13, 57) = 2.04, P 
This effect appears to be due to evalua! 


of the fire department increasing ce m 

anchored (M = 8.8) as compared o 003) 

anchored (M = 2.0) conditions P 
1 


Of greater interest, a reliable mu 
hedonic main effect was also detected, *^ } 
57) = 2.16, p = .02. Pretreatment-0P i 
treatment difference scores were relia! uid 
~<.05) more positive in the bad 4 ‘te 
the good old days condition for oe i 
listed in Table 2 except personal ee "i 
= .06). Inspection of the Difference ; 
in Table 2 for pretreatment-to-posttre 
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ents of Present Life Quality—Individual Items 


Pre- Post- 
d condition treatment treatment Difference F(1, 69) P 
health 
old days 48.0 47.2 —0.8 0.26 61 
old days 44.9 48.5 3.6 4.70 .03 
and courts 
d old days 36.4 31.7 —4.7 5.25 .02 
old days 33.1 35.9 2.8 1.81 18 
of food 
d old days 43.7 41.7 —2.0 1.15 29 
old days 43.9 47,2 3.3 3.06 08 
d officials 
d old days 33.8 28.3 —5.5 7.50 -008 
old days 33.6 33.9 0.3 0.02 89 
environment 
d old days 28.0 26.7 -1.3 0.52 AT 
old days 23.4 31.0 7.6 17.33 .00009 
spirit 
d old days 33.1 31.8 -1.3 0.48 49 
old days 30.0 36.2 6.2 10.86 .002 
relationships 
d old days 37.0 32.2 —4.8 3.97 .05 
d old days 34.9 43.4 8.5 12.14 .0009 


in judgments reveals that, generally, 
ons tended to decrease in the good 
S condition and increase in the bad 
Condition. Furthermore, inspection 
F ratios indexing these changes sug- 
that, generally, increments in judgments 
` RSA in the bad old days condi- 
k ed to be larger than decrements 

€ contrast) in the good old days con- 


S main effect and pretreatment- 
o ment simple effect findings de- 
“the are consistent with the analyses 

“5 e e composite. The multivariate 
Eo onic x Anchoring interaction 
er, not reliable, F(13, 57) = 1.49, 


Discussion 
Quality Composite 


ompari 
>] eg level theory most generally 
pretreatment-to-posttreatment 


i judgments should be more positive 


range from 1 (Terrible), through 34 (Mixed), to 67 (Delighted). 


in the bad than in the good old days condi- 
tion. This is precisely what was detected, as 
indicated by the reliable test for the hedonic 
main effect, when pretreatment-to-posttreat- 
ment differences for the composite were ana- 
lyzed. Furthermore, it was hypothesized that 
contrast effects would be greater in the bad 
than in the good old days condition. This is 
precisely what was detected, as indicated by 
the tests for pretreatment-to-posttreatment 
simple effects reported in the last two columns 
of Table 1. It must be noted, however, that 
whereas these tests are consistent with a 
more precise contrast hypothesis, the com- 
parisons must be interpreted cautiously be- 
cause of the absence of a no-treatment con- 
trol group. 

The results for the composite also support 
the hypothesis of greater contrast in the 
anchored than in the unanchored condition, 
although the test was just short of being reli- 
able at conventional levels of significance. 
This finding is similar to that reported by 
Pepitone and DiNubile (1976), who detected 
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contrast only when judgments of an initial 
stimulus in a two-stimulus “sequence were 
‘anchored,’ that is overtly recorded and thus 
publicly committed” (p. 448). It is not clear, 
however, on the basis of either study, which 
aspects of such anchoring manipulations are 
necessary for producing contrast. Whereas 
Pepitone and DiNubile have stressed public 
commitment, according to comparison level 
theory, merely requiring participants to judge 
but not publicly record initial judgments 
should be sufficient to induce or enhance con- 
trast, by virtue of increasing the salience of 
the hedonic value of the prior stimulus. 


Individual Items 


The contrast hypotheses were corroborated 
for 7 of the 13 life quality items. The pattern 
of reliable hedonic effects for the set of items 
appears clearly due to differential manipula- 
tion of aspects of life. Those aspects of life 
along which the general contrast hypothesis 
was corroborated were those most strongly 
manipulated, 

Finally, it will be recalled that a reliable 
anchoring effect was detected for the fire 
department item. This finding, that evalua- 
tions of the fire department increased more 
in the anchored than in the unanchored con- 
ditions across levels of the hedonic factor, 
can be reconciled with adaptation level theory 
if not with comparison level theory,? Recall 
that the manipulation check pertaining to the 
efficiency of the fire department indicated 
that evaluations were often below the scale 
midpoint in the good old days condition and 
even more so in the bad old days condition. 
The net result in either hedonic condition 
was to increase the salience of how bad the 
fire department was in 1900. In the anchored 
conditions, the manipulation checks were 
collected before the evaluation of the fire 
department’s present efficiency, and thus, 
greater salience and positive contrast would 
be expected in this condition. 


Theoretical Implications 


The present findings, although corroborat- 
ing comparison level theory, suggest that the 
theory may not be sufficiently refined to ade- 
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quately represent the complexities of evaly 
tive judgment. Upshaw (1969) has empe“ 
sized an important theoretical distinctin 
between comparison and adaptation led 
theories. Although outcomes might differ iy 
their specifics, according to comparison level 
theory, all outcomes are judged in terms ofa 
single utility scale such that “a nagging wily 
a 10% salary raise, and a slice of apple pit 
are all evaluated in terms of a single CI 
[comparison level]” (pp. 347-349). This, 
conception of the comparison level as a get: 
eralized hedonic standard of judgment would 
imply that contrast effects should have been 
detected, within the limits of experimental 
error, across all 13 measures of life quality, 
As indicated, however, the general contras 
hypothesis was corroborated for only 7 of tht 
measures. (The differential impact of the he 
donic manipulation upon the measures is mi 
due to differential experimental error.)? Ac 
cording to an adaptation level model of judg, 
ment, there is a different reference scale fot 
each aspect of life, with each scale having i 
particular adaptation level (Upshaw, pe 
p. 347). This model of judgment is consi 
with the finding that the strongest contras 
effects were observed along those dimension 
that were most strongly manipulated, x 
the anchoring main effect detected for o 
ations of the efficiency of the fire departme™ 


Experiment 2 * 


In Experiment 2, we attempted tO om 
structively replicate Experiment 1 
tially following Abraham Maslow’s ( 
1972) suggestion regarding exercises } 
vation: 


5 d hest 
All you have to do is to go to a hospital i efort 
all the simple blessings that people neY aate 
realized were blessings—being able to 


This distinction is presented in the 4 
$These error terms are for the analy 
treatment-to-posttreatment difference SCOP: , getty 
4 Special thanks are due Margaret Grade peri ent 
McMackin for help in conducting this stance it 
and Carol Schultz for her extensive ree 
collecting and developing stimulus materia: 


your side, to be able to swallow, to scratch 
etc. Could exercises in deprivation educate 
about all our blessings? (p. 108) 


fen who had completed pretreatment 
res earlier in the semester were asked 
gine vividly a series of events. In the 
ically negative condition, the events 
personal tragedies, whereas in the com- 
son condition the events were all positive. 
fter imagining each event, participants de- 

bed what they would do, think of, and 
| Later, they expressed their level of satis- 
on with various aspects of life, including 
life in general, health, and physical ap- 
arance, on scales identical to the pretreat- 
t measures." Given the results of an earlier 
ment (see Footnote 5), and given that 
latter aspects of life appeared a priori to 
Most strongly manipulated across condi- 
s, it was predicted that on a composite of 
measures, pretreatment-to-posttreatment 


the hedonically negative than in the he- 
cally Positive condition. This prediction, 
se, is analogous to the general con- 
n, which was strongly corrobo- 
led in Experiment 1 and was the major 
p thesis tested in the study. 

,_ ‘ticipants also described their moods. 
te to a simple affective model of 
ses a the positivity of evaluative re- 
tity is ould covary directly with the posi- 
te i 3) mone states (Byrne, 1971, chap- 
level Nile comparison and adaptation 
foes i SA however, suggest that evalua- 
aoe e composite index would be most 
in the hedonically negative condi- 


won. The mood descripti 
; es TV 
Manipulation pean also served as a 


Partici here Method 


Eighty. 
miog e women enrolled in introductory 
A Partici the University of Wisconsin—Mil- 
Or extra cp Cipated in the experiment in exchange 
ae it in their courses. Women were again 
Matter of convenience. 


Prety 
Catment Measures 


At the begi 
. Rigg ning of the semester, a survey was 
included a series of questions re- 
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garding the respondent's satisfaction with her life, 
health, physical appearance, relations with other 
people, sex life, and financial situation. Respondents 
answered each question by placing a check mark 
anywhere along the following rating scale: 


Dissatisfied Slightly 


Excep- Very 
dissatisfied 


tionally dissatisfied 
dissatisfied 


Responses to this 71- int scale were assigned inte- 
gers from 10 (Exceptionally dissatisfied) to 80 (Ex- 
ceptionally satisfied). Those women whose ratings 
were between 30 and 65 were invited to participate 
several weeks later and were randomly assigned to 
conditions. (Several women participated although 
they had not completed the “sex life” question.) 


Procedure 


size from two to seven persons 
pendently to the materials. Each session lasted about 
2 hours. All instructions and questions were con- 
tained in a single booklet. 

Instructions. At the beginning of each session, 
participants were informed that they would be asked 
to read and respond to a series of articles and that 
their responses would be confidential. It was also 
explained that at any time they could decline to 
answer a question or participate further, but would 
receive full credit toward their psychology course. 

Participants were told that the ability of people 
to vividly imagine “life events” was being studied. 
They anticipated reading a series of life events, 


SS 


‘es of experiments had been 


5 Before the present seri è 
conducted, a study conceptually similar to Experiment 
2 had been run (Fisher, 1976). Women role played 


either being slow walkers or having been in an 
automotive accident and confined to wheelchairs for 
the rest of their lives. After traveling about campus 
for an hour and discontinuing role playing, partici- 
pants in the hedonically negative condition tended to 
express more positive judgments on a composite index 
of satisfaction with life, health, and physical appear- 
ance than participants in the control condition. For 
various methodological reasons (e.g., the absence of 
pretreatment measures and some participants failing 
to comply with the role-playing instructions) the 


periment 2. 
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imagining the events happening to themselves and 
describing their reactions to the events. 

Life events. In the hedonically negative condi- 
tion, participants successively imagined (a) that they 
were severely burned and permanently disfigured— 
especially their face and hands—as a consequence 
of a gas explosion that destroyed their home or 
apartment and killed someone they dearly loved; 
(b) that they were blind; (c) that they were in an 
automobile accident that resulted in their confine- 
ment to a wheelchair for the rest of their lives 3 and 
(d) that they were severely suffering from Hodgkin's 
disease, a cancer of the lymphatic system that most 
often attacks young adults, In the hedonically posi- 
tive condition, Participants successively imagined 
that they were (a) winners of an all-expenses-paid 
tour for themselves and a friend to southwestern 
Europe and Morocco; (b) multimillionaires who 
with their loved one enjoyed a spectacular world 
cruise on the Queen Elizabeth 2; (c) well-paid pri- 
vate secretaries who travelled through northeastern 
Europe with their wealthy employer; (d) and win- 
ners of an all-expenses-paid vacation of their own 
design, for themselves and a friend, in Missouri. 

To stimulate Participants’ imaginations, in the 
negative condition women read appropriate articles 
from the medical and rehabilitation literature, 
whereas in the positive condition they read travel 
brochures. Participants could not proceed to a new 
life event until they had read and described their 
reactions to earlier life events, 


Posttreatment Measures 


understand responses, Participants first described 
how they felt while role playing, by completing the 
Nowlis (1970) Mood 


standing of the effects of the treatment and to check 
for experimental demand, Although Many women 
wrote extensively, especially in the hedonically nega- 
tive condition, no one indicated knowledge of any of 
the hypotheses being investigated. 


Debriefing 


After the data were analyzed, a complete descrip- 
tion of the study and the findings was mailed to all 
participants. Those wishing additional information 
were invited to discuss the study with the researchers, 
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Table 3 
Average Mood Scores as a Function of 
Experimental Condition and Phase 
— 
Condition 
Seat ý Univa 


Hedonically Hedonically hedonic 
negative positive tests, 
Factor (n = 42) (n = 41) F(1, 81) 
During role-playing task 
Elation 5.2 9.8 55.82 
Anxiety 7.8 4.7 39.05 
Sadness 7.8 4.4 45.86 
After satisfaction ratings 
Elation 6.3 6.3 0 
Anxiety 5.9 4.1 13,15 
Sadness 5.3 3.9 10.85 


Note. High scores indicate greater elation, anxiety 
and sadness. The range of possible scores for elation 
is 4 to 16, and for the remaining factors is 3 to. 
All nonzero F statistics are reliable at p < .002. 


Results 
Manipulation Check 


Multivariate analyses of variance for mood 
during the role-playing task, F(3, 79)* 
34.40, p < .0001, and after the satisfaction 
ratings were made, F (3, 79) = 5.09, p S w 
indicated reliable hedonic effects. As indicated 
in Table 3, participants’ reports of mood welt 
generally more negative in the hedonicalll 
negative than in the hedonically positive coi 
dition. 


Major Analyses 


Satisfaction composite. The three satis 
tion measures hypothesized to be most Í 
fluenced by the treatments were averag 
form a composite (the alpha coefficients 
the pretreatment and posttreatment inde" 
were, respectively, .52 and .59). An 
of variance of pretreatment-to-posttreat™ i 
differences revealed, as predicted, greater 
tivity in judgments of satisfaction call 
hedonically negative than in the hei ' 
Positive condition, F(1, 81) = 3.32, ? able 
directional alternative. As indicated in pi 
4, participants’ judgments of satisfactio oni 
creased an average of 4.9 points in the hed 


le 4 
wage Judgments of Satisfaction 
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ly negative condition but only 2.1 points in 
the positive condition, The taste for pee 
a ee eat simple effects, in the 
er columns of Table 4, reveal the he- 
oi 4 negative condition to be associated 

= ‘argest.increment in evaluations. 
De ales recalled that according to a sim- 
ity of om model of evaluation, the positiv- 
ae uations increases with the positiv- 
he Dae states. The present findings 
ics s model, since judgments of satis- 
RA Te most positive in the hedonically 
Ee adition; Nor does it appear that 
thane of evaluative responses increases 
sates, Ac hoe toward more positive affective 
Patticipante gee to such a revised model, the 
fio ful in the hedonically negative con- 
Judged their lives most positively be- 
kse, resulting from 
l ole-playing task. Con- 
oo Ae revised model, as participants’ 
e hedonically negative condition 


creas 
terms aon first to second assessment) in 
Mentto-n sess and anxiety, their pretreat- 


ection on cement changes in judged sat- 
r(40) = e composite, also di 
BS thd <5, fon s E 


30, pe sadness; r(40) = 

did not — z anxiety, . Changes in pS 

tte composite (r = —.01) with changes’ in 
ndivi : 

Ment. dual items. For each item, pretreat- 


Posi A 
ttreatment difference scores were 


Pre- Post- 
treatment treatment Difference F(1, 81) p 
Hedonically negative condition (# = 42) 
Composite 53.6 58.5 4.9 20.37 00002 
Life 55.0 61.1 61 18.78 00004 
Health 54.3 61.2 6.9 15.96 0001 
Physical appearance 51.4 53.2 1.8 1.77 .19 
Hedonically positive condition (# = 41) 
Etomite 53.8 55.9 2.1 3.65 .06 
Be ae 55.0 56.3 1.3 0.83 36 
aes 55.3 57.8 2.5 2.04 16 
ysical appearance 51.0 53.6 2.6 3.61 06 


ee range from 10 (Exceptionally dissatisfied), through 45, the midpoint, to 80 (Exceptionally 


calculated. The omnibus multivariate hedonic 
test was reliable, F(3, 79) = 2.65, p = .0S, as 
were the univariate tests for satisfaction with 
life, F(1, 81) =5.63, p < .01, directional alter- 
native, and health, F(1, 81) = 3.23, p< .04, 
directional alternative. As indicated in Table 
4, pretreatment-to-posttreatment increments 
in judged satisfaction with life and health 
were greater in the hedonically negative than 
in the hedonically positive condition. The 
weak trend for judgments of satisfaction with 
physical appearance is contrary to that hy- 
pothesized and, of course, is not reliable, using 
a directional statistical decision rule, F(1, 81) 
= 15, 

Similar analyses of pretreatment-to-post- 
treatment change were conducted for judg- 
ments of satisfaction with relations with other 
people and financial situation. Whereas the 
means for the relations item were in the di- 
rection specified by comparison level theory, 
F(1, 81) = 2.00, p < .08, directional alterna- 
tive, the means for the financial item were in 
the opposite direction, F(1, 81) = .02. Fur- 
thermore, although the absence of pretreat- 
ment measure scores for some participants 
precluded a pretreatment-to-posttreatment 
analysis of satisfaction with sex life, the post- 
treatment means were also opposite to what 
would be expected if the comparison level, F 
(1, 81) = .71, were a generalized standard. 
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Discussion 
Satisfaction Composite 


The finding that pretreatment-to-posttreat- 
ment changes in judgments of satisfaction 
were reliably more positive in the hedonically 
negative than in the hedonically positive con- 
dition corroborates the major prediction de- 
rived from comparison level theory. Further- 
more, it is interesting to note that as in Ex- 
periment 1 (bad old days condition), the 
hedonically negative condition was associated 
with the largest increments in evaluations. 
Although the stimulus material was not 
judged by participants in terms of hedonic 
value (as in Experiment 1), the greater po- 
tency of the hedonically negative material 
may have resulted from its departing more 
from participants’ pretreatment comparison 
levels than did the hedonically positive in- 
formation. 

The findings for the satisfaction index do 
not appear to corroborate Byrne’s (1971) af- 
fective model of evaluation, since judgments 
of satisfaction were greatest in the most af- 
fectively negative condition, Obviously, more 
research is needed to delimit the generally ac- 
cepted positive relationship between affective 
states and evaluations, 


Individual Items 


The contrast hypothesis was corroborated 
for two—life and health—of the three items 
comprising the satisfaction composite. Evalua- 
tive judgments of physical appearance were 
virtually uninfluenced, although the within- 
cell experimental error for this item (see Foot- 
note 3) was slightly smaller than the error 
associated with the items yielding reliable 
contrast effects. Furthermore, it will be re- 
called that satisfaction regarding three aspects 
of life besides the composite items were mea- 
sured and analyzed, but contrast effects were 
not generally detected. Although comparison 
level theory may be of heuristic value, these 
findings do not appear consistent with the as- 
sumption that all outcomes are judged in 
terms of a single utility scale and standard— 
the comparison level. 
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General Discussion 
Summary of Major Findings 


These experiments corroborate the hypoth 
esis that vicarious exposure to hedonic er 
tremes—especially the hedonically negative- 
results in contrast effects regarding evalua 
tive judgments of aspects of life that have 
evolved or been acquired in the course o 
life beyond the laboratory. Although the pres 
ent results are generally consistent with com 
parison level theory, the findings appear bet: 
ter described by an adaptation level model o 
judgment (Upshaw, 1969, p. 347). Further 
more, the finding that the hedonically neg 
tive condition of Experiment 2 was associated 
with the most negative moods, but the mos 
positive judgments, appears to contradict 
Byrne’s (1971) affective model of evaluation 


Potential Limitations 


Participant sex. Although only womél 
participated in the present studies, the find 
ings should hold for males, since contrast e 
fects have been detected with males in othe! 
studies (e.g., Brickman, 1975). al 

Treatment complexity. The treatments u 
in the present experiments were complex: 
is, therefore, not precisely known wai 
pects of the treatments—especially the É 
donically negative treatments—mediated i 
fects. Clearly, participants were not Eo. 
exposed to stimuli. For example, across f E, 
experiments, participants described wa 
to the stimulus materials before evaluat! 
aspects of life. 7 the 

Magnitude of effects. In relation t0 4 
maximum contrast effects that could w 
tected in these experiments, the effects 0 a 
hedonically negative materials, thongs 
tistically reliable, were small. A numve 
considerations are relevant to this resia 

1. Several participants indicated d 
hedonically negative materials had a. 
conflict. On’ the one hand, they were me 
to increase the positivity of their judg mal 
yet they felt that this was inapproP 0 
since their judgments were typically expos 
positive. These participants reported 
ing their more typical judgments 0n 


ent measures, thus hindering the de- 
f large contrast effects. 
n designing these experiments, it was 
that most participants had not been 
y or recently exposed to hedonically 
information and that the hedonic 
of such information consequently was 
intially below participants’ pretreat- 
comparison levels. These considerations 
suggest the production of large con- 
effects. According to Thibaut and Kelley 
), however, the weights associated with 
Comparison level reflect the “salience” of 
jitcomes, where “salience” refers to the ex- 
ft that an individual might think about 
icomes before judgment. But, as illustrated 
e mood ratings in Experiment 2, hedoni- 
lly negative outcomes may have immediately 
ive properties. Thinking, for example, 
jury, disease, or death generally elicits 
ive emotional responses, amounts to self- 
hment, and may therefore ordinarily be 
led. Consequently, the salience and 
ted contribution of the hedonically nega- 
ve to the comparison level may be small. 
a speaking comparison and adapta- 
s are weighted averages. The de- 
hants of values and weights, however, 
a ee precisely specified. Although 
a peel negative information 
B appears 4 hour, as in Experiment 2; 
erioa ia, the contribution to 
ce these a aptation levels may be small, 
licipant’s ee reflect all of a par- 
lous experience. 


B 
f pe the Hedonically Positive 
P i 
à aa appear, in general, to have 
ented EA individuals are primarily ori- 
Making ase, s the hedonically positive when 
logica] basi essments, although the psycho- 
Ways been ‘ for this assumption has not al- 
his npn te Festinger (1954) made 
e ei for the assessment of abilities 
nselyes acc, that individuals compare 
touS research those performing better. Nu- 
Son betwee; instruments involve a com- 
Or her » 2 Tespondent’s description of 
"Nd the oo status and “ideal” status, 
cept of relative deprivation, of 
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course, depends upon an individual not hav- 
ing that which he or she desires (see Cook, 
Crosby, & Hennigan, 1971). But if our the- 
ories are general, they should also explain 
when and how the hedonically negative may 
influence judgment. The young students we 
most often study may be bombarded with ad- 
vertisements rendering the hedonically posi- 
tive salient, and they may believe that they 
will always be healthy and their outcomes 
constantly improving; but as theorists we 
should not be blinded by these aspects of our 
culture. 

Thibaut and Kelley (1959) attempted to 
specify a psychological theory of outcome 
salience in proposing that salient outcomes 
are those that an individual believes he or 
she can to some degree control. Since vicarious 
exposure to the hedonically positive may pro- 
vide information about response-positive rein- 
forcement contingencies (see Berger, 1977) 
and consequently mediate positive reinforce- 
ment, it is not surprising that individuals may 
attend to those achieving more positive out- 
comes and think about these activities and 
subsequent outcomes. Such behavior may 
momentarily decrease judgments of satisfac- 
tion with current outcomes but may mediate 
more positive future outcomes. Eventually, 
however, there may be little or no possibility 
of achieving the hedonically positive! This 
may be true, for example, because. of injury, 
disease, old age, or social constraints. Under 
these circumstances, the hedonically negative 
may become salient even though it may 


6Several months after Experiment 2 was com- 
pleted, but before debriefing, we contacted partici- 
pants by telephone and solicited their reactions to 
. The participants in the hedonically 


the experiment. 
positive condition appear to have had some trouble 
d generally did not ex- 


recalling the experiment an 

press much interest in it, The participants in the 
hedonically negative condition, however, appeared 
to have little trouble recalling the study, thought 
participation was & valuable experience, and were 
eager to learn more about the study. Although 
thinking about the hedonically negative was aver- 
sive, the participants almost without exception 
deemed the experience valuable. Several partici- 
pants reported being happier with their lives a few 
days after participation. One participant, however, 
reported discerning the major hypothesis the day 
following the experiment. 
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momentarily elicit aversive emotional behav- 
ior. The hedonically negative may provide in- 
formation about response-negative reinforce- 
ment contingencies and thus facilitate an in- 
dividual’s avoiding outcomes far more aver- 
sive than those immediately resulting from 
vicarious exposure to, or thinking about, the 
hedonically negative. And, of course, as il- 
lustrated in the present experiments, such in- 
formation would be expected to enhance judg- 
ments of satisfaction with present life out- 
comes. 

Finally, it is interesting to note the applied 
implications of the present experiments. The 
students in Experiment 1 do not appear to 
have had much knowledge of the plight of the 
masses and the details of day-to-day life in 
earlier times, American historians only as re- 
cently as the 1960s have come to write his- 
tory from the “bottom up” (see Thernstrom, 
1964), although they were long aware that 
history has ordinarily been written and 
taught from the perspective of the bourgeoisie 
and aristocracy. The current findings suggest 
that it might be beneficial to incorporate de- 
scriptions of the lives of ordinary people into 
primary and secondary history curricula. Cer- 
tainly 19th-century novelists have provided 
graphic descriptions. To paraphrase Sir Walter 
Raleigh, one may learn to be appreciative 
from comparing people’s forepast miseries 
with one’s own like errors and ill deservings. 
Similarly, the results of Experiment 2 suggest 
that we might more likely count our blessings 


if we were not so isolated from the hedonically 
negative. 
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Two experiments were cond 
available to persons trying 
(actors) versus those trying 
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lucted to assess (a) differences in the information 
to understand the causes of their own behavior 
to understand the causes of another’s behavior 
(observers) and (b) the effects of information differences on causal explana- 


tions. In Experiment 1, actors reported positive behaviors to be less distinctive 
and more consistent with past behavior than did observers, whereas the reverse 


was true for negative behaviors. 
tributed desirable behaviors more to 


Consistent with this 
their own internal dispositions than did 


difference, actors at- 


observers, whereas the opposite occurred for undesirable behaviors. In Experi- 


ment 2, when all subjects were given the consensus, distinctiveness, and con- 


sistency information generated by actors in 


Study 1, both actors and observers 


attributed positive acts more to internal factors than negative acts. When given 
the information generated by the observers, neither actors nor observers ex- 
hibited this bias. Thus, when given the same information, actors and observers 
no longer showed differences in causal explanations. 


ot theory and research have suggested 
saci a variety of circumstances, causal 
fn mone for one’s own behavior differ 
> n! lanations of the same behavior per- 
a ae someone else (Jones & Nisbett, 
es hee & Snyder, 1977). Jones and 
Ei ) discussed three factors that 
Ea a ye to these differential percep- 
a eisai ty. They proposed that actors 
es: Ae in their visual perspec- 
talib R Pe and the information 
ae ees of visual perspective have re- 

uch attention, and results have gen- 


This article į 
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PhD degree, Th ent of the requirements for the 
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also go to eee het of this manuscript. Thanks 
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erally supported the contention that actors 
tend to see the causes of their behavior as 
environmental, whereas observers view the 
causes of the same behavior as originating 
within the actor (Arkin & Duval, 1975; Nis- 
bett, Caputo, Legant, & Maracek, 1973; Re- 
gan & Totten, 1975; Storms, 1973). Similarly, 
much research has focused on actor-observer 
attribution differences that appear to be moti- 
vationally based (eg., Arkin, Gleason, & 
Johnston, 1976; Bradley, 1978; Miller & 
Ross, 1975; Sicoly & Ross, 1977; Snyder, 
Stephan, & Rosenfield, 1976; Stevens & Jones, 
1976). 

In contrast, the question of the effects of 
actors’ and observers’ informational differences 
on causal attribution has been relatively ne- 
glected. It js this question that the present 
research addresses. Jones & Nisbett (1972) 
suggested that because they know their own 
history, actors are in a better position than 
observers to assess how distinctive and con- 
sistent a current behavior is compared with 
past behavior. Although observers may know 
little about someone else’s past behavior, they 
probably have an idea of whether they them- 
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selves would (or did) perform the behavior 
(consensus). Thus, observers may evaluate 
an actor’s behavior by comparing it with the 
behavior of others, while actors may evaluate 
their own behavior by comparing it with past 
behavior. These three types of information 
(distinctiveness, consistency, and consensus) 
were first proposed as useful for causal at- 
tribution by Kelley (1967) and were found to 
have a significant impact on perceptions of 
causality (Hansen & Lowe, 1976; McArthur, 
1972; Orvis, Cunningham, & Kelley, 1975). 

The difference in information available to 
actors and observers may contribute to di- 
vergent perceptions of causality (Jones & Nis- 
bett, 1972). For example, if Mr. Smith fails to 
show up for a meeting, his supervisor may in- 
fer that Smith’s unreliable behavior is wide- 
spread (low in distinctiveness and high in 
consistency) and that hardly anyone else fails 
to attend (low consensus). In line with Kel- 
ley’s (1967) model, the supervisor would at- 
tribute Smith’s transgression to his unreliable 
nature. On the other hand, Smith may know 
that he rarely misses meetings and attribute 
his absence to external factors. Consistent 
with the difference in information available 
to actors and observers, Hansen and Lowe 
(1976) found that in attributing causality for 
a behavior, actors relied more heavily on dis- 
tinctiveness and consistency information, 
whereas observers favored consensus. How- 
ever, to date there have been no attempts to 
ascertain whether actors think their behavior 
is more distinctive and less consistent with 
past behavior than observers think it is. 

The purposes of the present research were 
(a) to investigate actors’ and observers’ re- 
ports of how they think others would behave 
in given situations (consensus) and their 
knowledge or assumptions about the distinc- 
tiveness and consistency of given behaviors; 
(b) to assess actors’ and observers causal at- 
tributions for a variety of behaviors; and 
(c) to examine the effect on causal attribu- 
tions of providing actors with the information 
reported by observers, and vice versa. 

Assessment of actor—observer differences in 
information assumptions was accomplished in 
Experiment 1 by having actors and observers 
estimate consensus, distinctiveness, and con- 
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sistency information for a series of behavior 
Causal attributions were studied in Exper. 
ment 2 by giving one group of subjects no in 
formation, another group the information gen. 
erated (assumed) by actors in Study 1, and 
a third group, the information generated by 
observers in Study 1. This method is analo: 
gous to that used by Storms (1973). Bu 
rather than reversing visual perspectives, ac 


tors’ and observers’ “informational perspec: 
tives” were reversed. 


Experiment 1 
Experiment 1 was designed to assess dif 


ferences in actors’ and observers’ estimate 
of consensus, distinctiveness, and consistency 
surrounding each of 12 behaviors. For this 
purpose, 12 “personality—achievement inven 
tory” items were devised, each indicative of 
a different behavior. Three valences (positiv 
neutral, and negative) and four categorits 
(emotions, beliefs, actions, and accomplish 
ments) of behavior were employed, yielding 
one item for each combination of valence al 
category. All subjects completed the items 
and were provided with false feedback inform 
ing them of the behavioral dimension €a 
item measured. Actors’ feedback referred 0 
their own responses, while observers’ feedbatt 
referred to the responses of another subject 
whose completed items they were show! 
Based on the feedback, actors generated co 
sensus, distinctiveness, and consistency f 
each of their own responses, while oa 
generated the same types of information fof 
their matched actor’s responses. «aed BY 
Choice of this methodology was guided fot 
two major considerations: (a) In order F 
me to be able to generalize the findings, é 
vestigation of a variety of behaviors Wa i 
sired, and (b) relatively novel behaviors, 
which actors are not likely to have 4 “a 
store of distinctiveness and consistency 


formation, had to be chosen in oE j 


F + f i sed ty- 
information manipulations u “personality 


to be believable. Since taking a 1 
achievement inventory” is a fairly no 

perience in the lives of most people, ! z 
peared to be a good mechanism for ac 

modating these considerations. 
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Behavioral category 


Emotions Beliefs Actions Accomplishments 
Happy/ Progressive/ Helpful/ Verbal reasoning 
Unhappy Conservative Unhelpful Correct/incorrect 
Excitable/ Skeptical/ Reserved/ Language skills 
Unexcitable Naive Outspoken 50% 

Affable/ Tolerant/ Reliable/ Spatial relations 
Hostile Intolerant Unreliable Correct / 
incorrect 


’s (observers) response. 


m Jones and Nisbett’s (1972) sug- 
t was predicted that (a) actors would 
higher distinctiveness and lower con- 
r their responses than would ob- 
(b) actors would be more certain 
tinctiveness and consistency esti- 
would observers. 

ctions for consensus information 
‘complicated. Since observers as well 
mpleted the inventory items, they 
el they gave the same response as 
whose completed items they were 
Since only 12 items were employed, 
could easily recall their responses. 
already had “‘self-based” consensus 

Hansen & Donoghue, 1977). Con- 
it was predicted that (c) observers 
the same response on an item as 
actor (high self-based con- 
estimate higher consensus than 
ereas observers who gave a 
(low self-based consensus) 
lower consensus than would 


would be more certain of their 
s than actors. 


E was a 2X2X3%X4 design (Sex 
ale] X Role [actor, observer] X 
positive, neutral, negative] X 
[emotions, beliefs, actions, ac- 


italicized item represents the feedback given to subjects regarding their own (actors) or their 


Subjects 


Eighteen male and 18 female undergraduates vol- 
unteered to serve as subjects for research on “per- 
sonality assessment,” in partial fulfillment of an in- 
troductory psychology course requirement. Half the 
subjects of each sex were randomly assigned to serve 
as actors; the remaining subjects served as observers 
and were each “yoked” to a same-sex actor. That is, 
each observer was shown the responses of a same- 
sex actor and generated information for that actor’s 
responses. Subjects were run individually by a fe- 
male experimenter. 


Personality-Achievement Items 


Twelve forced-choice self-report items from a sup- 
posed “personality-achievement inventory” were con- 
structed. Each item indicated a positive, neutral, or 
negative emotion, belief, action, or accomplishment. 
The nine personality behavior dimensions (emotions, 
beliefs, and actions) were selected from Anderson’s 
(1968) likableness ratings of 555 adjectives. Positive 
behaviors were rated in the top 100, neutral be- 
haviors between 240 and 300, and negative behaviors 
in the bottom 100. The behavioral dimensions se- 
lected appear in Table 1. Although the dimensions 
could be conceived of as traits, the items themselves 
dealt with specific, concrete situations. For example, 
the following item represented the happy-unhappy 
(positive emotion) personality dimension: 


A. When I feel depressed, my friends are usually 


able to cheer me up. 
B. When I’m dating someone special, I feel more 


cheerful than usual. 


The three achievement items were actual tasks: an 
analogy, an anagram, and an embedded-figure prob- 
lem. 
Three randomly ordered blocks of four items each 
were constructed, with the restriction that an achieve- 
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ment task was always the last item in each block. 
This was done to facilitate timing of the problems. 
(The personality items were not timed.) Each of the 
three blocks was randomly ordered to yield three 
orders of items. 


Procedure 


Subjects were told that we were studying person- 
ality assessment techniques and that they would be 
asked to complete 12 items from a larger, widely 
used personality-achievement inventory. The items 
would be scored, the subjects would be told the re- 
sults, and they would be asked some questions 
about their responses, Participants in the experiment 
were then asked to read the instructions, complete 
the first page containing three personality items, look 
at the example of the problem that was to follow 
on the next page, and then call the experimenter to 
time the “test” problem. The experimenter informed 
subjects how much time they would have to do the 
problem, instructed them to begin, and informed 
them when time was up. This procedure was fol- 
lowed until all 12 items were completed. 

Actors. After completing the items, actors were 
left alone for approximately 3 minutes while the ex- 
perimenter scored their inventories. Upon returning, 
the experimenter gave the subjects the scored items, 
along with written feedback and the dependent-mea- 
Sure questionnaire. 

Observers. Upon completion of the inventory 
items, observers were told that their tests would be 
scored and feedback would be given later, but in 
the meantime, the experimenter would like them to 
look at someone else’s completed and scored inven- 
tory and answer some questions about the other per- 
Son’s responses. The experimenter left the room, re- 
turning several minutes later with a same-sex actor’s 


scored inventory and the dependent-measure ques- 
tionnaire, 


Feedback 


The feedback (scoring) was presented in the de- 
pendent-measure booklet. At the top of each page 
appeared a personality or achievement dimension 
represented by one item, followed by a sentence 
specifying which end of the dimension the subject’s 
response indicated, The dimensions were numbered 
to correspond to the numbers of the inventory items. 
Below is an example of one dimension and the feed- 
back given: 


Subscale: HAPPY—-UN HAPPY 
On this item you (actors) /this student (observers) 
gave the happy/unhappy response. 


For each item, the end of the dimension that the 
response supposedly indicated (the italicized word) 
was circled in red ink to appear as if the feedback 
were tailored to the individual’s response. In fact, 
everyone received the same feedback regardless of 
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how the item was answered. Thus, even though sib 
jects did not answer all the items in the same way, 
all were told that their responses indicated the samj 
behavior. Debriefing revealed that subjects believe 
that the items were from an actual personality, 
achievement inventory and that the feedback wal 
valid. The experimenter went through a sample iten| 
explaining how the items were scored, ensuring thal 
subjects understood the instructions. 


Dependent Measures 


Based on the feedback, subjects were asked t 
estimate for each item (a) what percentage of tht 
other subjects in the experiment also gave the X 
response on that item (consensus information); (b) 
if they took the whole inventory (rather than jut 
the 12 items selected), on what percentage of tht 
items in that subscale they (or their matched actor) 
would also give the X response (distinctiveness im 
formation) ; and (c) if they (or their matched actor) 
took the inventory at several different times, whit 
percentage of the time they (or their matched actor) 
would give the X response on that item (consistent. 
information). 

Inferences were made on a 9-point scale from 10% 
to 90% of the people, items, and time for consensi 
distinctiveness, and consistency information, respet 
tively. Subjects also rated on a 9-point scale how 
certain they were about each of their informatio 
estimates, 


Results 
Data Analysis 


Information estimates and certainty dal 
were analyzed using a 2 x 2 x 3 X 4 (Sex® 
Subject x Role x Valence x Category) 
analysis of variance. Sex of subject was 4 be 
tween-subjects factor; the remaining a 
independent variables were within-subje 
factors. Since actors and observers were y‘ ok H 
role was considered to be a within-subject 
factor, and actor—observer pairs were the um 
of analysis. ae 

Since more than one dependent meas i 
was obtained, use of multivariate analyst 4 
variance was considered. However, hea 
the difficulty in interpreting multivar ; 
analysis in which there are repeated nea 
on more than one factor and multiple eee 
of the independent variables, univariate @ ard 
ses were done. An attempt was made to 8° re 
against a Type I error by adopting the ™ 
stringent Greenhouse-Geisser criterion [1 ): 
jection of the null hypothesis (Winer, 1 


+ Obeervers 


X Actors 


neutral negative 


VALENGE 


positive 


Figure 1. Actors’ and 


effects reported are either still significant 
‘ording to this stringent criterion, or if not, 
| Fmax test for homogeneity of variance was 
rmed and found to be nonsignificant. 
aaa three-way or higher order interac- 
pou that are not psychologically meaningful 
are not reported. 


Information Estimates 


eens information. As predicted, 
s estimated their responses to be sig- 
tly higher in distinctiveness than ob- 
ers did, F(1, 16) = 5.48, p < .05. How- 
4 aw Role X Valence interaction, 
? 4q = 14.45, p< .001, modified this 
he ect, indicating that only for negative 
ME a 4 actors infer higher distinctive- 
D. V id observers (p < .05; see Figure 
hd Sia significantly higher dis- 
Geen 14 negative than neutral behav- 
Ea i cantly lower distinctiveness for 
jae neutral behaviors (ps < .05). 
CNN pated higher distinctiveness for 
Be was neutral behaviors (p < 05), 
Es no significant difference in their 

A scat we and neutral behaviors. 
B obtained tarp effect of valence was 
eae Edie th 32) = 37.34, p< 001. 
stimated distinctive a en the lower the 
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k A 5 observers’ distinctiveness and consistency estimates for three vale: 
C behavior. (The higher the number, the lower the estimated distinctiveness.) 
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-80 
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Estimated Consistency 


-50 
45 


40 


neutral negative 


VALENCE 


positive 


nces of 


Consistency information. Contrary to pre- 
diction, the main effect of role on consistency 
estimates did not approach significance. Both 
actors and observers estimated that consist- 
ency of behavior (how often they would give 
the same response on the same item) would 
be fairly high (see Figure 1). However, a sig- 
nificant Role x Valence interaction was ob- 
tained, F(2, 32) = 6.71, p < .001. Consistent 
with the findings for distinctiveness informa- 
tion, actors inferred lower consistency than 
did observers only for undesirable behaviors, 
although the difference was not significant. 
Contrary to prediction, actors estimated sig- 
nificantly higher consistency than did ob- 
servers for desirable behaviors (p < 05). 
Actors’ consistency estimates for positive, 
neutral, and negative behaviors all differed 
significantly from each other (ps < .05). Ob- 
servers generated lower consistency for nega- 
tive than neutral responses (p < 05), but 
there was no difference in their estimates for 
positive versus neutral responses. 

As for distinctiveness inference, a highly 
significant valence main effect was also found, 
F(2, 32) = 32.30, p < .001. Both actors and 
rated highest consistency for 


observers gene 
positive behaviors and lowest consistency for 


negative behaviors. 
Consensus information. As predicted, ob- 


servers used their knowledge of their own be- 
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havior (how they themselves responded to 
each item) to estimate how many others would 
perform the same behavior. For each item, 
observers’ responses were divided into two 
groups—those reflecting high self-based con- 
sensus (if their response was the same as that 
of their matched actor) and those indicating 
low self-based consensus (if their response was 
different from that of their matched actor). 
Newman-Keuls comparisons of the mean con- 
sensus estimates for these two groups of ob- 
servers’ responses with the mean consensus 
estimates of actors supported the prediction. 
For items on which observers gave the same 
response as their matched actors, generated 
consensus (M = 69) was higher than for ac- 
tors (M = 57), whereas for items on which 
observers differed, generated consensus (M = 
52) was lower than for actors. Each of the 
means differed significantly from the others 
(ps < .01). 

A significant main effect of valence was 
also found, F(2, 32) = 12.18, p< 01. The 
more positive the behavior, the higher the 
estimated consensus (M = 65 for positive, M 
= 59 for neutral, M = 54 for negative be- 
haviors). Again, each mean differed signifi- 
cantly from each of the others (ps < .05).* 


Certainty 


As predicted, observers having high self- 
based consensus were significantly more cer- 
tain than actors of their Consensus estimates, 
t(18) = 2.27, p < 05. Observers having low 
self-based consensus were no less certain than 
actors, (18) < 1. Also as expected, actors 
were more certain of their distinctiveness and 
consistency estimates than were observers, 
F(1, 16) = 6.20, p < .025, for distinctiveness, 


and F(1, 16) = 7.12, p< -025, for consist- 
ency. 


Significant main effects of valence were also 
obtained on certainty for each type of in- 
formation (ps < .01). The More positive the 


behavior, the more certain subjects were of 
their information estimates, 


Discussion 


The actor—observer differences in distinc- 
tiveness and consistency estimates support 
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Jones and Nisbett’s (1972) suggestion thy 
the different information generally available 


to actors and observers may lead to divergen 
assumptions about the consistency of behay. 
to prediction, actors showed no general tent: 


ior across time and stimuli. However, contrary 
ency to think that their behavior was mot 
distinctive and less consistent than observers 
thought it was. Only negative behaviors were 
viewed as more distinctive by actors than ob- 
Servers; positive behaviors were seen as more 
consistent by actors than observers. Thus, al- 
though actors and observers made different as 
sumptions about the distinctiveness and cot: 
sistency of a given behavior, the specific dit 
ferences found were not consistent with the 
notion that actors “know” that a given be 
havior is more distinctive and/or less com 
sistent than an observer thinks it is. Rather, 
the differences in information inference wete 
primarily determined by desirability of the 
behavior. 1 
Focusing on causal attributions for positive 
and negative behaviors, Miller and Ros 
(1975) and Monson and Snyder (1977) have 
suggested that actor—observer differences m 
attribution may derive from information’ 
differences, Actors may attribute “good” be 
haviors more to themselves because they S 
lectively remember past instances of desirable 
behavior (Monson & Snyder, 1977). Su 
selective recall would allow for an actor's ii 
ference of high consistency and low distinc 
tiveness for positive acts, and low consistent) 
and high distinctiveness for negative acts. Re 
sults of the present study support this notion: 
Actors generated higher distinctiveness E 
observers for negative responses and hig i 
Consistency than observers for positive r 
sponses. nd 
However, the question that comes 12 mM 
is, What could be the basis for actors’ Sei 
tive recall? There is no compelling cognl! 


ee = SaaS 


ie re 
* Significant Valence X Category interactions oe 
also obtained for consensus, distinctiveness, 4” change 
sistency estimates. They did not appreciably, i4 
the implication of the valence main effects, Pr arie 
suggest that the strength of the valence ene a 
somewhat as a function of type of behavior. a ed 
Valence X Category interactions were also pe 
on measures of causal attribution in Experiment 4: 


on 


ACTOR-OBSERVER DIFFERENCES 


reason for actors to recall desirable behaviors 
more than undesirable ones. It could be sug- 
‘ted that positive behaviors are perceived 
as more common by actors than by observers. 
Tf this were the case, actors’ distinctiveness 
‘md consistency estimates might be logical 
inferences based on perceived frequency of 
occurrence of the behaviors. (If people are 
generally helpful, and I give a helpful re- 
sponse, than it would be logical to think that 
‘Twould usually give a helpful response.) Low 
distinctiveness and high consistency would be 
“expected for frequently occurring behaviors. 
“The valence main effect for consensus esti- 
“mates suggests that positive behaviors were 
‘indeed seen as most common (highest gen- 
“erated consensus), followed by neutral and 
negative behaviors, respectively. However, ac- 
tors did not see positive behaviors as more 
common than observers did; nor did they see 
negative acts as more unusual. Thus, the rela- 
tive frequency of positive behaviors and 
Uniqueness of negative behaviors does not 
provide an adequate cognitive explanation for 
‘the actor-observer differences in distinctive- 
hess and consistency inference. 

The failure to find a compelling cognitive 
explanation for the results suggests that ac- 
tots’ information estimates may be motiva- 
tionally biased. Sicoly and Ross (1977) re- 
cently found evidence for self-serving biases 
in attribution that could not be explained in 
eae terms. The present results 
area at such attribution biases may be 
ie, a biased estimates of the distinc- 
ete ha and consistency of behavior, which 

Influenced by actors’ motivations. 


Experiment 2 


fa of the second experiment was 
attribution or—observer differences in causal 
ies one to determine whether the dis- 
acies in information inference found in 

sip cc influence attributions. The de- 
With the ee to that of Experiment 1, 
OF informati itional between-subjects factor 
“Servers was on One group of actors and ob- 
‘Tess, and ee the consensus, distinctive- 
sistency information estimated 


X 
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by actors in Experiment 1. Another group 
was given the information estimated by ob- 
servers in Experiment 1. A third group was 
not given any information. All subjects made 
causal attributions for their own or their 
matched actor’s behavior. The following pre- 
dictions were made: 

1. Actors given no information would at- 
tribute positive behaviors more to internal 
causes than would observers, and negative be- 
haviors more to external causes than observers 
would. This prediction was based on the Role 
X Valence interactions on distinctiveness and 
consistency estimates found in Experiment 1. 

2. Since actors’ and observers’ informational 
inferences for neutral behaviors did not differ 
in Experiment 1, it was predicted that actors’ 
and observers’ causal attributions for neutral 
behaviors would not differ, given no informa- 
tion; nor were they expected to differ in the 
two information conditions. ates 

3. When provided with identical informa- 
tion, actors’ and observers’ causal attribu- 
tions were not expected to differ. Thus, no 
Role x Valence interactions were expected to 
occur in the two information conditions. Spe- 
cifically, (a) actors and observers presented 
with the information estimated by actors in 
Experiment 1 would exhibit the strong posi- 
tivity bias of actors given no information 
(more internal attribution for positive than 
negative responses and more external attribu- 
tion for negative than positive responses) ; 
(b) actors and observers given the informa- 
tion generated by observers in Experiment 1 
would exhibit the attributional tendencies of 
observers given no information. 


Method 


Subjects 


Fifty-four male and 54 female undergraduates 


served as subjects in partial fulfillment of an in- 


troductory psychology course requirement. Twenty- 
seven subjects of each sex were randomly assigned 


to serve as actors and 27 as observers. 


Procedure 


The materials and procedure were the same as 
those used in Experiment 1. 


Me 
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Information Manipulations 


Subjects in the information conditions were told 
that in addition to the feedback regarding their re- 
sponses, they would be given information dealing 
with the norms (consensus), validity (distinctive- 
ness), and reliability (consistency) of the inventory 
items. The information was presented for each item 
along with the feedback in the following form: 


Norms, X% of the subjects in this experiment 
also gave the Y response. 

Validity. If you/this student took the whole in- 
ventory (rather than just these 12 items), you/ 
he/she would give the Y response on X% of the 
items in this subscale. 

Reliability. If you/this student took the inventory 
at different times, you/he/she would give the same 
response on this item X% of the time. 


Subjects were told that the information was based 
on past research with the inventory and that they 
might find it useful in helping them to decide what 


caused them (or their matched actor) to give a par- 
ticul pS e. The actual percentages provided 
wert rated by subjects in Experiment 1. 
One “ct r one observer were each given the 


exact Consensus, distinctiveness, and consistency esti- 
mates generated by one subject in the first experi- 
ment. Thus, 18 actors and 18 observers got the in- 
formation generated by the 18 actors in Experiment 
1; 18 actors and 18 observers received the informa- 
tion generated by the 18 observers in Experiment 1. 
An additional 18 actors and 18 observers received no 
information. 


Dependent Measures 


Based on the feedback from the personality— 
achievement items, subjects were asked to decide 
what was the most probable cause of each response. 


The four alternatives provided by McArthur (1972) 
were presented, as follows: 


1, Something about me (actors) /this student (ob- 
servers) probably caused me (actors) /him or her 
(observers) to give the Y response. 

2. Something about the specific item probably 
caused me/him/her to give the Y response. 

3. Something about the particular circumstance 
probably caused me/him/her to give the Y re- 
sponse. 

4. Some combination of 1, 2, and 3 above probably 
caused me/him/her to give the Y response, 


Subjects were asked, if they chose Number 4, to 
specify the particular combination of factors they 
thought could best explain why they (or their 
matched actor) gave the F response, 
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A Results 
Data Analysis y 


A , 
The design of this experiment was 3 
} 


to that of Experiment 1, with the additional 
between-subjects factor of information. Thre 
levels of information were varied: (a) infor- 
mation generated by actors in Experiment |; 
(b) no information; and (c) information gen- 
erated by observers in Experiment 1. 

A 3 X 5 (Item Order X Locus of Attribu 
tion) chi-square performed on the 1,2% 
forced-choice attributions was not significant, 
Therefore, all further analyses were done 
without regard to item order. Analyses of 
variance were performed on internal (person) 
and external (stimulus and circumstance at: 
tribution.2 (The measure of external attribu- 
tion was derived from summing stimulus, cit- 
cumstance, and stimulus—circumstance attribt- 
tions.) 


Causal Attribution Effects 


Internal attribution. It was predicted thi! 
actors given no information would exhibit # 
“Positivity” bias in causal attribution for pos’ 
tive and negative behaviors that would bet 
duced or eliminated when they were provid 
with the information generated by observers. 
Observers’ attributions were not expected W 
reflect as strong (if any) a bias, given n0™ 
formation, but were expected to be biase 
when given the information generated by a 
tors. To test these predictions, the simple F 
teractions were computed for each level g 
information. The results supported the predic 
tions. When subjects were given no inform 
tion, a significant Role x Valence interaction 
F(2, 96) = 10.09, p < .001, revealed that a 
tors made more internal attributions for P% 
tive responses and fewer internal attributiol 
for negative responses than observers did, ‘a 
though only the latter difference was 5! i 
cant (p < .05). Newman-Keuls tests reve’ ail 
that actors made significantly fewer ina f, 
attributions for negative than for neutt® | 


ae b 
Positive responses (ps < .05), whereas U 


ther W 
ot 1% 
m the 


? Analyses done on person-stimulus and © 
ternal-external combination attributions are 
ported because they are not directly relevant 
hypotheses, 
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_ Figure 2. The proportion of internal attributions made by actors and observers as a function of 


information and behavioral valence. 


ervers’ internal attributions for positive, neu- 
ral, and negative responses did not signifi- 
antly differ from each other (see Figure 2). 
\s predicted, there was no actor—observer dif- 
erence in internal attribution for neutral re- 
sponses, 

When subjects were given the information 
generated by the actors in Experiment 1, a 
significant main effect of valence was ob- 


5 
= .60 
i sot 
$ 
È 3 
i as —— 
2 
: 5 
i 
positive neutral negative positive 
VALENGE 
ACTORS’ INFORMATION Mu 


Figur 
e 3. The proportion of external attributions made by actors and observers as a 


informati 
mation and behavioral valence. 


is, 
tained, F(2, 96) = 21.06, p< 0l {)inty,the 
Role X Valence interaction did not appr ach 
significance. As shown in Figure 2, actors’ 
attributions reflected the same pattern as 
when given no information; the pattern for 
observers was similar to that of actors. Both 
groups exhibited the “positivity” bias shown 
only by actors in the no-information condi- 
tion. Actors’ and observers’ attributions did 


x————_* ators 


+ Obeervere 


50 
x 
a 
+25 ae 
neutral negetive pos! tive neutral negative 
VALENCE VALENGE 
INFORMATION OBSERVERS INFORMATION 


function of 
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not significantly differ from each other for 
positive, neutral, or negative behaviors. 

When subjects were given the information 
generated by observers in Experiment 1, 
neither the main effect of valence nor the Role 
X Valence interaction was significant. Actors’ 
and observers’ attributions did not signifi- 
cantly differ from each other at any level of 
valence. However, actors still attributed nega- 
tive behaviors less to internal causes than 
neutral behaviors (p < .05), but not less than 
positive behaviors, 

External attribution. The prediction was 
that, given no information, actors would at- 
tribute positive responses less and negative re- 
sponses more to external factors than would 
observers, and there would be no differ- 
ence in external attribution for neutral re- 
sponses. As predicted, actors’ and observers’ 
external attribution for neutral behaviors did 
not diej However, contrary to prediction, 
there i actor—observer differences for 
positive or negative responses either (see Fig- 
ure 3). The simple Role x Valence interac- 
tion for the no-information condition was not 
significant. However, the simple main effect 
of valence was significant, F(2, 96) = 7.21, p 
< .001, indicating that both actors and ob- 
servers attributed negative responses more ex- 
ternally than neutral responses, although the 
difference was significantly only for observers 
($ < .05). 

Since actors’ and observers’ external at- 
tributions did not differ, given no informa- 
tion, they would not be expected to differ in 
the two information conditions either. In sup- 
port of this reasoning, neither of the simple 
Role X Valence interactions in the two infor- 
mation conditions was significant, whereas 
both of the valence main effects reached sig- 
nificance, F(2,96) = 33.65 and F(2,96) = 
7.80, ps < .001.3 


General Discussion 


The tendency for actors to attribute posi- 
tive behaviors more and negative behaviors 
less to their own personal dispositions than 
observers do is consistent with self-serving 
biases found in previous research (Bradley, 
1978). Furthermore, actors’ and observers’ 
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estimates of the distinctiveness and @ 
ency of behavior paralleled the dig 
attributions. Actors reported their de 
behavior to be less distinctive and mor 
sistent than observers thought it was, w 
undesirable behavior was seen as mon 
tinctive and less consistent than obsery 
lieved it to be. 3 
Although subjects’ information es 
were quite consistent with their inten 
tributions, such was not the case for th 
ternal attributions. Despite the actor-ob 
differences in disinctiveness and co 
estimates, there were no differences be 
actors’ and observers’ attributions to 
factors. The reason for this finding 
late to a differential usefulness of cons 
distinctiveness, and consistency for 
versus external attribution. It has b er 
gested that distinctiveness and consiste n 
formation are more relevant for internal 
external attribution because they provi 
as to what sort of person someone is 
son, 1974; McArthur, 1976). If we 
that Priscilla usually helps the old lady 
consistency) and that she helps out othe 
ple, too (low distinctiveness), we © 
that she is a helpful person. In contrast, 
ing that other people help the old lady” 
consensus) has no direct implications 
Priscilla’s helpfulness or lack of it. Rath 
Suggests that the old lady must be a $} 
thetic or helpless person, since she gels 
from everyone. Thus, consensus info i 
tells us about an external factor (th 
lady) rather than about the actor (Pris 
In support of the differential useful ; 
consensus, distinctiveness, and consisten 
formation for internal and external at 
tion, Garland, Hardy, and Stephenu ] 
found that when asked to make an 
attribution, subjects more often req 
distinctiveness and consistency info i 
than consensus, whereas the reverse Wi E 
when they were asked to make an & 
attribution. 


* Significant main effects of verb ae 
also found for internal and external attribui 
ever, since no actor—observer differences Wa 
dicted or found, the findings pertaining ™ 
category are not reported. 


ACTOR-OBSERVER DIFFERENCES 


If consensus information is most useful for 
making an external attribution, then actors’ 
and observers’ external attributions would not 
be expected to differ in the present experi- 
ment, since actors and observers did not differ 
in their consensus estimates. 


Influence of Information on Causal 
Attribution 


Results of the present research suggest that 
actors’ self-serving biases in causal attribu- 
tion can be reduced by providing them with 
information about the distinctiveness and 
consistency of their behavior, generated by an 
observer. Such information may be viewed as 
more objective than their own knowledge of 
distinctiveness and consistency of their be- 
havior, which, like their attributions, appears 
to reflect self-serving biases. Similarly, ob- 
Servers can be swayed by actors’ biased in- 
formation estimates to attribute positive 
behavior more and negative behavior less to 
an actor’s personal disposition. 

E pa been suggested that self-serving bi- 
a te vi strategic self-presentation by ac- 
iy aa as than their private beliefs (Brad- 
i 4 h Attempts should be made to mini- 

Ta e ikelihood that such an artifact may 
rae for experimental findings. In the 
oo subjects were told that their 
a E the inventory items and the de- 
N ae would be kept confiden- 
aud only the o enpeared on any of the forms, 
completed ao could identify the 
Mere shown the onnaires. Although observers 
other subject aed responses of 
E time eee at person was not present 

WS, the inf as never identified by name. 
A uence of self-presentation mo- 


Ves in th 
minimal, € present research should have been 


Implications 


The 
teresting ieee Teported here have some in- 
Stings T ications for a variety of natural 
that ie oo Perennial dieter, who claims 
Weight eats a thing yet continues to 
Day at comes to mind as an example. 
mally y aante his obesity to an abnor- 
that ig ae etabolism or some other factor 


of his control. In accord with his 


271 


attribution, he reports that he hardly ever 
snacks between meals—that is, snacking is 
highly distinctive and inconsistent. On the 
other hand, the dieter’s wife, who attributes 
her husband’s obesity to his eating habits, 
correctly reports the late afternoon chocolate 
bars and midnight refrigerator raids. If the 
dieter were to keep a running record of every 
bite he took during the day, he would have 
the more objective information reported by 
the observer and might be convinced that his 
weight problem is the result of his excess eat- 
ing. This self-monitoring technique has been 
successfully used to help dieters in correctly 
attributing causality and, ultimately, in losing 
weight (Mahoney, 1974). Similar methods 
might be useful in other therapeutic or educa- 
tional settings where taking responsibility for 
negative behaviors is a prerequisite for posi- 
tive change. 

On the other hand, there are also situations 
in which the actor’s information may be more 
accurate or complete than the observer’s. A 
student may attribute his or her lack of at- 
tendance in Psychology 101 to the fact that 
the lectures are boring and uninformative. 
Consistent with this attribution, that student 
may know that his or her attendance at other 
classes is and always has been excellent. The 
professor may attribute the student’s behav- 
jor to an unreliable nature, based on con- 
sistent absence throughout the semester, re- 
gardless of what topics were being covered. In 
this situation, the actor (student) has more 
consistency and distinctiveness information 
than the observer. If the professor learned 
that the student’s behavior in Psychology 101 
really was highly distinctive and inconsistent, 
he or she might be expected to alter the im- 
pression of the absent student and perhaps 
even to realize that the lectures leave some- 


thing to be desired. 
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Sex Differences in Eavesdropping on Nonverbal Cues 


Robert Rosenthal and Bella M. DePaulo 
Harvard University 


Three series of studies investigated the hypothesis that nonverbally, women are 
more interpersonally accommodating than men. The first series of studies 
showed that women lost much of their advantage in decoding visual cues when 
the cues were based on displays too brief to be under good sender control. The 
second series of studies showed that as nonverbal cues became less intended 
(more “leaky”), women showed decreasing advantage over men in accuracy of 
decoding nonverbal cues. There was also a trend for women who were more 
skilled at eavesdropping on nonverbal cues to be seen as having less successful 
social outcomes. Women were also more biased to use (the more controllable) 
visual cues than tone of voice cues and especially so when the video cues were 
of the face rather than of the “leakier” body. The third series of studies showed 
that women were more polite in their ascription of characteristics to others, 
More accurate in decoding of nondeceptive behavior, but substantially more 
likely to interpret deceptive cues as the deceiver wanted them to be interpreted. 
Finally, it was shown that women’s nonverbal cues were more easily read 
than men’s. 


The superiority of women over men in the 
iccuracy of decoding nonverbal cues has 
ong been postulated and investigated. Re- 
ane these investigations have been ex- 
ene systematically, and quantitatively 
Seen so that the issue is no longer in 
ie peas are indeed superior to men in 

on ing of nonverbal cues (Hall, 1978). 
om a the results of dozens of stud- 
cal mad to nonverbal cues is the 
ea nding that people who are more skill- 

ecoding nonverbal cues are more effec- 
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tive in their interpersonal relationships (Hall, 
Rosenthal, Archer, DiMatteo, & Rogers, 
1978; Rosenthal, Hall, Archer, DiMatteo, & 
Rogers, 1979; Rosenthal, Hall, DiMatteo, 
Rogers, & Archer, 1979). Given that 
women in our culture have traditionally 
played more expressive, socioemotional, and 
supportive rather than instrumental roles 
(Parsons, 1955; Parsons, Bales, & Shils, 
1953; Zelditch, 1955), their superiority at 
decoding nonverbal cues, a concomitant of 
superior interpersonal relationships, seems 
quite consistent. 
Recently, however, a question has been 
raised about the conditions under which skill 
at decoding nonverbal cues might actually be 
detrimental to smooth social functioning 
(Rosenthal, Hall, DiMatteo, Rogers, & 
Archer, 1979). The evidence suggests that 
social relationships may suffer when people 
are too good at decoding nonverbal messages 
intended to receive. Thus, for 


they were not 

example, people who are very good decoders 
of very briefly exposed (e.g., 42 msec) facial 
and body cues may have less satisfactory 


interpersonal relationships (Rosenthal, Hall, 
DiMatteo, Rogers, & Archer, 1979). 
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If it is disruptive of smooth interpersonal 
functioning for a participant to “know too 
much” about the state of the other, we 
would expect women to show relatively less 
advantage over men in decoding nonverbal 
cues when those cues are under less control of 
the sender and are more likely to be unin- 
tended than intended cues. The plausibility 
of this hypothesis is strengthened by Weitz’s 
(1976) finding that the nonverbal styles of 
females are closely adjusted to the person- 
ality traits of male partners. 

Among the various channels of nonverbal 
communication that have been studied, the 
most informative, and also the one most 
likely to be best controlled by the sender, is 
the face (Ekman & Friesen, 1969; Izard, 
1971; Rosenthal, Hall, DiMatteo, Rogers, & 
Archer, 1979). Compared to the face, each 
of the following sources of nonverbal cues is 
likely to be less informative under ordi- 
nary circumstances of interaction and less 
likely to be as controllable by the sender: 

1, Body, Ekman and Friesen (1969, 1974) 
have shown that the body is more likely than 
the face to give off or “leak” deception cues. 

2. Tone of voice. Several studies (Ekman, 
Friesen, & Scherer, 1976; Streeter, Krauss, 
Geller, Olson, & Apple, 1977; Krauss, Geller, 
& Olson, Note 1) have shown that tone of 
voice is an additional source of cues to decep- 
tion or stress, It has also been shown that 
tone of voice may leak one’s true feelings 
about oneself (Bugental, Henker, & Whalen, 
1976; Bugental & Love, 1975; Holzman & 
Rousey, 1966) or about others (Weitz, 1972). 

3. Very brief exposures. Very brief expo- 
sures to face or body cues appear likely to 
offer further unintended cues to the other’s 
state without being as controllable as cues 
(in the same channels) occurring in longer 
time frames (e.g., 42 msec vs, 2,000 msec) 
(Ekman & Friesen, 1969; Rosenthal, Hall, 
DiMatteo, Rogers, & Archer, 1979), 

4. Discrepancy, Except in the special cases 
of irony, sarcasm, or humor, discrepancies be- 
tween visual and auditory nonverbal cues may 
be unintended and difficult to control (De- 
Paulo, Rosenthal, Eisenstat, Rogers, & Fink- 
elstein, 1978). 

The purpose of the present series of studies 
was to investigate the hypothesis that non- 
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verbally, women are interpersonally 
accommodating than are men. We wanted 
know whether women’s greater social civili 


More specifically we wanted to know the f 
lowing: 

1. Do women show relatively less adv 
tage over men in their decoding accuracy 
the nonverbal cues decoded become less in 
tended and less controllable by the sendi 
(i.e., are women less likely to eavesdrop l 
the leakages of others, taking into accou 
women’s general superiority at decoding 
verbal cues) ? 

2. Do women, more than men, attend ptt 
erentially to the more overt and more 0d 
trolled nonverbal channels? 4 

3. Do women, more than men, ascribe 
cially more desirable characteristics to 
nonverbal cues they read? s 

4. Do women, more than men, interp 
nonverbal cues in ways in which the 
would like them to be interpreted? 

5. Do women, more than men, permit th 
nonverbal cues to be read by others? 

6. Given the degree of sex role stereot 
that still exists, are there social costs 
women who are less interpersonally acco! 
modating in these nonverbal ways? 


Study Series 1 
Very Brief Exposures 


to 
A series of 10 studies was conducted 
test the hypothesis that for visual cu 


+ ove) 
very brief exposure, women’s advantage 


men in accuracy of decoding would be ee 


than that established for visual oor) 
able in longer time frames (Hall, 197°” 


Method 


itv 

The Brief Exposure Profile of Nonverbal we 
ity (PONS), a test of sensitivity to nonve 250 
cues with median exposure lengths of only 7°) pehet 
(Rosenthal, Hall, DiMatteo, Rogers, g colles 
1979), was administered to six samples alts 4 
students (NW =352), three samples of ad! nts Cia 
87), and one sample of high school studen ade 
111). For each of the 10 samples, the mE ea 
females’ superiority was computed by § 


fable 1 
dinar’ Versus High-Speed Visual Cues 
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-and-Leaf Plots of Female Superiority (in o Units) in Sensitivity to 


High-speed cues? 


Value Stems* | Leaves? Stems | Leaves 
1.8|6 
1:311 
1.0|2 
6/0579 
513 4 
41008 
3/00 3/28 
216 2102222 
1/0144 st 
0/00235 0 
—.0}2 —.0 |6 
—.1|2 5 —:1|2 
—.2 —.2|4 
—.3|1 
—.6|0 
Maximum 86 
Quartile 3 ST i 
Mdn 26 22 
Quartile 1 o1 ‘08 
Minimum —60 oe 
eo A —.24 r 
T 1 .56 32 ž 
8 L75: — Q] 42 2 g 
a 49 -20 ? 
32 14 


the me: 
Bi eid of the males from the mean 
the Saar females and dividing the difference 
le aie Standard deviation of the male and 
Mow, 1975) Ra (Cohen, 1977; Rosenthal & Ros- 
ie three Wes Pitti of variance showed that 
001) did ni of samples (college, adult, and high 
females out ot differ in the degree to which the 
performed the males, F(2,7) <1; ac- 


ingly, all 
subs: 
level of eae E TE 


Results 
Tabl 
ee” the stem-and-leaf plot (Ro- 
side ee 1975; Tukey, 1977) of our 
Plot of the ad oe with the stem-and-leaf 
by Hall ( ib78) ts of the 29 studies reported 
itude of D in which she computed the 
visual cues male superiority in sensitivity 
onds, on th exposed for at least several 
erence bei 


fs 
= 


oe Miles The most obvious 
e two distributions is the 


B f 
Based ba studies (Hall, 1978); N = 5,671 subjects, 
geet ices (present Srudy Series D) A 550 subjects. 
EAG iy in nonverbal sensitivity given in ¢ units to one 
al place for each study of female superiority in nonverbs 


decimal place. 
al sensitivity. 


of studies of ordinary-length visual 
cues in which women showed larger advan- 
tages over men (41%) than was true in any 
study of high-speed cues. In contrast, the 
percentage of studies of ordinary cues in 
which women showed smaller advantages 
over men than was true in any study of high- 


percentage 


d the stem-and-leaf plot 
tribution to facili- 
number 


1 Tukey (1977) 
as a special form o! 
tate the inspection 0! 
in the data batch is m 
leaf, but each stem may serve sever 
the fifth stem of Table 1, a .5, is followed by two 
leaves of 3 and 4 representing the two numbers .53 
and .54. The first digit is the stem; the second is the 
leaf. The eye takes in a stem-and-leaf plot as it does 
any other frequency distribution, but the original 
data are preserved with greater precision in a stem- 
and-leaf plot than would be the case with ordinary 


frequency distributions. 
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speed cues was only 7%, a difference that 
was significant at p = .006 (sign test) or at 
p=.08 by the less focused Kolmogorov- 
Smirnov test, x?(2) = 5.09. The summaries of 
the two distributions given at the bottom of 
Table 1 show that the two variances differ 
substantially, F(28,9) = 6.00, p = .008, with 
the bulk of this heterogeneity due to the stud- 
ies of ordinary nonverbal cues showing very 
large advantages of females over males. 

In summary, 10 studies of sex differences 
in accuracy of judgments of briefly exposed 
cues showed that women were markedly less 
advantaged at this task than they were at 
reading visual cues of ordinary duration. 
These results suggest that women may be 
more polite than men in their decoding of 
nonverbal cues (i.e., refraining from decoding 
too efficiently those nonverbal cues under 
less control of the encoder). 


Study Series 2 
Order d Leakage Channels 
Method 


Five measures of sensitivity to nonverbal cues 
derived from the PONS were administered to samples 
of 148 high school students and 94 college students. 
Details of the first four measures are given in Rosen- 
thal, Hall, DiMatteo, Rogers, and Archer (1979), 
and details of the fifth measure are given in De- 
Paulo et al. (1978), (See Appendix.) Briefly, the 
measures were as follows: 

it Face. A 20-item test of sensitivity to facial 
expressions, 

2. Body. A 20-item test of sensitivity to body 
movements, 

3. Tone. A 40-item test of sensitivity to speech 
masked by content-filtering (Rogers, Scherer, & 
Rosenthal, 1971) and random-splicing techniques 
(Scherer, 1971). ? 

4. Brief exposures. A 40-item test of Sensitivity to 
high-speed exposures of the face or of the body with 
a median exposure length of only 250 msec. 

5. Discrepancies. A 128-item test of sensitivity to 
the degree of discrepancy between the tone of voice 
and either facial expressions or body movements, 

The five measures are listed in the order in which 
we believed them to fall on a dimension of “leaki- 
ness.” Thus, the face cues were felt to be the best 
controlled or least “leaky,” whereas discrepancy cues 
were felt to be least well controlled or most “leaky.” 
Ekman and Friesen’s (1969, 1974) work convinces 

_ us that the body may be a leakage channel, though 
Feldman (1976) suggests that this may not always 
be the case. Evidence for the leakiness of tone-of- 
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voice cues is more pervasive (deriving from studie 
of personality, psychopathology, and attitudes, 
well as from studies of deception) and is as ye 
uncontested. Briefly exposed cues may be similar to 
the “microaffect displays” discussed by Ekman and 
Friesen (1969). These microdisplays may be met} 
fragments of full displays, or they may be complete 
movements that are greatly reduced in time. In 
either case, they should be more difficult to control 
than any of the full-channel messages communicated 
at ordinary speeds. Finally, we believe that dis 
crepant messages would be most “leaky” or difficult 
to control because (a) they involve simultaneow 
communication by two different channels, each af 
which might itself be difficult to control, and (yi 
the combined effect of both messages together might 
add a further element of subtlety or uncontrollability, 

The specific ordering of these five measures Wis) 
necessarily highly speculative, since prior studies had 
rarely ever compared more than two of these chan 
nels simultaneously. However, it should be noted 
that given only the placement of our most ai 
least leaky channels (face and dicrepancies), 0 
particular ordering of the five measures (with 
linear contrast weights of +2, +1, 0, —1, —2) con ii 
lated very highly (r =.89) with the ordering the 
does not make any discrimination among the mi wi 
three measures (with contrast weights of +1, 0 v i 
—1). Or, even more conservatively, if we had a 
claimed that the face was less leaky than aN 
the other channels, the resulting ordering: (+4 
—1, —1, —1) would still correlate very highly (r 
-71) with the present ordering. 


Results 


Table 2 shows the mean intercorrelatin 
among the five measures for the two samp i 
The fact that all correlations were pose 
most of them significantly so, shows 5 
these measures were clearly related to e 
other, but not to as great an extent P 
would expect if all the measures ved 
ping only a single underlying skill. TER K 
onal of the matrix of Table 2 shows t"? 
ternal consistency reliability for each 0 
measures, The median reliability of - inter- 
substantially greater than the median 
correlation (.22) among the five me 
supporting the relative independence ^ 
five measures. the 

The bottom row of Table 2 shora G 
degree of accuracy exceeding chance | ye 
units) for each of the five Teas om ore 
would expect, the channels posited p: Jess 
leaky or less controllable tended Be a 
accurately decoded, r(3) = .87, 2 < salt © re 
tailed. However, even the most diffic i 
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‘able 2 

mtercorrelations, Reliabilities, and Accuracy Levels of Five Measures of 

"Sensitivity to Nonverbal Cues 

—— aaaaaaaaaaaasasasassassasassassssssassutsuuasssssssssssssssssssssssssssssssuuuiÂii§i§Âiliii$$$l 


$ 
Brief Discrep- 
Measure Face Body Tone exposures ancies 

Face 40 

' Body bl 36 

' Tone .30*** -08 .28 
Brief exposures 40*** red hana DTE 53 
Discrepancies 16 12 AU bd -22% .59 
Accuracy (in e units) 3.88 2.82 1.82 2.58 1.12 


Note. Internal consistency reliabilities are in boldface type in the diagonal. Accuracy is defined by the 
degree to which performance exceeds chance (in ø units). 


p2 < os. 
mp < .01. 
ek p < 001 


to decode, those of discrepancy, were decoded 
very much better than chance (1.120). In- 
ection of the reliabilities of the five mea- 
res also shows that differences in reliability 
‘ould not be responsible for differences in 
accuracy. The most reliable measure was also 
the one with the lowest accuracy, and the 
verall correlation between test accuracy and 
eliability was negative, r(3) = —.33, ms. 
Table 3 shows the results of our analysis 
for each sample and for both combined. The 
Magnitude of females’ superiority over males 
in decoding decreased in a linear manner, go- 


ing from less to more leaky channels. Based 
on the analyses of variance testing the Sex X 
Linear Trend interaction, the college sample 
showed the regression (r = .96) to a signifi- 
cant degree, the high school sample showed 
the regression (r = .72), but not to a signifi- 
cant degree, and the combined samples showed 
the regression (r = .91) toa significant degree 
(p = 004). As channels became more and 
more leaky, females lost more and more of 
their advantage. These results, too, suggest 
that women may be more polite in their de- 
coding of nonverbal cues (i-e., refraining more 


Table 3 
Female Superiority in Sensitivity to Five Types of Nonverbal Cues (in o units) 
Sample 
High school* College? 
Decoding skill (N = 109) (N = 81) Me 
Kace 36 38 31 
Body 24 34 29 
phone 32 —.02 15 
Brief exposures 18 —.18 ‘00 
Discrepancies 22 = 28 —.03 
F (linear contrast) 1.25 1.43 is 
df error 216 196 
£ 265 007 004 
© (predicted with obtained) 2 96 1 
Note. Ne 
^ The inean ty from row to row; median Ns are given at head of columns, 
i effect size of .26 was significantly greater than zero (b = .003). 


e me ; 
ean effect size of .05 was not significantly greater than zero (p = -59). 
ect size of .16 was significantly greater than zero Q 


= .006); the correlation over the five 
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Loss of Female Superiority in Sensitivity to Four Types of Nonverbal Cues 


for Earlier and Present Studies (in o units) 


No. of Total 
Difference between Earlier earlier Present studies 
face and: studies studies studies (2) (weighted) 
Bod: 10 2 .08 09 
Tone 14 10 -22 15 
Brief exposures 9 10 37 a 
Discrepancies .26 1 40 
Mdn AT 30 19 
Unweighted M adi 27 ae 
Weighted M 16 27 al 


Note. Loss of female superiority was defined as the difference between female superiority in the face channel 
(in ø units) and female superiority in each of the leakier channels (in ø units). 


and more from decoding effectively those non- 
verbal cues that are less and less controllable 
by the encoder), 


Additional Evidence 


Although we knew of no other studies in 
which sex differences in sensitivity to non- 
verbal cues had been specifically examined 
for a series of channels differing in leakiness, 
we tried to find studies in which at least one 
relatively leaky channel could be compared to 
the least leaky channel, the face, with ex- 
posure length not too brief. 

We found 2 studies in which female su- 
periority had been measured on the body 
channels using the same test as employed in 
the present article (Rosenthal, Hall, DiMat- 
teo, Rogers, & Archer, 1979, chapters 6 and 
8); 10 studies in which female superiority 
had been measured in the tone-of-voice chan- 
nel (Hall, 1978); 10 studies in which female 
superiority had been measured for brief- 
exposure visual cues (the present article); 
and 1 study in which female superiority had 
been measured for discrepant cues (De- 
Paulo et al., 1978). In the 2 studies measuring 
female superiority in the body channel, fe- 
male superiority in the face channel was 
measured for the same sample. In all other 
cases, female superiority in the face channel 
was defined by Hall’s visual category data 
(mean effect size = .32c). 

Table 4 shows the loss (in o units) of fe- 
male superiority in going from the face chan- 


nel to each of the four more leaky channels 
Just as was the case for our two samples of 
Table 3, the earlier studies yielded an order- 
ing of loss of female superiority that wê 


gree of linear relationship between the ate 
ing of leakiness of channel and the loss 
female superiority was very great, r(3 
.99, p < .001. 


Social Consequences of Eavesdropping 


Teacher Ratings 


be penalized for being too accurate at reading cial 
den cues, we examined teachers’ ratings ni ho hat 
effectiveness of the high school students eseribed 
been administered the five measures 1-5) yated 
earlier. Several teachers (M = 2.4, range arity 
the students whom they knew on their Pi osite 
with the same sex, popularity with the ore 
and degree of social understanding. We ch of He 
all three ratings with degree of skill in He and f 
five nonverbal tasks, separately for male Becaust 
males, yielding a total of 30 correlations. 5 
the patterns of results for the three ne varit 
similar (the intercorrelations among the the correle 
bles were .69, .79, and 90), we averaged taS 1g the 
tions obtained from the three ratings t? 
summary correlations of Table 5. ng trend 
Results. Table 5 shows a Stron effer 
though not a significant one, fori ø 
tiveness to be less and less strongly m Jess 


Bent YO! 
nonverbal sensitivity as we move 
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relations Between Sensitivity to Five Types of Nonverbal Cues and 


eachers' Ratings of Social Effectiveness 


È i Males Females 

Decoding skill (n = 43) (n = 63) Difference M 
.36* .38** —.02 i aosi 
ee .03 .07 07 

1 -2 i; 

rief exposures ao" a F zE 

iscrepancies .035 —.06> ‘09 -.02 

Correlation (n = 5) between magnitude 

of relationship and degree of 

“leakiness” of channel —.40 —.69 60 —.56 


k ns vary from row to row; median ns are given at head of columns. 
Ao analogous value for a sample of 36 junior high school males was +18. 
‘The analogous value for a sample of 37 junior high school females was —.12. 


*p < 05. 
>< 01. 
p < 001. 


t both males (r = —.40) and females (r = 
69), but the trend was stronger for fe- 
es, as shown in the third column of Table 
i Relative to males, correlations between 
al effectiveness and nonverbal sensitivity 


I 

i. channels. These indications are 

4 i for females than for males, although 
ould note that even for females, the 


Table 6 


average correlation still tends to be positive 


rather than negative. 


Self-Ratings 


Method, For both the high school and college 
samples described earlier, we obtained students’ self- 
ratings of the quality of their relationships with 
members of their own and of the opposite sex. These 
ratings were correlated with skill at decoding each 
of the five types of nonverbal cues, separately for 
males and females. Because the results for the ratings 
of same- and opposite-sex relationships were similar, 
we present the average of the correlations of each 
rating with each skill in Table 6, separately for the 
males and females of the two samples. 


Correlati 
lations Between Sensitivity to Five Types of Nonverbal Cues and 


Selj-Ra 


-Ratings of Social Relationships 


Decoding skill 


Face 
Bod 
ly 
Tone 
Brief exposures 
iscrepancies 


Cae 
lation (n = 5) between magnitude 


of i i 
o cajationships and degree of 
akiness” of channel 


High school College 
Males Females Males Females 
(n = 36) (n= 43) (n = 32) (mn = 49) 
14 —.13 .09 —.09 
—.05 —AS 22 18 
.03 —.04 —.23 —.18 
06 -AT .12 —.23 
29 .03 —.03 .06 
50 56 —.31 —.10 


ole. ns vay 
+ 
ty from row to row; median ms are given at head of columns. 
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Table 7 
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Correlations Between Sensitivity to Five Types of Nonverbal Cues { 


and Bias Favoring Positive Affect 


Se EEEyE EEE EENENESnN nnn 


High school College 
Males Females Males Females 7 
Decoding skill (n = 46) (n = 63) (n = 32) (n = 50)  Interactiont 
.02 -09 14 —.12 33 
see —.09 AD —.03 —.20 45 
Tone —.01 —.14 304 —A5 4 
Brief exposures —.03 12 22 - 04 a 
Discrepancies —17 41 01 —.33 a b 
Correlation (n = 5) between 
magnitude of relationship 
and degree of “leakiness” 
of channel —.67 —.04 


—.02 —.38 


Note. ns vary from row to row; median ms are given at head of columns. ake 
* The difference between the correlations on the diagonals of the 2 X 2 table formed from the factors g® 
and age, for example, [(.09) + (.14)] — [(.02) + (—.12)] = .33. 


Results. Table 6 shows that self-ratings 
of the quality of relationships with others 
were not strongly related to sensitivity to 
nonverbal cues (median r = .00), a finding 
that is consistent with the results of 10 earlier 
studies (median r = .06; Hall et al., 1978). 

The last row of Table 6 provides an in- 
teresting hypothesis for future testing. Among 
high school males and females, there is a 
tendency (rs = .50 and .56, respectively) for 
correlations between nonverbal skill and self- 
ratings to grow larger as channels increase 
in leakiness. Thus, for our younger students, 
skill at eavesdropping on increasingly leaky 
channels was associated with increasingly 
better social relationships, as they perceived 
these relationships. These results are exactly 
opposite to those obtained when teacher rat- 
ings had been employed as the criteria of 
social effectiveness (rs = —.40 and —.69 for 
males and females, respectively). 

When we examine the analogous results for 
the college males and females of Table 6, we 
find the trend to be reversed. For our older 
students, skill at eavesdropping on increas- 
ingly leaky channels was associated with in- 
creasingly poor social relationships, as they 
perceived these relationships, Perhaps these 
students have learned, as have the younger 
students’ teachers, that it may not be bene- 
ficial to social relationships to be an effective 
eavesdropper on leaked nonverbal cues, It 


may well be that this knowledge is tacit rather 
than explicit (Polanyi, 1962). 


Biases and Nonverbal Skill 


The Nonverbal Discrepancy Test e 
Paulo et al., 1978), in which conflicting V! 
and auditory cues are presented, provides 
eral measures other than accuracy ât ani 
ing discrepant cues. Two of these w 
are indexes of bias in the processing Ci 
verbal cues. One of these biases, the P 
bias, is the extent to which the subject a i 
influenced by positive than by negate audi 
verbal cues. Thus, when either video aa af 
positive cues are paired with negative “sl 
the remaining channel, iti 


the positivity ti 
subject rates the scenario as mM! ore tenli 
These bias scores are corrected for 
ency to rate all scenarios as poy i 
tracting ratings of positiveness r affect be 
narios showing no discrepancy 0°” pias) 
tween channels. The other bias, videa 
the extent to which the subject 15 serbal cu 
enced by video than by audio nonv 


Positivity Bias 


Table 7 shows the correlations be nid 
jects’ degree of positivity bias a" 
at decoding five types of sone tH 
sults are given separately for 


es 


female subjects of the high school and college 
samples we have been discussing. 
For male subjects, there is a tendency for 
these correlations to be more positive at the 
‘college level than at the high school level, 
“whereas for the female subjects, there is a 
tendency for these correlations to be more 
"negative at the college than at the high school 
level; for interaction of Sex X Sample, F(1, 
2) = 11.85, p < .08. One interpretation of 
these results is that relative to men, as women 
mature, those who tend to focus more on the 
“good” in nonverbal cues tend to become less 
accurate decoders of nonverbal cues, perhaps 
in an effort to avoid knowing “more than they 
should” about others’ states, If this interpre- 
tation were reasonable, we might expect to 
find that the interaction described would be 
especially true for skills of decoding the less 
‘Controllable, more leaky channels. The sta- 
tistical test for the systematic growth of a 
two-way interaction along an ordered dimen- 
Sion is the A x B x Linear Trend in C inter- 
action, which for these data was the interac- 
tion of Sex x Sample X Linear Trend in 
pes of the Nonverbal Skills. Results of 
Ba poelyals supported our interpretation, 
) = 18.25, p < .03. That is, the tend- 
cy for women, relative to men, to become 
ie urate at nonverbal decoding as they 
a a when they are more inclined to see 
a in nonverbal cues, becomes sys- 
decodin Y greater when skill at nonverbal 
ely a would make them more able and 
eavesdrop, 


Video Bias 


MM hen people are confronted with contra- 
inele Shier cues in the video and audio 
em tob merra general tendency for 
e e, more influenced by the video than 

F ple show cues (DePaulo et al., 1978). Peo- 
owever Anelir individual differences, 
jased in E the degree to which they are 

ne o avor of video cues. 
aani of a greater video bias 
end Ta tendency to focus more on in- 
Since video a ae on leaked nonverbal cues, 
Cues, and f las is most pronounced for facial 
trolled om cues are felt to be best con- 
east leaky. As we might expect 
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from the other results presented in this arti- 
cle, we found females to be more biased in 
favor of video nonverbal cues in both the . 
high school and the college samples (mean 
effect size for sex difference = .550; df = 
161; p < .001). We also found that this sex 
difference in video bias was significantly 
greater when the video cues were from the 
face rather than from the body (mean effect 
size for Sex X Channel interaction = 370; 
df = 161; p=.02). 

These results are consistent with the in- 
terpretation that women, more than men, are 
more likely to attend to better controlled 
rather than to more leaky channels of non- 
verbal communication. Still greater force is 
given to this interpretation by the finding of 
a trend for this sex difference in video bias to 
be greater when self-reported relationships 
with the opposite sex were above average in 
quality (effect size = .200; df= 161; p< 
.16). Perhaps women are reinforced with bet- 
ter opposite-sex relationships when they attend 
to the better controlled channels and learn to 
avoid eavesdropping on the more poorly con- 
trolled, more leaky channels. 


Study Series 3 
Perception of Others 


In a third series of studies, 20 male and 20 
female college students were videotaped while 
describing a person they liked, one they dis- 
liked, one they were ambivalent about, one 
they were indifferent to, a person they really 
liked as though they disliked him or her, and 
a person they really disliked as though they 
liked him or her. The first two encodings 
were of clear affects, the second two encod- 
ings were of mixed affects, and the third two 
encodings were of deceptive affects. Each of 
the 40 encoders also served as decoder and 
rated the sendings of 20 of the encoders on 
the degree to which the sending reflected lik- 
ing, disliking, ambivalence, discrepancy, ten- 
sion, and deception. 

There were large overall differences in the 
average ratings assigned to encodings by de- 
coders. Perhaps as part of the pattern of see- 
ing only what one is supposed to see, females 
rated all encodings as significantly less tense 
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(effect size = 1.450; Ł= 4.10, p< .001), 
significantly less ambivalent (effect size = 
1.480, t = 4.19, p < .001), and significantly 
less discrepant (effect size = 0.840; ¢ = 2.38, 
p < 02) than did males. There were no sex 
differences in ratings on the other three scales 
(t <1). 


Overlooking Deception 


Each decoder’s ratings could be scored for 
accuracy of decoding. For example, a decoder 
was scored as more accurate in decoding lik- 
ing sendings if higher liking ratings were made 
to the encodings of liking than to the encod- 
ings of disliking. For both the liking and dis- 
liking accuracy variables, women were sub- 
stantially and significantly more accurate 
than men, with effect sizes of .920 (p < .001) 

-for liking and .68¢ (p < .001) for disliking. 
Thus, at this task, as at so many others, 
women were more accurate than men at de- 
coding nonverbal cues. 

A result of greater interest, however, was 
that compared to men, women were substan- 
tially and significantly more likely to inter- 
pret the deceptive encodings as the deceiver 
wanted them interpreted rather than as the 
deceiver really felt. That is, when senders 
were describing the person they liked as 
though they disliked him or her, females were 
more likely than males to believe that the 
sender was communicating disliking; analo- 
gously, women were more likely to interpret 
“dislike as though like” descriptions as com- 
munications of liking. Effect sizes were .78¢ 
(p < .001) for liked as though disliked and 
95a (p < 001) for disliked as though liked 
descriptions. In decoding deception, then, 
women are more likely politely to “read” 


what they are supposed to read rather than 
what is true, 


Openness in Sending 


There is considerable evidence to suggest 
that women are better encoders than men 
(Hall, Note 2), and the suggestion has been 
offered that women have been encouraged to 
be more emotionally expressive than men 
(Buck, Miller, & Caul, 1974; Buck, Savin 
Miller, & Caul, 1972). One way to view these 
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results, in the context of the present artik 
is that women not only eavesdrop less tha 
men but that they are also more interpersn 
ally accommodating by making themsely 
easy to read nonverbally. 

In the present person description study, 4 
in the studies reviewed by Hall (Note 2), w 
found that the nonverbal messages commun 
cated by women were more easily decodd 
than the nonverbal expressions of men (efie 
size = .73c, p < .05). 

If women more than men allowed their not 
verbal behaviors to be conveniently legibl, 
we might also expect to find a sex different 
on Snyder’s Self-Monitoring Scale (1974) 
Persons who score high on this scale at 
thought to be more controlling of the cus 
they emit, whereas those who score low at 
thought much more to “wear their feelings 0 
their sleeves.” We administered the Self-Mom 
toring Scale to 37 males and 44 females l 
the high school sample discussed in Stud 
Series 2 and to the 20 males and 20 femal 
in the person description study. The sex differ 
ences in self-monitoring for these two si! 
ples, though modest in size (427 and 2 
were consistent with each other and with di 
expectations: Women, more than men, wel 
likely to express their true feelings rather t m 
to control or monitor their nonverbal expr 
sions to fit a given situation (mean effect 
= „340; combined p = .03). 


Conclusions 


Three series of studies were conductel 
addressed the hypothesis that nonvet 
women will be more interpersonally ? 
modating than will men. The first wa 
of studies focused on the hypothesis | 1g 
nonverbal cues become less intended 2” 
controllable by senders, women Wi jei 
relatively less advantage over men 1 hy 
decoding accuracy. Put another way) iti 
pothesis was that women would be " 
less likely than men to eavesdrop a 
verbal leakages of others, correcting 
en’s general superiority at decoding 
cues. 

The first series of studies, b 
samples, showed that womens 
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vantage over men in decoding visual cues was 
to a large extent lost when the visual cues 

I yere of very brief duration (median duration 
‘of 250 msec). Exposures so short may be less 
controllable by the sender and may carry 
more unintended or leaked messages than ex- 
posures in longer time frames. 

The second series of studies, based on two 
samples (and 23 earlier studies), showed 
that when five measures of skill in decoding 
nonverbal cues were arranged from most con- 
trollable to least controllable (most leaky), 

omen showed a systematic decrease in their 
Superiority over men in going from the less 
to the more leaky channels. Although our 
dnitial ordering of these five measures of skill 
on a dimension of leakiness was very tentative, 
the overall results presented in this article 
‘Provide good support for the construct validity 
Of this particular ordering of the five mea- 
Sures, Most compelling in this regard are the 
accuracy results, based on over 20 studies, 
Which show that the initial ordering of these 
five types of cues fits very well with the order- 
Ing of the magnitude of women’s loss of supe- 
Mority in decoding these cues. 
a additional evidence for the hypothesis 
women were relatively less likely than 
Ta to eavesdrop on leaky nonverbal channels 
accompanied by suggestive evidence that 
ere may be social costs to eavesdropping. 
t greater one’s skill at decoding the leakier 
= the relatively less effective are one’s 
ide ‘ei relationships as judged by out- 
ae Servers—a finding that was stronger 
i ARE pi for men. This result, and the 
ato ae students show similar re- 
Bice ings of interpersonal relation- 
ind fous er, are based on only two samples 
theses for eee be viewed only S hy- 
oe considerable further evaluation. 
i cond series of studies also suggested 
focus aoe mature, those who tend to 
Verbal cues e positive or “good” in the non- 
Accurate Fas others may tend to become less 
Specially See of nonverbal cues, and more 
intended T these nonverbal cues become 
lhis series cf ess controllable, or more leaky. 
Ve to men studies also showed that rela- 
» women were more biased to 


Utilize y; 
vi 
, Sual cues than tone-of-voice cues, 
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and more so when these cues were of the face 
rather than of the body. Once again, then, 
women appeared more likely to use the in- 
formation from channels under better sender 
control and offering lower degrees of leaki- 
ness. This sex difference in video bias tended 
to be stronger among those with better op- 
posite-sex relationships. 

The third series of studies showed that 
women rated the behavior of others as less 
tense, less ambivalent, and less channel dis- 
crepant—results that fit well with the pat- 
tern of seeing only what it is polite to see. 
This series also showed that women were 
markedly superior to men in the decoding of 
clear, nondeceptive behavior. However, when 
deception cues were being emitted, women 
were substantially more likely to interpret 
these cues as the deceiver wanted them to be 
interpreted, rather than as the deceiver really 
felt. Finally, this series of studies suggested 
not only that women were less likely to eaves- 
drop on leaky nonverbal channels but also 
that they were more likely to be accurately 
decoded by others. 

Taking together the results of our three 
series of studies, it seems reasonable to con- 
clude that women are more polite in the non- 
verbal aspects of their social interactions than 
are men. They are more guarded in reading 
those cues that senders may be trying to 
hide, but more open in the expression of 
their own affective states. Further, there are 
indications that women who are less accom- 
modating in these nonverbal ways experience 
less successful interpersonal outcomes. Per- 
haps women in our culture have learned that 
there may be social hazards to knowing too 
much about other people’s feelings. This rela- 
tive avoidance of eavesdropping by women is 
consistent with the standards of politeness 
and social smoothing-over that are part of 
the traditional sex role ascribed to women in 
our culture, a sex role that is only now be- 


ginning to change. 
Reference Notes 
1, Krauss, R. M., Geller, V., & Olson, C. Modalities 


and cues in the detection of deception. Paper 
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data analysis. Red 


The first four measures of sensitivity to verbal 
cues were derived from the PONS test, a 47- 
minute film consisting of 220 2-sec audio and/or 
visual nonverbal stimuli. In each 2-sec segment, a 
14-year-old female acts in one of 20 different emo- 
tional situations. The 20 situations are categorized 
with reference to four: different types of emotion, 
each created by the crossing of two affective di- 
mensions: positivity-negativity and dominance- 
submission. Hence, there are five positive-domi- 
mant situations (e.g., talking to a lost child), five 
negative-dominant situations (e.g., expressing 
Strong dislike), five positive-submissive situations 
(¢g., expressing deep affection), and five negative- 
Submissive situations (eg., asking forgiveness). 
The situations were originally categorized as either 
Positive or negative and as dominant or submis- 
Sive according to the ratings of two different sam- 
ples of judges (Rosenthal, Hall, Archer, Di- 
Matteo, & Rogers, 1979; Rosenthal, Hall, Di- 
Matteo, Rogers, & Archer, 1979). 
p: 220 PONS items consist of a random 
n a these 20 situations, each represented 
i, any channels” of nonverbal communi- 
fits A re channels are pure video channels: 
Ris ae ed only (neck to knees), and face 
GR wo channels are pure audio channels: 
see pog (CF) and “randomized spliced” 
A REIRE 1 of these channels, verbal messages 
Mee and romnrehensible] CF preserves se- 
fllh and i i ythm (RS does not); RS saves 
nixed” Sea The other six channels are 
tombinatione a ee of all audiovisual 
Video channels two audio with the three 


me of tnt of the 220 items, subjects select 
Telly descri ee labels, one of which cor- 
tectly dese ee the situation and one which incor- 
Assigned Et es ‘it. The incorrect alternative was 
of the 19 Pcie item by randomly choosing one 
tect asec | uation labels that was not the cor- 
Th r 

Tt Oe Measure, the Nonverbal Discrepancy 
the PON D), employs 8 of the 20 situations of 
uadrants TR 2 from each of the four affective 

of the PONS. Half of the scenes in 
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Appendix 


the NDT are represented in the face channel, and 
half are represented in the body channel. In ad- 
dition, half of the scenes represented in each of 
these two video channels are also represented in 
the content-filtered audio channel, and half of 
the scenes are represented in the randomized- 
spliced audio channel. 

In the discrepancy test, each of the 8 scenes is 
paired with every other scene twice. Hence, there 
are 128 items in the test (8 scenes X 8 scenes X 2 
replications). Each item consists of the simul- 
taneous pairing of either a face or a body with 
a content-filtered or randomized-spliced voice. 
Every possible audio-video pairing (face-CF, 
face-RS, body-CF, body-RS) occurs exactly 32 
times. For one quarter of the items, the audio and 
the video segments are from the same affective 
quadrant (e.g., a positive-dominant face might be 
paired with a positive-dominant voice). One quar- 
ter of the items consist of audio and video seg- 
ments from exactly opposite quadrants (eg, a 
positive-dominant face might be paired with a 
negative-submissive voice). The audio and video 
segments of the remaining items differ on only 
one of the affective dimensions. For example, a 
positive-dominant face might be paired with a 
positive-submissive voice. In this case, the dis- 
crepancy would be along the dominance dimen- 
sion, since both evaluative inputs are the same 
(i.e., both are positive). Or, a positive-dominant 
face might be paired with a negative-dominant 
voice. In this case, both inputs assume the same 
value on the dominance dimension (i.e., both are 
dominant) but they differ on the evaluative di- 
mension (i.e. one is positive and the other is 
negative). 

Subjects rate each scene on a 9-point scale of 
discrepancy. Their accuracy (or sensitivity) is a 
function of the degree to which they rate as more 
discrepant those scenes in which the video and 
audio channels are, in fact, more discrepant, com- 
pared to their ratings of the scenes in which the 
video and audio channels are, in fact, less dis- 


crepant. 
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Young Children’s Preferred Attentional Strategies 
Delaying Gratification 


for 
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In four experiments, children controlled the frequency and duration of self- 
exposure to sets of stimuli during a delay of gratification. The stimuli included 
the real rewards for which they were waiting, symbolic (picture) versions of 
those rewards, or irrelevant objects. Previous research had shown that exposure 
to the real rewards hinders self-imposed delay, but viewing irrelevant stimuli 
or pictures of the rewards facilitates delay. In the present study, we recorded 
how long children viewed each stimulus during the delay and also 
their verbal viewing preferences. Preschool children spontaneously tended to 
prefer to view the real rather than the symbolic stimuli, regardless of their 
relevance to the rewards in the delay contingency, both when the delay was 
self-imposed and when it was imposed externally. 
for attending to the “real rewards” rather than to more symbolic representa- 
tions helps to explain why it is so difficult for young children to tolerate volun- 
tary delay of gratification. By attending to the real rewards, 
such delay especially frustrative and arousing, 
ability to wait for what they want. In contrast, children in Grades 1, 
systematically selected effective attentional strategies during delay. 


A great deal of experimental research has 
focused on the stimulus conditions that facili- 
tate or impede delay of gratification. These 
studies have manipulated such variables as 
the presence or absence of the rewards and 
assessed how they influence the ability and 
willingness of subjects to tolerate various 
types of delay of gratification (e.g., Mischel, 
1968, 1974; Mischel & Baker, 1975; Moore, 
Mischel, & Zeiss, 1976). The results provide 
much evidence about conditions that may 
facilitate waiting for deferred outcomes. In 
contrast, remarkably little is known about the 
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assessed 


This generalized preference 


they may make 


thereby defeating their own 
2, and 3 


al strated 
delay. This 


about deli 
put almost 


d attention 


subject’s own natural attention: 
for coping with various types of 
we know a considerable amount 
enhancing stimulus conditions 
nothing about children’s preferred ; 
strategies during delay of gratification. Fh 
gin to fill this gap, in the present set of Bi 
we explored young children’s verb Pa 
ences and actual use of different ae 
strategies for sustaining delay of ares 
tion. Our results should help clarify tee 
to which young children know and ull f 
lay-facilitating attentional strategies | i 
faced with situations requiring them 
for deferred but desired outcomes. , val 
The delay of gratification paradigm P 
the earlier research provided cil e 
the dilemma of waiting for 4 mo i 
but delayed reward or terminating 4 : d 
and getting a less preferred reware “491! 


ately (Mischel, Ebbesen, & Zen in 
Using this paradigm, several stu H edit 
how attention to the delayed a" ‘i 


$ con! 
rewards changes children’s ability © 
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PREFERRED STRATEGIES FOR DELAYING GRATIFICATION 


fo wait for the preferred outcome. Initially, 
it was predicted that attention to the rewards 
would facilitate delay of gratification. This 
prediction was based on Freud’s (1911/1959) 
provocative idea that “cathecting” images of 
desired but blocked gratifications is a corner- 
stone for the development of frustration tol- 
erance, Surprisingly, it was found that rather 
than facilitating delay by helping the subject 
‘anticipate the desired but blocked goal, at- 
tending to the delayed and immediate rewards 
„appears to increase the child’s arousal and 
frustration to the point where delay becomes 
“foo aversive to continue (Mischel, 1974). 
Specifically, the presence of the actual re- 
wards during the delay period consistently 
produces shorter delay times than does the 
absence of the rewards (Mischel & Ebbesen, 
1970; Mischel et al., 1972; Schack & Massari, 
1973). When the rewards are not visible dur- 
ing delay, instructions to think about them 
have the same negative effect on delay time as 
Po rewards themselves (Mischel et al., 
|e while attention and thought directed 
1 a e oa rewards hinder delay of gratifica- 
tion ae to their symbolic representa- 
ae the form of images (pictures of the 
1973; facilitates delay (Mischel & Moore, 
B a aoe 1974). Taken as a whole, these 
ins. rei studies (e.g., Mischel & Baker, 
k. oe et al., 1976) provide a coherent, 
kribe i yi meaningful set of rules that de- 
Stimuli hs attention to different types of 
lay en hinder or facilitate self-imposed 
Vestigat gratification. For the present in- 
a N the gist of these rules is that for 
strategie Petal there are at least two basic 
ay of f or using attention to facilitate de- 
ae and at least two basic at- 
ion to a à rategies that hinder delay. Atten- 
ne ymbolic or “cool” representation of 
Contingency (e, objects that are in the delay 
attention to relevant rewards”) as well as 
ittelevant NE a real objects that are 
Vant reward, thie delay contingency (“irrele- 
self-impos os ) are effective strategies for 
to the real Pee of gratification. Attention 
a symbolic elevant rewards and attention to 
representation of irrelevant re- 
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wards are both ineffective strategies that 
hinder delay, at least in the types of situa- 
tions sampled in these studies. 

In a recent series of related studies, Miller 
and Karniol (1976a, 1976b) have distin- 
guished between self-imposed delay (SID) 
of the sort studied in the paradigm of Mischel 
and his associates and externally imposed de- 
lay (EID), in which the subject cannot vol- 
untarily terminate the delay in favor of an 
alternative, immediate, but less preferred re- 
ward. Miller and Karniol believe that the 
findings of Mischel and associates apply to 
SID but not to EID. They argue that when 
individuals have no choice but to endure the 
delay, it is adaptive to focus on the rewards, 
and they have provided some empirical sup- 
port for their view. Miller and Karniol’s con- 
clusions should still be seen as tentative, but 
their work does underline the need to dis- 
tinguish between the two types of delay, and 
therefore, in the first study to be reported 
here, we included both the SID and EID 


paradigms. 


Experiment 1 


To investigate children’s preferred atten- 
tional strategies for delaying gratification, 
the delay paradigm developed by Mischel et 
al. (1972) was modified. Instead of placing 
fixed stimuli in front of the child for the en- 
tire delay, we designed equipment that al- 
lowed the subject to self-regulate the presen- 
tation of stimuli throughout the delay period. 
Each child had available for self-presentation 
one of the two types of stimuli that facilitate 
SID and one of the two types of stimuli that 
hinder it. Different groups of children had 
available for viewing one of the four possible 
pairings of the two SID-facilitating and two 
SID-hindering stimuli. We recorded the 
amount of time that the children actually 
viewed the different types of stimuli during 
either a self-imposed delay of gratification 
(SID) or an externally imposed delay (EID) 
of equal duration and also assessed their 


preferences verbally after completion of the 


delay. 
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Table 1 eS 4 

SID-Facilitating and SID-Hindering Stimuli 

Available for Presentation in Different 

Stimulus-Pair Conditions 

Eee a SS 
Stimulus pair 


eee) ES et 
SID-facilitating SID-hindering 
stimulus set vs. stimulus set 


Actual rewards 
(Relevant real) 


Irrelevant picture 
(Irrelevant symbolic) 


Picture of rewards 
(Relevant symbolic) 


Irrelevant objects 
(Irrelevant real) 


Irrelevant picture 
(Irrelevant symbolic) 


Picture of rewards 
(Relevant symbolic) 


Actual rewards 
(Relevant real) 


Irrelevant objects 
(Irrelevant real) 


Note. SID = self-imposed delay. 


Method 


Overview and Design 


Each child had one SID-facilitating and one SID- 
hindering set of stimuli available for viewing dur- 
ing the delay period. Two sets of stimuli exemplified 
the attentional strategies that have been found to 
hinder SID: (a) the actual rewards in the delay 
contingency (relevant real stimuli) and (b) a life- 
size color photograph of desirable objects that were 
irrelevant to the delay (irrelevant symbolic stimuli). 
Two other sets of stimuli exemplified the attentional 
strategies that have been found by previous research 
to facilitate SID: (c) a life-size color photograph 
of the rewards involved in the delay (relevant sym- 
bolic stimuli) and (d) a set of desirable objects that 
were irrelevant to the delay contingency (irrelevant 
real stimuli). All possible pairings of one set of delay- 
facilitating stimuli and one set of delay-hindering 
stimuli were tested. Table 1 lists the four stimulus- 
pair conditions used in this and the following studies. 

In a 2 (SID versus EID type of delay) X 4 (stim- 
ulus pairs) factorial design, children were exposed 
to a delay of gratification while being able to self- 
present stimuli from one of the four types of stimu- 
lus pairs shown in Table 1. Duplicating the procedure 
used by Mischel and his colleagues for self-imposed 
delay of gratification (Mischel, 1974), we instructed 
children in the SID condition that they could ter- 
minate the delay at any time by pushing a button 
that rang a bell and signaled the experimenter to re- 
turn. The children were also told that if they ter- 
minated the delay, they would receive the nonpre- 
ferred reward instead of the preferred reward. Chil- 
dren in the EID condition could not signal the 
experimenter to return. Instead, these children were 
yoked to children in the SID condition for duration 
of delay. That is, if a child in the SID condition 
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terminated the delay before the maximum time hal 
elapsed, a child of the same sex in the EID cont- 
tion was required to wait the same amount of tin: 
The maximum delay period was 10 minutes. 


Subjects and Experimenters 


The subjects were 96 preschoolers (48 girls and 4 
boys) who attended Bing Nursery School at Stanford 
University, Two male and two female experimenter 
tested equal numbers of male and female children in 
each cell of the design. Children were randomly s- 
signed to conditions so that 6 males and 6 females 
participated in each cell of the design.? The children 
ranged in age from 3.0 to 5.2 years, with a mean ol 
44 years. 


Apparatus 


The experimental room resembled the one de 
grammed in Mischel and Moore (1973). A visul 
barrier concealed a box of attractive toys in ont 
corner. A small table and chair were set against U 
wall on the opposite side of the room. The F 
was positioned inside a square that was outlin j 
the floor with black tape. Electromechanical equip 
ment on the table enabled the- child to control 
frequency and duration of two stimulus sets. a 
buttons that the child pushed to present alternat k 
stimuli for viewing required extremely little ba, 
to operate and were spaced 2 inches (5.1 oy rf, 
on the surface of a prominent metal box. T} ‘ail 
tion of these buttons in relation to other app 
is shown in Figure 1. l t 

Two devices that contained the stimuli inl 
child on his or her immediate right and left id 
the left button exposed the stimulus set a 
left, and pushing the right button exposed it 
lus set on the child’s right. An opaque ie 
hinged to the back edge of the lower part 0 , 
vice so that a solenoid held the cover one A a 
ing the set of stimuli for viewing only as lo ali 
child continued to press the button. sa 
mounted at a 45° angle so that they Be pr 
viewed by the child. When the child was vet 
ing the button, another solenoid held the aah witha 
preventing the child from viewing ae operat 
pushing a button. Only one device coule Paa ut 
at a given time; pushing both buttons ere post 
exposed neither set of stimuli. To contro ned 
position preferences, half of the children 


Bathe 


oust 

1 Of course, the children were observed Oe ie 
the delay (through a one-way mirror Di tely Ë 
sion would have been stopped imme i 
child appeared upset. the & 

2Nine additional children began pje pecal 
but were not included in the final s 
problems with equipment or W! 
prehension of instructions. 


j 


of the design had a stimulus set on their right, and 
the other children had the same stimulus set on 
their left. 

Electromechanical equipment recorded the dura- 
tion and frequency of the child’s self-presentation of 
each stimulus set. Duration was recorded in units 
of 005 minutes. Relay circuitry initiated data re- 
cording automatically when the experimenter left 
the room and terminated data recording after 10 

"minutes. 


Reward Stimuli 


The alternative rewards used in the present study 
differed only in quantity; they were (a) three stick 
pretzels versus one stick pretzel (each pretzel was 
16 em X.3 cm) and (b) three regular-size poker 
chips (white, red, and blue) versus one regular-size 
poker chip (white). Half of the children in each 
cll chose between (and waited for) the food re- 
wards, and the other children in each cell chose 
between (and waited for) the nonfood rewards. 
Thus, for the children who waited for food rewards, 
the stick pretzels were the relevant reward stimuli, 
pi the poker chips were irrelevant. For the children 
E waited for poker chips, the poker chips were 
ost rewards, and the pretzels were the ir- 
ie i stimuli, providing a complete crossover de- 
fe a stimulus pairs available for self-presenta- 
Pris a real and symbolic (picture) versions 
Bathe ats and irrelevant items in the four com- 
ia Fal own in Table 1. The four sets of stimuli 
w AEEA these combinations were made were 
a lie-et stick pretzels, (b) three poker chips, (c) 

-size color photograph of the three pretzels, 


ind (d) a life-si 
Poker aa life-size color photograph of the three 


Procedure 


U; . 
ete the room, the child was shown the 
would be played with later. The experi- 


toy, 


Figure 

j, 1. Ay 

ive stimulue atts for self-presentation of alterna- 

cm) above gt Viewed from about 2 feet (615 
child's eye level. 
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menter then concealed the toys and asked the child 
to sit at a table on the other side of the room. 
The child was instructed to remain seated inside 
the square that was taped on the floor. The experi- 
menter took a box from underneath the table, ex- 
posed the scheduled reward, and asked the child 
which alternative he or she would like to have. The 
experimenter explained that he or she would leave 
the room soon but would return later to give the 
chosen items to the child. These instructions were 
repeated until the child understood them, After- 
wards, the rewards were returned to the box, which 
was then concealed under the table. 

The experimenter proceeded to show the child 
how to use the stimulus presentation apparatus, So 
that the children would be less inclined to push 
both buttons simultaneously (a problem noted in 
pretesting), the children were asked to fold their 
hands together, to extend their forefingers, and to 
keep their forefingers together throughout the ses- 
sion, In essence, this procedure prevented the chil- 
dren from pushing both buttons simultaneously, 
After folding his or her hands and fingers together, 
the child was encouraged to push the buttons. 
Whenever the child pushed a button, the experi- 
menter named the presented stimulus set, noting 
whether it was a set of real objects or a picture of 
real objects (eg., “Oh, look! That button shows 
you a real and a real ____!” or “Oh, look! 
That button shows you a picture of and a 
picture of ___!”), The child was instructed to 
look closely at whichever stimuli were exposed, and 
the experimenter modeled intent viewing of which- 
ever stimuli the child self-presented. 

After the child had become familiar with the 
stimuli and the apparatus by presenting each stim- 
ulus set at least three times, he or she was asked 
not to push both buttons at the same time and not 
to touch the devices or stimuli. The child was then 
quizzed on which button produced which set of 
stimuli, to be certain that the child knew which 
stimuli were available for viewing and how to 
present each of the available stimulus sets. In this 
study, the children were given no specific instruc- 
tions about how to structure their viewing of the 
alternative stimuli during the delay; they were 
simply asked to “push the buttons.” The children 
also were not given specific information about when 
the experimenter would return but simply were re- 
assured that the delay would not be “too long.” 

The experimenter left the room after reminding 
the child that he or she would play with the toys 
Jater. During the delay, the experimenter observed 
the child through a one-way mirror and returned 
to the room after 10 minutes. The child was then 
again shown the reward items and was asked 
which he or she was supposed to receive. All chil- 
dren remembered which one they previously had 
chosen to receive, were given that object, and next 
were asked the following questions in random order: 


Question 1. If you could have just one of these 
(stimulus sets) in front of you while you were 
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Table 2 
Stimulus Preferences in Each Pair on 
Each Measure (Experiment 1) 


Measure 
Ques- Ques- 
Stimulus tion 1 tion 2 Viewing 
pair (choice) (liking) time 
Relevant 
Real vs. 
symbolic amm: ty Choe Wey bac 
Irrelevant 
Real vs. 
symbolic .83*** #1) bs es ae 
Symbolic 
Relevant vs, 
irrelevant .79** 62 .53 
Real 
Relevant vs. 
irrelevant 58 62 49 


Note. The data are proportions: The larger the 
proportion, the greater the preference for the top 
stimulus in each pair. n = 24 for all measures. 
Stimuli that facilitate self-imposed delay (SID) 
are in boldface type. In all tables, proportions for 
Question 1 and Question 2 are not directly compar- 
able to proportions for viewing time (see text). For 
Questions 1 and 2, binomial tests compared the 
data to chance levels; for viewing time, £ tests 
compared the means to an ideal neutral proportion 
of .50. All tests are two-tailed throughout this 
article. 

*p = 06. 

** b> < .025. 
+b < 002. 


waiting, which one would you want in front of 
you all the time while you were waiting? 

Question 2. Which one (of the stimulus sets) did 
you like to see the most while you were waiting? 


After recording the child’s answer, the experi- 
menter led the child to the toys and, after they had 
played with them for 5 to 10 minutes, returned the 
child to the nursery school classroom. 


Results 


Children’s answers to the postdelay ques- 
tions and the relative amount of time they 
spent viewing alternative stimuli showed a 
consistent pattern: They preferred to view 
real stimuli regardless of their relevance to 
the delay (see Table 2) and regardless of 
the type of delay (EID or SID). 
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Answers to the postdelay questions wer 
analyzed in terms of the proportion of chil 
dren in a given cell of the design who an 
swered the question by pointing to the de 
lay-facilitating stimulus in the stimulus pait! 
The relative amount of time spent viewing 
the delay-facilitating stimulus in a given 
pair was calculated for each child separately 
as the duration of viewing the delay-facilitat- 
ing stimulus divided by the total duration 
of viewing both stimuli. 

Preliminary analyses found no significant, 
effects of food versus nonfood rewards. Sepa: 
rate analyses of variance conducted on the 
three measures found no significant main el- 
fects or interactions of SID versus EID: 
both Fs < 1 for the two questions; F(1, 88) 
= 2.47, p > .10, for relative time spent view 
ing stimuli. 4 

To examine the effects of stimulus pails 
we combined data for types of delay within 
each stimulus-pair condition. These analysts 
showed a consistent pattern on all three my 
sures. More specifically, the proportions 
Table 2 express the relative preference fot 
the stimulus listed first in each pair. When 
given a choice between relevant real versi 
relevant symbolic stimuli, the children Pre 
ferred the real stimuli: p < .002 by a W 
tailed binomial test for Question 1; $=% 
by a two-tailed binomial test for Question ® 
t(88) = 2.71, p < .01, two-tailed, for tim 


>e 
S a a nC E 


3 As recommended by Winer (1971, pp. 39940) 
the variance-stabilizing transformation jx = whert 
[(ix)™°] was performed on each proportion ied i 
sx is the original proportion and gy» is exp" formed 
radians. The theoretical variance of the kas 3 
proportions is not dependent on the original Pri the 
tions and instead is simply the constant 1/4. 
analyses of Question 1 and Question 2 respo ( fF 
which there is only one proportion Per ©. vy 
hence no within-cell variance that can be œ gd 


just as we use the normal distribution Me 

t distribution with infinite degrees of freedor ce, We 
we “know” the “true” population is 
use an F distribution with k and © paee 
dom when we know the true or eee exalt 
variance (cf. Langer & Abelson, 1972, for 

and extended discussion of this procedure) 


g stimuli. When given a choice 
elevant real and irrelevant sym- 
i, children again preferred the 
p< .002 by a binomial test for 
; p= .06 by a binomial test for 
j #(88) = 3.48, p < .002, for time 
ing stimuli. In contrast, children 
not choose between real and sym- 
i, but only between relevant and 
t stimulus sets that were both real 
mbolic, did not exhibit a consist- 
nificant preference for either relevant 
nt stimuli: < .01 in favor of 
‘symbolic as opposed to irrelevant 
timuli for Question 1; but p> 
Juestion 2 in the same stimulus pair; 
30 for both Question 1 and Ques- 
| the relevant-real versus irrelevant- 
ulus pair. In addition, children did 
a significant preference when ac- 
ing relevant versus irrelevant stim- 
the delay, regardless of whether 
were symbolic or real (żs < 1.2; 


point correlations (Hays, 1963) 
the three measures were sig- 
and substantially correlated with 
er: >= .43, p< .001, between 
and viewing; ¢ = 34, p < .005, 
estion 2 and viewing; = 55, 
between Question 1 and Ques- 


discussing the significance and im- 
Of the results of the first study, 
cribe two additional experiments; 
$ -W of the total pattern of data 
ed. 


Experiment 2 


important changes in ideation and 
May occur during the preschool 
Paivio, 1971; Piaget & Inhelder, 
4, cond study was developmental 
tically included children from 
groups: 3.00 to 4.25 years, 4.50 to 
a 6.00 to 7.00 years. Thus, we 
mulus pairs) X 3 (age groups) 
En. Given the finding in Experi- 
a Rence patterns were highly 
and EID, in our second ex- 
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periment (and also in the third one to be 
reported in this article) we used only the 
EID paradigm, because it controlled length 
of delay time, thus preventing stimulus-view- 
ing proportions from being confounded with 
duration of delay. Except for this change 
and for differences in the specific relevant 
and irrelevant reward stimuli used, the basic 
procedures and measures were essentially 
identical with those in Experiment 1. 


Method 


Reward Stimuli 


To extend the generalizability of findings, some- 
what different sets of reward stimuli were used. 
Specifically, the stimulus sets used in Experiment 2 
were (a) one stick pretzel and one miniature marsh- 
mallow, (b) one small red poker chip (.75 inches, 
or 1.9 cm, in diameter) and one penny, (c) a life- 
size color photograph of the pretzel and the marsh- 
mallow, and (d) a life-size color photograph of 
the poker chip and the penny. 

Half of the children in each cell of the design 
chose between receiving one pretzel and one marsh- 
mallow, and the other children in each cell chose 
between receiving one poker chip and one penny. 
Thus, for the former groups of children, the pretzel 
and marshmallow were the relevant reward items, 
and the poker chip and penny were the irrelevant 
items. For the other group of children, the poker 
chip and penny were the relevant reward items, 
and the pretzel and marshmallow were the irrelevant 
items. To avoid differences in stimulus viewing 
that might result from different durations of delay 
or from different durations of exposure to alterna- 
tive stimuli, all children were exposed to a fixed 
period (12 minutes) of EID. Children were tested 
in random order in terms of stimulus-pair condi- 
tion, age group, gender, and experimenter. 


Subjects and Experimenters 


The subjects were 96 preschool children (48 girls 
and 48 boys) who attended Bing Nursery School 
at Stanford University. Each cell of the design in- 
cluded 4 male and 4 female children. Two children 
of each sex delayed food gratification (pretzel/ 
marshmallow), and two children of each sex de- 
layed nonfood gratification (poker chip/penny). 
Two female and two male experimenters tested 
equal proportions of the children in each cell of 


the design. 


4 These £ tests used the error term from the pre- 
ceding analysis of variance as the pooled estimate of 
standard deviation, (MS./n)**. These and all t tests 
in this article are two-tailed. 
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Table 3 
Stimulus Preferences in Each Pair on 
Each Measure (Experiment 2) 


LS 


Measure 
Ques- Ques- 
Stimulus tion1 tion2 Viewing 
pair (choice) (liking) time 
Relevant 
Real vs. 
symbolic eSoeee wat Ly hod 
Irrelevant 
Real vs. 
symbolic A-E aia EE f bate NE bce 
Symbolic 
Relevant vs. 
irrelevant 67 67 53 
Real 
Relevant vs. 
irrelevant 67 58 51 


Note. The data are proportions: The larger the 
proportion, the greater the preference for the top 
stimulus in each pair. Stimuli that facilitate self- 
imposed delay (SID) are in boldface type. n = 24 
for all measures. 

* p= 06. 

** p < 025. 
+*+ p < .002. 


Results 


Children’s answers to the postdelay ques- 
tions and the relative amount of time that 
they actually spent viewing the alternative 
stimuli again showed the same consistent 
pattern. As summarized in Table 3, both 
verbally and behaviorally, children preferred 
to view real stimuli regardless of their rele- 
vance or irrelevance to the delay contingency." 

Separate analyses of variance conducted 
on the three measures revealed no signifi- 
cant main effect or interactions of age group 
(all Fs< 1). To examine the effects of 
stimulus pairs, subsequent analyses com- 
bined data for age groups within each stimu- 
lus-pair condition. 

These analyses showed a consistent pat- 
tern on all three measures that closely 
matched the results of Experiment 1. Spe- 
cifically, when given a choice between rele- 
vant real versus relevant symbolic stimuli, 
the children preferred real stimuli. Two- 
tailed binomial tests yielded p< .002 for 


Question 1 and p = .06 for Question 2; t= 
2.90, p < .01, for time spent viewing stimul, 
Similarly, when given a choice between it 
relevant real and irrelevant symbolic stimul, 
children preferred the real stimuli: p < 0l 
for Question 1; p= .06 for Question 2; 
(84) = 3.31, p < .002, for time spent view 
ing stimuli. When the children could no 
choose between real and symbolic stimuli 
but only between relevant and irrelevant 
stimulus sets that were both real or both 
symbolic, no significant preferences wett 
found on any measure. Children who had 
these stimulus sets available showed a no 
significant tendency to prefer relevant stin 
uli (ps = .15 for Question 1; ps2 15 fo 
Question 2; ts < 1.3 for time spent viewll 
stimuli). 

Fourfold point correlations measured the 
strength of intermeasure relationships. The 
three measures were again significantly aI 
related with each other: ¢= 35, p< M 
between Question 1 and stimulus viewing) 
$= 39, p < .001, between Question 2 4 
stimulus viewing; ¢ =.42, ? < 001, , 
tween Question 1 and Question s di 
strength of intermeasure relationship i i 
not change systematically as 4 func 
age group. Between Question 1 a i 
lus viewing, the correlations were - p al 
and 32 for the 3.00 to 425¥ 


0 t 
the 4.50 to 5.75-year-old and we W 
7.00-year-old groups, respectively (2° 1. 


05). Between Question 2 and stimulus w 
ing, correlations were .62 (P < 001); A 
and .44 (p < .025) for the three age n 
respectively. The correlations be 00s 
two questions were .22 (75), - ( ot 
and .47 (p < .01) for the three a8° 


respectively. 


Experiment 3 


? then Ù 
The results of Experiment 2 streng ent 


generality of the findings from Ea Reg"! 
and are completely consistent W! d to pret 
less of age, preschool children ten' 


re 
5 As in Experiment 1, analyses of V renfood a 

no significant effects due to the foo 

tion in Experiment 2 or in Exper 
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view whichever stimuli are real, during delay 
of gratification. Before reaching any firm in- 
terpretations and conclusions, however, we 
conducted a third experiment designed to 
overcome some possible ambiguities. Namely, 
jn the first two experiments, the viewing in- 
structions to the children simply requested 
that they push buttons exposing stimuli dur- 
jng the delay period. Likewise, the postdelay 
‘questions may have been somewhat ambigu- 
ous, Can we be sure that the children’s re- 
ponses reflected attentional strategies that 
they believed would help them wait? Instead, 
tight they have viewed whatever they found 
more attractive or desirable (i.e., real ob- 
cts rather than pictures of them)? Might 
the children’s viewing and verbal preferences 
for the real stimuli possibly have been the 
‘Tesult of trying to “tell” the experimenter 
a wanted to receive one of the sets 

objects after the delay, instead of 


a fiat case, they might have con- 
i ay De ar what they viewed during the 
five ce influence what the experimenter 
DF ig when he or she returned. Likewise, 
rather ae may have self-presented real 
I eiui symbolic stimuli so that when 

the ek enter opened the door 
iking at te e or she would see the children 
ky to oi e real stimuli and might be more 
| than pict give them the real objects rather 

ictures, of them. 
| eed the meaning of the children’s 
f Ne explicit] ore precisely, in Experiment 3 
Ose stim * asked half the subjects to view 
la efficacy that would best facilitate de- 
those stint fiore) and half to view 
Mtategy) Mor at they wanted to see (wish 
® situation Ree: we carefully arranged 
Wid alternate nd the postdelay questions to 
tnative interpretations. 


á Method 
Nan EID yi 
t the four a half of the children in each 
View th ulus-pair conditions were instructed 
uli which, in their opinion, would 


Pt help ts ses 
em wait (efficacy strategy). The re- 
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maining children in each stimulus-pair condition 
were asked to view those stimuli which they wanted 
to see (wish strategy). Thus, we used a 2 (view- 
ing instructions) X 4 (stimulus pairs) factorial de- 
sign. 

The same rewards and stimuli used in Experi- 
ment 1 were used in Experiment 3. As in the first 
two studies, rewards and relevancy of stimuli were 
completely crossed within the design. Because Ex- 
periment 1 found no significant difference on any 
measure between SID and EID, in Experiment 3 
we again used only EID to provide maximum 
control over duration of delay. 


Subjects 


A total of 80 children (40 girls and 40 boys) 
who attended Bing Nursery School were the sub- 
jects. Four female experimenters tested equal pro- 
portions of the 5 girls and 5 boys who were ran- 
domly assigned to each cell in the design. The 
children ranged in age from 2.7 to 54 years, with 
a mean age of 4.3 years. 


Procedure 


The procedure used in the first two studies was 
modified as follows: The children were assured 
that the experimenter would not know anything 
about what they did during the delay. The children 
also were told that the experimenter would knock 
on the door three times before reentering the room 
and at that time they should stop looking at either 
stimulus set and, instead, should place their hands 
on top of their heads as soon as the experimenter 
knocked on the door. Furthermore, a tall (ap- 
proximately 8 feet, or 2.44 m, high) divider was 
placed between the door and the table at which the 
children sat. The children were told that in case 
they forgot to put their hands on their heads when 
they heard the experimenter knock, the divider 
would prevent the experimenter from seeing which 
stimuli they might be looking at. 

Just before leaving the room, the experimenter 
consulted a previously concealed portion of the 
schedule sheet to find out which predelay stimulus- 
viewing instructions the child was to receive. Half 
the children were given the efficacy strategy and 
were told the following: 


Push the button that shows you what helps you 
wait the most... > Look at the one (stimulus 


set) which makes it easiest for you to wait. 
The other half were given the wish strategy and 
were told to: 


Push the button that shows you what you like to 
see the most while you're waiting... - Look at 
the one which you like to see the most. 


returning to the room after the delay, the ex- 


Upon 
nter gave the child the promised reward. She 


perime: 
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Table 4 


Stimulus Preferences in Each Pair on Each Measure (Experiment 3) 


Question 1 (help) 


Question 2 (liking) Viewing time 


Viewing strategy 


Viewing strategy Viewing strategy 


Stimulus 


pair Wish Efficacy Total 


Wish Efficacy Total Wish Efficacy Total 


Relevant 
Real vs. 
symbolic 
Irrelevant 
Real vs. 
symbolic 
Symbolic 
Relevant vs. 
irrelevant 


Real 
Relevant vs. 
irrelevant 


1,00*** -50 “fee 


70 -70 
40 
90** 


40 65 


-10 19" .67"* 50 


-80 58 


-70 .90** .80** 56 57 56 


-10 55 .60 S4 S 


53 


-60 -65 AT 60 


Note. The data are proportions: The larger the proportion, the greater the preference for the top stimulus 
in each pair. n = 10. Stimuli that facilitate self-imposed delay (SID) are in boldface type. 


then asked two questions. The first question of the 
previous experiment was revised to make it more 
direct: y 


Please tell me which one (of the two stimulus 
sets) showed you what helped you wait the most. 
Which one made it easiest for you to wait? 


The second question remained quite similar to Ques- 
tion 2 in Experiments 1 and 2: 


Please tell me which one showed you what you 
liked to see the most. Which one did you want 
to look at? 


Both questions were asked of all children in a forced- 
choice format and in random order. 


Results 


Separate analyses of variance for the three 
measures yielded no significant main effects 
of predelay viewing instructions (all Fs < 
1). The Instruction X Stimulus Pair inter- 
action did not reach significance for time 
spent viewing stimuli, F(3, 72) =2.12, p> 
.10, or for Question 2, F(3, 0) = 1.23, p 
> .20, However, there was a significant in- 
teraction of Instructions X Stimulus Pairs 
on Question 1, F(3, œ) =6.15, p < .001.° 


As Table 4 indicates, this interaction reflect 
the fact that children who had been given 
wish strategy overwhelmingly chose the rë e 
vant real (rather than relevant symbolic 
stimuli while showing no strong preferent 
for relevant real versus irrelevant real stit 
uli. In contrast, subjects given the efficaty 
strategy showed no systematic preferent 
for relevant real versus relevant symbot! 
stimuli, but did choose relevant real over 
relevant real stimuli. abl 
The overall data are summarized in i 
4, Let us consider each measure separate 
During the delay, children tended t0 si 
tually view the relevant real more an 
relevant symbolic stimuli when give? 2 a tt 
delay wish strategy (i.e., 4 
self-present the stimuli they woul 15,1 
see the most while waiting); ¢(72) i 
<.01. But as Table 4 shows, children 


in analyses of Questions 1 and 2, an 
with k and © degrees of freedom was M 
significance tests. 
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ad been asked to self-present the stimuli 
hat would help them wait the most (efficacy 
trategy) showed no such systematic prefer- 
mces; they viewed the relevant symbolic 
timuli as much as they did the relevant real 
rewards. 

On Question 2, the postdelay measure ask- 
ing which stimuli they had liked to see most, 
children significantly preferred real stimuli 
to symbolic ones ($ < 05 for both stimulus 
pairs). In addition, children who were given 
the efficacy viewing strategy showed a sig- 
nificant preference for irrelevant real over ir- 
ect symbolic stimuli when asked which 
they “liked” to view (p < .025), but not 
When asked which helped them wait best ($ 
> 10). Finally, consider Question 1, the 
postdelay question asking children which 
stimuli had helped them wait the most. Here 
the answers again depended on the predelay 
Mewing instructions (as the previously noted 
Significant interaction indicated). Specifically, 
when children had been given the wish strat- 
gy instructing them to view the stimuli they 
liked to see the most, they consistently pre- 
ferred relevant real over relevant symbolic 
ae (p< 002), paralleling the results 
+ in Experiments 1 and 2. But when 
4 had been instructed to view the stimuli 
ie help them wait most (efficacy 
ee they did not report any significant 
on ence for „relevant real over relevant 

Ñ olic stimuli, choosing each equally often. 
Eo also that on Question 1, children who 
Beri ee the predelay efficacy strategy 
il, Sine en real to irrelevant real stim- 

ID a ese children were waiting in an 
a ion, their belief that attention to 
Foret, real rewards would help them wait 
4 Ea an attention to irrelevant real stimuli 
(197 belts to support Miller and Karniol’s 

ae i 6b) view that while attention to 
A A rewards impedes self-imposed de- 

Biota aes externally imposed delay. 
re must lee that conclusion, however, 
} Siderations a into account some other con- 
jects were es Miller and Karniol’s sub- 
ond, contradi 3 years older than ours. Sec- 
Ourth study aa evidence comes from a 

we conducted with subjects 
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of comparable age to those used by Miller 
and Karniol, as reported next. 


Experiment 4 


In this final study, we examined the stimu- 
lus preferences of older (elementary school) 
children. We concentrated on their prefer- 
ence for relevant versus irrelevant real stim- 
uli, since that is the most theoretically in- 
teresting choice, but continued to distinguish 
between EID and SID situations and between 
food and nonfood delay. We assumed that 
these older, more developmentally mature 
subjects would have a better awareness of 
the attentional strategies that would facili- 
tate delay and that this awareness would be 
reflected in their stimulus preferences. 


Method 


2 (food vs. 
dren in each 
again were crossed completely in the desig 
child’s choice in each condition was between view- 


ing the relevant real stimuli 


the delay contingency) or the irrel 
(real stimuli irrelevant to the delay contingency). 


Subjects 


The data came from 48 children (24 girls and 24 
boys) who attended Grades 1, 2, or 3 in one 0 
two local elementary schools (Kirk and Valley 
View Elementary Schools in San Jose, California). 
One male experimenter conducted the predelay pro- 
cedures; an experimentally blind assistant admin- 
istered the postdelay questions about the delay- 
facilitating characteristics of stimuli, A male and 
two female adults served as assistants, for the same 


proportion of children in each cell. 


Procedure 

The same procedure was employed in Experi- 
ment 4 as in the other studies, with the following 
modifications made because of the age of this sub- 
ject population. The rewards used in earlier studies 
were to be more appropriate for older 
children. Specifically, these children waited for 
either two versus six regular-size cat’s-eye marbles 
or for one half a cookie versus three cookies (half 
a sugar cookie vs. 2 chocolate chip, an oatmeal, and 


a sugar cookie, each 3 inches, or 7.6 cm, in diam- 
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Table 5 Ay : 
Stimulus Preferences on Each Measure in Each Condition (Experiment 4) 
jE Eee 


Food Nonfood 
Question 1 Viewing Question 1 Viewing 
Condition (choice) time (choice) time 
SID > bag A2 Ci bd EK hpa 
EID Mabe. 43 Sed 169%" 
Total AB hond .42* Aiaia Bri as 


Note. SID = self-imposed delay; EID = externally imposed delay. The larger the proportions, the greater 
the preference for relevant real versus irrelevant real stimuli. » = 24 for Question 1; n = 12 for viewin 


time. 
*p <.05, 
** p < .025. 
™* p < .002. 


eter). To collect more data on stimulus preferences 
during the delay, the apparatus was altered slightly 
so that children could alternate stimulus sets, de- 
termining how long each set was presented. The 
child chose which stimulus set was viewed, but al- 
ways viewed one of the two stimulus sets. As in the 
previous studies, the children were assured that the 
experimenter would not know what they were do- 
ing during the delay, but no tall divider was used, 
and no hands-on-head instructions were given. As 
in Experiments 1 and 2, the children were not given 
specific viewing instructions, The delay period was 
increased to 20 minutes, in view of the children’s 
age and to collect more extended data on stimulus 
viewing. 

Verbal stimulus preferences (Question 1) were 
obtained after the delay by having the assistant ask 
the child which stimulus would be best to view for 
food delay and, separately, for nonfood delay, re- 
gardless of whether the child had experienced a 
food or nonfood delay. For subjects in the SID 
condition, this question was phrased as follows: 


Suppose that you would get these (food or non- 
food rewards) if you waited for (experimenter’s 
name) to come back on his own. Also, suppose 
that you would get (this) (these) if you pushed 
this (the termination) button and called (the ex- 
perimenter) back. While you're waiting, would 
looking at what’s under this (assistant points to 


Results 


An analysis of variance for proportions al 
time spent viewing the alternative stimul 
found no significant main effect for SI 
versus EID situations (F < 1) and no De 
lay Situation x Rewards interaction (RS 
1). There was, however, a significant mal 
effect for food versus nonfood rewal 
F(1, 44) = 28.85, p < .001. d 

Verbal stimulus preferences also k 
significantly affected by SID versu; 
Tieton {r <1 for all children’s ae 
to the food delay question; F < 1 for ‘it 
swers to nonfood delay questions). Oe k 
for Question 1 in Table 5 show that i 
children said that they preferred viewing 
relevant real stimuli significantly mot fi 
relevant real stimuli during external i 
posed, as well as during self-imposed, ia 
of food gratification (ps < 025, ie 
tests). When SID and EID preferenc? 
pooled, the preference for viewing ite 
rather than relevant real stimuli E 
more significant (p < .001). Consisten! 


left) box lid help you wait? Or, while you're wait- children 5 
ing, would it help you wait more if you looked these pene Sey ae viewing 
at what's under this (experimenter points to ‘Significantly more time actually 


right) box lid? Looking at which would help 
you wait better? 


In the EID condition, the question was rephrased 
to describe an EID rather than SID situation. Half 
of the children in each experimental cell were asked 
the question first for food delay, and half were 
asked the question first for nonfood delay. 


irrelevant stimuli during delay of ted, 10" 
wards, (44) = 2.11, p < .05, two-tall” 
SID and EID pooled. o apcantl 

In contrast, these subjects PEG, 
preferred relevant real over irrelivam í 
stimuli during delays of nonfood g pref 
tion, whether EID (p < .025 for ver 
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rence); #(44) = 3.55, p < .002, for viewing 
erence, or SID (p < .05) and ¢(44) = 
430, p < 002. Possible reasons for this ef- 
fect are explored in the discussion. 
Further analyses examined the role of age 
the children’s viewing preferences. We di- 
the children into three age groupings: 
(a) less than or equal to 7 years (n= 9), 
(b) greater than 7 years but less than or 
qgual to 8 years (n= 17), and (c) greater 
8 years (n = 22). Chi-square tests of 
number of children in each age group 
iho preferred to view irrelevant or relevant 
rewards revealed that verbal stimulus 
ferences were not affected systematically 
age, x"(2) = 1.99, p > .30, for food de- 
+xX'(2) < 1 for nonfood delay. Moreover, 
Proportion of the delay spent viewing 
ant real stimuli was not significantly 
erent across age groupings. Specifically, in 
three-way analysis of variance for view- 
Proportions, the only significant effect or 
litraction was the main effect for food ver- 
nonfood delay, F(1, 36) = 25.96, p< 
wi; all other Fs < 1.5. Correlations be- 
Re. ee viewing preferences (verbal 
for E i were all nonsignificant and 
es and nonfood delay, both for 
separately and pooled. 


Ger ; Y 
neral Discussion and Conclusions 


T 
f E E s whole, the results of the pres- 
X Blisions experiments lead to several 
Bs of concerning the nature and conse- 
tional Sane children’s preferred atten- 
To + egies during delay of gratifica- 
oes briefly, Experiment 1 
D) and og both self-imposed delay 
aaa lead imposed delay (EID), 
nae Prefer to view real stimuli 
oe olic (pictorial) representa- 
the rewards regardless of their relevance 
ferences in ore the delay contingency. No 
peal versus eae patterns were found 
ferences ae D. The same pattern of 
ae aated in Experiment 2, 
A significant effects of age 
age categories sampled (ages 


“25 Year, 
"7.09 Years) S, 4.50 to 5.75 years, and 6.00 


School 
Tather th 


Li 
an 
= 


Mich al 
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Experiment 3 allowed us to distinguish 
clearly between children’s attentional prefer- 
ences during delay when they are oriented 
to attend to the stimuli that they want to 
view (wish strategy) as opposed to those 
stimuli that they believe would most help 
them wait (efficacy strategy). The results 
revealed that when following a wish strat- 
egy, children attended significantly more to 
the real rewards than to their more abstract 
(pictorial) representations, further replicat- 
ing the findings of Experiments 1 and 2. 
But when following an efficacy strategy, the 
children did not show any systematic prefer- 
ence for viewing relevant real as opposed to 
relevant symbolic stimuli: Presumably, they 
simply did not know which of these two 
types of stimuli would help more and chose 
quite randomly. 

The young subjects of the first three ex- 
periments attended as much to relevant real 
as to irrelevant real stimuli during delays 
and preferred the two equally, again indi- 
cating a lack of awareness of delay-facilitat- 
ing attentional strategies. This lack of dis- 
crimination was found consistently in all 
three experiments, and on all measures, with 
only one exception: Namely, on Question 1 
in Experiment 3, children who had been 
given the predelay efficacy strategy signifi- 
cantly preferred relevant real to irrelevant 
real stimuli. 

The data from the older children (Grades 
1, 2, 3) obtained in Experiment 4 were dra- 
matically different and suggest a change in 
delay strategies in the course of develop- 
ment, beginning around the age of 7 years. 
Specifically, when waiting for food, these 
older children systematically preferred view- 
ing nonfood stimuli and were aware that this 
strategy would help them delay gratification 
best. This was true in externally imposed as 
well as in self-imposed delay situations. Thus, 
the older children, in sharp contrast to the 
younger ones in Experiments 1, 2, and 3, 
preferred to view the irrelevant stimuli dur- 
ing food delay. This finding supports the 
view that in the course of development, the 
child becomes aware of effective delay strat- 
egies and realizes that attention to the rele- 
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vant reward stimuli impedes effective delay 
in EID as well as in SID situations. 

The fact that the older children also pre- 
ferred to view the irrelevant stimuli during 
externally imposed food delay raises impor- 
tant theoretical questions about the atten- 
tional strategies that actually facilitate or 
impede such EID. A great deal of direct evi- 
dence supports the conclusion that self-im- 
posed delay (SID) is impeded by attention 
to the relevant real rewards (e.g., Mischel, 
1974; Mischel et al, 1972; Moore et al., 
1976; Schack & Massari, 1973). In contrast, 
the conclusion of Miller and Karniol (1976a, 
1976b) that externally imposed delay is 
helped by attention to the rewards was based 
entirely on indirect evidence, such as indica- 
tions of differential frustration during reward 
absence versus reward presence in the course 
of EID, inferred from such data as subjec- 
tive time estimates. The present data there- 
fore raise the possibility that firm conclu- 
sions about the facilitating effects of atten- 
tion to the rewards: during EID may be 
premature. If an attentional strategy that 
focuses on the rewards really facilitates EID, 
it seems unlikely that the relatively mature 
subjects sampled in Experiment 4 would 
systematically tend to reject it during food 
delay, while younger subjects were more 
likely to prefer it (in Experiments 1, 2, and 
3). Unfortunately, a direct test of the ef- 
ficacy of different attentional strategies on 
actual duration of EID is not possible. By 
definition, EID requires that the subject 
cannot control the length of actual waiting 
time, thus preventing such a direct test of 
the effects of any stimulus manipulations 
upon duration of externally imposed delay. 

So far we have discussed the preferred at- 
tentional strategy of the older children only 
during food delay. But recall that when wait- 
ing for nonfood rewards, the older children 
preferred to view the actual rewards in the 
delay contingency (i.e. the relevant real 
stimuli), during both EID and SID. This 
preference might seem puzzling and contra- 
dictory. However, it becomes understandable 
when one examines more closely the children’s 
specific choice in this nonfood delay condi- 
tion. Namely, in this condition of the cross- 
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over design, food became the irrelevant stim, 
ulus. Thus, while waiting for a nonfogi 
object, the subject’s choice was either to viey 
that nonfood stimulus or a food treat mo 
relevant to the delay contingency. It seems 
likely that the older children realized that ij 
they chose to view the food during the delay| 
they would be frustrating themselves (by 
arousing their desire for a consumable bul 
unavailable object) more than they would il 
they viewed the relevant nonfood object 
That is, while waiting for marbles, it may, 
well be less frustrating to view those objec 
than to view cookies. In future research) 
tests of preference for relevant versus iti 
relevant stimuli probably should includi 
both relevant and irrelevant food stimuli o 
both relevant and irrelevant nonfood stimuli 
in order to avoid conflicts between the tw 
different types of categories. i 
Taken as a whole, the present series af 
studies suggests a clear shift in preferred at 
tentional strategies at the age of approx 
mately 7 years. Specifically, the first three 
experiments unequivocally show that pit 
school children do not systematically prefet 
or use attentional strategies that have bet 
found to optimize self-imposed delay i 
gratification (SID).” Indeed, the observ 
tendency of preschool children in Esp 
ments 1, 2, and 3 to prefer attending t0 F: 
stimuli (rather than to pictorial or abst" y 
representations) may make it espe 
frustrative and difficult for them to ap 
in sustained voluntary waiting for prefe" 
but delayed gratifications. 4 
Our finding that the younger children (Bs 
periments 1, 2, and 3) spontaneously 
to view the real stimuli during delay, 
gratification seems to be congruent 


a 
Sa 


F DS 

1 Follow-up studies using simpler choice ii 
plore knowledge of delay suggest that awa to 00 
at least one basic delay rule—that it hele 
rather than expose the rewards while w 
them—may begin to emerge even eaa help $ 
study, most children knew that it woul the t 
cover rather than expose the rewards iy 
they were 5 years old (Mischel, Mischi i 
Note 1). Even simpler or more ingenious 4 
might of course reveal important knowle 
self-control at still younger ages. 


fol 


à 
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Freud’s (1911/1959) classical theory of 
wsh-fulfilling ideation during delay: When 
the desired object is blocked, the frustrated 
child tries to self-present it, to “have it.” 
But our previous research suggests that in 
doing, young children are making self-im- 
posed delay even more difficult for themselves 
by increasing their desire and hence enhanc- 
ing the frustrativeness of the delay (eg., 
Mischel, 1974)—especially if this ideation is 
‘consummatory rather than more abstract 
(Mischel & Baker, 1975; Moore et al., 1976). 
he youngsters are then trapped in a delay- 
defeating cycle, attending to the consum- 
matory qualities of what they really want 
and becoming increasingly frustratively 
‘oused, thereby making it even harder to 
wait successfully. These interpretations are 
Supported by the fact that, at least in the 
SID paradigm, it has been demonstrated re- 
peatedly that attention to the real stimuli in- 
creases the frustrativeness of the delay and 
reduces the length of voluntary waiting 
(Mischel, 1974; Schack & Massari, 1973; 
Toner & Smith, 1977; also see Yates, 1977). 
The young children’s preferences for real 
stimuli over abstract or symbolic ones in the 
delay situation, and their inability to dis- 
criminate effective and ineffective delay 
Strategies, probably reflect their cognitive- 
developmental immaturity. With greater cog- 
nitive development, the child comes to recog- 
aon prefer attentional strategies that 
eae arousal (Experiment 4). 
a= ren increase their ability to deal with 
| TAN 1 more abstractly, they can transform 
Bia: a delay-facilitating ways (Mischel, 
4 ; Moore et al., 1976). Specifically, we 
iggest that the older child can focus more 
E the abstract rather than consummatory 
oe incentives, thereby avoiding ex- 
ustration while remaining oriented 


to, and guided 
ate by, preferred but delayed 


Reference Note 
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Strategy Signals in Face-to-Face Interaction 


Starkey Duncan, Jr., Lawrence J. Brunner, and Donald W. Fiske 
University of Chicago 


The notion of “strategy signal” is proposed as an element in the description of 
face-to-face interaction. Strategy signals are actions by a participant that have 


a distinct and consistent effect on specifi 


ed subsequent actions by the partner, 


but that have been rejected as signals within the organization of interaction. 


Data are presented on three hypothesize 
conversations. Both the speaker’s head 
number of turn cues have the effect of in 


of face-to-face interaction is discussed. 


We are proposing a new component of face- 
ace interaction, the strategy signal: one 
more actions by a participant that have a 
tinct and consistent effect on a specified 
my eduent action by the partner. Because the 
definition of a strategy signal requires 
eo oe of organization (or struc- 
a ae strategy in interaction, as well 
a in approach to research method, 
i n first to these matters. 
Be aie? she organization of interaction 
Bs ad language in terms of 
Action a poned organization for 
als ey elements such as 
Inguag, a, variously of actions in 
the Hey and aoe body motion, and 
responding = i appropriately using 
ization as A these signals. The or- 
fr the ne oe provides a structure 
Btogram of i uct of the interaction. A 
| search in this laboratory 
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d strategy signals in eight two-person 
direction toward the auditor and the 
creasing the auditor's attempts to take 


the turn under certain circumstances. Speaker smiling has the effect of increas- 
ing auditor smiles. The role of strategy signals within the broader description 


(Brunner, in press; Duncan, 1972, 1974; 
Duncan & Fiske, 1977; Duncan & Niederehe, 
1974) has attempted to describe some aspects 
of the organization of certain two-person 
conversations between adults. Because the 
research is concerned with the organization 
of interaction (as opposed to individual mes- 
sages), analyses are based on sequences of 
action involving both participants. 

A complete description of face-to-face in- 
teraction, however, would include considera- 
tion of interaction strategy in addition to 
organization (Duncan & Fiske, 1977). The 
notion of interaction strategy is a basically 
simple one. A close analogy is found in games 
such as chess or baseball. If the organization 
of face-to-face interaction is analogous to 
the rules of a game, then interaction strategy 
is analogous to ways of playing the game. 
For example, one rule in tennis is that the 
player must return the ball into the oppo- 
nent’s court, this rule leaving to the player’s 
strategy the decision where to place the ball 
within that court. And within the rules for 
baseball, a batter may choose to swing or 
not at a given pitch. In this sense, our use 
of the term, though confined to describing 
face-to-face interaction, is congruent with 
one of its major meanings in common usage. 
Of course, the notion of strategy is com- 
mon in social science. 
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In general, interaction strategy arises when- 
ever there are options attaching to the rule- 
based organization of interaction. We may 
think of two broad types of option. One type 
has to do with legitimate alternatives within 
the rules. In the organization hypothesized 
for aspects of some two-person conversa- 
tions (Duncan & Fiske, 1977), an example of 
a legitimate option would be the auditor’s 
choice to take the turn or not upon activation 
of the turn signal by the speaker. Each such 
choice would constitute an element of the 
auditor’s strategy in the interaction. 

Not all rules for interaction involve legiti- 
mate alternatives. For example, it would 
seem that if someone offers to shake hands 
with another, the recipient of that offer does 
not have a legitimate option to refuse, ex- 
cept under highly specialized circumstances 
(such as demonstrably dirty hands), for 
which apologies are given. In such cases 
where the rules require the taking or avoid- 
ing of certain actions, there remains the op- 
tion simply to violate the rules. The refusal 
to shake hands without extenuating circum- 
stances is presumably a rule violation; simi- 
larly, an interruption is a violation of the 
turn-signal rules. Thus, rule violation—or its 
absence—is another aspect of interaction 
strategy. 

One component of strategy—rule violation 
—is always present in interaction, whereas 
the other—legitimate alternatives—is present 
only when it is provided for by the rules. 
Of course, rules providing for legitimate 
alternatives may be prevalent in many inter- 
actions. In any event, because of the presence 
of optionality, engaging in rule-based inter- 
action requires a Participant to choose a 
course of action continuously with respect to 
rule violation, and also at each Point at 
which there are legitimate alternatives. 


Organization, Strategy, and Action Sequences 


To this point, distinctions have been made 
(a) between organization and strategy in 
interaction and (b) between rules that pro- 
vide for legitimate alternatives and those 
that do not. These distinctions may be di- 
rectly related to research on action sequences. 
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Consider a sequence A-B, in which A isa 
action by one participant, and B is a a 
sequent action by the partner. Let us assw 
that a sufficient number of A’s and B's hy 
been observed in one or more face-to-{y 
interactions and that an A is sometimes, by 
not always, followed by a B. In this case, 
a relationship between A and B exists withi 
the organization of interaction, then B a 
not-B would have to be considered legitima 
alternatives upon the occurrence of A. A 
example would be the hypothesized turn sg 
nal, which was sometimes, but not alway 
followed by an auditor’s attempt to take th 
turn. 

Given these data, there are two types ¢ 
questions we may ask of the relationship b 
tween A and B. First, given an occurrent 
of A, what is the probability that it is fd 
lowed by B? We shall call this the pro 
ability of a consequent—certainly the mos 
common perspective in studies of action $ 
quences. But of course we may also ask the 
reverse: Given an occurrence of B, be, 
the probability that it is preceded by A? W 
shall call this the probability of an al 
cedent. ji 

The distinction between the probabil 
of a consequent and the probability of 4 
antecedent provides us with an Ree 
framework for implementing the a 
between organization and strategy. i 
rules providing for legitimate altern? 
are at issue (as they are in this examp! 
evidence relating to the organization wit 
teraction is presented in terms of popa f. 
of antecedents. That is, to support 4 A-B i 
esized rule for the action sequence a 
which B and not-B are considered i 
alternatives, the data would optimal tf 
that each B was preceded by Fila i 
probability of that antecedent 1s ~- 
with the turn signal, the hypothe occi 
would state that B may appropriate 
only after A, but need not so 0c | 
partner apparently exercises the 
whether to do B upon the occurré 
Participant’s A. i interi 

In contrast, evidence relating t° pro 
. f i rms 0) i 
tion strategy is presented in terms srep i 
abilities of consequents. That is, 5 


nce 0 


hice to do B or not-B after the occurrence 
, some partners may do B more than 
hers, For example, upon activation of the 
ker turn signal, some auditors may at- 
t to take the turn (leading to smooth 
anges) more frequently than others. 
hus, the probability of the consequent (turn 
tempt) for one auditor may be .10, whereas 
r another auditor it may be .40, even 
ough each attempt by either auditor is 
receded by a turn signal (probability of 
intecedent = 1.00). 
This example illustrates how there may 
be strong individual differences in the prob- 
ibility of a consequent in action sequences 
hen the probability of an antecedent is es- 
éitially uniform across participants. When 
Msults of this sort are obtained, we would 
interpret the uniform antecedent probabil- 
Mies as reflecting the organization of the 
bserved interactions; the variations in con- 
(Quent probabilities would be interpreted as 
tellecting participants’ differences in inter- 
khon strategy within that organization. 
; he hypothesized organization for ob- 
if interactions, it seems natural that 
ke studies of interaction strategy would in- 
4 tigate whether some participants more 
Se tend to exercise certain legitimate 
the ya (such as acting to take the turn 
Viola Presence of a turn signal) or tend to 
a certain rules (such as interrupting). 
eae expresses an obligatory re- 
‘for aeae A and B—not providing 
10 follow ts ternatives—we would expect 
(urrences of A in virtually 100% of the oc- 
Tile violation (with possible allowances for 
the ee noise in the data). In this 
Pobability a ility of a consequent and the 
bse to 1.00 a antecedent would both be 
H ncerne T ; The ensuing discussion will be 
“d only with action sequences involv- 
egitimate alternati Obli y 
ces will not b ives. Obligatory se 
e further considered. 


Quen 


Shategy g 
fect of Organization Signals 


Consid 
e 
ate i 


obtained e case in which optimal results 
een or an organization relationship 
by A’s) os actions (all B’s are preceded 

> and the relationship involves legiti- 
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mate alternatives (not all A’s are followed 
by B’s). In this case, a certain type of strat- 
egy effect will be found: The occurrence of 
an A will raise the probability of a B from 
zero to some higher number. For example, if 
virtually all auditor attempts to take the 
speaking turn occur when the speaker turn 
signal is displayed, then we shall also find 
for every auditor that the speaker's display- 
ing the turn signal increases the probability 
that the auditor will attempt to take the 
turn, when compared to those points at 
which there is no turn signal. 

Of course, even when this sort of built-in 
strategy effect is found, there may be wide 
individual differences between participants in 
the probability of a B following an A. That 
is, upon display of a turn signal, some audi- 
tors may attempt to take the turn more than 
others. Nevertheless, all auditors will take 
the turn more when the turn signal is dis- 
played than when it is not. 

Thus, when the legitimate alternatives of 
doing B or not-B are involved, a specific type 
of strategy effect may be expected for all 
organization signals: The probability of B 
will be increased following the signal. But it 
should be noted that this same effect may be 
observed for other actions that regularly 
precede B, but that, for any reason, are not 
included in the hypothesized organization 
signal. How might such actions be excluded 
from an organization signal? 


Further Criteria for Organization 


Evidence for a hypothesized organization 
signal may not rest entirely on the associated 
probability of an antecedent. Other criteria 
may be invoked, depending on the nature of 
the hypothesized relationship and on the 
results of analyses. In some cases, a criterion 
will require that an action, A, precede all 
B’s, but not precede instances of some other 
event, C. For example, the turn signal was 
hypothesized to precede smooth exchanges 
of the turn, but not instances of simultaneous 
turns in the conversation. That is, it was hy- 
pothesized that auditor attempts to take the 
turn when the turn signal is active would re- 
sult in smooth exchanges of the turn and 
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that auditor attempts when the turn signal 
is not active would result in instances of 
simultaneous turns. Similarly, the speaker- 
state signal (Duncan & Fiske, 1977; Dun- 
can & Niederehe, 1974) was hypothesized 
to occur at the beginnings of turn attempts 
by the auditor, but not at the beginnings of 
auditor back channels (Yngve, 1970), such 
as head nods and shakes and vocalizations 
such as “yeah,” and “m-hm.” The speaker- 
state signal was hypothesized to differentiate 
turn beginnings from auditor back channels. 
Of course, this sort of differentiation will 
not be a criterion for every signal, but it is 
useful when available, 

Parsimony may also be used as a criterion. 
For example, an action, X, may not be in- 
corporated as part of a hypothesized or- 
ganization signal, O, if O already precedes 
virtually all instances of B, and the addi- 
tion of X would serve merely to increase the 
number of displays of O. When an organiza- 
tion signal is hypothesized to have multiple 
cues, results may permit parsimony in the 
rules defining signal display. For example, 
the turn signal was hypothesized to have six 
distinct cues (described in more detail in the 
Results section). Strong results were ob- 
tained when a signal display was defined as 
a display of at least one cue, as opposed to 
more complex displays involving some spe- 
cified combination of cues or some larger 
number of cues, 

Thus, hypotheses on the organization of 
interaction will typically include criteria in 
addition to the probability of an antecedent. 
For this reason, an action that regularly pre- 
cedes some subsequent Action B may not 
become part of the organization signal for B. 
It immediately follows that, because of these 
additional criteria, some actions may show 
the strategy effect described above for or- 
ganization signals but not be part of any 
hypothesized signal, having been rejected on 
the basis of these additional criteria. 


Strategy Signals 


We are now in a position to consider the 
subject of this article: strategy signals. An 
action, S, will be considered a strategy signal 
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in some observed interaction when it d 
not meet criteria set for the relevant organiz 
tion signal, but it does show a distinctive re 
lation to the probability of a subsequent ae 
tion B. It also seems reasonable to expe 
that it show the high degree of consiste 
over participants in the interactions that 
required for organization signals. 
Evidence will be presented on three hyi 
pothesized strategy signals in the intera 
tions we have observed. Although preliminary 
analyses suggest that additional strate 
signals may be hypothesized in these inter 
actions, space limitations prohibit their ful 
description. In any event, such a listing i 
unnecessary for the purpose of introducing 
the notion and a general research approach. 


Method 


Analyses presented below were based on dati 
drawn from videotapes of eight two-person con 
versations between adults. All participants wet 
aware of the video camera and had given | y 
sion for the taping, Conversations are identified 
the same numbering system used in Duncan 
Fiske (1977). Interaction 1 was a pacha 
intake interview between a male therapist aboul a 
years old and a female client in her early Oa 
teraction 2, an informal consultation about 4 bal 
was between the same therapist and another 
pist, also about 40 years old; the two were £ y 
friends. Interaction 3 was between two male D 
lege students, recruited as they emerged fron 
other psychological experiment; they ver Te 
ously unacquainted and were paid $2 for bea 
ticipation. Interaction 4 was between two istan 
graduate students who worked as research aan dr 
in the same office. They were continuing thei 
cussion of shopping. Interactions 5-8 were 
unacquainted _ professional-school pea. 
part in a larger study of interaction. T! tion, {0 
ticipants were instructed to have a conve a 
get acquainted, or to talk about anything vas pa 
interested them. Each of these participants 4 malt 
$3. These four interactions were betw 0 
and a female. Interactions 1 and 2 each a 
to 40 minutes, of which the first 19 min 
transcribed and analyzed. All of the video 
teraction was transcribed and analyzed in 
maining interactions, Interactions 3 aM 5 
about 5 minutes long; and Interactions vinutes 
about 7.5 minutes long, Thus, about 78 , 
interaction were analyzed. t ion 

Individual participants in each interni 
identified in terms of their interaction nW Ki of 
their sex (M,F). When the two participa’ ig to 
the same sex, “a” and “b” will be appe 


j 


tifications. Thus, the participants in In- 
are 1F and 1M; the participants in In- 
are 2Ma and 2Mb. 
es of analysis, each interaction was sub- 
into a series of units, defined in terms of 
of the speaker. Briefly, a boundary of a 
analysis was drawn at the onset of the first 
following the end of a phonemic clause 
Smith, 1957), during or immediately 
hich at least one of the following actions 
(a) a deviation in the intonation from 
lly level pattern (22| in the Trager-Smith 
b) the utterance of a sociocentric sequence 
1962) such as “you know”; (c) the 
of a grammatical clause, involving a 
cate combination; (d) paralinguistic 
, 1958) on the final syllable or stressed 
‘a phonemic clause; (e) termination of a 
gesticulation or the relaxation of a tensed 
Position (e.g. a fist); (f) a decrease in para- 
pitch and/or loudness, either across an 
Phonemic clause or during its final syllable or 
les; (g) an audible inhalation by the speaker; 
| unfilled pause (Maclay & Osgood, 1959); 
false start (Maclay & Osgood, 1959); (j) turn- 
Speaker’s head direction toward the partner; 
a relaxation of the foot or feet from a 
dorsal flexion. Counts of occurrences and 
rences of actions in the transcription were 
‘these units of analysis. 
analyzed for this article had been tran- 
ously at several different times. No in- 
agreement was tested for these particular 
because earlier analyses had yielded highly 
results, For example, 96% agreement 
kappa = 80) was found for transcribing 
ion on a syllable-by-syllable basis (Dun- 
e, 1977, p. 342; more detailed descriptions 
ersations and procedures are given in 


Results 
Head Direction 


‘speaker turn signal is defined as the 
at least one of six cues: the first 
listed above in the drawing of 
ysis, plus one additional cue—a 
Patalinguistic loudness or pitch 
tric sequence. In the observed 
the turn signal was said to 
ows: If the auditor acted to 
mn the signal was active, the 
bliged to yield the turn. How- 
Was not obliged to take the 
the signal was active; taking 
Presence of the signal is con- 
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sidered to be an option that the auditor may 
exercise. 

In addition, a speaker gesticulation signal 
was hypothesized. This signal was composed 
of a single cue: one or both of the speaker’s 
hands being engaged in gesticulation or in a 
tensed hand position. The gesticulation sig- 
nal serves to inhibit any turn signal concur- 
rently being displayed. That is, when the 
gesticulation signal is displayed concurrently 
with the turn signal, the auditor’s attempts 
to take the turn are strikingly reduced. 

Two criteria were used for evaluating the 
effectiveness of a possible turn signal: (a) 
The signal must precede a maximum number 
of smooth exchanges of the speaking turn 
in the observed conversations; (b) the sig- 
nal must differentiate smooth exchanges of 
the turn from instances of simultaneous 
claimings of the turn by the two participants. 
Ideally, the signal would precede all instances 
of smooth turn exchanges and no instances 
of simultaneous turns. Notice that the sig- 
nal is defined as the display of any one of 
its constituent cues. Therefore, each of these 
cues when occurring alone is evaluated by 
these two criteria. 

Speaker’s head direction was tested as a 
possible turn-signal cue. The cue was con- 
sidered to be activated when the speaker’s 
head turned toward the auditor. The po- 
tential head-direction cue was considered to 
be active from the moment of the actual 
head turning until the third boundary of a 
unit of analysis following the head turn (pro- 
vided the head was still toward the auditor). 
Head direction was used in this case because 
it was not always possible to describe on our 
videotapes the direction of the speaker’s 
gaze. However, head direction is taken as a 
surrogate for gaze direction; results for head 
direction are understood as most likely re- 
flecting the effect of gaze. 

Speaker’s head direction was rejected as 
a possible turn cue for two reasons. First, it 
was not necessary. Virtually all smooth ex- 
changes (251 out of 254; 98.8%) were pre- 
ceded by at least one of the six turn cues 
described above. Second, inclusion of head 
direction would have substantially impaired 
the ability of the turn signal to differentiate 
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Table 1 
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Auditor Attempts to Take the Turn in Response to Speaker Turn-Signal Displays With and Witho 


Concurrent Speaker Head-Direction Cue 


No head- 
direction cue 


Head-direction 
cue present 


Proportion Proportion Ratio of 

Participant ne of attempts ne of attempts proportions 
1F 60 .116 75 .360 3.10 
1M 13 .000 87 367 = 
2Ma 58 .034 54 .148 4.35 
2Mb 28 .035 41 219 6.25 
3Ma 11 .000 50 .240 = 
3Mb 34 .058 51 117 2,01 
4Fa 11 .090 36 222 2.46 
4Fb 10 -100 30 .333 3.33 
SF 24 041 49 .265 6.46 
5M 18 .055 55 .272 4.94 
OF 17 .235 36 .527 2.24 
6M 37 -108 71 .281 2.60 

7F 62 .096 64 .265 2.76 ' 
7M 21 -095 52 .365 3.84 
8F 6 .000 71 .281 = 
8M 27 .111 49 .346 3.11 
Total 437 .080 871 .289 3.61 


Note. F = female; M = male; a and b = first and second members of a same-sex dyad. j Pci 
* Number of units having a speaker turn-signal display, in the absence of a speaker gesticulation (ee 
and not followed by an auditor back channel; this number is the denominator for the proportion 


for the first entry, first column, 7/60 = .116). 


smooth exchanges from simultaneous turns. 
For example, in the eight conversations, 
there were 26 occurrences of simultaneous 
turns in which neither turn signal nor gestic- 
ulation signal was active. Nineteen of these 
occurrences (73%) were preceded by the 
potential head-direction cue. Thus, in the 
absence of the turn signal, the head-direction 
cue was associated with a large proportion 
of simultaneous turns in these data. When 
it occurred alone, the cue functioned quite 
poorly in differentiating smooth exchanges 
from simultaneous turns, 

However, it is possible that the speaker’s 
head direction has an effect on the auditor’s 
subsequent response. Specifically, when the 
turn signal was active in the observed con- 
versations, the auditor was more likely to 
attempt to take the turn when the speaker’s 
head was toward the auditor than- when it 
was not. (In these analyses, head direction 
was considered active for its full duration.) 
Table 1 shows data on the instances in which 


the turn signal was active and the gosto 
tion signal was not active. (The ape 
participant is the speaker.) Rate of audi 
attempts to take the turn when the "il 
direction cue was present is contrasted wa 
that rate when the head-direction i 
not present. In this situation, 254 s™ 
exchanges and 33 simultaneous turns p 
curred. Each auditor showed a hien 
portion of turn attempts when the P 
head-direction cue was present. Su al 
across all participants, the auditor was F 
3.6 times more likely to attempt to tā 
turn when the head-direction cue ea tut 
ent. In three cases, all of the au 
attempts occurred when the head- yall 
cue was present. Although there wet ete 
tion among the participants both ee id 
rate of turn attempts and in degree © sjet 
attempts were increased by the ere 
tion cue, the basic strategy-signal € al 
Present for each participant. Mae 
participants showing an effect in 


irection, p < .0001 in a two-tailed sign 
est.) 
eaker Smiling 


Data on smiling were derived from the 
ur conversations (2, 5, 6, 7) in which there 
an appreciable number of smiles. In a 
nt study of these four conversations, 
runner (in press) found evidence that the 
beginnings of auditor smiles were distributed 
ry much like auditor back channels (e.g., 
nods and shakes and vocalizations such 
“m-hm” and “yeah”). Like auditor back 
nnels, auditor smile beginnings located at 
just after the boundaries of units of 
alysis were systematically preceded by the 
aker within-turn signal. Briefly, the lat- 
signal is defined as the occurrence of at 
t one of two cues: completion of a gram- 
matical clause by the speaker and the speak- 
"s head facing in the auditor’s direction. 
th Duncan and Fiske (1977) this head- 
tection cue was considered to be deactivated 
ee units of analysis after the speaker had 
ned his or her head toward the auditor. 
the basis of Brunner’s (in press) analyses 
i the original back-channel data, the head- 
rection cue was redefined as being present 
As long as the speaker’s head was facing in 
a direction of the auditor. This new defini- 
n was also used for analyses of auditor 
Smile beginnings, 
> a sia that within the organization 
foua ay onyersations, speaker smiles 
e Tito, ; R Her for auditor smile 
ithin-tury oi T, the existing speaker 
ie signal was quite adequate in ac- 
{of the To or auditor smile beginnings: 39 
RE gee peemnings were preceded by 
A N e wan receded. te 
er smiling Sea e was preceded by 
eins ra A at seemed little reason 
Bast eae on the basis of this 
ng With yor er, including speaker smil- 
ithin-turn eros, Combinations of the two 
AEA BN did not improve the statisti- 
4 ‘ationship between withi ‘ 
cubs within-turn signal 
A Thus ; sequent auditor smile beginni 
» it was concluded eginnings. 
Was unne ; led that speaker smiling 
© account for auditor smile 
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beginnings in the observed conversations, and 
that speaker smiling did not add to the rela- 
tionship between speaker signal and auditor 
smiling. Brunner’s study led to the conclu- 
sions that auditor smiling was one form of 
auditor back channel and that the speaker 
within-turn signal marked the appropriate 
points for auditor smile beginnings located 
at or just after the boundaries of units of 
analysis, in the same manner that the signal 
functioned for the other, similarly placed 
auditor back channels previously studied. 

The following hypothesis is advanced for 
speaker smiling as a strategy signal: When 
speaker smiling is present in addition to the 
speaker within-turn signal, the auditor is 
more likely to smile than when the speaker 
within-turn signal alone is present. That is, 
speaker smiling will increase the probability 
of a subsequent auditor smile beginning in 
response to the within-turn signal. Of course, 
auditor smiling is not mutually exclusive of 
other auditor back channels. The auditor 
might smile in conjunction with a head nod 
or “m-hm.” The hypothesis merely states 
that there is an increased probability that 
smiling will be part of the response. 

For each participant in the four conversa- 
tions studied, Table 2 shows the percentage 
of displays followed by an auditor smile be- 
ginning (a) for within-turn-signal displays 
without smiling and (b) for within-turn-sig- 
nal displays with concurrent speaker smiling. 
(In the table, the indicated participant is the 
speaker.) For each participant, the probabil- 
ity of auditor smiling is increased when 
speaker smiling is present ($< 01, two- 
tailed sign test). 


Number of Turn Cues Displayed 


Duncan and Fiske (1977) reported a posi- 
tive, linear relationship between number of 
turn cues displayed conjointly by a speaker 
and the probability of a smooth exchange of 
the turn following the display. (This rela- 
tionship was found only for turn-signal dis- 
plays not accompanied by a gesticulation 
signal.) Two correlations between number of 
turn cues displayed and probability of smooth 
exchange were calculated, one pooling data 
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Table 2 | f j A 

Auditor Smiling in Response to Speaker Within-Turn-Signal Displays With and Without Concun 
Speaker Smiling 


No speaker smiling Speaker smiling 


Proportion Proportion 
of auditor of auditor Ratio of 
Participant ne smiling ne smiling proportions 
2Ma 102 009 1 1,000 = 
2Mb 273 .014 11 .090 6.42 
SF 38 -053 8 315 7 7.07 
5M 34 .000 5 -600 = 
6F 11 -090 20 -150 1.66 
6M 16 062 16 .250 4.03 
TF 28 178 19 263 1.47 
71M 19 105 5 -400 3.80 
Total 521 .030 85 258 8.60 


Note. M = male; F = female; a and b = first and second members of a same-sex dyad. 
* Number of units having speaker within-turn-signal display and not followed by an auditor attempt 
take the turn; denominator for proportion. 

b Insufficient data. 


from Conversations 1 and 2 and the other 
pooling data from Conversations 3 through 
8. The correlations were .99 and .98, respec- 


as the number of turn cues displayed | 
creases. This relationship stems from eee 
tion strategy, as it is based on the proba 


tively. These pooled data suggest that the 


ity of a consequent: the probability of 
probability of a smooth exchange increases 


smooth exchange following a cue display. 


Table 3 


Rate of Smooth Turn Exchanges as a Function of Number of Active Turn Cues | 
(Gesticulation Signal Absent) 


1 or 2 active turn cues 3 to 6 active turn cues 


Proportion Proportion i 
of smooth of IRA Ratio of A 
Participant n exchanges n exchanges proportion: 

1F 15 160 58 344 2.15 
iM 70 ‘200 27 ‘555 2.77 
2Ma 80 050 31 161 3.22 
2Mb 40 ‘075 29 241 3.21 
3Ma 46 “152 14 ‘285 1.87 
3Mb 68 ‘073 15 ‘066 ou 
4Fa 30 ‘100 16 312 3.12 
4Fb 27 ‘222 11 272 1.22 
SF 61 163 ul 272 1.66 
5M 54 ‘i11 14 357 sal 
OF 32 ‘312 20 600 1.92 
6M 17 .220 30 ‘200 90 
7F 99 161 25 200 1.24 
7™M 62 .225 7 428 1.90 
8F 60 150 13 ‘538 3.58 
8M 64 «203 9 “444 2.18 

Total 945 157 330 318 | 2.02 


Note. F = female; M = male; a and b = first and second members of a same-sex dyad. 
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j 
; ious analyses had led to the rejection 
i: of turn cues displayed as part of 
e definition of the turn signal within the 
sanization of the observed conversations. 
the display of any single cue was found 
T differentiate between smooth exchanges 
simultaneous talking, it was considered 
constitute a display of the signal as a 
ole, In addition, it will be apparent that 
reported correlations and the results be- 
are not simply a product of the strategy 
fiect of the turn signal. That strategy effect 
uld be to raise the probability of a smooth 
change from near zero following no turn 
l to some higher number following a 
al display. 
Table 3 shows data for each participant 
the eight conversations studied. Data for 
ividual participants are insufficient to sup- 
analysis of each of the six data points 
one through six cues). Therefore, two data 
nts were calculated for each participant: 
probability of a smooth exchange follow- 
(a) the display of one or two cues and 
) the display of three, four, five, or six 
es. The data were divided between two 
three cues because this division would 
i it sufficient observations in the two cate- 
ries for each of the participants. 
a tere that for 14 of the 16 par- 
, e probability of a smooth ex- 
a following the display of one or two 
ek lower than the probability of a 
exchange following the display of 
5 o four, five, or six cues. This result is 
ee .004 level (two-tailed test). 
of the ih intel the presence or absence 
ee ae 
SEREEN F instances in which the 
ts 4 is present, the same re- 
are obtained: 14 of the 16 participant 
wed the strategy-signal ie ks 
gnal effect. 


Discussion 


Th i 
ES of this article are to propose 
Fih Eemal as a phenomenon of in- 
Wnts -to-face interaction, to suggest 
to Providing evidence support- 


} ing hy 
ee of such phenomena, and to 
ce on three hypothesized strat- 
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egy signals operating in some transcribed 
interactions. 

We have suggested four criteria for hypoth- 
esizing an action to be a strategy signal: (a) 
that the strategy-signal action not meet cri- 
teria for hypothesizing it as an element in 
the relevant part of the organization of the 
interaction(s) studied; (b) that the action 
have an effect on the probability of occur- 
rence of a subsequent action; (c) that this 
effect be highly consistent across the par- 
ticipants for which it is hypothesized; and 
(d) that-the effect apply to action sequences 
involving legitimate alternatives—that the 
signal not be invariably followed by the sub- 
sequent action in question. By “effect” we 
mean a shift in the probability that the sub- 
sequent action will occur- We do not specify 
the' exact nature or direction of this shift. 
Thus, it need not be of the kind described 
in the Results section. 


Strategy Signals and Organization Signals 


The strategy-signal criteria rely on the 
parallel distinctions between organization and 
strategy in interaction and between the prob- 
ability of an antecedent and the probability 
of a consequent in analyses of action se- 
quences. In general, we have identified the 
probability of an antecedent (the probability 
that A will precede B) with information on 
the organization or structure of interaction. 
This is because organization has to do with 
regularities in the placement of actions in 
interaction, given that they occur. More 
broadly, organization has to do with the ap- 
propriate placement of actions in the stream 
of interaction. 

We have identified the probability of a 
consequent (the probability of B following 
A) with information on the strategy used in 
the interaction. This is because strategy is 
concerned with patterns of influence (or 
patterns of reaction) in interaction: one or 
more actions changing the probability that a 
subsequent action will occur. 

Of course, hypothesizing A as an organiza- 
tion signal for B will almost certainly involve 
criteria in addition to the finding that A 
consistently precedes B. Investigators will 
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attempt to formulate the simplest and most 
effective definition of the hypothesized Sig- 
nal A, And in some cases, the ability of the 
signal to differentiate different classes of 
action will be used as a criterion. An ex- 
ample would be the differentiating of smooth 
exchanges from simultaneous turns by the 
turn signal. 

Of the four criteria proposed for hypothe- 
sizing a strategy signal, the last three seem 
relatively straightforward, whereas the first 
—that the action be excluded from the rele- 
vant part of the interaction organization— 
may bear further discussion. Strategy signals 
are seen as supplementing organization sig- 
nals in describing interactions, If an action 
is an organization signal, we gain nothing by 
also labeling it as a strategy signal. Hence, 
a careful formulation of organization should 
precede the investigation of strategy signals. 
Furthermore, each organization signal has 
its own strategy effect. In the optimal case, 
when a given organization signal is not dis- 
played, the probability of the consequent 
action associated with it is zero; but when 
the organization signal is displayed, the prob- 
ability is some value above zero. This strat- 
egy effect of the organization signal stems 
from the fact that in the optimal case, the 
signal precedes all instances of the conse- 
quent action associated with it. That is, 
given the consequent action, the probability 
of the antecedent signal is 1.00. If an in- 
vestigator finds, for example, that the oc- 
currence of Action A raises the probability 
of a consequent Action B, A may be an or- 
ganization signal or a Strategy signal. Action 
A should be hypothesized as a strategy sig- 
nal for B only after it has been rejected as 
part of an organization signal for B. This ac- 
tive rejection was part of the evaluation of 
each of the strategy signals hypothesized in 
the Results section. 

This discussion indicates that the occur- 
rence of a legitimate alternative, B, may be 
affected both by a preceding organization 
signal and by a preceding strategy signal, 
The hypothesis of a strategy signal enables 
the investigator to account for more of the 
variance in the occurrence of B than does an 
organization signal alone. However, as a 
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legitimate alternative, the occurrence of 
can never be entirely predicted on the 

of preceding actions. The partner is noti 
any sense “caused” to do B on the basis 
preceding actions. Rather, in choosing to 
B, the partner may be influenced to a grea 
or lesser extent by the participant’s preo 
ing organization and strategy signals. 


Speaker’s Head-Direction Signal 


In the case of head direction and the tw 
signal, it is clear that the speaker’s h 
being turned toward the auditor preced 
many auditor attempts to take the tur 
otherwise, the strategy-signal results ( 
in terms of consequents) could not have bi 
obtained. It was in differentiating smooth e 
changes from simultaneous turns that 
direction was markedly worse than the 
posed turn-signal cues. Thus, even thoug 
the probability that the speaker’s head 
rection precedes the auditor’s turn attemp! 
may be relatively high for many participan 
head direction fails to meet the criteria set Í 
turn-signal cues. The finding of a strong @ 
consistent strategy effect, in the absence i 
the required organization effect, led us 
propose a new interaction component: 
strategy signal. 

Despite the extensive literature 0N E 
direction (see Davis, 1972; Key, wale 
Argyle & Cook, 1976, for references); 
has not been intensive study of the relatio! 
of gaze direction to turn taking. Relevant 
ceptions would be an early study by Ke | 
(1967) and two more recent studies furt 
exploring the phenomena Kendon da t 
(Beattie, 1978; Rutter, Stephenson, AY i 
& White, 1978; see also an accompany 
commentary by Kendon, 1978). UG 
nately, none of these studies provides 
on the effect reported here: given 4 a 
nal, an increased probability that the a" 
will attempt to take the turn when the $i j 
is gazing. These investigations studi 
parameters, such as the effect of AÝ 
gaze on length of the pause between `i 
and the proportion of turn caw 
which the speaker or auditor is 100 
not looking. 


e evidence presented here focuses on 
direction (presumably gaze) with re- 
t to a specific phenomenon: the ex- 
speaking turns. Ellsworth and 
(1976) take a broader perspective, 
that staring “elicits attention, 
and a sense of interpersonal involve- 
Phe type of involvement inferred and 
s perceived to be appropriate de- 
‘contextual cues” (p. 122). Al- 
are not certain that Ellsworth 
er would consider staring to include 
mittent gazing typically found in 
ons, we believe that the results 
here are entirely consistent with 
othesis. The increased arousal re- 
ü speaker gaze may have prompted 
higher rates of turn attempts by 
From the perspective of Ellsworth 
results bear on some of the 
” relevant to the appropri- 
on of gaze in the observed 


ing Signal 


miling, the claim that speaker 
S the probability of auditor 
haps less surprising than the 
smiling is not part of an 
$ for auditor smiling in the 
rsations. Yet, in our conversa- 
true both that some speaker 
te not reciprocated by the auditor 
Some auditor smiles were not pre- 
į er smiles, (For an interesting 
ditor failure to reciprocate 
8, see Rosenfeld, 1972.) Thus, 
ip between auditor and speaker 
necessarily a simple one. In 
, the proposal to include 
within the more general au- 

Class of actions seems 
table, is supported by a 
lalyses (Brunner, in press), 
htage of extending the prov- 
er within-turn signal (Dun- 
n & Fiske, 1977), while 
sity of hypothesizing an 
Organization signal. 
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Several studies have reported correlations 
between interacting subjects in the number 
or rate of smiles (Duncan & Fiske, 1977; 
Rosenfeld, 1966, 1967). Such correlations 
provide indirect support for the results re- 
ported here. But because data based on 
frequencies or rates do not contain informa- 
tion on the location of smiling in the inter- 
actions, they do not allow us to evaluate the 
degree to which smiles by a participant were 
actually reciprocated by the partner. 


Number of Speaker Turn Cues as a Signal 


As a strategy signal, number of turn cues 
is somewhat different from the two just dis- 
cussed. Both head direction and smiling are 
actions that frequently precede their respec- 
tive subsequent actions. In contrast, number 
of turn cues employs the same actions as the 
turn signal, but differs in the rule describing 
the way in which turn cues are combined. Ef- 
fective results were obtained for the turn 
signal when number of turn cues displayed 
was ignored. The occurrence of any single 
cue constituted a full organization signal 
display. Number of turn cues displayed was 
considered and rejected as a possible element 
of the turn-signal definition. Analyses sug- 
gested that number was quite irrelevant to 
the criteria of (a) preceding smooth ex- 
changes and (b) differentiating smooth ex- 
changes from instances of simultaneous turns. 

However, it seems additionally true for 
the conversations we observed that the prob- 
ability of a smooth exchange increases in 
some manner as the number of turn cues in- 
creases. More specifically, for most partici- 
pants the probability of smooth exchanges 
after the display of three, four, five, or six 
turn cues is greater than after the display 
of one or two turn cues. 


Individual Differences 


As mentioned in the Results section, Tables 
1, 2, and 3 provide data on two types of 
strategy effects. On the one hand, there is 
evidence of consistency: Each speaker’s 
head direction or smiling may be seen to 
change in the same direction the probability 
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of the relevant response by each auditor. 
This is the strategy-signal effect. On, the 
other hand, variation between partners is 
evident in rates of response and in the degree 
to which that response is increased following 
the hypothesized strategy signal. This varia- 
tion reflects the individual differences one 
may expect to find in interaction strategy, 
an area of research that extends well beyond 
strategy signals. 


Generality of Results 


How general are these strategy-signal re- 
sults? We hold that generality in research 
of this sort is entirely an empirical matter, 
not a logical one. The organization and 
strategy signals discussed in this article are 
elements of social practices or conventions 
hypothesized to be operating in the observed 
interactions. The cultural boundaries of the 
use of these practices is an issue that must 
be determined empirically. A strict analogy 
would be work on language dialects, An in- 
vestigator may interview one or more in- 
formants, gathering sufficient information to 
describe their dialect, but only dialect survey 
research can establish the geographical and 
demographic area in which the dialect is 
spoken. In dealing with social practices, we 
do not believe that it is appropriate to gen- 
eralize to a postulated a priori domain, as is 
commonly done in some other types of be- 
havioral research. 

Nevertheless, we may cite circumstantial 
evidence that the strategy effects we report 
are not entirely confined to the conversations 
studied. With the exception of Conversa- 
tions 2 and 4, the participants were unac- 
quainted with each other prior to their in- 
teraction. The participants came from a 
rather wide variety of social and educational 
backgrounds and geographical areas. Al- 
though the settings were not completely nat- 
ural, they were naturalistic and, hence, seem 
fairly representative of everyday interactions, 
In terms of the three dimensions of natural- 
ness identified by Tunnell (1977), the be- 
havior was natural and the treatment was 
also natural, because the stimuli for each 
participant were provided by the partner, 


S. DUNCAN, JR., L. BRUNNER, AND D. FISKE 


and there were no experimental manipula: 
tions. For these reasons, findings from ow 
studies may generalize more readily thay 
results from contrived laboratory investiga 
tions or from responses in testing rooms, 

Regardless of the domain to which a sl 
of results applies, in the process of face-te 
face interaction, a hypothesized organization 
implies strategy, and vice versa. Descrip 
tions of strategy must be coordinated with 
descriptions of organization. Research on im 
teraction requires careful examination of e 
side of an action sequence. 
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Editors Named for New Journal of Personality 
and Social Psychology 


The Publications and Communications Board has named Melvin Manis, Ivan 
D. Steiner, and Robert Hogan to edit the Journal of Personality and Social 
Psychology (JPSP), which will appear in a new fornfat in January 1980. The 
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Veteran’s Hospital, Ann Arbor; University of Massachusetts; and Johns Hop- 
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and Hogan the section on PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES. 

An announcement on the restructuring of JPSP appeared on p. 668 of the 
June 1978 issue of JPSP. As indicated at that time, the three editors will work 
independently with separate groups of advisory editors. Their acceptances will 
appear as distinct sections in monthly issues of JPSP. Detailed descriptions of 
each section’s coverage will be published in JPSP next month. 

Manuscripts for JPSP are currently being received by Acting Editor Clyde 
Hendrick through March 31, 1979. As of April 1, 1979, authors should submit 
manuscripts to the appropriate section editor as follows: 


ATTITUDES AND SOCIAL COGNITION 


April 1 through June 15: After June 15; 

Melvin Manis Melvin Manis 

c/o Anita DeVivo Research Center for Group Dynamics 
Suite 700 Institute for Social Research 

1400 North Uhle Street University of Michigan 

Arlington, Virginia 22201 Ann Arbor, Michigan 48104 


INTERPERSONAL RELATIONS AND GROUP PROCESSES 


Ivan D. Steiner 

Department of Psychology 
University of Massachusetts 
Amherst, Massachusetts 01003 


PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES 
Robert Hogan 
Johns Hopkins University 
Department of Psychology 
33rd and Charles Streets 
Baltimore, Maryland 21218 
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Attitudes Cause Behaviors: A Cross-Lagged Panel Analysis 


John J. Berman 
University of Nebraska—Lincoln 


Cross-lagged panel correlations were computed between attitudes and behaviors 
with Tespect to college students’ ratings of four issues: Jimmy Carter's pres- 
idential candidacy, Gerald Ford’s candidacy, drinking, and religion. For all four 
issues, attitudes showed causal predominance over behaviors. These results are 
interpreted as supporting McGuire’s views on the attitude-behavior relationship 
but as being contrary to Wicker’s situational determinism and self-perception 
theory. Kelman’s view of reciprocal causation between attitudes and behaviors 


hypothesis here to be defended says that this 
order of sequence is incorrect” (p. 190). 

At the most basic level, one could assume 
four types of relationships between attitudes 
and behavior: Attitudes cause behaviors; be- 
haviors cause attitudes; there is reciprocal 
causation; or the two are causally unrelated. 
Each of these positions has a contemporary 
proponent. McGuire (1976) has argued that 
attitudes generally lead to behavior. D. Bem 
(1972) has proposed that when internal cues 
are weak or ambiguous, situationally deter- 
mined behaviors will lead to attitudes. Kel- 
man (1974) maintains that attitudes guide 
behaviors but that behaviors also guide at- 
titudes. Wicker (1969) reviewed a number of 
studies on the relationship between attitudes 
and behaviors and concluded that “taken as 
a whole, these studies suggest that it is con- 
siderably more likely that attitudes will be 
unrelated or only slightly related to overt be- 
haviors than that attitudes will be closely 
related to actions” (p. 65). Hence, the basic 
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question that James raised has yet to be re- 
solved. Nevertheless, the results of many 
laboratory studies supporting Bem and/or 
Wicker have created some degree of pessimism 
about the predictive utility of attitudes (cf. 
McGuire, 1976). 

Although discriminating between these 
four hypotheses in terms of which is more 
accurate is important for attitude theory and 
research, it has been difficult to contrast 
these hypotheses within one laboratory study, 
in part because, typically, only attitudes or 
only behaviors are used as independent vari- 
ables, Thus, the reviewer must rely on only 
cross-study comparisons to reach any con- 
clusions. This reliance is always risky, be- 
cause of the numerous differences unrelated 
to theory that exist between different studies. 
Furthermore, random assignment of individ- 
uals to attitudes may be misleading, since the 
laboratory-induced attitudes either may be 
trivial or may interact with prior attitudes. 

An entirely new methodology may be neces- 
sary before we can progress to a new level of 
understanding empirically the relationship be- 
tween attitudes and behaviors. Kenny (1975) 
has developed a different methodology, cross- 
lagged panel correlation, which may prove 
useful for a better understanding of these 
issues. This article reports on the results of 
a cross-lagged panel correlation study of the 
relationship between attitudes and behaviors. 


Method 
Subjects 


The 463 introductory psychology students who 
filled out questionnaires at both of the times they 
were administered were the subjects for this research, 
Each subject was asked questions about the candi- 
dacy of Jimmy Carter and Gerald Ford for the 
presidency, about drinking, and about religion. After 
deletion of subjects for whom at least one data 
point was missing for a particular set of variables, 
the sample size for the political issues was 362, for 
the drinking issues was 326, and for the religion 
issues was 394. 


Cross-Lagged Panel Correlation 


Because some readers may be unfamiliar with 
cross-lagged panel correlation (CLPC), a brief over- 
view of the method is given here. A comprehensive 
overview has been given by Kenny (1975), 
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CLPC is a quasi-experimental design useful for 
testing causal hypotheses from correlational data, 
Within a panel (longitudinal) study, there may be 
both panel variables, which are relatively fluctuating 
(although they should be reliably measured) vari- 
ables measured at more than one time, and control 
variables, which are relatively stable variables mea- 
sured at only one point in time (e.g., sex). In the 
simplest case in which CLPC can be used, two panel 
variables, A and B, are measured at two separate 
times, Time 1 and Time 2. There are six possible 
correlations that can be computed with these data: 
two synchronous correlations (rain: and Fasse), two 4 
autocorrelations (raz and rsa), and two cross- 
lagged correlations (rainz and rassı). When the cross- 
lagged correlations are different, as indicated by a 
Pearson-Filon z test for differences between cor- 
related correlations (Kenny, 1975), and when vati- 
ous other assumptions have been satisfied, one may 
state that there is a causal relationship between the 
variables, For example, if fam: > roiaz, if both are 
positive, and if other assumptions have been satis- 
fied, one may state that increases in A cause in- 
creases in B, as opposed to stating that increases in 
B cause increases in A. 

In true experiments, randomization is used to 
rule out the plausibility of spuriousness (third-vari- 
able effects) as an alternative explanation to experi- 
mental effects. CLPC rules out spuriousness as at 
alternative explanation to research effects by meats 
of two assumptions about the data: synchronicity 
and stationarity. Synchronicity means that the score 
that comprise the synchronous correlations measure 
processes that occurred at the same time, and sla i 
tionarity means that the causal structure for tt 
variables did not change over time. Kenny (1973) 
has formulated a mathematical model that give 
synchronicity, stationarity, and spuriousness, 
cross-lagged correlations do not differ significantly 

Stationarity may be of three types: perfect sia 
tionarity, proportional stationarity, or quasi st 
tionarity. When the structural equation of the va 
ables does not change over time, perfect stationarity 
is obtained. A lack of change in synchronous 0 
relations is consistent with perfect stationarity. F 
portional stationarity refers to the case in which 
structural equations of the variables change by So 
proportional constant over the time lag. Quasi “i 
tionarity refers to the case in which the structu 
equation of each variable changes over time R, 
constant unique for each variable (Kenny, 1 
The implication of the assumption of qua 
tionarity is that the synchronous correlations va to 
equal each other if corrected for attenuation Fs 
measurement unreliability. If a set of datk pet 
the assumptions of quasi stationarity but D ary 
fect or proportional stationarity, it is Bec ae 
adjust the cross-lagged correlations prior tO f 
for spuriousness. This adjustment is necessary 
variables with decreasing reliability would pa 
appear to be causes, and variables with incre 
reliability. would tend to appear to be effects: 


tË 


test the various assumptions of stationarity 
ally (Kenny, 1973, 1975, Note 1) when one 
e than two panel variables. 


ges and Limitations of CLPC 


has an obvious advantage over traditional 
ions in that it can be used for causal infer- 
. It also has some advantages over traditional 
of variance designs, since it does not neces- 
assume randomization and variable manipula- 
‘As a correlational design, it has the further ad- 
that one need not discriminate between in- 
dent and dependent variables a priori. In the 
de-behavior controversy, for example, both at- 
ides and behaviors could have equal probability 
owing up as a cause or as an effect in a CLPC 
ough CLPC has several advantages over analy- 
ariance designs, it also has several limitations 
weaknesses in comparison to analysis of variance 
ns. The first limitation is that CLPC is a low- 
t statistical procedure. Because of this low 
A the alpha level for this research was set at 
r the Pearson-Filon tests of differences between 
-lagged correlations. 
her problem with CLPC is specifying the ap- 
te time lag. Nothing about the procedure 
fies what the appropriate time lag might be 
respect to a particular research question. Quite 
Social science theory also fails to guide re- 
TS to an appropriate time lag. This problem 
exists in other longitudinal research and in 
e research. In the area of attitude change, 
aa McGuire (1960) argued that laboratory 
ae hee” manipulations may not show 
a lect for a week. The time lag for notice- 
ude change in the absence of an experi- 
a is probably often much longer. 
faa Solution to this problem in the ideal 
aa u Sa sample numerous time lags. In the 
sues, int ia a time lag was identical for all 
whid poss did differ with respect to the 
election SA ey were expected to change. During 
ro T, one may assume that political per- 
under a great deal of pressure to change, 
ie and drinking perspectives may 
Ra 4 more slowly. Further advantages and 
i CLPC are given elsewhere (Kahle, 
1 “nny, 1975, in press), 


Or, wer two panel variables, attitude and 
presi a Measured for four issues—Jimmy 

Sidential candidacy, Gerald Ford’s can- 
ng, and religion.1 It was hypothesized 
o ia P the four issues, attitudes would 


| Date........ 


ATTITUDES CAUSE BEHAVIORS 


| Diary NoOs meer 
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Materials 


The items for all of the panel variables were 
developed specifically for the purposes of this re- 
search, in order to test the constructs of interest: 
attitudes and behaviors. An initial pool of items 
was generated by consulting some previous tests of 
the relevant issues (eg, Manson, 1963; Robinson 
& Shaver, 1973) and by using the authors’ knowledge 
of the issues. These 89 items were pilot tested on 
a group of 41 introductory psychology students. 
Topics of concern in this pilot study included 
whether scale points on the questionnaire were ap- 
propriately spaced and whether the questionnaire 
contained demand characteristics or other factors 
that could potentially bias the responses. Tests of 
internal consistency (Burns, Note 2) generally guided 
decisions about what items would be included in 
the variables for the actual research, but in some 
cases intuitions overruled psychometrics, because 
the Republican party had not yet selected a presi- 
dential candidate at the time of the pilot study. 

Each subject in the actual research responded to 
all of the panel-variable items twice and all of the 
control-variable items once. Before any CLPC anal- 
yses were completed, the items for each panel vari- 
able were analyzed to test whether they were de- 
tracting from the internal consistency of the vari- 
able. Three political behavior items were eliminated 
because they detracted from the internal consistency 
of the variable that they were intended to repre- 
sent. The eliminated items were ones that had 
highly skewed distributions and very small vari- 
ances (e.g., “How many times in the past 14 days 
have you put up a sign or bumper sticker support- 
ing Jimmy Carter?”). 

Subjects rated each attitude item on a scale ranging 
from strongly agree (1) to strongly disagree (9). For 
the behavior items, subjects were asked how many 
times in the past 14 days they had engaged in the 
activity mentioned in the item. The items for each 
variable (ie, attitude and behavior) within each 
issue were summed to obtain one score for each 


variable. 


Table 1 presents an example of an item from each 


set of questions used to measure the panel variables. 
The item that correlated most highly with the total 
score for all of the other items in each panel-variable 
scale was selected for inclusion in Table 1. 

The scales appear to be adequate psychometrically. 
Table 2 presents two measures of internal consistency, 
Cronbach’s (1951) alpha and Wollin’s index of 
single-factoredness (Burns, Note 2), for each panel 
variable from both times of measurement. Each 


1A third panel variable, stimulus condition self- 
selection, was also included in Kahle (1977). This 
third panel variable was used in the data reported 
here for tests of stationarity, since it is necessary to 
at least three panel variables to test for sta- 
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Table 1 i 
Representative Item From Each Set of Questions 
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Variable 


Item 


Carter issues 
Attitude 
Behavior 


Ford issues 
Attitude 
Behavior 

about Gerald Ford? (7) 

Drinking issues 
Attitude 
Behavior 


I like to drink. (14) 


Religion issues 
Attitude 
Behavior 


I strongly support Jimmy Carter for president. (5) 
How many times in the past 14 days have you made a favorable comment 
about Jimmy Carter? (7) 


I strongly support Gerald Ford for president. (5) 
How many times in the past 14 days have you made a favorable comment 


How many times in the past 14 days have you had at least one drink? (9) 


Being active in church is important. (9) : : 
How many times in the past 14 days have you prayed in private? 


How many times in the past 14 days have you been with people from 


your church? (9) 


Note. The number in parentheses after each item is the total number of items in that item’s variable scale 


variable was also correlated with the Bem Sex Role 
Inventory (BSRI) Social Desirability Scale (S. 
Bem, 1974). All correlations were quite low, with 
the mean r* = .01 and the median r = —.008. 

A number of control variables were measured 
along with the panel variables. At Time 1, subjects 
provided demographic information and completed 
the Bem Sex Role Inventory. At Time 2, subjects 
answered questions about their religious affiliation, 
interest in the presidential campaign, self-perceptions 
of the importance of their attitudes (“Among all of 
your attitudes, how important are your attitudes 
toward alcoholic beverages?”), and their attitudinal 
entailment (“How many attitudes do you have which 
are related to your attitudes toward alcoholic bever- 
ages?”) (Kahle, in press). A principal-components 
factor analysis with a varimax rotation was per- 
formed on the Bem Sex Role Inventory,? and all 
items with a factor loading greater than .55 after 
rotation were summed for each factor with an 
eigenvalue greater than 1.6, These factors were used 
as control variables. 

The purpose of control variables in CLPC is to 
increase the stationarity of the data (Kenny, 1975). 
Three criteria were employed to select which of the 
many control variables would in fact be used for 
the cross-lagged panel analysis. To be included, a 
control variable had (a) to account for some notice- 
able variance of several of the panel variables within 
a given analysis (Kenny, 1975), (b) to have some 
theoretical relevance, and (c) not to reduce sample 
size excessively because of missing data. 

Control variables for each set of panel variables 
were partialed from all panel variables in that set 
before any cross-lagged panel tests were performed 
(Kenny, Note 1). The control variables that met 


all three of the above criteria for inclusion with 
respect to political variables included independence 
(a factor from the Bem Sex Role Inventory), home: 
town size, importance of political attitudes, ani 
number of debates observed. The drinking contr 
variables were year in college, sex, importance l 
drinking, and entailment (Kahle, in press) K, 
drinking. The religion control variables were ho 
per week of work during the school year, ee a 
self-support for education, and importance © 
ligion. 


Procedure 


nts 
The potential pool of respondents was a 
enrolled in a large introductory psychology 


Pade ad- 
2 Other researchers at the same university Wer? 4 
ministering the BSRI to the same 


dequat? 


more K , 
t administer í 
the rest 

ata, too 5. 


with yet another rating scale, a 
measure of social desirability was no 
Given the availability of the data from 
the BSRI, it was decided to use those di rogy” 
Bem’s original masculinity, femininity, and a varie 
scores did not account for major amounts In gen 
ance in the attitude and behavior data Las same 
eral, the factor analysis showed about sien 
factor structure that one would expect bas ensures 
Bem’s (1974) reported internal consistency ee 
with some minor qualifications probably du 
ferences in subject populations. 


dif 


subjects for F: } 
Social Desirabll | 
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At the first meetings of this class, the instructors in- 
troduced the course and then asked the students to 
Piete a “Background Information Questionnaire.” 
Students attended these sessions in groups of ap- 
proximately 75 during the week of August 30, 1976. 
The date was less than 2 weeks after the Republi- 
a National Convention had nominated Gerald Ford. 

The students were offered half an hour of re- 
sarch participation credit toward their 3-hour re- 
quirement to complete the questionnaire again dur- 
ing the last complete week prior to the presidential 
election (October 25, 1976). Thus, the time lag was 
‘slected to be appropriate for the political issues, 
and it may or may not have been appropriate for 
the drinking issues or the religion issues. 


Results 


Table 3 displays the results from the tests 
of differences among the cross-lagged correla- 
tions. For all four issues, the “attitudes lead 
to behaviors” hypothesis was significantly sup- 
ported in the direction predicted. 

For those readers familiar with CLPC, 
several additional details may be useful in in- 
lerpreting Table 3. All sychronous correla- 
tions and autocorrelations were positive and 
ed than .30, The degrees of freedom 
a a tests were 360 for the Carter variables, 
i ior the Ford variables, 392 for the re- 
gion variables, and 324 for the drinking 


Variables, 
pile 2 
pra Consistency of Scales: Cronbach's 


bha® and i i 
oe s Index? of Single- 


Cronbach’s alpha Wollin’s index 


Variable 


Time 1 Time 2 Time 1 Time 2 


lay 
ee 92.89 ‘93.95 


+85 -86 94 96 
-79 80 +82 82 


Cre 
a Beach (1951), 


ttitude 

5 8 

er 2 Ste eee 
Ord issues 

Baitude 84.86 94 983 

Bich. 84 75 97 92 

ating issues 

pep Ude 9 

havi -90 92 85 91 

$ (Note 2), 
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Table 3 
Altitude-Behavior Cross-Lagged 
Correlation Test Results 
oe TAY AG TES Se Sh ts 
Differ- 
Issue faisa TBiA2 ence z 

Carter’s 

candidacy .53 18 35 6.29** 
Ford’s 

candidacy 57 Aki A4 1:1399 
Religion 57 A9 .07 1.84* 
Drinking 58 48 10 2.20** 


a Pearson-Filon test for the differences between 

correlated correlations, following Kenny (1975). 
*p<.1. 

* > < .05. 


Only the political data have been corrected 
to meet the assumptions of quasi stationarity. 
This correction was statistically necessary 
(probably because the salience of politics in- 
tensified during the time lag), in order to 
rule out changing reliability as an alternative 
explanation to obtained effects. Uncorrected 
variables with increasing, reliability could 
otherwise spuriously appear to be effects 
(Kenny, 1975). The disadvantage of this 
correction is that it may increase sampling 
error and thereby make the Pearson-Filon 
test more approximate. In the present case, 
in which the political data do not yield 
borderline results, the approximate nature of 
the significance test is probably not a prob- 
lem. For the religion data, the assumption 
of perfect stationarity was tenable, and for 
the drinking data, the assumption of propor- 
tional stationarity was tenable; thus, it was 
unnecessary to adjust the religion and drink- 
ing data to meet the assumptions of quasi 
stationarity. f 

Kenny’s weights (Note 1) for estimating 
the relative sampling error for the Carter, 
Ford, drinking, and religion data were, re- 
spectively, 2.2, 1.9, 8.2, and 7.9. In general, 
weights of less than 1 should not be inter- 
preted, and weights between 1 and 2 should 
be interpreted with caution (Kenny, Note 
1). These weights, then, imply that a slight 
amount of caution is called for in interpret- 
ing the Ford data. Kenny’s (Note 1) overall 
reliability (communality ) ratios for the 
Carter, Ford, drinking, and religion data 
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were 1.178, 1.264, 1.044, and 1.008, respec- 
tively. Reliability ratios greater than 1 indi- 
cate increasing reliability, whereas ratios less 
than 1 indicate decreasing reliability. Here 
reliabilities increased slightly. The larger of 
the cross-lagged correlations tended to be 
slightly larger than the synchronous correla- 
tions. Since Kenny (1975) argues that a 
cross-lag larger than the synchronous correla- 
tions is indicative of a causal effect, this re- 
sult provides additional (although minor) 
evidence consistent with the inferences. 


Discussion 


The results are straightforward. Attitudes 
had causal predominance over behaviors for 
all four issues. The weak or nonexistent re- 
lationship between attitudes and behaviors 
that one would expect in the absence of ex- 
tensive situational information, if one were 
to accept Wicker’s (1969) analysis, was not 
supported here. This study provided ample 
evidence for the utility of assuming that at- 
titudes lead to behaviors. In all cases the 
correlation between attitudes at Time 1 and 
behaviors at Time 2 was greater than .5. 

This study clearly implies that knowledge 
of attitudes has an important degree of pre- 
dictive utility. For the politicians, theolo- 
gians, and alcoholism therapists who may be 
interested in knowing whether their efforts 
to change attitudes will help them to achieve 
their ultimate goals of changing behaviors, 
the answer appears to be yes. Furthermore, 
knowledge about present attitudes will help 
predict future behaviors (in the absence of 
change). The pessimism about the predictive 
utility of attitudes is to some extent dimin- 
ished. With McGuire (1976) we can call into 
question Wicker’s emphasis: 


It is currently fashionable to stress situational de- 
terminants of behavior and question whether 
there are any personal determinants at all (Endler 
and Hunt, 1968; Mischel, 1973; Alker, 1972; 
Endler, 1973; [D.] Bem and Allen, 1974). I 
might seem, therefore, to be violating the current 
orthodoxy in the yin and yang of conventional 
wisdom on this issue when I stress the role of at- 
titudes in predicting behaviors that purportedly 
are determined primarily by situational factors. 
(p. 9) 


The attitude—behavior cross-lagged correla- 
tion differences also contradict at least the 
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starting point of self-perception theory (D, 
Bem, 1972). Although these results do not 
“disprove” self-perception theory, since Bem 
theorizes that behaviors will lead to attitudes 
only “to the extent that internal cues are 
weak, ambiguous, or uninterpretable” (p. 2), 
the results do suggest that with respect to 
topics such as politics, religion, and drinking, 
this qualification to self-perception theory 
may need to be invoked more often than the 
theory itself. And with respect to renditions | 
of self-perception theory that minimize the 
importance of this qualification (e.g., Kleinke, 
1978), the present data are quite damaging. 
The preponderance of attention given to 
artificial laboratory studies during the past 
several decades of attitude research may 
have distorted our ability to discriminate be 
tween major and minor naturally occurring } 
processes involved in attitude change. By 
way of qualification, however, it is important 
to note that the data reported here have at 
best only slight implications concerning * 
sues such as the heuristic utility of self-per” 
ception theory in areas such as attribution 
research. Even if the starting point of a the- 
ory is called into question, the total theory 
may nevertheless make other valuable con- 
tributions (Kahle, 1978). 

Kenny’s (1975) contribution in develop 
ing CLPC may have as many implication 
for theory as it has methodological implica- 
tions, because of the way it allows researche 
to ask questions. In CLPC research it is " 
necessary to determine irrevocably which 0 
several factors is causal prior to conducting 
the research. Kenny’s methodology alsa 
lows the investigation of reciprocal caus 
and cyclical causation (which may be chara 
teristic of attitudes and behaviors) throu 
the sampling of multiple time lags. re 

With respect to reciprocal causation, wf 
present data do not suggest that behav! o 
are causally void. Certainly, 4 nun 
laboratory studies suggest that 10 ee 
circumstances, behaviors can lead 10 gre 
tudes, A particularly fruitful area to aA 6 
in this vein may be the differences in 4! vara 
about objects, the typical topic of ee i 
and attitudes about behaviors a 
Ajzen, 1974). Even when behaviors a 
attitudes, however, it may be that 


ATTITUDES CAUSE BEHAVIORS 


siltant attitudes themselves will lead to new 
behaviors (Kelman, 1974). McGuire’s (1976) 
discussion of reciprocal causation is especially 
Besa 

Although the results of the present study 
dearly show that attitudes have causal pre- 
dominance over behaviors on these issues, a 
fev words of caution are in order. In some 
‘tases, a difference in levels of specificity be- 
tween attitudes and behaviors may lead to 
inconsistency (Weigel, Vernon, & Tognacci, 
1974). People may often hold contradictory 
attitudes (Kelman, 1974). Attitudes may be 
better predictors of groups of behaviors than 
Of specific behaviors (Fishbein & Ajzen, 
1974). What researchers view as attitude- 
behavior consistency may not be what re- 
search subjects view as consistency (Bern- 
stein & Kahle, Note 3; cf. also Bem & Allen, 
1974), And, of course, other things besides 
attitudes influence behaviors. 
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Egocentric Biases in Availability and Attribution 


Michael Ross and Fiore Sicoly 
University of Waterloo, Waterloo, Canada 


Five experiments were conducted to assess biases in availability of information 
in memory and attributions of responsibility for the actions and decisions that 
occurred during a previous group interaction. The subject populations sampled 
included naturally occurring discussion groups, married couples, basketball 
teams, and groups assembled in the laboratory. The data provided consistent 
evidence for egocentric biases in availability and attribution: One’s own con- 
tributions to a joint product were more readily available, that is, more fre- 
quently and easily recalled; individuals accepted more responsibility for a group 
product than other participants attributed-to them. In addition, statements at- 
tributed to the self were recalled more accurately and the availability bias was 
attenuated, though not eliminated, when the group product was negatively eval- 
uated (Experiment 2). Finally, when another participant’s contributions were 
made more available to the individual via a selective retrieval process, the 
individual allocated correspondingly more responsibility for the group decisions 
to the coparticipant (Experiment 5). The determinants and pervasiveness of 


the egocentric biases are considered. 


One instance of a phenomenon examined 
in the present experiments is familiar to al- 
most anyone who has conducted joint re- 
search. Consider the following: You have 
worked on a research project with another 
person, and the question arises as to who 
should be “first author” (i.e., who contributed 
more to the final product?). Often, it seems 
that both of you feel entirely justified in 
claiming that honor. Moreover, since you are 
convinced that your view of reality must be 
shared by your colleague (there being only 
one reality), you assume that the other person 
is attempting to take advantage of you. Some- 
times such concerns are settled or prevented 
by the use of arbitrary decision rules, for ex- 
ample, the rule of “alphabetical priority”— 
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a favorite gambit of those whose surnames 
begin with letters in the first part of the 
alphabet. 

We suggest, then, that individuals tend to 
accept more responsibility for a joint produt 
than other contributors attribute to them. It 
is further proposed that this is a pervasive 
phenomenon when responsibility for a joint 
venture is allocated by the participants. In 
many common endeavors, however, the -P4 
ticipants are unaware of their divergent vn 
since there is no need to assign “authorship 1] 
consequently, the ubiquity of the phenomendl 
is not readily apparent. The purpose of the 
current research was to assess whether hi 
egocentric perceptions do occur in & variet) 
of settings and to examine associated psy 
chological processes. y j 

In exploring the bases of such differenti 
perceptions, we are not so naive 3 É. 
suggest that intentional self-aggrandisen 
never occurs. Nonetheless, it is likely A 
perceptions can be at variance in the abe E 
of deliberate deceit; it is from this persp% 
tive that we approach the issue. z 

To allocate responsibility for a J 
deavor, well-intentioned participants 


oint & 


presu | 


EGOCENTRIC BIASES AND ATTRIBUTION 


‘ably attempt to recall the contributions each 
made to the final product. Some aspects of 
the interaction may be recalled more readily, 
‘or be more available, than others, however. 
In addition, the features that are recalled 
easily may not be a random subset of the 
‘whole. Specifically, a person may recall a 
greater proportion of his or her own contribu- 
‘tions than would other participants. 

An egocentric bias in availability of in- 
formation in memory, in turn, could produce 
biased attributions of responsibility for a 
joint product. As Tversky and Kahneman 
(1973) have demonstrated, people use avail- 
ability, that is, “the ease with which relevant 
“instances come to mind” (p. 209), as a basis 
for estimating frequency. Thus, if self-gen- 
erated inputs were indeed more available, in- 
‘dividuals would be likely to claim more re- 
sponsibility for a joint product than other 
participants would attribute to them. 

There are at least four processes that may 
be operating to increase the availability of 
One’s own contributions: (a) selective en- 
coding and storage of information, (b) dif- 
ferential retrieval, (c) informational dispar- 
ities, and (d) motivational influences. 


Selective Encoding and Storage 


__For a number of reasons, the availability 
of the person’s own inputs may be facilitated 
by differential encoding and storage of self- 
generated responses. First, individuals’ own 
thoughts (about what they are going to say 
daydreams, etc.) or actions may dis- 
Pr ot attention from the contributions 
ts ers, Second, individuals may rehearse 
a soe their own ideas or actions; for ex- 
ee they might think out their position 
a verbalizing and defending it. Conse- 
ea y, ae inputs may receive more 
Bone ime,” and degree of retention is 

a ‘4 y related to study time (Carver, 1972). 

a et Aes contributions are likely 

schema S, readily into their own cognitive 

a at is, their unique conception of 

as Hee ea on past experience, values, 

Bovine Contributions that fit into such 

. tetained 8 schemata are more likely to be 
| (Bartlett, 1932; Bruner, 1961). 
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Differential Retrieval 


The availability bias could also be pro- 
duced by the selective retrieval of informa- 
tion from memory. In allocating responsibil- 
ity for a joint outcome, the essential question 
from each participant’s point of view may 
be, “How much did / contribute?” Partici- 
pants may, therefore, attempt to recall prin- 
cipally their own contributions and inappro- 
priately use the information so retrieved to 
estimate their relative contributions, a judg- 
ment that cannot properly be made without 
a consideration of the inputs of others as well. 


Informational Disparities 


There are likely to be differences in the 
information available to the contributors that 
could promote egocentric recall. Individuals 
have greater access to their own internal 
states, thoughts, and strategies than do ob- 
servers. Moreover, participants in a com- 
mon endeavor may differ in their knowledge 
of the frequency and significance of each 
other’s independent contributions. For ex- 
ample, faculty supervisors may be less aware 
than their student colleagues of the amount of 
time, effort, or ingenuity that students invest 
in running subjects, performing data analyses, 
and writing preliminary drafts of a paper. On 
the other hand, supervisors are more Cog- 
nizant of the amount and of the importance 
of the thought, reading, and so on that they 
put into the study before the students’ in- 


volvement begins. 


Motivational Influences 


Motivational factors may also mediate an 
egocentric bias in availability. One’s sense of 
self-esteem may be enhanced by focusing on, 
or weighting more heavily, one’s own inputs. 
Similarly, a concern for personal efficacy or 
control (see deCharms, 1968; White, 1959) 
could lead individuals to dwell on their own 
contributions to a joint product. 

The preceding discussion outlines a num- 
ber of processes that may be operating to 
render one’s own inputs more available (and 
more likely to be recalled) than the contribu- 
tions of others. Consequently, it may be dif- 
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ficult to imagine a disconfirmation of the 
hypothesis that memories and attributions are 
egocentric. As Greenwald (Note 1) has ob- 
served, however, the egocentric character of 
memory “is not a necessary truth. It is pos- 
sible, for example, to conceive of an organiza- 
tion of past experience that is more like that 
of some reference work, such as a history 
text, or the index of a thesaurus” (p. 4). In 
addition, we were unable to find published 
data directly supportive of the hypothesized 
bias in availability. Finally, recent develop- 
ments in the actor—observer literature seem 
inconsistent with the hypothesis that mem- 
ories and attributions are egocentric. Jones 
and Nisbett (1971) speculated that actors 
are disposed to locate the cause of their be- 
havior inthe environment, whereas observers 
attribute the same behavior to stable traits 
possessed by the actors. Though a variety of 
explanations were advanced to account for 
this effect (Jones & Nisbett, 1971), the re- 
cent emphasis has been on perceptual in- 
formation processing (Storms, 1973; Taylor 
& Fiske, 1975). The actor’s visual receptors 
are aimed toward the environment; an ob- 
server may focus directly on the actor. Thus, 
divergent aspects of the situation are salient 
to actors and observers, a disparity that is re- 
flected in their causal attributions. This pro- 
posal seems to contradict the thesis that ac- 
tors in an interaction are largely self-absorbed. 

Two studies offer Suggestive evidence for 
the present hypothesis. Rogers, Kuiper, and 
Kirker (1977) showed that trait adjectives 
were recalled more readily when subjects 
had been required to make a judgment about 
self-relevance (to decide whether each trait 
was descriptive of them) rather than about 
a number of other dimensions (e.g., syno- 
nymity judgments), These data imply that 
self-relevance increases availability; however, 
Rogers et al. did not contrast recall of ad- 
jectives relevant to the self with recall of 
adjectives relevant to other people—a com- 
parison that would be more pertinent to the 
current discussion. Greenwald and Albert 
(1968) found that individuals recalled their 
own arguments on an attitude issue more ac- 
curately than the written arguments of other 
subjects. Since the arguments of self and 
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other were always on opposite sides of thel 
issue, the Greenwald and Albert finding could 
conceivably reflect increased familiarity with, 
and memory for, arguments consistent with 
one’s own attitude position rather than en. 
hanced memory for self-generated statements 
(although the evidence for attitude-biased} 
learning is equivocal, e.g., Greenwald & Saku: 
mura, 1967; Malpass, 1969). | 

We conducted a pilot study to determine 
whether we could obtain support for the 
hypothesized bias in availability. Students in 
an undergraduate seminar were asked to | 
estimate the number of minutes each mem: f 
ber of the seminar had spoken during the 
immediately preceding class period. An addi- 
tional 26 subjects were obtained from natt- 
rally occurring two-person groups approached 
in cafeterias and lounges. The participants 
in these groups were asked to estimate the 
percentage of the total time each person had 
spoken during the current interaction. 

It was assumed that subjects would bast 
their time estimates on those portions of the 
conversation they could recall readily. Thus, 
if there is a bias in the direction of better 
recall of one’s own statements, individuals 
estimates of the amount of time they them 
selves spoke should exceed the average speak 
ing time attributed to them by the other 
member(s) of the group. : 

The results were consistent with this A 
soning. For seven of the eight students in A 
undergraduate seminar, assessments of thel 
own discussion time exceeded the aver 
time estimate attributed to them by the off 
participants (p < .05, sign test). Similan” 
in 10 of the 13 dyads, estimates of one’s a 
discussion time exceeded that provided by 
other participant (p < .05, sign test). al 
magnitude of the bias was highly significa’ 
over the 13 dyads, F(1, 12) = 14.85, 2 
-005; on the average, participants oa 
that they spoke 59% of the time. The of 
provide preliminary, albeit indirect, evid® P 
for the hypothesized availability bias 
everyday situations. 

The principle objectives of the curren! i 
search were (a) to assess the ocurren?’ 
egocentric biases in availability and ad 
tions of responsibility in different settings, | 


t re 
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to examine factors that were hypothesized 
to influence these biases; and (c) to offer 
preliminary evidence of a relation between a 
bias in availability and a bias in attributions 
of responsibility. Experiment 1 assessed the 
occurrence of egocentric biases in availability 
and allocations of responsibility in a natural 
setting and examined the relation between 
the two biases. Next, a laboratory experi- 
“ment was conducted to address the issue of 
whether the quality of the group’s perform- 
ance affects the availability bias: Is the tend- 
ency for one’s own inputs to be more avail- 
able reduced substantially when the group’s 
performance is poor, as a motivational in- 
terpretation would suggest? Experiment 3 
further examined the effects of success and 
failure in a natural setting. The experimental 
Manipulations in Experiments 4 and 5 were 
designed to influence availability, and changes 
in attributions of responsibility were as- 
f sessed. The manipulation in Experiment 4 
induced differential encoding; the manipula- 
tion in Experiment 5 varied the retrieval 
: Cues provided to the subjects. 


Experiment 1 


In this experiment, we wished to examine 
 ocentric biases in naturally occurring, con- 
tinuing relationships. Married couples ap- 
ma to represent an ideal target group. 
on engage in many joint endeavors of 
“Neg ee This circumstance would 
ve ie ae rife with possibilities for ego- 
pene, the first experiment was con- 
in ica) to determine if egocentric biases 
ki hae of responsibility occur in mari- 
Semanal (b) to replicate, using a 
a in ependent measure, the egocentric 
a phoma obtained in the pretest; 
mith the i correlate the bias in availability 
ects E in responsibility. If the bias in 
abilit ility is caused by a bias in avail- 
Y, the two sets of data should be related. 


Method 


Subjects 
The Subjects were 37 


’ Student 3 


residences married couples living in 


Twenty of the couples had 
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children. The subjects were recruited by two female 
research assistants who knocked on doors in the 
residences and briefly described the experiment. If 
the couple were willing to participate, an appoint- 
ment was made, The study was conducted in the 
couple’s apartment; each couple was paid $5 for 
participating. 


Procedure 


A questionnaire was developed on the basis of 
extensive preliminary interviews with six married 
couples. In the experiment proper, the questionnaire 
was completed individually by the husband and 
wife; their anonymity was assured. The first pages 
of the questionnaire required subjects to estimate 
the extent of their responsibility for each of 20 
activities relevant to married couples by putting a 
slash through a 150-mm straight line, the endpoints 
of which were labeled “primarily wife” and “pri- 
marily husband.” 1 The twenty activities were mak- 
ing breakfast, cleaning dishes, cleaning house, shop- 
ping for groceries, caring for your children, planning 
joint leisure activities, deciding how money should 
be spent, deciding where to live, choosing friends, 
making important decisions that affect the two of 
you, causing arguments that occur between the two 
of you, resolving conflicts that occur between the 
two of you, making the house messy, washing the 
clothes, keeping in touch with relatives, demonstrat- 
ing affection for spouse, taking out the garbage, ir- 
ritating spouse, waiting for spouse, deciding whether. 
to have children. 

Subjects were next asked to record briefly ex- 
amples of the contributions they or their spouses 
made to each activity. Their written records were 
subsequently examined to assess if the person’s own 
inputs were generally more “available.” That is, 
did the examples reported by subjects tend to focus 
more on their own behaviors than on their spouses’? 
A rater, blind to the experimental hypothesis, re- 
corded the number of discrete examples subjects 
provided of their own and of their spouses’ con- 
tributions. A second rater coded one third of the 
data; the reliability (Pearson product-moment cor- 


relation) was 81. 


Results 


The responses of both spouses to each of 
the responsibility questions were summed, 


1In the preliminary interviews, we used percentage 
estimates. We found that subjects were able to re- 
member the percentages they recorded and that 
postquestionnaire comparisons of percentages pro- 
vided a strong source of conflict between the spouses. 
The use of the 150-mm scales circumvented these 
difficulties; subjects were not inclined to convert 
their slashes into exact percentages that could then 
be disputed. 
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so that the total included the amount that the 
wife viewed as her contribution and the 
amount that the husband viewed as his con- 
tribution. Since the response scale was 150 
mm long, there were 150 “units of responsi- 
bility” to be allocated. A sum of greater than 
150 would indicate an egocentric bias in 
perceived contribution, in that at least one 
of the spouses was overestimating his or her 
responsibility for that activity. To assess the 
degree of over- or underestimation that 
spouses revealed for each activity, 150 was 
subtracted from each couple’s total. A com- 
posite score was derived for the couple, aver- 
aging over the 20 activities (or 19, when the 
couple had no children). 

An analysis of variance, using the couple 
as the unit of analysis, revealed that the 
Composite scores were significantly greater 
than zero, M = 4.67, F(1, 35) = 12.89, p 
< .001, indicating an egocentric bias in per- 
ceived contributions. Twenty-seven of the 37 
couples showed some degree of overestima- 
tion ($ < .025, sign test). Moreover, on the 
average, overestimation occurred on 16 of the 
20 items on the questionnaire, including nega- 
tive items—for example, causing arguments 
that occur between the two of you, F(1, 32) 
= 20.38, p < .001. Although the magnitude 
of the overestimation was relatively small, 
on the average, note that subjects tended’ to 
use a restricted range of the scale. Most re- 
Sponses were slightly above or slightly below 
the halfway mark on the scale. None of the 
items showed a significant underestimation 
effect. 

The second set of items on the question- 
naire required subjects to record examples 
of their own and of their spouses’ contribu- 
tions to each activity. A mean difference 
score was obtained over the 20 activities 
(averaging over husband and wife), with the 
number of examples of spouses’ contributions 
subtracted from the number of examples of 
own contributions. A test of the grand mean 
was highly significant, F(1, 35) = 36.0, p 
< .001; as expected, subjects Provided more 
examples of their own (M = 10.9) than of 
their spouses’ (M = 8.1) inputs. The cor- 
relation between this self—other difference 
score and the initial measure of perceived 
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responsibility was determined. As hypothe 
sized, the greater the tendency to recall sel 
relevant behaviors, the greater was the over 
estimation in perceived responsibility, r(35 
= 50, p < .01. | 

The number of words contained in ead 
behavioral example reported by the subject 
was also assessed to provide a measure o 
elaboration or richness of recall. The mea 
number of words per example did not diffe 
as a function of whether the behavior wa 
reported to be emitted by self (M = 100) 
or spouse (M = 10.1), F < 1. Further, this 
measure was uncorrelated with the measure 
of perceived responsibility, r(35) = —.15, m 

Tn summary, both the measure of respons 
bility and the measure reflecting the avail 
ability of relevant behaviors showed the hy: 
pothesized egocentric biases. Moreover, there 
was a significant correlation between the mag 
nitude of the bias in availability and the mag 
nitude of the bias in responsibility, This 
finding is consistent with the hypothesis thal 
egocentric biases in attributions of responsi 
bility are mediated by biases in availability, 
Finally, the amount of behavior recalled 
seemed to be the important factor, rather 
than the richness of the recall. 


Experiment 2 


The data from Experiment 1 indicate tht! 
egocentric biases in availability and attribui 
tions of responsibility occur in ongoing tel 
tionships. The remaining experiments We i 
designed to demonstrate the prevalence y 
these phenomena, and to investigate E | 
of the factors that were expected to influen 
their magnitude. wil 

The major purpose of Experiment 2 f 
to evaluate the self-esteem interpretation zi 
the availability bias. If the availability j 
is caused primarily by the motivation t0 E 
hance self-esteem, recall of a joint endeav 
should facilitate an acceptance of Pe j 
responsibility after success and a denni 
personal responsibility after failure. we, 
quently, the self-esteem interpretuan 
plies that self-generated inputs shou fail 
more available after success than after 


ch th 


ure. The evidence from past resear' 
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people accept more responsibility for a suc- 

cess than for a failure is consistent with this 
-| reasoning (e.g., Luginbuhl, Crowe, & Kahan, 
| 1975; Sicoly & Ross, 1977; Wortman, Cos- 

tanzo, & Witt, 1973). 
| In Experiment 2, subjects learned several 
| days after participating in a problem-solving 
i} task that their group had performed either 
f well or poorly. It was hypothesized that 
| subjects would recall a greater proportion of 
their own statements when the group product 
was positively evaluated. Because we moved 
to the laboratory for this experiment, it was 
possible to tape record the group’s initial in- 
teraction. This recording provided a “reality 
base” against which to compare the subse- 
quent recall of subjects. 


Method 
Subjects 


The subjects were 37 males and 7 females selected 
from lists of students living at the university. Sub- 
ects were paid $5 each. All of the subjects par- 
ticipated in both sessions. 


Procedure 


The experiment was conducted in two sessions 
Separated by a 3- or 4-day interval. Subjects re- 
Ported for the first session in groups of two. They 
pere told that the purpose of the study was to de- 
poe whether groups exhibit more social aware- 
a than individuals. They were given 10 minutes 
ane a case study of Paula, a psychologically 
Se hegre (selected from Goldstein & Palmer, 
ition ach subject in the dyad was provided with 
aa nt portions of the case study. The subjects 
Para a asked questions designed to assess their 

k ological understanding of Paula’s difficulties, 
3 Be told to discflss each question and arrive 
ra ee Tesponse, taking into account the differ- 

u Formation that each group member brought 
that aa or her to the case. Subjects were told 

a it discussions were being tape recorded. 

len eomenter informed them that she would 

o the tapes following the session to evaluate 
H S answers, 

AS e returned individually for the second ses- 
BS ‘eat ae one half of the dyads, subjects 
Poorly iae elieve that their group had performed 
(third, aoe to other groups in the experiment 
Subjects were He worst), In the remaining dyads, 
formed telativet formed. that their group had per- 
e wax ao well (third from the best). Sub- 

en told, “Write down as much as you 
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can recall of your group’s discussion of Paula. 
You will only have a short time to do this, so it 
is unlikely that you will be able to report all or 
even most of what was said. It is, therefore, im- 
portant that you put things down in the order that 
they come to you. . . . If you remember the idea, 
but not the exact comment, rephrase it in your own 
words.” Subjects were told not to record who said 
each statement. They were simply to write what was 
said. 

Subjects were asked to stop writing at the end of 
8 minutes and to go back over their responses to in- 
dicate who said each statement during the discussion. 
Finally, they were asked whether, in their opinion, 
each statement improved or lowered their group’s 
score, or whether they were uncertain, Subjects 
were debriefed at the end of the second session. 

An observer who was blind to the subjects’ treat- 
ment conditions contrasted subjects’ recall with 
their original comments on the tape to assess ac- 
curacy. A statement was judged to be accurate if 
it represented an idea that the subject expressed 
during the interaction, even though the actual words 
used during the discussion might have differed from 
the words recalled by the subject, A second rater 
scored a random one third of the tapes, and agree- 
ment was 93%. 


Results and Discussion 


Availability 


The proportion of statements that subjects 
attributed to themselves was calculated for 
each member of the dyad. The average pro- 
portion for each dyad served as the unit of 
analysis. In 21 of the 22 dyads, the subjects 
attributed the majority of the statements 
that they recalled to themselves ($ < .001, 
sign test). The average proportion of sub- 
jects’ own statements was .70 in the success 
condition and .60 in the failure condition. 
Each of these proportions was significantly 
greater than a .50 or chance expectancy, 
£(10) = 9.09, p < .001, and #(10) = 3.22, p 
< .01, respectively. Thus, in both the suc- 
cess and failure conditions, subjects attributed 
significantly more of the recalled statements 
to themselves than would be expected by 
chance. Nevertheless, as hypothesized, sub- 
jects attributed a greater proportion of the 
recalled statements to themselves after a 
success than after a failure, F(1, 20) = 7.10, 
p< 025. The total number of statements 
recalled (adding over statements attributed 
to self and the other person) did not differ 
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significantly as a function of the group's per- 
formance (F = 1.57). 


Accuracy 


Subjects’ recall was compared with the 
taped record in a 2 X 2 between-within anal- 
ysis of variance (Success vs. Failure X Self 
vs. Partner), with the dyad as the unit of 
analysis. Subjects recalled a higher percent- 
age of their own actual statements (M = 
5.6%) than of their partner’s actual state- 
ments (M = 2.6%), F(1, 19) = 18.37, p< 
001.2 Although the means seem low, note 
that subjects were given only an 8-minute re- 
call period. The group’s performance level did 
not affect the percentage of actual statements 
recalled (main effect and interaction Fs < 1). 

We also compared the accuracy of the 
statements subjects attributed to themselves 
with the accuracy of the statements they at- 
tributed to their partners. Sixty-nine per- 
cent of the statements that subjects attributed 
to self were accurate reflections of self-gen- 
erated comments; 56% of the statements 
that subjects attributed to their partners 
were accurate. The difference between these 
two percentages was significant, F(1, 19) = 
7.06, p < .025. The group’s performance 
level did not significantly affect the accuracy 
of the attributed statements (success-fail 
main effect F = 1.14, interaction F < 1). 

Most of the errors that subjects made were 
of two types: They recalled material from 
the case history that had not been mentioned 
in the discussion; they reported inferences 
and conclusions that were not contained in 
the case history or in the discussion. In only 
a few instances (approximately 2% of the 
errors) did subjects take credit for statements 
made by their partners. 


Evaluations 


Finally, subjects’ evaluations of the state- 
ments were transcribed onto a 3-point scale: 
+1 (improved the group’s score), O (uncer- 
tain), and —1 (lowered the group’s score). 
Two scores were obtained for each subject: 
the average rating of comments attributed 
to self and the average rating of statements 
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attributed to the other person. An analysis o 
variance, with the dyad as the unit of analy. 
sis, revealed a main effect for success-fail, 
F(1, 20) = 14.56, p < .005, and a Succes- 
Fail X Self-Other interaction, F(1, 20)= 
5.19, p < .05. 

The success—fail main effect indicated thal 
statements were evaluated more positively 
following success (M = .75) than following 
failure (M = .41). The interaction revealed 
that whereas subjects’ evaluations of their 
own comments were marginally lower in t 
failure condition than in the success condi- 
tion (M difference = .18, ¢ = 1.85, p < .10), 
their evaluations of the other person’s com- 
ments were significantly lower in the failure 
condition than in the success condition (M 
difference = .50, t = 5.14, p < .01). 

In summary, the present study provided 
some evidence for the self-esteem maintenance 
hypothesis. Subjects attributed a higher pro- 
portion of the recalled comments to them 
selves after success than after failure; sub: 
jects’ evaluations of the recalled statements 
suggested an attempt to shift the blame fot 
failure onto their partners. On the other 
hand, contrary to the self-esteem interprelt 
tion, recall was egocentric even in the failure 
condition. 

Note that the strong egocentricity o 
tained on the recall measure and the increas 
accuracy of self-generated statements, m 
reflect, in part, the fact that subjects initiall 
read different aspects of the case history 
Since they subsequently presented this “I 
terial to the other person in responding K 
the questions, subjects’ own contributii 
may have received more “study time.” NG 
theless, this differential is ecologically K 
A person’s inputs are often derived from i 
or her previous history and experiences. 


Experiment 3 
f e effects 


In Experiment 3 we examined th ral se 


of success and failure in a more matali 
ting. We had the players on 12 inter eee i 
basketball teams individually comP 


a s we 
2The tapes from one of the failure grou 
lost; this group is omitted from the analysis- 
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questionnaire in which they were asked to 
recall an important turning point in their 
Jast game and to assess why their team had 
won or lost. 

It is a leap to go from the self-other com- 
parisons that we have considered in the pre- 
vious studies to own team — other team com- 
parisons. There are, however, a number of 
reasons to expect that the actions of one’s 
own team should be more available to the 

l attributor than the actions of the other team: 
I know the names of my teammates, and 
therefore, I have a ready means of organizing 
“the storage and retrieval of data relevant to 
them; our success in future games against 
other opponents depends more on our own of- 
fensive and defensive abilities than on the 
abilities of the opposing team. Consequently, 
I may attend more closely to the actions of 
my teammates, which would enhance encoding 
and storage. Also, there are informational dis- 
parities: The strategies of my own team are 
more salient than are the strategies of the 
Opposing team (Tversky & Kahneman, 1973). 

If the initiatives of one’s own team are 
differentially available, players should recall 
a turning point in terms of the actions of 
their team and attribute responsibility for 
the game outcome to their team. On the basis 
of the data from Experiment 2, it may be ex- 
pected that these tendencies will be stronger 
after a win than after a loss, 


Method 
Subjects 


he tour female and 84 male intercollegiate 
etball players participated in the study. The 
E were contacted by telephone; all 
et wing discussions with their players, to 
eir teams participate in the study. 


Procedure 


ity Peles 
+e ae sonaires were administered after six 
Study pla ec the teams participating in the 
games ee each other. Thus, for the three male 
Study ae three of the six male teams in the 
teams, S g against the other three male 
Meluded all oe the three female games selected 
Maires were a of the female teams. The question- 
following ay ministered at the first team practice 
ame), exc target game (1 or 2 days after the 

» except in one case where, because of the 
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teams’ schedules of play, it was necessary to col- 
lect data immediately after the game (two female 
teams). The questionnaires were completed indi- 
vidually, and the respondents’ anonymity was as- 
sured. The relevant questions, from the current per- 
spective, were the following: 


1. Please describe briefly one important turning 


point in the last game and indicate in which 
period it occurred. 


2. Our team won/lost our last game because. . . . 


The responses to the first question were examined 
to determine if the turning point was described 
as precipitated by one’s own team, both teams, or 
the other team. Responses to the second question 
were examined to assess the number of reasons for 
the win or loss that related to the actions of either 
one’s own or the opposing team. The data were 
coded by a person who was unaware of the experi- 
mental hypotheses. A second observer independently 
coded the responses from 50% of the subjects. 
There was 100% agreement for both questions. 


Results 


There were no significant sex differences on 
the two dependent measures; the results are, 
therefore, reported collapsed across gender. 
Since team members’ responses cannot be 
viewed as independent, responses were aver- 
aged, and the team served as the unit of 
analysis. 

A preliminary examination of the “turning 
point” data revealed that even within a team, 
the players were recalling quite different 
events. Nevertheless, 119 players recalled a 
turning point that they described as pre- 
cipitated by the actions of their own team; 
13 players recalled a turning point that they 
viewed as caused by both teams; 16 players 
recalled a turning point seen to be initiated 
by the actions of the opposing team (the 
remaining 10 players did not answer the 
question). Subjects described such events as 
a strong defense during the last 2 minutes of 
the game, a defensive steal, a shift in offen- 
sive strategies, and so on. 

The percentage of players who recalled a 
turning point caused by their teammates was 
derived for each team. These 12 scores were 
submitted to an analysis that compared them 
to a chance expectancy of 50%. The ob- 
tained distribution was significantly different 
from chance, F(1, 11) = 30.25, p < .001, 
with a mean of 80.25%. As hypothesized, 
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most reports emphasized the actions of the 
players’ own team. 

The percentage of players who recalled 
a turning point caused by their teammates 
was examined in relation to the team’s per- 
formance, The average percentage was higher 
on the losing team than on the winning team 
in five of the six games (p < .11, sign test). 
The mean difference between the percentages 
on losing (M = 88.5) and winning (M = 
72.) teams was nonsignificant (F < 1). 

The players’ explanations for their team’s 
win or loss were also examined. Of the 158 
participants, only 14 provided any reasons that 
involved the actions of the opposing team. 
On the average, subjects reported 1.79 rea- 
sons for the win or loss that involved their 
own team and .09 reasons that involved the 
opposing team, F(1, 11) = 272.91, p < .001. 
Finally, the tendency to ascribe more rea- 
sons to one’s own team was nonsignificantly 
greater after a loss (M = 1.73) than after 
a win (M = 1.65),F <1. 


Discussion 


The responses to the turning point ques- 
tion indicate that the performances of sub- 
jects’ teammates were more available than 
those of opposing team members. Further, 
subjects ascribed responsibility for the game 
outcome to the actions or inactions of their 
teammates rather than to those of members 
of the opposing team. Thus, biases in avail- 
ability and judgments of responsibility can 
occur at the group level. Rather and Hes- 
kowitz (1977) provide another example of 
group egocentrism: “CBS (news) became a 
solid Number One after the Apollo moonshot 
in 1968. If you are a CBS person, you tend 
to say our coverage of the lunar landing 
tipped us over. If you are a NBC person, you 
tend to cite the break-up of the Huntley— 
Brinkley team as the key factor” (p. 307). 

Contrary to the data from Experiment 2, 
the availability bias in Experiment 3 was as 
strong after failure as after success. There 
are differences between the studies that may 
contribute to this discrepancy. The “ego- 
centric” availability and attributions in the 
basketball experiment were team rather than 
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self-oriented; as a result, responsibility for 
failure was more diffused, and subjects’ self. 
esteem was threatened less directly. Also, u- 
like the group in the laboratory study, the 
basketball team had a future: The players 
could enhance their control over subsequent 
game outcomes by locating causality within 
their own team. Finally, and perhaps most 
important, unlike the laboratory group, the 
team also had a past. Team members recalled 
aspects of their behavior that changed and at- 
tributed the game outcomes to these varia 
tions (e.g, we win because of discipline and 
hustle; we lose because of a lack of discipline 
and hustle). What players seemed to ignore, 
however, was that the opposing teams might 
contribute to these fluctuations. : 

It seems likely that a tendency to perceive 
both teams as responsible for the game out- 
come might increase with the magnitude of 
the win or loss (assuming that large wins or 
losses are atypical). As Kelley (1973) noted, 
multiple causes are necessary to explain ¢ 
treme outcomes. Although no such tendency 
was observed in the current study, there wert 
too few data points (games) to provide at 
accurate determination. 


availability and the bias in attributions © 
responsibility more directly by introduc 
manipulations that should affect availabilt 
and measuring changes in attribution. 

Experiment 4, subjects were required g) 
record either their own comments (selta 
condition) or those of the other person as 
ner-focus condition) during a problema 
ing session. At a second session, subje 

were shown their notes and asked to 
the extent to which either they or their 
ners had been responsible for various aspe? 
of the decision-making process. It was 
sumed that the partner-focus condition art- 
enhance encoding and retrieval of wera 
ner’s contributions. Thus, the partner $ mt 
would be more available when ase ia 

of responsibility were made, and su He | 
should assign their partner more respo | 


patt 


Experiment 4 
In the final two experiments, we examine 
the hypothesized relation between the bias 
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bility for group decision-making in the part- 
nerfocus condition than in the self-focus 


condition. 
Method 


Subjects 


‘The subjects were 40 males recruited from the 
introductory psychology subject pool. 


Procedure 


Subjects were scheduled in pairs for the first ses- 
sion, In a few introductory comments, the experi- 
menter described the difficulty of preventing people 
from smoking. Subjects were then told they were 
participating, in a pilot project to assess the ef- 
ficacy of “brainstorming techniques” as a means of 
Providing possible solutions to this problem. They 
were further informed that solutions generated dur- 
ing their discussion would be sent to the Committee 
for the Prevention of Cigarette Smoking (a govern- 
ment committee). Subjects were told to follow a 
four-step sequence: define the problem, generate as 
Many solutions as possible, discuss the pros and cons 
at each proposed solution, and finally, select a pre- 
fitted solution and explain the reasons for this 
thoice, 
ie in the self-focus condition were asked 
inc, a record of their own contributions to the 
a a Subjects in the partner-focus condition 
Bs ince oh keep a record of only the other per- 
R puts: This will leave you free to think and 
T Pepe because your partner will be 
E e writing.” Subjects were given about 45 
pes for discussion, 

Pas uned individually for a second session 
es he i Pan subject was asked to look over 
in or ane ad taken during the previous session, 

Eision." refresh your recollection of the dis- 
Subj 

ee pannited the dependent measures after 
Principal deve ze their notes of Session 1. The 
indicate aes ue variable required subjects to 
lent of the Ai led to control the course and con- 
Were asked een during the first session. They 
“Beet to the jases this overall, and also with re- 

arious stages of the discussion, on 


ae endpoints labeled “the other 


20mm 
Person” 


Results 


2 5 
on tier 2 analysis of variance was performed 
$ 


A 
between sit; Self- versus partner focus was 4 
treated on jects factor. Since the dyad was 
Made by au, XPerimental unit, the response 
- Stered to a member of the pair was COn- 
€ a repeated measure. 


| 
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The focus manipulation had no reliable 
impact on attributions of responsibility (all 
Fs < 1). Once again, however, there was 
strong evidence of an egocentric bias in al- 
locations of responsibility. Subjects reported 
that they had exerted more control over the 
course and content of each segment of the 
discussion than their partners ascribed to 
them—solutions stage: 85 versus 68, F (1,18) 
= 7,32, p < .025; evaluation stage: 79 ver- 
sus 70, F(1, 18) = 4.13, p < .07; final pro- 
posal stage: 86 versus 67, F(1, 18) = 11.31, 
p< .01; overall discussion: 89 versus 72, 
F(1, 18) = 9.21, p < .01. Note that for each 
item, A’s self-attributions were, on the aver- 
age, beyond the midpoint of the 150-mm 
scale, indicating a perceived contribution of 
greater than 50%; on the other hand, the 
partner viewed A’s contributions as being 
less than 50% in each instance. 


Discussion 


The results revealed strong egocentric 
biases in individuals’ attributions of responsi- 
bility for segments of the problem-solving 
task. The focus manipulation had surprisingly 
little effect, however. What we had viewed to 
be a sledgehammer manipulation turned out 
to be ineffective. 

Why did the attention manipulation have 
so little impact? One possibility is that sub- 
jects may have found the written records to 
be relatively uninformative and relied more 
on their memories than on the notes in re- 
sponding to the questionnaire (hence, the 
strong egocentric biases). The notes may have 
appeared inadequate, in part because they 
were very brief, usually about one page, rela- 
tive to the length of the interaction (45 min- 
utes). Moreover, much of what subjects wrote 
may have seemed irrelevant to the final deci- 
sions made by the group. In short, we may 
not have succeeded in focusing subjects’ at- 
tention on what, from their perspective, were 
the important aspects of the interaction. To 
obtain this information they were, perhaps, 
only too willing to rely on their memories. 

Finally, note that recent research on the 
relation between attention and recall in in- 
terpersonal perception settings has yielded 
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inconsistent results (Taylor & Fiske, 1978). 
This situation stands in marked contrast to 
the strong relation evident in the cognitive 
literature (Cofer, 1977; Loftus & Loftus, 
1976). Unlike the constrained experimental 
settings utilized in cognitive research, how- 
ever, the present study and those reviewed 
by Taylor and Fiske incorporate rich social 
environments, Consequently, the manipula- 
tion of attention is relatively gross; it is less 
certain that the individual is attending to 
those aspects of the situation that are rele- 
vant to the dependent measures. 


Experiment 5 


In Experiment 5, we again attempted to 
vary the individual’s focus of attention so as 
to affect availability. In this experiment, how- 
ever, we employed a manipulation designed 
to promote selective retrieval of information 
directly relevant to attributions of responsi- 
bility, 

In our initial analysis, we suggested that 
egocentric attributions of responsibility could 
be produced by the selective retrieval of in- 
formation from memory and that retrieval 
might be guided by the kinds of questions 
that individuals ask themselves, Experiment 
5 was conducted to test this hypothesis. Sub- 
jects were induced to engage in differing re- 
trieval by variations in the form in which 
questions were posed. Graduate students were 
stimulated to think about either their own 
contributions to their BA theses or the con- 
tributions of their Supervisors. The amount 
of responsibility for the thesis that subjects 
allocated to either self or Supervisor was then 
assessed. It was hypothesized that subjects 
would accept less responsibility for the re- 
search effort in the supervisor-focus than in 
the self-focus condition, 


Method 
Subjects 


The subjects were 17 female and 12 male psy- 
chology graduate students, Most had completed 
either 1 or 2 years of graduate school, All of these 
students had conducted experiments that served as 
their BA theses in their final undergraduate year, 
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Procedure 


The subjects were approached individually in 
their offices and asked to complete a brief ques- 
tionnaire on supervisor-student relations, None re- 
fused to participate. The two forms of the ques- 
tionnaire were randomly distributed to the sub- 
jects; they were assured that their responses would 
be anonymous and confidential. 

One form of the questionnaire asked the subjects 
to indicate their own contribution to e f a 
number of activities related to their BA thhses. The 
questions were as follows: (a) “I suggest 
percent of the methodology that was finally em- 
ployed in the study.” (b) “I provided per- 
cent of the interpretation of results.” (c) “I initiated 
percent of the thesis-relevant discussions 
with my supervisor.” (d) “During thesis-related dis- 
cussions I tended to control the course and content 
of the discussion percent of the time. (e) “All 
things considered, I was responsible for per- 
cent of the entire research effort.” (f) “How would 
you evaluate your thesis relative to others done in 
the department?” . 

The second form of the questionnaire was identi- 
cal to the above, except that the word Z (self-focus 
condition) was replaced with my supervisor (super- 
visor-focus condition) on Questions 1-5. Subjects 
were asked to fill in the blanks in response to the 
first five questions and to put a slash troue 
150-mm line, with endpoints labeled “inferior” an 
“superior,” in response to Question 6. 


Results and Discussion 


For purposes of the analyses, it was ag 
sumed that the supervisor’s and the student's 
Contribution to each item would add up t0 
100%. Though the experiment was intro- 
duced as a study of supervisor-student rela- 
tions, it is possible that the students may 
have considered in their estimates the inputs 
of other individuals (e.g., fellow students): 
Nevertheless, the current procedure prov 
a conservative test of the experimental hy 
pothesis. For example, if a subject responi y 
20% to an item in the “I” version 0 A 
questionnaire, it was assumed that his or é 
Supervisor contributed 80%. Yet the supë 
visor may have contributed only 60%, va 
an unspecified person providing the 


mainder. By possibly overestimating the a 
Pervisor’s contribution, however, we are bi a 
ing the data against the experimental 
pothesis: The “I” version was expecté Ai 
reduce the percentage of responsibility 


located to the supervisor. 


ded 
f the 
h 
th 
b 
h 
to 
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Subjects’ responses to the first five ques- 
tions on the “I” form of the questionnaire 
"were subtracted from 100, so that higher 
numbers would reflect greater contributions 
by the supervisor in both conditions. Ques- 
tion 5 dealt with overall responsibility for the 
research effort. As anticipated, subjects allo- 
cated more responsibility to the supervisor in 
the supervisor-focus (M = 33.3%) than in 
the self-focus (M = 16.5%) condition, F(1, 
11) = 9.05, p < .01. The first four questions 
were concerned with different aspects of the 
thesis, and the average response revealed a 
similar result: supervisor-focus M = 33.34; 
self-focus M = 21.82; F(1,27) = 5.34, p< 
05, Finally, subjects tended to evaluate their 
thesis more positively in the self-focus condi- 
tion than in the supervisor-focus condition: 
112.6 versus 94.6, F(1,27) = 3.59, p < 10. 
The contrasting wording of the questions 
had the anticipated impact on subjects’ allo- 
tations of responsibility. The supervisor ver- 
Sion of the questionnaire presumably caused 
Subjects to recall a greater proportion of their 
ees contributions than did the “I” 
ai a the questionnaire. This differential 
Bons, eb was then reflected in the alloca- 
5 of responsibility. Note, however, that 
een were not entirely successful in 
x Saat subjects’ retrieval. The supervisor 
blty a. only one third of the responsi- 
e thesis in the supervisor-focus 
3 pun of the present data, the basketball 
Ee, attributions of responsibility for the 
ann in Experiment 3 need to be 
“asked i Recall that the players were 
Won lost pes the sentence, “Our team 
Question idea, ame peat iS es 
fentric bias With, higbI ynan ificant See 
at the fon i hindsight, it i evident 
Eis of the question—“Our team 
jects to cae —may have prompted sub- 
tams, even is he actions of their own 
Preclude refer eh the wording wie ed 
“turning ences to the opposing team. The 
vee question in Experiment 3 
teptible to Sabre worded and is not sus- 
ieai s alternative interpretation. 
tate from Ng questions in these studies ema- 
Tettieval an external source; many of our 
queries are self-initiated, however, 
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and our recall may well be biased by the 
form in which we pose retrieval questions to 
ourselves. For example, basketball players 
are probably more likely to think in terms 
of “Why did we win or lose?” than in terms 
of a neutrally phrased “Which team was re- 
sponsible for the game outcome?” 


General Discussion 


The five studies employed different sub- 
ject populations, tasks, and dependent mea- 
sures. As hypothesized, the egocentric biases 
in availability and attribution appear to be 
robust and pervasive. 


Determinants of the Availability Bias 


Several processes were hypothesized to 
contribute to the increased availability of 
self-generated inputs. It is possible to con- 
sider how well each accounts for the existing 
data. Selective encoding and storage cannot 
have contributed to the effects of success 
versus failure on availability in Experiment 
2 or of supervisor- versus self-focus in Experi- 
ment 5 (since these manipulations occurred 
long after encoding and storage took place). 
Informational disparities should not have 
contributed to the pretest results (subjects’ 
time estimates were based solely on the pre- 
ceding discussion), to the tendency to at- 
tribute a higher proportion of the recalled 
statements to oneself in the success as com- 
pared to the failure condition in Experiment 
2, or to the effects of supervisor- versus self- 
focus in Experiment 5 (since neither perform- 
ance level, as operationalized here, nor focus 
could affect the information initially avail- 
able to the subjects). Two motivational pro- 
cesses were posited. Self-esteem maintenance 
does not seem pertinent to the results ob- 
tained from the two-person groups in the pre- 
test. Nor does it account for (a) the over- 
recall of self-generated inputs in the failure 
condition of Experiment 2 and (b) the find- 
ing that players on losing basketball teams 
recalled the turning point of the game in 
terms of the actions of their teammates. The 
control motivation hypothesis fares somewhat 
better. Although focusing on one’s own inputs 
in failure situations may lower self-esteem, it 
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does permit one to perceive personal control 
over the activity. Hence, efficacy motivation 
could account for these results. Nevertheless, 
a desire for personal efficacy does not appear 
to explain all of the data. The two-person 
groups in the pretest seem to reveal a rela- 
tively “pure” information-processing effect: 
It is unlikely that people would feel a need 
to report that they dominated casual conver- 
sations. Also, the effect of supervisor- versus 
self-focus in Experiment 5 appears to be 
mediated by differential retrieval, Efficacy 
considerations may have induced the subjects 
to report that they were major contributors 
to their theses; nonetheless, motivational con- 
cerns do not dictate that focusing on the 
supervisor’s contributions will reduce one’s 
need to assume responsibility, 

In summary, selective encoding and stor- 
age, informational disparities, and motiva- 
tional influences do not appear to be neces- 
sary determinants of the egocentric bias in 
availability, The one remaining process that 
was posited, selective retrieval, is not pre- 
cluded by any of the current data; further, 
it receives direct support from the findings in 
Experiments 2 and 5, 

Nevertheless, it seems premature to elimi- 
nate any of the hypothesized processes as 
sufficient causes of the availability bias. The 
tendency of spouses to recall their own con- 
tributions in Experiment 1 may reflect infor- 
mational disparities; the desire to maintain 
self-esteem may have contributed to the 
effect of performance level in Experiment 2; 
basketball players’ responses to the turning 
point question in Experiment 3 may well have 
been influenced by selective encoding and by 
control motivation, 

We suspect that, like many cognitive phe- 
nomena (cf. Erdelyi, 1974; Erdelyi & Gold- 
berg, in press; McGuire, 1973), biases in 
availability are multidetermined in real life. 
Multidetermination may seem an unsatisfy- 
ing resolution; however, it is one that social 
psychologists shall probably confront increas- 
ingly as they begin to study cognitive phe- 
nomena in situ. Researchers in other sciences 
face parallel complexities. For example, simi- 
lar cancers appear to have different etiolo- 
gies, depending, among other factors, on the 
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environment in which the patient lives (Good. 
field, 1976). 


The Link Between Availability and 
Attributions of Responsibility 


The focus of the present research has been 
on demonstrating that the hypothesized biases 
in availability and attribution exist and are 
relatively ubiquitous. It was also hypothe 
sized, however, that the egocentric bias in 
attributions of responsibility would be medi- 
ated by the bias in availability. Although the 
data are suggestive, we have no definitive 
evidence that the bias in availability causes 
the bias in responsibility. The strongest 
affirmative evidence is that the two biases 
were significantly correlated in the marriage 
study and that a manipulation designed to 
induce selective retrieval influenced attribu- 
tions of responsibility (Experiment 5). In 
Opposition, it might be contended that the 
covariation between the two biases is a 
ceptible to a number of alternative causal 
interpretations and that there is no „direct 
evidence that the retrieval manipulation I 
Experiment 5 affected availability. Corga 
bly, the attributions of responsibility in a 
periment 5 were mediated by some ol a 
factor not yet identified. Further evid 
will be required to establish whether the n 
in responsibility is caused by the bias ; 
availability, The present results suggest pe 
eral additional considerations, however, ie 
cerning the determinants and pervasiven 
of the biases. 


Pervasiveness of the Egocentric Biases 
The egocentric biases obtained in the a 
rent studies may seem inconsistent with J a 
and Nisbett’s (1971) proposal that aay 
locate causality for their actions ie fs 
within their environment. There are 4 dig 
ber of differences between the two para 
that might account for the discrepancy “ned 
important, Jones and Nisbett were on 
with interpretation, whereas ses LON of 
recall and judgments of responsibility. conti 
could presumably overestimate han rs 
butions to a joint product and, at the with 
time, locate the cause of their behavior 


vironment. For example, suppose 
who reports that she does 80% of 
is asked why she cleans (the cen- 
ion for Jones and Nisbett). She 
nd that the house is dirty, an en- 
attribution. Conversely, her hus- 
perhaps accepts 30% of the re- 
for cleaning, may answer the 
stion by pointing out that his wife 
for cleanliness, a trait attribu- 


the current data do not speak di- 
the Jones and Nisbett hypothesis. 
, our data do seem to contradict 
dence that the responses of actors 
‘salient and available to observers 
actors themselves (Storms, 1973; 
¢ Fiske, 1975). The critical variable 
extent to which the observer de- 
from a passive role and interacts with 
When, as in the present research, 
undertake complex social interac- 
alternate between the roles of 
tor) and listener (observer), yet 
their attention may be directed at 
and executing their own responses. 
| they do not attend to themselves 
lly, they may be cognitively self- 
herefore, self-generated inputs are 
more available in recall. On the 
passive observers may concentrate 
Persons in their environment. Also, 
| May be less self-absorbed when 


nple, when they enact well-prac- 
Wiors (Langer, 1978, has specu- 
a wide range of social behaviors 
mal thought). 
instances notwithstanding, the pres- 
ch demonstrates the prevalence of 
biases in availability and judg- 
responsibility, In everyday life, 
entric tendencies may be over- 
en joint endeavors do not require 
Ocations of responsibility. If allo- 
te stated distinctly, however, there 
l for dissension, and individuals 
to realize that their differences 
t could arise from honest evalua- 
lormation that is differentially 
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Reference Note 


1. Greenwald, A. G. The tolitarian ego: Fabrica- 
tion and revision of personal history. Unpub- 
lished manuscript, 1978. 
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Ambient Temperature and the Occurrence of Collective Violence: 
A New Analysis 


J. Merrill Carlsmith and Craig A. Anderson 
Stanford University 


Prevalent folklore suggests that riots tend to occur during periods of very hot 
weather, Baron and Ransberger examined 102 major riots in the United States 
between 1967 and 1971 and concluded that the frequency of collective violence 
and ambient temperature are curvilinearly related. The present article points 
out that the Baron and Ransberger analysis did not take account of the dif- 
ferent number of days in different temperature ranges. The artifact is elim- 
inated, and the probability of a riot, conditional upon temperature, is estimated. 
When this is done, the evidence strongly suggests that the conditional probabil- 
ity of a riot increases monotonically with temperature, Some general implica- 


tions of such data analyses are discussed. 


In a recent article in this journal, Baron 
and Ransberger (1978) presented an analy- 
sis of the relationship between the frequency 
of major riots and the ambient temperature 
occurring during the riots. To do this, they 
studied 102 major riots in the United States 
between 1967 and 1971. The hypothesis they 
wished to test, and for which they claimed 
confirmatory evidence, is the existence of a 
curvilinear relationship between the likeli- 
hood of a riot and the maximum ambient 
temperature at the time of the riot. This 


hypothesis contrasts with the prevalent folk- 


lore that riots tend to occur during periods of 
very hot weather. Specifically, Baron and 


its reported herein were supported in 
ones unds from Boys Town. However, the 
eae or the policies advocated do not 
ae elect those of Boys Town. The research 
me a out while the first author was a member 
ae Town Center for the Study of Youth 
(aa ane at Stanford University and while the 
aidati 3 was supported by a National Science 
Brent Saree) We are indebted to Bradley 
ieee TA Oss, and Amos Tversky for their com- 
ie sectii on this work, and to Joy 
task of tra er assistance in the seemingly unending 
ie Pela! 58,000 temperatures. 
Chien oe reprints should be sent to J. Merrill 
Verity, Si partment of Psychology, Stanford Uni- 
» Stanford, California 94305, 


Ransberger concluded that the likelihood of a 
riot increases with temperature up to the 
range of 81°-85° F and then decreases 
sharply with further increases in temperature. 
The evidence that they presented to support 
this relationship is a frequency distribution 
of the number of riots plotted against tem- 
perature. This frequency polygon does indeed 
peak in the interval 81°-85° F, falling off 
sharply on either side. 

We contend that this relationship is an 
artifact of the particular way the data were 
examined and that an appropriate reanalysis 
suggests a monotonically increasing function 
relating the probability of riots and tempera- 
ture. Basically, we argue that the Baron and 
Ransberger results stem from their having 
not taken account of base-rate differences in 
temperature. For example, if days in the 81°- 
85° F range are more common than days in 
the 91°-95° F range, there may well be more 
riots in the former range. There are, after all, 
many more opportunities for riots. But an 
appropriate analysis may well show that riots 
are relatively more common in the higher 
temperature range. To be sure, Baron and 
Ransberger did consider this possibility, but 
they rejected it. In our view, their rejection 
was premature; we consider their arguments 
and the weaknesses therein at greater length 


below. 
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NUMBER OF GAMES PLAYED 


51-55 61-65 71-75 81-85 91-95 


TEMPERATURE (°FAHRENHEIT) 
Figure 1. Frequency of baseball games played (New 


York Mets, 1977) as a function of ambient tempera- 
ture. 


101-105 


Baseball, Temperature, and Base Rates 


To see most clearly how this artifact may 
work, it is instructive to apply the same 
analysis used by Baron and Ransberger to a 
set of events that we know are not influenced 
by temperature, Figure 1 shows the same 
analysis used by Baron and Ransberger ap- 
plied to the frequency of New York Mets 
baseball games played at home in 1977 (The 
Sporting News, April-October 1977). That 
is, we plot the frequency of Mets home games 
against the maximum ambient temperature in 
New York on the day of the game. (To be 
sure, baseball games occur primarily during 
the summer months, but then again, so do 
riots.) A brief study of Figure 1 shows a 
remarkable similarity to Figure 1 in the 
Baron and Ransberger (1978) article, and 
were we to follow their logic, we would have 
to conclude that “inspection of this figure 
lends support to the suggestion of a curvi- 
linear relationship between ambient tempera- 
ture and the incidence of [baseball games] ” 
(p. 354). In view of the fact that baseball 
games are scheduled some months in advance, 
such a conclusion hardly seems warranted. 
Another explanation seems far more plausi- 
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ble. Both frequency polygons lead to errone- 
ous conclusions, and for the same reason—the 
base-rate of different temperatures has not 
been taken into account. In our baseball ex. 
ample, it is not difficult to see that the fre- 
quency distribution of Mets home games is 
heavily influenced by the base-rate of daily 
maximum temperatures in New York during 
this period. Fewer games were played at 
temperatures of 91°—95° F, not because such 
temperatures lowered the probability of play- į 
ers choosing to play baseball, but because 
there were fewer such days. This simple ex- 
ample captures the essence of our critique of 
the Baron and Ransberger analysis, although 
the problem becomes a good deal more com- 
plex when we try to deal with the data they 
resent, 

a The failure to consider base rates when 
assessing the probability of some event i 
hardly unique to this example, It is a side 
lem of general concern in the analyn 
data when we wish to calculate the probabil- 
ity of an event, conditional on the oc 
rence of some other event. It is also a prob- 
lem that has begun to intrigue cogni 
psychologists interested in subjective a 
ments of probability rather than formal si 3 
tistical estimation. For èxample, in the con 
text of a discussion of judgmental heuristics 
Tversky and Kahneman (1974) comment: 


The reliance on heuristics and the prevalence ti 
biases are not restricted to laymen. eee. 
researchers are also prone to the same pinso t 
they think intuitively. For example, the ena daii 
predict the outcome that best represents ‘i ha 
with insufficient regard for prior probabi a 6, 
been observed in the intuitive judgments tatistics 
viduals who have had extensive training in va ele- 
Although the statistically sophisticated mM i 
mentary errors, such as the gambler’s a a 
intuitive judgments are liable to similar ieee p 
more intricate and less transparent problems. 
1130) 


Temperatures Before and After Riots 
(and Baseball Games) 


d 
The second major group of data presen 
by Baron and Ransberger is a piota k 
maximum daily temperatures in the pi Pi. 
during the 7 days prior to the R, riot 
each riot and the 3 days following the ee, 
This plot shows gradually increasing temp 
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79 


MEAN TEMPERATURE (°FAHRENHEIT) 


78 


77 
1 2 3 4 5 


PRE-GAME DAYS 


to the day of the riot, followed by 
temperatures during the 3 days 
the riot. It should be noted that 
this picture is consistent with their 
hypothesis, it is also consistent 
e variety of other possible func- 
mships between temperature and 
ty of riots. In particular, it is 
consistent with our hypothesis 
is a monotonically increasing rela- 
n temperature and the proba- 
t. On the other hand, the plot is 
istent with a hypothesis that 
on occasional rainy (and cooler) 
; or baseball games) are unlikely 
‘thus, the same picture could occur 
f the true relationship between 

a riot and temperature at the 
a riot. Figure 2 shows a plot, 
to Baron and Ransberger’s Fig- 
done for New York Mets home 
7. Again we see a remarkable 
ween Baron and Ransberger’s 
lots and our Figure 2 for base- 
AS our tentative hypothesis that 
is mediated by a few rainy 
temperatures tend to be cooler 
es tend not to be played, but 
ain of this artifact than we are 
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ó 7 1 1 2 3 
GAME POST-GAME 
DAY DAYS 


Mean maximum ambient temperature on the 7 days preceding the baseball game, on 
e day, and on the 3 days following the game day. 


of the artifact underlying Figure 1. In view 
of the fact that Baron and Ransberger’s Fig- 
ure 2 is consistent with a wide variety of 
possible functional relationships between 
temperature and the probability of riots, we 
will not pursue this issue further but return 
to the more fundamental question. If the 
postulated curvilinear relationship between 
temperature and the probability of riots is 
artifactual, what is the nature of the true 
relationship? 


Conditional and Unconditional Probabilities 


It is easier to point to the dangerous arti- 
fact underlying Figure 1 than to see a per- 
fect solution to it. We present two alterna- 
tive analyses below. Neither is immune to 
criticism, although both remove the obvious 
artifact. The consistency that emerges from 
these two very different analyses leads us to 
some confidence in the conclusions they im- 
ply, although we would emphasize at the 
outset our qualms about drawing any firm 
conclusions from this type of correlational 
analysis. 

To see how to remove the effect of base 
rates, it is instructive to formalize our dis- 
cussion slightly. The quantity we wish to 
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estimate is the conditional probability of a 
riot, given a particular temperature. Since 
we have only 102 data points with which to 
work (or 86 if we follow Baron and Rans- 
berger in analyzing separately the 16 riots 
that occurred on the date or anniversary of 
the Martin Luther King assassination), we 
can only hope to estimate this conditional 
probability of a riot given that the tempera- 
ture is in a particular interval. Let E; corre- 
spond to the event that the temperature is in 
the ith interval. Then P{R|E,}is the condi- 
tional probability of a riot given a tempera- 
ture in the ith interval. We follow Baron and 
Ransberger in using 5° intervals, From a 
familiar relationship in elementary probabil- 
ity theory, we have the following equation: 


P(R & E;) 
P(E;) 


That is, the conditional probability of a riot, 
given that the temperature is in a particular 
interval, is given by the joint probability of a 
riot and a temperature in a particular interval, 
divided by the probability that the tempera- 

_ ture is in that interval.! Examination of the 
equation makes it clear that Baron and Rans- 
berger essentially estimated the joint proba- 
bility without correcting for the marginal 
temperature distribution. 

Our problem, then, is to estimate that 
marginal distribution. It is not such an easy 
problem as it might appear, since the universe 
from which the particular riot temperatures 
are to be viewed as a sample is not well de- 
fined. We might attempt to conceptualize it 
as the distribution of all temperatures in the 
United States in the 5-year period in ques- 
tion. But a moment’s thought shows the 
vagueness of that conceptualization, Should 
that distribution be weighted by the popula- 
tion density in each geographical location? 
Does it include the temperatures at Death 
Valley, where there are too few people to 
stage a convincing riot? Alternatively, we 
might try the distribution of temperatures in 
cities larger than, say, 100,000 people, again 
over the 5-year period in question. But some 
riots occurred in much smaller cities. Fur- 
thermore, no riots occurred in Alaska. Should 
we then include Alaska in our universe? Our 
solution to this problem was to define the 


P(RIE,) = 
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universe of temperatures as all temperatures 
occurring in the 79 cities in which there were 
riots over the 5-year period in question? 
This definition generated a 79 x 1,826 
matrix of daily temperatures. In order to es- 
timate the distribution of temperatures in 
this matrix, we randomly sampled 2 of the § 
years for each city and found the daily maxi- 
mum ambient temperature for each of the 
57,705 days so defined. Temperatures were 
obtained from the Climatological Data re- 
ports of the U.S. Department of Commerce | 
(Environmental Data Service, 1967-1971). 
Where no reporting station existed in the city 
(as was true in 6 of the smaller towns), we 
used the nearest station. This method, then, 
yielded an estimate of the probability distri- 
bution of maximum daily temperature ovet 
all riot cities over the 5-year period in ques- 
tion. That is, for each 5° F temperature in- 
terval, we had a count of the number of days 
in which the maximum temperature was M 
that interval, Dividing that count by 57,709 
(the total number of days sampled) yielded 
an estimate of the probability of a day M 


1An urn model may serve to clarify this pa 
Suppose we imagine an urn with a large ne 
marbles, each marked with a temperature. Mord 
the marbles are white, but a few are blue. The atte 
marble corresponds to a riot. To estimate the onl 
ginal distribution of temperature, we draw 4 ri. 
of marbles and plot the temperatures. To bee 
the probability of a riot, we draw a sampi? a 
marbles and count the number of blue marble 
estimate the joint probability of riots and Mey 
ture, we draw a sample of marbles and coun 
number of blue ones in each temperature ra neve 
estimate the conditional probability of 4 iol eo 
a particular temperature, we draw a large siela 
count the number of blue marbles in a Pi 
temperature range, and divide that count E 
total number of marbles in that temperature ri ' 
This final number is interpreted as follows: tem f 
are told that we have a marble in a par tan us 
perature range, the conditional probability i to 
the number of chances that it is blue. So we ae 5 
know for our data, Given that the tempera nl 
between 81° and 85° F, what is the con condi- 
probability of a riot? Having estimated wod 
tional probability for each interval, we bs tem 
how these conditional probabilities vary W! 
perature. for Pim 

? We are indebted to Robert A. ae the wy 
viding us with the list of cities and dates © 
riots used in their study. 
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LIKELIHOOD OF A RIOT (x104) 


5 15 2-25 34395 4-45 


*— Excluding King Riots (N=86) 
o— Including King Riots (N=102) 
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Figure 3. Conditional probability (likelihood) of a riot as a function of ambient temperature. 


which the maximum temperature was in that 
aie To estimate the conditional proba- 
ility of a riot, given that the temperature 
a m a particular interval, we calculated 
a number of occurrences of a riot with a 
a Boor in that range and divided by the 
Bee Shand ee a temperature in that 
Paes. esulting function is shown in 
ee aa ia that Figure 3 provides no 
Besa = or a curvilinear relationship. In- 
i va ae - a continuously increasing like- 
i. id s as the temperature continues to 
bite ars up through the temperature 
eee -95° F. Contrary to the asser- 
he sa and Ransberger, riots do not 
Mean sie most likely at temperatures be- 
Bere aia and 85 F; rather they become 
TA ae likely with increasing tem- 
4 lectin is different function, of course, is 
ate fem F of the fact that although there 
is, Gy aes on days when the temperature 
Fone a He F than on days when the 
ewer days ee a F, there are many 
€ smaller numb E E 
their er of riots at extremely high 
“many fewer appears to be the result of 
pportunities for riots to occur; 


te 


the conditional probability of a riot is larger 
at the higher temperature. 

Three brief methodological notes are in 
order about Figure 3. First, there is a total 
of only three riots in the highest three tem- 
perature intervals, making the probability 
estimates extremely unstable. Consequently, 
we have averaged the three points, and this 
average is connected to the remainder of the 
function by a dashed line. Second, in defining 
the maximum daily temperature associated 
with each riot, we used the temperature on 
the day the riot began. This method con- 
trasts with that of Baron and Ransberger, 
who took the average of the daily tempera- 
tures over the duration of the riot, for those 
riots that lasted more than 1 day. It seemed 
to us that if one wants to consider tempera- 
ture as a causative factor in the outbreak of 
riots, it is more sensible to measure the 
temperature at the time of the outbreak 
rather than to include temperatures over 
subsequent days. Clearly, the temperature on 


3 We have followed Baron and Ransberger in esti- 
mating each function twice—once including the 
Martin Luther King-related riots and once exclud- 


ing them. 
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*— Excluding King Riots (N=86) 
o—o Including King Riots (N=102) 


NUMBER OF RIOTS 


nas 31-35 


PERCENTILE OF TEMPERATURE 


£ ) À } i 
Figure 4. Frequency of collective violence (riots) as a function of the relative ambient temper: 


ture (expressed in percentiles). 


July 12 cannot be more influential in deter- 
mining whether a riot begins on July 10 than 
is the temperature on July 10. If we were to 
follow the Baron and Ransberger definition, 
there would be some minor changes in our 
Figure 3. Since we know that temperatures 
following the onset of a riot tend to be less 
extreme than temperatures on the first day 
of the riot, it is not surprising to find that 
extreme temperatures are slightly less com- 
mon if we average over the days of the riot 
and that the proportion of riots at the modal 
temperature increases slightly. Even were we 
to use this averaging, the function still fails 
to show the precipitous drop that made Fig- 
ure 1 so compelling, and the riot probability 
still reaches its maximum in the highest tem- 
perature range. Third, we present no infer- 
ential statistics in conjunction with this fig- 
ure. It is our view that such statistics are, at 
best, irrelevant to these data and, at worst, 
seriously misleading. In their analysis of 
Figure 1, Baron and Ransberger present chi- 
square statistics. As we have already seen, 
such a calculation rests on an assumption that 
all temperature intervals are equally likely— 
an assumption that is demonstrably false. 
Furthermore, the riots show strong temporal 
and geographical dependencies (for example, 
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51-55 71-75 91-95 


five riots occurred in different cities in M 
gan in a 3-day interval), which make 
sumptions of independence untenable. 


An Alternative Analysis 


Although Figure 3 casts severe douit 
the curvilinearity hypothesis, it is not kf aa 
convincing by itself. We have already a nN 
to the somewhat arbitrary definition 0 aa 
universe of temperature days. ru 
the base-rate information used in ET Ee 
lation of Figure 3 can be overly ini noe in 
by an extreme temperature distin 
one or two cities (although the large nu jai- 
of data points makes this somewhat eda 
sible). Still, we are adding across oe quite 
similar temperature distributions. e pa 
different analysis, which avoids, Ee a 
ticular difficulties, involves looking a tures 
riot temperature relative to all tempe cumi 
in the riot city. Thus, we estimate hoa all 
lative distribution function (cdf) k at the 
temperatures in City K and then o city 
temperature on the day of the riot 1n in 
relative to the cdf of all temperatures tent 
city. This procedure converts each r ten 
perature to a percentile relative to 4 
peratures in the riot city. Again, we 
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random sample of 2 of the 5 years in each 
city over the period 1967-1971 to estimate 
the cdf of all temperatures in that city, esti- 
mating each cdf by 730 (nonindependent) 
points. 

Figure 4 shows the number of riots occur- 
ting at each percentile interval. By shifting 
to percentiles, we have eliminated the base- 
rate problem that plagued us in Figure 1. 
When we plot the frequency of riots occur- 
ring at different percentiles, the null hypothe- 
sis clearly predicts a uniform distribution. 
Any effect of the different likelihoods of dif- 
ferent temperatures has been removed by 
expressing each riot temperature relative to all 
temperatures in that city. Figure 4 hardly 
suggests a uniform distribution, nor is there 
any evidence of curvilinearity. What is sug- 
gested is a monotonically increasing function. 
Thus, once again we conclude that the likeli- 
hood of a riot in a given city increases as 
the maximum ambient daily temperature in 
that city increases, 

It should be noted that Figure 4 examines 
the covariation of riots and the relative tem- 
perature (relative to all temperatures in that 
Particular city) rather than the riots’ covari- 
ation with the absolute temperature. The re- 
lationship between relative and absolute tem- 
perature is strong enough (although by no 
‘Means perfect) that we see no hope of using 
ee os the fascinating question 
e re a ire temperature y relative 
a a n to some adaptation 
oa important in predicting the 
ee z a riot, Rather, we see the con- 
Ey He of these two functions, based on 
. port erent methods of analysis, as lending 

ae ee the general proposition that, at 

Ry of a oe Particular riots, the probabil- 
` ot increases monotonically with 


increasi i i 
pot easing maximum ambient temperatures in 
ential riot cities. 


Att ; 
empts to Discount the Base-Rate Artifact 


ane the dramatic differences in con- 
of aaa we draw after taking account 
: Worth consi temperature base rates, it is 
Tejected ae why Baron and Ransberger 
re ie as an explanation of 
tion was į nd why we feel that the rejec- 

n error, They present several argu- 
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ments against the plausibility of the arti- 
factual effect, but each is particularistic and 
fails to take into account the overall impact 
of the base rate. Analogous to our use of 
Mets baseball games, for example, they 
looked at celebrations following victories in 
particular major athletic contests (bowl 
games, Stanley Cup play-offs, National Bas- 
ketball Association play-offs, and the World 
Series) and found that temperatures associ- 
ated with such events do not peak at 81°- 
85° F. Unfortunately, none of the events they 
chose to study occur in the summer months. 
Thus, they failed to observe that over the 
course of the year in the riot cities, tempera- 
tures in the 81°-85° F range are indeed 
more frequent than those in any other in- 
terval. The only evidence they present rele- 
vant to the overall distribution comes from 
their second counterargument against it. They 
select 11 cities and 2 months (July and 
August) and show that temperatures in the 
g1°-85° F range are not uniformly most 
frequent in those cities in those months. But 
different choices of cities, months, and tem- 
perature intervals lead to different conclu- 
sions, none of which describe the overall 
temperature distribution. As we have shown 
above, temperatures in the 81°-85° F range 
are in fact the most common temperatures 1n 
these cities, Baron and Ransberger’s other 
arguments have this same particularistic 
quality, focusing primarily on temperatures 
in particular ranges in riot cities on the same 
dates in riot years versus nonriot years, and 
we do not consider them in detail here. 


A Final Note A 


We close with a final set of cautionary 
remarks. We feel quite confident that these 
data do not provide support for the hypothe- 
sis of a curvilinear relationship between tem- 
perature and the probability of a riot. We 
feel reasonably confident that for these 
particular riots, there is good evidence for a 
monotonically increasing relationship be- 
tween temperature and the probability of a 
riot! However, facile generalizations from 


4These remarks about the monotonic character of 
the relationship between temperature and the likeli- 
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these data make us very nervous. The riots 
are certainly not independent of one another; 
the dependencies cannot easily be described. 
The temperatures are not independent; again 
the dependencies are complex. Thus, from the 
point of view of inferential statistics, the 
number of true independent data points is 
unknown and may be small. Not only are the 
data fraught with all of the ambiguities of 
any correlational study, the data analyses are 
also subject to subtle, difficult, and complex 
effects. Even if the present data analysis 
seems satisfactory, there are numerous al- 
ternative explanations of the relationship. A 
clear understanding of the psychological ef- 
fects of temperature, and particularly the 
effects of temperature on aggression, seems 
much more likely to emerge from experi- 
mental work like that of Baron (1972), 
Baron and Bell (1976), or Baron and Law- 
ton (1972). 


hood of riots are restricted to the normal range of 
temperatures, Clearly, at some point the relation- 
ship must become curvilinear. We seriously doubt 
that riots are likely to occur when the temperature 


is 120° F (although we have no data one way or 
the other). 
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Affective Spac 


James A. 


e is Bipolar 


Russell 


University of British Columbia, Vancouver, Canada 


Numerous previous studies found monopolar rather than bipolar dimensions of 
affect (defined as emotion represented in language), but may have included 


methodological biases against bipolarity. 


The present study of self-report data 


(N = 150) on 11 affect scales showed that response format and acquiescence 
response style significantly shifted correlations between hypothesized opposites 
away from showing bipolarity. When these biases were taken into account, 
pleasure was found to be the bipolar opposite of displeasure and arousal of 


sleepiness. In turn, pleasure-displeasure 
dimensional bipolar space that accounted 
in Thayer's four factors of activation plu: 


and degree of arousal formed a two- 
for almost all of the reliable variance 
s a measure of depression. Dominance 


and submissiveness factors were also included in the study, but invalidity of 
the scales used precluded any conclusions regarding their bipolarity. 


Early work on affect (emotion as repre- 
sented in language) was based on the assump- 
hg that its descriptive dimensions would be 
bipolar: “An affective scale is a bipolar one” 
(Guilford, 1954, p. 264). Nowlis (e.g., Now- 

lis & Nowlis, 1956) similarly began his fac- 
tor-analytic work on affect by hypothesizing 
‘four bipolar dimensions: pleasantness-un- 
itness, activation—deactivation, posi- 
tive-negative social orientation, and control — 
of control. He gathered adjectives to 
"iat each dimension and, in a series of 
te les, factor-analyzed the intercorrelations 
(ia responses to those adjectives. “How 
Em these results relate to our original hy- 
ed First, we had assumed that each 
acor would be bipolar. Actually, bipolarity 
(Nowlie ¥ pte in the obtained axes” 
Bin 65, p. 358). His hypothesized 
ad tended to form two separate, in- 
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dependent monopolar factors instead of single 
bipolar ones. 

Borgatta (1961), Clyde (1963), and Mc- 
Nair and Lorr (1964) subsequently reported 
the results of additional factor-analytic stud- 
ies on affect in which monopolar, as opposed 
to bipolar, factors were consistently obtained. 
Thayer (1967) investigated what had origi- 
nally been hypothesized to be a single, bi- 
polar affective dimension, degree of activation 
or arousal, and obtained four independent 
monopolar factors, which he termed general 
activation, high activation, deactivation-sleep, 
and general deactivation. Indeed, most in- 
vestigators of affect now argue against the 
existence of bipolar affective dimensions (e.g., 
Bradburn, 1969; Hall, 1977; Izard, 1972; 
McLachlan, 1976; Westbrook, 1976). 

In other words, considerable research has 
tended to indicate that a person can feel, for 
example, both happy and unhappy or both 
aroused and sleepy at the same time. Based 
on these results, the affect scales most com- 
monly used today were constructed to assess 
monopolar factors (e.g, Izard, 1972; Mc- 
Nair, Lorr, & Droppleman, 1971; Nowlis, 
1965; Thayer, 1967), although several scales 
continue to be based on bipolar dimensions 
(e.g, Mehrabian & Russell, 1974; Spiel- 


Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3703-0345$00.75 


346 JAMES A. 
berger, Gorsuch, & Lushene, 1970; Zucker- 
man & Lubin, 1965). 


Bipolarity of Semantic Space 


Interestingly, a similar controversy arose 
in the semantic differential literature. Osgood, 
Suci, and Tannenbaum (1957) found that 
the affective meaning of words could be ade- 
quately summarized by three bipolar dimen- 
sions: evaluation, activity, and potency. 
Since each of these dimensions was defined 
by adjective-pair scales such as good-bad, 
fast-slow, and strong—weak, respectively, 
Green and Goldfried (1965) argued that bi- 
polarity had been assumed, indeed had been 
forced onto the results, by the rating pro- 
cedure employed. Green and Goldfried then 
attempted to empirically test the assumption 
of bipolarity by constructing a single-adjec- 
tive version of Osgood’s semantic differential 
scales and examining the actual correlation 
between alleged opposites. If evaluation, say, 
were indeed bipolar, one would expect a high 
negative correlation between responses to neg- 
ative evaluation words such as bad and those 
to positive evaluation words such as good. In 
fact, Green and Goldfried generally failed to 
find such correlations and concluded that 
semantic space should not be conceptualized 
as bipolar. If accepted, such a conclusion 
would have important implications not only 
for the widely used semantic differential tech- 
nique but for the area of affect as well, in 
that the semantic differential factors have 
been interpreted as reflecting basic dimen- 
sions of human affect (Osgood, 1969). 

Green and Goldfried’s (1965) argument 
against bipolarity of semantic space did not 
go unchallenged, however. Bentler (1969) 
argued that high negative correlations indica- 
tive of bipolarity might have been obscured 
in their study by acquiescence, which is an 
individual-difference variable in the tendency 
to agree or disagree with an item regardless 
of its content. That is, if acquiescence were 
a factor in Green and Goldfried’s data, it 
would have shifted all observed correlations 
between words in a positive direction and 
hence away from showing bipolarity (r= 
—1.0). Bentler tested his hypothesis by con- 
structing a single-adjective version of the 
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semantic differential and using a partial cor.| 
relation technique to control acquiescence, 
The results provided striking evidence for his 
hypothesis. For example, a composite of posi-| 
tive evaluation words correlated only .03 with} 
a composite of negative evaluation words— 
before acquiescence was taken into account, 
After acquiescence was partialed out, these 
composites correlated —.76, thus giving 
strong evidence for the bipolarity of semantic 
space. 


Bipolarity of Affective Space 


In the domain of affect, Meddis (1972) has 
recently defended the notion of bipolar di 
mensions. He noted two potential sources of 
bias in the response format used by Nowlis 
(1965), and subsequently by Thayer (1967), 
that could account for their failure to obtain 
bipolar dimensions. The response rating scale 
used by Nowlis and Thayer, shown in Table 
1, assigns 1 point to “No (definitely not); 
2 points to “? (cannot decide),” 3 points to 
“V (feel slightly),? and 4 points to W 
(definitely feel).” First, Meddis noted, this 
scale is asymmetric, in that there are M 
categories of acceptance (V and VV) r 
only one of rejection (No), and, second, p 
option “? (cannot decide)” may pred 
even ordinal data, since a response to n 
option could mean a variety of things 7 
than a feeling whose intensity is somewhe 
between the two neighboring categories. i 

Meddis (1972) empirically contrasted A 
results obtained using Nowlis’s and pe 
response format with results obtained a 
a format that was more symmetric and t 
alternativė 
(This last format is shown in 
“Meddis” format.) Responses to these 
adjectives were gathered with each of 
formats separately and factor-analyz® 


tots 
predicted, predominantly monopolar oe 
tended to emerge from the data 6%, 


with Nowlis’s and Thayer’s format, ie the 
bipolar factors tended to emerge ere 
data gathered with the Meddis format. jon 
for example, responses to 38 adje 
lar to those studied by Thayer (196 i 
gathered with the Meddis format, | 


3 ivation “ 
factors were obtained: general activa 


si 
wert 
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sus deactivation-sleep and high activation 
versus general deactivation. In Meddis’s 
(1972) words, “Whether or not we obtain 
bipolar factors appears to depend upon the 
kind of rating scale we use” (p. 183). 


Additional Factors Operating Against 
Bipolarity 


Many of the studies supporting monopolar- 


: ity reviewed above did not employ Nowlis’s 
and Thayer’s response format and therefore 


are not subject to Meddis’s specific criticisms. 
The conclusion of monopolarity in these ad- 
ditional studies may be questioned on the 
grounds of other methodological problems, 
however, such as the possibility of alternate 
rotations and of the influence of difficulty 
factors (unequal response distributions of 
the variables involved) in  factor-analytic 
solutions. More importantly, these studies 
may also have included biases specifically 
against bipolar factors. First, to the extent 
that the sample of emotion words studied 
underrepresents one end of a bipolar con- 
tinuum, bipolar factors are less likely to 
emerge. Second, if the instructions ask sub- 
Jects how they felt over an extended period 
of time (such as over a week; e.g., Bradburn, 
1969; Hall, 1977), they may describe sev- 
eral, perhaps opposite, emotional experiences. 
Finally, the correlation between two items 
ofa scale has been shown to be spuriously in- 
flated in proportion to the proximity of the 
items to each other in time or space (Guil- 
a 1954; Stockford & Bissell, 1949). This 
eae ene as proximity error, would 
ae a ee in = overall shift to- 
hie eae os correlations among a set 
EE team, es ive of content—to which 
time, uae a e! within a short period of 

course is standard practice 


and, i i 
ae it would seem, a necessity) when mea- 
uring affect, 


In short, 
Studies of affe 
tity, 
een bj 
ing: a 
SPonse 
an asy 


although numerous empirical 
ct have concluded against bipo- 
results from these studies may have 
ie by. one or more of the follow- 
: acquiescence response bias; a re- 
si te that fails to yield ordinal data; 

etric response format; inadequate 
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sampling of affect terms; instructions to de- 
scribe feelings during a time period long 
enough that several, possibly opposite, feel- 
ings might have occurred; and a proximity 
error, which is the tendency to respond simi- 
larly to items that are nearer in time or space. 


Overview of the Present Study 


In the present study, subjects described 
their current affective state on separate scales 
assessing opposite ends of hypothesized bi- 
polar dimensions so that correlations between 
these scales could be empirically established. 
Four response formats (shown in Table 1) 
were employed: Meddis’s (1972) format plus 
three others that have been extensively used 
in studies obtaining monopolar factors of af- 
fect, Nowlis’s (1965) and Thayer’s (1967), 
McNair and Lorr’s (1964), and a true-false 
format. 

To ensure an adequate sample of affect 
terms, items were chosen for the question- 
naire that appeared on a priori grounds to 
assess primarily opposite ends of pleasure- 
displeasure, arousal-sleepiness, and domin- 
ance-submissiveness, since these dimensions 
have been suggested by a variety of sources 
of evidence as basic dimensions of affect (see 
Russell, 1978, for a brief review of this evi- 
dence). All adjectives from Thayer’s (1967) 
four scales of activation were also included in 
order to explore the activation/arousal di- 
mension more thoroughly. Thayer (1970) 
showed his scales to be significantly related 
to physiological indexes of arousal, and the 
relation of his four factors to the entire af- 
fective space is thus of considerable theoreti- 
cal importance. Moreover, evidence provided 
by Russell and Mehrabian (1977) indicated 
that it was possible to subsume Thayer’s 
four dimensions within a more general, three- 
dimensional theory of affect defined by 
pleasure-displeasure, degree of arousal, and 
dominance-submissiveness. The present study 
provided further evidence on that notion and 
tested Meddis’s (1972) hypothesis that 
Thayer’s general activation is the bipolar 
opposite of deactivation-sleep and that high 
activation is the bipolar opposite of general 
deactivation. Finally, a set of adjectives that 


348 


Table 1 
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Response Probability for Each Category of Four Response Formats 


[i 


j Format Label 
is (1972 XX X a, V. : vv 
mete : (definitely do not feel) (do not feel) (slightly feel) (definitely feel) 
17 Al 34 -08 
i VV 
Nowlis (1965) & No ? ; v X 
m (1967) (definitely do not feel) (cannot decide) (feel slightly) (defini feel) 
45 AS 32 3 
McNair & Lorr 1 2 3 4 
(1974) . (not at all) (a little) (quite a lot) (extremely) 
Al 36 A8 05 
True-false N X. 
(no, I do not feel) (yes, I do feel) 
-63 37 


Note. Probabilities were calculated across 58 items and 150 subjects for each response format. 


appeared to assess depression was included, 
since on inspection of Thayer’s items, de- 
pression seemed more likely to be the actual 


bipolar opposite of his general activation 
scale. 


Method 
Subjects 


Subjects were 71 female and 79 male University 
of British Columbia undergraduates. These were 
students from two different psychology classes who 
responded to a series of four questionnaires during 
class time, each questionnaire on a separate day, ap- 
proximately 2 weeks apart. One class responded in 
the fall of 1976, the other in the following spring. 
Ten additional subjects began, but failed to com- 


plete the study, and their data were excluded from 
analysis, 


Materials 


Eleven sets of adjectives, a total of 58 items, 
were used. The first 4 sets are Thayer’s (1967) fac- 
tors of activation, and the remaining 7 were con- 
structed a priori to measure pleasure, displeasure, 
arousal, sleepiness, dominance, submissiveness, and 
depression. The sets were the following: 

1. General activation (Thayer): lively, active, 
full of pep, energetic, peppy, vigorous, activated. 

2. High activation (Thayer): clutched up, jittery, 
stirred up, fearful, intense. 

3. General deactivation (Thayer): at rest, still, 
leisurely, quiescent, quiet, calm, placid. 

4. Deactivation-sleep (Thayer): 


sleepy, 
drowsy. 


tired, 


5. Pleasure; contented, happy, satisfied, pleased, 
joyful. ec, 
6. Displeasure: discontented, unhappy; dissatis: 
fied, displeased, joyless. a 
7. Arousal: wide awake, aroused, aflame, impt 
sioned, alert, roused. usd 
8. Sleepiness: inactive, half asleep, slow, unar tial 
9. Dominance: dominant, controlling, influentiah 
important, autonomous. 
10. Submissiveness: 
fluenced, awed, guided. 


submissive, controlled, in 


11. Depression: depressed, discouraged, gloomy 
sad, blue, sluggish. S jixing 
Four questionnaires were formed by intermi 


thet 
these 58 items in two random orders and ia 
fil 


counterbalancing each. Each questionnaire al 
with directions stating, “On this sheet you n rel 
a list of words or phrases that describe ist tt 
kinds of moods and feelings, Please use this 
describe your feelings today.” 

Next followed one of the four respons allow 
(shown in Table 1). The true-false to for 
two response alternatives; the remaining Ais o 
mats allow four response alternatives and tive. Fo 
in the verbal label attached to each alten gio 
convenience, these three formats were ie intro 
the table by the name of investigators W. 
duced, or at least have used, that format 1" 
of affect. 


e formals 


Procedure 


each subject responded four times, order: 
the four response formats in randon ndom ott 
subject responded only once, again in ral of tbe 


to each of the four orders of presentation 
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items, In other words, all possible combinations of 
response format with order of item presentation 
were employed and randomly assigned to subjects. 
No feedback from their data was given to subjects 
until after the last questionnaire had been com- 


pleted. 


Results and Discussion 


Response Format 
í 


Table 1 shows the mean probability of a 
response to each category of each response 
format. The Meddis format resulted in a 
roughly even, symmetric distribution of re- 
sponses. The Nowlis and Thayer format, in 
contrast, resulted in a bimodal distribution, 
because subjects avoided the category labeled 
? (cannot decide)”—a finding that supports 
the view that this category may preclude 
ordinal data. The McNair and Lorr format 
resulted in a skewed distribution, because 
e modal response was to the first category, 
hot at all.” Thus, of the four formats stud- 
ied, the Meddis format provided the best 
distribution of responses and may be pre- 
ferred on these grounds alone. 

It had been hypothesized that in compari- 
son with the Meddis format, the other for- 
mats would shift correlations among items in 
the Positive direction. A sign test was thus 
“a aa which each interitem correlation from 
4 : eddis format was compared with its 
a from each of the other three for- 
E k expected, correlations based on the 
| bet ormats did tend to be greater than 
Biss. ie ee based on the Meddis for- 

io a e Nowlis and Thayer format data, 
sles rel Hata were smaller and 1,004 were 
it (z= 8.71, $ < .01); in the McNair 

orr format data, 684 were smaller and 
ea (2 = 6,98, p < .01); and in the 
id ins format data, 500 were smaller, 
addition lhe ia Ee E 
ine eae format resulted in 

e aE s that were greater than 
ormat but f parts not only from the Meddis 
With the Nn Er others as well; Compared 
Were smaller : a Thayer format data, 559 
3.13, p< d nd 1,094 were greater (z= 
Nair and ee and compared with the Mc- 
i ormat data, 641 were smaller 
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and 1,012 were greater (z = 9.10, p < .01). 
There was no significant difference between 
correlations from the Nowlis and Thayer for- 
mat and those from the McNair and Lorr 
format (864 smaller and 789 larger, z= 
1.82, ns). In sum, when compared with the 
Meddis format, the three formats commonly 
used in studies of affect resulted in correla- 
tions shifted in the positive direction—away 
from evidence of bipolarity. 


Content Scales 


Reliability. Since 7 of the 11 scales were 
constructed ad hoc for this study, their in- 
ternal consistency was unknown. Rather than 
use an extensive item-selection process, how- 
ever, it was decided to simply eliminate any 
item from a scale if it failed to correlate posi- 
tively with each of the other items on that 
scale. Based on data gathered with the Meddis 
format (N = 150), only one item—“con- 
trolled” on the submissiveness scale—failed 
this criterion, and that item was excluded 
from further analyses. Measures of internal 
consistency reliability (coefficient alpha, Cron- 
bach, 1951) were then calculated for all 11 
scales, separately for each format. (Values 
for 5 scales are reported later in Table 3.) 
Three scales showed only moderate reliabil- 
ity: high activation (.59 to .69), dominance 
(.68 to .73), and submissiveness (.56 to .65). 
But, all other scales showed adequate re- 
liability (.70 to .95), and at this point, all 
scales were retained for further analysis. 

Product-moment correlations. Table 2 
gives zero-order correlations of 6 scales with 
hypothesized bipolar opposites, separately for 
each response format. Correlations between 
pleasure and displeasure ranged from —.43 
with the true-false format to —.71 with the 
Meddis format. Similarly, correlations be- 
tween arousal and sleepiness ranged from 
—45 with the true-false format to —.62 
with the Meddis format. Correlations be- 
tween dominance and submissiveness were 
considerably lower and mostly positive, rang- 
ing from .27 with the true-false format to 
—.05 with the McNair and Lorr format. 
Correlations between Thayer’s high activa- 
tion and its hypothesized opposite, general 
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Table 2 I ] " 
Association Between Scales and Their Hypothesized Opposites 
Nowlis (1965) & McNair & 
Meddis (1972) Thayer (1967) Lorr (1964) True-false 
i i i ia Partial 
Hypothesized Partial Partial Partial 
Scale opposite r r AQ r r AQ r r AQ r r AQ 
Displeasure —.71* —.78* # —.66* —.80* # —.66* —.78* # — 43% t $ 
JEEN Siéepine, —.62* —71* # —.62* —.76* # —55* —.72* # =S 6 A 
Dominance Submissiveness 32 ATOS, A N d —.05 —.19 # ae 06 F 
General Deactivation-  —.53* —.50* —.58* —.66* # —.50* —.55* # —.38* -52 ¢ 
tivati slee; 
Guana PESARA —.58* —.60* —42* —.50* # —.39% —.56* # 20 -43% $ 
activation 
High General —.31* —.36* # —16 —30* # =16 —.25* # —14 -34 t 
activation deactivation 


Note. r refers to zero-order correlations between the two scales listed. Partial r refers to a partial correlation with scale 
partialed. The column AQ indicates (by the appearance of a number sign, #) that acquiescence correlated significantly (? < 0) 
with the first scale listed partialing the hypothesized opposite. N = 150. 


*p < 01. 


deactivation, were negative, but low and 
mostly nonsignificant. There were two hy- 
pothesized opposites for Thayer’s general ac- 
tivation scale: Thayer’s deactivation-sleep 
scale and the depression scale. Correlations 
between high activation and these two hy- 
pothesized opposites were negative and mostly 
significant, but again modest in size. 

To test the hypothesis that response for- 
mat influenced the correlation between hy- 
pothesized opposites, each of the six correla- 
tions from the Meddis format was compared 
with its counterpart from each of the other 
three formats.t Two pairs of correlations 
were found to be significantly different: (a) 
The correlation between pleasure and dis- 
pleasure (—.71) from the Meddis format 
differed from that (—.43) from the true- 
false format (z= 3.85), and (b) the cor- 
relation between general activation and de- 
pression (—.58) from the Meddis format 
differed from that (—.20) from the true- 
false format (z= 4.08). Thus, the true- 
false response format resulted in two sig- 
nificant differences, both in the expected 
direction of a more positive correlation than 
that produced by the Meddis format. In the 
other four comparisons between these two 
formats, the differences were in the same di- 
rection, although not significant. The Meddis 
format did not, however, produce any sig- 
nificantly more negative correlations between 
opposites in comparison with either the Nowlis 
and Thayer or the McNair and Lorr formats. 


Indeed, as shown in Table 2, although 9 of 
these 12 comparisons were in the expected 
direction, there were no large differences 
among corresponding correlations from ie 
four-place formats. Contrary to Meddis’s 
(1972) hypothesis, these results suggest that 
the Nowlis and Thayer format contributes at 
most a modest bias against obtaining bipolat 
dimensions. 


The Role of Acquiescence 


Partial correlations. A clearer picture v 
the relationship between hypothesized ia 
posites emerged when acquiescence Wee t For 
into account through partial correlation. al 
each pair of scales within a given forma | 
acquiescence score was calculated for ‘ 
subject by summing his or her respon 
all items (many of them antonyms), a w 
those items constituting the pair of sC cen 
be correlated. For example, the p 
score partialed from the correlation De 


of the 
1A separate test was carried out for en of 8 
six pairs of scales listed in Table 2 for 4 t the o 
comparisons. Each test was conducted al a gingt 
level (one-tailed) so that the probability P to of 
spuriously significant result would be T re fot 
proximately .018 (18 X .001). The test usel d’ 
the difference between two correlations 1961) put 
the same sample is discussed in Olkin a equati”t 
the formula is misprinted there. The corre son! 
was supplied by James H. Steiger throu 
communication. 
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pleasure and displeasure consisted of responses 
to 48 items, which were all the items except 
those for pleasure or displeasure. A partial 
correlation was then calculated between each 
pair of hypothesized opposites, partialing the 
appropriate acquiescence score. These results 
are also shown in Table 2. 

In all but one case (general activation/de- 
activation, Meddis format), partialing acqui- 
escence shifted the correlation in the expected, 
negative direction, that is, in the direction 
supporting bipolarity. In order to assess 
whether acquiescence contributed significantly 
to the variance in the content scales, another 
partial correlation was calculated: the correla- 
tion of acquiescence with the first scale listed 
in the table, partialing its hypothesized op- 

| posite. A number sign (#) is shown in Table 
2 in the column labeled AQ (acquiescence) 
whenever this partial correlation was sig- 
nificant at the .01 level, which occurred in 
21 of the 24 cases, Interestingly, all three 
exceptions occurred with the Meddis-format 
data, suggesting that this format is less sub- 
Ject to acquiescence bias than the others. In 
| Contrast, the results for the true—false format 
Were not only significant but sizable, suggest- 
Ing that this format is particularly subject to 

acquiescence bias, 
Pleasure and arousal. Returning to the 
Partial correlations between content scales, 
We can see from Table 2 that for pleasure 
and displeasure, these partial correlations 
ere quite substantial—negative and large 
Be enitude, ranging from —.71 to —.80. 
ie te case even for the data from 
st: din ane ne in which was seen the 
Bisscerice ai i ange due to partialing ac- 
Kousal a ae ar results were obtained for 
correlations Sleepiness, for which the partial 
these ios from —.66 to —.76. Thus, 
ipolarity tie imensions, strong evidence of 
obtained by partialing acqui- 


scence—r, wW] ponse forma’ 
egardless of which j 

a es) 

Was involved? sae 


Dominance. 
è correlations 


otra acquiescence from 
: etween dominai d sub- 
Riss s nce and sul 
Pa shifted these correlations in the 
Bhat te ee direction. Nevertheless, the 
a Te Sg ranging from .06 to —.19, did 
er significantly from zero and pro- 
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vided no evidence of bipolarity. In light of 
the results for pleasure and displeasure and 
for arousal and sleepiness, this finding is 
puzzling. Its explanation seems to be that 
there was little valid variance in either of 
the two scales, but especially submissiveness. 
First, reliabilities for these scales were low. 
Second, a substantial proportion of the re- 
liable variance appears to have been actually 
acquiescence response style: Correlations of 
dominance with an acquiescence score (based 
on all items except dominance or submissive- 
ness ones) ranged from .20 (Meddis format) 
to .42 (true-false format); correlations of 
submissiveness with the same acquiescence 
score ranged from .32 (McNair and Lorr 
format) to .52 (true-false format). In short, 
a considerable proportion of the variance in 
each scale was either unreliable or invalid, 
leaving so little valid variance that meaning- 
ful conclusions are precluded. 

Thayer's scales and depression. ‘Thayer’s 
(1967) four scales yielded intermediate re- 
sults. The partial correlations for general ac- 
tivation and its two hypothesized opposites, 
depression and deactivation-sleep, ranged 
from —.43 to —.66, indicating a consistently 
negative relationship, although not a sub- 


stantial one. The partial correlations for high 


2Since my concern was with the bipolarity of 
theoretical dimensions rather than with practical 
prediction problems, a correction for attenuation (un- 
reliability) provided a useful estimate of what these 
partial correlations would be if the two scales cor- 
related were perfectly reliable. The pleasure-dis- 
pleasure partial correlation from the Meddis-format 
data (—.78) was thus corr cted for unreliability (as 
estimated by alpha coefficients) in both the pleasure 
and displeasure scales; the result was — 88. A simi- 
lar procedure for the arousal-sleepiness partial cor- 
relation (—.71) yielded an estimate of —102. Such 
corrected correlations must be interpreted with cau- 
tion, because the use of coefficient alpha as an esti- 
mate of reliability can overcorrect the correlation 
(as obviously happened in the case of arousal-sleepi- 
ness). Also, it is not a common practice to disat- 
tenuate partial correlations, although doing so would 
appear to be biased on the conservative side, since 
after partialing reliable variance, error variance 
would constitute a relatively greater proportion of 
the remainder. Nevertheless, these results at least 
reinforce the view that as various sources of bias 
and error are removed, the correlation between these 


hypothesized opposites approaches —1.00. 
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Table 3 


Multiple Regression Results and Reliability Coeficients for Thayer's Four Scales and Depression 


JAMES A. RUSSELL 


Standardized regression 


weight Reli- 

ability 

Response Acquies- for cri- 

Criterion format Pleasure Arousal cence R terion 
General Meddis* 32 -67 -08 84 91 
activation Nowlis® & Thayer® 19 TO 16 84 92 
McNair & Lorr? -30 72 16 87 95 
True-false 15 .67 24 .80 93 
High activation Meddis — .64 38 32 .68 65 
Nowlis & Thayer —.39 40 37 57 59 
McNair & Lorr —.52 AH) 35 68 69 
True-false —.42 35 55 72 66 
General Meddis 43 —.69 14 Gi 78 
deactivation Nowlis & Thayer 35 —.58 33 49 76 
McNair & Lorr 27 —.60 34 48 1 
True-false 28 —.59 34 Al 81 
Deactivation- Meddis — .66 .23 66 88 
sleep Nowlis & Thayer —.83 .29 ‘11 90 
McNair & Lorr = 80)" |. 9:29 1 92 
True-false = 81 42 -14 88 
Depression Meddis Ra lO fu) -22 90 F 
Nowlis & Thayer —.19 —.22 a33 .87 o0 

McNair & Lorr —.74 —.28 40 84 f 


f 


—.69 84 


S True-false 

x 7 i ; Hi la- 
Note. Eacht regression equation was based on 150 cases. All values of regression weights and multiple core’ 
tions that are entered were significant at the .01 level. Reliability 
bach’s (1951 coefficient alpha measure of internal consistency. 


1972, % 
b 1965. 
° 1967. 
41964. 


be 

activation and its hypothesized opposite, gen- 
eral deactivation, ranged from —.25 to —.36, 
indicating a more modest but still negative 
relationship. Thus, although inversely re- 
lated, Thayer’s scales did not form exact bi- 
polar opposites as Meddis (1972) had pre- 
dicted. 


A Bipolar Affective Space 


Regression analyses. The results so far 
were far clearer for pleasure and displeasure 
and for arousal and sleepiness than for 
Thayer’s four scales and depression. A series 
of multiple regression analyses helped to 
clarify the relationships among the latter five 
scales and their relationship to pleasure—dis- 


ASR i n 
of each criterion was estimated by Cro 


pleasure and arousal-sleepiness as well. = 
tiple regression analysis was used specifi Jer 
to test the hypothesis that pleasure-disP iiy 
sure and arousal-sleepiness can adegu Bs 
account for Thayer’s four scales and dep 
sion. 

To define variables for the equations, soot 
on all scales were first standardize 
0; SD=1). Displeasure scores were 
subtracted from pleasure scores, au disp" 
ness from arousal, to form pleasure- j 
sure and degree-of-arousal pred “abl 
ables. For the third predictor bie ite 
separate acquiescence score was mini 
for each regression equation by sens 0 
scores on all items other than the uati 
those scales already part of that 
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eparate multiple regression equations were 
ien computed from data for each response 
rmat for five criterion variables: each of 
'hayer’s four scales plus depression. The 
esults are shown in Table 3. 

Multiple correlation coefficients were 
shrunken” to provide estimates of the cor- 
elations in the population between the scores 
n the criterion predicted from the equation 
nd those actually obtained. These multiple 
rrelations were substantial and significant 
(p< .01) in every case. Since unreliability 
neither the criterion or the predictors limits 
he possible magnitude of the multiple cor- 
elation coefficient, the last column of Table 
3 provides reliability estimates for each cri- 
terion variable. As can be seen, the multiple 
correlations for all scales except Thayer’s 
general deactivation generally approached the 
limit set by the reliabilities. 

A reinterpretation of Thayer's activation 
dimensions. The standardized regression 
Weights shown in Table 3 provided definitions 
for the five criterion variables that consider- 
ably clarified their meaning. Approximately 
the same information is also shown graphi- 
cally in Figure 1, which presents loadings of 
the 11 affective scales on their first two prin- 
Cipal components, since these components 


Arousal 
. 


General 
® Activation 


Hi 
Basics 


° 
z Pleasure 
Dominance 


General 
Deactivation 


e 1. First t inci 
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were readily interpretable as pleasure~dis- 
pleasure and degree of arousal, Both analy- 
ses showed Thayer’s high activation and gen- 
eral activation to contain the expected high 
degree of arousal. The difference between 
these two scales, however, was as much the 
amount of pleasure as it was the degree of 
arousal, with general activation involving 
pleasure and high activation involving dis- 
pleasure. Similarly, Thayer’s deactivation- 
sleep and general deactivation scales con- 
tained the expected low arousal, but again 
the two scales differed on pleasure, with gen- 
eral deactivation involving pleasure but de- 
activation-sleep involving only arousal. In 
short, Thayer’s (1967, 1970) factor-analytic 
results were multidimensional not because 
the arousal continuum is multidimensional 
but because (among other things) his items 
varied on the pleasure dimension as well as 
on arousal. 

In Figure 1, general activation is shown to 
be the approximate bipolar opposite of both 
deactivation-sleep and depression (as was 
suggested by the partial correlations of 
Table 2). General activation fell wi, Smythe 
high pleasure/high arousal quadran’ 'Its“hy- 
pothesized opposites, deactivation-s 2ep and 
depression, both fell within the ‘opposite 
quadrant (displeasure/low arousal), but 
neither fell exactly opposite general activa- 
tion. The regression equations of Table 3 
similarly showed deactivation-sleep to involve 
too little, and depression too much, displea- 
sure to be the precise bipolar opposite of 
general activation, thus explaining the mod- 
erate magnitude of their negative correlations 
shown in Table 2. 

Similarly, Thayer’s high activation fell 
within the displeasure/high arousal quadrant, 
and its hypothesized opposite, general deac- 
tivation, fell within the opposite quadrant, 
high pleasure/low arousal. These two scales 
were only slightly negatively correlated 
rather than exact bipolar opposites, how- 
ever, because each is maximally saturated 
with a different factor: High activation is 
mainly displeasure with a moderate compo- 
nent of arousal; general deactivation is 
mainly low arousal with a moderate com- 


ponent of pleasure. 
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Even though Thayer’s four scales did not 
form exact bipolar opposites, the results sum- 
marized in Figure 1 and Table 3 show that 
they are part of a bipolar space. Presumably, 
their exact bipolar opposites could be found 
if adequate terms were included in the sam- 
ple of terms studied. From this point of view, 
Thayer’s scales are considerably mislabeled. 
Since both general activation and general de- 
activation involve pleasure, “general” could 
be more precisely replaced with “pleasant.” 
More specifically, general activation assesses 
high pleasure combined with high arousal 
and, with items such as lively and vigorous, 
could be relabeled “vigor” or “excitement.” 
General deactivation assesses high pleasure 
combined with low arousal and, with items 
such as calm and placid, could be relabeled 
“calmness” or “relaxation.” Deactivation- 
sleep (with items sleepy, tired, drowsy) is 
appropriately labeled, but high activation as- 
sesses displeasure combined with high arousal 
and, with items such as jittery and clutched 
up, could be relabeled “distress.” 


Implications 


As the obscuring influence of methodologi- 
cal biases was reduced in this study, it could 
be seen that a bipolar affective space, de- 
fined by pleasure-displeasure and degree of 
arousal, was able to represent the relation- 
ships among scales of pleasure, displeasure, 
arousal, sleepiness, depression, and Thayer’s 
(1967) four factors of activation. Evidence 
from a previous study (Russell & Mehrabian, 
1977) showed that an even greater range of 
affective states could be represented in the 
same way. In a series of regression analyses 
similar to those shown in Table 3, pleasure— 
displeasure and degree of arousal were found 
to account for almost all of the reliable vari- 
ance in a sample of 42 commonly used scales 
of affect (e.g., scales of happiness, elation, 
anger, fear, anxiety, and depression). In an- 
other study (Russell, 1978), it was shown 
that the same dimensions of pleasure and 
arousal have emerged as the two major di- 
mensions of affect from studies of verbal self- 
report, semantic differential ratings, succes- 
sive-intervals scaling, and multidimensional 
scaling of affect terms. In other words, diverse 
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sources of evidence are converging on a rep- 
resentation of affect similar to that depicted 
in Figure 1. 

This proposed bipolar affective space may 
appear incomplete, however, because it is 
characterized by only two dimensions. In 
deed, evidence has consistently confirmed thy 
existence of dimensions of affect independen! 
of pleasure and arousal. For example, in the 
Russell and Mehrabian (1977) study, domi 
nance-submissiveness accounted for a signifi 
cant proportion of the variance in scales of 
affect beyond that accounted for by pleasure 
and arousal. In the Russell (1978) studji 
multidimensional scaling of affect terms wi 
used to examine more closely the dimension 
of affect beyond pleasure and arousal. Results 
provided evidence for at least three such 
dimensions, interpreted as dominance/p0 
tency/control, depth or importance, and locus 
of causation. 

Nevertheless, I would argue against tht 
use of more than two orthogonal dimension 
in a conceptualization of affect. The dimen 
sions beyond pleasure and arousal have con 
sistently been found to differ from pleasutt 
and arousal in several important ways. In th 
Russell and Mehrabian (1977) study, plea 
sure and arousal accounted for by fat $ 
major proportion of variance in affect scales 
with dominance accounting for a quite i 
amount. In the Russell (1978) study, ) 
dimensions beyond pleasure and arousal a 
were found to be components of some, abe 
not all, affect terms and (b) were interpre! a 
as referring to the antecedents and E a 
quences of the emotional state rather 
to the emotion per se. Pash 

An examination of the proposed me 
affective space as represented in Eee 
indicates that this space lacks “simple $ 
ture.” The various affective states do "° 
cluster about the axes, but appeat A 
meaningfully (except for dominance 49 b 
missiveness) around the perimeter i f 
space. This observation is supporte those í 
regression results of Table 3 and by a 
Russell and Mebrabian (1977), in "e d 
fective states were typically found t° 
fined as combinations of pleasur isp 
and degree of arousal. It may be @ 


ore, where the axes are located within 
rotation of 
shown in Figure 1 would yield two 
axes that could be labeled “excite- 
lepression” and “distress—relaxation.” 
nerally, the placement of the various 
Mive states in relation to each other within 
pace appears to be more meaningful than 
cement of the axes within the space. 

proposed bipolar affective space raises 
her issue concerning the conceptualiza- 
of affect. The proposed space is based 
definition of the domain of affect that 
ides low arousal states, such as sleepi- 
d relaxation, which are not commonly 
d on psychologists’ lists of the emo- 
For example, Tellegen * defines affect 
Woused feelings. His proposed conceptual- 
lon of affect, which involves two dimen- 
) Positive affect and negative affect, 
therefore be expected to fall solely 
the high arousal quadrants of Figure 
, indeed, his two dimensions appear to 
Hy correspond to the rotated dimensions 
ficitement and distress, respectively, de- 
ae The bipolar opposites of Tel- 
MS dimensions, however, fall outside the 
a aroused feelings. Tellegen’s dimen- 
a ought not to be bipolar within 
Cion, n, and this appears to be his con- 
Fel aM te showed that an em- 
Patousal = ip exists between high and 
ingial to ates and, therefore, that it is 
group the two together within 


mm) € domain. The Russell and Mehrabian 


Boving scored the same conclusion 
Mable an at low arousal states were de- 
Meng same pleasure and arousal di- 

€quired to define the high arousal 


Shs r 


ae point of view, a fully ade- 
h ae of a person’s affective state 
 possibilit, time therefore must in- 
ently ility of these more mundane, 
ell as t encountered, low arousal states 
€ more dramatic emotions. 


ps legen’ 

Wand yor “Ptualization is described in Hall 
unitati et elaborated through per- 

Renta PProxim; n. Needless to say, only the 

“Ned here “tion to his conceptualization is 
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The Generalizability of Salience Effects 


Shelley E. Taylor, Jennifer Crocker, Susan T. Fiske, 
Merle Sprinzen, and Joachim D. Winkler 


Harvard University 


Stimuli that engulf attention often have a disproportionately large impact on 
the judgment process, even when logically irrelevant. The boundary conditions 
of such salience effects were examined in a scenario where subjects observed a 
dyadic conversation in which the visual prominence of one or another partic- 
ipant was manipulated. Two hypotheses were examined: (a) Salience effects 
are dependent on quantity of information encoded and disappear at low levels 
of attention, and (b) salience effects will disappear if higher involvement in 
the situation serves to heighten and focus attention on more relevant cues. 
Neither hypothesis was supported. Rather, salience effects continue to be found 
(a) when the perceiver is distracted, (b) whether the perceiver’s impressions 
are assessed immediately or after a delay, (c) when the conversation has high 
interest value, (d) regardless of the perceiver’s cognitive tuning set, and (e) 
when the perceiver is involved in the discussion. We conclude that salience 
effects are highly generalizable and that they have a significant impact on both 
trivial and important social judgments. We suggest that salience effects are not 
customarily under the control of the social perceiver, but may rather be automatic. 


Considerable research leads to the conclu- 
ion that logically uninformative, irrelevant, 
Dut salient social stimuli have a dispropor- 
Honately large impact on the judgment pro- 
£s (see Taylor & Fiske, 1978, for a review). 
f general, the form of this “salience” effect 
i pi people attribute causality to stimuli 
pe their attention. Taylor and Fiske 
a example, found in two separate 
3 ra when one’s attention is engulfed 
a. icular person in the environment, 

es that person as the causal agent in the 
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conversation. Storms (1973) found that when 
an individual’s own behavior or that of an- 
other person is made salient via videotaped 
playback, the salient information is a strong 
influence on judgments of causality. Duval 
and Wicklund (1972) found that the self is 
perceived as more of a causal agent when at- 
tention is self-focused than when it is not, and 
subsequent research has upheld this finding - 
(see Wicklund, 1975; see also McArthur & 
Post, 1977). 

Since the early studies that focused on 
judgments of causality, it has emerged that 
salient stimuli may have an impact on per- 
ceptions other than causal ones. Specifically, it 
appears that the evaluative qualities of salient 
stimuli are exaggerated, more is learned about 
them, they are perceived as more representa- 
tive of their social group, and they are more 
“available,” that is, more easily brought to 
mind (Tversky & Kahneman, 1974).* (See 


1Some studies (e.g, McArthur & Post, 1977; 
Storms, 1973) suggest that salient people’s behavior 
is more likely to be perceived as dispositionally rather 
than situationally based. However, the evidence on 
this point is equivocal (see Taylor and Fiske, 1978). 
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Taylor & Fiske, 1978, for a review of the 
literature on these points.) Taylor, Fiske, 
Close, Anderson, and Ruderman (Note 1), 
for example, found that a black (in an other- 
wise white group), male (in an otherwise fe- 
male group), or female (in an otherwise male 
group) drew off a disproportionate amount 
. of attention, was perceived as being more 
prominent and influential, was evaluated more 
extremely, and was somewhat more stereo- 
typed (perceived as representative) than was 
a comparable individual in a racially or sex- 
ually mixed group. Several studies have found 
that more information is retained about a 
salient other (e.g., McArthur & Post, 1977; 
Taylor & Fiske, 1975). Pryor and Kriss 
(1977) found that salient items in sentences 
(subjects as opposed to objects) were both 
more available (i.e., recalled faster, though 
not better) and perceived as more causal. 

Salience effects are of more than theoretical 
interest. They have been posited as the medi- 
ators of a number of important social pro- 
cesses. Taylor et al. (Note 1), for example, 
have suggested that the salience of a novel 
stimulus helps explain reactions to a solo 
black, a solo woman, or a solo man in a token 
integration situation. Wicklund (1975) has 
Suggested that self-esi is influenced by 
whether the self is nt. The salience of 
particular cognitions affects people’s social 
judgments and self-perceptions (e.g., Kiesler, 
Nisbett, & Zanna, 1969; Ross, Lepper, & 
Hubbard, 1975), Salience explanations have 
been proferred for emotional disorders such as 
test anxiety (Wine, 1971) and stuttering 
(Storms & McCaul, 1976). However, the 
boundary conditions around salience effects 
have yet to be adequately explored. 

One hypothesis (Taylor & Fiske, 1975, 
1978) is that salience effects depend on some 
minimal level of attention during exposure to 
the initial stimulus conditions. According to 
this hypothesis (Hypothesis 1), salience ef- 
fects would not be expected to emerge if at- 
tention (as indexed by recall) were attenu- 
ated. 
> A second hypothesis (Hypothesis 2), which 
is in some respects opposite to the first, is 
that salience effects are characteristic of low 
levels of intensity of attention and that if a 
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variable were introduced that increased th 
individual’s intensity of attention, salience ¢ 
fects would not emerge. The assumption u 
derlying such a hypothesis would be thi 
people pay attention to trivial but salien 
stimuli when there is nothing else wort 
watching more closely, but that when mor 
interesting or informative stimuli are presen 
degree of attention is heightened and focuss 
such that otherwise salient but trivial stimu 
are ignored. Such variables might include: 
high level of interest value in the content 
material exchanged or the subjects’ invo 
ment in the information exchanged. 
much of the research demonstrating salie 
effects has been conducted in unarousing, 
involving, and low information contexts, 
question of whether salience effects will ge 
eralize to more engrossing, arousing, and i 
volving settings is a very important one (st 
McArthur & Solomon, 1978). A 

Hypotheses 1 and 2 are not necessarily q 
compatible. It may be that some minim 
level of attention is required to obtain salie 
effects, but that at the highly intense levels 
attention produced by high degrees of subj 
involvement, salience effects will disaj g 

Three studies were conducted to examl 
the hypothesized bases of salience effects. 
the first, subjects participated in 4 point 
view experiment in which their informal 
intake was either severely reduced of not 
a distraction task. In the second and é 
studies, a variety of manipulations was 
duced to see if increasing the subject s 
of attention through involvement in the 
of the material exchanged would alter i 
effects. Failure to obtain salience effects t 
conditions of high distraction would c0 
evidence for Hypothesis 1. Failure t0 
salience effects under conditions of i 
attention would constitute ev! 
pothesis 2. 


Experiment 1 


Experiment 1 examined whether 
effects depend on differential encod! 
formation during exposure to the 
ulus conditions. In this study, 
distracted during either the © 
(i.e, when observing | the co! 


ng 
nversation 


salient information was provided) or 
retrieval stage (i.e, when rating the two 
Hicipants). Crossing these two variables 
led us to see if salience effects represent 
ncoding or a retrieval bias or a combina- 
nof the two. 


Method 


bjects 


Subjects were 100 Harvard-Radcliffe undergrad- 
who signed up for a “getting—acquainted” 
. They were run in groups of 8 to 9 subjects, 


subjects. 

en subjects arrived for the experimental ses- 
bn, the experimenter explained that an informal 
Aversation would involve two of them and that 
rest of them would be observers. The experi- 
inter “picked” the two (male) confederates to hold 
e conversation and seated them facing each other, 
ith the other subjects arranged as observers. Four 
tubjects sat behind each confederate, and if a ninth 
bject was present, that person sat in the control 
ondition, That is, for four subjects, one of the con- 
derates (A) engulfed their visual attention, and 
lor the other four subjects, the other confederate 
B) engulfed their visual attention. Control subjects 
had equal visual access to both confederates. This 
rocedure has been successfully used as a manipula- 
tion of salience in previous studies (see, for ex- 
mple, Taylor & Fiske, 1975). 


Procedure 


The observers were instructed that the experiment 

concerned with several specific aspects of the 
Wnversation, so that separate instruction sheets 
ould be Passed out to each person. Subjects were 
d that their task might seem odd, but that both 
Wehological and linguistic aspects of interaction 
re being studied in the experiment. All subjects 
told to listen to the conversation without par- 
pating in it and without taking notes. For the 
istracted condition, the instructions read: 


As you are watching and listening, pli 
, Please observe 
a enon in general. Yorn a impression 
ies Tinie discussion, Imagine you are over- 
E pE Ra people in a public place. Afterwards, 
asking you about your observations of 

people and the entire conversation. 


The other half of the 
constituted the en 


subjects received instructions 
coding distraction manipula- 


you are Watching and listening, please observe 


con 4 
ey! usage of pronouns, Count the 
ya r nay and you are used, Keep a com- 

‘or the conversation as a whole. 
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Afterwards, we will be asking you about your ob- 
servations of these people and the entire con- 
versation. 


The confederates then carried on a “getting-ac- 
quainted conversation,” as if they had just met. They 
exchanged information about summer jobs, college 
major, career plans, hometown, and the like, The 
experimenter stopped the conversation after they 
had exchanged the standard information. The con- 
federates were asked to leave and wait down the hall, 

At this point, the decoding distraction manipulation 
was introduced. Half the subjects were seated at 
chairs with writing surfaces, and they were told to 
fill out the impression questionnaire. The other half 
of the subjects were taken into the next room and 
seated at a large table. The experimenter mentioned 
that someone else was working in the room, but had 
said that the subjects could use it, When the subjects 
entered the room, they found a second experimenter 
setting up audiovisual materials ostensibly for an 
unrelated experiment. As these subjects attempted 
to fill out the dependent measures, the second ex- 
perimenter continually readjusted the volume of a 
tape recording, changed and focused colorful slides, 
and generally bombarded the subjects with distract- 
ing audiovisual stimuli. 


Dependent Measures 


Subjects rated both confederates, and the order of 
rating (salient first/nonsalient first) was counter- 
balanced. For each confederate, subjects were asked 
first to recall what he had said, They then rated how 
friendly, talkative, and nervous he was, to what 
extent each of these qualities was dispositionally 
caused, and to what extent it was situationally caused, 
The last five questions assessed causal influence in 
the conversation: To what extent did this person 
set the tone of the conversation between the two 
people? To what extent did this person determine 
what kind of information would be exchanged dur- 
ing this conversation? To what extent did this per- 
son cause his partner to behave as he did? To what 
extent did you form an impression of this person? 
And, how much did this person influence the con- 
versation as a whole? All responses were on 9-point 
scales with labeled endpoints. 

After subjects had completed recall, dispositional- 
situational, and conversational causality measures 
for each confederate, they were asked to report the 
results of their observation task (if any), for ex- 
ample, the number of pronouns they had counted. 
They were also asked how successful they felt at the 
observation task, how much the situation interfered 
with the observation instructions, and how much 
the situation interfered with filling out the question- 
naire, When they had finished, subjects were paid $2 
each and fully debriefed. None expressed suspicion 
about the study’s purpose. 
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Results 


The dependent measures were analyzed by 
a four-way analysis of variance. The between- 
subjects variables were encoding distraction 
(distracted/not), retrieval distraction (dis- 
tracted/not), and confederate faced (A/B). 
Confederate rating (A/B) was a within—sub- 
jects variable for all measures except the 
manipulation checks. 


Manipulation Checks 


On the question assessing the encoding dis- 
traction manipulation, subjects who counted 
pronouns felt that the observation situation 
(i.e, counting pronouns) interfered signifi- 
cantly more than did undistracted subjects, 
F(1, 91) =4.84, p< .03; Ms=4.80 vs. 
3.72, There were no other significant effects 
in that analysis. 

On the question assessing ease of filling out 
the questionnaire (the retrieval distraction 
manipulation check), retrieval distraction sub- 
jects reported no more difficulty than no-re- 
trieval distraction subjects; however, retrieval 
distraction subjects took much longer to fill 
out the questionnaires than did the no-retrieval 
distraction subjects. This informal observa- 
tion indicated that despite the lack of dif- 
ference on the manipulation check, the ma- 
nipulation of retrieval distraction may have 
been successful. (On the other hand, subjects 
may have circumvented the retrieval distrac- 
tion manipulation by taking longer.) Interest- 
ingly, it was the encoding distraction subjects 
who showed a difference on this distraction 
measure, F(1, 90) = 6.64, p < .02, Appar- 
ently, the encoding distraction subjects’ diffi- 
culty in observing the conversation made them 
find it hard to answer the questions. This was 
particularly true for the distracted subjects 
facing Confederate B, F(1, 90) = 4.80, p< 
.03. There were no other significant effects. 
Finally, the question assessing how successful 
subjects felt at the observation task, a po- 
tential covariate for later analyses, showed no 
effects at all. 


Causality Measures 


The five conversational causality measures 
(set tone, determine information exchanged, 
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Table 1 

Causality and Evaluation Ratings: 
Experiment 1 

Oe 


Rating 
Causal- 
ity Talk- 

Condition index Friendly ative Nervo 
Facing A 

Rate A 22.58 6.56 6.40 4.13 

Rate B 21.80 6.30 4.91 4,90 
Facing B 

Rate A 18.85 5.72 5.42 5.10 

Rate B 26.11 7.49 5.79 4.06 


Note. Causality index values range from 5 to 45} 
other scales were 9-point rating scales. Higher value 
indicate greater rated perception. F values fof 
salience interaction effect range from 5.25 to 22.53 
with 1 and 75 degrees of freedom; ps range fr 
<.03 to <.001. 


cause partner’s behavior, influence conversi 
tion, form impression) were summed to fo 
a general index of causality, and the analy: 
showed that the salience effect emerg 
clearly, F(1, 91) = 13.29, p < 001. Mean 
are presented in Table 1. Importantly, thi 
effect did not interact with either distractlo 
manipulation, either separately or combi 
(all Fs <1). Thus, the data show 4 simple 
salience effect, uninfluenced by both distrat 
tion manipulations. au 
In addition to the salience effect, the ca Fe, 
ity ratings showed a main effect for con 
erate, F(1, 91) = 8.65, p < .005, such 
one confederate (B) was consistently see? a 
more causal than the other. This confedert™ 
effect accounts for the reversal of be 
dicted mean pattern in the no dist 
facing A cell, but does not affect the saul pe 
results, The question How much did this pE 
son talk?—also a measure of promin eal 
showed salience effects such that the SA 
actor’s estimated talking time was There 
higher, F(1, 92) = 16.90, p < ‚001. 
was also a confederate effect on this 
F(1,92) = 6.16, p < .02. it 
The dispositional and situational cat 
Measures were summed to form separate ‘a 
of each. No salience effects emerged, "° 
there any other significant effects. 


measles 


effect of salience on recall 
However, the encoding dis- 
g pronouns) manipulation re- 
‘subjects remembering virtually 
nversation, F(1, 92) = 20.45, 
ed with subjects who simply 
t any distraction. Recall also 
in effect for confederate, F(1, 92) 
001. 


two evaluative measures: 
ervous, Salience effects emerged 
dliness rating, F(1, 92) = 41.76, 
ch that the salient actor was seen 
lly friendly. Similarly, salience ef- 
ed on the nervousness rating, 
24, p < .02, such that the con- 
ibject faced was seen as relatively 
than the nonsalient confederate. 
ality evaluations, the salient 
differed substantially from the 
or’s rating. Since both means 
ne side of the midpoint, and the 
ratings were always more ex- 
ittern represents an exaggerated 
personality attributes due to 
Pattern similar to that found in 
dies (Taylor & Fiske, 1978). 
lables also showed confederate 
endly, F(1,92) = 23.09, p< 
l nervous, F(1,91) = 3.18, p< 
lly, the encoding distraction influ- 
ts to rate both confederates as 
F(1,92) = 9.51, p < .003. 
, then, salience effects on 
s and evaluations were found 
The hypothesis that such ef- 
d on differential encoding of in- 
hg exposure to initial stimulus 
Not supported, however. 


Experiment 2 


2 was designed as a test of 
t salience effects are elimi- 
ee of attention is heightened 
a more engaging or informa- 
Order to examine conditions 
tal generalizability, we ma- 
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nipulated intensity of attention by having 
subjects observe a debate on a current issue 
with which they were either involved or un- 
involved. In addition to examining the usual 
dependent variable of causality (the main 
test of the hypothesis), we included secon- 
dary measures of persuasiveness, attitudes 
toward the issue, and attitudes toward the 
debate to see if they would be affected by 
salience. Specifically, if salience exaggerates 
affect, one’s beliefs about a debater’s per- 
suasiveness, efficacy, and the like should in- 
crease when one agrees with the debater’s 
position and he or she is salient, and they 
should decrease when one disagrees with the 
debater and he or she is salient. 


Method 
Subjects 


Ninety-one Harvard-Raddliffe undergraduates (46 
males, 45 females) participated as paid volunteers 
and were selected on the basis of their position on 
the issue debated and the importance of that issue 
to them. Involvement and position on the issues 
was assessed by drawing subjects from a pool of 
participants in an “attitude survey” ostensibly un- 
connected with the present study and run by a 
different experimenter. The attitude survey asked 
participants to indicate their position on each of 34 
different issues and to indicate the importance of 
each issue to them, Two issues were used in the 
present study: banning cigarettes and alimony for 
unmarried couples. Subjects selected for the in- 
volved conditions rated the issue significantly more 
important to them (M =5.67 on a 9-point scale) 
than uninvolved subjects (M = 2.50, t= 13.25, p< 
001). There was a small but nonsignificant dif- 
ference between the involved and uninvolved sub- 
jects on their position on the issue, with involved 
subjects slightly more opposed (M=1.38) than 
uninvolved subjects (M = 1.72, t= 1.40, ns). 

All the survey participants were informed that 
several people in the psychology department needed 
subjects for experiments and that they should leave 
their name and number if they were interested in 
participating as paid volunteers in studies uncon- 


nected to the survey. 


Procedure 


Survey participants who were opposed on one of 
the two issues were contacted by telephone and re- 
cruited for an experiment on persuasion. The same 
procedure as in Experiment 1 was used, with the 
following changes. One involved subject and one 
uninvolved subject were assigned to each of the 
three viewpoint conditions. To support the impres- 
sion that the confederates were simply subjects 
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assigned to the debater role, the confederates were 
given 2 minutes to think about and jot down argu- 
ments supporting the side of the issue they had been 
assigned to argue. Two different pairs of confed- 
erates assisted with the study, and each confederate 
argued the “pro” side of the issue in some groups 
and the “anti” side in other groups. 


Dependent Measures 


Subjects rated both confederates, and the order of 
rating (salient first/nonsalient first) was counter- 
balanced. Subjects were asked to recall all the per- 
sonal information and arguments that each of the 
debaters gave. 

Subjects then completed 9-point measures assessing 
which confederate was more causal (To what extent 
did this person set the tone of the discussion? How 
much did this person talk in the discussion?) ; 
which confederate was more persuasive (How much 
did this person persuade the other?) ; replicability 
of the outcome (If a rematch of this discussion were 
arranged, how different would the outcome be?); 
and mutability of the outcome (How likely is it that 
this person could have said anything different to be 
more persuasive?). Subjects also judged which 
confederate had won the argument, which confed- 
erate was more convincing, and which one was more 
thorough. 

Finally, the subjects were asked to indicate their 
position on the issue, and they filled out several 
manipulation checks (How interesting did you think 
the discussion was? How much did you care who 
won the argument? How much do you care about 
this issue? How personally involved did you feel 
in the discussion?) . 

When all the subjects in the group had completed 
the dependent measures, they were carefully de- 
briefed, asked for their impressions of the study, 
informed of its purpose, and Paid $1.50 each. Dur- 
ing the debriefing, two subjects indicated a sus- 
picion that the debaters were actually confederates 
of the experimenter. When questioned, they indicated 
no understanding of the hypothesis of the study, 
so the data from these subjects were included in 
the analysis. Subjects were asked not to discuss the 
study with other potential subjects. 


Results 
Manipulation Checks 


Four manipulation checks of involvement 
were included, and analysis indicated that the 
involvement manipulation was successful. In 
response to the question, How much do you 
care about this issue? involved subjects indi- 
cated that they cared significantly more (M 
= 4.67) than uninvolved subjects (M = 
3.01), F(1,78) = 22.94, p < 001. Involved 
subjects also reported feeling significantly 
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more personally involved in the discussion 
(M = 3.53) than uninvolved subjects (M= 
2.34), F(1,78) = 7.46, p < .008. Differences 
between the involved and uninvolved sub. 
jects were not significant on their ratings of 
how interesting they found the discussion 
(involved M = 3.74; uninvolved M = 3.46), 
F(1, 78) = .50, p < .49, or how much they 
cared who won (involved M = 2.58; unin 
volved M = 2.12), F(1, 78) = 1.33, p< 6) 


Causality Measures 


Ratings on two causality measures used it 
earlier studies (To what extent did this per 
son set the tone of the discussion? and Hor 
much did this person talk in the discussion?) 
were correlated .54 (p < .001) and did nol 
correlate higher than .25 with any other de 
pendent variables. Accordingly, a causali 
measure was created for each confederate by 
summing the talkativeness and the settingi 
the-tone ratings. A repeated measures anal 
sis of variance with viewpoint (facing pM 


els), and issue (2 levels) 
subjects factors and person rated (p 
con) as the within-subjects factor was ue 
formed on the causality index. The pia 
view effect (Viewpoint X Person Retell 
weak and nonsignificant, F(2, 79) = 100) 
p< 40. However, the interaction of H 
point-of-view effect with involvement cf 
Viewpoint x Person Rated X Tavo 
was significant, F(2, 79) = 4.95, p < 0b t 
revealed that the point-of-view mei Hi 
strong for highly involved subjects a i 
regular for low-involvement subjects, it if 
means in Table 2 indicate. There wa xi 
significant Viewpoint x Person Rate ae 
volvement X Issue interaction on tee i 
ity item, which showed that the failure 
replicate the point-of-view effect for al 
volved subjects was mainly due to ar ove 
of the predicted mean pattern for ve ont 
subjects in the smoking issue condi 
F(2,79) = 3.65, p < .04. Finally, ra uth 
a significant main effect for person ra ageh 
that subjects rated the person they ) thal 
with (con) as more causal (M = (fer: 6.51) 
the person they disagreed with M= 
F(1,79) = 4.90, p < .03. 


Table 2 : j 
ean Causality Ratings as a Function of 


involvement: Experiment 2 


Involvement 
Pe ee 


High Low 


Person rated Person rated 


Viewpoint A B A B 
Pacing A 7.32 6.53 5.88 7.80 
Control 7.20 7.40 5.86 6.86 
Facing B 5.13 8.69 7.37 7.39 


Nole. Subjects agreed with Confederate B and 
disagreed with Confederate A; higher numbers 
indicate greater causality. Viewpoint X Person 
Rated X Involvement interaction was significant 
UF, 79) = 4.95, p < .01. 


Recall 


‘Subjects were asked to recall as many of 
arguments and personal details given by 
i confederate as they could. There were no 
Significant salience effects on either recall 
a ure. There was a significant Person 
ted x Involvement interaction on recall of 
iguments such that highly involved subjects 
more of the arguments they agreed 
Ne con arguments) (M = 44.52%; 
her” for pro arguments), whereas 
Eo subjects recalled more of the 
oa: T they disagreed with (M = 
#(1,19) por pro arguments, M = 43.50%), 
hint S 4.79, P< .04, There were also 
a etable main effects for issue on sev- 
easures that will not be reported here. 


Evaluations 


No signi 
en salience effects (Viewpoint 
Interactions ated) or Salience X Involvement 
Ments (per emerged on the secondary judg- 
lity, she ae repeatability, muta- 
the argu incingness, thoroughness, winning 
ment, and attitude change). 


on Discussion 

y di š 

Conditions sf asd effects replicate under 
conditions of lo gh involvement but not under 


w involvement? If other point- 
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of-view studies are uninvolving, as we have 
suggested, this failure to replicate the effect 
under the low-involvement condition is quite 
puzzling. One possibility is that the low-in- 
volvement condition is less involving than 
the typical point-of-view study, whereas 
the high-involvement condition is more 
involving. Although it is impossible to 
directly compare the degree of subject 
involvement across studies, this explana- 
tion seems possible. Subjects chosen 
because they indicated that an issue was not 
important to them may well be less involved 
than subjects viewing an exchange of personal 
information, who may in turn be less involved 
than subjects viewing a discussion on a topic 
that they rated as very important to them. 
Anecdotal support for this explanation is pro- 
vided by the experimenter’s report that in- 
volved subjects paid close attention to the 
arguments and suggested during the debrief- 
ing points that the debaters had missed; 
however, uninvolved subjects spent time look- 
ing around the room, playing with their pen- 
cils, and generally looking very bored. This 
explanation, then, suggests that some minimal 
degree of attention is required to achieve 
salience effects; that is, the subject must be 
paying enough attention for the salient in- 
formation to actually be salient. 

While this explanation would seem to con- 
tradict the results of Experiment 1, in fact, 
it does not. The salience manipulation is a 
visual one (i.e. who engulfs one’s visual 
field). In Experiment 1, though subjects’ 
verbal recall was eliminated, their attention 
to the salience manipulation did not wane, 
whereas in Experiment 2, it was the visual 
attention of low-involvement subjects that 
seemed to wander. Further research will de- 
termine whether this explanation is viable or 
whether, since the nonreplication occurred in 
only one of the low-involvement conditions, it 
is simply a fluke of the stimulus materials. 


Experiment 3 


Experiment 3 was designed to conceptually 
replicate and extend the results of Experi- 
ment 2 and, by implication, the test of Hy- 
pothesis 2. We pursued the issue of whether 
salience effects (overrated causal role of sali- 
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ent but trivial stimuli) are eliminated when 
other factors conspire to heighten and focus 
attention on more relevant aspects of the 
situation. In this study, half the subjects were 
given a cognitive tuning set (Zajonc, 1960) 
to receive information, and half were given a 
transmission set to communicate information 
to others. We reasoned that a transmission 
set prompts subjects to heighten and focus 
their attention to information necessary for 
transmitting a cogent impression (e.g., an in- 
dividual’s behavior) as opposed to salient but 
trivial information (i.e., his or her visual 
prominence), whereas a receiving set more 
closely mirrors the set of subjects in previous 
studies. 

A second issue we explored in Experiment 
3 was the effect of timing on salience effects. 
It may be that when information has been 
highly salient, its impact on the judgment 
process is strong only immediately after the 
experience, when the salient information and 
the judgment process are in close contiguity, 
and not if judgments are delayed to a later 
time, That is, it is possible that following a 
delay, salient information assumes its right- 
ful, limited role in judgments. 


Method 
Subjects 


Eighty-eight Harvard-Radcliffe students (56 males, 
32 females) were scheduled in groups of 6. Subjects 
were told that the study’s purpose was to find out 
what happens during self-disclosure conversations 
(that is, how observers form impressions of people 
who are disclosing characteristics about themselves) 
and what observers do with information about others 
at different points in time. 

We used a self-disclosure conversation because 
this kind of conversation is less ritualistic, stylized, 
and formal than getting—acquainted conversations. 
In addition, the confederates would be conversing in 
impression—formation terms; that is, the confederates 
would be describing themselves in ways that would 
tend naturally to allow subjects to form impres- 
sions. For these reasons, this kind of conversation 
is more interesting to the subjects than the getting— 
acquainted conversations used in previous research. 


Cognitive Tuning Manipulation 


As observers, the subjects were asked to watch a 
videotape of a self-disclosure conversation while 
performing a particular task. Sheets of paper were 
passed out describing the task each was to perform. 
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Half of the subjects in each group were given the 
“receiver” manipulation: 


Your task as you watch the videotape is to begin 
to form an initial impression of both of the par 
ticipants. You should use the information eath 
person provides about himself to begin to think 
about the kind of people they are. After you have 
finished watching the videotape, during the second 
part of the study, you will be receiving additional 
information about the participants that should 
help you form better impressions of them both. In 
other words, your task while watching the video. 
tape is to begin to think about the sorts of people 
the participants are, impressions that should be 
clearer after you have received additional infor- 
mation about them during the second part of the 
study. 


The others received the “transmitter” manipulation; 


Your task as you watch the videotape is to fort 
a clear impression of both of the participants, You 
should use the information each person provides 
about himself to figure out what kind of peopt 
they really are. After you have finished watching 
the videotape, during the second part of the study, 
you will be asked to communicate all you ag 
about both people to others. In other wordi 
your task is to determine, from watching thei 
conversation, the sorts of people the participant 
are in order to communicate your impressions to 
others during the second part of the study. 


+ to 
When the videotape ended, subjects were a 
write down what their task was, as 4 manipu! 
check, 


Point-of-View Manipulation 


ts that the col 


The experimenter told the Ske rel vide. 


versation they would be watching 
taped during another part of the exp 
had involved subjects having a self-disclosu! ff 
versation with other subjects. Subjects were to ‘matt 
cameras had been set up to give a split-screen verst 
but in the videotaping of the particular rat 
tion they would be watching, one O g 
had not worked properly; instead of 
picture, they would see one in which on 
the two conversationalists was clearly va 
subjects then watched a videotape of two Tsao 
federates, A and B, ostensibly having @ self! be 
conversation. The videotape for all SrOUP: eal 
with a brief introduction in which both ne jn the 
could be seen, and then, following @ blacko! eA 
film, half of the groups of subjects could 0! half o 
over the shoulder of B, whereas the owe ofA 
the groups could only see B over the shoule™ thal 
The confederates discussed characteris e an 
they liked and did not like about therta jk 
characteristics that other people liked and di th 
about them, The confederates talked about ‘nds ° 
length of time and exchanged similar 


re Co 


f 


moderately personal information. The conversation 
Jasted about 15 minutes. 


Delay Manipulation 


After watching the videotape, subjects in the 
delayed conditions were dismissed; two subjects per 
group (one receiver and one transmitter) returned 
after an hour, and two subjects (one receiver and 
one transmitter) returned after a day. The remain- 
ing two subjects per group (one receiver and one 
transmitter) completed the dependent measures im- 
“mediately. 
The tuning, point-of-view, and delay manipula- 
ions produced a fully crossed 2 X2 X 3 between- 
“subjects design. 


Dependent Measures 


Cunterbalanced, The first part of the questionnaire, 
4 each confederate, was a recall task; subjects were 
a ay paa what the confederate said that 
aa if ers liked and did not like about himself. 
aie ae part, subjects rated how friendly, talk- 
k: oe nervous the confederate was, on 9-point 
ee eee endpoints. For each of the three 
ii Ri e subject also rated the extent to which 
Brin viors were caused by dispositional qualities 
os oi (9-point scale) and by situational 
AN AGM scale). In the third part, on 9-point 
i i ie rated how much the confederate set 

Bee iis th, e conversation, was able to draw out 
Fit ca i other than he had to give up about 
TA ae a partner to behave as he did, truly 

fava ia ieli, and influenced the conversa- 

which they one Subjects also rated the degree to 
confederate Me ve to form an opinion of the 
Confederate, nee a favorable impression of the 
ate how aay a iked him. Last, they were asked to 
lions in this was for them to answer the ques- 
part of the questionnaire. Subjects then 


ed out an identi 
other Oa set of questions about the 


ubjects t : 
Dete the r ook approximately 20 minutes to com- 


he ite Subjects were then fully debriefed. 
Piclous RGM es that subjects were not sus- 
Mot guess e point-of-view manipulation and 
Subjects took the true nature of the experiment. 
ously ; eam cognitive tuning manipulation 
eir lee used the debriefing to relate 
vets expected of the confederates, whereas re- 
© hypothese, to receive additional information. 
“Used, and E ok the experiment were then dis- 
Immediate Sti Subjects were paid; subjects in the 
the delayed ae received $2 each, and subjects in 
tions received $3 each. 
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Table 3 
Mean Causality and Evaluation Ratings 
as a Function of Salience: Experiment 3 
Rating of Rating of 
confed- confed- 
erate erate 
Measure faced not faced 
Causality index 14.32 11.76 
Talking 5.98 5.32 
Recall of positive 
characteristics 2.95 2.41 
Favorable impression 4.50 KN] 
Ease of ratings 6.36 5.88 
4.09 3.18 


Liking 


Note. Causality index values range from 3 to 27; 
other scales are 9-point rating scales. Higher values 
indicate greater rated perception. F values for 
salience interaction effect range from 5.25 to 22.53, 
with 1 and 75 degrees of freedom; ps range from 
<.03 to <.001. 


Results 


Dependent measures were analyzed by 
analyses of variance using cognitive tuning, 
confederate faced, and delay as between-sub- 
jects factors. Because each subject rated both 
confederates, confederate ratings were a re- 


peated measure. 


Manipulation Checks 


The manipulation check verified that the 
cognitive tuning manipulation had worked for 
most of the subjects (73 of the 88 subjects 
remembered their task after they had viewed 


the videotape). 


Causality Effects 


On the basis of high average intercorrela- 
tions, subjects’ ratings of confederates’ influ- 
ence on the tone of the conversation, influence 
on the conversation as a whole, and influence 
on partner were combined into a causality 
index for purposes of analysis. 

As Table 3 demonstrates, the salience ef- 
fect emerged. Subjects perceived that the 
confederate they observed was more causal in 
the conversation than the confederate they 
did not observe, F(1, 96) = 7.45, p< 0l. 
There were no other interactions or main ef- 
fects on this causality index. Even when sub- 
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jects who had not remembered the tuning 
manipulation were removed, there were no 
significant cognitive tuning main effects or 
interactions with the causality index. The 
question How much did this person talk? is 
also a measure of prominence and showed 
similar salience effects, such that subjects 
perceived the salient confederate as talking 
more than the nonsalient confederate, F(1, 
75) = 5.25, p < .03. 

Given that the observer has decided who 
was influencing the conversation, will the ob- 
server go on to make situational and personal 
attributions about why the confederate be- 
haved as he did? As mentioned, for the three 
behaviors measured—friendliness, talkative- 
ness, and nervousness—subjects indicated 
how dispositionally caused and how situa- 
tionally caused each was for each confederate. 
The analysis revealed that subjects perceived 
that the nervousness of the confederate they 
observed was caused less by the personal 
characteristics of the confederate and more by 
the characteristics of the situation, whereas 
they perceived that the nervousness of the 
confederate they did not observe was caused 
more by the personal characteristics of the 
confederate and less by the characteristics of 
the situation (personal, F(1, 72) = 8.73, p < 
01; situational, F(1,72) = 4.76, p< .03. 
This finding constitutes a reversal of the 
mean pattern predicted by the actor—observer 
hypothesis (see Storms, 1973). However, be- 
cause of the nature of the attribute “ner- 
vous,” it may be that the data pattern is con- 


sistent with a view of the salient person as 
“in charge.” ? 


Evaluations 


Significant salience effects also emerged on 
recall of positive characteristics mentioned 
by the confederate, F(1,76) = 22.53, p< 
001; the degree to which the confederate 
made a favorable impression, F(1,76) = 
9.30, p < .01; the ease with which the sub- 
ject could rate the confederate, F(1, 76) = 
15.39, p < 001; and how much the subject 
liked the confederate, F(1, 76) = 8.18, p< 
.01. The means of these measures (see Table 
3) revealed that subjects could more easily 
form impressions, particularly favorable im- 


pressions, about the confederate they face] 
than about the confederate they did not, As 
in Experiment 1, the salient person’s attr. 
butes were exaggerated in a positive dire. 
tion. Confederate effects also emerged on 
these impression measures, but they are un 
interpretable and will not be reported here 


Summary 


| 
Experiment 3 found strong salience effeds 
on a wide variety of measures in a self-dis{f 
closure situation. However, the impact o 
cognitive set and whether impressions warty 
assessed immediately or later on the magi: ff 
tude of those salience effects was negligiblė 


General Discussion 


Three studies examined two hypothesi 
regarding the effects of salient information ot 
social judgments. The first was that salient 
effects are dependent on attention and tht 
differential attributions are made to stimu 
for which large amounts of information have 
been gathered. Results indicated that as log 
as subjects actually attend to the salient Im 
formation, salience effects occur and ae 
increased or decreased by more or less al 
tion of information. Indeed, the fact that F 
ence effects occurred even when subjects a 
taken in virtually none of the verbal ea 
tion exchanged in the scenario indie a 
a very low degree of verbal attentio iy 
still produce biased attributions of cau 
to a visually salient person. i, 
The i hypothesis tested was in s 
ence effects are characteristic of low pi Pi 
intensity of attention and that if some 


2 There were some uninterpretable A 
fects with some of the measures: (a) 3 pecans 
Cognitive Tuning interaction with u pei i 
of personal characteristics, F(2, 75) = 6. Faced it 


teraction with nervous, because of situation’ yt 
teristics, F(2, 72) =3.14, p< 05; (© 9 iin bof 
Cognitive Tuning X Salience interaction i eel 
easily the subjects could rate the i ye ToD 
F(1, 75) = 4.62, p <.01; and (d) 2 Coen” of boy 
X Person Faced interaction on perception, o4, p 
much the confederate talked, F(1, 75) = 


01. 


£ 


ple that increases intensity of attention (such 
‘gs subject involvement in the scenario) were 
introduced, salience effects would not emerge. 
The rationale for this prediction was that 
peightened involvement in the scenario would 
increase attention to relevant cues, thereby 
Teducing the impact of otherwise salient but 
ivial stimuli. Two studies tested this hy- 
thesis and found that under conditions of 
involvement with the issue, more engrossing 
conversations, and a cognitive tuning set to 
transmit rather than receive, salience effects 
continued to emerge. 
The results of these studies suggest that 
salience effects have at least some degree of 
eternal generalizability. In addition to the 
Conditions of generalizability tested in the 
‘Present studies (involvement, tuning, distrac- 
tion, and delay), the results of two pilot stud- 
e also extend the domain of applicability of 
nce effects. In a pilot study by Gartrell 
(Note 2), salience effects were obtained even 
‘When one actor clearly dominated the other 
ña conversation. In a second pilot study 
SIN & Winkler, Note 3) assessing the 
ae oe on salience effects, power 
a indicated that salience effects would 
ot under conditions of arousal but that 
Reema would not be substantially 
ee s arousal. Taking all this 
“ace effects GA er, we conclude that sali- 
“and involvin CE imi to more engrossing 
isie a situations than those examined 
E aie a and that salience effects are 
Titing versu on either subject set (trans- 
can BE, he or on immediate 
at salience E jects’ impressions. The fact 
! interesting ans si continue to emerge in 
tions eee ving, and arousing condi- 
“Tent that sali es more plausible the “argu- 
tial socia] aes may mediate consequen- 
an eptions, 
highly he Of the present study are also 
Sa regarding how salience ef- 
Verbal cea es are mediated. Elimination of 
(Experiment leaves the effect untouched 
atlention ma 1), whereas wandering visual 
- Tplicate in ee responsible for the failure to 
clusion to i Xperiment 2. The obvious con- 
Well be etalk is that salience effects may 
an ort. y y mediated. However, results of 
© model the mediation process 
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(Fiske, Kenny, & Taylor, Note 4) do not 
unequivocally support this contention, though 
the indicators of both verbal and visual re- 
call in those two studies are by no means 
infallible. In any case, given inconclusive 
data, we prefer to hedge on the issue of 
mediation for the present. 

The present studies indicate that salience 
effects are not pushed around very much by 
the social variables that would be expected to 
interact with them. Nor are salience effects a 
function of the subject’s set or of the juxta- 
position of the salience scenario and the 
rating task. Indeed, what is striking about 
the point-of-view effect is that it is quite 
reliable and practically unmovable. What is 
suggested to us by this fact is that although 
the effect is manifested on variables of social 
interest such as causality or prominence rat- 
ings, the effect itself may be an automatic 
perceptual bias (not unlike optical illusions, 
orienting to novel stimuli, and the principles 
of figural emphasis uncovered by Gestalt psy- 
chologists). Thus, we would argue that sali- 
ence effects are automatic responses to stimu- 
lus qualities, which the perceiver has never 
learned and which occur without intention 
(cf. Schneider & Shiffrin, 1977). 
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“interpersonal Consequences of Person Perception Processes 
: in Two Social Contexts 


Lawrence A. Messé and Gary E. Stollak 
Michigan State University 


Gerald Y. Michaels 
University of Michigan (Ann Arbor) 


This research examined the proposition that perceiver-based perceptual pro- 
cesses affect social behavior. Approximately 1,100 undergraduates were exposed 
to a videotape that portrayed a male or female child interacting with an adult 
in a playroom. In Study 1, subjects who “saw” the child emit (a) primarily 
positive behaviors (i.e., subjects who were positively biased), (b) about equal 
numbers of positive and negative behaviors (i.e., subjects who were accurate), 
or (c) primarily negative behaviors (i.e., subjects who were negatively biased) 
then engaged in cooperative task activities with a 7-year-old child. In Study 2, 
a subset of these subjects engaged in a discussion with another undergraduate 
about three issues on which they apparently disagreed. Systematic analyses of 
these interactions suggested that perceptual processes affected social behavior; 
for example, negatively biased subjects tended to act in a more authoritarian 
manner in their encounters with the child, whereas positively biased subjects 
were the least effective in the discussion task. These and the other findings are 
discussed in terms of their implications for understanding the relationship be- 
tween person perception and overt social behavior. 


Rogers, 1959) and sociology (e.g., Cottrell, 
1966; Mead, 1934) that explains human be- 
havior in part in terms of person perception 
processes. For example, Kelly (1969) postu- 
lates that how a person “anticipates” events 
—including interpersonal encounters— 
through cognitive and perceptual processes 
affects the course of those events. Similarly, 
perhaps even more strongly, Combs and 
Snygg (1959) argue that all human behavior 
is determined by perceptual mechanisms and 
processes. This position is also similar to the 
proposition of role theory (Cottrell, 1966) 
and symbolic interactionism (Mead, 1934) 
that people’s “definitions of the situation 
have a large influence on the course of their 
interpersonal encounters. 

Given that person perception processes are 
seen by such theorists as important determi- 
nants of social behavior, it is somewhat sur- 
prising that few studies have attempted to 
establish empirically the link between per- 
ception and behavior (cf. Hastorf, Schneider, 
& Polefka, 1970; Shrauger & Altrocchi, 1964 ; 
Taguiri, 1969). Notable exceptions to this 
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statement are the seminal work of Kelley 
(1950) on the consequences of a set to see 
another as warm or cold, the research on 
people’s reactions to attractive or “stigma- 
tized” others (e.g, Kleck, Ono, & Hastorf, 
1966; Snyder, Tanke, & Berscheid, 1977), 
and—although not usually thought of in this 
perspective—studies of race effects on inter- 
personal encounters (e.g., Harrison, Messé, & 
Stollak, 1971). 

While studies such as Kelley’s demonstrate 
that in general, person perception mechanisms 
play a role in people’s overt social behavior, 
they do not establish the link between such 
behavior and an important class of person 
perception variables: those factors in the 
perceiver (e.g., personality, cognitive struc- 
ture, etc.) that affect his or her impressions 
and judgments (Kaplan, 1976; Shrauger & 
Altrocchi, 1964). Perceptual set, as it was 
induced by Kelley, is a situational variable, 
whereas physical appearance variables are 
primarily factors in the target person. Thus, 
the relationship between perceiver-based per- 
son perception variables and social behavior 
has not yet been clearly established. 

The present research was an attempt to 
establish the link between social behavior and 
one perceiver-based person perception vari- 
able: the differential attention that persons 
pay to the behavior of others, especially chil- 
dren. Obviously, children emit a large number 
of cues or behaviors. Adults who observe 
them often evaluate these behaviors as “good” 
(acceptable, desirable, valuable, etc.) or “bad” 
(unacceptable, undesirable, immoral, etc.). 
We speculated that some people might be 
more attuned than others to the “bad” be- 
haviors that children emit, and/or they might 
be more likely to interpret children’s am- 
biguous behavior as bad; such people could 
be called negatively biased. Others might be 
more sensitive to children’s good behaviors 
and/or be more likely to interpret behavior 
as good; these people would be positively 
biased. Still others might be more “balanced,” 
that is, not be differentially sensitive to good 
or bad behavior and/or primed to interpret 
ambiguous behavior one way or the other. 
This speculation about “interpersonal per- 
ceptual styles” (IPS) is similar to the posi- 
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tion of Kaplan (1976) that persons can have | 
general response dispositions that influence 
their judgments of others. 

In the present research, we explored the 
hypothesized link between IPS and interper- 
sonal behavior in two social contexts: In Study 
1, we examined the relationship between IPS 
and the behaviors that occurred during a 
cooperative encounter between an adult and 
a child; in Study 2, we explored the relation- 
ship between IPS and behavior within the con- | 
text of a less cooperative interaction, a dis- 
cussion between two adults of issues about 
which they disagreed. 


Study 1 


We measured IPS by examining people's 
perceptions of the same stimulus event— 
videotape excerpts from what was supposed 
to be a relatively long-term encounter between 
an adult and a child. We then observed en- 
counters between children and people who, 
through their responses to the common stimu- 
lus event, were identified as negatively biased, 
positively biased, or perceptually accurate 
Given the rather short-term and cooperative 
nature of this encounter, we expected (a) that 
IPS would relate more strongly to the be 
havior of the perceiver (i.e., the adult) than 
to the behavior of the child with whom she 
or he interacted and (b) that the behavior of 
negatively biased persons would differ (i.e, 
be more controlling and dominant) from that 
of positively biased and accurate persons, WhO, 
in turn, would not differ much from ê 
other. 


Method 


Subject Recruitment and Selection 

An advertisement was placed in the student Me 
paper of Michigan State University that solicited, 5; 
persons who were interested in participating 1" 
havioral research” for pay. Interested persons bie 
instructed to come to a 45-minute interview (44 ) 
times, and place were listed in the advertisement 
for which they would be paid $1. The advertisem® | 
also stated that those persons who were chosen if 
participate further in the project could earn $5 P 
hour for 1 to 3 hours of their time. 


PERCEPTION AND INTERACTION 


Table 1 
Mean IPS Scores and Cell Frequencies 
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Subject’s IPS category 


Sex of Positive Accurati i 
Sex of child in ves SNES oe 
subject videotape Score n Score n Score n 
F 6.80 14 —.24 13 —16.20 i 
F M 7.02 12 —07 10 -16.38 13 
M F 9.52 12 07 12 18.19 12 
M 9.07 14 81 13 —16.30 11 


Note. IPS = interpersonal perceptual style. An IPS score is the difference between the numbers (out of 
25 each) of positive and negative behaviors that a subject reported perceiving the target child emit. 


About 1,100 persons responded to the advertise- 
ment and came for the interview, These sessions 
were arranged so that potential subjects were ex- 
amined in groups of 18-25 persons, Each person first 
completed a brief information sheet that asked his 
or her name, address, and telephone number, as well 
as the days and hours during the week that the 
respondent typically would be available to participate 
in the study. Then the group was shown the video- 
taped excerpts of the adult-child playroom encoun- 
ters, under the guise that the project was concerned 
mainly with how various groups of people—profes- 
sional psychologists, older adults, undergraduates, 
and so on—evaluated the activity of a “graduate 
student in training.” After viewing the tape, the 
members of the group individually filled out a be- 
havior checklist about the children in the videotape, 
Supposedly so that the interpersonal context of the 
graduate student’s evaluation could be assessed; then 
they completed instruments that were designed to 
tap their judgments about the graduate student. 

The child behavior checklist was scored by sub- 
tracting the number of “negative” child behaviors 
that a respondent saw from the number of “posi- 
tive’ behaviors that she or he saw; such scores 
could range from +25 to —25, since the checklist 
contained 25 behaviors of each type (see below). 
There were 1,068 scorable checklists. From these, po- 
tential subjects were selected for the adult-child en- 
counters such that there were 30 persons of each sex 
(plus any additional persons, if any, whose IPS 
Score equaled that of the 30th potential subject se- 
lected) who best matched the definitions of positive 
bias, balance, and negative bias. These potential sub- 
jects met the following criteria: (a) They checked 
at least 20 behaviors in all; (b) they were identified 
as negatively biased if they checked at least 14 more 
negative behaviors than positive behaviors; (c) they 
Were identified as accurate if the number of negative 
and positive behaviors that they checked did not 
differ by more than one; (d) they were identified 
aS positively biased if they checked at least 5 more 
Positive behaviors than negative behaviors. Table 
1 presents the mean IPS score and the number of 


each type of respondents who were selected to par- 
ticipate further. 


Children Participants 


Through the cooperation of the administration of 
the East Lansing, Michigan, school district, the par- 
ents of all children in second-grade classes were con- 
tacted via letter and asked if they would permit 
their child to volunteer (for $5) to play with an 
undergraduate in a playroom that was located at the 
Department of Psychology, Michigan State Univer- 
sity. Parents who were interested returned a post- 
card to that effect, and a sample of these parents 
was contacted via telephone. In this conversation, 
the nature of the child’s participation was explained 
in detail, and if the parent agreed, an appointment 
was made to have the child come to the playroom 
for the encounter with the undergraduate, In this 
way, the cooperation and participation of 168 7- 
year-old children were obtained, 


Instruments and Materials 


Standard perceptual stimulus (SPS). A script 
was constructed in which a “graduate student in 
training”—the word clinician was not used to avoid 
the possibility that the term would induce a nega- 
tive set in subjects—acted rather passively in a 
series of encounters with a child. The child was 
identified as one of several volunteers who were paid 
to participate with the student, During these en- 
counters, the child, while engaging in a variety of 


1The criteria used to identify negatively and 
positively biased respondents differed substantially. 
This difference occurred because relatively few re- 
spondents (less than 2%) checked at least 14 more 
positive behaviors than negative behaviors, so that, 
of necessity, the identification criterion for positive 
bias had to be lowered. 
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activities, emitted a number of behaviors; some of 
these behaviors could be construed as negative (eg., 
cheating at a game), some could be “seen” as posi- 
tive (eg., sharing food with the adult), and some 
could be perceived as neutral or ambiguous (e.g., 
painting with watercolors). An attempt was made 
to portray approximately equal numbers of positive 
and negative behaviors. 

Two female and two male child actors (who 
ranged in age from 9 to 11 years old) learned the 
script and were videotaped while they played this 
role with a female graduate student. The SPSs of 
the female and male actors who were judged to do 
the best, most natural jobs were used to determine 
the perceptual styles of the undergraduates, as de- 
scribed above. About half the respondents saw the 
tape of the female target child, and half saw the 
tape of the male. Table 1 presents the number of 
male and female subjects of each perceptual style 
who were exposed to the male or female target child. 

Child Behavior Checklist (CBC). The CBC, 
which was used to determine the interpersonal per- 
ceptual style scores of the respondents, was a modifi- 
cation of an instrument developed by Ferguson (eg., 
Ferguson, Partyka, & Lester, 1974). In its present 
form it presented respondents with 64 “behaviors” 
(eg. “Played with toys in a rough way”; “Did 
what the adult asked her [him] to do”) and asked 
them to check those behaviors that applied to the 
child in the videotape that they had just viewed. 
Fourteen of the items were nonscored “fillers”; the 
remainder consisted of 25 “positive” behaviors and 
25 “negative” behaviors that had differentiated pa- 
rental perceptions of clinic-referred children from 
parental perceptions of non-clinic-referred children 
(Ferguson et al., 1974) and that a sample of about 
100 undergraduates judged to be clearly positive or 
negative. As noted above, IPS was determined by 
subtracting the number of negative behaviors that 
a respondent checked from the number of positive 
behaviors that he or she checked. 

Evaluation of graduate student in training. Two 
forms were used to permit respondents to evaluate 
the graduate student—the reason they were given for 
asking them to view the SPS. One was a modification 
of rating scales that were developed by Bessell and 
Palomares (1970); the other was a checklist com- 
posed of 36 trait words taken from Anderson (1968) 2 


Tasks 


The adult subjects interacted with the child vol- 
unteers in playroom facilities that were located at 
the Department of Psychology, Michigan State Uni- 
versity. Every adult-child pair engaged in three co- 
operative tasks, each of which lasted for 10 minutes. 
In the first (“free play”), both participants were 
told that they could do what they wanted (ie., play 
together or separately, read, just sit there, etc.), 
In the second task, the pair was asked to reproduce 
to the best of their abilities a rather complicated 
figure on an “Etch-a-Sketch.” In this task the adult 
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had control of only one knob (and, therefore, one di- 
rection), and the child controlled the other knob, 
Their task was to coordinate their activities so that 
they produced a reasonable facsimile of the sketch, 
In the third task, the adult was given a list of 
proverbs (eg., “Haste makes waste”) and told to 
try to teach the meaning- of them (as many as he 
or she wished) to the child. The interactions wer 
videotaped through one-way mirrors. 


Design 


This study used a factorial design whose dimen- 
sions were 2 (sex of undergraduate) X2 (sex of 
target child; here target child refers to both the sex 
of the child in the SPS and that of the child with 
whom the subject interacted, since care was taken 
to insure that these were always the same) X?2 
(IPS of the subject: positive, accurate, or negative, 
as defined above). Fourteen subjects, chosen at ran- 
dom from the 15 or more potential subjects in each 
condition, were examined in each cell of the design, 
but because of equipment failure and the other 


mechanical problems, final cell sizes ranged from 10 | 


to 14 (see Table 1). 


Procedure 


All phases of the research were conducted by 
undergraduate assistants, although one of the 
two authors was always available in case some dif- 
ficulty arose. One assistant escorted the child into 
the playroom, and a few moments later, a second a 
sistant brought the adult into the room, introduced 
him or her to the child, and then left. The first a$- 
sistant then instructed the pair concerning the R 
task, after which she or he left the room and wen 
to the observation room where a third assistant Wis 
videotaping the playroom activity. After 10 minutes, 
the assistant reentered the playroom to instruct z 
participants concerning the second task, again leav” 
ing the room once the instructions had been tea 


2 Since these instruments were not used fart a 
this study, they are not described in any ante ea 
Note, though, that there was a relationship be 
these measures and the subjects’ perceptions ES ails 
child in the videotape (mean r= .21). These ti 
support Kaplan’s (1976) hypothesis that disposi! 
in person perception tend to be applied Peon 
within a social context. Thus, in the prese en to 
text (i.e. the videotaped excerpts), the disposi” ig 
see the child as negative or positive api dult 8 
generalize somewhat to perceptions of the 2 


well. : Fi 
3 Of course, the undergraduates, the childrens jut 
their parents all were informed of this P" 


ent 
beforehand, and all gave their informed cert 
about this (and all other) aspects of their 


pation in the study. 


in genet 


PERCEPTION AND INTERACTION 


This same operating procedure was also employed 
for the third task. After the session was completed, 


| both the undergraduate and the child were paid the 
specified amounts of money and thanked for their 


participation 


Coding of the Adult-Child Interactions 


Eight undergraduates (who were kept unaware of 
the nature of the study until they had finished this 
task) were trained to use a modified circumplex sys- 


tem of classifying social behavior (Freedman, Leary, 


Ossorio, & Coffey, 1951). While the 16 specific cate- 
gories of the system were retained, the scoring pro- 
cedure was changed from rating scales of intensity 
to frequency counts of the number of social acts 
(Bales, 1950) emitted that fell within a given cate- 


gory.t Each videotape was scored independently by 


two coders, and the number of tapes that the various 
coder pairs scored ranged from 3 to 12. 

The average interjudge reliability (correlation co- 
efficient across coder pairs) for each circumplex cate- 
gory ranged from .29 (for suspect) to .93 (for hate), 
with an overall mean of .48, Because of the some- 
What erratic interjudge reliabilities and because of 
large disparities across coders with respect to the 
total number of acts that were coded per tape, it 
was thought best to transform the raw data by con- 
Verting them to z scores—within a coder within a 
tategory—and calculating the mean z score per cate- 
gory across each coder pair. In this way, coders who 
Scored more acts would not influence the results 
Unequally, and, as much as possible, random error 
Would be reduced. Moreover, the frequencies with 
Which interactants displayed categories of behaviors 
differed markedly. Thus, only half of the categories— 
Structure, cooperate, passively question, dominate, 
helplessness, help, submission, and reassure—accounted 
for about 95% of the behaviors that were exhibited. 
Given the infrequency of behavior within the re- 
Maining categories, it was thought best not to ex- 
amine them further. 


Results 


The dependent measures—the mean 2 
Scores within a coder pair for each of the 
eight most frequently displayed circumplex 
Categories—were subjected to a 2 (sex of 
subject) x 2 (sex of target child) x 3 (IPS 
Of subject) x 2 (interactant’s score, adult’s or 
child’s, a within-dyad measure) multivariate 
analysis of variance (MANOVA).° 


Effects of IPS 


— 


Given the theoretical framework of the 
Present research, we expected that effects of 
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IPS would appear as a function of who (adult 
or child) was emitting the behavior; that is, 
we expected IPS to affect the adult's behavior 
in this relatively brief encounter to a greater 
extent (and in different ways, if any) than 
it would the child’s behavior. Thus, we ex- 
pected the analysis of variance to yield a sig- 
nificant multivariate IPS x Interactant inter- 
action, The MANova, in fact, indicated that 
this was the case, F(16, 248) = 1.53, p< 
08, 

Subsequent examination of the univariate 
IPS x Interactant effects revealed that five 
of the eight dependent measures yielded a 
significant interaction: dominate, F(2, 131) 
= 5.47, p < .005; structure, F(2, 131) = 2.84, 
p < .075; help, F(2, 131) = 2.77, p < .075; 
cooperate, F(2, 131) = 2.88, p < 075; and 
submit, F(2, 131) = 4.55, p < .02, These el- 
fects were analyzed further via tests of simple 
effects, and where appropriate, individual 
comparisons (Winer, 1971, p. 384) were per- 
formed between the negative and accurate 
conditions and between the accurate and posi- 
tive conditions. Table 2 summarizes the re- 
sults of these analyses by presenting those 
cell means that generated significant simple 
main effects for IPS. 

The anova also indicated that the Sex of 
Subject x IPS X Interactant multivariate in- 


4We felt that the circumplex system was a very 
judgmental task for undergraduates, even in the 
simpler form, The somewhat erratic interrater relia. 
bilities that were obtained confirmed that there was 
a need to be concerned about this issue. 

5 There is some controversy concerning the ap- 
propriate procedure to use to select a for a multi. 
variate effect, especially when, as in the case of the 
present research, a number of dependent measures 
are used to test rather specific hypotheses (see, for 
example, Bock, 1975, pp. 422-425). The present re- 
search used a slight modification of a procedure 
suggested by Bock (1975, p. 425) in which the over- 
all, multivariate a is a function of the product of 
increasingly more stringent univariate (is. For both 
studies, the 8 for the “first” univariate was set at 9, 
and each successive 8 was raised by a multiple of 
025 or .015. Thus, given that Study 1 examined 


eight dependent measures: 
ao =1—[(.9) (925) (95) (975) 
(99) (9925) (.995) (.9975)] = 248 


Likewise, for the 16 dependent measures in Study 
2, as = 250. 
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Table 2 


Mean Social Behavior Scores Reflecting Significant Effects Involving IPS 


IPS category of adult 
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i li Negative Posit ive 
mre bias Accurate bias P 
Adult's behaviors 
Dominate -128> —.060 —.088 s 
Structure -168> —.193 iS se 
Help by males —.111> 068 227 pik 
Submit —.146> 110 053 2,69 
Child's behaviors 
Dominate —.195» 078 = 138 a 
Help with male adults —.006» 235 —.125» a 
Note. IPS = interpersonal perceptual style. Values are mean z scores (standardized within each oder | 
averaged over each coder pair) for each circumplex category that was analyzed. Thus, each socia 


score had an overall mean of 0 and a o equal to 1. 
adf = 2, 262. 


b Value differs significantly from accurate IPS mean, 


*p < 10, 
** > < 05. 


teraction was significant, F(16, 248) = 1.43, 
p < .13. This effect reflected one significant 
univariate interaction, for help, F(2, 131) = 
3.17, p < .075, which was examined further 
by simple effects tests and individual com- 
parisons. These findings are also summarized 
in Table 2. 
As Table 2 indicates, a number of reason- 
_able results emerged from these comparisons. 
Negatively biased adults tended to display 
more dominance and less submission than 
did accurate perceivers; they also engaged in 
more acts of structuring. Moreover, hegatively 
biased male subjects were less helpful than 
were their accurate counterparts. When com- 
pared to accurate perceivers, Positively biased 
subjects also engaged in more structuring ac- 
tivity, whereas positively biased male adults 
tended to be more helpful, 

Table 3 presents the mean scores for co- 
operative behaviors displayed as a function of 
IPS and interactant. While the simple effects 
analysis of this dependent variable did not re- 
veal any significant effects for IPS within 
interactants, it did yield an interesting pat- 
tern of results when this variable was further 
examined from the perspective of differences 
between the interactants within each IPS 


category. Such a perspective is especially 4 
propriate for cooperative behavior, bern 
this activity can only be effective when h, 
expressed by both parties. The analysis 1M Í 
cated that children displayed a greater ano 
of this behavior than did the negatively biased 
adults with whom they were paired, Me 
the reverse occurred when the adults ye 
positively biased. However, the bere ath 
tween children and adults was not sign! ia ; 
when the adults had an accurate IPS. T E 
the dyads composed of a child and wa 
ceptually balanced adult yielded the mos 
distribution of cooperative activity. Ar. 
Three other findings relevant to X > on 
worthy of note, since they also invo K, i 
behavior of children (see Table 2). E 
yses revealed that children displayi “ult 
dominance toward negatively biased d 
than towards accurate adults, and child 
were more helpful with accurate males | 
with either negatively biased or pos 
biased males. tt 
Thus, taken together, these findings r rA 
the position that perceiver-based perso ond 
ception processes are related to interpe! l 
behavior. 


tively 
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Sex Differences 


: The analyses also yielded two significant 
multivariate effects involving the sex of the 
interactants: the Sex of Subject X Interactant 
interaction, F(8, 124) = 3.24, p < .005, and 
the Sex of Child X Interactant interaction, 
F(8, 124) = 2.02, p < .05. While not a focus 
of this research, the results of the systematic 
exploration of the underlying significant (p 
< .05) univariate differences are summarized 
briefly here, in part because they reflect on the 
validity of the observational data: (a) Male 

‘adults emitted more acts of dominance than 
did females; (b) children displayed more 
dominance toward female adults; (c) females 
were more cooperative than males; (d) chil- 
dren were more cooperative with male adults; 
(e) male subjects were less submissive; (f) 
children were more submissive to males; and 
(g) male children displayed more helpful acts 
than did their female counterparts. 


Discussion 


The results supported the basic premise 
that served as the impetus for this research: 
Social behavior is related to perceiver-based 
Person perception processes. Moreover, the 
specific findings that did emerge, on the whole, 
were reasonable. 

Negatively biased adults tended to act to 
constrain the behavior of the child. This re- 

sult is especially striking, given the social psy- 
chology of the observational setting—that is, 
the interactants were strangers to each other, 
their tasks were highly cooperative in nature, 


Table 3 
Mean Cooperative Behavior Scores as a 


Function of IPS and Interactant 
ee 


IPS of adult 
Inter- ra 
actant Negative Accurate Positive 
Adult —.047 —.041 102 
Child $122  —.089 — 048 
Note. IPS = interpersonal perceptual style. As in 


Table 2, values are mean standardized scores— 
a this case, of the frequencies of cooperative be- 
mers that interactants displayed. Again, the over- 
“Mean was 0, and g was 1. 
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they knew they were being observed, and 
they interacted for a relatively brief period 
of time (30 minutes)—in which both adult 
and child most likely were “putting their best 
foot forward.” The pattern of social behavior 
that negatively biased persons exhibited 
would be expected of those people who have 
a propensity to focus on children’s negative 
behaviors, since it is likely that persons of 
this type would feel that they could not trust 
the child to act “appropriately” without “ade- 
quate” supervision. 

On the other hand, positively biased per- 
sons, especially males, tended to be more 
helpful, if also somewhat more structuring. 
Again, such activity is reasonable in someone 
who is relatively inattentive to what most 
people would consider to be negative behavior 
in children. In this cooperative social context, 
though, a positive bias probably was con- 
gruent with the child’s actual behavior in the 
playroom. 

In addition, the behavior of the children to 
some extent was related to the adult’s IPS, but 
not as much as we had expected. Perhaps the 
magnitude of the effect of adult IPS on the 
child’s behavior is reasonable, given that this 
initial encounter between adult and child was 
brief and known to be transitory. It is in- 
teresting to speculate, however, that if IPS 
continues to affect adult behavior past the 
initial stage of an interpersonal relationship 
with a child, it could come to have a sub- 
stantial impact on the child’s behavior as well. 

One conclusion that could be derived from 
the present findings is that a positive per- 
ceptual bias promotes positive interpersonal 
functioning. No doubt this was somewhat the 
case for the social context that we examined 
in Study 1—a cooperative, brief encounter 
with a child. However, it appeared reasonable 
to explore the possibility that positive bias 
would be related to less effective behavior in 
other social contexts—for example, when per- 
sons must deal with each other in an encounter 
that is not entirely cooperative. This possibil- 


ity was the focus of Study 2. 


Study 2 


Specifically, in Study 2 the relationship be- 


tween IPS and interpersonal behavior in a 
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mildly confrontational dyadic interaction was 
examined via three separate sets of dependent 
measures: a poststudy questionnaire, which 
measured the participants’ reactions to their 
partner; observers’ ratings of the subject and 
of the total interaction; and structural mea- 
sures of the interaction, which consisted of the 
times to completion and the types of outcomes 
that occurred in a revealed differences task. 
We expected that, as in Study 1, IPS would 
be related to the social behaviors that occurred 
in an initial encounter between the perceiver 
and another person; and, since this encounter 
required that persons actively defend their 
opinions to someone whose position was op- 
posed to theirs, we further expected that posi- 
tively biased interactants would behave the 
least effectively. 


Method 
Subjects and Research Assistants 


Subjects were 48 male and 48 female undergrad- 
uates who previously had participated in Study 1. 
Eight subjects were randomly chosen from the 10-14 
subjects in each condition of that study, and each 
was paid $5 to participate in Study 2.6 

Four confederates (two male and two female), 
four coders (two male and two female), and a male 
experimenter were selected from undergraduate psy- 
chology majors. These assistants were kept unaware 
of the purpose of the research and the reasons why 
subjects were selected to participate until all data 

È were collected. They received course credits for par- 
ticipation in the study. 


Design 


This study used a 2 X 2 X 3 factorial design, with 
sex of the subject, sex of the confederate, and IPS 
respectively, as independent variables, Each cell con- 
tained eight dyads (subject and confederate), Subjects 
were paired with a confederate of the same sex as 
that of the target child whom they had viewed in 
the standard perceptual stimulus. 


Procedure and Instrumentation 


The subjects were observed interacting with a 
peer (confederate) in a structured situation that 
consisted of a revealed differences task, After the 
subject and confederate had completed a 17-item 
attitude questionnaire, the experimenter selected 3 
items (those the subject had responded to with a 
strong opinion) and instructed the dyad members to 
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try and reach an agreement about the items during 
the next 30 minutes.” All confederates were trained 
such that they could always advocate, in a rational 
and nonthreatening manner, a position opposite to 
whatever the subject defended. The procedure used 
in this study was very similar to the revealed differ- 
ences technique developed by Strodtbeck (1951). 
After completion of the interaction segment of the 
study, the subject and confederate were asked to fill 
out a poststudy questionnaire. In the first part, the 
respondents indicated along a 7-point scale how well 
each of 36 personality traits described their partner. 
The endpoints for the 7-point scales were labeled 
“very” and “not at all.” The 18 positive and 18 
negative trait scores were averaged to yield a total 
positive score and a total negative score. The second 
part of the questionnaire contained five questions. 
Four items consisted of evaluating the respondent's 
partner on 7-point dimensions of persuasiveness, lik- 
ability, ability to win affection and liking from others, 
and ability to fit in with the respondent’s circle of 
close friends. The remaining item assessed how com- 
fortable the individual felt during the study. 


Coding of Peer Interaction 


Two coders independently rated the interaction 
occurring in a session, Each coder rated an equal 
number of subjects from each condition. Coders 
viewed the session through a one-way mirror and 
heard the interaction through headphones. The coders 
rated the dyad and subject using five 7-point semantic 
differential scales. The scales consisted of rating the 
total interaction (positive-negative) and the subjects 


®Note that 2 to 3 weeks elapsed between at 
jects’ participation in the two studies, Note also thal 
subjects never were informed that the two pce 
Were connected in any way, nor were they told we 
they were selected to participate. They were led d 
believe, when they originally responded to the aC 
vertisement, that a group of researchers who Ei 
pay people to participate in their separate pa. 
were pooling their recruitment efforts, Of course, e 
personnel that subjects saw in each study W° 
entirely different. a 

7 Actually, to make training of the confederate 1 
a more reasonable task, only 8 of the 17 issues vi 
ever chosen for discussion: (a) legalization of ™4 
juana use, (b) women’s roles, (c) amnesty fr rd’ 
resisters, (d) the future of the family, (e) EO 
pardon of Nixon, (f) the Protestant Ethic, 
ecology versus progress, and (h) the role. ae at 
university. Because of their salience to subjec an 
the time that the study was conducted, Topics AA 
e were almost always used, with the remainder U 
less frequently and more or less equally. NO 
ever, that there was no relationship between 
chosen for discussion and subjects’ sex and IPS. 


ar = = 
rn i OO eee a OO ae” a 
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feans of Measures Yielding Significant IPS Effects 


IPS category 
h Negative Positive 
Variable bias Accurate bias F 

Coders’ overall rating of dyads 

containing male subjects* 4.93 5.21 3.46 4,87*** 
Proportion of time spent reaching 

moderate disagreement with 

female confederate 245 .278 0575 3.24** 
Coders’ rating of friendliness when 

confederate was female* 5.57 5.67 4,55» 2.87* 
Coders’ overall rating of dyads 

containing female confederates* 5.14 4,92 3:55% 4.04** 
Interactants’ rating of each other's 

persuasibility when confederate 

was female* 4.03 3.70 2.93 3.04** 


Note. IPS = interpersonal perceptual style. 


Tb < 075. 


lensiveness,8 


Results 


_ The dyad was used as the unit of analysis, 
with the exception of the four global ratings 
fat the coders made only of the subject. The 
T6 dependent measures, comprised of the 
ders’ ratings, the structural measures, and 
e poststudy questionnaire, were subjected 
a 2 (sex of subject) X 2 (sex of confed- 
te) x 3 (IPS) multivariate analysis of 
ance.® 
A The MANOVA revealed two significant multi- 
ate effects involving IPS: the IPS X Sex 
St Subject interaction, F(32, 100) = 1.34, p 
A and the IPS x Sex of Confederate in- 
E ìon, F(32, 100) = 1.25, p < .21. These 
ag reflected six significant univariate ef- 
E The IPS x Sex of Subject interaction 
Significant for the coders’ overall rating of 
s, F(2, 65) = 5.24, p < .01; the IPS 
E he Confederate interaction was signifi- 
a K the proportion of time that dyads 
Iscussing issues that resulted in mod- 


“Scores for these variables could range from 1 (very negative, unfriendly, etc.) to 7 (very positive, friendly, 


Value differs significantly from mean for IPS accurate category. 


erate disagreement, F (2, 80) = 4.38, p < .02; 
for the coders’ rating of the subjects’ friendli- 
ness, F(2, 65) = 2.47, p < .1; for nondefen- 
siveness, F (2, 65) = 2.59, p < .1; for overall 
rating, F(2, 65) = 3.36, p < .05; and for the 
interactants’ rating of each other’s persuasi- 
bility, F(2, 80) = 3.18, p < .05. 

Table 4 summarizes the results of univariate 
simple effects analysis by presenting the rele- 


8 Coders also scored the interactions in a more 
molecular fashion, using Borgatta’s (1963) Behavior 
Scores System. While these data yielded a number of 
reasonable differences as a function of IPS, these 
findings tended to be redundant with, and somewhat 
less sensitive than, the data derived from the coders’ 
semantic differential ratings. Thus, for reasons of 
efficiency, the results of the molecular coding are 
not presented in this report. 

9 Because of mechanical and procedural difficulties, 
we were unable to obtain coders’ ratings for 15 
dyads; moreover, assorted data were missing for 
4 other dyads as well. Thus, the sample for the 
manova had to be restricted to the 77 dyads for 
which there were complete data. However, it was 
thought best, where possible, to examine univariate 
effects—when the multivariate effect was significant 
—from analyses of variance that were performed on 


the larger sample. 
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vant means and significant F ratios. As pre- 
dicted, the data presented in Table 4 sug- 
gest that positively biased subjects were the 
least effective participants in this mildly con- 
frontational social encounter. Coders rated the 
interactions involving positively biased males 
the most negatively; likewise, coders were the 
least positive in their impression of interac- 
tions between positively biased subjects and 
female confederates. They also judged posi- 
tively biased subjects to be the least friendly 
toward their female partners. Moreover, this 
combination of interactants spent the least 
time coming to moderate disagreement— 
which probably was the most appropriate out- 
come, since confederates were programmed to 
concede points very slowly—and rated each 
other as the least persuasive. 


Sex Effects 


The manova also revealed that the multi- 
variate main effect of sex of confederate was 
significant, F(16, 50) = 4.24, p < .0001. Ex- 
amination of the underlying univariate effects 
revealed that while the coders judged sub- 
jects to be more intelligent and comfortable 
when they interacted with a female confed- 
erate, the interactants themselves were much 
less favorably impressed with each other (i.e., 
they rated each other as more negative, less 
likable, less persuasive, and so forth) when 
the confederate was female. 


Discussion 


The results from Study 2 replicated those 
of Study 1, in that, as predicted, IPS was 
related to overt social behavior. More spe- 
cifically, in the somewhat confrontational so- 
cial context of Study 2, positively biased per- 
ceivers engaged in the most dysfunctional 
interaction, especially when paired with a fe- 
male partner. During the interaction, the fe- 
male confederates were assertive, confident, 
and rational, which is contrary to the common 
stereotype of females as submissive, weak, 
and emotional. Perhaps the positively biased 
perceivers had the most difficulty with female 
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confederates because their self-image is firmly 
grounded on the traditional stereotype oj 
what a male and female should be. Therefore 
when the female did not behave according to 
her traditional role, the positively biased per: 
ceivers might have felt the most threatened, 

Taken together, the two studies suggest 
that persons with an accurate IPS exhibit the 
most adaptive social behavior across social 
contexts, Persons with a negative IPS were 
reasonably adept at interacting within a situa- 
tion that contained an element of conflict (i.e, 
the revealed differences discussion), but were 
less able to behave appropriately in a more 
cooperative setting. Persons with a positive 
IPS showed the opposite pattern; they did 
very well when interacting with a child who 
was on his or her “good behavior,” but ap- 
peared to be somewhat inept when confronted 
with a mild social conflict. 

Thus, the two studies demonstrate that 
there is a link between at least one perceiver 
based person perception process and social 
behavior. Clearly, though, further work i$ 
needed. The theoretical perspective that served 
as the basis of this work (e.g., Combs & Snys6i 
1959) postulates that perceptual mechanisms 
like IPS cause social behavior. It also could 
be, however, that IPS and interpersonal be 
havior are related in part because they bot 
are manifestations of some other variable 
(e.g, need for security). This latter inter 
pretation is plausible, and it is likely t0 i 
valid to some extent. IPS is likely to be de- 
termined by a number of personality variables 
that also have a direct impact on social be 
havior. But it is also likely that there © d 
direct, causal link between person perceptio 
and social behavior as well. As with ma 
phenomena in social psychology, it 1$ a 
ably the case that person perception proc 
like IPS are both concomitant variables iG 
and causes of, interpersonal behavior. Es p 
lishing empirically the precise relations 
between IPS, other components of petson# a 
and overt social behavior would be 4 a 
difficult task. However, the findings Pain 
Present research suggest that the underta it 
of this task would be a worthwhile and 
ful endeavor. 
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Affective States, Expressive Behavior, 
and Learning in Children 


John C. Masters, R. Christopher Barden, and Martin E. Ford 
. University of Minnesota 


Two experiments with children are presented that illustrate the effects of emo- 
tional states on learning and validate experimental affect-induction procedures 
in which individuals dwell upon thoughts of affect-provoking experiences. Posi- 
tive affective states enhanced learning, and negative states retarded it dramat- 
ically. Ratings of children’s facial expressions confirmed that positive affect- 
induction procedures elicited happy expressions, and negative inductions elicited 
sad ones. Additionally, positive affect inductions enhanced children’s apparent 
interest, involvement, and arousal, and negative inductions decreased them. 
These measures were related to learning but proved not to be the sole mediators 
of the impact of affective states on learning. The thoughts children generated 
for affect induction illustrated their recognition of naturalistic experiences that 
induce affective states. These results indicate that young children possess the 
potential for the cognitive self-control of their own affective states, and the 
effects on learning indicate that even transient mood states may produce lasting 


changes in behavior. 


Experimentally Induced Affective States 


Studies of experimentally induced affective 
states pose a significant contribution to the 
scientific study of such states and the deter- 
mining role that emotions may play in shap- 
ing other behavior patterns. It has already 
been clearly demonstrated that induced affec- 
tive states in children affect patterns of self- 
gratification (Moore, Underwood, & Rosen- 
han, 1973; Rosenhan, Underwood, & Moore, 
1974), altruism (Harris & Siebel, 1975; 
Moore, et al., 1973; Rosenhan et al., 1974; 
Underwood, Froming, & Moore, 1977), and 
aggression (Harris & Siebel, 1975), and mood 
states thus seem likely to determine the per- 
formance of a broad range of social behaviors, 
many of which are so far unstudied. There 


Preliminary versions of this experiment constituted 
an undergraduate summa cum laude honors thesis 
by the second-named author, The able assistance of 
Craig Binger is gratefully acknowledged. 

Requests for reprints should be sent to John C. 
Masters, Institute of Child Development, 51 East 
River Road, University of Minnesota, Minneapolis, 
Minnesota 55455. 
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are also compelling arguments for the M 


sibility that affective states may influence 


consequently have far-reaching effects on | 
development of new behavior patterns tara 
social learning or on intellectual function 
in achievement settings. 


Consequences of Affective States for Bekat 
and Learning 


Theoretically, the effects of a given affe 
state may be due to a variety of mediating p! 
cesses, perhaps different ones, depending WH 
the behavior pattern being affected. For ® 
haviors such as self-gratification of alti 
it has been proposed that the pleasant or a4 
sive nature of a given state serves as 4 nn 
tional determinant for behaviors that 
maintain the state if it is positive of "4 
the state if it is negative (Moore et al., i 
Rosenhan et al., 1974; Underwood © pl 
1977). The influence of affective vanag 
on persistence at effortful behavior Me 
mediated by reinforcement effects, if the 
ables are of positive or negative valen i 
as favorable or unfavorable self-evê va 
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(Masters & Santrock, 1976) and if they are 
consequent to the effortful behavior. There is 
y some evidence, however, that the impact of 
self-evaluations and their emotional con- 
comitants may affect learning through incen- 
tive or other motivational mechanisms that 
are not consequent to learning but actually 
occur in anticipation of intellectual mastery 
(Masters, Furman, & Barden, 1977). Thus, 
mood states bearing no contingent relation- 
ship to performance may affect performance, 
learning, or mastery not through reinforce- 
ment processes but through motivational or 
‘arousal components. 

Affective states may also have an indirect 
effect on behavior and learning—an impact 
mediated by the actions of others whose be- 
havior towards an individual is influenced by 
that individual’s apparent affective state. 
There is ample evidence that people’s beliefs 
and biases about the characteristics of others 
influence their behavior towards these others 
(Mischel, 1968; Rosenthal, 1966, 1969). Cer- 
tainly, a child who appears happy or sad, in- 
terested, involved, or aroused may elicit dif- 
ferent treatment from parents, teachers, peers, 
or other agents of socialization, and this treat- 
Ment is then likely to influence behavior and 
‘pparent competence in social or intelectual 
‘ontexts. Facial expressions of emotion are 
‘ommon and socially significant. They are 
ilo scientifically significant and have been 
led successfully to index internal affective 
Atates (Ekman, Friesen, & Ellsworth, 1971; 

Aman et al., 1972). However, investigations 
Utilizing experimental affect-induction pro- 
dures have failed to gather direct validating 
Widence, such as changes in affective facial 
abressions. The demonstration of appropri- 
fife es in facial expressions following 

y rent affect-induction manipulations would 
rengthen the assumption that affective states 
laa aly induced, as well as give some in- 

iis SA of changes in a child’s social stim- 
is aracteristics as a function of the affect- 

Ing experience. 

he a experiments are presented that examine 

Ba ects of positive (happy), neutral, and 

E Ive (sad) affective states on children’s 
‘stery of a learning problem. In addition to 

ve affective valence, the tempo of a given 


emotional state was independently manip- 
ulated within an induction procedure to estab- 
lish active or passive forms of the positive, 
neutral, and negative affective states,’ Three 
basic predictions were made: (a) Positive af- 
fective states will enhance learning, and nega- 
tive states will interfere; (b) a tempo com- 
ponent that is active will promote effective 
learning, and one that is passive will not; 
and (c) induced active tempo will increase 
the speed of ongoing learning behavior (aver- 
age time to solve individual problems). These 
predictions were based on the following rea- 
soning: 

1. Positive states may also enhance the ef- 
fectiveness of social reinforcers (right-wrong 
feedback delivered by an experimenter; 
Byrne & Clore, 1970; Gouaux & Gouaux, 
1971; Izard, 1964). 

2, Negative affective states will interfere 
with learning through distraction, by divert- 
ing attention away from the task and onto 
attributes of the self (Mischel, Ebbesen, & 
Zeiss, 1973) and by failing to induce any in- 
centive for mastery (Masters et al., 1977). 

3. Positive affective states may also mo- 
tivate greater attention to the task plus a 
striving for success, since success experiences 
will maintain the affective states (Masters et 
al., 1977; Mischel et al., 1973; Rosenhan et 
al., 1974). 

4, Induced active tempo will enhance learn- 
ing accuracy through increased arousal, in- 
terest, and involvement in the task, com- 
patible with the long history of research bear- 
ing upon the facilitative effects of moderate 
arousal on learning (e.g., Izard, 1964; Izard, 
Wehmer, Livsey, & Jennings, 1965; Velten, 
1968; Woodworth & Schlosberg, 1954). 

Experiment 1 was conducted to evaluate 
the accuracy of the predictions regarding the 
effects of affect and tempo on learning. Ex- 


1 It should be noted at this point that the interac- 
tion of arousal and valence may produce different 
labels for affective states (e.g., active-positive might 
be termed elation; passive-positive could be serenity ; 
active-negative might be agitated depression or 
anger; passive-negative might be depression, sadness, 
or grief). Such labels will not be used here, nor were 
they utilized in the experiments, since the influence 
of such labeling was not under investigation. 
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periment 2 replicated Experiment 1 and in 
addition provided data relevant to hypotheses 
concerning other consequences of affective 
states. In this experiment, children’s facial 
expressions were recorded on videotape fol- 
lowing the affect and tempo induction pro- 
cedure; these expressions were rated accord- 
ing to the degrees of various emotions that 
were apparent and according to expressional 
indexes of factors related to learning, such as 
interest, involvement, and arousal (Ekman 
et al., 1971; Ekman et al., 1972). For pur- 
poses of convergent and discriminant valida- 
tion of the affect-induction procedure, chil- 
dren’s facial expressions were rated on con- 
vergent dimensions of happiness (positive 
affective state) and sadness (negative affective 
state) and on divergent dimensions as well 
(e.g., fear, disgust, surprise). To provide ad- 
ditional validity evidence, the thoughts that 
children generated for the affect induction 
were independently rated for affective tone 
by a panel of adult raters. 


Method 
Subjects 


For each experiment, 48 4-year-old children served 
as subjects. Children were drawn from nursery 
schools in a large metropolitan area, They were 
equally divided by sex and randomly distributed ac- 
cording to socioeconomic status and racial back- 
ground, The experimenter was a male adult. 


Procedural Overview 


the learning task, At 
recall the thought ti 
affect induction, 


Experimental Manipulations 


There were 
resulting i 
induction 


three levels of affect and two of tempo. 
n six experimental conditions. The affect- 
procedure was similar to that employed 
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by earlier investigators (Masters & Furman, 1976; 
Rosenhan et al., 1974) and involved asking the child 
to generate a thought of the appropriate affect and 
tempo and to concentrate on it for 30 sec. The child 
did not communicate the content of his or her 
thought to the experimenter until the experiment was 
completed, to minimize any experimenter bias, At 
that time, a record was made of the thought content, | 
The affect and tempo induction was as follows: 


You know, (subject’s name), sometimes things 
happen to us that make us feel happy, and some- 
times things happen to us that make us feel sad, 
and sometimes things happen to us that don’t make 
much difference to us. Can you remember some- 
thing that happened to you that made you feel: 


(positive affect, active tempo) so happy that you 
just wanted to jump up and down? 

(positive affect, passive tempo) so happy that you 
just wanted to sit and smile? 

(neutral affect, active tempo) like jumping up and 
down? 

(neutral affect, passive tempo) just like sitting? 

(negative affect, active tempo) so sad that you 
wanted to jump up and down? 

(negative affect, passive tempo) so sad that you 
just wanted to sit and frown? 


Now (subject’s name), I want you to use your 
imagination, all right? Our imagination really 
works sometimes, doesn’t it? It can really make us 
believe things, can’t it? All right now, (subjects 
name), just look at my light box (the learning 
equipmment) and really use your imagination for 
a few seconds and think of something that maa 
you feel (same as above), That’s right, think 0 
something that makes you feel (same as above). 


The child was then given 30 sec to stare at a plait 
gray box and think, 


Learning Task 


Following the affect/tempo induction procedu 
children were introduced to the learning task, 1). 
involving shape discrimination (Masters et al., W ms 
There were 12 individual discrimination probie 
making up a trial block, and a “tower of lights ji 
dicated a child’s performance by illuminating oh w 
-9f lights, one for each problem solved corte 
the child. Children were given up to 10 trial F 
to achieve perfect mastery of the 12 problems ce 
an additional trial block was run for those ull 
whose performance was less than perfect on m rong 
trial, in order to make certain that the right We 
feedback during that trial had not allowed i 
finally to achieve mastery, There were three a cor- 
in each problem, one of which was designate’ ed 
rect, and their order in a horizontal array 

for each presentation of the problem. The p 
of rate and accuracy of learning utilized for P™ real 
of analysis were the amount of time taken t° fect 
Perfect mastery, the number of trials tO Pe 


EXPERIMENT | 


Number of Trial Blocks to Mastery 


POSITIVE NEUTRAL NEGATIVE 
Affect Induced 


the total number of errors, and the mean 
f time spent considering individual problem 


pendent Measures of Expressed Affect 


(periment 2, children’s facial expressions were 
e for a 30-sec period at two times during 
ental session, The first recording occurred 
tely following the affect-induction period and 
; x to the learning task (pretask), providing 
x of the impact of the induction procedure 
Wing the assessment of affect, attention, and 
pet that might be predictive of learn- 
a The second recording occurred im- 
the completion of the learning task 
i and tapped the continued impact of the 
pp ocedures as they had persisted through- 
earning task, plus any effects generated by 

8 experience itself. 
oo ratings were completed accord- 
e categories defined by Ekman (Ekman et 
). The following categories were utilized: 
ae pleasantness, interest, involve- 
ae gust, fear, pain, and surprise. Two 
j Mii oo dently viewed the videotaped seg- 
É: e neces of the nature of the ex- 
Fo es hypotheses involved. They were 
Eo er the duration of a facial expres- 
Ive intensity, and the frequency of its 


A during the 30-sec period, in order to 


ensions (median agreement = 94%) 
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EXPERIMENT Il 


----@ ACTIVE 
— PASSIVE 


POSITIVE NEUTRAL NEGATIVE 
Affect Induced 


: (d 1. Mean number of trial blocks (maximum = 10) to perfect mastery for both experiments 
function of affective valence and active or passive tempo. 


Thoughts Utilized in Affect and Tempo 
Inductions 


The thoughts that children generated on the af- 
fect induction in Experiment 2 were submitted to 
five adult judges who rated them on both affective 
tone and tempo. After the initial ratings were com- 
pleted, the raters were informed of the experimental 
manipulations and were allowed to read the experi- 
mental instructions for all conditions. They were 
then asked to review the thoughts once more and 
estimate (a) which affect induction the child had 
experienced and (b) which tempo induction the child 
had experienced. Average interrater agreement across 
all five raters ranged from .65 to .92, with a median 


value of .73. 


Results 


Effects of Induced Affective States on Speed 
and Accuracy of Learning 


The four learning measures (total time to 
mastery, number of trials to mastery, total 
number of errors, and mean amount of time 
spent on each problem) were significantly in- 
tercorrelated; consequently, a single multi- 
variate analysis of variance was performed. 
The results were identical for the two experi- 
ments, and since Experiment 2 also contained 
the measures of affective expression, only the 
statistical results of Experiment 2 are spe- 
cifically reported. 

The multivariate analysis revealed signifi- 
cant main effects due to both affect and 
tempo, F(8, 144) = 53.59, p< .001, and 
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F(4, 144) = 43.11, p< .001, respectively. 
The effects of affect and tempo as manifested 
on one dependent variable, the number of 
trials to mastery, are depicted in Figure 1. 
Individual group comparisons (a comparison 
to the neutral condition for the affect vari- 
ables) were conducted using the Duncan mul- 
tiple-range rest. These comparisons revealed 
that positive affective states and active tempo 
significantly increased the overall rate of 
learning as well as the speed with which chil- 
dren processed individual problems, whereas 
both negative affective states and passive 
tempo produced relatively poor learning rates 
and retarded the speed with which individual 
problems were processed. 

The interaction between affect and tempo 
also proved significant, F(8, 144) = 12.11, p 
< .001. In general, the difference between 
active and passive tempo was enhanced for 
the negative affect condition, although this 
was less true for the mean time spent on each 
problem than it was for the other learning 
measures. This interaction, which can be seen 
particularly well in Figure 1 with respect to 


one learning variable, may be due to a partial. 


ceiling effect for the positive affect conditions, 
in which children often learned in as few trial 
blocks as possible, many showing perfect 
mastery after only the second block of learn- 
ing trials (accuracy on the first trial block 
was chance, since correct solutions were not 
given in advance). 

It is of interest to note here the contrast 
between the different effects of affective states 
on the overall number of trials to task mas- 
tery and the speed with which children pro- 
duced solutions for individual problems. 
Those affect and tempo conditions that pro- 
moted fewer errors and rapid overall mastery 
in fewer trials also increased the speed with 
which children solved individual problems. 
Under conditions of various affect and tempo 
inductions, the “impulsive,” rapid answer was 
more likely to be correct, and the slower, 
“reflective” response incorrect. The correla- 
tion between number of trials to mastery and 


mean time per problem was r(48) = 47, p 
KVO 
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Relations Between Expressed Affect 
and Learning 


A multiple regression analysis was com- 
pleted to provide information about the rela- 
tions between affective states and learning, 
Measures of expressed happiness, sadness, 
interest, involvement, and arousal were sig- 
nificantly related to all measures of learning, 
with the posttask measures being slightly 
more efficient predictors than the pretask 
measures. Children appearing happier, more 
interested, involved, and aroused required 
fewer trials for mastery, made fewer errors, 
and produced their answers to individual 
problems more rapidly. Children appearing 
sadder and not particularly interested, in- 
volved, or aroused required more trials, made 
more errors, and spent longer on the solution 
of individual problems, : 

Since the measures of expressed affect and 
of interest, involvement, and arousal were 
quite highly correlated with one another, 
median (48) = .51, further multiple regres 
sion analyses were undertaken to determine 
the degree to which expressed affect and in- 
terest/arousal variables contributed to the 
variance in learning. The multiple correla: 
tions between the prediction set and two mg 
sures of overall rate and accuracy of ian 
were .72 and .64 (dj = 6, 41, p < 01) f 
trials to mastery and number of errors, i 
spectively. Significance tests for individue 
beta weights revealed that expressed hea i 
ness and sadness were the most on 
predictors of the rate and accuracy of a 
learning, and expressions of apparent N 
involvement or arousal contributed little. ‘ 
multiple correlation using affect and ine 
arousal variables to predict processing spe 
for individual problems was R(6,41) = “fe 
p < .05. Only expressed sadness had a ea 
cant beta weight in the predictor equa ua 
for the amount of time spent on indivi al- 
problems—a result that interestingly par 
lels the clinical picture of depression as 0”? 
psychomotor retardation. ew 

To test further the implication that in 
pressed pretask and posttask interes lly 
volvement, and arousal were less powe" wist 
related to learning than affect, a SteP i 
multiple regression analysis was perform 
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this time entering the predictors in the equa- 
" tion in two orders. First the affect predictors 

were entered, followed by the interest/arousal 

variables, and then the reverse. For the rate 
mi accuracy of overall task mastery, each 
stof variables accounted for a significant 
amount of variance, but the magnitude of the 
contributions was very different. The affect 
| measures added between 20% and 30% more 
variance to the equation when added after 
the interest /involvement/arousal variables, 
whereas the latter added only from 1% to 
3% more variance when the affect measures 
had been included first (but this slight addi- 
tion was still significant). With respect to 
the amount of time children spent on indi- 
vidual problems, the stepwise regression 
analysis revealed that the interest/involve- 
| ment/arousal variables did not contribute 

significantly to the predictor equation. 

These results indicate that positive and 
| negative expressed affective states are strongly 

poe with the overall rate and accuracy 
Be area's learning, and negative states 
J en a speed of cognitive processing, or 
e PA e rapidity with which a solution is 
Bein: Apparent interest, involvement, and 
i et also related significantly to learn- 
a T the relation is modest at best, and it 
eae that these variables are major 
ae g factors in the relation between 

e states and learning. 


ci uy 
Bent and Discriminant Validation of 
ect-Induction Procedures 


a oe expression ratings. In order to as- 
BF the me: and discriminant validity 
mete ee REE procedures, analyses 
i ee ed to determine the effects of 
UA eee conditions on pre- 
Boreas a affective states from the facial 
a fe Two types of convergence 
Bicotes fo Rae were examined: (a) the 
pre which affect-induction manipula- 
Pratings <a ce appropriate facial expression 
eee tempo-induction manipulations 
a (b) the degree to which the 
pobre eee procedures influenced the ap- 
tee ag a ect ratings (pleasantness, happi- 
| , Sadness) in an understandable fashion 
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Table 1 
Mean Degree of Expressed Affect Following 
Positive, Neutral, and Negative Affect 
Inductions 
LAO te ee oe 
Affective expression 
rating* 


Affect induction Happiness Sadness. 


Positive 6.76 1.03 
Neutral 3.91 1.44 
Negative 2.91 3.04 


Note. n = 16 for each mean. 
a Maximum = 9.00; minimum = 1.00. 


and did not influence ratings of divergent 
affective states (anger, disgust, surprise, 
pain). 

Analyses of the effects of sex, affect induc- 
tion, and tempo induction on pretask ratings 
of happiness and sadness revealed significant 
effects only for affect induction (happiness: 
F(2,36) = 24.99, P< .01; sadness: F(2,36) 
= 5.44, p<.01. Means are presented in 
Table 1, and they reveal clearly that the ef- 
fects were appropriate for the different induc- 
tion conditions. Individual group comparisons 
indicated that for happiness, the positive af- 
fect condition differed significantly from both 
the neutral and negative conditions. For sad- 
ness, the negative affect condition differed 
significantly from the neutral and positive 
conditions, and the positive condition dif- 
fered from the neutral condition as well. Fig- 
ure 2 illustrates the facial expressions of four 
children, two following the induction of a 
positive state and two following the induc- 
tion of a negative state. 

Discriminant validity was revealed by an 
ffects of the affect-induc- 


examination of the € 
tion procedures on the ratings of divergent 


emotional states. There were no significant 
effects whatsoever from the affect-induction 
procedures on ratings of anger, disgust, sur- 
prise, or pain. A significant main effect of 
induced affect was observed for ratings of 
fear, F(2,36) = 5.01, p < .05. These ratings 
were all quite low, and children in the neu- 
tral condition were rated as showing some- 
what more fear (M = 2.16) than children in 
either the positive (M = 1.07) or negative 
(M = 1.25) affect conditions. This finding 
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Figure 2. Facial expressions 
periment 2—two from the 
induction conditions. 


may indicate simply a mild degree of appre- 
hension in children after the difficult assign- 
ment of determining a neither happy nor sad 
experience to dwell upon, with no feedback 
given regarding their success, 

Analyses of affect- and tempo-inducing 
thoughts. The thoughts generated by chil- 
dren were rated for their degree of intrinsic 
happiness, sadness, activity, or passivity, 
These ratings were analyzed to determine the 
degree to which the experimental conditions 
of affect and tempo induction inclined young 
children to generate thoughts that were 
identifiable by adults as matching the in- 
tended affect or tempo manipulation. The 
ratings were submitted to two-way analyses 
of variance with primary dimensions of affect 
and tempo induction. The means for all analy- 
Ses are presented in Table 2. 


The analysis of happiness ratings revealed 
a significant main eff 


ect for the affect induc- 
tion, F(2, 23) = 22.67, p < .01. The thoughts 
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i zine in Ru 
, photographed from videotape, of four children ane pre, 
positive affect-induction conditions and two from the nega 


itive 

produced by children to effect the Fae 
affect induction were rated significan a jn 
happy than those produced by ¢ R, 
the neutral condition, who in an Pa 
thoughts judged happier than thos of sal 
children in the sad condition. Anela e l 
ratings also revealed a significant 190 p 
of affect induction, Fs(2, 23) = 2 “dren it 
01. The thoughts produced by rit than 
the sad condition were rated more 5 dition 
those from children in the neutral T more 
who in turn produced thoughts ra nappy 
sad than those from children in the 
condition. sity TAM 

Analysis of the activity and yee (J) 
ings revealed main effects of temp" activit) 
139) = 5.82 and 6.51, p< 01, for Mg 
and passivity, respectively, and a § fect i 
interaction between activation ae a 
duction, Fs(2, 23) = 8.47 and 7.4 ae 
Thoughts from children in the acti re ac 
conditions were rated significantly m0! 


temp 
tive! 
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ngs of the Happiness, Sadness, Activity, and Passivity of Children’s 
ed Thoughts as a Function of Affect and Tempo Inductions 


Affect induction 


Rating f Tempo Positive Neutral Negative 
dimension induction (n = 16) (n = 16) (n = 16) 
Happiness Active 7.95 5.35 2,16 

Passive 7.07 4.05 1.93 
Sadness Active 1.25 2.60 6.72 
Passive 1.23 2.60 6.43 
Activity Active 5.85 7.05 3.60 
Passive 4.50 2.70 5.10 
Passivity Active 2.70 2.20 5.12 
Passive 4.07 6.30 3.93 


e from children in the passive con- 
and the reverse was true for the 
y ratings. However, these findings 
fied by the interactions. Thoughts 
dren in the neutral condition were 
y recognizable as being from either the 
t passive activation induction accord- 
both rating dimensions. However, this 
lon was greatly reduced for thoughts 
ldren in the positive affect condition. 
More, the thoughts generated by chil- 
the negative affect condition were 
uly misjudged by raters. There was a 
to judge thoughts from the passive- 
dition to be more active than those 

active-sad condition, and vice versa 
assive ratings. These results indicate 
Presence of a clear affective tone, 
p a negative one, tends either to con- 
dren in their generation of active or 
ai or to produce thoughts that are 

Passive to them but not to adult 
dging those thoughts. The latter 
a the more likely explanation, since 
Mimental results revealed effects of 
on learning, without any significant 
as a function of the affect induced. 


Discussion 


Affect in Social Learning 


aa that affect strongly influences 
ances our understanding of the 
at affective variables may play in 


development. Mood states may be transient, 
but if they influence adaptive learning, their 
consequences may be widespread and of long 
duration, The current affective state of an 
individual in any context is an important 
person variable that is likely to mediate any 
influence of cognitive or contextual learning 
variables and may well constrain or amplify 
the possibility for prior experience and learn- 
ing to mold adaptive behavior. The present 
findings also indicate that the impact of emo- 
tional states should not be underestimated. 
The affective states induced in this experi- 
ment were surely less varied than the states 
induced by uncontrollable experiences children 
have in the natural environment. Most of the 
thoughts generated by children represented 
social experience with parents, teachers, or 
peers, and the thoughts must surely have in- 
duced weaker affective states than did the 
original experiences. 

The implications of the present findings 
extend beyond the academic or intellectual 
context. Although the present experiment 
utilized an intellectual learning task, there 
seems little reason to believe that affective 
states will not also affect children’s social 
learning in nonintellectual contexts. Thus, 
children’s sensitivity to social reinforcement 
or punishment or to the behavior of peer or 
adult models may depend at least in part upon 
the children’s ongoing affective state. The 
effectiveness of social learning variables, pun- 
ishment for instance, may also be mediated 
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by the affective states that are induced by 
the social learning variable itself during the 
learning interaction, The role of affect as a 
participant or mediating variable in intellec- 
tual or social learning simply has not yet been 
explored. 

The fact that children’s self-generated 
thoughts can create intended affective states 
raises the possibility that children possess at 
least the potential ability to manipulate their 
own affective states and possibly intervene in 
a therapeutic fashion to terminate the per- 
sistence of a negative affective state or en- 
hance an existing positive state. There is some 
evidence to indicate that children may ameli- 
orate unpleasant experiences and affective 
states by selective exposure to contexts that 
are likely to induce opposing states or pro- 
vide desired experiences (Masters, Ford, & 
Arend, Note 1). The present results indicate 
that the “power of positive thinking” or a 
rational-emotive analysis (Ellis, 1962, 1971) 
is not without application to young children, 
and cognitive abilities to induce affective 
states may very early become part of an in- 
dividual’s armamentarium of self-regulatory 
skills, providing some degree of personal con- 
trol over mood states and their consequences, 


Tempo and Affect 


The results of the present two investiga- 
tions indicate that the tempo component of 
an affective state may be an important con- 
tributor to the effects achieved on a broad 
range of behaviors. The learning data also 
revealed that inductions with an active tempo 
enhanced learning in much the same way that 
the induction of a positive affective state did. 
Nevertheless, in many instances three was a 
significant interaction between activation and 
affect, illustrating an apparent interdepend- 
ence of the valence and activation compo- 
nents of specific affective states, Labels exist 
in the language that appear to be appropriate 
for different combinations of affect and tempo 
(eg., positive/active = elation or joy; posi- 
tive/passive = serenity, pleasure, or perhaps 
contentment), and the present findings indi- 
cate that a conceptual dissection of emotional 
states into hypothesized dimensions beyond 
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merely that of affective valence is a heuristic 
procedure for future research. 


Children’s Comprehension of Affect and 
Affect-Inducing Experiences 


Little is known about children’s concepts of 
affect, but the present results indicate that 
even those of preschool children correspond 
quite well to those of adults. Preschool-age 
children’s comprehension of the affect that 
they do feel or are expected to feel in various 
circumstances is well developed, and they com- 
prehend the determinants of a given affective 
state well enough to propose appropriate 
antecedent conditions. In the present exper 
ment, it was difficult to tell which of the 
thoughts that children generated were recol- 
lections of their own past experiences and 
which were more normative nominations of 
experiences anyone might have that would 
produce the affective state in question. Some 
thoughts were of everyday experiences: siting 
and watching TV (neutral), helping to cleat 
up the house (negative), jumping on one foot 
(neutral), or going on a vacation to crian 
(positive). Others, however, appeared to be 
concocted rather than recalled, such as havin 
a turtle throw a shell at one (negative) i 
when Santa comes and leaves no ie 
(negative). The response of at least one © 
indicated an awareness of vicarious ri 
ences that may produce an affective nets 
watching a friend on a bike make 4 oa 
jump (positive). Very young children a 
appear to have the ability to identify 2 wt 
inducing experiences and a comprehen 
understanding of culturally defined m 
labels. The dimensions of these skills i 
knowledge are at present only poorly uy 
stood. ul 
The diversity of content spontaneously ich 
lized by children in generating the 
inducing thoughts also suggests the nee ae 
further investigation into the classes ie 
periences that may generate affective E a 
Just as the effects of success of failu 
levels of aspiration are constraine¢ A 
contextual character of that experien 
that, for example, academic failure wa 
necessarily lower aspirations for athle of? 
complishments, the contextual nature 


‘ 


-inducing experience may focus the im- 
of the induced state upon relevant or 
related subsequent behaviors. For example, in 
the present experiments, some children gen- 
erated thoughts that were distinctly social in 
nature (dad helping me jump up and down; 
‘When grandma visits; when my father spanked 
me), whereas others thought about contexts 
that were private (candy; sitting and watch- 
ing TV; breaking toys accidentally). A pri- 
vate-social dimension such as this may be 
relatively unimportant for a private activity 
like learning but be influential in determining 
subsequent social behavior such as altruism. 
Similarly, some children generated thoughts 
about experiences with intrinsic social com- 
‘ison components (everybody’s eating candy 
me), whereas others nominated contexts 
t contained peers, but without any com- 
ative overtones (when my friend comes 
ver and then leaves in a huff; when a friend 
alls me a copycat). In short, the effects of 
fective states on various social and intel- 
ual behaviors may be discriminated ac- 
Ording to aspects of the precipitating experi- 
es other than simply the affective valence. 


Nactors Mediating the Behavioral 
ronsequences of Affective States 


f The factors mediating the effects of affec- 
We states on behavior have rarely been ex- 
ored, In the present instance, expressive 
dicators of both pretask and posttask in- 
est, involvement, and arousal very clearly 
ere influenced by induced affective states 
ne were related to children’s learning. How- 
er, careful analysis revealed that interest/ 
Tousal factors were related to learning in a 
And a independent of the affective factors, 
| ey were clearly not a prime mediator 
or the effects of affective states on learning. 
te role of attentional and arousal 
as Cannot be ruled out, because of the 
bility that facial expressions are not the 
psst indicators of ongoing attentional be- 
vior during learning. There is growing evi- 
jace; however, that changes in attentional 
ha are not a major consequence of 
fi aM States. Underwood et al. (1977) have 
Steady shown that measures of incidental 
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attention are unaffected by induced affective 
states. More recently, Isen, Shalker, Clark, 
and Karp (1978) found that positive affective 
states affected the recall of positive memories, 
which suggests that the impact of affective 
states on learning may occur during retrieval 
or recall performance rather than during at- 
tentional or encoding phases. 


Validation of Experimental Affect-Induction 
Procedure 


The inclusion of independent assessments 
of affective states provided clear convergent 
and discriminant validity for the experimental 
affect-induction procedure. The availability of 
techniques to manipulate emotional states 
under controlled conditions and in normal 
individuals is an important methodological 
advance, and there should be further attempts 
to demonstrate the validity of this manipula- 
tion through the use of independent indexes 
of induced affective states. In the present 
case, the use of expressed affect and interest/ 
arousal indexes also provides information 
about another aspect of emotional states, the 
degree to which they may alter the social 
stimulus characteristics of the person experi- 
encing them. Among the factors mediating the 
effects of emotional or mood states on be- 
havior are the social reactions of others, re- 
actions which may in part be determined by 
the appearance of the person experiencing the 
affective state. The present results indicate 
that expressional changes during affect may 
involve aspects or judgments of appearance 
other than those given affective labels per se. 
Thus, for example, the impact of affective 
states on learning in the intellectual context 
may be shaped by a teacher’s judgments of a 
pupil’s interest or involvement rather than by 
judgments of an ongoing mood state, ‘The 
external, social consequences of internal and 
presumably private states merit direct investi- 


gation. 


Conclusion 


In summary, the role that affective states 
may play in intellectual and social behavior 
is becoming ever more documented. The pre- 
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sent results indicate that the experimental 
procedure of inducing affective states is valid 
and effective and can provide an important 
investigative tool in exploring the role that 
affective states may play in shaping cognitive 
and social aspects of both the normal and 
disordered behavior of children and adults. 
Further descriptive investigations are cer- 
tainly in order, providing both examples of 
the social and private experiences that are 
likely to induce affective states and examples 
of the range of behaviors and capabilities that 
are sensitive to affective determinants. Also 
of importance is the further investigation of 
the nature of affective states, including the 
elucidation of companion factors, such as 
tempo, that influence the impact of affective 
states and the cognitive, attentional, and so- 
cial mechanisms that mediate observed effects 
on learning and behavior. 


Reference Note 
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Stimulus Recognition May 


Mediate Exposure Effects 


Michael H. Birnbaum and Barbara A. Mellers 


University of Illinois ai 


Moreland and Zajonc presented stimuli 


t Urbana-Champaign 


with differential numbers of exposures 


to subjects and obtained measures of affect (e.g., ratings of liking) and ratings 
of familiarity. Exposure frequency and ratings of familiarity were both signifi- 
cant predictors of affect in a multiple regression equation. Moreland and Zajonc 
concluded that there are two independent effects, and thus the exposure effect 
could not be explained by a stimulus recognition factor alone. However, these 
results can be explained by the theory that exposure frequency affects a single 


mediator that is imperfectly correlated 


with ratings of familiarity and affect. 


Thus, the null hypothesis that recognition mediates the exposure effect cannot 
be refuted by the partial correlation and regression analyses of Moreland and 


Zajonc. 


: 

Moreland and Zajonc (1977) varied the 
frequency with which stimuli were presented 
thd obtained several dependent variables 
cluding ratings of familiarity and a measure 
4 liking (affect). Since the rating of affect 
tould be predicted from rated familiarity and 
posure frequency and since both regression 
efficients were significant, it was argued 
that there are two “independent” effects. 
“Owever, the present article shows that the 
theory that one factor, subjective recognition, 
Mediates the effect of the independent variable 


on both dependent variables can predict this 
Outcome, 


Single-Mediator Theory 


r Suppose the manipulated variable, exposure 
“quency, affects a mediator, subjective 


cognition, 
S= g(4*) +s, (1) 


V i 7, 3 ngs . 

‘ here S is subjective recognition, A* is actual 

Xposure frequency, s is a random residual, 
£ 1s some function (e.g., logarithmic) of 

ctual frequency. GA 


we thank Richard Moreland for providing us with 
Correlation matrices in Table 1. 

7 ane for reprints should be addressed to Michael 

f um, Department of Psychology, University 


ois, Champaign, Illinois 61820. 


Suppose that ratings of liking (Z) and 
ratings of familiarity (R) are correlated with 
subjective recognition, 


L=aS+l, (2) 


R=bS+r, (3) 


where a and b are linear constants, and / and r 
are residuals that are uncorrelated with S, s, 
or each other. 

It follows that the intercorrelation between 
each pair of variables can be expressed as the 
product of the correlations between the two 
variables and the mediating factor, subjective 


recognition : 
pis = fifi (4) 


where p,; is the correlation between variables 
i and j; fi and f; are the correlations between 
the variables and subjective recognition. Since 
each correlation is the product of the corre- 
lations with a mediating factor, S, the path 
model defined by Equations 1, 2, and 3 is 
equivalent to a one-factor model. 


and 


i#j, 


Interpretation of Partial Correlations in Single- 
Mediator Theory 


The least squares regression coefficient in the 
(standardized) equation predicting ratings 
of liking from transformed exposure frequency, 
A = g(A®*), and rated familiarity is given by 


Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3703-0391$00.75 


391 


392 


Table 1 
Correlations of Variables from Moreland and 


Zajonc (1977) 


a 


1 2 3 4 5 


1, Frequency A2 64 53 41 
2. Affect 66 40 35 25 
3. Familiarity 58 53 AT S4 
4. Confidence 34 25 -29 48 
5. Accuracy Al 30 „55 42 


Note. Correlations above and below diagonal are 
for Experiments 1 and 2, respectively. Frequency 
represents log (f + 1), where f is actual exposure 
frequency. Liking and affect are corresponding 
dependent variables for Experiments 1 and 2, 
respectively. 


the expression : 


PLR — PRAPAL 
1 — pra® © 


where rr-a is the beta coefficient for rated 
familiarity in the equation predicting rated 
liking with transformed frequency (A) par- 
tialed out. (The corresponding partial corre- 
lation has the same numerator as Equation 5.) 

Substituting Equation 4 into Equation 5 
yields 


Bura = 


_ fife — fa) _ ere(l — fa’) 


Pia i i y”: 1— pra D 


(6) 


which shows that unless subjective recognition 
is perfectly correlated with frequency (f4? = 1), 
the beta is expected to be non-zero and to 
have the same sign as prr- 

Similarly, the beta for A in the equation 
predicting liking will be given by 


_ PLA (1 — fr) 
1 — pra® 


Bia-r (7) 
which shows that the coefficient of frequency 
is expected to be non-zero and to have the 
same sign as the correlation between exposure 
frequency and liking, unless rated familiarity 
is perfectly correlated with subjective recog- 
nition (fr = 1). Thus, the finding of a 
positive partial correlation between liking 
and frequency with rated familiarity partialed 
out is consistent with both the single-mediator 
(null) hypothesis and the alternative. This 
partial correlation is therefore not diagnostic 
of the presence of two independent effects on 


MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS 


liking, as contended by Moreland and Zajone 
(1977). 


One or Two Mediators? 


The correlations between exposure fre- 
quency, rated familiarity, and affect (liking) 
for Experiments 1 and 2 of Moreland and 
Zajonc (1977) are shown in Table 1. The 
off-diagonal correlations among these three 
variables can be perfectly fit by the single: 
mediator model in each case. For Experiment 
1, the correlations with the mediator are 82, 
.78, and .51, respectively; for Experiment 2, 
the correlations with the recognition mediator 
are .85, .68, and .78 for frequency, rated 
familiarity, and affect (liking), respectively. 

Consequently, the three correlations are 
consistent with the representation shown 
below, in which each correlation among ob- 
served variables is the product of their corte: 
lations with subject recognition (Equation 4): 


Rated | 
Familiarity 


Sr, 


fa Subjective 0 


Frequency Recognition 


fu Rated. 
Affect (liking) 


This model can be extended to describi 
the intercorrelations among the other de 
pendent variables measuring recognition. f 

The alternative to Equation 4 is that they 
is an additional effect of frequency on likin 


that operates independently of the effect 


o ‘te {0 
1 Should a partial correlation have a sign oppo, 
dell 


that of the original correlation (i.e, Pij:kPii mo 
would not be consistent with a one-mediator 5 
Though it may seem counterintuitive, & ge 
partial correlation between exposure frequen 
rated familiarity with liking partialed out wou el, b 
been inconsistent with the single-mediator T10 fect 
in agreement with the presence of a second e 
frequency on liking. à 

2For a three-variable, positive 1 
matrix, the path coefficients (factor loi 
given by the following equation : 


Pik 
f= Ki 
Pik 


-pje i and 
where f; is the correlation between Variable i 2™ 


mediating variable. | 


= 


ntercorrel f 
adings 


on recognition. This theory can be 
ed as follows: 


Subjective tr Rated 
Recognition———Familiarity 


fi - (9) 


Rated 


E Riaconacious 
Affect 1.0 Affect 
gle-mediator model (Expression 8, 
1 4) implies that the correlations 


ESES 1/pu, (10) 
Pjk 


correlations with subjective recog- 
\) are less than or equal to 1.0, and 
n 10 is equivalent to the following: 


Sfi S Hifi < fifi- (11) 


ample, the single-mediator model would 
be able to account for correlations for 
) PLA, and pra of .7, .7, and .35, respec- 
since pra/pra = 2, which is outside 
imits set by .7 and 1/.7. However, Equa- 
; 10 was satisfied for exposure, affect, and 


A se analyses show that the regression 
partial correlation results reported by 
and and Zajonc (1977) can be explained 
e null hypothesis of a single mediator. 
fit of the null hypothesis (Expression 8) 
ecognition mediates the exposure effect 
Not rule out the possible presence of 
Scious affect (as in Expression 9). How- 
the burden of proof clearly remains on 
ho would conclude, with Moreland 
onc (1977, p. 191), that “the relation- 
between stimulus exposure and affect 
not depend on the operation of higher 
cognitive processes.” 


‘allel With Subception 


© Present issue has interesting parallels 
; Sema (Garner, Hake, & Eriksen, 1956) 
pombe (Dulany, 1968). Eriksen (1956, 

Picate and reinterpreted an experi- 
a lage a and McCleary (1951) that 
Piss demonstrate subception. In Erik- 
i 6) replication, subjects were trained 
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to give a verbal response to each of several 
squares, one of which was paired with shock. 
In the testing phase, galvanic skin response 
(GSR) and verbal responses were measured 
to each stimulus. It was found that the 
stimulus-GSR relationship was maintained 
when verbal response was partialed out. 
Eriksen (1960) noted that the results are 
consistent with the hypothesis that the stimulus 
leads to a perception that is imperfectly 
correlated with the two dependent variables 
with uncorrelated errors. To find an absence 
of the partial correlation between stimulus 
and GSR with verbal response partialed out 
would require that the verbal response be 
perfectly correlated with perception. In terms 
of Thurstone’s law of categorical judgment, 
the category boundaries would have to show 
zero variance. Should the subject shift the 
limens during the experiment, the partial 
correlation will be expected to be non-zero. 


Causation and Correlation 


Brewer, Campbell, and Crano (1970) have 
cautioned against the use of multiple regres- 
sion and partial correlation in nonexperimental 
settings. They reviewed 10 psychological 
studies in which partial correlations had been 
used inappropriately to argue for multiple 
effects without testing the rival, one-factor 
hypothesis. This paper shows that the criti- 
cisms of partial correlation analysis made by 
Brewer et al. (1970) can apply to certain 
experimental as well as nonexperimental 
studies. 

Although factor and path analyses may be 
useful for examining the number and nature 
of the effects in the dependent variables, they 
do not allow one to draw inferences about 
causation. Unless the variables are experi- 
mentally manipulated, numerous alternative 
causal models remain consistent with the data. 
To argue for multiple causes, an investigator 
must demonstrate that two or more experi- 
mentally manipulated variables contribute 
independently to the measured dependent 
variables. One should not draw causal in- 
ferences from correlations unless the in- 
dependent variables are truly independent, 
that is, manipulated experimental variables. 
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Conclusions 


The present article shows that the one- 
mediator model may be a reasonable null 
hypothesis for data obtained in experimental 
studies in which one variable is manipulated 
and several are measured. Indeed, a positive 
partial correlation between exposure fre- 
quency and liking with rated familiarity 
partialed out does not evaluate the null 
hypothesis that a single variable mediates the 
effect of the independent variable on both 
dependent variables, as shown by Equation 7. 
Thus, the theory that stimulus recognition 
mediates the exposure effect on liking is not 
refuted by the partial correlation and regres- 
sion analyses of Moreland and Zajonc (1977). 
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Personality is that branch of psychology 
ch is concerned with providing a syste- 
tic account of the ways in which individu- 
differ from one another (Wiggins, Ren- 
ier, Clore, & Rose, 1971). Individuals differ 
m one another in a variety of ways: their 
atomical and physiognomic characteristics; 
feir personal appearance, grooming, and 
a pi dress; their social backgrounds, 
a other demographic characteristics; 
Eo ect on others or social stimulus value; 
M any given moment in time, their 
ie p states, moods, attitudes, and ac- 
study Hs ne the principal goal of personality 
Riva o provide a systematic account of 

Idual differences in human tendencies 
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A Psychological Taxonomy of Trait-Descriptive Terms: 
The Interpersonal Domain 


Jerry S. Wiggins 


University of British Columbia, Vancouver, Canada 


The eventual aim of the research reported is the development of a comprehen- 
sive taxonomy of trait-descriptive terms in the English language. Building on 
earlier work of Allport, Norman, and Goldberg, preliminary a priori distinctions 
were made among different domains of trait categories. General procedures for 
developing structured taxonomies within domains were illustrated with reference 
to the interpersonal domain. Theoretical considerations dictated the definition 
of the universe of content, the choice of measurement model, and the proce- 
dures for classifying terms within the domain. Eight adjectival scales were de- 
veloped as markers of the principal vectors of the interpersonal domain. The 
substantive, structural, and psychometric characteristics of these scales were 
found to be highly satisfactory. Hence, they may prove useful both as assess- 
ment devices in their own right and as reference points for the classification of 
variables in personality and social psychology. 


(proclivities, propensities, dispositions, incli- 
nations) to act or not to act in certain ways 
on certain occasions. 

Although such tendencies are commonly 
called “traits,” the use of that term in con- 
temporary psychological discourse carries with 
it implications of a particular theoretical com- 
mitment, a preferred method of scientific in- 
vestigation, and a philosophical preference for 
certain kinds of explanation in theory con- 
struction. Hence, it is necessary to make it 
clear at the outset that an interest in human 
tendencies (traits) does not imply a theo- 
retical precommitment to such issues as 
whether traits are manifestations of genera- 
tive or causal mechanisms (Allport, 1937); 
whether trait attributions reflect specific cog- 
nitive processes of observers (Heider, 1958); 
whether traits are best construed idiographi- 
cally or nomothetically (Allport, 1937; Bem 
& Allen, 1974; Kelly, 1955); or whether con- 
sistencies in human tendencies are largely due 
to environmental or situational constancies 
(Mischel, 1968). 

In my view, consistent patterns of human 
conduct constitute the basic data of person- 
ality study, which require rather than provide 
explanation (Wiggins, Note 1). In approach- 
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ing this task of accounting for individual 
differences in human tendencies, I share with 
others the conviction that the natural lan- 
guage provides a convenient starting place 
(Allport, 1937; Cattell, 1957; Goldberg, Note 
2; Norman, Note 3). The universe of content 
of human tendencies is contained within the 
covers of an unabridged dictionary of the 
English language, I also share with these 
writers the conviction that an adequate tax- 
onomy of trait-descriptive terms must pre- 
cede meaningful empirical studies of human 
tendencies. There have been a number of 
systematic efforts to develop a personality 
taxonomy over the past 40 years, and my own 
work has attempted to capitalize on these 
earlier efforts. 

Allport and Odbert (1936) examined the 
approximately half million separate entries or 
derivatives included in Webster’s New Inter- 
national Dictionary (1924) for terms that 
appeared “to distinguish the behavior of one 
human being from that of another” (p. 24) 
and identified a pool of 17,953 terms having 
this characteristic. Norman (Note 3) scanned 
the entire contents of Webster’s Third New 
International Dictionary Unabridged (1961) 
for additional terms that had not been in- 

` cluded in the Allport-Odbert list. The universe 
of content thus defined was approximately 
27,000 terms. Subsequently, Norman and 
Goldberg were able to reduce this list by 
eliminating obscure, inappropriate, and ar- 
chaic terms. The focus of all of these investi- 
gations has been on a subset of approximately 
3,600 terms that Allport initially called “sta- 
ble biophysical traits.” Allport considered 
these to be “real” traits as opposed to tem- 
porary states, moods, social roles, physical 
characteristics, and so forth. 

However one conceives of stable biophysical 
traits, I see them as no more nor less real 
than any other kinds of human character- 
istics. I agree with Guilford (1959) that “a 
trait is any distinguishable, relatively en- 
during way in which one individual differs 
from others” (p. 6). But I see the major 
taxonomic task as that of specifying the dif- 
ferent kinds of ways in which individuals 
differ from each other. One kind of way in 
which individuals differ from each other is in 
terms of what they do to each other. Thus, I 
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think that one of the most important catei 
gories of traits is that which may be desig- 
nated “interpersonal.” Within the realm of | 
things that people do to each other, it is 
desirable to make a further theoretical dis- 
tinction between interpersonal exchanges 
based on love and status and interpersonal 
exchanges based on goods, money, and ser- 
vices (Foa & Foa, 1974). The former are 
called interpersonal traits and the latter ma- 
terial traits. 

Another equally “real” way in which indi- 
viduals are distinguishable from one another 
is in terms of their styles of emotional reac- 
tivity, which we refer to as “temperament.” 
It is also possible to distinguish individuals 
on the basis of the particular roles and status 
they hold within the framework of our social 
institutions, Additionally, there are “charac- 
ter” terms that represent appraisals of an 
individual based on a code of proper behav- 
ior, And finally, there are many words that 
refer to individual differences in qualities of 
mind as manifested in thought, perception, 
and speech. ; 

Consider the following trait-descriptive ad- | 
jectives: aggressive, miserly, lively, ceremoni | 
ous, dishonest, and analytical, I would classi 
these terms as representative of the six tral 
categories just mentioned, namely, inter 
sonal traits, material traits, temperament 
traits, social roles, character, and men F 
predicates, Here I differ from Allport, Nor 
man, and Goldberg, who view all of the 
terms as “stable biophysical traits,” wi 
attempting to differentiate the different E 
of descriptive jobs such terms perform var 
To these earlier authors’ important distin 
tion between stable traits and ene 
States, moods, social evaluations, and so pE 
I would add finer distinctions within the ca 
gory of stable traits. fa 

The present article describes our ee 
forts to develop a taxonomy of Lemire 
trait-descriptive terms in the English f 
guage. The universe of content is taken 5125 
Norman’s (Note 3) total lexicon of My ine 
terms. Within this broader framework, a 
tend to focus on the “prime” categories a 
Norman, which involve 4,063 relatively ting 
miliar and nonobscure terms. Even rest i 
our attention to prime terms, we are 


Facet | 
OBJECT 


Facet III 
RESOURCE 


“INTERPERSONAL VARIABLES 


NO ( Gregarious - Extraverted ) + + 
PA ( Ambitious - Dominant ) + + 
BC ( Arrogant - Calculating ) + + 
DE (Cold-Quarrelsome ) y 


FG (Aloof -Introverted ) 

HI (Lazy - Submissive ) 

VK ( Unassuming - Ingenuous ) 

LM ( Warm - Agreeable ) A 


i — = rejection.) 


l 
l 


p ned with a list of staggering size. Our 
Bee creares a clear-cut game plan or 
fa. egy to guide us through this sea 
ae pean by defining a limited taxonomy 
ea basis. We attempted to specify 
ae ie of human characteristics 
hook e different descriptive jobs per- 
Bes y the words within the domains. 
Esse ome preliminary a priori classi- 
@ TERR in domains, mainly as a means 
ae : ae of terms and lightening our 
iPass urdens. The overall strategy for 
ayie K a taxonomy within a single domain 
REN norifically described as “iterative,” 
Pe Sia trial and error” may be a more apt 
note n r. Our main concern was the avoid- 
Hence Paree “fixing” of a taxonomy. 
nahi an preliminary classifications were 
eaa ponte and subject to continual 
gories “ras a N of other developing cate- 
egy will be i er complex interactive strat- 
eve ai ustrated with reference to the 
ma of a taxonomy within the do- 
interpersonal traits. 


TAXONOMY OF INTERPERSONAL TRAITS 


Figure 1. Facet composition of interpersonal variables (after Foa & Foa, 
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SELF 


LOVE 


STATUS LOVE STATUS 


1974). (+ = acceptance; 


The Domain of Interpersonal Traits 


The line of reasoning to be presented origi- 
nated in the interpersonal theory of Sullivan 
(1953) and was later operationalized in a set 
of personality measurement procedures by 
Leary (1957) and his colleagues. Foa (1961, 
1965) integrated this theory with Guttman’s 
(1954) order analysis, and the general orien- 
tation was extended to domains other than 
the interpersonal by Rinn (1965). Notable 
interpersonal systems based on a circumplex 
model have also been described by Becker and 
Krug (1964), Benjamin (1974), Lorr and 
McNair (1963), Schaefer (1959), and Stern 
(1970). This general framework was inte- 
grated with interpersonal exchange theory 
(Homans, 1961) by Carson (1969) and by 
Foa and Foa (1974). 

In their most recent theoretical statement, 
Foa and Foa (1974) describe the development 
of cognitive categories of social perception as 
a progressive differentiation of structure in- 
volving facets of directionality, object, and 
resource. A somewhat simplified version of 
Foa and Foa’s representation of this structure 
is illustrated at the top of Figure 1. The earli- 
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est cognitive schemata are based on the dis- 
crimination of the directionality of social 
events (give vs. take, accept vs. reject). With 
the acquisition of the concept of social object 
(self vs. other) four categories of social mean- 
ing are discriminated (e.g., give to self, take 
away from other). Out of an initially undif- 
ferentiated matrix of resource classes, services 
are differentiated from love, and the latter is 
further differentiated into love and status. 
With the distinction between love and status, 
earlier facets become reorganized, and it is 
possible to distinguish eight features of social 
meaning (e.g, granting status to oneself, 
denying love to another). These eight features 
may be thought of as part of a semantic code 
strip that provides the basic discriminations 
for encoding and decoding interpersonal events 
(Osgood, 1970). 

Within the above context, interpersonal 
events may be defined as dyadic interactions 
that have relatively clear-cut social (status) 
and emotional (love) consequences for both 
participants (self and other). This definition 
provides a theoretical basis for distinguishing 
interpersonal traits from other categories of 
trait descriptors, such as temperament, moods, 
Cognitive traits, and physical characteristics. 

Under the semantic features in Figure 1 
are listed eight theoretical interpersonal vari- 
ables. The organization of these variables is 
thought to be determined by an interrelated 
set of societal rules that impart meaning to 
social events (Wiggins, Note 1). Thus, ac- 
tions that have the same profile of semantic 
features are categorized as belonging to the 
same response class. The semantic features of 
each of the eight variables appear as rows in 
Figure 1. Note that the first variable (No) 
is coded on all the positive (accept) categories 
for self and other with respect to both love 
and status. The first variable is in marked 
contrast to FG for which all values are nega- 
tive (reject). Variables No and FG have no 
features in common, but since acceptance and 
rejection are logically opposed concepts, it 
would be expected that the two variables 
would be strongly negatively correlated. Simi- 
lar relationships can be seen to exist between 
PA and HI, Bc and JK, and DE and um. Note 
also that each variable differs from its ad- 
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jacent variable by only one element, This is. 
also true of the first (No) and last (LM) 
variables, and hence the structural relations 
among this set of variables can be represented 
as a circumplex (Guttman, 1954). 

The labels that have been attached to the 
interpersonal variables in Figure 1 (gregari- 
ous-extraverted, ambitious-dominant, etc.) 
are meant to capture the flavor of terms that 
share the same profile of semantic features) 
and may serve more as tags than as definitions, 
Thus, for example, the rather inelegant label 
of “lazy-submissive” is attached to interper- 
sonal transactions involving incompetence, | 
passive resistance, submission, or obedience. | 
These otherwise diverse attributes share in 
common the semantic features of denying 
status to self, denying love to both self and 
other, and granting status to other. A fuller 
listing of representative terms in each cate 
gory may be found in Table 2, later in this | 
article. 

Were we to collect personality measure- 
ments on the eight variables listed in Figure 
1, their intercorrelations would, in theory, 
show the pattern illustrated in Table 1. The 
correlation of a variable with itself is as 
sumed to be unity, so that the principal di- 
agonal of this matrix contains ones. The cor- 
relations along this main diagonal are larg 
and positive, and they decrease across rf 
cessive minor diagonals to the n/2nd variab re, 
where they are a minimum. The correlations 
then increase up to a large positive value H 
the off-diagonal matrix. The circumplexity ‘i 
this or any other set of variables can- 5 
evaluated directly from the intercorrelathl 
matrix. A rigorous procedure for eee 
circumplexity of a set of variables ina 
plotting the correlations of each aw 
(ordinate) with the other variables in A 
quence (abscissa). This procedure should Pa 
erate a series of overlapping sine curves a 
can be evaluated for goodness of fit (Ste 
1970). 1 

An alternative procedure for evaluatin 
cumplexity is to extract the first two prin® Ja- 
components from the matrix of intercorrê 
tions and to examine the plot of the varia 
on the two components, Figure 2 presents 
plot of the eight variables on two prin“ 
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Variable DE FG HI JK LM NO 
i PA 1.00 
! BC .50 1.00 
DE .00 50 1,00 
FG —.50 00 50 1.00 
HI —1.00 —.50 .00 50 1,00 
JK —.50 —1.00 —.50 .00 50 1.00 
LM .00 —.50 —1.00 —.50 .00 50 1,00 
NO 50 .00 —.50 —1.00 —.50 .00 -50 1.00 
Nol. PA = ambitious-dominant; BC = arrogant-calculating; DE = cold-quarrelsome; FG = aloof- 


introverted; HI = lazy-submissive; JK = unassuming-ingenuous; 


arious-extraverted. 


components extracted from the intercorrela- 
tions in Table 1. As can be seen from this 
igure, the intercorrelations among variables 
in Table 1 form a perfect, equally spaced cir- 
cumplex. This pattern follows, both theoreti- 
cally and empirically, from the pattern of 
shared semantic features among the variables 
illustrated at the bottom of Figure 1. Variables 
that share three features in common are ad- 
jacent to each other on the circle; variables 
that have no features in common are opposite 
each other. 


LM = warm-agreeable; NO = gre- 


There are two distinct advantages to the 
representation of interpersonal variables by a 
two-dimensional circumplex. The first is that 
it provides an explicit conceptual definition 
of the universe of content of interpersonal be- 
havior. Any behavior that meets the defini- 
tion of a meaningful interpersonal event given 
above must be capable of being represented as 
a vector originating from the center of the 
circle. Thus, the specific system proposed 
here is potentially falsifiable. The second ad- 
vantage of the circumplex model is that it 


AMBITIOUS - 
DOMINANT 
(PA) 
o 
ARROGANT - GREGARIOUS - 
CALCULATING O o EXTRAVERTED 
(BC) (NO) 
COLD - WARM ~- 
QUARRELSOME © o AGREEABLE 
(DE) (LM) 
ALOOF - UNASSUMING - 
INTROVERTED © o INGENUOUS 
(FG) (JK) 
o 
LAZY - 
SUBMISSIVE 
(HI) 


Figure 2. Perfect circumplex of interpersonal variables. 
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alerts the investigator to noticeable “gaps” in 
the interpersonal space of a given set of vari- 
ables. Strictly empirical procedures of vari- 
able selection are likely to deemphasize the 
importance of certain variables that are im- 
plied by the logic of the circumplex system 
but that are underrepresented in the English 
language. Although the system of interper- 
sonal variables under discussion is limited to 
eight variables, it could in principle be equally 
well represented by 16, 32, or 64 variables. 
The thinness with which we slice the circum- 
plex pie is limited by the reliability with 
which respondents can distinguish between 
closely synonymous words or phrases. 

It should be evident that the preceding 
theoretical considerations substantially con- 
strain the final form that a taxonomy of in- 
terpersonal traits may assume. The definition 
of interpersonal events, the specification of 
their underlying facet structure, and the selec- 
tion of a measurement model to represent re- 
lationships among variables all express a 
commitment to a particular, albeit widely 
shared, conceptualization of the domain under 
investigation. In this sense, the eventual tax- 
onomy of trait-descriptive terms will be a 
“psychological” taxonomy rather than a 
strictly “semantic” taxonomy based on dic- 
tionary definitions. It is assumed that the 
semantic structures underlying social percep- 
tion in this culture cannot be inferred in any 
obvious way from dictionary definitions. In- 
vestigators who start with different assump- 
tions and who utilize different measurement 
models would undoubtedly devise somewhat 
different taxonomies. Structural relationships 
of the kind at issue here are not “discovered” 
(Loevinger, 1957). They are postulated and 
then evaluated for goodness of fit. 


Development of Interpersonal Clusters 


Rational Categorization 


In a preliminary attempt to make the dis- 
tinction among kinds of trait terms discussed 
above, I classified all of the trait descriptors 
in Goldberg’s (Note 2) pool of 1,710 adjec- 
tives into one or another of seven a priori 
categories. At that stage in the development 
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of the taxonomy, the principal distinctions 
were between interpersonal trait terms, tem. 
peramental trait terms, and mental Predicate 
terms. Tentative categories of “attitudinal 
terms” and “constancy” were employed at this 
point, as well as a “miscellaneous” category 
that involved various dimensions that could 
not be clearly classified into other categories, 
In the initial sorting of the 1,710 adjectives, 
approximately 800 terms were identified as! 
“interpersonal” under our working definition 
of interpersonal traits. 

For purposes of preliminary categorization, 
we selected Leary’s (1957) system of inter- 
personal traits, because it appeared to be the 
most explicit system described in the litera- 
ture, an entire book being devoted to the topic. 
Two colleagues * and I thoroughly familiarized 
ourselves with the system and its theoretical 
framework, Working as a team, we consid- 
ered each of the approximately 800 terms 
previously classified as interpersonal and at- 
tempted to classify each within one of the 16 
interpersonal vectors. Adjectives that could 
not be classified within the 16 dimensions and 
that did not appear to belong in one of the 
other five a priori categories were temporarily 
set aside for further analysis. There were té 
markably few such adjectives. With consid- 
erable effort we were able to distribute $l 
adjectives across the 16 categories with unani- 
mous agreement among three raters. 


Selection of Preliminary Markers 


Social desirability scale values for all E 
jectives were available from a previous ey 
by Norman (Note 2), and mean self-endors 
ment frequencies were available from 4 af 
recent study by Goldberg (Note 2). On E 
basis of these itemmetric data, items Wë 
selected to serve as preliminary PUC 
clusters within each of the 16 interpers? 
categories. These items were selected en 
a way as to represent the category unam A 
ously across the range of endorsement on 
quencies and desirability values. APP" 


‘ Alex 
1I am grateful to James M. Kilkowski a ths" 
ander Galvin for their help and suppo” 


enterprise. 
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mately 12 items were selected for each cate- 


gory. 

In Goldberg’s sample of 70 male and 117 
female University of Oregon undergraduates, 
' self-ratings were obtained on a 9-place scale 

for each of the 1,710 trait descriptors. Using 

Goldberg’s data, our next step was to obtain 
the intercorrelations among terms selected to 
be preliminary markers of each of the cate- 
gories of interpersonal behavior. By a cluster- 
ing procedure, a smaller and more homoge- 
neous subset of items was selected within each 
category, with approximately six items per 
cluster. 

The next step was to consider all items 
that had been rationally classified as falling 
within the 16 categories as potential candi- 
dates for addition to or deletion from each 
nuclear interpersonal cluster. For this pur- 
pose, we scored the preliminary clusters as 16 
scales and examined the correlations of indi- 
vidual items with these scales with an eye 
‘toward their circumplex patterning. Thus, we 
examined the correlation of each of the 567 
items with the 16 preliminary clusters ordered 
by their hypothetical position in the circum- 
plex. Ttems were sought that had positive cor- 
relations with adjacent clusters, zero correla- 
tions with orthogonal clusters, and negative 
correlations with opposite clusters. For most 
items, this circumplex patterning across 16 
clusters was far from perfect. Item selection 
Was further complicated by the fact that some 
of the initial nuclear clusters were clearly out 
of place on the circumplex, so that we con- 
stantly had to keep in mind the inadequacies 
of our Original clusters. By this bootstrap 
Procedure, we selected a set of eight adjectives 

for each of the 16 vectors that, we hoped, 

would increase the homogeneity of their vector 
And correct the previous shortcomings of that 
vector, 

We assembled the 128 adjectives chosen to 
mark the 16 vectors into a test format and 
ioe tered. it toa small group of students 

a the University of British Columbia. The 
rats were requested to rate the self-ap- 
p polity of the adjectives on a 9-place Likert 
Cale. Item responses were summed for both 
ctant and sixteenth variables. Intercorrela- 
_tions among the variables were factored by 
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the method of principal components. Figure 
3 displays the pattern of octant variables 
loading on the first two factors in the Univer- 
sity of British Columbia sample. The pattern 
of loadings for 16 interpersonal variables is 
considerably less orderly than this, due no 
doubt to the fact that the vector scores are 
based on fewer items in a relatively small 
sample of subjects. But the pattern of octant 
scores clearly displays some of the difficulties 
we encountered in our attempt to map out 
Leary’s (1957) system. 

The most striking feature of Figure 3 is. 
the lack of variables in the upper right-hand 
quadrant. This gap in the Leary (1957) cir- 
cumplex has been previously noted by several 
authors including Stern (1970), who felt that 
the failure of Octants pa and No to “close” 
raises the question of whether the Leary sys- 
tem is a circumplex. Lorr and McNair (1965) 
noted this gap also and attempted to close it 
with additional substantive variables. An- 
other notable departure from expectations is 
the location of Octant No, which not only 
fails to appear in the upper quadrant but 
appears below Octant LM. In a sense, then, 
we partially succeeded in replicating the 
Leary system with trait-descriptive adjectives, 
but in so doing we carried over the faults of 
the system as well.” 


Development of Bipolar Clusters 


At this point it occurred to us that a 16- 
variable circumplex can be rather easily con- 
structed from eight genuine bipolar dimen- 
sions. One of the conceptual difficulties we 
experienced in working with the Leary sys- 
tem was a decided lack of bipolarity between 
vectors that appeared opposite each other on 
the circle. In particular, the following im- 
plicitly bipolar contrasts did not make a 
great deal of sense: success versus masochism, 
narcissism versus conformity, rebellion versus 


2 Juris I. Berzins (personal communication, March 
8, 1977) plotted the loadings of the original Leary 
(1957) Interpersonal Checklist (ICL) octants on the 
first two principal components in samples of 685 
high school students and 1,109 college students. The 
plots for both samples are indistinguishable from 
Figure 3. (See also Rinn, 1965, p. 458, Figure 6.) 
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SUCCESS - 
POWER 
(PA) 
o 
NARCISSISM - 
EXPLOITATION O 
(Bc) 
PUNISHMENT. COLLABORATION - 
HOSTILITY o LOVE 
(DE) Q (LM) 

2 TENDERNESS - 
pisTRUST " GENEROSITY 
(FG) (NO) 

CONFORMITY - 
© TRUST 
(JK) 
o 
MASOCHISM - 
WEAKNESS 


(HI) 


Figure 3. Rotated components of Leary (1957) interpersonal variables. 


tenderness, distrust versus generosity, and 
punishment versus collaboration. 

In Leary’s (1957) original system the ten- 
derness-generosity octant (No) was concep- 
tualized as the bipolar contrast to the rebel- 
lious-distrustful octant (rc). This likely ac- 
counts for the noticeable gap in the upper 
right quadrant of the system (Figure 3). 
Tenderness and generosity are not bipolar 
contrasts to rebellious and distrustful behav- 
ior. Tenderness and generosity are too weak 
and loving to be placed this high on the circle. 
As Lorr and McNair (1965) noted earlier, 
the No octant, which falls between dominance 
and love, reflects a socially exhibitionistic 
style of behaving (gregarious-extraverted) . 

With a little effort, our revised representa- 
tion of the 16 interpersonal variables can be 
read from Figure 2. Each variable represents 
a bipolar contrast to the variable appearing 
opposite it on the circle. The contrasts are 
dominant-submissive, arrogant-unassuming, 
calculating-ingenuous, cold-warm, quarrel- 
some-agreeable, aloof-gregarious, introverted- 
extraverted, and ambitious—lazy, 

In developing our bipolar clusters, we at- 
tempted to select items that were highly nega- 
tively correlated with their opposite cluster 
and that had zero correlations with their 


theoretically orthogonal clusters. Thus, 4 
good dominant item should have a high posi- 
tive correlation with a dominant cluster, & 
high negative correlation with a ee 
cluster, and essentially zero correlations mi 
quarrelsome and agreeable clusters. We usi 
a tentative set of 16 four-item bipolar clusters 
as markers to select items having the Pri 
erties just described. This enabled us to a 
vise and expand our tentative clusters en 
eight-item variables. Factor analysis of i) 
eight-item variables in Goldberg’s ee 
sample of 187 subjects revealed the r: 
circumplex structure we had yet encour a 
in our own work or in the literature. This ji 
sult was replicated in an additional ea 
of 119 subjects from the University of Bri 
Columbia. 


Development of Final Scales 


The items in our eight-item bipolar inter 
personal clusters were developed i E 
pool of 567 adjectives classified as pr a 
sonal in our initial rational sorting oe 
total list of 1,710 adjectives. The poss! fron 
clearly existed that some adjectives 1d bel 
among the 1,143 not considered wou pel 
appropriately placed within the 16 inte 
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sonal categories. Consequently, a program 
was written to examine the relationships be- 
tween all 1,710 adjectives and the 16 bipolar 
interpersonal clusters. The program correlates 
each adjective with the 16 bipolar clusters, 
‘selects the cluster with which the adjective 
‘js most highly correlated, and then prints out 
“an ordered list of variables within each of the 
16 categories. This listing provided the cor- 
relation of each item with the cluster with 
which it was most highly correlated. It also 
‘provided the correlation with the bipolar op- 
posite cluster and with the two orthogonal 
‘dusters. This procedure produced 16 lists 
that located all 1,710 adjectives with respect 
to the interpersonal cluster with which they 
‘were most highly correlated. 

Two of us then went through these lists 
and examined the correlation of each item with 
the cluster from the category to which it had 
been assigned, as well as the correlation of 
the item with its opposite and orthogonal 
‘lusters.® Items that clearly belonged in the 
interpersonal category to which they had been 
assigned were retained. Items that did not 
ppear to belong in the interpersonal category 
to which they were assigned were placed in 
One of eight other taxonomic categories that 
then existed (e.g., temperament, character, 
material traits, etc.). 

At this point, an attempt was made to de- 
Velop more refined subcategories within the 

toader taxa. Preliminary groupings of clusters 
Were formed within the domains of tempera- 
Ment, character, attitudes, mental predicates, 
a Social roles. This classification added 38 
eas to the 16 interpersonal clusters, 
maung a total of 54 categories within the 
Pr iinirary taxonomy. These 54 categories 
hier as scales by computing, for each 
| fine: the sum of his or her ratings to the 
i interper that category. In the case of the 16 

ie Sonal categories, the 25 best items, as 

a ined from item-—cluster correlations, 

ng reference scale for each category. 
mise Stage of analysis was designed to 
doubtful calls exactly the classification of 
sort. Thi Jectives in the preceding taxonomic 

Deoa aed adjectives tuatie 
Mni al but that seemed to be in the 

8 interpersonal category, adjectives that 
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were unclassifiable in any category, and ad- 
jectives that were candidates for deletion on 
the grounds of ambiguity, obscurity, or lack 
of personological relevance. In addition, all 
adjectives in the interpersonal category other 
than the 400 (16X 25) marker adjectives 
were classified as doubtful for the purpose of 
this analysis. There were 768 such adjectives 
designated as questionable by this criterion, 
Each of the 768 adjectives was correlated with 
all of the 54 taxonomic category scales. Two 
of us then examined this pattern of correla- 
tions for all 768 words and assigned them to 
one of the 54 categories on the basis of both 
conceptual and empirical (correlational) con- 
siderations. Some items were retained in their 
original interpersonal category, others moved 
to other categories, and some deleted from 
further consideration. By these procedures, a 
revised 54-category taxonomy was developed, 
Within this revised taxonomy, 864 adjectives 
were classified as interpersonal. 

The selection of items for inclusion in the 
final interpersonal trait scales involved one 
more cycle in our iterative procedures The 
25 “best” items previously identified in each 
of the 16 interpersonal categories were scored 
as reference clusters. These reference clusters 
provided a broader representation of the con- 
tent of the final interpersonal variables than 
did our earlier clusters. Within each category, 
each item was correlated with its own 25-item 
cluster and with its 25-item opposite and 
orthogonal clusters. The items were then 
ordered within each category by their cor- 
relations with their own clusters. 

Working independently, three of us selected 
the “best” eight items in each of the 16 in- 
terpersonal categories. One rater made selec- 
tions strictly on the basis of a summary nu- 
merical index based on item correlations with 
same, opposite, and orthogonal clusters, a 
procedure referred to as empirical. Another 
rater attended to the same item correlations, 
but attempted to choose the more “meaning- 
ful” adjectives from pairs that had roughly 
similar empirical characteristics, a procedure 


3] would like to acknowledge the substantial help 
of Ana Holzmuller in this and many other phases of 


the project. 
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designated quasi-empirical. The third rater 
selected sets of eight adjectives that were 
“psychologically cohesive” in terms of his 
conception of the constructs under investiga- 
tion, the rational procedure. 

The empirical and quasi-empirical item- 
selection procedures yielded sets of 16 scales 
with highly similar psychometric properties. 
For a specific sixteenth, it was possible to 
identify one or the other of the two scales 
as being preferable, either on grounds of a 
higher alpha coefficient or on the grounds of 
a better position in the 16-variable circum- 
plex plot. Consequently, a combination set of 
8 variables was formed from the best of the 
empirical and quasiempirical sixteenths. 

An additional procedure for item selection 
was based on a more fine-grained analysis of 
the circumplex properties of individual items. 
Each item was correlated with the 16 25-item 
clusters that served as markers of the inter- 
personal variables. This pattern of correla- 
tions was then inspected for the number of 
departures from perfect circumplex ordering 
that occurred. Items with the smallest num- 
ber of departures from this pattern were re- 
tained in a set of scales labeled “item order- 
ing.” 

The final selection of a 128-item set of in- 
terpersonal variables was based on an index 
of circumplexity that permits comparison 
among competing scale sets (Wiggins & Mar- 
ston, Note 4). Very briefly, two factors were 
extracted from the intercorrelations among 
interpersonal variables in a given scale set. A 
hypothetical factor matrix was constructed to 
represent a perfect circumplex solution for 
variables of this level of reliability (as esti- 
mated from communalities). The sum of the 
absolute differences between elements in the 
hypothetical factor matrix and elements in 
the obtained factor matrix was taken as an 
index of circumplexity. Each of the five sets 
of interpersonal scales was then compared on 
this index. Both octants and sixteenths were 


evaluated in Goldberg’s (Note 2) sample of 


187 University of Oregon undergraduates and 
in Norman’s sample of 123 University of 
Western Australia undergraduates. An over- 
all index of circumplexity, based on both 8- 
and 16-variable solutions in American and 
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Australian samples, suggested that the com. 
bination set of 128 adjectives had a slight 
edge over the item-ordering, empirical, and 
quasi-empirical procedures and a substantial 
advantage over the rational procedure, which 
fared rather badly. This combination set of 
adjectives, which has been employed in al 
subsequent investigations, is listed in Table 2, 


Generalizability of Circumplex Structure — 


The present taxonomy of the interpersonal 
domain is based on an explicit structural 
model (Guttman, 1954) that follows from a 
facet analysis of cognitive categories of so- 
cial perception (Foa & Foa, 1974). On the 
basis of both theoretical and psychometric 
considerations, a set of eight 16-item scales 
were developed as marker variables of the 
principal vectors of this system. In addition 
to being potentially useful personality assess 
ment measures in their own right, these scales 
enable one to classify any interpersonal trait 
descriptor by establishing its location within 
the circumplex space. However, the ler 
ingfulness of such classification depe 
heavily on the generalizability of the presem 
circumplex structure to samples other ie 
those involved in the derivation of the eg 
interpersonal scales. The scales were dn 
primarily from Goldberg’s (Note 2) Me, 
of American students and cross-validated fi 
various stages in samples of Canadian is 
dents and in Norman’s sample of Aust! i 
students. Although relatively diverse va 
of students were employed in scale deri 
tion, evidence for the generalizability 0 
circumplex structure must come from a 
sources. Four samples of subjects were te 
for this purpose. 


Cross-Validation Samples 


Sample A. Subjects were recruited es 
vertisements to participate in a Pr 
ity study” that involved two separate 7 
sessions of approximately 2 hours, agl 
(Marston, Note 5). Subjects were pa ‘i 
each for their participation and weré Pi gi 
individual feedback if requested. The Ain 
terpersonal adjectives were embedde 


e 2 


fal Set of Interpersonal Adjective Scales 
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P, Ambitious A. Dominant B. Arrogant C, Calculating 
(Success) (Power) (Narcissism) (Exploitation) 
Persevering Dominant Bigheaded Sly 
Persistent Assertive Boisterous Tricky 
Industrious Forceful Conceited Wily 
Self-disciplined Domineering Boastful Cunning 
Organized Firm Overforward Overcunning 
Deliberative Self-confident Swellheaded Crafty 
Stable Self-assured Cocky Calculating 
Steady Un-self-conscious Flaunty Exploitative 
_D. Cold E. Quarrelsome F, Aloof G. Introverted 
(Hate) (Hostility) (Disaffiliation) (Withdrawal) 
Warmthless Impolite Antisocial Silent 
Unsympathetic Uncordial Unneighborly Shy 
Ironhearted Discourteous Impersonal Introverted 
Uncharitable Ungracious Unsociable Bashful 
Coldhearted Disrespectful Distant Inward 
Hardhearted Uncooperatiye Dissocial Unrevealing 
Cruel Ill-mannered Unsmiling Unsparkling 
Ruthless Uncivil Uncheery Undemonstrative 
H. Lazy I. Submissive J. Unassuming K. Ingenuous 
(Failure) (Weakness) (Modesty) (Trust) 
Unproductive Self-doubting Nonegotistical Uncunning 
Lazy Self-effacing Undemanding Uncalculating 
Unthorough Timid Unvain Uncrafty 
Unindustrious Meek Unwild Unwily 
Inconsistent Unbold Unargumentative Unsly 
Disorganized Unaggressive Boastless Guileless 
Unbusinesslike Forceless Pretenseless Undevious 
Impractical Unauthoritative Conceitless Undeceptive 
L. Warm M. Agreeable N. Gregarious O. Extraverted 
(Love) (Collaboration) (Affiliation) (Outgoingness) 
Tenderhearted Courteous Friendly Outgoing 
| Gentlehearted Charitable Genial Extraverted 
Tender Well-mannered Neighborly Vivacious 
Kind Respectful Companionable Jovilt 
Emotional Cordial Approachable Enthusiastic 
Sympathetic Cooperative Congenial Cheerful 
Softhearted Accommodating Good-natured Perky 
Appreciative Forgiving Pleasant Unshy 


aar N adjectives whose order was ran- 
P N . for each subject. Four 
a ity inventories were administered in 
aa orders along with the total list of 
10i wie The 152 subjects (51 men and 
mS included both undergraduate and 
maid oo from a variety of academic 
Senpiè F University of British Columbia. 
Sateen andio cis samples of vol- 
itom thre S essional workers were recruited 
e different social service agencies in 


the Greater Vancouver area (Merritt, Note 
6). The 128 interpersonal adjectives were ad- 
ministered in a single booklet along with a 
value survey and a standardized personality 
inventory. Testing was accomplished indi- 
vidually, in small groups, or on a take-home 
basis. One hundred subjects (29 men and 71 
women) completed the interpersonal adjec- 
tive form. j 

Sample C. Students in the second term of 
an introductory psychology class at the Uni- 
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SAMPLE C 


= ambitious-dominant; 
Figure 4. Structure of final interpersonal variables in four samples. eb anions do s 
BC = arrogant-calculating; DE = cold-quarrelsome; FG = aloof-introverted ; 
sive; JK = unassuming-ingenuous; LM = warm-agreeable.) 


versity of British Columbia were administered 
the 128 adjectives, embedded in a larger list 
of adjectives, during a regular class period. 
The students were given a guest lecture on 
personality testing in exchange for their par- 
ticipation, Data were collected for 132 sub- 
jects (57 men and 75 women), 

Sample D. Summer students enrolled in 
an introductory psychology class at the Uni- 
versity of British Columbia were administered 
the 128 adjectives, embedded in a larger list, 
during a regular class period. The students 
later received their own standardized scores 
on the interpersonal variables and a lecture 
on interpersonal behavior, Data were collected 
for 139 subjects (58 men and 81 women). 


Circumplex Analyses 


Intercorrelations were obtained among the 
eight interpersonal adjective scales in each of 
the four samples just described. Two prin- 
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SAMPLE B 


o NO 


o NO 


o LM 


SAMPLE D 


HI = lazy-submis- 


cipal components were then extracted a 
each of the intercorrelation matrices. “a 
respective analyses, these two coe 
counted for 76.1% (Sample A), 619 
ple B, 72.0% (Sample C), and 74.87 er 
D) of the total variance. For compa F 
of inspection, the second component nae 
solution was hand rotated to pass Figi 
the ambitious-dominant vector (PA). ar 
4 depicts the loadings of the eight a ; 
sonal scales on two hand-rotated P 
components in each of four samples. E 

Perfect, evenly spaced circumpl a 
ure 2) are not expected in real data, a 
of measurement error, and hence erai 
tures in Figure 4 are more properly E 
as quasi-circumplexes (Guttman, ue 
though the amount of qe a 
prepared to tolerate in empirical P ‘al 
some extent a matter of taste, it we 
to say that the quasi-circumplex struc 


ble 3 
ichometric 


Characteristics of Interpersonal Adjective Scales 
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Internal consistency 


Scale M and.SD* (coefficient alpha) 
Social desirability Range of 
Men Women Total rating ae 
P. (n = 236) (n = 374) (N = 610) (N = 100) Total from the 
Ss T: —_—__ sample 4 sub- 
M SD M SD u R M sD (N = 610) samples 
5.87 95 5.72 101 5.79 99 617 110 855 831.880 
E 430 1.02 360 1.07 387 105 3.66 -98 ‘870 .845-.885 
DE 295 89 248 80 266 84 231 81 "977 .862-.889 
428 «1.19 387 117 403 117 35471 ‘891 .887-.894 
i 395 94 413 101 406 99 337 61 `809 .769-.838 
K 46 83 514 1.00 495 97 573 95 ‘801 .743-.840 
M 64 80 708 -76 ési MEAN 753.73 ‘365-841-879 
617 %6 6s4 9% 640 96 752 66 ‘897 887-917 


bie, Scale labels stand for ambitious- 
); lazy-submissive (HI); 
kale values range from 1 ( 


gure 4 are among the clearest reported in 
e personality literature to date. The cir- 
plex ordering obtained in scale derivation 
relatively di- 
erse samples of subjects tested in differing 
texts. Hence, the eight scales may prove 
pul both as measuring instruments and as 
ference points for the classification of items 
d scales in the interpersonal domain. 


Psychometric Characteristics 


ex Differences 


ko 3 presents normative and psycho- 
eae for each of the eight interpersonal 
k A The first six columns con- 
k bee ce standard deviations in a com- 
$ Ei of 610 North American univer- 
ae oa Inspection of the separate means 
eee aome reveals clear-cut, and to 
eas predictable, sex differences in 
ae A n comparison with women, men 
o RA mselves as more ambitious-domi- 
Bnd pan gani calculating; cold-quarrelsome, 
TEN aOR Women present them- 
E ore lazy—submissive, unassuming- 
E aeto ape and gregarious- 
j tatistically Ta $ these mean differences are 
Mie TER le. Surprisingly, the smallest 
<.03) and as on ambitious-dominant ($ 

y-submissive (p < .01); all 


other diff 
.0001), erences are highly reliable (p< 


1 -dominant (PA); arrogant-calculating (BC); 
unassuming-ingenuous gK); warm-agreeable (LM); and 


extremely inaccurate description) to 9 


cold-quarrelsome (DE); aloof-introverted 
gregarious-extraverted (NO). 


(extremely accurate description). 


Sex differences in self-report on the inter- 


personal circumplex provide a graphic repre- 
sentation of sex role stereotypes in North 
American society (Wiggins & Holzmuller, 


1978). These differences are sample dependent 


and may possibly decrease in future years. 
tudents are not uni- 


Samples of university S 

formly “stereotyped.” They contain differing 
proportions of complexly “sex-reversed” sub- 
jects, such as men who score high on lazy- 
submissive and on aloof-introverted and 
women who score high on ambitious-dominant 
and on gregarious-extraverted (Wiggins & 


Holzmuller, Note 7). 


Social Desirability 


Social desirability ratings of the individual 
adjectives were obtained from Norman’s 
monograph (Note 3). One hundred Univer- 
sity of Michigan undergraduates (50 men and 
50 women) rated the adjectives on a 9-place 


social desirability rating scale. The means 
and standard or the 16 adjectives 


deviations f 
in each scale appear in columns 7 and 8 of 
Table 3. Comparing the total-sample mean 
self-report scale scores with corresponding 
mean social desirability ratings, it is clear that 
the well-known relationship between endorse- 
ment and desirability is found within the in- 
terpersonal domain (Edwards, 1957a). More- 
over, the similarity between mean endorse- 
ment and mean desirability with respect to 
the patterning ot ordering of interpersonal 
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variables is open to alternative interpretations 
of the circumplex structure in terms of stylistic 
response variables.* 

A variety of procedures have been sug- 
gested for coping with the endorsement-de- 
sirability confound in self-report personality 
data. On the assumption that in principle, re- 
sponses within all interpersonal vectors should 
be equiprobable, LaForge and Suczek (1955) 
tried to write desirable phrases for undesir- 
able dimensions (“Can be strict if necessary”) 
and undesirable phrases for desirable dimen- 
sions (“Spoils people with kindness”). Jack- 
son’s (1970) differential reliability index re- 
moves desirability variance and enhances con- 
tent saturation at the stage of item selection. 
Norman (Note 9) has described analysis of 
variance procedures for removing desirability 
variance from the Subject x Item response 
matrix. Finally, several individual difference 
measures of social desirability response style 
have been constructed to permit the assess- 
ment of desirability responding for each sub- 
ject (e.g., Crowne & Marlowe, 1960; Edwards, 
1957b; Wiggins, 1959), 

The extent to which 


personal vari- 
b esirability (or 
did not reveal sex differences) would be a 

; é life categories 
of social perception, On the other hand, basic 
of responding to 
may require vari- 


P Scores on the in- 
be made with refer- 
f the kind Provided 


judged only wi 
reference to the scores of ones gis 


Internal Consistency 


In the selection 


of item sets to 
vectors of the inter erat 


Personal domain, the prin- 
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cipal itemmetric criterion was that of stry 
tural fidelity (Loevinger, 1957). By a varie 
of procedures, we attempted to ensure th 
each item selected was properly located wi 
reference to a circumplex that we had adopt 
as the most appropriate measurement mod 
for representing the relationships among cat 
gories of interpersonal perception. Items we 
also selected that had relatively high correl 
tions with appropriate preliminary scale 
Final scales thus constructed are likely | 
be internally consistent, although this mu 
be demonstrated rather than assumed. 

The next-to-last column of Table 3 presen 
coefficient alpha (Cronbach, 1951) estimat 
of internal consistency for each of the inte 
personal scales in a combined sample of un 
versity students. The final column of the tabl 
indicates the range of alpha coefficients ol 
tained in the four samples that constituted th 
combined student group. Although some o 
the interpersonal scales are more reliabl 
than others, all of them meet a cana 
stringent requirement of acceptable intern 
consistency (a> .80), The range of alph 
coefficients in the different samples tends 
be small, and this, together with the repre 
Sentativeness with which the universe of con 
tent was sampled, makes it reasonable to as 
sume that the mean alpha values reporte 
are good estimates of the characteristic degret 
of interitem structure for the different inter 
personal variables (Loevinger, 1957). In thi 
respect, the extraversion-introversion dimen 
sion (No-RG) is the most internally cohesive 
and the variables of unassuming-ingenuou 
(JK) and lazy-submissive (1) are the least 


Discussion 


The long-range goal of the research a 
Ported in this article was the development A 
a psychological taxonomy that would ey I 
Pass the approximately 4,000 relatively i 
miliar trait-descriptive terms identified y 


4 For example, Douglas N. Jackson (persona EA 
munication, November 27, 1975) is convince! y 
the present interpersonal circumplex struci 
well as others reported in the literature, C% of 
accounted for in terms of his threshold peat 
desirability responding (Rogers, 1971; Jackson, 

8). 


Norman (Note 3). The research strategy 

nted for this task relied heavily on cer- 

in a priori distinctions between different 

mains of human characteristics. In the pres- 

work, the domain of interpersonal traits 

defined in such a way as to distinguish 

fom other domains such as temperament, 

acter, moods, and cognitive traits. These, 

imilar, distinctions have been made by 

st personality theorists concerned with 

jonomic issues (e.g., Allport, 1937; Cattell, 
Murray, 1938). However within the 
personal domain, the adoption of a 
dimensional real-space circumplex model 
jamin, 1974) to capture the interrela- 
| ships among trait terms represents & per- 
bial methodological preference that is per- 
ips not as widely shared. On the basis of 
work reported in this article, it seems 
t to conclude that the domain of interper- 
hal trait descriptors can be classified, quite 
isely, with reference to a circumplex model 
a conclusion that was not obvious when we 
gan. Since there are, no doubt, many other 
ays in which the interpersonal domain could 
classified, under different definitions and 
ferent models, the specific advantages of the 
esent framework must be discussed in more 
letail. 

Two-dimensional circular orderings of inter- 
rsonal variables have been reported in the 
terature for more than 40 years (Schaefer, 
1961), and the general conceptual model un- 
lerlying this structure can be traced as far 
ack as Galen (Roback, 1928). 
convergences” in conception and structure 
ave been noted for studies varying widely 
in variables, populations, and measurement 
procedures (Foa, 1961; Schaefer, 1961; Wig- 
gins, 1968). These convergences do not stem 
ot the similarity of generative mechanisms 
postulated by different theorists (needs, in- 
erests, dynamisms, etc.) to account for Sur- 
pe patterns. Instead, the convergences Te- 
ect a set of semantic categories that impart 
4 common meaning to social events observed 
Da genet The categories 
ori y the ordinary language of per- 
f. y are not sharply demarcated, and 
| E G eae best thought of as “fuzzy 
adeh, Fu, Tanaka, & Shimura, 1975) 


TAXONOMY OF INTERPERSONAL TRAITS 


409 


rather than as precise logical distinctions. The 
circumplex model appears to be particularly 
well suited for representing elements (in the 
present case, adjectives) - whose class mem- 
bership is continuous rather than discrete. 

Although often working in apparent isola- 
tion from one another, numerous investigators 
have proposed highly similar models of inter- 
personal behavior for the study of parents 
(Chance, 1959; Roe & Siegelman, 1963; 
Schaefer, 1959), children (Baumrind & Black, 
1967; Becker & Krug, 1964; Schaefer & Bay- 
ley, 1963), parents and children (Benjamin, 
1974; Foa, Triandis, & Katz, 1966), normal 
and abnormal adults (Leary, 1957; Lorr & 
McNair, 1963), psychotics (Lorr, Klett, & 
McNair, 1963), and college students (Stern, 
1970). Within many of these diverse pro- 
grams of research, the circumplex model has 
provided a nomological network that has 
enhanced the meaning and significance of 
the separate interpersonal variables employed 
(Schaefer, 1961). Unfortunately, the potential 
of the circumplex as an integrative conceptual 
model has not been as widely recognized by 
those whose research paradigms have focused 
on the intensive study of single dimensions in 
personality and social psychology. 

In a recent book that presents summaries 
of research on the major dimensions of per- 
sonality, the editors state: 


been no overarching plan or 
theory, implicit or explicit, guiding the selection of 
topics for trait researchers. Indeed, the editors were 
forced to organize the book by means of the un- 
sophisticated tactic of simply placing the chapters in 
alphabetical order. (London & Exner, 1978, p. xiv) 


There obviously has 


But surely the major research topics in con- 
lity and social psychology 


temporary persona 
such as achievement and power (PA), Machi- 
n (DE), intro- 


avellianism (BC), aggressio. 
version (FG), obedience (HI), interpersonal 
trust (JK), altruism and helping behavior 


(LM), affiliation and extraversion — 


can be related to each other by schemes more 
s not 


to claim an es 
the research topics just enw ‘ 
present interpersonal taxonomy; rather, it is 


to illustrate the kinds of questions whose 


answers might bring conceptual clarity to 
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both taxonomic and experimental research. 
For example, once it is recognized that the 
dimensions measured by Bem’s (1974) mas- 
culinity and femininity scales are the or- 
thogonal interpersonal dimensions of ambi- 
tious-dominant (PA) and warm agreeable 
(LM), then the issue of the “independence” 
of masculinity and femininity, so conceived, is 
seen in a new light (Wiggins & Holzmuller, 
1978). 

The alphabetical list of human “needs” 
provided by Henry Murray (1938) has proved 
to be an exceptionally fertile source of per- 
sonality dimensions. Within the framework of 
Murray’s personology, several of these dimen- 
sions have been studied in considerable depth 
(e.g, Atkinson, 1958; McClelland, 1961; 
Winter, 1973), although little attention has 
been given to the interrelationships among 
the needs in the overall system, perhaps be- 
cause an alphabetical taxonomy provides no 
guidance in this matter. Needs selected from 
Murray’s list have formed the basis for two 
multiscale inventories (Edwards, 1959; Jack- 
son, 1967) and for scoring keys on an adjec- 
tive checklist (Gough & Heilbrun, 1965). 
There appears to be a consistent factor struc- 
ture associated with the dozen needs these 
instruments happen to share in common, but 
attempts to interpret this structure have not 
been particularly enlightening from a substan- 
tive point of view (Huba & Hamilton, 1976). 
In contrast, the strangely neglected work of 
Stern (1958, 1970) presents convincing evi- 
dence that a circumplex model provides a 
meaningful representation of the full range of 
Murray’s need variables. 

Two-dimensional circumplex models may 
have utility outside the realm of ordinary 
language-trait description associated with per- 
sonality inventories and adjective checklists. 
The circular ordering of intercorrelations 
among clinical scales from the MMPI has 
been noted (Schaefer, 1961), and it has 
been argued that these scales can be inter- 
preted within a two-dimensional circular 
model of personality (Kassebaum, Couch 

& Slater, 1959). Given the manner in which 
a MMPI clinical scales were constructed, 
es) arsament appears rather farfetched on 
initial consideration. However, a recent study 
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by Plutchik and Platman (1977) suggests; 
possible reason why this might be the case 
Twelve trait-descriptive adjectives were se 
lected to represent an interpersonal circum 
plex (Schaefer, 1961). Psychiatrists were 
then asked to rate seven diagnostic categorie 
(e.g., paranoid, schizoid, hysterical) with 
spect to these adjectives. The mean adjectiv 
profiles of the seven diagnostic categories wer 
intercorrelated, and two factors were 
tracted. The circular ordering of the diag 
nostic categories (Plutchik & Platman, 1971% 
p. 421) bears a striking resemblance to 
circular ordering of corresponding MM 
scales reported elsewhere (Schaefer, 1961, 
135). The constellations of traits impl 
by diagnostic labels can be represented 
a two-dimensional interpersonal circump! 
(Schaefer & Plutchik, 1966). To the exte 
that the MMPI clinical scales reflect conven 
tional diagnostic labeling, they too would bi 
expected to exhibit a circular ordering. 
The present taxonomy of interperso 
traits may prove useful to investigators W 
employ single adjectives as stimulus materią 
in studies of interpersonal perception, imp) 
sion formation, and trait attribution. 
representativeness of the stimulus materials I 
always an issue in such studies, and the pres 
ent taxonomy provides a systematic basis K 
sampling the entire domain of interperson 
traits as an alternative to, for example, tl 
less differentiated strategy of sampling 
vorable” traits. Elaborate sampling procedu 
for the study of one or more specific dimé 
sions of interpersonal behavior are made p4 
sible by the availability of a list of 864 tet 
that have been classified within 16 categont 
The position of an adjective within a 8v% 
category is indexed by the correlation of t* 
adjective with its own, its opposite, and I 
orthogonal clusters. In addition, the earlié 
itemmetric work of Norman (Note 3) pr 
vides information on these same adjectiv4 
with respect to such characteristics aS om 
desirability, difficulty level, self-endorsemen”: 
and attributions to liked, indifferent, and “A 
liked others. Although such itemmetric a 
are useful primarily for calibration of S 
lus materials, they may themselves be PY? 


Y ably studied for the light they shed on basic 
issues of social perception (Goldberg, 1978). 
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Self-Awareness, Psychological Perspective, and 
Self-Reinforcement in Relation to Personal and Social Standards 


Ed Diener and Thomas K. Srull 
University of Illinois at Urbana-Champaign 


The present study was designed to assess whether subjects would be more likely 
to judge their own behavior from a social perspective when they were self-aware 
than when they were non-self-aware. Participants reinforced themselves after 
receiving feedback about a performance task that indicated they had surpassed 
their own standard, a social standard, both standards, or neither standard. Sub- 
jects did not alter their reinforcement as a function of the self-awareness ma- 
nipulation when they surpassed both standards or neither standard. However, 
when subjects learned that they had surpassed their own standard but not the 
social standard, they rewarded themselves significantly more (p<.01) and 
felt more satisfied with their performance (p <.01) when they were non-self- 
aware. When subjects surpassed the social standard but not the personal stan- 
dard, they reinforced themselves significantly more (p<.01) and felt more 
satisfied (p <.01) when self-aware. It was also found that self-awareness de- 
creased feelings of choice in administering self-reinforcement ($ < .001). The 
possibility that self-awareness may create a social perspective from which one’s 
own behavior is evaluated and the implications this possibility has for future 


theory development are discussed. 


ent standards will cause negative affect, the 
self-aware person is more likely to avoid anti- 
normative behavior. 

Several studies have demonstrated the im- 
pact of self-awareness on normative behavior. 
For example, Diener and Wallbom (1976) 
found a large reduction in cheating when col- 
lege student subjects were made self-aware. 
It has also been found that objective self- 
awareness increases aggression when subjects 
think aggression is normative (Carver, 1974), 
but decreases aggression when they believe it 
is antinormative (Rule, Nesdale, & Dyck, 
1975; Scheier, Fenigstein, & Buss, 1974). The 
present experiment concerns two ambiguities 
present in current conceptualizations of self- 


Duval and Wicklund (1972) outlined a 
theory of self-awareness that has been ex- 
tended and refined by Wicklund (1975, in 
Press), The theory maintains that attention 
may be focused either outward or upon the 
self as an object. One major emphasis of the 
theory of self-awareness has been on stan- 
dard-related behavior, The theory maintains 
that self-awareness should increase adherence 
to normative standards, because the self- 
aware person will be highly aware of any 
deviations between his or her behavior and 
these standards. Moreover, since perceived 
discrepancies between one’s behavior and sali- 
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awareness. The first is whether self-awareness 
necessarily increases adherence to all types of 
standards. The second is whether self-aware- 
ness produces a purely introspective examina- 
tion of oneself or whether, under some condi- 
tions, self-awareness may create a social per- 
spective from which one’s own behavior is 


evaluated. 
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Types of Standards 


The major purpose of the present study was 
to explore the question of whether self-aware- 
ness increases adherence to all types of norms, 
only to social norms, or only to personal 
norms. According to the original theory of 
objective self-awareness, compliance with both 
personal and social standards should increase 
in the state of objective self-awareness. In 
other words, the theory maintains that any 
salient standard will exert more influence if 
the person is objectively self-aware, regard- 
less of whether the standard is social or per- 
sonal, It is possible, however, that self-aware- 
ness operates mainly upon one or the other 
type of standard. Alternatively, Fenigstein, 
Scheier, and Buss (1975) have identified two 
types of self-awareness, public and private, 
and it is possible that adherence to one type 
of standard will depend on which type of 
self-awareness is manipulated, It is clear that 
social and personal standards will often co- 
vary to some degree, and no effort has been 
made in past research to examine their inde- 
pendent effects, 

Demonstrations of increased norm adher- 
ence have relied mostly on socially imposed 
standards (cf. Carver, 1974; Scheier et al., 
1974). It is difficult, however, to draw any 
firm conclusions regarding the relative influ- 
ences of personal and social standards from 
this literature. In most situations, including 
those used in past self-awareness research, 
social and personal standards are undoubtedly 
intertwined. For example, Carver (1975) 
found that self-reported belief in punishment 
led to greater administering of electric shock, 
but only when participants were self-aware. 
In contrast, Scheier (1976) found that self- 
awareness, but not personal attitudes toward 
punishment, led to heightened aggression 
among angry subjects. However, these find- 
ings do not indicate that social or Personal 
standards were necessarily more powerful or 
were the locus of the self-awareness effects. 
Note that in each of the studies mentioned 
above, personal and social standards were 
both relevant to the behavior in question. 
Moreover, the social standard had to be 
filtered by the subject’s perceptions, Thus, 
based on past research, it is difficult to sepa- 
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rate the relative influence of the two types of 
standards, Even in cases where self-awareness 
has increased compliance to personal stan- 
dards, it is quite possible that subjects per- 
ceived the social standard to be congruent 
with their own personal standard. Indeed, 
there is evidence to show that in the absence 
of contradictory information, people assume 
a close relationship between self- and social 
standards (Ross, Greene, & House, 1977). 


Psychological Perspective and Attention to 
Standards 


Another ambiguity associated with past 
research is suggested by Mead’s (1934) self 
theory, upon which the self-awareness model 
was originally based. According to Mead, self- 
awareness is the ability to look at oneself as 
others do, to adopt an outside social percep- 
tion of oneself. This suggests that the self- 
aware person might become more concerned 
with the opinions of others and more empathic 
with their viewpoint and therefore conform 
more to social expectations. On the other 
hand, the self-aware person might be more 
aware of personal standards because of self- 
focused attention and therefore conform more 
to them, This raises the further question of 
whether the relative salience of personal and 
social standards is the same. More specifi- 
cally, it is possible that the salience of per 
sonal and social standards varies dramatically 
across different situations, and this factor r 
also been neglected in past research. In 2 
present study, personal and social standar ad 
were both made highly explicit and v 
independently. This procedure allowed a 
examination of the relative impact of pers? A 
and social standards when they are similar 
well as when they are dissimilar. p 

It is undoubtedly the case that adhere i 
to personal and social standards is not E 
tually exclusive, and one of the most i 
tant immediate problems is to delineate 
situations in which the relative impact 0 
type of standard is greater than the othe 
making both standards highly salient, 
thereby controlling for differences ne 
tion, the present study is concerned we 
such important class of theoretically 1° ir 


situations, Í 
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In sum, it is suggested that self-awareness 
has two principal components: attention to 
aspects of the self and a social perspective of 
oneself, It is possible that these two com- 
ponents underlie the two individual difference 
_ patterns of self-awareness (private and pub- 
lic) identified by Fenigstein et al, (1975). It 
is hypothesized that when social and personal 
standards are independently varied, and at- 
tention to standards is held constant, self- 
awareness will have a greater impact on ad- 
herence to social than personal standards, be- 
Cause of the social perspective subjects have 
of themselves. The differential reliance on 
standards was assessed in a self-reinforce- 
ment paradigm. Self-reinforcement was 
adopted because it reflects in a concrete way 
4n individual’s evaluation of his or her be- 
havior. In addition, Diener (in press) has 
Tecently stressed the interplay between self- 
Awareness and self-reinforcement in develop- 
ing a theory of deindividuation. Thus, the 
) Use of self-reinforcement as a dependent mea- 
Süre provides a potential operational bridge 
between the theories of self-awareness and 
-deindividuation, 


ii Method 
Overview 


Participants made a series of 12 perceptual judg- 
ments and generated a standard of accuracy within 
Which they would consider an individual’s perform- 
ance to be successful. Later they were given sys- 
tematically varied bogus feedback concerning their 
OWN performance levels, their self-standards of suc- 
ess, and social standards of success ostensibly gen- 
erated by their peers. Trials on which the subject 
Surpassed both standards, one or the other standard, 
Or neither standard were thereby created. On the 
basis of the Social and self-standards and their own 
Performance levels, participants were allowed to 
administer reinforcement to themselves under self- 
‘Wate and non-self-aware conditions. 


Pari ticipants 


ig etticipants Were 24 male and 24 female Univer- 

sy oF Illinois introductory psychology students. 

Atticipation in the experiment partially fulfilled a 
requirement, 


rřocedure 
Subjects were run in individual sessions. Each par- 


fet was greeted by the female experimenter, who 
Oduced her male assistant, Both the experimenter 
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and assistant were blind to the experimental hy- 
pothesis. The assistant Tequested the subject to recite 
his or her name and hometown into a microphone, 
This information was recorded on a tape loop for 
later use. 

The subject was told that he or she was Partici- 
pating in the development of a Perceptual judgment 
task. The experimenter explained that she was in- 
terested in the maximum performance of which 
People are capable under actual testing conditions, 
She stated that there are a variety of distractions, 
such as the presence of Proctors and stray noises, 
Present in testing situations and that concentration 
is important in determining how well one performs. 
She explained that there would therefore be a vari- 
ety of distractions present during the experiment. 
She stated that as an incentive to Perform well, the 
subject would later be able to reward him- or her- 
self by taking from 0 to 9 “points” on each trial. 
Furthermore, at the end of the experiment, the total 
number of points accumulated would be converted 
to money. 

A slide projector was used to provide an example 
of the perceptual task that subjects would be asked 
to complete. After displaying a practice slide, the 
experimenter explained that each of the test slides 
would contain a large number of dots that would 
be shown for approximately 4 sec. The subject’s task 
consisted of two parts. First, he or she was to esti- 
mate the number of dots on the slide, Second, the 
subject was told: 


Obviously you will not be able to guess the exact 
number of dots after seeing the slide for only four 
seconds. Therefore, I would also like you to tell me 
within what range you think you should be for 
it to be a successful performance. That is, within 
how many dots should your estimate be for you 
to consider it a success? For example, you might 
estimate this slide to have, say, 160 dots and think 
you should be within 25 dots to have a successful 
estimate. 


The subject was then told that he or she would go 
through a series of test trials, Later, the experimenter 
would review each trial and report the participant's 
actual performance level, his or her self-standard 
for a successful performance, and a social standard 
of success that was based on the performance of 250 
introductory psychology students from the previous 
semester. The subject would then be able to reward 
him- or herself for each trial. 

After any questions had been answered, the first 
test slide was presented. With the onset of the first 
slide, white noise began to filter into the room, os- 
tensibly as the first distractor. Throughout this 
procedure, the experimenter’s assistant could be 
seen sitting at a desk, recording the subject’s re- 
sponses and performing a number of calculations on 
a calculator. Although the actual performance of 
subjects on these tasks was of no import to the 
study, obtaining the estimates was necessary to make 
credible the later feedback about their performance 
and self-standards. 
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At the end of the 12th slide, the experimenter ex- 
plained that her assistant had been calculating how 
well the participant had performed on each trial in 
terms of percentage of wrong answers. She then 
took a booklet from her assistant and began review- 
ing the 12 slides. For each trial, the subject was 
told his or her own “performance level,” his or her 
stated “self-standard” of success, and the “social 
standard” of success, These three figures were repre- 
sented in supposed percentages and shown on a 
large visual display to facilitate comparison. Obtain- 
ing estimates and standards in terms of raw numbers 
and later representing them as percentages allowed 
the experimenter to provide the false feedback with- 
out suspicion. After receiving the feedback for each 
trial, the subject took from zero to nine marbles 
from a container, placed them in an empty bowl as 
a reward, and told the experimenter how many 
marbles were taken. The number of marbles taken 
each time was recorded by the experimenter. The 
subject then made several short ratings, and the 
experimenter went on to the next trial. 

At the end of the 12th trial, subjects were com- 
pletely debriefed. Participants were told that the 
feedback and standards were bogus. The experi- 
menter carefully explained why this was a necessary 
part of the experiment and said that since she had 
promised that the marbles would be converted to 
money, she was paying each participant $2. The 
experimenter . fully discussed any concerns of the 
subject and was alert to any signs of upset about 
the deception. 

Sef-awareness manipulation. Each subject was 
exposed to both self-aware and non-self-aware ma- 
nipulations and was led to believe that these were 
different types of distractors. The self-aware condi- 
tion was created by having the subject’s image dis- 
played on a television screen. It has been shown that 
seeing one’s own image increases self-focused atten- 
tion (Carver & Scheier, 1978; Geller & Shaver, 
1976). In addition, the subject’s tape-recorded 
voice was played on a continuous loop, repeatedly 
saying, “My name is and I am from ___.” 
Thus, subjects in the self-aware condition were ex- 
posed to a voice (ostensibly similar to a person talk- 
ing) and a televised face (ostensibly similar to a 
proctor watching them). In the non-self-aware con- 
dition, an exciting automobile chase scene was played 
on the television. The videotape included both 
picture and sound and was presented as the type of 
distractor that might occur while a Person is study- 
ing. The order of the self-aware and non-self-aware 
conditions was counterbalanced across subjects. Each 
subject rewarded him- or herself on Trials 1 through 
6 while in one condition and on Trials 7 through 12 
in the other condition. 

Feedback conditions. As noted above, systematic 
feedback was given to each subject for his or her 
“performance level,” “self-standard” of success, and 
“social standard” of success, A total of six feedback 
conditions was created by having the subject’s os- 
tensible performance on different trials surpass either 
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(a) both the self- and social standard, (b) the s 
but not the social standard, (c) the social but m 
the self-standard, (d) neither standard, (e) the 
standard, with no information about the sodi 
standard, or (f) the social standard, with no info 
mation about the subject’s previous self-standat 
Each subject received the six feedback conditions j 
a different random order. The feedback was com 
structed in such a way that the percentage figur 
given subjects varied considerably from trial to trid 
in how close they were to the actual number of dol 


lievable. This maneuver seemed reasonable, 
some trials appeared very easy, while others ap 
peared very difficult. However, the discrepancy bt 
tween the subject’s ostensible performance am 
either of the two standards was, although not exadi 
close for each trial. These discrepancies were com 
trolled by making them equivalent when summi 
over subjects in any given condition, 

Self-report measures. A variety of self-repoi 
measures were obtained to explore the psychologi 
processes underlying subjects’ behavior in self-awall 
and non-self-aware conditions, to check on 
validity of the manipulations, and to check for pol 
sible suspicion. After the subject rewarded him- 0 
herself on each trial, two rating scales were cot] 
pleted. One asked, “How satisfied were you with 
your performance on the last task?” and the othe 
asked, “How much choice did you feel you had ‘ 
deciding how many points you should take?” A 
the end of the 6th and 12th trials (ie., at the end 
the self-aware and non-self-aware conditions), sul 
jects were also asked the extent to which they a 
sidered their own standards of success ae 
extent to which they considered the way they mig 
appear to others in rewarding themselves over Fo 
previous 6 trials. All of the ratings were EE 
11-point scales. The intrusion of additional a 
scales after the 6th and 12th trials was not H 
conspicuous as it might seem, because they (Ome y, 
at the same time the assistant was changing id 
one set of “distractors” to another. At the Mi. 
the 12 trials, subjects were asked to guess the 
pothesis of the study, to say whether they Ý 
heard about the experiment beforehand, and to 
what thoughts they had about the feedbacl 
received. 


Experimental Design 

Subjects were exposed to each feedback onl 
under both self-aware and non-self-aware C 
tions. The feedback conditions were present 
different randomized order for each subject ” 
the experiment was a 2 (sex) X 2 (order (sah 
aware and non-self-aware conditions) X ‘ition’ 
aware vs. non-self-aware) X 6 (feedback com e B 
factorial design with repeated measures on " 
two factors. 
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Table 1 
„Mean Number of Marbles Taken and Ratings of Personal Satisfaction 
Under Self-Aware and Non-Self-Aware Conditions as a Function of 
Type of Feedback 
Surpass Surpass Surpass 
self- self- social Surpass Surpass 
and but not butnot Surpass self- social 
abh socia! social self- neither standard standard 
Condition standard standard standard standard only X aly Total 
Number of marbles 
Own TV image 
and voice 6.81 3.44 5.75 -88 4.15 6.73 4.63 
Chase scene 6.73 5.50 3.65 .98 6.10 4.13 4,51 
Total 6.77 4.47 4.70 93 5.13 5.43 4,57 
Satisfaction 
Own TV image 
and voice 7.21 3.81 6.27 2.48 5.27 7.50 5.42 
Chase scene 1.42 6.00 4.50 2.56 6.65 5.25 5.40 
Total 7.31 4.90 5.39 2.52 5.96 6.38 5.41 


j 


Note, The possible range of marbles was 0 to 9 (higher numbers indicating greater self-reinforcement), and 
the possible range of satisfaction ratings was 0 to 10 (higher values indicating greater satisfaction). 


Results 


The order in which subjects experienced 
the self-aware (own televised image and own 
voice) and non-self-aware (televised chase 
scene) conditions did not produce a significant 
difference on any of the dependent variables. 
Sex of subject was significant to the extent 
that, across conditions, males took more mar- 
6 (M = 4.95) than females (M = 4.19), 

(1,44) = 9.28, p <.01. The order of the 
self-awareness manipulation and sex of sub- 
Ject were thus eliminated as variables, and 
the design was considered a 2 (self-awareness 
Vs. non-self-awareness) X 6 (feedback condi- 
tions) within-subjects factorial for all further 
analyses. It should also be noted that no sub- 
Ject had heard about the experiment before- 
ie Suspected the feedback was false, or was 
The to guess the hypothesis of the study. 
oa Possibilities were assessed both in the 

itten questionnaire and in verbal interviews. 


Self-Administered Reward and Personal 
atisfaction 


aie mean number of marbles taken by sub- 
S in the 12 conditions is presented in 


Table 1. There was a significant main effect 
for feedback condition, F(5, 235) = 167.04, 
p < .001, and, consistent with the hypothe- 
sis, a highly significant interaction between 
feedback condition and the self-awareness 
manipulation, F(5, 235) = 60.62, p < .001. 
Specific sets of comparisons are particularly 
germane to the original hypothesis. Those 
trials in which subjects received feedback in- 
dicating that they surpassed their self-stan- 
dard but not the social standard, or surpassed 
the social standard but not their self-stan- 
dard, should indicate the relative weight sub- 
jects put on each type of standard in admin- 
istering reward under self-aware and non- 
self-aware conditions. Similarly, those trials 
in which subjects surpassed only the self- or 
only the social standard will indicate the 
relative importance of these standards across 
self-aware and non-self-aware conditions. 
Post hoc comparisons using procedures sug- 
gested by Tukey (see Keppel, 1973) were 
therefore computed on selected pairs of means. 
These comparisons revealed that when sub- 
jects surpassed their own but not the social 
standard, they rewarded themselves signifi- 
cantly more highly in the non-self-aware than 
in the self-aware condition (p < .01). In con- 


418 


Table 2 


Mean Ratings of Choice Under Self-Aware and Non-Self-Aware 


Conditions as a Function of Type of Feedback 
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2 


Surpass Surpass Surpass 


self- self- social Surpass Surpass 
and but not but not Surpass self- social 
social social self- neither standard standard 
Condition standard standard standard standard only only 
Own TV image 
and voice 6.19 5.29 5.31 4.60 5.42 5.63 
Chase scene 7.90 7.60 7.69 7.56 7.73 7.48 
Difference 1.71 2.31 2.38 2.96 2.31 1.85 


Note. The possible range of ratings was 0 to 10, with higher values indicating a gr 


trast, when subjects surpassed the social but 
not the self-standard, they rewarded them- 
selves more highly in the self-aware than in 
the non-self-aware condition (p < .01). Fur- 
ther, an identical pattern appears when sub- 
jects were only given feedback with respect 
to one of the standards, Specifically, when 
subjects surpassed their self-standard but 
were not told the social standard, they re- 
warded themselves significantly more highly 
in the non-self-aware than in the self-aware 
condition (p < .01). However, when subjects 
surpassed the social standard but were not 
told their original self-standard, they re- 
warded themselves more in the self-aware 
than in the non-self-aware condition (p< 
01). Finally, it is also important to note that 
there was no difference in self-reinforcement 
as a function of self-awareness when subjects 
surpassed both standards or neither standard. 

Subjects were asked how satisfied they were 
with their performance following the feed- 
back on each trial. It was thought that these 
ratings would also reflect the relative im- 
portance subjects placed on surpassing their 
own and the social standards in the two con- 
ditions, The mean ratings of satisfaction, 
which are also presented in Table 1, parallel 
those for self-reinforcement quite closely, 
There was a main effect for feedback condi- 
tion, F(5, 235) = 66.06, p < -001, and a sig- 
nificant interaction between feedback condi- 
tion and the self-awareness manipulation, 
F(5, 235) = 25.95, p < 001. 

Post hoc comparisons revealed a pattern 
identical to that for self-reinforcement, Spe- 
cifically, subjects rated themselves more satis- 
fied with their Performance in the non-self- 


eater feeling of choice, 


aware than in the self-aware condition whi 
they surpassed their own but not the soci 
standard (p < .01). When they surpassed th 
social but not their personal standard, ho 
ever, subjects rated themselves as more sal ; 
fied in the self-aware than in the non-sel 
aware conditions (p < .01). Similarly, wh 
subjects surpassed their own standard but d 
not know the social standard, they ral 
themselves as more satisfied in the non-stl 


self-standard. Here, subjects rated therm Ag 
as more satisfied in the self-aware than 4 
the non-self-aware condition (p < .01). e 
sistent with the hypothesis, there was aA 
no difference as a function of self-awath 
when subjects surpassed both standards 
neither standard, sefactiol 

One might expect ratings of satis! i: be 
and amount of self-administered reward ‘a 
very redundant measures. However, aie if 
relations between these two variables, Wi 4) 
each of the conditions, ranged from .13 to”) 
with a mean of .41. This result suse 
each of the measures contributed some ! 
pendent information. 


Additional Ratings | 


owi | 

Following each trial, subjects ated i 

much choice they felt they had in eating 

how many points to take. The mer a 

of subjects for each of the 12 condita j 

presented in Table 2. There was a Sig 5)7 
feedback condition main effect, F(5; 


SELF-AWARENESS 


2.98, p < .02. Examination of the means re- 
veals that, as one would expect, subjects felt 
that they had the most choice when they sur- 
passed both standards and the least choice 
when they surpassed neither standard, with 
intermediate values in the other conditions, 
More interesting was a strong main effect for 
the self-awareness manipulation, F(1,47) = 
$0.94, p < .001. Post hoc comparisons indi- 
cate that subjects felt that they had more 
choice when non-self-aware (overall M = 
7.66) than when self-aware (overall M = 
5.41) in every one of the six feedback condi- 
tions (all at least at p < 05). 

Each subject was asked to make several 
additional ratings after the 6th and 12th 
trials (i.e., following the entire block of self- 
aware and non-self-aware conditions). One 
question asked to what extent subjects con- 
sidered their own standards of success in 
rewarding themselves over the previous 6 
trials. Results indicate that subjects consid- 
ered their own standards of success more in 
hon-self-aware (M = 7.98) than in self-aware 
(M = 6.13) conditions, £(47) = 5.64, p< 
001. A second question asked to what extent 
Subjects considered the way they might ap- 
Pear to others in rewarding themselves over 
the previous six trials. Here, subjects consid- 
ered the way they might appear to others 
More in self-aware (M = 7.77) than in non- 
self-aware (M = 5.25) conditions, #(47) = 
1.63, p < .001, These data provide a check 
on the intended manipulation of perspective 
and add further support to the general con- 
tention that subjects see themselves as so- 
cial objects and are more concerned with so- 
cial standards when they are made self-aware. 


Discussion 
Summary of Findings 


a mejor. finding of the present study 
50) that subjects relied more on their per- 
mal standards of accuracy for evaluating 

E performance when they were non-self- 
ia Whereas they relied more on the social 
War ee accuracy when they were self- 
vhom, Even though the social group from 
ot m the public standard was derived was 
po; Present, self-awareness increased reliance 
^ its standard. Several different measures 
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all pointed to the fact that self-aware sub- 
jects were more concerned with their social 
selves and felt more Pressure to comply with 
the social’ standard. The self-reinforcement 
and satisfaction ratings proved to be totally 
consistent in demonstrating the self-aware 
person’s heavy reliance on social standards. 
In addition, self-reinforcement was very simi- 
lar across self-aware conditions in those 
situations in which subjects surpassed both 
standards or neither standard. This finding 
Suggests that self-awareness has no effect 
when both personal and social standards are 
highly salient and are consistent with each 
other, 

The increased importance of social stan- 
dards to the self-aware subjects was clearly 
confirmed in every one of the four conditions 
in which subjects received feedback that they 
had surpassed one or another of the stan- 
dards. In addition, whether subjects received 
feedback on only one or two standards was 
unimportant; the social standard produced 
the large impact on self-aware participants in 
all conditions. It is important to note that 
since this was an entirely within-subjects 
design, these differences represent a dramatic 
shift in the relative reliance on one standard 
or the other as a function of the self-aware 
conditions, 

The responses to questions asked at the 
end of the self-aware and non-self-aware pe- 
riods confirmed the greater weighting of 
social standards by self-aware subjects. When 
asked how concerned they were with their 
personal standards, non-self-aware subjects 
reported being significantly more concerned 
than objectively self-aware subjects. How- 
ever, when asked about their concern for 
how they would appear to others, objectively 
self-aware subjects were the most concerned. 
Finally, self-aware subjects reported feeling 
less freedom in choosing their level of self- 
reward, indicating that they felt less self- 
directed and more constrained by the situa- 
tion. Thus, multiple measures across a variety 
of conditions all demonstrated the objec- 
tively self-aware person’s greater reliance on 
social standards. These findings clearly dem- 
onstrate that when standards are manipu- 
lated in a social situation and attention to 
the standards is held constant, self-aware 
persons will heavily weight the social stan- 


420 


dard. It would appear that self-aware per- 
sons take on a social perspective of them- 
selves in a social situation and judge their 
performance from this external vantage point. 

A close examination of the self-reinforce- 
ment and satisfaction data reveals an in- 
teresting finding in addition to the clear 
shift noted above. A number of comparisons 
all indicate that subjects in one condition or 
the other weighted a certain standard more 
heavily, but did not ignore the other stan- 
dard. That is, all subjects took both standards 
into account, but simply shifted the relative 
weight placed on them. When one compares 
the cells in which only one standard was given 
with the comparable surpassed-one-but-not- 
the-other cells, it can be seen that the former 
are higher for every comparison. This indi- 
cates that the standard that was given less 
weight nevertheless had a definite effect. The 
fact that the two surpass-both-standard cells 
have higher values than all others also indi- 
cates that one standard was never ignored 
completely, Thus, self-awareness shifted the 
weight given the two types of standards, but 
both standards were always considered. 

It is informative to examine more closely 
the data concerning subjective feelings of 
choice. In the non-self-aware conditions, sub- 
jects felt that they had free choice in deter- 
mining their reward even when they sur- 
passed neither standard. Thus, even when 
they gave themselves little reward, they felt 
that they had a great deal of choice. When 
self-aware, however, subjects felt much less 
freedom when they surpassed neither stan- 
dard. Even more impressive is the fact that 
self-aware subjects felt that they had less 
choice in choosing their reward when they 
surpassed both standards than they did in 
any of the non-self-aware conditions. In fact, 
the self-aware participants’ feelings of choice 
were highly affected by their performance, 
whereas this was much less true for non-self- 
aware subjects. Thus, self-awareness appears 
not only to decrease feelings of free choice 
overall but also tends to produce a greater 
feeling of constraint on the person who has 
performed poorly. When a person is non-self- 
aware, he or she appears to feel free in re- 
warding and punishing him- or herself, sug- 
gesting a feeling of internal self-direction. 
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One interesting finding of the present stud 
was that self-awareness had no impact o 
self-reinforcement or feelings of satisfactio 
when subjects surpassed both standards o 
failed by both standards. This indicates thal 
at least when standards are clear and explici 
self-aware subjects do not judge themselve 
more harshly when wrong or reward them 
selves more liberally when successful. If at 
tention to standards were allowed to vat 
however, it is possible that the self-aware pel 
son would be more aware of behavior-stat 
dard discrepancies and thus reinforce himsél 
or herself differently. Indeed, this is whati 
typically found in the self-awareness litera 
ture (Wicklund, 1975). However, the presen 
data indicate that when attention to standard 
remains high, self-aware subjects do not a 
themselves differently across all types of si 
dards, but simply rely more heavily on th 
social standard. : 


Psychological Perspective and Attention t0 


Standards 


As noted earlier, one recent line of evident 
in the objective self-awareness literature sul 
gests that there are two types of trait 
awareness, private and public (Fenigstelm 
al., 1975). Private self-awareness is said 1 
be a focus of attention on the covert, 1m! 
aspects of oneself, whereas public self att 
ness is what is commonly thought of as $ 
consciousness”—an awareness of the sl 
a social object. A person who is privë 
self-aware should experience a height i 
awareness of his or her own beliefs and 
ings, whereas a publicly self-aware a 
should be more aware of his or her soc 
pearance and the impression he or she a 
on others. This distinction in the concha 
ization of dispositional self-awareness 
gests a parallel two-factor approach t0 
underlying processes involved in stat 
awareness—one that emphasizes both Pet 
tive and attention. This distinction ra 
eral interesting possibilities that afè 
below. ions! 

There are several possible explanat! ost 
the apparent divergence between the a 
findings and past results showing 81°? 


herence to personal standards among self- 
aware subjects. A strong possibility is that 
personal standards are often adhered to more 
closely by self-aware subjects because of their 
close perceived correspondence with social 
norms (cf. Ross et al., 1977). Another Possibil- 
ity is that self-aware subjects are more aware 
of both social and personal standards than are 
non-self-aware subjects, but where the two 
are clearly discrepant, they weight social stan- 
dards more heavily than usual because of 
their social perspective of themselves. 

One explanation for the absence of in- 
creased adherence to personal standards in the 
present study concerns the difference pointed 
out earlier between attention and perspective. 
Th most studies, self-awareness is manipu- 
lated, and attention is free to vary, focusing 
either on aspects of the self or on the environ- 
ment, Thus, if the typical self-awareness para- 
digm had been employed, attention would 
have been free to focus on the external en- 
vironment or on social or private standards. 
In such a case, the person might attend more 
to both personal and public standards under 
self-aware conditions and thus be influenced 
more by both of them. This interpretation is 
Supported by the findings of Rule et al., 
(1975), who found that self-aware subjects 
complied with their own beliefs more when in 
à situation where no standards had been made 
explicit, 

The procedure used in the present study, 
owever, presented standards in such a way 
that they were explicit and easily comparable. 
hus, attention to the two types of standards 
Was likely to be high in both self-aware and 
Ron-self-aware conditions. However, the view- 
Point or perspective of the subject was likely 
to change across the two conditions. The self- 
Aware person was likely to see him- or herself 
i: Social object, and this apparently led 

Jects to weight the social standard much 
ae heavily, In the non-self-aware condition, 
a ee were also likely to be equally aware 

oth types of standards, but here they gave 
Steater weight to the personal standard, ap- 


ty because of their egocentric perspec- 
e, 


3 This analysis suggests an important distinc- 
On between the attention given to different 
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types of standards and psychological perspec- 
tive in the analysis of self-awareness affects, 
In order to determine the importance of this 
distinction, future research will need to ex- 
amine adherence to the two types of standards 
when they vary in explicitness, as a function 
of self-awareness. Since attention to various 
standards may vary in many situations, a 
theory explaining which standards will be 
most salient (and hence most magnified in im- 
portance by self-awareness) when standards 
are not highly explicit is needed. Wicklund (in 
Press) has offered the concept of “end points” 
(roughly analogous to goals) to explain which 
standards the self-aware person is likely to 
attend to, and this approach should be vigor- 
ously explored. 


Possible Limitations 


One factor may limit the generality of the 
present findings and suggests the need for 
additional study. It is possible that the present 
results are due to the fact that subjects did 
not have a firm basis for their self-standard 
and thus quickly abandoned it in favor of the 
social standard when made self-aware. If the 
person had a strong relevant personal stan- 
dard, self-awareness might serve to enhance 
its influence. 

Another potential limitation of the present 
findings is that they may be applicable only 
to public self-awareness (also referred to as 
self-consciousness). That is, it is possible that 
the self-awareness manipulation combined 
with the interaction with the two experi- 
menters induced public self-awareness. Pre- 
liminary analyses of a subsequent study in 
which the self-reinforcement situation was 
less social in nature indicate that the present 
pattern of findings may not emerge in a non- 
social situation. Hence, the findings of in- 
creased reliance on social standards may be 
most applicable to public self-awareness or 
self-awareness produced in a social situation. 
Indeed, the process underlying public self- 
awareness may be the psychological perspec- 
tive effect varied in the present design. It is 
interesting to speculate that the effects of 
private self-awareness depend on attentional 
differences that were intentionally held con- 
stant in the present study. 
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Future Research 


Although it is clear that self-aware sub- 
jects relied more upon the social standard, it 
is also apparent that non-self-aware subjects 
relied more upon their personal standard. It 
seems likely that subjects approached the sit- 
uation with an initial bias toward weighting 
their personal standard more heavily, possibly 
because of cultural values related to inde- 
pendence and self-reliance. One possible ex- 
planation for the reliance on personal stan- 
dards in the non-self-aware condition is that 
the non-self-aware person may have an ego- 
centric orientation that favors personal goals 
and standards when he or she is aware of 
them. This possibility deserves further re- 
search, The different effects of self-awareness 
on adherence to performance standards (e.g., 
those oriented toward reaching a specific goal) 
and norms (e.g., those used to judge moral or 
value-related behavior) should also be ex- 
plored. 

Finally, the distinction drawn earlier be- 
tween the person’s perspective or viewpoint 
and the focus of attention in the self-aware 
state also provides an intriguing avenue for 
future investigation. It will be important to 
examine whether self-aware persons are 
equally aware of both social and personal 
standards. If so, it will also be important to 
determine the precise mechanism through 
which one type of standard is given more 
weight than the other. It is possible that 
public self-awareness predisposes one to take 
the view of others and thus to emphasize so- 
cial standards. However, private self-aware- 
ness may focus one’s attention on internal 
factors and thus increase adherence to per- 
sonal standards. This suggests that the public 
and private individual difference dimensions 
may rest on two psychological processes— 
focusing attention on aspects of the self and 
taking a social perspective of oneself—that 
together comprise what has become known as 
“self-awareness.” Moreover, it may be pos- 
sible to vary these two factors independently. 
Thus, future research should explore the re- 

lation of perspective and attention to public 
and private self-awareness. One important di- 
rection for future theoretical development will 
be to discover the effects that different types 
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of self-awareness have on adherence to peu 
sonal and social standards. 


Conclusions 


The present experiment indicates that self-] 
aware persons may rely more heavily on social 
standards when judging their performancé 
than non-self-aware persons do. However, 
present findings may be limited to variations} 
in social self-awareness. Results also indicatedy 
that self-aware subjects felt less freedom inf 
choosing the level of reinforcement for thein 
behavior, suggesting that they felt more com 
strained by a social perspective of themselves) 
Focused attention to various aspects of one 
self and taking a social perspective of oneself 
were postulated to be two distinct componenti 
of self-awareness. It is suggested that thes 
two components parallel the public and privat 
individual difference dimensions of self-cot 
sciousness reported by Fenigstein et al. (1975) 
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Inferior Performance as a Selective Response to Ex 


pectancy; 


Taking a Dive to Make a Point 


Roy F. Baumeister, Joel Cooper, and Bryan A. Skib 


Princeton University 


An experiment investigated factors that determine whether persons conform to 
what is expected of them or not. Female subjects took a bogus personality test 
that established either publicly or privately that they had a particular (fic- 
tional) trait, which was presented either as a very desirable or a very undesir- 
able trait. Subjects then attempted an anagram-solving task after being informed 
that persons with this trait usually perform poorly on such tasks. Subjects for 
whom the expectancy was presented as derived from the desirable trait, when 
it was publicly established that they had this trait, solved significantly fewer 
anagrams than subjects in the other three conditions, implying that self-presen- 
tational concerns are important in determining the extent to which an expect- 
ancy influences behavior. Moreover, analyses of the differences between the 
number of anagram solutions subjects reported during the experiment and the 
number of actually correct solutions suggested that when the expectancy was 


linked to an undesirable trait, subjects actively sought to disconfirm it. 


Past research on expectancies has explored 
whether expectancies increase the likelihood 
of the expected behavior (e.g., Rosenthal & 
Fode, 1963). In particular, studies dealing 
with task performance have sought to de- 
termine whether persons will tend to adjust 
their performance levels to agree with what- 
ever level they have been led to expect. The 
main emphasis of this research has been to 
determine whether the subject’s cognitions of 
task and expectancy seem to generate some 
internal processes (such as drives toward 
cognitive consistency) that induce him or her 
to perform at the expected level. Thus, Aron- 
son and Carlsmith (1962) proposed that if 
one expects poor performance, one will avoid 
good performance because the latter would 
cause “cognitive dissonance” (Festinger, 
1957). Subsequent research, however, has not 
consistently found this “rejection of success” 
nor established what cognitions or situational 
factors produce it (Brock, Edelman, Edwards, 
& Schuck, 1965; Cottrell, 1965; Lowin & 


Requests for reprints should be sent to Joel Cooper, 
Department of Psychology, Princeton University. 
Princeton, New Ji ersey 08540, F 


Epstein, 1965; Marecek & Mettee, 19% 
Mettee, 1971; Ward & Sandvold, 1963). 
Archibald (1974) reviewed the literatut 
dealing with such “self-fulfilling prophecies; 
that is, with events that tend to occur ™ 
cause someone expects them. Archibald raise! 
several unresolved questions regarding 
nature of such expectancy effects and clos 
with a vigorous call for further research. S 
cifically, he proposed several possible inter 
cognitive dynamics (such as dissonance, 4 
iety, and defensiveness) that might aco% 
for the self-fulfilling prophecy effects. 
The present investigation was ™ 
by our contention that the effects of a” 
pectancy on behavior are not necessarily ; 
exclusively a product of the internal a 
of the individual—especially when the i, 
pectancy originates (as it usually does) A 
side the individual. Rather, the desire t0 © 
vey a particular impression of ones? 
others in the situation may in some ie 
a more important determinant of the ! 
vidual’s response to an expectancy than p 
individual’s private drives toward mai i 
ing a certain self-concept or achieving g 


otival 
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tive equilibrium (Tedeschi, Schlenker, & 
Bonoma, 1971). 


The Public Context: Self-Presentation 


It is generally assumed that persons try to 
maintain favorable evaluations of themselves 
and that one basic criterion of self-esteem is 
how successfully a person feels he or she has 
managed to live up to an “ideal self” (e.g., 
Cohen, 1959). Separate from one’s desire to 
be the ideal self, however, is a motivation to 
convince others that one’s personality re- 
sembles one’s ideal self. Schlenker (1975) 
has provided evidence that the motivation to 
present a highly favorable image of oneself 
is not dependent on having such a favorable 
self-image. Our point is that behavior in re- 
Sponse to expectancies furnished by others + 
is more likely to be motivated by the latter 
(communicative or self-presentational) inter- 
est than by internal goals of protecting one’s 
self-esteem or achieving cognitive balance. 

Our investigation therefore emphasized the 
effect of the context of an expectancy rather 
than a self-fulfilling effect of the expectancy 
itself, Depending on the context, the same 
expectancy could conceivably elicit different 
behaviors. This brings up the distinction be- 
tween the content and the basis of an ex- 
Pectancy. The content is what the target per- 
Son is expected to do, The basis is whatever 
attributional reasoning led to the formation 
of the expectancy. For example, if a man is 
believed to be a lazy person, one might expect 
him to spend his spare time watching tele- 
vision rather than playing tennis. The con- 
tent of this expectancy is the behavioral pref- 
erence for watching television; the basis is 

is laziness, 

It is plausible that the target’s response to 
an expectancy will depend not only on how 
4 praluates the expected behavior but also 
in Ow he feels about the supposedly under- 
A a trait that is attributed to him. One rea- 
a Pra target person’s sensitivity to the 
À es basis for expectancy may be Gru 
aa the significance of the target’s 

eka behavior. The target person must 
€ into account that for him to conform to 


€ behavioral expectancy will confirm the 
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observer’s attribution to him of the sup- 
Posedly underlying trait. Even if the target 
Person believes there is no relation between 
the trait and the behavior, even if he would 
deny that he has that trait, he must acknowl- 
edge that the observer will interpret the ex- 
pected behavior in relation to the attributed 
trait. After all, if the target person (for what- 
ever reason) behaves consistently with the 
observer’s expectancy, the observer will in 
all probability feel simply that the hypoth- 
esis has been confirmed. 


Predictions 


The present experiment was designed to in- 
vestigate whether differences in responses to 
a given expectancy as a function of its con- 
text are due to public, self-presentational con- 
cerns or to private, internal motivations. Our 
subjects were informed that they were ex- 
pected to perform poorly based on a trait 
they (supposedly) had, which was presented 
either as a desirable or an undesirable trait. 
Half our subjects were led to believe that 
they alone knew that they had the underly- 
ing trait and were therefore expected to do 
poorly, the remainder of the subjects were 
identified publicly as having the (good or 
bad) trait that predicted poor performance. 
We predicted that the expectancy would suc- 
cessfully elicit poor performance only when 
the expectancy was public and only when it 
was publicly linked to the desirable trait. 
However, a viewpoint based on the belief in 
the internal dynamics of the individual as the 
essential or exclusive mediators of expectancy 
effects would predict that the results would 
be the same in the public as in the private 


treatment conditions. 


Relation to Past Work 


Although previous research has largely ig- 
nored the issue of how responses to expectan- 
cies are determined by the public or private 


1 We are not dealing with expectancies that origi- 
nate within the actor. Such expectancies may well 
operate independently of public or self-presenta- 
tional interests. 
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nature of their contexts, various findings can 
be interpreted as suggesting a connection be- 
tween self-presentational concerns and the 
beliefs and expectancies of others. For ex- 
ample, Zanna and Pack (1975) found that 
women will present themselves consistently 
with the sex role attitudes of an attractive 
male. Similarly, Deutsch and Gerard (1955) 
demonstrated that conformity to an obviously 
false majority judgment is greatly reduced if 
subjects express their judgments anonymously. 

A study by Sigall, Aronson, and Van Hoose 
(1970) is relevant to the present discussion. 

Their subjects copied lists of telephone num- 
bers after being told that they were expected 
to perform at a level either below or above 
that of the subjects’ own “practice round” 
level. In a third condition, subjects were given 
the expectancy of performance lower than 
their own and told that subjects who per- 
formed well on this task had obsessive-com- 
pulsive personalities. Sigall et al. found that 
this third condition produced poorer per- 
formance (i.e., fewer numbers copied) than 
a no-expectancy control condition, whereas 
the first two conditions elicited superior per- 
formance. 

However, Sigall et al. were primarily in- 
terested in a methodological issue, and their 
results (although suggestive) do not provide 
conclusive evidence about reactions to ex- 
pectancies. In the first place, subjects per- 
formed better than controls after an expected 
level of performance was communicated to 
them. But it made no difference whether this 
expected performance level was set above or 
below the subject’s pretest level. Both pro- 
duced improved performance. Sigall et al. 
suggest the possibility that subjects did not 
perceive the expectancy in relation to their 
pretest performances. Therefore, they may 
not have known whether an increase or a de- 
crease was expected. 

In fact, Sigall et al. (1970) reported a sec- 
ond experiment in which the expectancy ma- 
nipulation was omitted altogether, and im- 
proved performance was found as a result of 
simply telling the subject more about the 
study (and informing the subject that a hy- 
pothesis existed, without telling the subject 
what it might be). These results therefore 
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seem to imply only that experimental 
work harder if the experimenter m 
greater effort to explain the purpose an 
tion of the research than if the experi 
simply tells them what to do. Altho 
is an interesting finding, it does not 
direct bearing on how persons are affect 
what others expect them to do. 
The results of the obsessive-compulsi 
treatment also can be understood w 
reference to the expectancy about the sub 
performance level. Subjects were tol 
only obsessive-compulsive personalities 
formed well on that task. The task w: 
presented as the only indication of 
the subject was an obsessive-comp 
Rather than presenting the subject yv 
expectancy based on an already madi 
ence about his or her personality, the 
was presented with this task as a 
determining whether the subject was æ 
sessive-compulsive or not. 
Thus, although the procedures and 
of Sigall et al. raise a number of i 
mane to the present investigation, th 
not be considered to provide clear 0 
tive answers to these questions. Of 
the most important difference bet 
procedures of Sigall et al. and ours 
their work provides no indicatio 
whether any such motivation toward | 
performance derives essentially from @ 
desire of the individual to protect s 
or from self-presentational interests, 
ent investigation was directly concer 
this issue. 


Method 
Subjects 


Forty-one undergraduate women volun’ 
study on “personality and behavior.” Data 
subject were discarded because she in 
undermined a manipulation by announdi 
pretest results aloud when they were SUPP 
known only to her. k 

Subjects participated in the study indivi 
though a male confederate, who preten 
another subject, was present for all sessi 
jects were offered and paid $2.50 for theit 
tion. They were assigned at random amon 
treatment conditions. l. 
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Female subjects were used exclusively because of 
availability and to ensure a reasonably homogeneous 
sample. The cover story and operations were adapted 
so as to be suitable for an exclusively female sample. 


Procedure 


Pretest. When the subject arrived at the experi- 
mental room, the male confederate was not yet pres- 
ent, The female experimenter? said that the first 
procedure involved a personality inventory that was 
designed for the female subjects, Since the other par- 
ticipant scheduled for that session was male, the 
present subject might as well begin the personality 
measure without waiting for him to arrive. 

The pretest was in fact bogus. Its only function 
was to provide a vehicle for convincing the subject 
that she scored high on the bogus personality trait 
from which the behavioral expectancy would later 
be derived. 

In order to facilitate the confidentiality of the 
subject’s score and in order to avoid suspicion con- 
cerning the results of the test, subjects scored their 
own tests. The tests were structured so that all 
subjects would score high on the trait in question. 

The male confederate waited until the female sub- 
ject was nearly finished taking her test. He entered 
with a confused apology for being late. The experi- 
menter instructed him to sit across a large table 
from the subject. A large barrier precluded visual 
contact. 

Manipulation of trait desirability. The subject 
heard one of two tape-recorded briefings. Both be- 
gan by informing subjects that the preceding per- 
Sonality test was a measure of the trait of “surgency.” 
Our use of the term surgency followed Ickes, Wick- 
lund, and Ferris (1973), who suggested that the 
term is well suited to denote a bogus personality trait 
because it sounds like authentic psychological jargon 
but does not have a meaning that is familiar to 
Most subjects. 

Both initial briefings stated that it was expected 
that many Princeton women would be high in 
‘urgency because of their unusual developmental 
backgrounds. The importance of the research on 
Sutgency was stressed. Moreover, the recorded voice 
a that even if the subject did not score high on 
Rete her participation was valuable in order to 
ceed a comparison with the high-surgency group. 

n addition, a male control group was to be included. 
a There followed the explication of surgency. In the 
Rod Condition, the recording protrayed high- 
eee women as unusually mature and well-ad- 
a ay Sensitive, intelligent, and insightful. A specula- 
trodu relopmental scenario suggested that the in- 
focialinarn of some aspects of traditionally male 
Ra ation resulted in an optimal “best of both 
ie S personality development in the high-sur- 

ncy woman, 
ea “bad” condition, high-surgency women were 
sensiti as immature and ill-adjusted, shallow, 2s 

ve, defensive, and insecure, The implication 
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of the developmental argument was that exposing 
the young female to some aspects of traditionally 
male socialization made her insecure about her 
femininity. 

Manipulation of public/private. At this point, the 
subject was asked to score her own answer sheet for 
surgency. She was given a scoring template and a 
red pen and instructed to mark it. In the “public” 
condition, the experimenter asked the subject aloud 
what her score was, stated publicly, “That is above 
13, so you are ‘high’ in surgency,” and made a show 
of recording the subject’s name, score, and “high” 
in a data record book. Although the confederate made 
no comment, it was obvious that he heard all the 
proceedings. The experimenter then collected the sub- 
ject’s answer sheet, after making certain that the 
subject had written her name on it. 

In the “private” condition, the subject was in- 
structed not to write her name on the answer sheet 
(or on any of the experimental materials), She was 
assigned an arbitrary three-digit “subject number” 
by which her data would later be assembled. When it 
was time for her to score her answer sheet, the experi- 
menter gave her the template and explained how to 
score it and how to determine if she was in the 
high-surgency group. The experimenter then left the 
subject alone to do the scoring and recording, after 
saying that she (the experimenter) was not sup- 
posed to know the subject’s score, When the subject 
was finished, she folded her answer sheet and stuffed 
it into a box full of other answer sheets, closed the 
data book, and returned the materials to the ex- 
perimenter, 

The criterion “13 high/12 low” was written on the 
blackboard and was thus apparent to all subjects, al- 
though until they scored the tests it presumably had 
no significance for them. 

Anagram task, The tape-recorded instructions 
announced that the next procedure would involve 
solving anagrams. A brief statement of the rationale 
for including this measure in the experiment con- 
tained the critical prediction that high-surgency 
women were expected to perform poorly on the 
anagram task. 

In the “good” condition, it was explained that 
anagram solving is a simple, mechanistic cognitive 
task and that because the talents of the high-sur- 
gency woman are geared toward tasks involving 
meaning, creative tasks, and tasks involving insight 
into and understanding of other persons, she would 
not perform well on exercises like anagram solving. 
In the “bad” condition, it was asserted that the de- 
fensive vigilance of the high-surgency woman makes 
her reluctant to immerse herself in a simple, mecha- 
nistic cognitive task, because she prefers to keep her 
attention on interpersonal matters “where the source 
of her insecurities lies.” 


2The authors wish to thank Elaine Seligman for 
her very capable performance as the experimenter. 
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Table 1 
Mean Number of Anagrams Reported 
and Actually Solved 


Trait 

Condition Bad Good 
Private 

Reported 21.5 23.2 

Actual 20.3 22.8 
Public 

Reported 22.7 14.4 

Actual 21.8 14.3 


Note. n = 10 per cell; maximum possible number of 
anagrams = 40, 


Following the prediction of poor performance 
among high-surgency women, the tape recording 
said that performance among low-surgency women 
was “considerably less uniform.” Low-surgency 
women might therefore do well or poorly. Specific 
instructions for the task were then given: It would 
involve two 8-minute sessions, each with a list of 
20 anagrams; all sets of letters could be unscrambled 
to make English words (this was true); the subject 
should solve as many as possible in the alloted time. 

Anagram performance was to be public for all 
subjects. The initial instructions said that after each 
list, the experimenter would ask each subject how 
many anagrams he or she had solved. Following the 
first list, the experimenter asked first the subject and 
then the confederate how many anagrams (out of 
20) they had solved. The confederate invariably re- 
ported the same number as the subject had. Follow- 
ing the second list, this was repeated, except that 
the confederate reported having solved two fewer 
than the subject. The number of anagrams solved by 
the subject was the main dependent measure. 

Finally, a postexperimental questionnaire con- 
tained manipulation checks and solicited reactions 
to or feelings about the procedures. A careful de- 
briefing followed in which the nature and necessity 
of all deceptions were fully explained. 


Results 
Manipulation Checks 


An item on the postexperimental question- 
naire asked subjects to “Indicate how you 
felt about the results of the Personality test.” 
Four subjects left this item blank. Subjects in 
the desirable trait condition felt much better 
about having high surgency than did subjects 
for whom surgency was presented as an un- 
desirable trait, F(1, 32) = 6.84, p<.02. 
Neither the effect for the public/private vari- 
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able (F < 1) nor the interaction (F = 1.66) 
approached significance. A second item askel 
whether the other subject (i.e., the confed. 
erate) had been aware of the results of the 
subject’s personality test. The appropriate 
answer would of course be “yes” for the 
public conditions and “no” for the private 
conditions. All but three subjects answered 
this item correctly. We did not discard the 
data from these subjects. However, omissioi 
of their data would not reduce the significance 
levels of the analyses. 


Anagram Solutions 


The experimenter recorded how many ana 
grams the subjects reported having solved. 
Another experimenter later scored all the 
problem sheets for the actual number of cot 
rect solutions. Because there were occasional 
discrepancies between the number of anagrams 
actually solved and the number reported 
aloud, separate analyses were performed of 
these two measures. l 

Mean anagram scores for both reported and! 
actual solutions are presented in Table 1 
Both analyses of variance revealed significant 
interactions between the two independent 
variables: for reported scores, F(1, 36) = 
8.00, p < 01; for actual scores, F(1, 36) = 
7.86, p < .01. In addition, the main effect for 
public-private was significant, F(1, sO 
4.62, p < .05, for the reported scores, but y 
for the actual scores. It is clear from Table ; 
however, that the significant effects arè M 
tributable to the disparity between the per 
formance of the good trait/public group 8 
the other three conditions. Using Tuke 
honestly significant difference (HSD; ni 
Kirk, 1968) technique, the good trait/pv” 
cell mean was found to be significantly low 
than each of the other means for the repo” 
scores (p < .05) and lower than each 0 
others except the bad trait/private cell ™ 


for the actual scores. The difference betw? 


the good trait/public and the bad trait/P? 
vate cell means for the actual scores i 
however, significant (p < .05) according 5) 
Fisher’s LSD (least significant differ 
technique. No other pairwise difference 
tween means was significant. It seems, 


that the good trait/public treatment was 
much more effective then the others in elicit- 
ing behavior consistent with the expectancy. 


Discrepancy Between Reported Score and 
Actual Score 


Some subjects reported scores that were 
different from the actual number of anagrams 
they had solved correctly. When this differ- 
ence occurred, the reported scores were in- 
variably greater than the actual correct scores, 
implying that these were not simply random, 
honest mistakes. We shall use the term en- 
hancing to refer to a subject’s reporting a 
score greater than the correct one. A measure 
of enhacing was obtained by subtracting the 
actual number of anagrams each subject 
solved correctly from the number she re- 
ported during the experiment. Analysis of 
variance on these enhancing scores revealed 
only one significant effect, and that was the 
main effect for the desirability of the trait, 
F(1, 36) = 6.44, p< .02. Subjects in the 
bad trait conditions enhanced significantly 
More than subjects in the good trait condi- 
tions, 

That this difference was a general trend 
and not due to a few isolated individuals can 
be verified by considering the number of sub- 
| Jects who enhanced. Only 2 of the 20 sub- 
Jects in the good trait conditions enhanced. In 
the bad trait conditions, however, 12 of the 
20 Subjects enhanced. The chi-square analy- 
Sis of this difference is highly significant, 
X(1) = 10,99, p < .001. 


Difference Between the Two Anagram Lists 


Subjects worked two lists (labeled A and 
of anagrams, All subjects did List A first. 
The difference between the performance levels 
on the two lists was not influenced systemati- 
cally by the independent variables. 

Owever, the patterns of enhancing on the 
two lists did vary among the different condi- 
tions. For each subject, the difference between 
A enhacing on the first (A) test and on the 
ee (B) test was obtained by subtracting 
na enhancing score (reported minus actual 

agram solutions) on the B list from that on 
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the A list. Analysis of variance on these dif- 
ferences showed only one significant effect: an 
interaction between the public-private and 
the good-bad trait variables, F(1, 36) = 4.53, 
p < .05. Because enhancing was negligible in 
the good trait conditions, the interaction 
arises chiefly from the bad trait cells. The 
bad trait/private condition subjects enhanced 
more on the A list than on the B list, whereas 
the bad trait/public condition subjects en- 
hanced more on the B list than on the A list. 
The difference between the bad trait/private 
and the bad trait/ public means on this mea- 
sure is significant (p < .05) according to 
Tukey’s HSD technique. 


Final Questionnaire 


Several questions on the postexperimental 
questionnaire asked subjects to rate on 5- 
point scales how hard they had tried to do 
well on the anagram task and how hard they 
had tried to conform to the level of perform- 
ance that was expected of them. No significant 
differences were found on these measures. 


Discussion 


Experimental subjects typically perform 
tasks to do the best of their ability, motivated 
presumably by implicit demand character- 
istics, habit, and concern with self-esteem and 
public esteem. Our subjects systematically 
violated this practice and performed at a level 
significantly below that of their capabilities 
when they were publicly expected to do 
poorly and this expectation reflected a de- 
sirable quality in their personalities. 

There is no evidence from our self-report 
(or other) data of any conscious intention to 
achieve inferior performance. Still, it seems 
apparent that a person is at some level sensi- 
tive to how others will interpret her behavior 
and that she will modify her performance 
level in order to convey the desired message. 
Because the inferior performance levels were 
found when the expectation about the subject 
was public but not when it was private, it is 
clear that the public nature of the expectation 
was indispensable in eliciting conformity. 
We interpret this result as meaning that 
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modification of performance level was due to 
essentially communicative interests. It has 
been shown previously that the presence or 
orientation of an audience can raise concerns 
that may interfere with performance (Brown, 
1968, 1970; Brown & Garland, 1971; Cot- 
trell, Wack, Sekerak, & Rittle, 1968; Deutsch 
& Gerard, 1955; Henchy & Glass, 1968). In 
the present study, the performance was re- 
duced to a mere tool for the communication 
of information that had no analytical rela- 
tion to the task itself. We submit that the 
concerns raised by the audience expectancies 
in the present study were not merely distrac- 
tions that interfered by competing with the 
task for the subject’s attention (for such 
concerns would have been equally if not more 
potent in the public expectancy/bad trait 
condition) but actually preempted the de- 
sire to do well. 


Illicit Enhancement 


Illicit performance enhancement did seem 
to reflect some internal motivation, in that it 
occurred in both the public and the private 
versions of the situation in which the ex- 
pectancy of poor performance was linked to 
an undesirable (bad) trait. When poor per- 
formance was expected on the basis of an 
undesirable trait, our subjects were eager to 
do well. Although they performed no better 
than subjects in the private expectancy/good 
trait condition, they resorted more to illicit 
techniques such as reporting nonwords as 
solutions and simply miscalculating their 
score in their favor. This suggests that they 
were strongly motivated to achieve high scores, 
possibly in order to discredit the attribution 
of the bad trait to them by disconfirming the 
prediction based on it. 

An alterative interpretation of this finding 
can be derived from the work of Aronson and 
Mettee (1968). These authors found that 
subjects were more willing to engage in dis- 
honest behavior after receiving an unfavor- 
able evaluation than after receiving a favor- 
able one. By cheating at a game, subjects 
could demonstrate that they apparently had 
extrasensory perception (ESP), a possibly 
very desirable trait that had not been men- 
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tioned in the unfavorable evaluation, 
cheating of these subjects may therefore ha 
represented a self-enhancement designed 
compensate for the bad evaluation (Bay 
meister & Jones, 1978). It is possible that 
one knows that one is evaluated unfavorabl 
by others, this knowledge effectively reduce 
the inhibitions against using illicit means i 
order to look good. The illicit perfo1 
enhancement of our subjects following 
bad trait treatment can be seen as an insi 
of this possibility. 

Moreover, there was a difference in the pat 
terns of illicit performance enhancement 
tween the public and the private versions ol 
the condition in which the expectancy 
based on an undesirable trait. When the ; 
pectancy was private, subjects resorted to 
licit enhancement more on the first of the 
anagram tasks, but when it was public, they 
did so more on the second test. The only dif 
ference in the situation between the two t 
was that after the first test, the subject and 
the confederate reported a score equal to thal 
of the subject. Thus, if the subject enhi 
her performance on the first list and not i 
the second list, she was apparently content 
with performing just as well as the confeti 
erate; whereas if the subject enhanced het 
score more on the second test, she was ab 
parently not content with matching the coti 
federate but wanted to do better than he. 

We did not predict this finding, but We 
venture a post hoc speculative interpretation 
In the private condition, the subject’s inte 
desire to discredit the notion that she 
high surgency was satisfied by doing a 
as the confederate. In the public conditio 
however, she had the additional concert a 
discrediting the attribution (of high surge" m 
to her) in the eyes of the others present, f; 
she could not assume that they would be m 
easily convinced as she herself might be 
other words, she could not be certain thai 
others would consider the expectancy of 
performance disconfirmed if she merely fs 
formed at the same level as the confedet® 
therefore, after learning that her score °° a 
first test equaled his, she was tempted He 
sort to illicit enhancement on the secon 
in the hope of exceeding his score. 


EXPECTANCY AND INFERIOR PERFORMANCE 


Subjects in the private expectancy/good 
trait condition showed no sign of trying either 
to raise or to lower their performance level 
from the level appropriate to their ability. 
There is no reason to infer that they endowed 
the task with any surplus meaning. That is, 
there is no evidence that they believed poor 
anagram performance would convince the au- 
dience that they had high surgency. In this 
respect they were unlike the subjects in the 
other three conditions. 


Origin of Expectancy 


A possible boundary condition of the present 

findings is implied by the fact that the sub- 
ject herself provided the basis for the ex- 
pectancy by her own behavior (that is, the ex- 
pectancy was based on her test responses). 
The findings of Gurwitz and Topol (1978), 
although not based on behavioral measures, 
Suggest a relation between whether a person 
has given evidence in support of an expectancy 
and how the person will subsequently intend 
to behave vis-à-vis that expectancy. 
_ Because our subjects were women, our find- 
ings might be considered in the context of 
literature that has suggested that women per- 
form below the level of their abilities because 
of a “fear of success” (Horner, 1972). Our 
study was not designed to address this issue. 
Still, it might be noted that in our sample, 
any motivation to perform below the level of 
one’s ability was derived from immediate pub- 
lic expectancies and self-presentational in- 
terests rather than from internalized “stereo- 
types or dispositional fears of success. 


Conclusions 


The present findings do mot mean simply 
that women perform poorly when bad per- 
formance means something good. Rather, our 
evidence indicates that an expectancy exerts 
effective power over a woman è toward elicit- 
e behavior confirming the expectancy #f the 
ollowing two conditions are satisfied: The 
€xpectancy is based on a desirable rather than 
an undesirable aspect of the target’s person- 
4 ty, and other persons present in the situa- 
ae are aware that this trait (on which the 
*Pectancy is based) is attributed to the 
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target person. We do not deny that internal 
expectancies can affect behavior. Nonethe- 
less, our findings imply that in order to under- 
stand responses to overt public expectancies, 
one must take into account not only the in- 
ternal cognitive dynamics of the individual 
but also the individual’s self-presentational in- 
terests. The influence of expectancy on be- 
havior is in part determined by the expec- 
tancy’s derivation and context, which give 
extra meaning to the behavior in the face of 
the expectancy and which therefore affect the 
self-presentational strategy. 


2 Our conclusions are stated in reference to women 
because our experimental sample consisted of women. 
We do not, however, wish to imply that our hy- 
potheses would not be true for men. 
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Sex Differences in the Allocation of Pay 


Charlene M. Callahan-Levy 
New College, University of South Florida 


Lawrence A. Messé 
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The present research examined the possibility that relative to males, females 
perceive less of a connection between their work and monetary rewards. In 
Study 1, females and males determined either their own pay or the pay of 
another person. The results supported the hypothesis, in that (a) females paid 
themselves less than did males, and (b) females paid themselves less than other 
people (males or females) paid females. Moreover, the results suggested that 
subjects were more generous when they paid females. In Study 2, sex differences 
in self-pay were examined in samples of first-, fourth-, seventh-, and 10th-grade 
subjects. Results replicated those of Study 1, in that at every grade level, 
females paid themselves less than did males. In addition, the extent to which 
females allocated pay the way their male counterparts did was found to be 
highly related to the masculinity of their career goals. Some implications of 
these results for the understanding of inequitable consequences of traditional 


sex roles are discussed. 


One of the more consistent findings gen- 
erated by studies of reward allocation be- 
havior is that American female allocators pay 
themselves less than do males (Lane & Messé, 
1971; Leventhal & Lane, 1970; Mikula, 
1974; Messé & Lichtman, Note 1). While this 
sex difference has been demonstrated a num- 
ber of times, there have been few empirical 
investigations of the numerous speculations 
that have been offered to account for this 
Phenomenon (e.g., Leventhal, Note 2). Thus, 
the present research examined one possible 
explanation, derived from the position of Med- 
nick and Tangri (1972) and Chesler and 
Goodman (1976), that relative to males, fe- 
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males perceive that money is less salient a re- 
ward for the work that they perform. 

Mednick and Tangri (1972) and Chesler 
and Goodman (1976) speculated that females 
learn that performances of the sex role- 
related tasks traditionally assigned to them 
(e.g., mothering, housekeeping, etc.) usually 
are not rewarded monetarily in any direct 
way. It is possible that the lack of a relation- 
ship between money and the traditional tasks 
for women generalizes to some extent to other 
work situations as well. Moreover, this per- 
ception may be reinforced indirectly by socio- 
cultural biases about females’ relative “needs” 
for monetary rewards. For example, differ- 
ences in pay as a function of sex often have 
been “justified” by assertions that it is ac- 
ceptable for women to earn less than men 
because women usually do not provide the sole 
means of economic support for their families. 
Males, on the other hand, might learn to dis- 
tinguish between family-oriented tasks, for 
which no money is expected, and “work,” 
which they perform for pay. 

Males also may come to see money not 
only as a means to acquire goods and services 
but also as a symbol of how their work is 
evaluated and, thus, as a measure of their 
worth. Females, however, may not perceive 
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their pay to have this evaluative component, 
since the traditional tasks at which women 
are “supposed to do well” and to which their 
self-concepts are supposed to be tied (eg., 
mothering, cleaning, cooking, etc.) are not 
“jobs” for which pay is expected. Chesler 
and Goodman (1976) speculate that women 
might use money as an index of self-worth to 
the extent that they can acquire it without 
working for it in the traditional sense, that 
is, when it is “magic money.” 

Given that females might have internalized 
less of a connection between work and pay 
and between pay and self-esteem, it follows 
that females’ views of money as pay would be 
influenced by a number of factors that males 
would perceive as less relevant. For example, 
females could be concerned with appearing 
“nice,” “not greedy,” or “not concerned with 
money,” and so forth, whereas males could 
be concerned almost entirely with getting paid 
at least what they deserve, 

This article reports two studies that exam- 
ined the implications of this argument. In 
Study 1, we explored the extent to which sex 
differences in reward distribution found in 
past studies were due, at least in part, to the 
tendency of females to pay themselves less 
than males pay themselves. In Study 2, we 
attempted to replicate some of the major find- 
ings of Study 1 and to extend them by exam- 
ining self-pay allocation in other subject 
populations and by relating this behavior 
more directly to sex role socialization, 


Study 1 


The “less-of-a-connection” hypothesis can 
be viewed within the Perspective of equity 
theory (Berkowitz & Walster, 1976). In one 
version of the theory (eg. Lane & Messé, 
1972; Weick, 1966), persons are assumed to 
have, among other criteria of equity, an in- 
ternal standard by which they judge the ade- 
quacy of the compensation that is available 
for a given level of work inputs. This standard 
can be applied both to one’s own rewards 
(“own equity”) and the rewards of other per- 
sons (“other equity”). It could be that males 
are at least as likely to apply this internal 
standard of fair pay to outcomes that are po- 
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tentially available for their own inputs aş 
they are to apply it to the potential outcome | 
of others. Females, on the other hand, could 
be more likely to apply the internal standard 
to the outcomes of others than they are to 
their own outcomes. In Weick’s (1966) terms, 
the more tenuous connection between work 
and pay that is hypothesized to be found in 
females could be manifested in a weaker sense 
of own equity. Further, as discussed above, 
this hypothesized weaker sense of own equity 
would make females more susceptible to the 
influence of such factors as concerns with im- 
pression management, which would tend to af- | 
fect adversely their judgment about what is 
appropriate pay for themselves. 

An alternative interpretation, again cast 
within the framework of equity theory, is that 
persons, both males and females, apply differ- 
ent standards of fairness to the pay of females 
and males. Thus, it could be that both males 
and females believe that women should be paid 
less than men for a given amount of work. 
This difference in the perception of what is 
fair pay would mean that the “double stan- 
dard” of different pay for the same work by 
males and females would have the status of a 
general norm. 

Both explanations involve a sex difference 
in the connection between work and pay, and 
both would account, at least in part, for the 
reward allocation findings of past research. 
In these studies (e.g., Leventhal & Lane, 
1970; Mikula, 1974; Messé & Lichtman, Note 
1), allocators typically were asked to divide 
anonymously a monetary reward between | 
themselves and another person. In this zer% 
sum situation, it could be that females gav’ | 
the other person more in part because they 
conformed to a norm that specified that 
women (including themselves) deserved 168 
On the other hand, it also could be that the 
female subjects in these studies allocated rela- 
tively more reward to the other person in pr 
because they might have perceived less © ; 
connection between their own work and t 
money that they could give themselves, ee 
thus, they could have been more suscept! : 
to being influenced by concerns with the othe 
person’s welfare, by the norm of equality, an 
so forth. 
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However, the procedure that typically was 
used in these past studies—where subjects 
' divided a reward between themselves and an- 
other person—does not separate the impact on 
allocation behavior of people’s internal stan- 
_ dards from the impact of their interpersonal 
concerns. The present study, on the other 
hand, used a procedure that reduced the sali- 
ence of interpersonal concerns and thereby 
permitted a more direct exploration of the 
hypothesis that females perceive less money as 
an appropriate reward for themselves than do 
males, In addition, it also permitted a direct 
examination of the extent to which the tend- 
‘ency in females to pay themselves less is a 
result of a weaker sense of their own equity 
or of a general norm that persons apply to all 
_ women, 
Tn the present research, female and male 
‘subjects were asked to pay either themselves 
another female or male for a given period 
‘of work, If females do, in fact, perceive less 
of a connection between work and money and, 
ja result, are more influenced by additional 
Concerns, they should pay themselves less than 
do males. Moreover, given that the hypothe- 
Sized less-of-a-connection is reflected in a 
Weaker sense of own equity, then females 
should pay themselves less than both females 
and males pay other females. However, if the 
Sex. difference in reward allocation is based on 
4 general norm, then females should pay 
themselves less, and both females and males 
should pay females less than they pay males. 


Method 
Subjects and Recruitment 


+ 

= hundred twenty-six undergraduates, 66 fe- 
= and 60 males, served as subjects in this experi- 
sent. These persons were chosen by chance from ap- 

‘the siete 400 respondents to an advertisement in 

“i, Michigan State University student newspaper 

lat solicited subjects for pay. 


uia dichotomous independent variables were ex- 
(a) the Via a factorial design. The variables were 
Other target of the allocation (the allocator or an- 
; a (b) the sex of the target person, and 

luded as a control to complete the design) 


ý “8X of the nontarget person (same as or different 
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from the target). When the subject was choosing pay 
for herself (himself), the sex of the nontarget merely 
referred to the sex of the people in the other group 
(see below). When the subject was paying the other 
person, the sex of the nontarget referred to the sex 
of the allocator. Viewed from a simpler, more 
straightforward perspective, the design involved the 
sex of the allocator, the sex of the people in the 
other group, and the target of the allocation (the 
allocator or a person in the other group). However, 
the comparisons of greatest interest, which were sub- 
sumed under the Target X Sex of Target interaction, 
could be examined most directly when the design 
was cast in the more complicated format. 


Procedure 


Subjects were examined in 16 groups of six or 
eight persons.1 Four of the groups were composed 
entirely of females, four entirely of males, and the 
remainder were composed so that half of the par- 
ticipants were male and half were female. In two 
same-sex groups and four mixed-sex groups, each 
subject allocated money only to herself (himself) 
after she (he) had completed a pretask; in the re- 
maining groups, half the subjects each allocated pay 
to another person, who also had completed the pre- 
task but who did not allocate payment. 

When subjects arrived for the experiment, they 
were divided between two large tables in a rather 
spacious room. In mixed-sex groups, females were 
always assigned to one table and males to the other. 
Two experimenters, one male and one female, handed 
out the pretask materials and explained what the 
subjects were to do. They always started by ex- 
plaining that the subjects were participating in a 
study that attempted to simulate an industrial set- 
ting. This rationale has been used a number of times 
in past reward distribution studies (e.g. Leventhal 
& Michaels, 1969). 

The pretask consisted of a number of essay ques- 
tions pertaining to relevant campus-related issues,? 
Subjects were asked to spend approximately 5 min- 
utes writing their own opinions on each question or 
topic. It was stressed that this task was nonevalua- 
tive, in the sense that there were no right or wrong 


answers to the questions. 


1Jt should be noted that a group of subjects per- 
formed only the initial task in the same room. As 
explained in detail below, subjects always allocated 
pay and completed the postallocation questionnaire 
in separate cubicles. 

2 This task appears to be an appropriate choice for 
a nonevaluative work input, given that 223 males 
and females, when asked to work on and evaluate 
it, generated no sex differences in their perception of 
their absolute or relative enjoyment, skill, effort, and 
quality of performance ; multivariate F(8,214) = .90, 


p<. 
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At the end of 50 minutes, the experimenters col- 
lected the finished questionnaires; then they told the 
subjects that the remainder of the session would be 
devoted to paying them. In the self-allocation condi- 
tions, each subject was led to a cubicle, where she 
or he was given $6 in bills and change, two en- 
velopes—one marked “My pay” and the other “Re- 
mainder”—and a short questionnaire that asked (a) 
the amount of money allocated, (b) the reason, in 
the subject’s own words, why this amount was 
chosen, (c) the amount of money (up to $6) that 
the subject thought would be fair pay for this work, 
and (d) the sex of the person in the “other group.” 
The experimenter explained that each subject should 
pay herself (himself) any amount, up to the maxi- 
mum of $6, by placing the desired amount in the 
“My pay” envelope, This amount was hers (his) to 
keep. The subject was to take the remainder and 
place it in the appropriately marked second envelope. 
He (she) was informed that the amount that was 
left over would be returned to the general research 
fund, and thus, his (her) allocation would not affect 
the payment of any other person. After allocating 
the money, she (he) was then to complete the ques- 
tionnaire, fold it, place it in the “Remainder” en- 
velope, seal the envelope, and deposit it in a box 
that was placed at the exit to the room. 

In the other-allocation conditions, only half the 
subjects (always sitting at the same table) were led 
to the cubicles, while the remainder stayed at their 
original places at their table. Once in the cubicles, 
subjects were told that they were to allocate up to 
$6 to pay one of the other subjects who was left in 
the original room. The specific person who would 
be the recipient of the allocation was to be selected 
by chance, and her (his) identity would never be 
disclosed to the allocator, nor vice versa. (However, 
in a matter-of-fact manner the experimenter stated 
verbally, and the written instructions indicated, 
whether the recipient was female or male.) Again, 
each subject in the cubicles was handed two en- 
velopes—this time, however, one was marked “Her 
(or His) pay” and the other was marked “Re- 
mainder”—and a similar short questionnaire. The 
experimenter stressed that the allocator’s behavior in 
no way influenced his or her own pay, which had 
been determined in advance but would not be re- 
vealed or dispensed until this task was completed. 
The procedure for allocating pay to the other per- 
son was essentially the same as that used in the self- 
pay conditions, except that the allocator placed both 
envelopes in the box before she (he) was given a 
sealed envelope containing $2.50. The experimenter 
did, in fact, take the pay envelopes and distribute 
them at random to the subjects who were waiting in 
the original room. 


Results 


There were two payment scores obtained 
from each allocator. One score was the amount 
of money that subjects (on the postallocation 
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questionnaire) reported giving to the target 
of the allocation; the other was the amount 
of money that the allocator indicated was 
“fair” pay.* These scores were subjected to 
separate 2 (target: allocator or another per 
son) X 2 (sex of target) 2 (sex of nontar. 
get) unweighted-means analyses of variance 
The results of these analyses are reported be 
low in terms of their implications for the hy. 
potheses.* 

Both the less-of-a-connection hypothesis 
and the double standard hypothesis predict 
that females should underpay themselves rela: 
tive to males. In addition, however, female 
should pay themselves less than other peopl 
pay females if the lesser connection is mani 
fested in a weaker sense of own equity 
whereas both males and females should pay 
other females less than they pay males i 
there exists a general, lower standard of fair 
pay for women. Thus, we expected either 4 
significant Target X Sex of Target interac 
tion (if the less-of-a-connection was expressed 
in weaker own equity) or a significant sex o 
target main effect (if the sex difference I 
pay was a reflection of different standards), 
The Target x Sex of Target interaction wi 
significant in both analyses, Fs(1, 66) = 
12.50 and 11.35, ps < .001, for actual an 


3 As noted above, an item on the postallocatin 
questionnaire asked subjects to indicate the 4 d 
the other person. Since this variable was pal 
the design, and it was crucial when subjects a 
cated pay to the other, the data of nine ra 
who answered this question incorrectly were exi = 
from the analysis. In addition, the data foe 
other subjects were not used: One subject Ths 
postquestionnaire, indicated that he was tol ih 
he had been telephoned and asked to partir if 
he was to be paid at a rate of $2/hour, woe 
fact is what he paid himself. This occurre e 
once, since he was one of the first subjects sto 
examined. He was told this in answer to à 4 
and we took steps to insure that the assist## ue 
were in charge of scheduling answered SU s 
tions more appropriately in the future. socal 
subject gave very bizarre reasons for his 
(he paid a female $6). e 

it is important to note, though, that el 
pected, the sex of the nontarget had little p 
effect on allocations, since no F ratio inyo 
variable even approached significance. The 
these F ratios were .940 and .739, for # 
fair pay, respectively. 
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fair pay, respectively. Table 1, which presents 
the relevant cell means, indicates that the di- 
` rection of this effect supports the position 
| that the more tenuous connection between 
work and pay for women is reflected in a 
weaker sense of own equity. 
Inspection of Table 1 and appropriate in- 
dividual comparisons (Winer, 1971, p. 384) 
| indicate that females tended to pay them- 

selves less than males paid themselves, ¢(66) 
f = 1.89, p< .03 (actual pay), and ¢(66) = 
W 3.50, p < .001 (fair pay), and less than other 
f people paid females, #(66) = 3.86, p < .001 
| (actual pay), and (66) = 2.87, p< .005 
) (fair pay). 

Since the weaker own equity interpretation 
posits that females will pay themselves less 
than they pay other females, it seemed rea- 
sonable, given the results presented above, 
to perform individual comparisons of these 
Means. These comparisons provided further 
Support for the own equity explanation, since 
females were found to pay themselves (see 
Table 1) less than female allocators paid 
other females: M = $5.19, £(66) = 2.86, p 
< .005, for actual pay; and M = $3.77, ¢(66) 
= 236, p < .02, for fair pay. Thus, our evi- 
dence indicates that females pay themselves 


Table 1 


Mean Payment Allocated to Females 
and Males 


a 


Target of allocation 


Self* Other 
Sex of Amount Amount 
target (in$) n (n$) n 
Actual pay allocation 
paale 345 2 Satay 
ale 4.26 17 3.42 18 
“Fair” pay allocation 
emale 249 22 362 17 
ale 3.88 17 3.16 18 
"Values 


amount: Presented in this column are the average 
ems, S that the subjects of each sex allocated to 

b elves, 
amour, Presented in this column are the average 
allocarts, that subjects (both males and females) 
ed to other females and males. 
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less than males, and this sex difference ap- 
pears due to a weaker sense of own equity in 
women. 

In addition, the simple effects analysis of 
the Target X Sex of Target interaction indi- 
cated that allocators tended to pay other 
males less than they did other females, 
F(1, 66) = 9.62, p<.005 (actual pay); 
F(1, 66) = 2.98, p < .10 (fair pay). 

Although (as noted above) data collected 
on a separate sample of students indicated 
that males and females evaluate their own 
performance on the task similarly, it still 
could be that in general, writing essays—a 
task that requires some reasonable degree of 
verbal fluency to perform well—is perceived 
as a task more suited for the “typical” fe- 
male. To explore this possibility, 307 subjects 
—students in introductory psychology and so- 
cial psychology classes—completed a “Cam- 
pus Issues Questionnaire” and evaluated on a 
5-point scale either their own performance 
or the expected performance of other females 
or males. These data indicated that there 
was no difference in the evaluation of per- 
formance, either one’s own or that of others, 
as a function of the sex of respondent, 
F(1, 301) = .15. However, there was a differ- 
ence in evaluation of others’ expected per- 
formance as a function of their sex, F(1, 301) 
= 8.24, p < 01; females were expected to do 
a better job than males. Thus, it appears that 
the differential pay given to other females 
and males was due, at least in part, to the 
differential expectations that allocators might 
have had about performance as a function of 
sex. 
In addition, both males and females evalu- 
ated their own performance more favorably 
than they did the expected performance of 
other persons of either sex, F(1, 301) = 
105.44, p < .0001. Of particular interest is 
the finding that females evaluated their own 
performance more favorably than they did the 
expected performance of either other women, 
F(1, 301) = 23.92, p < .0001, or other men, 
F(1, 301) = 78.64, p < .0001. Table 2 pre- 
sents the cell means that are relevant to these 


effects. 
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Table 2 
Evaluation of Performance on 


Campus Issues Questionnaire 


Target of evaluation 


Respondent Other 
Sex of target — 
of evaluation Score n Score n 
Female 4.52 118 3.96 44 
Male 4.66 105 3.48 40 


Note. Scores on this measure could range from 1.0 
(i.e., poor performance) to 5.0 (i.e., excellent per- 
formance). 


Discussion 


The results of this study indicate that per- 
sons who were recruited for pay, who actually 
worked for a period of time, and who received 
payment for their work differed in the amount 
of money that they paid themselves (and con- 
sidered to be fair) as a function of their sex. 
These findings lend support to Mednick and 
Tangri’s (1972) and Chesler and Goodman’s 
(1976) contention that females expect less 
monetary reward than males for the work that 
they do. 

Results also indicated that this more tenu- 
ous connection between work and pay is ex- 
pressed primarily through a weaker sense of 
own equity in females. Females paid them- 
selves less than other people paid females, al- 
though females also evaluated their own per- 
formance more favorably than they did the 
performance of other females. Thus, it ap- 
pears that females’ payment to themselves 
was not based entirely on the evaluation of 
their own performance. On the other hand, 
both females and males paid other females 
more than they paid other males, and they 
evaluated the expected performance of other 
females more favorably than they did that of 
males. Since this correspondence between eval- 
uation and pay did not occur when females 
allocated money to themselves, it appears 
that women were less apt to apply a standard 
of fair pay to their own behavior than they 
were to the behavior of others. 

It should be noted, however, that these 
findings do not necessarily invalidate com- 
pletely other alternative explanations. Some 
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explanations—for example, that females ay 
more influenced by a norm of equality than 
are males (Leventhal, 1976; Sampson, 1975) 
—are not antithetical to the less-of-a-conner. 
tion hypothesis. In fact, the weaker sense of 
own equity in females could be a necessary 
condition that permits these other variable 
to operate. For example, females might E 
scribe to the norm of equality because their, 
weaker sense of own equity alone does not 
provide them with an adequate basis for mak- 
ing a decision about an appropriate distribu- 
tion of rewards. 

The data do, however, tend to diminish the 
plausibility of a number of alternative ex 
planations. For example, one possible alterna: 
tive explanation is that the female subjects in 
this study had been victims of wage discrimi: 
nation in their pasts, which may have lowered 
their expectations with regard to monetary 
payment for work performed. A second al 
ternative suggests that females are less ev 
perienced than males (and therefore Tess 
“skilled”) at the task of determining mont 
tary outcomes. Thus, females may have 
tended to underreward themselves because ol 
a relative lack of experience in dealing wi 
such matters. Neither of these explanations; 
however, can account for the amount 0 
money women allocated to female target 
That is, any lowered expectations regar 
pay for work as a result of past experien? 
(or inexperience) would most likely genet# 
ize to paying other females as well and, p 
fore, should have resulted in female allocato! 
treating other females in the same Way a 
they treated themselves. Results indicate 
however, that this was not the case. 

Another potential alternative ex 
that is rendered more implausible by 
sults of the present research is that fem 
are more “moral” or socially responsible f 
males. The data that are summar al 
Table 1 suggest that both males and f i 
tended to actually take more for them alt 
than they indicated they felt was approp" 
Thus, while females paid themselves 1°% 
did males, they (as did males) still to 
vantage” of the situation and paid p. 
more than the amount they reported W 
propriate immediately after that all 


Janatiot 
p the 1 
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Thus, there is no indication that females 
acted in a more moral manner than did males. 
As noted above, the greater payment made 
by allocators to other females, compared to 
that made to other males, was congruent with 
differential evaluations of performance as a 
function of sex. However, examination of the 
relevant means (Tables 1 and 2) suggests 
that the amount actually paid to other fe- 
males was excessive, even given an expected 
difference in performance. Moreover, the dif- 
ference between what subjects actually paid 
other females and males was greater than the 
difference between what they said was fair 
pay for the two sexes. Thus, a case can be 
made that allocators in the present study 
were being “kind” to females. This conclu- 
sion is congruent with the findings of Gruder 
and Cook (1971) that persons are more help- 
ful to females than to males and those of 
| Messé and Callahan-Levy (Note 3) that al- 
locators tend to be kinder to females when 
distributing a reward between themselves and 
another person. The notion of being kinder 
to females is congruent with the passive, de- 
Pendent aspects of the traditional female sex 
tole. Thus, if females are persons towards 
Whom others are nurturant and protective, 
there is no need for them to be concerned 
with acquiring a reasonable amount of money 
for their work, 

The original impetus for the present re- 
Ee came from an attempt to explain, at 
east in part, the sex differences that were 
rather consistently found in studies of reward 
pe cadon in which subjects were asked to 
peus divide money between them- 
in es and another person. The present find- 

8S suggest that one reason why females ap- 

Bre generous in such situations is their 
My erence, relative to males, to receiving 
p propriate” money for their work. Study 
fe below, attempted to replicate 
a eult across a different task and a num- 
in eee groups and to relate pay allocation 

e consequences of sex role socialization. 


Study 2 


te the results of Study 1, it seemed rea- 
€ to explore two related questions: (a) 
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whether sex differences in self-pay allocation 
occur across a number of ages and (b) 
whether these differences, if they exist, are 
related to sex role socialization. To this end, 
in Study 2, we examined self-pay allocation 
in first-, fourth-, seventh-, and 10th-grade fe- 
males and males and measured these subjects’ 
preferences for masculine and feminine oc- 
cupations and activities. 

Studies that have examined reward dis- 
tribution in children have yielded mixed re- 
sults, For example, Leventhal and Anderson 
(1970) found that kindergarten-age females 
allocated less reward to themselves than did 
males. On the other hand, Lane and Coon 
(1972), in a replication of the Leventhal and 
Anderson study, found no sex difference, and 
Lerner (1974), in a study that used a similar 
methodology, found that 5-year-old females 
allocated more to themselves than did 5-year- 
old males. The reasons for these discrepant 
findings are not clear. Although the meth- 
odologies employed have much in common, 
they also differed in many potentially rele- 
vant ways—for example, the type of reward 
that the subjects were asked to distribute— 
and these differences in procedure could have 
caused the differences in results. 

Given the age of the children who were ex- 
amined in these studies, it could be that these 
differences in research findings were due, at 
least in part, to the complex interpersonal 
nature of the reward distribution task. It is 
possible that a more straightforward task— 
for example, self-pay allocation—would yield 
more consistent, interpretable findings. 

Hartley (1960) found that both males and 
females, who were as young as 5 years old, 
were capable of making clear distinctions be- 
tween men’s and women’s work roles. Her 
subjects saw women’s work to be centered on 
homemaking duties (which receive no direct 
monetary pay), whereas men’s work was per- 
ceived as a number of tasks for which money 
was paid. Assuming (a) that the weaker 
sense of own equity in females is connected 
to sex role socialization regarding appropriate 
occupations for men and women and (b) 
that the consequences of the socialization ap- 
pear at least by the age of 5 years, it follows 
that a sex difference in self-pay allocation 
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Table 3 


Mean Ages (and Frequencies) of Female and Male Subjects at the 


Four Grade Levels 


ee eee 


Grade 
First Fourth Seventh 10th 
Age Age Age Age 
Sex (years) n (years) n (years) n (years) n 
Female 6.28 7 9.33 12 12.57 11 15.40 10 
Male 6.14 14 9,33 9 12.45 7 15.60 10 


would occur quite early in children’s psycho- 
logical development. However, as children be- 
come more experienced with sex role-related 
activities, it also seems likely that this dif- 
ference would be increased. Thus, we expected 
that the sex difference in self-pay allocation 
would be greater among adolescent subjects 
than it would be among younger children. 

A fundamental premise of the less-of-a-con- 
nection hypothesis is that it is related to sex 
role socialization, especially those areas of 
sex role that involve occupational choice. In 
Study 2, we attempted to examine the premise 
more directly by measuring preferences for 
sex-typed occupations and activities and ex- 
ploring in females the relationship between 
these measures and their self-pay allocation 
behavior. We expected that girls who ex- 
pressed a preference for masculine occupa- 
tions and activities would be more likely to 


Table 4 


Presentation Order and Femininity Ranking of Items in the 


Activities and Occupations Scales 


pay themselves as males do, whereas gits 
who selected more feminine pursuits woul 
be more likely to underpay themselves rel: 
tive to males. 


Method 
Subjects 


The subjects were male and female students in i 
first, fourth, seventh, and 10th grades of schools it 
district that serves a lower-middle-class to micati 
class suburb of Lansing, Michigan. Table 5 a 
the frequencies and mean ages of the subjects, | 
each class the teacher introduced the experimenter ; 
his or her students. The experimenter exp! ' 
the class that the students were needed to perform 
job—participating in an interview that invol 
“things that you might like to do.” Parental pemi 
sion was obtained for all those students who 38! É 
to participate. Of all the students who were ‘i 
tacted, only one, a female 10th-grade student, @ 
clined to take part. 


Activities Occupations 
Order of ji 
presentation Item Ranking Item Ranking 
1 Playing football 10 Firefighter H 
2 Sewing or knitting 1 Secretary 0 
3 Shoveling snow 9 Football player 1 2 
4 Cooking or baking 2 Nurse 8 
5 Going camping 8 Astronaut 3 
6 Dancing 3 Teacher 7 
7 Going bowling 7 Police officer 4 
8 Swimming 6 Parent 6 
9 Ice skating 4 Doctor 5 
10 Reading 5 Baker 
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Interview Schedule 


The instruments used in the study were admin- 
jstered to subjects via an interview protocol, which 
was the “task” for which subjects were paid. This 
interview required approximately 25 minutes to 
complete. A female or male undergraduate experi- 
menter (counterbalanced with respect to sex of sub- 
ject) administered the interview individually. 

The interview began with a number of questions 
about personal and family demographic character- 
istics, for example, age of subject, family size, and 
parents’ employment status. Then, subjects were pre- 
sented with a list of 10 activities, This list was ob- 
tained through research that established, via a modi- 
fied Thurstone procedure, that these activities dif- 
fered significantly with respect to the degree of 
femininity and masculinity associated with them. 
Table 4 presents these activities ranked as a function 
of their perceived femininity, and it also indicates 
the order in which the activities were given to the 
subjects. The experimenter read through the list 
twice and then asked each subject to indicate her or 
his three most preferred activities, 

Next, subjects were presented with 10 occupations. 
These occupations were selected to vary along the 
feminine-masculine dimension and were obtained via 
the same procedure that we used to construct the 
Activities scale. The 10 occupations, their rank order 
in terms of femininity, and their presentation order 
also are given in Table 4, The Occupations scale was 
administered to subjects in exactly the same manner 
as Was the Activities scale. 

Following the presentation of the Occupations 
scale, the experimenter administered a short form 
(10 items) of the Bialer-Cromwell Locus of Control 
Scale for Children (Bialer, 1961), This scale con- 
sists of statements that require a yes-no answer. 
tetas Subjects were given instructions concerning 
jae S ay allocation (see below), which they per- 
bine hen, the experimenter asked three questions 
iie no subject’s evaluation of his or her perform- 
Prii ‘he exact wording of these questions is pre- 

i ats Table 5. Finally, the experimenter asked 
an ject what she or he thought was the “fair 

ount” of pay for doing this work.5 


Pay 


Bua Study 2, via extensive pilot testing, we 
aropa e Of what aa 
nt of Priate pay for participation in an experi- 
that ET type. This evaluation was necessary $0 
money Jects could be presented with an amount of 
Breat se which to pay themselves that was in 
fair, ace of a consensus standard of what was 

` “4s pilot research revealed that children con- 


Side j 
$3 in ao Propriate pay to be less than $1, and thus, 


ach subject, 


me) 


es was the amount of money presented to- 
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Table 5 
Mean Responses of Female and Male 
Subjects to Questions About Their 
Own Performance 
Females Males 

Interview question (n = 40) (n = 40) 
1. How well do you feel 

you did at this “job”? 2.52 2.70 
2. Do you think you did 

better than most boys 

your age would do? K 2.18 
3. Do you think you did 

better than most girls 

your age would do? 2.10 2.30 


Note. Subjects answered all questions using a 3- 
point scale. Thus, scores could range from 1-3, 
with higher scores indicating a higher evaluation. 


Also, there was some question about the appropri- 
ateness of money as a reward for 6-year-old first 
graders. Pilot testing indicated that most first graders 
were confused about the value of coins, but that 
they were interested in, and understood, the value 
of Hershey “kisses.” This pretesting also indicated 
that children in first grade saw less than 10 “kisses” 
as fair pay for being interviewed. Thus, we presented 
subjects in this grade level with 30 Hershey “kisses,” 
so that they could take what they wanted as their 


pay. 


Variables and Experimental Design 


The design of Study 2 was both experimental and 
correlational. Sex of subject and grade level were ex- 
amined through a 2 (female, male) X 4 (first, fourth, 
seventh, and 10th grades) factorial design with two 
dependent measures: (a) the amount of reward that 
subjects actually allocated to themselves and (b) the 
amount of reward that subjects indicated was fair 
pay. In addition, the two pay measures were cor- 
related with the two measures of sex role preference, 
scores on the locus of control measure, and the 12 
biographic-demographic questions (e.g., age of sub- 
ject, number of brothers, employment status of the 
father, etc.). 


5 The interview was worded exactly the same for 
all subjects. Thus, care was taken to insure, via pilot 
testing, that the younger children could understand 
the meaning of the questions. 
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Procedure 


A major goal in designing the present research was 
to replicate as closely as possible the essential fea- 
tures of the design and procedure used in Study 1.6 
Study 2 was conducted in large classrooms at the 
schools that the subjects normally attended. Ap- 
proximately 10 subjects, both male and female, were 
interviewed at the same time, but each was inter- 
viewed by a different experimenter, and they were 
always well separated within the room. Subjects were 
seated so that they faced the experimenter, who had 
her or his back to the wall, and so that no subject 
could see or communicate with other subjects in 
the room, 

At first, subjects were given general instructions 
that informed them that they were about to com- 
plete a “job” for a class of students at Michigan 
State University. They were asked to take this job 
as seriously as possible and to do their best to answer 
the interview questions honestly and accurately, al- 
though there would be no right or wrong answers to 
whatever they were asked. They also were reminded 
that as promised earlier, they were to be paid for 
their time after they completed the job. Then the 
experimenter administered the interview schedule to 
the subject, a procedure that took approximately 25 
minutes to complete. 

At the completion of the interview, the subject 
was given the opportunity to pay herself or himself. 
The subject was told that her or his job was com- 
pleted and that it was time to be paid. The ra- 
tionale that was given for this self-payment pro- 
cedure was as follows: 


We have some money to pay you. The only prob- 
lem is, we don’t know what fair pay is for a job 
like this for someone like you. So we've decided to 
let each person take what they feel is a fair amount 
of money for their work. This means we want you 
to take what you feel you deserve. 


The experimenter also stressed that there was no 
“correct” amount and that the subject would ac- 
tually keep the amount that she or he chose. 

After the subject received these instructions, the 
experimenter placed 30 dimes (or candies, for first- 
grade subjects) on a large cardboard and gave the 
subject a blank envelope and an envelope marked 
“Leftover.” Subjects were instructed to put what 
they wanted in the blank envelope, which they 
would keep, and to put the remainder in the en- 
velope marked “Leftover.” The experimenter stressed 
that she or he would turn around and not look or, 
if possible, leave the room, so that no one would 
know how much money (or candy) the subject ac- 
tually kept. The subject also was told that any left- 
over money (or candy) would be returned to Michi- 
gan State University to be used for other studies, 


Results 


As in Study 1, there were two measures of 
allocation behavior: (a) the amount of money 
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that subjects actually allocated to themselya , 
as pay and (b) the amount that Subjects jp.) 
dicated they felt was fair pay on a Postalloca| ; 
tion question. Summary data for the self. 
location dependent measures are presented in 
Table 6, which presents the mean number of 
payment units (candy in the first grade, dime 
in the fourth, seventh, and 10th grades) ar 
tually kept and indicated as fair pay bj- 
males and females at the four grade levels, 
The amount of reward subjects actual) 
kept and the amount they indicated was fai 
pay on the relevant postallocation question 
were subjected to 2 (sex of subject) x4 
(grade level) unweighted-means analyses ol 
variance (ANovAs). As in Study 1, the tw 
ANOVAs produced essentially the same resulti 
The anovas revealed two significant effects) 
the main effect for sex of subject, F(1, 72) 
15.54, p < .001 (actual pay), and F(1, 14 
= 9.22, p < .005 (fair pay), and the mai 
effect for grade level, F(3, 72) = 13.39 (a40 
tual pay) and F(3, 72) = 8.37 (fair pay) 
both ps < .001. As expected, Table 6 shows] 
that the main effect for the sex of subject të 
flected a tendency in males to allocate great) 
pay to themselves than did females and 
perceive that more pay was fair. Table 6 al 
shows that as the grade level increased, th i 
amount of reward that they felt was fair Pa Fi 
tended to decrease. Comparisons perfo ie 
on the two scores between adjacent gtd! 
levels across both sexes indicated that b 
was a very significant decrease in the amoun 
kept and thought to be fair henni à 
fourth and seventh grades, F(1, 72) =% 


jed 
Study 1, however, recruited college-age a 
for pay through an advertisement in the 2 welt 
newspaper. Although the subjects in Suai {or 
essentially volunteers who were also recrull 
pay, the act of volunteering required ye ati 
effort on their part. In fact, many of the ¢ na 
junior high, and senior high students Be Ae 
welcome the opportunity to be excused E essen] 
for the time required by the study. Thus, “it j 
tial difference between the two studies is wi 
vation of subjects for the pay omera this d 
possible to determine the degree to Mee 
ference might have affected the interpre ace i 
results, but it is assumed that these differ ni 
motivation would be uniform across S@ 
grades. 
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Table 6 
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Mean Payment (in Units of Pay) That Females and Males Allocated 


o Themselves at Each Grade Level 


ee 
Grade level 
First Fourth Seventh 10th 
Sex of 
subject Amount n Amount 2 Amount n Amount 2 
Actual pay allocation 
Female 10.86 7 12.92 12 3.73 il 2.40 10 
Male 16.57 14 18.56 9 6.57 7 10.70 10 
“Fair” pay allocation 
Female 10.14 7 9.17 12 4.64 11 3.00 10 
Male 14.86 14 15.44 9 5.29 7 8.50 10 


pe pay) and 12.43 (fair pay), both ps < 
Analysis of variance tests the significance 
Í interval rather than ratio differences, and 
ese interval differences tend to be con- 
tained by the total score over the cells of a 
ven comparison. Both male and female sub- 
cts in the higher grades took much less than 
id their younger counterparts; in fact, by 
è seventh grade, the mean self-allocation 
as only about five units, across sex. This 
latively small amount of pay had the effect 
Í constraining the size of the interval differ- 
eevee the conditions of sex within 
Older grades and, thus, would tend to 
inmize the magnitude of the obtained Sex 
pee Level interaction (F < 1.00, in both 
eine Therefore, in an attempt to ex- 
ae 4 Possibility that sex, in fact, did in- 
a ‘= ocation of pay differentially across 
a vels, actual and “fair” female scores 

transformed to proportion scores, which 


able 7 


represented the females’ proportion of the 
mean males’ score at each grade level. For 
example, each first grade female’s actual and 
fair scores were divided by the respective 
actual and fair means of the first grade males’ 
scores. In this way, an analysis could be per- 
formed on ratio scores, rather than on the 
more constrained interval units. 

These female proportion scores were sub- 
jected to one-way (grade level) analyses of 
variance, These analyses revealed that the 
effect of grade level was significant for actual 
pay, F(3, 36) = 4.94, p< .01, but not for 
fair pay, F(3, 36) = 1.20, p > .1. The means 
of these proportion pay scores over the four 
grade levels are presented in Table 7. Planned 
comparisons (Winer, 1971, p. 384) performed 
between adjacent grade levels revealed that as 
predicted, the actual pay proportion scores 
for the 10th-grade females were significantly 
less than the scores for seventh-grade girls, 
t(34) = 2.55, p < .02. This also was the case 


i 
iin Proportion of Average Male Pay That Females 


id Themselves in Each Grade 
i Grade level 
First Fourth Seventh 10th 
Pay score (n = 7) (a = 12) (n = 11) (n = 10) 
Actual 66 69 57 .22 
Fair .68 259) 88 35 
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for fair pay proportion scores, ¢(34) = 1.88, 
p <.05, even though the overall effect of 
grade level was not significant. 

The possibility that the sex difference in 
self-pay was due to differences in the males’ 
and females’ evaluations of their own per- 
formances was examined through 2 (sex of 
subject) X 4 (grade level) analyses of vari- 
ance, performed on the three questions that 
were relevant to this issue that were asked 
during the postallocation portion of the in- 
terview. 

Table 5, which presents the mean responses 
to these questions as a function of sex of sub- 
ject, indicates that there was little difference 
between males and females in their self- 
evaluations. The results of the three analyses 
of variance supported this conclusion, since 
no effect attributable to sex of subject was 
statistically significant, (Fs = .08 to 2.60). 
Thus—as was the case for the findings for 
Study 1—the sex difference in self-pay al- 
location does not appear to be due to any 
underlying difference in performance evalua- 
tion. 

We expected that scores on the sex role 
preference scales would correlate with self- 
allocation behavior. It should be noted that 
t tests comparing the median rank of the 
three items that were selected by a male or 
female subject for both the Activities scale 
(female M = 5.85, male M = 8.50) and the 
Occupations scale (female M = 3.98, male M 
= 8.43) were highly significant, ts(78) = 
8.35 and 16.70, respectively, ps < .001. The 
two scales also were significantly correlated 
with each other, 7(76) = .65, p < .001. As 
mentioned previously, we expected that scores 
on the sex role preference scales would cor- 
relate with self-allocation behavior in females, 
such that more masculine preferences would 
relate to more masculine payment behavior. 
To test this prediction directly, the propor- 
tion of the average male payment in her grade 
level that each female took for herself was 
correlated with the median rankings of her 
selected occupations and activities. The re- 
sultant correlation coefficient revealed that 
scores on the Occupations scale correlated 
substantially with proportion scores, 7(38) 
= .66, p < .001, indicating that as expected, 
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the more masculine the occupational orien 
tion, the more masculine the allocation b 
havior. The correlation coefficient compu 
for scores on the Activities scale and the py 
portion payment scores was in the predict 
direction, but was not statistically significa 
r(38) = .15.7 

The payment scores also were correlat 
with the locus of control measure and the] 
demographic variables. None of these corr 
tions, however, yielded statistically significa 
findings. 


Discussion 


The findings of Study 2 replicate clos 
those of Study 1. Females—even the fem 
in the first-grade sample—tended to H 
themselves less than did males at the equi 
lent grade level, as did the college student il 
males who were the subjects in the first stu 
Moreover, the sex differences in self-pay Um 
were found in both studies do not appear 
be a function of differential evaluations) 
task performance. Thus, findings of the 
studies taken together provide strong, 4 
indirect, support for the proposition that hi 
males are socialized to have a weaker $4 
of their own equity. 

There were some slight differences betwt 
the findings of the two studies, howevel. 
example, the 10th-grade sample of su 
was closest in age to the subjects in Study 
and yet the sex difference at this grade i 
was substantially larger than was ey A 
in the college student sample. It is likely f 
this difference in results is attributable o 
ferences between the two studies in te 
such factors as (a) the manner in which d 
jects were recruited to participate 
probably caused systematic differences ! f 
jects’ motivation for money—an' iow 
tasks that were used as the basis for a 


mp? 
TIt is of interest to note that the ity 


correlations, for males (i.e. relating er: 
culinity of occupational and activity 

pay) were similar to those for females, F 
magnitude. Specifically, the correlation 4, 1 
cupational preference and self-pay H 
.075, while, again, it was considera! 

activity preference and self-pay (=: 
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subjects to pay themselves. It is possible that 
these and other factors mediate the magni- 
tude of the sex difference. However, it is 
noteworthy that the sex difference in self-pay 
allocation did appear under such different 
conditions and in somewhat different subject 
populations. 

Given Hartley’s (1960) findings that young 
children to some extent have internalized the 
traditional division of labor as a function of 
sex, we expected some degree of sex differ- 
ence in self-pay behavior in our first-grade 
sample. However, it seemed reasonable that 
this sex difference would be greatest when the 
allocators were closest to the age at which 
they were expected to assume the adult sex 
tole. Thus, we also expected that the relative 
difference between male and female self-pay 
would be greatest for our 10th-grade sample. 

Findings supported both these expectations 
lo some extent. In fact, though, the differ- 
ice in the first-grade sample was somewhat 
stronger than we had originally expected. The 
Magnitude of this finding suggests that the 
onsequences of that portion of sex role so- 
alization that involves economic activities 
become evident at a rather early age. The 
Ptoportional difference in self-allocation as a 
function of sex was greatest, as expected, in 
he 10th grade, but, as noted above, this in- 
nese was not evident when interval differ- 
pees were examined. This disparity, as we 
Uggested, could be due to the constraints 
ed on the seventh- and 10th-grade data 
k to the relatively small amount of money 

at both males and females in these grades 
Paid themselves, On the other hand, there is 
Silty that the hypothesized increase 
bul really that strong. Thus, it seems rea- 
a that future research should examine 
i 'ssue—and that of the comparability of 
ng ee and college students—by ask- 

b Pies of seventh-grade, 10th-grade, and 

oe Student male and female subjects to 

eS after they work on a more 

a ar task, one that would provide 

by lar a sound basis for taking a reason- 
8e amount of money. 

iter Pe thesized that females and males 

ence of Be allocation behavior as a conse- 

. of differences in sex role socialization, 


0 
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and thus, this behavior in females should be 
related to other manifestations of their sex 
role. Findings of Study 2 indicated that this 
was the case, since self-pay allocation in fe- 
males correlated strongly with the femininity- 
masculinity of occupation preferences. It is 
interesting that a less specifically economic 
index of femininity—masculinity (i.e., activ- 
ities) did not yield a significant relationship 
with self-pay allocation. If this specific find- 
ing is a valid reflection of this relationship in 
general, it would suggest that the sense of 
one’s own equity is a consequence of a some- 
what narrow range of socialization experi- 
ences: those that involve rather specifically 
the economic area of human endeavors. 


Conclusions 


The findings of the two studies also appear 
to have more general implications. The well- 
documented lack of parity in pay that exists 
between female and male members of the 
work force in American society to some ex- 
tent could be the result of a willingness on 
the part of women to work for lower wages 
than men. Mednick and Tangri (1972) and 
Chesler and Goodman (1976) contend that 
one factor that has contributed to this will- 
ingness is the different degree to which work 
and money are related in females and males. 
The present research does provide evidence in 
support of this assertion. 

In addition, the results of the present work, 
as well as the preponderance of past studies, 
strongly suggest that women tend not to act 
in ways that maximally benefit them eco- 
nomically. It could be argued that the “femi- 
nine” orientation toward money, in fact, is 
more psychologically healthy. However, it is 
nevertheless the case that such an orientation 
literally is costly to those persons who hold it, 
especially since it makes them vulnerable to 
exploitation by persons who have more self- 
serving ideas about money and work. 

These conclusions have important implica- 
tions for the evaluative examination of tradi- 
tional sex roles that currently is being under- 
taken within and outside of psychology. For 
this reason alone, they merit and require fur- 
ther examination. 
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Self-Esteem, Self-Consciousness, and Task Performance: 


Replications, Extensions, and Possible Explanations 


ali Joel Brockner 
State University of New York College at Brockport 


Previous research has suggested that the task performance of low-self-esteem 
individuals (low SEs) is impaired under conditions designed to increase self- 
focused attention. Task-focusing, rather than self-focusing, manipulations have 
actually bolstered the achievement of low SEs. The present studies were de- 
signed to replicate and extend these findings. The results of two experiments 
demonstrated that the performance of low SEs (but not of individuals with 
higher self-esteem) was affected by a variety of attentional manipulations. As 
before, task-focusing instructions enhanced and self-focusing stimuli impaired 
their performance on a concept formation task. Similar results were obtained 
for individuals who scored high (but not for those who scored low) on an indi- 
vidual difference measure of self-consciousness. Study 2 also demonstrated that 
when the task-focusing manipulation worked, it neutralized the adverse effects 
of the self-focusing stimulus on the low SEs’ performance. Supplementary data 
suggested that the manipulations generally had their intended effects on atten- 
tional focus and that attentional focus influenced performance. It is hypoth- 
esized that the low SEs’ level of self-focused attention influenced performance 
through the mediating effects of anxiety. 


on (SE) theorists and researchers 
pe established that relative to high-self-es- 
is people (high SEs), low-self-esteem in- 
oe (low SEs) are more likely to suffer 

m a variety of emotional and behavioral 
oo (e.g., greater anxiety, less happi- 
en oopersmith, 1967) and more likely to 
mu alcoholics (Wahl, 1956). Perhaps be- 
Flats ow SE has so many undesirable cor- 
fol clinical, social, and personality psy- 
“ae have long been concerned with un- 
M Er, effective methods of enhancing the 
hi w uations of low SEs. Unfortunately, 

S task has proven quite difficult, since low 
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SEs seem to be trapped in a vicious, self-de- 
feating cycle of low SE, For example, many 
studies (Hamachek, 1971; Shrauger, 1972) 
have shown that low SEs do more poorly than 
high SEs in achievement settings (i.e., situa- 
tions that have evaluative implications for 
one’s SE). Poor performances, in turn, would 
seem to set the stage for continued self-criti- 
cism and low SE. Thus, of potentially critical 
importance are the precise reasons for the 
often deflated task performance of low SEs. 


Low Self-Esteem and Self-Focused Attention: 


One of several possible causes of low SEs’ 
poor task performance may be their focus of 
attention while engaging in the task. Spe- 
cifically, there is converging evidence that 
low SEs are generally more self-conscious or 
self-focused than high SEs. Turner, Scheier, 
Carver, and Ickes (1978) discovered a sig- 
nificant negative correlation between indi- 
vidual difference measures of self-esteem and 
self-consciousness (Fenigstein, Scheier, & 
Buss, 1975). In addition, recent findings from 
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the objective self-awareness literature suggest 
an inverse relationship between self-focused 
attention and SE. For instance, Ickes, Wick- 
lund, and Ferris (1973) found that subjects 
who completed a SE inventory in the presence 
of a self-focusing stimulus (a mirror) rated 
themselves significantly lower than those in 
a no-mirror condition. Also, self-focusing stim- 
uli cause people to behave like low SEs. For 
example, both Duval (1976) and Wicklund 
and Duval (1971) found greater conformity 
when subjects were exposed to self-focusing 
stimuli. In parallel fashion, low SEs have also 
exhibited higher levels of conformity in some 
studies (e.g., Janis, 1954), although the evi- 
dence here is not totally consistent. Taken 
together, however, the evidence seems to im- 
plicate an inverse relationship between SE 
and self-focused attention. 


Self-Focused Attention and Task Performance 


The task performance of low SEs may be 
impaired by their heightened self-focused at- 
tention for at least two related reasons: (a) 
Given that attentional capacities are finite, 
low SEs will not be able to devote sufficient 
attention to the task, and (b) low SEs may 
well focus on their own negative character- 
istics (Mischel, Ebbesen, & Zeiss, 1976), 
leading to anxiety. Anxiety, in turn, may im- 
pair performance, particularly on complex 
tasks (e.g., Spence, Farber, & McFann, 1956). 

This is not to say, however, that low SEs 
will always be more self-conscious than high 
SEs and/or perform more poorly than high 
SEs. For example, if the task is inherently 
engaging, one might expect all subjects’ self- 
focused attention to be reduced. Under such 
conditions there may be no relationship be- 
tween SE and performance. In fact, Brockner 
and Hulton (1978), Shrauger (1972), and 
Shrauger and Terbovic ( 1976) found that low 
and high SEs performed equally well on a 
concept formation task that naturally cap- 
tured the subjects’ attention. Nevertheless, 
when self-focusing stimuli were introduced 
into the experimental situation (i.e., an audi- 
ence in the Shrauger, 1972, study and an 
audience and mirror in the Brockner and Hul- 
ton, 1978, experiment), the performance of 
the low SEs dropped markedly, whereas the 
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high SEs maintained their high level of per 
formance. Hence, even if low SEs are not a. 
ready more self-conscious than high SEs, i 
would seem that their susceptibility to be 
coming self-focused is considerably greatg 
than high SEs’, 

If the task performance of low SEs is im 
paired by heightened self-focused attention 
and/or a greater propensity to become selil 
aware, then any manipulation that enhan 
task-focused and decreases self-focused atten 
tion should improve their performance, Im 
deed, Brockner and Hulton (1978) discov 
ered that low SEs actually performed bette 
than high SEs when both groups were given 
a set of instructions designed to increase tali 
focusing. These data may be of considerable 
practical significance, in that the favorabl 
performance of the low SEs may serve as 4 
starting point for breaking the vicious cyd 
of low SE. That is, it is entirely possible thi 
with repeated success experiences in achievtt 
ment settings, low SEs will elevate their sel 
evaluations to a more positive level. 


Questions Addressed by the Present Researtl 


The tendencies for low SEs in the Broa 
and Hulton (1978) study to perform poor) 
in the presence of self-focusing stimuli a 
favorably when instructed to focus 0” 1 
task raised a number of theoretical and pin 
tical questions. First, could the various r 
ings be replicated? Do self-focusing stm 
consistently impair the low SEs’ task 
formance? This is an important ae, 
from a practical or applied vantage pom" 
self-focusing stimuli consistently ine a 
SEs, this may help us identify (and P A K 
start to change) the factors that mig wo 
timately perpetuate low SE. From a 
theoretical viewpoint, experimental socia ; 
chologists have witnessed a great res a 
of interest in the notion of self-conscious 
To study the effects of self-conscio "ij 
many investigators have manipulate me 
focused attention with mirrors, video @ av 
and audiences and tested its effect OP 4 

" 


tioni 


icklu 
riety of behaviors (see Duval & Wick 


1972). The present formulation wen dl 
esteem as a factor related to a perso” 
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focusing tendencies. Thus, it would be useful 
to know how this individual difference vari- 
able interacts with situational manipulations 
of self-consciousness. 

Second, could the task instructions results 
be replicated? Given that the instructions en- 
hanced the low SEs’ performance and that 
enhanced performance might begin to break 
the vicious cycle of low SE, it would be im- 
portant to know if these instructions reliably 
improve the low SEs’ performance. 

) Third, can the results be extended? An im- 
portant condition missing from the Brockner 
and Hulton (1978) study is one in which the 
self-focusing stimuli and task-focusing in- 
structions are combined. What happens when 
these two manipulations with opposite effects 
on performance and presumably opposite ef- 
fects on attention are pitted against each 
Other? Do the two stimuli tend to cancel each 
‘other, does one override the other, or do they 
Produce some other kind of interaction effect? 

Fourth, if the results can be replicated and 
‘xtended, what are the appropriate explana- 
tions for the data? To what extent are atten- 
tional processes influencing performance? 
Brockner and Hulton (1978) used several 
Manipulation checks of attentional focus, and 
While the results on these items were generally 
Consistent with performance, the manipula- 
tion check data were somewhat weak. The 
Present research included additional manipu- 
lation checks that may give a more valid in- 

Cation of subjects’ focus of attention. Two 


Studies were conducted to answer the above 
(uestions, 


Study 1 


pisces were given a pretest measure of 

eOncent f-esteem and completed the identical 
Bion formation task used by Brockner and 

complet TE One third of the subjects 
focusin apes task in the presence of self- 
u 8 stimuli. The stimuli used in this 

A wa a video camera and a mirror. 

e vid 'S condition will be referred to as 

erent €o-mirror condition. Note that a dif- 
aes was used by Brockner and 
fore ; who had subjects work on the task 
an audience that was observing them 
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from behind a one-way mirror. To strengthen 
a possible attentional interpretation of the re- 
sults, a different operationalization of self- 
focused attention was employed in the present 
experiment. One third of the subjects were 
provided with the identical task-focusing in- 
structions used by Brockner and Hulton (the 
task condition). The other third of the sub- 
jects received both the self-focusing and task- 
focusing manipulations (the combined condi- 
tion). In sum, a two-factor (SE x Attentional 
Treatment) between-subjects experiment was 
conducted. It was expected that the manip- 
ulations of attentional focus would have the 
most pronounced effects on the low SEs’ task 
performance. Following the results of Brock- 
ner and Hulton (1978), it was predicted that 
compared to high SEs, low SEs would per- 
form significantly worse in the video—mirror 
condition and significantly better in the task 
condition. Given the seemingly greater sali- 
ence of the self-focusing versus the task-fo- 
cusing manipulation, it was expected that the 
low SEs would perform somewhat worse than 
high SEs in the combined condition. 


Method 


Participants 


The 88 volunteer subjects (53 male, 35 female) 
were summer school students at a Vermont state 
university. Each subject received $3 for participating. 


Materials 


Self-esteem inventory. The inventory was identi- 
cal to the one used by Brockner and Huston (1978) 
and Shrauger (1972). The scale measured the sub- 
jects’ perceived competence across 16 various situa- 
tions (e.g., academic, athletic, social). Subjects had 
to indicate the percentage of time a particular be- 
havior or outcome applied to them (eg., “When 
meeting new people for the first time, what percent- 
age of the time are you able to impress them favor- 
ably and form good relations?”). Nine items were 
worded so that higher percentages indicated higher 
SE, whereas severi were phrased so that lower per- 
centages indicated higher SE. For the latter items, 
the subjects’ percentage ratings were subtracted from 
100. The average score for all 16 items was computed 
such that higher scores represented higher SE. The 
mean score for all subjects was 70.36 (SD = 9.40). 
Subjects were classified as high or low SEs on the 
basis of a median split (high SEs: M = 77.27, SD 
=5,91; low SEs: M=63.14, SD = 6.49). Subjects 
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were randomly assigned to predetermined attention 
conditions. As a result there was an unequal distribu- 
tion of subjects across conditions, Nevertheless, within 
each SE level the mean SE scores across conditions 
were virtually identical. 

Concept formation task. The task was identical to 
the one employed by Brockner and Hulton (1978). 
Specifically, 32 squares were attached to a 14 X 25 X 
.25 inch (36X64 X.6 cm) wooden board. Each 
square had five characteristics: number (one or two), 
size (large or small), color (black or white), shape 
(circle or square), and position (left or right). Be- 
neath each square a small hole had been drilled in the 
wooden board. 

The subject was told that the concept would con- 
sist of one, two, or three characteristics. The experi- 
menter selected a square that was an example of the 
correct concept. The subject’s task was to determine 
the concept by lifting up other squares, one at a 
time. If the square was an example of the correct 
concept, the word yes was written on a piece of oak 
tag that. had been inserted below the board. If 
the square was not an example of the correct con- 
cept, the word no had been written below. When 
subjects believed that they had ascertained the cor- 
rect concept, they wrote it on a slip of paper pro- 
vided to them, At no time were subjects ever in- 
formed of the accuracy of their responses, Subjects 
were told that while “we are interested in the speed 
and accuracy with which you determine the concepts, 
it is more important to be accurate.’ There were 
seven trials, which all participants completed in an 
identical random order. After subjects completed each 
trial, the experimenter removed the underlying piece 
of oak tag for that trial and inserted the piece con- 
taining the feedback for the next trial. 


Procedure 


All participants were run one at a time. Upon en- 
tering the laboratory they were led to a cubicle and 
were told that the experiment would consist of a 
number of parts. In the first part they were asked 
to complete the self-esteem inventory. After com- 
pleting this measure they were led to another cubicle 
where they found the concept formation task and 
the accompanying instructions. Subjects were asked 
to read the instructions to themselves while the ex- 
perimenter simultaneously read them aloud. The in- 
structions informed subjects how to determine the 
concepts. In addition, subjects were provided a prac- 
tice trial to familiarize them further with the task. 
During the practice trial the experimenter directed 
the subject toward the correct concept whenever 
necessary, However, subjects always had to deter- 
mine the concept themselves on the practice trial. 

After the practice trial was completed, the focus- 
of-attention manipulations were introduced. In the 
video-mirror condition, the experimenter placed a 
large mirror (24 X 36 inches—61 X 91 cm), which 
had been hitherto unexposed, directly in front of 
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the subject. Because the mirror was so large, subjecs 
could see not only themselves but also the experi. 
menter observing them as they worked on the task, 
Also, the experimenter uncovered a video camera and 
pointed it directly at the subjects, saying: 


Since we are interested in how people go abou 
forming concepts, a video camera has been placed 
to your right. We will be making a videotape of 
you and your concept formation strategies. Thi 
tape will aid us greatly, as we will be able ti 
observe you an unlimited number of times in 
future when we play back the tape. Furthermo 
you will notice the mirror in front of you. Thi 
will help the experimenter observe you as you at 
working on the task. 


The experimenter then switched on the video- 
cording unit. In the task condition the experimenti 
told the subjects: 


OK, now that you are familiar with the task, 
me just give you the following advice, This tasl 
is not extremely difficult, although it can be some) 
what tricky. Therefore, I can’t emphasize strong] 
enough just how important it is that you mall 
tain your complete and undivided attention on th 
task at all times. People who have done this ¢ 
periment in the past frequently mentioned the im 
portance of concentrating on the task at hand S 
just remember to keep focusing on the tak 
much as possible. Again, I can’t emphasize thi 
strongly enough. 


: i 
In the combined condition subjects received boll 


of the manipulations used in the conditions bie 
tioned above. The order in which the self-focisii 
versus task-focusing stimuli were introduce 
counterbalanced. Since the order variable prod 
no effects on performance, it will not be consitt 
further. oe. hal 

After the focus-of-attention manipulations f 
been introduced, subjects began to work on the n 
The experimenter was seated adjacent to ee 
ject, recording the amount of time that she 0 
took and the number of squares looked at a 
trial. While the experimenter was clearly yikes 
the subjects’ focus-of-attention conditions, 
not aware of their self-esteem scores. A 
seventh trial was completed, the experimente" i 
subjects to complete a postexperimental q0% pij 
naire measuring their attentional focus 2 s 
anxiety while working on the task. In addition 
jects were asked to write their perceptions ae 
purposes of the study. None of the subjects ri 
to guess the true nature of the study, and th 
suspicions were unrelated to the actual core 
the experiment. Finally, participants wer? a 
debriefed, thanked for their participation, at 
quested not to speak to future subjects any 
study. 
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Table 1 
Task Performance, Study 1 
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———— aa 


Self-esteem 
High Low 
Measure n M SD n M SD 
Performance in each condition 

Focus of attention 
Video—mirror 14 1.86 1.59 16 2.81 2.55 
Combined 17 2.23 2,31 12 3.92 2.10 
Task 14 3.14 2.45 15 3.20 2.79 

Internal analysis 

Task concentration 
High 23 2.35 2.03 20 2.05 1.75 
Low 22 2.45 2.41 23 430 2.69 


Note. Performance was measured by errors. Scores ranged from 0 to 9. Thus, the higher the score, the poorer 


the performance. 


Results and Discussion 


Performance 2 


Both errors of commission and errors of 
mission were summed over the seven trials 
0 determine task performance. Thus, if the 
ubject had written a characteristic that was 
gt part of the concept, it was scored as an 
Ee the subject had omitted a character- 
RE at was part of the concept, an error 
b ssessed, For example, if the subject had 
_ etl “one” when the concept was “left 
dare,” three errors were assessed. 

Man ninay t test yielded no effect as a 
NN ‘Sex of subject. Thus, all data were 
five ya a two-factor unweighted-means 

AN o Variance. The only effect to ap- 

k Peete was the self-esteem main 
Es ae 82) = 2.98, p< .10, with low 
Bie 1 ae more errors. However, from 
ily oe can be seen that this effect was 
fied ae in the video—mirror and com- 
Pet de lons (i.e., when the self-focusing 
k DA Present). Several preplanned ¢ 
a high a that low SEs made more errors 

ibe S in the presence of the mirror 
amera, £(57) = 2.08, p <.025, one- 
» with no difference between groups in 


€ ab; 
tine’ Of these stimuli (i.e. in the task 
jon; t <1), 


Focus-of-Attention Manipulation Checks 


Contrary to the results of Brockner and 
Hulton (1978), the task-focusing instructions 
had no ameliorating effect on the low SEs’ 
performance in the task condition. However, 
the focus-of-attention measures will also show 
that the task-focus instructions did not have 
their predicted effects (i.e., the manipulation 
did not work). To test for their attentional 
focus during task performance, subjects com- 
pleted the following items on the postexperi- 
mental questionnaire (all questions were com- 
pleted using 41-point scales, with higher 
scores representing greater endorsement of 
the statement): “While you were completing 
the task, (a) how self-conscious did you feel, 
and (b) how hard were you concentrating on 
the task?” On the self-consciousness item, 


1 For both studies only the error data will be re- 
ported. The amount of time subjects took and the 
number of squares they looked at for each trial were 
also recorded. However, individual subject variability 
was quite high on both measures. As a result there 
were no significant effects for either measure in the 
two studies (all Fs <1). 

2Winer (1971) states that ¢ tests are permissible 
for meaningful comparisons planned prior to the 
inspection of the data. Since an a priori directional 
hypothesis had been proposed, the one-tailed test was 
employed. 
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the only effect to emerge was a focus-of-atten- 
tion main effect, F(2, 82) = 4.02, P< 025. 
All subjects were more self-conscious in the 
task condition than in the other two. Like- 
wise, on the task concentration item, there 
was no evidence that the task-focusing in- 
structions enhanced the low SEs’ task-fo- 
cused attention. In fact, low SEs were least 
task focused in the task condition. 

In essence, then, the task-focusing instruc- 
tions were ineffectual in this study, as the per- 
formance and attentional focus data suggest. 
If so, one would expect low SEs to become 
more self-conscious and perform more poorly 
than high SEs not only in the video-mirror 
condtion but also in the combined condition. 
As already stated, low SEs did perform worse 
than high SEs in both conditions. Moreover, 
there is additional evidence that low SEs 
were more self-conscious in both conditions. 
In addition to completing the self-conscious- 
ness manipulation check, the subjects in the 
video-mirror and combined conditions only 
were asked, with the 41-point scale format, 
“While you were completing the task, (a) 
how aware of the mirror were you, and (b) 
how aware of the video camera were you?” 
Each of these items correlated highly with 
reported self-consciousness: self-consciousness 
and mirror awareness, 7(57) = .48; self-con- 
sciousness and video camera awareness, 7(57) 
= .58; both ps < .001. Therefore, all three 
items were summed into a self-focused atten- 
tion index. As predicted, low SEs were sig- 
nificantly more self-focused than high SEs, 
(57) = 1.85, p < .035, one-tailed. 

Internal analysis. Although the task-focus 
instructions did not enhance the low SEs’ 
task concentration, one may still wonder if 
the low SEs who were concentrating on the 
task performed better than those who were 
not. Therefore, subjects were classified, by a 
median split on their task concentration 
scores, as high or low in task-focused atten- 
tion. The summary data for the subsequent 
two-factor (SE x Task Concentration) anal- 
ysis of variance are presented in Table 1. In- 
terestingly, an interaction effect was obtained, 
F(1, 84) = 4.73, p < .05, demonstrating that 
low SEs only performed worse than high SEs 
when subjects were not highly task focused. 


In fact, when both groups reported high tag 
concentration, there was a nonsignifican 
tendency for low SEs to perform better tha 
high SEs. 

In sum, the present study replicated Brod: 
ner and Hulton’s (1978) finding that selli 
focusing stimuli cause low SEs to perform 
more poorly than high SEs. However, 
task-focusing instructions did not work i 
this study—they did not increase task-i9 
cused, relative to self-focused, attention o 
enhance the performance of low SEs. There} 
fore, there were several reasons to perform at 
additional study. First, it was unclear if 
task condition results were simply due to 
failure of the manipulation. An alternati 
explanation is that the task-focus instructio 
also enhanced the low SEs’ ego involvemt 
or evaluation apprehension. Quite possibl) 
these processes negated the potentially ben 
ficial effects of the task-focus instruction 0 
the low SEs’ performance. For example, Sat 
son (1972) has convincingly demonstra 
that evaluative or ego-involving situatio 
cause severe performance decrements for hi 
test-anxious subjects, a group who greal 
resemble low SEs. Thus, it would be useful 
know how low SEs would perform if 
were independent evidence that the task-fot 
manipulation was successful. Specifically, | 
the low SEs’ performance also improve% 
the task condition results of Study 1 “A 
seem to be due to a failure of the manip 
tion. However, if their performance did n 
improve even when there was evidence = 
the manipulation was successful, then 
manipulation failure hypothesis woul 
plausible. 

Second, in light of the failure of the 


That is, is it the case that the self 
stimuli actually overpowered the tasi F 
instructions? Or, were the task-focus w 
tions, as in the task condition, not succes? 
manipulated? It would seem that 4 ma 
vincing test of the combined effects ° “il 
focusing and task focusing woul i 
were at least shown that each st i 
liably affected attention when manipt? 
isolation. 


stimu 
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Study 2 


The procedures used were in many ways 
identical to those employed in Study 1. There 
were, however, several important differences 
that should be noted. First, a control condi- 
tion was included, in which subjects received 
neither the self-focusing nor the task-focusing 
manipulations. Three studies (Brockner & 
Hulton, 1978; Shrauger, 1972; Shrauger & 
Terbovic, 1976) have shown no SE differences 
in performance on this task in control condi- 
tions. However, the addition of this condition 
made it possible to combine factorially the 
task-focusing and self-focusing variables. 
Thus, this experiment consisted of a three- 
factor design (SE X Task Focus X Self-Fo- 
cus). 

Second, the specific operationalization of 
self-focused attention was altered somewhat 
to provide further converging evidence on the 
Joint effects of SE and self-focused attention 
on performance. That is, only the mirror, and 
hot the video camera, was used in this study. 
I Third, in addition to responding to the SE 
inventory, subjects completed an individual 
difference measure of self-consciousness—the 
scale recently developed by Fenigstein et al. 
(1975)—prior to performing the concept for- 
mation task. If the interfering effects of self- 
focusing stimuli and the enhancing effects of 
task focusing for low SEs are related to the 
higher self-consciousness of low SEs, then 
highly self-conscious subjects (high SCs) 
Should respond like low SEs. Thus, the mir- 
tor should impair and the task-focus instruc- 
tions should bolster mainly the high SCs. 
ae findings for low SEs and high SCs 
= or high SEs and subjects low in self- 

Clousness, low SCs) would provide con- 
verging evidence that performance was in- 
enced by attentional processes. 


Method 
Participants 


has 119 participants were drawn from an intro- 
à e nolor class at a New York state col- 
Bests Teceived extra course credit for their par- 
us n. Of the 119 subjects, seven were eliminated 
ents ‘ure to understand the instructions. The pro- 
n of eliminated subjects was not related to 
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any independent variables. The final sample con- 
sisted of 112 subjects (50 male, 62 female) who were 
unequally distributed across the various experimental 
conditions. 


Materials 


Self-esteem inventory. The scale was identical to 
the one used in Study 1. The mean score for all sub- 
jects was 68.41 (SD =9.57). Subjects were divided 
into three groups. The upper third of the distribution 
were labeled high SEs (M = 79.05, SD = 5.49); the 
middle third, medium SEs (M = 68.21, SD = 2.59) ; 
and the bottom third, low SEs (M=57.97, SD = 
421). 

Self-consciousness inventory. Subjects were asked 
to indicate on 6-point scales (minimum = 0, maxi- 
mum = 5) how much each statement was character- 
istic of them (eg., “I reflect about myself a lot,” 
“Im usually aware of my appearance”). Eighteen 
items were phrased so that a higher score repre- 
sented higher self-consciousness, and three items were 
worded so that lower scores reflected higher self- 
consciousness. For the latter items each score was 
subtracted from the maximum rating of 5. The 
scores for each subject were then summed over the 
21 items (minimum=0, maximum = 105), such 
that higher scores represented greater self-focused at- 
tention. The average score for all subjects was 64.72 
(SD = 11.81). A median split was employed to 
classify subjects as high or low SCs (M for high 
SCs = 73.9, SD = 6.56; M for low SCs = 54.87, SD 
= 7.46). As expected, SE and self-consciousness were 
negatively related, r(110) = —.30, p < .01. 

Concept formation task. The task was the same 
one used in Study 1. 


Procedure 


As in Study 1, subjects were told that the study 
would consist of several parts. In the first part they 
were taken to a small cubicle and asked to complete 
the SE and self-consciousness inventories. Afterwards 
they were led to another cubicle to perform the con- 
cept formation task. When the instructions and 
practice trial were completed, the experimenter intro- 
duced the focus-of-attention manipulations. In the 
two self-focus conditions, subjects were told: 


One more point. Since we are interested in how 
people go about forming concepts, the experi- 
menter will be watching you as you are working 


3 The content of 2 items from this 23-item scale 
was very closely related to several procedural aspects 
of this study. Specifically, they were, “One of the 
last things I do before I leave my house is look in 
the mirror,” and “I have trouble working when 
someone is watching me.” To prevent undue suspi- 
cions from arising, these items were deleted from 


the questionnaire. 
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Table 2 
Task Performance in Relation to Self-Esteem, Study 2 
E 
Self-esteem 
High Medium Low 
Self-focus Task focus n M SD n M SD n M SD 
Present Present 7 443 2.56 9 200 1.76 10 4.40 3.20 
(combined) 
Absent 6 2.50 2.63 14 3.93 2.40 9 7.22 5.81 
(mirror only) 
Absent Present 13 3.00 3.04 8 2.88 1.69 8 1.75 1.20 
(task only) 
Absent 11 4.18 1.99 7 3.86 1.81 10 3.60 2.15 
(control) 


Note. Performance was measured by errors. Scores ranged from 0 to 20. Thus, the higher the score, the 


poorer the performance. 


on the task, Also, a mirror will be placed in 
front of you, making it easier for the experimenter 
to observe your behavior more closely. 


With that, the experimenter placed the large mir- 
ror directly in front of the subject. In the two 
task-focus conditions, subjects were given the same 
task-focus instructions used in Study 1. Essentially, 
subjects were randomly assigned to one of four focus- 
of-attention treatments: (a) mirror condition—sub- 
jects were only exposed to the mirror manipulation ; 
(b) task condition—subjects were only exposed to 
the instructions designed to increase task-focused and 
decrease self-focused attention; (c) combined condi- 
tion—subjects received the task-focus instructions 
first, followed by the introduction of the mirror; and 
(d) control condition—subjects received neither at- 
tentional focus manipulation, As in Study 1 the ex- 
perimenter was not aware of the subjects’ SE or self- 
consciousness scores. 

After completing the task subjects made ratings 
of their attentional focus and anxiety while perform- 
ing the task. In addition, they were asked to write 
their hypotheses and suspicions about the experi- 
ment, none of which was related to the true purpose 
of the study. Finally, all participants were fully 
debriefed. 


Results and Discussion 
Performance 


Performance in relation to self-esteem. A 
preliminary ¢ test yielded no effect as a func- 
tion of sex of subject. Thus, all data were 
analyzed with a three-factor unweighted- 
means analysis of variance.* The summary 
data for this measure are presented in Table 
2. The analysis revealed a significant main 


effect for task-focus instruction, F(1, 100) = 
3.94, p < .05, with subjects performing be! 
ter when instructed to focus on the task. At 
though the SE x Task Focus interaction wi 
not significant (p < .15), simple effects analy 
ses revealed that the task-focus instructio 
only had a significant impact on the low 
F(1, 100) = 5.52, p< .025. High SEs e 
unaffected by the task-focus instructio 
(F <1), and medium SEs only marginal) 
benefited from being told to focus On t 
task (p < .20). As additional evidence th 
primarily low SEs were aided by the ta "i 
cus instructions, there was a marginal invert 
relationship between SE and performance ! 
the no-task-focus conditions, F (2, 100) 
2.36, p < .10, whereas no such tendency 4 
peared in the task-focus conditions. 

The only other significant effect to emel 
was a SE X Self-Focus interaction, F(2, 
= 3.93, p < .025. Simple effects analyse 
vealed that only the performance of the 


+As in Study 1 the SE scores (within ad 
level) did not differ by condition. Also, two Ai 
menters (one male and one female) collecte cl 
in this study. A preliminary ¢ test showe' 
jects made more errors in the presence o! a 
experimenter, ¢(110) = 2.56, p < .025. Howey ef 
was no Sex of Subject X Sex of Experimente p 
tion. Moreover, since the experimenters ran ch oon 
mately equal proportions of subjects 1m ce inet 
tion, the data were collapsed across the exP® 
variable. 


a 
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Table 3 


455 


Task Performance in Relation to Self- Consciousness, Study 2 


CE eee a arEsEIsnEnEEEIEEIEIEEIEESEEIEEnIEInTETTIInnEEnISTnnnSnnInnTnI 


Self-consciousness 
High Low 
Self-focus Task focus n M SD n M SD 
Present Present 11 3.64 3.25 15 3:53 1142:50 
(combined) 
Absent 13 7.08 5.01 16 2.69 1.86 
(mirror only) 
Absent Present 18 3.00 = 2.56 11 2.00 1.86 
(task only) 
Absent 16 4.00 1.50 12 3.15 2.55 
(control) 


Note. Performance was measured by errors. Scores ranged from 0 to 20. The higher the score, the poorer 


the performance. 


SEs was adversely affected by the mirror, 
R(1, 100) = 9.95, p < .01. Medium and high 
Es performed equally well in the self-focus 
i d no-self-focus conditions. In addition, only 
a relationship between SE and performance 
was observed in the self-focus conditions, 
(2, 100) = 4.68, p < .01, with low SEs per- 
orming significantly worse than medium and 
high SEs (who did not differ). In the no-self- 
focus conditions, there was a nonsignificant 
(F <1) inverse relationship between SE and 
performance. 

Performance in relation to self-conscious- 
nas: Summary data for this analysis are 
presented in Table 3. The analysis of variance 
revealed a significant main effect for self- 
consciousness, F(1, 104) = 6.68, p< .025, 

th high SCs making more errors than low 

Cs. A significant triple interaction, F (1, 104) 

5.27, b < .025, revealed that high SCs did 

articularly worse than low SCs in the mirror 
i ndition. 

Fe Of course, the main effect for task focusing 

as still present, F(1, 104) = 5.96, p < .025. 
‘Oreover, simple effects analyses showed that 
nly the high SCs benefited from the task- 
cus instructions, F(1, 104) = 8.21, p < .01, 
aralleling the earlier findings for low SEs. 
once of the low SCs (like that of 
= igh SEs) was not affected by the task- 
ES Instructions. 

Fenigstein et al. (1975) reported that the 


‘ 
self- ; 
Self-consciousness scale consisted of three 


separate factors: private self-consciousness, 
public self-consciousness, and social anxiety.“ 
They reported that 


Private self-consciousness was concerned with attend- 
ing to one’s inner thoughts and feelings, public self- 
consciousness was defined by a general awareness of 
the self as a social object that has an effect on others, 
and social anxiety was defined by a discomfort in 
the presence of others. Public and private self-con- 
sciousness refer to a process of self-focused atten- 
tion; social anxiety refers to a reaction to this pro- 


cess. (p. 523) 


Accordingly, subjects were classified as high 
or low, with a median split for each of these 
three components of self-consciousness. Three 
three-factor analyses of variance were per- 
formed, with private self-consciousness, public 
self-consciousness, social anxiety, and the two 
focus-of-attention factors as independent vari- 
ables. Interestingly, individual differences in 
private and public self-consciousness produced 
no main effects and did not interact with 
either of the attentional focus manipulations. 
However, subjects high in social anxiety 
(high SAs) performed significantly worse 
than low-social-anxiety participants (low 
SAs), F(1, 104) = 6.61, p < .025. Simple ef- 


5 Although Turner et al. (1978). reported negative 
correlations between SE and each component of self- 
consciousness, only a correlation between SE and so- 
cial anxiety was obtained in the present study, 7(110) 
= —.54, p < 01. 
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Table 4 
Self-Focused Attention Index, Study 2 
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Self-esteem 
High Medium Low 

Self-focus Task focus n OM SD n M SD n M spl 
t 
Present Present 7 37.14 24.87 9 19.11 25.58 10 39.10 26.2 | 
(combined) K í 
Absent 6 12.17 31.32 14 24.14 28.05 9 46.00 20,99 | 

(mirror only) | 

Absent Present 13 28.15 32.15 8 23.50 26.46 8 20.25 181 

(task only) | 
Absent 11 27.64 39.77 7 4.57 25.98 10 39.80 26,68). 
(control) ( 


Note. Index was summed over four questions. Scores ranged from —31 to 92, with higher scores refl 


greater self-focused attention. 


fects analyses demonstrated that this was par- 
ticularly true in the self-focus conditions, 
F(1, 104) = 6.92, p < .025. When the mirror 
was not present, high and low SAs did not 
reliably differ, F(1, 104) = 1.01. In addition, 
a triple interaction effect emerged, the nature 
of which was somewhat unexpected. Specifi- 
cally, it was only in the control condition that 
high SAs did not make more errors than low 
SAs, F(1, 104) = 4.42, p < .05. 


Focus-of-Attention Manipulation Checks 


Subjects completed the following items on 
the postexperimental questionnaire (once 
again, all questions were completed on 41- 
point scales, with higher scores representing 
greater endorsement of the statement): 
“While you are completing the task, (a) how 
self-conscious did you feel? (b) How much 
were you aware of the experimenter’s pres- 
ence? (c) How much did you find yourself 
thinking about how well or how badly you 
seemed to be doing? and (d) How much 
did you find yourself completely concen- 
trating on the task and nothing else? That 
is, how much would you say you were con- 
centrating on the task to the exclusion of 
any other thoughts?” All items correlated 
with Item 1, the most direct measure of self- 
Consciousness (71 ana2 = .52; 73 ana 3 = .40; 
fianais = —.30; df= 110; p<.01 in all 
cases). Therefore, a self-focused attention in- 


dex was computed for each subject by su’ 
ming Items 1, 2, and 3 and subtracting Ili! 
4. Higher scores corresponded to gre 
amounts of self-focused attention. | 
A three-factor unweighted-means analy 
of variance was performed, for which the si 
mary data can be found in Table 4. Fio 
inspecting the data one finds that the s 
tend to be aligned with the error data 
sented in Table 2. The only anomalous fi 
ing occurred in the medium SE-control col ; 
tion. There the self-focused attention a ; 
were unusually low and not at all consis 
with the error data observed in that aa l 
only significant effect to emerge was i vill 
effect for SE, F(2, 100) = 3.37, p < 0" 
low SEs most and medium SEs least sel 
cused. However, several simple effects 4 
leled those found for the perforata 
For example, in the no-task-focus oF n 
low SEs were considerably more self-fo 
than medium and high SEs, F(2, 100) = y 
p < .025. In the task-focus conditions, AB 
were no differences between SE groups: 
in the self-focus conditions, low SEs ue 
ginally more self-focused than ee 
high SEs, F(2, 100) = 2.52, p < -10. ita 
trend emerged in the no-self-focus com E ' 
Additional evidence of the success H 
task-focus variable on the low SEs ste™ a 
the results on the measure of task-forr 
tention (Item 4). Just as the perf 


ge 
data showed that primarily low SEs ¥ 


it 
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ected by the task-focus factor, the analyses 
n this measure revealed that the low SEs 
yere most influenced by this manipulation. 
While the SE X Task Focus interaction was 
nly a trend (p < .11), simple effects analyses 
howed that in the no-task-focus conditions, 
ow SEs were considerably less task focused 
han medium and high SEs, F(2, 100) =3.40, 
< .05, whereas all three groups were equally 
ask focused in the task-focus conditions 
F< 1). Low SEs also tended to be more 
ask focused in the task-focus than in the no- 
ask-focus conditions, F(1, 100) = 3.34, p< 
7, whereas medium and high SEs were not 
both Fs < 1). 

As further evidence of the role of atten- 
ional processes on performance, there was a 
ignificant correlation between subjects’ scores 
n the self-focused attention index and the 
umber of errors they committed, r(110) = 
p< 01. 


General Discussion 
ummary and Replications 


Both studies demonstrated that (only) the 
erformance of the low SEs was quite mallea- 
le across conditions. In Study 2 the self- 
using stimulus caused low SEs to perform 
orse than (a) medium and high SEs who 
fe exposed to the same stimulus and (b) 
W SEs who were not exposed to the mirror. 
. oY 1 low SEs only performed more 
E than high SEs in the presence of the 
pone stimuli. These data replicate 
1072) b and Hulton (1978) and Shrauger 
erati y showing that self-focusing stimuli 
ca tonalized differently here than in the 
te hasta studies) can exacerbate or ac- 
ae Ormance decrements for low SEs. 

sitive eee , task focusing seemed to have 
udy feta primarily for the low SEs. In 
W SEs t e task-focus instructions caused 
ere aS perform better than low SEs who 
heteas Provided with such instructions, 
ted p aoe and high SEs were unaf- 
tetnal c's Manipulation. In Study 1 the 
Ported esi showed that low SEs who 
ade siete levels of task concentration 

nificantly fewer errors than low SEs 
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whose task concentration was low. That anal- 
ysis also showed that high SEs did equally 
well, regardless of their task concentration 
level. Both studies thus confirm Brockner and 
Hulton’s (1978) finding that low SEs espe- 
cially benefit from increased task-focused at- 
tention. However, there are some important 
differences between the present results and 
Brockner and Hulton’s data that should be 
made explicit. Brockner and Hulton discov- 
ered that low SEs made significantly fewer 
errors than high SEs when both groups were 
instructed to focus on the task. Low SE-task 
condition subjects, however, did not perform 
significantly better than low SEs in their con- 
trol condition, although the means were in 
that direction. In the present studies the find- 
ing that low SEs performed better than high 
SEs when task focusing was high was not 
replicated (i.e, in the task condition in 
Study 2 and the internal analysis of SE and 
task concentration in Study 1), although the 
means were in that direction in both studies. 
However, unlike Brockner and Hulton, both of 
the studies reported here demonstrated that 
low SEs who were task focused performed 
significantly better than less task-focused 
low SEs. 

That the self-focusing and task-focusing 
stimuli generally produced replications of 
previous results is all the more striking given 
the differences in subject populations. As sug- 
gestive evidence of the heterogeneity of the 
samples, the subjects in the present Study 2 
overall made 71% more errors than Brockner 
and Hulton’s subjects, while the Study 1 sam- 
ple made 32% more errors. However, in spite 
of these overall performance differences, the 
relative differences in performance between 
conditions were in most cases remarkably con- 
sistent across all three studies. 


Extensions 


The combined conditions, which were ex- 
tensions of the Brockner and Hulton (1978) 
study, produced inconsistent results. The low 
SEs’ performance in Study 2 demonstrated 
that when the task-focus instructions work 
they can actually neutralize the effects of the 
mirror. From the low SE column of Table 2 it 
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can be seen that relative to the control condi- 
tion, performance was impaired in the mirror 
condition and enhanced in the task condition. 
The low SEs’ performance in the combined 
condition, however, was nearly identical to 
that found in the control condition. It is in- 
teresting that the task-focus instructions 
could functionally neutralize the mirror, given 
the relative objective salience of the two 
stimuli. That is, the task-focus instructions 
consisted of several sentences that were only 
read to subjects right before they started to 
work on the task. The large mirror, however, 
was a very salient self-focusing stimulus that 
remained present throughout the course of 
the experiment. One could reasonably argue 
that the possible ameliorating effects of the 
task-focus instructions were provided a fairly 
conservative test when combined with the 
omnipresent mirror. That the instructions 
actually neutralized the mirror suggests their 
potential utility for low SEs in situations that 
are likely to cause high self-consciousness, 
poor performance, and perhaps exacerbated 
low SE. 

Of course, the above remarks must be tem- 
pered by the combined condition results 
found in Study 1. However, given that the 
task-focus instructions produced no beneficial 
effects for low SEs when administered in isola- 
tion (i.e., in the task condition of Study 1), 
it is perhaps not surprising that they were 
ineffective when used in conjunction with the 
mirror and video camera. 


Attentional Focus and Performance 


One weakness of the Brockner and Hulton 
(1978) study was its failure to provide 
highly convincing evidence that the low SEs’ 
performance was influenced by attentional 
processes. Fortunately, there is now consid- 
erable evidence to suggest that for low SEs, 
(a) the manipulations had their intended ef- 
fects on attentional focus (except for the 
task-focus instructions in Study 1) and (b) 
attentional focus influenced performance. 
First, both studies showed that self-focusing 
stimuli had their most pronounced effects on 
the self-focused attention of low SEs. Ma- 
nipulation checks from Study 2 (see Table 4) 


showed that when the data from the miy 
and combined conditions were taken toget 
low SEs were more self-focused than medinm 
and high SEs. The data from Study 1 ay 
revealed that low SEs were more self-focusg 
than high SEs in the presence of the mi 
and video camera.° Second, there was eviden 


creased their task-focused attention. Speci 
cally, when subjects were not told to ta 
focus, low SEs were significantly more 
conscious and less task focused than mediu 
and high SEs. However, when subjects we 
instructed to concentrate on the task, they 
were no self-focusing or task-focusing diff 
ences between the three SE groups. 
Given that the manipulations generally hi 
their desired effects, what evidence was thel 
that attentional focus affected performan 
First, recall that in Study 2 there was a s 
nificant correlation between subjects’ scoi 
on the self-focused attention index and W 
number of errors they committed. Furti 
more, from Study 2 the relationships betwee 
the individual difference measure of self 
sciousness, the attentional manipulations, f 
performance were particularly illuminating 
As stated earlier, if the performance of 
low SEs in the various conditions is rela 
to their higher self-consciousness, then of 
should find similar performance data fot iq 
SEs and high SCs, and for high SEs and 
SCs. Both predictions received firm supi 
Like low SEs, only high SCs (a) were 4 
paired by the self-focusing stimulus, partiel 
larly in the mirror condition, and 
proved when instructed to task focus. _ i 
In Study 1 the task-focus manipulation l 
not have its desired effect on the low H 
focus of attention. Given the present be 4 
that attention influences performance, dt 
not surprising that the task-focus instru a 
had no appreciable effect on their Pe A 
ance. Therefore, the most parsimoniows 


-f0 
planation of the null effect of the task: 


schei] 
6 These data serve to extend Carver and pe ; 


(1978) recent verification that mirrors, pot 
creased self-focused attention. That is, thei iu 
sis would seem especially true for low SE i" 
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instructions on the low SEs’ performance may 
be the failure of the manipulation to influence 
attention as intended, rather than the ego 
involvement hypothesis posed earlier. Note 
that in studies showing independent evidence 
of the task-focus instructions’ effect on the 
low SEs’ attention (Study 2; Brockner & Hul- 
ton, 1978), the performance of the low SEs 
improved significantly. Moreover, even when 
the task-focus instructions did not influence 
the low SEs’ attention (Study 1) the internal 
analysis suggested that attentional factors af- 
fected their performance. 


Self-Esteem, Attentional Focus, Anxiety, and 
Performance 


If the low SEs’ attentional focus influenced 
their performance, one still needs to con- 
sider the theoretical bases for this relation- 
ship. Two hypotheses were proposed earlier. 
First, if low SEs are too self-preoccupied, they 
simply will not be able to devote adequate 
attention to the task. Second, self-focused at- 
tention may be especially anxiety provoking 
for low SEs, leading to performance decre- 
Ments on complex tasks like the one used in 
the present studies. With no attempt to dis- 
count the first possibility (indeed, the two 
may be complementary rather than competing 
hypotheses), evidence from these and other 
Studies suggests that (a) self-focusing stimuli 
increased the low SEs’ anxiety in both studies, 
whereas the task-focus instructions decreased 
their anxiety in Study 2, and (b) anxiety, in 
turn, mediated task performance. 

Low SE, self-focused attention, and anxiety. 
On theoretical grounds there is good reason 
to believe that the self-focusing stimuli in- 
Cteased the low SEs’ anxiety. It seems rea- 
sonable that the low SEs entered the test 
Situation with higher levels of anxiety than 
high SEs. Indeed, researchers have consist- 
ently discovered inverse relationships between 

E and general or test anxiety (Crandall, 
1973). Moreover, recent research by Scheier 
and Carver (1977) showed that self-focused 
attention serves to intensify an individual’s 
affective or emotional experience. Thus, it 
Would appear that low SEs should have be- 
come even more aware of their anxiety when 
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self-focused. In fact, both studies showed that 
low SEs reported greater anxiety than high 
SEs in the presence of self-focusing stimuli.’ 
In Study 2 this tendency was expressed as a 
SE x Self-Focus interaction, F(2, 100) = 
3.32, p < .05, such that in the self-focus, but 
not in the no-self-focus, conditions there was 
an inverse relationship between SE and anx- 
iety. In Study 1 the ¢ test showed that low 
SEs were also more anxious than high SEs in 
the video-mirror and combined conditions, 
t(57) = 1.61, p < .06, one-tailed, but not in 
the absence of the self-focusing stimuli (task 
condition, £ < 1). 

Furthermore, recall that Fenigstein et al.’s 
(1975) social anxiety subscale measures the 
extent to which people become anxious when 
self-aware. As additional evidence of the self- 
focused attention — anxiety link for low SEs, 
there was a pronounced inverse correlation 
between SE and social anxiety in Study 2 
(r= .54)8 

Self-focused attention, anxiety, and per- 
formance. There is also evidence that the 
anxiety that results from self-focused atten- 
tion impaired performance on the task used 
in the present studies. Specifically, it was 
found that in the self-focus conditions in 
Study 2, subjects high in social anxiety per- 
formed significantly worse than low SAs. 
However, in the no-self-focus conditions, high 
and low SAs did not differ. Furthermore, test 
anxiety researchers (Geen & Gange, 1977; 
Wine, 1971) have reported that the presence 
of an audience produces additional perform- 
ance decrements for highly test-anxious sub- 
jects. Given that audiences increase self-fo- 
cused attention (Carver & Scheier, 1978), it 
is possible that they increased the already 
high anxiety awareness level of test-anxious 


7 Subjects were asked, “While you were completing 
the task, how anxious did you feel?” Their answers 
were recorded on 41-point scales. 

8 While SE and social anxiety are inversely related, 
the nature of the anxiety produced by the low SEs’ 
self-focused attention is still unclear. One possibility 
is that the anxiety is due to evaluation apprehension. 
This interpretation seems plausible, given that sub- 
jects high in social anxiety seem to be more con- 
cerned or worried about being evaluated by others 
(Turner, 1977). 
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subjects (Scheier & Carver, 1977), leading 
to additional performance decrements. 

Just as the self-focusing stimuli seemed to 
intensify the perceived anxiety levels of low 
SEs, it is possible that the task-focus in- 
structions reduced felt anxiety. In Study 2 
low SEs in the task condition did report lower 
levels of anxiety than low SE-control condi- 
tion subjects, ¢(16) = 2.26, p< .05, two- 
tailed. As already noted, low SE-task subjects 
made fewer errors than low SE-controls. 
Moreover, the test anxiety literature shows 
that manipulations designed to increase task- 
focused and decrease self-focused attention 
have caused dramatic improvements in the 
performance of highly test-anxious subjects 
(Sarason, 1958, 1973, 1975). It may be that 
their performance increments are accom- 
panied or even caused by decreases in their 
usually overinflated levels of felt anxiety. 


Conclusion 


The present studies generally confirm the 
hypothesis that the task performance of low 
SEs is inversely related to their self-focused 
attention. To address an important practical 
issue, future investigators should consider the 
effect of the low SEs’ performance on their 
subsequent SE. The implication set forth 
earlier was that subsequent SE should covary 
with task performance. However, the relation- 
ship is undoubtedly more complex. For ex- 
ample, how much performance improvement 
is necessary to bolster the self-evaluation of 
low SEs? How long lasting are such effects? 
In short, under what conditions will the low 
SEs’ performance affect their subsequent SE? 
To address an important theoretical issue, fu- 
ture researchers should provide additional evi- 
dence of the influence of attentional processes 
on performance. A promising technique was 
recently employed by Diener and Dweck 
(1978), who unobtrusively assessed their 
subjects’ task-focusing and self-focusing be- 
haviors. 
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In the area of attraction, a great deal of re- 
search has supported the notion that attitude 
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Effects of Contrast and Generalization 
on the Attitude Similarity — Attraction Relationship 


John J. Seta, Lenny Martin, and George Capehart 
University of North Carolina at Greensboro 


An experiment was performed examining the phenomena of contrast and gen- 
eralization within the attitude similarity — attraction paradigm. Subjects read of 
the attitudes of two strangers (A and B) whom they had met and who had both 
initially agreed with these subjects on 50% of a number of topics. After a 
brief period of time, each subject again met the two strangers. During this 
second meeting, Stranger A continued to agree with the subject at a 50% rate, 
whereas Stranger B agreed with the subject at a 100% rate. In one condition, 
the two strangers were depicted as members of the same group, and in a second 
condition, the two strangers were depicted as members of different groups. The 
results indicated that after the second meeting, there was less of a difference 
between the attractiveness of Strangers A and B in the same-group condition 
compared to the different-group condition. In addition, within the different- 
group condition, Stranger A was liked less after the second meeting than after 
the first meeting, whereas within the same-group condition, Stranger A was 
liked more after the second meeting than after the first meeting. These results 
support the notion that the effects of contrast are accentuated when two indi- 
viduals being rated are distinct entities (members of different groups), whereas 
generalization is accentuated when the two individuals are not distinct entities 
(members of the same group). The above results are interpreted within the 
Byrne-Clore reinforcement affect model. 


1971) that the affective response elicited b 
an individual (X) “is a positive lineat func 


similarity affects attraction (e.g., Byrne, 1969, 
1971; Byrne & Nelson, 1965; Clore & Bald- 
ridge, 1968; Griffitt, 1971; Schachter, 1951). 
The effects of attitude similarity on attrac- 
tion can best be described as a “positive linear 
function of the sum of the weighted similar 
attitudes divided by the total number of 
weighted similar and dissimilar attitudes” 
(Byrne, 1971, p. 71). The Byrne-Clore rein- 
forcement affect model (e.g., Byrne & Clore, 
1970) has been the most parsimonious ac- 
count of the attitude similarity — attraction 
paradigm. Briefly, they propose (Byrne, 
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tion of the sum of the weighted positive rei 
forcements associated with X divided by th 
total number of weighted positive and neg 
tive reinforcements associated with X” (p 
279-280). 

The present article deals with how th 
weight or value of a positive reinforcer : 
punisher associated with X is affected by a 
presence of a second individual. That 35 
attractiveness of a particular individual sho 
depend not only on the reinforcers (e.84 4 
ments) and punishers (e.g., disagrees 
obtained from that particular individual | 
also on the reinforcers and punishers obta! 
from other individuals in the situation: 
situation in which an individual meets ‘i 
strangers is briefly depicted. Individual 
multaneously meets two strangers, 

A and Stranger B) who both initia 
with X on 50% of a number of topics: 
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a brief period of time, X again meets these 
two strangers and discusses a number of dif- 
ferent, yet equally important, topics. During 
this second meeting, Stranger A continues to 
agree with X at a 50% rate, whereas Stranger 
B now agrees with X on 100% of the topics. 


Opposing Predictions 


The question that can be asked is “Would 
the attractiveness of each of the strangers de- 
pend on the presence of the other?” Further, 
would Stranger A be liked more by X after 
their first or after their second meeting? Like- 
wise, would Stranger B be liked more after 
their second meeting? Predictions concerning 
this question can be derived from two sources. 
The first is the phenomenon of contrast (e.g., 
Helson, 1964). Since contrast occurs when the 
differences between two stimuli are accentu- 
ated, one would predict that X would like 
Stranger A less after their second rather than 
their first meeting, even though Stranger A’s 
proportion of agreements has not changed. 
Stranger A’s rate of agreement after the sec- 
ond meeting should seem less valued, since 
Stranger B has increased his or her agree- 
Ments from 50% to 100%. In contrast, 
Stranger B should be liked more after the sec- 
ond meeting, since not only has the rate of 
| agreement increased but the rate has also in- 
| eased relative to that of Stranger A. Experi- 

a by Mascaro and Graves (1973) and 
ae (1971) support the notion of a con- 
| ‘ast effect within the attitude similarity — 
J attraction paradigm. Mascaro and Graves 
eo found that a second stranger’s attrac- 
| e was affected by the first stranger’s rate 
f 4gteement.* That is, the second stranger 
| "as liked less as the first stranger’s rate of 
| #steement increased relative to that of the 
|| Second stranger. 
|, An opposing prediction can be made from 
Beets of stimulus generalization. 
stimu generalization occurs when different 
oa i evoke a common response, one would 
D ict that X would like Stranger A more 
! Pee second meeting than after their 
Tie Raa Stranger B’s 100% agreement 
{ imagers Positively affect the value of 

s 50% rate, since A and B’s agree- 
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ment rate might be averaged if the two indi- 
viduals are perceived as a unit. An experi- 
ment by Wyant, Lippert, Wyant, and Moring 
(1977) seems to support the concept of gen- 
eralization within the attitude similarity — at- 
traction paradigm. They found that the effects 
of one stranger’s evaluation generalized to a 
second stranger. In this experiment, the sec- 
ond stranger was depicted as being similar to 
the first. 


Contrast and Generalization 


The phenomena of contrast and generaliza- 
tion may occur under somewhat different con- 
ditions. Contrast should predominate when 
the individuals being judged are seen as dis- 
tinct entities or members of different groups, 
whereas generalization should predominate 
when the individuals being judged are not 
distinct, being members of the same group. In 
essence, the attractiveness of a particular indi- 
vidual should depend not only on the rein- 
forcers and punishers obtained from that par- 
ticular individual but also on the reinforcers 
and punishers obtained from other individuals. 
If there is an association, then the affective 
response associated with each of the individ- 
uals should generalize. That is, there should 
be a tendency to average the reinforcers ob- 
tained from each individual and therefore 
minimize differences between them. However, 
if the two individuals come from different 
groups, a contrast effect should be pro- 
nounced. That is, the reinforcers and pun- 
ishers and, hence, the affective response as- 
sociated with one individual should be com- 
pared to the affective response associated with 
the second individual. Given that there are dif- 
ferences in the absolute magnitude of rein- 


1Jn this study, subjects first met one stranger and 
then at some later time met a second stranger. Spe- 
cifically, their results supported the notion of what 
can be labeled successive contrast. However, their 
results do not support the existence of what can be 
Jabeled simultaneous contrast. Simultaneous contrast 
would be supported if the subjects met the two 
strangers simultaneously and a contrast effect was 
obtained. The present experiment tests the notion of 
simultaneous contrast within the attitude similarity — 
attraction paradigm. 
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forcement obtained from the two individuals, 
a contrast effect would accentuate these dif- 
ferences. 

The present experiment was performed in 
an attempt to answer the following questions: 
First, does the evaluation that is given to an 
individual depend on the percentage of agree- 
ments (rewards) given by a second individ- 
ual? Second, under what conditions would a 
contrast effect be accentuated and under what 
conditions might a generalization effect be 
accentuated? In this experiment a subject ini- 
tially meets two strangers (A and B) who in- 
dependently agree with the subject on 50% 
of the topics. After rating both strangers, the 
subjects were presented with additional at- 
titudes of the two strangers on six new topics 
and again were asked to independently rate 
the attractiveness of the two strangers. Dur- 
ing this second meeting, Stranger A continued 
to agree with the subject at a 50% rate, 
whereas Stranger B now agreed with the sub- 
ject at a 100% rate. In the first condition 
(different-group condition), Strangers A and 
B were depicted as members of different 
groups (one stranger was a math major, and 
the other was a drama major), whereas in a 
second condition (same-group condition), both 
strangers were depicted as members of the 
same group. (Both strangers were math ma- 
jors or both strangers were drama majors.) 
As earlier mentioned, the effects of contrast 
should be accentuated when an individual con- 
centrates on the distinctive properties of 
others (different-group condition), whereas 
generalization should be accentuated when an 
averaging of differences is made (same-group 
condition). That is, after the second meeting 
there should be less of a difference between 
the attractiveness of Strangers A and B in 
the same-group condition compared to the 
different-group condition. Differences should 
be accentuated in the different-group condi- 
tion (contrast predominates) and minimized 
in the same-group condition (generalization 
predominates). 


Method 
Subjects 


Thirty-six students (31 females and 5 males) 
from two junior-level psychology classes were ran- 
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domly assigned to either the same- or different-gray 
condition, There were 18 subjects (15 females a 
3 males) in the different-group condition and i 
subjects (16 females and 2 males) in the same-groy, 
condition. Subjects were run in groups. One g 
consisted of 24 subjects (12 in the same-group conti 
tion and 12 in the different-group condition), 
the second group consisted of 12 subjects (6 in 
same-group condition and 6 in the different-gr 
condition). All subjects volunteered their partici 
tion. 


Experimental Design 


A one between (same or different) and two withit 
[Persons A or B and trials (first or second meli 
ing)] repeated measures analysis of variance was ptt 
formed. In the same-group condition, Strangers 
and B were depicted as either both math or bol 
drama majors, whereas in the different-group co 
tion one stranger was depicted as a math major 
the other as a drama major. The order of depicti 
was counterbalanced. 


Procedure 


Consistent with the phantom-other procedure (egi 
Byrne & McGraw, 1964), subjects first filled out 
standard questionnaire defining their attitudes (ef 
Byrne & Nelson, 1965). Approximately 3 weeks all 
filling out the standard questionnaire defining Ml 
attitudes, all subjects were presented with typed 
structions and the purported attitudes of Strangts 
A and B. After reading the instructions, subjects 4 
both conditions read of the attitudes of Strangers 
and B (both agreed on 50% of the six topics) 
were then asked to rate each stranger separa 
All subjects were asked to respond to three a 
tions. Each question contained a 7-point scale, wi 
1 corresponded to like and 7 corresponded to a 
The three questions were: (a) How much woul 
enjoy working further with this person? (b) K 
much did you like the stranger? and (c) How mi 
would you like to continue a relationship WIA 
stranger? After rating the strangers, all subjects 
presented with additional attitudes of Stra 
and B. This time Stranger A agreed with t 
ject on 50% of the six new topics, whereas Stri a 
B agreed with the subject on 100% of the six T 
topics. All subjects were then asked by tyP® 
structions to again rate Strangers A and B. tal qu! 

As a manipulation check, a preexperimen’® il 
tionnaire was given to 12 subjects who were in 
in academic standing to the subjects used fi 
present study. Using only information that wee 
to them by the experimenter, they were mort 
answer two questions: (a) Who do you feel ma 
similar, two math majors or a math major ib 
drama major? (b) Who do you feel are more 
two drama majors or a math major and 4 a 
major? The phrasing of the above ques tb 
counterbalanced. All subjects felt that the tW 


he sib 


j 


x were more similar than the math major and 
the drama major and that the two drama majors 
were more similar than the math major and the 
drama major. A postexperimental interview was 
also given to 10 subjects. They were asked if they 
had read the entire instructions and attitudes of the 
irangers. In addition, they were asked if they knew 

t they were rating two different individuals and 
if they remembered if they had rated two individuals 
ho had the same major or two individuals who had 


different majors. 


Results and Discussion 


A between (same or different)—within [Per- 
‘sons A or B and trials (first or second meet- 
‘ing)| repeated measures analysis of variance 
was performed. Since all three measures were 
nearly identical, the total of all three mea- 
sures was used in the analysis. (See Table 1 
for the means of each of the experimental 
conditions.) The analysis revealed a signifi- 
cant Group (same or different) X Person 
(Strangers A or B) Xx Trials interaction, 
F(1, 34) = 55.30, p < .001; a Person X 
Trial interaction, F(1, 34) = 83.99, p < .001; 
and a Group X Person interaction, F(1, 34) 
= 5.93, p < .05. In addition, consistent with 
the Byrne-Clore (1970) framework, a per- 
son main effect, F(1, 34) = 87.32, p < .001, 
and a trials main effect were obtained, 
F(1, 34) = 106.43, p< .001. All subjects 
that were asked stated that they had read all 
of the instructions and the attitudes of the 
Strangers, In addition, the subjects were aware 


of whom they had rated (e.g, two math 
Majors). 


Results Supporting Contrast and 
Generalization 


Bo interpret the significant interactions a 
k: y (a) test was performed on all logical 
or E of paired means. (See Table 1 
the a rY of the comparisons.) Within 
liked le. E condition, Stranger A was 
the a after the second meeting than after 
the Se (p < .05), whereas within 
liked pet oh condition, Stranger A was 
f the first sd after the second meeting than after 
Bort! the eeting (p < .05). These results sup- 
snares Contrast—generalization interpreta- 
» Since within the different-group condi- 
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Table 1 
Attraction Score Means for Each Condition 
eee aA 
Person A Person B 
Trial 1 Trial 2 Trial 1 Trial 2 
Different-group condition 
11.50 13.61 13.55 7.66 
Same-group condition 
13.33 11.55 12.61 10.00 


Note. The higher the score, the less the attraction. 
Range of scores is 3-21. The following is a summary 
of the significant logical comparisons among paired 
means using the Tukey (a) test with the level of 
significance set at .05. A pooled error term was 
employed as suggested by Winer (1971). For 
comparisons between Trials 1 and 2 for Person A 
or Person B, the critical difference was 1.43. In the 
different-group condition, Person A was liked sig- 
nificantly less after the second than after the first 
meeting. In the same-group condition, Person A was 
liked significantly more after the second than after 
the first meeting. In both the same and different 
conditions, Person B was liked significantly more 
after the second than after the first meeting. None 
of the comparisons between the different- and 
same-group conditions reached significance. The 
critical differences for these comparisons was 4.75. 
For comparisons between Persons A and B within 
either the different- or same-group conditions, the 
critical difference was 1.26. During the first meeting, 
Person A was liked significantly more than Person B 
in the different-group condition. During the second 
meeting, Person B was liked significantly more than 
Person A in both the same- and different-group 
conditions. 


tion, contrast should predominate, whereas 
in the same-group condition, generalization 
should predominate. In the different-group 
condition, Stranger A’s 50% rate of agree- 
ment should be contrasted to Stranger B’s 
100% rate, resulting in a reduced level of at- 
traction for Stranger A. Within the same- 
group condition, Stranger B’s 100% agree- 
ment rate should generalize and positively 
affect the value of Stranger A’s 50% rate. 
Although the comparisons did not reach 
significance, after the second meeting Stranger 
B was liked somewhat more in the different- 
group condition than in the same-group con- 
dition, whereas after the second meeting 
Stranger A was liked somewhat more in the 
same-group condition than in the different- 
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group condition. This result is consistent with 
a contrast-generalization interpretation. In 
the different-group condition, contrast should 
predominate, whereas in the same-group con- 
dition, generalization should predominate. 

Stranger B was liked more after the second 
meeting than after the first meeting in both 
the same- and different-group conditions (p 
< .05). In both the same-group and different- 
group conditions, Stranger B’s attraction 
should have increased from the first to the 
second meeting because of Stranger B’s in- 
crease in agreement rate (50%-100%). How- 
ever, the increase in attraction from the first 
to the second meeting was greater for B in 
the different-group than in the same-group 
condition. (In the different-group condition 
the difference was 5.89, whereas in the same- 
group condition the difference was 2.61.) This 
should have been the case, since in the same- 
group condition, generalization should pre- 
dominate, whereas in the different-group con- 
dition, contrast should predominate. In the 
same-group condition, Stranger B’s attraction 
might have been lessened due to B’s associa- 
tion with A (A’s rate is lower than B’s); in 
the different-group condition, Stranger B’s at- 
traction should have increased because of A’s 
low rate of agreement. Finally, after the sec- 
ond meeting, there was less of a difference be- 
tween the attractiveness of Strangers A and 
B in the same-group compared to the differ- 
ent-group condition. An analysis of variance 
on difference scores revealed that after the 
second meeting, there was less of a difference 
between the attractiveness of Strangers A and 
B in the same-group condition (28 difference) 
compared to the different-group condition 
(107 difference), F(A, 34) = 67.9, p < .001. 
This should have been the case, since differ- 
ences should be accentuated in the contrast 
condition and minimized in the generalization 
condition. 

The above results support the notion that 
the attractiveness of a particular individual 
should depend not only on the reinforcers and 
punishers obtained from that particular indi- 
vidual but also on the reinforcers and pun- 
ishers obtained from other individuals. In ad- 
dition, the magnitude of the reinforcers and 
punishers should depend on whether there is 
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or is not a strong commonality between 
two individuals. If there is an associatioy 
then the affective response associated with 
each of the individuals should generalize- 
There should be a tendency to average the re 
inforcers obtained from each individual, Hi 
ever, if the two individuals are perceived 
distinct entities, a contrast effect should 
pronounced. 


Consensual Validation 


As one can see from Table 1, after the firs 
meeting, subjects liked Stranger A more that 
Stranger B in the different-group condition 
(p < .05). Although no a priori predictions 
were made concerning subjects’ first ratings 
of Strangers A and B (both agreed with th 
subject on 50% of the topics), these resil 
are quite interesting and explicable within 
Byrne & Clore’s (1967) theoretical frame: 
work if one assumes that a 50% agreemetl 
rate is below that which the subjects are at 
customed to receiving. Byrne and Clot 
(1967) argued that a possible reason why it 
creasing rates of agreement result in incre 
ments in attraction is that attitudinal agreé 
ment supplies an individual with consensud 
validation. Hence, there should be greater in 
validation the greater the number of a 
disagreeing with the subject. Due to A 
above, subjects in the different-group oa 
tion should like Stranger B less than Strang A 
A because Stranger B’s general disagreemét 
results in a greater degree of consensual n 
validation. Further, since the weight of A 
sensual invalidation is greater for B than 7 
A, there should be a greater probabil 
discounting the first stranger’s general 
agreement. 

In sum, the results of the present oe 
ment support the notion that both com! f 
and generalization must be considered "a 
investigating the effects of attraction. Tii 
dition, these results suggest that the pro? i 
ity of obtaining a generalization effect E 
creased as the perceived commonality betw! 
two individuals is increased. Further 
probability of obtaining a contrast effe j 
increased as the perceived commonality es 
tween two individuals is decreased. The P 


peti- 


t results indicate that certain personality 
pes should enhance the contrast effect— 
Individuals who concentrate on individual dif- 
erences should demonstrate a pronounced 
ontrast effect when compared to individuals 
yho concentrate on the similarities among 
eople. 
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The cognitive processes by which people infer whether individuals who possess 
one trait also possess another were examined in two reaction time experiments. 
Subjects took less time to affirm and more time to deny that two traits co-oc- 
curred, the greater the semantic similarity of the traits. As ‘the amount of re- 


_People are willing to infer that an in- 
dividual has one trait or characteristic from 
information about the other traits or charac- 
teristics possessed by that individual (Asch, 
1946; Schneider, 1973). Inferences about the 
nature of someone’s personality are made 
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called evidence required to affirm that two traits co-occurred was increased by 
altering the nature of the co-occurrence statements, 
creased and false response times decreased. Although it was not possible to de- 
termine whether the stored semantic “features” of a trait are locations on mean- 
ing dimensions, specific behaviors, known people characterized by the trait, or 
something else, the results nevertheless strongly suggest that implicit personality 
inferences result from a two-stage process in which 
tailed memory search stage is entered only 
features of the traits falls in between two task-established decision criteria. 


true response times in- 


the second and more de- 


if the similarity of the semantic 


with relative confidence even in the absence 
of direct (or behavioral) indicators of many 
of the inferred traits (Asch, 1946; D’An- 
drade, 1974; Fiske, 1978; Mischel, 1968; 
Shweder, 1975). This tendency to infer 
some traits from knowledge about other 
traits has been termed “implicit personality 
theory” (Bruner & Tagiuri, 1954; Schneider, 
1973). Several issues within this general topic 
area have received considerable research at- 
tention. Some studies have attempted to 
characterize the structure of the interrelation- 
ships among trait terms (see Rosenberg & 
Sedlak, 1972, for a review). Factor analysis 
models (e.g., Mulaik, 1964; Passini & Nor- 
man, 1966), multidimensional scaling models 
(e.g, Rosenberg & Sedlak, 1972), and hier- 
archical clustering models (Johnson, 1967) 
have all been explored and generally provide 
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interpretable representations. Other studies 
have been concerned with the way groups of 
traits combine to affect the amount of an- 
other trait (e.g, likeableness) that is inferred 
or perceived (Anderson, 1974). Finally, some 
have examined the role of implicit personal- 
ity theory in other phenomena, such as mem- 
ory for a list of trait terms or behavior de- 
scriptions (eg., Cantor & Mischel, 1977). 
Although these various approaches to the 
study of implicit personality theory have 
yielded a wealth of evidence about the type, 
Strength, and structure of the relationships 
among people’s trait conceptions, little is 
known about the nature of 
cesses that allow these relationships to emerge 
in different tasks, There is, however, a grow- 
ing body of literature within cognitive psy- 
chology that focuses on such issues—but for 
information sets 
example, in a classic study, Collins and Quil- 
lian ( 1969) used a binary forced-choice 
(true/false) decision task to study the na- 


ture of semantic memory. Subjects were 
asked to make 


ae for models of the structure of semantic 
7 ormation, of the retrieval of that informa- 
em memory, and of the inference or 
$ „Processes that governed how that 


i $ 
nformation was used in the construction of 
answers to such questions. 


who has Trait X 
(Bruner, Shapiro, & Ta- 
1958; Lay & Jackson 
mulate such questions so 


Y 
1958; Hays, 


: Quillian by 
I verify whether “a(n) X 
person is a(n) ¥ Person,” where X ae ¥ 
are trait terms. Although these questions are 


Not identica] to the noun questions typically 
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used in cognitive research, we can still ex- 
amine the length of time it takes subjects to, 
indicate whether a given class of objects 
(people, in our case) possess particular at. 
tributes (traits, in our case). The decision 
time data can then be used in the same way 
Collins and Quillian (1969) and others (e.g, 
Smith, Shoben, & Rips, 1974) have used 
them. 

Although the major predictions of such 
decision time experiments depend on the spe- 
cific nature of the search and decision pro- 
cesses that are proposed, it is usually the 
case that search processes are constructed to 
fit a particular theory about the structure of 
semantic memory. Two different structural 
representations that have been proposed for 
object categories seem most reasonable in 
the case of personality traits. In one view, a 
trait can be characterized in terms of the set 
of “people I have known” who possess that 
trait. The meaning of a trait stems from the 
Set of person examples associated with that 
trait. People know what the trait term gen- 


erous means because they can retrieve from , 


memory one or more examples who seem to 
Possess the trait of generosity. In other 
words, a set of remembered examples defines 
the trait term. 

A second view characterizes the meaning 
of a trait term as a list of more elementary 
features. Some of these features might be 
locations on underlying dimensions (see Ros- 
enberg & Sedlak, 1972) such as socially 
good-bad and/or dominant-submissive. Fea- 
tures might also be behaviors typically linked 
to that trait. For example, donating money 
to charity could be a feature of the trait of 
generosity. Here, the meaning of a trait term 
is derived from the list of basic features 
rather than from the sample of known people 
it activates in memory. These two models of 
the structure of personality trait terms sug- 
gest, on an intuitive basis, somewhat different 
inference processes for the type of binary 
forced-choice tasks outlined above. Let us 
consider these next. 


Exemplar Scanning Model 


If personality traits are represented 1™ 
memory not as a list of more basic features 


+ 


b 
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but rather as sets of persons who have the 
relevant characteristic (see Walker, 1975), 


| = then, when asked to verify trait co-occurrence 


statements (e.g., “A witty person is also gen- 
erous”), people might retrieve one or more 
of the known persons who possess the first 
trait (witty) and examine this sample to de- 
termine whether the second trait (generous) 
is an attribute of the people in the sample. 
We can further assume, as Walker has done 
for object categories, that a “true” response 
will be made when a sufficient number of 
the person examples in the sample possess 
the second trait and that a “false” response 
will be emitted when an insufficient number 
of the exemplars in the sample possess the 
second trait. It should be emphasized that 
there is no need to assume that a subject 
considers the entire set of known people who 
possess the first trait. We need merely as- 
sume that a subject creates a sample, prob- 
ably consisting of the most available and/or 
typical people in the entire set (Kahneman 
& Tversky, 1972; Walker, 1975). In this 
model, the length of time a person takes to 
teach a decision should increase as the num- 
ber of examples that are examined increases. 


Feature Comparison Model 


Regardless of the intuitive plausibility of 
the person-exemplar model, beginning with 
Smith et al. (1974), several reaction time ex- 
periments, using the Collins and Quillian 
(1969) paradigm, have supported a some- 
what different view. Specifically, people might 
verify assertions about object categories on 
the basis of the overall similarity between 
lists of features that the two concepts (X 
and Y) activate in memory. This model as- 
sumes that when the similarity of the two 
feature lists is above some task-established 
cut point, Cr, the participant responds 
“true.” However, when the similarity of the 
two feature lists is below another cut point, 
Cr, a “false” response is made, When the 
similarity of the feature lists falls between 
these two cut points, as might happen por 
assertions such as “A robin is an anma 
or “A dolphin is a fish,” a second stage of 
processing is hypothesized, and therefore the 
time to reach a decision regarding such as- 
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sertions should be longer. While Smith et al. 
(1974) argued that this additional process- 
ing consisted of an examination of the simi- 
larity of only the defining (as opposed to the 
most typical) features of the two concepts, it 
is sufficient for our purposes to argue that 
the additional processing requires that more 
features be examined than in the initial stage 
(or that the same ones be reexamined) with- 
out specifying the exact nature of the second 
stage. The most straightforward decision time 
prediction of this model is that the time to 
make “false” decisions should increase with 
the similarity of the two feature lists, whereas 
the time to make “true” decisions should 
decrease. 


Effects of Quantification in Each Model 


Quantification refers to the fact that we 
can alter an assertion by replacing “a per- 
son” with “all people” or “some people.” 
Thus, rather than asking participants to 
verify the sentence “A witty person is also 
intelligent,” we could ask them to verify the 
universal assertion “All witty people are also 
intelligent” or the particular assertion “Some 
witty people are also intelligent.” The effects 
of these alterations on decision times can 
offer insights into the decision and search 
processes used in generating responses to our 
statements (see Meyer, 1970; Rips, 1975). 

Exemplar scanning. If a participant does 
retrieve from memory a sample of person 
exemplars with the X trait and examines 
them to determine whether they also possess 
the Y trait, decision times should increase as 
the number of exemplars that are examined 
increases. Consider universal assertions. Here, 
participants are attempting to decide whether 
all Xs are Ys. It seems reasonable to expect 
that the participants would respond “true” 
only if all of the X exemplars in the re- 
trieved sample were found to possess the Y 
trait. However, once the first X exemplar 
that did not possess the Y trait was en- 
countered, this should provide sufficient evi- 
dence to stop the examination and respond 
“false.” The likelihood of finding an X exem- 
plar without the Y trait in the sample will 
depend on the co-occurrence of the X and Y 
traits in the set of known people. If every 
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known X exemplar possesses the Y trait 
(complete co-occurrence), long, “true” de- 
cision times should be found. If, on the other 
hand, none of the X exemplars possesses the 
Y traits (no co-occurrence), then the first 
retrieved example will always provide suf- 
ficient evidence for fast, “false” decision 
times. When some, but not all, of the known 
Xs are Ys (partial co-occurrence), then the 
length of time should depend on the size of 
the sample retrieved and the proportion of X 
exemplars with the Y trait that are in the 
retrieved sample. The greater the number of 
X exemplars that are examined before a 
negative instance is encountered, the longer 
the time to reach a “false” decision.* 

It should be noted, however, that if the 
sample size is constant, and greater than one, 
all of the latter “false” decision times should 
be shorter than “true” decision times (be- 
cause fewer examples are examined) and 
longer (on the average) than the “false” 
times for the no-co-occurrence case (since 
only one example needs to be examined in 
the latter instance). Thus, for universal as- 
sertions, participants should take the longest 
time to verify statements in which there is 
complete co-occurrence, the shortest time to 
negate statements in which the X and Y 
traits do not co-occur, and an intermediate 
amount of time to negate statements in which 
the X and Y traits co-occur for some but 
not all of the exemplars. 

Applying similar reasoning to particular 
assertions predicts that participants should 
emit fast, “true” responses when there is 
complete co-occurrence, because the first ex- 
ample should provide sufficient evidence that 
at least one X exemplar has the Y trait.? 
On the other hand, long, “false” response 
times should be observed when there is no 
co-occurrence. The entire sample should be 
examined before the participants are sure 
that not even one X exemplar has the Y 
trait. Moderately long “false” times should 
result when there is partial co-occurrence. 
because, on the average, more than one Bue 
less than the entire sample of exemplars 
should be examined before a Positive case is 
found. 


Feature comparison. Rips (1975) has out- 
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lined how the feature comparison mode 
might be affected by quantifying assertions 


with all and some. His argument consisted of 


several major points. First, he noted that the 
similarity of the X and Y feature lists will 
generally be confounded with co-occurrence 
relationships. The feature lists of X and Y 
should be most similar when there is com: 
plete co-occurrence and least similar when 
there is none. Second, he argued that quanti- 
fication of assertions would affect the place- 
ment of Cy (but not Cy) on the feature simi- 
larity dimension. Cy should be set high (mak- 
ing it harder to say “true”) for universally 
quantified sentences and moderately low 
(making it easier to say “true”) for sentences 
quantified by some. Note that if the position 
of Cy does not change, then the number of 
sentences requiring second-stage processing 
should be smaller for particular than for 
universal assertions. The distance between 
Cy and Cy should be smaller, and therefore 
the probability of any given X-Y similarity 
estimate falling between them should be re- 
duced. Finally, to make exact predictions, we 
must know how the similarities of the three 
types of trait pairs are distributed. Rips 
(1975) assumed that for object categories, 
the average similarity of Xs and Ys in which 
only some of the X exemplars could be classi- 


ix -a 


fied as Y types (e.g., mother and doctor) 


would be closer to the average similarity of 
X-Y pairs in which all of the X exemplars 
could be classified as Y types (e.g, mother 


1 Because we are proposing that the decision ® 


based on an examination of a sample rather than 


on all of the known persons, decision errors can a 
easily accounted for by this model. For example, 
a subject could respond “true” to a universal be 
sertion even though only some of the known pe 
sons who possess the X trait also possess the i 
trait. We need merely assume that the retrieve 


Sample consisted of X exemplars, all of whom POS 


sessed the F trait. that 

2To make this prediction we have assumed “at 
subjects interpret the quantifier some to mean ilar 
least one.” It should be noted, however, that sim a 
Predictions to those made in the text can be bat 
if we assume that the term some is interprete ris 
mean “a few.” In this case, we would assume few” 
a “true” response would be emitted only if “a I 
(say two or three) exemplars were found to P! 
sess the relevant trait. 
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and woman) than to the average similarity ALL 
of X-Y pairs in which none of the X ex- 

emplars could be classified as Y types (eg., eer BRC Bish 
mother and chair), Extending Rips’s reason- ie Fa ue 
ing to the co-occurrence of trait terms, Fig- - 
ure 1 shows three distributions of feature 
list similarities for the three types of co- Cr Cr 
occurrence relationships whose spacing con- 
forms to Rips’s assumption.* Also shown is 
the postulated effect on Cr of changing the SOME 
quantifier from all to some. 


i 


ee vT f NO PARTIAL 
The decision time predictions of this model _co-occurrENce CO-OCCURRENCE COLOCCURRENCE 


are based on the proportion of trait pairs that 
fall between Cp and Cr and therefore require 
second-stage processing in any given condi- 
tion. One prediction is that decision times GERNOT 

should be longer for universal than particular Sow Abbe 


assertions, since, as can be seen in Figure l,a SIMILARITY 
smaller number of trait pairs falls between Figure 1. Spacing of similarity distributions as- 


Cy and Cr in the latter than in the former sumed by the feature comparison model for the 
Ga Sond prediction is that “false” es ey OS 0 Sn! pats tte peed foc 
pponse times should be positively related and Scour in some but not all people (partial), and pairs 
true” response times should be negatively that are perceived to co-occur in all people (com- 
related to the similarity of trait pairs, even plete). (Also shown is the effect on the true cri- 
for those trait pairs in the same co-occur- terion, Cr, of changing the E T a ath 
rence relationship.* This prediction follows AREN, Dak Sia R ERNA epee 
from the fact that as the similarity increases, i 
É 5 hers “ nippi X 
ee te pees eal ce predictions can be found by further study of 
wer likeli Table 1. 
response has resulted from second-stage pro- 
cessing. Predictions for particular types of i $ l 
trait pairs (e.g., those in a specific co-occur- one fee Se gas ee ay 
rence relationship) are easily derived by not- Bytt; OF 
ing the relative areas of each curve that fall Unlike the noun sentences typical in cog- 
Bre Cr and Cy in PER hi F nitive research, there is no a priori method of 
e two models outlined above can be 
contrasted in terms of the predictions they ae y Hee 
make regarding the effects on decon time 1 SPS ie canprion mode de 
. . jons ppan 

: the co-occurrence relationship between ag Girectly on the assumed spacing of the similarity 

and Y traits and the use of the quanti- distributions for different types of trail Pare There- 


fiers all and some. Table 1 presents the pre- fore, it is useful to have independent evidence about 
the validity of these assumptions. Without such 


dicti i of rela- T 
ction soA tue two models in terms ior evidence, it is possible to account for a much wider 
tive decision times. As can be seen, a MAJE fange of outcomes merely by postulating different 
difference between the two models is their similarity distributions. 


i i tic similarity and 
i i -i nce trait 4 To the extent that our seman r 
Preden an o iin pane nies io co-occurrence estimates are highly correlated, this 


pairs. The exemplar scanning model predicts result could be explained by the exemplar scanning 
that the times for these statements will fall model as well. We need merely assume that the 


en er two proportion of positive exemplars in a sample in- 
idway between those for the oth creases monotonically with semantic similarity. The 


types, whereas the feature comparison model number of exemplars that would have to be scanned 
predicts that these trait pairs should yield before a “true” response could be emitted would 


the longest decision times. Additional specific then decrease as semantic similarity increased. 
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Table 1 7 2 s à k 
Order Relationships Among Average Response Times Predicted by Exemplar Scanning and by ; 
Feature Comparison Models for Different Quantifiers and Extent of Trait Co-Occurrence f 
Trait co-occurrence i 
Model Quantifier Complete Partial None 
Exemplar scanning All True > False > False 
> < 
Some True < True < False 
Feature comparison” All True < False > False 
> > = | 
Some True < True > False 
Note. True/false entries refer to the type of response being made in the condition. 
* The relationship between these two outcomes depends on the proportion of X exemplars that possess the 
Y trait in the sample and the order in which the exemplars are examined. Assuming a random order of 


examination, if the proportion is greater than .5, then the false times to universal (all) assertions should 
be longer than the true times to particular (some) assertions. If less than .5, then the reverse should be found, 


» The predictions of this model are based on the similarity distributions depicted in Figure 1. 


deciding into which type of relationship a 
given trait pair, such as witty and intelligent, 
falls. While almost everyone would agree that 
the set of known robin exemplars is a subset 
of the set of known birds, the relationship 
between the set of known witty persons and 
the set of known intelligent persons is less 
clear. We can, however, determine from sub- 
jects’ responses to sentences quantified by all 
and some what the majority of people believe 
about a given sentence. If the majority of 
subjects respond “true” to both the aX and 
Some versions of a sentence, then (for the 
majority) all of the X exemplars may be con- 
sidered to possess the Y trait. When the ma- 
jority of people respond “false” to the all 
version and “true” to the some version, then 
only some of the X exemplars possess the Y 
trait. Finally, when the majority respond 
“false” to both versions, the X and F traits 
will not be considered to co-occur. 


Similarity of Trait Feature Lists 


The similarity between feature lists for the 
different traits was assessed by requiring that 
subjects rate the similarity in meaning of all 
relevant trait pairs. We assumed that these 
ratings would provide monotonic indicators 
of the overall similarity of each pair of fea- 
ture lists (Rips, 1975; Smith et al., 1974), 


Experiment 1 


Experiment 1 was designed to examine the 
contrasting predictions of the two models. 
Ratings of the similarity of trait pairs and | 
decision times to verify universal, particular, 
and “a person” assertions about these trait 
pairs were recorded. 


Method 
Subjects 


Nine male and nine female college students eag 
participated in three 1-hour sessions in partial ful- 
fillment of an introductory psychology course i 
quirement. All of the participants were native spea® | 
ers of English. 


Apparatus and Software 


The decision time data were collected using 5 
PDP-12 laboratory computer. Responses were wes 
to stimuli, displayed on the video screen, or Ae k 
telegraph keys, one labeled “true,” the other a A 
A user-oriented computer program was usi 3 
control and randomize stimulus pairs and to a 
the response latency and which of the two Ut 
had been pressed. Decision times were accura® J 
within 1 msec. 


Procedure 


ith | 
Participants were told that the study on jane 
their reactions to various personality charact te in 
of people and that they would have to particip 
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three separate sessions, each lasting from 45 minutes 
to 1 hour and separated by 3 to 5 days. 

Semantic similarity and co-occurrence ratings. 
All of the participants completed two questionnaires. 
One was filled out in the first session and one in 
the third, counterbalanced for order over partici- 
pants. Both questionnaires presented all possible 
pairs of 12 different trait adjectives (see Figure 2). 
The semantic similarity questionnaire asked the par- 
ticipants to indicate “how similar in meaning” the 
two terms were on a 20-cm unmarked scale anchored 
by “not similar in meaning” and “identical in mean- 
ing.” The other, co-occurrence, questionnaire asked 
the participants to indicate, “How likely is a(n) 
person to also be ____?” Unmarked 20-cm 
scales were again used, but in this case they were 
anchored by “not likely” and “extremely likely.” The 
participants were told to complete the semantic simi- 
larity scales on the basis of each term’s “dictionary 
definition” but were not given a particular instruc- 
tion regarding the co-occurrence ratings. Partici- 
pants were run individually in an isolated cubicle 
and took as much time as they needed to complete 
each questionnaire, 

Forced-choice decision task. The procedures for 
the three decision tasks were virtually identical. One 
was conducted in the first session after the first 
questionnaire was completed, and two were con- 
ducted in the second session. Order of task was 
counterbalanced across the participants. Upon ar- 
rival at the computer laboratory, participants were 
Seated in front of the PDP-12 and video stimulus 
display. They were told that pairs of trait terms 
would appear on the video screen in blank spaces in 
a cardboard mask that was placed on the video 
screen. In a particular session, one of the three 
following sentences was written on the mask: “A(n) 
person is also a(n) person”; 5 “Some 
—— people are also »; and “All —— 
People are also ____. 


” The subjects were told that 
when a trait pair appeared on the screen, they were 
to indicate whether the entire sentence was true or 
false by pressing the right (left for half of the 
subjects) telegraph key if they thought it was true 
and the left (right) key if false. They were told to 
read the entire sentence each time that a new pair 
of trait terms was presented. They were also told to 
think about each sentence carefully before responding, 
but as soon as they were sure of the answer to press 
one of the keys. After demonstrating to the subjects 
how to position their hands near the telegraph keys, 
the experimenter began a short program that pre- 
sented five practice trait pairs (not in the actual 
experimental list), The experimenter observed the 
Participants while they completed these practice 
trials, reinstructing them if necessary. After these 
trials were completed, and following a 2- to 3-minute 
delay, the main program was initiated. The experi- 
menter left the room, instructing the participant 
that the word finished would appear in both blank 
Spaces on the video display when the task was 
completed. 

The main program presented each word in the 
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trait pair virtually simultaneously. Measurement of 
the response latency began the moment that the 
two words appeared on the screen and ended either 
when one of the two keys was pressed or when a 
15-sec maximum had been reached. There was a 
4-sec intertrial interval, during which time the 
display was blank, between the presentations of the 
stimulus pairs, No “ready” signal was used in this 
study. Each participant was presented with a given 
pair of trait terms only once. The two terms in a 
given pair were presented in only one order, which 
was held constant over participants. 


Results 


The data were analyzed in an attempt to 
examine several major issues. Of initial con- 
cern was whether the semantic similarity rat- 
ings (uncommon in implicit personality re- 
search but common in cognitive research) 
would yield a multidimensional representa- 
tion similar to that obtained from the co- 
occurrence ratings (which are more common 
in studies of implicit personality). The second 
analysis studied the effect of the universal 
and particular quantifiers on decision times 
for the three types of trait pairs: complete, 
partial, and no co-occurrence. In the third 
analysis, the relationship between semantic 
similarity and “true” and “false” decision 
times was examined. The fourth analysis ex- 
amined the ability of the semantic similarity 
ratings to explain the effects of the co-occur- 
rence relationships on decision times. 


Multidimensional Structure of Trait Terms 


The structure inherent in the semantic 
similarity ratings was compared to that ob- 
tained with the more common co-occurrence 
rating method. The average (over partici- 
pants) semantic similarity and the average 
co-occurrence rating between each pair of the 
12 trait terms used in the study were com- 
puted. The two half-matrices (the trait terms 
were paired in only one order) were then 
separately analyzed using KYST2 (see Krus- 
kal, Young, & Seery, Note 1), a monotone 
multidimensional scaling program. The en- 


piui 


5For reasons of clarity, the results for this de- 


cision task will not be presented. It should be noted, 
however, that they were completely consistent with 
the results from Experiment 2. 
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Figure 2. Two multidimensional scaling solutions 
for the 12 trait terms used in Experiment 1. (The 

A top panel shows the solution for co-occurrence rat- 
ings, rotated to its principle components. The bot- 
tom panel shows the solution for semantic similarity 
ratings of the same traits, rotated to minimize the 
difference between the two solutions. The high cor- 
respondence between the two solutions is of most 
interest.) 


tries in each lower half-matrix of the two 
types of ratings were analyzed ‘as direct dis- 
tance estimates. Two-dimensional solutions 
seemed to provide reasonable representations 
of both data sets, The one-dimensional stress 
(Formula 1) values were .2,976 for the co- 
occurrence ratings and 2,302 for the semantic 
similarity ratings. A two-dimensional fit re- 
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duced these values to .1,245 and .0,817, re- | 
spectively. It seemed inappropriate to fit a 
three-dimensional solution to the data, given 
the small stimulus set. 

The top panel in Figure 2 contains the two- 
dimensional solution for the co-occurrence rat- 
ings, rotated to its principal components. The 
bottom panel contains the results for the | 
semantic similarities rotated to minimize the 
difference between the two solutions. The two 
solutions yielded remarkably similar results. 
As in Rosenberg and Sedlak’s (1972) work, 
the two most important dimensions in this 
trait set seem to be socially good-bad and | 
intellectually good-bad. In short, not sur- 
prisingly, the participants in our study 
seemed to have an implicit personality theory 
similar to that found for other subject sam- 
ples in other laboratories. More important, 
however, the obvious congruence of the se- 
mantic similarity and co-occurrence struc- 
tures implies that processes involved in im- 
plicit personality may be related to more 
basic semantic memory processes, as we have 
argued. 

$ 


Complete, Partial, and No Co-occurrence 


Of major concern in this experiment werè 
the effects on decision time of universal and 
particular assertions involving trait pairs that 
were defined as being in a complete, partial, 
or no co-occurrence relationship. In order to 
examine the predictions outlined in Table 1, 
the 66 sentences quantified by all were divided 
into two groups according to the majority 
response (true vs. false) given to each. The 
same was done for the corresponding 66 sen- 
tences quantified by some. The mean decision 
time (over participants) for those partici 
pants whose response agreed with the major 
ity was then computed for each univer 
assertion and for each particular assertion: 
The 66 trait pairs and their associated de ; 
sion times were divided into three groups: 
those for which the majority response ME 
“true” to both the universal and particu 
versions (complete co-occurrence), those i 
which it was “false” and “true,” respectively 
(partial co-occurrence), and those for whi a 
it was “false” for both versions (nO cone 
rence), There were no trait pairs in which 


COGNITIVE PROCESSES AND TRAIT INFERENCES 


Table 2 
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Mean Decision Time (in sec) to Respond to the Complete, Partial, and No Co-Occurrence Trait 


Pairs When Quantified by All and Some 


SS 


Complete 
(n = 16) 


Trait co-occurrence 


Partial 
(n = 33) 


No 
(n = 17) 


Quantifier Time Response 


Time Response 


Time Response 


All 
Some 


True 
True 


3.719 
3.314 


3.540 
3.811 


False 
False 


False 
True 


3,134 
3.730 


Note, n = the number of trait pairs in each co-occurrence category. 


majority response was “true” for the uni- 
versal, but “false” for the particular, version. 

Table 2 presents the mean (over trait 
pairs) decision times for the al and some 
quantifiers applied to the three types of trait 
pairs. A 2 (quantifier) X3 (type of trait 
pair) analysis of variance was performed.® 
Contrary to the feature comparison model, 
the main effect of the type of trait pair was 
not significant, F(2,63) = 1.14. Also con- 
trary to the feature comparison model, the 
time to verify sentences quantified by some 
was significantly longer (rather than shorter) 
than the time to verify those quantified by 
all, F(1,126) = 5.57, p < .05. The interac- 
tion between quantifier and trait pair type 
was highly significant, however, F(2, 126) = 
18,33, p< .001. For universal assertions, 
subjects negated the no co-occurrence pairs 
much faster than they affirmed the complete 
Co-occurrence pairs. In contrast, for the par- 
ticular assertions, subjects negated the no and 
partial co-occurrence trait pairs much more 
slowly than they affirmed the complete co- 
occurrence pairs. This pattern of mean reac- 
tion times seems to conform more closely to 
that predicted by the exemplar scanning 
model than to that predicted by the feature 
comparison model (compare Tables 1 and 2). 


Decision Times and Semantic Similarity 


Recall that the feature comparison model 
argues that “true” response times should 
decrease and “false” response times should 
increase as the semantic similarity of the X 
and Y trait terms increases. To examine 
these predictions, the decision times were 


divided according to the participant’s “true” 
and “false” responses, and by quantifier. Each 
set was then correlated with its associated 
semantic similarity ratings, separately for 
each participant. This analysis yielded one 
“true” correlation and one “false” correla- 
tion per quantifier for each participant. Fish- 
er’s z transformations of these correlations 
served as the raw data for the analyses below. 

Table 3 presents the mean “true” and 
“false” correlations for the two quantifiers. 
Only the mean “true” correlations were sig- 
nificantly different from zero for the sen- 
tences quantified with some, t(17) = 4.62, p 
< .01, whereas only the mean “false” corre- 
lations were significant for the sentences 
quantified with all, (17) = 2.80, p< 05. 
The correlations between rated similarity and 
decision time were examined for each subject. 
These were significant ($ < .05) and in the 
predicted direction for two “true” and six 
“false” cases when all was used, and for eight 
“true” and three “false” cases when some was 
used. Thus, the average ¢ test results seem to 


6 Actually, two different analyses of variance were 
computed. In one, the fact that there were more 
intersection pairs than the other types was treated 
as representative of the entire population of trait 
terms, and a least squares solution was therefore em- 
ployed. In the second analysis, the unequal ns were 
assumed to be artifactual, and an unweighted-means 
analysis was used, Since the conclusions were identi- 
cal, only the results of the second analysis are pre- 
sented in the text. In addition, it should be noted 
that the error terms in both of these analyses were 
based on variability over sentences rather than sub- 
jects. The logic of using sentences to generate the 


y 
error term has been outlined by Clark (1973). 
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Table 3 : 

Mean of Fisher z Transformed Correlations 
Between Decision Times and Semantic 
Similarity Ratings for “True” and “False” 
Responses to Sentences Quantified With All and 
Some 
SS ee ee See a ee 


Quantifier 
Response All Some 
True —.06 =.26* 
False io” .08 


* p <.05 (two-tailed). ** p < .01. 


represent the proportion of subjects for whom 
the semantic similarity — decision time corre- 
lation was as predicted. 

A 2 (true/false) x 2 (quantifier) mixed 
analysis of variance on these transformed 
correlations indicated that the “true” re- 
sponse correlations were significantly differ- 
ent from the “false” response correlations, 
F(1,17) = 9.82, p < .01. The effect for type 
of statement was not significant, F(1, 17) = 
1.48, p > .05, and the interaction between 
true/false and quantifier was also not sig- 
nificant, F(1, 17) = 2.85, p > .10. Separate £ 
tests on the difference between the mean 
“true” and “false” correlations for each 
quantifier supported this interaction result; 
the difference was significant in both cases 
(b < .05). The finding that “false” response 
times were more positively related to semantic 
similarity than “true” response times across 
both quantifiers offers relatively strong sup- 
port for the feature comparison model. 

It appears that the results of the co-occur- 
rence analysis (Table 2) supported the ex- 
emplar scanning model, whereas the results 
from the semantic similarities (Table 3) sup- 
ported the feature comparison model. One 
plausible explanation for this outcome is that 
one or more of our assumptions about the 
nature of the models may require modifica- 
tion. In particular, recall that the predictions 
of the feature comparison model described in 
Table 1 were based on an extension of a 
suggestion made by Rips (1975), namely, 
that the distributions of semantic similarities 
for complete, partial, and no co-occurrence 
trait pairs would be spaced as depicted in 
Figure 1. To examine the validity of this 
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assumption, we computed the mean similarity 


rating for each trait pair (only for those , 


participants whose “true”/“false” Tesponse 
agreed with the majority). The average (over 
sentences) similarities for the three types of 
sentences were then derived, and they are 


presented in Table 4. An unweighted-means | 


2 (quantifier) x 3 (type of trait pair) analy- | 


sis of variance of these data indicated that 
there was a highly significant difference 
among the three types of trait pairs, F(2, 63) 
= 350.66, p < .001. The trait pairs defined 
as being in a complete co-occurrence rela- 
tionship were indeed seen as much more simi- 
lar than the no co-occurrence pairs. Partial co- 
occurrence trait pairs were midway between 


these two, but, contrary to expectation, they 7 


were closer to the no co-occurrence than to 
the complete co-occurrence trait pairs. No 
other effects were found to be significant. y 
As we will discuss below, these similarity 
results suggest an explanation for the pattern 
in the previously presented mean decision 
times that is based on a revised feature com- 
parison model. With this explanation in mind, 
we examined the ability of the semantic simi- 
larity estimates to predict variations in e 
sponse times within each of the three we 
trait pairs. The feature comparison m o 
makes the strong prediction that “true cs 
sponse times should be positively related f 
“false” times negatively related to in 
even within these trait pair types. A # 
presents the correlation (over sentences) i 
tween the mean similarities for each trait ue 
type. As can be seen, the correlations by r 
significantly different from zero in two 0 r 
six cases: the true times to the complete an 


; f nti- 
partial co-occurrence pairs that were qua 


Table 4 Parken 
Mean Semantic Similarity of Complete, Pa d 
and No Co-Occurrence Trait Pairs Quantifie 
by All and Some 


Trait co-occurrence 
O U 


Quantifier Complete Partial No 
All 14.19 6.36 Ti 
Some 13.52 8.77 ue 


7 le. 
Note. Similarity was rated on a 20-point ae 
Higher numbers indicate greater similarity. 
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fied by some. The sign of each of the correla- 


: tions was in the direction predicted by the 


4 


i 


feature comparison model, however, 


Discussion 


Taken together, the results from this ex- 
periment offered some support for both models 
proposed in the introduction, 


Mean Decision Times 


With one exception, the pattern of mean 
decision times for the three types of trait 
pairs seemed to support the exemplar scan- 
ning model fairly well. It took less time for 
subjects to affirm that some X were Y when 
all of the X exemplars possessed the F trait 
than when only some did. Similarly, it took 
less time to say that all Xs were not Ys when 
none of the X exemplars possessed the Y trait 
than when some of the X exemplars did. On 
the other hand, the time to verify that some 
Xs were Ys when only some X exemplars 
possessed the Y trait was not midway between 
the complete and no co-occurrence trait pairs, 
as predicted (Tables 1 and 2); instead, it 
was the longest time." 


Decision Time and Semantic Similarity 


Although the mean decision times seemed to 
support the exemplar scanning model, some 
aspects of the correlations between semantic 
similarity and decision time supported pre- 
dictions of the feature comparison model. As 
predicted, semantic similarity ratings corre- 
lated more positively with “true” decision 
times than with “false” decision times (Table 
3). However, the fact that the correlations 
for “false” responses to particular assertions 
and “true” responses to universal assertions 
were not significantly different from zero is 
somewhat inconsistent with the feature com- 
parison model, The correlations within each 
type of trait pair also provided only weak 
Support for this model. 


The Feature Comparison Model Revisited 


_ It is possible to offer a consistent explana- 
tion for the entire pattern of results by revis- 
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Table 5 

Correlation Between Mean Decision Times and 
Mean Semantic Similarity Ratings for Each 
Type of Trait Pair and for Each Quantifier 


Trait co-occurrence 


7 Complete Partial No 
Quantifier (n = 16) (n = 33) (n=17) 
All —.24 -04 01 
Some —.67* —.41* 06 


Note. n = the number of trait pairs used to compute 
each correlation. 
*p < 05. 


ing our assumptions about the effects of 
quantifiers on the cut points, Cy and Cr, in 
the feature comparison model and by consid- 
ering some of the methodological inadequacies 
of this experiment. Take the latter point first. 
In this study, each participant produced only 
one decision time and one semantic similarity 
rating per trait pair. If the error variability 
in the decision times were reasonably large, 
it could account for the failure to find con- 
sistently significant correlations for individu- 
als between semantic similarity ratings and 
decision times (Table 3). Furthermore, the 
fact that the standard deviation of the mean 
semantic similarity ratings (over trait pairs 
within a given type) was very small (min. ê 
= .826, max. ¢ = 1.974, on a 20-point scale) 
could explain the failure to find significant 
correlations within each type of trait pair 
(Table 5). 

Our original predictions for the feature 
comparison model were based on Rips’s 
(1975) assumption that the semantic simi- 
larity distribution for the partial co-occur- 
rence trait pairs would be more similar to 


7It should be noted that the average semantic 
similarity results can be used to explain this out- 
come. If a large majority of the X exemplars did 
not possess the Y trait in the partial co-occurrence 
trait pairs, then the participants would be expected 
to take a long time to determine that at least one 
exemplar possessed the Y trait. On the average, most 
of the exemplars in the sample would have to be 
examined before an answer was given. Thus, the 
decision time should be almost as long as the time 
to assert that not even one exemplar possessed the 


relevant trait. 
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CO-OCCURRENCE CO-OCCURRENCE 


SOME 


NO PARTIAL COMPLETE 
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SIMILARITY 

Figure 3. Spacing of similarity distributions obtained 
for the three types of trait pairs (no, partial, and 
complete co-occurrence) in Experiment 1. (Also 
shown is the effect on the “true,” Cr, and the 
“false,” Cr, criteria of changing the quantifier from 
all to some, as assumed by the revised feature com- 
parison model.) 


the complete than to the no co-occurrence 
similarity distributions (see Figure 1) and 
that changing the quantifiers would only 
affect the position of Cr. The mean semantic 
similarity results (Table 4) suggest that for 
our trait terms, the first of these assumptions 
was not satisfied, The average similarity of 
the X—Y terms in the partial co-occurrence 
pairs was closer to that for the no co-occur- 
rence than the complete co-occurrence pairs. 
With this finding in mind, it is possible to 
explain the exact pattern of means obtained 
for the two types of sentences (Table 2) by 
assuming that the change in quantifier affected 
the placement of both Cy and Cy. That is, 
changing from universal to particular asser- 
tions may lower both Cy and Cp rather than 
just Cy, The effect of such a change in as- 
sumptions can be seen in Figure 3. Three 
distributions of semantic similarities, one for 
each type of trait pair, are presented so that 
the differences in their means correspond, 
roughly, to the differences in semantic simi- 
larity ratings reported in Table 5. 

The predictions for this new version of the 
feature comparison model can be derived by 
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noting the area of each distribution t 
between the cut points. The area is 
index of the proportion of trait 
the trait-pair type) that requires 
cessing time. Note that in the all 
of the complete co-occurrence pairs 
tween the two cut points, thereby pi 
long, “true” decision times. A mod 
ber of the partial co-occurrence p 
between the cut points, resulting in 
times for “false” responses to thi 
Since most of the no co-occurrence 
below Cy, very fast “false” respons 
be expected. On the other hand, in 
case, most of the complete co-occurre! 
are above the new Cy cut point, 
“true” responses would be expected, : 
all of the partial co-occurrence pair: 
tween Cy and Cy, resulting in ver 
(rather than moderate, as predicted 
“true” responses. Finally, the large 
tion of no co-occurrence pairs falling” 
Cy and Cy should produce longer “fa 
sponse times for these pairs. As can b 
this describes the exact pattern of d 
time means that was obtained (Table: 

Given the relatively better fit of 
vised feature comparison model, it i 
tant to ask why Rips (1975) did no 
evidence for a shift in Cr and we di 
possibility is that the mean of the sin 
distribution for the no co-occurrence P= 
used by Rips and others (e.g., Meyer, 
may have been so low and different f 
other distributions that a change in 
undetected. The psychological dist; 


8 If the similarity distributions and cut po 
as depicted in Figure 3, we would expect 
mean “true” response times, over all types” 
pairs, should be longer when all as op} 
is used as the quantifier. However, the 
should obtain for the mean “false” response 
This prediction follows from the fact thi 
fast (single-stage) “true” responses (those ab 
would be expected with some than with all 
quantifier. On the other hand, many mor” 
(second-stage) “false” responses would be € 
for some than for all sentences. This pattern 
dicted effects was obtained. For universal 
the mean “true” response time was 3.85 
the mean “false” time was 3.65 sec, but 
ticular assertions, the means were 3.61 and 
respectively, The interaction was si 
17) = 6.64, p < 05. 


COGNITIVE PROCESSES AND TRAIT INFERENCES 


tween an “intelligent” and a “dishonest” per- 
son may be much smaller than that between 
a “woman” and a “chair.” One implication 
of this point is that future research should 
include evidence about the average similari- 
ties of different types of pairs. Unless such 
evidence is presented, almost any result will 
be interpretable within the context of the 
feature comparison model. 

The revision in the feature comparison 
model greatly increases the similarity between 
what initially appeared to be very different 
views of implicit personality trait inferences. 
One of the two crucial aspects that initially 
differentiated the two models does so no 
longer, namely, the fact that the criterion for 
a “false” response varied with the quantifier 
in the exemplar scanning model and not in 
the feature comparison model. That is, in the 
former model we assumed that a change from 
a universal to a particular quantifier altered 
the evidence the participant needed for a 
“false” as well as a “true” response. In the 
universal case, one negative example was 
sufficient for a “false” response, but this was 
not so in the particular case. By allowing Cr 
to vary, as well as Cr, the same effect is now 
achieved in the feature comparison model: 
moderate feature list similarities are suffi- 
cient for a “false” response when all is used 
but not when some is used. 

The commonalities between the two models 
can be emphasized by noting that both are 
sensitive to the psychological distance between 
trait pairs (feature list overlap vs. degree of 
co-occurrence of attributes in exemplars), and 
both assume that a different evidentiary cri- 
terion is used for “true” than for “false” 
responses, In the feature comparison model, 
this last point is made explicit by the dual 
cut points; however, the exemplar scanning 
model also postulated a dual criterion (e-8., 
for particular assertions, the first positive 
example was sufficient for a “true” response, 
but a number of negative examples were re- 
quired for a “false” response). Thus, the two 
models, while making quite different assump- 
tions about the way in which traits are repre- 
sented in memory, now predict virtually 
identical mean reaction time results as 4 
function of alterations in decision criteria 
because of common processing assumptions. 
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Experiment 2 


; It seemed necessary to postulate the ex- 
istence of a dual-criterion decision process 
(whether based on features or exemplars) in 
order to explain the results from Experiment 
1. This conclusion greatly limits the class of 
models that, can be used to understand im- 
plicit personality inferences of the type being 
examined here. On the other hand, the evi- 
dence for this strong conclusion was not as 
consistent as we would have liked. For this 
reason, a second experiment was designed in 
which more reliable estimates of reaction 
times and of psychological distances were ob- 
tained, The participants were presented with 
each trait pair six times, three times in one 
order and three times in the other, both when 
making binary forced-choice decisions and 
when rating the similarity of the trait pairs. 

Another purpose of this second experiment 
was to provide evidence relevant to one of the 
few remaining differential predictions of the 
two models. Specifically, the exemplar scan- 
ning model predicts that the relationship be- 
tween psychological similarity and “true” and 
“false” decision times should, in general, vary 
with the evidentiary criteria used to reach a 
decision. While “false” times should increase 
and “true” times should decrease as the simi- 
larity increases, regardless of the decision 
criteria, the absolute value of the slopes of 
these functions should change as the criteria 
change. When a smaller number of exemplars 
is required for a false than for a true response 
(eg., as in the all assertions), the slope for 
the “false” times should be steeper than for 
the “true” times. The reverse is expected 
when a smaller number of exemplars is re- 
quired for a “true” than for a “false” re- 
sponse. This prediction follows from the fact 
that the length of a self-terminating search of 
a sample can extend over a wider range when 
termination depends on the discovery of a 
few instances in the sample than when it 
depends on the discovery of many instances 
in a sample.” 


9 Although this prediction can be tested against 
the data from Experiment 1, the failure to find sig- 
nificant correlations between similarity and decision 
time in some conditions prevented a reasonable test 


from being made. 
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It should be noted that the revised feature 
comparison model is quite capable of explain- 
ing the pattern described above (were it to be 
found), provided the distributions of se- 
mantic similarities have appropriate forms. 
Nevertheless, a failure to find the predicted 
slope changes can be taken as direct evidence 
against the current version of the exemplar 
scanning model. Thus, if the predicted 
changes in slope are found, neither model 
will receive selective support; but if the pre- 
dicted changes are not found, the exemplar 
scanning model, as proposed, can be rejected. 
Naturally, this prediction need only be exam- 
ined after other aspects of the results, which 
are predicted by both models, are found to 
conform to the common predictions of both 
models. 

As can be seen, an essential aspect of any 
experiment that attempts to provide evi- 
dence relevant to either of the models is that 
the evidentiary criteria for “true” and for 
“false” responses be varied in some manner. 
Rather than repeat the same quantification 
design used in Experiment 1, we chose to 
broaden the empirical domain of both models 
by varying these criteria in a less common 
manner than is typical of semantic memory 
research with object categories. Specifically, 
one half of the participants were asked to 
verify co-occurrence assertions (in which “a 
person” was used as a general quantifier), 
and the other half were asked to verify the 
similarity of the dictionary definition of the 
trait terms, We assumed that the participants 
would use a more stringent “true” criterion 
and a less stringent “false” criterion in the 
latter than in the former condition. In addi- 
tion, half of the participants in each of the 
conditions mentioned above rated the likeli- 
hood of the co-occurrence of trait pairs, and 
half rated their similarity in meaning. This 
resulted in a 2 (co-occurrence vs. semantic 
decisions) X 2 (co-occurrence vs. semantic 
ratings) factorial design. 


Method 
Subjects 


Forty-eight male and female undergraduate college 
students, who were all native speakers of English, 
served as participants in exchange for introductory 
psychology course credit. Six males and six females 
were randomly assigned to each condition, 
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Apparatus and Software 


z 2 : È 
The decision times and ratings were collected «iw 


using a PDP-8A laboratory computer equipped with 
a video teletype. A user-oriented, experimental-con- 
trol system, FOEP (Allen, 1978), presented the 
questions, randomized the trait terms, and recorded 
the decision times and ratings for later analysis, For 
both the forced choices and the ratings, the entire 
question was written onto the video screen at high 
speed (960 characters/sec). 


Procedure 


At the beginning of the session, the participants 
were told that the study dealt with their judgments 
of various personality characteristics of people, 
The participants, who were run individually, were 
then randomly assigned to one of four experimental 
conditions in which they were asked to make either 
semantic similarity or co-occurrence ratings and 
either semantic similarity or co-occurrence binary 
forced-choice decisions. To provide more stable 
estimates of the data points, there were three repli- 
cations for both the ratings and forced-choice deci- 
sions. The order of question presentation was ran- 
domized for each replication, The order of the 
ratings and forced-choice judgment tasks was coun 
terbalanced across subjects, and all of the replications 
for the two tasks were blocked together. The data 
collection was completed in a 1-hour session. 

Semantic similarity and 
The question “How similar is the meaning of the 
trait term to the meaning of the trait term 
2” was used to obtain semantic similarity 
estimates for all of the trait pairs. The question 
“How likely is a person who is 3 
——— ?” was used to obtain co-occurrence esti- 
mates. The blanks were completed with all possible 
combinations of pairs of 10 trait terms. These terms 
were similar to those used in Experiment 1. For a 
semantic similarity ratings, the scale was anchana 
by “not very similar” and “very similar.” For i 
co-occurrence ratings, the scale was anchored bY 
“not very likely” and “very likely.” The partici 
pants responded by typing a number between 1 a 
20, using the digits on the teletype keyboard. Re 
half of the participants, the positive end of EL 
scale was assigned a value of 20; for the other 5 
of the participants, the positive end of the sē 
was assigned a value of 1. e als 

Forced-choice decision task. The question 


; ees the 
the meaning of the trait term similar to for 
meaning of the trait term ewes od ues- 
the forced-choice semantic decisions, and the 4 ei 


tion “Is a person also likely to be orcs 
was used for the co-occurrence version.1° The Pi 


10 It should be noted that this is a diferent 
quantified statement than those presented 37 er 
periment 1. It seems reasonable that the quanti w 
a would be processed much like the quantifier 4 ee 
Therefore, relatively little evidence should be 
quired for a “true” response. 


co-occurrence ratings. | 


also to be | 
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ticipants were asked to respond to the questions by 
pressing one of two teletype keys with their right 
and left index fingers. The pairing of response key 
and response direction was balanced across partici- 
pants. Unlike Experiment 1, the participants in this 
study were instructed to respond as rapidly but as 
accurately as possible. It was hoped that this in- 
struction would reduce the variance in decision 
times. We therefore expected that the average times 
would be considerably shorter in this study than in 
the first one. 

Each decision trial started with the presentation 
of the word ready on the screen for 4 sec. One-half 
second after the termination of the ready signal, a 
question was written on the screen, and the deci- 
sion time clock was started. When the subject re- 
sponded, or if no response had been made after 15 
sec, the question was erased, and the next trial was 
started 1 sec later. There was a pause of approxi- 
mately 1 minute between replications. 


Results 
Mean Decision Times 


As in Experiment 1, the data were ana- 
lyzed in several different ways. Of initial 
interest were the effects of question type on 
the mean “true” and “false” response times. 
Table 6 presents the average of each subject’s 
mean “true” and mean “false” response times 
in each condition of the 2 x 2 design. As can 
be seen, in a manner consistent with both 
models, people negated semantic assertions 
faster than they affirmed them; but the re- 
verse was true for the co-occurrence asser- 
tions. A 2 (true/false) X 2 (semantic/co- 
occurrence rating) X 2 (semantic/co-occur- 
rence decision) analysis of variance of the 
mean decision times indicated that this inter- 
action was highly significant, F(1,44) = 
29.91, p < .001. There was also a significant 
tendency for “false” responses to be emitted 
faster than “true” ones, F(1, 44) = 4.96, p < 
05. No other results were found to be signifi- 


Table 6 
Mean Decision Times (in sec) for Semantic 
and Co-Occurrence Questions 


Type of response 


Semantic Co-occurrence 
Type of rating True False True False 
Semantic 2.085 1.766 2.171 2.254 
Co-occurrence 2.201 1.676 1.839 2.103 
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Table 7 

Means of Fisher z Transformed Correlations 
Between Similarity Ratings and “True” and 
“ False” Decision Times (in log sec) 


e—a 


Type of response 


oe, eee 


Semantic Co-occurrence 
Type of rating True False True False 
Semantic —.26 35 —.21 29 
Co-occurrence —.33 18 —.20 _ .22 


cant. Both models explain the findings men- 
tioned above by assuming that more confirm- 
ing evidence is needed for a “true” response 
and less for a “false” response when a se- 
mantic rather than a co-occurrence question 
is asked. Apparently, our attempt to manip- 
ulate the true and the false evidentiary cri- 
teria was successful. 


Decision Times and Similarity 


One of the most important predictions of 
both models, which yielded ambiguous results 
in Experiment 1, concerns: the relationship 
between psychological similarity and “true” 
versus “false” decision times. In the present 
experiment, both models predict a negative 
relationship for “true” response times and a 
positive one for “false” response times. Table 
7 presents the mean correlation for each con- 
dition in the 2 X 2 design. Correlations were 
computed for each subject by taking the mean 
decision time for the three replications for 
each trait pair and treating the different or- 
ders of trait terms in a pair as unique ob- 
servations. The “true” and “false” means 
were then separately correlated with the as- 
sociated mean similarity ratings for corre- 
sponding orders. Not only was the pattern 
of correlations exactly as predicted, but in 
addition, with the more sensitive procedure 
used in Experiment 2, all of the average cor- 
relations were significantly different from 
zero, minimum #(11) = —2.51, 2 < 05. 

A 2 (true/false) X 2 (semantic/co-occur- 
rence rating) X 2 (semantic/co-occurrence 
decision) analysis of variance of the results 
indicated that the true/false difference was 
highly significant, F(1, 44) = 252.11, p< 
0001. No other effects were found to be sig- 
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nificant. In short, as predicted by both mod- 
els, “false” response times increased and 
“true” response times decreased as the simi- 
larity of the pairs increased, whether the simi- 
larities of the pairs were measured by se- 
mantic overlap or trait co-occurrence, and 
whether the decision times were produced from 
verification of a semantic or a co-occurrence 
assertion. 

The previous analyses had established the 
necessary conditions for an examination of 
the effects of criteria alterations on the slopes 
of the relationship between reaction times 
and similarity. These slopes were computed 
separately for each subject in the same fash- 
ion as were the correlations. A 2 X 2 x 2 
analysis of variance of these data indicated 
that the interaction between true/false and 
type of question predicted by the exemplar 
Scanning model was not significant (F < 1). 
No other effects were significant. Apparently, 
the mean absolute values of the slopes were 
the same in all conditions. 


General Discussion 


The results from both experiments can be 
easily summarized. First, when an assertion 
about a pair of traits was altered (either by 
changing the quantifier or by changing the 
nature of the assertion), such that greater 
psychological similarity was required for a 
positive response, the time to reach a positive 
decision increased, and the time to reach a 
negative decision decreased (Tables 2 and 6). 
Second, as the psychological similarity be- 
tween trait terms (whether assessed by defi- 
nitional or co-occurrence ratings) increased, 
the latency of positive decisions about the 
two traits decreased, and the latency of nega- 
tive decisions increased (Tables 3 and 7), 
Third, the strength of the relationships be- 
tween psychological similarity and “true” and 
“false” latencies did not vary as the eviden- 
tiary criteria varied. Taken together, these 
results seem more consistent with the revised 
feature comparison model than with the ex- 
emplar scanning model. 

The former model explains the findings 
summarized above by Postulating a two-stage 
decision process in which some decisions are 
made after a first Stage, whereas others are 


made after a second stage. The more time- 
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consuming second-stage processing is Tequired 
when the similarity of the two feature lists 


falls in between a “true” and a “false” cut * 


point. The placement of both cut points de. 
pends on the exact nature of the assertion, 
The relationship between similarity and 
“true” and “false” response times follows 
from the fact that an increase in similarity 
increases the probability that “false” re- 
sponses will result from second-stage pro- 
cessing and decreases that probability for 
“true” responses. The effects of quantifier 


and type of assertion are explained in terms 


of shifting cut points. 

Given the intuitive plausibility of an ex 
emplar representation of traits, it is impor- 
tant to ask whether the exemplar scanning 
model can be revised in a manner that retains 
this representation of traits but which does 
not predict the slope interaction. One possi- 


bility is to alter our assumptions about de- | 


cision processes. Assume that subjects take 
more than one exemplar from memory and 


that their “true”/“false” decision is based | 


on whether the overall proportion of X ex- 
emplars with the Y trait is greater than a 
“true” criterion or less than a (different) 
“false” criterion. Suppose further that sub- 
jects will continue to draw samples until the 
overall proportion is above the “true” or be- 
low the “false” cut point. Finally, assume 
that decision times are an increasing function 
of the number of samples that are drawn. 
This model does not predict the slope 
changes. It also makes decision time predic- 
tions that are identical to the feature com- 
parison model. In fact, it should be obvious 
that the two new models assume virtually 
identical decision processes and differ pii- 
marily in the way in which trait terms are 
represented in memory, and therefore in the 
dimension that underlies the true/false cH 
terial decision. 

While the exemplar scanning model dere 
perhaps be salvaged with the modifications 
Suggested above, it is important to realize 
that the data have constrained that model s0 


that the decision processes in this new model 


have much in common with those of the fea- 
ture comparison model. Although it appea's 
that we cannot decide from the data pre 
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sented here whether traits are represented in 
memory as feature lists, exemplars, or even 
both, consistent with the original goal of this 
research, we can conclude that (a) whatever 
the mental representation of traits may be, 
the similarity of those representations seems 
to play a major role in implicit personality 
inferences; (b) decisions about people’s per- 
sonality characteristics seem to be governed 
by a multistage process in which some an- 
swers are based on a less extensive (or de- 
tailed) search of memory than other answers 
(the search is not exhaustive in all cases); 
(c) a different criterion seems to be used 
for “true” than for “false” responses; and 
(d) both evidentiary criteria seem to change 
as the nature of the assertion changes. Thus, 
models that argue that all decisions are made 
after an exhaustive scan or processing of 
the relevant evidence cannot explain the cur- 
rent results. Such models would, in general, 
not be able to explain the effects of quanti- 
fication on decision times found in these ex- 
periments, Similarly, models that assume that 
exhaustive scans or processing occur as a de- 
fault option when sufficient evidence for a 
true response is not found (e.g., Collins & 
Quillian, 1969) are incapable of explaining 
the changes in mean “false” response times 
observed in both experiments. 

On the other hand, almost any view of 
the representation of trait terms (@.g., pro- 
totype, schema, script, network, feature list, 
set of exemplars, visual image, etc.) could be 
postulated and combined with a multistage 
processing model similar to that described 
here, so as to explain the results from these 
experiments. This fact suggests the need for 
caution when claiming that social information 
takes a particular form in memory. As we 
noted in the introduction, most models of 
Cognitive-social phenomena include assump- 
tions about the structure of information in 
memory, the mechanisms by which that in- 
formation is retrieved from memory, and the 
decision process that operates on that in- 
formation. It can happen, therefore, that very 
different assumptions about one or more of 
these aspects of semantic decisions can yield 
identical predictions, provided other aspects 
of the model take an appropriate form (see 


. thing else, seem 
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Anderson, 1978, and Townsend, 1974, for 
similar views). For this reason, we believe 
caution must be exercised when concluding 
that a given set of results validates a par- 
ticular structural view or a particular re- 
trieval process. 

It is of interest to note that a process 
model that provides a useful explanation of 
implicit personality inferences also provides 
a useful explanation of decisions about en- 
tirely different kinds of semantic information. 
Apparently, the same model can be used to 
understand how people decide whether a 
mother is a chair as whether an intelligent 
person is also honest. The same model also 
seems to apply to other social stereotypes 
besides traits. For example, Cohen (1977) 
reported results of a reaction time experi- 
ment in which people decided whether par- 
ticular attributes were characteristic of given 
occupations. “False” reaction times increased 
and “true” reaction times decreased as the 
rated typicality of the attribute to the occu- 
pation increased. It even appears that de- 
cisions about the relevance of particular at- 
tributes to oneself are governed by a dual- 
criterion multistage decision process. Markus 
(1977) found that reaction times for “like 
me” decisions about trait attributes decreased 
as the rated self-applicability of the trait 
terms increased. In short, there is reason to 
believe that questions about semantic infor- 
mation, whether that information is about 
traits, objects, occupations, oneself, or some- 
to be answered by a multi- 
s in which the similarity of the 


stage proces: 
elements in the question is compared to task- 


established evidentiary criteria. 

Tt is both somewhat surprising and im- 
portant that the present dual-criterion multi- 
stage model appears to have such wide gen- 
erality. As mentioned above, it seems likely 
that similar, if not identical, processes may 
be found for a variety of social psychological 
inference processes besides those involved in 
implicit personality inferences. 


Reference Note 


1. Kruskal, J. B., Young, W., & Seery, J. B. How 
to use KYST, a very flexible program to do mul- 
tidimensional scaling and unfolding. Bell Labora- 
tories technical memorandum, 1973. 
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Effects of Exogenous Changes in Heart Rate on 
Facilitation of Thought and Resistance to Persuasion 


John T. Cacioppo 
University of Notre Dame 


Two experiments were conducted to examine the effects of an accelerated heart 
rate on information processing and resistance to persuasion. Experiment 1 
addressed the effects on cognitive performance of manipulating heart rate 
exogenously for brief periods of time. Fourteen subjects wearing implanted de- 
mand-type cardiac pacemakers performed reading comprehension and sentence 
generation tasks while their heart rate was either accelerated or not accelerated. 
Results revealed that performance was better when heart rate was accelerated 
than when it was not accelerated. Experiment 2 addressed the effects on counter- 
argumentation and resistance to persuasion of manipulating heart rate using the 
cardiac-pacing technique employed in Experiment 1. Subjects read highly in- 
volving counterattitudinal communications while their heart rate was either 
ostensibly or actually accelerated. Accelerated heart rate resulted in the genera- 
tion of more total thoughts and counterarguments than did basal heart rate; re- 


sistance to persuasion was related significantly to the number of counterargu- 
ments generated. The methodology used provides a means by which social 
psychologists can study the effects on social processes of actual but unperceived 


changes in physiological processes. 


Much of our social behavior is affected by 
the state of our physiological systems. The 
important work on this issue by social psy- 
chologists has been concerned with the effects 
on cognition and behavior of perceived 
changes in the functioning of the autonomic 
nervous system (ANS; eg., Detweiler & 
Zanna, 1976; Schachter, 1964; Valins, 1966). 
However, little has been done with respect to 
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the effects of actual but unperceived changes 
in ANS functioning, Methodological rather 
than theoretical problems seem responsible 
for this empirical deficit (cf. J. Lacey, 1967; 
J. Lacey & B. Lacey, 1974). Recently, a tech- 
nique has been found (Cacioppo, 1977) that 
may alleviate the latter shortcoming, The 
purpose of this article is to describe this pro- 
cedure as well as to demonstrate that heart 
rate influences cognitive elaboration, This 
will be done in two ways: (a) by showing 
that performance on an intellectual task re- 
quiring cognitive elaboration is enhanced by 
increased heart rate (Experiment 1) and (b) 
by showing that when resistance to persuasion 
depends on cognitive elaboration, greater 
resistance (counterarguing) occurs with high 
heart rates (Experiment 2). 


Cardiac Activity and Information Processing 


Angell and Thompson (1899) were perhaps 
the first to conclude that performing complex 
cognitive tasks led to an accelerated heart 
rate. Somewhat later, Darrow (1929a) dis- 
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tinguished between stimuli that led to “exci- 
tation of sensory end organs causing immedi- 
ate physiological effects but calling for no 
extensive association of ideas” from those 
that led to “excitation in which the stimula- 
tion of sensory end organs is but incidental to 
the initiation of associative processes” (p. 
185). He concluded from a review of the lit- 
erature that (a) momentary sensory stimula- 
tion was accompanied generally by a decel- 
erated heart rate and (b) “ideational stim- 
uli” were associated with an accelerated heart 
rate. This review and an experimental study 
by Darrow (1929b) were followed by a lull 
of almost three decades in this area of re- 
search (J. Lacey, 1959; J. Lacey & B. Lacey, 
1958). 

Contemporary investigations of cardio- 
vascular psychophysiology have included (a) 
monitoring heart rate during the anticipation 
and performance of a wide variety of cogni- 
tive (e.g., sentence generation) and sensory 
(e.g., viewing flashing lights) tasks (see J. 
Lacey, 1959; J. Lacey, Kagan, B. Lacey, & 
Moss, 1963; Obrist, 1963); (b) the measure- 
ment of the differences in heart rate change 
during performance by individuals who differ 
dispositionally in their mode of processing 
task information (e.g., Blatt, 1961; Kagan & 
Rosman, 1964); (c) the manipulation of at- 
tributes of the task while monitoring heart 
rate (e.g., Cacioppo & Sandman, 1978; Cam- 
pos & Johnson, 1967; B. Lacey & J. Lacey, 
1974; Tursky, Schwartz, & Crider, 1970); 
(d) monitoring heart rate and somatic ac- 
tivity during the anticipation and performance 
of tasks, often under varying levels of moti- 
vation (eg., Elliott, 1974; Obrist et al., 
1974); and (e) monitoring sensory thresholds, 
reaction time, or cognitive and attitudinal 
responses following endogenous changes in 
heart rate (e.g. Cacioppo, Sandman, & 
Walker, 1978; Sandman, McCanne, Kaiser, 
& Diamond, 1977; Surwillo, 1971). 


Methodological Considerations 


Although heart rate has been found gen- 
erally to covary with the cognitive complex- 
ity, or difficulty, of a task (e.g, Cacioppo, 
1977), the biological basis of this association 
has been and continues to be debated. The 
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Laceys (J. Lacey, 1967; J. Lacey & B. Lacey 
1958) have formulated a neurophysiolo, ical 
account of the variations of heart rate ob. 
served during these tasks, They have specu- 
lated that, ceteris paribus, an accelerated 
heart rate is associated with and facilitates 
cognitive elaboration, whereas a decelerated 
heart rate is associated with and facilitates 
Sensory reception. Obrist and his colleagues 
(Obrist, Gaebelein, Shanks, Langer, & Bot- 
ticelli, 1976; Obrist, Webb, Sutterer, & How- 
ard, 1970), however, have contended that the 
observed changes in heart rate are accompa- 
nied by changes in somatic activity, both of 
which are mediated by a common central ner- 
vous system (CNS) integrating mechanism, 
Unfortunately, most of the research with 
humans on which these writers base theit 
arguments is correlational rather than experi- 
mental. 

Several design strategies might be adopted 
to provide information about the effects of 
changes in heart rate on information pro- 
cessing. For instance, heart rate might pe 
varied exogenously without the individual's 
knowledge, to determine if changes in heart 
rate are sufficient for altering information 
processing; * or physiological activity except 
for heart rate could be varied systematically 
to determine if changes in heart rate ag 
necessary for altering information processing. 

The first strategy was adopted in the pre- 
sent research, In Experiment 1, cardiac-pacing | 
techniques Were employed to accelerate heart 
rate during the performance of two cognitive 
tasks (reading comprehension and sen 
generation). These procedures involve ony 
minimal risks; nevertheless, the research was 
conducted in a clinical setting under the 
supervision of a cardiologist. Unfortunately, 


1 Exogenous manipulations of heart rate are eh 
appropriate for investigating causal relations a 
between changes in heart rate and cognitive/sens $ 
behavior because (a) exogenous manipulation di 
heart rate can be accomplished without the sul E 
knowing when and in what direction heart rate 
varied, thus circumventing the confounding effec! A 
the subject’s knowledge or belief of heart rate e 
(cf. Valins, 1966); and (b) if behavioral effects in 
observed to result from an exogenous change k 
heart rate, then causal direction of the observi 
association is established, 
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‘this setting contained no resources for the 


measurement of any physiological process 
besides heart rate, but previous research has 
demonstrated that paced changes in heart 
rate (within the range of rates employed in 
the present research—i.e., 72-88 beats per 
minute) are unaccompanied by significant 
changes in normal visceral, somatic, or car- 
diac-output responses (€.8., Gill, Jakobi, Mor- 
ton, & Wechsler, 1975; Karlof, Bevegard, & 
Ovenfors, 1973). 

The selection of heart rates for investiga- 
tion and the length of the experimental ses- 
sions in this research also were restricted. The 
pacemaker worn by each subject had been 
implanted for at least 3 months and was 
necessary to maintain heart rate in most in- 
stances at 72 beats per minute (bpm).? Ex- 
cept for surgical procedures, paced heart rate 
could be changed only by the use of a mag- 
net, which, when placed properly, immedi- 
ately accelerated and maintained heart rate 
at 88 bpm. Thus, although it would have been 
desirable, the effects of a decelerated heart 
rate on performance could not be studied in 
this research. And time limitations restricted 
the type and number of tasks on which obser- 
vations could be made. 

Complex cognitive tasks were chosen for 
the present study because of the theory and 
research that suggest that an accelerated 
heart rate should facilitate performance on 
these tasks (e.g, J. Lacey, 1967; J. Lacey 
et al., 1963). The sentence generation task 
has been used in previous research and was 
found to be associated with an accelerated 
heart rate (J. Lacey et al, 1963). A task 
similar to reading comprehension was €m- 
ployed by Spence, Lugo, and Youdin (1972): 
Subjects were instructed to attend to, and 
were later asked to recall, sentences reflecting 
a certain theme while they listened to a 17- 
minute passage from a clinical interview. 
Spence et al. found that subjects displayed a 
decelerated heart rate (presumably denoting 
sensory intake) followed by an accelerated 
heart rate (presumably denoting cognitive 
elaboration or encoding) during the presen- 
tation of thematic sentences that were subse- 
quently recalled. This waveform was distinct 
from those displayed in the absence of the- 
matic sentences and in the presence of un- 
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recalled thematic sentences. Since the present 
procedure allowed the subject to take as long 
as he or she wished to read the passages 
(hence, allowing subjects to control sensory 
intake by their allocation of reading time to 
items), we expected that an accelerated heart 
rate would facilitate the cognitive elaboration, 
or encoding, of the stimuli and thereby im- 
prove reading comprehension. 


Experiment 1 
Method 


type cardiac pacemaker for at least 3 months prior to 
participation in the experiment. All subjects were 
examined by a cardiologist and, immediately after- 


One could not read; (b) one was extremely nervous 
and refused to follow instructions once the experi- 
ment had begun; 
when her daughter entered the testing room an 
began speaking with her; and (d) one was replaced 
because the placement of the magnet to increase 
heart rate could not 
menter. The remaining 14 subjects completed the 
pretest satisfactorily: 
heart rate of 72 bpm when a capped magnet was 
placed over a reed of the 
erated heart rate of 88 bpm when an uncapped mag- 
net was placed over a 
(c) an inability to identify the intervals during 
which heart rate was accelerated 
questions asked of the subject during 


— 


2 Cardiac Pacemakers, Inc. (CPI) was the manu- 
facturer of the pacemakers. The demand-type pace- 
maker is characterized by its pacing of a person’s 
heart rate at a constant rate when natural pacing 
produces a rate below the preset level of the cardiac 
pacemaker. To our knowledge, this is the only pace- 
maker that provides a nonsurgical means of altering 
heart rate simply, and without the 
subject’s awareness. The preset level of the CPI 
pacemakers used in the present study was 72 bpm. 
The subjects employed in Experiments 1 and 2 
required cardiac pacing when seated to attain a 
heart rate of 72 bpm. 

Importantly, subjects in the present experiments 
were unable to report accurately when and how 
frequently their pacemaker was pulsing their heart. 
This finding is consistent with that of Nowlin, Eis- 
dorfer, Whalen, and Troyer (1971); they reported 
subjects were unaware of paced changes in heart 
rate that exceeded 110 bpm. 
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Subjects were told to discontinue their participation 
at any time they wished. Four subjects left after 
their completion of the first series of tests (ie., the 
reading comprehension test); reasons for leaving 
included feeling fatigued, feeling nervous, and being 
late for another appointment. 

Materials and apparatus. Two types of cognitive 
tests were administered. The instructions, example 
Passage and questions, and four test passages and 
questions were selected from a college-level reading 
comprehension test for use in the first portion of 
the experiment. In the second portion of the experi- 
ment, a sentence generation task was employed. The 
task involved spending 90 sec generating sentences 
that (a) consisted of exactly five words, (b) made 
sense, (c) had good grammatical structure, (d) con- 
sisted of at least three words not contained in a 
Previous sentence, and (e) began with the same 
letter. (The letter for the practice trial was O; the 
first test-trial letter was E; and the letter used for 
the second test trial was A.) An Apollo stopwatch 
was used to time subjects during the tasks, and a 
CPI horseshoe magnet was used to manipulate heart 
rate. 

Design. A 2X 2 factorial design was used. Heart 
rate (basal ys. accelerated) served as a within- 
subjects factor, and the order in which heart rate 
Was varied served as a between-subjects factor. The 
tasks were administered in the same sequence to all 
Subjects. However, half of the subjects (a) read the 
first and third passages of the reading comprehension 
test and (b) performed the first test trial of the 
Sentence generation task with a basal heart rate of 
72 bpm; these subjects (c) read the second and 
fourth Passages of the reading comprehension test 
and (d) performed the second test trial of the sen- 
tence generation task with an accelerated heart rate 
of 88 bpm. The heart rate of the remaining subjects 
was varied in the opposite Sequence during the 
tasks, Subjects were assigned randomly to the order 
condition. (Passages served as an additional within. 
Subjects factor for the analyses of the reading com- 
prehension measures.) 


b Procedure. Subjects were tested individually while 
sitting calmly at a table in a quiet room, The ex- 
perimental 


tasks were described, and an example 
passage and questions for the reading comprehension 
test were administered. Once subjects understood 
what they were to do and were relaxed with the 
procedure, the reading comprehension test was ad- 
ministered, 

Either a capped (basal heart rate) or uncapped 
(accelerated heart rate) magnet was placed over a 
reed of the subject’s pacemaker while he or she read 
each Practice and test Passage. Immediately fol- 


lowing the Subject’s response that he or sh 
hi 
completed reading the se 
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they had reported having read it. 
multiple-choice tests constituted th 
hension test. 

Next, the nature of and instructions for the sen. 
tence generation task were explained, Subjects prac. 
ticed the procedure until the requirements of the 
task were understood. Each subject's heart rate was 
monitored, and either a capped or uncapped magnet 
was placed over the pacemaker during the 90 sec in 
which sentences were generated. At the onset of the 
trial, the experimenter announced the letter with 
which each sentence was to begin, and during the 
trial, subjects listed the sentences on ruled paper 
as they generated them. 

Dependent variables, The percentage of the ques- 
tions for each passage that w nswered correctly 
Served as the measure of reading comprehension; 
the time that was spent reading each passage was 
also recorded and analyzed. 

The dependent measure of the sentence generation 
task was calculated in the following manner: (a) 
Each sentence that was generated in accordance with 
all of the instructions listed was assigned a value 
of 1.0; (b) sentences with errors were assigned a 
value of (1.0~.2X), where X denoted the number 
of errors (O was the least a sentence could be as- 
Signed). The total points accumulated served as 
the measure of cognitive performance. 


Four Passages and 
© reading compre. 


Results and Discussion 


Fourteen subjects completed the reading 
comprehension task, and 10 of the 14 subjects 
completed the sentence generation task. A 
multivariate analysis of variance was con- 
ducted for the cognitive performance of the 
10 subjects who completed the experiment, 
the dependent measures were the percentage 
of correct answers in the reading comprehen- 
sion test and the number of points accumu- 
lated in the sentence generation task. The 
analyses indicated that accelerated heart rate 
facilitated performance, F(3, 6) = 5.63, ? < 
05. A 

Univariate analyses of variance reveale 


3Since pacemakers are sometimes implanted S 
other than the upright position, location of the le 
was determined by monitoring heart rate W ad 
Placing the magnet on the skin over the upper ae 
lower portions of the implanted pacemaker. We 
placement of the magnet resulted in an acceler ay 
heart rate of 88 bpm, the placement was ae 
with a felt pen. Heart rate was also monitored ces 
ing the pretest and test intervals. In all instance’ 
heart rate did not vary from the paced levels. blind 

t The person who scored the sentences was bject 
to the experimental conditions to which the sub) 
had been assigned, 


os 
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that (a) reading comprehension was better 
during accelerated heart rate (M = 48.68%) 
than during basal heart rate (M = 39.33%), 
F(1,12) = 5.24, $ < .025, one-tailed com- 
parison; (b) performance of the sentence 
task was improved marginally by accelerated 
heart rate (M = 3.88) compared to basal 
heart rate (M = 3.27), F(1,8) = 2.87, D< 
.08, one-tailed comparison; and (c) reading 
time was not affected by heart rate or order 
(Fs <1). 

A significant Heart Rate X Order interac- 
tion, F(1, 12) = 51.68, P< 01, for reading 
time signified that the second and fourth pas- 
sages of the reading comprehension test (W 
= 78.1 sec) took longer to read, on the av- 
erage, than did the first and third passages 
(M = 45.4 sec). Since this interaction was 
not significant for the percentage of correct 
answers on the reading comprehension test 
(F < 1), these results indicate that reading 
time did not affect the role of heart rate on 
cognitive performance in the present study. 

Together, these results indicate that heart 
rate accelerated peripherally is sufficient to 
facilitate performance of a difficult cognitive 
task when the change in heart rate is brief 
(e.g, 30-90 sec) and ongoing physiological 
activity reflects a normal state." 


Experiment 2 


Previously, multiple sessions of discrimina- 
tive operant training have been employed to 
modify heart rate while subjects maintained 
a relatively constant level of somatic and 
respiratory activity (Cacioppo et al., 1978). 
The results revealed that individuals gen- 
erated more counterarguments and were more 
resistant to persuasion when heart rate was 
accelerated than when it was decelerated. It 
appeared that brief accelerations of heart 
rate when unaccompanied by large increases 
in somatic activity produced resistance to 
persuasion by stimulating the cognitive elab- 
oration of and responding to the counteratti- 
tudinal communication. Thus, it was expected 
that the brief increase in heart rate produced 
by using cardiac-pacing techniques would fa- 
cilitate thought production (e.g, counter- 
argumentation) and, ultimately, resistance to 
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persuasion. The purpose of Experiment 2 was 
to provide a direct test of this hypothesis. 


Method 


Subjects. Twenty-four healthy and articulate out- 
patients at a large midwestern university hospital 
volunteered to participate in the experiment. Experi- 
ment 2 was conducted in the same setting and in a 
similar manner to Experiment 1. As before, subjects 
were examined first by their cardiologist and by an 
experimenter who was unaware of the experimental 
hypotheses. Each subject displayed (a) a basal 
heart rate of 72 bpm when a capped magnet was 
placed over a reed of the pacemaker, (b) an accel- 
erated heart rate of 88 bpm when an uncapped mag- 
net was placed over the reed of the pacemaker, and 
(c) an unawareness of the intervals in which heart 
rate was manipulated. Two subjects were deleted 
from analyses because the experimenter failed to 
follow the experimental procedure correctly. 

Materials and apparatus. Subjects were given the 
first half of the reading comprehension test em- 
ployed in Experiment 1 during a preliminary exami- 
nation and instruction period. These stimuli consti- 
tuted the materials used in the preliminary task. 
Materials for the experimental task were developed 
in pilot testing with four elderly subjects; communi- 
cations were developed that were highly involving 
and counterattitudinal. The advocacies selected were 
that (a) all Social Security and Medicare programs 
be eliminated and (b) the drinking and voting age 
in Ohio be lowered to 13 years of age. 

CPI horseshoe magnet again was used to close 
the reed in the pacemaker and thus raise heart rate, 
and an Apollo stopwatch was used to time reading 
intervals. 

Design. A 2X2 factorial design was used. Heart 
rate (basal vs. accelerated) served as the within- 
subjects factor, and the order in which heart rate 
was varied (basal-accelerated vs. accelerated-basal) 


5 To assess if it was necessary to vary heart rate 
within a constant pattern of peripheral nervous sys- 
tem activity to affect cognitive performance, & por- 
tion of the experiment was replicated using college 
undergraduates who had no pacemakers. They either 
stood or lay down while performing @ reading 
comprehension task; these postural variations caused 
not only heart rate to differ but also respiration, 
muscle tension, and blood pressure. In this situation 
neither reading time nor reading comprehension was 
affected by the heart rate of the subject during the 
performance of the task. One should be cautious in 
comparing these results with the previous experi- 
mental results because of the noncomparability of the 
subject samples. Nevertheless, these studies together 
suggest that increased heart rate facilitates cognitive 
performance, particularly if the increase is relatively 


independent of major changes in physiological activ- 
ity. 
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served as the between-subjects factor. Again, the 
stimuli were presented to all subjects in the same 
sequence. 

Procedure. The procedure was similar to that 
used in Experiment 1 except that subjects engaged 
in a longer preliminary task to all adaptation to 
the experimental procedures prior to the administra- 
tion of the experimental task. Subjects were tested 
individually while sitting at a table in a quiet room. 
Subjects were told that they would first read two 
passages and answer questions about the content of 
each passage; afterwards, they would hear two 
advocacies, and their opinions about each would be 
solicited. 

The preliminary task was then begun. Two pas- 
sages of the reading comprehension test were 
administered. During the performance of the prelimi- 
nary task, (a) heart rate while a capped and un- 
capped magnet was placed over a reed of the pace- 
maker was again determined; (b) task requirements 
were clarified; and (c) all subjects were assured that 
the changes in their heart rate were not hazardous 
to their health. (Greater precautions were taken in 
the present experiment than in Experiment 1 be- 
cang of the involving nature of the experimental 
task.) 

Subjects then performed the experimental task, 
Subjects were informed that two proposals perti- 
nent to them were under consideration and that 
messages had been prepared to .explain the pro- 
posals. Subjects read each advocacy at their own 
speed. After reading each, subjects were given 2.5 
min. to list everything about which they had thought 
while reading (thoughts were reported aloud by 
speaking into a tape recorder) and were asked to 
rate their agreement with the advocacy.® 

Dependent variables. The tape-recorded listings 
of thoughts were transcribed by a judge who was 
unaware of the experimental hypotheses. Transcribed 
as a “cognitive response” 
expressing a single thought or idea; although gram- 


cognitive responding, 

A second judge, 
ditions to which 
transcribed cognitive ri e 
scored as favorable, unfavorable (ie. counterargu- 
ment), or neutral/irrelevant toward the FEE ANA 


A third judge, who also was un; i 
mental conditions ti i P a 


the uniform nature 
discussion below). 


“How much do you 
you just read about?” 
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scale in which 1 was labeled “disagree completely” 
and 15 was labeled “agree completely.” 


Although there were procedural differences among 
the subjects during the preliminary task of this 
experiment, the percentage of questions answered 
correctly and the reading time were also monitored 
and analyzed. 


Results 


Twenty-two subjects completed the pre- 
liminary and experimental tasks. Multivariate 
analyses of variance were performed for the 
set of eight dependent measures; heart rate 
and order served as the criteria. The analyses 
revealed that heart rate affected cognitive and 
affective response, F (8, 13) = 3.17, p < .05. 
Univariate analyses of variance were then 
performed. The means are summarized in 
Table 1. 

Preliminary task. Neither heart rate nor 
order affected reading comprehension or read- 
ing time during the preliminary task. Heart 
Rate X Order interactions were found for 
both reading comprehension, F(1, 20) = 
7.39, p < .05, and reading time, F(1, 20) = 
11.65, p < .01. These effects denoted only 
that the first passage was read more quickly 
and comprehended less completely than the 
second passage (see Table 1). This result is 
due possibly to the subjects apprehension 


ĉ Subjects were asked to verbally report every- 
thing about which they thought during their reading 
of the advocacy, because pilot testing indicated sub- 
jects had noticeable difficulty in writing. This diff- 
culty was also noted during Experiment 1, and may 
possibly account for the attenuated effects of heart 
rate on the performance of the sentence generation 
task observed in Experiment 1. 

*The procedures employed for scoring go 
adapted from Petty and Cacioppo (1977) aa 
Cacioppo and Petty (1979). Statements directe 
against the advocated position that mentioned spe 
cific unfavorable consequences, statements of alter- 
native methods, challenges to the validity of argu- 
ments in the message, and statements of affect Op- 
posing the advocated position were counted as 
counterarguments. Statements in favor of the advo- 
cated position that mentioned specific favorable con- 
Sequences, statements eliminating alternatives, state- 
ments that supported the validity of the me 
and statements of positive affect regarding the ad- 
Vocacy we scored as favorable thoughts. All other 


statements were classified as neutral/irrelevant 
thoughts. 


| 
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Summary of Cognitive Measures for Preliminary and Experimental Tasks, Experiment 2 


Heart rate 


Accelerated (88 beats 


Basal (72 beats per minute) per minute) 
Order 1 Order 2 Order 1 Order 2 
Measure (Message 2) (Message 1) (Message 1) (Message 2) 

Preliminary task 

Reading comprehension 41.73 26.36 19.73 38.91 

Reading time 96.73 47.54 47.36 92.91 
Experimental task 

Total thoughts 3.82 3.00 4.82 3.55 

Counterarguments 3.73 2.91 4.73 3.27 

Favorable thoughts ‘09 0 0 0 

Neutral irrelevant thoughts 0 09 .09 27 

Attitude > 1.67 3,82 1.91 3.36 

Reading time 145.00 121.64 134.82 138.64 


1G Entries for reading comprehension are in terms of percentage of correct answers, whereas entries for 
zea ing time are in seconds. Higher means on the attitude measure indicate more agreement with the speak- 
er's position, where 1 indicates “disagree completely” and 15 indicates “agree completely.” Remaining en- 


‘tries designate the mean frequency observed for each 


about the experiment and confusion about 
what they were to do when they started. 
Experimental task. It was hypothesized 
that an accelerated heart rate would facilitate 
cognitive responding. The analyses supported 
"the hypothesis: More total thoughts, F(1, 20) 
= 20.64, p< .001, and counterarguments, 
F(1,20) = 12.36, p< .01, were produced 
when heart rate was accelerated than when 
it was not accelerated (see Table 1). 
It was also hypothesized that heart rate 
would affect resistance to persuasion. The 
a of variance provided no support for 
p pothesis; No treatment or interaction 
° Teatments affected subjects’ rating of 
pment with the adyocacies. However, an 
A or of the raw data revealed that 59% 
Bs E subjects’ responses to the attitude 
ih S were 1 (“disagree completely”); 
ae attitude measure was probably in- 
oe to treatment effects. In other words, 
ae : yee able to counterargue and reject 
A ely the advocacies, whether their heart 
Se accelerated (thus, facilitating cogni- 
ies eeng) or not accelerated. Accord- 
ne a total of one favorable thought and 
ee ee thoughts were gen- 
call luring the entire experiment. Within- 
f correlations between the cognitive re- 


type of cognitive response measured. 


sponse and attitude measures were calculated 
and indicated that agreement with the advo- 
cacy was related (a) negatively with the 
number of counterarguments generated (r= 
— 46, p < 05); (b) positively with the num- 
ber of neutral/irrelevant thoughts generated 
(r= 42, 0< 05); and (c) negatively with 
the number of total thoughts generated (r= 
—A4, p< 05). Counterargumentation and 
total thoughts were correlated positively and 
highly (r= 99, $ < 001), denoting that 
most of the thoughts generated were unfavor- 
able toward the advocacies (see Table 1). 
Finally, counterargumentation and the num- 
ber of neutral/irrelevant thoughts were re- 
lated negatively (7 = —.35, p < .05). 

The positive association between the num- 
ber of neutral/irrelevant thoughts generated 
and susceptibility to persuasion was unex- 
pected. The production of neutral/irrelevant 
thoughts may indicate that a person was 
momentarily or partially distracted from re- 
sponding cognitively to the advocacy while 
reading. Previous research has demonstrated 
that distraction during the presentation of a 
counterattitudinal communication increases 
susceptibility to persuasion by inhibiting the 
production of counterarguments (cf. Petty, 
Wells, & Brock, 1976). Thus, the generation 
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Encoding of Personal Information: 
Self—Other Differences 
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Self-other differences in processing personal information were investigated by 
having subjects make self-referent (describes you?) or other-referent (describes 
experimenter?) ratings of personal adjectives. Results from five studies in- 
dicated that self-ratings were ‘consistently judged as easier to make, and sub- 
jects always placed more confidence in these judgments. An analysis of rating 
times showed that only adjectives with long rating times were recalled for the 
unknown-other-referent task (Experiments 2 and 3). In contrast, the recalled 
words for the self-referent task had very short rating times. This difference is ex- 
plained via a “two-process” interpretation. Unknown-other-referent processing 
involves a relatively inefficient rehearsal or effort strategy, whereas self-referent 
processing involves the self as a highly organized and efficient schema. Finally, 
the effects of familiarity on other-referent processing were examined. A model 
of other processing is formulated to account for the observed changes in process- 


The present article is concerned with how 
formation about the self and others is pro- 
essed. The focal concern is upon the kinds 
Í memory traces produced by judgments 
about the self and judgments about others. 
Our major goal is to delineate processing dif- 
ferences between these two kinds of judg- 
ments. 

Some recent research and theory in per- 
Sonality and social psychology has been con- 
erned with how people process personal in- 
formation (Cantor & Mischel, 1977; Markus, 
1977; Rogers, Kuiper, & Kirker, 1977; Ross, 
1977). These researchers have begun to map 
Out some of the biases and facilitations that 
emerge when people process input data about 
themselves and others. Some of the findings 
Include the following (a) Adjectives pro- 
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ing information about a familiar other (Experiments 4 and 5). 


cessed using self-reference produce extremely 
deep and elaborate memory encodings (Rog- 
ers et al., 1977); (b) subjects can produce 
more behavioral exemplars for traits that 
they themselves possess (Markus, 1977); (c) 
there are several kinds of biases in recogni- 
tion memory that are related to the person- 
ality of either a target person (Cantor & 
Mischel, 1977) or the subject (Rogers, Rog- 
ers, & Kuiper, in press) ; and (d) resistance 
to incorrect personal data is related to self- 
perception (Markus, 1977). Each of these 
findings sheds some light on the kinds of 
transformations that occur to personal data 
as they are absorbed into the human cogni- 
tive system. 


The Role of Cognitive Prototypes or 
Schemata 


The theoretical notion of prototype or 
schema has been advanced to account for the 
findings mentioned above. For example, Mar- 
kus (1977) postulates that a person’s or- 
ganization, summation, and explanation of 
personal data involve the use of “self-sche- 
mata,” which are “cognitive generalizations 
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about the self, derived from past experience, 
that organize and guide the processing of 
self-related information” (p. 63). These 
schemata or cognitive structures are thought 
to interact with incoming data and produce 
some of the biasing and facilitating effects 
outlined earlier. For example, Cantor and 
Mischel (1977) showed subjects a series of 
statements describing an introvert (as well as 
some other statements). The subjects were 
then shown a list of items and asked to indi- 
cate whether each item had been in the first 
series of statements. These subjects showed 
a consistent bias to identify novel, but highly 
introvert-related, items as having been in 
the first ‘series, when they had not. These 
data indicate that the subjects had abstracted 
the concept of introvert during the initial 
list presentation and used this abstraction as 
a comparator Or reference during the identi- 
fication task. When novel items representing 
similarity to this abstraction were presented, 
the subjects tended to misidentify them. The 
abstraction represents a prototype or schema, 
and its biasing effect on information process- 
ing is clearly reflected in the Cantor and 
Mischel data (see also Rogers et al., in 
press). The notion of schema or prototype 
has found support in most of the work con- 
cerned with processing personal data (e.g., 
Markus, 1977; Rogers et al., 1977). 
Schemata or prototypes have been postu- 
lated to underlie processing of other kinds 
of information as well. For example, Posner 
and Keele (1970) demonstrate biases, not 
unlike those shown by Cantor and Mischel 
(1977), in a pattern recognition situation. 
Memory for connected discourse (Bartlett, 
1932), faces (Reed, 1972), and embedded 
Sentences (Bransford & Franks, 1971) has 
been interpreted using schemata or proto- 
types as the major theoretical construct, 


Self-Other Differences in Processing 
Personal Information 


_ The present article is concerned with the 
kinds of structures that are involved in pro- 
cessing information about other people. It is 
assumed that some kind of cognitive struc- 
ture—possibly a schema or pyototype—is in- 
volved when we process information about 


A 
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others, The critical question is whether these j 
“other structures” are any different from 
the “self-schema” documented in the preyi- 
ously outlined research. In an effort to ex-lj 
plore this question, the present article in- 
volves a comparison between memory en- 
codings generated from tasks involving judg- 
ments of self and those from tasks involving) 
judgments of others. If it is found that the! 
self-schema and the other schema generatel 
similar memory encodings, the contentione 
that all personal information (both self- and) 
other-referent) is processed using a commoni 
schema system will be supported. If, on the! 
other hand, self- and other information profi 
duces different memory encodings, we will 
have evidence supporting the self as a di 
tinctive cognitive schema. Further, it will be 
possible to explore the kinds of differences) 
between self- and other information process-) 
ing, with an eye to formulating a model off 
how we process information about other} 
people. 

The seli-other comparison used in the 
present experiments also speaks to an im 
portant methodological point embedded i 
the Rogers et al. (1977) experiments. These 
experiments documented the relative strength 
of self-reference as an encoding device, 
thereby supporting the analyses indicating 
that the self is a schema. In these studies, 
subjects first made a series of ratings of per- 
sonal adjectives. For example, subjects had 
to make synonymity judgments for some 0 
the words. Different words were rated as to) 
whether they described the subject. These 
rating tasks produced memory traces. The 
strength of the memory trace was assessed 
using an incidental recall task, after\the rat- 
ings were completed. Subjects were inst cted 
to remember as many as they could $ 
words they had rated. The results” 
experiments revealed a clear recall s 
for the self-reference rating task. T. 
was interpreted as indicating that Me 
produces strong and elaborate memory traces 
and in turn was offered as support for the 
self as a schema. k 

There is one important weakness in this 
argument. Self-reference was compared (and 
found superior) to rating tasks involvi 


. 
Š. 
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udgments of synonymity, semantic speci- 
city, rhyming quality, and type size. None 
f these comparison tasks involved making 
udgments about people. Jt is possible that 
he superior recall for the self-reference task 
s due to the involvement of a person in the 
ask and that any rating task involving a 
person would produce superior recall, This 
‘generalized person hypothesis” suggests that 
ny rating task with an explicit person refer- 
nt (self or anyone else) enhances incidental 
call in this paradigm. If this hypothesis 
vere correct, both the conclusion that the 
elf produces an elaborate memory trace and 
he formulation of the self as a schema 
yould be questioned. 

The present article presents a series of ex- 
eriments intended to explore the processing 
of self- and other-referent information using 
the incidental recall paradigm from Rogers 
tal. (1977), which derives from Craik and 
Tulving (1975). The first two experiments 
explored self—other differences in this para- 
igm; the later experiments were designed 

elaborate the obtained results. The goal of 

is experimental series was to clarify the 
ature of self- and other-referent processing 
of personal data, in an attempt to better 
understand the cognitive structures involved 
in processing information about people. 


he Present Experiments 


The first two experiments were simple at- 
tempts to determine if there are incidental 
recall differences following self- or other- 
teferent rating tasks. However, before the 
data are presented, it is instructive to con- 
Sider what kinds of processes may be in- 
Volved when a person is required to rate 
a. a given adjective (e.g, jolly) de- 
scriti another person. These kinds of rating 
tasks ase not foreign to the personality do- 
Main, ‘since they are used in the assessment 
literature (e.g., peer ratings) and person per- 
ception research (Schneider, 1973). Our cog- 
nitive position persuades us to explore. the 
Kinds of processes and structures that may 
be involved in this task. 

The cognitive literature gives a starting 
Point for considering processes involved in 
J= other people. Several researchers have 
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explored memory traces involved in making 
various ratings about people, using a photo 
as the stimulus. For example, subjects may 
rate either physical qualities of the person 
(e.g., size of nose) or psychological qualities 
(e.g., honesty). In a subsequent recognition 
task, subjects are better able to identify the 
faces given the psychological (deeper) rat- 
ings (Bower & Karlin, 1974). These data in- 
dicate that ratings of others that involve 
psychological qualities tend to create elabo- 
rate memory traces, compared to physical 
(shallow) tasks. The studies are not overly 
instructive as to what the underlying mecha- 
nisms might be during the rating task. The 
one hint is Bower and Karlin’s suggestion 
that prototypes mediate the judgments of 
others. 

Another area that relates to judgments 
about others is person perception (e.g., 
Schneider, 1973; Wiggins, 1973). One finding 
that consistently emerges in this domain is 
that the lay perceiver finds it easy to go 
from behavioral data to inferences about dis- 
positions (Hastorf, Schneider, & Polefka, 
1970). In rating another person, the rater 
is apt to use factors such as appearance, be- 
havior, social relationships, typical contexts, 
personal origins, and internal properties 
(Fiske & Cox, in press), The mechanisms 
involved in making these inferential leaps 
are usually thought to involve certain classes 
of conceptual structures that prescribe cer- 
tain behavior-trait relationships. These struc- 
tures, sometimes referred to as implicit per- 
sonality theories, facilitate generation of dis- 
positional inferences. 

These two domains of research (rating 
faces and implicit personality theory) give 
us some ideas as to how people may make 
ratings of other people. Presumably, the cog- 
nitive structures associated with the person’s 
implicit personality theory form an integral 
part of this rating process. This being the 
case, rating another person with respect to 
psychological characteristics should provide 
a relatively elaborate (or deep) memory 
trace. The research involving faces supports 
this contention. In toto, the data suggest 
that trait or “deep” ratings of others should 
enhance incidental recall. The critical ques- 
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tion is whether this enhancement is compar- 
able to that documented for self-referent 
judgments in the Rogers et al. (1977) ex- 
periments, 

Since both the self- and other-referent 
tasks involve Cognitive structures (self and 
implicit Personality theories, respectively), 
it may be possible that 
tions as well. It is most likely that both cog- 
nitive structures allow I 
reduce the incoming data to a manageable 
size. Implicit personality 
to allow us to “simplify the complex world 
of other people” 
whereas the self-sch 
stracted essence 
him or herself,” 


encountered over a life- 
time” (Rogers et al., 1977, p. 677). Thus, 
structures that appear 
to possess very similar functions. This situ- 
differentiation between 
tasks may be difficult, 


Tatings have been in- 
cluded. It is hoped that th 


as to the nature of self- 
tion Processing, 


This first experiment 
methodology of the inci 
and offered initia) data 


experiment usi 
somewhat different methodology. see 


Method 


In all of the following experim; 
Stages were involved, First, Subjects 
of adjectives, The rati 
Stage and Were 
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the presentation of the adjective. 
ratings, 


temember the words they rated, 
Materials. The 48 adjectives (40 target adjectives 
plus 4 recency and 4 Primacy buffers) Tepresented 
a broad set of personality characteristics, chosen 
from the scale descriptions of the Personality Re. 
search Form (Jackson, 1967). The details of selec. 
tion are presented in Rogers et al (1977, p. 680), 
The rating tasks used are detailed in Table 1, 
Four task orders Were generated to ensure that each 
adjective was rated under each task. Within each 
list one quarter of the words were rated under! 
each task. Task order was randomized in blocks of 
four, such that each rating task was performed once 
in a block, 
Procedure. Data were collected during the first 
session of an undergraduate psychology class, With 
the exception of Substituting “describes him” for 
the phonemic cue question, the procedure was identi- 
cal to that in Experiment 2 of Rogers et al. (1977), 
Experimenter, The target for the other-referent 
ratings was a male psychology graduate student. 
The experimental session was his first encounter 
with these subjects, who were students in his sta- 
tistics lab, 
Subjects. Twelye (11 female) undergraduate psy- 
chology students Served as subjects. Their mean 
age was 20.2 years, 


Results 


Recall performance. The recall protocols 
were scored under the following conditions: 
(a) The first and last four adjectives pre- 


effects; (b) grammatical transformations of 
the words were scored as wrong; and (c) a 
Proportion correct score was employed to in- 
Sure that differential numbers of Yes and No 
ratings were Not affecting the recall scores 
(see Rogers et al., 1977, pp. 683-684, for 
details and rationale of this proportionality 
score), i 

The mean Proportions correct as a func- 
tion of rating task and rating (Yessor No) 
eae ented in Table 1. As can be séén, the 


he main effect tating task was signifi- 
fife, ©: 23) = 2.98, p< .05, as was the 
ifference between the self- and other-referent 
tasks (p < 05), The post hoc test also re- 
vealed t other-referent recall was nol 
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‘able 1 4 
ting Tasks, Mean Adjusted Recall, and Confidence and Dificulty Ratings for Experiment 1 
Mean adjusted recall Mean Mean 
Rating task Question Definition Yes rating No rating M E ate 
i 
Structural Long? Rate whether you feel the word is „16! 
long or short. j a SF 4o ey 
Semantic Specific? Rate whether you feel the word has 16 AS 15 
a specific meaning or relates to a i Et 238 
specific situation. 
Other-referent Describes him? Rate whether you feel the word 09 27 18 5,00 2.75 
describes the experimenter. s ; 
Self-referent Describes you? Rate whether you feel the word 42 21 31 2.00 6.00 
describes you. ji 
M 20 .20 


“Adjusted recall values can range from 0 to 1.00. A value of 1.00 indicates that all adjectives receiving a particular rating (either 


yes or no) for a particular task were recalled. A value of 0 indicates that none of these adjectives were recalled. 
Difficulty ratings were made on a 7-point scale with endpoints of (1) “not at all difficult" and (7) “extremely difficult.” 


superior to the semantic task, For the struc- 
tural, semantic, and self-referent tasks, the 
pattern of present results replicates exactly 
the findings of Rogers et al. (1977). 
Clustering. One possible explanation of 
the recall findings is that subjects simply 
wrote down every adjective they felt was 
self-descriptive during the recall phase. If 
they did this, their self-referent recall would 
appear to be excellent relative to the other 
tasks, This hypothesis predicts two things. 
First, there should be a large number of in- 
trusions in the recall protocols. This was not 
found (M =.25 intrusions per subject). 
pans if subjects were merely listing the 
eatures stored in their self-concepts, all 
words rated under the self-referent task 
pon be emitted together, or in a definite 
oe The measure of clustering outlined 
i po nlesAltora (1970) was calculated 
ie ae protocols to assess this possibil- 
ah is measure permits a comparison be- 
‘i n the number of repetitions observed in 
ee and the number expected if the 
E words were emitted randomly. The 
E Roid measure also adjusts for 
R i of protocol and the number of each 
tions xu element observed. Significant devia- 
ee etween the observed and expected 
ee of repetitions would support the hy- 
isdue ‘3 that the superior self-referent recall 
Prising As oe listing of the features com- 


© Confidence ratings were made on a 7-point scale with endpoints of (1) “not at all confident” and (7) “extremely confident," 


The series analyzed for each subject was 
the task under which each successive item 
in the recall protocol was rated. While there 
was a tendency for there to be more repeti- 
tions observed than expected (M = 3.17 and 
2.61, respectively), these figures never ap- 
proached statistical significance, x(11)= 
1.13, p (sign test) = .61, suggesting that out- 
put order was close to that expected by 
chance. This difference remained nonsignifi- 
cant even when several variations of the Dal- 
rymple-Alford measure were used. To save 
space, it will be mentioned here that the same 
results were obtained for all recall data re- 
ported in this article, suggesting that there is 
not a'meaningful amount of clustering in the 
recall data of these experiments. As suggested 
in Rogers et al. (1977), the incidental recall 
data appear to reflect more encoding than 
retrieval processes. 

Difficulty and confidence ratings. The 
means of the postexperimental ratings of the 
encoding tasks are presented in Table 1. 
One-way analyses of variance of these rat- 
ings revealed significant effects for both the 
difficulty, F(3, 33) = 9.04, p < 001, and 
confidence, F(3, 33) = 16.22, p < .001 rat- 
ings. Post hoc tests revealed that the sub- 
jects found the other-referent task signifi- 
cantly more difficult than the self-referent 
rating (p<.01). They also placed a sig- 
nificantly greater degree of confidence in the 
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self-referent, compared to fhe other-referent, 
task (p < .01). 


Experiment 2: Self-Other Recall Differences 
Using the Individual Method 


Method 


Experiment 2 was conducted to extend the gen- 
erality of the self-other difference found in the first 
study by incorporating three methodological changes. 
First, a different other-target was used to ascertain 
if the target was an important aspect of the ratings. 
Second, the semantic rating task was changed to 
a synonymity rating. Third, subjects were tested 
individually, which permitted assessment of the 
time required to make the ratings (RT). The pro- 
cedures outlined in Rogers et al. ( 1977, Experiment 
1) were used. The only change was that the 
thythmic task was replaced with an other-referent 
task. 

Materials. The adjectives from Experiment 1 
were used. The necessary synonym list for the new 
Semantic task was taken from Experiment 1 of 
Rogers et al. (1977). 

Procedure. All of the stimuli were presented on 
a television manitor driven by a PDP8, 
which also recorded the ratin; 
rating times (in msec). Each of the 48 trials con- 
sisted of (a) a 3-sec presentation of the cue ques- 


and (e) a 2-sec inter- 
48 trials the subject 
was allowed 3 min to recall, i 


n any order, as man 
of the words as he or she could. x 


“> years, and each 
cipation. They were 
e eight lists, 


The mean Propor- 
a function of rating 
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of variance of these recall data revealed 4 
main effect of rating task, F (3, 69) = 156, 
$ < .001, replicating Experiment 1, How. 
ever, a fine-grained analysis of this main) 
effect revealed that the self- and other-refer. 
ent recall difference did not reach sat 
significance. 
Response time. RT was defined as te 
time from the onset of the adjective on thea 
monitor until the subject pressed the button} 
revealing a Yes or No response, These RTs 
were calculated separately for each rating 


task for each rating (yes or no); the mean; 
are presented in the second panel of Table 
2. A two-way analysis of variance of the data 


involved in these means revealed a single 
meaningful main effect of rating task, F(3 
69) = 27.05, p < .001. 

The focal concern of this study is the 
memory trace laid down by the rating tasks. 
When considering RTs for all ratings, we are 
dealing with nonrecalled words as well as 
recalled ones. Since nonrecalled words gen: 
erated nonfunctional traces, the overall RTs 
do not speak directly to the issue of useful 
memory traces. Because of this, mean RT 
was analyzed as a function of whether a 
word was recalled. For the two person tasks, 
the mean RT for recalled and nonrecalled 
words was calculated, and the means are en- 
tered in the third panel of Table 2.2 The 
critical observation of these data is the 848- 
msec difference between the two rating tasks 
for recalled words. Analysis of variance of 
the 2 x 2 matrices (n = 23) revealed a main 
effect of rating task, F(1, 22) = 12.02, p< 
01, and an interaction, F(1, 22) = 4.42, ? 
< .05. Post hoc tests revealed that for the 
other-referent task, the recalled words had 
meaningfully higher RTs than the nonte- 
called words. Further, considering only the 
recalled words, the self-referent task Pr 
duced significantly shorter RTs than the 
other-referent task. All other comparisons 
were nonsignificant, 

Confidence and difficulty ratings. As cam 


1It was not Possible to fill sufficient cells for m 
analysis if the rating (yes or no) variable was F, 
tained, hence it was not included. One subject bie 
eau from this analysis because of an empl) 
cel 


be seen from the lower panel of Table 2, the 
confidence and difficulty ratings replicate 
those shown in Table 1. Analysis of variance 
of the confidence rating means revealed a 
significant effect of rating task, F(3, 69) = 
25.29, p < -001, with post hoc tests revealing 
significance for the self/other-referent com- 
parison ($ < .01). For the difficulty ratings 
the self/other comparison of the significant 
main effect of rating task, F(3, 69) = 54.73, 
p<.001, revealed that the other-referent 
task was rated as significantly (p < 01) 
more difficult to perform than the self-refer- 
ent judgment. 


Discussion 


The results of these two experiments indi- 
cate that (a) more adjectives rated under 
the self-reference task were remembered 
compared to the remaining encoding tasks; 
(b) clustering analyses revealed no unusual 
patterning in the recall protocols; (c) RT 
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for encoding interacted with rating task and 
recall of words; and (d) difficulty and con- 
fidence ratings indicated that self-reference 
judgments were easier and were made more 
confidently than the other-referent judg- 
ments. These results, found using two dif- 
ferent methods, paint an interesting picture 
of the differences between the self- and other- 
referent rating tasks. 

First, the recall data suggest that the 
“general person hypothesis”. is not tenable. 
In Experiment 1 an encoding task involving 
another person did not produce recall equiva- 
lent to the self-reference task. This finding 
suggests that it is not simply the involve- 
ment of a person per se in the encoding 
rating that facilitated recall, Rather, the su- 
perior self-reference recall documented here 
and in Rogers et al. (1977) is the result of 
the self being active during the encoding 
task, Experiment 2 showed a similar pattern 
of results, but fine-grained analysis revealed 
that the self- and other-referent recall were 


Table 2 
Summary of Results for Experiment 2 
Rating task 
Other- Self- 
Measure Structural Semantic referent referent 
Mean proportion correct recall 
Yes rating 09» 20 26 35 i) 
No rating 07 12 125 30 Rue) 
ie 08 16 26 33 
Overall mean RT® (i sei 
Yes rating ona 1,280 2,639 2,855 2,536 2,327 
No rating 1,449 2,632 2,837 2,674 2,398 
oa 1,364 2,635 2,846 2,605 
Mean RT" fi : = 23 
Recalled detrei = subignee 3,223 2,315 2,799 
Nonrecalled words 2,714 2,558 2,636 
M 2,968 2,466 
Mean difficulty rating 1,33° 3.20 5.66 2.83 
Mean confidence rating 6.454 4,83 2.79 5.37 


a 
E ae = response time. 
parti djusted recall values can range 
the icular rating (either yes or no) for a particu 
ea cleaves were recalled. 
i ifficulty ratings were made on a 7-p 
tremely difficult.” 
ate onfidence ratings were made on a T- 
Temely confident.” 


from 0 to 1.00. 
lar 


point scal 


oint scale with endpoints of 


indicates that all adjectives receiving a 


A value of 1.00 i 
d. A value of 0 indicates that none of 


task were recalle 


(1) “not at all difficult” and (7) “Ex- 


e with endpoints of (1) “not at all confident” and (7) 
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not significantly different. This result could 
be due to any number of factors relating to 
the different methods employed (eg., the 
group method is experimenter paced, whereas 
the individual method is self-paced). This 
failure to find self—other differences could 
be damaging for our self theory, were it not 
for the consistent pattern of results emerging 
from the other dependent measures, particu- 
larly the RT and rating data. 

The major difference between the self- and 
other-referent tasks was that words recalled 
under the other-referent task were those upon 
which the subject expended a large amount 
of effort, as witnessed by the inflated RTs 
(M = 3,223 msec). When effort was not ex- 
pended during the other-referent rating, the 
word tended to not be recalled (M = 2,714 
msec). The opposite pattern of results was 
obtained for the self-rating task, with re- 
called words showing faster RTs (M = 2,375 
msec) compared to those that were not re- 
called (M = 2,558 msec). Thus, recall under 
the other-referent condition is “effort tied,” 
whereas self-reference recall is not. The find- 
ing that subjects found the self-referent task 
easier and placed more confidence in their 
self-ratings compared to the other-referent 
task underscores the differences between these 
two encoding tasks. Overall, these results 
indicate that the self- and other-referent tasks 
involve two different types of cognitive pro- 
cesses. 

The other-referent task can be character- 
ized as a rehearsal or frequency type of pro- 
cess, Here, if the subject works hard during 
encoding (e.g., rehearses or repeats the ad- 
jective), he or she is apt to recall the item. 
The absence of this effort during encoding 
predicts Poor recall. This type of process is 
reminiscent of verbal learning frequency the- 
ory (e.g, Klatzky, 1975) in that there is a 
simple linear relationship between RT and 
recall. The effort during encoding could be 
sheer rote repetition (though this is unlikely) 
or related to trying to “pigeonhole” or form 
an impression of the target person. This 
process may involve some kind of a schema 
or prototype, albeit a somewhat weak one. 
Subjects clearly indicated they were not con- 
fident of their other-referent ratings and 
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found them difficult to make, which rein. 
forces this rehearsal-type interpretation. 

The data suggest a very different process 
for the self-referent ratings. The recalled 
words do not show inflated RTs during the 
ratings. Rather, there is even a hint that the 
words that were easiest to rate are those 
that were recalled. This suggests that words 
that were easily integrated into the self- 
schema were those that were recalled. The 
finding that subjects found the self-referent 
task easy and had confidence in their ratings 
supports this interpretation. It appears that 
the self-referent task is based on an efficient 
schema. Presumably, the adjective is com- 
pared to the self-concept as it is rated (see 
Rogers, 1974), and the appropriate response 
is generated. This involvement of the self 
produces a rich and functional memory trace, 
which facilitates recall. The present data do 
not permit precise inferences about how the 
self-ratings are actually performed. However, 
the results indicate that these ratings in- 
volve the use of a highly organized and ef- 
ficient self-schema and that the process in- 
volved is very different from the other-refer- 
ent rating. 

These data, then, suggest a “two-process” 
interpretation of self-other differences in pro- 
cessing information. Data about others are 
processed using a frequency of rehearsal pro- 
cess, whereas self-referent data involve the 
use of a highly organized cognitive schema. 


Experiment 3: A Closer Look at the 
Process Differences 


Despite the relatively strong support for 
the two-process interpretation of self- and 
other-referent tasks provided by the first two 
experiments, several weaknesses remain: 
First, it is a post hoc explanation and there- 
by demands independent testing. Second, i 
was not possible to analyze the RT data of 
Experiment 2 for the effects of rating (ye 
or no). Third, lack of sufficient observations 
forced elimination of one subject in the prê- 
vious experiment, which attenuates generall2- 
ability. Experiment 3 was designed to ove 
come these problems by (a) providing 4 di- 
rect test of the two-process interpretatio 
(b) ensuring a sufficient number of ratings 


it assessment of yes/no variations, 
increasing the number of ratings per 
task to. eliminate any possibility of empty 
K the subject data matrices. In order 
to fulfill (b) and (c) without making the 
task unmanageable, the structural 
Semantic tasks were eliminated, thereby 
‘doubling the number of self- and other-refer- 
‘ent ratings to 20 per task. A further change 
the use of yet a third target for the 
ferent task. 
Predictions in this study relate to the two- 
process interpretation presented above. If the 
other-referent task involves frequency or re- 
‘heatsal processes and the self-referent rating 
‘is based on an efficient schema, RTs should 
be greater for the other-referent task. Fur- 
thermore, this effect should be stronger for 
led words, predicting an interaction be- 
tween rating task and recall (recalled and 
‘honrecalled words). It is even possible that 
effect would emerge as a three-way inter- 
tion when the rating (yes or no) variable 
added. 


'ethod 


AN aspects of this study were identical to Experi- 
ment 2 except that only self- and other-referent 
_ tasks were used. This meant that only two stimulus 
Orders were necessary to counterbalance adjectives 
Be rating tasks. 

Experimenter, A female psychology graduate stu- 
a ran the subjects and served as the target for 
he other-referent ratings. She was not the same 
Subj enter used in Experiment 2. 

Subjects. Twelve undergraduate volunteers (six 
emale) served as subjects. Each was paid $1.50 for 
IN pating. Their mean age was 20.3 years. 


Results and Discussion 


Th A 

The adjusted recall as a function of rating 
ae and rating was calculated. The other- 

erent recall (M = .23) was less than the 
a recall (M = .28), replicating the 
peri Ous experiments. However, as with Ex- 
Bement 2, this difference was not signifi- 
a 11) = 1.01. Furthermore, the re- 
Fori terms in the adjusted recall analysis 
Th ance were also, nonsignificant. 
The mean times required to make the self- 
Rip other-referent ratings were calculated for 
subject. A ¢ test on these data revealed 
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Table 3 

Mean Reaction Time (in msec) as a Function 
of Rating Task, Recall, and Rating for 
Experiment 3 


Se 


Rating task 
Other- Self- 
Rating referent referent M 
Recalled words 
Yes 5,133 2,936 4,035 
No 4,054 5,124 4,589 
M 4,594 4,030 4,312 
Nonrecalled words 
Yes 3,461 3,119 3,290 
No 3,843 3,488 3,666 
M 3,652 3,304 3,478 
Note. n = 12. 


that other-referent ratings required signifi- 
cantly more time (M = 3,784 msec) than did 
the self-referent (M = 3,191 msec) ratings, 
t(11) = 2.98, p < 01. This pattern of results 
also replicates Experiment 2 and offers pre- 
liminary support for the two-process interpre- 
tation. 

The important analysis in this experiment 
involves RTs as a function of rating task, 
recall, and rating. Such a matrix was calcu- 
lated for each subject; the overall means 
based on all subjects are presented in Table 
3. The noteworthy aspect of these means is 
the evident “crossover” of RTs in relation to 
rating and rating task for the recalled words 
only. This crossover emerges statistically as a 
significant three-way interaction, F(1,11) = 
6.77, p < .025, in the analysis of variance 
conducted on these data. Other significant 
terms in this analysis were (a) the main effect 
of recall, F(1,11) = 21.34, p < .001, indi- 
cating that RTs were higher for recalled 
words, and (b) the Rating Task x Rating 
interaction, F(1, 11) = 8.50, p < .025, which 
is clearly qualified by the triple interaction. 
The main effect of rating task was not signifi- 


cant.” 


SS 

2This finding may appear incongruous with the 
earlier “overall RT” analysis, but it is not. The 
categorization procedure used to derive the eight 
mean RTs per subject weighs the raw RTs un- 
equally. The overall RT analysis assigned an equal 
weight to each raw observation. 
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These data indicate that the Yes-rated RTs 
neatly replicate the pattern of results observed 
in Experiment 2 (Table 2). These data are a 
second verification of the two-process inter- 
pretation, wherein it appears that the self- 
and other-referent tasks involve schema-based 
and rehearsal-based processes, respectively. 

The words receiving No ratings paint an 
almost opposite picture to the Yes-rated ad- 
jective, The longest RT in the No condition 
was for recalled self-referent words (5,124 
msec), which was the shortest RT for Yes 
words (2,936 msec). However, post hoc tests 
failed to reveal any significant differences 
within the No words, which suggests that the 
Yes-rated words carried the useful informa- 
tion for differentiating other- and self-refer- 
ent processing. This result reinforces the con- 
clusions in Rogers et al. (1977), where em- 
phasis was placed on the Yes-rated words in 
their schema interpretation. 


Experiment 4: Familiarity of the 
Other-Target 


The three studies reported above were 
concerned with an unfamiliar other. Care was 
taken to ensure a minimum of contact be- 
tween the subjects and the other-referent 
target. This was done to minimize possible 
confounds in the designs associated with how 
well the subjects knew the other-target. How- 
ever, it is possible that familiarity is an im- 
portant aspect of other-referent ratings, and 
as such deserves systematic investigation. Ex- 
periments 4 and 5 considered this possibility, 
using the same paradigm as the previous 
studies, 

There are some hints in the literature that 
familiarity does affect processing of informa- 
tion about others. Koltuy (1962) observed a 
simpler multivariate Structure for judgments 
of an unfamiliar other, compared to known 
targets. This led Hastorf et al. (1970) to 
Suggest the following: 


Implicit theories of Personality in the form of as- 
sumptions about which traits are related to one 
another operate most Strongly when the perceiver 
faces an ambiguous person, one he does not know 
well. On the other hand, when the perceiver is rating 
people he knows Something about, such stereotyped 
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inferences are modified to fit actual characteristig 
of the other more closely. (p. 44) 


Į 

There is also evidence to suggest that degree. 
of acquaintance has a powerful effect on the 
validity of trait inferences (Norman & Gold- 
berg, 1966). Willingness to ascribe a trait to 
a person has been found to decrease with 
familiarity (Nisbett, Caputo, Legant, & Mar 


that familiarity with the other is a salient 
attribute of how we process personal infor 
mation, 

The ratings of an unfamiliar other in the 
first three studies of the present article were 
probably based on a stereotype or expectancy 
of the psychology graduate student whi 
served as the target (Jones & Gerard, 1967). 
During this time the subjects were trying to 
develop a cognitive structure for the target, 
but the lack of available information forced 
recourse to a stereotype (Secord & Backman, 
1974). This suggests that effort in making 
the judgment is likely to be an important 
variable, which was reflected in the RT analy- 
ses. However, as familiarity increases, two 
critical elements change. First, there is more 
information available, and second, there 5 
more time to formulate a cognitive organiza 
tion or structure for that person, These tw 
factors should lead to a reasonably accurate 
and organized cognitive structure as famili- 
arity increases (e.g., Hastorf et al., 1970): 
Other-referent decisions should then be base 
on this new organization and produce reasol 
ably elaborate memory encodings. Further 
more, during the recall phase, this organize 
tion can become part of the retrieval environ- 
ment and thereby facilitate recall. All of this 
suggests that other-referent ratings for @ 
familiar person should produce excellent 1° 
call in the incidental recall paradigm. 


Method 


Experiment 4 was an exact replication of a 
ment 1, involving the same subjects and the E 
experimenter/other-target. The only difference ji- 
that the experiment was run 11 weeks after n, 
ment 1. During this period the experimenter bec# 
more familiar to the subjects, since he was their the 
instructor in the course. One feature added after ity 
recall period was the collection of several familia! 


ratings. 


f 
ecek, 1973). Each of these studies E 


> 
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atings indicated that the 
knew the experimenter 
moderately well” and a little less than a 
“casual acquaintance.” Almost all subjects 
indicated that their interaction with the ex- 
perimenter had been limited to the classroom 
and the instructor’s office. 

The adjusted recall figures, the means of 
which are presented in Table 4, were sub- 
jected to an analysis of variance with rating 
tasks and ratings as factors. The only sig- 
nificant effect was the main effect of rating 
task, F(3, 33) =4.23, p< .025. Post hoc 
tests failed to reveal a_self-referent-other- 
referent difference, but recall for both per- 
son tasks was superior to that of the seman- 
tic task. 

The difficulty ratings showed a pattern of 
significance similar to that of Experiment i, 
F(3, 33) = 7.09, p< .001, with the self- 
teferent judgments being rated as signifi- 


The familiarity r 
subjects felt they 


| 
Table 4 
Mean Adjusted Recall and Co 


nfidence and Difficulty Ratings for Experiments 
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cantly easier than the other-referent judg- 
ments (p < .05). The confidence ratings also 
replicated those of Experiment 1, F(3, 33) = 
14.96, p < .001, with the self-referent judg- 
ments receiving significantly higher confi- 
dence ratings (p < .05) than all of the re- 
maining tasks. 

These data confirm the predictions deriv- 
ing from a view that suggests that familiarity 
results in the development of an accurate or- 
ganization. The enhanced recall under the 
other-referent rating (M = .18 in Experi- 
ment 1 and .40 in Experiment 4) appears 
to be due, in part, to the increased familiarity 
with the other-target, derived from 11 weeks 
of experience with him. The self-referent re- 
call is roughly the same for the two experi- 
ments (Ms=.31 and .32, respectively). 
Hence, the effect of familiarity is to dramati- 
cally enhance incidental recall for words 
rated under an other-referent task. This re- 
sult suggests the development of a cognitive 
structure for the now-familiar other. 


4and 5 


E Cone an Diy ag rst 


Rating task 
: Rating Structural Semantic Other-Referent Self-Referent M 
Experiment 4 (n = 12) 
< Mean adj k 
ee 10° 24 «30 35 24 
Y ; E 27 "32 
s .22 29 50 
a 16 .26 40 32 
Mean difficulty 3.66" 4.66 3.83 2.16 
Mean confidence 342° 3.00 3.16 6.00 
Experiment 5 (n = 15) 
Mean adjust 
eae 09 20 31 39 26 
Y AT ‘07 28 29 20 
x 13 13 32 34 
Mean diffculty 3.20 4.53 4.60 1.60 
Mean confidence 3.66 3.53 3.06 6.13 
‘Adjusted reca 00 indicates that all adjectives receiving a 
i Q 0 to 1.00. A value of 1.00 indi adj 
ee rating (either oi pene particular task were recalled. A value of 0 indicates that none of 
ese adjectives j i > 
A i aiy iapa ae on a 7-point scale with endpoints of (1) “not at all difficult” and (7) “ex- 
l ag ely difficult." f i 
: Onfidence ratings were made on a 7-point scale with endpoints of (1) “not at all confident” and (7) ‘‘ex- 
| temely confident,” 
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Experiment 5: Intentional Versus 
Incidental Recall 


The repeated experimental design em- 
ployed in Experiments 1 and 4 brings one 
obvious problem with it. Since subjects had 
been tested previously in the same situation, 
possibly the observed effects of Experiment 
4 were due to a sensitizing effect (see Camp- 
bell & Stanley, 1967, p. 18). The second 
study (Experiment 4) represented an inten- 
tional learning situation, whereas Experiment 
1 was an incidental paradigm. An examina- 
tion of the cognitive literature (Postman, 
1964) indicates that an intentional/incidental 
manipulation may, under certain circum- 
stances, affect subsequent recall levels, To 
rectify this, a second group of subjects was 
run, These subjects were in the same course 
as the previous subjects, but in a different 
lab section. The same experimenter was the 
lab instructor for this section. When tested 
at the same time of year as the subjects in 
Experiment 4, these subjects had an equiva- 
lent degree of familiarity with the experi- 
menter—but had not been sensitized with 
a pretest. If the results of Experiment 4 are 
replicated, the sensitizing effect will be ruled 
out as an alternate explanation. 


Method 


The same procedures and materials employed in 
Experiment 4 were used in this study. Subjects 
were 15 students (3 male) in an undergraduate 
psychology course, with a mean age of 19.8 years. 
They were run during a lab session. 


Results and Discussion 


The familiarity ratings of this group were 
the same as those found for Experiment 4. 
The data from this experiment are summa- 
rized in Table 4. The analyses of these data 
revealed the same pattern as observed with 
Experiment 4: (a) a main effect of rating 
task, F (3, 42) = 9.89, p < .001; (b) equiva- 
lent levels of recall for both the self- and 
other-referent tasks; (c) significant recall 
differences between the two person tasks and 
the semantic judgment; and (d) significantly 
lower confidence ($ < .01) and higher diffi- 
culty (p < .01) ratings for the other-referent 
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task compared to the self-reference condi- 
tion. The major impact of these findings iş! 
that they permit rejection of a sensitizing | 
effect as an explanation of the data in Ex. 
periment 4. 

Taken in combination, the results from 
Experiments 4 and 5 suggest that familiarity 
with a person tends to enhance the proba- 
bility of generating an elaborate and func’ 
tional memory trace in the incidental recall 
paradigm. It appears that familiarity aids in 
the development of a cognitive structure that 
can be used to process information about that 
person. Presumably, this structure becomes 
more elaborate and embellished with in 
creased familiarity. 


General Discussion 


The data from the five experiments com 
struct a consistent argument for several prop- 
ositions regarding the processing of informa 
tion about people. The experiments leave 
little doubt that the processes attending) 
other- and self-referent decisions are differ 
ent. This difference is reflected in the com 
sistent differences in RTs observed in Ex 
periments 2 and 3. Not only did other-referet! 
decisions require more time than self-referent 
ratings, but the RTs also showed some 1 
teresting interactions with the various inde 
pendent variables. Of particular interest 1$ 
the finding that words recalled from the 
other-referent condition were those requiring 
a greater amount of time for the rating to 
performed. All five experiments revealed cons 
sistent patterns in the confidence and diff- 
culty ratings, indicating that self-referent de- 
cisions were easier and produced more con 
fidence in the given response. Within th 
constraints of the RT findings and the rating 
data, the recall results from the five studies 
also support the contention that self- an 
other-referent decisions represent dite 
processes. This two-process interpretatio 
has a number of implications for the stu 
of processing personal information. 


The Self as a Cognitive Schema 


` the 
First, the pattern of results supp a 
Proposition that the self functions j 


ema. In all five studies, words receiving 
‘Yes ratings under the self-referent conditions 
“were better recalled than No-rated words. 
“This finding suggests that self-descriptive 
words, presumably those that “ft” with the 
“self, produce strong and elaborate memory 
= Such results would be expected if a 


schema is the basis of the self-referent 

* decision. 
Another aspect of the present data sup- 
‘ports the self-as-schema view. Not only were 
elf-referent decisions faster than other-refer- 
ent judgments but they also showed an oppo- 
site pattern in regard to which adjectives 
were recalled. In the other-referent task, 
words with long RTs were recalled, whereas 
with the self-referent task, there is a hint 
that words with short RTs—presumably 
“those that “fit” with the self—were those 
that were recalled. Again, a view that sug- 
gests the self is a schema predicts this pat- 
tern of results. When these two pieces of 
Supporting evidence are combined with pub- 
lished data, an impressive case for the schema 
Property of the self can be made. Some of 
this convergent evidence includes the fol- 
lowing: (a) Greater amounts of information 
are available for terms in the self-concept, 
Which is to be expected from a schema 
(Markus, 1977); (b) resistance to incorrect 
personal feedback is related to the degree of 
self-description (Markus, 1977), which dem- 
onstrates the activity of the self-schema dur- 
| ing the processing of new information; and 
(©) several biases documented in recognition 
ian ory mirror those found for experimen- 
‘ally manipulated schemata (Rogers et al., 

M press), 
A ae of this evidence suggests that the self 
ea cnt agent in the processing of 
Biter information, Each of these lines of 
at a can be predicted from a position 
n nes the self as an abstract cognitive 
ike ee that contains both general trait- 
ae ries and some specific behavioral ex- 
f: an or instances, This memory structure 
of ae during the input and interpretation 
“related information and provides a 


egr : 

ne of “meaning” or embellishment to the 

K T information. The self can be seen 
i as a hook or interpretative frame for 
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the encoding of personal data. The clustering 
analysis of the present data clearly indicate 
that the self is active during encoding. If the 
data were a retrieval phenomenon or simply 
reflected a listing of self-referent features, 
some degree of clustering in the output pro- 
tocols should have been observed, Since this 
was not found, the present data reinforce the 
proposition that the self is a powerful agent 
during the encoding of personal information. 


Processing Information About 
Unknown Others 


A second aspect of the present data is the 
information they provide us about how we 
rate other people. The results for the other- 
referent conditions in the current experiments 
provide several suggestions about the mecha- 
nisms involved in making decisions about 
the psychological characteristics of another 
person. Ratings of an unfamiliar other person 
appear to involve a relatively inefficient fre- 
quency type of process, with the memory 
trace relating to the amount of effort that 
the rater puts into his or her response. This 
effort probably relates to attempting to form 
an impression of the other person. Secord and 
Backman (1974) suggest that the context, 
the target, and the observer contribute to the 
initial view of an unfamiliar other. People do 
use contextual information in forming im- 
pressions (e.g., Price, 1974). This fact some- 
times emerges in the form of the context de- 
termining what information about a person is 
salient for a given situation. The target per- 
son elicits a considerable number of cues, 
even in a short exposure. Several researchers 
have provided categories of the kinds of in- 
formation that might be involved (Beach & 
Wertheimer, 1961; Secord & Backman, 
1974). Fiske and Cox (in press) suggest the 
following categories: (a) traits, (b) demo- 
graphic parameters (e.g., race, sex), (c) spe- 


3 As indicated in Rogers et al. (1977), this dif- 
ference fails to reach significance in individual 
studies. However, when we combine the previous 
research with the present five studies, we have nine 
separate verifications of Yes-rated words showing 
significantly better recall than No-rated words in 
the self-reference task, p = 1/512 = .002. 
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cific behaviors (e.g., verbal output and non- 
verbal cues—see Secord & Backman, 1974, 
p. 43), and (d) appearance (e.g., Jones & 
Gerard, 1967). 

A third source of information about an 
unfamiliar other is the observer. Here we 
encounter implicit personality theories and 
stereotypes that prescribe various expecta- 
tions or predictions about another person 
(e.g., Hastorf et al., 1970). These allow us 
to “fill in the holes” left by an inadequate 
opportunity to observe the target and there- 
by to embellish an incomplete memory rep- 
resentation. The observer’s self may also be 
involved in the rating of another person. The 
possibilities here range from making relativ- 
istic appraisals of the other person (e.g., 
“He’s more aggressive than I am—hence, he’s 
really aggressive”) to the imposition or pro- 
jection of self-referent characteristics onto 
‘the other person (e.g., Koltuy’s 1962 pro- 
jectivist hypothesis). The increased availa- 
bility of behavioral exemplars for self-de- 
scriptive qualities (Markus, 1977) suggests 
that such qualities may be used more often 
in judging other people. 

All of these sources of information no 
doubt combine in subtle and complex ways to 
produce the subjects’ ratings, and indeed the 
domain of person perception addresses these 
kinds of issues (Tagiuri, 1969). Although 
the present data do not permit statements 
about how these sources may combine, they 
do clearly indicate that the memory trace for 
the rating event is very different from that 
created during a self-referent rating. The 
other-referent data show that increased time 
performing the rating—which could be due 
to either seeking out additional information 
or combining available inputs—enhances the 
probability of subsequent recall. This finding 
suggests a frequency or strength model, in 
which increased input makes increments in 
some internal memory store and thereby fa- 
cilitates subsequent performance. 


Familiarity and a Model for Other Processing 


A third proposition about the processing 
of personal information comes from manipu- 
lating the familiarity of the target for the 
other-referent rating. Apparently, familiarity 
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with the target significantly modifies the 
processes involved in the judgment. Memory 
performance is enhanced after the subject | 
comes to know the target (Experiments 4 
and 5). This enhancement suggests that 
familiarity with the target permits a more 
efficient cognitive system for representing the 
other person than was available in the case 
of an unfamiliar target. The present data 
do not permit statements about whether 
these ratings are based on attempts to “type” 
a person (Cantor & Mischel, 1977), implicit 
personality theories (Schneider, 1973), or 
some form of self-involvement in the process- 
ing of information about others (Lemon & 
Warren, 1974). Rather, they do tell us that 
there are substantial changes in the cognitive 
system as we come to know a person better. 
These changes relate to the increasing in- 
formation we acquire about another person 
with increased familiarity. At some point 
during the accumulation of information about 
another person, an overload state must be 
encountered. That is, there must be a time 
when we can no longer keep track of all of 
the actually observed information we have 
about that other person. In order to cop? 
with this, either the cognitive structure must 
change or we will have to ignore new infor- 
mation. If the cognitive structure change, 
it probably shows an increasing degree ® 
abstraction in content. Rather than ron 
bering a series of specific instances, a genera 
statement representing the commonality 0f 
the other’s behaviors may be the stored a 
ment. To be certain, specific instances ca? 
generated from this general statement, bA 
the stored element is the generalization abot 
the other person’s behavior. Fiske and C 
(in press) present evidence consistent W ‘ 
this proposed interpretation. When indi 
uals are asked to describe an unknown E 
(stranger), they rely heavily on specific 
havioral exemplars or scripts, rather a 
abstracted traits. However, when asked 4 
describe a known other, subjects gene™ 
descriptive protocols that indicate 4 hi “ae 
frequency of abstracted traits than of sper 
observed behavior. Furthermore, traits ‘ot 
generated much earlier in the protocols ll 
known others, suggesting that they do 1% 
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serve a summarizing function. For strangers, 
traits begin to emerge in the protocol only 

after the opportunity for observation has 

been provided. Hence, increased exposure to 
the target enhances the probability of ab- 
" straction. 

The critical question is, What guides the 
abstraction process that allows us to summa- 
yize the available information about the other 
person? Implicit personality theories, stereo- 
types, and the kinds of attributes outlined 
by Fiske and Cox (in press) are no doubt 
part of this process. However, it is possible 
that the self is also very active here. The 
self is a cognitive structure that has evolved 
to help summarize available self-related in- 
formation. This means it serves the same 
"function as the abstraction process involved 

with other people. It is an intriguing possi- 

bility that how we summarize information 
about other people is bound up with our 
own view of self. Several lines of research 
evidence are compatible with this suggestion. 


Ta 


Self-Involvement in Other-Processing 


First, it appears that the same traits may 
be used in the perception of other people as 
are used in self-perception. For example, 
Lemon and Warren (1974) found that per- 
sonally relevant traits are used more often 
and earlier in free descriptions of other 
People (see also Shrauger & Patterson, 1974). 
This Suggests that the self may shape our 
views of others. 

Second, the research involving self-based 
consensus implies that people surmise popula- 
tion performance from their own behavior 
poset, 1958). More recently, Ross (1977) 

as formulated the notion of a “false-con- 
ae bias” in attributions, which suggests 

ae the self serves as a reference point in 
the ia information about others. Briefly, 
= alse-consensus bias postulates that lay- 
fate see their own behavioral choices and 

T Hoag as relatively common and appro- 
R i to existing circumstances, while view- 
a alternate responses as uncommon, devi- 
te inappropriate (Ross, 1977). An im- 
t corollary of this hypothesis is that 
ue) judges those responses that differ 

m his or her own to be more revealing of 
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another’s stable disposition than responses 
that are similar (Ross, 1977). To the extent 
that persons use their own views to interpret 
information about others, the self is prob- 
ably involved in processing information about 
others. 

A third line of research and theory that 
speaks to the issue of the self being involved 
in the ratings of other people emerges under 
the idea that the self functions like a fixed 
cognitive reference point. Snygg and Combs 
(1959) suggested that the self is the refer- 
ence point for the processing of all informa- 
tion. With respect to the processing of in- 
formation about others, they suggested that 
“people are not really fat, unless they are 
fatter than we are” (Snygg & Combs, 1959, 
p. 145). Comparisons of others to the self 
also form the keystone of Festinger’s (1954) 
“social comparison” theory and Jones and 
Gerard’s (1967) “comparative appraisal” 
theory of self-development. Lemon and War- 
ren (1974) and Shrauger and Patterson 
(1974) offer empirical evidence consistent 
with this notion of an implicit and automatic 
self-other comparison being an integral part 
of how we process information about other 
people. 

These three lines of research are compat- 
ible with the notion that the self, as a cogni- 
tive structure, is involved in the processing 
of information about other people. This idea 
does not imply that the content of a cognitive 
structure for another person is the same as 
the content for the self. Rather, it suggests 
that the self is involved in the development 
of the cognitive representation of another 
person. The manner in which the self helps 
shape this development and the kinds of 
biases that may emerge are fodder for a con- 
siderable amount of future research. For the 
present, it appears that there may be a com- 
plex involvement of the self in the processing 
of information about other people. 

Jn summary, the present data do verify 
that self- and other-referent tasks are per- 
formed using different processes when the 
target other is unfamiliar. There is also an 
indication that the cognitive structure asso- 
ciated with another person undergoes some 
substantive changes as familiarity increases. 
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However, these findings relate to only one 
kind of decision about other people and, as 
such, beg for further research to explore 
their generality. We hope that this research 
will be hardheaded experimental work that 
begins to shed light on the developmental 
sequence of the cognitive structures involved 
in processing information about other people. 
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F at ae ae Bee Psychology 
Making Trust Easier and Harder 


Through Two Forms of Sequential Interaction 


Philip Brickman, Lawrence J. Becker, and Sidney Castle 
Northwestern University 


If people in a trust dilemma take turns choosing first and second (alternating in- 


motives are completely revealed by their choices, In most real-world situations, 
however, although choices are sequential, each choice is both a response to 
another person’s past move and a stimulus for the other person’s next move (con- 
tinuing interaction). It was hypothesized that cooperation would be harder to 
achieve in continuing interaction than in alternating interaction, since in con- 
tinuing interaction, like simultaneous interaction, people’s motives are ambiguous. 
Since males have been shown to be more concerned with communication in 
trust dilemmas, it was also hypothesized that alternating interaction would bene- 
fit males more than females. Data from two studies using a total of 42 male 
and 42 female dyads supported this reasoning. Thus, we can relax one of the 
most artificial constraints of the formal trust dilemma—simultaneous choice— 
and still retain its important psychological properties. 


It is enormously important to understand 
When people will trust each other and when 
they will not. Experimental social psychology 
has invested great energy in studying this 
problem through what are called trust di- 
lemmas or Prisoner’s Dilemmas (see Wrights- 
man, O’Connor, & Baker, 1972, for an exten- 
šive review of this literature). In a trust 
dilemma each person is much better off com- 
Peting if the other person cooperates (because 


teraction), there is no longer a dilemma for the people choosing second. Their 


_ One exploits the other person’s generosity) 
and is also better off competing if the other 


Person competes (because one defends oneself 
against exploitation by the other person). 
Thus, regardless of what the other person 
does, competing is always a more profitable 

oice. However—and this is the dilemma— 
both parties will be worse off if they each 
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compete than they would be if they could both 
trust each other and cooperate. 

The question is increasingly raised, how- 
ever, as to whether people decide to trust 
someone else in laboratory experiments on 
the same basis that they do outside the lab- 
oratory (see Wrightsman, O'Connor, & Baker, 
1972, also for a discussion of these issues). 
The present studies address themselves to 
one possible issue in this debate. In labora- 
tory experiments on trust dilemmas, both 
parties to the dilemma are usually asked to 
decide whether to trust each other simultane- 
ously and in ignorance of each other’s choice. 
Mutual simultaneous choice is necessary for a 
situation to be considered a formal trust di- 
lemma (Rapoport & Chammah, 1965a). How- 
ever, in social interaction situations outside 
the laboratory, people rarely make choices 
that are precisely simultaneous and continue 
to be so over an extended period of time. 
Social interaction ordinarily consists of a 
sequence of events in which each party’s ac- 
tion in turn stimulates subsequent action by 
the other party. If making choices in se- 
quence removes the logical and psychological 
dilemma in trusting, the dynamic of the trust 
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dilemma is applicable only to a very narrow 
and special set of circumstances. 

We wish to demonstrate that there are two 
forms of sequential interaction. One of them 
removes the dilemma, to the extent that the 
dilemma exists because there is ambiguity 
in deciding why someone has chosen not to 
cooperate, The other form, however, contin- 
ues to make it difficult to decide whether peo- 
ple are interested in exploitation or only in 
protecting themselves and is also a much more 
reasonable analogue to ordinary social inter- 
action than either of the other two conditions 
we will be considering. 


Ambiguity of Simultaneous Choice 


Although choosing to compete is always 
more profitable in a trust dilemma than choos- 
ing to cooperate, the motivation for choosing 
to compete may be quite different depending 
upon whether the other party is expected to 
cooperate (in which case competing is a mat- 
ter of exploitation) or expected to compete 
(in which case competing is a matter of 
self-defense). If one knows only that someone 
has chosen to compete, it is impossible to 
know whether his or her motivation was ex- 
ploitation or self-defense. In a situation 
where both people are choosing simultane- 
ously, it will be impossible to know with cer- 
tainty whether a person is competing to gain 
advantage or to protect him- or herself, Kel- 
ley and Stahelski (1970b) have mapped out 
some of the problems that occur because 
people make different assumptions about 
what others will do, and McClintock (1972) 
has developed a sophisticated analysis of the 
different social motives potentially operative 
in trust dilemmas. 


Clarity of Alternating First and 
Second Choices 


The ambiguity involved in choosing to com- 
pete is fully removed if the person competing 
has already been informed whether the other 
party is competing or cooperating. If the 
other party is cooperating, the only reason to 
compete would be a desire to exploit. If the 
other party is competing, a desire to defend 
oneself is sufficient motivation to compete. 
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Conversely, if the other party is competing 
the only reason for cooperating in the second 
position would be a desire to create trust, 
There is no dilemma for a person choosing 
second in a sequential choice situation, and 
there is little difficulty in inferring the mọ, | 
tives of a person in this position (cf. Kelley £ 
& Stahelski, 1970a). If people choose sequen: 
tially rather than simultaneously and take 
turns choosing first and second, each person 
should have a variety of occasions on which 
to signal the other person without ambiguity 
as to whether he or she is interested in exploi 
tation, or only in self-defense, or actually in 
creating trust. If people compete in trust 
dilemmas in part merely because they cannot 
send or receive unambiguous messages of 
cooperative intent (cf. Deutsch, 1958, 1960) 
or because ambiguity serves as a smoke screen 
that makes it easier to get away with being 
exploitative, an interaction in which they) 
alternate choosing first and choosing second 
should produce a greater degree of coopera 
tion than a situation in which both parties 
choose simultaneously. 

Direct tests of this hypothesis so far have 
yielded ambiguous results, Swinth (1967) 
found a trend toward more cooperation fol- 
lowing an alternating-choice situation, but 
Wrightsman, Bruininks, Lucker, & O’Connot 
(1972) found no effect. Deutsch (1958 
1960) found more cooperation in simultaneous 
play, but he did not have parties alternate 
choosing first and choosing second. Oskamp 
(1974) also did not have parties alten 
first choices. He, too, found more cooperation 
in simultaneous play, but only when the pay 
off matrix had a positive average ope 
payoff. When the average expected payo 
was zero, there was more cooperation mM a 
quential play. 


Dee ee Eea y a 


Ambiguity of Continuing Interaction 


Knowing whether the other person has a 
decided to compete or cooperate, E i 
may not remove the dilemma from ones 
sion of whether in turn to compete oF ee 
erate. In ordinary social intercourse, € 
action is both a response to a previous 20 t 
by the other party and a stimulus toa S f 
quent action by the other party. AS 2° > 
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ponse, the action of competing or cooperat- 
ng may þe taken as a second choice in a trust 
jilemma, fixing at that moment how much 
eward each party gets as a function of their 
joint current orientations toward competing 
and cooperating. As a stimulus, the action of 
competing or cooperating may be taken as a 
first choice in a trust dilemma, setting the 
stage for the other person to determine by his 
or her subsequent choice how much reward 
each party gets as a function of their current 
orientations. 

In what may be called continuing interac- 
tion, since each move is both a first move (a 
stimulus) and a second move (a response), 
there is no longer a pure second move, and 
one can no longer infer unambiguously 
whether a person competes out of a bold 
desire to exploit or a cautious desire to pro- 
tect him- or herself. Even if the other person 
has just chosen cooperatively, the subject 
cannot be sure that this is not a device to 
induce a cooperative choice that will subse- 
quently be exploited to the net advantage of 
the other person. A continuing interaction of 
this sort conveys more information about par- 
ticipants’ motives than simultaneous choos- 
ing, but less than an alternating situation in 
Which people’s moves count only as either 
first moves or second moves, not both. We 
Would thus expect that the problem of 
whether to trust someone, as determined by 
the precision with which that person’s moti- 
vation to cooperate can be estimated, would 
be easiest with alternating interaction, harder 
with continuing interaction, and hardest with 


"simultaneous interaction. The present research 


is designed to test this analysis. 


Sex Differences 


Re sex differences have been an almost 
aa finding for trust dilemmas from the 
a they were explored by Rapoport and 
oe (1965b), they may be expected in 
La research as well. Most past re- 
E z has found that females are less likely 
(et es in trust dilemmas than are males 
this see & Kahn, 1974). The reasons for 
Matin are still somewhat obscure, but a 
Sist er of provocative analyses (Bedell & 

tunk, 1973; Conrath, 1972; Hottes & 
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Kahn, 1974; Johnston, Markey, & Messé, 
1973) have suggested that females are simply 
less aware that communicating cooperative 
intentions is a critical feature of the situa- 
tion. If this is so, we would expect that the 
enhanced opportunity to communicate coop- 
erative intentions in the continuing and al- 
ternating games would be of more benefit to 
males than to females. 

The design for the present research was a 
2 x 3 factorial using sex of subject and inter- 
action structure (simultaneous, continuing, or 
alternating) as independent variables. Two 
studies were carried out. The purpose of 
Study 2 was to assess the reliability of the 
findings in Study 1, to remove a potential 
confound involving sex of experimenter, and 
to obtain explicit measures of subjects’ at- 
tributions of intention for various choice se- 
quences in each interaction condition. Since 
the procedures for the second study were the 
same as for the first study, except where 
noted, the two studies will be described to- 


gether. 
Method 


Subjects and Experimenters 


In Study 1, 20 male and 22 female undergraduates 
acted as experimenters. Same-sex students were 
paired off for a total of 10 male and 11 female 
experimenter teams. Each team of experimenters 
recruited and ran three pairs of undergraduate 
volunteers of the same sex as the team, one pair 
randomly assigned to each of the three experimental 
conditions. Thus 20 males and 22 females served in 
each of the three conditions. 

In Study 2, six male and four female undergradu- 
ates acted as experimenters. Same-sex students were 
paired off for a total of two male and two female 
experimenter teams (one of the male teams had four 
members), Since one of the purposes of Study 2 was 


to avoid the confounding of subject and experi- 
menter sex that was set up in Study 1, each team of 
experimenters in Study 2 recruited and ran six pairs 
of undergraduate volunteers, three male pairs and 


three female pairs. Each team randomly assigned one 
pair of each sex to each of three experimental condi- 
tions. Thus, eight males and eight females served in 
each of the three conditions. The order in which 
conditions were run by each experimenter team in 
both Study 1 and Study 2 was randomly deter- 
mined for each team. 


Procedure 


The experiments were conducted in dormitories 
and other living units. Subjects were seated in sepa- 
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rate rooms, with care being taken to conceal each 
subject’s identity from the other. The experimenters 
were also separated, each taking the responsibility 
for briefing one of the subjects. 

Subjects were told that the study was one of how 
people made choices in a social situation. Their 
ability to handle some difficult social situations 
would be tested by having them play a special game. 
Evaluations would be based on the number of points 
each person earned in the game. The better a person 
handled such situations, the more points he or she 
would earn, 

In the game, subjects were told, each player 
would have two possible moves, A or B, with the 
amount a person won or lost in each round de- 
pending both on the choice the person made and on 
the choice the other player made. To facilitate sub- 
jects’ understanding of the game, they were handed 
a sheet of paper listing all the possible combina- 
tions of moves, along with their associated payoffs. 
The sheet showed that if both parties chose A, each 
would earn 10 points. If the subject chose A and the 
other person chose B, the other person would win 50 
points and the subject would lose 50 points. If the 
subject chose B and the other person chose A, the 
subject would win 50 points and the other person 
would lose 50 points. If both parties chose B, each 
would lose 20 points. Subjects were quizzed to be 
sure that they understood these rules and were also 
told that they could keep their copy of the handout 
showing the various moves and payoffs and refer to 
it whenever they wished during the game. 

At this point subjects were given a tally sheet on 
which columns were marked for them to record their 
move, the other person’s move, their payoff, and the 
other person’s payoff for each round. The manipula- 
tion of the interaction condition was then intro- 
duced, with players being instructed how to fill out 
their tally sheet on each round. To prevent end game 
effects, subjects were told only that they would play 
“a fair number” of rounds; they were actually 
allowed to play 18 rounds. Following play, sub- 
jects in Experiment 2 were asked to rate the extent 
to which they would infer either a desire to exploit 
or a desire to create trust from various moves made 
by the other player after their own previous move 
had been either competitive or cooperative. These 
ratings were made on scales ranging from 1 for “not 
at all” to 7 for “very much.” Finally, subjects were 
brought together and allowed to talk about their 
experiences and to ask questions of the experiment- 
ers. The experimenters explained the study, gave 
subjects a feedback sheet that they could take with 
them to review at their leisure, and thanked the 
subjects for their participation. 


Manipulation of Interaction Structure 


In the simultaneous-choice condition, subjects were 
asked to write down their move for each round 
before being informed of the other player’s move. 
After both subjects had made their choices, the 
experimenters informed each subject of what the 
other’s choice had been. 
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In the alternating-choice condition, subjects were 
told that one would choose first and have his or her 
choice announced to the other, who would then in 
turn make a choice. Subjects were also told thi 
they would take turns being in the position of 
moving first and that the person who was to start 
the game would be chosen by lot. It was suggested 
that subjects keep track of which player had chosen 
first on each round by circling on their tally sheet 
the choice made by the player who chose first, 

In the continuing-choice condition, as in the 
alternating-choice condition, subjects were told that 
choices would be made in sequence and that it 
would be determined by lot who would start the 
game. It was then explained that each choice they 
made would always count double (except for the 
person starting the game)—once as the last move in 
one round and once as the first move in the next 
round. If the subject were moving last in, say, 
Round 7, he or she would write down that move 
once in the appropriate column for Round 7 and 
then once again directly below this, in the same 
column, as the move for Round 8. Similarly, when 
the other player’s move for Round 8 was announced, 
the subject would write it down once as the last 
move for Round 8 and once again as the first move 
for Round 9. Again, subjects were asked to keep 
track of who had chosen first on each round by 
circling the choice of the player who had gone first, 
Subjects in all cases were quizzed to be sure that 
they understood the tally sheet and the sequence of 
moves before the game itself was started. 


Results 


Results of Study 1 were analyzed by means 
of 2 X 3 analyses of variance using sex 0 
subject and interaction condition as indepet 
dent variables. Sex of experimenter was added 
as a factor in the analysis of the choice bè 
havior measures in Study 2. No significant 
main effect or interactions involving sex 0 
experimenter were found. Res 

The proportion of cooperative choices in 4 
dyad in Study 1 and in Study 2 is showa 
Table 1. In both studies there was a signi 
cant interaction between sex of subject 4% 
interaction condition: F(2, 57) = 4.24, i 
019, for Study 1, and F(2, 12) = 4.00, $ $ 
-047, for Study 2. In each case an analysis 
simple main effects showed that interact 
condition had a significant effect only iy 
males: F(2, 57) = 7.37, p < .005, for SWG 
1, and F(2, 12) = 8.43, p < .01, for Study - 
In both studies a Newman-Keuls 4! d 
indicated that males in the alternating 07 
tion had a significantly (e = .05) i Ff 
portion of cooperative choices in their : 
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males in the simultaneous and continu- 
ing conditions and that between those latter 
two conditions there was no significant dif- 
ference. 

In Study 1 the proportion of cooperative 
moves in a dyad, given a cooperative prior 
move by the other party, was significantly 
greater for males, F(1, 56) = 7.00, p < 011. 
Although the interaction effect was not sta- 
tistically significant, males in the alternating 
condition responded to a prior cooperative 
move by the other party with a much higher 
‘proportion of cooperative choices than males 
in the other two interaction conditions (see 
‘Table 2). The significant sex-of-subject effect 
“on this measure was not replicated in Study 2. 
Tn neither study were there any significant 
treatment effects on the proportion of coop- 
erative choices in a dyad given a competitive 
prior move by the other party. 

Using the arc sine transformation sug- 
gested by Myers (1972) for proportions did 


‘Table 1 
Means and Standard Deviations of the 
Proportion of Cooperative Choices în a Dyad 


Interaction condition 


Sex of Simul- Contin- Alter- 
subject taneous uing nating 
Study 1* 

Male 
M 
34 39 69 
SD 21 A7 28 
Female 
M 
36 29. 32 
SD 28 14 20 
Study 2> 
Male 
M 
3t 18 .67 
SD 12 13 20 
Female 
M 
A9 46 A8 
SD 24 ‘CA 


“Th ara 

zA ere were 10 male pairs and 11 female pairs 1n 

ore eons condition. 

on Artate comparisons with Study 1, the gan 
e and female experimenters were com ine 


ere were 4 mal ie AAS 
Š e and 4 femal irs in each inter- 
action condition. female pairs 
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Table 2 

Means and Standard Deviations of the 
Proportion of Cooperative Moves in a Dyad 
Given a Cooperative Prior Move by the Other 
Party, Study 1 


MA OA SSS an 


Interaction condition 


Sex of Simul- Contin- Alter- 
subject taneous uing nating 
Male 
M .33 AT 70 
SD -26 31 29 
n 10 10 10 
Female 
M +32 25 34 
SD 33 .26 27 
n 10 11 11 


not essentially change any of the results of 
the analyses of variance reported above. 

The data collected in Study 2 on subjects’ 
attributions of either a desire to exploit or a 
desire to create trust in hypothetical choice 
sequences are presented in Table 3. As can be 
seen, when a cooperative choice of their own 
is followed by a competitive choice by their 
partner, subjects are much readier to say that 
their partners want to exploit them in the 
alternating and continuing conditions than in 
the simultaneous condition, and especially in 
the alternating condition, F(2, 42) = 6.99, 
p< ol. When a competitive move of their 
own is followed by a cooperative move by 
their partner, subjects are much more willing 
to attribute a desire to create trust to their 
partner in the alternating condition than in 
either the simultaneous OF the continuing 
conditions, F(2,42) = 3.34, p < 05. When 
a cooperative move of their own is followed 
bya cooperative choice by their partner, sub- 
jects are also somewhat readier to say that 
their partners want to create trust in the 
alternating and continuing conditions than in 
the simultaneous condition, especially in the 
alternating condition, F(2,42) = 3.18, ? < 
06. Finally, competition that follows compe- 
tition is not seen as particularly indicative of 
a desire to exploit in any of the three condi- 
tions. There were no sex differences on these 
variables and no Sex of Subject X Interac- 


tion Condition interactions. 
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Table 3 : 

Mean Attributions of Intentions to Other 

Person as a Function of Choice Sequence 

and Interaction Condition 

——$— 
Choice sequence 


Interaction condition 


Own Other's ; 
previous next Simul- Contin- Alter- 
choice choice taneous uing nating 


Attributed desire to exploit as 
motive for competing 


Cooperate Compete 3.00 4.56 5.06 
Compete Compete 2.12 2.50 2.31 
Attributed desire to create trust as 
motive for cooperating 
Compete Cooperate 3.19 3.63 5.06 
Cooperate Cooperate 4.12 5.12 5.56 


Note. n = 8 for each interaction condition. All scales 
range from 1 for “not at all” to 7 for “very much,” 


Discussion 


There are three main conclusions suggested 
by this research. The first is that trust dilem- 
mas appear to generalize to continuing inter- 
actions, in which parties choose in turn but 
each party’s choice counts as both a response 
to the partner’s previous move and a stimu- 
lus for the partner’s next move. The rational 
dilemma of whether to trust somebody and 
the attributional dilemma of how to interpret 
that person’s intentions hold in modified form 
for continuing interaction as well as for si- 
multaneous interaction, in which both par- 
ties choose at the same time in ignorance of 
each other’s current choice. This result has 
important implications for the generalizabil- 
ity of the findings of laboratory studies of 
trust dilemmas and for the conduct of future 
laboratory studies. These studies have typi- 
cally involved simultaneous interaction. Out- 
side the laboratory, however, interactions are 
much more likely to be continuing than 
simultaneous. (Consider, for example, such 
phenomena as arms races, contract negotia- 
tions, courtship, seduction, and contract 
bridge.) By showing that trust dilemmas are 
relevant to continuing interactions as well 
as to simultaneous ones, the Present study 
encourages us to believe that the dynamics 
of trust dilemmas underlie problems of sus- 
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picion and distrust outside the laboratory 
well as inside it. Furthermore, by indicating 
that we can relax one of the most demanding, 
constraints of the formal trust dilemma 
simultaneous choice—and stil] retain many 
of the dilemma’s 
Properties, the present research opens 
door to the use in future laboratory studi 
of better experimental analogues, namely, 
those involving continuing interaction. 
The second conclusion is that at least f 
males, trust is hard to achieve in part simplj 
because it is hard to make unambiguous i 
ferences about other people’s motivations i 
the ordinary simultaneous-choice trust di 
lemma. Even if the subject has chosen co 
operatively on his or her own previous move, | 
a subsequent competitive move by the other 
party is not seen as indicating a clear desire 
to exploit in a simultaneous game (see Table 
3). Even if the subject has chosen competi 
tively on his or her own previous move, @ 
subsequent cooperative move by the other 
party is not seen as indicating a clear desi 
to establish trust in a simultaneous game. Of 
the other hand, both these inferences até 
clearly made for second moves in an alter 
nating game, in which the first move Ï 
known in advance by the party making the 
second move. This knowledge proved suf 
ficient to establish a much higher degree of 
Cooperation among males in the alternating 
game than in the simultaneous game. Male 
subjects were apparently willing and able to 
signal their interest in cooperation in a 
alternating game and to respond coopéti! 
tively to such signals given by their para 
Where a disposition to cooperate exists, A 
present data indicate that we may _take f 
long step toward resolving trust dilen 
merely by altering the structure of choice 4 
a way that makes each party’s motivatio 
clear to the other party. ae 
It should be noted that alternating E. 
action is a clear advantage for coos 
only when the question of inferring anothe i 
motives is paramount. This is not the i 
for example, in a situation in which K 
parties unknowingly determine each othe i 
fate, called the “minimal social situation 
The main questions for the actors in 
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minimal social situation are discovering which 
choice the other person prefers and com- 
municating to the other person which choice 
they prefer. Kelley, Thibaut, Radloff, and 
Mundy (1962) have demonstrated that under 
these circumstances, a cooperative solution 
is more likely if the parties choose simul- 
taneously rather than in alternation. 

The third conclusion is that the basis for 
failure to achieve trust in a trust dilemma is 
to some extent different for males and fe- 
males. Like males, females made more un- 
equivocal inferences of motivation in an al- 
ternating interaction than in a simultaneous 
one. Unlike males, however, females were not 
thereby more cooperative under alternating 
than under simultaneous conditions. It would 
appear that the testing and communicating 
of cooperative intentions is a critical factor 
for males in resolving trust dilemmas, but 
not for females. This sex difference may ac- 
count for Wrightsman, Bruininks, Lucker, & 
O'Connor’s (1972) not finding increased co- 
operation in the alternating condition. They 
had twice as many female as male subjects 
and did not include sex of subject as a factor 
in the design. The puzzle of female distrust 
is all the greater because in other situations, 
females are more cooperative and more com- 
passionate than males (Uesugi & Vinacke, 
1963; Vinacke, 1959). We suspect that fe- 
males are not defining compassion as relevant 
to the standard trust dilemma and that if 
they were induced to do so, this would be 


the key to elevating their tendency to C0- 
Operate. 
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Three experiments 


examined the influence of 
physical attractiven 


» 15 females made 
by a photograph 
bility and attrac- 
’ judgments. 


Two factors have frequently been hypothe- 
sized to play an important role in dating 
choice (e.g., Walster, Aronson, Abrahams, & 
Rottman, 1966). One factor is the attractive- 
ness of a prospective date, and the other is 
the perceived probability of being accepted 
by the date. The importance of attractive- 
ness, particularly the physical appearance of 
the date, has been repeatedly demonstrated 
(Byrne, Ervin, & Lamberth, 1970; Kleck & 
Rubenstein, 1975 ; Walster et al., 
However, there has been no consist 
dence about the importance of probability of 
acceptance, The primary purpose of this re- 
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search is to determine what influence proba 
bility has on dating choice. M 

Previous attempts to evaluate probally 
of acceptance have generally compared the 
dating choices of attractive and ue 
subjects, Unattractive subjects are eee 
to have a lower expectancy of being accept A 
by any given date. Therefore, these saa | 
would presumably choose less attractive ei 
than would more attractive subjects, tam 
should be particularly true when the saliency 
of rejection is emphasized. ; 

The results using this between-subjects a 
Proach have not been conclusive. bg H 
Dion, Walster, and Walster (1971) fo f 
that unattractive subjects did choose slig A 
less attractive dates than did attractive a i 
jects. However, they failed to find any E 
ferential effects when rejection was ial 
more salient. In comparison, Huston (1 
found no difference in the dates chosen a 
attractive and unattractive subjects. pet 
ever, he did find that all subjects chose mM Í 
attractive dates when acceptance was ass 
Other studies have generally led to equa! 
inconclusive results. 


In contrast to previous efforts, the present 
esearch was designed to study the influence 
yf probability of acceptance for individual 
subjects. That is, rather than studying the 
ffects of probability by a comparison be- 
tween groups of subjects, analyses were made 
by using a within-subjects approach. The 
research strategy is thus based on analyzing 
the impact that probability has on the cog- 
nitive processes used by individuals in mak- 
ing dating choices. 

‘A second but related purpose of this re- 

search was to investigate how probability is 
used in dating choice. That is, assuming 
probability is important, how is it combined 
with other information such as physical at- 
tractiveness? Several approaches suggest that 
attractiveness and probability should com- 
bine by multiplying. For instance, level of 
aspiration theory predicts that the desira- 
bility of a goal should be multiplied by the 
probability of attaining that goal (Lewin, 
Dembo, Festinger, & Sears, 1944). According 
to this view, dating choice should depend on 
the product of the desirability (or attractive- 
ness) of the date and the probability of ac- 
ceptance (Walster et al., 1966). 
_ From a rather different viewpoint, sub- 
jective expected utility theory states that 
choices between risky alternatives (such as 
gambles) should depend on the product ‘of 
the payoff times the probability of attaining 
the payoff (Edwards, 1961). Support for 
this multiplying rule of risky decision making 
has been found in several types of gambling 
situations (Shanteau, 1974, 1975). However, 
the rule has not been evaluated in more com- 
plex social judgment settings such as dating 
choice, 


Formally, the multiplying rule can be writ- 
ten as 


R= PX PA, (1) 


where the desirability of a date (R) equals 
the product of the probability of acceptance 
(P) and the physical attractiveness (PA) 
of the date. Statistically, Equation 1 predicts 
a significant interaction between probability 
and attractiveness that should be concen- 
par in the bilinear (Linear x Linear) 
TE component (Anderson & Shanteau, 
970; Shanteau, 1977). 
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Thus, the present studies were designed 
to determine whether probability is impor- 
tant in dating and, if so, to evaluate how 
probability is combined with attractiveness 
to arrive at a dating choice. In the first of 
three experiments, probability of acceptance 
was explicitly provided along with a photo- 
graph of the prospective date. This procedure 
allowed a direct assessment of the impact of 
probability and attractiveness on dating 
choice. In the second and third experiments, 
only photographs were provided, with the 
probability unspecified. Since probability was 
left up to each subject to infer, these studies 
were intended to mirror more closely actual 
dating situations. 


Experiment 1 


In Experiment 1, female subjects were 
asked to choose between two male dates. 
Each date was described by a photograph 
and a verbal statement of the probability 
that he would agree to go out with the sub- 
ject. Both the attractiveness of the photo- 
graphs and the probability of acceptance 
were systematically varied in a factorial de- 
sign. This design permitted a straightforward 
analysis of whether probability influenced 
the subjects’ judgments. It also allowed a 
direct assessment of how probability is com- 
bined with physical attractiveness in making 
a dating choice. 


Method 


Subjects. Fifteen female undergraduates were 
each paid $1.50 per hour to participate in the re- 
search. The females were all unmarried, free to 
date, and between 17 and 22 years old. All sub- 
jects were run individually. 

Stimuli, Subjects made preferential choices be- 
tween two dates such as “Tom/Fairly likely” and 
“Joe/Unlikely.” The names referred to photographs 
on a nearby display board; the verbal probability 
phrase represented the likelihood that the pictured 
male would be willing to go out with the subject 
ona date. 

The stimulus information was presented on index 
cards with one date on the left side of the card 
and the other on the right. There were 140 data 
pairs from a 7 (left probability) x 5 (left photo) X 
2 (right probability) * 2 (right photo) factorial 
design. For the left date, the seven probability 
phrases were Sure thing, Highly likely, Fairly likely, 
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Toss-up, Somewhat unlikely, Very unlikely, and No 
chance. These were combined with five photographs 
that varied from quite attractive to unattractive. 
For the right date, the probabilities were either 
Likely or Unlikely, and either an attractive or an 
unattractive photograph was used. The left and 
right dates were counterbalanced so that each pho- 
tograph and probability appeared equally often on 
each side of the card. 

The photographs were candid yearbook shots and 
were selected on the basis of preliminary ratings 
to cover a wide range of physical attractiveness. 
The names identifying the photographs were the 
seven most common male names (John, Bob, Bill, 
Jim, Tom, Joe, and Dick) listed by Battig and 
Montague (1969, p. 35); the names were randomly 
assigned to the photographs. The verbal proba- 
bility phrases were selected on the basis of norma- 
tive ratings (Lichtenstein & Newman, 1967) to 
cover a wide range; these same phrases have also 
been used in prior research (Shanteau, 1974, 1975). 

Procedure. The subject’s task was to compare 
the two dates on a stimulus card and to “make a 
choice as to which you would prefer.” Although the 
subject was aware that she would not actually go 
out on a date, an attempt was made to make the 
task meaningful. For instance, photographs were 
taken of all subjects comparable to the pictures of 
male dates. The subjects were to assume that the 
probability phrase represented the male’s reaction 
after seeing her photograph. 

The subject indicated her date preference by lo- 
cating a pointer on a 40-cm unmarked response 
bar. The left end of the bar was defined as a “very 
strong preference for the date on the left,” and the 
right end was defined as a “very strong preference 
for the date on the right.” The middle of the 
scale represented “no preference between the two 
dates.” The ends of the response bar were also ex- 
emplified by anchor stimuli that were more extreme 
than any of the experimental stimuli. These anchors 
were used to decrease the possibility of floor or 
ceiling effects. 

After presenting each dating pair, the experimenter 
recorded the subject’s response to the nearest milli- 
meter, using a ruler on the rear of the scale. The 
scale ranged from +100 on the left to —100 on the 
right. 

The experiment consisted of three daily 1-hour 
sessions. The initial part of the first session was 
devoted to instructions and practice. Following the 
practice, the subject was required to correctly sum- 
marize the instructions back to the experimenter. 
The entire set of experimental stimuli was then pre- 
sented. Sessions 2 and 3 began with a brief sum- 
mary of the instructions. In Session 2, the com- 
plete set of stimuli was presented for a second time; 
two more replications were then administered X 
Session 3. Prior to each replication, the stimulus 


cards were shuffled to Provide a differ 
ent 
presentation order, inda 
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Results 


The group mean preference responses fo. 
the left dates averaged over the right date 
are plotted in Figure 1. For example, the top 
right point corresponds to the mean prefer) 
ence for the photograph of Tom when it wa 
paired with a Sure thing probability; th 
mean response in this case was roughly +657 

The names have been spaced along thi 
horizontal axis according to their marginal 
means and reflect the attractiveness of thi 
corresponding photos: the farther to th 
right, the more attractive the photograph) 
Thus, Tom and Bob were seen as highly at 
tractive, John and Jim were of intermediat 
attractiveness, and Bill viewed as fairly ur 
attractive. 

Each curve in Figure 1 corresponds to the 
listed probability. As can be seen, preferen 
for dates increases as the probability becomes 
more certain. For instance, ‘“Tom/Sutt 
thing” has a much higher preference that 
“Tom/No chance.” 

The results in Figure 1 not only sugg 
that probability of acceptance is importat 
but also bear on the combination rule us 
to integrate probability with attractiveness 
The multiplying model in Equation 1 predio 
that the seven curves in Figure 1 should fomi 
a fan of straight diverging lines (Anderson 
Shanteau, 1970). As can be seen, the nig 
are very nearly linear and diverge in the a 
dicted fashion. Similar-appearing results #6 
obtained for all 15 females when individu 
subject data were plotted. each 

Statistical analyses. The results ton 
subject were analyzed by separate a 
of variance. For all subjects, the probabi y 
and photograph main effects were both hig a 
significant. In addition, all 15 subjects i 
vealed a significant Photograph X ee i 
ity interaction as shown in the second coi 
of Table 1. While these results are ê cist 
from Equation 1, the model makes Late, 
predictions about the form of the inte 
In particular, the Photograph X Probi if 
interaction should be concentrated in t 
linear trend component. This mean’ " 
after the bilinear term is removed, goant 
sidual component should be nonsigt! 
Using the procedure described in Sha" 
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Figure 1. Mean preference responses for dates described by a photograph and probability of 
acceptance, Experiment 1. (Each point is averaged over four alternative dates. Photographs are 
spaced along the horizontal axis according to their marginal means; probabilities are listed as 
curve parameters, Filled points are discussed in the text.) 


(1977), 11 of the 15 subjects were found nificant residuals, the Fs are quite small, and 
to have nonsignificant residual components, plots of individual data revealed no sys- 
as listed in the third column of Table 1. tematic pattern of deviations from pilinearity. 
While the remaining 4 subjects showed sig- Similar analyses were also performed sepa- 


Table 1 
F Ratios for Interaction and Residual Terms Collapsed Over All Alternatives, 
and for Each Alternative Separately 


Photo $ 
o X Probability Overall Residual for Residual for Residual for Residual for 
Subject interaction residual alternative 1 alternative 2 alternative 3 alternative 4 
No. (df = 24, 420) (df = 23, 420) (df = 23, 105) (af = 23, 105) (af = 25, 105) (df = 23, 105) 
1 4.79* 1.45 2.29* 1.02 54 1.27 
2 95.75* ‘81 89 96 pide 1.18 
3 23.84* 1.93 6.12* 75 1.30 93 
4 20.45* 93 3.09* 2.26" 1.38 91 
3 41.74* 1.83 ‘97 ‘90 1.27 1.68 
6 29.09* 2.18* 1.69 1.40 ‘96 2.434 
7 8.28* 2.50* 90 1.05 1.20 5.69" 
8 16.45* 1.53 2.03* 1.98* 62 1.67 
2 15.58* 1.50 1.28 93 59 2.01* 
10 19.10* ‘94 ‘49 2.04* “64 2,00* 
11 40.25* 1.20 1.44 2.14* 2.01* 2.10* 
te 36.93* 1.51 1.02 1.44 54 1.05 
13 17.50* 2.00* 2.64* ‘66 1.07 1.96" 
n 24.92* 3.34* 3.41* 1.37 2.43* 1.00 
ie 29.11* 1.63 81 1.37 1.70 2.57* 
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rately for each choice alternative. As listed 
in the last four columns of Table 1, approxi- 
mately one quarter of the residuals were 
found to be significant, As in the overall 
analysis, the significant residual components 
were of small magnitude and appeared to 
have no systematic locus. There was, how- 
ever, a slight trend for more significant re- 
siduals with Alternative 4; there was no 
identifiable reason for this result. In all, the 
results of Experiment 1 provided general 


Support for the multiplicative model in Equa- 
tion 1. 


Discussion 


The results from Experiment 1 clearly 
show that females not only make use of 
probability information, they appear to com- 
bine it with attractiveness in a multiplica- 
tive fashion, Thus, a date who is low on 
either factor (i.e, very unattractive or very 
unlikely to accept a date) will be seen as 
undesirable, 

It is commonly believed that 
inverse relation between attractiveness and 
Probability (e.g, Walster et al., 1966). This 
means that highly attractive dates should 
have a low Probability of acceptance, and un- 


there is an 


preferred, 
This effect is 
Figure 1, Fi 


Probability (Sure thing), 
(Tom) 


Of Course, this prediction depends on the 
Presence of a negative telation between at- 
tractiveness and Probability. One alternative 
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is that subjects may not see any conneciith 
between attractiveness and Probability (gs 
may see similar probabilities for differen} 
levels of attractiveness). This would imply 
that the dating preferences would be an ine 
creasing function of physical attractiveness 
alone, so that the most attractive dates would 
also be the most preferred. One purpose of 
Experiments 2 and 3 was to examine the rey 
lation between probability and attractiveness 
and to determine whether subjects would im 
fact prefer dates of intermediate attractive 


To study the natural relation between 
probability and attractiveness, it is clear 
that probability cannot be artificially manip- | 
ulated as in Experiment 1. Indeed, probabil- 
ity information is not explicitly available in 
most dating situations, Therefore, Experi- 
ments 2 and 3 were designed so that subjects | 
would have to infer or evaluate probability 1 
based on other information (i.e, a photo: | 
graph), i 


Experiments 2 and 3 


In Experiments 2 and 3, subjects made | 
Preferential choices between pairs of dates 
described by photographs alone. This allowed 
a direct assessment of whether factors other | 
than physical attractiveness influence dating 
choice. One additional Purpose of Experi- | 
ments 2 and 3 was to evaluate the multiply- 
ing rule for attractiveness and probability in 
a setting where probability was not explicitly 
provided. | 

Experiment 2 was run as a continuation of 
Experiment 1 using the same subjects. How- 
ever, it is possible that the explicit presenta- 
tion of probabilities in the first experiment 
may have alerted these subjects to the use 
of probability in the second experiment. 
Therefore, Experiment 3 was run on a new 
group of female subjects who had not gone 
through Experiment 1 initially. 


Method 


_ Subjects. In Experiment 2, the 15 females used 
in Experiment 1 returned for one additional Le 
Sion. In Experiment 3, 14 new subjects with simi- 


lar characteristics Were run under the same con- 
ditions, 


—_ 
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Procedure. Subjects were initially presented with 
the seven photographs of males used in Experiment 
Y1 and were asked to carefully consider the dating 
. potential of each of the males. It was then ex- 
plained that several facets of dating desirability 
would be examined in the experiment. 

Subjects began by estimating the probability that 
each of the males would accept a date with her 
after viewing her photograph. In making these 
estimates, subjects were instructed to keep in mind 
that the male would be assured of her willingness 
to accept a date; this was done because several pilot 
subjects indicated that unattractive males might not 
choose them because the males would fear rejection. 

Subjects responded by locating a pointer along a 
40-cm unmarked bar. The left end of the bar was 
defined as “No chance” and the right end was de- 
fined as “Sure thing.” The experimenter read the 
responses from the rear face of the scale using a 
0-100 scale. The stimuli were presented four times 
in all, with a different random order each time. 


í Subjects were then asked to rate the physical 


attractiveness of each of the pictured males. The 
subject’s task was to “disregard any other feelings 
you may have about the picture while making this 
rating—tell me only how physically attractive you 
consider this person to be.” This instruction was 
provided because some pilot subjects stated that 
they were taking into account the possibility of re- 
ciprocal interest by the males. The basic experi- 
mental procedure was the same as for the proba- 
bility ratings, except that the ends of the scale were 
defined as “Extremely unattractive” and “Extremely 
attractive.” 


: Finally, subjects made preferential choices be- 


tween pairs of dates in which each date was de- 
scribed by just a photograph. As in Experiment 1, 
subjects were aware that they would not actually 
go on a date. However, subjects were instructed 
to assume that they were in an actual dating situa- 
tion and to make choices taking into account all 
factors that might be important in choosing a 
date. Furthermore, it was emphasized that in this 
dating situation, like any other, it would be unclear 
Whether their dating choice would reciprocate. 

Twenty-one pairs of dates were formed from 
all possible combinations of the seven photographs. 
For each pair, subjects responded by first giving 
the name of the chosen date. They then indicated 
the degree of their preference by using the same 
Tesponse scale as in Experiment 1. The scale was 
defined as before, except that anchor stimuli were 
Not used to define the ends of the scale. The 21 
choice pairs were presented four times, with a dif- 
ferent shuffled order each time. Finally, each sub- 
ject was debriefed and interviewed regarding her 
thoughts on dating and her approach to making 
dating choices. 


Results of Experiment 2 


Results for three representative subjects 
ate graphically presented in Figure 2. The 
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spacing along the horizontal axis for each of 
the three panels reflects the subject’s physi- 
cal attractiveness ratings of the photographs. 
Subject (S)1, for example, considered Dick 
to be the most attractive, Joe to be least at- 
tractive, and Bob to be of roughly intermedi- 
ate attractiveness. The starred points at the 
top of each panel represent the subject’s 
probability estimates (e.g., S1 assigned the 
lowest probability to Jim and the highest to 
Joe). The filled points indicate the subject’s 
mean preference for each date relative to’ all 
others (e.g, S1 showed the greatest mean 
preference for Tom and the lowest preference 
for Joe). 

Comparison of the three panels reveals 
several types of individual differences. In the 
physical attractiveness ratings, for instance, 
Bob was given the highest rating by S5 and 
$13, whereas S1 rated Bob as only moder- 
ately attractive. There were also considerable 
differences in subjects’ probability ratings, 
with S1 and S5 giving low probability esti- 
mates to highly attractive dates, whereas $13 
rated all dates fairly high on probability. 
Perhaps the most pronounced differences were 
in the preference values; although Dick was 
highly preferred by $13, he was one of the 
least preferred by S1 and S5. Individual dif- 
ferences of comparable magnitude were ob- 
served for the remaining subjects as well. 

In spite of these differences, subjects re- 
vealed some important similarities in the 
shapes of their preference curves. The left 
and center panels of Figure 2 are representa- 
tive of results obtained for 11 of the 15 sub- 
jects; the plots show a peaked preference 
curve with the greatest preference for dates 
of intermediate attractiveness. Further, there 
js a marked similarity between the proba- 
bility and the preference curves. Thus, these 
11 subjects preferred dates of intermediate 
attractiveness, and it appeared that probabil- 
ity was related to these intermediate prefer- 
ences. P 

As exemplified by $13, the remaining four 
subjects revealed a rather different preference 
pattern. These subjects showed an increasing 
curve and preferred the most attractive 
dates; moreover, their probability estimates 
were generally flat, with little relation to the 
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Figure 2. Responses to dates described only by photographs for three typical subjects (S), ra 
periment 2. (Horizontal spacing reflects physical attractiveness ratings for each date. “oy oa 
probability-of-acceptance ratings for each date. Filled points show mean preferences for ead 


compared to all others. Ope: 


preference curve, Therefore, these four sub- 
jects appeared to be basing their preferences 
primarily on the attractiveness of the dates, 

Statistical analyses. In order to quanti- 
tatively describe each individual’s responses, 
separate multiple regression analyses were 
performed. The basic strategy was to derive 
a regression equation to predict each sub- 
ject’s preference values based on (a) her 
probability ratings for each date, (b) her 
physical attractiveness ratings for each date, 
and (c) the cross-product values derived 
by multiplying probability and attractiveness 
ratings together, Thus, the multiple regres- 
sion model was based on 


J = B; (Physical Attractiveness) 

+ bz (Probability) 

+ Bs (Physical Attractiveness 

X Probability). (2) 

The subject’s mean preference judgment (J ) 
is expressed as a function of the attractive- 
ness and Probability ratings; the weighting 
constants (8) represent standardized weights 


derived from a multiple regression analysis. 
The prediction terms that accounted for a 


n points give values derived from a multiple regression model.) 


significant proportion of variance in the pref- 
erences of each subject are listed in the se 
ond column of Table 2.1 The terms are listed 
in their order of variance accounted for. 
The subjects were placed into two a 
depending on whether they preferred dates j 
intermediate attractiveness (peaked „oun 
or dates of greatest attractiveness (increas 
ing curve). f 

A a the physical attractiveness 


1 When predictor terms are intercorrelated, aa 
case here, the squared semipartial ori 
is the best indicant of a term’s significance jation 
ington, 1969). The squared semipartial corre after 
measures the contribution of a predictor is 
the variance accounted for by all preceding intet- 
has been removed. Thus, when predictors are heav- 
correlated, the semipartial correlation depends d into 
ily on the order in which terms are enter ee 
the regression analysis. To alleviate this ace a 
the order should be determined on the ba (Ker- 
theory or some other predetermined criterion t the 
linger & Pedhazur, 1973). In this exper A 4 
shapes of the preference curves were predic rule 
certain combination rules (e.g., a multe 
for subjects with a peaked preference ea 
were therefore used to determine the order in an 
terms were entered into the regression analy: 


term accounted for the most variance for 
three of the four subjects who chose the most 
attractive dates. Of more interest, the cross 
product was dominant for 9 of the 11 sub- 
jects who preferred dates of intermediate 
attractiveness; for these subjects, probability 
did appear to influence dating preferences 
in the form of a multiplicative combination 
of attractiveness and probability. 

Of additional interest, several of the sub- 
jects who chose dates of intermediate at- 
tractiveness also revealed a significant attrac- 
tiveness term (e.g., S5). These subjects gen- 
erally had preference curves in which the 
greatest preference was for a date of inter- 
mediate physical attractiveness, but there 
was also a moderately high preference for 
the most attractive date (as seen for S5 in 
Figure 2). This may represent a compromise 
strategy between a simple peaked strategy 
(e.g., S1) and an increasing preference strat- 
egy (e.g., S13). 


Table 2 


Peaked and Increasing Preference Curves 


Ne x F 
[ ote. PA = physical attractiveness; 
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To determine whether each subject’s regres- 
sion equation could adequately describe her 
preference judgments, the discrepancies be- 
tween actual and derived preferences were 
examined. As illustrated by the open circles 
in Figure 2, the discrepancies were visually 
quite small. This result suggests that the 
predictor terms listed in Table 2 for each 
subject provided a reasonable account of the 
preferences. 

At a statistical level, however, analyses of 
the discrepancies revealed small but consist- 
ent significant differences. When tested 
against within-cell error, significant discrep- 
ancies were observed for 10 of the 15 sub- 
jects, including S5 and S13 in Figure 2. Ex- 
amination of these discrepancies generally 
revealed that they were quite small. There- 
fore, it appears that the regression equations 
provide at least a reasonable first approxima- 
tion to the preferences. 

To further examine the usefulness of the 


Significant Predictor Terms and Number of Correctly Predicted Choices From 
Regression Equations and Attractiveness Ratings for Subjects Showing 


Peaked and Increasing Preferencë utie MA i i= 


Number of 
Numberof correctly pre- 
correctly pre- dicted choices 
dicted choices from attrac- 
from regres- tiveness 
Subject No. Significant predictor terms sion equation ratings 
Peaked curve 
1 (PA X Prob.), PA a v 
2 PA, (PA X Prob.), Prob. if is 
3 (PA X Prob.) iG “a 
5 (PA X Prob.), PA je i 
6 Prob. 7 16 
7 (PA X Prob.), Prob. A i 
8 (PA X Prob.), PA ai 21 
9 (PA X Prob.), PA i ié 
10 (PA X Prob.) B 16 
12 (PA X Prob.), PA i B 
14 (PA X Prob.), PA 
creasing curve 
= In : g D 9 
i 21 
11 Erobs (PA X Prob.) A 21 
AS P. 21 
15 PA, (PA X Prob.) 


Prob. = probability. 
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derived regression equations, three post hoc 
analyses were conducted. The first was based 
on the ability of each subject’s equation to 
account for her dating choices. The number 
of choices that could be correctly predicted is 
presented in the third column of Table 2. 
Out of a total of 21 choices, an average of 
19.0 choices were accounted for by the re- 
gression equation (which are based on at- 
tractiveness and probability ratings). This 
ranged from a low of 15 to a high of all 21 
correct. More detailed inspection revealed 
few errors of any magnitude, and most errors 
occurred when there was little or no dif- 
ference in the preference ratings. Thus, the 
choices of most subjects could be accounted 
for quite well by the predictions obtained 
from the regression equations. 

The second post hoc analysis involved a 
comparison between choice predictions based 
on the regression equations and choice predic- 
tions based on physical attractiveness values 
alone. This analysis bears directly on the 
issue of whether probability is an important 
component in subjects’ preferences. The num- 
ber of choices that could be correctly pre- 
dicted based on attractiveness ratings is 
shown in the right column of Table 2. Out of 
a total of 21 choices, attractiveness alone was 
able to account for an average of 17.0 correct 
choices, with a low of 11 and a high of 21 
correct. Many of the errors, however, were 
quite large. As expected, attractiveness alone 
was able to account for the choices of the 
four subjects whose preference curves in- 
creased with attractiveness (e.g., S13). In 
comparison, attractiveness alone was inferior 
to the regression equation for most of the 11 
subjects who preferred dates of intermediate 
attractiveness (e.g., S1 and S5). 

The final post hoc analysis was designed to 
examine the credibility of the direct ratings 
of attractiveness. The regression model in 
Equation 2 depends crucially on these rat- 
ings, and any bias or distortion in the ratings 
can greatly influence the form of the regres- 
sion equation (Schmidt, 1973). However, 
substitution of attractiveness values derived 
for each subject in Experiment 1 into Equa- 
tion 2 produced no substantial differences in 
the resulting regression equations. Thus, the 
present direct rating procedure apparently 
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produced reasonable estimates of individua] 
subjective values.? 


Results of Experiment 3 


The results obtained from the new set of 
subjects substantially replicated the findings 
of Experiment 2. For instance, analysis of the 
preference curves revealed that 9 of the 14 
subjects preferred dates of intermediate at- 
tractiveness, whereas five subjects preferred 
the most attractive dates. Comparison of 
actual preferences with the preferences de- 
rived from the regression equation produced 
only small visual differences; however, the 
discrepancies were significant for 11 of the 14 
subjects. Finally, 18.1 of the 21 choices could 
be correctly predicted from the regression | 
equations. In all, the results from Experiment 
3 were quite similar to those of Experiment 2. 
This suggests that the results are not de 
pendent on the subjects having prior experi- 
ence in Experiment 1. 


General Discussion 


There were two questions addressed in this 
research: (a) Is probability of acceptance 
important in choosing a date? (b) How is 
probability combined with physical attrac: 
tiveness in making a choice? The discussion 
will consider the implications of the present 
findings for these two questions. 


Is Probability Important? 


The answer from all three of the ae 
experiments is clear and unequivocal: 6, 
probability is important in choosing 2 "| 


2 This analysis also bears on an important E 
involving scale constants. The addition of constan 
to any of the terms in Equation 2 can potentia 
change the form of the derived regression equat i 
Moreover, since only the direct ratings of pro! DA 
ity can be assumed to be on a ratio scale (beci 
probability has a meaningful zero value), p 
terval scale of attractiveness values may 
ticularly suspect. The fit of the multiply! 
in Experiment 1, however, provides attract... 
estimates on a ratio scale (Shanteau, 1978) 
the substitution of these values into the me rived 
analysis did not change the basic results, the d ect t0 
regression equations are apparently not subject 
problems of scale constants. 5 


r PROBABILITY OF ACCEPTANCE IN DATING CHOICE 


Probability was used by subjects when it was 
explicitly provided in Experiment 1 and when 
it was left for subjects to infer in Experi- 
ments 2 and 3. There is little doubt, there- 
fore, that the females in this research used 
probability of acceptance in making their 
dating choices. 

The present findings on the importance of 

probability contrast with previous research 
that reports little or no effect of probability 
(eg., Berscheid et al., 1971; Huston, 1973). 
There are, however, several design and pro- 
cedure differences that may account for the 
discrepancy in results. 

To begin with, the present research was 
designed to isolate the effects of probability 
without concern for the subject’s own level 
of physical attractiveness. In contrast, prior 
studies have generally studied probability by 
comparing the dating choices of attractive and 
unattractive subjects. However, if there is a 
small or a diffuse relation between subjects? 
attractiveness. and their use of probability, 
then this research approach may not be sensi- 
tive to the effects of probability. 

An attempt was made in the present study 

to determine whether there was any connec- 
tion between a subject’s attractiveness and 
her use of probability.8 However, no evidence 
{ Was found of any relationship between a sub- 
jects appearance and her probability in- 
ferences, A further argument against compar- 
ing groups comes from the widespread indi- 
vidual differences in Experiments 2 and 3. 
These differences suggest that subjects’ per- 
ceptions of probability and attractiveness may 
vary quite widely. Therefore, analyses based 
on comparing group differences between at- 
tractive and unattractive subjects may well 
be incapable of detecting the presence of 
Probability effects. 
© Another difference from past research was 
that the present subjects did not actually ex- 
Pect to go out with their dating choice. Pre- 
Vious studies, in contrast, have generally at- 
tempted to simulate actual dating situations 
various ways (Huston & Levinger, 1978). 
Tt may be, then, that the present subjects 
Were isolated from “real world” demands, and 
they might have used probability in the lab- 
oratory when in fact they would not normally 
Use such information. 
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Several lines of evidence, however, suggest 
that the subjects in this research did approach 
the present tasks in a manner analogous to 
choosing real dates: (a) Subjects’ verbal re- 
ports indicated that they were thinking about 
the present tasks in much the same way as 
they would think about real dates; (b) many 
subjects became quite personally involved 
in the research, even to the point of wanting 
to know how they could find out more about 
the pictured males; and (c) the widespread 
individual differences between subjects sug- 
gest that each subject was carefully evaluat- 
ing the dates by her own standards and not 
simply following some stereotypic set of peer 
standards. Therefore, it would appear that 
probability was an important factor for the 
present subjects in their general approach to 


dating. 
How Is Probability Used? 


The answer is that, at least to a first ap- 
proximation, probability is combined with 
physical attractiveness by multiplying. The 
visual evidence in support of multiplying, 
especially in Figure 1, seems quite compelling. 
However, the presence of small but consistent 
statistical discrepancies in Experiments 2 and 
3 means that the model may not be entirely 
adequate. À 

One possible explanation for these discrep- 
ancies from multiplying is that subjects may 
use other information besides probability and 


sJf the subject’s own physical attractiveness is 
related to her expectancy of acceptance, then the 
subject’s attractiveness should be related to the 
dating strategy employed. For example, subjects who 
always chose the most attractive dates might be 
expected to be more attractive than subjects who 
chose dates of intermediate physical attractiveness. 
Contrary to this prediction, no such relation was 
observed. Ratings of the subjects’ photographs by 
60 male undergraduates revealed little or no rela- 
tionship between the physical attractiveness of a 
subject and the dating strategy she employed. For 
example, S1 and S13 in Figure 2 were nearly equal 
in rated attractiveness, but employed very different 
dating strategies. It appears, therefore, that the 
subject’s attractiveness does not determine her ex- 
pectancy for acceptance. Of course, the possibility 
remains that self-perceived attractiveness (as op- 
posed to outward attractiveness) may be related to 


probability. 
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attractiveness in making their choices. One 
conjecture, by Shanteau and Nagy (1976), is 
that females may also rely on their inferred 
compatibility with prospective dates. Evi- 
dence for this possibility came from com- 
ments by the subjects to the effect that “he 
is (or is not) my type,” and so “we would 
(or would not) get along together.” There- 
fore, the present discrepancies from multiply- 
ing may indicate that more than attractive- 
ness and probability information was being 
combined when subjects were making choices. 

The use of factors other than probability 
and attractiveness might explain why the 
multiplying model fit well in Experiment 1 
but less well in Experiments 2 and 3. In the 
first experiment, probability and attractive- 
ness were explicitly manipulated, and the data 
supported multiplying; there was little op- 
portunity in this experiment for other varia- 
bles to have any effect. In Experiments 2 and 
3, however, only physical attractiveness was 
manipulated, and subjects were encouraged 
to use any other information they might ac- 
tually rely on in choosing dates. Thus, there 
was an opportunity for compatibility (or 
other factors) to influence the choices; this 
would then decrease the fit of the multiplying 
model, which is based only on attractiveness 
and probability. 

Nevertheless, the model still did impres- 
sively well in accounting for subjects’ dating 
choices in Experiments 2 and 3. The derived 
model accounted for over 90% of the sub- 
jects’ choices; in comparison, a model based 
on physical attractiveness alone was able to 
account for only 75% of the choices for sub- 
jects who preferred dates of intermediate 
physical attractiveness. Thus, combining 
probability and attractiveness information 
does an appreciably better job than attrac- 
tiveness alone. Including information on com- 
patibility or other factors may, of course, 
increase the percentage even more. 

Finally, it is appropriate to make some 
comment on the general research strategy 
used in this study. The progression from con- 
trolled to more naturalistic research in a 
single study has been praised by some (Byrne 
et al., 1970) and criticized by others (Eb- 
bessen & Koneéni, 1975). In the present re- 
search, this strategy proved quite useful in, 
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among other things, evaluating the l 


and weaknesses of the multiplying rule. The 
results from the three experiments suggest 
that probability and attractiveness probably 
do combine by multiplying, but in more 
naturalistic settings other factors may enter 
in as well. 

The progression of studies was also useful 
in empirically deriving the prediction from 
Experiment 1 that subjects might prefer dates 
of intermediate attractiveness, This predic- 
tion was borne out in Experiments 2 and 3 
for roughly two thirds of the subjects. Even 
the deviations from this pattern were in- 
teresting, since they indicate that a sub- 
stantial minority of females may rely pri- 
marily on physical attractiveness in choosing 
dates. 

In all, the present research illustrates how 
a combination of controlled and more natu- 
ralistic research may be more beneficial than 
either approach taken alone. Together with 
other studies using a similar strategy (e8 
Phelps & Shanteau, 1978), this study sug 
gests that combining controlled and natural- 
istic research may prove quite useful in fu 
ture research involving complex human judg 
ment, 
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“Responsiveness” is defined in terms of two sequential response contingencies: 
(a) the probability with which each person in an interaction responds to the 
communicative behaviors of the other, and (b) the proportion of responses that 
are related in content to the preceding behaviors of the other. Two experiments 
examined the effects of responsiveness in a verbal exchange on attraction. Under 
the guise of a study of the “acquaintanceship process,” 176 male and female 
subjects exchanged information about themselves with another subject (actually 
a same-sex confederate) by taking turns choosing and answering one of either 
two or three questions about themselves on each trial. For Experiment 1, sub- 
jects were required to answer on all trials, whereas the probability and fre- 
quency with which the confederate responded to the subject were orthogonally 
manipulated, For Experiment 2, the proportion of content-related responses was 
varied. The confederate answered the same question as the subject on either 
80% or 20% of the trials. Both the probability of response and the proportion 
of content-related responses were positively related to (a) attraction to the 
confederate, (b) subjects’ perceptions of the confederate’s attraction to them- 
selves, and (c) the degree to which subjects felt that they and the confederate 
had become acquainted with one another. 


Social interaction has been described as 
“the very stuff of human life” (Goldschmidt, 
1972, p. 59). Certainly, at the least, it may 
be described as the “very stuff” of human 
relationships. Relationships are formed and 
maintained through a series of interactions 
and are often formed simply for the sake of 
interaction. 

Despite the importance of interaction, by 
far the majority of empirical research on love, 
attraction, and friendship has focused on the 
effects of static characteristics of others (eg., 
similarity, physical appearance, personality or 
behavior descriptions, etc.). Although such 
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characteristics are undeniably important, ade§ 
quate analysis of the determinants of attrat 
tion must also relate the process of interaty 
tion to attraction. The present article is con 
cerned with one dimension of interaction 
responsiveness—which, it is argued, function 
as a major determinant of the quality of i 
teraction and, hence, attraction. a 

Because the present concept of responsi 
ness is new to social psychology, before © 
scribing two empirical studies relating 
sponsiveness to attraction we will (a) d 4 
responsiveness and distinguish it from 0” 
related concepts, (b) describe the func 
of responsiveness and their implications (| 
attraction, and (c) summarize relevant 4 
erature. 


The Concept of Responsiveness 


Responsiveness is defined in terms i i 
sequential contingencies: (a) the prons 
with which each participant responds ( 


RESPONSIVENESS AND INTERPERSONAL ATTRACTION 


verbally or nonverbally) to the communica- 
tive behaviors of the other and (b) the pro- 
portion of responses that are related in con- 
tent to the preceding communicative behavior 
of the other. It should be noted that the first 
contingency (probability of response) is ap- 
plicable to both related and unrelated re- 
sponses. It is assumed that an unrelated re- 
sponse is better than no response. With re- 
gard to the second contingency (relatedness of 
response content), three conceptual distinc- 
tions should be emphasized. 

Perception of relatedness versus logic or 
intent of the speaker. We would like to 
stress that what is important for attraction 
is the perception that the other’s responses 
are related to one’s own communications. Re- 
sponsiveness is “in the ear of the beholder,” 
so to speak, If a response is perceived to be 
unrelated, it will be functionally eqivalent to 
any other unrelated response, regardless of 
whether it was logically related, or intended 
by the speaker to address the preceding com- 
munication. 

Content versus causality. The relationship 
between the content of successive responses 
should not be confused with the nature of the 

causal relationship between them. Jones and 

| Gerard (1967), for example, classify interac- 
tions into four categories that vary in the 
extent to which each participant’s behaviors 
are determined by the preceding behavior of 
the other versus by his or her own “plan” for 
the interaction. Likewise, Mehrabian’s (1969, 
1970, 1971) definition of responsiveness fo- 
cuses on the extent to which one’s behaviors 
are influenced by those of the other. For 
Mehrabian, responsiveness is indexed by the 
amount of change in one’s behaviors (e8. 
facial and vocal expressions, rate and volume 
of speech, etc.) resulting from another’s 

` presence or behavior. 

In contrast, the present concept of related- 
ness of response content is concerned only 
with the degree to which the response is per- 
ceived to address itself to the content of the 
Preceding message. Responses that are deter- 
mined or influenced by the other’s preceding 
behavior may nevertheless be unrelated in 
content—as in the case of unrelated associa- 
tions that may be triggered by the other’s be- 
havior. These responses will, however, be more 
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frequently characterized by related content 
than those not influenced by the preceding 
behavior. On the other hand, responses that 
are not causally related may be related in 
content. In this case, the frequency with which 
related responses will occur will depend upon 
the frequency with which responses that are 
related in content arise independently. 
Relatedness versus valence. A related re- 
sponse need not be positive. It may be 
friendly, hostile, agreeing, disagreeing, defi- 
ant, cooperative, and so on. A punch in the 
nose and a pat on the back may be equally 
related to a preceding behavior, although 
different in valence. We are concerned here 
only with the degree to which each behavior 
addresses the content of the preceding be- 
havior, not with the way in which it does so. 
This is not to say that valence has no effect 
on attraction, but rather that valence and 
relatedness are independent in their effects. 


The Functions of Responsiveness— 
Why is it Important? 


When two persons interact, each expects 
the other to be responsive—to respond to his 
or her communicative behaviors, and to re- 
spond with content that addresses itself to 
the intended message. As Ruesch and Bate- 
son (1951) have pointed out, each behavior 
in an interaction has a certain “demand” 
aspect, requiring a response of a certain kind. 
Certainly, in order to fulfill the demand, the 
response must address the content of the pre- 
ceding behavior. We expect a question to be 
followed by an answer, or an expression of 
disgust with the weather by a comment on 
the same. Changes of subject are, of course, 
acceptable, but only after at least a brief 
related response. Such mutual responsiveness 
between the participants will serve at least 
four functions for the interaction, each with 
potential implications for attraction. 

Maintenance of interaction. Perhaps most 
fundamental of the functions of responsive- 
ness is its role in the maintenance of interac- 
tion, Although interactions may persist in the 
face of occasional failure to respond (as in- 
deed occurs in virtually all interactions) , 
repeated failure to respond will typically 
result in cessation of interaction. Likewise, 
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irrelevant responses do not facilitate interac- 
tion and, in fact, are frequently used to 
either terminate an interaction or to change 
the subject (Watzlawick, Beavin, & Jackson, 
1967). Attraction has been shown to be a 
positive function of the opportunity for inter- 
action (e.g., Ebbesen, Kjos, & Koneéni, 1976; 
Festinger, Schachter, & Back, 1950) and of 
the frequency of interaction (Brockner & 
Swap, 1976; Byrne, 1961; Kipness, 1957; 
Saegert, Swap, & Zajonc, 1973). Through its 
influence on the maintenance of interaction, 
responsiveness should likewise facilitate at- 
traction. 

Predictability, control, and stress. To the 
extent that another is responsive, his or her 
behavior will be in some degree predictable. 
Although the exact content of a response is 
typically not predictable, one may at least 
expect (a) that the other will respond and 
(b) that the response will address the content 
of the preceding message. If the other is un- 
responsive, even this degree of predictability 
is lost. 

Similarly, the degree to which the other is 
responsive will affect the magnitude of control 
one may exert over the course of the inter- 
action. Clearly, one cannot often exert com- 
plete control. Neither the exact content of the 
other’s responses nor the specific nature of 
the behaviors the other will initiate will typi- 
cally be controllable. However, responsiveness 
from the other will allow some degree of 
control, in that one can (a) cause the other 
to respond and (b) determine the general 
topic area (albeit not the specific content) of 
the response, 

Because of the constant uncertainty asso- 
ciated with unpredictability and the frustra- 
tion associated with lack of control, inter- 
action with an unresponsive other will be 
Stressful. Arnold, Veitch, and Arkkelin (Note 
1), for example, have recently demonstrated 
that social situations that are perceived as 
most uncomfortable are those characterized 
by both high arousal and lack of control. 
Since others who are associated with stress 
should be less attractive than those who are 
not, again, it follows that responsiveness will 
influence attraction, 

Facilitation of interaction goals. To the 
extent that responsiveness facilitates control 
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over the course of the interaction, it shoul 
also facilitate interaction goals. Most behay. 
lors or communicative acts in interaction have 
a purpose. It may be as simple as the main- 
tenance of interaction or as complex as an 
attempt to manipulate or persuade. What. 
ever the goal of the act, response from the 
other is necessary, but not sufficient, for its 
fulfillment. Although response may or may! 

not facilitate the initiator’s purpose, failure to 

respond will almost certainly frustrate it, 

Similarly, related responses will be more likely 

to facilitate interaction goals than will irele- 

vant responses. Since others who are associ 

ated with reward (facilitation of interaction 

goals) should be more attractive than those 

associated with punishment (frustration), re- 

sponsiveness would again be expected to facili- ; 
tate attraction, 

Communication of interpersonal afect 
The degree of responsiveness between partici 
pants in an interaction reflects the relation 
ship between them—their interest in one ar 
other, as well as in their respective communi- 
cations. When interacting with another who 
frequently either fails to respond or responds 
irrelevantly, one is tempted to conclude that 
the other is interested neither in oneself not 
in what one has to say. Certainly, one is left 
with the feeling that no real “interaction” has 
taken place, and that no “relationship has 
been developed. Responsive interaction, 0 
the other hand, will facilitate the perception 
of a “relationship” between the participant 
as well as the perception of mutual interes 
and attraction, 

As Heider (1958) has argued, the perce 
tion of a “unit” relationship should produce 
a corresponding sentiment relationship F 
traction). Similarly, attraction to anothe 
facilitated by the perception that he or * 
attracted to oneself (e.g., Backman & so 
1959; Harvey, Kelley, & Shapiro, 1 We 
Jones, Gergen, & Davis, 1962). Respons! 
ness, to the extent that it strengthens rj. 
perception of a bond between the a 
pants and of mutual interest and attract% 
should likewise facilitate attraction. 


Empirical Evidence E 
Although research on attraction i 
dealt explicitly with responsiveness, €V 
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of its importance has appeared in several lines 
Tof research. Responsiveness is defined in terms 
‘ of two response contingencies—probability of 
response and relatedness of response content. 
The existing literature relevant to the two 
response contingencies is summarized below. 

Probability of response. Response to an- 

other’s behavior may be as simple as an ori- 
/enting response or other indication of atten- 
tion or as complex as an elaborate compli- 
“ment. The power of attention as a reinforcer 
‘has been well documented by the countless 
“reports of the use of “differential attention” 
“as a technique for behavior modification (see 
Hersen & Barlow, 1976, pp. 339-352). Verbal 
and nonverbal responses indicating attention 
to another have also been shown to facilitate 
attraction, Research on eye contact (cf. Ells- 
worth & Ludwig, 1972), for example, has 
demonstrated that under appropriate circum- 
stances, visual attention to another will fa- 
cilitate attraction. Likewise, Rosenfeld’s 
(1966) work on affiliation has documented the 
importance of verbal indicators of attention 
‘or acknowledgment. Rosenfeld recorded 
various verbal and nonverbal behaviors of 
‘subjects instructed to either seek or avoid 
“approval from a naive interlocutor (inter- 
locutress), Approval seekers used more “rec- 
ognitions” (e.g., brief utterances such as 
“Mmhmm,” “Hmmm,” “Yeah,” “No kid- 
ding?” “Really?”, etc.) than approval avoid- 
ers; and furthermore, “recognition” scores 
were positively correlated with approval ac- 
tually received from subjects’ naive partners. 
Finally, research on physical pleasuring 
(Davis & Brock, in press; Davis & Martin, 
1978; Davis, Rainey, & Brock, 1976) has 
shown that attraction to a recipient of physi- 
cal pleasure is facilitated by responsiveness. 
Recipients who responded to receiving plea- 
Sure (by expressing enjoyment) were liked 
better than those who remained silent. 

The importance of response has been 
further documented by studies of mother- 
infant interaction. Maternal responsiveness 
has been shown to facilitate both attachment 
to the mother and the development of behav- 
ioral competence for the infant (see Apple- 
ton, Clifton, & Goldberg, 1975, and Martin, 
1975, for reviews of this literature). Konner 
(1975) has also stressed the importance of 
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responsiveness, pointing out that children 
often prefer to interact with older children or 
adults, rather than peers, because the former 
are more “‘contingently responsive?” —presum- 
ably with respect to both probability and 
relatedness of response. 

Finally, in a series of studies of responsive- 
ness and social attraction between rodents, 
Latané and his colleagues have demonstrated 
that while static qualities of the animals (e.g., 
color, texture, smell, etc.) do not affect at- 
traction, attraction is strongly influenced by 
factors affecting the capacity for movement 
and response (see reviews by Latané & Hoth- 
ersall, 1972; Werner & Latané, 1974). Rats 
are relatively unattracted, for example, to 
caged, stuffed, or anesthetized rats, whose 
capacity for response is restricted, and to 
unresponsive objects (e.g., tennis balls, warm 
water bottles, Plexiglas tubes, etc.), but are 
quite attracted to a responsive human hand. 
Additionally, attraction is facilitated when 
the capacity for response is enhanced by in- 
jection of adrenaline or caffeine, and it is 
inhibited when rats are made sluggish by in- 
jection of chlorpromazine or alcohol, 

While the results of the research described 
above are consistent with the hypothesis that 
attraction will vary positively with the prob- 
ability of response, the various manipula- 
tions (e.g. anesthetizing, stuffing, eye con- 
tact) and measures (e.g. recognitions) have 
involved initiative as well as responsive be- 
haviors, and have not varied the probability 
of response independently of response valence, 
and in relation to specified units of perceiver 
behavior. 

Relatedness of response content. Although 
the concept of relatedness of response content 
has not been prominent in social psychologi- 
cal theories of love and attraction, it has 
played an important role in other areas of 
research. " 

The problem of unresponsive interaction, 
under the label “disqualification,” has been 
prominent in discussions of mental illness and 
family and marital dysfunction. “Disqualifi- 
cations” are responses that “invalidate” the 
messages that evoke them (Danziger, 1976). 
Such invalidation may be achieved by chang- 
ing the subject, failure to respond, ignoring 


1 
the message, misunderstanding, interruption, 
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tangentialization, literal interpretation of 
metaphors and vice versa, and so forth—es- 
sentially, by irrelevant or unrelated responses. 
The concept of disqualification has been cen- 
tral to the double-bind theory of schizo- 
phrenia (e.g., Bateson, Jackson, Haley, & 
Weakland, 1956; Sluzki, Beavin, Tarnopol- 
sky, & Vernon, 1967), and interactions 
marked by frequent disqualifications have 
been indicted as contributors to inadequate 
development of the self-concept (e.g., Laing, 
1961) and have been shown to characterize 
dysfunctional marriages and families (e.g., 
Watzlawick et al., 1967; Williams, 1969; 
Follingstad & Haynes, Note 2). 

Deriving from the client-centered tradition, 
responsiveness has played an important role 
in counseling and psychotherapeutic theory 
and technique—under the label “empathy.” 
“Empathic understanding,” for example, as 
defined by Carkhuff and Berenson (1967), is 
defined in terms of the extent to which the 
content of the therapist’s responses reflects 
understanding of, and addresses itself to, the 
true meaning (both verbal and emotional) of 
the client’s communications, The ability of 
the therapist to facilitate the client’s prob- 
lem solving and growth, they argue, is di- 
rectly related to the depth of the counselor’s 
empathic understanding, 

Finally, in contrast to the intimacy of 
therapist-client or family and marital rela- 
tionships, Petty and Brock (1976) have 
studied the effects of relevant versus irrele- 
vant responses from a public speaker to a 
heckler. The degree to which the speaker’s 
Tesponses were related in content to the heck- 
ler’s comments significantly affected the 
audience’s agreement with, and reactions to, 
the speaker. Relevant responses produced 
significantly greater agreement with the mes- 
sage than irrelevant responses, and the re- 
sponsive speaker was perceived as more com- 
petent than the unresponsive one, 

Again, although the literature summarized 
above is suggestive of the relationship be- 
tween responsiveness and attraction, direct 
empirical evidence is lacking, The present two 
experiments were designed to test the hy- 
pothesis that attraction will vary Positively 
with responsiveness, Experiment 1 dealt with 
the first component (probability of response) 
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and Experiment 2 with the second componen 
(relatedness of response content). As argui. 
earlier, responsiveness will serve to communi. ® 
cate interest and attraction, as well as o 
strengthen the perception of a “relations 
between participants. Thus, it was also ex 
pected that responsiveness would affect sub 
jects’ perception of the other’s attraction 
toward them and the degree to which they 
felt that they and the other had becom 
acquainted with one another. 


Experiment 1: Probability of Response 


Proper manipulation of the probability of 
response requires identification of the behave 
ioral unit that “demands” response, While in 
the “passive” or “listener” role, each partici ¢ 
pant in an interaction will emit a number of 
brief responses such as Rosenfeld’s (1966) 
recognitions (e.g., smiles, head nods anl 
shakes, gestures, facial expressions, briel 
utterances such as “I see,” “Yeah,” ett), 
which provide important feedback for the 
other. Such responses have been variously 
termed “back-channel responses” (Dunca 
1975; Duncan & Fiske, 1977; Krauss, Gil 
lock, Bricker, & McMahon, 1977; Yagit 
Note 3), “listener responses” (Dita 
Llewellyn, 1967, 1968), “concurrent ma 
back” (Krauss & Weinheimer, 1966), of bs 
nals of continued attention” (Fries, 19 i 
While the concept of responsiveness 2 f 
tended to include such “back-channeli iy 
sponses, because of the difficulty of id ee 
the unit of behavior that “demands obi 
the present study dealt only with the pro? i 
bility of response at those junctures at W 
the exchange of the “active” (or SP 
role is demanded. 

Under the guise of a study of ee 
quaintanceship process,” subjects exc ther 
information about themselves with am b 
“subject?” (actually a tape recording) 
taking turns choosing and answering an 
two questions on each of 6, 12, or 24 estion 
Subjects were instructed to answer & m 
on each trial. However, they were to a 
their partner would be allowed to choos fi 
each trial, whether or not to answer ‘3 or 
tion. The tape responded for either 3 
66% of the trials. 


eae 
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Table 1 
Design of Experiment 1 


Probability of response 


Frequency oe 

of response 33% 66% 
4 12 trials 6 trials 
8 24 trials 12 trials 


In the present situation, if the number of 
trials were held constant, probability and 
frequency of response would be confounded. 
Thus, differences in attraction as a function 
of probability of response, if obtained, could 
be interpreted as resulting from either differ- 
ing amounts of information about the stimu- 
lus person (see Sloan & Ostrom, 1974) or 
different frequencies of exposure to him or her 
(eg. to the sound of his or her voice; see 
Zajonc, 1968). In order to separate the effects 
of the two variables, probability (33% versus 
66%) and frequency (4 versus 8) of response 
‘were factorially combined in the present ex- 
periment, while the number of trials varied 
between conditions. à 


ethod 


Subjects. Subjects were 96 Caucasian introductory 
psychology students, 48 males and 48 females, who 
participated for partial fulfillment of course require- 
ments. 

Design. In a 2 (probability of response) X 2 
(frequency of response) X 2 (sex of subject) design, 
subjects chose and answered one of two questions 
On each of 6, 12, or 24 trials. Subjects were responded 
to on either 33% or 66% of the trials by a same-sex 
Confederate’s tape-recorded answer to the same ques- 
tion as that chosen by the subject. In order not to 
confound probability and frequency of response, the 

“total number of trials was varied in relation to the 
two variables: In the 33% condition, the tape re- 
sponded on either 4 of 12 or 8 of 24 trials; and in 

the 66% condition, the tape responded on either 4 
of 6 or 8 of 12 trials, (The design of the experiment 
is illustrated in Table 1.) 

Questions were designed to elicit information about 
the subject (eg., “How do you usually spend sum- 
mers?” “Do you keep house plants?” “What do you 
expect to be doing at 65?” “What would you do if 
you suddenly inherited a million dollars?” “Did you 
work during high school?” “Where do you spend 
Most of your time outside of class?”) and were 
designed to be neutral in content in order to mini- 
Mize the effects of similarity. 

The order of pairs of questions was Latin square 
counterbalanced for each condition, For the 6- and 
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12-trial conditions, the 24 pairs of questions were 
divided into four and two subgroups, respectively. 
The order of the 6 or 12 pairs was then Latin square 
counterbalanced within each subgroup, and each 
subject saw one of the orders within one of the sub- 
groups. 

Apparatus. The subjects’ cubicle contained an 
intercom box approximately 60.96 cm wide X 30.48 
cm high X 3048 cm deep. Centered on its front 
were two panels labeled “Talk” and “Listen,” and 
affixed to it were two buttons, one attached by 
wire and the other on the box under the “Talk” 
panel. The two buttons were used by the subjects to 
signal when ready to begin speaking and when 
finished speaking, respectively. 

Subjects were instructed to signal when ready to 
begin speaking by pushing the button attached by 
wire. This, they believed, would cause the other's 
“Listen” panel to illuminate. When finished speak- 
ing, subjects pushed the panel button, which they 
believed would signal the other to begin by causing 
his or her “Talk” panel to illuminate. For the 
responsive trials, the panel button activated a con- 
trol box in an adjacent room, which, after a fixed 
delay, caused the subject's “Listen” panel to illumi- 
nate and automatically initiated the appropriate 
taped response, For unresponsive trials, the experi- 
menter pushed a button that caused the light to 
sequence back through the “Listen” panel to the 
“Talk” panel without activating the taped response. 

Responses to the 48 questions were recorded sepa- 
rately on 96 cassette reels, 48 for a male and 48 for a 
female confederate. In this way, the experimenter 
could easily insert the appropriate tape after hearing 
the subject's choice of question, The tape was always 
an answer to the same question as the one chosen 
by the subject. 

Procedure, Upon arrival, subjects were seated in 
a small cubicle containing an intercom system and a 
set of written instructions, The experimenter ex- 
plained that another subject was expected, instructed 
the subject to read the written instructions, and left 
him or her to wait, closing the door behind him. 
‘After a brief delay, the experimenter pretended to 
lead another “subject” to the adjacent cubicle, 
audibly instructed the “subject” to read the instruc- 
tions, and closed the door. The true subject's in- 
structions were labeled “Instructions-Person A,” and 
read as follows: 


This experiment is concerned with the acquaint- 
anceship process as it occurs in brief encounters 
with another person, and as it occurs without the 
use of nonverbal cues such as physical appearance, 
dress style, eye contact, body posture, etc. 


You will be communicating with another person 
using the intercom system (the box in front of 
you). Both of you will have a booklet, with two 
questions on each page. You (Person A) will begin 
by choosing and answering one of the two ques- 
tions on the first page of the booklet, and you will 
be required to answer one of the two questions 
on all of the trials. 
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You may begin your answer when the light is in 
the panel labeled “Talk.” When you are ready to 
answer, push the button attached by wire to the 
box. This will activate the intercom and signal 
the other person that you are going to talk, by 
causing his or her “Listen” panel to illuminate. 


Please begin by telling the other person which 
question you are going to answer (A or B), and 
then give your answer. When you have finished, 
push the red button under the “Talk” panel, This 
will signal the other person that you are finished 
by causing his or her “Talk” panel to illuminate. 


Person B will then have the option of either 
responding or not responding on each trial. If 
Person B chooses to respond, when he or she is 
ready to talk, he (she) will push the button at- 
tached by wire to the box, again activating the 
intercom, and causing your “Listen” 
luminate. When finished, he or she will push the 
red button under the “Talk” panel, causing your 
“Talk” light to come on. If Person B chooses not to 
respond, he or she will just push the red button, 
oe i lights will Sequence back to your “Talk” 
panel, 


Please do not communicate in any way other than 
by answering the questions, 


When subjects had been allowed enough time to 
read the instructions, the experimenter opened both 
doors and, from the hallway facing the two cubicles, 
e upon the written instruc- 
tions for the subject (Person A) and the nonexistent 


Upon completion of the trials, subjects responded 
to a questionnaire designed to assess their i 

to the confederate (eight items), perceptions of the 
confederate’s 


the experiment: think the e peri- 
menter expects to learn from the experiment?” 
“What was he 


gu 
experiment, Subjects were i 
end of the semester, 
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Results 


Attraction” to conjederate, pre 
dicted that subjects would like the confederai 
who responded on 66% 


five columns of Table 2 indicate, respectively; 
the dependent measure of attraction and 
means for the four treatment combinations, 

The predicted main effect of probabilil 
of response was highly significant for 
out of the eight questions and in the predic 
direction for the eighth. Columns 6 and 7 
Table 2 indicate the Fs and significance lev- 
els, respectively. Analysis of variance on the, 
mean of the eight questions also yielded 4 
significant main effect of probability of re 
sponse, F(1,88) = 14.89, p < .001. 

There were no significant main effects of 
either sex or frequency of response, and in 
no case did probability of response interact 
with either variable, 

Perceptions of confederate reactions to sul 
ject. Two of the postexperimental question: 
naire items reflected subjects’ perceptions 
how much the confederate liked them ( É 
much do you think your partner likes you? 
and the confederate’s interest in their anse 
to the questions (“Did you think your paft 
ner was interested in the information y 
sent him/her?”). Although -subjects teni a 
to believe that the responsive confedet 
liked them more than the unresponsive a 
(Ms = 5.23 and 4.90, respectively) and 
he or she was more interested in their Fi 
Swers to the questions (Ms = 4.40 and ae 
respectively), the main effect of proba t 
of response was only marginally signi a 
for the two questions separately, Fs(1, 88) = 
1.81 and 3.44, ps<.17 and < .06, Bi 
tively, and for their mean, F(1, 88) ee 
p < .06. Again, there were no signi a 
effects involving frequency of response or § a 

Perceptions of acquaintanceship. RER i 
siveness affected the degree to which subje a 
felt that they and the confederate had bec? 


int 
* All questionnaire ratings were made on 11-P° 
Scales, 
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us to Confederate as a Function of Probability and Frequency of Response 


l 


your partner? 


partner sent you? 


Probability of response 
33% 66% 
ee wea Main effect, 
Frequency Frequency probability 
of response of response of response 
Questionnaire item 4 8 4 8 F(1,88) p 
How much do you like your partner? 5.31 5.71 6.38 6.63 12.89 001 
How well do you think you would get along with 
5.08 5.00 6.17 6.63 14.85 001 
Were you interested in the information your 
5.46 5.79 6.71 6.92 11.90 001 
How much would you enjoy a casual conversation 
with your partner, over a beer or coke? 6.21 5.38 7.08 6.96 12.72 001 
How much would you enjoy working on a 
problem with your partner? 5.08 5.42 663 6.29 11.70 001 
Is your partner the type of person you could 
become close friends with? 4.58 4.54 5.33 5.37 5.63 02 
Is your partner a good listener? 5.58 542 613 5.67 1.28 26 
How friendly is your partner? $63 592 6.79 6.46 4.97.03 
5.35 5.40 6.40 6.37 14.89 001 


Note. Ratings ranged from 0 (‘‘not at all 


a fainted with one another. Responses to 
he postexperimental question “How well did 
ju feel that you got to know your partner?” 
dicated that subjects felt they got to know 
he confederate who responded on 66% of the 
als (M = 4.09) better than the one who 
j poned on only 33% of the trials (M = 
9.16); F(1,88) = 8.11, p < .005. 

a Th contrast, in response to the question 
ow well do you think that your partner 
Bot to know you?” subjects indicated that the 
confederate got to know them better in the 
33% condition than in the 66% condition 
» (Ms = 4,90 and 4.19, respectively), although 
i effect was only marginally significant, 
F(1,88) = 3.17, p < .08. The latter effect is 
not surprising, however, since subjects an- 
Swered twice as many questions in the 33% 
as in the 66% condition (i.e., for 12 and 24 
Versus 6 and 12 trials), In fact, the two 12- 
trial conditions did not differ from each other 
(4.83 versus 4.67), whereas the 6-trial con- 
dition was much lower (3.70) and the 24- 
i condition higher (4.96). 


”) to 10 (“very 


”), 


Again, there were no main effects of either 
sex or frequency of response and no interac- 
tions of either variable with probability of 
response. It is interesting that frequency of 
response failed to affect the extent to which 
subjects felt that they had become acquainted 
with the confederate, despite the fact that 
they heard the confederate answer twice as 
many questions in the high frequency condi- 
tion. On the other hand, probability of re- 
sponse did affect perceived acquaintanceship, 
even though the number of answers subjects 
heard from the confederate was constant 
across the two conditions. 

Ratings of responsiveness. Five trait rat- 
ings reflected subjects’ perceptions of confed- 
erate responsiveness. Again the main effect of 
probability of response was significant, indi- 
cating that responsive confederates (66% 
response) were perceived as more responsive, 
F(1, 88) = 26.24, p < .001; attentive, F(1, 
88) = 6.26, p< .02; talkative, F(1,88) = 
33.95, p < .001; active, F(1, 88) = 5.13, p < 
.025; and sympathetic, F(1,88) = 3.80, p < 
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.05, than the more unresponsive (33% re- 
sponse) confederate. There were no main 
effects of sex or frequency of response and no 
interactions of sex with probability of re- 
sponse for any of the above ratings. Although 
probability and frequency of response inter- 
acted for the trait “active,” F(1,88) = 4.59, 
p < .04, the two variables did not signifi- 
cantly interact for the other four ratings. 

Differences in number of trials. In order 
to independently manipulate probability and 
frequency of response, the number of trials 
varied between conditions (see Table 1). The 
two 33% conditions involved 12 and 24 trials, 
whereas the two 66% conditions involved 6 
and 12 trials. The effect of probability of re- 
sponse may be tested, however, while holding 
the number of trials constant. 

T tests were performed to test the dif- 
ference between the two cells involving 12 
trials (frequency = 4, probability = 33% 
versus frequency = 8, probability = 66%) on 
the means of the three major dependent 
variable categories. The comparison was sig- 
nificant for all three: attraction to the con- 
federate, ¢(46) = 14.48, p< .001; percep- 
tions of confederate attraction to the subject, 
t(46) = 9.88, p< 001; and the degree to 
which subjects felt they got to know the con- 
federate, t(46) = 15.96, p < .001. 

Although probability and frequency of re- 
sponse are confounded for the above com- 
parison, it is doubtful that differences in fre- 
quency of response accounted for the results. 
The overall main effect of frequency of re- 
sponse was not significant, and for the other 
possible comparison (frequency = 8, proba- 
bility = 33% versus frequency = 4, proba- 
bility = 66%) the means were in the opposite 
direction (i.e., frequency of response was in- 
versely related to the three dependent varia- 
bles). The failure of frequency of response to 
affect attraction is, of course, consistent with 
existing evidence that attraction varies with 
the proportion, but not number, of positive 
reinforcers (e.g., Byrne & Nelson, 1965). 


Discussion 
The results of Experiment 1 are clearly 
consistent with the hypothesis that attraction 


will vary Positively with responsiveness, Sub- 
Jects liked the confederate who responded on 
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66% of the trials more, believed that he o 

she liked them more, and felt that they got ty 

know him/her better than the confederate? 
who responded on only 33% of the trials, 

However, since only related responses were 

employed (the confederate always answered 

the same question as the subject), there is no 

direct evidence that the effect of probability 

of response would hold for unrelated responses} 
as well. Indirect evidence is available, hoy 

ever, through comparison of the present re 

sults with those of Experiment 2, which wil 

be described in the general discussion of 
the two experiments. 


Experiment 2; Proportion of Related 
Responses 


Experiment 2 was designed to test the hyi 
pothesis that attraction will be a positive 
function of the proportion of related responses. 
As in Experiment 1, subjects exchanged it- 
formation about themselves with another sub- 
ject (actually a confederate) by taking tums 
choosing and answering questions about 
themselves. The subject and confederate ad 
one of three questions on each of 10 tla 
Relatedness was varied through the coniet 
erate’s choice of question. He or she chos 
either the same question as the subject la 
lated response) or one of the two differ 
questions (unrelated response). The AA 
question was chosen on either 80% or 2 
of the trials. al 

For half of the subjects, the conte i 
answered after the subject on each trial, ‘ 
which case his or her choice could be pi 
ceived as a response to the subject’s ana 
For the other half, in order to control e= 7 
possibility that perceived similarity ci 1 
result of choosing the same questions) p 
affect attraction, the confederate an 
first. In this case, his or her responses ¢ “4 
in no way be perceived as contingent be 
those of the subject and, thus, could Tae 
Perceived as a response to the subject. “1, 
fore, for the confederate-first conde y 
proportion of related responses shoul 
affect attraction. 


Method 


Subjects. Subjects were 80 Caucasian ini 
Psychology students, 40 males and 40 fem: 


ducto 
ae who 
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participated for partial fulfillment of course require- 
ments. 

Design. In a 2 (sex of dyad) X 2 (order) X 2 
(proportion of related responses) factorial design, 
subjects and a same-sex confederate? alternately 
chose and responded to one of three questions on 
each of 10 trials. Using prearranged responses, the 
confederate answered the same question as the sub- 
ject on either 80% or 20% of the trials. For half of 
the subjects, the confederate answered first on all 10 
trials, and for the other half, the subject answered 
first for the 10 trials. 

As in Experiment 1, questions were designed to 
edicit information about the subject (e.g., “What did 
you do the last time it was a nice day?” “What 
family traditions do you have around holidays?” 
“What are your parents’ occupations?” “What mode 
of transportation do you use to get around town?” 
“What classes are you taking this semester?”) and to 
be neutral and noncontroversial in content. In addi- 
tion, the three questions for each trial were scaled 
and equated for intimacy level. 

The order of the 10 triads of questions was Latin 
square counterbalanced, and the two responses that 
represented the 20% (either same or different) al- 
ways occurred on Trials 2 and 8. 

Procedure. Upon arrival, subjects were seated in 
a small cubicle containing an intercom system. The 
experimenter explained that another “subject” was 
expected and left the subject to wait, closing the 
door behind him. After a brief delay, another “sub- 
ject” (actually a same-sex confederate) was audibly 
ed to the adjacent cubicle. The experimenter then 
Opened both doors and instructed both subject and 
Confederate from the hallway facing the two rooms. 
/At no time during the experiment did the subject 
‘ind confederate come into visual contact. 

Subjects were told that the purpose of the experi- 
Ment was to study the “acquaintanceship process” as 
it occurs through the exchange of information alone, 
Without the use of the physical and nonverbal cues 
Normally present in everyday interactions. The ex- 
Perimenter explained that in this case, the partici- 
Pants would be communicating over the intercom 
oe and getting acquainted simply by taking 

B answering questions about themselves. 

Fac Subject and confederate were given a 10-page 
ia ea three questions per page and were 

Bie oft to take turns choosing and responding to 
Y trials Th ree questions available on each of the 10 
begin Es i experimenter explained that they should 
Kere pie turn by announcing which question they 
saying Oke answer and signal when finished by 
should be ei: It was stressed that. conversation 
tions, yy actly: confined to answering the ques- 
Questions.. other conversation, such as extraneous 

For th or comments, was permitted. 
the cate Confederate-first condition, in order for 
ae erate to make the appropriate number of 
and nee subjects were instructed to choose 

orm,” Shae choice of questions on a “choice 

fore th ich was collected by the experimenter 
trials began and used to choose the con- 
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federates’ responses, Subjects also circled their choices 
on their own booklets, and the experimenter stressed 
that they were not to alter their choices after the 
experiment began. Subjects believed that the same 
procedure was followed by the confederate, 

This procedure was employed in order to ensure 
that for the confederate-first condition, the confed- 
erate’s choices would not be perceived as contingent 
upon those of the subject. In the subject-first condi- 
tion, choice forms were not employed, and both 
the subjects’ and confederates’ responses were chosen 
as the trials progressed. 

When the instructions and the choice forms (for 
the confederate-first condition) were complete, the 
experimenter allowed the subject and confederate to 
greet each other in order to become familiar with 
the intercom system, instructed either the subject or 
the confederate to begin, and closed both doors. 

To keep the confederate blind to condition, a 
visual cue from the experimenter, who was monitor- 
ing the subject’s responses via headphones located in 
the confederate’s cubicle, informed the confederate 
when, and to which question, to give the prear- 
ranged response.$ 

Upon completion of the trials, subjects completed 
the same questionnaire as the one employed in Ex- 
periment 1, and again were debriefed by mail at the 
end of the semester. 


Results 


Attraction to confederate. It was hypothe- 
sized that the proportion of related responses 
would influence attraction in the subject-first 
condition, but not in the confederate-first 
condition. Again, the postexperimental ques- 
tionnaire included eight questions designed to 
measure liking of the confederate (see Table 
3). The first five columns of Table 3 indicate, 
respectively, the dependent measure of at- 
traction and the means for the four treatment 


combinations. y 
The predicted interaction of order with 


2Qne confederate of each sex was employed. 

3]f there tended to be distinct patterns for sub- 
jects’ choices among the three questions for the 10 
triads, then there would also be distinct patterns 
for the questions answered by the confederate. In 
other words, subjects who answered with 80% Te- 
lated responses would typically answer different 
questions than those who responded with 80% unre- 
lated responses. Differences in attraction might then 
be explained by differences in questions answered by 
the confederate. In order to examine this possibility, 
the experimenter kept a record of all subject and 
confederate choices. Analyses of this record indi- 
cated that subjects chose all questions approximately 
equally often and that confederate choices did not 
differ across conditions. 
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proportion of related responses was statisti- 
cally significant for six of the eight questions 
(see column 6 of Table 3) and for the mean 
of the eight questions, F(1, 72) = 8.78, p< 
.004, For the other two questions, the inter- 
action failed to reach significance by only a 
small margin. 

The simple effect tests for proportion of 
related responses within the subject-first and 
confederate-first conditions (columns 7 and 
8 of Table 3) indicated that as predicted, 
related responses facilitated attraction when 
the subject answered first, but did not sig- 
nificantly affect attraction when the confed- 
erate answered first. There were no interac- 
tions involving sex for any of the question- 
naire items. 

Perceptions of confederate reactions to sub- 
ject. Two of the postexperimental question- 
naire items reflected subjects’ perceptions of 
how much the confederate liked them and the 
confederate’s interest in their answers. Again, 
the proportion of related responses interacted 
with order, such that when subjects answered 
first, they believed that a confederate who 
answered the same question on 80% as op- 
posed to 20% of the trials liked. them more 
(Ms = 6.55 and 5.55, respectively), simple 
F(1,72) = 5.05, p < .03, and was more in- 
terested in their answers (Ms = 6.45 and 
4.40, respectively), simple F(1, 72) = 13.54, 
p < .001; but when the confederate answered 
first, the proportion of related responses had 
no effect, Fs(1,72) < 1.53, ms. The interac- 
tion was significant for both questions, Fs(1, 
72) = 6.07 and 8.15, ps < .02 and < .006, 
respectively, and for their mean, F(1,72) = 
6.59, p < 01. 

Perceptions of acquaintanceship. The pro- 
portion of related responses affected the ex- 
tent to which subjects felt that they and the 
confederate became acquainted with one an- 
other and the extent to which they enjoyed 
the experiment, again only in the subject-first 
condition. When subjects answered first, they 
felt that they got to know the confederate 
better (Ms = 6.60 and 4.95), simple F(1, 72) 
= 8.07, p < .006, and that the confederate 
got to know them better (Ms = 6.30 and 
445), simple F(1, 72) = 11.73, p< .001, 
when the confederate answered the same ques- 
tion for 80%, as opposed to 20%, of the 
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trials. There were no significant differen 
for the confederate-first condition, The inter 
action was significant for both questions 
(Fs(1, 72) = 5.35 and 5.86, ps< 03 ai 
< .02, respectively). 

Similarity. Neither order nor the propor. 
tion of related responses affected subjec 
ratings of their partner’s similarity to th 
selves. This result, in combination with th 
absence of an effect of relatedness for thy 
confederate-first condition, argues against thi 
alternative explanation of the results that thy 
proportion of related responses affected a 
traction through its effect on perceived simi 
larity. 

Ratings of responsiveness. Subjects rate 
the confederate as more “responsive” when hi 
or she answered 80% of the same questions 
rather than 20%, F(1, 72) = 3.49, p< ol 
but only for the subject-first condition, inter 
action F(1, 72) = 3.21, p < .08. These effect 
were only marginally significant, however, ai 
responsiveness did not affect ratings of “ay 
tentive,” “talkative,” “active,” and “sy a 
thetic,” as it did in Experiment 1. This a 
ference probably occurred because subject 
personal definitions of these adjectives al 
more tied to probability than to relatedn 
of response. 


General Discussion 


The results of the two experiments clea 
supported the hypothesis that attraction 
facilitated by responsiveness. For Experi 
1, probability of response facilitated, n 
tion, as well as the perception of acqualn® | 
ship with the confederate. Subjects E 
tended to believe that the confederate M 
responded with high probability liked t4 
more and was more interested in the , 
swers to the questions than the onë va 
sponded with low probability. sot fi 

For Experiment 2, those in the subi Fi 
condition not only liked the confedera! al 
gave 80% related responses more, oo af 
felt that the confederate liked them mOr g 
was more interested in their answers 27" 
the two of them had become better acd” a 
with one another. These results are ee i 
impressive, given the large amount 
formation (constant across conditions) 
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was conveyed by the confederate’s answers 
to the 10 questions and by his or her style of 
verbal interaction—information that would 
doubtless carry a great deal of weight for 
impressions formed of the confederate. It is 
also impressive that responsiveness affected 
attraction in the present situation, where 
relevant responses should not have been nec- 
essarily expected or demanded by the situa- 
tion, and where unrelated responses would not 
reflect negatively upon the character of the 
confederate, 

As mentioned earlier, since Experiment 1 
employed only related responses, there was no 
direct evidence that the effect of probability 
of response would hold for unrelated re- 
sponses. The subject-first, 20% related condi- 
tion of Experiment 2 represents a comparison 
in which the probability of related response 
was even lower (20%) than for the low-prob- 
ability-of-response (33%) condition in Ex- 
periment 1. Despite the fact that the former 
condition involved lower probability of re- 
lated responses, attraction to the confederate 
(M = 6.33), perceptions of confederate at- 
traction to the subject (M = 4.98), and the 
degree to which subjects felt that they had 
gotten to know the confederate (M = 4.95) 
were greater than the comparable values for 
the 33% response condition of Experiment 1 
(Ms = 5.38, 4.30, and 3.13, respectively). 
While comparisons between experiments are 
tenuous, it is plausible to argue that the dif- 
ferences were due to the relative probabilities 
of unrelated responses (80% versus 0%). 


Perceived Acquaintanceship 


Perhaps one of the most interesting results 
of the two experiments was the relationship 
between responsiveness and perceived ac- 
quaintanceship. For Experiment 1, probability 
of response affected the degree to which sub- 
jects felt that they became acquainted with 
the confederate, even though the number of 
answers they heard from the confederate was 
constant across the two conditions. Frequency 
of response, on the other hand, did not affect 
perceived acquaintanceship, despite the fact 
that subjects heard the confederate answer 
twice as many questions in the high fre- 
quency condition. For Experiment 2, the 
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proportion of related responses affected 
ceived acquaintanceship: Subjects felt that 
they became better acquainted with the oo. 
federate and that the confederate had becom 
better acquainted with them when the confed 
erate responded with 80% related ang 
Apparently, responsiveness affected somethin 
more basic than attraction, namely, the per 
ception of a “bond” or “relationship” between 
the subject and confederate. 

Literature on pathological communicatio 
(e.g., Danziger, 1976; Laing, 1961; Ruescl 
& Bateson, 1951; Sluzki et al., 1967; Wate 
lawick et al., 1967) suggests that the rela 
tionship between the successive communica 
tive acts in an interaction serves to define thi 
relationship between the participants. As long 
as a response addresses the content of the 
preceding message—whether positively, nege 
tively, with agreement or disagreement—its 
assumed to provide confirmation of the other 
as a person, In contrast, an irrelevant respon 
or failure to respond—a “disqualification’= 
is assumed to communicate the interpersom 
message “You do not exist.” Danziger (197 
has expressed the point well: 


In recognizing the meaning and intent of a més i 
and addressing myself to its implied demand 1 ai 
recognize the originator of the message as a on 
In disqualifying a message and refusing to eh 
its nature I invalidate the claim to self-deti w 
that it implies. If I show by my inappropriate i 
sponse that a message has not registered I 7 
that moment refusing confirmation of the pi 

ality that is the source of the message. (P- 13 


an 
Certainly, in order to “get to know © 
individual, one must first recognize his oF 
claim to personhood. It may be that for tel 
present studies, responsiveness commune 
to the subject that the confederate was 1% 
ing to him or her as a person, thereby ind 
the perception of greater acquaintan 
At the least, the data indicate that reena 
ness communicated to the subjects that” 
confederate liked them and was interest 
their answers to the questions. 
The results for the confederate-! a 
tion in Experiment 2 attest to the impor s 
of the perception of contingency betwen 
responses, both for the definition of a : 
tionship (or the perception of acquai® 
ship) and for attraction. Since the resp° 


-first condi 


were chosen before the interaction began, and 
ince the subject answered after the confed- 
erate, the confederate’s responses could not 
haye been perceived as responses to the sub- 
ject and consequently could not have been 
interpreted as a comment on their “relation- 
ship.” Thus, not surprisingly, the proportion 
of related responses affected neither attrac- 
tion to the confederate, nor subjects’ percep- 
tions of how much the confederate liked them, 
nor the degree to which subjects felt that 
they and the confederate became acquainted 
ith one another. 

A final piece of evidence concerning the 
importance of related responses came from 
pilot subjects in the confederate-first 20% 
elated condition. Until the instructions were 

\ modified to strictly and redundantly forbid 
that the choice of questions be altered during 

| the interaction, subjects frequently changed 
responses that should have been different 
from the confederate’s in order to answer the 
same question. When confronted, they com- 
plained, “But how can we get to know each 
other if we just keep answering different 
questions that have nothing to do with each 
other?” Apparently, subjects felt that it was 
necessary to be responsive in order to become 
acquainted with the other person. 


Implications for Self-Disclosure Research 


Research on self-disclosure (€-8-, Altman, 
1973; Altman & Taylor, 1973) has been 
concerned with the content (i.e. intimacy 
level) of information exchanged in dyadic in- 
teraction, The present analysis suggests that 
the sequential process of disclosure may be as 
important as the specific content. As sug- 
gested above, greater feelings of acquaintance- 
hip and attraction, as well as greater per- 
eption of control, will develop from inter- 
tions in which each participant’s responses 


y 


address previous disclosures Or other behav- 
iors of the perceiver will produce greatest 
attraction. 

h The majority of self-disclosure research 
as focused on the absolute intimacy level of 
another’s disclosures, independently of the 
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perceiver’s own. However, it may be argued 
that disclosures at the same level of intimacy 
will be perceived to be directly related to the 
preceding communication with greater prob- 
ability than those at a different level of inti- 
macy. It may also be argued that one will 
feel more in control of the intimacy level of 
the interaction to the extent that another’s 
disclosures are of the same level of intimacy 
as one’s own. If so, another will be more at- 
tractive the more closely his or her'disclosures 
match the intimacy level of the perceiver’s. 


Directions for Future Research 


Although the results of the present two €x- 
periments were clearly consistent with the 
hypothesis that attraction is facilitated by re- 
sponsiveness, the experimental interactions 
were highly structured and in many ways 
quite different from more naturalistic interac- 
tion. Both the format of the interaction (i.e., 
taking turns answering questions) and the 
range of alternative responses (choice among 
two or three topics) were defined by the ex- 
perimenter. In addition, unresponsiveness was 
specifically condoned by the experimenter. 
For Experiment 1, the confederate was told 
that he or she could choose not to respond, 
and for Experiment 2, that he or she could 


answer any of the three questions. Because of 


the structure of the situation, the attributions 


resulting from the confederate’s unresponsive- 
ness—both regarding his or her social graces 
and his or her feelings toward the subject— 
should have been less extreme than those con- 
cerning unresponsiveness in more naturalistic 
interactions. Responsiveness should also have 
been less related to maintenance of interac- 
tion, predictability, control, stress, and fa- 
cilitation of interaction goals in the present 
situation. Thus, one of the first priorities of 
future research should be to manipulate re 
sponsiveness in the context of less structured 
more naturalistic interactions, where it woulc 
be expected to have even greater impact 01 
attraction. 

Given the importance of responsiveness fo 
human interaction, future research might als 
profitably explore the determinants of two pel 
sons’ capacity for mutually responsive interac 
tion. At least five conditions are necessary fc 


. 
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responsive interaction. In order to be respon- 
sive, one must first attend to the other person, 
both in order to detect behaviors that demand 
response and in order to process those behav- 
iors with sufficient accuracy to be able to re- 
spond with related content. Many situational 
characteristics, such as other persons present, 
task demands, noise, or other distracting stim- 
uli, will affect the degree to which attention 
is available for the person with whom one is 
interacting. Internal demands for attention, 
such as self-consciousness, preoccupation with 
irrelevant thoughts or with bodily sensations, 
and so forth will also reduce attention to the 
other person. Finally, stimulus characteristics 
of the other person may influence the amount 
of attention devoted to him or her. It is rea- 
sonable to assume, for example, that one will 
devote more attention to those whom one 
likes, those who are of high status or in con- 
trol of rewards, or those who are physically 
attractive. 

A second condition necessary. for responsive 
interaction is the motivation to be respon- 
sive. Responsiveness has two effects, which 
may or may not be desirable to the partici- 
pants: (a) It tends to prolong the interaction 
and (b) it tends to lead the participants to 
feel more of a relationship (or more intimacy) 
with one another. sf either or both of the 
two effects are undesirable to the participants, 
the motivation to be responsive will be cor- 
respondingly low. As Watzlawick et al. (1967) 
have pointed out, unrelated or irrelevant re- 
sponses (disqualifications) are commonly em- 
ployed as a means to either terminate an in- 
teraction or to change the subject. Thus, the 
motivation to be responsive should be affected 
by such factors as the attractiveness of the 
other person and the degree to which he or 
she controls rewards, as well as by temporary 
factors affecting the desire to interact at a 
specific time or place or on a particular topic. 


Given that a person has attended to the 
other and is motivated 


to be responsive, he or 
she must have the capacity or energy to re- 
spond at the rate “demanded” by the other. 
When interacting with another who talks 
rapidly and at length, one must have the 
energy to respond rapidly and at length. If 
the “interaction rates” of two Persons are not 
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matched, the more active Participant will fee 
frustrated by the other’s unresponsivene 
while the less active Participant will fe 
drained by the constant demand for response 
Chapple (1939, 1974) has 


argued for yean 
that two persons’ capacity for smooth inte 


action will be determined in part by the mai 
of their activity levels. Similarly, Latané a 
Hothersall (1972) have argued that the rely 
tive preference of rodents for within- versi 
between-species others results in part fron 
incompatible activity levels, or “interactio 
rates,” between species. 

In order to respond with related eile 
one must first accurately interpret the mean 
ing of the other person’s behavior or co 
munication. Accuracy of understanding will 
be affected, in part, by the content of the com 
munication, The more ambiguous the subje 
matter, the more difficult it will be to in 
terpret accurately, and therefore the less te 
sponsive will be the interaction. The cogni 
tive skills of the participants will also affect 
the capacity for responsive interaction. The 
ability to accurately understand communica- 
tions will be determined both by general vet- 
bal skills and by the ability to take on the 
frame of reference of the other person. ge 
the capacity for responsiveness will n 
with the intelligence, general knowledge 
ity, cognitive development, and cognitive oy 
plexity of the participants, and with t 
familiarity with one another. 3 j w 

Understanding will also be facilitate a 
many forms of similarity. It would Ke 
pected, for example, that any shared €l ic 
teristic (such as race, culture or subcu «a 
socioeconomic class, profession, interests, Ry 
nitive structure, and even shared experi 4 
that would lead two persons to share hei 
thinking or of perceiving the world, 0 eh 
language, or of interpreting behavior fe: 
facilitate understanding and therefor andisi 
sponsiveness. Similarity of culture ae 
1964, 1975), socioeconomic class l 1956) 
1961), and cognitive structure (Rune Om 
Shibuya, 1962; Triandis, 1959, 19608, oid 
have, in fact, been shown to facilitate ©" 
munication. a 

The fifth, and final, condition necessa! an 
responsive interaction is that the participat 


t 


ess the response repertoire necessary for 
Jated responses. Again, similarity, as well as 
ral knowledgeability, will be important. 
e persons most often initiate conversa- 
ions dealing with topics in which they are 
interested or knowledgeable, or with which 
ey are currently concerned, persons who 
e similar interests, knowledge, or topics 
concern will more often engage in respon- 
dye interaction than those who do not. 


Conclusions 


The present demonstration of the impor- 
ance of responsiveness illustrates the utility 
of exploration of interaction-based determi- 
ants of attraction. Such sequential response 
ntingencies as those described here may be 
profitably explored as both independent and 
dependent variables, as illustrated above. As 
a dependent variable, responsiveness or other 
Sequential response contingencies may allow 
identification and greater understanding of 
variables that may influence attraction either 
partially or entirely through their influence on 
interaction (e.g., similarity). 
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Behavioral Change in a Constant Environment: 
Shift to More Difficult Tasks 
With Constant Probability of Success 


Julius Kuhl Virginia Blankenship 


The present experiment was designed as a first attempt to test hypotheses de- 
rived from Atkinson & Birch’s dynamic theory of action, which proposes a 
theoretical reorientation from an episodic to a dynamic view of motivation. 
Traditional episodic theories of achievement motivation predict constant risk 
preference over a series of free choices from various difficulty levels when the 
assumed situation-specific determinant (probability of success) remains con- 
stant. In contrast to this, the dynamic theory predicts a shift to more and more 
difficult tasks, both for success-oriented and for failure-oriented subjects. In addi- 
tion, the dynamic theory predicts that the initial ambivalence between very easy 
and very difficult tasks, predicted by traditional theory of achievement motiva- 
tion for failure-oriented subjects, is quickly replaced by a consistent preference 
for very easy tasks in that motive group. The results of the present experiment 


University of Michigan (Ann Arbor) 


support the predictions derived from the dynamic theory. 


l When subjects are allowed to choose tasks 


from a given set of different difficulty levels 
Over an extended period of time, a shift of 


Preference to more and more difficult tasks 


is the common finding (e.g., Atkinson, Bas- 
tian, Earl, & Litwin, 1960; Atkinson & 
Feather, 1966). The cognitive theory of 
achievement motivation (Atkinson, 1957; 
Atkinson & Feather, 1966; Raynor, 1969) 
and its reformulation within an attributional 
theory of motivation (Heckhausen, 1973, 


(1977; Weiner et al, 1971) attribute such 


shifts in risk preference to respective changes 
in the relevant situation-specific determinant 


: Of risk preference, namely, subjective proba- 


This article was prepared during an exchange 
Re granted to the first author by the German 
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bility of success. According to this hypothe- 
sis, subjects move to more difficult levels 
during a series of free choices among various 
difficulty levels because subjective probability 
of success increases as a result of subjective 
improvement. With probability of success 
held constant, no systematic shift in pre- 
ferred difficulty would be expected. In recent 
years, a dynamic theory of motivation has 
been developed that explains behavioral 
change even if the situation-specific determi- 
nants remain constant (Atkinson & Birch, 
1970, 1974). The experiment to be described 
here was designed to test this hypothesis, 
which was derived from The Dynamics of 
Action (Atkinson & Birch, 1970), stating 
that shifts to more and more difficult tasks 
occur even if subjective probability of suc- 
cess is held constant (i.e., has stabilized after 
a sufficiently long practice period). 


The Dynamics of Action 


The basic concepts and assumptions of this 
theory (Atkinson & Birch, 1970, 1974) are 
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summarized in a recent article on computer 
simulation of imaginative behavior (Atkin- 
son, Bongort, & Price, 1977): 


If a certain kind of activity has been intrinsically 
satisfying or previously rewarded in a particular 
situation, there will be an instigating force (F) for 
that activity, attributable in part to strength of 
motive in the person and in part to the magnitude 
of incentive for that activity in that situation, This 
will cause a more or less rapid arousal and increase 
in the strength of an inclination to engage in that 
activity, an action tendency (T), depending on the 
magnitude of the force. If a certain kind of ac- 
tivity has been frustrated or punished in the past, 
there will be an inhibitory force (I) and a more 
or less rapid growth in the strength of a disin- 
Clination to act, This is what we now call a 
negaction tendency (N) and conceive as a tendency 
not to do it. The duration of these forces will de- 
termine how strong the action tendency or negaction 
tendency becomes. The latter, the tendency not to 
do something, will produce resistance to the activity, 
It opposes, blocks, dampens; that is, it subtracts 
from the action tendency to determine the re- 
sultant action tendency (T = T — N). The resultant 
action tendency competes with other action tend- 
encies for other incompatible activities, The strong- 
est of them is expressed in behavior. The expression 
of an action tendency in behavior is what reduces 
it. Engaging in activity produces a consummatory 
force (C), which depends in part on the consum- 
matory value (c) of the particular activity and in 
part on the strength of tendency being expressed in 
behavior (ie, C= cF). Similarly, the resistance to 
an action tendency, produced by the opposition of 
a negaction tendency, constitutes an analogous force 
of resistance (R), which reduces, in a comparable 
way, the strength of the negaction tendency. 


The force of resistance (R) is a function of 
the negaction tendency and the resistance 
value (r) of the particular activity (i.e, R 
= r:N). In order to derive predictions from 
the theory concerning changes in risk prefer- 
ence during a series of free choices among 
various difficulty levels, definitions have to 
be made coordinating the basic parameters 


(F, I, c, and r) to the assumed antecedents 
in achievement-related situations, 


Coordinating Definitions 


The instigating force (F;) to choose Difi- 
culty Level i is defined by the product of the 
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the complement of P, (,=1-p,); 


Fr= Ms X P,, x (1 — Py, 


Equation 1 differs from 
of achievement motivation (Atkinson, 195) 
Atkinson & Feather, 1966) in relating 
product of motivational determinants to 
parameter controlling the rate of incr 
(F;) rather than the absolute strength (T, 
of a motivational tendency (cf. Atkinson 
Birch, 1970, 1974). 

Similarly, the inhibitory force (J;) control 
ling the rate of increase in the tendency jj 
avoid working on Difficulty Level i is 
fined as 


T= Mr X (— Pa) X 1-P), O 


where My describes the motive to avoil 
failure. 

Consummatory value (c), that is, the tal 
of reduction in the tendency to continue 
choosing a given level of difficulty, is as 
sumed to be greater for success than for 
failure (cs > cs). This is consistent with Te 
sults obtained by Weiner (1965) showing” 
that for success-oriented subjects, level ol 
performance and persistence are greater fok 
lowing failure than following success. The 
average consummatory value (ĉ) in a a 
of repeated attempts at the same level o 
difficulty may be assumed to be a function 
of the objective probability of success (ps) 


= Caps + cr pr o 


In the experiment to be described belom 
objective and subjective probability of sue 
cess are assumed to be identical. a 
Similarly, it is assumed here that i. 
sistance value (r), which controls the r 
tion of negaction tendencies, varies E 
function of success and failure. A pai 
theoretical assumption would be that resi 
ance as a result of fear of failure is “i 
after success to a greater extent than F 
failure. This is consistent with results A 
gesting that failure-oriented subjects a 
better performance and greater persiste al 
after success than after failure wed 
1965). The average force of resistance (7 
a series of repeated attempts at the E. 
difficulty may be assumed to be a func 
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of the objective probability of success (ps): 
F= rops trepe (4) 


Computer simulations of the dynamic theory 
‘using Equations 3 and 4 for defining aver- 
age c and r values, respectively, resulted in 
predictions that were equivalent to those ob- 
tained in trial-by-trial simulations using the 
appropriate Cs OF Cr value after each success 
‘or failure outcome, respectively (Kuhl & 
‘Blankenship, in press). It was concluded 
from this result that Equations 3 and 4 may 
be used to derive predictions from the dy- 
namic theory. 


Predicted Change in Risk Preference 


Equations 1 to 4 may be used to derive 

predictions concerning changes in risk prefer- 
ence in a series of free choices from a given 
set of different difficulty levels. Though 
mathematical derivations are not possible be- 
cause of the complexity of the theory and 
the situation it is applied to, the implications 
of the theory for a given situation may be 
found by computer simulations of the theory. 
Simulations of the dynamics of action yielded 
the following hypotheses when the basic pa- 
rameters (Fi, I ci, ri) were defined as sug- 
gested by Equations 1 to 4. A description of 
the simulation program (Bongort, Note 1), 
the simulation procedure, and the simulated 
results can be found elsewhere (Kuhl & 
Blankenship, in press). 
_ Hypothesis 1. For success-oriented sub- 
jects (Ms > Mr), the slope of the regres- 
sion line describing preferred difficulty (1 
~P,) as a function of time (trials) is 
greater than zero, indicating a gradual shift 
to more difficult tasks. 

Hypothesis 2. The intercept of this re- 
gression line does not differ from the inter- 
mediate difficulty level (P, = -5). This indi- 
eee a preference for intermediately difficult 
s (Ps=.5) at the first trial when Ms 

r 
aeai 3. For failure-oriented subjects 
Natal the slope of the regression line 
RD ing preferred difficulty (1 — Ps) as a 
tunction of time (trials) is greater than zero, 


indicating a gradual shift to more and more 
difficult tasks. 
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Hypothesis 4. The intercept of this re- 
gression line is at a difficulty level lower than 
the intermediate one (P, > .5). This indi- 
cates a preference for tasks easier than inter- 
mediately difñcult at the first trial when 
Mr > Ms. 

Hypothesis 5. For subjects with low mo- 
tivation (Ms ~ Mr), the slope of the regres- 
sion line describing preferred difficulty (1 — 
Ps) as a function of time (trials) does not 
differ significantly from zero. 

Hypothesis 6. The mean slope of the re- 
gression line is greater for success- than for 
failure-oriented subjects. 

These simulated predictions were confirmed 
by trial-by-trial simulations using specific Cs 
or Cr values after each success or failure out- 
come rather than average c and r values 
based on Equations 3 and 4. 


Initial Risk Preference 


Hypothesis 2 corresponds to the well- 
known implication of the original theory of 
achievement motivation: Success-oriented 
subjects prefer intermediately difficult tasks 
(Atkinson, 1957). For failure-oriented sub- 
jects, the simulations of the dynamics of 
action yielded ambivalence between the easi- 
est and the most difficult task (Kuhl & 
Blankenship, in press), as predicted by the 
original theory (Atkinson & Feather, 1966). 
This prediction, however, described the very 
first trials only. After the first few trials, a 
clear-cut preference for the easiest difficulty 
level is predicted—because of the reduction 
of negaction tendency, which is assumed to 
be greater for easier than for more difficult 
tasks when s > fr (Equation 4). Since the 
hypotheses are based on the assumption of 
stabilized P, values, a practice period is nec- 
essary before the beginning of the main phase 
of the experiment. Hypothesis 4 states the 
prediction for the main phase, assuming that 
the short period of ambivalence between ex- 
treme difficulty levels has already been fin- 
ished during the practice period (Kuhl & 
Blankenship, in press). 


Temporal Change in Risk Preference 


Hypotheses 1 and 3 contain the main im- 
plications to be tested in this experiment. 
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In contrast to the original theory, a shift to 
more and more difficult tasks is predicted for 
both success- and failure-oriented subjects 
even when 'P, is assumed to have stabilized 
at the end of the practice period. This is 
explained as a result of consummation, which 
is assumed to be greater at easy than at more 
difficult tasks when c,>c, (Equation 3). 
Hypothesis 3 is also in contrast to predic- 
tions derived from a recent application of 
some concepts of the dynamics of action 
(Revelle & Michaels, 1976). Based on a mis- 
taken interpretation of the coordinating defi- 
nitions made in the dynamics of action (At- 
kinson & Birch, 1970), Revelle & Michaels 
(1976, p. 396) predict that failure-oriented 
subjects will show a strategy that is the re- 
verse of what is expected from success-ori- 
ented subjects (Kuhl & Blankenship, in 
press). According to the dynamics of action, 
failure-oriented subjects show—like success- 
oriented subjects—a trend to increasingly 
more difficult tasks, because resistance due 
to negaction tendencies is more quickly re- 
duced for easier tasks than for more difficult 
tasks (cf. Equation 4). As a result, a tend- 
ency to choose more and more difficult tasks 
will emerge, after a delay that is propor- 
tional to the time required to overcome ini- 
tial resistance. 


Forced Choice 


Revelle & Michaels’s discussion of a spe- 
cial case of the dynamics of action is based 
on the assumption of continuous operation 
of consummatory forces, as if a subject is 
working on all difficulty levels at the same 
time. This is, of course, an unrealistic as- 
sumption for a situation in which a person 
makes a choice of one among tasks that differ 
in difficulty. The ideal case of working on 
all difficulty levels at the same time can be 
approximated, however, by forcing the sub- 
ject to work successively on all difficulty 
levels included in the experiment before 
repeating the series. Such a “forced-choice” 
situation is introduced in the experiment de- 
scribed below. 

For this experimental condition, the dy- 
namics of action as well as Revelle & Mi- 
chaels’s special case derived from it predict 


JULIUS KUHL AND VIRGINIA BLANKENSHIP 


a trend to increasingly more difficult | 
provided initial tendency strengths (T;) are 
close to zero. When a considerable number 
of practice trials precede the main phase of 
the experiment, however, both theories pre- 
dict this trend to weaken to the extent that 
Tis approach the respective asymptotic levels, 
This change in trend can be seen from Figure 
1, which is based on a computer simulation 
of the forced-choice condition using initial 
tendency strengths markedly above zero. 
These predictions are in contrast to what is 
expected for the free-choice condition: Com- 
puter simulations showed that in this situa- 
tion, the dynamics of action predict the trend 
to more difficult tasks, both when Tys ate 
close to zero and when 7's are close to the 
respective (F/c) value (asymptotic level; 
Kuhl & Blankenship, in press). 

The reason for this implication of the 
theory seems to be related to a positive: 
feedback effect typical of the free-choice 
condition. Under free-choice conditions, the 
strength of a tendency rises considerably 
higher than the asymptotic level it would 
approach if the expression of any competing 
tendency were prevented: Whenever a tend- 
ency is in the “dormant” state (i.e, another 
tendency has gained dominance), it is 10 
longer a function of consummatory force (the 
necessary determinant of its asymptotic level 
in the dormant state) and may, as a result, 
rise above its asymptotic level if the latter 
is lower than the momentary strength of the 
dominant tendency. In sum, the dynamics 0 į 
action predict a stronger trend to more dith- 
cult tasks in the free-choice than in thé 
forced-choice condition, whereas Revelle 5 
Michaels’s (1976) theory predicts no DE 
interaction, since it is limited to the force 
choice paradigm. 


Method 
Subjects 


jb- 
Forty male and 37 female subjects wen { 
tained from the introductory psychology conri 
the University of Michigan (Ann Arbor) sychol- 
the 1976 fall semester. Participation in the, cour | 
sessions. || F 
) were sub- 


ogy experiment was a requirement of 
All 77 subjeots were tested in group 
these, 64 subjects (32 males, 32 females: 
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TRIALS 
Figure 1. Simulated choice tendencies for the five probability of success (Ps) levels for a forced- 


choice situation. 


sequently selected at random and assigned at ran- 
dom to the two experimental groups for individual 
testing, 


Group Sessions 


(ed two experimenters administered a Thematic 
ee Test (TAT; McClelland, Atkinson, 
anes Lowell, 1953) using sex-specific verbal cues 
aa Horner (1974) and a short form of the 
Aon ranae Questionnaire (TAQ; Mandler & Sara- 
Da Kisak in three group sessions that varied in 
N rom 19 to 29 subjects. The TAT Need for 
i ievement Scale (n Ach) was administered with 
andard neutral instructions. 


Individual Sessions 


fees for both experimental groups (forced 
ie a free choice) were the same in the first 
told S the individual sessions. The subjects were 
a ont it was the purpose of the experiment to 
Eion ormation about how people feel about a 
ea of tasks. In order to give knowledgeable 
thes i C would have to work on a series of 
acts a first. Subsequently, subjects were in- 
of aed in the task, which was a modified version 
(i968) eee reasoning task used by Feather 
whi eee Rips 2 illustrates an example of the task, 
he mane been studied extensively by Kuhl (1977). 

Bis Meee of the task is to connect the numbered 
ree order with an imaginary line starting at 

dot, The, and proceeding to the highest-numbered 
of units Correct answer to the task is the number 
Will aS Gine segments) in the shortest line that 
Violatin, ect the dots in sequential order without 
g any of the following rules: (a) The line 


may not cross, touch, or retrace itself. (b) The 
borderline of the grid may not be touched or traced, 
(c) Only the lines within the grid may be traced. 
Subjects were told that the tasks were similar to 
items on typical IQ tests and that their opinions 
were being sought concerning the difficulty of the 
tasks, problems they saw regarding the tasks, and 
so on, which would be elicited on a questionnaire 
to be administered at the end of a prolonged ses- 
sion during which the subjects would gain sufficient 
familiarity with the tasks. 

One puzzle from each difficulty level was solved 
verbally by the subjects so that the experimenter 
could check the subjects’ understanding and appli- 


“perceptual reasoning” task. 


Figure 2. Example of a 
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cation of the rules. Characteristics of the task were 
discussed to avoid sudden changes in solution time 
due to learning. To further reduce the effects of 
learning on the probability levels to be established, 
subjects were presented with 10 tasks from each 
difficulty level and urged to work at tasks from 
each level until they felt that performance at that 
level had stabilized and improvement was no longer 
forthcoming. In addition, the experimenter timed 
the subjects and was able to verify the stabiliza- 
tion of solution times at each level when the solu- 
tion times for three consecutive trials varied no 
more than + 4 sec. 

When the performance on all five difficulty levels 
had stabilized, the experimenter selected for the up- 
coming probability-establishing trials a time limit 
that would maximize the chances of inducing prob- 
abilities of 8, .6, .5, 3, and .1 for the respective 
difficulty levels. Six tasks from each of the five dif- 
ficulty levels were Presented, and if necessary, slight 
adjustments were made in the time limit to pro- 
duce 5, 4, 3, 2, and 1 successes at the respective 
difficulty levels. The answers were then checked in 
front of the subject, and the corresponding proba- 
bilities, .83, .67, 50, .33, and .17, were computed. 
These odd and asymmetric probabilities were chosen 
to maximize credibility of personal probabilities. 
Labels with these Personal probability values were 


placed in view next to the stacks of corresponding 
tasks, 


Free-Choice Condition 


A The next phase of the experiment varied, depend- 
ing on whether the subject had randomly been as- 
signed to the free-choice situation or the forced- 
choice situation, 


level, and could switch as often as desired. The 
subject was instructed to indicate when a choice 
had been made so that the e: 


all difficulty levels, 


If the subject finished before the time limit 
up, he or she recorded the answer, transfered ‘the 
puzzle to a box on the table, and then turned the 
sheet of paper behind the uzzle over, revealing the 
correct answer. The subject. i 


was called before an answer Was obtained, the sub- 
ject drew a line through that trial, indicating fail- 
ure. The subject then dep 
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along with the puzzle, in the box and made the 
choice for the next trial. At the start of this group 
of SO trials, the experimenter turned his or her 
chair so that it faced away from the subject, This 
was explained as being necessary to keep the er- 
perimeter from being distracted from timekeeping, 
The purpose was to minimize social-desirability 
factors by making it clear to the subject that the 
experimenter could not know which difficulty level 
had been chosen. Only after the subject had left 
the room did the experimenter learn which choices 
had been made, by examining the sequence of 
puzzles that had been deposited in the box. 


Forced-Choice Condition 


In the forced-choice situation, the subject was 
told that 50 more trials were to be completed in 10 
groups of five tasks, one from each difficulty level 
Subjects were shown tables of random permutations 
and told that the order of the difficulty levels in 
each group of 5 trials would be random. Again, the 
experimenter turned his or her chair to face away 
from the subject. The subject worked the puzzles 
with the modal time limit established in the pr 
ability-establishing trials, checking the answers wil 
the keys behind the puzzles in the same way as the 
free-choice subjects did. At the end of 5 trials 
the subject was asked to indicate on the answer 
sheet which level of task—easy, moderately be 
intermediate, moderately difficult, or analai 
would be chosen if a free choice were alloge 
was emphasized that this choice would not be k 
upon. Each subject in the forced-choice situatio 
made 10 of these choices. 


The Questionnaire j 


At the end of the last 50 trials, subjects in ne 
the free-choice and forced-choice conditions on 
given the questionnaire that had been mee 
earlier. The main purpose of most questions ‘at 
to maintain the credibility of the earlier bi 
about the purpose of the experiment. The ques an 
referred to standard setting, expended effort, the 
expectation of success. The question regariiing 5 
Predicted number of successes given 10 mora 
at each difficulty level is particularly pertine sub- 
this study, since it was used to select Be face 
jects for whom the manipulation of situation eit, 
tors (time limit, success and failure feedback, per- 
was successful in inducing and maintaining mei in 
sonal probabilities established with the subject n 
the early portion of the experiment. If the ae e 
of the probabilities reported by the subject © 
questionnaire was more than .1 higher or rae 30) 
the average of the established probabilities (- On 
the subject was excluded from further analya a 
the basis of this criterion, 25% of the subjects j of 
excluded, 4 because of too low and 12 becāi i 
too high subjective probabilities. 


to 
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Treatment of the Data 


Only after the individual sessions had been com- 
pleted were the TAT protocols scored for n Ach by 
the second author, who had acquired a reliability 
correlation -of .88 with expert-scored testing mate- 
rials (Atkinson, 1958). The use of code numbers 
on the TAT protocols and TAQs further decreased 
the possibility of experimenter knowledge of sub- 
ject classification during individual sessions. The 
TAQ was scored as described by Mandler and 
Cowen (1958) except that 5 intervals instead of 10 
were used, The rank order correlation between TAT 
and TAQ scores was —.09 (n= 64), a nonsignificant 
relationship, Because there was a significant dif- 
ference between the male and female scores on the 
TAQ, F(1, 60) = 9.53, p < 01, data were analyzed 
separately for males and females. Within each sex, 
subjects were ranked on both measures, with the 
lowest score given a rank of 1, and so on. Differ- 
ences between ranks were obtained by algebraically 
subtracting the subject's rank on the TAQ from 
his or her rank on n Ach. Those for whom this 
calculation produced a high positive number were 
relatively higher on n Ach than on test anxiety 
and were considered more nearly Ms > Mr individ- 
uals. Subjects with resulting high negative rank 
differences were relatively higher on test anxiety 
than on n Ach and were considered more nearly 
Mr > Ms individuals. 


Results 
Stability of Ps 


_ To check the assumption of stabilized sub- 
jective probability of success during the main 
phase of the experiment, the mean P, values 
Were computed from P, reported at the end 
y the experiment for each difficulty level. 
he P, values were computed for each group 
of subjects and averaged across the five dif- 
ficulty levels. As shown in Table 1, the 
As Py values are very close to the aver- 
ne subjective probabilities (P,=.50) in- 
ea at the five difficulty Jevels prior to 
ae ae phase. A three-way analysis of vari- 
in id not show any significant differences 
eee P, between groups. These findings 
ue by separate analyses of the 
eo lifficulty levels using the Newman-Keuls 
ae None of the tests performed in- 
nian | a significant increase in subjective 
ability over the respective induced value. 


P, Preferences 


ee frequency of response is commonly 
idered an index of motivational prefer- 
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Table 1 
Mean Probability of Success Reported at the 
End of the Experiment 


SSF 


Forced 
Free choice choice 
Group M SD M SD 
Males 
Ms > Me 50 06 39a 04 
Mr > Ms 46 04 52 .06 
Females 
Ms > Mr 1 10 Si At 
Mr > Ms 52006 "50 Aaaa: 


Note. Ms = motive to succeed; Mr = motive to 


avoid failure. 
a Departs significantly (.05 level) from induced 
average probability of success (P, = .50). 


ence (eg, Atkinson & Birch, 1970), the 
mode of P, values associated with the pre- 
ferred diffculty levels was determined for 
each successive block of five trials from the 
free-choice situation, and the slope and inter- 
cept of the regression line describing Ps pref- 
erence as a function of time (trials) was 
computed. This was done for each person 
within each of the four groups in the free- 
choice situation. A four-way analysis of vari- 
ance (Sex X Choice X Motive X Blocks) per- 
formed on preferred Ps values revealed a sig- 
nificant main effect of the trial blocks, (9, 
360) = 4.6, p < .0001, and a significant in- 
teraction between choice conditions and trial 
effect, A trend analysis performed separately 
for the free-choice and the forced-choice 
group revealed a_ highly significant linear 
trend in the free-choice group, F(1, 207) = 
36.7, p < 107, and a significant linear trend 
in the forced-choice group, F(1, 207) = 5.1, 
p< 05. Both trends indicate the expected 
shift to increasingly more difficult tasks dur- 
ing the course of the experiment. No other 
trends were significant. 

Figure 3 shows the mean preferred Ps 
levels for the 10 sets of five trials for suc- 
cess- and failure-oriented males. The mean 
slope of the regression line describing the 
linear trend to more difficult choices was M 
= .43 in the success-oriented male group and 
M = 27 in the failure-oriented male group 
(cf. Table 2). The two coefficients depart 
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Males: 
aa M.>M, 


o——-0 Mp>My 


40 45 TS 


Figure 3. Mean preferred probability of success (P,) level as a function of motive type (Ms= 
motive to succeed; Mr = motive to avoid failure) and trials (free choice, males). 


significantly from zero, as was expected by 
Hypotheses 1 and 3; ¢(5) = 3.46, p< 01, 
for success-oriented males, and #(5) = 4.80, 
p < .0025, for failure-oriented males. Similar 
results were obtained for female subjects 
(cf. Figure 4). The mean slope of the re- 
gression line is M = .26 for success-oriented 
females and M = .22 for failure-oriented fe- 
males, The first value approaches statistical 


Table 2 


Mean Slopes of Regression Lines for Eight 
Subject Groups 


ase 
Free choice Forced choice 


Group Male Female Male Female 
Ms > Mr A 126 =.03 23 
My > Ms 27208 06 01 


Note. n = 6 in each cell. Ms = motive to succeed ; 
Mr = motive to avoid failure. 
* Significantly above zero (.05 level). 


significance, #(5) = 1.55, p< .09, whereas 
the second value is significantly above 2% 
t(5) = 2.59, p< .025. The differences 1 
mean slopes between male and female oe 
are not significant; (10) = 1.24, ns, for m 
cess-oriented subjects, and ¢(10) = -9% i 
for failure-oriented subjects. The mean u 
from pooled male and female groups # 
M = .32 for success-oriented subjects, a 
M = 31 for failure-oriented subjects. if 
values depart significantly from zeros i 7 
=3.51, p < 0025, and #(11) =443 ? 
.0005, respectively. 

The variance of the probabilities 
by failure-oriented subjects on the first y 
of the main phase does not differ significant" 
from the respective variance for i ; 
oriented subjects, as would be oper i 
there were any ambivalence between Ue ects, 
high difficulty in failure-oriented subj 
F(5, 5) = 3.02, ns. 


preferred 
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The hypotheses regarding preferred diff- 
culty level at the beginning of the main phase 
of the experiment (Hypotheses 2 and 4) 
were confirmed. As can be seen from Table 3, 
the mean intercepts of the regression line 
are 63 and .59 for success-oriented males 
and females, respectively. These values do 

not differ significantly from the expected 
value (Ps = 5); #(5) = 2.37, ns, for males, 
and #(5) = 1.02, ns, for females. For failure- 
oriented subjects, the mean intercepts are 
65 for males and .71 for females. These 
values are, as expected (Hypothesis 4), sig- 
nificantly above the intermediate P, level; 
t(5) = 3.37, p < .01, for males, and #(5) = 
3.18, p < .05, for females, Similar results 
were obtained when first-trial data rather 
than intercepts were analyzed: The mean 
P, level chosen at the first trial was .57 for 
success-oriented subjects, which does not 
differ significantly from the expected value 
(5); ¢(11) = 1.7, ns. For failure-oriented 
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Figure 4. Mean preferred probability of success (Ps) 
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Table 3 
Mean Intercepts of Regression Lines 
for Eight Subject Groups 


Free choice Forced choice 

Group Male Female Male Female 
Ms > Mr 59 63 44 61 
Mr > Ms 65° mS 49 (57 


Note. n = 6 in each cell. Ms = motive to succeed; 
Mr = motive to avoid failure. 

* Significantly different from probability of success 
(P,) = .50. 


subjects, the mean Ps level chosen at the first 
trial was .66. This value is, as expected 
(Hypothesis 4), significantly higher than the 
intermediate P, level; t(11) =4.17, P< 
001. The difference in mean P, level be- 
tween motive types approaches significance, 
t(22) = 1.53, p < -07 (cf, Hypotheses 3 and 


4). 


Females: 
eo—* M;>M; 


o-—-—0 Mp>Ms 


45 50 


35 40 


level as a function of motive type (Ms = 
d trials (free choice, females). 
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Forced Choice 


The mean P, preferences reported after 
each block of five tasks in the forced-choice 
condition are shown in Figure 5 for the four 
groups according to motive type and sex, 
The mean slope of the regression line de- 
scribing the linear trend in changes of risk 
preference over time is M = -03 in the suc- 
cess-oriented male group and M = —.06 in 
the failure-oriented male group. Neither value 
differs significantly from zero, £(S) = .57, 
ns, and ¢(5) = —.64, ns, respectively. The 
mean slopes of the regression lines in the fe- 
male groups were M = 23 for success-ori- 
ented females and M = 01 for failure-orj- 
ented females. Neither value differs signifi- 
cantly from zero, t(5) = .6, ns, and t(5) = 
21, ns, respectively. The mean intercepts do 
not differ significantly from P,=.5 in an; 


of the four groups of the forced-choice con- 
dition (cf. Table 3). 


25 30 
TRIAL NUMBER 
success 
‘ceed; Mr = motive to avoid failure 
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aa males, MN 
&—-—-0 males, Mp>M, 
@——--e females, MM; 
©----—-o females, Me >M; 


35 45 9 


ae Ms= 
(P.) level as a function of motive type ( 
), sex, and trials (forced choice). 
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Discussion 


; ition 
The results from the free-choice Eo 
support the predictions derived E 1910): 
namics of action (Atkinson & eo j 
If one allows subjects to actually the pre 
Preferred difficulty levels, they re > Mi) 
dicted shift from intermediate (if and mot 
or easy (if Mp > Ms) to se ct is most 
difficult tasks, The fact that this effect iy 
pronounced in the success-oriented sooren j 
and least pronounced in the Spe. when 
female group is what is expect jety scot 
difference between the sexes in anx signi 
is taken into account: Females male Const: 
cantly higher on the TAQ than m the male 
quently, it may be assumed that edian at 
with TAT-TAQ rank above the 1 ie i 
More characteristic of high n Ach, ales Wi 
ety individuals and that the ee a 
TAT-TAQ rank below the median jety intl | 
characteristic of low n Ach, high-anx! l 
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viduals, Females with TAT-TAQ rank above 
the median may be assumed to have low mo- 
tivation (Ms ~ Mr), since their positive 
motivation may be dampened by negative mo- 
tivation, which is significantly greater for 
females than for males. According to this ar- 
gument, the insignificant trend obtained by 
success-oriented females is in line with Hy- 
pothesis 5. 

The results suggest that there are factors 
influencing temporal change in risk pref- 
erence that cannot be explained by changes in 
Ps, The important implication of this conclu- 
sion is that the shift to harder tasks on re- 
peated trials cannot be explained by tradi- 
tional episodic theories of achievement moti- 
vation (Atkinson, 1957; Heckhausen, 1977; 
Weiner et al., 1971), because subjective prob- 
ability of success (Ps) was held constant 
throughout the main phase of the experiment. 
This is suggested by average P, judgments 
obtained at the end of the experiment (Table 
1). In the free-choice condition, none of the 
reported P, values measured at the end of 
the experiment differs significantly from the 
_ average induced value of .50. This is what 
was expected on the basis of the assumption 
that P, values were stable over trials during 
the main phase. The correspondence between 
objective initial pẹ values and reported final 
P, values does not, however, exclude initial 
discrepancies between objective and subjec- 
tive probabilities, nor does it exclude incre- 
ments of latent (not reported) probabilities. 

We are facing an old and fundamental 
methodological problem here: Repeated mea- 
surements of subjects’ reports do not solve the 
Problem, because they may constitute reac- 
tive conditions destroying the validity of the 
conclusions. The application of objective 
Methods for assessing variables requires, in 
Most cases, more knowledge about the rela- 
tionship between objective indexes and their 
Subjective determinants than we actually 
have, Accordingly, our method of checking 
the assumption of stable P, values can be 
viewed as a preliminary compromise only. 
Yet, the results (Table 1) do show that one 
Possible attempt to falsify the assumption of 
noe P, values failed. The conclusiveness of 
ses studies attempting to check hypothe- 

s like ours may be increased if we succeed 
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in developing | objective (rather than self- 
report) techniques of assessing subjective 
probabilities. A measure of decision time has 
already yielded some promising results in this 
direction (Schneider, 1973). 

In spite of the encouraging results dis- 
cussed so far, the validity and generality of 
the conclusions still have to be investigated 
in further studies. Results obtained from 
similar experiments with different tasks and 
different types of subjects are consistent, 
though, with the present data in suggesting 
that there are reasons for the tendency to 
choose increasingly more difficult tasks that 
cannot be explained on the basis of respective 
shifts in expectancy of success alone (Schnei- 
der & Posse, 1978; Kuhl, Note 2). These 
studies also demonstrate that the obtained 
results cannot be explained by effects due to 
the spatial arrangement or order of presenta- 
tion of the various difficulty levels. The ex- 
planation offered by the dynamics of action 
is based on the assumption of greater consum- 
matory value of success than of failure, which 
leads to relatively quick reduction of tenden- 
cies to work on easy tasks because success is 
more likely than with difficult tasks, 

The shift to harder tasks in spite of con- 
stant P, was also predicted by an applica- 
tion of some concepts from the dynamics of 
action by Revelle and Michaels’s theory 
(1976). This theory, however, sacrifices some 
of the predictive power of the original theory 
(Atkinson & Birch, 1970) by neglecting the 
motivational effects of success and failure on 
negaction tendencies in failure-oriented sub- 
jects. The fact that failure-oriented subjects 
show a significant though weaker trend to 
more difficult tasks contradicts Revelle and 
Michaels’s reverse-strategy hypothesis for this 
motive type. In addition, the limited dynamic 
theory includes the assumption of constantly 
active consummation of all competing ten- 
dencies, no matter whether a tendency is ex- 
pressed in behavior. This simplifying as- 
sumption has the implication that the theory 
can only be tested in a situation in which 
constantly present consummation of all ten- 
dencies is approximated by forcing the sub- 
jects to work on consecutive series consisting 
of tasks of all difficulty levels (Revelle, Note 
3). The significant interaction between choice 
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condition and trial effect (compare Figures 2 
and 3 with Figure 4) stresses the predictive 
power of the original theory, which predicted 
the differential effects in both the free-choice 
and the forced-choice conditions. The gen- 
erality of the simplified dynamic theory is 
limited to the forced-choice situation. The 
main focus of the dynamics of action, how- 
ever, is directed towards the “ecologically 
more representative” (Brunswick, 1952) sit- 
uations allowing free expression of preferred 
tendencies over extended periods of time. 

A final word may be said about a phenom- 
enological interpretation of the results. The 
shift to more difficult tasks in spite of con- 
stant P, may be described in terms of a “cog- 
nitive strategy” involving a shift to a higher 
difficulty level after one or more successes on 
a given difficulty level have been experienced. 
Cognitive interpretations of behavioral data 
have been contrasted to “mechanistic” expla- 
nations (i.e., explanations that do not de- 
scribe cognitive mediators explicitly; Weiner, 
1972). According to our view regarding this 
controversy, the difference between a phe- 
nomenological and a functional explanation, 
as we prefer to call the one offered in this 
article, is not a difference between contra- 
dicting theoretical positions, but a difference 
between two epistemological perspectives on 
the same phenomenon. Accordingly, we be- 
lieve that the functional explanation offered 
by the dynamics of action in terms of dif- 
ferential consummatory values after success 
and failure is compatible with a phenomeno- 
logical explanation describing the cognitive 
processes that may be coordinated to the 
functional parameters discussed here. 
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Effects of Paced Respiration and Expectations on | 


Physiological and Psychological Responses to Threat 


Kevin D. McCaul, Sheldon Solomon, and David S. Holmes 
University of Kansas 


While waiting to receive electric shocks, 105 males either ( a) regulated their 
breathing at one half the normal rate, (b) regulated their breathing at the normal 
rate, or (c) did not regulate their breathing rate. Half of the ‘subjects in each 
breathing condition were told that their breathing task would aid them in relax- 
ing, whereas the other half were not given that expectation. Subjects in a no- 
threat condition were not threatened with shocks, did not regulate their breath- 
ing, and were not provided with expectations. The results indicated that slowing 
respiration rate reduced physiological arousal as measured by skin resistance and 
finger pulse volume (but not heart rate) and reduced self-reports of anxiety. 
Expectations did not influence arousal. These data provide evidence for the 
effectiveness of paced respiration as a coping strategy, and they resolve the con- 
flicting findings of previous investigations. 


Considerable attention is being devoted to 
determining the effectiveness of various strat- 
egies for controlling the physiological arousal 
and anxiety experienced during threatening 
situations. One such strategy involves the 
control of respiration. Controlling respiration 
may be effective for reducing physiological 
arousal because there are strong general re- 
lationships among the various physiological 
response systems (cf. Davies & Neilson, 
1967; Levenson, 1976; Obrist, Webb, & Sut- 
terer, 1969; Stroufe, 1971). Control over 
respiration may also be effective for reducing 
anxiety, because if it does reduce physiologi- 
cal arousal, there would be fewer cues to 
stimulate self-reported anxiety (Schachter, 
1964). 

Surprisingly, only two experiments have 
been conducted specifically to test the effects 
of controlled respiration on persons’ responses 
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to threat, In the first investigation (Harns 
Katkin, Lick, & Habberfield, 1976), oe 
participated in either (a) a paced-respiralio 
condition in which they were instructed 
synchronize their respiration to a light a 
came on and went off at a rate of 8 cyo% 
per minute, a rate half that of the wae 
resting respiration rate; (b) an atten 
control condition in which they poi 
counted the light cycles and signaled a f 
perimenter after every 10 cycles; of p 
baseline control condition in which inl 
were not given any instructions cono 
the flashing light. All subjects wer? | 
presented with two tones that they were 
might be followed by electrical shocks ve 
first one was). The investigators ee 
that when the first tone was presented, eve 
jects in the paced-respiration cond ea 
denced smaller electrodermal be 
(though not smaller heart rate re 
than did subjects in the other two Com retel 
Regrettably, these results must be interp m 
with caution, because the investigation and 
fered from a number of methodologi™ 6 
statistical problems (see Holmes, McCall 
Solomon, 1978). 4 
In the second investigation (Holmes 88 4 
1978) subjects were first asked to res 
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{0-min. period during which their respira- 
terns were recorded. Subjects then 
participated in either (a) a respiration-trac- 
ing condition in which they were shown a 
record of their resting respiratory patterns 
and were instructed to breathe in such a way 
that the pen tracing their present respiratory 
pattern traced the line representing their 
resting respiratory pattern—a procedure that 
required subjects to duplicate in all respects 
(rate, depth, and phasing of rate and depth) 
their resting respiratory activity; (b) an 
aitention-tracing control condition in which 
they were shown a record of their resting 
respiratory patterns and were instructed to 
manipulate a knob in such a way that a pen 
connected to the knob traced the line repre- 
senting their resting respiratory pattern; or 
(c) a no-tracing control condition in which 
they were not shown a record of their resting 
respiratory pattern and simply sat quietly. 
Once the respiration manipulation was com- 
pleted, half of the subjects in each condition 
Were threatened with electrical shocks, where- 
as the other half were not. Contrary to what 
Was expected, analyses of heart rate and 
self-report data indicated that the respiration 
ulation was not effective for decreasing 
arousal, (Unfortunately, apparently because 
of technical problems with the equipment, 
this experiment the skin resistance mea- 
Sure was insensitive to changes in arousal 
and ‘thus could not be used for testing the 
thesis.) In contrast to the first experi- 
ment, then, the second experiment offered 
evidence for the effectiveness of controlled 
*Spiration for reducing arousal under threat- 
hg conditions, 
Apart from their results, the two experi- 
ents Sa above differed in a number 
me ant respects; for example, (a) one 
experiment employed a A pe that 
‘one half the normal resting rate, whereas 
(b Other employed the normal resting rate; 
a respiration was controlled with different 
te ues; 


es; and (c) one experiment found 
Predicted effects only with the skin re- 
e AA ASN whereas that measure was 
differen able in the other. Because of these 
the cornet, did not seem possible to resolve 
"conflict in the results of the two experi- 


Pa and thus it was not possible to draw a 


F 
ý 
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firm conclusion concerning the effectiveness 
of controlled respiration for reducing arousal 
under threatening conditions. 

The present experiment was conducted in 
an attempt to resolve the conflicts between 
the two previous experiments. Because it was 
felt that slowed respiration might be the 
crucial difference between the earlier experi- 
ments, in the present experiment subjects 
were threatened with painful electrical shocks 
while (a) their respiration rates were regu- 
lated at one half of the normal resting rate 
(slow-breathing condition); (b) their respi- 
ration rates were regulated at the normal 
resting rate (normal-breathing condition); 
or (c) their respiration rates were not regu- 
lated (nonregulated-breathing condition). It 
was predicted that subjects who had their 
breathing regulated at a slow rate would evi- 
dence less physiological arousal and report 
less anxiety than subjects who had their 
breathing regulated at a normal rate or sub- 
jects who did not have their breathing regu- 
lated. In addition, it was expected that 
breathing at a normal rate would not reduce 
arousal or anxiety relative to breathing at a 
nonregulated rate. 

Although respiration rate appears to repre- 
sent the chief difference between Harris et 
al. (1976) and Holmes et al. (1978), it is 
conceivable that the respiration manipula- 
tions used in those experiments also influ- 
enced subjects’ respiration depth. Unfortu- 
nately, data concerning respiration depth 
were not reported in either of those earlier 
studies, This issue is important because of 
the results of several studies that have ex- 
amined the influence of paced respiration in 
nonthreatening situations. In one of those 
studies it was concluded that respiration rate 
per se influenced heart rate (Levenson, 
1976), but other investigations suggested 
either that rate has no effects (Engel & 
Chism, 1967; Epstein & Webster, 1975) or 
that it is the combination of changes in rate 
and depth that influences measures of arousal 
(Deane, 1965; Stroufe, 1971). Thus, even 
though the experimental manipulation em- 
ployed in the present investigation was fo- 
cused on respiration rate, both rate and depth 


were monitored. 
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In addition to examining respiration con- 
trol, this experiment also investigated the 
possible effects of the expectations subjects 
hold about the coping strategy of respiration 
control. Because expectations concerning the 
effects of the breathing manipulations might 
have an influence on arousal (i.e., a placebo- 
like effect), half of the subjects were given 
explicit expectations that what they were do- 
ing would aid them in reducing arousal (high- 
expectations condition), whereas the other 
half were not given those expectations (no- 
expectations condition), 

So that the influence of the threat, respira- 
tion, and expectation manipulations could be 
assessed relative to a nonthreatened group 
whose respiration was not regulated, a con- 
trol condition was employed in which sub- 
jects were not threatened, breathing was not 
regulated, and expectations were not provided 
(no-threat control condition). 


Method 
Design 


This experiment employed a 3 (slow breathing, 
normal breathing, nonregulated breathing) X 2 (high 
expectations, no expectations) factorial design in 
which all subjects were exposed to a threat, plus a 
one-cell control condition (nonregulated breathing, 


no expectations) in which subjects were not exposed 
to a threat. 


Subjects and Laboratory Facilities 


One hundred and ten male un 
University of Kansas 
the experiment as pai 
ment for a general 


procedures, and 
the threat conditions elected not 


1 ined a Beckman poly- 
graph, a Sony video system, and a tape record: 


rd physiological Tesponses, 
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Present visual materials, and present the intral 
tions. 


Procedure 


Upon arrival, the subject was told that them 
periment involved studying physiological respon 
to various types of stimulation. After the subje 
agreed to participate by signing an informed: 
sent statement, he was seated, and the physiolog 
sensors were attached. Specifically, a Beckman phy 
toelectric plethysmographic finger transducer Wi 
attached to the middle finger of the subject's non 
dominant hand; 20 X 25 mm monopolar Lafayette 
chrome skin resistance electrodes were attached i 
the second and fourth fingers of the same hani 
and a Beckman strain gauge was placed around Ù i 
subject’s chest.? In addition, a large electrode wi 
strapped to the subject’s left wrist. That electrod 
was later described as a shock electrode to the sil 
jects in the threat conditions. ita 

The subject was then asked to sit quietly i 
relax for § min. After 5 min., the subject he 
structed to complete an 18-item anxiety a 
that used adjectives taken from the Affect Adjectvt 
Checklist (Zuckerman, 1960). For each ae 
the subject checked a 4-point scale (1= pe 
all”; 4=“a great deal”) that reflected the i i 
to which that adjective described how he oT 
that time. Responses to this checklist were a n i 
a measure of the subject’s initial level of an ue 

Threat manipulation. After the subject on 
the anxiety checklist, the experimenter reene ta 
room and gave the subject a second informe 
sent statement. The form given to sublets be 
threat conditions indicated that the stile ‘ba 
ing studied was “painful electric shock m (12 
asked the subject to rate on a 61-point scal a 
“not at all”; 61=“extremely”) how pee i 
was about receiving shocks. After the sabin cone 
to continue participating and indicated a tet 
cern about receiving shocks, the exper ae 
the room and started the audiotape that th 
the shock procedure. The description ida p 
after the instructions were completed, a S inning 
nal light would come on, indicating the A be 
of a 90-sec “waiting (anticipation) perio indicat 
followed by the onset of a red signal light (threat) 
ing the beginning of a 90-sec “shock ould be 
period,” during which up to three shocks W' he 
administered on a random schedule throu 
electrode on the subject’s left wrist. 


al 
*Male subjects were used because of nen that 

ability in the subject pool. It might be aes et ah 

in the previous, similar experiment (Holm emalt 

1978), no differences between males an -rept 

were found on either physiological or Be w 

measures. Therefore, there was little reas? 

pect differences in this experiment. tant volt 
? Skin resistance was recorded using CONS 

age (see Venables & Christie, 1973). 
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Jn contrast, the second informed-consent state- 
ment given to each subject in the no-threat condi- 
tion indicated that the stimulation being studied 
was “visual stimulation.” After the subject agreed 
to continue participating, the experimenter left the 
room and started an audiotape that described the 
procedure for receiving the visual stimulation. The 
description indicated that after the instructions were 
completed, a green signal light would come on, in- 
dicating the beginning of a 90-sec “waiting period” 
to be followed by the onset of a red signal light 
that would provide 90 sec of visual stimulation. 

Breathing manipulation. After receiving the in- 
structions for the shock or visual stimulation, the 
subject was directed to look at the video monitor 
mounted in front of him. In the slow-breathing and 
normal-breathing conditions, a Sony video recorder 
was used to project a sine curve that moved from 
right to left across the monitor. The subject was 
instructed to watch the curving line as it passed the 
red line that divided the screen and to breathe so 
that he inhaled as the line went up and exhaled as 
the line went down. In addition, the subject was 
admonished not to hold his breath or to take huge 
breaths, but to breathe slowly and smoothly through- 
out each curve. 

In the slow-breathing condition, the videotape 
was recorded to regulate the subject’s respiration at 
a rate of approximately 8 breaths per minute. This is 
one half the normal rate, and it is the rate that 
Previous investigators have employed in studying 
the effects of slowed breathing on arousal (see 
Harris et al, 1976). In the normal-breathing condi- 
tion, the videotape was recorded to regulate the 
Subject’s respiration at a rate of approximately 16 
breaths per minute. This rate was selected because 
Previous research indicated that this was the rate at 
which subjects in similar experiments breathed when 
a under stress (see Harris et al., 1976; Holmes et 
R? 1978). Each subject in the nonregulated-breath- 
3 af conditions was simply instructed to watch a 
im peta on the video monitor. In all conditions, 
i Subject was instructed to continue the task as- 
igned to him until the end of the experiment. 

Re eiciation manipulation. Half of the subjects in 
Resi condition were assigned to the high-ex- 
task He condition. After practicing the breathing 
aN matching the test pattern for approximately 
on pe, were told that their task was effective 
either teas Stress; that is, the subject was told 
Wideo ee regulated breathing or attending to a 
Ple relax nitor was an exercise designed to help peo- 
subjects and stay calm, Furthermore, each of these 
Bund to ae informed that the exercise had been 
previ e effective for reducing stress in many 

Ta experiments, 
ition omer half of the subjects in the threat con- 
Were recy el as subjects in the no-threat condition, 
Were lan to the no-expectations condition. They 
attending ore that regulating breathing or that 
periments the video monitor was desirable in 
pt thei using physiological measures because it 

eir breathing or attention constant while 
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allowing other physiological indexes like heart rate 
and skin resistance to vary. They were thus given no 
particular expectation that the tasks would be useful 
for reducing stress. After the expectation manipula- 
tion was completed, each subject in the threat con- 
dition was briefly interrupted and asked to report 
on a single-item questionnaire how relaxed he ex- 
pected to be during the shock period. Responses to 
this item served as a check on the expectation ma- 
nipulation. 

When those procedures were completed, each sub- 
ject was reminded to continue performing his spe- 
cific task during the wait and during the shock/ 
stimulation periods. The lights then came on as an- 
nounced, but no shocks were actually administered. 
Following the 90-sec shock/stimulation period, each 
subject was asked to complete another anxiety check- 
list describing how he felt during the shock/stimu- 
lation period. Each subject in the threat condition 
was also asked to describe his thoughts and feelings 
while waiting for the shock and to report what he 
thought about to keep calm during the shock period. 
The session then concluded, and the subject was 
completely debriefed concerning the procedures, de- 
ception, and purpose of the experiment. 


Results and Discussion 


Scoring and Preparation of Physiological Data 


The skin resistance data were scored for 
seven 30-sec periods: The highest level of 
resistance (lowest arousal) was determined 
for the last 30-sec period of the baseline 
period that preceded any of the experimental 
manipulations, and the lowest level of re- 
sistance (highest level of arousal) was deter- 
mined for each of the three 30-sec periods in 
the anticipation period and each of the 30- 
sec periods in the threat period (or “visual 
stimulation” period, in the case of subjects in 
the no-threat condition) .* To eliminate the 
influence of initial levels of arousal on subse- 
quent levels of arousal (i.e. the “law of initial 
values”; Lacey, 1956; Wilder, 1962), the 
scores from the anticipation and threat pe- 
riods were residualized (Cronbach & Furby, 
1970) using the scores from the baseline 
period as the measures of initial levels of 


abami 

3 The highest skin resistance level (lowest arousal) 
was chosen as the best initial measure of skin re- 
sistance, to avoid taking into account brief fluctua- 
tions in the final 30 sec of the baseline procedure. 
This procedure appears to have been effective, since 
the average pretest-posttest correlation for skin re- 
sistance level was greater than .90. 
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arousal. (A residualized score consists of the 
difference between the score obtained during 
a given period and the score predicted by 
linear regression from the baseline score.) 
The scores for the periods within the antici- 
pation period were averaged to obtain one 
score for the anticipation period, and the same 
was done for the threat period. 

Heart rate and finger pulse volume were 
scored for the same seven 30-sec periods. 
Heart rate was defined as the number of 
beats occurring in the period. Finger pulse 
volume (in mm) was determined by averaging 
the largest volume (lowest arousal) and the 
smallest volume (highest arousal) during each 
period, Both the heart rate and finger pulse 
volume scores were then residualized and 
averaged like the skin resistance scores, 

Respiration rate was scored for seven 30- 
sec periods; the 30-sec period after the breath- 
ing manipulation, the three 30-sec periods 
during the anticipation period, and the three 
30-sec periods within the threat period. The 
scores within the anticipation and threat 
periods were then averaged, Respiration rate 
was simply defined as the total number of 
breaths taken during the 30 sec. Respiration 
depth, measured (in mm) from the bottom to 
the top of an inspiration, was scored for the 
breath approximately in the middle of each of 
the same seven periods, as well as the baseline 
period. These scores were then adjusted and 
averaged like the skin resistance scores, 


Effectiveness of Experimental Manipulations 


Threat manipulation. Comparisons of the 
skin resistance scores of the subjects in the 
six threat conditions with the skin resistance 
Scores of the subjects in the no-threat condi- 
tion revealed that across the anticipation and 
threat periods, subjects in the threat condi- 
tions showed lower resistance (higher 
arousal) than subjects in the no-threat con- 
dition, F (1,92) = 7.07, MS. = 8057.73, p< 
.01.° Comparable comparisons conducted on 
the heart rate data revealed that across pe- 
riods, subjects in the threat conditions eyi- 
denced higher heart rates than subjects in 
the no-threat condition, F(1,97) = 6.38, MS, 
= 18.28, p = .01. Analyses of the finger pulse 
volume data showed similar results, F(1, 97) 
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= 47.25, MS, = 22.40, p< 01, Fina 
was found that subjects in the six threat a 
ditions reported significantly higher ans 
than did subjects in the no-threat conditi 
F(1,96) = 20.89, MS, = 68.00, 9 <i 
These results consistently indicate that { 
threat manipulation was effective in ince 
ing subjects’ physiological arousal andi 
ported anxiety, | 
Respiration manipulation. A 3 (respi 
tion) X 2 (expectations) X 3 (periods) am 
ysis of variance with repeated measures y 
ducted on the respiration rate scores Tel 
that only the respiration effect was stale 
cally significant, F(2, 80) = 114.71, MS. 
3.46, p < 01. Subsequent comparisons 1 
cated that subjects in the slow-breathin; 
dition showed a slower respiration rate a 
subjects in either the normal-breali k 
nonregulated-breathing conditions, #s( i 
= 9.92 and 10.83, respectively, pss i 
There was no difference in respiration ia 
between the latter conditions (Ms= 
7.75, and 8.17, respectively). ions) 4 
A 3 (respiration) X 2 (expectations 
(periods) analysis of variance with ia 
measures conducted on the respiration 


204.73, p < .01. Subsequent com 
sons (Duncan’s) indicated that ie 
the slow-breathing condition brea! 


*Tnitial analyses indicated that there WO" ġ 
Pretest differences among conditions analysts 
the measures used as covariates. Other a o period 
dicated that the consideration of So contribut 
within the anticipation and stress paler heart t 
no new information; therefore, for bo ting 
and skin resistance, the factor represen alyses ! 
periods was eliminated to make pe aaa cha 
complex and to reduce the possibility 
findings. be P 


six h 


jects could not be used for self-repor og i 
Comparisons between experimental ana es sind 
jects on the anxiety checklist were per iist 508 | 
only one covariate: initial anxiety chec 
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deeply than subjects in the other respiration 
conditions (ps < .05). Respiration depth for 
subjects in the normal and nonregulated 
respiration conditions did not differ (Ms = 
25.63, 19.94, and 16.39, respectively). 

The findings concerning respiration indi- 
cate that the respiration manipulation was 
effective for slowing subjects’ respiration rate. 
At the same time, however, the manipulation 
affected respiration depth; subjects who were 
induced to breathe slowly took deeper breaths. 
Furthermore, since none of the other effects 
generated by these analyses was statistically 
significant, it appears that respiration was 
not influenced by the other manipulations 
and that subjects maintained their respira- 
tion rate and depth over time. 

_ Expectations manipulation. A 3 (respira- 
tion) X 2 (expectations) analysis of covari- 
ance was conducted on the scores indicating 
how relaxed subjects expected to be during 
the threat period; subjects’ initial levels on 
the fear of shock and the anxiety checklist 
Measures were used as covariates. This analy- 
sis revealed that only the expectations effect 
Was statistically significant, F(1,81) = 5.29, 
MS, = 99.27, p = .02, Inspection of the ad- 
justed means indicated that subjects in the 
high-expectations condition reported expect- 
ing to be more relaxed than subjects in the 
No-expectations condition (Ms = 29.31 and 
24.33, respectively). The fact that none of 
the other effects generated by the analysis 
Was statistically significant indicates that sub- 
ects’ expectations were not influenced by the 
respiration manipulation. 
Beery: The results clearly indicate 
a D the threat manipulation was effective 
a increasing subjects’ physiological arousal 
a _teported anxiety, (b) the respiration 
nee was effective for slowing sub- 
nS eee 
Bs deer was effec ive for 1 g 
Foil gree to which subjects thought they 
uld be able to relax while waiting for 
e In view of the effectiveness of the 
A aeaiiee: manipulations, it is ap- 
tion ey to consider whether slowed respira- 
ing th expectations were effective in reduc- 
a € arousal and anxiety of subjects under- 
ing threat, 
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Figure 1. Mean residualized skin resistance scores for 
the breathing conditions and the no-threat control 
condition during the anticipation and threat periods. 
(Cell means are based on the following number of 
subjects: no-threat control, n = 14; slow breathing, 
n=29; nonregulated breathing, »=28; normal 
breathing, n = 28.) 


Effects of Slowed Respiration 


Physiological responses. A 3 (slow breath- 
ing, normal breathing, nonregulated breath- 
ing) X 2 (high expectations, no expectations) 
x 2 (anticipation period, threat period) 
analysis of variance with repeated measures 
conducted on the skin resistance data revealed 
a respiration effect that closely approached 
significance, F(2,92) = 2.87, MS, = 
15827.44, p = .06, and a Respiration X Pe- 
riods interaction that was statistically signifi- 
cant, F(2,92) = 3.48, MS, = 288.02, p= 
.03. The means contributing to this interac- 
tion are presented graphically in Figure 1. 

To test the prediction that slowed respira- 
tion would be effective for reducing physio- 
logical arousal, a contrast was performed com- 
paring the skin resistance level of subjects in 
the slow-breathing condition to the skin re- 
sistance level of subjects in the combined 
normal and nonregulated conditions. This 
contrast was significant for both the antici- 
pation period, F(1,92) = 3.95, MS. = 
8057.73, p = .05, and the threat period, F (1, 
92) = 5.42, MS. = 8057.73, p= 02. The 
skin resistance levels of subjects in the nor- 
mal and nonregulated conditions were not 
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significantly different in either period. Thus, 
slow breathing was effective in reducing the 
skin resistance of threatened subjects relative 
to threatened subjects who were in the nor- 
mal and nonregulated breathing conditions. It 
should also be noted that the skin resistance 
levels of threatened subjects in the slow- 
breathing condition did not differ signifi- 
cantly from the levels exhibited by subjects 
in the no-threat condition. 

A 3 (respiration) X 2 (expectations) X 2 
(periods) repeated measures analysis of vari- 
ance was conducted on the finger pulse volume 
data. This analysis revealed a main effect for 
trials, F(1, 97) = 8.65, MS. = 342, p < 
01; the means indicated that subjects ex- 
hibited a lower pulse volume level (higher 
arousal) during the threat than during the 
anticipation period. The analysis also re- 
vealed a significant effect for respiration, 
F(2,97) = 3.30, MS, = 41.39, p = .04, 

To test the prediction that slowed respira- 
tion would be effective for reducing physio- 
logical arousal, a contrast was performed 
comparing the finger pulse volume level of 
subjects in the slow-breathing condition to 
the finger pulse volume level of subjects in 
the combined normal and nonregulated condi- 
tions. Across periods this contrast was signifi- 
cant, F(1,97) = 5.43, MS, = 22.40, p = .02. 
The finger pulse volume levels of subjects in 
the normal and nonregulated conditions were 
not significantly different (Ms = 9.22, 6.22, 
and 7.23, respectively). Thus, slow breathing 
was effective in reducing the finger pulse vol- 
ume of threatened subjects relative to threat- 
ened subjects who were in the normal and 
nonregulated breathing conditions. It should 
be noted, however, that subjects in the slow- 
breathing condition exhibited more arousal 
than subjects who were not threatened, F(1, 
97) = 25.81, MS, = 22.40, p < i01. 

Analyses conducted on the heart rate data 
did not reveal any statistically significant 
main effects or interactions involving respira- 
tion. 

Self-report responses. A 3 (respiration) 
Xx 2 (expectations) analysis of covariance 
was conducted on the anxiety scores. Sub- 
jects’ scores on both the self-report measure 
of initial fear of shock and on the first anxi- 
ety checklist were used as covariates. The 
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analysis of covariance revealed only a respira. 
tion effect that approached statistical signif. 
cance, F(2,81) = 2.54, MS, = 65.76, p= 
.08. To test the prediction that slowed respi 
ration would be effective for reducing psycho- 
logical anxiety, a contrast was performed 
comparing the self-reported anxiety of sub- 
jects in the slow-breathing condition to the 
anxiety reported by subjects in the combined 
normal and nonregulated breathing conti 
tions. This contrast was statistically signif 
cant, F(1, 81) = 4.89, MS, = 65.76, p 
.03. A comparison of the anxiety reported by 
subjects in the normal-breathing and no: 
regulated-breathing conditions was not sig 
nificant (Ms = .75, 5.37, and 4.26, respet 
tively). These data suggest that slow breath: 
ing was a successful technique for reducing 
psychological stress. It should be noted, how: 
ever, that subjects in the slow-breathing com 
dition reported more anxiety than did sub: 
jects who were not threatened with shock, 
F(1,96) = 10.96, MS, = 68.00, p < Ol. 
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The factorial analyses of variance, desert 
above, that were conducted on the skin He 
tance, finger pulse volume, heart rate, P 
self-report data did not reveal any mam, a 
fects or interactions involving expectata 
that reached or approached statistical se 
cance, The consistent absence of those ele a 
and the absence of meaningful onde 
means clearly indicates that in this mr 
gation, differential expectations conte 
the effectiveness of the coping strategy the 
jects were employing did not influence 
actual effectiveness of the strategies. 


Conclusions and Implications 


Effects of Expectations 


This experiment provides support fof 
conclusions of Harris et al. (1976) th cing 
respiration can be effective for * 
arousal under threatening condition® 
data also suggest that the con ae ; 
Holmes et al. (1978) that controlled te ist 


be qualified. Specifically, when respi" 
controlled by slowing it to a rate Mes den 
normal, as in Harris et al., subjects © 


higher skin resistance and higher finger pulse 
volume levels, and they reported lower levels 
of subjective anxiety than did subjects in the 
other conditions. In contrast, when respira- 
tion was controlled at a normal, resting rate, 
as in Holmes et al., no beneficial effects for 
controlled respiration were found. The simi- 
Jarity of these results across both physiologi- 
cal and self-report measures provides con- 
erging validity for the effects of the slow- 
breathing strategy. 

Despite the strength of these results, it is 
important to recognize two points. First, on 
both the self-report measure of anxiety and 
on the finger pulse volume measure, subjects 
who were threatened and who used slow 
breathing exhibited greater arousal than non- 
threatened subjects, Clearly, by itself, the 
slowed-breathing strategy is not a panacea for 
problems of arousal. 

Second, the data concerning respiration 
demonstrated that subjects who were induced 
to change their respiration rate also changed 
their respiration depth; thus, it is not clear 
whether it was the change in rate, depth, or 
isome combination of rate and depth that re- 
Sulted in the reduction of arousal. From a 
ate standpoint (e.g., using the slow- 
rhe strategy to reduce anxiety), it is 
Hi that the manipulation of respiration rate 
EN arousal, and the question of whether 
| or depth or some combination mediates 

eo effects may not be crucial, From a 
P retical standpoint, however, this question 
‘| Sen if we are to understand the phe- 

enon, and it may deserve further atten- 
PL Summary, the present experiment em- 
i Masizes the importance of examining physio- 
gical strategies for controlling physiological 

r ; Psychological stress. If social-personality 
ee are going to make a maximal 
: oc. to the process of stress control, 
Sie should include physiological com- 
th along with the cognitive components 


dec” already studied and found to be 
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Interpersonal Attraction in Aversive Environments: 
A Problem for the Classical Conditioning Paradigm? 


Douglas T. Kenrick and Gregory A. Johnson 
Montana State University 


Several studies have found decreased attraction for a stranger rated under 
aversive conditions, and these studies have been used to support a model that 
views attraction as a function of generalization of affect. A number of studies, 
however, have failed to support such a model, not only failing to show decreases 
in attraction as a function of negative circumstances but often showing the 
reverse, that is, enhanced attraction under aversive conditions. It is suggested 
that studies that have supported the affective generalization model may be 
limited in ecological validity as a function of their use of a “simulated stranger 
paradigm and that results from studies in which “real” strangers are rated can 
be understood within a negative reinforcement model. Original data from a 
study in which female subjects rated both a fellow subject and a bogus stranger 
under conditions of either aversive noise or low-level noise are presented to 


support such a resolution. 


Research stimulated by the reinforcement 
model of attraction (Byrne, 1971; Clore & 
Byrne, 1974; Lott & Lott, 1974) has pro- 
duced an abundance of evidence supporting 
the idea that persons will come to like others 
associated with positive affect and dislike 
others associated with negative affect. With 
regard to the latter phrase of this proposition, 
a number of studies done to test a classical 
conditioning model of attraction have in fact 
shown that arousal of aversive affect leads to 
decreased interpersonal attraction (Gouaux, 
1971; Griffitt, 1970; Griffitt & Veitch, 1971: 
Veitch & Griffitt, 1976). There is at the same 
time, however, a large body of seemingly 
contradictory findings. For instance, anecdotal 
evidence suggesting that individuals often 
become highly attracted to one another under 
aversive circumstances (Byrne, Allgeier, Wins- 
low, & Buckman, 1975; Rubin, 1973, pp. 


The authors wish to thank Robert B. Cialdini for 
his comments on an earlier version of this manu- 
script. We would also like to express our apprecia- 
tion to Robert A. Baron, whose help extended well 
beyond the call of duty of a topic editor. 

Requests for reprints should be sent to Douglas T. 
Kenrick, Department of Psychology, Montana State 
University, Bozeman, Montana 59717, 


5-7) is supported by similar correlation 
findings (Driscoll, Davis, & Lipety 17 
Rubin, 1973, pp. 228-230). Likewise, a n4 
ber of experimental studies of attrac i 
der aversive circumstances (Bell, ie i 
& Baron, 1974, 1976; Byrne et al., DM 
Dutton & Aron, 1974; Kenrick, aa 
Linder, 1979; Latané, Eckman, & Joy, re 
Morris et al., 1976; Brehm, Gatz, a 
McCrimmon, & Ward, Note 1) have fai a 
show a classical conditioning effect and vc 
show the opposite pattern (i.e. aversive 
cumstances increasing attraction). 


ization 4 
Negative Reinforcement or Generalizalt 
Negative Affect? 


t- 
Although it is possible to interpret i : 
ter findings as inconsistent with Tel) 
ment principles (Berscheid & Walster, tha 
Kenrick and Cialdini (1977) suggest, inei 
these data could be parsimoniously i, 
as consistent with a negative renee 
model. Briefly, since the presence ° al aris 
has been found to reduce aversive arousé fi 
ing from such situations (Back & 
1964; Bovard, 1959; Conger, Sanam i 
rell, 1958; Wrightsman, 1960), subj 
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these experiments would be expected to find 
another’s presence reinforcing, and hence re- 
ported attraction would be quite consistent 
with reinforcement principles, In the Dutton 
and Aron (1974) study, for instance, there 
was evidence that the target person did in 
fact reduce subjects’ anxiety levels. Accord- 
ing to this formulation, the reinforcement 
model of attraction is not incorrect at all. In 
this case, it must simply be kept in mind that 
positive affect is aroused not only by the on- 
set of a rewarding stimulus (positive rein- 
forcement) but also by the offset of a punish- 
ing stimulus (negative reinforcement).* 
Although such an account is consistent with 
the reinforcement formulation of attraction 
(Byrne, 1971; Clore & Byrne, 1974; Lott & 
Lott, 1974), there remains nevertheless a dis- 
turbing inconsistency in the literature, with 
one group of studies suggesting that we will 
come to dislike others we meet under aversive 
circumstances (e.g., Griffitt, 1970; Griffitt & 
Veitch, 1971) and another group of studies 
contradicting this suggestion (e.g., Driscoll et 
al, 1972; Dutton & Aron, 1974), We would 
thus seem to have a situation in which rein- 
forcement theory has little predictive utility. 
However, if one compares those studies sup- 
Porting the simple affective generalization 
Model with those which fail to support the 
oe one very consistent difference emerges. 
Ria’ studies lending support to such a 
, the target person was not physically 
ee In the remaining studies, on the other 
Ean support an expanded negative 
cement version of the model, the target 
as actually present contiguous with 
rsive stimuli. 


I : 
pence of Aversive Affect on Ratings of @ 
ogus Stranger” 


k the Grifte and Veitch (1971) experi- 
Varyin y jects were exposed to conditions 
d ef ae ‘hot and crowded” to roomy 

kas ortable, As predicted, ratings of 
tinea = o negatively influenced by both 
Stranger, able heat and by crowding. The 
sul fae rated, however, were not the other 
ite Present in the experimental chamber 
ete simulated strangers (Byrne, 1971). 
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The simulated stranger in the Griffitt & Veitch 
(1971) experiment was described as a stu- 
dent of the same sex as the subject, who had 
participated in another study during the pre- 
vious semester. The subject was shown a copy 
of an attitude survey ostensibly filled out by 
this person (in actuality, the attitude ratings 
were filled out by the experimenter so as to 
appear to be either similar or dissimilar to 
the subject’s own attitudes). Similarly, sub- 
jects in the Griffitt (1970), Gouaux (1971), 
and Veitch and Griffitt (1976) studies rated 
such an artificially simulated stranger. Hence, 
when a target person is not physically present, 
there is reason to believe that negative af- 
fect generated by unpleasant circumstances 
will generalize to that other, leading to a 
negative impression. 


Aversive Affect and Ratings of Another who 
is Physically Present 


The Dutton and Aron (1974) and Brehm 
et al. (Note 1) studies, on the other hand, 
had subjects (who were or were not expect- 
ing painful electric shock) rate a physically 
present confederate who posed as a fellow 


subject. In these experiments, there was in- 
creased attraction 


for the confederate under 
the relatively more aversive circumstances. 
Byrne et al. (1975) likewise found increased 
attraction for same-sex others under rela- 
tively more aversive circumstances and sug- 
gested that these results “create potential 
problems within the larger context of the 
reinforcement-affect model of attraction” 
(Byrne et al., 1975, p. 11). Although this 
study also used a bogus attitude profile, how- 
ever, as did studies by Bell (1978) and Bell 
and Baron (1974, 1976), subjects in these 
studies were led to believe that the profile 
had been filled out either by a confederate or 
by another subject who actually participated 
in the same experiment along with the sub- 


1 This term is often mistakenly interchanged with 
the term punishment. As used by contemporary 
Jearning theorists (¢8-5 Rachlin, 1970), however, 
negative reinforcement refers to the termination of 
an aversive stimulus, whereas punishment refers to 
its onset, which of course has more or less opposite 


consequences. 
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ject.* Similarly, Latané et al. (1966) reported 
highest attraction for same-sex others who 
were present under relatively more fear-pro- 
ducing conditions, whereas in correlational 
studies of the influence of adverse conditions 
on attraction (Driscoll et al., 1972; Rubin, 
1973), subjects reported their attraction to- 
ward actual dating partners. Here again, re- 
sults seem contradictory to a simple classical 
conditioning model, indicating increased at- 
traction under relatively more adverse cir- 
cumstances. It would seem, then, that results 
obtained using the bogus stranger method 
might not be readily generalizable to most 
situations involving actual interpersonal judg- 
ment under aversive conditions. 


Validity of the Bogus Stranger Paradigm 


If our reasoning up to this point is correct, 
the inconsistent results of those studies in- 
vestigating the influence of negative situa- 
tions on attraction pose a problem not for the 
classical conditioning model as a whole but 
only for the generality of one derivation from 
that model, which has received support and 
general acceptance because of the use of a 
particular limited methodology. 

At this point it should be noted that the 
validity of the bogus stranger technique has 
been the subject of a great deal of debate in 
a different area of interpersonal attraction 
research, namely, in studying the relationship 
between similarity and attraction. Critics have 
suggested that results from such bogus 
stranger studies might not always generalize 
beyond the confines of the paradigm (Lev- 
inger, 1972; Murstein, 1971; Wright, 1971). 
In brief, the content of these arguments has 
been that results from observations of actual 
relationships will likely show a much weaker 
connection between similarity and attraction 
than that obtained using the laboratory para- 
digm. With regard to the connection between 
negative affect and interpersonal attraction, 
however, it would seem that the influence of 
negative environmental stimuli is not only 
weaker than one would expect on the basis of 
studies done within the bogus stranger para- 
digm but would actually tend to be Opposite 
in its influence on actual interpersonal attrac- 
tion, even within the confines of a short first 
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encounter in a laboratory setting. In effect, 
while it may generally be profitable to vie 
a target person as a neutral stimulus to which 
reinforcement value may be conditioned, tha! 
same other individual, even if she or he isa) 
complete stranger, may actually take ona 
positive valence in threatening or aversive 
situations. 

It is our contention that the data presented 
thus far seem, in themselves, to strongly sup: 
port the resolution we have outlined. In ordei 


to control for the possible operation of other . 
perhaps less obvious, factors, however, wt 


conducted an experiment in which subject 
evaluated both bogus strangers and actul 
strangers under identical aversive conditions. 
It was predicted that contiguously presented 
aversive stimuli would have the effect of dé 
creasing attraction toward the bogus strange 
while increasing attraction toward a strange 
who had been physically present with the! 
subject. 


Method 
Subjects 


Subjects were 66 undergraduate females eni 
in an introductory psychology course at ei 
State University. They received credit tone i 
course grade for their participation. Subjects Bal 
ticipated in groups of two. There was no eea l 
reason for limiting the subjects to females. fe 
the necessity of having both subjects show ee 
males were chosen, because in our experience tl 
they had been more reliable. Each member set 
dyad came from a separate large introductory 

00 

0 

with their experimental partner, 10° 
dropped SE of suspicion about the aa 
files. The remaining 60 subjects were equally ihe 
into the high and low noise conditions ( 
below). 
K 


Procedure 


“ 
Subjects arrived for an experiment Misc dat 
ment and judgment.” The two subjects 


2 Consistent with our own reasoning, iE 
Baron (1976) suggest that their failure toi roa 
the findings of Griffitt (1970) and fact thst 
Veitch (1971) may have been due to the perati 
“any negative affect of high ambient er esis d 
may have been neutralized by a positive 
‘shared suffering’ ” (pp. 28-29). 


separate locations for any given experimental session 
and were brought (individually) to the experiment 
toom, Here the experimenter introduced them, asked 
if they had prior acquaintance, and then explained: 


As the experimental title indicated, we’re interested 
in the effects of environmental stimuli on judg- 
“ment, We're particularly interested in the effects 
of environmental noise. You two will be in the 
; high (low) noise condition. 


“The first task you'll be working on is simply an 
attitude survey. We ask that you not discuss this 
‘with one another, since people are often strongly 
influenced by others on these things. We'll also be 
returning to a similar task later—so again, please 
do not discuss these with one another (all sub- 
jects complied with this request). 


j Noise manipulation. The experimenter turned on 
a noise tape at this point. Subjects in the aversive 
noise condition heard a tape that consisted of in- 
termittent, unpredictable bursts of very loud noise— 
Peaks were 95 dB (A)—determined to be highly 
aversive in preexperimental testing.’ Subjects in the 
low noise condition heard the same tape played at 
avery low level (peaks were 32 dB). 

Attraction ratings. After completing the 26-item 
Ed Survey questionnaire, subjects were given a 
bo, ee task while the experimenter constructed 
pis attitude surveys based on each subject’s re- 
ponies Using a method based on Byrne’s (1971) 
| oa of constant discrepancy,” a moderately 
ae ae was constructed for each subject (aver- 

inc eee points on a 6-point scale). The 
etimenter then returned and explained: 


u 
eee Separating the two of you for the re- 
pers, er o the experiment. I need one of you to 
ae ua me—it doesn’t really matter which of 
on ACE there’s another speaker in the other 
m playing the same noise tape. 


oD 5 
rig e pttimenter always chose the subject to his 
toom oi in actuality took her to an adjoining 
veg ren the same condition noise tape was being 
nana T separating the subjects, the experi- 
e explained to each: ayi ER 


Thi 
3S Patt of the experiment involves interpersonal 


tet hen only a small amount of informa- 
ehta ble. You'll be rating a girl who was in 

filled ly different experiment earlier this year 
One yow Out an attitude survey similar to the 
son’ ve just completed. Please read the per- 


es . s 
about Rees carefully and try to form an opinion 


The sub; 
iĝ ot Was given an attitude profile sheet 
Subject Ene Jean Haverland” on it. The other 
told that a Sien the same instructions, but was 
| the other ee be rating (name of co-subject), 
in the experiment with you.” Subjects 
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were told that their ratings would remain “com- 
pletely confidential” and were asked to give a “frank 
and honest judgment.” Subjects who first rated the 
bogus stranger were then asked to rate the “other 
subject” and vice versa. In each case the subject re- 
ceived another profile of exactly the same degree of 
discrepancy, except that item directionality was 
varied. For instance, if the subject’s responses to 
Items 8, 9, 10, and 11 were 3, 4, 4, and 3, respec- 
tively, one profile might read 1, 2, 6, and 5 on those 
items, whereas the other might read 5, 6, 2, and 1 
(on sum, the mean discrepancy of each profile was 
identical) 

For half the subjects, the “other subject” had in- 
dicated that she would shortly be leaving the state 
(in response to an item tapping attitudes about dif- 
ferential funding for the major rival branch of the 
state university system), whereas half of the sub- 
jects saw the same comment on the profile attributed 
to the bogus stranger. Order was randomized, Simi- 
lar statements have been included in earlier research 
using this technique (Byrne, 1971) to Jead the sub- 
ject not to expect future interaction with the partner. 

The noise tape (either aversive or control) was 
continued during the interpersonal judgment mea- 
sures, The ratings were made on the standard In- 
terpersonal Judgment Scale (Byrne, 1971). The “lik- 
ing” and “desire to work” items were, in the typical 
fashion, summed to yield the major dependent vari- 
able. Following the ratings, the noise was terminated, 
and subjects were fully debriefed and thanked for 
their participation. 

Predictions. It was expected that the results for 
the attraction measure would show an interaction of 
noise condition and type of stranger rated, indicating 
relatively more positive ratings of the other subject 
in the high noise condition, but an opposite effect 
for ratings of the simulated stranger. 


Results 


An initial 2 X 2 X 2 analysis indicated no 
effect of the target person’s intention to leave 


3 Thirty-two female subjects from the same pop- 
ulation used in the experiment proper were exposed 
to either the high or low noise conditions while per- 
forming the initial task engaged in by experimental 
subjects (filling out the attitude questionnaire) . They 
then responded to two questions designed to assess 
the aversiveness of the noise; “How annoying do 
you find this noise?” (0= “not at all annoying”; 
100 = “extremely annoying”) and “How pleasant do 
you find this environment to be?” (—3 =“not at all 
pleasant” ; +3 = “extremely pleasant”). Mean ratings 
on the first item were 67.87 and 31.67 for the high 
and low noise conditions, respectively, F(1, 30) = 
17.30, $ < 001. Similarly, the second item yielded 
mean ratings of —2.29 for the high noise and —.72 
for the low noise conditions, F(1, 30) =14.27, p< 


.001. 
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Figure 1. Attraction ratings as a function of noise 
condition and type of stranger rated. 


the state nor any interaction of this factor 
with the central manipulation in the present 
study (all Fs < 1), and this variable will not 
be discussed further. Results are displayed in 

. Figure 1 according to a 2 (between-groups : 
noise level) X 2 (within-groups : type of 
stranger) analysis of variance and indicate 
that our hypotheses were clearly supported, 
F(1, 114) = 7.02, p < .01, for this interac- 
tion. Ratings of the fictional stranger were, 
as predicted, significantly more negative in 
the aversive noise condition than in the low 
noise condition, F(1, 56) = 4.80, p< .05. 
Ratings of the other subject indicated that 
these were affected in the opposite direction 
by the noise manipulation, F(1, 56) = 6.21, 
p < .03. In addition, the overall main effect 
of the type of stranger indicated that other 
subjects were generally rated more positively 
than bogus strangers, F(1, 114) = 62.11, p< 
001. The overall main effect of noise level 
was not significant, F(1, 114) < 1. 


Discussion 


Our results are clear in supporting the hy- 
pothesis that, under negative conditions, the 
bogus stranger method yields results that are 
contradictory to those obtained using “real” 
others as target persons, even in a short-term 
laboratory encounter, The results we obtained 
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| 
for bogus strangers are directly in line wit 
those obtained in prior studies using tiis 
method (e.g., Griffitt, 1970; Griffitt & Veitch, 
1971), while our results for ratings of cor 
tiguously present others are consistent with 
the general findings of previous studies using 
this method (e.g., Byrne et al., 1975; Dutton 
& Aron, 1974; Latané et al., 1966). 

A recent paper by Rotton, Barry, Frey, & 
Soler (1978) reported results that are com 
patible with those of the present investige 
tion. In their first experiment, these authors 
failed to find the predicted decreases in a 
traction for a stranger as a function of u 
pleasant odors. In this case, however, the at 
titude profile rated by the subject was pte 
sented as having been filled out by anothe 
subject in an adjoining room (and pii 
ably “sharing stress” with the subject). 
Rotton et al.’s (1978) second experimen 
however, it was made clear to the a 
that they were alone in the situation ia 
that the person being rated was not a fel a 
participant. In this study, unpleasant “a 
led to significant decreases in attrac 
the rated other. These authors concluded i, 
“negative affect generalized onto wes, 
other persons in the environment” (P- 67) A 
added the qualification that “air pollu 
duces attraction only when a person !5 a 
lated” (p. 68). We would argue, of oun 
that the results of Rotton et al.’s first § 
are more likely to generalize to actual a 
personal encounters, since attraction Yea 
seem to be rarely developed when onè 
isolation. z ted 

Thus, the use of the bogus strani 
nique in the present line of investige al 
even more strongly subject to criti 2 
validity than is the case in similarity- sil 
tion research (Levinger, 1972; A a 
1971; Wright, 1971), where the gerens 
curacy of the relationship, if not its MYT 
seems to be ecologically valid ( ye i 
& Lamberth, 1970; Clore & Byrne, 19 


Theoretical Considerations 


f due 

While our results oppose one specific ae 

tion from the classical conditioning ™ poni 

that persons come to dislike stranger e 
they meet under aversive circumstan¢ 
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can in fact be seen to be in essential agree- 
ment with the overall reinforcement model. 
‘As we stated earlier, the failure of the simple 
dassical conditioning model in this instance 
may stem from the fact that other persons 
function as generalized reinforcers under aver- 
sive circumstances (Back & Bogdonoff, 1964; 
Kenrick & Cialdini, 1977; Schachter, 1959; 
Wrightsman, 1960), and their negative rein- 
forcement value (as potential reducers of 
aversive arousal) can generally be expected 
to neutralize or outweigh any immediate gen- 
eralization of negative affect. Thus, we remain 
‘in agreement with the basic assumptions of 
the Clore and Byrne (1974) model, while 
calling into question certain lower level de- 
ductions from that model. 

The present data suggest an important con- 
ceptual qualification on the simple affective 
generalization model, however. Certainly mere 
contiguity with a negatively toned context is 
Not sufficient, in itself, to produce disliking 
for another. Instead, it seems necessary to 
Consider the subject’s cognitions to explain 
os part of the present experiment’s pat- 
a T results. Subjects’ ratings of their fel- 
ial 4 ject showed a pattern that did not fol- 
Beat simple contiguity assumptions, but 
nin seemed to indicate an active and dis- 
a ae attributional analysis of the sit- 
rati uch a consideration of cognitive in- 
a ae processing is, of course, compatible 
fect Re statements of the reinforcement— 
1974) odel of attraction (Clore & Byrne, 


RE the suggested negative reinforce- 
and oe el is consistent with prior literature 
Resa a plausible explanation of the 
Not ee it should be noted that we do 
aversive ay demonstrate the reduction of 
Case oe in our data. It might be the 
Physical] instance, that attraction for the 
ion ae Present other in the aversive condi- 
tion (or ee increased because of the reduc- 
sive aroy the anticipated reduction) of aver- 
ence of ae but instead because the experi- 
increase n ared stress may have led to an 
lationshi perception of similarity or “unit re- 
Subject, P with the fellow subject. Since the 
i S had a great deal of information relat- 


Ng to 
the target person’s attitudes on con- 
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troversial positions, it seems to us unlikely 
that this one additional cognitive input would 
greatly alter the extent to which the target 
was seen as similar/dissimilar, but such an 
account cannot be ruled out, particularly 
since “shared stress” was certainly a salient 
issue in this case. Further research investigat- 
ing the mediator of these effects would be 
worthwhile. 

Future research is also needed to determine 
which components of the other subject’s pres- 
ence mediate the demonstrated increases in 
attraction under aversive circumstances. For 
instance, subjects in the aversive condition 
may have engaged in higher levels of mutual 
eye contact than those in the low noise condi- 
tion. Research by Morris et al. (1976) sup- 
ports the suggestion that aversive conditions 
do produce differential patterns of interaction 
between subjects, although the research of 
Rotton et al. (1978), discussed above, sug- 
gests that the simple knowledge that the other 
is sharing a stressful experience with oneself 
may sometimes be sufficient to produce at- 
traction. 


When Will Negative Affect Decrease 
Attraction? 


The major point of these data is to suggest 
the severe ecological limitations of those ear- 
lier findings (e.g, Griffitt, 1970; Griffitt & 
Veitch, 1971), which social psychologists have 
widely accepted to mean that “any event that 
arouses a negative emotional state. may create 
a temporary dislike for whoever is nearby” 
(Tedeschi & Lindskold, 1976, pp. 455-456). 
In assessing the effect of simple negative 
arousal on interpersonal attraction, results 
from studies using the bogus stranger para- 
digm would seem to explain only a rather 
limited range of situations (i.e., those in 
which one is evaluating another person who is 
not physically present). Whenever the object 
person is present, negative conditions would 
be expected to result in rather different effects 
on attraction. 

Tt should be noted here that we are dealing 
only with the case of impersonal aversive or 
threatening environmental stimuli, from which 
it has been determined that individuals do 
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in fact seek solace in the company of others 
(Burlingame & Freud, 1943; Marshall, 1951; 
McIntyre, 1971; Sarnoff and Zimbardo, 1961; 
Schachter, 1959; Wrightsman, 1960). We 
would not, for instance, expect the present re- 
sults to hold in situations involving potential 
interpersonal embarrassment, where the pres- 
ence of others has punishment, as opposed to 
reinforcement, value. For instance, Sarnoff 
and Zimbardo (1961) found that whereas 
95% of their subjects who expected painful 
electric shocks wished to wait in the company 
of others, the majority of those expecting to 
perform potentially embarrassing behaviors 
expressed a desire to wait alone. Consistently, 
Griffitt and Guay (1969) found that their 
subjects showed decreased attraction for others 
who were present when the subject received 
esteem-damaging feedback (negative creativity 
ratings), even when that other was not per- 
sonally responsible for the feedback. On the 
other hand, Jacobs, Berscheid, & Walster 
(1971) found that females were more at- 
tracted to a male they met after receiving 
self-esteem-lowering feedback. In this case, 
however, the male was not present during the 
receipt of the feedback, but served to repair 
the subject’s damaged self-esteem by express- 
ing romantic interest in her. 

Finally, it would be expected that other 
persons who are themselyes directly respon- 
sible for inducing negative affect in an indi- 
vidual will come to be disliked, and there is 
an abundance of data to support this supposi- 
tion (e.g, Burnstein & Worchel, 1962; 
Byrne & Rhamey, 1965; Griffitt & Guay, 


1969; Hendrick & Taylor, 1971; Sigall & 
Aronson, 1969). 


Reference Note 


1, Brehm, J., Gatz, M., Goethals, G., McCrimmon, 
J, & Ward, L. Psychological arousal and inter- 


personal attraction. Unpublished manuscript, Duke 
University, 1967. 
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Three experiments testing the effectiveness of the foot-in-the-door technique for 
recruiting blood donors consistently failed to demonstrate that this procedure 
influences either verbal or behavioral compliance, suggesting that the generality 
of the foot-in-the-door phenomenon is limited. Experiment 1 attempted to dem- 
onstrate that an earlier failure of this technique was due to poor operationaliza- 
tion rather than to the magnitude of the critical request or to the invalidity of 
the phenomenon, but it failed to do so, Experiment 2, designed to more closely 
resemble other foot-in-the-door studies by using telephone contacts and an 
initial request for persons to answer questions, was conducted to examine other 
possible explanations for the two previous failures. This experiment also failed 
to show any foot-in-the-door effect. Experiment 3 was a conceptual replication 
of Experiment 2 but used personal contacts. One apparent foot-in-the-door 
effect emerged in this case, but it was more likely due to a factor other than 
the experimental treatment. It is concluded that although the foot-in-the-door 


procedure may indeed influence verbal compliance with requests for minimal 


forms of aid, it probably will not 


Significantly affect people's willingness to 


comply with more substantial requests involving behaviors that are psychologi- 


cally costly to perform. 


A number of recent investigations of the 
effects of compliance with a small request on 
subsequent compliance with a more substan- 
tial request (the “foot-in-the-door” phenome- 
non) suggest the possibility of a direct and 
exciting application of social psychological 
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knowledge to practical matters. It has i 
shown repeatedly that persons who are s 
duced to comply with a small request k 
much more likely to comply with a a 
quent (critical) request of greater magn! 
than would be the case if there had ae. 
preliminary request (Cann, Sherman, & Ha 
1975; Freedman & Fraser, 1966; Pliner, 
Kohl, & Saari, 1974; Seligman, Bush, 
Kirsch, 1976; Snyder & Cunningham, th 
The apparent reason for this ettegi ae 
initial compliance alters a person’s sè cit 
ception, and as a result, the person, aa 
himself/herself as more of a helper ae 
viously, is more likely to help a strabh ji 
asked (Snyder & Cunningham, 197 itse 
merely when the situation presents 
(Uranowitz, 1975). ranks 
One clear implication of these findings 
that various service organizations such z e 
American Cancer Society, the American 
Association, the American Red Cross bai 
be able to make good use of this techniq! 
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EFFECTIVENESS OF THE FOOT-IN-THE-DOOR TECHNIQUE 


induce compliance with their requests for aid 
“in their respective tasks. In fact, Pliner et al. 
' (1974) were able to nearly double the num- 
þer of contributors to the Cancer Fund in a 
suburban neighborhood of Toronto, Canada, 
by using this technique. However, with this 
exception and one other, all studies of the 
oot-in-the-door phenomenon have used rather 
trivial requests, and a large number have em- 
ployed a single research paradigm, which is 
to make requests over the telephone for per- 
sons to answer varying numbers of questions. 
| When the foot-in-the-door technique was 


used in a different and more realistic field 
setting in an attempt to increase blood dona- 


tions, it failed (Cialdini & Ascani, 1976). In 
that study, the second request was made im- 
mediately after the first, which may not have 
allowed time for a person’s self-perception to 
be sufficiently altered by compliance with the 
initial request. However, Cann et al. (1975) 
have shown, albeit in the somewhat artificial 
telephone interview paradigm, that timing 
of the second request apparently is not cru- 
cial for the foot-in-the-door effect to emerge. 
Therefore, it may simply be that moving from 
a rather trivial request such as answering 
questions in a telephone interview or donat- 
ing money to a charity * to a more substantial 
Ohe such as asking for a blood donation pro- 
Vides more of an obstacle than the foot-in-the- 
door technique can overcome. 

Given that several studies of helping have 
mented a substantial rate of attrition 

i eres volunteering to do something and ac- 

i é doing it (cf. Gross, Wallston, & Pilia- 
a fee Kazdin & Bryan, 1971), the gen- 

: A a ility of the foot-in-the-door phenome- 
4 NA settings in which such a technique 
n have some practical utility is further 
apie only Pliner et al. (1974) and 
a eta Ascani (1976) have studied ac- 
Ei ehavioral compliance with the critical 

st, 

5 sone present studies were designed to clarify 
naM issues that are of importance for the 
STEN application of the foot-in-the-door 
a oe In particular, it would be use- 
iably kad (a) whether the effect does re- 
Well gs ccur with respect to overt behavior as 

| verbal behavior, (b) whether the ef- 
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fect extends to other than question-answering 
behavior, and (c) how strong the effect is. 
Since Cialdini and Ascani’s (1976) negative 
results amount to an acceptance of the null 
hypothesis, a replication of this result would 
be desirable before much confidence is placed 
in it. Therefore, the present series of experi- 
ments also used the foot-in-the-door tech- 
nique in an attempt to increase blood dona- 
tions during the visit of a bloodmobile to a 
university campus. Experiment 1 is a modified 
replication of Cialdini and Ascani’s experi- 
ment in which timing of the critical request 
was manipulated to see whether it was the 
timing or the magnitude of the critical re- 
quest that was more likely responsible for the 
failure of the foot-in-the-door effect to ma- 
terialize in that study. 


Experiment 1 


Method 


Research participants were 76 dormitory residents 
at a small, rural university. For each hall in each of 
the 10 dormitories on campus, a room number was 
randomly selected. Two female experimenters indi- 
vidually approached selected rooms between 6 p.m. 
and 10 p.m. and made the following initial request 
to the person answering the door: 


Hi, my name is ——, and I am working as a 
volunteer for the Red Cross. Our bloodmobile is 
to be here next week [on Tuesday] [tomorrow] 
and Wednesday from 12:00 to 6:00 p.m. Would 
you help us advertise its visit by putting this 
poster on your door? 


Persons who agreed were given a poster that stated 
the time, place, and date of the bloodmobile visit 
and that contained the American Red Cross logo. 
The experimenter then took the subject’s name and 
recorded his/her sex. All persons were thanked for 
their help, or if they did not cooperate, they were 
thanked just for listening. If no one answered the 
door of a preselected room, another room on the 
same hall was randomly selected and approached. 

Timing between initial and critical requests was 
manipulated by making the initial request on the 
Wednesday (5-day delay) or Friday (3-day delay) 
of the week preceding the bloodmobile visit or on 


pies 

1 Forty-eight percent donated with no prior re- 
quest in the Pliner, Hart, Kohl, and Saari (1974) 
, indicating that donating small amounts of 
(M <$1) to a good cause is a behavior 
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Table 1 

Percentage of Verbal and Behavioral Compliance by 

Experimental Condition—Experiment 1 


Condition (%) 


Compliance 5-day delay 3-day delay 


No delay Control x $ 
Verbal compliance , 
Recontacted sample 53 (9 of 17) 54 (7 of 13) 55 (11 of 20) 53 (10 of 19) 046 >. 
Behavioral compliance : 
Recontacted sample 18 (3 0f 17) 38 (Sof 13) 25 (5of20) 26 (5 of 19) 1.672 >S 
Entire sample 14 (3 of 22) 33 (5 of 15) 26 (5of19) 2,095 >% 


25 (5 of 20) 


___ O 
Note. Ns are given in parentheses. Data for verbal compliance are for 69 persons who were cues 
contacted for both the initial and critical request (recontacted sample). Data for behavioral compl 
reported for these persons (recontacted sample) and for these persons plus 7 persons who could ni 
contacted for the critical request but whose behavioral compliance could nevertheless be determined (enti 


sample). 


the Monday evening (no delay) prior to the blood- 
mobile visit, The critical request was then made on 
the Monday evening preceding the bloodmobile visit. 

In the 5-day and 3-day delay conditions, persons 
were recontacted by a different experimenter who 
also presented herself as a Red Cross volunteer and 
asked if the person would be willing to donate a 
pint of blood. 

In the no-delay condition, after the person had 
agreed to the initial request to display the poster, 
the experimenter continued with the following re- 
quest: “There is one other thing you can do for us. 
We are asking students to donate one pint of blood 
tomorrow and I wondered if you would volunteer to 
donate?” In the control condition, persons were sim- 
ply contacted for the first time on the Monday 
evening preceding the bloodmobile visit and asked if 
they would donate. A list of the names of all persons 
who appeared at the donation center and either 
donated or were deferred for medical reasons was 
obtained from the American Red Cross. This list 
was used to determine how many persons in each 
experimental condition actually complied behavior- 
ally with the critical request as well as how many 
of those in each condition were previous blood 
donors. 


Results 


There were no sex differences for verbal 
compliance, with 55% of males (18 of 33) 
and 53% of females (19 of 36) verbally 
agreeing to donate nor were there any dif- 
ferences for behavioral compliance, with 27% 
of males (9 of 33) and 25% of females (9 
of 36) actually appearing at the donation 
center.° Therefore, male and female data were 
combined for the following analyses. 

Verbal compliance. Table 1 indicates that 


there was practically no variation in oo 
pliance across conditions, and the differen 
did not approach statistical signitan 
x*(3) < 1. Combining all experimental a 
tions, the overall verbal compliance was 
(27 of 50), which is virtually identical ae 
53% compliance rate in the control condi 
2(1) <1. 
x na compliance. It was g 
that behavioral compliance would be subs 
tially less than verbal compliance, an i 
turned out to be the case (see Table a 
only about half of those who verbal ue 
plied actually showing up to donate. t 
there were no significant differences g 
pliance rates between conditions, ti 
1.67, p > .50. The compliance el 
combined experimental conditions be 
(13 of 50), which is identical to 
pliance rate in the control condition. 
In an attempt to determine why del 
compliance was so much lower than a 
compliance, information about the vet 
donation history of all persons who 
complied was obtained from local i 
records. As Table 2 indicates, this ye 
information is vital in explaining ou : 
in behavioral compliance among aa tt 
verbally complied. Of those who hi 


for 
2Seven persons could not be recontactt a 
critical request (five in the 5-day delay 
two in the 3-day delay condition). 
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Å 
before, 82% (14 of 17) carried through on 
their verbal commitment, whereas only 21% 
(4 of 19) of those who had never donated be- 
fore did so, x"(1) = 11.15, p< 001 ($= 
61). Thirty-seven percent of the variation 
in behavioral compliance in this group was ac- 
counted for by donation history. Furthermore, 
itis evident from the table that this relation- 
‘ship between donation history and behavioral 
‘compliance holds up within both the experi- 
mental condition, x°(1) = 7.96, P< 005 (¢ 
“= 63) and the control condition (Fisher’s 
“exact test, p = .083, ọ = .65), suggesting that 
the experimental treatment had no effect on 
the relationship between these variables. 


Discussion 


The present results call into question the 
external validity of the foot-in-the-door phe- 
nomenon. While it previously appeared that 
Cialdini and Ascani (1976) may not have ob- 

| tained the usual effect because of an im- 
| oe operationalization of the technique, 
‘ A present study rules out this interpreta- 
F and leaves us with the suspicion that this 
ji p enomenon may not be robust enough to af- 
f lect those behaviors that people are initially 
art hesitant to engage in. It is well- 
ees most people, if they have never 
a x blood before, are quite reluctant to 
A pecshereas Red Cross, undated). On 
Blan er hand, examining the amount of com- 
| Spree in control conditions of pre- 
Be, oot-in-the-door studies, we find that 
1966) a from 17% (Freedman & Fraser, 
ie TEs (Cann et al., 1975) of the 
Bence who were simply asked initially to 
3 s in the criterion behavior were willing 
Bio This suggests that none of these 
frst ate were particularly noxious in the 
ina ee with the exception of the 
Beavis Fraser experiments, the criterion 
iy ot s were quite minimally inconvenienc- 
A costly.) * 
the a possible reason for the failure of 
Cialdini a effect to materialize for 
Peake i Ascani (1976) as well as in the 
ies etine has to do with situational 
any alt at may have intervened to prevent 
eration in self-perception as 4 result 
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Table 2 
Behavioral Compliance of Previous Donors 
and Previous Nondonors—Experiment 1 
Condition 
Experimental* Control? 

i Non- Non- 
Compliance Donor donor Donor donor 
Yes 9 4 5 0 
No 1 12 2 Ki 

Total 10 16 if 3 


Note. This table includes only those persons who 
verbally complied with the request to donate blood 
plus two who failed to verbally comply but donated 


anyway. 
= [In the experimental condition, x*w) = 7.96, p 


< .005, ġ = .63. 
>In the control condition, Fisher exact, p = .083, 


$ = -65. 


of compliance with the initial request. If 
there is strong external justification for a 
behavior, that behavior is likely to be dis- 


3 All chi-square values reported with 1 degree of 
freedom are corrected for continuity. 

41Tt might be argued that the 32% verbal com- 
pliance rate obtained by Cialdini and Ascani (1976) 
and the 53% verbal compliance in the present study 
provide evidence that people are also fairly willing to 
donate blood; however, such a conclusion js not justi- 
fied. The baseline compliance rate in blood donation 
studies is elevated by the presence in the sample of 
previous donors, who are much more likely to agree 
to give blood than persons who have never donated 
(Foss, Note S) Consequently, this baseline rate is 
te indicator of the reluctance of people 


not an accural j 
in general to engage in this particular criterion be- 


havior. 
In the present study, both the 53% verbal com- 


pliance and the 26% behavioral compliance are ab- 
normally high and, in fact, are much higher than 
those reported by Cialdini and Ascani (1976). This 
can be accounted for the fact that there is a well- 


organized, continuing effort to obtain blood donations 
sent study was con- 


on the campus at which the pre: 
iding a climate of normative support 


for this activity (cf. Barton, 1969), and many stu- 


conducive to blood donation, verbal compliance was 
only 25% and behavioral compliance a mere 3%. 
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counted rather than taken as an indication 
of a person’s character (Bem, 1972; Kelley, 
1971). It is quite possible that the initial re- 
quest (to help advertise the presence of a 
mobile blood-collection unit on campus) was 
for such a good cause that people perceived 
a great deal of external justification for do- 
ing so—“Sure I’ll do that, anybody would.”— 
and, hence, did not attribute “helper” char- 
acteristics to themselves subsequent to the 
initial behavior. 

There is yet another possible explanation 
for the failure of the foot-in-the-door tech- 
nique to influence blood donation. It may be 
that size of the initial and critical requests 
must be correlated such that for a larger 
critical request, one must use a more sub- 
stantial initial request. Seligman et al. (1976) 
found this to be the case using the telephone 
interview paradigm, and it is plausible that 
one’s self-perception will be more strongly in- 
fluenced by an initial agreement to a moder- 
ate request than to a small request. There- 
fore, it may still be possible to develop prac- 
tical applications of this technique in areas in 
which human aid or assistance are needed 
but in which people are reluctant to provide 
such services. However, by having to in- 
crease the magnitude of the initial request, 
one is of necessity decreasing the number of 
Persons who can be subjected to the tech- 
nique, because of the decreased likelihood of 
compliance with the initial request. One of 
the main appeals of the foot-in-the-door tech- 
nique is the possibility that even the most 
trivial of initial requests may be sufficient to 
increase subsequent compliance. However, if 
this is not the case, and considering the sub- 
stantial effort that would be necessary to 
carry out a recruitment project using this 
technique, the minimal payoffs may not be 
worth the effort, Nevertheless, in the interest 
of a better understanding of this phenome- 
non and to provide a more thorough specifica- 
tion of relevant parameters, two further ex- 
periments were conducted, 


Experiment 2 


A second study was conducted to examine 
the possibilities Suggested earlier (a) that be- 
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cause of the compelling nature of the requ 
used, altered self-perceptions might not ly 
induced or (b) that the initial request yay 
not large enough to alter self-perceptions suh 
ficiently to influence compliance with so large 
a request as donating blood. Contacts wer 
made by telephone, as has been the case in 
most other foot-in-the-door studies, and 4 
condition similar to that typically used i 
such studies was added to approximate mo 
closely the procedures used in studies i 
which this technique has been succes 
experiment was also conducted in connection 
with a 2-day bloodmobile visit to a university 
campus. 


Method 


Research participants were 135 dormitory ts 
dents at a large, urban university, Participants welt 
randomly selected from a listing of dormitory rei 
dents and assigned to one of three experimental com 
ditions or to a control group. Persons in the experi 
mental conditions were then contacted by phor a 
either Wednesday or Thursday evening between s 
p.m. and 9:30 p.m. and were asked if they aker, 
be willing to do one of the following tasks, depent: 
ing on the condition to which they had been i 
signed: 

Routine initial request. Subjects were asked 4 
answer a few questions about blood doir 
they agreed, five questions were asked, in pet 
whether they had ever donated, and they 
thanked for their cooperation. 

Poster request. Subjects were asked to alow b 
blood organization to put a poster on their ond 
room door to advertise the bloodmobile A the 
they agreed, they were asked the location aa 
room, their age, and whether they had ever rationi 
blood. They were then thanked for their con E 

Large initial request. Subjects were ai a 
cruit four friends to donate blood as part 9 dati 
approach to donor recruitment being tried at l 
the upcoming bloodmobile visit. If they aaa 
were simply told to contact four friends, } 
them that the bloodmobile would be on Ber 
following Monday and Tuesday, and ask 


m 


were then thanked for their cooperation. _ nts GA 

On the following Sunday and Monday ned mes 
days later), all experimental subjects and “ier 
bers of the control group were called by 4 amobile 
Person, who informed them that the bloo asked 
would be on campus for two days and vee a 
them if they would be willing to donate & "™ 
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fable 3 
mrcentage of Verbal and Behavioral Compliance by 
Experimental Condition—Experiment 2 
Condition (%) 
Routine Poster Large 
Item request request request Control x t 
lerbal compliance 
Recontacted sample 20 (6 of 30) 29 (60f 21) 31 (9of29) 20 (6of3 
7 O) S ue ose) 
Previous nondonors 19 (Sof 26) 22 (4 of 18 33 (9 of 2 : 
For ane ) (9 of 27) 22 (5 of 23) 1.67 >.60 
Entire sample 5 (2 of 41) 4 (1 of 28) 0 (0 of 36) 3 (1 of 30) — at 


ee are given in parentheses. Data for verbal compliance are for 110 persons who were successfully 
a ; ie or both the initial and critical request. Data for behavioral compliance are for these persons plus 
others who could not be contacted for the critical request but whose behavioral compliance could never- 


blood.’ Persons who agreed were told of the time 
And place and were reminded to eat a good meal be- 
fore coming to donate. 

To Summarize, there were three experimental con- 
ditions ranging from a small initial request, which 
is the type normally used in foot-in-the-door re- 
am ae moderate initial request that was es- 
Kiar Pi ic lentical to that used in Experiment 1, to 
fe d eae request that should have a greater ef- 
a ie -perceptions if persons agree to it. There 
t a control condition with no initial request. 


Results 


he compliance. Twenty-five persons 
Bes: ee be recontacted for the critical re- 
Bes, each in the poster and large request 
Mence ee 11 in the routine condition), 
Biin these people the amount of verbal 
they mi ht is unknown.® Persons who said 
able ag donate were classified as refusals. 

Bice ran ris the amount of verbal compli- 
Ported: th efore and after persons who re- 
Moved Nea were previous donors were re- 
e N ma analysis. In neither case was 

stances th chi-square significant. In both in- 
Pliance a greatest amount of verbal com- 
Quest See obtained in the large initial re- 
significantly ort but in neither case was this 
ion, Finalt different from the control condi- 
Btoups, th ly, combining the experimental 
(21 of 80) overall compliance rate was 267 
20% com, l which does not differ from the 
ition, ON obtained in the control con- 
Moved a ) <1. With previous donors re- 
» the combined compliance rate was 


less be determined (one of the donors in the routine condition was in this group). 


25% (18 of 71), which is not significantly 
different from the 22% compliance in the 
control condition, x*(1) < 1. 

Behavioral compliance. There was virtually 
no behavioral compliance, with only four 
persons donating, one of whom had not been 
successfully contacted for the critical request. 
This result underscores one conclusion of 
another study of blood donation that personal 
contact is a far more effective recruitment 
technique than telephone contact (Ford & 
Wallace, 1975). Unfortunately, the extremely 
small amount of behavioral compliance ob- 
tained in the present study (3%) precludes 
a definitive test of the possible effects of dif- 
fering initial requests, although the failure 
of effects to materialize with verbal com- 


5 All callers were unaware of the hypothesis being 
tested. Although some persons in the large request 
condition may have failed to behaviorally comply 
with the initial request, only 2 persons (poster condi- 
tion) would not agree to the initial request. Due to 
the time pressures generated by having to contact 

in two fixed 34-hour time periods, 


135 persons withi io 
these 2 persons were not recontacted for the critical 
request but were included in the analyses where pos- 


sible (i.e. behavioral compliance). 

6 This is a disturbingly large attrition rate; how- 
ever, there is little reason to believe that these per- 
sons differ in any important way from those we 
could successfully recontact. Because of the logistics 
of contacting a large number of persons during a 
relatively short time period, only a limited number of 
call backs were possible, and these persons simply 
were not in when we called. 
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pliance suggests once again that the foot-in- 
the-door technique simply does not work 
with blood donation. Nevertheless, a third ex- 
periment was conducted in a further attempt 
to determine if it might yet work under the 
right circumstances. 


Experiment 3 
Method 


This experiment was conducted on the same cam- 
pus as Experiment 1, 9 months later, during a dif- 
ferent academic year, and again in connection with 
a bloodmobile visit to campus. Procedures were 
similar to those in Experiment 2, but dormitory 
rooms rather than individuals were sampled to maxi- 
mize the sample size with which we would ultimately 
be able to make two contacts. Five or 6 days prior 
to the bloodmobile visit, research participants were 
contacted in person by an experimenter who de- 
livered one of three initial requests to whomever 
answered the door, depending on the condition to 
which the room had been assigned: 

Routine initial request. Subjects were asked to 
answer six or seven questions about their knowledge 
of and experience with blood donation. 

Poster request. Subjects were asked to put a 
poster on their door advertising the bloodmobile 
visit to campus. 

Large initial request. Subjects were asked to re- 
cruit four of their friends to donate blood. 

Persons who agreed to the request were asked the 
questions, given the poster, or told how to recruit 
friends as in Experiment 2. They were then asked 
their names and whether they had ever donated 
blood, For purposes of secondary analysis, a record 
was kept of all persons who reported conditions that 
would legitimately disqualify them from donating 
blood (chronic illness, underweight, temporary ill- 
ness, etc.). 

On the evening preceding the bloodmobile visit, 
all persons who had received an initial request and 
persons in the control group were contacted by a 
different experimenter (who feigned ignorance of 
the earlier contact by an experimenter)? and asked 
to donate blood the following day. Those who agreed 
to donate were reminded of the time and place and 
to eat a good meal before coming to donate, Every- 
one was thanked for his/her time, even if he/she 
had refused to donate, 


Results 


Of the 127 persons initially contacted, 95 
were successfully recontacted and asked to 
donate blood. In addition, 36 persons in the 
control group were contacted for the first time 
and asked to donate. Since there were no 
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significant differences between males and 
males for verbal compliance (36% vs, 39 
or for behavioral compliance (12% vs. 14%), 
data were combined for the following anal. 
yses.® 

Verbal compliance. Although there wer 
some moderate variations in verbal compli 
ance across conditions (see Table 4), they 
did not approach statistical significance, xi) 
= 2.30, p > .50. Compliance was greatest in 
the poster condition (44%), although it dif 
not differ significantly from the control con! 
dition, x*(1) = 1.27, p > .20. The 40% (i 
of 95) compliance rate among combined er. 
perimental conditions also did not differ sig} 
nificantly from the control condition, y(1) 
<1. 

Behavioral compliance. Since 32 perso 
could not be recontacted for the critical 
quest (8, 11, and 13 in the routine, poster 
and large request conditions, respectively) 
two sets of analyses were conducted. Althou 
these persons were not directly asked 
donate, they were in experimental conditions 
that, in view of the processes apparently in 
volved in the foot-in-the-door phenomenon, 
should have made them more likely to do si 
when the opportunity presented itself ( 
Uranowitz, 1975). On the campus at whi 
this study was conducted, almost everyon 
knows when the bloodmobile is on a 
and, therefore, is aware of the opportunity 
to donate. bk 

First, in the recontacted sample (see Tal H 
4), which provides the easier task for the foot, 


(i 


= 


‘Both verbal and behavioral compliance Si 
were substantially higher in Experiment 1. TA ad 
ing appears to have been due primarily to wintt 
that Experiment 3 was conducted during the vi 
in the midst of a flu epidemic, whereas Expél 
1 was conducted in the spring. the 

8 Although recruiters were initially blind i 
respondent’s experimental condition, thoes 
Poster condition were easily recognized as suc unfor: 
the recruiter approached the room. This iS e 
tunate but is probably not serious, since H ypotit 
were unaware of the specific nature of the uesti 
ses being tested. Furthermore, the crucial aa 
became (see below) whether any or all expe! athe! 
conditions differed from the control grouP > er 
than whether there were any differences be 
perimental conditions. si 
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f Verbal and Behavioral Compliance by 


al Condition—Experiment 3 


Condition (%) 


Routine Poster Large 
request request request Control x? Pp 
compliance 
itacted sample 43 (16 0f 37) 44 (14 of 32) 31 (8 of 26) 31 (11 of 36) 2.30 >.50 
il compliance 
contacted sample 16 (60f37) 31 (10 of 32) 12 (3 of 26) 6 (2 of 36) 8.83 <.05 
‘ire sample 13 (60f45) 23 (100f 43) 8 Gof39) 6 (of 36) 6.79 <.08 


or technique, there was a significant 
effect, x°(3) = 8.83, p < .05; this was 
d for by the poster condition, which 
d significantly from the control condi- 
1) = 6.03, p < .02. Neither of the 
perimental conditions differed signifi- 
from the control group. The combined 
nental conditions differed from the con- 
oup at a marginal significance level 
Ws. 6%), x*(1) = 3.05, p < .09, but 
this is due primarily to the contribu- 
the poster condition. 

the entire sample, behavioral compli- 
n the experimental conditions was again 
hat higher than in the control condi- 
ind the overall effect approached con- 
nal levels of statistical significance, 
= 6.79, p < 08. The poster condition 
ed for this effect and was signifi- 
different from the control condition 
Ws. 6%), x*(1) = 3.49, p < .06, 
neither of the other two experimental 
ns differed significantly from the con- 
oup. It should be pointed out that none 
32 persons who were not recontacted 
ly donated, and it is arguable that the 
r test of the foot-in-the-door effect is 
€vious one involving only those con- 
Er both the initial and critical re- 


Onation history. Among those persons 
f eported having donated previously, 5976 
of 39) agreed to donate and among those 
not donated previously 30% (25 of 


jare given in parentheses. Data for the recontacted sample are for 131 persons who were successfully 
d for both the initial and critical request. Data for the entire sample also include 32 persons who 
be contacted for the critical request but whose behavioral compliance could nevertheless be 


83) agreed to donate; x*(1) = 8.09, p< (0) Fe 
Actual donation was also more common among 
previous donors (24%, 13 of 54) than among 
persons who had never donated (8%, 8 of 
100), y2(1) = 6.39, p < 02. 

In the present experiment, previous dona- 
tion history did not play such a strong role 
in determining whether persons who agreed 
to donate would actually do so as it did in 
Experiment 1. Nearly half (48%) of those 
who agreed to donate were previous donors, 
and of those, 39% actually donated, whereas 
28% of those who agreed to donate but were 
not previous donors actually did so, x*(1) 
< 1. Still, donation history is an important 
determinant of behavioral compliance, with 
those who have donated before three times as 
likely to donate as those who have not previ- 
ously done so." In the present experiment, 


9A third set of analyses was conducted after ex- 
cluding the 44 persons who reported a legitimate 
medical excuse for not donating, as well as the 32 
persons not recontacted. This altered the percentages, 
but the same pattern of results as in the above anal- 
yses emerged for both verbal and behavioral com- 
pliance. 3 

10 Because of an oversight, information about 
donation history was not obtained from nine per- 
sons in the control group. 

11 Jn this study, information on donation history 
was obtained from the persons themselves, whereas 
in Experiment 1, it was obtained from actual records. 
Tt is possible that some persons felt a need to deceive 
us about their donation history and, therefore, that 
the results of this analysis are less reliable than those 


from Experiment 1. 
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previous donors constituted only one third 
(359%) of the sample, yet nearly two thirds 
(62%) of the donations were made by per- 
sons who had previously donated. 


General Discussion 


Since the experiments reported here pri- 
marily provide confirmation of a null hy- 
pothesis, a brief discussion of methodological 
issues surrounding this procedure is in order. 
Traditionally, in social psychology, experi- 
mental results that confirm the null hypothe- 
sis have been viewed with more than a little 
skepticism—often with good reason (Aronson 
& Carlsmith, 1968). Even if a study is well 
designed and well executed, one plausible al- 
ternative explanation of experimental results 
that confirm a null hypothesis is that the 
statistical test used may not have been pow- 
erful enough to detect a true effect. Thus, 
conventional canons of research dictate that 
very large samples are necessary to effectively 
test a null hypothesis. 

The samples in the present study clearly 
are not particularly large, averaging about 31 
persons per condition, which leaves open the 
possibility that the foot-in-the-door tech- 
nique does indeed have an effect on compli- 
ance with requests to donate blood, but the 
effect is too small to detect with such small 
sample sizes. However, the purpose of the 
present studies was not the traditional one of 
examining a theoretical proposition but rather 

. to examine the utility of a phenomenon in 
practical application. In such a situation, we 
are not particularly interested in small effects, 
real though they may be. Even though a 
compliance-inducing procedure might have the 
(statistically reliable) effect of increasing 
compliance by, say, 25% over normal pro- 
cedures (an effect that would not be detected 
with a moderate sample size), this might very 
well be too small an effect to be of value in 
practical applications? In organizational 
donor recruitment Programs, cost-benefit 
analyses must be conducted, and on balance, 

` procedures that produce real but small effects 
are not feasible unless they involve essen- 
tially no extra effort or expense to put into 
practice. To this we would add that in the 
present series of experiments, there is little 
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evidence to suggest that there is even a smi) 
foot-in-the-door effect that would have beat 
detected with larger sample sizes, The ducet 
ation in compliance rates across experimen 
and conditions appears to be essentially tay) 
dom. 

In the three experiments conducted, 
prising a total of nine experimental gro 
evaluated on two dependent measures ex 
only one foot-in-the-door effect occurred (By 
periment 3, poster condition, behavioral com 
pliance). The fact that this effect occurredit 
a condition that had no discernible effect i 
several other comparisons in Experiment 
1 and 2 and that a behavioral compliance 
effect occurred in a condition in which ther 
was no verbal compliance effect was somewhal 
puzzling. Further examination of the dal 
yielded an explanation for this result. Tt wa 
discovered that this one condition contain 
a substantially larger proportion of previow 
donors than any other condition (47% W 
28%, 29%, and 37% in the routine, E 
request, and control conditions, respective) 
Given the much stronger inclination of pre 
vious donors to donate, this provides 4 “a 
parsimonious explanation of the greater 
havioral compliance in this condition tl 
does the foot-in-the-door explanation. | 

Given the consistent failure across H 
experiments to find a foot-in-the-door E 
despite different procedures (both tee 
and in-person requests), study of oo 
populations (urban and rural universi jiel 
and testing some of the most plausible a 
native explanations for the initial fail 3 
only reasonable conclusion we are left y a 
that the foot-in-the-door technique doe 
work with blood donation. It appe . 
this behavior is too costly for a ae 
tered self-concept, which the foot-in-t i 
procedure presumably produces, to 1" = 
a person’s willingness to donate. 


i 

1 jth 4 

12 Assuming a 12% rate of compliance Wio 

straightforward request to donate blood n re pele 

control groups of studies in which perona t would 

sonally asked), a 25% increase above t id hardi 

amount to only 3% more donors, which wou af oe 

justify the expense of doubling the number foot” 

sonal contacts in order to bring about the ; 
the-door effect. 


) 


EFFECTIVENESS OF THE FOOT-IN-THE-DOOR TECHNIQUE 


The present series of experiments provides 
idence that the power of the foot-in-the- 
door technique for practical applications may 
have been overstated. Two of the most com- 
prehensive and popular recent social psychol- 
ogy textbooks maintain that the effectiveness 
of this technique is well established (Baron 
& Byrne, 1977; Wrightsman, 1977). How- 
ag although the existence of the foot-in- 
the-door phenomenon is well documented, its 
utility in practical applications is not, with 
only a single study demonstrating an effect 
on behavioral compliance with a request for 
a meaningful form of aid. The present series 
of failures, in addition to at least two others 
(Cann, 1975; Cialdini & Ascani, 1976), com- 
bined with the shortcomings of other studies 
previously mentioned, suggest that this tech- 
nique may be of only limited value to organi- 
zations seeking ways to increase compliance 
with their requests for aid. 

We suggest that at least part of the reason 
‘that the misconception about the power and 
‘practical usefulness of this technique has 
developed is because of the characteristics of 
the research publication system, which dis- 
Courage the submission and acceptance for 
publication of studies that report negative re- 
sults, that is, those studies that support the 
null hypothesis (Greenwald, 1975). Such bi- 
ases against accepting the null hypothesis 
are particularly serious when results of s0- 
cial psychological inquiry are likely to be 
rapidly applied. If only positive findings 
emerge, a very misleading conception of a 
pene can develop, and this appears to 
Tied what has occurred with the foot- 
i e-door phenomenon. The technique does 
ee to work well in some fairly limited 
Pias but the limitations have not been 
CA out. The present series of experiments 
the tee step in the direction of identifying 
ae ote parameters that influence if, when, 

cae aps how this technique works. 
that pe remains to be done m order 

a He reports on the foot-in-the-door 
Ratt ean be accompanied i A 
of this ry remarks detailing the limitations 

0, procedure. 

t e a point is worth mentioning. Al- 
onstrates © present series of experiments dem- 

convincingly that the foot-in-the- 
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door effect does not influence blood donation, 
some caution should be exerted in interpret- 
ing these findings. On the basis of this single 
series of experiments plus one other (Cial- 
dini & Ascani, 1976), it would be risky to 
conclude that all costly behaviors are re- 
sistant to the foot-in-the-door phenomenon, 
although the implication is certainly there. In 
some ways, blood donation is perhaps an un- 
usual form of helping. For example, previous 
donation is an extremely powerful factor in 
present behavior, which is primarily a reflec- 
tion of the irrational fears people have about 
donating blood (cf. Foss, Note 1). On the 
other hand, it would be equally unwise to 
attempt to explain away the present results 
as being due to the peculiar nature of blood 
donation. Other behaviors such as donation 
of organs, bone marrow, and large amounts 
of time or money have equally high costs (of 
varying sorts) and are also behaviors that a 
given proportion of any population simply 
cannot engage in because of physical or other 
unalterable limitations. Finally, if blood do- 
nation were also resistant to other, thoroughly 
documented, forms of social influence, we 
would be justified in concluding that the 
current results tell us more about blood dona- 
tion than about the foot-in-the-door phenom- 
enon. However, blood donation is quite sus- 
ceptible to various other forms of social in- 
fluence (cf. Cialdini & Ascani, 1976; Condie, 
Warner, & Gillman, 1976; Ford & Wallace, 
1975; Foss, Note 1). Therefore, we may 
conclude with some confidence that the pe- 
f blood donation alone does 


culiar nature 0 y ‘ 
not explain our consistent negative findings. 


Reference Note 


R. D. The role of social influence in blood 


1. Foss, i a 
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Toronto, 
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Erotica and Aggression: The Influence of Sexual Arousal, 
Positive Affect, and Negative Affect on Aggressive Behavior 


Leonard A. White 


Purdue University 


Male college students were first either angered or not angered by a confederate 
of the experimenter and were then ostensibly given an opportunity to aggress 
against the confederate by means of electric shock. Prior to aggressing, subjects 
were shown one of four sets of stimuli chosen to effect a factorial variation in 
the intensity of positive sexual arousal (high, low) and negative affect (high, 
low) elicited by exposure to such material. In addition, one group of angered 
subjects (no-exposure control) was included who did not view any of the four 
sets of stimuli prior to being given an opportunity to aggress. Results indicated 
that exposure to affectively positive erotic stimuli significantly reduced retalia- 
tory behavior by angered males to a level below that exhibited by subjects ex- 
posed to neutral stimuli and by those in the no-exposure control group. In con- 


trast, relative to baseline controls, subjects’ exposure to erotic stimuli that were 
reported to be disgusting and unpleasant slightly enhanced subsequent aggres- 


sive behavior. A number of possible mechanisms 


(eg., attentional shifts, incom- 


patible responses, cognitive labeling) are discussed in relation to these results. 


During recent years, it has become ap- 
parent that exposure to erotic stimuli may 


have a variety of effects on subsequent ag~ 


gressive behavior. In past studies, exposure to 
‘Sexual imagery has been found at times to 
‘decrease attacks against others (Baron, 
1974a, 1974b; Baron & Bell, 1977), at other 
times to increase subsequent aggression (Zill- 
mann, 1971), and at still other times to have 
Seemingly little effect on such behavior. Ex- 
perimental investigations (e.g. Baron & 
Bell, 1977) aimed at delineating more pre- 
Cisely the types of erotic material that in- 
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hibit or facilitate subsequent aggression have 
implicated the sexual arousal potential of 
such stimuli. Specifically, mildly arousing 
stimuli have been found to decrease subse- 
quent aggression, whereas exposure to more 
highly arousing sexual imagery tends to fa- 
cilitate such behavior. In addition, a close 
examination of past research reveals that 
affective responses to erotic imagery may also 
need to be taken into account. 


Affective Response to Erotic Stimuli and 
Aggression 


Some suggestion as to the way affective 
responses may relate to aggression is provided 
by the results of a recent study by Baron (in 
press). In the study by Baron, angered fe- 
males showed increases in aggression follow- 
ing exposure to explicit erotic scenes, which 
had been found in a previous study (Baron & 
Bell, 1977) involving males to inhibit such 
behavior. To. reconcile these seemingly dis- 
crepant findings, Baron pointed out that 
females reported finding these depictions of 
explicit sexual activity to be disgusting and 
unpleasant, whereas for males, sexual imagery 
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was regarded as positively exciting and 
pleasant. These contrasting affective re- 
sponses, in turn, were hypothesized to have 
led to different levels of aggression by modi- 
fying subjects’ attributions concerning the 
source of their arousal (cf. Baron, 1977; Rule 
& Nesdale, 1976). Specifically, for females, 
increments in sexual arousal that were re- 
garded as unpleasant may have been more 
readily interpreted as anger and consequently 
increased subsequent aggression. Alterna- 
tively, it was suggested that males, who re- 
sponded with greater positive affect, may have 
been willing to label their arousal more posi- 
tively (e.g. as due to the sexual stimuli, 
rather than as anger), with the result that 
subsequent retaliatory behavior was reduced. 
At first glance, it may appear to follow 
directly from the studies described above that 
exposure to erotic materials that are labeled 
and experienced as unpleasant and disgust- 
ing will enhance subsequent aggression, 
whereas exposure to more positively valenced 
sexual stimuli effectively inhibits such be- 
havior. Upon closer examination, however, it 
is apparent that the results described above 
must be regarded as tentative and clearly 
demand further experimental scrutiny. For 
example, in the studies by Baron (Baron, in 
press; Baron & Bell, 1977) outlined above, 
while depictions of explicit sexual activity 
may have evoked contrasting affective re- 
sponses by males and females, these feelings 
were not measured directly in the study in- 
volving males, but inferred from subjects’ 
comments during debriefing. In addition, since 
baseline control groups have not been in- 
cluded in past studies of erotica and aggres- 
sion, it is possible that what has been in- 
terpreted as an aggression-inhibiting effect of 
positively valenced sexual imagery was ac- 
tually due to an aggression-facilitating effect 
of more boring and less arousing control 
stimuli (cf. Zillmann & Sapolsky, 1977). In 
addition, no evidence is available concerning 
the impact of more negatively valenced sexual 
imagery on subsequent aggression by males. 
Thus, the present study was designed to pro- 
vide data relevant to answering some of the 
questions surrounding the past research de- 
scribed above, through an examination of the 
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influence of erotica-produced affect an 
arousal responses on aggression. a 


Dimensions of Affective Response 


_ One final concern surrounding the prepara)’ 
tion of specific sets of stimuli involved th 
measurement of affective responses to erotica) = 
and their relationship to sexual arousilf © 
Toward this end, the present study incopy, 
rated the results of recent work (Bym 
Fisher, Lamberth, & Mitchell, 1974) thi 
delineates these relationships more precise 4 
The results of this research indicate thay” 
sexual arousal and affective responses t 
erotica may be conceptualized as varyinji 
along two orthogonal dimensions—one dimen 
sion that I will refer to as positive sexe 
arousal, composed of subjective sexual arous 
and positive affective responses related ti 4 
sexual arousal, and an orthogonal dimensio fi 
of negative affect, characterized by respon el 
of disgust, nausea, and guilt. Within this two 
dimensional framework, four distinct patter) 
of response to erotic stimuli can be identified 
Exposure to erotic imagery may lead to 1M 
creases in sexual arousal that are labeled s 
experienced as pleasant and entertal 

(positive response), evoke a combination í 
positive affect and negative feelings (en 
alent response), lead primarily to r 

affect (negative response), OY elicit feelingii 
of relative indifference (neutral response): 


Experimental Design and Hypotheses 


Procedures in the present inve 
followed closely those used by Ba i 
Baron & Bell, 1977) in past research rela j 
exposure to erotica and aggression. Ma 
lege students were either angered or 
angered by a confederate and were theni 
vided with an opportunity to ageress 7 
the confederate by means of clea 
Prior to aggressing, participants were © n 
to one of four sets of stimuli chosen a pos 
a factorial variation in the intensity © iw 
tive sexual arousal (high, low) an ne 
affect (high, low) associated m A 
stimulus. Each set of stimuli was e i 
elicit one of the four categories of responia ji 


snc 


AFFECT, SEXUAL AROUSAL, AND AGGRESSION 


otic imagery outlined above—positive, neg- 
ive, ambivalent, and neutral. 

Based on the discussion above, an inter- 
tion was anticipated such that aggressive 
sponding would be (a) inhibited by expo- 
re to erotic stimuli that evoked primarily 
ysitive affect (positive response), (b) en- 
anced by exposure to arousing erotic imagery 
at evoked a combination of positive and 
egative affect (ambivalent response), and 
c) neither reduced nor facilitated by ex- 
osure to neutral stimuli or erotic imagery 
at led primarily to heightened negative 
selings. Simple increases in negative affect 
sulting from exposure to sexual stimuli 
negative response) were not expected to 
crease subsequent aggression, since the fa- 
litation of aggression by erotica presumably 
quires heightened levels of sexual arousal 
eg., Baron & Bell, 1977). Although this 
redicted pattern of results was expected to 
e more clearly evident for angered males 
relative to nonangered), this prediction must 
e regarded with caution, since exposure to 
rotic stimuli has been found (e.g., Baron & 
sell, 1977) to alter aggression by nonangered 
ersons, 

Finally, to determine if the anticipated 
eduction in aggression by exposure to erotic 
magery that evokes primarily positive affect 
lay actually represent an aggression-facili- 
ating effect of neutral control stimuli, a 
alg of provoked subjects was included who 
d not view any of the four sets of stimuli 
z to being given an opportunity to ag- 


Pilot Study 
Method 


| 

ERAN ae main experiment, a pilot study was 
Eeri i determine if the chosen sets of stimuli 
d iy able for inducing a factorial variation in 
sltive sexual arousal and negative affect. 


Thematic Co: 
ntent o ri 
SY i f the Experimental 


a me four sets of experimental stimuli was 
lowi te ny color slides that depicted the fol- 
tal fon ea (a) positive response—mutual geni- 
flati. 7 sexual intercourse, and mildly explicit 
i (b) ambivalent response—moderately ex- 
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plicit mutual oral-genital contact, fellatio, and cun- 
nilingus; (c) negative response—same-sex masturba- 
tion and highly explicit cunnilingus; (d) neutral 
response—clothed and partially clothed males and 
females standing alone or mutually involved in what 
seemed to be a nonsexual activity. 


Subjects and Procedure 


After providing informed consent, 48 male under- 
graduates were randomly assigned to view one of 
the four sets of stimuli until a total of 12 subjects 
had rated each set, Following presentation of the 
stimuli, each participant completed a Feelings Scale 
(Byrne et al, 1974) that required a rating of sexual 
arousal and self-ratings of affect along 10 dimen- 
sions: disgusted, entertained, anxious, bored, angry, 
nauseated, depressed, guilty, excited, and disap- 
pointed with oneself, Males then responded to items 
pertaining to sexual-physiological reactions (Schmidt 
& Sigusch, 1970) and to six semantic-differential 
scales. 


Results and Discussion 


As an initial step in the analysis, responses 
to the Feelings Scale (for all four sets of 
stimuli combined) were factor analyzed and 
orthogonally rotated using a varimax criterion 
(Kaiser, 1958). In line with expectations 
based on past research (€.8., Byrne et al., 
1974), two orthogonal dimensions of response 
were found, The first factor, accounting for 
35% of the variance, was labeled as “positive 
sexual arousal” and consisted of the rating of 
sexual arousal and four positive affective 
responses related to sexual arousal (excited, 
anxious, entertained, and not bored).2 The 
second factor, accounting for 30% of the 
variance, was labeled “negative affect” and 
consisted of responses to the items angry, dis- 
gusted, nauseated, disappointed with oneself, 
depressed, and guilty. 

Subsequent analyses sought to determine if 
the chosen sets of stimuli produced the in- 
tended differentiations in response along 
these two dimensions. Multivariate analysis 
of variance applied to responses along these 
two dimensions yielded the intended differen- 
tiations, showing only a main effect of posi- 
tive sexual arousal, multivariate F(5,40) = 


1 Anxiety as a positive response would appear to 
denote feelings of positive anticipation and excite- 
ment that preceded the onset of each stimulus. 
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Table 1 


Mean Affect and Arousal Responses to Experimental Stimuli in the Pilot Study 


TT 
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Positive sexual arousal Negative affect 


Positive affect Negative 
Sexual related to : | affect i 
Condition arousal sexual arousal Disgust (with disgust) 
High sexual arousal/high negative affect a 
(ambivalent response) 4.17, 4.18, 3.17, 2.30, 
High sexual arousal/low negative affect n 
(positive response) 4.17, 4.10, 1.25; 1,23, 
Low sexual arousal/high negative affect 
(negative response) 1.25, 1.95, 4.33, 3.374 
Low sexual arousal/low negative affect 
(neutral response) 1.67, 2.12, 1.00» 1.18), 


Note. n = 12 within each cell. A higher number (on a 7-point scale) indicates more intense affect eo 
responses. Within each dependent measure, means that do not share a common subscript differ sign 


at the .05 level by Duncan’s multiple-range test, 


10.02, p < .001, and a main effect of hegative 
affect, multivariate F(6,39) = 12.09, p< 
.001.? As summarized in Table 1, males ex- 
posed to erotic imagery chosen to elicit high 
(as opposed to low) negative affect reported 
much greater disgust (p < 0001), as well as 
More intense anger, nausea, guilt (for each 
item, p < .001), and seli-disappointment (p 
< .02). In addition, and as intended, sets of 
stimuli chosen to elicit high positive sexual 
excitement (as opposed to low) were reported 
to be more sexually arousing, more exciting, 
more entertaining, less boring, and more anxi- 
ety provoking (for each item, p < .001). 

Corroborating the above findings, males 
more frequently reported either a partial or 
full erection, F(1, 40) = 23.33, p < 001, and 
greater physical excitement, F(1,40) = 
27.97, p < .001, in response to those stimuli 
chosen to elicit high (as opposed to low) 
levels of positive sexual arousal. In addition, 
males reported feeling more negative, un- 
pleasant, uncomfortable, sad, and bad (for 
each item, p < :005) following exposure to 
those stimuli selected to elicit more intense 
negative affect, 

Taken collectively, the findings described 
above indicate that the four sets of stimuli 
can be used effectively to create the intended 
factorial variation in positive sexual arousal 
and negative affect? 


Main Experiment 


Method 
Subjects 


Participants were 95 males who were at let 
years of age and were attending a major mi pie E 
university. Fifty-one males who had not tae 
course in psychology and had not partic a 
previous psychology experiment were recruiter E, 
classes in introductory communication. Bere i 
received extra credit in his communication init l 
exchange for his participation. Since only a a, 
number of students were available via baie: 
tion classes, an additional 44 males were n ai 
by means of advertisements in the studen 


* All of the multivariate analyses of vaian aa 
ported in the pilot study and in the main oot 
used Pillai’s trace criterion, described by 
(1967), as the multivariate test statistic. 

3 As a part of the pilot study, subeco mets 
tional affective orientation to sexuality, je Fishes 
sured by the Sexual Opinion Survey (WI ae have 
Byrne, & Kingma, Note 1). Validation peas 
indicated that subjects who score high i are 
(erotophiles) report more intense a fect fol- 
greater positive affect, and less negative those sub- 
lowing exposure to erotica, relative EE a 
jects who score low (erotophobes). ae d 
responses to the Feelings Scale, which weal ‘ 
measure, revealed that the desired fachon fect 
tion in positive sexual arousal and nega tophobe 
was evident for both erotophiles and er var to bè 
Thus, the experimental manipulations Ah. nees in 
generally effective across individual di | 
affective orientation to sexuality. 


7 disposi- 


paper, in which it was stipulated that volunteers 
should not have taken a course in psychology and 


that each volunteer would be paid $2.00,4 


Design and Apparatus 


The main experiment was arranged as a 2X 2X2 
petween-subjects factorial with factors corresponding 
to two levels of provocation (angry, not angry) and 
the intended characteristics of the experimental 
‘stimuli, positive sexual arousal (high, low) and 
negative affect (high, low). One group of angered 
jects was also included who were not shown any 
e four sets of stimuli prior to the opportunity 
gress, Participants were randomly assigned to 
of the nine cells with the stipulation that each 
‘up (n=10) contain an approximately equal 
number of paid volunteers and students from com- 
munication classes. 
The apparatus consisted of a modified Buss (1961) 
“aggression machine,” lists of 40%-60% Glaze asso- 
dation nonsense syllables, and a bogus intercom 
System. Ten push-button switches numbered from 1 
to 10 were arranged in a row across the front of 
the aggression machine and were ostensibly to be 
used to deliver shocks of varying intensity to an- 
other person. A Lafayette stop clock (Model 54015) 
Was used to record shock duration, 


rocedure 


The procedure followed closely that described by 
Baron and Bell (1977) and as a result will only be 
briefly outlined here. 
After obtaining informed consent, the male ex- 
Perimenter escorted both the subject and the con- 
sate to the experimental room, where it was 
a ned that they would work on an impression- 
Imation task. For purposes of this task, both indi- 
Viduals began by writing a brief summary of the 
“ae aspects of their own personalities, then €x- 
mined each other’s descriptions, and finally ex- 
fanged their mutual impressions of each other. 
Bot manipulation. As in previous studies 
oe 1978; Baron & Bell, 1977), the confed- 
e's rating of the subject was prepared in advance 
Ne experimenter in order to manipulate the 
A anger toward the confederate. In the angry 
4 ition, the personality rating ostensibly provided 
ane confederate was very unfavorable and quite 
eae In the nonangry condition, the rating by 
confederate was more favorable. To enhance the 
Cesena Pulation still further, the confederate 
ers” ss himself as being “rather intolerant of oth- 
Manner a angry condition and, in a more neutral 
con dition. an average sort of guy” in the nonangty 
e phase. Following the impression-for- 
Seated į ask, the subject and the confederate were 
a a a second room where the aggression ma- 
cond ae located. It was then explained that the 
en of the experiment would now begin, 
aS to be concerned with the effects of ex- 
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posure to unpleasant stimuli on physiological re- 
sponses. During this part of the experiment, one 
person would serve as a responder and receive a 
series of electric shocks delivered by the stimulator. 
Through the use of a rigged drawing, the confed- 
erate was designated to be the responder and the 
subject to be the stimulator. 

The responder was then taken to an adjacent 
room, where the experimenter attached two sets of 
electrodes to the confederate’s hand and wrist in 
full view of the subject. The electrodes were osten- 
sibly to be used to record the responder’s physio- 
logical responses and for the delivery of finger shock, 
The recording electrodes, in turn, were attached to a 
polygraph, which the subject could see. 

The experimenter then proceeded to explain to the 
subject his role as the stimulator. It was initially 
made clear that the responder would be trying to 
memorize a list of eight paired-associate trigrams, 
which the stimulator would administer to him using 
the intercom located next to the aggression machine. 
Presumably, the use of this memory task would 
allow the experimenter to obtain more accurate evi- 
dence concerning the effects of unpleasant stimuli on 
physiological responses by effectively distracting the 
responder, and thereby preventing him from trying 
to “get ready” for each shock. Following these re- 
marks, the experimenter informed the subject that 
at certain times a red sign: ht would be illumi- 
nated, at which time he (the stimulator) was to 
deliver an electric shock: to the responder by de- 
pressing one of the 10 buttons on the aggression 
machine, It was made clear that the occurrence of 
the red signal light was random and was in no way 
related to the responder’s performance on the 
paired-associates task. In order to familiarize the 
subject with the levels of shock roe to be 
used, sample shocks itor buttons 4 a were ad- 

jnistered to the stimulator. d 
a presentniin of the experimental stimuli, Follow- 
ing the sample shocks, the subject Lua ne 
there would be a short delay before continuing the 

i in order to insure that the responder’s 
experiment, in 

iologi had returned to baseline 
physiological responses MENAT 
levels. To fill this waiting time, pie liter 
indicated that he had been asking subjects to rate 


some slides that e yn to view and rate the 


dy, All subjects agri 
Aa and were then shown one of the four sets 


cluded 

from five subjects were not in 
i S Dar a analysis. Four subjects (two angered 
ad two nonangered) were suspicious of some aspect 


xperimental manipulations, and one subject 
oe A who failed to follow instructions 


proper it i the results of a preliminary 
3 O ahai the type of incentive used to 
mee subjects (i.e paid vs. course credit) had ‘no 
AVAN effect on either the intensity or duration 
reed delivered to the confederate, this classifi- 
tion variable will not be mentioned further. 
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of 10 color slides described in the pilot study. Each 
slide was automatically projected once onto a 
screen for 20 seconds. Two seconds elapsed between 
the offset. of one slide and the onset of the next 
slide. Following presentation of the stimuli, subjects 
completed a Feelings Scale (cf. Pilot Study). 

While the subject was viewing the stimuli and 
reporting his reactions, the experimenter waited in 
the next room, where the confederate was located. 
Two minutes after the slides were Presented, the 
experimenter informed the subject that the responder 
was ready. The experimental trials were then initi- 
ated, The red shock signal was given 20 times during 
the session. Both shock intensity and shock duration 
were recorded on each of the 20 trials, 

No-exposure control. Ten angered subjects did 
not view or rate any of the four sets of stimuli, In- 
dividuals in this group 
menter would use the a 


polygraph, and no men 
The experimenter then 
the confederate was sea 


ready, and then initia: 


Postexperimental nd debriefing. 


subjects completed a 


the ex, 
anxious). A brief verbal 
administered, followed by 


Results 


Affect and Arousal Responses to the 
Experimental Stimuli 


Statistical procedures used to determine if 
e experimental stimuli produced the in- 
tended differentiations in affect and arousal 
response followed closely those used for this 
purpose in the pilot study. As in the pilot 
study, subjects’ ratings of themselves as 
sexually aroused, not bored, entertained, ex- 
cited, and anxious, loaded on a common fac- 
tor, and were combined to form a single multi- 
variate F value, which was used to assess the 
effectiveness of the manipulation of positive 
sexual arousal. Analysis yielded the required 
differences, showing only a main effect of 
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positive sexual arousal, multivariate Ri 
= 9.37, p < .001. Partici 
more sexually aroused, 
less bored (all $ < .001 
($ < .02) following exposure to sets of stim! 
uli chosen to elicit high (as opposed to Ton) 
Positive arousal: M = 3.44 for high sexy 
arousal, and M = 1.85 for low sexual arousal 
on a 7-point scale. 
Next, subjects’ response: 
to disgust, anger, nausea, guilt, depressio 
and self-disappointment were combined 
form a single multivariate F value that 
used to determine the effectiveness of Ú 
manipulation of negative affect. Responses tof 
these six items each shared high loadings on4 
single factor, as in the pilot study, Analysi 
revealed only the desired main effect of nega 
tive affect, multivariate F (6,67) = 6.41, 
< .001. Responses to those stimuli chosen ti 
elicit high (rather than low) negative aff 
were most strongly differentiated with respedi 
to feelings of disgust (p < .0001); M=3: 
for high negative affect, and M = 1.70 fo 
low negative affect, on a 7-point scale. In al 
dition, subjects reported greater nausea ($ % 
002), depression (p < 002), guilt (p < 01) 
anger (p < .02), and self-disappointment (H 
< .10) in response to those stimuli chosen tfj 
evoke more intense negative feelings. À 
Taken as a whole, the results presem 
above, in conjunction with those of the pil 
study, strongly indicate that the experimen 
manipulations were effective. 


5, 68) 
Pants reported bei 


s to items relatin 


Aggressive Behavior 


Subjects’ shock intensity, shock ee 
and a transformed Shock Intensity X ee 
Duration score (composite measure) ag 
combined to form a single multivarin a 
value, which was interpreted as an*inde teii 
aggressive behavior toward the conteo 4 
Table 2 presents the mean level of agres 
(for all three measures) in each of the M7 


F 


ck Dura- P 
f human 


* The transformed Shock Intensity X Sho 
tion measure has been used in past studies a 
aggression (e.g., Baron, 1978; Baron & B a sub- 
This composite measure was calculated fro as 
jects’ shock intensity and shock duration 5c) [g 
using a simple transformation, X = (=+ 1) ity 
where x equals the product of shock intensi 
shock duration, 
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Aggression as a Function of Exposure to Erotic Stimuli and Prior Provocation 


r 


Shock duration 


Shock intensity (sec) Composite measure 
x Condition Nonangry Angry Nonangry Angry Nonangry Angry 
"High sexual arousal/ 
high negative affect 
(ambivalent response) 4.66ab 5.29a0 «46be aps 3.23be 4.06, 
High sexual arousal/ i: 
low negative affect 
(positive response) 3.95be 3.86, -36b 44b 2.68; 2.84b6 
sexual arousal/ 
high negative affect 
(negative response) 3.22, 5.28 29%, 6400 2.37% 3.8600 
Low sexual arousal/ 
low negative affect 
(neutral response) 4.43qp Sots 36b Slab 2.84146 3.6lac 
No-exposure control" 5.11 .44 3.26 


(e. n=10 within each cell. A higher number indicates more intense shock (on a 10-point scale) and shock 
pone duration (range = .01 to 2.68 sec). For each dependent measure within the eight basic experi- 
al groups, means that do not share a common subscript differ significantly at the .05 level by Duncan's 


hultiple-range test. 


` k 4 
For shock intensity, high sexual arousal/low negative affect versus no-exposure control, F(1, 27) 


< .05 (one-tailed), by Dunnett’s test. 


xperimental groups. Multivariate analysis of 
ariance applied to aggression scores yielded 
significant main effect of provocation, multi- 
ariate F (3,70) = 4.67, p < .005, and a main 
ec due to negative affect, multivariate F(3, 
0) = 3.02, p < .05, which was further quali- 
ied by a significant interaction between nega- 
Ne affect and positive sexual arousal, multi- 
ariate F(3, 70) = 2.82, p < .05. 

As an initial step in further probing these 
Ignificant findings, the main effect due to 
rovocation was examined more closely. Uni- 
arate tests revealed that individuals who re- 
ĉived a negative evaluation from the confed- 
rate (relative to nonangered subjects) evi- 
enced stronger aggression on each of the 
ree measures of such behavior: F(1, 72) = 
‘ 4, p < 01, for shock intensity; F(1, 72) 
` 11.61, p < .001, for shock duration; and 
Q, 72) = 14.37, p< .001, for the com- 
Osite. measure. Thus, it appears that the 
anipulation of anger was generally effective. 
rag next to the effect of exposure to 
1€ various types of erotic stimuli on aggres- 
On, univariate tests revealed that the main 
fect due to negative affect failed to attain 


= 2.26, 


significance on any of the three dependent 
measures of aggression: Fs(1, 72) = .21, 3.10, 
and 3.24, all ps > .05, for shock intensity, 
shock duration, and the composite measure, 
respectively. A closer examination of the sig- 
nificant multivariate interaction between pos- 
itive sexual arousal and negative affect, how- 
ever, yielded a parallel univariate effect for 
shock intensity% (1, 72) = 6.86, p< .05, 
and the composite measure, F(1, 72) = Dion, 
p < .05, but not for shock duration, F(1, 72) 
= 1.71, p > .20. Simple effects tests were then 
used to probe differences among the four 
group means comprising these significant uni- 
variate interactions. Analysis revealed that ag- 
gressive behavior toward the confederate was 
reduced by exposure to erotic stimuli that 
evoked primarily positive responses, relative 
to the effect of exposure to neutral material; 
F(1, 72) = 4.93, p < .05, for shock intensity: 
and relative to ambivalent stimuli; F(1, 72) 
= 4.74, p< 05, for shock intensity; and 
F(1, 72) = 8.60, p < 01, for the composite 
measure. 

Because further inspection of the means 
presented in Table 2 suggested that a more 
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complex pattern of results might underlie the 
interactive effects presented above, the means 
of all eight groups were compared (for all 
three measures) using Duncan’s multiple 
range tests. As can be seen in Table 2, the re- 
sults of these comparisons revealed that ex- 
posure to erotic stimuli that evoked primarily 
positive affect reduced retaliatory behavior by 
angered individuals (p < :05), but had only 
weak or negligible effects on aggressive be- 
havior by nonangry persons, Thus, the effect 
of exposure to the various types of erotic 
stimuli on subsequent aggressive behavior to- 
ward the confederate was most clearly evident 
for angered males. 

Finally, Dunnett’s test (Dunnett, 1955) 
was used to compare each of the four group 
means within the angry condition against the 
no-exposure control. The results of this analy- 
sis indicated that the intensity of aggressive 
behavior was inhibited, F(1, 27) = 2.26, p< 
OS (one-tailed test), by exposure to erotic 
stimuli that led Primarily to increases in posi- 
tive affect, No further comparisons involving 
the no-exposure control attained significance, 

In summary, aggressive behavior toward 
the confederate was inhibited by exposure to 
sexually arousing erotic stimuli that evoked 
primarily positive affect and slightly enhanced 
by increases in sexual arousal that were re- 
sponded to with ambivalence. 


Postexperimental Questionnaire 


Analysis of responses tothe postexperimen- 
tal questionnaire indicated that males who 
received a negative evaluation from the con- 
federate (as Opposed to a neutral one) rated 
themselves as more angry, F(1, 72) = 93.17, 
$ < .001; reported being more negative, 
F(1, 72) = 32.91, 
F(1, 72) = 104.16, p < .001, and unpleasant, 


ticipants reportedly felt there was not a cor- 
rect number of high intensity (M = 7.15) or 
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low intensity (M = 7.21) shocks that 
were supposed to administer to the coni 

erate and reported no obvious intention (I 
= 5.71) to deliver shock in a manner th, 
mimicked the behavior of other Participant 
Taken as a whole, subjects’ mean respong 
to these items and the failure to find any si 
nificant effects for these items due to the in 
dependent variables, Suggest that the expe 
mental manipulations did not operate simpl 
to evoke an expectation that one pattern 
shock was more correct or desirable 
another, 


Discussion 


It appears that exposure to sexually arous 
ing erotic stimuli may sometimes inhibit ag 
gression, and at other times slightly incre: 
such behavior, depending on how exposure 
such material makes one feel. In the pr 
study, there was a slight tendency for aggresi 
sive behavior to be enhanced by exposure 
arousing erotic stimuli that were reported 
be disgusting and unpleasant. In contrast wi 
this group, subjects’ exposure to sexual it 
agery reported to be entertaining and a 
tively exciting decreased retaliatory bea 
by angered individuals to a level that f 
slightly below that of subjects who were z. 
provoked. Further comparison with a a 
line control group strongly suggested tha a 
posure to such erotica acted to inhibit agg 
sive behavior by angry males. Pei 

The inhibition of motivated agree 
exposure to positively arousing erotic 5 iy 
uli observed in the present study mi 
Viewed as corroborating and extending ie 
findings (e.g., Baron, 1974a, 1974D; a 
Bell, 1977; Zillmann & Sapals iyon 
Moreover, since only those stimuli whi 
labeled and experienced as entertaining 
exciting led to reductions in retal 
havior, it appears that subjects’ att fae 
sponse to such material is a critical n D 
leading to the inhibition of aggressio 
erotica. 


Alternative Explanations for the Major 
Findings 

Attentional shift. The pattern “ 
described above may be viewed as 8 


esili 
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failing to support the hypothesis (e.g., Don- 
@ nerstein, Donnerstein, & Evans, 1975) that 
ik exposure to erotic stimuli inhibits aggressive 
behavior by shifting subjects’ attention away 
‘from the source of their anger. Based on the 
arguments presented by Donnerstein et al., 
it would seem reasonable to expect that ex- 
posure to virtually any type of mildly arous- 
ing erotic stimuli would be more distracting 
than sexually arousing and, therefore, would 
be effective in reducing subsequent aggression. 
n the present study, however, exposure to 
amingly attention absorbing erotic imagery 
at led to heightened negative affect not only 
failed to reduce aggression but tended slightly 
to enhance subsequent aggressive behavior. 
Cognitive labeling. One explanation for the 
results of the present study may be that sub- 
jects’ contrasting affective responses to arous- 
Ing erotic imagery led to different levels of 
aggression by modifying the type of label 
males applied to any arousal they experienced 
(Baron, 1977; in press). This interpretation 
Is in keeping with other work (e.g., Geen, 
Rakosky, & Pigg, 1972; Zillmann & Bryant, 
1974) suggesting that whether or not a given 
Source of arousal will act to facilitate or to 
inhibit subsequent aggression depends on a 
Person’s cognitions regarding the source of 
his or her arousal, Thus, in the present study, 
following exposure to erotic stimuli that 
evoked primarily positive responses, males 


may have been more inclined to label any - 


arousal they experienced in a positive man- 
her (e.g., as sexual excitement or entertain- 
Ment, rather than as anger), with the result 
that subsequent retaliatory behavior was re- 
duced, In contrast, following exposure to 
Sexual material that was reported to be un- 
Pleasant and disgusting, subjects may have 

een more willing to interpret any arousal 
’ they experienced as anger (rather than re- 
sulting from exposure to the erotic stimuli), 
With the result that aggressive behavior to- 
Ward the confederate was slightly facilitated, 
or at least was not inhibited. 

The cognitive interpretation presented above 
assumes that the facilitation of aggression by 
ĉrotica requires an increase in arousal deriv- 
ing from exposure to such stimuli and the 
‘Adoption of a cognitive label appropriate to 
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the expression of aggression (cf. Zillmann & 
Sapolsky, 1977). In line with this view, 
simple increases in negative affect resulting 
from exposure to erotica (negative response) 
were not found to enhance subsequent aggres- 
sion in the present study. Continuing this 
line of reasoning, it is less clear, however, why 
exposure to reportedly ambivalent material 
that led to increases in both sexual arousal 
and negative affect failed to increase subse- 
quent aggressive behavior more significantly. 
While a number of reasons may be offered to 
explain this result (e.g., dissipation of nega- 
tive affect after rating the stimuli), it seems 
reasonable to suggest that the ambivalent 
erotic imagery used in the present study 
simply failed to produce a sufficient degree of 
sexual arousal (cf. Baron & Bell, 1977). As 
such, exposure to this set of stimuli may not 
have generated a sufficiently intense excita- 
tory reaction (i.e., arousal response) whose 
residue could identify subsequent aggressive 
behavior (cf. Zillmann, 1971). In support of 
this interpretation, a closer examination of 
the present results revealed that angered in- 
dividuals who reported the highest levels of 
sexual arousal following exposure to ambiva- 
lent stimuli were subsequently found to be 
the most aggressive: (8) = 3.20, p < .02, for 
shock intensity. In general, however, subjects 
reported only moderate levels of sexual ex- 
citement following exposure to such material 
(M = 3.25, on a 7-point scale) and did not 
exhibit heightened aggressive attacks against 
the confederate. 

Incompatible responses. Although ambiv- 
alent erotic depictions may not have been 
sufficiently arousing to enhance subsequent 
aggression, exposure to affectively positive 
and arousing erotic stimuli was found to in- 
hibit aggression as was predicted. As discussed 
above, while exposure to such erotica may 
have effectively reduced aggressive behavior 
by affording subjects more positive cognitive 
labels, other, though not necessarily incom- 
patible, explanations are also possible. In par- 
ticular, positive responses of pleasure, enter- 
tainment, and excitement deriving from ex- 
posure to sexual stimuli may be hedonically 
incompatible with feelings of anger and/or 
the performance of aggressive acts and con- 
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sequently may inhibit such behavior directly 
(e.g., Baron, 1974a, 1974b, 1976, 1978; Zill- 
mann & Sapolsky, 1977). That is, once an 
angry individual becomes immersed in pleas- 
ant and entertaining erotic stimuli, it may be 
difficult to maintain heightened feelings of 
annoyance or to engage in intense retaliatory 
behavior. Following the implications of this 
analysis, it may be that exposure to virtually 
any pleasant stimulus (e.g., art, scenery) is 
capable of reducing aggressive attacks by 
angry individuals. In line with this interpreta- 
tion, exposure to enjoyable and entertaining 
cartoons has been found (Baron, 1978; Baron 
& Ball, 1974) to reduce aggressive behavior 
by angry persons. 


Conclusion 


In summary, the present study serves to 
extend previous research by demonstrating 
that the effect of exposure to erotic stimuli on 
subsequent aggression depends, to an impor- 
tant extent, on the type of affective response 
elicited by exposure to such stimuli. One use- 


® Out of convenience, male subjects and male con- 
federates were used in the present study, which 
leaves open the question of whether exposure to 
the various types of erotic stimuli used in the 
present study would have similar effects on ag- 
gressive behavior toward males and females and/or 
by females toward these target persons. Although 
empirical research that examines the combinations of 
aggressors and victims described above may be re- 
quired to answer these questions definitively for 
specific types of stimuli, a number of studies are 
available that may provide at least a partial answer 
to these concerns. With respect to the sex of the 
victim, exposure to erotic material has been found 
(Donnerstein & Barrett, 1978; Jaffe, Malamuth, 
Feingold, & Feshbach, 1974) to have similar effects 
on aggressive behavior toward males and females. 
Further, it has been argued (Baron, in press) that 
exposure to sexual imagery that evokes similar pat- 
terns of affect and arousal responses in males and 
females will lead to similar levels of subsequent 
aggression. Thus, on the basis of the evidence 
available, it may be tentatively assumed that the 
conclusions of the present study would apply inde- 
pendently of the sex of the aggressor or the sex of 
the victim. It should be emphasized that the specific 
characteristics of those stimuli (e.g, thematic con- 
tent) which evoke positive, negative, ambivalent, 
and neutral responses may differ greatly for males 
and females (e.g., Schmidt & Sigusch, 1970). 
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ful direction for extending the results of the 
present study and related research, would ap 
pear to be to explore further the efficacy of 
training individuals who have difficulty con 
trolling their anger to emit pleasant image 
and other incompatible behaviors in respons 
to provocation, as a means of inhibiting ange 
and reducing the intensity of retaliatory be- 
havior (cf. Baron, 1977; Novaco, 1977; 
Smith, 1973). 


i 
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Predictions of Others’ Responses in a Mixed-Motive Game: 
Self-Justification or False Consensus? 


Lawrence A. Messé and John M. Sivacek 
Michigan State University 


This study examined speculations by Dawes, McTavish, and Shaklee concerning 
the extent to which persons use their own responses in mixed-motive situations 
as a basis for predicting the behavior of others. Female subjects played a one- 
trial Prisoner’s Dilemma game and predicted the responses of their partner and 
a person in another dyad. As predicted, subjects, irrespective of the particular 
choice they had made, tended to attribute their own response to others. In 
many subjects, this attribution appeared to be self-justifying, in that it was 
specific to the subjects’ partner. However, it was even more frequently the case 
that subjects attributed their own response to both partner and nonpartner (i.e., 
a false consensus), although they also tended to be more confident about their 


prediction of their partner's choice. 


Dawes, McTavish, and Shaklee (1977) re- 
cently speculated that behavior in some 
mixed-motive situations might not be a con- 
sequence of an attribution about the other(s), 
but, rather, that the reverse relationship might 
be operating—that is, the attribution about 
the other’s behavior is a consequence of the 
subject’s own behavior. Dawes et al. used a 
multiperson one-trial dilemma to examine the 
validity of their position. They compared 
the attributions to others that were made by 
participants in a commons dilemma to the 
attributions that were made by observers who 
did not participate in the situation but merely 
predicted the behavior of those who did. They 
expected that participants would be more 
variable in their attributions than would ob- 
servers, given that the former should base 
their judgments of others on their own be- 
havior. Their findings did support this be- 
havior-to-attribution hypothesis, 


This research was conducted while the second au- 
thor was a National Institute of Mental Health 
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David Lovell, John Lovell, and Steven Fox for 
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Requests for reprints should be sent to Lawrence 
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Unfortunately, the methodology that Dat j 
et al. employed did not permit the examini 
tion of more specific components of thei 
argument, For example, they speculated 
the attributions of persons are influenced b 
their own actions, irrespective of whether the 
compete or cooperate. While Dawes et al. 
results suggest that this was the case in thel 
study, their design and analysis did not m 
mit separate examination of the data of i 
subjects who cooperated and of those a 
competed. Moreover, prior studies that m 
examined this hypothesis with ent 
short duration (e.g., Kanouse & Wiest, 1 l 
Terhune, 1968) have yielded somewhat in 
sistent findings, although, on the vi weal 
support the notion that there is 4 grea hol 
of correspondence between a subject's c wl 
and the behavior he or she attributes 4f 
others. 

Dawes et al. (1977) also suggest 
did not examine—the notion that one a 
of two mechanisms might account for t E 
pothesized tendency in subjects in mixé abou 
tive situations to base their judgments ested 
others on their own behaviors. They SUl ait 
that subjects might engage in this et a 
tional process in order to justify vai the 
sponses; subjects who compete might od © 
need to justify what could be consider 


ed—but 
or bo! 
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oitative and antisocial behavior, and sub- 
ects who cooperate might attribute a tend- 
ncy toward similar behavior to others in 
rder to “avoid feeling duped.” In addition, 
hey suggested that subjects might use their 
wn behavior as a guide for guessing “what 
yerybody does.” This latter explanation has 
jeen termed the “false consensus” phenome- 
non (see Ross, Green, & House, 1977). 

It should be noted that Dawes et al. em- 
ployed the concept of self-justification some- 
what narrowly. Ross et al. (1977), in their 
discussion of false consensus judgments, sug- 
gested that such behavior could be based on a 
number of underlying processes, including the 
need to justify oneself. Clearly, if self-justifi- 
cation is a salient motive, it is most efficiently 
expressed by attributing one’s own behavior 
only to one’s partner. But, in a larger sense, 
this narrow attribution might not be sufficient, 
perhaps because one needs to believe that 
one’s partner (who often is a stranger) is just 
like everyone else, or perhaps because there is 
greater comfort in thinking that one is not 
atypical in general. On the other hand, other 
explanations for the false consensus phenome- 
non also are plausible—for example, that one 
uses his or her own behavior as the best evi- 
dence (perhaps as the only evidence) of what 
people in general do. In any event, it seemed 
of interest to attempt to examine the extent 
to which persons in short-duration mixed- 
motive situations appear to use their predic- 
tions of others to self-justify in the narrow 
sense suggested by Dawes et al. (1977) or to 
make more general predictions about the be- 
havior of others, for whatever reason. 

Thus, the present study examined the be- 
havior and predictions of subjects in a one- 
trial dilemma to explore some of the issues 
Proposed but left unresolved by Dawes et al. 

First, we examined the extent to which both 
Persons who competed and persons who co- 
“Operated attributed their own responses to 

a Participants. Second, we used a pro- 

the a that permitted differentiation between 

E se consensus and (narrow) self-justi- 

à Ga hypotheses. In the present study, all 
“Subjects served as both participants and ob- 

A Thus, they responded in a Prisoner’s 

lemma and made predictions about the 
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responses of both their partner and a person 
with whom they were not paired. This within- 
subjects procedure permitted examination of 
both the self-justification and false consensus 
explanations. To the extent that self-justifica- 
tion is a valid explanation, we would expect 
that subjects would attribute to their partner 
a response similiar to their own, but their 
response would not be related to their predic- 
tion about the person with whom they were 
not paired. On the other hand, to the extent 
that false consensus is a relevant phenomenon, 
we would expect that subjects would be 
likely to attribute responses that were similar 
to their own to both their own partner and 
the individual with whom they were not 
paired. 
Method 


Subjects 


One hundred seventy-two female students from 
introductory psychology classes at Michigan State 
University volunteered to participate in the research 
for extra course credit. In addition, as explained be- 
low, they also received up to 80¢ each, depending 
upon the decision that they and their partners made 
when playing a one-trial Prisoner’s Dilemma (PD) 


game. 


Design 

In addition to making a “cooperative” or “com~- 
petitive” choice in the PD game, subjects also made 
predictions about their partner's (P) choice and the 
choice of another person (0) with whom they were 
not paired. The order in which they made these deci- 
sions wase Varied such that half the subjects made 
their own choice first, whereas the remainder first 
made a prediction about the decision of one of the 
other people. Also, the order of the predictions was 
varied such that half the subjects predicted their 
partner’s choice first, whereas the other half pre- 
dicted the nonpartner’s choice first. This procedure 
generated a design whose dimensions were 2 (sub- 
jects’ choice: cooperate or compete) X2 (order of 
subjects’ choice: before or after their first prediction) 
í predictions: about P first or about O 
first). The latter two factors were included primarily 
for reasons of experimental control. In actuality, we 
were primarily interested in (a) the subjects’ own 
choice and (b) the extent to which they attributed 
the same choice to P and/or O. 


Procedure 

Subjects were examined in groups of four. When 
the four subjects in a session had assembled, each 
was given 30¢ and was paired at random with a 
male experimenter. One of the experimenters shuffled 
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Table 1 
Outcome Values (in cents) in Prisoner's 
Dilemma Game 


Person B's choice 


Person A's 

choice Cooperate Compete 
Cooperate 25, 25 —25, 50 
Compete 50, —25 —10, —10 


Note. The first value in each cell is the outcome to 
Person A. 


a deck of four cards (numbered 1, 2, 3, and 4) and 
dealt one face down to each subject. Then, each ex- 
perimenter escorted the subject assigned to him to 
one of four cubicles. From this point on, subjects had 
no further contact with each other. 

Once in the cubicle, the experimenter seated the 
subject at a small table and presented her with a 
“typical” 2X2 PD matrix. (Table 1 presents the 
values that were contained in the matrix.) The ex- 
perimenter explained in detail the nature of the one- 
trial PD game—there are two players, each has two 
choices, and so on. At the end of his explanation, 
which was a script that he had memorized, the ex- 
perimenter made sure that the subject understood 
the nature of the game by soliciting questions and 
reiterating points about which the subject appeared 
uncertain. The experimenter then informed the sub- 
ject that the four of them had been divided at ran- 
dom into two pairs and that she had been paired 
with the person who had drawn “Card Number X” 
(where X was a number on one of the cards that 
had been dealt to the other subjects). 

At this point, depending upon the condition to 
which she had been randomly assigned, the subject 
was asked (a) to make a choice of PD alternatives— 
with the understanding that her choice, coupled with 
that of her partner, actually determined payment to 
both—or (b) to predict her partner’s PD choice, or 
(c) to predict the PD choice of one of the persons 
in the other pair. In addition, whenever a subject 
made a prediction about her partner’s or the other 
person’s choice, she was asked immediately after- 
wards to indicate on a scale of 50 to 100 how con- 
fident she was in her prediction. She was told to 
mark a 50 (indicating a “50-50 guess”) if she was 
merely guessing, a 100 if she was completely con- 
fident, or an appropriate number in between to in- 
dicate an intermediate degree of confidence. Depend- 
ing on the nature of the first task, the subject then 
was asked either to make her own choice and the 
remaining prediction (and confidence rating) or to 
make the two predictions (and confidence ratings) in 
the order dictated by the experimental design. 

It should be noted that the subject was not in- 
formed that she was to make a series of decisions; 
that is, she was told about the second task only 
after she had completed the first, and about the third 
task only after she had completed the second. It also 
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should be noted that the subject made her judgment! 
in private. The experimenter did not observe hal 
while she was making her decision or her predictioys } 
and confidence ratings. In each instance, she wa 
told to fold the card on which she had marked he 
response(s) before returning it to the experimenter, 

When the subject had completed the three tasks| 
the experimenter excused himself for a moment i 
consult with the experimenter who had been assigned 
to her partner. Together they determined the ou 
come of the one-trial PD game for the pair, Ead 
then returned to his respective subject and adjust 
her payment accordingly. Every subject was di 
briefed, sworn to secrecy, and thanked for her pat} 


a way that she did not come into contact with tiv 
other participants in her session. | 


Results k 


The subjects’ own PD choice and the 
predictions about the choices of P and 0 
were examined via chi-square analyses. Sim 
larly, their confidence ratings were exa 
through analysis of variance (ANOVA). Pre 
liminary analyses were performed to explot 
the possibility that the order in which svt 
jects performed the three tasks influenci 
their responses. Then, the data were cast an 
analyzed in a manner that most directly a 
dressed Dawes et al.’s (1977) speculations. | 


Examination of Possible Order Effects 


Three chi-square tests were calculate d 
explore the possibility that the four combll 
tions of order of own choice and order of p | 
dictions about others’ choices influenced j 
subjects’ responses in the PD game, their P 
dictions about their partners’ choices, % ai 
predictions about the choice of onè i, 
the other dyad. None of the obtaine® 
squares reached significance, x’s(3) Eo 
1.08, 4.88, respectively, ps > -20. Likewise] 
preliminary anova indicated that 
relative confidence in a prediction was 
fected by order of PD decision, F (4 
.12, p > .70; order of predictions, F(1, 
= .70, p > .40; or their interaction, F i j 
= 0, > .99. Thus, order of responsé ; 


not a 


in 
i issing data, © 
1Note that 7 subjects had m: fidence 1 


they did not give one or both con their © 
and/or experimenters failed to recor these d 
conditions, Thus, the total sample size for 

was 165. 
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disregarded in subsequent analyses. It is likely 
that response orders had little effect, because 
subjects probably rehearsed their own choice 
even before they were explicitly told that 
they were to make one. 


Tests of Dawes et al.’s Speculations 


As noted above, Dawes et al. (1977) specu- 
lated that subjects predict that others will act 
ina PD situation as the subjects themselves 
did, irrespective of the specific choice that the 
subjects made (i.e., irrespective of whether 
they competed or cooperated). In addition, 
they also speculated that the tendency to 
attribute one’s own behavior to others might 
be due to one or both of two mechanisms, 
self-justification and false consensus. The 
data and analyses presented below bear di- 
tectly on these speculations. 

Predictions of subjects who cooperated or 
competed. Table 2 presents the proportions 
of the 73 subjects who chose cooperatively 
and the 99 subjects who chose competitively 
who attributed similar or dissimilar responses 
to their partner and to the person in the other 
dyad.? These data provide strong support for 
Dawes et al.’s (1977) speculation, since there 
Was little difference in the frequencies with 
which persons who cooperated and persons 
who competed attributed their response to 
their partner, x°(1) = .39, or to the other per- 


Table 2 
4 roportions of Competitive and Cooperative 
ubjects Who Predicted That Others Would 


Make the Same or a Different Response as 
Their Own 


ae 


Predicted response 
PoE E E 


i Same Different 
Subject’s response asown from own 
Prediction about partner 
Competition 72 .28 
Cooperation .16 24 
Total sample 74 26 


Prediction about person in other dyad 


Competition 63 37 
Cooperation 56 44 
Total sample -60 40 
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Table 3 

Proportions of Subjects Who Made Similar 

or Dissimilar Predictions About Their 

Partner's and the Other's Responses 

ee 
Prediction about 
partner’s response 


Different 
Prediction about Same as from 
other’s response subject subject 
Same as subject 44 16 
Different from 
subject 30 10 


son, x2(1) =.92. In addition, there was a 
strong tendency for subjects to attribute the 
same response as their own to their partner, 
x2(1) = 38.82, p < .001, and, to a somewhat 
lesser extent, to the other person, HOS 
6.44, p < .025. 

Examination of the self-justification and 
false consensus predictions. Dawes et al. 
(1977) speculated that the attribution of 
one’s own behavior to one’s partner probably 
was based on one or both of two mechanisms: 
an attempt to justify one’s decision and the 
tendency to use one’s own behavior as the 
basis for generalizing about everyone’s be- 
havior. To examine the validity of these pre- 
dictions, data were organized by tabulating 
the number of subjects who attributed their 
own behavior to both their partner and the 
other person, the number who attributed such 
behavior only to their partner, the number 
who attributed their own response only to the 
other, and the number who attributed their 
response to neither party. (Since there was 
no difference in the attributions of similarity 
by competitors and cooperators, the data were 
collapsed across this variable.) Table 3 pre- 
sents these data (in the form of proportions). 

A chi-square test of this array (McNemar’s 
test for correlated proportions; see Hays, 


2 Note that data are presented in the tables in the 
form of proportions, to make the comparison of the 
data of subjects who cooperated and of those who 
competed more straightforward, given that com- 
petitive responses were made somewhat more fre- 
quently, x°(1) =3.70, p < .075. Of course, in every 
case the chi-square analyses were performed on the 


raw frequencies. 
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Table 4 > 
Mean Confidence Ratings of Subject's 
Predictions of Partner's and Other's Responses 


Confidence in 


prediction of 
Prediction about 
partner Partner Other 
Same response as 
subject (124) 69.38 63.93 
Different response 
from subject (44) 62.92 64.94 


Note. Scores could range from 50 (indicating no con- 
fidence) to 100 (indicating complete confidence). 
Numbers in parentheses indicate cell frequencies. 


1963, p. 602) yielded a significant value, 
x*(1) = 7.29, p< .01. Thus, there was a 
greater tendency to attribute similar be- 
havior to one’s partner than to a nonpartner, 
Although this finding provides support for 
the self-justification hypothesis, the false con- 
sensus hypothesis also appears to be sup- 
Ported to some extent as well, since the fre- 
quency with which subjects attributed their 
own behavior to both P and O was signifi- 
cantly greater than the frequency with which 
‘subjects attributed such behavior only to 
their partner, y2(1) = 4.50, p < .05. 


Analysis of Confidence Scores 


Recall that subjects were asked to indicate 
how confident they were in each of their pre- 
dictions. These data were analyzed via a 2 
(prediction about partner) X 2 (prediction 
about other) x 2 (target of confidence rat- 
ing: partner/other) Anova with repeated 
measures on the third factor.® This analysis 
yielded one significant effect: the Prediction 
About Partner x Target of Confidence Rating 
interaction, F(1, 164) = 6.76, p < .01. Table 
4 presents the relevant cell means. Subsequent 
simple effects analyses (Winer, 1971, pp. 347- 
351) of these data revealed that subjects who 
Predicted that their partners would behave 
as they did expressed more confidence in 
their prediction about their partner than in 
their prediction about the person in the other 
dyad, F(1, 164) = 7.15, $ < .01, irrespective 
of whether they predicted that the nonpartner 
would respond similarly to, or differently 
from, them. However, subjects who predicted 
that their partner would choose differently 
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from them were no more confident in ther} 
prediction about their partner than in their 
prediction about the other person, F(1, 164) 
= .89. Thus, these findings indicated that 
even those subjects who predicted that both 
P and O would choose as they did were more 
confident in their predictions about P than 
about O. 


Discussion 


The results provide strong support fon 
Dawes et al.’s (1977) speculation that people 
predict that others will behave as they them- 
selves did in one-trial PD situations, irrespet 
tive of whether their behavior was coopera 
tive or competitive. Moreover, this attribui 
tional phenomenon appears to be due to botti 
a need to justify one’s response and a tendi 
ency to think of one’s own behavior as typ! 
of everyone. As noted above, however, Dawe 
et al. employed the concept of self-justific 
tion somewhat narrowly. It could be, in facl 
that those subjects who attributed their ow 
response to both partner and other did som 
an attempt to justify their own behav 
(e.g., “I’m just doing what my patty an 
everyone else in this situation would 4 
The finding that these subjects had maf 
confidence in their prediction of their s 
ner’s behavior than they had in their pt E, 
tion of the other’s behavior lends indirect “a 
port for this interpretation. This result $ e 
gests that subjects might have possess a 
greater need to believe that their aa ( 
would choose as they did than to belles 
someone whose behavior was irrelevan 
them would do so. A 

Two related findings of the presen a 
search appear to merit special attento 
they suggest that, for whatever rean i F 
judgments of subjects were not entit! Bi. 
tional. As noted above, subjects rea 
likely to predict that their partner ull 
behave as they did than they were to 4! 


F fects “ 
3 Preliminary analyses revealed no main ef 
interactions for subjects’ own choice (c d 
cooperate), Thus, the data were cole ee 
factor. It should also be noted that 4 SUDIO «sos 
to provide confidence scores for their ii from 16 
This analysis, therefore, was based on data 
subjects, 


this choice to the nonpartner; also, they had 
greater confidence in the attribution of a 
similar choice when they made this prediction 
about their partner. These results occurred 
in spite of the fact that the procedure used 
precluded the possibility that subjects could 
objectively differentiate their partner from 


the nonpartner. Recall that in a given session, 


all four subjects were led into separate cubi- 
cles, and they could know only that their 
partner and the other person about whom 
they made a prediction were two of the three 
people in the other cubicles. There was no 
way that they could identify these two people 
specifically. Thus, subjects had no objective 
basis for making differential predictions or 
confidence ratings; yet they tended to do so. 
These findings suggest that there is some 
within-subject process (or processes)—a need 
to justify one’s own behavior, a need to feel 
in control (Langer, 1975), and so forth—that 
Produce(s) the illusion in subjects that the 
two targets of their predictions are identifiably 
different. 

The systematic study of the relationship 


between people’s behaviors and their predic- 


tions of others’ behaviors in mixed-motive 
Situations began with Kelley and Stahelski’s 
(1970) examination of the “triangle” hypoth- 
= They Proposed that people who compete 
Bee motive situations do so because they 
A a biased perception that other people 
P penti; on the other hand, coop- 
e are less likely to stereotype in this 
lon, and thus, their behavior is influenced 

Y the actual behavior of the other person. 
hee et al. (1977) proposed their explana- 
Stahel i contrasting position to Kelley and 
a i’s triangle hypothesis. They con- 
3 that their findings—and most likely 
eou conclude that our findings as well 
Ban p disconfirm Kelley and Stahelski’s 
noted ae In fairness, however, it should be 
free ot this conclusion is somewhat ex- 
Present a there is ample evidence (e.g., the 
Wiest. udy; Dawes et al., 1977; Kanouse 
the tria 1967; Miller & Holmes, 1975) that 
Mixed tele hypothesis is not valid in all 
(1970) oe situations, Kelley and Stahelski 
ions sh emselves noted that biased percep- 
Re ‘ould affect responses only under lim- 
circumstances, And there is evidence 
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(e.g., their own study; Miller & Holmes, 
1975) that the triangle hypothesis is valid in 
substance under the appropriate conditions 
(i.e. mixed-motive situations of extended 
duration in which contingent cooperation is 
both rational and congruent with social 
norms; see, for example, Luce & Raiffa, 1964, 
p. 101). 

Thus, taken as a whole, the present and 
previous studies provide evidence that no 
single explanation can account for the link 
between behavior and attributions in all 
mixed-motive situations. Instead, the pro- 
cesses that underlie this linkage appear rea- 
sonably complex. Therefore, attempts to dem- 
onstrate the general utility of one perspective 
over another seem to be a less fruitful pursuit 
than are attempts to understand the condi- 
tions under which the different processes pro- 
posed by these perspectives operate. 
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Two experiments examined the effects of various operations of personal control 
on reactions to stress. The first study incorporated two features into the blood- 
drawing procedure at a blood bank: providing donors with accurate information 
and allowing donors to choose the arm to be used. Measurement of nurses’ 
actions to prevent donors from fainting and self-reports of discomfort revealed 
that the combination of choice and information was somewhat effective in re- 
ducing distress. However, providing either information or choice alone was more 
effective. In a second laboratory study using a cold pressor stimulus as stressor, 
subjects given a choice (the option to terminate the aversive stimulus and 
choice of hand used) showed a reduction of aftereffects on a measure of atten- 
tion to detail. Subjects given information but not choice also showed this reduc- 
tion. Combining information and choice was no different from either treatment 
alone. Taken together, the results of both studies indicate that moderate levels 
of choice and information are optimal for coping with stress. An explanation 
was suggested based on a contextually determined relationship among choice, 


information, and perceived control. 


Laboratory and field studies have shown 
that negative effects of threatening stimuli are 
frequently lessened when subjects believe that 
they can escape or avoid them (e.g., Glass & 
Singer, 1972; Seligman, 1975) or when sub- 
jects are provided with information about 
the threatening event (e.g., Johnson, 1973; 
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Staub & Kellet, 1972). In addition, provid- 
ing subjects with options or choices may also 
affect behavioral and physiological stress 1¢ 
actions (e.g, Corah & Boffa, 1970; Lange 
& Rodin, 1976; Stotland & Blumenthal, 
1964). These operations have been concep 
tualized as forms of personal control ber 
they presumably allow subjects to alter 4 
affect outcomes. Recent investigations a 
explored the applications of personal con K 
to naturalistic health care situations that i 
inherently stressful. In this regard, into A 
tion, participation, and choice have been a 
plicated as important variables that may anl 
fluence outcomes in both health bee 
settings and the institutional environm™ A 
(Krantz & Schulz, in press; Langer & 
1976; Leventhal & Everhart, 1978). 
Situational Context and Control Intervention 
ncluded a 


A recent review, however, C0 a 
ol are t 


various operations of personal contr 
lated to stress in a complex fashion A 
1973). The particular situational co” 
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the meaning of the control response to the in- 
dividual determines whether it will be effec- 
‘five in reducing stress. This conclusion is 
supported by studies demonstrating an inter- 
‘action among some operations of personal 
control. For example, Corah and Boffa (1970) 
manipulated the ability to escape (terminate) 
or not escape a loud noise. Half of the sub- 
jects were also given instructions that height- 
ened their sense of choice about what to do, 
and the remainder were not given a choice. 
Results indicated that ability to escape was 
stress reducing. Choice reduced stress for the 
No-escape subjects, but was not effective (and 
even increased rated discomfort) for subjects 
given an escape response. Other research in- 
dicates that under certain conditions, making 
Tesponses available to subjects or providing 
information may actually be stress inducing 
(Averill & Rosenn, 1972; Epstein, 1973). One 
conclusion that can be drawn from this re- 
= that the situational context affects 

ae of interventions. Therefore, to 

eee € specific effects of real-life op- 

Bas control, it is necessary to conduct 

rH naturally occurring settings. 

Moct; r study in this article examines the 
nor ae sens of personal control on 
botomy) x aaron to giving blood (phle- 
Bre: n ay donors were selected as the 

faiie ie ation because giving blood is a 
to experim SS penon that lends itself easily 
A neee interventions, We manipu- 

nation and choice because both 


thes 4 
k T variables are of theoretical interest in 
edical settings. 


E 
Effects of Information and Choice 


Fe ion has been conceptualized as a 
silts in th Bnitive control because it often re- 
8 that oe interpretation of an aversive event 
ai 90 threat is lessened. Research sug- 
ions jg Tr appa about sensa- 
insor ective in reducing distress (cf. 
» 1973), presumably because it in- 


_CTeases abili 
a to prepare for and validate ex- 
i with aversive events. Sensory in- 


ation 
l associated has been shown to reduce distress 
with noxious medical examinations 


| 
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and unpleasant procedures (Johnson & Leven- 
thal, 1974; Fuller, Endress, & Johnson, Note 
1) as well as to facilitate recovery from sur- 
gery (Johnson, Rice, Fuller, & Endress, Note 
2). 

Participation and choice often lead to an 
increase in perceived control. Choice is seen 
as a form of control because it may provide 
subjects with the perception (correct or not) 
that they can have an effect on outcomes. 

Both choice and participation have also 
been described as effective means of facilitat- 
ing favorable health outcomes and reducing 
distress associated with various medical pro- 
cedures (Cromwell, Butterfield, Brayfield, & 
Curry, 1977; Krantz & Schulz, in press; 
Taylor & Levin, 1976). Langer and Rodin 
(1976), in a study of institutionalized elderly 
people, found that providing elderly residents 
with the freedom to make choices about daily 
matters and enhancing their sense of responsi- 
bility for rather routine events resulted in 
heightened activity, happiness, and well-being. 
Moreover, a long-term follow-up found that 
the choice and responsibility interventions 
had sustained positive effects on measures of 
health and mortality (Rodin & Langer, 1977). 


Rationale and Hypotheses 


Although recent discussions of behavioral 
approaches to health care encourage the ac- 
tive involvement of clients in their own 
treatment (e.g., Cromwell et al., 1977; Taylor 
& Levin, 1976), there has been relatively 
little systematic investigation of the effects 
of various types of patient control. In a 
blood-bank setting, allowing donors to choose 
the arm to be used was a means of increasing 
donor involvement that could easily be in- 
corporated into the ongoing procedure. Pro- 
viding information about sensations and pro- 
cedures is another means of enhancing control 
that could be easily incorporated into the 
procedure. Since little is known about how in- 
formation and choice might operate together, 
research combining these operations can pro- 
vide insight into possible positive or nega- 
tive effects of providing for patient control 
over some aspects of treatment. 
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Experiment 1 


The present study is designed to evaluate 
the effectiveness of information and choice 
in a real-life setting. Accordingly, we manip- 
ulated both information about the blood- 
drawing procedures and participation in the 
process by allowing donors to choose the arm 
to be used. Assuming that enhancing personal 
control would reduce distress in blood donors, 
it was expected that information and choice 
manipulations would each reduce distress in- 
volved in the phlebotomy procedure. We also 
expected that the combination of information 
and choice would reduce distress and perhaps 
be more effective than either treatment alone. 


Method 


Overview 


Forty blood donors were assigned to one of four 
conditions in a 2X 2 design. The two manipulated 
factors included information (high/low) and choice 
(high/low). High-information subjects heard taped 
communications describing the upcoming procedures 
and discussing physical and psychological sensations 
donors frequently experience. Low-information sub- 
jects heard a Red Cross documentary tape of com- 
parable length revealing little about procedures. 
High-choice subjects were allowed to select the arm 
to be used for phlebotomy, whereas low-choice sub- 
jects were merely instructed that the nondominant 
arm would be used. After the manipulations, a nurse 
drew blood according to the standard American Red 
Cross procedure. Donor stress reactions were in- 
dexed by observing the necessary health mainte- 
nance actions taken by a trained nurse and also 
through self-reports of discomfort, pain, and anxiety 
experienced at several points in the procedure. The 
Stroop color-word test, a response-competition per- 
formance measure, was administered during the donor 
recovery period. 


Subjects and Recruitment 


Subjects were 40 first-time donors who passed 
American Red Cross screening at a local blood bank 
in Southern California. Twelve additional donors 
were approached, but they declined to participate in 
the study. There were 19 males and 21 females, rang- 
ing in age from 17 to 50 years, with a mean age of 
26.7. Subjects were randomly assigned to conditions 
with the qualification that groups contain an ap- 
proximately equal number of males and females. 
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Procedure 


Subjects were tested individually. After initial 
screening, each subject was escorted to the testing 
area by a nurse. The potential subject was informed 
by the first experimenter of the general nature of 
the study, which was “to find ways to make donors 
more comfortable while giving blood.” After agreeing 
to participate and signing a consent form, subjects 
completed brief self-report scales including prema- 
nipulation ratings of pain, discomfort, and anxiety 
on “visual analogue” scales (Ohnhaus & Adler, 1975), 
These scales contained printed instructions and were 
anchored with the words none at one end and severi] 
at the other. Subjects were then assigned randomly | 
to one of four conditions that were factorial com- 
binations of the two experimental treatments. Tht 
manipulation for each subject was introduced by an 
appropriate tape-recorded message. 

Information manipulation. Half of the subjects 
(high information) listened to a 2-minute high-in- 
formation tape that described the step-by-step proz 
cedures involved in donating blood and that alsy 
discussed some of the physiological and psychologt 
cal sensations commonly experienced by donors, Ex} 
cerpts of the message used are as follows: 


There are three steps involved in donating blood! 
the preparation of the arm, the donation itself, an 
a short rest period afterwards. . . . We will cleat 
your arm with soap and an antiseptic, then one 4 
the nurses will place a blood pressure cuf on yol 

arm, inflate it, select the largest vem 
area that has been cleaned, and inser 
Some people feel a slight sting initially, 
the medication that’s in the needle. Once E 
needle is in place, blood will flow into the co 


tion bag. After about 10 minutes the nia fed 
remove the cuff and the needle. If you at 


a slight numbness or tingling in your arm ee 
blood is being drawn, it will go away ae 
cuff is removed. Then you will relax. Sig S 
feel a little faint, there’s no reason to e 

cerned—you need a few moments to adjust. 


The remaining subjects (low information) heard 
taped message of comparable length that 
the Red Crosss blood donor program but 
little about the upcoming procedures. ar 
containing statistical and very general epatt 
information, was excerpted from a recent 1™ 
donor recruitment (Sherer, 1976). 


s wi 
The tape continued 


Choice manipulation. sag the 
an additional 2-minute message introduc subje 
propriate choice manipulation. Half of to heigl 
(high choice) heard a message designed the P 
their feelings of active participation ie allo 
cedure by telling them that they woul would 
to choose the arm from which bloo the dor 
drawn. The message was designed to lead tho! 
to use a standard arm (nondominant arm) ch 


to 
they were actually given the opportunity 
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whichever arm they preferred (18 of 20 subjects 
chose the nondominant arm). Excerpts of the mes- 
\ sage used are as follows: 


It is now time for you to choose the arm that we 
will use. . . . Other donating centers have their 
donors use the nondominant arm. That’s so that 
if there’s any soreness due to the needle afterwards, 
it will be on the nondominant arm and will not 
interfere with their normal routine. But remember, 
this study is interested in your role in the blood- 
drawing process. Therefore, regardless of what 
other centers have decided, here, the choice about 
your arm is completely up to you. Which arm do 
you choose? [Subjects then chose the arm that 
they preferred to use for the phlebotomy.] 


The remaining half of the subjects (low choice) 
Were merely instructed that the nondominant arm 
‘would be used in the blood-drawing procedure. The 
Justification for use of this arm was to “prevent 
soreness from interfering with normal routine.” Sub- 
ects were given no choice in selection of the arm. 


Dependent Measures 


: Following the taped messages, subjects were es- 
Corted to the blood-drawing area where an experi- 
i nurse from the blood center (who was blind to 
‘the subject’s experimental condition) drew blood 
According to the standard Red Cross procedure. 
ae of nursing interventions. During the 
Bas procedure, the nurse performing the 
ER omy closely observed the subjects for signs of 
ni and distress, The nurse was instructed to 
3 Ea as she judged necessary and to respond to 
C N of the intensity of donor reactions in 
Eia 3 l, graded fashion. The type and severity of 
Bali Boece action between nurse and donor 
is ae ore be used as a measure of donor dis- 
Wa ipa ee. of nursing interventions was coded 
iect’s experi observer who was also blind to the sub- 
Well as peace condition, Earlier pilot work as 
Brse-don, a collected during the study revealed that 
way Aes interactions could be standardized in a 
Ro reflected intensity of action required by 
sale 4 ne interventions were coded on a 4-point 
insure that het 1=brief verbal communication to 
reaction; 2 = e donor was not having an adverse 
assure t] he i extended verbal communication to re- 
physical AA (and check on his/her status) ; 3 = 
towels and/or eeu and 4= the use of cold 
“tient, Use of ae monia capsules to revive the pa- 
tematically Sage wees made it possible to sys- 
‘Same time to e nursing interventions and at the 
addition ok Provide necessary care to donors. In 
“on led yes $ nurse and observer independently re- 
Verse Physic i no to whether they had observed ad- 
- botomy. Ph Teactions in donors during the phle- 
Signs of IEE reactions were defined as overt 
ostsession s on, tension, dizziness, or fainting. 
self-ratings and performance measures- 
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After the blood drawing was concluded, the second 
observer administered scales containing instructions 
that asked subjects to retrospectively rate the degree 
of pain, anxiety, and discomfort they experienced 
during the time blood was being drawn. Subjects also 
rated their postsession reactions, indicating how they 
felt after the procedures were over. Next, according 
to usual Red Cross procedures to facilitate recovery, 
subjects were given juice and a snack. After these re- 
freshments were offered, subjects were given the 
Stroop color-word test. This timed response-competi- 
tion task, given in three parts, has been used as a 
measure of behavioral aftereffects of stress (Glass & 
Singer, 1972). Manipulation checks and debriefing 
concluded the procedure. 


Results * 
Checks on Experimental Manipulations 


The manipulations were effective in induc- 
ing both differential perceptions of choice and 
differential perceptions of information re- 
ceived from the taped communications. An 
item on the postexperimental questionnaire 
asked subjects to indicate where they received 
the most information about the blood-draw- 
ing procedure. Seventy-nine percent of sub- 
jects in the high-information conditions rated 
the taped message as their primary source of 
information, whereas 37% of those in the 
low-information conditions did the same, 
x2(1) = 5.29, p < .03. Low-information sub- 
jects indicated that other sources had pro- 
vided them with the most information (e.g., 
companions, nursing staff). 

Subjects responded to the question, “To 
what extent did you have a choice in select- 
ing which arm would be used for blood draw- 
ing?” on a 5-point scale. A 2 (choice) X a 
(information) analysis of variance of this 
item revealed the expected main effect for 
choice, F(1, 34) = 110,90, p < .001. There 
was also a main effect for information, 
F(1, 34) = 4.16, p? < .05, with high-informa- 
tion groups rating more perceived choice. 
There were no other significant effects. Ex- 
amination of the means for each condition 
indicated that the high-information-high- 
choice and low-information-high-choice groups 


for only 38 subjects on 


1Data were recoverable 
for 39 subjects on self- 


manipulation checks and 
rating measures of distress. 
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Table 1 
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Number of Subjects Requiring Each Type of Nurse Intervention and Mean Interve ntion Score 


ee 


Severity of intervention 


Verbal M 
status Cold intervention 
Condition None check Touch towels score* 
Low information-low choice 1 2 2 5 1.9 
Low information-high choice 7 2 0 1 1 3 
High information-low choice 8 1 0 1 1.2 
High information-high choice 4 2 1 3 1.6 
Note. There were 10 subjects in each condition. , 
“This score was computed by dichotomizing nurse actions as none versus intervention (verbal, touch, or 


towels). Higher scores on this scale reflect greater severity. 


rated the most choice (M = 4.9 and 4.7, re- 
spectively) with ratings for high-information— 
low-choice (M = 2.0) and low-information-— 
low-choice groups (M = 1.0) significantly 
lower (p< .05 by Newman-Keuls test). 
High-information—low-choice subjects also ap- 
parently perceived somewhat more choice 
than their low-information-low-choice coun- 
terparts (p < .05). 


Nurse Interventions During Blood Drawing 


The principal measure of donor distress 
during the phlebotomy procedure consisted of 
observer ratings of nurse interventions coded 
on a 4-point scale of intensity (no adverse re- 
action to the application of cold towels). In 
addition, both the nurse and second obseryer 
recorded whether an adverse reaction (overt 
weakness, dizziness, or fai tness) was present 
or absent. Evidence for the reliability of 
nurse intervention intensity as a Measure of 
distress was provided by the fact that every- 
one (10 subjects) who received cold towels 
was rated by the nurse as having had a reac- 
tion. Across the range of interventions there 
was 80% agreement between independent re- 
ports of reactions made by the nurse and the 
second observer. 

It was hypothesized that donors who were 
singly given information or choice would show 


presented in Table 1, These data were not 
normally distributed and, strictly speaking, 
were not appropriate for analysis of variance. 
To overcome this distributional problem, the 
dependent measure was first dichotomized as 


no intervention versus intervention (verbal 
communication, touch, or towels) and then 
subjected to a 2 (choice) x 2 (information) 


analysis of variance? This analysis revealed 
no significant main effects (p > .15). Instead, 
there was a significant interaction between 
choice and information treatments, F(1, 36) 

= 12.86, p < .002. This effect indicated that 
although the high-information-high-choice (M 

= 1.6) condition reduced the severity of in- 
terventions relative to the low-information-ų 
low-choice condition (M = 1.9, p < .05, by 
Newman-Keuls test), both conditions re- 
quired significantly greater nurse interven- 
tions (p < 05) than either the high-informa- | 
tion-low-choice (M= 1.2) or low-informa- 
tion-high-choice groups (M = 1.3). In short, 
these results indicated that although combin- 
ing information and choice was effective in o 
reducing stress reactions, information oF 
choice given alone was more effective. This © 
Conclusion is further supported by a multi- © 
variate analysis of variance conducted on the | 
Cluster of measures of distress during blood 
drawing (namely, nurse intervention intensity 
and rated discomfort, pain, and anxiety dur- 


ve 


* Analysis of variance on the original 4-point nurse 
intervention data in Table 1 revealed the identical 
Pattern of results as did Post hoc contrasts. 


4 


Table 2 
Experiment 
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, Mean Self-Ratings of Three Sections of 
iment 


Note. These ratings were 
a Means for this measure are 
bn = 10. 

on =9% 


ual scales 


taken on vis r 
adjusted for premanip! 


ing the proce 
significant interaction 
and choice treatments, F(4, 
01, and no main effects. 
effects for these variables are 


where in the results section. 


Self-Ratings 
were asked to rate degree 
in, and anxiety on 


X i 

point scales for 
with high 
The frst rating 
livery of the exp 


indicating greater 


was conduc! 
al manipulations. The 


‘ate 


20). ; 
tsession following plood drawing, 


were made- Subjects retrospec- 


ranging from 4 = non 
ulation score. 


3 j ~ Self-rating measure - Pre- Blood 
Experimental condition of distress manipulation drawing’ Postsession* 
Low information-low choice” Pain 4.2 4.3 1.4 
Discomfort 2.5 3.8 2.1 
Anxiety 4.6 4.2 1.6 
Low information-high choice? Pain 1.3 3.4 1.2 
Discomfort 1.9 15 1.1 
Anxiety 3.8 2.5 1.1 
High information-low choice® Pain 1.2 1.8 1.6 
Discomfort 1.9 tt 1.7 
Anxiety 3.3 3.8 2.6 
High information-high choice? Pain 1.7 3.5 2.5 
Discomfort 2.8 4.0 2.3 
Anxiety 5.1 4.2 2A 


e to 14 = severe. 


discomfort, pain, and anxiety 


tively rated 
d then rated their 


felt during 
present 

i affected self-ratings, 
sures analyses of covariance * 
choice of either 


discomfort, pain, or anxiety, respective pre- 
i i ed as the covariate 


postmanipulation measures as 
Adjusted means for 
are presented in Table 2. 

discomfort revealed 
ificant trials € ect, F(1, 35) = 12.69, 
p< .002, greater discomfort dur- 
ing blood drawing than for the postmeasure. 
simple effects were not significant; however, 
there was 4 reliable interaction between choice 
and information treatments, F(1, 34) = 4.35, 
p< 05. The interaction parallels the finding 
for nurse interventions and indicates that 


a 


3 Although there were no significant between-con- 


premeasures, covariance analyses 


d for postmanipulation ratings to correct 
_ A repeated-measures 


jance on these measures of distress re- 
put somewhat stronger pattern of 


results. 
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both the low-information-low-choice and high- 
information-high-choice groups rated more 
discomfort than low-information-high-choice 
and high-information—low-choice subjects. 
Moreover, a significant interaction between 
choice, information, and trials, F(1, 35) = 
4.60, p < .04, reinforced the fact that be- 
tween-conditions differences were greatest for 
rated discomfort during blood drawing. This 
conclusion was confirmed by separate 2 X 2 
analyses conducted on ratings of each trial. 
For rated discomfort during blood drawing, 
the Choice x Information interaction was 
significant (p < .03). For discomfort in post- 
session, this effect was weaker (p < .09). 

Repeated measures covariance analysis of 
pain ratings once again revealed a trial ef- 
fect, F(1, 35) = 11.81, p < .003. The only 
other significant effect was an interaction be- 
tween information treatments and trials, 
F(1, 35) = 4.44, p < .05. The nature of this 
interaction was revealed by separate 2 x 2 
analyses for each trial. For ratings of pain 
during blood drawing, there were no signifi- 
cant effects (ps > .17), whereas for self-re- 
ported pain in postsession, a main effect for 
information treatments emerged, F(1, 35) = 
445, p< .05. This effect indicated higher 
postsession pain ratings for the high-informa- 
tion groups. Newman-Keuls tests revealed 
that this effect was largely due to the high- 
information-high-choice group rating more 
pain than all the other groups in postsession 
(p < .05) and the low-information-high- 
choice group rating less pain than all the 
others (p < .05). 

Finally, repeated measures covariance anal- 
ysis of anxiety ratings revealed only a sig- 
nificant trials effect, F(1, 36) = 7.67, p < 01, 
reflecting a decrease in rated anxiety across 
trials. 

Pearson correlation coefficients were com- 
puted between the 4-point nurse intervention 
measure and self-ratings of experience during 
blood drawing. Obtained coefficients for the 
39 cases were .31 for pain, .30 for discomfort, 
and .46 for anxiety, These correlations of 
moderate magnitude were all significant (ps 
< .03). In sum, results for the self-rating data 
are generally supportive of the results ob- 
tained on the behavioral measure of nurse 
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interventions. Single manipulations of infor- 


mation and choice were effective in reducing rs 


distress indexed by rated discomfort and, to 
a weaker extent, by a postmeasure of pain. 
There was some indication that ratings were 
highest for the group receiving the combina- 
tion of information and choice. For rated 
anxiety, the treatments did not appear to 
have any differential effect. 


Stroop Color-Word Interference Test 


Analysis of this measure revealed no sig ya 


| 


control appear to be optimal for coping with . 


nificant main effects or interactions. 


Discussion 


Results of this experiment suggest that in- 
formation and choice manipulations, admin- 
istered alone, are effective means of reducing 
distress during a phlebotomy procedure in a 
blood-bank setting. However, when these 
treatments are combined and given together, 
their effects are not additive. On a measure 
of nursing interventions, the data indicate 
that high levels of both information and 
choice are somewhat less effective than single 
manipulations of either variable. The pattern 
of results obtained on self-rating data are 
more complex, but still support the general 
conclusion that moderate degrees of personal 


stress in this setting. 

Although relatively few studies have ex- 
amined the effects of choice in conjunction 
with other operations of personal control, re- 
sults of the present study resemble those ob- 
tained by Corah and Boffa (1970), who ex- 
amined the effects of combining choice with 
the ability to escape aversive stimulation. 
They found no evidence that these combined 
treatments were more effective than single 
manipulations of either variable. In fact, 
Corah and Boffa found that subjects given 
both choice and ability to escape aversive 
noise actually rated the noise as producing 
as much discomfort as the group receiving 
neither treatment. This is the same effect for 
rated discomfort obtained in the present 
study. 

Although much of stress research is charac- 
terized by a dissociation among behavioral, 
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self-report, and physiological measures (cf. 
„4 Glass & Singer, 1972; Lazarus, 1966), a 
problematic feature of the present study is 
a degree of inconsistency among the various 
self-rating indexes of distress. Although rated 
discomfort and the postsession measure of 
pain generally support the conclusion that 
moderate levels of personal control reduce 
distress, neither self-rated anxiety nor rated 
pain during blood drawing were affected by 
the manipulations in this study. It should be 
noted that ratings tended to be rather low— 
indicating little stress. This may have been 
so because donors chose to give blood and to 
be subjects. Moreover, rating data were 
gathered during the postsession after nursing 
measures were taken to assist donors in cop- 
ing with the physical distress of giving blood. 

Reflecting on the original hypotheses of the 
study, we sought to determine if two treat- 
ments that could be easily incorporated into 
the ongoing blood-donating procedure would 
lessen donor distress. The surprising result on 
the nurse intervention measure indicated that 
in the high-information-high-choice group, 
there were actually more severe reactions than 
in the groups receiving only one manipula- 
tion. Fortunately, this condition still resulted 
in less severe reactions than those in the group 
given neither information nor choice. In a 
purely descriptive sense, this finding rem- 
forces Averill’s (1973) ‘conclusion that the 


context and meaning of @ “control response” 


determines whether it will be effective in re- 
might speculate 


ducing stress. However, we É 
briéfly about the cognitive mechanisms 

could have produced these results. One pos- 
sibility is that the combination of informa- 
tion and choice manipulations gave donors 
more of a role in the plood-drawing procedure 
referred. That is, after 


being informed of the procedures and sensa- 


tions, donors might 
be responsible for the rest of 
This explanation raises the 
bility that articipation Or r 
tua ea. jn a decrease in “perceived D 
trol” (ability to affect outcomes) if the in- 
dividual prefers not to have it or believes that 
it will not be effective (see Houston, 1972). 


Under those circumstances, 
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or participation may even heighten stress. Al- 
though others have speculated that this pos- 
sibility may apply to health care outcomes 
(Freidson, 1975; Rodin & Langer, 1977), this 
issue has not been extensively researched and 
deserves further investigation. 

Finally, there were no reliable effects on 
Stroop test performance at the end of the 
study. An examination of the blood-bank pro- 
cedure, however, revealed several reasons why 
group differences in performance on an after- 
effects task may not have been observed in 
that setting. In addition to receiving cold 
towels and ammonia capsules, subjects with 
adverse reactions were reassured and reori- 
ented; their stress was reduced. We suspected 
that the nursing interventions during phle- 
botomy and refreshments given to donors fol- 
lowing the procedure precluded an adequate 
assessment of aftereffects in that setting. We 
therefore conducted a partial laboratory rep- 
lication of the blood donor study to determine 
if information and choice can prevent after- 
effects of exposure to aversive stimulation. 


Experiment 2 


Research (Glass & Singer, 1972) suggests 
that a variety of stressors—at least when they 
are unpredictable and uncontrollable—pro- 
duce aftereffects that appear relatively soon 
after stimulation is terminated. Aftereffects 
have been observed as impairments in per- 
formance of a variety of tasks including mea- 
sures of frustration tolerance and attention 
to detail. Moreover, Glass and Singer (1972) 
have shown that if subjects believe they have 
control over the onset or offset of aversive 
stimuli, these negative aftereffects of stress 
arousal are reduced. In several studies, this 
“perceived contro » variable was manipulated 
by providing subjects with the option of 
terminating an aversive stimulus.* The ex- 


= 

4 Glass and Singer (1972) have conceptualized op- 
erations similar to this manipulation as “perceived 
control,” whereas others (eg, Corah & Boffa, 1970) 
have termed it choice. Although we feel it is reason- 
able that choice and perceived control do not always 
vary together, we have evidence that the choice 
manipulation in Experiment 2 did heighten feelings 
of control (see Results section). 
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plicit relationship between providing informa- 
tion about a stressor and behavioral afteref- 
fects has not been investigated previously. 

To create a rough analogue to the blood- 
drawing procedure, a cold pressor stimulus 
was used in which subjects immersed their 
hand in ice water. Half of the subjects (high 
information) received information about the 
Sensations of a cold pressor stimulus (cold- 
hess, aching, etc.), whereas the remaining 
half (low information) received no informa- 
tion. Crosscutting this treatment, half of the 
subjects (high choice) were allowed to choose 
the hand to be used and could elect to remove 
their hand from the cold water if they chose 
to do so. The remaining subjects (low choice) 
were not given these options, Dependent mea- 
sures included performance on an aftereffects 
task and self-ratings. Based on the results of 
the previous study, it was predicted that sin- 
gle manipulations of information and choice 
would be effective in Preventing aftereffects 
on a poststress task. The combined manipula- 
tions were not expected to be any more ef- 
fective than either treatment alone and per- 
haps somewhat less effective as observed on 
Some measures in the first study. 


Method 
Subjects 


Procedure 


Subjects were tested individually. After being greeted 
by the experimenter, 
table and was instru 
experiment was to “ 
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The scales were anchored by 1=none at all, 4= 
moderate amount, and 7=most ever experienced. 
After the subjects completed these premeasures, the 
tape was turned on again, and prerecorded messages 
presented the appropriate experimental manipula- 
tions. 

Information manipulation. Half of the subjects 
(high information) listened to the high-information 
manipulation that described the physiological and 
Psychological sensations commonly experienced by 
persons who placed their hands in ice water, The 
manipulation was as follows: 


We would like to let you know what you can an- 
ticipate from holding your hand in the ice water. » 
Most people report a coldness, aching, tightness of 
the skin across the hand. Also, they experience a 
“pins and needles” Sensation and numbness, Such 
reactions are normal, and after you remove your 
hand from the water, the reactions will go away. 


The remaining half of the subjects (low informa- 
tion) received no information at this point and 
heard only the subsequent messages. 


Choice manipulation. Half of the subjects (high 
choice) heard instructions that emphasized their 
ability to determine which hand would be inserted 
into the water, (Instructions encouraged choice of 
the nondominant hand.) In addition, although dis- 
couraged from doing so, they were told they would 
be able to withdraw their hand if they chose. Ex- 
cerpts of the message are as follows: 


The next step is holding your hand in the ice 
water .. . You can place Whichever hand you want 
into the water . . . the choice of hands is entirely 
up to you. It is important that you do not ex- 
change hands after you have placed one of them ne 
the water, Secondly, while we would like the sub- 
jects to keep their hand in the water for a pre- 
determined length of time, you are free to with- 
draw your hand at any time you may wish. Now 
pick the hand you will place in the water. We find 
that most People pick the hand that they do not 
use for writing so that you can fill out the scales 
with the free hand. When I tell you, place your 
hand in the metal bowl, resting your palm flat on 
the bottom surface. Although a few people who 
come here don’t keep their hand submerged the 
full time, most do. For the success of the expel 
ment, I prefer you to keep your hand submerged, 


until I tell you remove it, but that’s entirely UP 
to you. 


The remaining half of the subjects (low choice) 
Were told which hand they would insert into he 
water and were told not to remove their hand. The 
message was as follows: 


It is important that 7 control the length of ot 
you keep your hand in the water. For this study to 
be successful, all the subjects must be exposed 
the ice water for the same length of time. 
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please keep your hand in the water until I tell 
you to remove it. Also, we have chosen the hand 
you will place in water. Place the hand that you 
do not use for writing in the water so that it will 
be easier for you to fill out the scales. When I tell 
you, I will want you to place your hand in the 
metal bowl, resting your palm flat on the bottom 
surface. Leave your hand there until I tell you to 
remove it. 


At this point all subjects submerged their hand 
in a flat-bottomed metal pan containing ice water at 
5 °C. The tape recorder remained on and after 2 

minutes delivered a message instructing subjects to 
æ again complete the four self-rating scales. This mea- 

sure reflected self-reports during exposure to the 
cold pressor stimulus. After a total of 4 minutes had 
elapsed from the time of submersion, the tape in- 
structed subjects to remove their hand from the 
water, At this time the experimenter measured blood 
pressure with a cuff and administered postmeasures. 
Postsession ratings. After subjects removed their 
hand from the water, they were asked to indicate 
once more how they had felt when their hand was 
submerged in the ice water and, finally, to indicate 
# how they felt at that moment. As with the pre- 
measure, these ratings were taken on scales measur- 
ing discomfort, pain, and anxiety. 

Aftereffects task. The final task for the subject 
was to proofread a simplified version of a 10-page 
passage used extensively in previous research (€8-) 
Glass & Singer, 1972; Krantz & Stone, 1978) to 
measure attention to detail. This passage contained 
typographical errors, misspellings, and grammatical 
and punctuation mistakes. Subjects were instructed 
to proofread the passage, detect the errors, and un- 
derline them. Instructions emphasized both accuracy 
and speed, After instructions were given, the sub- 
ject was left to work alone for 6 minutes, although 
nothing was said about time. In accord with previ- 
ous research using this measure (e.g. Glass & Singer, 
1972; Krantz & Stone, 1978) a percentage correct 
score was computed as a measure of performance. 
This score consisted of the percentage of correctly 
located errors in the portion of the passage completed 
by subjects when'asked to stop working, This was 
a measure of performance accuracy correcting for 
amount read, “Total amount read” was also ana- 
lyzed. After the proofreading task, manipulation 
checks were administered and subjects were debriefed. 


a 


Results * 
Manipulation Checks 


Data from postexperimental questions indi- 
cated that the manipulations were effective. 
Subjects were first asked to rate (on a 7-point 
scale) the degree of choice they had in select- 
ing the hand to be placed in the cold water. 
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Table 3 
Mean Percentage of Correctly Located Errors 
on Proofreading Task 


Information High choice Low choice 
High 1.2 61.8 
Low 74.2 52.3 


Note. n = 11 subjects per condition. Higher scores 
reflect better performance. 


A 2 (high-low information) X 2 (high-low 
choice) analysis of variance revealed that 
high-choice subjects indeed rated more choice, , 
F(1, 40) = 89.30, P< 001. There were no 
other significant effects. A similar analysis 
was conducted on an item asking subjects to 
rate the degree of choice they felt in deter- 
mining when their hand would be removed 
from the water. Once again, only the ap- 
propriate main effect for choice emerged, 
F(1, 40) = 71.40, p < .001. 

To check on the information manipulation, 
subjects were asked how much information 
they had received about sensations they would 
experience from the cold water. Analysis of 
these 7-point scale ratings revealed only an 
information main effect, F(1, 40) = 62.57, P 
< .001. 


Dependent Measures 


Aftereffects task. It was predicted that 
single manipulations of information and 
choice would prevent behavioral aftereffects 
of exposure to the experimental stressor. High 
levels of both information and choice were 
not expected to produce any additional reme- 
dial effect. A two-way analysis of variance 
was conducted on percentage of errors cor- 
rectly identified on the proofreading task. 
Means for this measure are presented in 
Table 3. There was a significant main effect 
for choice, F(1, 40) = 12.52, P< 001, with 


5 Four subjects in the high-choice conditions re- 
moved their hand from the ice water before the full 
4-minute submersion period. Data both excluding 
and including these individuals on all dependent mea- 
sures were analyzed. Results on all measures were 
virtually identical both ways. Therefore, data for the 
full complement of 44 subjects are presented. 
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high-choice subjects (M = 72.7) being more 
accurate than low-choice subjects (M = 60.1). 
In addition, there was an unreliable tendency 
toward an information main effect ($ < .09). 
A significant Choice x Information interac- 
tion also was evident, F(1, 40) = 6.66, p < 
:02. This interaction indicated that under 
low-choice conditions, providing information 
led to improved performance (p< .05 by 
Newman-Keuls test). Under high-choice con- 
ditions, providing information made little dif- 
ference and even depressed performance 
slightly, though not reliably. Newman-Keuls 
tests revealed that the low-choice—low-infor- 
mation group was significantly less accurate 
than the other three groups (p< 05). The 
latter three groups, however, did not differ 
reliably from one another. 

An analysis was also conducted on amount 
of the proofreading passage read (i.e., number 
of lines of text) during the 6-minute test 
period. This 2 x 2 variance analysis revealed 
only a significant Choice x Information inter- 
action, F(1, 40) = 4.00, p < .052. This in- 
teraction indicated that low-information-high- 
choice subjects read the most (M = 52.2 
lines), followed in decreasing order by high- 
information-low-choice (M = 48.7 lines), 
high-information-high-choice (M = 43.5 
lines), and low-information—low-choice (M 
= 39.8 lines). These individual group mean 
differences did not reach reliable levels of sig- 
nificance (ps > .05 by Newman-Keuls) ; how- 
ever, these data demonstrate that on a mea- 
sure of performance speed as well as accuracy, 
the effects of information and choice are not 
purely additive in preventing aftereffects, 


Self-Report Measures 


were con- 


c discomfort, 
pain, and anxiety. These analyses revealed 
no initial Premanipulation rating differences 
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or interactions for any of the measures. On 
7-point scales, mean self-ratings during sub- 
mersion ranged between 4.5 and 5.1 for dis- "1 
comfort, 4.0 and 4.4 for pain, and 3.1 and 3.9 
for anxiety. For each measure of distress, 
however, there was a significant trials effect 
(ps < .01) reflecting heightened aversive ex- 
perience during the cold pressor test and sub- 
sequent relief in postsession. Examination of 
the means revealed that relative to premea- 
sures, experience during cold pressor was rated 
with heightened discomfort, pain, and anxiety 
by subjects in all conditions. In addition, * } 
Pearson correlation coefficients were com- 
puted between rated experience during the 
cold pressor test and the aftereffects measure 
of percentage correct on proofreading. These 
correlations were uniformly low and _nonsig- 
nificant, indicating a lack of relationship be- 
tween these measures, 


Other Relevant Data 


lowing the cold pressor test revealed no re- 
liable effects. However, it is worth noting that 


groups tended 
to show somewhat higher systolic and diastolic 
blood pressures, Additionally, in a postexperi- 
mental questionnaire, subjects rated the de® 


Discussion 


Results on the behavioral aftereffects of Y - 


Was given alone; combining the treatments 
added little to subsequent performance. The 
data on the Proofreading measure also reveal 
that the choice manipulation was more po- 
tent than Providing subjects with relevant in- 
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formation 
fects. 

The manipulations in Experiment 2 did 
not affect any of the self-reports of discom- 
fort, pain, or anxiety; and correlations be- 
tween aftereffects and self-ratings were non- 
significant, as in the Glass and Singer (1972) 
research, Although related studies (e.g., Glass 
& Singer, 1972, pP- 47-55; Sherrod, Hage, 
Halpern, & Moore, 1977) have reported that 
control manipulations did not affect rated 
aversiveness, other studies have found a tend- 
ency for predictable and controllable stimuli 
to be rated as less aversive (Corah & Boffa, 
1970; Fuller et al., Note 1). We have no 
ready explanation for these apparent incon- 
sistencies. Further investigation is necessary 
to clarify the conditions under which self-re- 
port measures are or are not affected by ma- 
nipulations of personal control. 

In our discussion of the results of Experi- 
ment 1, we speculated that too much patient 
control might not be desired or expected in a 
blood bank. In accord with this explanation, 
we found in the first study that both of the 
groups receiving only one manipulation €x- 
perienced reduced stress reactions compared 
to the high-information-high-choice condition. 
Results of the cold pressor study (Experiment 
2) indicate that groups receiving either com- 
bined or single manipulations evidenced re- 
duced stress on an aftereffects task. We at- 
tribute this somewhat different pattern of re- 
sults to characteristics of the respective set- 
tings in which the studies were conducted and 
also to the particular operations used to ma- 
nipulate choice. 

First, subjects in a laboratory study prob- 
ably do not enter the situation with estab- 
lished expectancies about how much control 
they will be allowed. Therefore, in contrast 
to a clinical setting, there is little expectancy 
violation involved in being given control over 
a laboratory stressor. Second, the nature of 
the choice manipulation in the cold pressor 
study (i.e., the option to terminate the stim- 
ulus) made it explicitly associated with “per- 
ceived control” (Glass & Singer, 1972). Sub- 
jects had every reason to believe that this 
form of control would be effective in terminat- 
ing the aversive stimulation, thereby promot- 


in preventing behavioral afteref- 
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ing the stress-reducing properties of this treat- 
ment. (Note the choice main effect.) This 
logic argues that choice manipulations will 
reduce stress to the extent that they are per- 
ceived to be effective in regulating aversive- 
ness. 

The two studies presented here shed light 
on some of the complexities of the control— 
stress telationship. Based on these findings, 
we can conclude that information and choice 
can play a useful role in remediating adverse 
physical reactions to an aversive health-care 
procedure. Reasonably similar manipulations 
were shown to affect the magnitude of behav- 
joral aftereffects induced by a laboratory 
stressor. In both studies, there was little or 
no gain in effectiveness from combining the 
treatments. In light of the economic and medi- 
cal benefits of self-care and the growing ap- 
plication of behavioral science principles to 
health-care settings (eg., Krantz & Schulz, 
in press; Leventhal & Everhart, 1978), it is 
important to investigate further the effects of 
personal control in its various operational 
forms. Moreover, subtle differences between 
the clinical and laboratory settings suggest 
that naturalistic study of this problem can 
be maximally beneficial. 


Reference Notes 
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Measuring Causal Attributions for Success and Failure 


Timothy W. Elig and Irene Hanson Frieze 
University of Pittsburgh 


A great deal of research has been generated by the Weiner et al. attribution 
model of achievement behavior. Although generally supportive of the model, 
the literature is marred by a lack of concern with the reliability and validity 
of the measurements used. In this study, causal attributions for a manipulated 
success-failure event were collected from college students on five different 
measuring instruments. Results indicated that the open-ended response measure 
y showed poorer interest correlation validity than did the structured measures. 
Rating scales showed a better fit to attribution conceptualizations than did the 
percentage method. Overall, scale measures seem to be the method of choice, 
although possible uses for open-ended measures of attributions are suggested. 


Although a great deal of research based on studies of causal attributions for success and 


Weiner et al.'s (1971) model of causal at- failure events. 

tributions for success and failure events has As shown in T able 1, there are a number 

« been published during the 1970s, relatively of commonly used techniques for assessing 
little attention has been given to the question causal attributions: open-ended responses, in- 
of how causal attributions should best be dependent ratings, ipsative ratings, choice of 
measured (Deaux & Farris, Note 1; Smith, one major cause, and bipolar ratings. Each of 


Note 2). Although a few articles (€-8-, Elig & these methods has its own advantages and 
Frieze, 1975; Frieze, 1976; Weiner, 1974; disadvantages in terms of practical considera- 


McHugh, Note 3) have referred to the variety tions. ae i 
of measures used for assessing attributions, Most attribution studies use structured rat- 
there has been no formal study of the im- ings rather than open-ended data. An open- 
plications of using One attribution measure ended procedure involves asking subjects E 
over another, and researchers have tended to state in their own words why a particular 
be unsystematic in their selection and use of event has occurred. These verbal responses 
the common techniques. This article explores Can then be classified by a ee Ha e 
the interrelationship of several measures of any of a set of previously de ne ih u- 
causal attributions to assess their validity and tional categories. Such 3 ao piere i is 
to make recommendations concerning the se- dest er goena ie ae ed aes 
lection of instruments to be used in future w Sal and Darom (in press) ating 
Cooper and Burger (Note 4). The necessity 
Portions of this article were presented at the meet- for training coders oe Er E A 
ing of the American Psychological Association in naturs of this type of causa A 
saet 1978 and served as part of the requirements fora tributes to its rare use In attribution re- 
master’s degree for the first author. search. However, there are also problems with 
, “he authors would ee Rule, eek, structured response measures, which confine 
Smith, and Bernard Weiner for comments on earlier subjects to 4 limited set of factors defined in 
drafts of this article. advance by the experimenter as important for 
Requests for reprints should be sent oot the situation. This set may not include the 
pare Department of Papen 0. ; factors of importance for some subjects. Open- 
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Table 1 } 
Methods of Assessing Causes of Success and Failure 
Method Method used by Example 
Unstructured 7 
Open-ended Elig & Frieze (1974, 1975) Why do you think you 
Frieze (1976) succeeded on this task? 
McHugh (Note 5) 
Structured 
1. Independent 
Unipolar ratings Feather & Simon (1971b, 1972) Rate the extent to which 
Valle & Frieze (1976) these factors caused your success: 
1. If the factor to no extent 
caused the outcome, 
9. If the factor to an extremely 
high extent caused the outcome. 
Your high general intelligence = 
How easy this type of task is = 
2. Ipsative 


Percentage assessment Meyer (1970) 


Choice of one cause 


Bipolar ratings Feather (1969) 


Feather & Simon (1971a) 
Weiner, Nierenberg, & Goldstein 


(1976) 


Paired comparisons McMahan (1973) 


Bailey, Helm, & Gladstone (1975) 


To what extent was your 
success caused by 
your high general intelligence 
how easy this type of task is — 


Which of the following contributed 
the most to the outcome? 
Ability 
Effort 


My outcome was mainly due to: 


Ability-Luck« | 


Of each pair, circle which is 
more responsible for your outcome. 
Ability, luck 
Ability, effort 


a Measured on a 5-point scale. 


ended response measures avoid this problem 
as well as the cuing of subjects toward con- 
sidering causal possibilities that they may not 
have spontaneously considered (Frieze, 1976; 
Smith, Note 2). 

Weiner et al. (1971) postulated that indi- 
viduals attribute the causes of success and 
failure primarily to four of the causal ele- 
ments discussed by Heider (1958): ability 
effort, task difficulty, and luck. Several stud. 
ies using structured measures supported the 
belief that these factors are used by subjects 
in systematic ways to explain achievement 
outcomes (e.g., Frieze & Weiner, 1971: 
Weiner, Heckhausen, Meyer, & Cook, 1972: 
Weiner & Kukla, 1970). Frieze (1976) ans 
ployed open-ended questionnaires to ascer- 


tain what causes college students naturally 
used to explain success or failure in two 
achievement tasks (an exam and an unspeci- 
fied game). The results indicated that the 
four causal factors postulated by the Weiner 
et al. (1971) model were used by subjects for 
these situations and accounted for the large 
majority of causal attributions. However, tw0 ~ 
additional causal factors, mood and other peo- 
ple, were also indicated. Other recent studies 
(Elig & Frieze, 1975; McHugh, Note 5) 
that used open-ended questions for a variety 
of social and achievement situations found 
still other causal factors that are frequently 
used to explain success and failure. Among : 
these are personality, interest in the task, and 
Physical appearance. This research supports 


y 
a 
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the importance of the four causes proposed 
originally but suggests the importance of 
other factors not previously considered, espe- 
cially for dealing with nonacademic situations. 
These results raise serious questions about 
studies relying only on the four causal fac- 
tors of ability, effort, luck, and task difñculty 
in a structured format. 

An additional advantage of open response 
assessment is that subjects may find open re- 
sponse questions easier and more natural to 
respond to. However, even though it is hy- 
pothesized in this study that subjects will 
prefer open-ended questions, it is also hy- 
pothesized that open-ended questions will be 
psychometrically inferior to more structured 
responses. First, the added step of coding the 
unstructured responses should lead to lower 
reliability for open-ended questions than for 
structured measures. Second, structured mea- 
sures provide a closer approximation to inter- 
val or ratio measurement. Third, structured 
measures allow for degrees of attribution on 
various dimensions rather than the simple 
presence—absence Or frequency of appearance 
measures typical for open-ended response 
coding. 

A major distinction between various struc- 
tured attribution measures is whether they 
involve ipsative Or independent judgments. 
Ipsative measures are measures in which the 
score of one attribution must influence the 
score of other attributions, thus inducing nes- 
ative correlations. Negative correlations are 
not forced by measures using independent 
judgments. With independence of ratings 
comes ease of analysis, since each attribution 
can be tested separately. However, indepen- 
dent ratings do not give as direct an assess- 
ment of the relative importance of attribu- 
tion factors as is given by ipsative judgments. 
This ease of comparison is the major attrac- 
tion of ipsative measurement. 

Among the ipsative measures, the assign- 
ment of percentages to various causes is per- 
haps the best developed. Percentage ratings 
(e.g., Meyer, 1970) make explicit the basic 
assumption of interdependent judgments that 
the causes being rated account for the totality 


cause of an event can be parceled out to vari- 


ous particular causes. McHugh (Note 3) 
pointed out that percentage ratings give the 
clearest indication of the relative importance 
of various causes (e.g., luck rated relative to 
other causes seems unimportant). 

Another widely used method of causal as- 
sessment, the single bipolar scale anchored by 
ability and luck used by Feather (1969) and 
Feather and Simon (1971a), does not allow 
for even the four causal factors identified by 
Weiner et al. (1971). It measures only two 
of the possible causes of events. This measure 
also confounds two dimensions that have 
been shown to be important for attributions: 
stability and locus of control (Weiner et al., 
1971). However, multiple bipolar scales could 
be used in such a way as to overcome the par- 
ticular problems of using 4 single bipolar 
scale (Weiner, Nierenberg, Goldstein, 
1976). Weiner et al. (1976) limited compari- 
sons on each scale to a difference within only 
one dimension (e.g; luck and ability are not 
on a scale since they differ on both the sta- 
bility and the location of cause dimensions). 
Such a limitation solves the theoretical prob- 
lem of which dimension the causes were 
judged on as well as easin the practical 
problem of the very large number of com- 
parisons needed in a paired comparison of 
many attributions. Within-dimension bipolar 
scales (Weiner et al., 1976) yield more sensi- 
tive measures than paired comparisons use 
by McMahan (1973), since the range of pos: 
sible scores is increase 


separate rating scales and percentage mea- 
sures appeared to be the most interesting 
measures. These were selected for further 
study along with an unstructured, open-ended 
measure. 

A few researchers have begun to question 
the indiscriminate use of the various attribu- 
(Deaux & Farris, Note i 
Smith, Note 2). Deaux and Farris analyzed a 
variety of measures by looking at differentia- 
tion across success and failure and at correla- 
tions across measures. They found some con- 
vergence, but they also found significant dif- 
ference between what the various scales 
measure. This study extends their analysis by 
using @ multitrait-multimethod test of the re- 
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liability of three of the most commonly used 
measures: open-ended measures, scale ratings, 
and percentage judgments. In addition, a for- 
mal assessment is made about subjects’ per- 
ceptions of the face validity of the various 
measures. 


Hypotheses 


Two specific hypotheses that can be 
made concerning various attribution measures 
involve a comparison of the open-ended re- 
Sponse measure with the two structured re- 
sponse measures: (a) Convergent and dis- 
criminant validities will be lower for the open- 
ended response measure than for either of the 
two structured response measures and (b) the 
face validity to subjects of the open-ended 
response question will be better than that ob- 
tained by either structured response measure. 
That is, subjects will rank the open-ended 
measure as better and easier to answer, An- 
other hypothesis is made based on the ap- 
Parent superiority of the independent scale 
judgments: (c) The independent ratings will 
be superior to the percentage ratings in terms 
of convergent and discriminant validities. 


Method 
Subjects 


The subjects were 252 students in introductory 
social psychology and personality psychology classes. 
Students were requested to participate in a study 
that would serve as a focal point for class discussions 
of experimental and measurement procedures in 
Psychology. Subjects were free to Participate or not 
as they chose. Two of a total of 254 students elected 
not to participate because of their dislike of ana- 
grams, 


Attribution Measures 
Three types of attribution measures. 
The measures were worde 


“Why 
Succeeded (failed) on this task?” 
tion measures were 


Eight causal attributions were assessed in the 


structured response measures, These inch 
d : i luded th 
four basic Weiner et al. (1971) causes: high/low 


do you think you 
The other attribu- 
structured response measures, 
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general intelligence, task ease/difficulty, good/bad 


luck, and high/low unusual effort. These were se- _ 


lected for theoretical reasons and because of their” 
frequent use in the literature. The other four at- 
tributions were high/low stable effort (stable = con- 
sistent), high/low task interest, good/bad mood, and 
high/low motivation. These have been suggested as 
important causal factors for this type of situation 
by open-ended studies (Elig & Frieze, 1975; Frieze, 
1976). The attributions were worded in the format 
suggested by Elig and Frieze (1975) and by Valle 
and Frieze (1976). Table 2 presents these causes in 
a three-dimensional taxonomy (cf. Elig & Frieze, 
1975; Rosenbaum, 1972), 


Two structured response measures were used: w 


unipolar 9-point rating scales and percentage ratings 
(see Table 1). Both of these were administered with 
two alternate forms, (a) determination (e.g., “Please 
rate [indicate] how important you think each of 
the following factors was in determining your suc- 
cess or failure on the anagram task”) and (b) cause 
(eg., “Please rate [indicate] the extent to which 
each of the following factors caused your success or 
failure on the anagram task”). Each of these was 
then followed by instructions to respond to a set 
of success or failure causes, Exact wording of the 
causes is shown in Table 2 for the success and failure 
conditions, Structured response measures were (a) 
rating scale of determinance, (b) percentage scale of 
determinance, (c) rating scale of causality, and (d) 
percentage scale of causality, With the open-ended 
measure, this resulted in a total of five different at- 
tribution measures. 

The two forms were needed for alternate-form 
reliability estimates for the scale and percentage 
methods of assessing causal attributions. Reliability 
estimates for the open-ended measure were based on 
intercoder reliability. Since these causal attributions 
represent a causal analysis of a Particular success- 
failure event, other forms of reliability estimates such 
as test-retest or coefficient alpha were not appropriate. 
A Subjects were given the four structured attribu- 
tion measures and one unstructured attribution mea- 
sure in a within-subject design. Sixteen orders of the 
four structured measures were used in a 2 X 2 X 2 


the same method Were together or separate, (b) 


percentage method was asked 
first, and (c and d) whether the determination 


and for the scale method. To an 
unknown extent, this within-subject design will lead 
to an inflation of intermethod correlation estimates. 
Subjects may strive for Consistency in their responses 
even when they are instructed to start each measure 
afresh and not to look forward or back in the test 
sheets. Such a problem confronts any assessment of 


* Another issue, not dealt with in this article, is 
the way in which attributions are worded. This also 
needs to be systematically analyzed. 
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Measured Causal Attributions Varying According to Stability, Locus of 


Control, and Intentionality of the Attribution 


Locus of control 


Stability Internal External 
Stable 
Unintentional 1. Your high (low) general intelligence 2. How easy (difficult) this 
å k type of task is 
Intentional 5. The high (low) level of effort which you consistently 
demonstrate in whatever you do 
8. Your high (low) desire to do well in everything you do 
* Unstable 
Unintentional 7. Your good (bad) mood 3. Good (bad) luck 
Intentional 4. How high (low) your interest in this task was 
Á 6. The unusually high (low) effort which you put forth 


in doing this task 


Note. Numbers indicate the order in which the attributions were presented. Failure wordings are in 


parentheses. 


a momentary state rather than of a stable trait. 
# However, since our interest is in the pattern of 
correlations rather than in the overall absolute 
levels, this problem was not too troublesome. Varying 
order also allowed the assessment of order effects, 
but since no consistent order effects were found, the 
results for various orders are collapsed in the dis- 
b, cussion of results. 


Procedure 


The experiment was administered during class time 
by a male experimenter in five classes. Subjects were 
# told the following: 


In a few minutes you will be asked to do a task 
consisting of 15 anagrams. An anagram is a group 
of scramble letters that can be unscrambled to 

: form a word .... The brightest 25% of college 
students can solve at least eight of these ana- 
grams. If you solve eight or more of these ana- 
grams, you will have succeeded at the task; if you 
solve seven or less of the anagrams, you will have 
failed on this task. . . . 


Subjects were given 30 sec to answer each of the 15 
anagrams. The difficulty of the anagrams was ma- 
nipulated so that half of the subjects received a set 
of mostly easy anagrams that led them to succeed 
(@.g., MNEGAA, WADNET, BOLWE), and the other half 
of the subjects was given a set of mostly difficult 
anagrams that led them to fail (e.g., SEALGT, IUMSC, 
Oorruc) (Bar-Tal & Frieze, 1977). 

After completing the anagram task, subjects were 
asked to total the number of anagrams solved and 
to rate their subjective feelings of success and failure 
on a 9-point scale. Subjects were then given as 


much time as they needed to answer the attribution 
questions previously discussed and a final set of 
questions that assessed the face validity of the at- 
tribution measures. Subjects were asked for their 
general impressions of the methods and then ranked 
them for difficulty and for global impressions of 
best and worst. 

Subjects answered the questions at their own pace; 
no subject required more than 45 minutes. In each 
class, when all students were finished, the purpose 
and general nature of the study were discussed. All 
subjects were debriefed on the deceptions involved 
in the presentation of the anagrams task, 


Results and Discussion 


Although the majority of subjects re- 
sponded to the success—failure manipulation 
of anagram difficulty, there were a few sub- 
jects who did not. Thirty-six subjects who 
were given easy anagrams failed to solve 
eight of them and therefore failed the success 
criterion. Seventeen other subjects made up 
words for enough of the unsolvable anagrams 
to score a “success.” This raised a question 
about whether these 53 subjects should be 
included in the analyses. A preliminary anal- 
ysis indicated that there were no systematic 
differences in the responses of these “erring” 
subjects as compared to the responses of the 
other subjects. Therefore, all subjects were in- 
cluded in the analyses listed below. Similar 
data were obtained when these 53 subjects 
were eliminated from the analyses. 
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Table 3 
Open-Ended Responses 
Equivalent No. % 
Attribution structured Intercoder subjects codable 
category measure reliability* mentioning _ responses 
Ability Intelligence 91 51 20.8 
Task difficulty Task difficulty 99 65 26.5 
Luck Luck 1,00 2 8 
Intrinsic motives Interest 97 17 6.9 
Stable effort Stable effort 57 38 15.5 
Unstable effort Unstable effort 51 11 4.5 
Mood Mood -91 34 13.9 
Personality — .70 8 3.3 
Task X Ability 
interaction sr .74 19 7.8 
*n = 252. These are correlations of the frequency of the attribution being coded for the subject. 
» A total of 247 attributions were made that were codable; that is, they did not simply repeat the outcome 
nor explain the process of labeling the outcome a success or failure, One codable attribution was given by 


102 subjects, whereas 64 subjects made more than one codable attribution. 


Open-Ended Attributions 


The responses to the open-ended attribution 
Measure were coded according to the scheme 
developed by Elig and Frieze (1975). In this 
coding scheme, each subject’s response is seg- 
mented into phrases containing individual 
Causes, each of which can be scored for the 
specific attribution category and for the 
dimensions of stability, location of cause, and 
intentionality. The index of each attribution 
category (such as ability and task ease-task 
difficulty) used in this study is the number 
of times it was mentioned by the subject. 
Two coders coded all responses; the percent- 
age of agreement for the codings ranged be- 
tween 82.8% and 93.5%. Percentage agree- 
ment, although a useful concept, is not a re- 
liability estimate. Reliability was estimated 
by using the correlations reported in Table 3. 

Table 3 also lists the attributions found 
for the open-ended question, and it shows that 
all but a small proportion of these attributions 
were also measured by the structured re- 
sponse measures. Two types of responses are 
not shown in Table 3. Sixty-one responses 
(16.4% of total Tesponses) simply repeated 
the outcome or denied the outcome, Almost 
all of these responses were from the subjects 
who had failed to follow instructions, An- 
other 65 responses (17.5% of total responses) 
were uncodable as attributions, Most of these 


responses were from subjects who misunder- 
stood the question as asking for the reasons 
why they labeled the outcome as a success or 
failure rather than asking for the causes of 
the outcome. 


Intertest Correlation Validit y 


Interest correlation validities (convergent 
and discriminant) were assessed in multivari- 
able-multimethod matrices (Campbell & 
Fiske, 1959; Magnusson, 1967). Since al- 


ternate forms (cause and determination) of » 


the structured response methods were used, 
separate multivariable-multimethod matrices 
were prepared for each. Table 4 presents the 
causal forms of the structured method. The 
matrix for the determination forms was highly 
similar.? 

Factor analysis procedures were used in 
the interpretation of these validities. The fac- 
tor-analytic solution chosen was principal 


components analysis with varimax rotation. - 


This solution accounts for reliable true vari- 
ance with independent factors. We were most 
interested in the variance common to two or 
more attribution measures but had to allow 


*For purposes of brevity, much of the specific 
comparison data obtained has been omitted from this 


article. These data are i t from 
Be ise e available on reques 
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Table 5 
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Factor Loadings of Cause and Open-Response Measures 
a Eee 


Factor i 
Commun- 
ality 
Attribution/method 1 2 3 4 5 6 7 8 ty 
87 
Stable effort/scale 89 Fil 
Stable effort/percentage 69 Ri 
Motivation/scale 88 ute + 
Motivation /percentage .79 en ue 
Intelligence/scale 70 3 5 
Mood/percentage 87 k 
Mood/scale Uy gil s 
Mood/open 61 S 
Unstable effort/percentage 67 —40 a 
Unstable effort/scale 36 78 7 4 
Interest/scale 74 pS 
Luck/scale 86 ue 
Luck/percentage 87 8 x 
Task ee 88 7 
Task ease/difficult: , 
sas —.44 —.35 .62 fee 
Intelligence/percentage 39 80 $ a 
Stable effort/open —.53 44 oS $ 
Intrinsic motives/open 73 3 
Luck/open 57 4 
Task ease/difficulty/open —.70 F 
Ability/open —.28 61 5S 
Unstable effort/open 48 40 


for specific factors to emerge. Table 5 gives 
the factor loadings of the factor analysis of 
the multivariable-multimethod matrix pre- 
sented in Table 4.° This analysis accounts for 
91% of the reliable variance. 

The main diagonal of the matrix (Table 
4) gives the reliability estimates for attribu- 
tion-method unit. For the structured methods, 
these are alternate form estimates; for the 
open response method, they are intercoder 
reliabilities. With few exceptions, these are 
satisfactorily high. The open response mea- 
Sure shows low reliabilities for the effort at- 
tributions and shows generally low communal- 
ities indicating a need for further refinement 
of coding instructions to achieve more uni- 
form, acceptable reliabilities for this measure. 

The monomethod triangles (under the re- 
liability diagonals) show the intercorrelations 
of the attributions measured the same way. 
It is obvious that the open response measure 
shows the smallest (in absolute value) inter- 
correlations, indicating either relatively good 
independence for the attributions measured 
or a poor representation of the true interrela- 


tionships of the attributions. The true cor- 
relations may be attenuated by the small 
tange of frequencies of the open response at- 
tributions. 

The monomethod triangle for the open re- 
sponse method is more similar to the triangle 
for the percentage than for the scale method. 
In part, this seems due to the quasi-ipsative 
nature of the method. No subject in this study 
gave more than four responses to the open- 
ended question; thus, once an attribution was 
made, there was less chance for any other at- 
tribution to be made also, inducing small 
negative correlations, Negative correlations 
were directly induced by the ipsative per- 
centage measures. The relatively high positive 
intercorrelations of the attributions measured 


3 Because the percentage measures are ipsative, 
one Percentage measure had to be dropped from the 
analysis. Interest was dropped in the analysis re- 


Ported here. Though this is not a totally satisfactory 
solution, it does eliminate the singularity problem. 
Interest is included in other analyses, which are 
available from the authors. 
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by the scale method may in part have been 
due to a response bias of some subjects to 
use only one part of the 9-point scale. This 
was a response bias that cannot influence the 
percentage method or the open response 
method. Method variance can thus be seen to 
influence the monomethod intercorrelation 
pattern. The percentage method and the open 
response method seem to present a pattern 
different from the one evident in the scale 
measures. 

Factor analyses can aid us in clarifying 
these patterns. Returning our attention to 
Table 5, we find the major evidence for 
method variance in the factor loadings of the 
open response measures, which form distinct 
method factors (Factors 7 and 8). However, 
with the exception of the intelligence attribu- 
tion, the structured measures of a specific at- 
tribution both load on the same factor (see 
Footnote 3). Also, the monomethod blocks of 
the structured methods can be factor analyzed 
separately to clarify their patterns of cor- 
relations and the role of method variance. 
(The open response measure cannot be factor 
analyzed within method, since there is no 
alternate form for it as there are for the 
structured measures.) Tables 6 and 7 present 
the factor analyses of percentage and scale 


Table 6 
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measures, respectively. The factors in these 
analyses account for 97% and 100% of the 
reliable variance of the percentage and scale 
measures, respectively. The strongest simi- 
larity between the percentage and scale mono- 
method triangles of the multivariable—-multi- 
method matrix was that in each triangle, the 
largest correlation was for stable effort with 
motivation (motives in the open measure). 
These two attributions form the nucleus of 
the strongest factor (which appears to be a 
stable achievement factor) in all analyses. 
The analyses also agree to having mood and 
luck constituting independent factors. 

There are some disagreements between the 
scale and percentage measures. Factor 1 in- 
cludes task ease-difficulty for percentage mea- 
sures and intelligence for scale measures. The 
factor analysis of the scale measures is more 
easily interpreted than that of the percentage 
measures. Factor 2 of the scale analysis can 
be called unstable achievement. Interest ina 
task should generate unusual effort leading to 
achievement. Interest and unstable effort cor- 
relate when measured by scale cause and 
scale determination, .52 and .43, respectively, 
indicating only 27% and 18% overlap in 
variance, respectively. 

Tables 6 and 7 exhibit remarkably clear 


Factor Loadings of Percentage Measures 


Factor 


Commun- 

Attribution-alternate form 1 2 3 4 5 ality 
Motivation-cause .88 oS 
Motivation-determination 87 ae 
Stable effort-determination 65 A 
Stable effort-cause .12 i 
Task difficult: 

amines Bea: —.55 — 39 —.40 = Al —.33 89 
Task ease/difficulty a 

—determination —.55 — Al -.37 —.36 36 5 7 
Intelligence-cause 95 a 
Intelligence-determination .95 a a 
Unstable effort-cause ist ee 
Unstable effort-determination : ae =p 
Mood-determination ae i ie 
Mood-cause : A a 
Luck-determination a 5 
Luck-cause : 96 


Note. Interest was dropped from t! 


his analysis because of the ipsative nature of percentage measures. 


630 TIMOTHY W. ELIG AND IRENE HANSON FRIEZE 
Table 7 
Factor Loadings of Scale Measures 
Factor 
Commun- 
Attribution-alternate form 1 2 3 4 5 $ ality a 
Motivation-cause 88 25 A 
Motivation-determination 87 -26 8 : 
Stable effort-cause 86 29 ie 
Stable effort-determination 84 Sl : 4 
Intelligence-determination 84 H 3 
Intelligence-cause -84 bi 
Interest-determination -82 7 
Interest-cause 85 7 
Unstable effort-cause 36 77 us 
Unstable effort-determination 39 71 i i 
Mood-cause 91 s 
Mood-determination 91 94 
Luck-cause 93 i 89 
Luck-determination .92 88 
Task ease/difficulty 
— cause 93 87 
Task ease/difficulty 
—determination 
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structures, The alternate forms of the struc- 
tured methods have similar loadings, sub- 
stantiating their interchangeability. 

Returning our attention to Table 4, we note 
the convergent validities. The convergence of 
the two structured response methods seems 
high enough to warrant our further attention. 
The range of these validities is .43 to .63 for 
Cause measures and .30 to .69 for determina- 
tion measures. For attributions measured by 
cause, there is a mean overlap of 31.1% in 
variation of the structured methods, whereas 
the mean is 30.4% for the determination 
measures, 

The convergence of the two structured 
methods can be clearly seen in Table 5. In 
this analysis, the scale and percentage mea- 
sures of an attribution have their highest 
loading on the same factor except for intelli- 
gence. The two ‘structured measures of in- 
telligence converge to the point of loading on 
the same two factors. (Interest measured by 
percentages was not in the analysis; see Foot- 
note 3.) 

The convergent validities of the open re- 
Sponse measure with the structured measures 
are woefully low (ranging from .00 to 37), 
These figures indicate that two studies claim- 
ing to measure the same attribution, one with 


a structured measure and one with the open 
response measure, will only have an average 
of 4.7% of the variance in common (and a 
maximum of 14%), with the remainder of 
the variance being due to noncommon factors 
(method and error). Mood is the only open 
response measure to show any amount of con- 
vergence. It correlated .37 with the percentage 
measure and .22 with the scale measure. Mood 
is the only open response measure to load on 
the same factor as its structured equivalent 
(see Table 5). 

On either side of the convergent validity 
diagonals lie heterovariable-heteromethod tri- 
angles. Correlations in these triangles are of 
variables having neither trait nor method 
variance in common. From this consideration 
follows the second type of validity, discrimi- 
nant validity. Discriminant validity is shown 
when a validity diagonal value is higher than 
the values lying in its column and row in the 
heterovariable-heteromethod triangles. For 
example, attributions to intelligence measured 
in percentages should correlate more highly 
with intelligence measured by a scale than 
they correlate with any other attribution mea- 
sured by a scale, 

The structured Tesponses discriminate fairly 
well between variables, For the cause form, 
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no convergent validity estimate is exceeded 
by a heterovariable-heteromethod value. In 
the determination form, only task ease—dif- 
ficulty is problematic. Its convergent validity 
is exceeded by four other values. This relative 
lack of discrimination showed up in the factor 
analysis of the determination form (Table 6) 
by task ease-difficulty loading somewhat on 
every factor. 

The discrimination shown by the open re- 
sponse method is not nearly as good. Only 
mood always discriminated. The other attribu- 
tions failed to show discrimination an average 
of 4.7 times out of 13 comparisons. It seems 
reasonable to blame the open response rather 
than the structured response measures for 
this lack of discrimination, since good dis- 
crimination was shown between the structured 
measures themselves. 

As predicted, results indicate some superior- 
ity of structured measures. Structured re- 
sponse reliabilities are higher and are sub- 
stantiated by communalities, whereas the 
open response measure intercoder reliabilities 
are not substantiated by communalities. Con- 
vergent and discriminant validities for struc- 
tured measures are satisfactory, whereas open 
response convergent validities are low (ex- 
cept for mood). Expect for mood, other open 
response convergence values are exceeded by 
an average of one third of the values in their 
row and column of the heterovariable—hetero- 
method triangles, indicating poor discrimina- 
tion. 


Face Validation 


The second hypothesis concerns the reac- 
tions of those subjects whose attributions were 
being measured. This hypothesis states that 
to subjects, the face appearance of the open 
response measure will be more positive than 
that of either structured response measure. 
Table 8 presents the frequency with which 
the subjects termed each method easiest, hard- 
est, best, and worst. As can be seen, subjects 
differed in ranking the methods on all criteria. 
Contrary to the hypothesis, however, the dif- 
ferences are not between the open response 
method compared with structured methods. 
There were no significant differences between 
the open response and scale methods; even 
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Table 8 
Frequency of Attribution Measurement 
Methods Selected as Easiest, Hardest, 
Best, and Worst 
Method 
Open 
Evaluation response Scale Percent x? 
Easiest 16 79 47 9.282** 
Hardest 52 49 97 21.909*** 
Best 77 58 47 7.595* 
Worst 45 51 89 18.462*** 
*p <.05. 
** p < 01. 
* p < .001. 


the largest difference between these methods 
on the evaluation of best is not significant, 
xë (1) = 2.4. Differences arose from a general 
disliking of the percentage method. Subjects 
said they felt that the percentage measure 
was hard to compute and was not the best re- 
flection of what they felt were the reasons 
for the outcome. 


Outcome Effects 

The final analysis concerned the variations 
across the methods for success as compared 
with failure conditions. Results of the analysis 
of variance for each attribution are presented 
in Table 9. As can be seen, success was more 
attributed than failure to ability, stable ef- 
fort, and motivation for all four structured 
measures. These outcome effects were not 
found for the open response measures. There 
was convergence of the open response mea- 
sure and both forms of the percentage method 
(but not the scale method) in that both had 
greater attributions to low interest and task 
difficulty for failure than to high interest or 
task ease for success. Both scale measures also 
yielded an outcome effect such that success 
was more attributed to luck, unstable effort, 


and mood. 
Conclusions 


As predicted, open-ended response measures 
of causal attributions have poorer intertest 
validity and reliability than structured re- 
sponse measures. However, contrary to pre- 
diction, structured scale measures were seen 
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Table 9 


Mean Attributions for Success and Failure 
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Structured response measure 


Percentage 9-point scale opi 
Determi- pon responi 
i ion 2 
Outcome and attribution Cause nation Cause nai 
Success *** .54*** AD 
High intelligence 22.10%% 21.41*** 5.42 2 5 one 
Task ease 17.02*** TGSS 5.14 PoR a 
Good luck 8.47 8.04 2.97 3.06 a 
High interest OETA bee 11.24*** 4.86 5.08 al 
High stable effort 12.40*** 12.60*** AAR BS cs ie 
High unstable effort 7.50 7.76 le Sel Wa 
Good mood 6.60 6.80 : FLN EA E 
High motivation 14:3594 14.90*** 6.21 ` s 
Personality "Os 
Task-ability match 
Failure i 
Low intelligence 8.89 9.72 2.31 3.38 a 
i 32.27 32.63 5.02 4.90 ws 
Task difficulty Me 
Bad luck 7.64 7.33 2.33 2.30 a 
Low interest 22.50 21.77 4.51 4.96 be 
Low stable effort 4.85 4.18 2.08 2.36 cs 
Low unstable effort 9.03 8.82 3.14 3.34 i 
Bad mood 9.13 9.37 2.65 2.74 ie 
Low motivation 5.38 5.49 2.11 2.37 ‘en 
Personality re 
Task-ability mismatch : 


Note, Significant differenc 
*p < 05. 

“> < 01. 

sp < 001. 


by subjects as easy to respond to, like the 
open-ended measures. Percentage ratings were 
seen by these subjects as having the least face 
validity. Thus, one of the postulated advan- 
tages of the percentage measure was not con- 
firmed. And a supposed weakness of scale 
measures was not found. Thus, at least for 
college students solving anagrams or doing 
similar tasks in which the basic causal cate- 
gories are well understood, the scale method 
is clearly a Superior technique. This strong 
Support for the scale method of independent 
assessment of the contribution of various 
causal factors also confirms findings of Deaux 
and Farris (Note 1), who used other criteria 
for their conclusions. They also found little 
Support for the percentage method. 

Although open-ended Procedures are weaker 
on the basis of Psychometric criteria than 


e between success and failure attributions are indicated on the success means. 


scale ratings, they have utility for the re- 
Searcher who is asking for causal attributions 
in a new situation. Clearly, people use differ- 
ent categories of causal explanations in dif- 
ferent settings (e.g., Elig & Frieze, 1975), 
and without open-ended pretesting, the ex- 
Perimenter cannot know which causal factors 
to include in structured measures. Using an 
open-ended question in later stages of nee 
search can serve as a continuing validation 
of the attribution scales that subjects are 
asked to rate. 

Data from the factor analyses and the suc- 
cess—failure comparisons Suggest that open- 
ended and scale and percentage ratings not 
only vary in their Psychometric properties but 
also yield different types of data. Hypotheses 
that concern Particular causal attributions 
can be supported by one type of measurement 


¥ 
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and disconfirmed by another; this was dem- 
onstrated in the outcome effects reported 
here. 

The attributional measures discussed in 
this article have been the measures typically 
used and are representative of three general 
methods of attribution measurement. Other 
methods and other within-method variations 
are possible (e.g., “How much ability did he 
have?” vs. “How much was ability a cause 
of his success?”). Many of these possible vari- 
ations have been used in attribution research. 
As indicated in this article, these measure- 
ment variations may account for some of the 
discrepancies in the attribution literature. 

Fiske (1971) has suggested that in cases 
in which multiple measures yield conflicting 
results, the construct be operationally defined 
by a single measure. These are multiple cri- 
teria for selection of a measure, but many of 
them would lead to the use of the scale mea- 
sure for attribution research. Results reported 
here indicate that scale measures have mod- 
erately good intermethod correlations with 
percentage measures, do not force intercorre- 
lations among attributions, and have good 
face validity, Another important criterion for 
selecting the specific measurement technique 
that defines the construct is the construct 
validity of various procedures. Other data not 
reported here (see Elig & Frieze, Note 6) in- 
dicate that the scale method used in this study 
provides generally better support for some 
of the basic theoretical relationships between 
causal attributions and future expectancies 
and affect than do either the percentage or 
open response methods. Even if attribution 
researchers do not agree on a single measure- 
ment technique, these reliability and validity 
issues should be carefully considered before 
selecting the measures to be used in future 
investigations of causal attributions. 
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The Popularity of Conspiracy Theories of Presidential 


Assassination: A Bayesian Analysis 


Clark McCauley and Susan Jacques 
Bryn Mawr College 


Journalist Tom Bethell has advanced the hypothesis that conspiracy explana- 
tions of Presidential assassination are popular because people have an irrational 

need to explain big and important events with proportionately big and impor- `` 
iant causes. This is a species of consistency hypothesis and clearly predicts that 
a shot that kills the President is more likely than a miss to be attributed to a 
conspiracy. Four studies are reported that support this prediction. Three of the 
four studies provided a check on whether conspiracy was overly favored, in the 
case of successful assassination, by comparison with the normative Bayesian 
formulation. No evidence of this kind of departure from rationality was found, 
It appears that people associate conspiracy with successful assassination, not 
because of any kind of special need for proportionality of cause and effect, but 
because of a belief that conspiracies are more effective and successful than lone 


assassins. 


John F. Kennedy was shot dead in Dallas 


“in 1963, and the Warren Commission reported 


A 


in 1964 that the assassination was the work 
of a lone assassin, Lee Harvey Oswald. In 
1979, the issue is evidently not yet settled. 
The House Select Committee on Assassina- 
tions has concluded its work, still uncertain 
about a fourth shot. Polls indicate that the 
majority of Americans, around 80% in fact, 
believe that others besides Oswald were in- 
volved in the assassination (Gallup, 1976). 
Books and articles theorizing about the as- 
sassination are still appearing regularly. Re- 
cent examples of the genre include They've 


Requests for reprints should be sent to Clark Mc- 
Cauley, Department of Psychology, Bryn Mawr Col- 
lege, Bryn Mawr, Pennsylvania 19010. 


Killed the President by Robert Sam Anson 
(1975), Coincidence or Conspiracy? by Ber- 
nard Fensterwald (1977) and the Committee 
to Investigate Assassinations, and Legend: 
The Secret World of Lee Harvey Oswald by 
Edward Jay Epstein (1978). Clearly, Ameri- 
cans are not satisfied with the conclusion of 
the Warren Commission. 

Whatever the merits or defects of the War- 
ren Commission report, the continuing popu- 
larity of conspiracy theories is itself a re- 
markable fact. Most events, no matter how 
traumatic, do not last in the public awareness 
as the Kennedy assassination has. Most 
events, no matter how great, quickly drop 
from headlines to history. Accidents, scan- 
dals, great men, and even wars are left be- 
hind, forgotten or at least much faded. But 
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new books about John F. Kennedy’s assassina- 
tion are selling 15 years after the event, sell- 
ing in supermarkets and in drug stores against 
competition from pop psychology and sex 
books. It is the premise of this article that 
the continued popularity of conspiracy the- 
ories of the Kennedy assassination is a sur- 
prising social fact that is worthy of investiga- 
tion. 

The studies reported here are aimed at test- 
ing one hypothesis about the popularity of 
conspiracy theories, namely, that people ir- 
rationally seek big causes to explain big 
effects. According to this hypothesis, ad- 
vanced by journalist Tom Bethell in The 
Washington Monthly (1975), a lone assassin 
is too small and insignificant a cause to pro- 
vide a satisfactory explanation for such large 


effects on policy and people as follow a Presi- 
dent’s death. 


We are expected to believe, according to the of- 
ficial explanation, that the Johnson Administration 
and all that it entailed, possibly including the 
debacle of Vietnam, was set in motion by one man 
who had quarreled with his wife; who had, as it 
Were, gotten out of bed on the wrong side that 
morning, and found a gun lying there. 

The cause doesn’t fit the effect. But the fact is, 
when great power is vested in one man, as in the 
President of the United States, it is always pos- 
sible that a small cause (a microbe in his blood, 
for example, leading to a fatal disease, leading to 
a new President, leading to a “Vietnam”) can 
trigger a large effect. 
In such cases many people will seek a new cause 
that is commensurate with the effect—seek 


other words, large and global explanations 


thereby imbue the event with appropriate meaning. 
In the case of the Kennedy assassination, of course, 
this means looking 


ing for a conspiracy—preferably a 
large one. (Bethell, 1975, p. 39) 

Bethell’s hypothesis is recognizable as a 
species of consistency hypothesis, of which 
dissonance theory is the most Prominent pre- 
vious example (Brown, 1965, chapter 11). A 
need for consistency of cause and effect clearly 
implies that the need for a big cause should 
be greater, the greater the effect to be ex- 
plained. Thus the Bethell hypothesis predicts 


E 


that 


when he is shot at but 


missed. Study 1 was designed to test this 


prediction. 
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Study 1 
Method 


Subjects. Subjects were 20 undergraduate students 


of Bryn Mawr and Haverford Colleges, 10 female 
and 10 male. 
Questionnaire. The questionnaire consisted of two 


pages. On the top of cach page was the following 
introduction: “News reports of violent events and 
their causes are sometimes surprising and sometimes 
not. This study aims to measure your personal feel- 
ing about the likelihood of several stories of several 
events.” Below the introduction was typed a head- 
line in capital letters: “A MAN SHOOTS AT THE PRESI- 
DENT AND MISSES” Or “A MAN SHOOTS AT THE PRESI- 
DENT AND KILLS MIM.” On cach page, after the head- 
line, the same two questions appeared: “What is 
the probability that this man is acting alone and 
unaided?” and “What is the probability that this 
man is acting as a member of a group organized to 
kill the president?” 

Procedure. The order of the two pages was re- 
versed for half the subjects. Two female experi- 
menters cach obtained 10 completed questionnaires. 


Results 


In this and succeeding studies, it is the 
relative probabilities associated with group. 
and individual explanations of assassination 
that are of interest. These relative probabil- 
ities are naturally expressed as ratios, and we 
report our data in the form of median ratios. 
Mean ratios do not properly represent the cen- 
tral tendency of distributions of these ratios, 


since a few ratios in every distribution ares 


likely to be very extreme values. When most 
subjects are giving ratios such as 2:1, 5:1, 
or 1:4, one subject giving a ratio of 100:1 
can make the mean ratio totally unrepre- 
sentative. Thus, means and parametric sta- 
tistics are inappropriate with our data, and 


we use medians and nonparametric statistics 
instead. 


The data of Study 1 did not appear to de- 


pend on the order of the pages of the ques- 
tionnaire or on the experimenter, so the data 
of all 20 subjects were pooled for analysis. 
The first column of Table 1 shows that the 
median odds for conspiracy were 1:1 when 
the President had been killed, but were 1:2 
when the President had been missed. That 15, 
the likelihood of conspiracy relative to the 
likelihood of a lone assassin was typically 
Seen as greater when the assassination was 


> 


- successful, We can use a sign test to test the 
significance of this tendency. Of the 20 
subjects in Study 1, 13 subjects gave odds of 
conspiracy higher when the President was 
killed than when he was missed. Six subjects 
indicated no difference in these odds, and one 
subject reported the reverse difference in odds. 
These data indicate (p< .05 by one-tailed 
sign test for correlated samples) that a suc- 
cessful assassination is more likely than a 
failure to be attributed to a conspiracy. 


Discussion 


Study 1 supported the prediction that in- 
formation about success or failure of an as- 
sassination attempt makes a big difference in 
the popularity of a conspiracy explanation. 
This prediction came from the general hy- 
pothesis that people need a big cause to ex- 
plain a big effect. A special need for con- 
_ sistency in the size of cause and effect is not, 
however, directly demonstrated by the data 
of Study 1. For instance, it might well be 
that the effect of information about success 

Jmes rationally from a judgment that groups 

e more effective and likely to succeed than 
individuals. In order to demonstrate the hy- 
Pothetical consistency need, it must be shown 
that people systematically exaggerate the 
probability of a conspiracy beyond what the 
news of a successful assassination rationally 
calls for. Clearly, this demonstration requires 
a formulation of the rational impact of in- 
formation, and Bayes’ rule provides just this 
normative formulation. 
ae probability form, Bayes’ rule requires 
that 


(conspiracy /President killed) 
= p(conspiracy) 
p (President killed/ conspiracy) ; 
i p (President killed) 


Note that this formulation calls for revision 
in the prior probability of a conspiracy to the 
extent that the ratio reflecting the efficacy of 
a conspiracy—p (President dead/conspiracy )/ 
P (President dead)—is greater than 1.0. More 
useful for present purposes is Bayes’ rule 
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in odds form: 


p(conspiracy/President killed) 
p(individual/President killed) 


_ P(conspiracy) 

— p(individual) 
p(President killed/conspiracy) 
“p (President killed/individual) ` 


This form indicates that the odds favoring a 
conspiracy over an individual assassin, given 
that the President is killed, need to be re- ` 
vised and increased over the prior odds favor- 
ing a conspiracy to the extent that a con- 
spiracy is seen as more likely than an indi- 
vidual to succeed in killing the President. If 
the posterior odds are found to be systemati- 
cally higher than called for by the prior odds 
and the efficacy ratio, then the hypothesis of 
a consistency need to explain the departure 
from rational prescription would be strongly 
supported. 


(1) 


Studies 2 and 3 
Method 


Subjects. The subjects of Study 2 were six males 
and six females recruited individually by a female 
experimenter in the environs of the Bryn Mawr 
train station. The 12 subjects ranged in (estimated) 
age from early 20s to late 50s. Subjects of Study 3 
were 15 males and 9 females recruited by a different 
female experimenter in a restaurant and bar near 
Bryn Mawr College. These subjects appeared to be 
in their 20s and 30s and were generally approached 
as same-sex groups (though subjects filled out the 
questionnaire without discussion with their friends). 

Questionnaire. The questionnaire for Studies 2 
and 3 was composed of four pages, two of which 
asked questions much like the two questions used 
in Study 1. At the top of these two pages was the 
instruction: “Imagine the following news headline”: 
There followed, on one page, “MAN SHOOTS, KILLS 
PRESIDENT,” and on the other page, “MAN SHOOTS AT 
PRESIDENT, MISSES.” On both pages the succeeding 
question was the same: “Which is more likely? (A) 
The man is acting alone and unaided or (B) the man 
is acting as a member of an organized group.” The 
question continued with a quantification: “If you 
checked A, how much more likely is A? (Twice as 
likely as B? Three times as likely? Five times? Ten 
times?).” The parallel alternative was also given: 
“Tf you checked B . . . ,” etc. 

The prior odds favoring conspiracy were assessed 
on a third page: “The next person to try to kill the 
President will likely be . . . (check one).” There fol- 
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lowed the same A versus B choice and the same 
quantification of that choice as just described for 
the first two pages. 

The likelihood ratio reflecting the relative efficacy 
of a conspiracy was assessed on a fourth page, as 
follows: “Suppose that a man acting alone and un- 
aided is trying to kill the President. Suppose also that 
a man acting as a member of an organized group or 
conspiracy is trying to kill the President. Which is 
more likely to succeed in killing the President?” 
There followed the same A versus B choice and 
quantification of that choice as already described. 

Procedure. In Study 2, the prior-odds question 
Wi [ways on the last page, and the 6 possible orders 
of the other three pages appeared twice each to 
form the 12 questionnaires, In Study 3, all 24 pos- 
sible orders of the four pages appeared once to make 
up the 24 questionnaires. 


Results $ 


Columns 2 and 3 in Table 1 show that, as 
in Study 1, the conspiracy explanation was 
typically more popular when the President 
was killed than when he was missed (median 
odds of 2:1 vs, 1:3 for Study 2, and median 
odds of 2.5:1 vs. 1:3 for Study 3). This 
result is confirmed (p < .05 by one-tailed sign 
test for correlated samples) by noting that 7 


subjects in St: judged conspiracy odds 
higher when ident was killed than 
when he was mi 


|, whereas only lejudged 
the reverse. For Study 3, the corresponding 
numbers were 13 and 2, respectively (also $ 
< .05 by sign test). 

The two additional questions and the odds 
format of all the questions permit assessment 
of the degree to which subjects exaggerate the 
chance of conspiracy. For each subject, the 
posterior odds of conspiracy given that the 
President was killed were compared with the 
product of the prior odds for conspiracy and 
the efficacy ratio. In Study 2, 5 subjects gave 
posterior odds of conspiracy higher than re- 
quired by Bayes’ rule, and 6 subjects gave 
Posterior odds too low. In Study 3, 8 subjects 
gave posterior odds too high, and 13 gave 
posterior odds too low. Clearly, there is no 
evidence here of systematic departure from 
rationality; conspiracy is not overly favored 
when the President is killed, 

Earlier we supposed that higher odds for 
conspiracy, given successful assassination, 
might be rational if people believe that groups 
are more effective than lone assassins, That 
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Supposition received some support in Studies 
2 and 3, since the median efficacy ratios 
(third row in Table 1) of 2.5:1 and 2:19 
indicate that groups are typically seen as 
more likely than individuals to succeed in an 
attempt to kill the President. In*Sfudy 2, 
subjects thought groups were more likely t 
succeed, 2 subjects thought individuals wer 
more likely to succeed, and 1 subject thought 
there was no difference (p < .05, one-tailed 
binomial test). In Study 3, 14 subjects though 
groups were more effective, and 10 subj 

thought the reverse (p < .27, one-tailed bi 
nomial test, ns). Thus, Study 2, but not Study 
3, shows significantly more than half the sub 
jects judging groups as more likely than indi- 
viduals to succeed in an attempt to kill the 
President. 


Study 4 


It appears that Studies 2 and 3 have show! 
that the odds for conspiracy are not exag- 
gerated, compared to the perception of ti 
Prior odds of conspiracy and the diagnostici 
of the news that the President has been kil 
Before accepting this conclusion, howev 
there is a flaw to be considered in the que 
tionnaire used in Studies 2 and 3. That is 
the aim was to ask about prior and posterior 
odds of conspiracy that differ only in giving 
the information that the President was killed 
in the posterior assessment. Unfortunately, 
the difference between our prior (“The next 
Person to try to kill the President . . .”) and 
our posterior (“Man shoots, kills President”) 
question is two pieces of information: that 
the attempt got as far as getting a shot off | 
and that the shot was successful. Our ef- 
ficacy ratio, on the other hand, assessed only 
the information value of success and not the 
information value of getting a shot off. In 
order to be sure that this confounding is not 
important to our results, we revised the ques- 
tionnaire and used it in a new study. 


Method 


Subjects. Subjects were 15 males and 9 tenak 
recruited at a shopping center near Bryn Mawr Col- 
lege by the same female experimenter who ran 
Study 3. The subjects ranged in (estimated) a8° 
from early 20s to late 50s. 


Table 1 


à 


President was missed (p < .05 by one-tailed 


Questionnaire. The questionnaire was identical to 
that used in Studies 2 and 3 except for the wording 
of the prior-odds question and the relative efficacy 
question, Whereas the previous questionnaire asked 
about “The next person to try to kill the President 
>). |” the revision asked, “The next person to shoot 
the President will likely be. . . .” And whereas the 
Sevious form asked about the success expected of 
n individual or group “trying to kill the President,” 
the revision went as follows: “Suppose that a man 
acting alone and unaided gets a shot at the Presi- 
dent. Suppose also that a man acting as a member 
of an organized group gets a shot at the President. 
Which is more likely to succeed in shooting and kill- 
ing the President?” 

Procedure. The procedure was 
Studies 2 and 3. 


Results 


Column 4 of Table 1 shows that, as in 
Studies 1-3, the conspiracy explanation was 
more favored when the President was killed 
than when he was missed (median odds of 
2:1 vs, 1:2). This result is confirmed by 
noting that 13 subjects judged conspiracy 
more likely when the President was killed 
than when he was missed, whereas only 4 
subjects judged the reverse (Pp < -05 by one- 
tailed sign test for correlated samples). 

As in Studies 2 and 3, an analysis at the 
level of the individual compared the posterior 
odds of conspiracy with the product of the 
prior odds for conspiracy and the likelihood 
ratio giving the relative efficacy of conspiracy. 
Nine subjects gave posterior odds of con- 


the same as in 
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Median Probability Ratio Associated With Conspiracy (Group) Vi Lone A i 
(Individual) Explanations of Presidential aA sok ima 


Í Probability 6 Eo A en) A BAA ees 
p(group/President killed)* 
p(individual/President killed) i: 2:1 2.5:1 2:1 
p(group/President missed)* 
p(individual/President missed) 1:2 1:3 1:3 1:2 
p(President killed/group try)? 
p (President killed/individual try) 2.5:1 2:1 3:1 
D p (group try) 
p (individual try) 1:2 J&A 1:1 


A ee ee RA 
’ a In each of the four studies, a conspiracy was judged more likely when the President was killed than when the 
sign test for correlated samples). 
> In Studies 2 and 4, but not in Study 3, more than half the subjects judged a 
than an individual (p < .05 by one-tailed binomial test). 


group more likely to succeed 


spiracy higher than required by Bayes’ tule, . 
and 15 subjects gave posterior odds too low. 
In short, the results of Study 4 are like those 
obtained with the flawed questionnaire in 
Studies 2 and 3. Studies 2-4 are consistent in 
finding no systematic exaggeration, compared 
to the Bayesian prescription, of the probabil- 
ity of a conspiracy when the President is 
killed. 

Likewise, the tendency to see groups as 
more effective than individuals is confirmed 
in Study 4. The effic ratios judged by sub- 
jects in Study 4 were for the case of an as- 
sassin who had gotten | far as a shot at the 
President: the relative probability of killing 
the President for a member of an organized 
group versus an individual acting alone and 
unaided, The median efficacy ratio (third 
row in Table 1) of 3:1 indicates that group 
members are typically seen as better shots. 
In Study 4, 19 subjects thought group 
members were more effective, and 5 subjects 
thought the reverse (p < .05, one-tailed bi- 
nomial test). Thus, Study 4 shows signifi- 
cantly more than half the subjects judging 
groups as more likely than individuals to 
succeed in killing the President, once having 
gotten off a shot. 


“Pure” Data From Studies 1-4 


The within-subjects design of the present 
studies, where each subject answered two 
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(Study 1) or four (Studies 2-4) questions, 
raises the possibility of sensitization effects. 
Perhaps subjects answer a question differently 
because of biases or hypotheses engendered 
by having already answered previous ques- 
tions. In order to be sure that our results do 
not suffer from sensitization problems, we 
brought together “pure” data corresponding 
to the first three rows of Table 1, that is, for 
the ratios about which we have made claims 
by statistical test. The “pure” data are from 
only the first pages of questionnaires, that 
ate responses to questions by subjects wo 
had not Previously answered other quest’ is. 
These data cannot therefore be prejudiced by 
any kind of sensitization effect, 

Putting together “pure” data from Studies 
1-4, we had 26 judgments of the odds for 
Conspiracy given success (first row in Table 
1) and 26 judgments of the odds for-con- 

iven failure (second row in Table 1), 
The median odds for Conspiracy when the 
President was killed were 2:1; they were 
1:2 when the President was missed. These 
medians are numerically quite 
medians for all data in rows 1 
x 2 (respectively) of Table 1, 
that sensitization effects were not 


Medians are significantly different (p < .05 
by one-tailed median test) and clearly in- 
dicate that in a Cross-groups design, we still 
find that the odds for piracy are higher 


when assassination is successful than when 
it fails, 


Similarly, 
ies 2-4 to get a total of 15 “pure” 


pure” median offers 
efficacy ratio data do 
ensitization Problems. 
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General Discussion 


We began with Tom Betheli’s (1975) hy- 
pothesis that conspiracy theories of Presi- 
dential assassination are popular because of 
an irrational need for big causes to explain 
big events. From this hypothesis we predicted 
that the success or failure of an assassination 
attempt should make a big difference in the 
perceived likelihood of conspiracy. Studies 
1-4 confirmed this prediction: People are 
much more likely to entertain a conspiracy 
explanation when the President is shot and | 
killed than when he is shot at and missed. 

This effect is all the more striking given 
that our manipulation of the information of 
Success and failure was only a few words in 
an imaginary newspaper headline. We pro- 
vided none of the richness of detail and de- 
scription that one might suppose necessary 
for compounding a conspiracy theory. The 
cold and abstract nature of our manipulation 
was intentional, since Bethell’s consistency 
hypothesis depends only on an abstract pro- 
Portionality of 


greater popularity of conspiracy explanations 
when the President is dead is an effect that 
does not depend on the specifics of a particu- 
lar assassination story. 

Studies 2-4 were aimed at discovering 
whether the effect of success versus failure is 
irrational, that is, whether the posterior odds 
for conspiracy are exaggerated compared to 

Prior odds for conspiracy and the diag- 
Nosticity of the information that the Presi- 
dent has been killed. No evidence of such 
exaggeration was found. 

Here it should be noted that the differ- 
ence between the odds for success when the 
President is killed and the odds for success 
when the President is missed—the difference 
Suggested by Bethell’s hypothesis and found 
by us in each of the four studies reported here 
—might still be irrational in a fashion not 
tested for in our studies. That is, it might be 
that the posterior odds for conspiracy are 
systematically underestimated compared to 
the prior odds for Conspiracy and the diag- 
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nosticity of the information that the Presi- 
dent has been missed. 
What we know from all four studies is that 


$ (group/ President killed) 
p(individual/President killed) 


$ (group/President missed) 
p(individual/President missed)’ (2) 


Bethell’s consistency hypothesis suggests that 
people have an irrational need to see big 
causes for big effects, which implies an in- 
equality in place of the normative Bayesian 
equality: 


b(group/President killed) 
p(individual/President killed) 


__ (group try) 
$ (individual try) 


ji $ (President killed/group try) 
(President killed/individual try)’ 6) 


We tested for this systematic exaggeration 
and found no evidence of it, that is, no evi- 
dence that the left side of Equation 2 is 
typically too large. But it might yet be true 
that Equation 2 is irrational to the extent 
that the right side of it is too small: 


b(group/President missed) 
?(individual/President missed) \ 


P(group try) 
(individual try) 
p b (President missed/group try) 
p(President missed/individual try)’ 


This amounts to the hypothesis that people 
have an irrational need to see small causes 
for small effects. Although Equation 4 is 
also a species of consistency hypothesis, it 
is different from and less intuitive than 
Bethell’s hypothesis. We did not test for Equa- 
tion 4, which requires an odds judgment— 


(4) 


(President missed/group try) 
$ (President missed/individual try) 
—that we did not ask of our subjects. 


Thus, people do associate successful as- 
Sassination with conspiracy, but not because 
of an irrational need to find big causes for 
big effects. Rather, our data indicate that 
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people believe conspiracies are more efficient 
and effective than lone assassins. Whether 
conspiracies are in fact more dangerous than 
lone assassins is unknown, save perhaps to 
the Federal Bureau of Investigation, but the 
belief in group efficiency does make sense of 
the substantial probabilities people attach to 
conspiracy explanations of Presidential as- 
sassination. 

The association of successful assassination 
with conspiracy and the belief in group ef- 
ficacy make sense in another way. They are 
‘4a interesting example of support for Kelley’s 
(4267, 1972) analysis of variance model of 
subjective causality. According to this model, 
“the more extreme the effect to be attributed, 
the more likely the attributor is to assume that 
it entails multiple necessary causes” (Kel- 
ley, 1972, p. 6). With the relatively easy as- 
sumption that the members of a conspiracy 
are thought of as multiple necessary causes, 
it follows from Kelley’s model that a con- 
spiracy is seen as more likely when the Presi- 
dent is dead—an extreme effect—than when 
he has been missed. 

At this point we must recognize that we 
have not entirely answered the question with 
which we began. We set out to understand 
the popularity of conspiracy explanations of 
the John F. Kennedy assassination, In one 
sense we have done that, since the belief that 
conspiracies are ‘more effective than lone as- 
sassinations makes. understandable the sub- 
stantial likelihood of conspiracy that people 
perceive when the President is killed. But the 
level of public interest in conspiracy explana- 
tions of the Kennedy assassination, as dis- 
tinct from the level of confidence in these 
explanations, is yet to be understood. The 
strength and duration of public interest in 
conspiracy explanations suggests a strong mo- 
tive at work, but the Bayesian analysis—a 
static, cognitive analysis—cannot by itself re- 
veal the motive we seek. If the Bayesian anal- 
ysis had given us evidence of systematic exag- 
geration in the probability of conspiracy, we 
would have inferred the kind of consistency 
motivation we hypothesized at the beginning 
of this research. In the absence of this kind 
of exaggeration, however, we must look else- 
where for the motivation that makes the con- 
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spiracy explanation of the Kennedy assassina- 
tion so interesting. 

One interesting possibility is that Ameri- 
cans are experiencing a gigantic Zeigarnik ef- 
fect, Zeigarnik (1927) found that interrupted 
tasks are remembered better than tasks that 
are completed. Although this effect dissipated 
within a day or two for the laboratory tasks 
Zeigarnik used, stronger effects of the same 
kind may occur with more involving tasks. It 
seems likely, for instance, that the same lack 
of closure Zeigarnik was studying can 
a fisherman dwell on the one fish that got 
away long after he has forgotten the many 
he boated, Perhaps as long as the official ex- 
planation conflicts with a common belief link- 
ing successful assassination with conspiracy, 
Americans will see the John F. Kennedy as- 

- Sassination as the kind of unresolved prob- 
lem that continues in memory and attention. 
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Implicit Theories of Relationship: 
An Intergenerational Study 


Marylyn Rands and George Levinger 
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College students and senior citizens two generations apart in age estimated the 
probabilities of each of 30 behaviors for each of 14 pair relationships varying 
in their closeness (casual acquaintances, good friends, close relationship, married 
partners) and in their sex composition. The younger respondents rated the pair 
relationships of 22-year-olds today, while the older ones rated those of 22-year- 
old pairs of 50 years ago. The behavior items referred to a variety of social 
behaviors pertaining to joint activity, self-disclosure, other-enhancement, other- 
disparagement, physical contact, and norm regulation. Closeness and sex com- 
position of the relationships as well as content of the behaviors strongly affected 
the probability estimates. The raters’ generation also exerted strong effects; 
today’s pairs, especially good or very close heterosexual friends, were believed 
to be much more likely to express positive and negative feelings and to have 


physical contact. 


How do social relationships of today differ 
from those of yesterday? And how do they 
correspond? In what ways have norms 
changed from generation to generation? An- 
swers to such questions are difficult to come 
by. 

Historical analyses sometimes provide us 
with theories or hypotheses about contempo- 
rary conduct by comparing it with that of 
earlier times. For instance, in his historical 
novel Angle of Repose, Wallace Stegner 
(1971) draws a picture of the intimate rela- 
tionships of three pairs of characters, each 
pair separated by two generations. The main 
character is a 58-year-old historian, Lyman 
Ward, who writes a book about his grand- 
parents’ marriage, contrasting it intermittently 
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with his own and that of his 20-year-old 
secretary, Shelly. Lyman’s two grandparents 
lived together for over 60 years, until parted 
by death; each suffered the other’s transgres- 
sions silently, but their relationship deterio- 
rated over the course of their long marriage. 
Lyman’s own long marriage was recently 
ended, leaving him feeling helpless and be- 
trayed. His secretary, meanwhile, unsure of 
her commitment, alternately lives with and 
hides from her lover. 

These three pairs display vast differences 
in their overt’ behaviors, their expectations of 
right and wrong, their comforts and discom- 
forts. Lyman’s grandparents display stoicism, 
politeness, discipline, and inhibition toward 
each other, their kin, and close friends. Ly- 
man and his wife, too, maintain their rela- 
tionship publicly in the face of private dif- 
ficulties, but their failure to resolve their 
problems ultimately results in the breakup 
of their marriage. In contrast, Shelly’s and 
her lover’s on-and-off affair is marked by self- 
expression, directness, and unpredictability. 

This set of characters conveys the idea that 
norms of interpersonal conduct have changed 
dramatically over the past century and that 
each generation both suffers and benefits from 
its own constellation of circumstances. The 
novel’s message derives from Stegner’s per- 
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sonal impressions—from his theory of changes 
in interpersonal conduct from yesterday to 
today. 

Are there ways of confirming such theories 
about past and present interpersonal rela- 
tionships? The present study was an effort to 
do so. It was a systematic attempt to study 
people’s “implicit theories” about pair rela- 
tionships—people’s expectations about what 
behavior will occur in different pairings. Each 
of us is assumed to hold expectations for ap- 
Propriate conduct that depend on our rela- 
tionship with another, whether as a child, 
sibling, friend, spouse, parent, co-worker, or 
colleague. 

The term “implicit theory of relationship” 
was first suggested to us by Wolfe (Note 1), 
who proposed that people share “notions 
about what characteristics of relationships 
should co-occur” (p, 20). Research on such 
notions appeared to be a way of pursuing our 
own interest in distinguishing among levels of 
pair relatedness (Levinger, 1974; Levinger & 
Snoek, 1972). Before ascertaining what sorts 
of characteristics are expected to co-occur, 
however, it seemed important first to discover 
what behaviors People expect to occur, as 
was explored earlier in our research program 
by Richard Mack (see Levinger, 1974). 


Previous Research 


Several previous studies have investigated 
perceptions of interpersonal relationships. De- 
Soto and Kuethe (1959) examined male col- 
lege students? subjective probabilities for ex- 
pecting positive or negative feelings in hy- 
pothetical pairs of male acquaintances, Re- 
spondents estimated the probability that, 
under various conditions, A would like B 
(or trust, feel superior to, dislike, lie to, 
listen to, or dominate B, and so on). Re- 
Spondents were found to assign higher prob- 
abilities to positive than to negative affective 
relations. Some A-B relations were perceived 
as symmetric, others as asymmetric; some as 
transitive, others as intransitive. 

Marwell and Hage (1970) obtained ratings 
of 100 different dyadic role relationships— 
such as father-son, best friend—best friend, 
landlord-tenant, actor—agent—on a set of 16 
scales Measuring characteristics of interaction 
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(such as its frequency, the effort required, or 
its voluntariness). A factor analysis of these 
ratings uncovered three dimensions of pair 
relationships: intimacy, visibility, and regula- 
tion of behavior. 

A third study located cross-cultural gen- 
eralities in close relationships (Triandis, Vas- 
siliou, & Nassiakou, 1968). American and 
Greek respondents rated the appropriateness 
of 120 behaviors (e.g., show concern for, dis- 
agree with, kiss, help, argue with other) for 
100 dyadic roles. Four factors common to 
both cultures emerged: affect, intimacy, domi- 
nance, and hostility. 

In a fourth study, Wish, Deutsch, and Kap- 
lan (1976) searched for fundamental dimen- 
sions in people’s perceptions of interpersonal 
relationships by asking subjects to rate 25 
“typical” role relationships and 20 of their 
own interpersonal relationships on scales 
pertaining to a large variety of traits (such 
as cooperative-competitive, emotional-intel- 
lectual, flexible-rigid). Their multidimensional 
scaling analysis revealed four major dimen- 
sions of relationships, which can be inter- 
preted as mutual cooperativeness, equality, 
intensity, and informality. 

All these studies found aspects of close- 
ness, positivity, and social influence to be 
important for characterizing variations among 
social relationships. Only one of these studies 
(Triandis et al., 1968), however, examined 
interpersonal bekaviors, and it did not select 
those behaviors from any defined population 
of actions. Nor did any of these studies have 
an explicit schema for selecting its sample of 
relationships; the findings did not permit one 
to draw conclusions about the effects of sys- 
tematic variations in relationship. 


The Present. Research 


The present work is concerned with the ef- 
fects of interpersonal closeness on people's 
perceptions of appropriate behavior in pairs. 
Following from Levinger and Snoek’s (1972) 
suggestion of a continuum of pair related 
Richard Mack had earlier obtained pertinen 
descriptive data (see Levinger, 1974, pp. 109- 
111). Mack’s study focused on the behaviors 
of heterosexual couples of four different de- 
grees of intimacy, from “casually acquainted 
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to “much in love and fully committed.” His 
informants considered 52 different possible be- 
haviors—from “smile at the other” to “refuse 
to date other persons”—and rated how likely 
each behavior was in each of the four sorts 
of relationships. For most of the behaviors, 
likelihood of pair interaction was perceived 
to vary directly with intimacy (see Levinger, 
1974, p. 110). 

In the present study, it was decided to vary 
not only the two partners’ closeness but also 
the pair’s sex composition, so as to study per- 
ceptions of a systematically selected set of 
pair relationships. Such relationships were to 
be rated by raters of both sexes who were 
drawn from two different age groups. A set of 
guiding hypotheses and questions is listed 
below. 

Relationships: Degree of closeness. It was 
hypothesized that expectations of interaction 
would increase as the perceived closeness of 
a pair increased. According to current con- 
ceptions of relationships (Altman & Taylor, 
1973; Levinger & Snoek, 1972), the breadth 
and the intensity of mutual behavior grows 
along with increasing interpersonal involve- 
ment, 

Furthermore, respondent characteristics were 
assumed to affect such expectations. In line 
with earlier findings in our research program 
(Levinger, Rands, & Talaber, Note 2), we 
hypothesized no differences between the judg- 
ments by raters of different sex, but we did 
look for significant age differences. Raters 
from today’s college generation were hypoth- 
esized to expect more pair interaction than 
raters from their grandparents’ generation rat- 
ing young pairs of 50 years ago. 

Relationships: Sex composition. We hy- 
pothesized that different behaviors would be 
expected in same-sex pairs than in cross-sex 
ones, as well as in male—male as compared to 
female-female pairs. For example, cross-sex 
pairs would be perceived as more likely to 
have physical contact than would same-sex 
pairs, and female-female pairs as more likely 
to disclose intimacies than would male-male 
dyads. 

Furthermore, it was hypothesized that 
younger raters would differentiate less be- 
tween same-sex and cross-sex relationships 
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than would older raters. This hypothesis was 
based on the assumption that today’s sex role 
stereotypes are less distinct than those of 50 
years ago. 

Considering the sex of the actor, we hy- 
pothesized that raters would judge female 
partners to be more likely than males to initi- 
ate expressive actions, and males to be likely 
to initiate more actions pertaining to the man- 
agement of overt activities, Such differences, 
it was thought, would be greater for older 
than for younger respondents. 

Questions were also posed concerning the 
perceived likelihood of various sorts of be- 
haviors that might occur in close pairs. We 
hypothesized that activity-oriented behaviors 
would be perceived as less affected by the 
partners’ closeness or sex composition than 
would those expressing positive or negative 
feelings and that expectations of physical con- 
tact would be most affected by variations in 
closeness and sex composition. 


Method 


Participants in the study were of two age groups 
and of two sexes. Each person rated the likelihood 
of 30 different behaviors for each of 14 relationships. 
A sample of young, college-age raters judged “typi- 
cal relationships” between 22-year-old partners to- 
day. The age of 22 was chosen to be near enough to 
that of the college-age respondents while old enough 
for people to be stably married, A sample of older 
raters, generally in their 70s, described relationships 
of 22-year-olds as they recalled them from the time 
when they themselves were in their early 20s. 


Raters 


The group of young raters consisted of 40 Uni- 
versity of Massachusetts students, half male and 
half female. They came to group sessions in re- 
sponse to either an announcement in an introductory 
sociology class or an advertisement posted in the 
psychology building. Thirty-four raters were paid 
$1.50 each for filling out the questionnaire, and six 
chose to receive experimental credit from a psychol- 
ogy class. They ranged in age from 18 to 27 years, 
with a mean age'of 19.9 years. T 

The group of older raters were 40 senior citizens, 
also half male and half female, who lived in a re- 
tirement community near Philadelphia. The com- 
munity’s “Reserve Fund” received $1.50 for each 
rater’s participation. These respondents ranged in 
age from 65 to 87 years, with a mean age of 75.7 
years. Respondents in both age groups were all Cau- 
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casian and predominantly from middle- to upper- 
middle-class background. 


Relationships 


Each person rated 14 relationships representing 
four levels of closeness (casual acquaintances, good 
friends, close relationship, and married pairs). At 
each degree of closeness except marriage, the two 
partners could be either both male, both female, or 
one male and one female. In the heterosexual pairs, 
the sex of the actor was also varied. Thus each rater 
considered eight heterosexual and six same-sex re- 
lationships. 

The questionnaire first introduced the rater to 
four degrees of relatedness, Each pair relationship 
was pictured as two circles, varying in their degree 
of “intersection” (Levinger & Snoek, 1972) from 
very small to substantial. They were described as 
follows: (a) casual acquaintances—two people who 
feel friendly toward each other; the Partners have 
many other relationships like this one; (b) good 
friends—two people who care about each other; each 
person has several other relationships like this one; 
(c) close relationship—two people who care about 
each other very much; the partners have no other 
relationship like this one; and (d) married partners— 
two people who care about each other very much; 
the partners have no other relationship like this 
one; they have made their relationship permanent. 

Each rater received a booklet with the 14 rela- 
tionships placed in a different random order. For 
each pairing, the relationship was described and the 
sex of each partner indicated, with the first person 
in each pair designated as the actor. 


Behaviors 


The 30 items used in this study (see Table 1) 
were selected from an initial list of 50 items to 
represent overt actions in the categories of joint ac- 
tivity, self-disclosure, other-enhancement, other-dis- 
paragement, norm regulation, and physical contact 
(positive and negative). The items chosen repre- 
sented both physical activity and emotional expres- 
sion, both verbal and nonverbal acts, and both posi- 
tive and negative behaviors. A preliminary study 
showed that these items discriminated among the 
four degrees of closeness and covered a range of 
activities appropriate for most social relationships. 

Choices of behaviors were guided by several cri- 
teria: (a) The behaviors were selected to represent 
specific acts, not abstractions such as many of the 
verbs in Osgood’s (1970) study, nor behavioral in- 
tentions. (b) Items were to represent initiatory ac- 
tions, since reactive behaviors would depend on their 
context. (c) Behaviors were to be applicable to all 
Possible relationships; except for “making love,” none 
of the items appeared likely to seem improper for 
Same-sex pairs but proper for cross-sex ones. 

Each behavior was rated on a 10-point scale rang- 
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ing from 0 (not at all likely—ie., would occur in 
0%-10% of all relationships of the kind rated) to 
9 (extremely likely—ie., would occur in 90%-100% 
of all relationships of the kind rated). 

The 30 items were presented 10 to a page, with 
3 pages per relationship. The order of the three 
pages was randomly varied across the 14 pairings, so 
that no two respondents made their ratings in exactly 
the same order,* 

A note on the rater's time perspective. While the 
young raters judged relationships among today’s 22- 
year-olds, their contemporaries, older raters were 
asked to recall behavior norms of about 50 years ago. 
Although judgments of their own contemporaries 
seemed more meaningful than if senior citizens were 
to express norms about young pairs today, this pro- 
cedure’ introduced a difficulty for interpreting the 
data; The older raters’ judgments depended on their 
recollections, This difference will be discussed fur- 
ther at the end of the article. 


Results 


Our findings will be presented as follows: 
First we will consider the implicit dimensions 
of closeness that seem to underlie the re- 
spondents’ ratings; the relationships judged 
by each group of raters will be plotted into 
a two-dimensional space. Next we will look at 
mean expected probabilities for the average 
social behavior across relationships and across 
generations. Following that, we will examine 
variations in different kinds of behaviors, 
using the results of a five-factor analysis of 
variance. Finally, we will focus intensively on 
intergenerational differences in expectations 
for appropriate interaction in close pairs. 


Dimensions for Distinguishing Among the 
Relationships 


In order to compare the 14 relationships 
simultaneously across several dimensions, @ 
multidimensional scaling analysis was per- 
formed on the ratings of each age group. In- 
dividual Differences Scaling (1Npscat), de- 
veloped by Carroll and Chang (1970), was 


*A preliminary study, using 50 behavior ma 
was carried out with two other groups of 40 ye $ 
and 40 old respondents (Lacey, 1977). Because 0 to 
design error, its results could not be submitted 4 
multidimensional scaling analyses. The methods 
that study otherwise parallel the present ones, FE 
its results will be referred to wherever they ® 
relevant to the present findings. 


i 


— se 


IMPLICIT THEORIES OF RELATIONSHIP 


649 


Dimension 2 


LOW AFFECTIVE 
INTERDEPENDENCE 


Casual Acquaintances 


Good 
Friends 


LOW BEHAVIORAL 
INTERDEPENDENCE 


Close 
Relationship 


Dimension 1 


HIGH BEHAVIORAL 
INTERDEPENDENCE 


HIGH AFFECTIVE 
INTERDEPENDENCE 


Figure 1, Relationship dimensions derived from 


the younger respondents’ behavior ratings. (For each pi 
= female.) 


second the target of the action. M = male; F 


used; the statistical program allows simul- 
taneous analyses of all 14 relationships and 
Provides unique solutions without any rota- 
tion of axes.? The INDSCAL analyses produced 
a relationship space for each group of raters 
with several dimensions and weights on each 
dimension for each behavior, to reflect the 
importance of each item for contributing to 
that dimension.* Distances between points in 
the space indicated :the perceived dissimilar- 
ity among relationships. The analyses showed 
that a one-dimensional solution accounted 
for 81% and 74% of the variance for the 
young and the old raters, respectively; a two- 
dimensional solution accounted for 89% and 
83%, respectively; and a three-dimensional 
solution accounted for 91% and 86% of the 
variance for the young and old raters, respec- 


an mpscar (Carroll & Chang, 1970) analysis of 


air, the first letter denotes the actor, the 


tively. Since the further increment from two 
to three dimensions had little meaning, the 
two-dimensional solution was used. 

Figure 1 shows the 14 relationships judged 


2 Jnput for the INDSCAL analyses was the dissimi- 
larity matrix for each relationship, derived from the 


following formula: 


lw 


öra =. [— E (rie — ir)”, 
N 


where §yx is the dissimilarity between relationships ; 
xıjs and Xiks are Subject 7’s ratings of Relationships j 
and k on Behavior s; and N is the number of sub- 
jects who rated the relationships on Scale s (see also 
Wish, Deutsch, & Kaplan, 1976, p. 411). 

3 These dimension weights are analogous to partial 
correlations, whereas the R values (see last column 
of Table 1) are analogous to multiple correlations. 
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Table 1 
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Behavior Weights for Two Dimensions of Relationships 


eee 


Dimension 1 Dimension 2 
(High-low (High-low 
behavioral in- affective in- 
terdependence) terdependence) R 
Ttem* Young Old Young Old Young Old 
98 
Plan a joint project 1.00 89 —.06 i : oe a 
Drop by unannounced 1.00 86 —.04 14 os ‘og 
Loan O money ($10 or more) 1.00 84 —.03 17 Pe 3 
Offer to do an errand for O -98 84 —.01 19 “4 T 
Plan an outing with O 96 84 05 19 a 
Help O learn a hobby 96 92 —.01 06 99 j: 
Criticize O 96 67 .03 4 A a 
Ask for advice about one’s work 95 92 —.04 01 He S 
Confide one’s personal problems 95 80 .06 .25 na ‘24 
Pat O on the back 95 68 04 .25 s ‘38 
Express irritation with O .94 59 05 40 -96 ‘ 
Use O's belongings without a 
permission = 90 81 08 = 419 dae 
Praise O 89 61 14 45 T A 
Do something spacial to surprise O -88 -50 AT 56 : ` 
Give O an expenŝive gift (worth 
$10 or ae F 88 40 7 68 98 96 
Play a game with O (e.g., ball 
fame, card game) 87 92 —.08 01 a is. 
Ask about O's Personal problems 86 46 .22 63 es 33 
Nag O 85 30 21 63 ra a 
Express one's innermost feelings .83 54 24 56 : 
Spend a social evening alone 
ith O ; 82 St 28 55 98 a 
Give up friends O doesn’t like -80 .38 .16 .65 88 ‘0 
Go for a walk with O 72 80 .37 .20 95 “ai 
Slap or hit O 64 35 10 56 -10 ‘o7 
Express affection for O -64 .35 .50 .14 -98 ‘on 
Stand close to O (within 1 foot) 61 16 52 88 97 "93 
Hug O 53 07 57 89 95 ‘$6 
Avoid touching O 24 134 .79 62 94 oi 
Compete with O 9 24 58 A7 69 k 
Make love (have intimate sexual 83 
contact) .10 —.08 87 88 92 igs 
Hold hands 01 —.26 96 1.00 96 E 
Percentage of variance for two-dimensional solution 89 s 


Note. O = the other. 


* Items are listed according to decreasing weights for the young raters on Dimension 1. 


by the young raters. They are arrayed on two 
dimensions. Dimension 1, accounting for the 
largest amount of variance, can be interpreted 
as a pair’s degree of behavioral interdepen- 
dence. Most of the 30 behaviors had high 
weights on this dimension (e.g., planning a 
joint project, helping the other learn a hobby, 
or asking for advice about work; see Table 
1). Casual pairs were rated lowest on be- 


havioral interdependence and married pal 
highest, with good friends and close relation- 
ships intermediate. í 
Dimension 2 refers to the pair’s degree 9 
affective interdependence. This dimension i 
characterized by physical contact—by 
haviors such as making love, hugging, hol 
ing hands, or showing affection (and, for the 
older raters, by self-disclosure to and criticis™ 
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of the other). Heterosexual pairs were con- 
sidered highest on Dimension 2, male pairs 
lowest, and female pairs intermediate. 

The correspondence between the young 
sample’s and the older sample’s use of these 
two dimensions was estimated by correlating 
their respective coordinate positions in their 
two-dimensional solutions. Product-moment 
correlations were extremely high for both Di- 
mension 1 (.969) and Dimension 2 (.936). 
Accordingly, the separate solutions for the 
young and old raters were rotated slightly so 
that they could be plotted in the same rela- 
tionship space. Figure 2 shows the resulting 
intergenerational comparison. 

While behavioral and affective interdepen- 
dence were important for describing the rat- 
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ings of both samples of raters, Figure 2 also 
indicates important differences. For example, 


_ older raters perceived significantly more be- 


havioral interdependence between same-sex 
than between cross-sex friends, whereas young 
raters expected, if anything, the opposite. Al- 
though young raters distinguished little be- 
tween married and close heterosexual pairs, 
old raters distinguished considerably between 
these intimate pairs; they also rated good 
friends as less interdependent on both dimen- 
sions than did the young raters. Furthermore, 
the older raters saw relationships at each 
degree of contact as less affectively differ- 
entiated by sex composition than did the 
younger raters. This may have occurred be- 
cause the affective interdependence dimension 


Dimension 2 


LOW AFFECTIVE 
INTERDEPENDENCE 


Casual Acquaintances 


Good 
Friends 


LOW BEHAVIORAL 
INTERDEPENDENCE 


@ younger respondents’ 
ratings (F,M/ 


O older respondents’ 
ratings (f,m) 


Figure 2. Relationship d 
ratings. (For each pair, the 
M, m= male; F, f = female.) 


HIGH AFFECTIVE 
INTERDEPENDENCE 


imensions derived from both you 
first letter denotes the actor, 


Close 
Relationship 


Dimension 1 


HIGH BEHAVIORAL 
INTERDEPENDENCE 


nger and older respondents’ behavior 
the second the target of the action. 
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was largely based on physical contact behav- 
iors; the older participants rated such contact 
quite improbable across all unmarried rela- 
tionships, whereas the younger ones’ ratings 
varied from low to very high. 


Behavior Probabilities Across Relationships 
and Generations 


The multidimensional analysis does not in 
itself reveal absolute percentage differences 
between the younger and the older samples, 
or among the relationships. Figure 3 shows 
such mean differences. It shows the extent to 
which probability of interaction was per- 
ceived to increase with relationship closeness. 
At each degree of closeness, and for almost 
all behaviors, the younger raters reported a 
significantly higher expectation of probable 
interaction than did the older raters. This 
generational difference was greatest for the 
two intermediate pairings, good friends and 
close relationship. 


Perceived Variation in Behaviors Across 
Relationships 


The analyses described above have ex- 
amined differences across relationships and 


Casual 
Acquaintances 


Good 
Friends 


Close 
Relationship 


Married 
-83 


904 


Yng Old Yng Old Yng Old Yng Old 
Figure 3. Mean probabilities of interaction across 


four degrees of closeness as estimated by both 
younger and older raters. 
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Table 2 
Six Behavior Clusters Used in the 
Analysis of Variance 
——————E——_———— 
Social contact 
Play game with O (e.g., ball game, card game) 
Ask for advice about one’s work 
Help O learn a hobby 
Offer to do an errand for O 
Plan an outing with O 
Go for a walk with O 


Self-disclosure 
Confide one’s personal problems 
Ask about O's personal problems 
Express one’s innermost feelings 


Other-enhancement 
Praise O 
Pat O on back 
Do something special to surprise O 
Express affection for O 


Other-disparagement 
Criticize O 
Express irritation with O 
Nag O 


Norm regulation 
Drop by unannounced 
Use O's belongings without permission 
Give up friends O doesn’t like 


Physical contact 
Hold hands 
Hug O 
Make love (have intimate sexual contact) 
Stand close to O (within 1 foot) 


Note. O = the other. 


between generations but have given little 
systematic information about variations in 
dyadic behavior. To consider such variations, 
the behaviors were divided according to our 
own a priori classification into six distinct 
clusters, as shown in Table 2.* Behaviors 1” 
the first five clusters were weighted most 0? 
Dimension 1 of the rnpscat analysis, Pat 
ticularly for the young sample, while the be- 
haviors in the sixth cluster were heavily 
weighted on Dimension 2. The construction 
of these six clusters permitted an analysis of 
variance across the 12 nonmarried relation- 


+ Seven items were omitted from this analysis E. 
variance across categories because they did 0 
clearly fit into any of the six behavior categories- 


ye 


4 
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Table 3 
3 Other- Other- Norm 
‘ Joint Self- awe jonas i 
Relationship activity disclosure pie n a hte Total 
al 
Casual acquaintances 
Same sex 
M- 
MM 48 th -20 .24 A7 13 429 
„51 :25. 37 24 19 21 
Cross-sex ; ; a 
M-F 50 23 40 23 
5 K ` 2 16 K 
m 46 22 33 22 ‘15 wane 
o .49 .22 35 .23 AZ 23 .28* 
Good friends 
Same sex 
M-M 13 A2 53 37 40 123 45 
F-F 73 55 61 39 Al 36 51 
Cross-sex : 
M-F 70 44 59 35 33 48 48 
F-M .69 .45 56 35 31 48 48 
Total Art AT 58 31 36 39 48 
Close relationship 
Same sex 
M-M 81 61 66 A8 55 33, .57 
EF 81 70 73 AT 05) 48 62 
Cross-sex 
M-F 80 68 17 AT 53 75 67 
F-M .80 ah) -16 48 50 72 06 
Total 81 67 3 48 53 57 63 
Marriage> 
Cross-sex 
M-F 86 83 86 61 68 .90 19 
F-M 88 84 87 61 68 .90 .80 
Total 87 83 86 61 68 90 79 
All relationships 
2 55 63 42 Ad 52 55 


ee | ee cn 
t of the action, M = male; F 


Note. For each pair, the first lette: 

= female. 

® Any discrepancy between the probabilities re 

Aumber of items on which the means are based. 
Probabilities for the married partners a 

cluded in the analysis of variance. 


ships. The one between-group factor was the 
raters’ age (young or old);* the four within- 
subjects factors were the closeness of the rela- 
tionship (casual acquaintances, good friends, 
close relationship), sex composition (same 
sex, cross-sex), sex of actor (male, female), 
and the type of behavior (joint activity, self- 
disclosure, other-enhancement, other-dispat- 
agement, norm regulation, and positive physi- 
cal contact). Each of these variables exerted 


ported in this column and in Figure 1 is due t 


re shown here, ev 


denotes the actor; the second, the targe! 


o the different 


en though the married relationship was not in- 


a statistically significant effect on the ratings.” 
Table 3 depicts the mean perceived probabil- 
ities across all four within-subjects variables; 


5Sex of rater was omitted from this analysis, 
since previous analyses had shown no significant main 
effect (p> .10) nor any significant interaction ef- 
fects of the raters’ sex. 

6 Parallel findings were obtained for the data col- 
lected in the preliminary study, which also had in- 
cluded many additional behaviors. 
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Table 4 
Mean Probabilities for Six Behavior Clusters Across Three Levels of Closeness 
Other- Other- Norm 
Joint Self- enhance- dispar- regula- Physical 
Relationship activity disclosure ment agement tion contact Total 
inta. 
corre z AD 21 33 .24 18 Eyi A 
Cross-sex 49 23 37 23 16 29 2 
Good friends, 
Same sex -13 A9 57 38 40 30 7 
Cross-sex 69 45 58 35 32 48 A 
Close relationshi, 
Same sex : 81 65 70 48 55 Al pe 
Cross-sex 80 69 -16 .47 51 73 5 


Oe a a O 


for the sake of simplicity, it disregards gen- 
erational differences, which will be consid- 
ered later. 

Efects of closeness. As anticipated, inter- 
action was expected to increase directly with 
the increasing closeness of the partners; mean 
probabilities were 28% for casual acquaint- 
ances, 48% for good friends, and 63% for 
the close relationship. These differences were 
highly significant statistically, F(2, 156) = 
463.62, p < .0001. 

The six clusters of behavior items differed 
significantly, F(5, 390) = 229.64, p < .001. 
For some clusters, increases in likelihood 
across degrees of relationship closeness were 
steep and others were gradual; some began 
from a low baseline and others from a moder- 
ate baseline (see Table 3). Joint activity and 
other-enhancement were perceived to be mod- 
erately probable (49% and 35%, respectively) 
among casual acquaintances and to rise high 
(87% and 86%, respectively) for the married 
pair; most other behaviors were considered 
to be improbable between casual acquaint- 
ances (less than 25%). Self-disclosure and 
physical contact were perceived to rise steeply 
(to 83% and 90%, respectively) as closeness 
increased. The behaviors subsumed under 
norm regulation and other-disparagement lev- 
eled off at a lower ceiling (68% and 61%, re- 
spectively). 

Sex composition. The sex composition of a 
relationship was also significantly associated 
with the expected Probability of behavior, 
F(1, 78) = 23.76, p < .0001. Overall, cross- 


sex pairs were expected to interact somewhat 
more than same-sex pairs (see Table 3). That 
effect was conditioned, however, both by the 
degree of closeness (there was no such differ- 
ence for good friends) and by the sex of the 
same-sex pairs (female—female pairs were be- 
lieved more likely than male—male pairs to 
initiate interaction across almost all sorts of 
behavior). Further, while cross-sex means 
tended to be about the same regardless of who 
was the actor, in same-sex dyads females were 
expected to interact more than were males 
(see row totals in Table 3). ; 
How did same-sex and cross-sex pairs differ 
in regard to particular behavior clusters? Con- 
sidering first the rather neutral behaviors 
classed under “joint activity,” interaction was 
expected to be somewhat greater in same-sex 
pairs—for good friends, significantly so (see 
Table 4). Regarding self-disclosure, same-sex 
and cross-sex means were overall about equal, 
but same-sex good friends and cross-sex close 
ones were believed to disclose the most. Other- 
enhancement was believed to be significantly 
higher between cross-sex partners, but this 
difference was entirely due to the low means 
for male-male dyads. Other-disparagement, 
in contrast, was expected to be higher be- 
tween same-sex than cross-sex partners, pr- 
marily for good friends. The cluster of nor™ 
regulation behaviors (i.e., dropping by unan- 
nounced, borrowing belongings and giving up 
friends the other does not like) was believed 
more characteristic of same-sex than of cross 
sex interaction, but mainly for the good 


eu 
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friends. Finally, as hypothesized, the largest 
same- versus cross-sex difference, at every 
level of closeness, occurred for behaviors as- 
sociated with physical contact. 

Other notable sex differences may be seen 
in comparing the means for male and female 
same-sex dyads in Table 5. Overall, interac- 
tion was estimated as higher in the female 
pairs at each level of closeness; but these dif- 
ferences were due almost entirely to three of 
the six behavior clusters: to greater self-dis- 
closure, other-enhancement, and physical af- 
fection between females. There were insignifi- 
cant differences between male and female 
pairs with regard to expected joint activity, 
other-disparagement, or norm regulation. 

Sex of actor. The analysis of variance lo- 
cated a significant effect of the actor’s sex, 
F(1, 72) = 14.19, p < .0001, but only in the 
heterosexual pairs is it meaningful to con- 
sider male versus female actor differences. 
Turning to the data on cross-sex pairs in 
Table 5, one can examine differences in be- 
havioral interaction attributable to the sex of 
the actor. In general, the male was expected 
to initiate more physical activity, ranging 
from working to playing games and making 
love, whereas the female partner was be- 
lieved more likely to initiate verbal activity, 
such as giving praise or disclosing feelings. 

Such male-female differences appeared to 
decrease with increasing pair closeness. The 
Total column of Table 3 shows that in the 
casual acquaintanceship, the male partner 
was believed more likely than the female part- 
ner to initiate all behaviors, particularly other- 
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enhancement and physical contact. When the 
two persons are good friends or closer, how- 
ever, the difference becomes negligible and is 
actually reversed in the marriage relationship. 
In other words, our respondents perceived it 
as more appropriate for male than for fe- 
male members of casual relationships to ini- 
tiate interaction, but in more intimate rela- 
tionships, initiating an interchange was seen 
as equally appropriate for either sex. 


Generational Differences 


As shown above, the raters’ generation 
strongly influenced their expectations of ap- 
propriate behavior in various relationships, 
F(1, 78) = 45.40, p < .0001. Altogether, older 
raters expected less interaction between part- 
ners than did young raters (44% vs. 58%). 
We now consider additional differences be- 
tween groups both for relationships and for 
behaviors. We also explore within-group varia- 
tions in behavioral expectation. 

Relationship differences. Although older 
raters expected less interaction than did 
young raters, this difference was largest for 
the pairs intermediate in closeness (i.e., for 
the good friends and close relationship; see 
Figure 3). To assess these differences between 
the young and the old sample’s mean ratings, 
a set of 420 significance tests was computed— 
one for each of the 14 relationships, for each 
of the 30 behaviors. About 35% of these com- 
parisons were significant (by ¢ test) at the 
.003 level or higher—the appropriate signifi- 
cance level for the conservative Bonferroni 


ae er a for 6 Behavior Clusters for 12 Relationships Differing in Sex Composition 
Joint Self- areal Socal batt pce ae 

Relationship activity discolsure ment agement tion 

— ee = 8 

“Mee ‘Gian 45 58 3 A y i s 


46 .55 


F-M 65 : j È 
target of the action. M = male; F 


Note. For each pair, the first lette: 
= female. 


r denotes the actor; the second, the 
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Table 6 


Mean Probabilities for Six Behavior Clusters Estimated by Young and Old Raters 
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l| 

; 

Other- Other- Norm | 

int Self- enhance- dispar- regula- Physica! | 

Group alra disclosure ment agement tion contact Total 

4 58 F 

«13 .58 65 -50 46 5 l 

oa -66 44 54 29 4 39 ie l 

Total M -70 1 60 39 40 47 : | 
Young-Old difference .07 .12 


At 2 12 15 jd 


t test comparison. These differences between 
the young and the old raters’ means were 
significant for only 15% of the casual ac- 
quaintances and 27% of the married pair rat- 
ings. In contrast, the young-old mean differ- 
ences were significant for 41% of the good 
friends and for 55% of the close relationship 
ratings, 

Generational differences were more notice- 
able for ratings of cross-sex than of same-sex 
relationships (40% vs. 29% of the compari- 
sons), and these differences were significant 
most frequently for female-initiated interac- 
tion in the female-male pair: 49% of these 
contrasts were significant compared to 32% 
for the male-female, 30% for the female—fe- 
male, and 28% for the male-male pairs. 

Behavior differences. Although the young 
raters consistently perceived more interaction 
across the 14 relationships than did the old 
raters, these differences varied according to 
the type of behavior. Table 6 shows the mean 
probabilities for the six behavior clusters. 
(The following discussion utilizes other data 
for individual items and relationships that are 
not presented here.) 

Intergenerational differences were least evi- 
dent in expectations about activity-centered 
interaction; the data indicate little change 
over the years, except for today’s females 
being thought more likely than their counter- 
Parts of 50 years ago to initiate interaction 
with their close male Partner. 

The largest difference Occurred in the older 
generation’s perceptions of greater inhibition 
in regard to criticizing another; today’s raters 


Saw as more permissible the disparagement 


of the partner, including such actions as nag- 


ging, criticizing, and expressing irritation. 


Generational differences were also quite 


| 

large for disclosing feelings to another, For | 
each relationship other than the casual ac- | 
quaintanceship, today’s 22-year-olds were per- 
ceived as significantly more likely than those 
of 50 years ago to express their feelings, to | 
ask about a partner’s personal problems, and 
to confide their own personal problems. 

Other-enhancement also was perceived a 
more likely between today’s than yesterday’s 
22-year-olds. Particularly in the close rela- 
tionship, friends today were perceived as 
more likely to praise the other and to express 
affection; especially, it was indicated that fe- 
males are freer to acknowledge affection to- i 
ward males than they were two generations 
ago. 

Peai contact was another notices 
area of generational difference. Bete 
partners today were believed far more likely 
to pat one another on the back, to hug, or to 
make love; and such interactions today were 
considered nearly as likely to be initiated by 
females as by males. Such differences wer 
extended to same-sex pairings; young pe 
believed it significantly more likely that bot 
female and male pairs would express ee 
physical affection for each other. (Ano 
finding, concerning negative physical contac’ 
behaviors not included in the six major a 
ters—for instance, slapping or hitting a pa" 
ner—was that they, too, were perceived 2 
more likely in today’s relationships.) The a 
Pression of all sorts of feelings appears mom 
permissible today than yesterday. d 

Within-group age variations. After i 
ing such pervasive differences between E 
young and the old group of raters, it rea 
important to ask whether similar tren 
might be detected within a single age a 
If younger raters within an age group We? 
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consistently found to expect more interaction 
than older raters-in that same group, such a 
finding would strengthen the validity of the 
between-groups comparisons. 

The ages of the college raters were very 
homogeneous, with a range of only 8 years, 
but among the senior citizens the age range 
was 22 years (from 65 to 87). Therefore we 
expected to find no meaningful correlations 
within the young group but did hope to find 
negative Age X Probability correlations within 
the old sample. Within each age group of 
raters, and for each of the 14 relationships, 
we computed correlations between the raters’ 
ages and their probability ratings on a sub- 
sample of six behaviors that had earlier shown 
significant between-groups differences." 

For the young group, as was expected, no 
overall trend emerged. For the older age 
group, 95% of the correlations were negative; 
21% of them were significant at the 5% 
level. In other words, 65-year-old raters de- 
scribing relationships of 45 years ago tended 
to recall a higher probability of interaction 
than did 75- or 85-year-old raters describing 
pairs of 55 or 65 years ago. This correla- 
tional finding gives strong confirmation to the 
age effects found in the between-groups com- 
parisons.® 


Discussion 


It was found that variations in the close- 
ness and sex composition of pair relationships 
exerted strong effects on normative expecta- 
tions of interaction across a wide variety of 
dyadic behaviors. Those findings applied both 
to college students judging interaction be- 
tween today’s 22-year-olds and to senior citi- 
zens judging interaction between 22-year-olds 
when they themselves were young, although 
there also were important intergenerational 
differences. There were no significant effects 
of the judges’ own sex in either of the two 
samples. After reviewing such effects and dis- 
cussing their implications, we will assess the 
usefulness of this method for comparative re- 
search on close relationships. 


Effects of Closeness 
from casual to 


As a relationship moves 
per- 


close, its interaction probabilities were 
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ceived to increase correspondingly. Behaviors 
pertaining to touching or disclosing oneself 
to the partner were believed to be most af- 
fected by increases in closeness, whereas less 
intimate joint activities, such as planning a 
joint project or doing an errand for the other, 
were believed to be least affected. These 
findings confirm earlier theoretical proposals 
about the meaning of a growing person-other 
“intersection” (Levinger, 1974; Levinger & 
Snoek, 1972). With the exception of fairly 
neutral task-oriented activity or of strongly 
affective sexual activity, the closeness of a 
relationship exerted an increasing, monotonic 
effect on perceived probabilities of interaction. 
Nonetheless, the relationships of good friends 
were perceived to be much more similar to 
those of close friends than to those of casual 
friends, as was indicated by their location on 
Dimension 1 of Figures 1 and 2 and by the 
means shown in Figure 3. This finding paral- 
lels those obtained in Mack’s study of per- 
ceived behaviors in heterosexual couples 
(cited in Levinger, 1974) and in Stambul’s 
(1975) retrospective investigation of stages 
in the courtships of premarital pairs; Stambul 
found that the stage of serious dating was 
much more similar to the stage of engagement 
than to that of casual dating. 

The two age groups were largely parallel 
in these effects but also showed some differ- 
ences, At each level of closeness, older re- 
spondents. rated the interaction probabilities 
significantly lower than did the younger re- 
spondents. Greater young-old contrasts oc- 
curred for the relationships intermediate in 
closeness (good friends and close relation- 
ship) than for those at the extremes of casual- 
ness (casual acquaintances) and closeness 
(married) ; such contrasts were noted par- 
ticularly in the older respondents’ lower rat- 


— 

7 For this analysis we selected a sample of six be- 
haviors from different clusters that had shown larger 
jntergenerational differences: stand close to the other, 
ask about the other’s personal problems, hug the 
other, criticize the other, confide one’s personal prob- 
Jems, and express irritation with the other. 

8 Jn the preliminary study, too, the older sample of 
raters had shown significant negative correlations 
þetween raters’ ages and the behavior probabilities 
they estimated for the pairs of friends. 
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ings of physical contact and other-disparage- 
ment. 


Effects of Sex Composition 


Norms for cross-sex pairs were seen to 
differ from those for same-sex pairs, but those 
for female and male same-sex pairs were per- 
ceived as even more dissimilar. The sex of 
the actor, too, had an effect on ratings of be- 
havioral expectations. 

Cross-sex versus same-sex differences. The 
largest differences between cross-sex and 
same-sex pairs occurred for the close rela- 
tionship. At this level of closeness, cross-sex 
pairs were perceived as more likely than 
same-sex pairs to engage in self-disclosure, 
other-enhancement, and physical contact. In 
Contrast, for good friends, same-sex pairs were 
seen as more likely to interact than were 
cross-sex ones (except for having physical 
contact). 

Male-male versus female-female differences, 
The effect of sex composition is revealed fur- 
ther in the ratings of interaction in same-sex 
Pairs. Female pairs were expected to engage 
in greater self-disclosure, other-enhancement, 
and physical contact than were male pairs. 
These expectations parallel findings from other 
Studies. Previous research has found that fe- 
male friends do indeed disclose more intimate 
details about their lives than do male friends 
(Jourard, 1971), report greater emotional in- 
volvement in their friendships (Rubin, 1973; 
Purdy, Note 3), have longer interactions 
(Wheeler & Nezlek, 1977), and do things to- 
gether more spontaneously (Booth, 1972), 
Weiss and Lowenthal (1975) found that 
males emphasize sharing activities and in- 
terests more than do females, whereas females 
emphasize the importance of warm support- 
iveness. 

Pleck (1975) has su 
friendships are traditio 
men’s friendships are 


iggested that women’s 
nally intimate, whereas 
more sociable than inti- 
mate—a distinction he derived from an anal- 
ogy to Lewin’s (1948) cross-cultural analysis 
of German versus American relationships. Our 


data support Pleck’s suggestion. The multi- 
dimensional analysis shows that at each de- 
gree of closeness, most sorts of interactions 
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were expected to be lowest for same-sex male 
pairs, intermediate for female pairs, and high- 
est for cross-sex pairs. Males, then, are be- 
lieved to obtain emotional gratification from 
cross-sex relationships, whereas females are 
intimate with friends of either sex. This in- 


equality, also alluded to in studies of adoles- | 


cent friendships by Douvan and Adelson 
(1966) and Komarovsky (1976), implies that 
males are more emotionally dependent than 
females on their heterosexual rel tionships, in 
that they have learned to minimize the emo- 
tional value of their same-sex friendship. 

Sex of actor. Differences between male and 
female actors were in accord with tradition. 
Males were believed more likely to initiate 
overt physical activities, such as playing 
games, and females to initiate verbal interac- 
tion, but such sex differences were expected 
to diminish with the deepening of a rela- 
tionship. 

This finding accords well with independent 
empirical research. For example, in a study 
of discussions in heterosexual pairs, Heiss 
(1962) found that among “casual daters, 
males were more task oriented and females 
more social-emotional, but “committed” part- 
ners differed little along such traditional lines. 
Furthermore, two studies of married couples 
(Levinger, 1964; Raush, Barry, Hertel, & 
Swain, 1974) found little difference between 
male and female spouses in their dyadic soz 
cial-emotional behavior. In other words, in- 
creasing closeness is associated with a reduc- 
tion of sex typing. 


Variations in Behavior 


Multivariate analyses indicated the pres- 
ence of a primary dimension indicating gen- 
eral activity or behavioral interdependence 
and of a secondary dimension indicating in- 
timacy or affective interdependence. The first 
dimension means that to the extent that part- 
ners share one type of interaction, they ae 
judged likely to engage also in other kinds 
of exchanges. A wide range of behaviors was 
believed to become more probable as relation- 
ships become closer. 2 

The second dimension refers to behaviors 
that are associated with affection; they i" 
clude physical touch, hugging, making lov®, 
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and expressing affection. For the older raters, 
negative remarks to the partner as well as 
strongly positive disclosures were also a com- 
ponent of this “affectivity” dimension. 

The present study was designed to com- 
pare relationships only with regard to varia- 
tions in their expected behavior; the two 
orthogonal dimensions obtained in the present 
study, therefore, are undoubtedly influenced 
by the selection of the particular behaviors. 
Nonetheless, our two dimensions of behavioral 
and affective interdependence appear gen- 
erally meaningful. Behavioral interdependence 
refers to the sheer frequency or likelihood of 
joint action across a wide variety of behav- 
iors. Affective interdependence, in contrast, 
refers to the frequency of emotional expres- 
sion toward the other. 

Selection of behaviors in the present study. 
Previous studies compared relationships with 
regard to traits (e.g. Wish et al., 1976) 
or interpersonal feelings (DeSoto & Kuethe, 
1959), but in this study we focused on overt 
action. In doing so, we found great diffi- 
culty in selecting a satisfactory sample of 
interpersonal behaviors, both in getting lists 
of pair behaviors from samples of naive judges 
and in constructing such lists ourselves. The 
list of 30 behaviors was derived largely from 
the suggestions of college students and other 
younger persons. It may have been less rep- 
resentative of behaviors that older respon- 
dents would have considered important, even 
though they were able to respond to these 
items. 

Our preliminary study of 50 behaviors had 
included equal numbers of associative and 
dissociative acts, but raters had found nega- 
tive acts to be less clearly interpretable. Be- 
cause of that ambiguity, and also because the 
many negative (unlikely) acts may have 
lowered rater motivation to consider relation- 
ships realistically, the main study included 
only seven negative behavior items. These 
points are mentioned in the hope that future 
investigators will arrive at a more elegant 
sample of behavior items. 


Effects of the Raters’ Generation 


recalling the behavior of 22- 


Older raters, 
themselves were 


year-old pairs when they 
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that age, showed systematically different im- 
pressions of pair interaction from those of 
college-age raters. For almost all relationships 
and behaviors, old raters recalled significantly 
lower probabilities of interaction than did 
young raters; the differences were particu- 
larly pronounced for the good friends and for 
the close relationship. Today’s pairs were 
expected to be more expressive, to do more 
things together, to reveal a greater proportion 
of their positive and negative feelings, and 
to have more physical contact. Such young- 
old differences were least for superficial ac- 
tivities (e.g, playing games or spending a 
social evening together) and were less marked 
for the casual and the married pairs. Parallel 
young-old differences were found in an in- 
ternal analysis of the two samples of older 
raters; for a set of representative behaviors, 
the raters’ ages and their estimates of inter- 
action probabilities were inversely correlated. 

One other age difference pertained to the 
heterosexual relationships. Old raters made a 
significantly greater distinction between the 
behaviors of female and male actors. Yester- 
day’s female was perceived as less equal to 
her male partner than is today’s. The genera- 
tional difference in this study referred par- 
ticularly to the female’s initiation of physical 
contact and of joint activity. 

Limitations of the young—old comparison. 
It was noted earlier that these data refer to 
raters’ perceptions of typical relationships. 
We do not know how nearly such ratings cor- 
respond to “reality.” It is possible, for ex- 
ample, that older raters, who had to think 
back 50 years to recall their own and other 
people’s friendly relationships when they were 
young, would have distorted their ratings of 
interaction in some systematic way. In other 
words, it is possible that yesterday’s stereo- 
types were less accurate than today’s. It is 
also likely that 50 or so years ago, there were 
in fact fewer instances of “close” heterosexual 
relationships short of marriage between 22- 
year-olds; our questionnaire did not ask re- 
spondents to indicate the expected frequency 
of such relationships, but that fact may have 
influenced these results. Furthermore, while 
most of the older raters had a middle-class, 
fairly well educated socioeconomic back- 
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ground, it is impossible to make a clear-cut 
socioeconomic comparison between two age 
groups at such very different stages in their 
lives. 

Nonetheless, our correlational results do 
support the idea of growing trends toward 
more openness and expression. Despite the 
necessary absence of customary experimental 
controls, our results do confirm intuitive anal- 
yses (Stegner, 1971) and documentary evi- 
dence (Gadlin, 1977; Murstein, 1974) of 
historical changes in intimate relationships. 


Comparative Assessment of Relationships 


The present study has examined a tech- 
nique for collecting and analyzing “snap- 
shots” of different stages of interpersonal re- 
lationships, beginning with casual acquaint- 
anceship and moving through friendship to 
very close relationship and marriage. With- 
out tracking the interaction of actual pairs 
over time, it enables one to gather systematic 
data about people’s implicit theories about the 
conduct and the development of relationships. 

Considered alone, the approach yields sum- 
maries of people’s subjective impressions that 
permit intergroup comparisons. Considered in 
conjunction with other more objective ob- 
servations, a useful overall picture may 
emerge. Despite its susceptibility to possible 
distortion, the present procedure permitted 
a systematic intergenerational comparison that 
can be integrated with documentary data. 
Furthermore, a variety of other empirical stud- 
ies (see Chaikin & Derlega, 1974; Huston & 
Levinger, 1978; Levinger, 1974; Morton, 
1978) have shown that relations at super- 
ficial levels are oriented largely toward do- 
ing a particular activity with any other per- 
son; at deeper levels they become increasingly 
oriented toward doing any activity with a 
particular other. The present research tends 
to confirm this, 

Our survey method provides a mapping of 
People’s beliefs about behavior—a means of 
comparing differences across relationships that 
vary in depth and sex composition. A weak- 
ness in earlier relationship research has been 
its lack of system, its tendency to study some 
variables in one kind of relationship (e.g., 
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marriage) and quite other variables for an- 
other relationship (e.g., same-sex friendship 
or first impressions of an acquaintance; Hus- 
ton & Levinger, 1978). The present method 
may encourage better comparative research, 
It may be useful for historical and cross-cul- 
tural comparisons and thus may facilitate the 
recognition that the meaning of interpersonal 
behavior is embedded in its sociocultural con- 
text. 
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In a simulated industrial setting, subjects performed a clerical task, believing 
that their pay was being determined by a peer allocator. After being treated 
inequitably, subjects were able either to threaten the (fictitious) allocator or 
appeal to fairness principles, or they had no say (“mute” condition). During a 
second series of pay periods, subjects’ pay either remained constant, improved 
such that they and the allocator received equal shares, or improved such that 
subjects received more than the allocator (“compensation” condition). T he total 
pay was identical in all conditions and created a context of overall inequity. 1n- 
creased satisfaction and perceived fairness were observed with improved out 
comes in both the mute and the threat conditions. In the appeal conditions, 
satisfaction and perceived fairness were highest in the equality cell. These resuits 
are interpreted in terms of relative deprivation. Implications for responses of 
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recipients in ameliorative social programs are presented. 


The concept of relative deprivation was in- 
troduced by Stouffer et al. (1949) to account 
for “paradoxical” observations in their stud- 
ies of the American soldier during World War 
II. Theoretical development in this area has 
continued to the present day (e.g., Crosby, 
1976; Davis, 1959; Gurr, 1970; Runciman, 
1966). In all of these formulations, felt in- 
justice has been characterized by a discrep- 
ancy between the individual’s outcomes and 
experiences and those that he or she feels are 
deserved. A similar rationale is evident in 
recent formulations of equity theory (Adams, 
1965; Berkowitz & Walster, 1976) , distribu- 
tive justice (Homans, 1974), and justice the- 
ory (Lerner, 1977). The consequences of in- 
justice involve psychological states roughly 
equivalent to anger and dissatisfaction. These 
states are hypothesized to motivate behav- 
ioral attempts to remedy the injustice, and 
there exists a strong historical connection be- 
tween these states and conflict, violence, and 
revolution (c.f. Davies, 1962; Sears & Mc- 
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Conahay, 1973). Though the results of at- 
tempts to rectify injustice have often been 
in the direction of objective outcome improve- 
ment, the satisfaction associated with these 
increases in outcome level is sometimes tem- 
porary. This is because the subjective stan- 
dard by which outcome levels are evaluated 
may also rise when outcomes improve, 4 pi 
cess referred to as the “hedonic treadmill 
by Brickman and Campbell (1971). : 

The improvement itself may be a source S 
rising expectations, by making salient value 
end states to which the individual feels en- 
titled. To the extent that these rising, legit 
mate expectations are violated by improv 
ments that fail to rise at the same rate i 
occur in a capricious manner, a second con 
tion for dissatisfaction is present. Finally, ! 


* x 5 EF as | 
the violation occurs by illegitimate means, 


in the case in which an allocator of outa 
is perceived to be intentionally holding ba 
the rate or extent of improvement, the exP@ a 
ence of injustice should be acute. Under ne 
conditions, outcome improvement may be pee 
dicted to lead to dissatisfaction and hostili 
rather than to satisfaction (cf. Brown, ' 

Brown & Herrnstein, 1976; Crosby; 1976): 
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EVALUATION OF OUTCOME IMPROVEMENT 


Research Evidence 


Anecdotal evidence for this proposition 
abounds. For example, de Tocqueville ob- 
served that “evils which are patiently en- 
dured when they seem inevitable become in- 
tolerable when once the idea of escape from 
them is suggested” (quoted in Davies, 1962, 
p. 6). However, experimental evidence con- 
cerning the link between improvement and 
discontent is scarce. Thibaut (1950) found 
that disadvantaged parties who were success- 
ful in influencing the experimenter to allow 
them to perform the interesting tasks in an 
experimental game exhibited more hostility 
toward the formerly advantaged group than 
did those who were unsuccessful. 

A recent 2 X 2 X 2 experiment by Folger 
(1977) examined these issues more closely. 
He constructed a simulated industrial setting 
in which sixth-grade boys performed a cleri- 
cal task and were paid over a series of trials 
by a (fictitious) manager. After an initial 
period of inequitable treatment, half the boys 
were allowed to express allocation desires to 
the fhanager (“voice” condition), whereas half 
were not (“mute” condition). During the re- 
maining pay periods, the subjects’ pay either 
improved or remained constant, and the final 
totals were either inequitable (3-3 in the 
manager’s favor) or equitable (4-4). The re- 
sults were that in the inequity half, improve- 
ment led to greater perceptions of fairness 
and lower requests for more pay in & bonus 
session than did no improvement when it was 
granted arbitrarily (mute condition), but the 
reverse was true in the voice conditions. tn 
the equity half, perceived fairness was high 
in all of the cells. What could account for 
these data, particularly the pattern of means 
in the inequity half? F 

Folger suggested that the manager's pos 
tive response to the subject’s expression of al- 
location desires (i.e., the improvement) Tep- 
resented an acknowledgment that the appeal 
was legitimate and created the expectation 
that fair allocations would follow. The im- 
equity that remained in the final totals dashed 
these hopes in a particularly cruel fashion. 
From the subjects’ perspective, the manager 
confirmed their right to higher pay and then 
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blithely continued allocating for his own bene- 
fit. The negative response (no improvement) 
did not endow the appeal with this legitimacy 
and may even have represented a denial of 
subjects’ right to more pay. This response 
undercut any basis for strong discontent. In 
the mute conditions, on the other hand, the 
allocations were experienced relative to those 
of the first set of pay periods. The improved 
outcomes were satisfying; the constant ones 
were not. 


Rationale for the Experiment 


The Folger (1977) account relies very 
heavily on the nature of the interchange be- 
tween worker and manager. The “voice” was 
clearly an appeal to principles of fairness. 
Subjects filled out a “fair pay card” to com- 
municate how they thought pay ought to be 
allocated. There are other possible interpreta- 
tions, however, that are simpler in the sense 
that they suggest that voice per se may have 
produced the differing evaluations of the im- 
provement in the mute and voice conditions. 
Discontent in the voice-improvement cell, 
relative to that in the mute-improvement cell, 
may have reflected a greater willingness on 
the part of these subjects to express their feel- 
ings (“expression hypothesis”). These sub- 
jects, after all, had already been encouraged 
once before by the experimenter to voice their 
discontent, whereas those in the mute condi- 
tions had not. Voice-improvement subjects 
may have thought the manager less fair be- 
cause they attributed the improvement to the 
“external pressure” of their complaint, rather 
than to the manager’s spontaneous desire to 
increase their outcomes (“attribution hypoth- 
esis”; cf. Kelley, 1971). Finally, yoice-im- 
provement subjects may have viewed the man- 
ager as a bully who “acted tough” in the first 
few pay periods but who was easily cowed 
when they confronted him, leading to a low- 
ered evaluation of him and his actions (“der- 
ogation hypothesis”). 

A direct way to disentangle these possibil- 
ities is to manipulate the nature of the op- 
portunity for expression available to sub- 
jects. If it is voice per se that produces the 
effect, then allocations following from differ- 
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ent kinds of voice should be evaluated simi- 
larly. If the effect depends on the arousal of 
fairness principles, then communications to 
the allocator not involving issues of fairness 
should not result in evaluations of the sub- 
sequent allocations similar to those that do. 
This raises the general issue of the impact of 
the procedures by which improvements are 
wrested from advantaged parties on how these 
“gains” are evaluated. 

A particularly interesting type of voice, 
not involving fairness, is one in which the 
worker is able to threaten the allocator with 
dire consequences if the pay is not increased. 
Note that this maneuver involves expression, 
allows the improvement to be attributed to 
“external” pressure, and involves social in- 
fluence. Thus, from the perspectives of the 
expression, attribution, and derogation hy- 
potheses it is equivalent to an appeal to fair- 
ness. If, on the other hand, fairness prin- 
ciples are necessary for discontent, are there 
any grounds for predicting how an improve- 
ment following a threat to the allocator will 
be evaluated? The most straightforward pre- 
diction is that subjects will feel satisfied 
when their outcomes increase and dissatisfied 
when they do not. Satisfaction should be in- 
creased when the improved outcomes exceed 
those experienced previously; it should also 
be increased by the sense of mastery or con- 
trol inherent in being able to force an appar- 
ently unwilling allocator to move away from 
the initial allocation pattern. It is also pos- 
sible that subjects in these threat conditions 
would be more satisfied even than those in 
the mute conditions, where the increase comes 
about beyond the subjects’ control (cf. Wort- 
man & Brehm, 1975). Thus, this “threat” 
Condition is one of special interest, because it 
could result in either satisfaction or dissatis- 
faction with the improvement. 

To this point, outcome improvement has 
been discussed in general terms, It may be in- 
teresting, however, to examine how improve- 
ments of different objective magnitudes are 
experienced. Several alternatives are available 
to an allocator confronted by a request from 
a disadvantaged party for higher outcomes. 
One is to maintain the status quo (i.e., to 
continue the deprivation). With respect to 
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| 
improvement, the allocator could terminate | 
the existing discrimination and begin allocat- 
ing fairly from that point on (“equality”), ? 
or he or she could compensate the worker for 
the past inequity by giving the (former) vic- 
tim a better than even share of the remaining 
allocations (“compensation”). It is clear that 
the magnitude of the improvement in the 
compensation case is objectively greater than 
that in the equality case, which in turn ex- | 
ceeds that in the constant case. The issue of 
what is fair compensation for victims of dis- 
crimination is a complex and explosive one, 
as the furor over allegations of “reverse dis- } 
crimination” indicates. The focus here, how- f 
ever, is on how the victims themselves ex- | 
perience these improvements. 


Hypotheses | 


of how the improvements should be experi- 
enced within each level of the voice factor. 
It should be noted that all of the experimen- 
tal conditions occurred within the context of 
overall inequity, since it was in these cndi- : 
tions that Folger (1977) found his most m- 
teresting results. 

In the mute conditions, where improve 
ments are evaluated against initial alloca- 
tions, satisfaction and perceived fairness 
should increase as the new outcome level ex- 
ceeds the old (i.e, be greatest in the com 
pensation case and least in the constant ae 
In the appeal conditions, this pattern shoul 
be reversed. If the constant allocations repre 
sent a denial of the worker’s right to hig 
pay, as Folger suggested, there should b 
little, if any, discontent when no impio 
ment is forthcoming. In the improveme? 
cases, the positive response should ince 
the perceived legitimacy of the subjec i 
claim. Further, it is likely that the magnit? 
of the improvement itself should affect SU” | 
Jects’ perception of their own deservingne®® 
The further subjects are taken in the 7 
tion of a complete restoration of on 
(where Folger found high satisfaction) a | 
more they should expect it and feel entit ae 
to it. These subjects should also feel m% 
chagrined when this standard is not reache® 


The hypotheses will be presented in terms 
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In a sense, it is like a “goal gradient,” where 
they are taken ever closer to the goal before 
it is snatched away. Thus, satisfaction and 
perceived fairness should be /owest in the 
compensation case and greatest in the con- 
stant case. 

The threat conditions could go either way. 
With no fairness principles involved, it is 
likely that the improvements would be experi- 
enced as increasingly satisfying as they ex- 
ceed the initial outcome level. Further, the 
larger the increase that had been forced from 
the unwilling allocator, the more control or 
mastery of the situation subjects should feel, 
which should also increase their satisfaction. 
On the other hand, the expression, attribu- 
tion, and derogation hypotheses would pre- 
dict a pattern similar to that described above 
for the appeal conditions, since the increase 
in both cases resulted from “voice.” 

These predictions suggest a Voice x Im- 
provement interaction. A 3 X 3 experiment 
crossing outcome improvement (constant, 
equality, compensation) and opportunity for 
expression (mute, threat, appeal) was con- 
ducted to examine these issues. 


Method 
Participants 


Ninety-seven persons (63 females, 34 males) were 
recruited via sign-up sheets posted around the Uni- 
versity of North Carolina at Chapel Hill. The ex- 
perimenter scheduled each person by phone. Subjects 
Were run two at a time, but since interaction between 
them was minimal, the individual was used as the 
unit of analysis. Five females and two males turned 
out to be “professional subjects” who attended any 
experiment involving payment. These subjects ex- 
Pressed some suspicion about the identity of the 
allocator, Their data were excluded from the re- 
Ported analyses. The remaining 58 females and 32 
males were randomly assigned to conditions until 
there were 10 respondents per cell. 


Procedure 


Two subjects at a time reported for an experiment 
called “Organizational Behavior” and were immedi- 
ately ushered into separate rooms, which were linked 
to the experimenter’s control room via an intercom 
system and a set of signal lights. The experimenter 
began by explaining that the focus of the experi- 
ment was on “factors affecting job satisfaction an 
Performance.” Subjects were told that the organiza- 
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tion in which they would be working would con- 
sist of two roles—worker and allocator. 

At this point subjects were assigned to roles by 
the use of a series of signal lights. All subjects dis- 
covered that they would be the worker and assumed 
that the other subject would be the allocator. In fact, 
the role of allocator was played by the experimenter. 
Each subject was aware of the sex of the allocator, 
but there was no systematic relation between experi- 
mental condition and sex of subject or the per- 
ceived sex of the allocator. The experimenter visited 
each subject and gave him or her a “worker’s folder” 
containing experimental materials. 

The experimenter returned to the control room 
and specified the duties of the two roles. The worker 
was told that he or she would perform a clerical 
task—transcribing numbers set up in the form of 
business transactions from a computer printout onto 
a work sheet. The number of items transcribed was 
to be used as a measure of productivity. Subjects 
were told that they would be paid by the allocator 
over a series of 10 pay periods, each lasting 1 
minute. The allocator was to receive 24¢ in each pay 
period from the experimenter and would then divide 
up this sum between him- or herself and the worker. 
The experimenter indicated that he would announce 
the allocator’s pay decisions after each pay period, 
Subjects were asked to write down these allocations 
on the payment record card in their folders. Subjects 
were also told that the allocator would check the 
worker's performance, but would only be able to do 
so at the end of the final pay period, In order to 
make the experimental situation more plausible, the 
experimenter indicated that this phase of the study 
was concerned with the degree of contact among 
organization members. They were told that this was 
the minimal contact condition, which accounted for 
their being in separate rooms and the allocator divid- 
ing up the pay at each pay period despite only being 
able to check the worker’s performance at the end, 

At this point, the experimenter visited each subject 
and answered questions about the procedure. Sub- 
jects were asked to fill out a fair pay card that 
asked what they thought would be a fair way for 
the allocator to divide up the pay. 

The experimenter then announced that the first 
series of pay periods was about to begin and that 
subjects should begin their tasks, At 1-minute ìn- 
tervals, the experimenter informed them of the al- 
locations. During this first series, the allocations were 
consistently unfair to the worker. After Pay Period 5, 
subjects were asked to total up the allocations on 
the payment record card and to fill out the first 
questionnaire. A 

The opportunity for expression manipulation was 
introduced at this point. In the mute conditions, the 
experimenter visited the subjects and asked whether 
they had any questions about the procedure. In the 
voice conditions, subjects were able to select a mes- 
sage for the experimenter to deliver to the allocator. 

During the remaining five pay periods, the im- 
provement manipulation was introduced. In the con- 
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stant conditions, the worker received the same out- 
comes as in the first half, whereas in both improve- 
ment conditions, his or her outcomes increased. In 
the equality conditions, the worker and allocator re- 
ceived equal shares, whereas in the overcompensation 
conditions, the allocator gave the worker more than 
he or she kept. 

After Pay Period 10, the experimenter announced 
that the work session was over. He asked subjects to 
calculate their second-half and final allocation totals 
and to fill out the final questionnaire. When sub- 
jects were finished, they were led into a common 
room, fully debriefed, and paid a standard sum of 
$2 each for their participation. 


Independent Variables 


Two factors were manipulated: (a) opportunity 
for expression and (b) outcome improvement. The 
opportunity-for-expression factor was introduced be- 
tween the two series of pay periods. The purpose of 
this manipulation was to vary the “tone” of the 
interchange with the allocator and cast the subse- 
quent allocations in different lights. In the mute con- 
ditions, there was no opportunity for the subject to 
contact the allocator. The experimenter simply visited 
the subject’s room and answered any questions the 
subject might have had. In both voice conditions, 
however, the subject was able to select one of three 
Messages for the experimenter to pass to the al- 
locator. One of these messages expressed satisfaction 
with the initial allocations, one suggested that the 
allocator should take more in the second half, and 
the third expressed dissatisfaction. Of course, it was 
expected that subjects would choose the third mes- 
sage, 

This message went on to include further informa- 
tion specific to the threat or appeal conditions. In 
the threat conditions, the tone was clearly power- 
oriented and belligerent. Subjects believed that there 
would be a short bonus session following the 10th 
pay period, during which they would be able to 
bargain with the allocator. During this session, they 
were told they would be able to deliver loud bursts 
of noise into the allocator’s headphones to back up 
their demands. The message that the subjects were 
able to choose emphasized this capability and stated 
that if their outcomes were not improved in the 
second half, they would use this weapon during the 
bonus session, The message concluded by demanding 
that the total shares of pay end up equal. Thus, this 
message did not mince words and had a distinctly 
“or else” quality about it. 

In the appeal conditions, however, the message ap- 
pealed to the allocator’s sense of fairness and tried 
to evince oe or her of the legitimacy of the re- 
quest for a larger share. This message also ed 
that the total shares ought to end te egal te ine: 
ther enhance the legitimacy of the worker’s request 
for more pay, the experimenter supposedly gave the 
allocator the subject’s first questionnaire, which 
spelled out his or her dissatisfaction and felt in- 
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justice, and the subject’s work sheet, which detailed | 
the worker’s performance during the first half. Thus, | 


this message was based on an appeal to fairness and f 
emphasized the worker's entitlement to higher pay, | 
The opportunity-for-expression factor was crossed | 


factorially by a manipulation of outcome improve- 
ment. This manipulation was defined in terms of 


the amount allocated in the second half relative to 
that allocated in the first half. In the constant condi- 
tions, the allocator took two thirds of the available | 


pay in both halves. The totals were $1.60 for the 

allocator and 80¢ for the worker. In the two im- 

provement conditions, the final totals were identical, | 
but the outcomes were distributed differently over 
the 10 pay periods. In the equality conditions, the 

allocator gave the worker an exactly even split (60¢- 

60¢) during the second half, whereas in the com- 

pensation conditions, the allocator gave the worker | 
more (70¢-50¢) than he or she kept. Table 1 pro- 

vides an array of the allocation schedules on a trial- 

by-trial basis for both worker and allocator. 


Dependent Variables 


The dependent variables were assessed at various 
points in the procedure. Before the pay periods, sub- 
jects filled out a fair pay card, which was a check on 
the perceived equity norm. Following Pay Period 5, 
subjects completed the first questionnaire, which as- 
sessed their satisfaction with, and perceptions of the | 
fairness of, the initial allocations. Finally, at the 
conclusion of the work session, subjects completed 
the final questionnaire, which contained the man 
dependent variables, Subjects were asked how satis- 
fied they were with their share of the pay in the 
second half (“satisfaction”), and how fairly they 
felt the allocator was dividing up the pay in the 
second half (“fairness”). They were also queried on 
their feelings about the total allocations, and the 
number of items they transcribed was tallied. 


Results 
Effectiveness of the Manipulations 


The improvement manipulation was checked 
by asking subjects if their share of the it 
changed from the first to the second a 
Eighty-nine of the 90 subjects responded o 
rectly (“no” in the constant conditions, u 
in the improvement ones). They were a 
asked how much of a change there w35 ‘a 
their share of the pay from the first to 
second half (1 = none, 11 = very much). m 
means for the constant, equality, and a 
pensation conditions were 1.2, 8.9, and ”"” 
respectively, F(2, 72) = 315.52, p < 001 

The opportunity-for-expression manip ; 
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Table 1 
Worker and Allocator Pay Schedules for the 10 Pay Periods 
nite E eee eee 
Condition 
Constant Equality Compensation 
Pay period Worker Allocator Worker Allocator Worker Allocator 
First half 
1 8 16 4 20 2 
22 
2 8 16 4 20 2 22 
3 9 15 5 19 3 21 
4 7 17 3 21 1 23 
5 8 16 4 20 2 22 
Sum 40 80 20 100 10 110 
Second half 

6 8 16 12 12 14 10 
7 7 17 12 12 15 9 
8 8 16 12 12 14 K 
9 9 15 12 12 13 11 
10 8 16 12 12 14 10 
Sum 40 80 60 60 70 50 
Total 80 160 80 160 30 a 


Note. The amounts are in cents. 


tion was checked by giving subjects a choice 
of four alternatives with which to sum up the 
content of the message sent to the allocator 
after Pay Period 5. The choices were (a) 
sending no message, (b) threatening possible 
retaliation, (c) appealing to fairness prin- 
ciples, and (d) other (specify). All 30 mute- 
condition subjects chose (a), 23 of the 30 
threat-condition subjects chose (b), and 28 
of the 30 appeal-condition subjects chose (c), 
x?(6) = 148.10, p < .001. 

Finally, subjects were asked how much in- 
fluence they had been able to exert over the 
allocator to change the division of pay from 
the first to the second half (1 = none, 11 = 
very much). The Voice X Improvement inter- 
action was significant, F (4, 72) = 9.52, B< 
-01. The mean for the four voice-improvement 
cells was 6.25, whereas the mean for the three 
mute cells and the two voice-constant cells 
was 1.16. 


Evaluation of the Improvement 


below were initially 


The analyses reported 
tional fac- 


run using sex of subject as an addi 


tor to check on the generality of the effects. 
However, there were no significant effects in- 
volving this variable, so the data for male 
and female subjects were collapsed. 

The first effect of interest is the signifi- 
cant improvement main effect: satisfaction, 
F(2, 72) = 45.53, p< 001, and fairness, 
F(2, 72) = 63.56, p < .001. The pattern of 
means indicated that in general, subjects were 
more satisfied and perceived the allocator to 
have been more fair when their outcomes 
improved in the second half than when they 
did not (see Table 2). The voice main effect 
was also significant for the satisfaction mea- 
sure, F(2, 72) = 5.20, p < .01, but not for 
the fairness measure ($ > 35). 

These main effects were qualified by the 
hypothesized Voice x Improvement interac- 
tion: satisfaction, F(4, 72) = 4.62, p < 01, 
and fairness, F (4, 72) = 2.43, p < .055. The 
interaction was explored with Scheffé tests to 
specify more closely the nature of the effect. 
Only those tests achieving significance (p < 
.05) on both measures will be presented. 
Within both the mute and threat conditions, 
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Table 2 
Means for the Evaluation-of-Improvement Items 


Opportunity for expression 
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Mute Threat Appeal 
Con- Equal- Com- Con- Equal- Com- Con- Equal- Com- 
Improvement stant ity pensation stant ity pensation stant ity pensation 
Satisfaction a 
measure 1.6 6.1 7.3 2.0 8.6 8.6 1,3 8.3 3.6 
Fairness ž i 
measure 1.5 8.6 8.0 1.9 8.0 6.9 1.5 9.3 44 
Note. Higher numbers indicate higher satisfaction and fairness (1 = not at all, 11 = very much). 
subjects felt more satisfied and more fairly Discussion I 


treated when their outcomes improved than 
when they did not. The equality and com- 
pensation cells did not differ. Within the ap- 
peal condition, however, the improvement in 
the equality cell resulted in increased satis- 
faction and perceived fairness compared with 
the constant cell. The objectively larger im- 
provement in the compensation cell did not 
produce mean differences from the allocations 
in the constant cell and resulted in signifi- 
cantly lower satisfaction and fairness than 
did that in the equality cell! In fact, the ap- 
peal—compensation cell did not differ from 
the average of all the constant cells and was 
significantly lower than the average of the 
rest of the improvement cells. The means for 
the satisfaction and perceived fairness mea- 
sures are presented in Table 2. 


Total Satisfaction and Fairness Measures 


Subjects were asked to indicate their satis- 
faction and perceptions of fairness of the to- 
tal allocations. The improvement main effect 
was significant: total satisfaction F(2, 72) = 
7.93, p < .001, and total fairness, F(2, 72) 
= 4.14, p < .05. The pattern of means indi- 
cated that subjects were more satisfied with, 
and felt there was more fairness in, the total 
allocations when there was an improvement 
in the second half than when there was not. 
It should be noted again that subjects’ total 
allocations were identical in every condition. 
Finally, the behavioral measure of number of 
items transcribed yielded no significant effects. 


The focus of the present research was to 
examine how disadvantaged parties in a rela- 
tionship experience improvements of varying 
magnitudes as a function of the procedures 
by which these increases in their objective 
outcome level are introduced. 


Evaluation of the Improvement 


The main effect for the improvement factor 4 
indicates that subjects were more satisfied 
and felt that the allocator had been more 
fair when their pay improved in the second 
half than when it did not. This is consistent 
with a hedonistic view of human beings 4 
motivated to maximize their outcomes. The 
demand for compensation is a popular Te- 
sponse to inequity (Leventhal & Bergman, 
1969), because the disadvantaged party bene 
fits materially, as well as psychologically 
(with the reduction of “inequity distress ). 
The present results indicate that in general, 
compensation will be preferred to no com- 
pensation, even if the amount is not sufficient 
to restore overall equity. ; 

This main effect was qualified by a Voice x 
Improvement interaction. To facilitate th? 
presentation, the evaluation of the improv’ 
ment will be discussed within each level ° 
voice separately. P 

Mute conditions. In the mute condition 
subjects were more satisfied when their oUt 
comes increased then when they did not. 
two levels of improvement did not proc” 
differential satisfaction, although the °°” 


duce 
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pensation mean was higher than the equality 
mean. It is likely that the positive experience 
of the improvement resulted from the in- 
creased outcomes exceeding subjects’ compari- 
son levels (CL; Thibaut & Kelley, 1959). 
Since the increase occurred beyond subjects’ 
control, their CL was probably not affected 
by the higher outcomes and remained rooted 
in the allocations of the first half. The result 
was a positive outcome-CL discrepancy in 
the improvement cells, which was absent in 
the constant cell where outcomes remained 
at CL. 

Threat conditions. In the threat condi- 
tions, the pattern was similar to that de- 
scribed for the mute conditions. Subjects were 
more satisfied when they were successful in 
forcing the allocator to give them a larger 
share of the pay than when they were unsuc- 
cessful. This pattern of means is clearly 
counter to the predictions from the expres- 
sion, attribution, and derogation hypotheses. 
It seems that voice per se is not sufficient to 
produce discontent with improving outcomes. 
Subjects in the threat conditions were able 
to voice their feelings during the procedure, 
the improvements represented a retreat by 
the allocator from his or her initial “tough 
stance,” and the threat certainly could have 
been interpreted as “external pressure,” yet 
subjects were more satisfied in the improve- 
ment cells than in the constant cell. Thus, it 
may require special types of voice to produce 
the discontent effect. This possibility will be 
discussed further below. 

The question for now is, Why were sub- 
jects satisfied in the improvement conditions? 
Certainly their outcomes were higher than 
they had been in the first half. It may also 
have been that the successful exercise of power 
and the sense of efficacy it provided was it- 
self satisfying (cf. Wortman & Brehm, 1975). 
The voice main effect, which showed highest 
overall satisfaction in the threat condition, is 
consistent with this combination of hedonic 
satisfaction and perceived control interpreta- 
tion. An alternative explanation of the voice 
main effect relies on subjects’ beliefs about 
the “bonus session,” during which they 
thought they would actually have an op- 
portunity to carry out their threat to deliver 
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a burst of noise into the allocator’s head- 
phones. Perhaps the anticipated future con- 
tact caused them to be more attracted to the 
allocator (cf. Darley & Berscheid, 1967) and 
thus to evaluate his or her actions positively 
as well. Measures of liking for the allocator 
were included on the final questionnaire to 
tap these feelings in the second half, and over- 
all as well. In both cases, the voice main ef- 
fect was not significant (Fs < 1), making this 
explanation quite unlikely. A final point is 
that Thibaut and Kelley (1959) would pre- 
dict that the improvement, having come about 
under subjects’ control, would contribute to 
the CL, causing it to rise and eventually blunt 
the satisfaction associated with the increase. 
We will say more about this later. 

Appeal conditions. The pattern in the ap- 
peal conditions deviated somewhat from the 
predictions, but what did happen seems to 
be of great theoretical interest, Recall that 
the predicted pattern required discontent to 
be highest in the compensation cell and least 
in the constant cell. 

Subjects in the constant cell did not show 
signs of accepting their continued low out- 
comes, as one might expect if they had per- 
ceived the constant allocations as represent- 
ing a denial of their right to more pay. Satis- 
faction in this cell was not higher than in 
the improvement conditions, as had been hy- 
pothesized, and in fact did not differ from 
the constant cells in the mute and threat 
conditions. Perhaps this difference between 
the present study and Folger’s (1977) may 
be explained by differences in the clarity of 
the inequity induction. The opportunity to 
voice objections came after five periods of in- 
equitable allocation in the present study as 
compared with only two in Folger’s study. 
Our subjects’ beliefs that they had been un- 
fairly treated and were deserving of com- 
pensation may have been less subject to 
change following the allocator’s response be- 
cause of this increased experience. In this 
regard, the present study probably represents 
more adequately the state of disadvantaged 
groups in society, whose beliefs about their 
own entitlement are not easily shaken. 

In the improvement cells, satisfaction was 
lower in the compensation than in the equal- 
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ity cell despite the fact that the improvement 
was objectively larger in the former. It had 
been expected that bringing subjects closer 
to the goal of overall equality without reach- 
ing it would constitute a greater violation of 
expectations and produce more dissatisfaction. 
Overall equity had been assumed as the 
criterion of fairness, since satisfaction was 
high in all of Folger’s “equity” cells and 
since the messages sent by subjects in our 
voice conditions stated that the total shares 
should end up equal. However, the mean 
satisfaction in the equality cell (8.3 on the 11- 
point scale) is really too high to be called 
discontent. Some theoretical modifications ap- 
pear necessary. 

Subjects were clearly dissatisfied with the 
initial allocations (according to their responses 
to the first questionnaire, to be discussed be- 
low) and felt that they ought to be com- 
pensated (28 of 30 subjects in the appeal 
conditions actually chose the “appeal” mes- 
sage). However, they may have been some- 
what uncertain about the level of compensa- 
tion to expect until they experienced the im- 
provements during the initial pay periods of 
the second half. We are now suggesting that 
the magnitude of the improvement given by 
the allocator served to legitimize one of sev- 
eral acceptable standards. In the equality cell, 
subjects experienced equal shares (12¢-12¢) 
in each of the initial periods of the second 
half. The allocator was thus communicating 
that a “proximal” standard of “fairness from 
now on” was the appropriate fairness cri- 
terion. In fact, the remaining allocations ex- 
actly matched this standard. In terms of 
relative deprivation, subjects received what 
they and the allocator had decided was fair, 
and they felt quite satisfied. In the compensa- 
tion cells, the initial improvement gave the 
subjects more than the allocator (14¢-10¢ 
and 15¢-9¢ in Pay Periods 6 and 7, respec- 
tively). This division communicated to the 
subjects that they deserved not only equal 
treatment but also compensation for the ini- 
tial unfairness—a “distal” standard, Later al- 
locations failed to match these expectations, 
resulting in a strong sense of discontent. In 
the words of one subject in this condition who 
wrote in the margin of the final questionnaire, 
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the allocator “attempted to co-opt me by giv- 
ing me a little, but not enough to make fair 
allocation.” 

Further evidence on this point comes from 
the fact that the fairness measure was highly 
correlated with the satisfaction measure we 
have been discussing (within-cells correla- 
tion = +.59). Subjects not only felt less satis- | 


| 
1 
7 


fied in the compensation than in the equality | 
cell but also felt the allocations to have been 
less fair. This is exactly what one would ex- | 
pect if there existed a greater discrepancy be- | 
tween deserved and experienced outcomes ing 
the compensation cell. In addition, a recently l 
completed study by deCarufel (in press) ay 
rectly manipulated the level of the standard 
and the magnitude of the improvement. The 
results indicated strong discontent when the 
distal standard was legitimized but “only” 


the proximal standard was matched by the’ 
improvement. 


The Role of Initial Inequity 


One final matter requiring comment is the ; 
role of the initial inequity. The precondition | 
of initial deprivation was assessed with the” 
first questionnaire, after Pay Period 5. Sub: 
jects were clearly dissatisfied (M = 1.75) 
and felt the allocations to have been unfait 
(M = 1.67). It is obvious however that im 
order to manipulate the second-half alloca- 
tions systematically, this difference had t° 
result in different total allocations or corte 
spondingly greater initial inequity in the im 
provement cells. Because of our interest m 
the evaluation of identical totals differentially 
distributed over time, we decided to allow 
initial inequity vary with the improvement m 

This is also, by the way, how Folger (19 
chose to deal with this dilemma. Fortuna | 
in the present study subjects’ responses tO tl f 
first questionnaire afford a method of a 
ing whether this inevitable confounding 5 
any way damages the interpretations ont 
above. It might be suggested, for ont 
that if subjects were initially more diss@ + 
fied in the compensation cell, then the rest f. 
in the appeal conditions could simply = 
a continuation of these trends. Alternative 
it might have been the greater initial ined” 
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that caused subjects in the equality and com- 
pensation cells to use different standards to 
evaluate the improvement right from the start. 
The first point to note is that there was no 
difference in satisfaction after the first half 
across levels of the improvement factor (p > 
15) despite the objective differential in- 
equity. There was, however, a main effect for 
the improvement factor on the fairness mea- 
sure, F(2, 72) = 6.75, p < .01. The pattern 
of means (constant = 2.5; equality = 1.0; 
compensation = 1.5) indicated that subjects 
were Jess dissatisfied in the constant cells 
than in either of the improvement cells ($ 
<.01 in each case), which did not differ, 
t(58) = .17, ns. Thus, with the lack of a con- 
sistent satisfaction—fairness difference across 
the improvement factor and no difference be- 
tween the improvement cells on either mea- 
sure, any conjectures based on the effects of 
this differential inequity on evaluations of 
the subsequent improvement would appear 
less plausible. What these measures do show, 
however, is massive deprivation in all of the 
cells prior to the second half (recall that the 
means were under 2 on the 11-point scale). 

Finally, it might be noted that the de- 
Carufel (in press) study mentioned above 
held initial inequity constant and allowed the 
final totals to reflect the improvement. Simi- 
lar patterns of discontent with improvement 
were evident in that study as well, reinforcing 
the fact that experiencing discontent follow- 
ing outcome improvement does not require 
initial differences in degree of inequity. 


Total Satisfaction and Fairness Measures 


The main effect for the improvement factor 
was significant for both the total satisfaction 
and total fairness measures, despite the fact 
that the total outcomes in every condition 
were identical. This result clearly reflects the 
fact that the second-half allocations carried 
more weight in determining total satisfaction 
and fairness than did those of the first half. 
This recency effect may be due to the later 
allocations replacing the earlier ones (cf. 
Jones & Goethals, 1971) as indicators of the 
allocator’s intentions, Evidence for this point 
comes from an auxiliary measure from the 
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final questionnaire that asked subjects to in- 
dicate what they believed the allocator now 
felt they deserved. The main effect for im- 
provement was significant, F(2, 72) = 9.67, 
p < .001, indicating that subjects in the im- 
provement cells perceived that the allocator 
regarded them as more deserving than did 
subjects in the constant cell. These order ef- 
fects point out the importance of taking into 
account how outcomes are distributed in time, 
as well as how the final totals come out, in 
considering the effects of outcome distribu- 
tions on satisfaction and perceived fairness. 


General Discussion and Conclusions 


The major theoretical tool guiding the 
present experiment has been that of relative 
deprivation. The alternative mechanisms based 
on the operation of voice per se (expression, 
attribution, derogation) received no support 
from the pattern in the threat conditions. 
However, the analysis of the appeal condi- 
tions based on the “illegitimate violation 
of legitimate expectations” received support, 
both from the divergence of the threat and 
appeal patterns and from the convergence of 
the satisfaction and fairness measures within 
the appeal cells. 

Most of the research conducted within the 
relative deprivation framework is concerned 
with “static” comparisons of one person’s 
outcome total with another's total outcomes. 
The present article suggests two additional 
matters that contribute to an individual’s 
satisfaction with his or her outcome level: 
“dynamic” comparisons of outcome distribu- 
tions over time and procedural circumstances 
surrounding the distributive actions of the 
allocator (cf. Thibaut & Walker, 1975). 

The effects of allocation schedule and pro- 
cedure may contribute to satisfaction beyond 
the effects of the final totals, because these 
factors may precipitate changes in the stan- 
dards by which outcomes are evaluated. It 
was mentioned earlier that perhaps satisfac- 
tion in the threat conditions would eventually 
be lower than in the mute conditions. The CL 
would rise, since the outcomes increased under 
the subject’s control, and blunt the effect of 
the improvement. This is the classic analysis 
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of CL derived from Thibaut and Kelley 
(1959). Let us refer to this as a “market- 
type” CL, because it rises and falls with ex- 
perienced outcomes. 

There may be other standards, however, 
that are developed out of social interaction. In 
the appeal conditions, it was suggested that 
the nature of the voicing of opinions and the 
allocation schedule interacted to invoke vari- 
ous “normative” standards of fairness such 
as equality or full equity, against which the 
subsequent allocations were evaluated. Let us 
refer to this type of CL as a “contract-type” 
standard, because the fairness principle binds 
the participants to a particular standard. This 
would seem to be an instance of a more gen- 
eral process by which standards are negoti- 
ated by two (or more) parties and in which 
the outcomes of the interaction are evaluated 
in terms of each’s obligations under that con- 
tract. This type of standard has not been ex- 
plicitly recognized in the relative deprivation 
literature, which has focused almost exclu- 
sively-on temporal (past outcomes) and social 
(others’ outcomes) standards. Yet the inter- 
action of type of standard and pattern of 
allocation is potentially important. For ex- 
ample, with a market-type CL, gradually im- 
proving outcomes that exceed CL would lead 
to the greatest satisfaction, whereas with a 
contract-type CL, sudden shifts, correspond- 
ing to renegotiations, followed by stable pe- 
tiods in which the allocations match the terms 
of the contract would produce greatest satis- 
faction. According to this logic, the threat 
conditions and the appeal conditions could 
still be distinguished when examined over a 
longer time period. 

The present results provide another dem- 
onstration of the importance of the subjective 
evaluation of outcomes, rather than the ob- 
jective level of outcomes, in determining satis- 
faction and feelings of fairness with dynamic, 
or changing, outcome distributions. This find- 
ing is particularly striking because outcome 
improvement would seem, on an intuitive 
basis, most likely to result in high satisfac- 
tion. There are some interesting implications 
for the evaluation of improvements by dis- 
advantaged groups in society. One especially 
disturbing one is that those who use violent 
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means to extract the increase would feel satis- | 
fied, and those who use nonviolent means | 
would not. A future research direction might } 
be to examine how subjects who have been in 
some of the conditions of this experiment) 
would respond to a choice of voicing strat- | 
egies, violent and nonviolent, before an ad-) 
ditional series of pay periods. One might pre- 
dict that the threat subjects who felt good) 
about their recent outcomes would choose the) 
threat message again. Those in the appeal- | 
compensation cell, who objectively benefited 
from their nonviolent approach yet remained 
dissatisfied, might be prone to choose some- 
thing other than the appeal again—perhaps | 
the exercise of whatever form of power 15) 
available to them. These possibilities under- 
score the importance for administrators of 
ameliorative social programs to be very care- 
ful about triggering rising expectations in 
nonviolent groups and reinforcing the actions 
of violent ones. 
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To understand how people communicate information about others and make 
memory-based interpersonal judgments, it is necessary to understand how per- 
son impressions become organized in memory. Consistent with recent research 
in cognitive psychology on thematic organization of information in memory, 
Experiments 1 and 2 found that subjects were able to recall (as well as rec- 
ognize up to a week later) descriptive traits that were relevant, as opposed to 
irrelevant, to an initial impression judgment. Furthermore, they tended to use 
judgment-relevant characteristics when asked to freely describe the stimulus 
person. Subjects’ memory for the initial impression judgment itself was excel- 
lent and not measurably affected by the relevance of the stimulus information 
upon which it had been based. A third experiment investigated two possible 
ways in which thematic organization of an impression might influence a subse- 
quent memory-based judgment about a person. Consistent with other recent 
research, it was found that when making memory-based decisions, subjects re- 
lied on memory for an earlier thematic judgment rather than on memory for 


~~ 


descriptive stimulus cues. 


A great deal of research in impression for- 
mation has been directed toward studying 
how people integrate items of person informa- 
tion for the purpose of making a single, stim- 
ulus-based judgment. “Stimulus based” is 
used to refer to the fact that such judgments 
are based on descriptive information provided 
by the experimenter simultaneously with, or 
immediately prior to, the time the judgment 
is made. 

Two limitations to this research can be 
identified. Most day-to-day judgments that 
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people make about each other are “memory 
based” rather than stimulus based. That 1$; 
they are based on a sampling of thoughts K 
a cognitive representation of the person th 
a perceiver has built up over time (years e 
most cases) and stored in memory. To under 
stand the way in which people make ne 
ory-based judgments, it is necessary to uo 
stand how they organize person informa 
in memory and draw on that information 
later social behavior, 

A second limitation of most previous F 
in impression formation is that attention F 
been focused almost exclusively on the ra 
cess of making judgments. Yet, a great a 
of interpersonal communication involves t" K, 
mitting facts rather than making juden 
People share remembrances of others’ E 
appearance, and other characteristics. oe 
times it is necessary to recall informē 


ork 
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about others, whereas at other times it is 
necessary to decide whether someone else’s 
remembrances are accurate. (These two situa- 
tions correspond to the recall and recognition 
tasks used in memory research.) 

There are, therefore, two important social 
consequences of how person impressions are 
organized in memory. These are (a) how peo- 
ple remember previously acquired person in- 
formation when it must be transmitted or 
judged for accuracy, and (b) how memory is 
drawn upon when a judgment is made. Both 
these questions are examined in the present 
article. 


Thematic Organization of Impressions 


Verbal learning research has demonstrated 
that memory for stimulus events can be or- 
ganized around a superordinate theme. For 
example, it has been shown that sentences are 
more easily remembered when they are pre- 
ceded by a thematic title than when no title 
is provided (Bransford & Johnson, 1973; 
Dooling & Mullet, 1973). Other research in- 


_vestigating how people remember written 


texts indicates that a salient theme helps de- 
termine what specific items, from a set of 
items, will later be recalled (Frederiksen, 
1975: Rumelhart, 1975; Schank, 1975a, 
1975b; Thorndyke, 1977). Finally, it has 
been found that a salient theme not only in- 
fluences what factual information is remem- 
bered but also what inferences are likely to 
be made about a stimulus event (Picek, Sher- 
man, & Shiffrin, 1975; Potts, 1972; Sulin & 
Dooling, 1974). Generally, the inferences peo- 
ple generate and remember tend to be rele- 
vant to, and congruent with, the theme that is 
salient when the stimulus information is ini- 
tially considered. 

Little is known about the themes that serve 
to organize people’s impressions. Some work 
has shown that distinctive characteristics of a 


| person may serve as an organizing theme. 


Asch (1946) originally suggested that cer- 
tain central traits, such as warm and cold, 
serve as organizing frameworks for people's 
impressions. It has been shown that memory 
for information about a person can be orga- 


nized around traits (Cantor & Mischel, 1977) 
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or around a person’s occupation (Anderson, 
1977; Anderson & Hastie, 1974). 

Rather than seek additional classes of per- 
son characteristics that are used as organizing 
themes, the present research examines whether 
the act of making an impression judgment 
will, in and of itself, influence the nature of 
memory organization. It is often the case that 
a person’s first encounter with another is in a 
context that calls for a decision or judgment. 
For example, it may involve something as 
simple as deciding whether to continue a con- 
versation or something as consequential as of- 
fering a job to the other. In both cases the 
person is receiving and organizing informa- 
tion about the other while actively engaged in 
making a judgment. 

All three studies in this article required 
subjects to make an occupational suitability 
judgment (e.g., would this person be success- 
ful or unsuccessful as a dentist?) at the time 
they initially received information about a 
stimulus person. If such a judgment serves 
as an organizing theme for a person’s impres- 
sion, this judgment should affect memory 
(and subsequent memory-based social behav- 
ior) in the ways discussed above. Memory 
for the stimulus features of the person should 
be affected by the occupational theme. Oc- 
cupation (theme)-relevant stimulus informa- 
tion should be better remembered than theme- 
irrelevant information. Furthermore, subse- 
quent judgments and inferences about the 
stimulus person should be congruent with the 
thematic organization introduced by the judg- 
ment. 

The first two studies reported here investi- 
gated the effects of a judgment on memory 
for information used to describe a person. 
Both recall (Experiment 1) and recognition 
(Experiment 2) memory were examined. Ex- 
periment 3 investigated two possible alterna- 
tive ways in which thematic organization of 
an impression might influence a subsequent 
memory-based person judgment. 


Experiment 1 


Tf an initial judgment organizes a person’s 
impression of another, this should be reflected 
both in the information that is later remem- 
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Table 1 

Mean Number of Judgment-Relevant and 
Judgment-Irrelevant Traits Recalled: 
Experiment 1 


Trait relevance 
Stimulus person 


replication Relevant Irrelevant 
1 2.69 2.25 
2 2.44 2.06 
3 2,56 1.81 
4 2.25 2.25 
M 2.49 2.09 


Note. Cell n = 16. Scores in each cell could range 
“from 0 to 4 traits. 


bered about the person and in the attributes 
that are used to describe the person. In Ex- 
periment 1, subjects were asked to make oc- 
cupational suitability judgments about stimu- 
lus persons described by traits that were both 
relevant and irrelevant for making the judg- 
ment, Afterwards, the subjects were asked to 
recall the descriptive traits and list additional 
traits they thought the person might possess. 
These inferred traits were evaluated for rele- 
vance to the occupational judgment. It was 
expected that subjects would (a) recall more 
jJudgment-relevant, as opposed to judgment- 
irrelevant, traits and (b) describe the person 
with more judgment-relevant (as opposed to 
irrelevant) characteristics. 


Method 


Subjects. Subjects were 116 male and female 
undergraduates from Ohio State University who par- 
ticipated in partial fulfillment of an introductory 
psychology course requirement. Of these, 12 as- 
sisted in selecting the stimulus materials, 64 par- 
ticipated in the experiment Proper, and 40 were used 
to judge the relevance of the subject-generated de- 
scriptions. Where appropriate, subjects were ran- 
domly assigned to the experimental conditions, 

Stimulus materials. Two pairs of occupations and 
four eight-trait person descriptions (two for each 
Occupation pair) were used as stimulus materials, 
The occupations were academician and sportsman 
(Occupation Pair 1) and comedian and pilot (Oc- 
cupation Pair 2), The person descriptions accom- 
panying the first occupational pair contained four 
traits relevant to performance as an academician 
(eg. logical, studious) and four traits relevant to 
Performance as a Sportsman (eg., fearless, fast mov- 
ing). The person descriptions accompanying the other 
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occupational pair contained four traits related to per. 
formance as a comedian (e.g., humorous, theatrical) 
and four traits related to performance as a pilot 
(eg., accurate, decisive). The traits for each person's 
description were selected by having five subjects gen. 
erate adjectives they considered to be relevant to one 
occupation within a pair but not to the other (eg, 
relevant to academician but not to sportsman). These 
traits were subsequently rated by seven independent 
judges, and those traits were selected that received 
the highest ratings of relevance to one of the occupa 
tions within a pair but not to the other. Four traits 
relevant to academician in one of the Pair 1 person 
descriptions and four traits relevant to comedian it 
one of the Pair 2 descriptions were negative in eval 
uative tone. All the other traits were positive in tone. 

Experimental procedures and design. Subjects wert 
tested in small groups, and the stimulus material, 
were presented in booklet format. An introductory 
page explained that the research was concerned with 
how people predict success in an activity on tht 
basis of personality trait information. Page 2 listed 
eight person traits and asked the subjects to consider 
the suitability of such a person for a designated 
occupation. Subjects were permitted to study the 
traits for 60 sec before turning the page. On three 
successive pages they were then asked (without look- 
ing back at the stimulus traits) to (a) record how 
well they thought the person would perform in the 
occupation mentioned on page 2 (a 21-point if 
was used on which —10 was labeled “unsuccessfl 
and +10 was labeled “successful”), (b) recall ® 
many traits as possible that had described the ie 
son, and (c) list additional traits that they though! 
would be characteristic of the person. al 

The experimental design was a 4X 2X2 mixi 
design with eight subjects per cell. The two between 
subjects factors were (a) the four stimulus pers? 
replications and (b) which occupation (out of i 
occupation pair associated with a particular stimi 
person) was judged by a group of subjects. f. 
within-subjects factor was recall for the judgmen 1 
relevant versus judgment-irrelevant traits. This ‘al 
sign insured that each subset of four traits was p 
equally often as a relevant and an irrelevant SU 
across the eight groups of subjects. 


Results 


Recall of stimulus traits. The mean nu” 
ber of relevant and irrelevant traits r‘ ii 
for the four stimulus person replications 5 
displayed in Table 1. In support of the one 
mental hypothesis, subjects recalled sign! 
cantly more judgment-relevant traits 
judgment-irrelevant traits, F(1, 56) = 
$ < .002. Although no difference Pig 
for Replication 4, no significant interact 
between the trait relevance and stimulus 


10.57, 
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son replication factors was obtained (F < 
1.0). 
Subject-generated descriptions. The second 
experimental hypothesis was that the traits 
subjects use to describe the stimulus person 
would also be judgment relevant. To test this, 
it was necessary to determine for each sub- 
ject the number of generated characteristics 
that were relevant to the judged occupation 
and the number that were irrelevant (i.e, 
more relevant to the other occupation in the 
pair). To do this, the traits that each subject 
generated were shown individually to two new 
groups of 10 subject judges. Each group of 
judges was asked to decide for each trait 
whether it indicated success or failure for one 
of the two occupations associated with the 
person description. 
The criterion for determining the relevance 
of a generated trait to one of the two corre- 
sponding occupations was the absolute amount 
of consensus among the 10 judges as to 
whether it indicated success or failure for a 
particular occupation. For example, if 4 
judges rated “outgoing” as indicating success 
as an academician and 6 judges rated it as 
indicating failure in this profession, while 7 
out of 10 judges from the second group rated 
the trait as associated with success as a sports- 
man, the trait was scored as being more rele- 
vant to sportsman than to academician. This 
is because there was greater consensus among 
the judges as to the trait’s implications for 
sportsman (7/10) as compared to academician 
(6/10), Traits for which the degree of con- 
Sefisus was a tie were not included in the data 
analyses, 
The analysis of subjects’ inferred character- 
istics paralleled the 4 x 2 X 2 mixed design 
used for analyzing the recall data. The within- 
oe factor in this case fee of the 
two categories—judgment evant versus 
, Judgment irrelevant—to which the inferred 
traits had been assigned. > 

The mean numbers of relevant and irrele- 
vant inferred traits for the four stimulus per- 
son replications are displayed in Table 2. The 
means are generally consistent with the ex- 
perimental hypothesis. The statistical analy- 
sis produced a main effect of borderline Sig- 
nificance for occupation relevance, F(1, 56) 


: 
a 
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Table 2 

Mean Number of Judgment-Relevant and 
Judgment-Irrelevant Inferred Traits: 
Experiment 1 


——— 
Trait relevance 


Stimulus person 
replication Relevant Irrelevant 
Se ————_——— 

1 2.19 1.50 
2 2.19 1.25 
3 1.81 1.88 
4 2.25 144 

M 2.11 1,52 


SS 
Note. Cell n = 16. Scores in each cell could range 
from zero to as many additional traits describing 
the person as the subject wished to list. 


= 3.29, p < .08, and no interaction between 
relevance and person description replications 
(F < 1.0). 

Because of the borderline significance of the 
relevance effect, an additional analysis was 
conducted on the percentage of judgment- 
relevant items out of each subject’s total num- 
ber of generated implicational associates. This 
analysis revealed a significant effect for rele- 
vance in the predicted direction, F(1, 54)= 
5.83, p < .02. 


Discussion 
Subjects in Experiment 1 remembered judg- 
ment-relevant traits better than judgment-ir- 
relevant traits and used more judgment-rele- 
cteristics to describe the stimulus 
. This finding is consistent with the 
notion that an initial judgment thematically 
zes an impression. However, before such 
a conclusion can be drawn, additional evidence 
is needed. In the present experiment subjects 
were asked to remember the stimulus traits 
immediately following their judgment. Con- 
sequently, their recall may have represented 
short-term, as opposed to long-term, mem- 
ory organization. Before concluding that a 
judgment organizes an impression, it would 
be necessary to demonstrate the persistence of 
the effect over an extended period. 
A second type of evidence needed to estab- 
lish the organizational properties of a judg- 
ment on an impression would be to demon- 
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strate the relevance effect in a memory recog- 
nition test as opposed to a recall test. This 
would have both practical and theoretical im- 
plications. Practically, during social interac- 
tions people engage in both types of memory 
tasks. Theoretically, the two kinds of mem- 
ory tasks depend to varying degrees on differ- 
ent cognitive processes. Several theorists (An- 
derson & Bower, 1972; Kintsch, 1968) view 
recall as a two-step process in which informa- 
tion items are first searched for in memory 
and then compared with the item being sought. 
A recall test involves both the retrieval and 
comparison processes. In a recognition test, 
the items are supplied, making the compari- 
son processes of primary importance. Testing 
the organizational properties of a judgment 
using a recognition task would have implica- 
tions for whether the locus of the relevance 
effect lay in the retrieval or comparison pro- 
cesses. 

It has been shown that an organizing theme 
is generally better remembered than the in- 
dividual items of information upon which it 
is based (Bartlett, 1932; Frederiksen, 1975; 
Kintsch, Kozminsky, Streby, McKoon, & 
Keenan, 1975; Mandler & Johnson, 1977). 
Consequently, a further type of evidence that 
would help establish the organizational prop- 
erties of a judgment would be evidence that 
the judgment (i.e., the organizing theme) is 
well remembered and unaffected by the rele- 
vance of the information upon which it is 
based or by the Passage of time. 

To further investigate how a judgment or- 
ganizes an impression, Experiment 2 examined 
(a) whether the relevance effect would be 
present for recognition memory, (b) the per- 
sistence of the relevance effect over time, and 


(c) subjects’ memory for an early judgment 
made about a person. 


Experiment 2 


Subjects in Experiment 2 were asked to 
make an occupational suitability judgment 
about a stimulus person and then to (a) re- 
call the judgment they had made and (b) 
identify in a Tecognition test the traits that 
had been used to describe the person. The in- 
dependent variables in the 2 x 2 x 2 between- 
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subjects design were (a) whether the memory 
tests were administered 1 day or 1 week afte 
the occupational judgment, (b) whether the. 
stimulus traits were relevant or irrelevant to 
the occupational judgment, and (c) a stimu- 
lus person replication factor. Several hypothe- 
ses were proposed concerning the effects of) 
these factors on subjects’ memory for their) 
judgment and the stimulus traits. | 

As in Experiment 1, it was predicted that) 
subjects would be more accurate when the 
traits were relevant, as opposed to irrelevant, 
to an earlier occupational judgment they had 
made. This was expected to be true both! 
day and 1 week following the judgment. Á 

Several investigators (Bartlett, 1932; Sulin 
& Dooling, 1974) have found evidence that 
subjects use a salient theme to help “recon 
struct” related events or stimuli. Such re- 
liance on the theme would likely increase? 
with the passage of time as ancillary cue 
from the experimental setting become less 
available to assist subjects in remembering 
the stimulus traits. The assumption that sub- 
jects increasingly use the theme to help re- 
construct the traits in a memory test led E 
several additional predictions. The first 0 
these was that subjects’ intrusion errors (i.e 
traits identified as stimuli that were not pat! 
of the original person description) would be 
judgment relevant. It was further hypothe: 
sized that after a week, as compared to a day; 
(a) subjects would make more mistakes 
recognizing the traits, and (b) the relevance 
effect for both recognition errors and intru- 
sion errors would be greater. 

This reasoning also implies that memory 
for the theme should be superior to memory 
for individual items of person information: 
Unfortunately, there is no way of direi 
comparing memory for an occupational ju a 
ment with memory for descriptive pe 
traits. First, each may be of different ar 
iarity to the subjects. Second, in the cou" 
of the experiment, subjects are expose, z 
several traits but to only one occupati 
Nevertheless, the question of subjects’ abi is 
to remember their occupational judgment 
interesting for two reasons. First, there i 
been little previous research that direct 
examines memory for an initial impressi 
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judgment. Second, as explained, subjects’ abil- 
ity to remember a judgment is pertinent to 
the question of whether a judgment themati- 
cally organizes an impression. For these rea- 
sons, subjects were asked to recall their oc- 
cupational judgment when they returned for 
the second experimental session. It was hy- 
pothesized that (a) the relevance of the stim- 
ulus information would not effect how well 
the judgment could be remembered and (b) 
memory for the judgment would be good and 
unaffected by whether the memory test was 
administered 1 day or 1 week later. 


Method 


Subjects. Subjects were 130 male and female 
undergraduates from Ohio State University who par- 
ticipated in partial fulfillment of an introductory 
psychology course requirement. Each subject was 
randomly assigned to one of the eight experimental 
conditions, 

Stimuius materials. The stimulus materials used 
in Experiment 2 consisted of two occupations—pilot 
and comedian—and two groups of 22 traits. These 
traits were selected from the traits generated by sub- 
jects in Experiment 1 to describe the stimulus per- 
Son evaluated as a comedian or a pilot. The 22 traits 
lin one group consisted of traits that the judges 
agreed implied success as a comedian, but about 
which there was low agreement as to whether they 
implied success as a pilot (eg., talkative and artistic). 
The second group of traits were selected to be rele- 
vant for judging success as a pilot but low in rele- 
vance for judging success as a comedian (eg, ac- 
curate and punctual). From each group, 11 traits 
were randomly selected to serve as & stimulus per- 
son description, and the remaining 11 traits were 
used as foils in the trait recognition test. ` 
Experimental procedure and design. The subjects 
were tested in small groups of two to four people, 
and the stimulus materials were presented in booklet 
format. An introductory page explained that the re- 
search concerned people’s ability to predict success 
in a specific occupation on the basis of personality 
trait information. Page 2 asked subjects to consider 
the suitability of a stimulus person for one of the 
two occupations. This person was described by 11 
traits and a small photograph of a neatly dr 
male. The subjects were permitted to review the 
traits and picture for 60 sec before being told to 
turn the page. On the next page subjects were asked 
to record their judgment of the stimulus person's 
suitability for the occupation on a 21-point scale 
anchored by —10 (very unsuccessful) and +10 (very 
successful). After recording their rating, subjects were 
scheduled for a second session either the next day 
or 1 week later. It was implied that the tasks under- 
taken in the second session would be unrelated to 
those completed in the first. 
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During the second session the subjects received a 
booklet headed by a picture of the stimulus person. 
They were first asked to recall the occupation on 
which they had previously evaluated the person. 
Next, on an evaluation scale identical to the one 
used in the first session, they were asked to recall 
their exact evaluation. Finally, on a separate page, 
the subjects were instructed to identify the 11 traits 
that had described the person. These traits were em- 
bedded among 22 additional traits, 11 of which were 
the remaining traits relevant to the occupational 
judgment the subject had made and 11 of which 
were relevant to the occupation the subject had not 
seen. Subjects were told to identify exactly 11 traits 
from the list of 33. i 

To summarize the design, half of the subjects were 
scheduled to return for the second session the next 
day, and half were scheduled for 1 week later. Half 
of the subjects judged the stimulus person on an 0c- 
cupation for which the stimulus traits were relevant, 
whereas half judged an occupation for which the 
traits were irrelevant. Finally, half of the subjects 
judged a stimulus person described by 11 comedian- 
relevant traits, and half judged a person described by 
11 pilot-relevant traits. 


Results 


Seven out of the 130 subjects did not re- 
turn for the second session. Consequently, the 
reported analyses are based on data from 123 
subjects. The stimulus person replication fac- 
tor (i.e. whether the stimulus person pos- 
sessed comedian- or pilot-relevant traits) did 
not interact with either the judgment rele- 
vance or time factor for any of the dependent 
measures and is, therefore, not discussed fur- 
ther. 

Recognition accuracy for the stimulus traits. 
Consistent with the results of Experiment 1, 
subjects made more recognition errors when 
the traits were irrelevant, as opposed to rele- 
vant, to their occupational judgment (Ms = 
2.87 vs. 2.11, respectively), F(A, 115) = 9.68, 
p < 002. Furthermore, subjects made more 
errors in identifying the traits after 1 week as 
opposed to 1 day (Ms = 2.92 vs. 2.00, re- 
spectively), F(1, 115) = 13.8, p < .001. How- 
ever, the prediction that the effect of trait 
relevance would be greater after 1 week than 
1 day was not supported, F(1, 115) < 1.0 
for the interaction. 

Nature of subjects’ intrusion errors. The 
22 incorrect foils in the recognition test seen 
by each subject consisted of 11 traits that 
were congruent with the traits that had been 
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used to describe the stimulus person (i.e., rele- 
vant to the same occupation) and 11 traits 
that were incongruent with the description 
(i.e., relevant to the other occupation). When 
subjects made a judgment that was relevant 
to the descriptive traits, there was no reason 
for them to have intrusion errors from the 
11 traits that were incongruent with the de- 
scription, since these traits were incongruent 
with the judgment they had made as well. 
However, in the irrelevant conditions the 11 
traits that were incongruent with the descrip- 
tion were congruent with the judged occupa- 
tion. Thus, to the degree that the judgment 
thematically organized subjects’ impressions, 
there should have been an increase in the 
proportion of intrusion errors from the 11 
description-incongruent traits. Furthermore, 
this tendency was expected to increase over 
time if subjects increasingly came to depend 
on a judgment-based reconstructive process in 
recognizing the traits. 

An examination of subjects’ recognition 
protocols indicated that significantly fewer 
than 50% of the intrusion errors were from 
the description-incongruent trait lists (over- 
all M = 16.2%), F(1, 105) = 172.5, p< 
.001,* Even in the irrelevant condition, only 
20.8% of the intrusion errors were from the 
set of traits that were incongruent with the 
person description. The predicted tendency 
for subjects to have an increased percentage 
of description-incongruent intrusion errors 
when the occupational judgment they made 
was irrelevant (as opposed to relevant) to the 
person-description traits was marginally sig- 
nificant (Ms = 20.8% vs. 11.0%, respec- 
tively), F(1, 105) = 3.64, p < .06. However, 
this tendency did not increase with time (the 
Fs for both the time main effect and its in- 
teraction with the relevance effect were less 
than one). 

Recall of the occupational judgment. It 
was hypothesized that subjects’ memory for 
the organizing theme (i.e., the occupational 
judgment) would be good and unaffected by 
the passage of time or the relevance of the 
traits on which it had been based. To test 
this, an index was constructed that reflected 
each subject’s ability to recall both the judged 
Occupation and the position she or he marked 
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on the evaluative scale. This index was formed 
by giving a subject a score of 2 for remember, 
ing both the occupation and the exact scale 
position, 1 for remembering only the scale 
position or the occupation, and O for remem- 
bering neither. 

In support of the hypothesis that neither 
trait relevance nor the passage of time would 
affect subjects’ ability to recall their judg 
ment, there was no significant difference in, 
the recall index between subject groups who 
returned 1 week, as opposed to 1 day, later 
(Ms = 1.72 vs. 1.79, respectively), F(1, 115) 
< 1.0, nor a significant difference between the 
relevant and irrelevant trait groups (Ms= 
1.79 vs. 1.67, respectively), F(1, 115) = 1.91, 
p < .20. The interaction between these factors 
was also nonsignificant, F(1, 115) <1.0.7 

It would be expected that memory for the 
judgment would be better than memory for 
the traits. Although, as has been noted, the 
two cannot be appropriately compared, it 18 
worthwhile to comment on subjects’ excellent 
ability to remember their judgment. Acros 
all conditions, 96% of the subjects remem- 
bered the occupation they had judged, and 
77% were able to reproduce their exact rating 
on the 21-point scale; 97% were able to re- 
produce their initial rating within one scale 
position. 


Discussion 


The results of Experiment 2 provide four 
types of evidence supportive of the notion 
that an initial judgment thematically oe 
nizes a person’s impression of another. First, 
consistent with the results of Experiment © 


1It was necessary to omit 10 subjects from a 
analysis, since they made no recognition errors., em- 

2 In addition to the combined index, subjects’ Ma 
ory for their evaluation and judged occupation 
analyzed individually, Subjects’ ability to remem 
their evaluation was unaffected by either the ition, 
vance manipulation or the time factor. In ad i 
there was no effect for relevance on subjects only 
ory for the occupation they had judged. The em 
significant finding was a difference in subjects a toa 
ory for the occupation after a week as compare’ pm- 
day. Five subjects out of 123 were unable to reat 
ber the occupation they had judged, and all 
were in the week-delay condition ($ 4o. J 
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subjects were better able to recognize a set of 
descriptive person traits when they were rele- 
vant, as opposed to irrelevant, to a judgment 
they had made. Second, this relevance effect 
persisted over time and was present both 1 
day and 1 week after subjects had made their 
judgment. Third, there was a tendency for 
an initial judgment to increase the percentage 
of intrusion errors from a set of traits that 
had not been part of the person description, 
but which were relevant to the judgment. 
Fourth, subjects’ memory for the judgment 
they had made was excellent and not sub- 
stantially affected by either the passage of 
time or variations in the relevance of the stim- 
ulus traits, 

This fourth conclusion must be viewed with 
caution, Subjects’ memory for their judgment 
was so good that a ceiling effect may have 
occurred that masked the effect of time and 
trait relevance on memory for the judgment. 
Nevertheless, the fact that subjects’ memory 
for their early decision was exceptionally ac- 
curate up to a week later is consistent with 
the idea that the decision thematically or- 
ganized their impression and that it remains 
prominently available in memory as part of 
their impression of the stimulus person. A 
consequence of this prominence is examined 
in Experiment 3. 

The predictions stemming from the assump- 
tion that over time subjects increasingly rely 
on their judgment to reconstruct the stimu- 
lus traits were not supported. While there 
was a tendency for subjects’ judgments to in- 
fluence intrusion error rates, 
time did not strengthen this effect. 
more, time did not increase the tendency for 
judgment-relevant traits to be recognized bet- 
ter than judgment-irrelevant traits. 

There are at least two possible explana- 
tions why an interaction between the rele- 
vance and time factors did not emerge for 
either type of recognition error. First, subjects 
may not have relied on 4 judgment-based re- 
constructive process to identify the traits. If 
this were the case, it suggests that the rele- 
vance effect is primarily the result of differ- 
ential encóding of the stimulus traits rather 
than reconstructive retrieval process<>: Fi 

A second possible explanation for the fail- 
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ure to find an interaction between relevance 
and time is a lack of sensitivity of the experi- 
mental method. This could have occurred for 
several reasons. First, as noted, recognition 
tests are more sensitive to comparison pro- 
cesses than to retrieval processes. The hypoth- 
esized effect may have been visible if a re- 
call, rather than recognition, measure had 
been employed. Second, the person descrip- 
tions tended to provide subjects with an al- 
ternative theme to help organize and recall 
the stimuli. For example, when subjects in 
the irrelevant condition made a judgment 
about a pilot, all 11 descriptive traits were 
relevant to the profession of comedian. Thus, 
it was easy for them to spontaneously con- 
clude that the person would make a good 
comic and use this decision to organize their 
impression. Such an occurrence could account 
for the high rate of description-congruent in- 
trusion errors even when subjects made an 
irrelevant occupational judgment. In support 
of this interpretation, four of the five sub- 
jects who erred in recalling the occupation 
remembered having judged an occupation that 
was congruent with the descriptive traits they 
had seen (e.g., TV entertainer, showman). 

Experiments 1 and 2 used recall and recog- 
nition memory to provide evidence about the 
organizational properties of a judgment on a 
person impression. These findings have di- 
rect implications for people’s ability to ac- 
curately convey person information to others. 
First impressions are indeed lasting, at least 
in their effects on the information people later 
have available for communication. 

There is 4 second issue of equal concern: 
If a judgment organizes an impression, it 
should be reflected in subsequent memory- 
based judgments that rest on the impression. 
This was tested in Experiment 3. 


Experiment 3 


stigated how organiza- 
by a judgment affects 
subsequent judgments. The rationale and al- 
ternative experimental hypotheses can best 
be understood in light of the experimental 
task. This task consisted of describing a stim- 
ulus person with all positive, all negative, or 


Experiment 3 inve 
tion of an impression 
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Figure 1. Possible patterns of attribute ratings as a function of judgment relevance and j : ; 
ption valence depending on whether the ratings are trait based (Panel a) or judgment based 
(Panel b). 


all neutral traits and asking subjects to make 
memory-based ratings of the person’s intel- 
ligence and friendliness. Before making these 
ratings, however (and while the traits were 
still available), subjects were asked to make 
an occupational judgment about the person. 
The occupational judgment was relevant to 
one of the attributes but not the other (e.g., 
research physicist to intelligence, but not to 
friendliness, and waiter to friendliness, but 
not to intelligence). It was expected that sub- 
jects’ ratings of each stimulus person’s in- 
telligence and friendliness would be influenced 
by the relevance of the occupation they 
judged to each attribute. 

The manner in which an occupational judg- 
ment might affect a memory-based attribute 
rating would depend on the way in which 
subjects draw on their memory for the per- 
son. For example, one way would be for them 
to recall traits that had been used to describe 
a person. Experiments 1 and 2 suggest that 
subjects would best remember traits relevant 
to their initial occupational judgment. If such 
memory bias exists, it should have the effect 
of polarizing subjects’ attribute ratings when 
the occupation is relevant, as opposed to ir- 
relevant, to the attribute being judged. 

To understand this prediction, consider the 
specific case in which a subject is presented 


with a negatively described stimulus Person. 


The subject is likely to conclude that the per- 
son would not be successful, regardless of 
whether the occupation being judged was re- 
search physicist or waiter. However, the ot- 
cupation judged should affect later memory 
for the traits. In the case of research physicist, 
the subject would be likely to remember moré 
intelligence-relevant traits, since intelligi 
is an important attribute for deciding ! 
someone would be a good physicist, as com- 
pared to deciding if that person would be k 
good waiter. The more intelligence-releva t 
negative traits a subject can remember, 3 
less intelligent the stimulus person is likely 
to seem (see Hamilton & Fallot, 1974 fo 
demonstration of this phenomenon in sia 
lus-based judgments). Just the opposite wou 3 
be expected when a subject considers a S 
lus person described by positive traits. i 
increased number of intelligence-relevant He 
scriptive traits remembered following & T° x 
vant (as compared to an irrelevant) pe 
tional judgment would be positive, thus ™ he 
ing the person seer more intelligent. T L 
interaction between person description ae 
ence and judgment relevance that would te 
expected if subjects based their attribuit 
ratings on memory for the stimulus traits 
graphically depicted in Figure 1 (a). an 
It may be that subjects do not data 
memory for descriptive stimulus traits W 
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making memory-based judgments. Several 
studies have shown that people rely on their 
own cognitive responses to stimuli when mak- 
ing judgments (Greenwald, 1968; Lingle & 
Ostrom, 1979, in press; Petty, Ostrom, & 
Brock, in press). Lingle and Ostrom (1979), 
for example, had subjects make pairs of oc- 
cupational judgments. They concluded that 
subjects based their second judgment on 
memory for their first judgment, since sec- 
ond-decision time was not affected by the 
number of traits that described a stimulus 
person, but was affected by the similarity of 
the first occupation to the second. Certainly, 
in the present Experiment 2, subjects’ excep- 
tional accuracy in remembering their occupa- 
tional judgments suggests that such judg- 
ments are readily available in memory to be 
used as the basis for subsequent decisions. 

If subjects were to base attribute ratings 
on memory for an organizational theme (i.e, 
the occupational judgment), rather than on 
memory for stimulus traits, the ratings should 
reflect the degree to which a particular at- 
tribute is stereotypical of the theme (occupa- 
tion). For example, a subject would seem 
more likely to judge a person as intelligent if 
she or he had first judged the person to be 
a good physicist (as compared to a good 
waiter), since physicists are stereotypically 
more intelligent than waiters. One would also 
be likely to view a negatively described per- 
son as more intelligent following an intelli- 
gence-relevant occupational judgment, since 
a bad physicist is still likely to be stereotypl- 
cally viewed as more intelligent than a bad 
waiter. Thus, if subjects were to base their 
attribute ratings on memory for the theme 
induced by an earlier occupational judgment, 
the expected pattern of results would be those 
depicted in Figure 1(b)- 

A 3x2x2 within-subjects design was 
used to test if, and in what way, 40 initial oc- 
cupational judgment would influence a sub- 
sequent attribute judgment. The three factors 
varied were (a) person description valence 
(positive, negative, neutral), (b) relevance of 
the occupation to the attribute (relevant OF 
irrelevant), and (c) an attribute-rating repli- 
cation factor (friendliness Or intelligence) - 
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Method 


Subjects. Subjects were 82 male and female in- 
troductory psychology students who participated in 
partial fulfillment of a course requirement. Of these, 
48 took part in the experiment proper, and 34 as- 
sisted in preparation of the stimulus materials, Sub- 
jects were randomly assign ed to different stimulus 
material presentation orders used to counterbalance 
the experimental design. 

Stimulus materials. Two classes of six occupa- 
tions were used in the experiment. For one class, 
friendliness (but not intelligence) was the more rele- 
vant attribute for success (ie., cab driver, airline 
steward, waiter, baggage porter, telephone solicitor, 
and shoe salesman); for the other class, intelligence 
(but not friendliness) was the more relevant attribute 
(ie., geologist, research physicist, organic chemist, 
statistician, medical researcher, and aeronautical 
technician). These occupations were selected as fol- 
lows: Two of the authors first chose 30 occupations 
that they believed to be of the intelligence-relevant 
class and 29 occupations they believed to be of the 
friendliness-relevant class from Hopke’s (1975) list 
of occupations. Next, 18 introductory psychology 
students rated each of these occupations on 8-point 
scales according to how important they thought 
knowing about friendliness and intelligence would 
be for judging a person’s aptitude for that occupa- 
tion. The six intelligence-relevant occupations and 
the six friendliness-relevant occupations with the 
largest mean differences between friendliness and 
intelligence ratings were chosen for use in the ex- 
periment. 

Three types of stimulus persons (positive, nega- 
tive, and neutral) were constructed to be used in 
the experiment. This was done by first selecting 32 
traits from three different ranges in Edwards's (Note 
1) rescaling of ‘Anderson’s (1968) trait list: 250-349 
(moderately negative), 350-449 (neutral), and 450- 
549 (moderately positive). The traits were randomly 
selected from within each range with the constraint 


that traits strongly implying anything about either 
telligence (€g. stupid) 


friendliness (€-g- warm) or in 
were eliminated. Sixteen stimulus persons were then 
constructed by going through each set of 32 traits 
twice and randomly selecting (without replacement) 
eight groups of 4 traits each. To make the stimulus 
person descriptions of approximately equal relevance 
for judgments about both the intelligence and friend- 
Jiness occupations, the 48 generated stimulus persons 
(3 X 16) were rated on 7-point friendliness and in- 
telligence scales by a new group of 16 subjects. From 
each of the three groups of 16 scaled stimulus per- 
son descriptions, the 4 descriptions having the small- 
difference between their intelligence and 


est absolute 
friendliness mean ratings were selected to be used 


jn the experiment. 

Experimental procedure and design. Subjects were 
tested in groups of 10 to 20, and stimulus materials 
were presented in booklet format. The first two pages 
of the booklet explained that the experiment was 
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concerned with how people form person impressions 
from a limited number of descriptive traits and de- 
scribed the experimental task. This task consisted of 
having subjects make judgments of 12 different stim- 
ulus persons. Subjects first saw a list of four traits 
describing a person and were asked (on the same 
page) to rate the person on a 7-point scale according 
to how well they thought the person would perform 
in a particular occupation. After making this judg- 
ment, subjects were asked to turn the page and rate 
the person (without looking back at the stimulus 
traits) on 10 different attributes, again using 7-point 
rating scales. Two of the 10 traits were friendliness 
and intelligence. The eight attributes rated in addi- 
tion to friendliness and intelligence were eight traits 
from Anderson’s (1968) trait adjective list, which 
three of the investigators judged were unrelated to 
either friendliness or intelligence. These additional 
attributes were included to conceal the fact that the 
attributes of intelligence and friendliness were the 
only ratings of interest. The eight filler ratings were 
not analyzed. 

Each subject judged all 12 stimulus persons, 1 on 
each of the 12 occupations. The 6 occupations in 
each class were paired with 2 stimulus persons de- 
scribed by negative traits, 2 described by neutral 
traits, and 2 described by positive traits. Twelve dif- 
ferent rating-scale pages were constructed to be 
counterbalanced with the stimulus person and oc- 
cupations. These pages differed only in terms of the 
order in which the attribute scales appeared. Each 
was constructed by first randomly assigning the 
friendliness and intelligence scales to two of the first 
five positions. The eight additional scales were then 
randomly assigned to the remaining positions on a 
page. Thus, for each of the 12 rating-scale pages 
that a subject saw during the experiment, the friend- 
liness and intelligence ratings always occurred early 
in the list of ratings, but the order in which the 
rating scales appeared varied. 

Counterbalancing of the 12 stimulus persons, 12 
occupations, and 12 rating-scale pages was achieved 
by a Greco-Latin square. Twelve subjects were re- 
quired in order for each possible stimulus combina- 
tion to appear once, Four replications were under- 
taken by going through the Greco-Latin Square four 
times, bringing the total sample size to 48. Two 
random orders were used for Presenting the 12 stim- 
ulus sets, 

When a subject initially judged a stimulus person 
on a friendliness occupation, the subsequent friendli- 
ness rating was considered relevant and the intelli- 
gence rating irrelevant. It was just the reverse when 
an intelligence Occupation was used. Thus, this was 
a 3 (stimulus person type) X 2 (relevant or irrelevant 
occupational judgment) x 2 (friendliness or intelli- 
gence rating scale) design. ; 


Results 


Occupational Suitability ratings. To test 
whether subjects’ attribute ratings were trait 
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based or judgment based, it was first necessary 
to show that the three levels of stimulus ity 
formation led to distinctly positive, neutral, 
and negative occupational judgments on the 
7-point rating scales. A repeated measures 
analysis of variance revealed a strong overall 
effect for type of stimulus person on sub- 
jects’ occupation suitability ratings, F(2, 29) 
= 223, p < .001. The positive, neutral, and 
negative stimulus persons received ratings of 
5.56, 4.08, and 1.88, respectively. There were 
no significant differences in the suitability 
ratings for the two occupation types, F(1, 47) 
= 1.57, nor for the Stimulus Person XOccu 
pation interaction, F(2, 94) = 1.23. 

Intelligence and friendliness ratings. Ii 
subjects’ attribute ratings were based solely 
on memory for the initial stimulus informa- 
tion, occupational relevance should have po- 
larized the ratings. If they were based on 
memory for the organizing theme, the rele- 
vance manipulation should have produced 
more positive trait ratings. It can be seen In 
Figure 2 that a strong relevance main effect 
was obtained, F(1, 47) = 72.6, p< 001. 
Even when the thematic judgment was nega- 
tive (e.g., the person would make a poor re- 
search physicist), attribute ratings were more 
Positive following a relevant, as opposed to 
an irrelevant, thematic judgment. The inter- 
action between relevance and stimulus pet 
son valence, as predicted by the memory-for- 
traits hypothesis, was not significant, F (2, 29) 
= 1.67. 

The data analyses did yield two interi 
tions. The first of these was a three-way Ki 
teraction between relevance, description ‘0 
ence, and attribute rating, F(2, 94) = 3: i 
b < .04. Figure 2 shows that the two a 
replications differed primarily in how y i 
vance affected ratings of the negative stimu x 
person. The second interaction was a e 
way interaction between relevance and t 2 
attribute ratings, F(1, 47) = 10.6, p< 002; 
resulting from the relevance effect a 
stronger for the intelligence attribute th 
for the friendliness attribute. 


General Discussion 


} in 
Psychologists have long been interested a 
the ways impressions are organized. Per! 


THEMATIC ORGANIZATION OF IMPRESSIONS 685 
Positive traits Positive traits 
5.0 5.0 
£ e Neutral traits 
= Neutral traits = 
Š 40 ž 
È f f 40 
5 z Negative traits 
iy a Negative traits = 
3.0 30 
Ci je Tg a 
irrelevant Relevant irrelevant Relevant 
Occupation Occupation Occupation Occupation 
Friendliness Intelligence 
Attribute Replication 


Figure 2. Mean friendliness and intelligence ratings as a function of occupation relevance and per- 


son description valence; Experiment 3. 


this is because so much social interaction in- 
volves transmitting, receiving, evaluating, and 
acting on information about others. It has 
generally been assumed that each of these ac- 
tions depends not only on specific facts that 
someone knows about another, but on how 
such facts are organized into a unified im- 
pressions. In spite of such interest, there has 
been little unanimity as to how impression 
organization should be conceptualized and 
studied. Asch (1946), for example, concep- 
tualized an impression as a unified gestalt or- 
ganized by central traits that provide mean- 
ing to, and relationships among, peripheral 
characteristics. More recent theorists (e.g. 
Cantor & Mischel, 1977; Markus, 1977) have 
viewed impression organization as the forma- 
tion of prototypes or cognitive schemata that 
develop out of stimulus cues or experiences 
when these cues converge on a common Con- 
cept. 

_ The approach exposited in the present ar- 
ticle conceptualizes impressions as being Or- 
ganized around a central theme that orders 
the availability in memory of impression-re- 
lated cognitions. While stimulus configura: 
tions may be capable of imparting such or- 
ganization (as when they are highly interas- 
sociated), the present studies investigated the 
organizational consequences of a judgment 
made when information about a person is 
initially received. The importance of investi- 
gating the thematic organizational properties 


of a judgment, as opposed to some other or- 
ganizing factor, arises from the fact that peo- 
ple often receive information about others in 
a judgment context. From national elections 
to job interviews to psychology experiments, 
people perceive, organize, and encode infor- 
mation about others within a judgment con- 
text. 

In support of the hypothesis that an early 
judgment thematically organizes an impres- 
sion, Experiments 1 and 2 found that (a) the 
judgment itself was extremely well remem- 
bered up to a week after it had been made, 
(b) judgment-relevant information was bet- 
ter remembered than judgment-irrelevant in- 
formation (again, up to a week later), and 
(c) the characteristics subjects used to freely 
describe the person tended to be judgment 
relevant. Each of these properties is consistent 
with the properties that research in cognitive 
psychology has established as characteristic 
of an organizing theme in prose or discourse. 

While Experiments 1 and 2 investigated 
the general availability of person information 
within the context of a recognition or recall 
test, Experiment 3 examined what subjects 
access from memory in a judgment situation. 
Although integration theorists (e.g., Ander- 
son, 1974) have developed algebraic models 
capable of describing how people combine in- 
formation cues in a stimulus-based decision, 
there has been little research investigating 
what thoughts people access when they make 
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memory-based judgments. Experiment 3 in- 
dicated that although subjects may be cap- 
able of recalling subsets of descriptive stimuli 
(as indicated in Experiments 1 and 2), in 
certain judgment situations they draw more 
on their memory for a previous thematic 
judgment and its associated characteristics, 
One important, but unanswered, question con- 
cerns identifying the conditions under which 
subjects will make the effort to recall previ- 
ously learned person characteristics and base 
a decision on memory for these stimulus fea- 
tures (rather than on memory for the theme). 

Another question that has not been an- 
swered by the present research is which spe- 
cific cognitive processes are responsible for 
the organizing properties of an early judg- 
ment on an impression. The fact that the rele- 
vance effect was obtained for recognition as 
well as recall memory suggests that encod- 
ing processes may have been partially re- 
sponsible. This is also indicated by the find- 
ing that the relevance effect did not increase 
with time. If encoding processes do play a 
tole, several mechanisms may be involved. 
For example, it may be that while consider- 
ing the person descriptions, subjects generate 
and store in memory characteristics that are 
jointly associated with both the descriptive 
traits and the judgment they have to make. 
Several studies have demonstrated that peo- 
ple’s internally generated thoughts are stored 
in memory and well remembered (Dosher & 
Russo, 1976; Greenwald, 1968). These in- 
ferred characteristics may later be recalled 
and serve both as a basis for subjects’ de- 
scriptions and as a cue for remembering the 
stimulus traits, 

An alternative explanation of the relevance 
effect could be that the judgment-relevant 
stimulus traits are Processed more deeply 
(Craik & Lockhart, 1972; Craik & Tulving, 
1975) and therefore are remembered better 
than judgment-irrelevant traits. Memory for 
these traits might then have served as the 
basis for subjects’ judgment-relevant free de- 
scriptions. These, as well as other Possible 
Mechanisms, need to be explored by further 
research in order to establish the conditions 
under which an early (and perhaps a later) 
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judgment will and will not serve as an op 
ganizing theme for an impression. 

As a final note, a question of long-standing? 
interest in the social psychological literature 
is whether attitude-consistent information iş 
better remembered than attitude-inconsistent 


information. While some studies have found 
better memory for attitude-consistent informa 
tion (e.g., Levine & Murphy, 1943), most sub- 
sequent investigations (e.g., Greenwald & Sa 


kumura, 1967) have not found this to be true 
If person impressions are similar to E 
about other objects, the present research sug 
gests that the best remembered information 


the reaching of a new attitude judgment, in 
formation should be best remembered when it 
is relevant to attitude change, not associated 
with attitude constancy. 


Reference Note 


1. Edwards, J. D. Revised likableness ratings of 55 
personality-trait adjectives. Unpublished manu: 
script, Ohio State University, 1967. 
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Pain experience is conceptualized as a combination of stimulus sensations (e.g. 
aching) and emotional distress. In Experiment 1, less distress was reported to 
cold pressor stimulation by subjects first told about stimulus sensations than by 
subjects who were uninformed or were told about symptoms of bodily arousa 
(e.g., tension). Adding a pain warning to sensation information blocked d stress 
reduction, presumably by eliciting an emotional interpretation of the stimulus, 
In Experiment 2, subjects attending only to hand sensations reported less dis- 
tress than subjects attending to their bodies. This decrease in the power of r 
stimulus to provoke emotion is presumably mediated by a schema of hang 
sensations formed by attention. In Experiment 3, subjects attending to han 

sensations early in the immersion and distracting themselves later reported the 
same low levels of distress as did subjects who attended to hand sensations 
throughout. Subjects distracted throughout and subjects attending to hand sen- 
sations later showed no distress reduction. Therefore, stimulus schematization 
must precede distress reduction. Implications for distress control are discussed. 


A number of controlled laboratory (John- 
son, 1973; Staub & Kellet, 1972) and field 
(Cohen & Lazarus, 1973; Egbert, Battit, 
Welch, & Bartlett, 1964; Johnson, Kirchoff, & 
Endress, 1975; Johnson & Leventhal, 1974; 
Johnson, Morrissey, & Leventhal, 1973; Sime, 
1976; Wolfer & Davis, 1970) experiments 
have demonstrated the value of preparatory 
information for the reduction of distress dur- 
ing exposure to noxious stimulation. Despite 
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these positive findings, our picture as to Ki 
preparation reduces distress is not ties: 
clear. We do not know which components 9 
the preparation package affect particulo 
dicators of distress, and we lack a E d 
view of the process underlying distress re i 
tion. Given these limits, it is not ue 
that a variety of specific, often apparen at 
competing, hypotheses are offered to oe 
for the distress-reducing effects of a 
tion. For example, preparation is said "a 
duce distress if it stimulates moder 
levels and preparatory behaviors (Janis, A 
1967); if it provides specific, accu” 
pectations about stimulus properties as 
1973; Janis, 1958; Johnson, 1973, | des 
Johnson & Leventhal, 1974); if it prov! ‘a 
neutral label to account for diffuse bod E. 
actions that would otherwise be labele din 
fear (Nisbett & Schachter, 1966; Ross, Ror 
& Zimbardo, 1969); and/or if it enha! the 
the individuals sense of control 910 
stressor (Geer, Davison, & Gatchel, 
Glass & Singer, 1972; Weiss, 1971). ears 
None of the specific hypotheses a 
adequate to account for the empirica! 
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EFFECTS OF PREPARATORY INFORMATION ON DISTRESS 


ings. Thus, moderate levels of fear prior to 
stressor exposure are reported to innoculate 
against distress during exposure in some stud- 
ies (Janis, 1958) but not in others (Cohen & 
Lazarus, 1973; Johnson, Leventhal, & Dabbs, 
1971; Sime, 1976; Johnson, Rice, Fuller & 
Endress, Note 1). Perceived control is found 
to reduce distress from noxious stimulation in 
some studies (Geer et al., 1970; Staub, Tur- 
sky, & Schwartz, 1971, Experiment 2) but 
not in others (Brady, Porter, Conrad, & 
Mason, 1958; Pervin, 1963; Staub et al., 
1971, Experiment 1). When control reduces 
distress, it does not always do so during ac- 
tual exposure to the stressor (Glass & Singer, 
1972). While no single study can possibly un- 
tangle the complete set of factors underlying 
these apparent inconsistencies, the three ex- 
periments reported here are designed to clar- 
ify at least one aspect of the problem, the role 
of stimulus interpretation or stimulus coding 
(e.g, Holmes & Houston, 1974; Lazarus, 
1966; Leventhal, 1970; Sokoloy, 1963) in 
distress reduction. This focus should not be 
seen as a rejection of the importance of prior 
fear level or of coping as important contribu- 
tors to distress reduction; all of these factors 
are important elements in a complete picture 
of distress reduction. It is our assumption, 
however, that they can be better understood 
if we clarify the way the coding or interpreta- 
tion of the stressor influences distress during 
impact. 

The general thesis underlying the present 
experiments is that distress during stressor 1m- 
pact consists of both informational (sensory; 
stimulus attributes) and emotional (subjective 
distress or fear) components, which are usu- 
ally integrated into a common experience 
(Beecher, 1959; Leventhal, 1970, in press; 
Leventhal, & Everhart, in press). Three ques- 
tions are raised respecting the effect of cod- 
ing or interpretation on the distress experi- 
ence: (a) Can interpretation actually alter 
the total experience of stressor impact? (b) 
What aspects of the experience are actually 
altered (e.g., the sensory information or the 
emotional distress)? (c) How does the al- 
teration take place? The three experiments Te- 
ported here show that distress on impact can 
indeed be reduced by preparation. Second, 
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they suggest that emotional distress (or suf- 
fering) is the component of the experience 
most likely to be altered by preparation. 
Third, they suggest that this change happens 
because identifying or focusing on the sensory 
properties of the stimulus in a nonthreatening 
setting leads to the formation of a template 
or schema of the stimulus that causes it to 
lose the power to provoke emotional reactions 
(Sokolov, 1963). 

All three investigations study distress in- | 
duced by cold pressor stimulation. The cold 
pressor test seems ideally suited for the study 
of coding or interpretation of noxious stimula- 
tion because it provides a rich array of sen- 
sory features (Mountcastle, 1968; Wolf & 
Hardy, 1943); has a relatively slow onset of 
distress induction, providing time for the 
impact of mental operations (Wolf & Hardy, 
1943); can be affected by psychological ma- 
nipulations such as set (Barber & Hahn, 
1962) and hypnotic analgesia (Hilgard, 1969, 
1971); and is not so threatening an experi- 
ence as to prevent manipulating threat level 
while holding physical stimulation constant 
(e.g., Teichner, 1965). 


Experiment 1 


Experiment 1 compared the level of dis- 
tress reported during a cold pressor test of 
groups of subjects given three different types 
of preparatory information: (a) information 
on the distinctive features or sensory prop- 
erties of the noxious stimulus; (b) informa- 
tion on the individual’s likely overt and co- 
vert emotional or arousal reactions to the 
stimulus; and (c) information on the po- 
tential strength, painfulness, or magnitude of 
the noxious stimulus. Both sensation informa- 
tion and arousal information have been used 
to reduce distress and fear behaviors on im- 
pact with noxious stimulation, and both are 
presumed to do so because they affect ex- 
pectations or interpretations of the stimulus. 
Magnitude or threat information, on the 
other hand, is expected to limit the operation 
of both sensation and arousal information. 
Experiment 1 compared six conditions, sensa- 
n information, arousal information, and 


tio! 
d with or without high- 


control, each presente 
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magnitude pain warnings. The specific rea- 
soning and hypotheses for each of the condi- 
tions are presented in the following sections. 


Sensation Information 


Sensation information prepares the indi- 
vidual with a detailed preview of the tactile, 
thermal, and visual changes that he or she 
will experience during stressor impact. John- 
son (1973) gave subjects sensation informa- 
tion about ischemic pain (“to feel pressure 
and sensations such as tingling and aching, 
followed by numbness, . . . arm and hand will 
be... pale and blotched”) and in two experi- 
ments found substantially lower levels of re- 
ported distress for these subjects in compari- 
son to control subjects given information on 
experimental procedures. There were no treat- 
ment differences, however, for reported 
strength of stimulus sensations; preparation 
affected only the emotional component of 
experience. 

Beneficial effects of sensation information 
were also suggested, though less clearly so, 
in a study by Neufeld and Davidson (1971) 
and in a study by Staub and Kellet (1972), 
although in the latter study the investigators 
did not expect to find the reduction in dis- 
tress reported for electric shocks by sensa- 
tion-informed subjects, and it occurred only 
when the sensation-informed subjects were 
reassured that the shock was harmless. Fi- 
nally, reductions have been observed in be- 
havioral signs of distress (e.g., gagging and 
heart rates) for patients given sensation in- 
formation the night before an endoscopic ex- 
amination—a diagnostic test of the upper 
gastrointestinal tract in which the patient 
must stay awake and vary his or her position 
after swallowing and admitting a fiber optic 
tube (Johnson & Leventhal, 1974; Johnson 
et al., 1973). Similar effects in emotional 
distress reduction have been found during 


cast removal with children (Johnson et al., 
1975). 


Two hypotheses have been offered to ac- 
count for the reduction of distress by sensa- 
tion information. The first argues that the in- 
formation provides the subject with accurate 
expectations against which to check stressor 
impact (Johnson, 1973, 1975). When expecta- 
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tions are accurate, uncertainty, surprise, an 
startle are minimized, and the individual 
believe that the situation is under control 
The second hypothesis takes the more in 
clusive view that sensation information altel 
the way the noxious stimulus is processé 
(i.e., it changes the schemata or codes thal 
are integrated with the noxious input). If the 
subject processes the noxious input in term 
of its informational properties and forms 
schema of its sensory features, the input wil 
lose the power to stimulate attentional am 
emotional responses, and there will be 
gradual decline or habituation of the emo 
tional (distress) component of the exi 
ence. This analysis suggests that the accurag} 
of the sensation information per se is not th 
critical factor in producing distress reduction 
The critical factor is the schematization @ 
coding of the input as a set of objective fea 
tures, such as coldness, numbness, and pini 
and needles, rather than the schematization 
of the input as pain, fear, and uncertainty a 
to whether the skin will break and bleed, thi 
flesh will be damaged, and so on (Engél 
1959; Leventhal, 1970, 1975, in press; Level 
thal & Everhart, in press). a 

If the accuracy hypothesis is a sufficien 
explanation for the beneficial effects of sensé 
tion information, one might expect that anj 
accurate information, whether it describes thi 
sensory properties of the noxious stimulus @ 
its painful magnitude, will reduce surpts 
and distress. Thus, warnings about the pall 
fulness of the stimulus should be distress a 
ducing, as is information about its sensoi 
features. The processing interpretation, how 
ever, suggests that warning information 3 
precisely the kind of preparatory input 4 
might encourage encoding a stimulus a8 i 
threatening event (i.e., integrating the stimi 
ulus with emotional and pain memories taa 
stimulate expectations and uncertainty abou 
possibly harmful outcomes). This type © 
Processing can occur regardless of the © 
curacy of the warning. The processing anai 
sis is supported by Epstein’s (1973) penettal 
ing review of studies of the effects of P“ 
Warnings on subjects’ reactions to the im 
of noxious events. He points out that dist" 
and emotionality are not reduced by bigi 
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magnitude information (warnings that a stim- 
ulus will be painful), even if this information 
is accurate. Epstein found that subjects who 
expected the 10th of a series of tones to be 
very severe were more anxious than subjects 
who expected the tone to be of moderate or 
mild volume, and he found no evidence that 
the accurate information about stimulus mag- 
nitude reduced distress on impact (as mea- 
sured by heart grate and skin conductance; 
Epstein & Clarke, 1970). To the contrary, 
the evidence suggested that magnitude in- 
formation that established “an accurate ex- 
pectation can amplify reactivity to a suff- 
ciently threatening stimulus, at least initially” 
(Epstein, 1973, p. 103). The Staub and Kel- 
let (1972) findings are consistent with this 
interpretation. 

The accuracy and processing notions lead 
to quite different expectations, therefore, re- 
specting the effects of preparation on distress 
to cold pressor pain. The accuracy hypothesis 
suggests that accurate information on sensa- 
tions and magnitude of a stressor should re- 
duce distress on impact because both types 
of information limit uncertainty and define 
anticipated harm. The processing hypothesis 
suggests that magnitude and sensation infor- 
mation will have different effects; sensation 
information will permit the schematization 
of the noxious stimulus and distress reduc- 
tion, whereas magnitude information will 
establish a set for integrating the stimulus 
with emotional memory schemata that will en- 
hance distress reactions. Thus magnitude in- 
formation will cancel the effect of sensation 
information. 

If we apply the reasoning described above 
to cold pressor stimulation, in which stimu- 
lus sensations and distress gradually become 
More intense, we can anticipate the follow: 
ing: If the accuracy hypothesis is true, (a) 
the group receiving sensation information and 
the high-magnitude pain warning will show 
the lowest levels of distress, the unprepared 
control group will show the highest levels of 
distress, and the sensation-only and pain- 
warning-only conditions will fall in between; 
and (b) the effect on emotional distress 
should appear early, or as soon as stimula- 
tion confirms the preparation. On the other 
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hand, if distress reduction involves the more 
complex process of coding or schematizing 
the objective features of the stimulus—a 
process that can be disrupted by a pain warn- 
ing—one would expect (a) lower distress rat- 
ings with sensation information and no pain 
warning, with the remaining conditions more 
or less the same, and (b) condition differ- 
ences toward the latter part of the exposure 
period (i.e., after schematization takes place). 


Arousal Information 


Arousal information involves descriptions 
of emotional behaviors—the objective and 
subjective signs of arousal such as heart beat- 
ing, hand sweating, tenseness, and so forth— 
which allow individuals to anticipate their 
overt and covert reactions to a noxious stimu- 
lus. Stimulated by Schachter’s (Schachter, 
1964; Schachter & Singer, 1962) revival of 
cognition-arousal theories of emotion, investi- 
gators have developed a variety of experi- 
mental paradigms in which they give subjects 
arousal information prior to exposure to a 
noxious stimulus. This information is usually 
presented along with misattribution informa- 
tion (i.e. information incorrectly linking the 
arousal signs to a neutral source). For ex- 
ample, Nisbett and Schachter (1966) gave 
subjects a placebo pill before exposing them 
to a series of increasingly painful electric 
shocks. The subjects were told that the pill 
would cause changes (misattribution) in 
their emotional reactions (arousal informa- 
tion), for example, heart rate, hand sweating, 
trembling, and so forth. Because the subjects 
were led to assume that their arousal behav- 
iors were caused by the neutral pill rather 
than by the painful shock, they were expected 
to, and did, tolerate higher levels of shock. f 

Although the misattribution studies are in- 
terpreted as demonstrating that the neutral 
cognitive label redefines the subjects’ emo- 
tional state they feel less frightened when 
they associate their arousal with a neutral 
pill than with a threatening shock, it is un- 
clear whether the effect of the manipulation 
is due to the misattribution to the neutral 
source or to informing the subjects about 
their arousal behaviors (Leventhal, 1970, p. 
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172; 1974). Indeed, the major difference be- 
tween misattribution and control subjects is 
the receipt of information on arousal behav- 
iors by the former and the receipt of informa- 
tion on a series of irrelevant signs (sugges- 
tions that they will feel numbness, itching, 
etc.) by the latter. Consistent with this, Cal- 
vert-Boyanowsky and Leventhal (1975) found 
that arousal information per se was important 
in reducing subjects’ fear-motivated avoid- 
ance behavior (i.e., arousal-informed subjects 
worked longer at puzzles to win money than 
at puzzles to avoid shock, whether their 
arousal was attributed to a “neutral” noise or 
to a threatening shock). They concluded, 
therefore, that arousal information and not 
the source of attribution (neutral noise vs. 
threatening shock) is the factor responsible 
for reduction in fear-motivated avoidance 
behavior. 

The effects of arousal information are simi- 
lar in some respects to those for sensation 
information, since arousal information appears 
to reduce avoidance behavior and is less ef- 
fective in doing so when accompanied by 
warnings of high magnitudes of danger (e.g., 
Nisbett & Schachter, 1966). But there is little 
evidence to suggest that arousal information 
actually reduces distress on stressor impact. 
For example, while Calvert-Boyanowsky and 
Leventhal (1975) found that arousal informa- 
tion reduced the subjects’ avoidance behavior, 
it did not change their subjective emotional 
State. Nisbett and Schachter (1966) also seem 
to have found much more substantial differ- 
ences for avoidance behavior (shock toler- 
ance) than for Teports of distress, holding 
levels of shock constant. These findin 


s gs raise 
the question, therefore, as to whether 


t t arousal 
information will actually reduce distress dur- 


ing cold pressor impact, and will do so only 
in the absence of high-magnitude pain warn- 
Ings, or whether the experience of distress 
during cold pressor stimulation will be as 


arousal information 
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out awareness of stimulus sensations, Wha 
seemed likely was that arousal information 
would prevent the subjects from becoming 
frightened by their own fear reactions, te 
ducing, therefore, signs of distress-motivated 
avoidance. 


Method 


Design and Subjects é 


The experimental design crossed magnitude in 
formation (pain message vs. no pain message) with 
three information conditions: (a) sensory informa: 
tion, (b) arousal information, and (c) a control 
group given information on the procedure. Judg: 
ments or trials were a within-subjects factor. ‘The 
magnitude information was included in a single e 
tence that informed the subjects that they woul 
experience strong pain while their hands were in a 
water, The reference to pain was omitted in the 
low-magnitude conditions. The sensory infor 
message described the physical sensations the w 
jects would feel during and after the immersion 0 
their hands in the water. The arousal info 
described specific arousal reactions that would ae, 
during this same time period, and the control m 
Sage described general details of the experimen! 
rocedure. y 
i A total of 50 males were included in the an 
of the study (see Figure 1). While we know p 
important sex differences in response to cold be 5 
or to the experimental treatments, the quit pe, 
slightly higher for female subjects, and to ma 
experimental time we decided to use male ia a 
only in all three studies. These subjects w T 
cruited from the introductory psychology Pripad 
had completed and met the criteria for ho d 
set by a questionnaire. The questionnalte ni 
to disqualify subjects who reported medi prob 
lems such as hypertension or other creat 
lems, rheumatic fever, heart disease or hear! edict 
malities, and asthmatic attacks or being on vast 
tion for asthma. Reports of any form of er, he 
Prior frostbite, taking psychoactive drugs wit 
Past 12 hours, or special medical problems * 7 
a physician’s care (e.g., stomach ulcers) Ti ise 
grounds for disqualification. One subject 
qualified because of a broken arm. 


Experimental Setting ha 
ir in 
The subject was seated in a comfortable cie wis 
toom containing little visible equipment. nk, three 
a low platform that held the ice water aoe that 
large metal filing cabinets, and a one-way m of 81% 
was almost completely covered by a series e ma 
11-inch color prints. A portable cart held T the 
terials for attaching heart rate electrodes expert 
questionnaires for the initial portion of the 
ment. 
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All equipment (a Grass polygraph used to record 
heart rate and skin temperature, a Panasonic video- 
tape recorder used to record subjects’ facial expres- 


5 sions, and a Sony tape recorder used to present in- 


structions) was in an adjacent room. The two rooms 
were connected by an intercom, and the experimenter 
could view the subject through the spaces between 
the prints that covered the one-way mirror, 


Procedure 


Subjects were randomly assigned to the six ex- 
perimental conditions, with the restriction that a 
complete replication of the design (a subject in each 
of the six cells) was run before beginning a second 
set of six. There were four experimenters, each aided 
by an assistant. experimenter. The assistant placed 
the appropriate instruction folder on the subject's 
table, randomized and assigned the subjects to condi- 
tions, and gave the experimenter a code letter that 
indicated which instruction tape to play to the sub- 
ject. The further precaution was taken of assigning 
new code letters to the experimental tapes halfway 
through the study. 

The subject was conducted to the experimental 
room, where he donned earphones and listened to a 
brief introduction that stated that the study was 
one of a series of investigations on “psychological 
and physiological reactions to cold temperature” and 
advised the student that he could “quit the experi- 
ment now or at any time . . . without forfeiting 
(his) experimental credit.” 


Informational Conditions 


The instructions for each condition were as fol- 
OWS; 


(Sensory information). The cold temperature 
treatment that we mentioned will involve sub- 
merging your hand (to the wrist) in an ice bath. 
When you put your hand in the water, the first 
sensation will be one of extreme coldness. The feel- 
ing of coldness will last for a short period of time 
(around 20 or 30 seconds), and then you will be- 
gin to feel a number of different sensations. The 
first of these is a sensation of pain, which will 
begin to get very strong around this time. (This 
last sentence was eliminated in the low-magnitude 
message.) Along with this, you will begin to get 
a feeling of strong pressure on your hand. You 
may notice that the feeling of discomfort is not 
Spread evenly around your hand but rather is 
concentrated in certain areas. For instance, you 
may begin to feel a tingling sensation in your 
fingers which seems to bite or burn. Your whole 

and may throb after some additional time, and 
the joints of your fingers wili begin to feel some- 
' what stiff. 


After a while, the strong sensations will begin to 
fade. At this time you will feel a pinpricking sensa- 
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tion or a feeling that your skin is being pulled 
tightly across the back of your hand. This sensa- 
tion will fade in your fingers and lower hand until 
you can feel only numbness. The prickly feeling 
will remain only in a ring at the point where your 
hand enters the water. 


After your hand is removed from the water solu- 
tion, you will be aware of a new set of sensations. 
First, there is a prickly feeling which is similar to 
a limb waking up. It starts in your fingers and 
moves slowly up the sides of your palm. You will 
also detect nervous twitching in the muscular pads 
of your palm and in your fingers. 


(Arousal information). The cold temperature 
treatment that we mentioned will involve sub- 
merging your hand (to the wrist) in an ice bath. 
This experience produces many feelings and reac- 
tions in addition to the sensations you will feel in 
your hand. 


When you first put your hand in the water, you 
will feel a sense of apprehension or anticipation. 
Almost immediately your whole body will begin to 
react to the temperature change. One of the first 
feelings you will notice is the sensation of pain, 
which will begin to get very strong about this time. 
(This sentence was omitted in the low-magnitude 
message.) You'll also be aware of additional feel- 
ings and reactions. Some of these will be similar to 
those feelings you have when you are experiencing 
an emotion such as fear or excitement, You may 
even be experiencing some of these reactions right 
now, such as butterflies in the stomach, You may 
notice that your other hand has begun to sweat, 
Along with this, you will feel yourself becoming 
more alert or awake. Generally, you will feel your 
whole body is exerting a great deal of effort, Your 
facial muscles, in particular, will show an increase 
in tension, You might feel your forehead raise and 
wrinkle, Tension will sometimes spread to other 
parts of your body—your arms, shoulders, and 
chest, After a few minutes, this muscle tension may 
cause a feeling of weakness in various joints and 
muscles in the legs and chest. After a while your 
emotional reactions and your feelings of tension 
will begin to fade. 

ce you have removed your hand from the 
ERNA, will probably feel a lot of pressure 
being removed from your body muscles—this is 
relief. This relief may be accompanied by a surge 
of warmth and by the feeling of tension rapidly 
leaving your chest and arms, 


(Procedural information control), The cold tem- 
perature treatment that we mentioned will involve 
an ice bath. This bath consists of a small bucket 
of ice-cold water. A cylindrical wire screen is used 
to keep the ice out of the center area of the bucket. 
Considering your normal body temperature of 98°, 
ice temperature represents a large environmental 
change. For this reason, after your hand has been 
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in the water a short period of time, the sensation 
of pain will begin to get very strong. (The last 
sentence was omitted in the low-magnitude mes- 
sage.) 


The experimenter is presently preparing to take 
the physiological measures we need, He has al- 
ready attached some measuring electrodes. These 
electrodes are part of a recording apparatus and 
are connected to a penlike oscillograph located in 
the next room. Before the next part of the pro- 
cedure, the experimenter will take another physio- 
logical measure—blood pressure. The entire time 
your hand is in the water, the experimenter will be 
next door operating all of the recording apparatus, 
since it is necessary for him to adjust the equip- 
ment occasionally as your physiological reactions 
reach a peak and begin to fade, and he must make 
small notations on the settings which are being 
used for recording this rise and fall in your physio- 
logical reactions, 


After you remove your hand from the water, the 
electrodes will remain attached for a short period 


of time, Finally, the experimenter will have some 
questions to ask you. 


Each message terminated with the following in- 
structions: “Your hand will be in the water for a 
6-minute period unless you decide to remove it be- 
fore that time. Now let’s go over that again so 
that you will remember it clearly.” All subjects were 
given a written copy of the appropriate message for 
their group. 

Instructions on safety. At the close of the in- 
structions specific to his condition, the subject was 
given the following information, which stressed that 
the experiment was harmless: = 


Some people who have participated in this ex- 
periment have expressed concern about injury to 
their hand, We want to assure you that under 
these conditions: even ice-cold water cannot and 
will not cause any damage during the period of 
time you will be exposed to it. However, as a 
safety factor, we are monitoring skin temperature 
and will make certain that your hand does not 
reach freezing temperatures, There is absolutely 
no danger of any damage to your hand. 


Measures 


Mood ratings. Subjects were next instructed to 
fill out a Mood Adjective Checklist (fact), because 
“your emotional feelings or moods at the moment 
may be related to some of the measures of physio- 
logical activity which we are taking. For this reason 
we would like you to fill out the mood and feelings 
rating scale located under the manila folder. We’ 
also like you to fill out the attached questions on 
your expectations at this point.” When the MACL 


LEVENTHAL, BROWN, SHACHAM, AND ENGQUIST 


was completed, the experimenter again entered the 
room to take the subject’s blood pressure. Two TV 
cameras and some additional TV equipment were on : 
one of the cabinet shelves. One of the cameras wasé 
live and transmitting to the next room, so that the 
experimenter could observe the subject’s behavior 
and score expressive reactions. None of the expres- 
sive ratings are reported, since very little expressive 
behavior was observed. 


Sensation and distress ratings. After the experi- 
menter took the subject's blood pressure, the ice 
bath and table were moved to the subject's right, 


and the experimenter displayed a sensation and a 


È | 
distress scale, which the subject was to use to make | 


ratings during and following the immersion of his | 
hand in the cold water. When the experimenter de- 
parted, the subject heard the taped instructions for 
rating sensation and distress during and after im- 
mersion of his hand in the cold water. He was told 
that sensation was “the physical intensity of what 
you will be feeling,” and distress was “the amount} 


of upset or distress the sensations cause.” Each scale 


was printed in the vertical orientation on a separate 
84 X 11-inch page. The points along the <a 
scale were marked with numbers, and the points: 


along the distress scale with numbers and verbal 
labels of distress magnitude. Instructions for rating 
sensation and distress were the same as Johnsons) 
(1973). 

Following the instructions, the subject was told " 
put his hand in the water and to make sensation E 
distress judgments whenever the tape indicated. T 
signals for making the ratings were given 10, 35, F ] 
60 sec after the subject immersed his hand in the 
water, every 30 sec thereafter for 4 additional judg- 
ments, and then once every minute until a pe 
period had elapsed (10 judgments in all). . i 

The subject was then asked to remove his han i 
from the water and to continue to make te 
and distress judgments when asked. A total of if 
Postimmersion judgments were made beginning iie 
sec after the subject removed his hand from se 
water and continuing at 30-sec intéfvals for 4 i 
min. period. The experimenter be to take 


last blood pressure reading, and subject com 
pleted another macı. PN \ the 

Postexperimental questionnaires. Following i 
completion of the mact, the subject was taken. es 
another room to complete a series of questiona 
that asked about each of the sensations, arousal a 
and procedural events mentioned in the instruction! 
along with questions on other sensations and I 
ings not mentioned. All items used 20-point Likt 


T 


tionnaires, the subject was offered his choice Ea 
fee, tea, coke, or milk for a short break. A tho 
debriefing completed the procedure. 
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Results 
Manipulation Checks 


The postexperimental questionnaire checked 
on the effectiveness of the manipulations. For 
each of the 12 sensations mentioned in the 
sensory message, each of the 12 arousal signs 
mentioned in the arousal message, and each of 
the various events mentioned in the procedural 
information message, the subject was asked 
to make ratings with respect to the following 
four questions: (a) How strongly was it ex- 
pected? (b) How strongly did he remember 
being told to expect it? (c) How strongly was 
he aware of the sensation, sign, or event? and 
(d) How closely did his experience match his 
expectations? Several other items were in- 
cluded on the questionnaire, and they will be 
mentioned when results are reported. 

Information manipulations, Overall means 
of subjects’ responses to the 12 sensory and 
the 12 arousal questions are presented in 
Table 1. Compared to subjects in the other 
Conditions, subjects in the sensory informa- 
tion condition recalled being exposed to the 
Sensory information, F(2, 44) = 69.26, p< 
001, and reported stronger expectations that 
they would experience these specific sensa- 


Table 1 


Mean Responses by Condition for Questions on Sensat 


Behavioral Arousal Information 
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tions, F (2, 44) = 5.54, p < .05. Subjects given 
arousal information recalled being exposed to 
that information, F (2, 44) = 45.50, p < .001, 
but when compared to other subjects they did 
not report significantly stronger expectations 
that they would experience these arousal 
symptoms, F(2, 44) < 1, p>.25. Finally, 
subjects in the high pain-magnitude condi- 
tions recalled being told to expect strong pain, 
F(1, 44) = 9.10, p < .005, and reported 
stronger expectations that they would feel 
pain than did subjects in the no-pain condi- 
tion, F(1, 44) = 7.21, p < .01. 

The results are very clear in showing that 
subjects remember what they have been told. 
But they also show that what subjects have 
been told is not always equivalent to what 
they expect. The high-magnitude pain warn- 
ing clearly produced differences in expecta- 
tions for experiences of pain. The sensory in- 
formation was also reasonably effective in 
producing differences in expectations regard- 
ing these experiences. The one exception to 
this was the relatively high expectation of 
sensations for the subjects in the high-magni- 
tude pain warning control condition. The 
overall mean of the 12 items measuring ex- 
pected sensations for this group was high be- 


‘ion. Information and 


Experimental condition 


Warning Information Remember Expect Awareness Match 
Sensation information questions S i 
13.0 : 
High pain warning Sensation lge lgs mando 19480 
BSCE 3.875 11.125 12.180 13:90 
No pain warnin Sensation 13.404 11.980 10.270 13.030 
A sal 4.586 8.838 à 12940 
pe: 2.111 9.040 10.270 i 
On! s 
Behavioral arousal questions ae ig 
i j 7.568 ! : 
High pain warning Semon ESP apt Gass, 
E sae 1568 =" 8102 ane 14.140 
es as 
q No pain warning Sensation aE 528 8.020 11.700 
Ara 2.404 6.054 5.323 15.530 
'ontro! s 


Note. Scores can range from 1 to 20. 
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cause the high-magnitude pain warning pro- 
duced a pronounced elevation in three of the 
expectation items: (a) expected coldness, (b) 
expected stiffness of the fingers, and (c) ex- 
pected pinpricking sensations. Each of these 
questions refers to a sensation known to be 
associated with severe cold, and the pain 
warning seems to have alerted subjects to ex- 
pect these symptoms. Finally, the data sug- 
gest that arousal information, although clearly 
remembered, failed to provide any strong ex- 
pectation or set for the subjects. 

There are two reasons for caution in in- 
terpreting the effects of the instructions on 
subjects’ expectations, First, the responses 
are retrospective reports given after the sub- 
jects had been exposed to the cold water stim- 
ulus, and we do not know whether the re- 
ports at this point would correspond precisely 
to reports given prior to exposure. For ex- 
ample, it is possible that expectations im- 
mediately prior to exposure are more precise, 
clear, or certain in informed than in unin- 
formed subjects. Second, there is no reason 
to assume that expectations will have the 
same effect on later behavior if they are based 
in one instance on information from the ex- 
perimenter (sensory informed) and in another 
instance on information provided by the sub- 
ject himself (e.g., high-magnitude or pain 
warning control subjects). 

Table 1 also indicates that the preparatory 
information did not produce suggestion ef- 
fects, since there were no significant differ- 
ences between conditions on awareness of 
Sensation or arousal signs. There also were 
no differences between conditions in reported 
matches between experience and expectations. 
But we cannot assume that subjects in different 
conditions were equally attentive to the sen- 
sory experiences and arousal signs, since the 
question merely asks if the signs were no- 
ticed. Similarly, we do not know if the re- 
ported match of experience and expectations 
in unprepared subjects was a postexperimen- 
tal conclusion saying, “Yes, I guess that was 
what I did expect,” and for Prepared sub- 
jects was a conclusion during immersion say- 


ing, “Yes, I’m feeling just what they told me 
to expect.” 
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Response to Stressor Impact 


Four major dependent measures were ob- 
tained during exposure to the noxious cold 
pressor stimulus: distress ratings, strength. 
of-sensation ratings, finger temperature, and 
heart rate. Though there were changes in 
heart rate over time, the average 2 beats/ 
minute increase during instructions and the 
average 8-12 beats/minute decrease during 
immersion being statistically significant, there 
were no significant treatment differences. 
Thus, the heart rate data will not be reported 
or discussed further. Reports of trial effects 
are omitted, since they are all highly signifi 
cant. 

Distress ratings. Figure 1 presents the 10 
distress ratings for the sensation, arousal, and 
control conditions in high-magnitude pain 
warning conditions, and Figure 2 presents the 
same data for the low-magnitude (no pall 
warning) conditions. It is immediately obvi- 
ous that distress increased quite sharply dur: 
ing the first minute and a half (the first foa 
judgments) of exposure and then leveled oft. 
But distress reporting was not identical actos 
the three conditions; the sensory informe 
group showed a precipitous drop in reporte 
distress, beginning with Judgment 6 
min.), in the no pain warning conditions. *' 
decline in distress is far smaller and oem 
somewhat later, at Judgment 7 (3 min.), a 
the high-magnitude pain warning conditions 
The difference between the three types E 
condition is reflected in a highly signifi 
interaction between treatments and judgme 
trials, F(18, 396) = 3.44, p < .001. s 

It is obvious from the pattern of pr 
that treatment effects became manifest 4 
during the final 4 min. of immersion; © 
groups acted similarly during the first 2 pE 
of cold water impact. Separate planned CO 
parisons were run on Judgments 5 throu 
10. Subjects in the sensory information ef d 
reported signiñcantly less distress than 


The 


informed 
= 24.49, p < 001, and the sensory-inforn =i 


subjects also reported significantly less a 
tress than did subjects in the arousal into 
tion groups, F(1, 220) = 9.76, p < ee in 
level of distress reported in the arot: tin 
formation groups did not differ from tha! 
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trol conditions with high-magnitude pain warning—Experiment 1. 


the control conditions, These tests show that 
sensory information significantly reduced dis- 
tress during the final judgments (last 4 min. 
of immersion) in relation to both control 
(procedural) and arousal information, while 
arousal information had no substantial affect 
on distress reduction. 

The final critical comparison is between 
the sensory information group with no pain 
warning and the sensory information group 
with the high-magnitude pain warning. If the 
Pain warning blocked distress reduction by 
Sensation information, it would support the 
‘Processing notions and lead us to reject the 
‘simpler version of the accuracy hypothesis. 
A separate analysis of variance of the last 
judgments, 5 through 10, for these two con- 
ditions (sensation—high pain vs. sensation — 
no pain) yielded a nearly significant main ef- 
fect, with less overall distress reported in the 

| no pain warning — sensation group, F(1, 15) 


= 4.29, p < .06. The reduction of distress in 
the sensory information group with no pain 
warning (M decrease = 34.12) was greater 
than that in the sensory information group 
with the high pain warning (M = 20.97), but 
the overall interaction of pain warning with 
trials was not significant, F(5, 75) = 1.84, p 
< 15. Given the virtual identity in rated 
distress for the first five trials, it appears 
that the pain warning interfered with distress 
reduction in the final 34 min. of exposure. 
Sensation ratings. Sensory information was 
expected to change distress ratings but to 
have no effect on the reported strength of 
sensations. The data showed, however, a 
greater drop in the rated strength of sensa- 
tions in the sensory information groups than 
in the control groups and low ratings in the 
arousal information—no pain warning condi- 
tion. This produced a significant Condition x 
Trials interaction, F(18, 396) = 3.98, p< 
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. becal 
001. Despite this effect, the condition differ- cluded from the figure and analyses F 


ratings, and there was no difference between 
the sensory information — high-magnitude pain 
warning and sensory information —no pain 
warning conditions. It is possible that both 
distress and sensation ratings changed in simi- 
lar ways because subjects failed to discrimi- 
nate sharply between them, but the more sub- 
stantial impact of sensation information on 
the distress ratings suggests that the differ- 
entiation was made by at least some of the 
subjects, Separate figures are not reported for 
the sensation data because of their basic simi- 
larity to the distress findings. 

Finger temperature. Figure 3 presents the 
finger temperature readings for all six experi- 
mental groups at 10 different times during the 
final 3 min. of cold water immersion. Finger 
temperatures for the first 3 min. were ex- 


Teichner’s (1965) findings showed the 
of vasodilation to be in the neighbor! i 
240 sec. It is immediately obvious from © 
ure 3 that the no pain warning sensor Í 
formation condition shows a distinct E 
skin temperature. The difference betwe 

3- and 6-min. finger temperature wa 
puted for subjects in all six con 
a planned comparison showed ee f 
greater increases in skin temperature BA 
jects in the sensory information = 20, ; 
warning conditions than in the ohe fy 
ditions, F(1, 342) = 41.31, p< 07 
greater increases in skin temperature ote 
Sensory-informed — no pain warning SM 
comparison to the sensory-infor 342) 
magnitude pain warning group, F( ae 
15.83, p < .001. The temperature en 
vide support for the hypothesis tha 
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information reduced distress only in the ab- 
sence of pain warnings—a prediction follow- 
ing from the view that processing the stressor 
input as objective rather than as subjective 
emotional information is the critical factor 
in distress reduction. 


Postimmersion Questionnaire Responses 


The postimmersion questionnaire was de- 
signed to help clarify the processes underly- 
ing the treatment effects. Significant results 
were found for one of two sets of questions, a 
3-item scale that attempted to assess sub- 
jects’ strategies for coping with the distress 
experience (Kanfer & Goldfoot, 1966). This 
scale focused on the subject’s tendency to de- 
velop an “objective set” toward the noxious 
stimulus: (a) “While my hand was in the 
water, I tried to think of it as something 
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separate from me, so that I was not actually 
experiencing the discomfort, but rather, my 
hand was”; (b) “When my hand was in the 
water, I tried to think of the whole experience 
as something which would be informative and 
interesting to observe”; (c) “While my hand 
was in the water, I tried to concentrate on all 
the sensations that it was experiencing—sen- 
sations such as tingling, pinpricking, etc.” As 
can be seen in Table 2, subjects receiving sen- 
sory information gave significantly higher 
ratings to these items than arousal informa- 
tion or control subjects, F(1, 44) = 4.38, $ 
< .05. This did not happen because sensory- 
informed subjects agreed to Item 3, since the 
second item (informative and interesting to 
observe) was the best single discriminator of 
the groups, F(1, 44) = 4.90, p < .04. 

Four other significant findings suggested 
that both sensation and arousal information 
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Table 2 
Self-Ratings on Objective Set Items 


Information condition 
Warning condition Sensation Arousal Control 


6.458 
5.074 


7.417 
7.074 


5.048 
5.148 


High pain warning 
No pain warning 


Note. Scores can range from 1 to 15. 


affected emotional states. First, when asked 
if they had “thoughts and worries about how 
the cold water might damage your hand,” sub- 
jects in the arousal information condition re- 
ported significantly less worry than did sub- 
jects in either the sensory or the control 
group, F(1, 44) = 4.57, p < .05 (see Table 
3). Second, on a postexperimental question- 
naire measure of avoidance motivation, sub- 
jects in the arousal condition reported less de- 
sire to remove their hand from the cold water 
during the immersion period than subjects in 
the control conditions (Table 4), F(1,44) = 
4.32, p < .05. The postexperimental question- 
haire data support the hypothesis that arousal- 
informed subjects will show reduced avoidance 
motivation. 

Two mood scales also showed significant 
treatment differences on the postexperimental 
measures: Subjects in the control condition 
were more tense and anxious (M = 6.5) than 
were subjects in either the sensory (M = 
2.2) or arousal (M = 3.2) information con- 
ditions, F(1, 43) = 8.26, p < .01, and con- 
trol subjects reported a greater sense of hope- 
lessness and defenselessness (M = 4.5) than 
did subjects given sensory (M = 1.6) or 
arousal (M = 2.3) information, F(1, 43) = 


Table 3 


Means of Subjects’ Reports of 
Thoughts and Worries 


—— eee 
Information condition 
Warning condition Sensation Arousal Control 


High pain warning 


p r 5.625 1.857 9.125 
No pain warning 3.111 2.222 4.222 
M 4.294 2.062 6.259 


Note. Scores can range from 1 to 15. 
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Table 4 
Means of Subjects’ Reported Desire to 
Remove Their Hand from the Water 


Information condition 


Warning condition Sensation Arousal Control 

High pain warning 13.50 8.86 15.38 

No pain warning 10.11 8.89 12.44 
11.71 8.88 13.82 


Note. Scores can range from 1 to 20. 


l 
7.85, p < .05. These differences appeared only) 
in the high-pain warning conditions and only 
for the postexperimental (post-cold pressor) 
measures; there were no significant differences 
on the mood scales given prior to the cold 
pressor treatment, 


Discussion 


The results lend clear support to past find: 
ings (Johnson, 1973, 1975; Johnson et al, 
1975; Johnson & Leventhal, 1974; Johnson 
et al., 1973; Staub & Kellet, 1972) that sensa- 
tion information reduces distress during con: 
tact with a noxious stressor. The sensation 
informed group reported less distress, had 
higher finger temperatures, and reported: 
somewhat weaker sensations from the cold) 
pressor stimulus than did the control groups 
or the groups given arousal information. But 
the distress-reducing effect of sensory infor 
mation was strongest in the absence of a pall 
warning. When subjects were told that the 
Stressor was very painful, there were smal 
differences between the sensory, arousal, ani 
control conditions. That a pain warning Cal 
block distress reduction by sensory inform 
tion establishes an important limitation o! 
the effectiveness of sensory information. AN 
the finding is consistent with prior studié 
showing (a) increased tolerance of shoc 
when sensation information was accompani® 
by reassurance that shock was harmles) 
(Staub & Kellet, 1972); (b) the blocking ° 
peripheral vasodilation to cold pressor Whe! 
subjects were threatened by electric sho q 
(Teichner, 1965); and (c) increased me 
of pain to electric shock when the subje? 


were given a pain warning (Hall & stride 
1954). 
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The findings suggest that a simple accuracy 

hypothesis does not readily account for the 
pain-reduction effects of sensory information 
and indicate that different kinds of prepara- 
tory information alter the way the noxious 
stimulus information is processed to change 
the subject’s experience of the stressor. Sen- 
sory information leads to an objective, non- 
affective experience of the stimulus; magni- 
tude information, or a pain warning, leads to 
an emotional experience of the stimulus. When 
objectively processed, the noxious input is 
coded or categorized in terms of specific sen- 
sory features, such as coldness, numbness, 
pins and needles, and so forth, and emotional 
distress reactions habituate. When emotion- 
ally processed, the information is encoded or 
integrated in an emotional schema or pain 
memory (Engel, 1959; Leventhal & Everhart, 
in press), and the stimulus and coding con- 
tinue to stimulate distress. 
q The hypothesis that the pain warning ac- 
tivates an emotional pain memory for the 
encoding of the noxious experience of the 
cold pressor suggests that the warning might 
Convert each of the sensation cues into a series 
Of threat cues; Epstein (1973) offers a simi- 
lar interpretation of the effects of pain warn- 
Ings for subjects’ responses to successive num- 
bers in response to a loud noise presented 
during a sequential countdown procedure. 
Correlations between heart rates and the num- 
ber of sensations expected by subjects (mea- 
sured by the 12 items in Table 1) support 
this expectation in the present data. Sensa- 
tion-informed subjects who received a pain 
Warning showed higher heart rates the more 
Sensations they expected (rs = .16 to .62 for 
the nine heart rates recorded during immer- 
Sion), whereas sensation-informed subjects 
Who did not receive a pain warning showed 
lower heart rates the more sensations they €x- 
Pected (rs = —.46 to —.67 for the nine heart 
tates recorded during immersion). 

Other features of the data support the idea 
that sensation information leads to an objec- 
tive or nonthreatening schematization of the 
Stimulus, which then facilitates habituation 
Of distress, First, the subjects given sensation 
information clearly adopted a more objective, 
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detached set toward the exposed hand. While 
this finding held whether or not the subjects 
received a pain warning, the sensation-in- 
formed subjects given a pain warning also 
reported considerable worry about the noxious 
event; the objective schematization would 
be in competition, therefore, with an emo- 
tional representation of the experience that 
would be distress stimulating. Second, it is 
also clear that sensation information produces 
its effects gradually; the conditions all reached 
approximately equal peaks of distress at 14 
min. after immersion, and differences between 
conditions were clear only during the latter 
34 min. of exposure. And while the data were 
not as clear as we might like, it is reported 
distress rather than reported strength of sen- 
sations that shows the more pronounced de- 
cline during this final period, and increased 
hand temperature (for sensation-informed — 
no pain warning subjects) follows approxi- 
mately 2 min. after the beginning of the de- 
cline in distress reports. It would appear, 
therefore, that the acquisition of an objective 
schema of the noxious stimulus facilitates the 
habituation or reduction of distress, and this 
in turn reduces the sympathetic discharge 
that sustains peripheral vasoconstriction 
(Teichner, 1965). 

Arousal information had relatively little 
impact on reports of stimulus strength or dis- 
tress during stimulus impact. It did, how- 
ever, sharply reduce past experimental reports 
of worries, thoughts of harm, and reported 
desire to remove one’s hand from the water. 
Thus, being aware of and expecting a host of 
bodily reactions seems to have reduced the 
level of fear and avoidance motivation stimu- 
lated by the noxious stimulus sensations and 
distress. And these effects occurred even 
though all subjects, arousal-informed or not, 
expected these responses. Thus, being told 
about them by the experimenter appears to 
be a form of reassurance, perhaps, because it 
suggests that one is responding as do other 
subjects and need not be threatened or wor- 
ried by one’s reactions. This outcome is com- 
patible with Calvert-Boyanowsky and Leven- 
thal’s (1975) interpretation of the misattribu- 


tion studies. 
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Experiment 1 suggested that sensory in- 
formation reduced distress because it altered 
the way subjects processed the noxious in- 
put; it prepared the subjects to generate an 
objective schema of the stimulus features, 
which facilitated the habituation of distress. 
An alternative interpretation of this effect is 
that the act of analysis itself and its accom- 
panying “distancing” of the self from stimula- 
tion, rather than schematization of the nox- 
ious event per se, is the critical factor in 
distress reduction. Analysis of features may 
create an attentional set inconsistent with the 
holistic, integrative set that seems to be neces- 
sary for the integration of stimulus percep- 
tion and bodily reactions into an emotional 
experience (Blitz & Dinnerstein, 1971; 
Krueger, 1928; Leventhal, in press; Schachter 
& Singer, 1962). Emotional experience is de- 
scribed as holistic at a phenomenal level 
(Krueger, 1928), and experimental data show 
that analytic sets disrupt emotional experi- 
ences associated with positive, humorous stim- 
ulation (Cupchik & Leventhal, 1974; Leven- 
thal & Cupchik, 1976). 

The considerations discussed above sug- 
gested a simple study in which subjects were 
assigned to one of three conditions: (a) an 
attention set, in which subjects were asked to 
monitor the site of impact (ie., the experi- 
ences they were having in their hand while it 
was in the cold water; (b) an attention-to- 
hand-and-body set, where subjects were asked 
to monitor both their hand and their body 
reactions to the stressor; and (c) an unin- 
structed control group. 

The comparison of attention to hand and 
body to the control condition and to the at- 
tention-to-hand-alone condition would allow 
us to evaluate a variety of alternative hypoth- 
eses. If subjects in the combined attention-to- 
hand-and-body condition reported less distress 
than subjects in both control and attention- 
to-hand-only conditions, it would support the 
idea that focusing attention on all details of 
the situation is distress reducing (i.e., that an 
analytic set and coding of specific features is 
distress reducing). This outcome also would 
seem plausible if one believed that a combina- 
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tion of sensation and body information would 
allow for maximal preparation. 

If objective schematization of the input ig 
the critical factor in distress reduction and 
habituation, it would be essential for the sub- 
ject to monitor the site of cold pressor impact, 
and it would be most efficient for sche 
formation if he paid exclusive attention td 
the objective features of stimulation to 
hand. Assuming that the formation of af 
objective schema of the input is necessary 
for habituation also suggests that attention 
to the hand should produce a decline in dis 
tress during the latter 34 to 4 min. of ex 
posure to the noxious stimulus. This hypoth! 
esis assumes that sensation information al 
fected distress in Experiment 1 because il 
encouraged schematization of objective stimu 
lus features by focusing attention on them 
This reasoning is supported by Epstein! 
(1973) finding that habituation to simpli 
tones occurs only when the subject directly 
attends to them. When attention is direct 
elsewhere, there is no habituation to the tones 


Method 
Subjects and Design 


A total of 68 right-handed males volunteered fa 
this study. All were enrolled in introductory py 
chology classes at the University of Wisconsin” 


the cold pressor task; and one subject was dropi 
because of extremely atypical behavior before A 
experiment. The remaining 56 subjects were a A 
domly assigned to the three conditions: (a) atten! 
to hand only, (b) attention to hand and body, a 
(c) control. 


Procedure 


The subject was seated comfortably in an a 
chair and heard the introductory instructions 4 
a loudspeaker. These instructions explained Be, 
cedure for the attachment of a thermistor ™ th 
ing hand temperature and assured the subject s 
the cold pressor treatment would be harie ga 
subject completed a medical questionnaire, o th 
out people with health problems. The subject g 
read and signed a voluntary consent form thal ithoul 
cated that he could quit the experiment W. 
losing credit for participation. All subjects ol 
agreed to take part in the study. The thermis 
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then attached to the middle finger of the subject’s 
right hand, and the experiment proceeded as follows. 

Stress task. Two insulated water tanks were lo- 
cated to the right of the subject—one with water at 
room temperature and the other with cold water. 
The room-temperature tank (22 °C) was used to be 
sure of a common initial hand temperature for all 
subjects. The cold water was maintained at a con- 
stant temperature of 7 °C by a circulating pump 
that sent the water through an auxiliary ice bucket. 
Before leaving the room, the experimenter asked the 
subject to immerse his right hand in the room-tem- 
perature tank. The remainder of the instructions 
were heard over a loudspeaker. 

Distress ratings. The instructions for rating dis- 
tress were identical for all the groups. Subjects were 
told: 


While your hand is in the cold water, you will 
be asked to record your judgment on the scale of 
distress. The distress scale . . . refers to the amount 
of upset or distress the sensations cause. Zero . . . 
means you are feeling no upset or bother at all. 
One hundred means the maximum amount you 
can imagine of the upset or distress you will ex- 
perience., 


The subject was told to make his judgments when 
the word judgment was said in the tape-recorded in- 
structions. He was told to “say the number on the 
scale which most accurately describes how you feel 
at that moment. . . , Each judgment is independent 
of previous ones. That means your ratings may g0 
up, down, or stay at the same level.” 

Attention instructions. At this point, the experi- 
menter randomly selected a ballot to assign the sub- 
ject to one of the three conditions and played back 
the appropriate instructional tape for that condition. 
Th the attention-to-hand-only condition, the sub- 
Ject heard the following instructions: 


During the time your hand will be in the colder 
water, you will feel many sensations in your 
hand. The sensations will change over time. We 
Would like you to pay attention to the sensations 
or feelings in your hand only, so that you will be 
able to describe each of the specific feelings and 
Sensations that you experienced. I will ask many 
questions about the different sensations in your 
hand afterwards, and this is the most important 
Part of the study. So ignore everything else and 
Pay close attention to the sensations in your hand. 


$ a the condition for attention to hand and body, 
ubjects heard the same instructions, with the ex- 
n that the phrase kand and body was in- 
pa for hand. The control group did not get any 
Da manipulation instructions. 
į Distress ratings. The recording then asked the 
ubject to move his hand to the cold water tank and 
peed for the distress judgments. The subject made 
a distress ratings—9 during the 6 min. of the cold 
essor task and 4 after removing his hand from the 
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water. The first judgment was requested 10 sec 
after immersion in the cold water; 2 additional 
judgments were then requested at 25-sec intervals, 
and the interval was expanded to 50 sec for the re- 
maining judgments, Six minutes after he placed his 
hand in the cold water, the subject was instructed 
to remove it, and 4 additional judgments were re- 
quested at 30-sec intervals. The subject made his 
ratings by calling out a number from 0 (not at all) 
to 100 (maximum) on the 100-point distress scale. 

Distress judgments were the only judgments called 
for in this experiment and in Experiment 3. Inter- 
views with pilot subjects showed that it was more 
difficult for them to make the distinction between 
distress and sensation than in Experiment 1, a dif- 
ficulty that could stem from the absence of sensa- 
tion instructions that helped the subjects in Experi- 
ment 1 focus more clearly on particular sensory 
features. Use of a pain rating was rejected, since 
mentioning the word pain prior to cold water im- 
mersion might obliterate distress reduction and treat- 
ment differences. 

Finger temperature. Finger temperature was re- 
corded on a meter located in the control room, The 
first temperature recording was made while the sub- 
ject’s hand was in the room-temperature tank, and 
three additional recordings were made, one at each 
of the distress ratings. 7 

Postexperimental questionnaires. At the close of 
the procedure, the experimenter removed the therm- 
istor and gave the subject two questionnaires. The 
first was a sensation and feature checklist asking the 
subject to rate (on a 21-point scale) the degree to 
which he noticed various sensations in his hand 
(“hand throbbing” or “tingling, biting sensations in 
fingers”), sensations in his body (“speeded-up breath- 
ing” or “feeling as if your body is exerting a great 
deal of effort”), and various features of the environ- 
ment (e.g. instructions, procedure, and room). $ 

The second questionnaire included direct questions 
about the percentage of the time the subject focused 
his attention to his hand and to his body. The sub- 
ject made these judgments with reference to each 
of three time periods—the beginning, middle, and 
end of the cold pressor task. Thus, there were three 
questions for attention to hand and three for atten- 
tion to body. The second questionnaire also included 
a question inquiring about the level of tiredness and 
tenseness that the subject felt toward the end of the 
immersion episode. At the conclusion of the session, 
the subject was debriefed, and the experimenter an- 
swered any of his questions about the study. 


Results 


Distress Ratings 


Separate analyses were conducted on the 9 
en during the exposure to 


impact ratings giv e 
the cold water and the 4 recovery ratings 


given after the hand was removed from the 
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Figure 4. Distress ratings durin; 
to hand and body (n = 17), 


water. As can be seen in Figure 4, subjects in 
all three groups reported sharp increases in 
distress during the first minute of immersion 
(Judgments 1 to 3), which peaked at Judg- 
ment 4 and then leveled off and dropped 
slightly during the remaining time the hand 
was in the water. As can also be seen from 
the figure, the groups began at the same level, 
but the rate of distress increased very much 
more rapidly in both the control and atten- 
tion-to-hand-and-body conditions, so that by 
the second judgment these two groups were at 
substantially higher levels of reported distress 
than the attention-to-hand subjects, and the 
difference remained at relatively the same 
magnitude throughout the 6-min, Period. 

The component of the main effect that tests 
the hypothesis that attention-to-hand sub- 
jects are less distressed (M = 52.99) than 
control subjects (M = 66.68) is statistically 
significant, F(1, 55) = 4.45, p < 05. The at- 
tention-to-hand-and-body group (M = 61.96) 


— 19 ion 
g cold pressor immersion for attention to hand (nm = 17), attenti 
and control ( = 24) groups—data from Experiment 2. 


was intermediate, but much closer to the 4 
trol than to the hand-only si F 
separate tests show that it did not ay ; 
nificantly from either the hand-alone, i 
= 2.12, ns) or the control condition, ro 
=.58, ns). Finally, there is no signifo 
Treatment x Trials interaction. 


Recovery Ratings | 


uri 
Although distress declined sharply a 
the recovery phase, there were no 
differences. 


Finger Temperature 


ifie 
There were no significant treatment oq 
ences or interactions of treatments W! in p” 
The failure to find effects may be due oÇi 
to the change in water temperature 2) 
Experiment 1, 7 °C in Experiment f ¢ 
culties with the temperature meters 4 
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trodes, and factors related to the time of year 
at which the data were collected. 


Postexperimental Questionnaire Reports 


Awareness. Subjects estimated the amount 
of attention they paid to their immersed hand 
during the beginning, middle, and end of the 
6-min. immersion in the cold water and the 
amount of attention they paid to their body 
reactions during each of these same three 
periods. These ratings were made on 11-point 
| scales ranging from 0% to 100% of attention. 
Subjects also completed a checklist on which 
they rated the degree to which they felt each 
of 35 sensations (1 = not at all aware of it, 
|?! = very strongly aware of it). Eight of 
| these items related to the hand only, and 11 
to the body reactions only. The remaining 16 
items focused primarily on attention to situa- 
tional factors such as the procedure. 

Significant treatment effects appeared on 
both measures: Subjects in the attention-to- 
hand-and-body condition reported paying a 
[greater percentage of attention to their body 
reaction (M = 32.05%) than did subjects 
Mm either the attention-to-hand-only condition 
(M = 17.35%) or the control condition (M 
[= 14.82%), F(2, 100) = 5.68, p < 01; the 

control and attention-to-hand-and-body con- 
ditions were not significantly different. Sub- 
Jécts in the attention-to-hand-and-body con- 
dition also checked greater awareness of spe- 
Cific body reactions (M = 79.7); the control 
group was intermediate (M = 59.5); and the 
least aware were subjects in the hand-only con- 
dition (M = 39.4), F(2,47) =5.77, p < 01. 
as of the groups was significantly different 
rom the others at the .05 level by # test. 
a were no significant treatment effects 

€ measures of attention to the hand. 
_ Fatigue. On three 21-point scales, sub- 
Jects responded to the following items: “To- 
i" the end of the time your hand was in 
ang Weter, did you feel (a) tired, (b)ptense 
nd jumpy, (c) detached and daydreamy?” 
tee instructed to attend to their hands 

a reported lower levels of tiredness (M = 
i" ) than was reported by subjects instructed 
Rain to hand and body (M E 
e (M = 5.6), F(2, 42) = 3.94, 
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Discussion 


Attention to the site of stressor impact, the 
hand only, reduced levels of reported distress 
during cold pressor. Instructions to attend to 
both hand and body did not, however, pro- 
duce statistically significant effects. Thus, the 
mere act of selectively deploying attention 
and analyzing the stimulus field is not suf- 


_ ficient for distress reduction; focusing of at- 


tention on the site of stressor impact is criti- 
cal for distress reduction, through a habitua- 
tion-type process. 

While the data are generally supportive of 
a processing interpretation (i.e., that lower 
levels of distress and reduced tiredness are 
due to the formation of an objective schema 
of stimulus properties and the habituation of 
distress), there is one feature of the data that 
is not fully consistent with this interpreta- 
tion: The differences between conditions 
should have been greatest toward the 3rd and 
Sth minutes of the immersion period, rather 
than during the Ist minute, as shown in 
Figure 4. The data suggest that the condi- 
tions differentiated very rapidly, and this was 
reflected in the analysis, which found a main 
effect between conditions but no Treatment 
Trials interaction. A second problem was 
the failure to detect differences between con- 
ditions in the measure of the percentage of 
time subjects reported that they attended to 
their hand. Indeed, control subjects reported 
having attended as much to their hand as did 
attention-instructed subjects. Postexperimen- 
tal interviews suggested that this attention 
was not the same. For the control subjects, 
the attention seemed automatic, involuntary, 
and stimulus engendered; they could not ig- 
nore the distress of the stimulus impact. For 
the instructed subjects, attention seemed self- 
engendered, controlled, or voluntary, a de- 
liberate focusing of awareness on stimulus 
properties. Indeed, a comparison of findings 
and impressions from the postexperimental 
interviews of Experiments 1 and 2 suggests 


that sensation information may be superior 


to direct attentional instructions, in reduc- 
ing distress, since the former defines the stim- 
ulus features to be monitored and provides 
schemata to be used in coding the input 


(Bruner, 1957). 
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Experiment 3 


Experiment 3 was designed to obtain better 
evidence with respect to the effects of the at- 
tentional instructions on the cues subjects 
monitor and to provide a better test of the 
hypothesis that schematization of the stim- 
ulus preceded distress reduction and habitua- 
tion. Evidence on the deployment of atten- 
tion was obtained by introducing a specific 
distractor to direct attention in the control 
condition. Evidence was obtained for the as- 
sumption that objective schema formation 
precedes reductions in distress by systemati- 
cally varying the deployment of attention 
during the early and later parts of the cold 
water immersion. The strength of sensations 
and distress from the cold pressor stimula- 
tion begin at relatively low levels and grad- 
ually build in intensity. It should be relatively 
easy to direct attention to and schematize 
stimulus features at this time. As the stimu- 
lus becomes more intense, the schema should 
be reinforced by repetition of the features de- 
tected earlier. On the other hand, subjects 
instructed or motivated (by threat) not to 
attend to the stimulus may have difficulty 
deliberately directing their attention to stimu- 
lus features if they are first instructed to 
monitor the stimulus later in exposure. De- 
laying attention and permitting the stimulus 
to build to maximum strength, to be in- 
tegrated with pain memories, and to be ex- 
Perienced as a distressing event will make it 
difficult to monitor and schematize, since the 
pain and distress may block attention to 
specific features. 

The assumptions mentioned above are gen- 
erally compatible with empirical observations 
made on studies of behavior and neural ha- 
bituation in the cat (Groves & Thompson, 
1970). Habituation occurs more rapidly to 
stimuli that increase gradually in strength 
early in a series of stimulus exposures, and 
habituation is relatively slow and sometimes 
fails to occur for strong stimuli. To test the 
assumption with cold Pressor, a four-group 
experiment was designed varying attention to 
and distraction from the hand at different 
times during immersion in ice water. In one 
condition subjects were instructed to monitor 
the hand throughout the immersion period 
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(attention-attention). In a second conditi 
(attention—distraction) they were instruc 
to monitor the hand for the first half of th 
immersion period, during the time ofin 
creasing strength of sensations when schem 
formation is critical, and they were di 
tracted by slides in the second half. In a thit 


group (distraction-attention) subjects wer 
instructed to attend to and monitor the han 
for the last half of the immersion perio 


after the input had presumably been clas 
fied and responded to as painful, thus blod 
ing the positive effects of monitoring. | 
fourth group (distraction-distraction) vi 
distracted for the entire immersion perio 
The hypothesis under test was that monita 
ing throughout the immersion period an 
monitoring during the first half of the im 
mersion period would be equally effective i 
schema formation and would facilitate habi 
uation during the second half of exposui 
Monitoring during the last half of immersió 
and continual distraction should be equal 
ineffective for distress reduction. 


Method 
Subjects 


Seventy-six male undergraduates at the Univers! 
of Wisconsin served as subjects. They were select 
from a larger pool of subjects, and before being 
mitted to participate, they were screened an 
cluded if they reported a health problem o 
history of diabetes, rheumatic fever, heart m 
mality, or high blood pressure) or if they were 4 
handed. An incentive of $2 per hour was on f 
encourage participation. Nine subjects quit aa 
exposure to the cold water treatment (they di 
duster in any particular condition), and one Te 
to participate when told about the noxious stimi 
Two subjects were dropped, one because me 
hard of hearing and did not hear all of a i 
structions and another because he meditated Aw 
the experiment. The 64 subjects who remainee 
randomly assigned to one of the four Con 
with the restriction that a subject was en 
each of the four conditions before assign 

n to another replication. This procedur 


Tepeated eight times for each of the two male © 
Perimenters. 


Distraction Materials 


iment 
In postexperimental interviews for Experi tr 


subjects suggested that it was difficult to 
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themselves from their hand in the absence of an 
alternative stimulus. For this reason, 48 color slides 
of landscapes and people were used as distractors 
throughout the nonattention condition of Experi- 
ment 3, and 24 of the slides were used as distractors 
for the late and early nonattention (distraction) 
periods of the early and late attention conditions. 

The 48 slides were selected from a larger set of 
250, The experimenter first eliminated those slides 
suggesting pain or coldness (e.g., winter landscapes, 
scenes including water, etc.). The 60 slides remain- 
ing after this elimination were then shown to a 
group of six males and six females who had previ- 
ously been exposed to the cold water stimulus and 
who rated them as to how much they thought the 
slide could distract them from the (cold pressor) 
pain. The 15 slides having the highest mean score 
(none scored less than M = 4.75 on a 7-point scale 
where 1 = attends to the pain and 7 = distracts from 
the pain) were selected as distraction stimuli and 
were used as standards for the selection of 33 ad- 
ditional distractors from yet another set of slides. 
The additional slides were matched to the first 15 
in subject matter and amount of detail. We can best 
characterize these slides by saying that they were 
sufficiently interesting to attract attention and pre- 
vent deliberate monitoring of the hand, yet not so 
Interesting as to provoke a strong alternative emo- 
tional response that would compete with the emo- 
tional distress evoked by the noxious stimulus. 


Procedure 


The procedure was virtually identical to that in 
Experiment 2. One major change was to raise the 
temperature of the laboratory to 26 °C and the tem- 
Perature of the water used to establish a common 
baseline to 30 °C. It was hoped that these changes 
Would facilitate recording skin temperature change. 
ae Subjects in each of the four conditions went 

rough the same seven steps in the following order: 
he completion of the medical screening question- 
li paad reading and signing the consent form; (b) 

Stening to an overall description of the aim and 
Procedure of the study; (c) immersing the right 
ed in the warm (30 °C) water; (d) hearing an 
a fanation of the distress rating scale and the rat- 
Eo oedi; (e) receiving attention and/or dis- 
al oaen instructions appropriate for the experimen- 
E ondition; (f) removing the hand from the warm 

er and immersing it in the cold water (7 °C) for 
bas and 15 sec; (g) removing the hand from the 
nies tank and making judgments of distress at 

A cis during a 2-min. postin a erg 

imes for j nd so forth a! 

Presented later on. judgments a 
ee instructions. The attention-distraction 
the aons fon each condition were presented after 
jects escription of the distress rating procedure. Sub- 
eard Tee to the attention-attention condition 
in pye S Same attentional instruction as that given 

Xperiment 2, The instructions for the distrac- 
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tion-distraction group differed from those in Ex- 
periment 1, since slides were used as distractors in 
the current study. The instructions were as follows: 


During the time your hand will be in the cold 
water, you will see landscape and art slides. We 
would like you to pay attention to these slides, so 
you will be able to answer questions about them 
afterwards. This is the most important part of 
the study, so ignore everything else and pay close 
attention to the slides. 


The instructions were modified somewhat when 
combined for the attention-distraction condition, and 
their order was reversed for the distraction—attention 
condition. The instructions were as follows: 


The time your hand will be in the colder water 
is divided into two parts, In the first part, we 
would like you to pay close attention to the sensa- 
tion in your right hand so that you will be able 
to describe each of the specific feelings and sensa- 
tions that you experienced. You will be asked 
many questions about the different sensations in 
your right hand afterwards. In this part try to 
ignore everything else and pay close attention to 
the sensation in your right hand. 


In the second part, you will see landscape and art 
slides, In this part we would like you to pay at- 
tention only to these slides, so you will be able to 
answer questions about them afterwards. Try to 
ignore everything else and pay close attention to 
the slides. It is very important that you follow 
the instructions, so pay close attention to the sensa- 
tions in your right hand in the first part and try 
to ignore them and pay attention to the slides in 
the second part. 


Tape-recorded instructions 
to shift his hand from the 
warm to the cold water tank and also told him when 
to make the distress ratings. The subject made 12 
judgments of distress during the 5 min. and 15 sec 
that his hand was immersed in the cold water} the 
distress scale was the same as that used in Experi- 

ents 1 and 2. j 
The immersion in the cold water was divided into 
two equal periods of 2 min, and 35 sec each, sepa- 
rated by a 5-sec interval in which the ea 
appropriate for the last 2-min, and 35-sec pe 
were repeated. Because it was necessary to repeat e 
instructions in the conditions in which subjects 
switched the focus of their attention, the instructions 
were also repeated in the attention-attention and 
distraction-distraction conditions. Six distress judg- 
ments were made in each 2-min. and 35-sec period. 
The first was requested 15 sec after the start of each 
period (15 sec after the instructions to immerse the 
hand in the water or 15 sec after completion of the 
between-periods instructions). The second judgment 
followed 15 sec later. A 30-sec interval was used be- 
tween the remaining four judgments. To further en- 


Timing of procedures. 
told the subject when 
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sure that subjects followed the appropriate pro- 
cedure, two brief reminders were given during each 
period, one 45 sec after its start and the second a 
minute later. 

Two different sets of 24 slides were used during 
each of the distraction periods. The slides were ex- 
posed for 5 sec each, and the between-slides interval 
equaled the time taken for the slide to change. The 
screen was darkened during the first of the three 
repetitions of the instructions and for 5 sec during 
each of the six ratings. The subject was told to re- 
move his hand from the cold water and make four 
additional distress judgments at 30-sec intervals at 
the end of the second period. 

Postexperimental questionnaires. After the 16 
judgment trials, the thermistor was removed, and 
the subject completed two questionnaires. The first 
was the sensations and features checklist used in 
Experiment 2. The second questionnaire included 
modified versions of the direct questions asking for 
estimates of the deployment of attention and the 
questions about tiredness and tenseness, The atten- 
tion questions had the subject estimate the percent- 
age of time he attended to his hand, his body, the 
slides, and other features of the setting. An 11-point 
scale was used for each of the four foci of attention 
(in 10% intervals from 0% to 100%). The subject 
was told that his four estimates had to sum to 100%. 
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The estimates were made twice, once for each 
the two periods of the cold pressor 


Results 
The data for the four conditions were 
lyzed as a 2 X 2 factorial design with a thi 
repeated measure factor. The first factor wi 


the attentional set (attention vs. distraction) 
during the first period of the experiment, 
second factor was the attentional set (at 
tion vs. distraction) during the second 
tiod, and the repeated measure factor 
judgment trials. 

As can be seen in Figure 5, distress show 
sharp initial increase for the first minute í 
immersion, peaked at approximately 14 
into the procedure, and then declined, thou 
differentially, in all groups. Figure 5 
shows that the distress levels declined consid 
ably more in the attention-attention and at! 
tion-distraction conditions than in the y 
traction-distraction and distraction-attentl 
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periment 3. (DD 2 distraction distat mersion for the four experimental groups 


ms Didi ion; DA=distracti ay = tion-attention) 
and AD = attention-distraction. n = 16 in each Rs EDE ee 


EFFECTS OF PREPARATORY INFORMATION ON DISTRESS 


conditions. The difference in rate of decline in 
reported distress resulted in a significant main 
effect, with the two groups paying attention 
to the hand in the first period reporting an 
average level of distress (M = 59.04) sub- 
stantially lower than the average level of dis- 
tress reported for the two groups that were 
distracted by slides during the first immer- 
sion period (M = 68.18), F(1, 60) = 4.61, $ 
<.05. More important, however, there was 
an interaction between trials and the atten- 
tional set adopted during the first 2-min. and 
35-sec period, F(11, 660) = 5.15, p< .001. 
The level of distress reported during the sec- 
ond period of cold pressor was substantially 
lower for the two groups attending to the hand 
during the first period than was the level of 
distress for the two groups watching slides 
during the first period. There were no sig- 
nificant effects as a function of attention or 
distraction during the second period of the 
cold pressor. 

A pair of subsidiary analyses were con- 
ducted, one on the six ratings before the 
change in instructions from attention to dis- 
traction and another on the six ratings made 
after the change. The analysis of the first pe- 
riod showed an interaction between atten- 
tional set and trials, F(5, 300) = 2.87, Ż 
< 05, because the groups, although virtually 
identical for judgments 1-3, began to separate 
by the fourth judgment. The peak for the two 
early distraction groups (M = 77.0) was only 
Slightly greater, however, than the peak for 
the two early attention groups (M = 71.8). 

In the second set of six judgments, the 
drop in distress was significantly greater for 
the groups that were instructed to pay at- 
tention to the hand during the first period 
(the attention-attention and attention-dis- 
traction groups) than for ‘the groups that 
a instructed to watch slides during the 
a Period (the distraction—distraction and 
Fea action-attention conditions), interaction 

(5, 300) = 2.47, p<.05. The difference 
ee the two pairs was sufficiently great 
F Produce a significant main effect as well, 

(1, 60) = 8.46, p < .01. The analyses sup- 
Port the pattern visible in Figure 5 and sug- 
= that early attention to the hand began 

affect distress reports 14 min. after 1m- 
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mersion in the cold water and that these ef- 
fects lasted and became still stronger in the 
second period of exposure. (Once again, trial 
effects were highly significant.) 


Finger Temperature 


Finger temperature dropped during the 
early part of immersion and rose during the 
postimmersion period for all conditions, but 
there were no treatment effects. 


Postexperimental Questionnaire Reactions 


Focus of attention. Subjects instructed to 
attend to their hand indicated that they spent 
a substantially higher proportion of their time 
monitoring their hand than did subjects not 
so instructed, in both the first (M attend = 
76.5%, M distract = 50.3%), F(1, 59) = 
50.75, p < .001, and the second period 
(M attend = 79.6%, M distract = 42.9%), 
F(1, 59) = 76.05, p < .001, of cold pressor. 
There were no significant differences for at- 
tention to body reactions or to other events 
in the setting. 

The sensation and features checklist showed 
that more attention was paid to the hand 
(sum of eight items) during the first (M= 
73.7) than during the second (M = 58.7) 
period of the cold pressor, F(1, 59) = 17.12, 
p< .001, whereas the sum of the 11 body 
sensation items showed less awareness of 
body responses in the first (M = 51.7) than 
in the second (M = 66.2) cold pressor period, 
F(1, 59) = 21.97, p < .001. These shifts in 
awareness from hand sensations in the first 
half of the cold pressor to body sensations in 
the second half held for all groups. 

Feeling of fatigue. The same three-part 
question used in Experiment 2, “Toward the 
end of the time your hand was in the water, 
did you feel (a) tired, (b) tense and jumpy, 
(c) detached and daydreamy?” showed a 
significant effect for the tiredness rating (made 
on a 21-point scale); subjects who were in- 
structed to attend to their hand in the first 
half of the cold pressor exposure reported 
being less tired (M = 4.30) than did subjects 
who were distracted by slides during the 
early phase (M = 6.14), FC, 58) EAT P 
< .05. There were no other significant effects. 
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Discussion 


The results of Experiment 3 confirm and 
extend those of Experiment 2. First, atten- 
tion to the site of cold pressor impact pro- 
duced substantial reductions in reported dis- 
tress. Second, it was clear that attention dur- 
ing the first portion of exposure was critical 
for later distress reduction; this finding held 
for reported distress and for reported fatigue 
on the postexperimental questionnaire. The 
introduction of the distractor slides facilitated 
the deliberate direction of attention and the 
ease. with which subjects could report on its 
deployment, making clear that subjects gave 
a higher proportion of attention to the im- 
pact site during this early period. And the at- 
tentional effects seem to have begun very 
early, taking hold by the end of the fourth 
distress judgment or about 14 min. into the 
cold pressor test. 

Third, there were no signs of decline in re- 
ported distress or reported fatigue for sub- 
jects who monitored their hand during the 
second half of exposure; the reports of these 
subjects did not differ from those of subjects 
who were continually distracted with slides. 
It seems that attention during the latter part 
of exposure is relatively ineffective for dis- 
tress reduction. While there are many reasons 
why this may be so, the present data and 
past observations point to one conclusion in 
+» particular. First, it is clear that distress is 
areata sey high levels when atten- 
tion is first deployed to the hand in the dis- 
traction-attention group. Second, it is clear 
that the experience of the cold pressor changes 
over time; the sensation checklist showed 
declining awareness of hand sensations and 
growing awareness of body sensations from 
the first to the second half of cold pressor 
immersion. Wolf and Hardy (1943) have de- 
tailed the many complex changes in the ex- 
perience of cold pressor from the coldness 
and bright, pricking, pain sensations early in 
exposure to the deeper aching and numbness 
experienced later on, changes which have been 
attributed to activity in different fiber sys- 
tems (Mountcastle, 1968). The earlier ex- 
periences are far more localized and informa- 
tional in nature than the diffuse, unchanging 
aching sensations that begin to dominate the 
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later pain experience. These latter sensation) 1 
seem to be of the type that are more easily 
summated with other sources of input than | 
are the clear, discrete, and localized sensations 
of early pain experience (Melzack, 1973). In J 
conclusion, it would appear that subjects at- ; 
tending to the impact site in the latter half 
of exposure would have relatively less infor 
mation available for the construction of & 
schema of the stimulus input and relatively 
more information (diffuse aching) for tht? 
formation of a generalized state of arousal 
and distress. The pattern of findings tends to) 1 
confirm the hypothesis that monitoring dur | 
ing the onset of stimulation, when features Í 

i 

i 

i 

] 


are rich and informative and diffuse, aching’ 
distress is relatively low, is optimal for th 
formation of a stimulus schema and the ha 
bituation (or cessation of recruitment) al 
distress emotion (Sokolov, 1963). 


General Discussion 


cess is empirically similar to that reported "i 
habituation studies in animals (Groves i 
Thompson, 1970), though the data from i 
present study point to a mechanism sim! i 
to that suggested by Sokolov (1963). It shoti 
be understood that the findings do not "i 
out alternative modes of distress reductio 
Analgesic sets, under hypnotic ien 
1969, 1971) or nonhypnotic conditions (B A 
ber & Hahn, 1962; Melzack, Weisz, & Spree 
1963; Spanos, Horton, & Chaves, 1975), "i 
also reduce distress to noxious inputs, ball 
empirical findings in these areas (immé fd 
distress reduction, successful reduction A 
relatively short periods of time) sugs“ 
different mechanism, namely, the blockage 
sensations and distress from conscious 2% 
ness, rather than the habituation of ËS” 
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er se (Hilgard, 1973; Hilgard, Morgan, & 
Macdonald, 1975). And it should also be clear 
hat monitoring stimulus inputs does not nec- 
ssarily lead to distress reduction. If the sub- 
ect is threatened and regards the stimulus 
ensations as warnings of further harm and 
langer, monitoring may enhance emotional 
listress rather than facilitate emotional ha- 
ituation. Monitoring may be essential for the 
currence of habituation, but it is not suffi- 
jent. 

Our interpretation of the cold pressor ef- 
ects assumes, therefore, two quite different 
yays of coding or schematizing noxious stim- 
lus inputs—an objective informational cod- 
ng in which cues are coded for their concrete, 
mmediate meaning and an emotional mem- 
wy schematization in which cues are coded 
or their anticipatory, threat value (Engel, 
1959; Leventhal, in press; Leventhal & Ever- 
aart, in press). The relative strength of these 
wo types of coding will obviously vary as a 
function of set and subject disposition. The 
heoretical interpretation is the same as that 
offered several years ago for the “new look” 
studies of perception (Bruner, 1957) and in 
More recent theorizing respecting the role 
of categorization in awareness (Broadbent, 
1977), 

While the approach is not novel, it does 
allow the synthesis of a wide range of data. 
For example, Morgan and Pollock (1977) 
studied world-class runners and found that 
elite runners carefully monitor leg and mus- 
ular sensations while performing. Outstand- 
ing, but nonelite runners distract from these 
sensations; they regard them as threat cues 
or signs of an anticipated wall or limit to 
their endurance. Thus, monitoring per sè is 
not the critical factor in the runner’s control 
of distress; it is the schematization of the 
cues that is central. Similarly, cases of phan- 
fom pain appear to involve a pain schema of 
body parts and a vivid central neural repre- 
sentation of the somesthetic and distress 1€- 
pons associated with preoperative injuries 
© amputated body parts (Melzack, 1971; 
Morgenstern, 1970; Simmel, 1962). Case ex- 
kel illustrate that these painful experi- 
Mees may vanish if the patient is exposed, by 
Means of epinephrine injection to nerve roots, 
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to a rich panoply of concrete sensations from 
the missing limb without pain inputs; moni- 
toring the fresh, nonpainful set of stimulus 
features allows the reassertion of the neutral, 
nonpain schema of the missing body part. 

It also seems that under some conditions, 
emotional schemata (pain memories) are more 
available than so-called objective codes. In- 
deed, the availability of these schemata in 
real-life situations is very likely responsible 
for the differences in experimental findings 
between laboratory and clinical studies of 
pain (Beecher, 1959). We have been im- 
pressed with the reliability with which one 
can find distress reduction by using monitor- 
ing strategies in hospital settings. The patient 
is in a threatening situation, pain and emo- 
tional schematization are highly likely, and 
distress reduction is readily achieved by com- 
binations of monitoring strategies (sensation 
information) and reassurance that the pro- 
cedures will help (Johnson et al., 1975; John- 
son & Leventhal, 1974; Johnson et al., 1973; 
Johnson et al., Note 1; Fuller, Endress, & 
Johnson, Note 2; Wilson, Note 3). 

The field studies also make clear that cop- 
ing or control is an important factor in the 
distress-reduction process. The availability of 
a coping reaction may not only substitute in- 
strumental responding for emotional respond- 
ing but will also help sustain attention to the 
objective features of repeated, noxious stimu- 
lus inputs, facilitate placing benign interpreta- 
tions on these inputs, and permit habituation 
(see Johnson & Leventhal, 1974; Johnson et 
al., Note 1). For example, a postsurgical 
cardiac by-pass patient must engage in 
stretching and breathing exercises that will 
generate intense sensations and distress from 
the chest. Knowing what the sensations will 
be, carefully monitoring them as he or she 
breathes, and interpreting them as signs of 
healing and return to normal function will 
likely encourage the patient’s objective 
schematization of the cues and the habitua- 
tion of distress reactions. By contrast, an un- 
prepared patient who does not expect the 
signs and has no way of interpreting them as 
benign might inhibit coping reactions 1n order 

d distract him- or herself from 


to minimize ani rself f 
pain and distress and fears of potential injury. 
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A complete model of the stress control 
process will involve, therefore, a series of 
steps going from the representation of stimu- 
lus features to their coding in objective or 
emotional memory schemata, the execution 
of coping reactions guided by these cues, and 
the interpretation or evaluation of feedback 
from the coping behaviors (Lazarus, 1966). 
The three experiments reported here ad- 
dressed only part of the problem. It is also 
clear that we have merely touched on the 
process of distress habituation. Habituation 
may be a passive dropping out of responding 
(e.g., Groves & Thompson, 1970) or a product 
of one or more active, inhibitory processes 
related to gating of noxious stimulus impulses 
by informational factors (Melzack & Wall, 
1965, 1970) or by inhibition generated by 
endorphins in the medial forebrain system 
(Snyder, 1977). Interesting questions remain 
respecting the mechanisms underlying the 
behavioral effects reported here, and these 
mechanisms may suggest new psychological 
manipulations for pain control. 
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In three experiments, subjects listened to recordings of male speakers answering 


two interview questions and rated the 


cordings had been altered so that the pi 
lowered by 20% or left at its normal 


speakers on a variety of scales. The re- 
tch of the speakers’ voices was raised or 
level, and speech rate was expanded or 


compressed by 30% or left at its normal rate. The results provided clear evi- 
dence that listeners use these acoustic properties in making personal attributions 
to speakers. Speakers with high-pitched voices were judged less truthful, less 
emphatic, less “potent” (smaller, thinner, faster), and more nervous. Slow-talk- 
ing speakers were judged less truthful, less fluent, and less persuasive and were 
seen as more “passive” (slower, colder, passive, weaker) but more “potent.” 
However, the effects of the acoustic manipulations on personal attributions also 
depended on the particular question that elicited the response. 


; Human speech provides a listener with at 
east two sources of information: a verbal 
channel, encoding the message’s linguistic con- 
tent, and a vocal channel, conveying paralin- 
guistic information by variations in pitch, 
speech rate, loudness, and the like. 
Ape importan! type of information com- 
ae via the vocal channel concerns 4 
fie er’s affective state. The vocal character- 
fon ee with the expression of emo- 
a re eginning to be understood (see, for 
R rs Fairbanks, 1940; Hecker, Stevens, 
zo Bismatcl, & Williams, 1968; Williams & 
establish 2a It is nont T 
a 4 that stressful situations raise the 
ae area frequency (the number 
ane al pulses per second) and that “ac- 
aa motions such as anger and fear tend 
e reflected in increased mean pitch and 
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pitch variance, whereas “low energy” states 
such as sorrow and indifference are associated 
with a lower mean pitch and a slower speech 
rate. 

Given that emotional states do differ re- 
liably in their paralinguistic expression, to 
what extent do listeners use these vocal cues 
in judging the immediate affective state or 
more enduring personality traits of a speaker? 
An early investigation of the noncontent as- 
pects of speech by Allport and Cantril (1934) 
demonstrated that listeners could judge, at 
better than chance levels, a speaker’s age and 
at least some personality characteristics from 


voice alone: In four of six experiments, speak- 


ers’ scores ON @ test of ascendance-submis- 


sion (Allport’s A-S reaction study) were 
judged with significant accuracy. However, 
reviewing much of the voice-attribution work, 
Kramer (1963) concluded that more com- 
mon than accuracy in such judgment studies 
was the finding of “vocal stereotypes.” That 
is, certain voices were reliably, though some- 
times incorrectly, judged as belonging to cer- 
tain personality types- 
Unfortunately, as 
scription of the vocal 


Kramer noted, the de- 
parameters that under- 
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lie such stereotypes leaves much to be desired. 
While not all the earlier research on voice 
attributions can be faulted on these grounds, 
studies attempting to specify critical stimu- 
lus dimensions often have neglected or con- 
founded important paralinguistic cues. For 
example, in reporting two recent field experi- 
ments, Miller, Maruyama, Beaber, and Va- 
lone (1976) concluded that the persuasive- 
ness of a communication can be directly re- 
lated to the rate at which it is delivered. More 
rapid speech was found to be more persuasive, 
presumably because a fast talker is viewed as 
more credible. Although Miller et al. have 
tuled out certain alternative explanations 
(such as the effect of having limited oppor- 
tunity to counterargue against a rapid pre- 
sentation), the stimulus materials used in 
their experiments warrant closer examina- 
tion. In both experiments the persuasive mes- 
Sages were recorded by the same speaker at 
either a slow or a fast speech rate. This was 
accomplished “by simply instructing the 
speaker to practice delivering the same speech 
as rapidly and slowly as possible while con- 
trolling his level of enthusiasm and involve- 
ment” (Miller et al, 1976, p. 618). The stim- 
ulus recordings in the two conditions do, of 
course, differ in speech rate, but it is quite 
likely that they differ in other respects as well. 
In natural speech, such vocal parameters as 
amplitude, pitch, and rate tend to covary. 
For example, rapid speech is likely to be 
louder and higher pitched than normal speech 
(Black, 1961), Consequently, it is quite pos- 
sible that subjects in the Miller et al. study 
were responding to pitch and/or loudness cues 
as well as to rate. 

Because it is so difficult to assess, in a con- 
trolled way, the contribution of various vocal 
parameters to the attribution process using 
natural speech, a number of workers have at- 
tempted to deal with the problem by using 
nonspeech stimuli. For example, Scherer 
(1974) presented listeners with simple tone 
sequences generated on a Moog synthesizer, 
in which a minimal set of acoustic cues 
(pitch level and variation, amplitude level 
and variation, and tempo) were varied fac- 
torially; listeners rated the emotional quality 
of these tone sequences on a set of semantic 
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differential scales. Scherer found that judy 
ments were most influenced by tempo anf 
pitch variations: Fast tempo led to attribu 
tions of highly active and potent emotion 
(e.g, interest, anger, and happiness) and 
slow tempo to attributions of sadness, disgust, 
and boredom; extreme pitch variation with 
rising contours produced ratings of highly 
pleasant, active, and potent emotions (hap 
piness, surprise, interest). Other parameter 
influenced the perception of specific emotional 
qualities. While Scherer’s method effectively 
deals with the confounding of vocal qualities 
in natural speech and suggests the effect thal 
independently varied acoustic cues can exeri 
in view of the artificial nature of the stimul 
used, it is not clear how directly the results 
can be generalized to the processing of speech 

Other investigators, using natural speech 
have achieved a degree of control over stimu 
lus materials by means of editing techniques 
For example, investigators have added o 
removed such speech disfluencies as filled 
pauses and repetitions from stimulus audio 
tapes and asked listeners to judge speakers 
credibility and other personal attributes (La 
& Burron, 1968; Miller & Hewgill, 1964) 
Typically, highly hesitant or disfluent speak 
ers are assigned relatively undesirable pe 
sonality traits, and their communications at 
judged to be low in credibility. 

While the methods thus far described a 
all afforded some degree of stimulus ond 
recent developments in speech synthesis te 
nology permit investigators to vary indepet 
dently one or more parameters of aai 
speech. This approach has been exploited DY 
Brown and his colleagues (Brown, Strong, 
Rencher, 1973, 1974; Smith, Brown, Stron 
& Rencher, 1975), who have manipulate 
speech rate, mean fundamental frege 
and fundamental frequency variance, ia 
a computer-based analysis—synthesis SA 
Two major personality dimensions Gar 
“competence” and “benevolence”) p. 
emerged from factor analyses of judgme?® 
of such manipulated speech. Generally T 
ing, higher pitch seems to result in a Pa 
being judged less competent and less a, 
olent (Brown et al., 1974), whereas T 
speech rates produce judgments of hig 
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competence but yielded an inverted-U rela- 
tionship on the benevolence dimension (Smith 
et al., 1975). 

The studies of Brown and his associates 
make an important contribution to our under- 
standing of the significance that listeners 
ascribe to vocal qualities. Nevertheless, they 
are not free of methodological problems. 
They typically have relied on a small number 
of stimulus voices, all of whose acoustic per- 
| mutations are presented to the same raters. 

For example, the Brown et al. (1974) study 


used only two adult male speakers uttering 


the same sentence (“We were away a year 
ago”), which was then manipulated to fill 
the 27 cells of a 3 X 3 X 3 factorial design. 
It is not clear that judgments based on hear- 
ing 54 repetitions of the same content bear 
much similarity to the kinds of everyday per 
sonality ascriptions listeners make under more 
natural conditions. Additionally, the particu- 
lar parametric values used in these studies 
are problematic, Brown et al. (1974) reported 
that in the high-pitch condition, the original 
fundamental frequency was multiplied by a 
factor of 1.8. Although they did not report 
average fundamental frequency values for 
their two speakers, assuming that they fell 
into the normal range for males, Brown et 
al.'s high-pitch manipulation would have 
raised a male voice into the female range. 
Similar criticisms apply to their choice of 
Speech-rate scale factors. 

The present article reports an exploratory 
Series of experiments designed to demonstrate 
the effects of two acoustic parameters—aver- 
age fundamental frequency and speech rate- 
on judgments of several state and trait vari- 
ables. It was hoped that by using a large 
Number of speakers and naturally produced 
utterances, together with more conservative 
Values for acoustic alteration, and by inde- 
Ely: manipulating the two parameters 

; ey we could overcome the methodo- 
gical problems noted above. 


Experiment 1 


Experiment 1 derives from a finding 1e- 
ae by Streeter, Krauss, Geller, Olson, and 
Pple (1977), who demonstrated that the 
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mean fundamental frequency of speakers’ 
voices increased when they lied, relative to 
a truth-telling baseline, and that this effect 
was stronger when the speaker had been mo- 
tivated to lie effectively. In addition, Streeter 
et al. found that although judges’ ratings of 
truthfulness were essentially uncorrelated 
with pitch level, there was a significant nega- 
tive correlation between judged truthfulness 
and average pitch for listeners who heard the 
speech after it had been passed through a con- 
tent filter—a device that destroys intelligibil- 
ity without affecting vocal features of the 
utterances, Apparently, the natural pitch in- 
crements during lying (on the average about 
3 Hz) were too small to affect the judgments 
of listeners who had content available but 
were taken into account by listeners who 
could not understand the responses’ verbal 
content. However, since pitch increments and 
speech rate decrements were correlated in 
their data, Streeter et al. could not rule out 
the possibility that listeners were attending 
to rate differences rather than pitch differences 
between true and false utterances. From the 
finding of Miller et al. (1976), one would 
expect slower speech to result in lower per- 
ceived speaker credibility. In Experiment 1 
we assessed the effects of these two variables 


on truthfulness judgments. 


Method 


Stimulus materials. 
lege undergraduates, 
answered questions fo ma A 
were individually recorde! in a so! 

igh-quality audio equipment, and they recel 
o MESAN participation. Each Sap 

ix standard questions dealing wit! 
tat f The questions were 
in a fixed order, and speakers were in- 
all questions honestly and frankly. 


Forty male Columbia Col- 
all native speakers of English, 
r use as stimulus materials, They 
nd-isolated booth 


They were to Lai 
re than a yes-or-no repan r 
"enrated concerning how their interviews might be 
used and signed informed-consent releases permitting 


of the recordings. i 
toe nterview questions were selected 


One asked the subject’s 
ions quotas designed to 


favor minority groups (Question 3) ; the other asked 
ji ould 


These two questions 
a diversity of respons 
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on a dimension of personal salience for the respon- 
dents. (The question of quotas is a matter of gen- 
uine concern and frequent discussion among Co- 
lumbia undergraduates.) Twenty-seven speakers, who 
gave responses that were not excessively long and 
represented the spectrum of opinions on the two 
questions, were selected. 

Each of the 27 speakers was then randomly as- 
signed to one of nine cells of a 3 (rate: slow, un- 
manipulated, fast) X 3 (pitch: low, unmanipulated, 
high) completely crossed factorial design, with three 
speakers in each cell. All speech material was digitized 
and analyzed by the linear predictive coding (LPC) 
method of Atal and Hanauer (1971). (The LPC 
analysis calculates 14 parameters every 10 msec. 
Twelve of these parameters represent pseudoarea 
functions of the vocal tract; the other two are 
values for amplitude and fundamental frequency. 
The advantage of LPC analysis is that such vari- 
ables as rate, pitch, and amplitude can be manipu- 
lated without changing other voice parameters, In 
addition, since the parameters are derived from 
the original speech, the quality of the resulting syn- 
thesis is high.) The two responses of each speaker 
were manipulated on a DDP-224 computer using an 
interactive program (Nakatani, Note 1) that displays 
the parameters as functions of time and allows the 
user to manipulate any or all of the 14 parameters 
as well as to linearly expand or compress the 
time base. The altered utterances can then be syn- 
thesized and recorded on audiotape. * 

The scale factors chosen for the low- and high- 
pitch manipulations were 80% and 120%, respec- 
tively, of the speaker’s unmanipulated fundamental 
frequency, The values chosen for the slow and fast 
speech rate manipulations were 70% compression 
and 130% expansion, respectively, of the utterance’s 
time base, resulting in speech rates that were 11% 
and 143% of the unmanipulated rates. These scale 
values were chosen because the resulting speech still 
sounded natural and the acoustic properties remained 
more or less within the normal range of values, (See 
Hanley, 1951; Mysak, 1959; Peterson & Barney, 
1952; and Terango, 1966, for data on Pitch and 
Goldman Eisler, 1968, for normative speech rate 
data.) The mean premanipulation fundamental fre- 
quency and speech rate for our speakers were 109.79 
Hz (SD = 14.82) and 3.27 syllables/sec (SD = .66). 
Following manipulation, all utterances were resyn- 
thesized to produce a stimulus audiotape of the 
answers to the quota question (in one random 
order) and a tape of the answers to the money 
question (in a different order), The tapes were for- 
matted to allow 15 sec of silence between each re- 
sponse, 

Procedure. Twenty undergraduates, 14 males and 
6 females, were paid for their Participation.2 Raters 
Were run in groups of three to seven, They were told 
that the purpose of the study was to determine how 
well people can tell, from the sound of the speaker’s 
voice, whether someone is lying or telling the truth. 
They were informed that approximately half the 
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speakers had been instructed to tell the truth, wher 
the remaining half had been told to lie—that is, i 1 
give answers that did not correspond to their actul 
beliefs or feelings. 

The two stimulus tapes were played over high. 
quality audio equipment with the order of tay 
presentation counterbalanced. A 500-Hz warning tone 
preceded and followed each answer by 1 sec, Rates 
were given 15 sec in which to rate the truthfulnes 
of the answer on a 7-point scale ranging from “nol 
at all truthful” (1) to “entirely truthful” (7). 

Because of variations in the quality of the record. - 
ings, raters were also told that recordings had bee ] 
made using a variety of equipment in several differ * 
ent environments. The experimental session lastel 
about an hour, and there was a 5-min. rest periodi 
between tapes. Participants in the three rating studie ] 
were sent a report describing the purpose and detail t 
ing some of the findings of the study several weeki 
after its conclusion. 


Ww 
j 


W 
t 
Results 


Some of the premanipulation characteristié 
of the stimulus materials are summarized ii 
Table 1. Note that the more involving topit 
college admissions quotas, resulted in sign 
cantly longer, slower, and higher pitched 1 f 
sponses. Since the quota topic always preceded | 
the money topic in the original order, W 
cannot rule out the possibility that these dit 
ferences are due to serial position effects 


l 
1The rate manipulation algorithm linearly A | 
panded or compressed consonant and vowel infor i 
tion alike. In natural speech, rate variation Eo, 4 
be reflected more in changes of vowel duration t 
in consonant duration, However, the rate man i 
tion used resulted in reasonably natural-soundl 
speech. 70%) 
It may not be immediately apparent why it 
compression of an utterance’s time base resus 
a speech rate that is 143% of the original. er! 
sider a response consisting of n syllables spoken On 
seconds of time; that utterance’s rate woul k g 
syllables/sec. Seventy-percent compression © aa 
the rate to n/(.7s), or 1.43 n/s—a faster speech A 
that is 143% of the original. Likewise, a 130 a 
pansion of an utterance’s time base TER aie i 
slower speech rate that is 77% of the original a 
* One additional rater was run, but his data ard: 
discarded after he expressed some suspicion e i 
ing possible splicing of the tapes. No other beet 
voiced any suspicion that the stimuli ha 
altered. ers 0 
$ There is some quality variation across spēki nd 
the LPC synthesis; in particular, speakers whe ti . 
to mumble and have a high degree of nasa : 
seem to suffer the greatest quality degradation- q 
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fable 1 
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remanipulation Characteristics of Stimulus Tapes 
in o mm I Immm 


Quota tape Money tape 

M SD M SD t 
Response length (syllables) 111.1 40.7 84.6 42.7 3,.89** 
Response time (sec) 36.8 13.8 25.6 13.2 4.68% 
Speech rate (syllables/sec) 3.0 -50 3.4 74 2.12" 
Fundamental frequency (Hz) 112.2 14.9 107.4 146 5.60% 

Vote. N = 27 segments per tape. 
001, two-tailed. 


p <..05, two-tailed. ** p < .01, two-tailed. *** p < | 


Towever, it seems more plausible to regard 
hese differences as reflections of differences in 
ur subjects’ personal involvement with the 
wo questions and/or the cognitive complex- 
ty of the answers they called for (see Gold- 
nan Eisler, 1968, and Williams & Stevens, 
1969). 

„To check the adequacy of the random as- 
ignment, premanipulation speech rate (syl- 
ables/sec) and average pitch were subjected 
0 analyses of variance (3 rate levels X 3 
pitch levels x 2 questions) with speakers 
nested within the rate and pitch factors. In 
addition to the between-questions differences 
already noted, the analysis of the premanip- 
ulation pitch failed to reveal other significant 
effects, However, there was a marginally sig- 
nificant difference in premanipulation speech 
tate across the three assigned rate conditions, 
F(2, 18) = 3.00, p < .08, primarily because 
of some slower speakers’ random assignment 
to the slow rate condition. Therefore, prior 
to all further analyses, we adjusted the raw 
data for covariation on speakers’ premanipu- 
lation speech rate and fundamental frequency: 
he appropriate beta weights were derived 
from linear regression of the data collapsed 
across raters, This adjustment has the virtue 
of controlling for spurious rate and question 
effects in the analyses of variance. Having 
thus transformed each dependent variable, we 
Computed min-F ratios (and their approxi- 
mate degrees of freedom) for the analyses of 
Variance, The F” statistic we report (Winer, 
1971, pp. 375-378) treats both speakers and 
Taters as random effects, permitting simul- 
taneous generalization over both groups“ 

A 3 (pitch levels) x 3 (rate levels) X 2 


(questions) analysis of variance was per- 
formed on the mean truthfulness ratings with 
repeated measures on the last factor. Pre- 
manipulation speech rate and premanipula- 
tion fundamental frequency were used as co- 
variates, and all means reported below have 
been adjusted for these covariates. A signifi- 
cant main effect was found for the pitch ma- 
nipulation, F”(3, 34) = 3.37, p < .05, and a 
marginally significant effect for rate, F” (3, 51) 
= 2.26, p < .10. The mean truthfulness rat- 
ings for the three pitch conditions (going 
from low to high pitch) were 4.62, 4.40, and 
4,09, respectively, indicating that lower pitch 
enhanced credibility. The corresponding means 
for the rate manipulation were, going from 
slow to fast rate, 4.10, 4.63, and 4.37; the 
unmanipulated rate was judged most credible 
and the slow rate least credible. 


No significant effect was found for the 


jects’ ratings were not 


4To ensure that our sub) 
ustic quality across our 


affected by variations in aco! 
nine experimental conditions, we had six undergrad- 
uates rate the 54 recorded segments for intelligibility 
and correlated these ratings with our subjects’ ratings 
of truthfulness. The two sets of ratings were not 
significantly correlated (r = .15). Similarly, to ensure 
that response content was well distributed across 
conditions, we had 12 undergraduates rate from a 


transcript how pro- or antiquota (for the quota 
question) or generous or selfish (for the money 
re then 


was, Their ratings we! 
levels) X 3 (rate levels) anal- 
ion were significant 


question) each response 
subjected to a 3 (pitch 
ysis of variance. For neither quest 
main effects or interactions found. 

5 In cases where the min-F value (F) is marginal, 
we will also report the conventional F ratios for 
raters (considering speakers as a fixed effect) and 
speakers (considering raters as a fixed effect). These 
values represent less conservative tests. 
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PITCH CONDITION 


Figure i. Average rated truthfulness plotted as a 
function of pitch condition (low, normal, high) for 
each of the two question topics. 


Pitch X Rate interaction. However, there was 
a nearly significant Pitch X Question effect, 
F" (2, 32) = 3.02, p < .10; raters’ F(2, 38) 
= 10.25, p < .001; speakers’ F (2, 18) = 3.84, 
p < .05. For the quota question, truthfulness 
ratings and pitch were curvilinearly related; 
for the money question, low-pitched speakers 
were judged as most truthful. The two sets 
of means are shown in Figure 1.° 


Discussion 


The results demonstrate that the acoustic 
manipulations performed on the speech stimuli 
affected judgments of truthfulness. Consistent 
with the findings of Streeter et al. (1977), 
judges rated high-pitched voices as less truth- 
ful than lower pitched voices. Perhaps listen- 
ers perceived high pitch to be an indication of 
stress and attributed such stress to attempted 
deception. The pitch manipulations used, al- 
though not extreme enough to place voices 
outside the normal male pitch range, were 
evidently large enough to produce attribu- 
tions of lying from naive listeners. It will be 
recalled that the smaller, naturally occurring 
pitch increments accompanying deception in 
the Streeter et al. experiment did not evoke 
such attributions, except in the filtered listen- 


ing condition, in which verbal content was un- 
intelligible. 
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While the rate effect was only marginally 
significant, it appears that the effect of rate 
on truthfulness judgments is not linear y 
rather, the pattern is an inverted-U function, 
with slower speech perceived as least creati 
These results are consistent with the finding’ 
of Miller et al. (1976), who demonstrated 
that a faster speaker was perceived as mori 
intelligent, knowledgeable, and objective than 
a slower speaker. Since Miller et al. used o 
two levels of speech rate, it is, of course, not! 
possible to establish the effect of intermediate. 
rates on credibility judgments in their study, 
It is relevant, however, that Brown and co 
workers have reported a similar inverted- 
relationship between manipulated speech rati 
and their “benevolence” dimension, on whid 
the adjective pair sincere-insincere loads sig 
nificantly (see Smith, Brown, Strong tl 
Rencher, 1975). To the extent that rating 
of sincerity correspond to this study’s truth, 
fulness measure, the two results are consistenti 

The most plausible explanation of the Ques} 
tion X Pitch interaction (Figure 1) is thal 
listeners took question content into considet#! 
tion in making their truthfulness judgments; 
When listening to a potentially “loaded” oe 
(quota question), raters were willing to ¢ 
both low- and normal-pitched voices mot 
truthful than high-pitched voices. For the les 
involving question (money), raters were wa 
ing to call only low voices more truthful. Th 
interaction argues for a pitch threshold, abort 
which deception is signaled to raters. Su 
threshold would interact with response cor 
tent, so that for an emotionally involvi 
topic, it would be set higher than for & Ie 
involving topic. With an emotionally inl 
ing topic, some of the vocally reflected E 
can be attributed to the topic; given 4 nd 
involving topic, the high-pitch responses” 
likely to be attributed to attempted a 
tion. However, since only two questions r ' 
used, further research is needed to test 
content-attribution hypothesis. l a 

The truthfulness ratings reflect judg 


ip 
ê The corresponding analysis performed oP ir 
telligibility ratings revealed a significant efec afe 
for the rate manipulation, with fastest spe® 
ing greatest loss in intelligibility. 
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processes that give rise to attributions of a 
speaker’s transient state. If listeners had not 

“been told that certain speakers were lying, 
differences in the acoustically manipulated 
variables might have been seen as enduring 
vocal properties reflecting stable personal pre- 
dispositions. Such properties have been re- 
ferred to by the linguist Trager (1958) as 
the “voice set” and involve “the physiologi- 
cal and physical peculiarities resulting in the 
patterned identification of individuals as . . . 
persons of a certain sex, age, state of health, 
body build, rhythm state” (p. 4). The voice 
set, therefore, acts as a relatively permanent 
background against which transient vocal 
changes are superimposed. In the absence of 
situational factors (e.g., the possibility that 
the speaker was lying) that could explain the 
voice qualities produced by our acoustic ma- 
nipulations, listeners would be likely to as- 
cribe such qualities to the voice set. How 
such variables affect person perception and 
contribute to vocal stereotypes was explored in 
Experiment 2. 


Experiment 2 
Method 


Stimulus materials. The quota and money tapes 
from Experiment 1 were used. 

Procedure. Eleven college students, nine males and 
two females, were paid for their participation as 
raters, The procedure was essentially the same as 
that used in Experiment 1, with the following dif- 
ferences: Raters were instructed that the study’s 
Purpose was to investigate how listeners form im- 
pressions of speakers from the things they say as 
Well as from the way they say them. Accordingly, 
Taters were told to focus both on content and de- 
livery when making their judgments. 

The speaker of each recorded segment was rated 
on nine bipolar adjective pairs taken from the se- 
mantic differential (Osgood, Suci, & Tannenbaum, 
1957) ; scales were chosen that had high loadings on 
One of Osgood et al.’s semantic space factors ant 
pare low loadings on the other two. The scales 
te evaluation factor were sour-sweet, awful-nice, 
thin bad-good. Scales for the potency factor were 
th thick, small—large, and weak-strong. Those for 

e activity factor were slow—fast, cold-hot, and 
Passive-active, 
lei syne tone followed each recorded segment, 

iling judges to begin making the nine ratings Oñ 
ee scales. (The second adjective of each pair 
shor ies as 7.) Instructions stressed that ratings 
uld reflect the listener’s impression of the speaker 
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and not agreement or disagreement with the con- 
tent of the answer. Subjects were told that all 
speakers had answered the questions truthfully, 


Results 


Means of the nine scales were computed 
for each recorded segment, and the intercor- 
relation matrix was factor analyzed using 
principal factoring followed by a varimax rota- 
tion. A three-factor solution accounted for 
84.5% of the total variance and roughly cor- 
responded to the three dimensions of Osgood 
et al.’s (1957) semantic space. We chose 
scales loading greater than .60 (in absolute 
value) as representative of the respective fac- 
tors. Using that cutoff, Factor 1 consisted of 
all three activity scales (slow-fast, cold-hot, 
and passive-active) as well as the strong- 
weak scale; it accounted for 54.4% of the 
variance, Factor 2 was a pure evaluation di- 
mension (with only the three evaluative scales 
loading appreciably: sour-sweet, awful-nice, 
bad-good); it accounted for 20.2% of the 
variance. The third factor consisted of two 
potency scales (thin-thick, small-large) and 
an activity scale (slow-fast); it accounted 
for 9.9% of the variance. 

Each rater’s data were reduced to three 
factor scores weighting the original scales by 
the factor loadings; the factor scores were ad- 
justed for covariates (as in Experiment 1) 
and entered into univariate analyses of vari- 
ance of the same design as that used in Ex- 
periment 1. 

For Factor 1, a 
F" (2, 32) = 10.49, 


significant rate effect, 
p < 001, and a nearly 
igni t Rate X Question interaction, 
A eR =3.02, p < 10; were obtained; 
raters’ F(2, 20) = 9.72, p < 01; speakers 
F(2, 18) = 3.87, p < 05. The configuration 
of means is shown in Figure 2. In both cases, 
slow speakers were perceived as less active. 
No significant main effects of interactions 
were found for Factor 2. For Factor saguh: 
cant main effects were found for ie F" (2,36) 
= 12.87, p < 001, and pitch, F (2, 33) S 
7.94, p < 01. In both cases the relationship 
was monotonic, with increasing pitch and 
igments of decreasing po- 


rate resulting in jud 
tency. No other significant effects were found. 
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Figure 2, Averages for the activity factor plotted as 
a function of rate condition (slow, normal, fast) 
and question topic, 


Discussion 


These results extend the findings of Ex- 
periment 1 to judgments of more stable 
speaker dispositions. Men speaking in higher 
pitched voices were perceived as less potent 
(smaller, thinner, slower) and slow-speaking 
men were perceived as more passive (slower, 
colder, more passive, weaker) and more 
potent. 

These findings are to some extent con- 
sistent with correlational evidence provided 
by Scherer, Koivumaki, and Rosenthal 
(1972). In their experiment, listeners rated 
taped segments, taken from a recorded play, 
on semantic differential scales similar to the 
ones used here, as well as on scales reflecting 
the segments’ acoustic properties (e.g., bass- 
treble, soft-loud). Unlike the present study, 
raters judged the emotion portrayed and not 
their impression of the speaker, and a variety 
of listening conditions were used to degrade 
semantic content and prosodic features. 
Nevertheless, Scherer et al. found a marginally 
significant relationship between pitch rating 
(bass-treble) and Potency ratings (strong— 
weak) paralleling the main effect reported 
above: Lower pitched speech was placed to- 
ward the “stronger” pole in the potency di- 
mension of the emotional-meaning space. In 
contrast to our results, they also found a 
marginal correlation between articulation rate 
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(slow-fast) and potency ratings: i 
with slower speech were rated as “weaker, 
Not surprisingly, Scherer et al. also found} 
that segments with faster speech were heavily | 
loaded on the activity dimension. Í 

Our findings are likewise consistent with 
the results of Brown et al. (1974). Although 
the methodological differences previously) 
noted preclude direct comparison, Brown 
found that high fundamental frequency de 
creased competence ratings—a scale probably 
related to our potency dimensior 

The Rate x Question interaction (Figure 
2) again suggests the influence that con 
tent exerts on raters’ judgments: When speak 
ers talked about quotas, rati on the pas- 
sive-active dimension were linear with ma- 
nipulated rate; for the money question, they 
were not. 


Segments 


l 


Experiment 3 l 


Experiment 3 returned this work to the 
area of judgments of the speaker’s affective 
state. Impressions of nervousness, emphatic 
ness, seriousness, fluency, and persuasivenes 
illustrate how these acoustic variables servé 
to convey a speaker’s self-presentation unde! 
conditions in which raters believe that am 
swers are being given honestly. These sa 
ratings (with the exception of persuasiveness) 
were chosen because Krauss, Geller, and Olson 
(Note 2) found significant correlations be 
tween them and truthfulness ratings in * 
previous study of deception interactions. 


Method 


| 

Stimulus materials. The quota and money A 
from Experiment 1 were used. 

Procedir, Ten college-student subjects, two si 
and eight females, were paid to rate all segmen ula 
five state variables: fluency, emphaticness, pe 
siveness, nervousness, and seriousness. The je the 
scales spanned 7 points, with 1 indicating A i, 
smallest amount of a variable was judged an d 
dicating that the largest amount was judge? 
example, anchors of the nervousness scale were wert 
at all nervous” and “Very nervous.” Subi rat- 
encouraged to adopt their own criteria for & tions 
ings; no external standards were given. Instruc” i 
were virtually identical to those used in 
ment 2. Again, listeners were asked to tak al 
consideration both the content of an answe" 
the manner in which it was delivered. 


e int 
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Table 2 
Min-F Values for the Five State Variables 
ee 
Rate (R) Pitch (P) Question (Q) 

State variable effect effect XR QXP 

Persuasiveness 3.93** 2,94* 3.54** 

Fluency 6.03*** 4.07** 213% 

Emphaticness 6.66*** 3.12%* 3:33" 

vousness 6.16*** 5.90*** 
Sı 4.60** 


ousness 


*p <10. ** p < 05. *** p < 01. 


All other procedural details were identical to 
those of Experiment 2, except that this study was 
run at Bell Laboratories, using students from a 
number of colleges who were home on summer vaca- 
tion. 


Results 


Segment means were computed for all state 
measures and analyzed using covariance anal- 
yses as described above. Table 2 summarizes 
the findings. Note that all variables with the 
exception of seriousness show main effects 
for the rate manipulation. These effects are 
all of the inverted-U type with the normal 
(unmanipulated) speakers judged most fluent, 
persuasive, and so forth, and slow speakers 
judged lowest on these scales. (Ratings of 
hervousness go in the direction opposite to 
the other three scales.) i 

Only nervousness yielded a significant main 
effect for pitch; the pitch manipulation was 
marginally significant for persuasiveness. 
Rated nervousness increased with higher 
pitch, whereas rated persuasiveness decreased. 

In addition, ratings of emphaticness and 
seriousness showed significant Question X 
Pitch interactions. The shape of these inter- 
actions, shown in Figure 3, is quite similar to 
the corresponding effect on judged truthful- 
ness (Figure 1). For the quota question, only 
the highest pitched group was “underrated 
on emphaticness and seriousness, whereas 
for the money question, both the normal- and 
high-pitched groups were underrated.” The 
fluency measure showed a comparable inter- 
action. ; 

There were Question X Rate interactions 
for fluency, persuasiveness, and emphaticness. 
All interactions had a similar shape; for 


quota question only the slowest group suffered 
low ratings, while for the money question 
both the slow and fast groups received low 
ratings. The effect for seriousness judgments 
was marginally significant, but of the same 
shape. 


Discussion 

These results again demonstrate the effect 
acoustic variables have on person perception 
processes. Decreasing speech rate has a par- 
ticularly deleterious effect on a speaker’s per- 
ceived persuasiveness, fluency, and emphatic- 
ness. Similarly, increased pitch lowers ratings 
of persuasiveness and increases greatly the 
impression of nervousness. : 

Our findings also suggest that context plays 
a role in the attribution process, as evidenced 
by the question interactions. When a speaker 
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Figure 3. Average emphaticness and seriousness rat- 
ings plotted as a function of pitch condition (low, 


normal, high) and question topic. 
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answered the quota question, higher pitch 
levels could be discounted by attributing them 
to the stressfulness of the topic. Similarly, 
it may have been inappropriate on the money 
topic to talk too slowly, (perhaps because of 
the topic’s relative simplicity) or too rapidly 
(perhaps because rapid speech is perceived 
as an attempt to be inappropriately serious 
or persuasive). On the other hand, rapid 
speech on the quota question might have been 
attributed to the speaker’s conviction regard- 
ing his argument. 


General Discussion 


The three experiments taken as a whole 
provide clear evidence that acoustic properties 
of a message have considerable impact on 
judgments of a variety of state and trait vari- 
ables. The impressions of high-pitched or 
slow-talking speakers seem particularly nega- 
tive. For example, men with high-pitched 
voices are judged less truthful, less persua- 
sive, weaker, and more nervous. Similarly, 
slow-talking men are judged to be less truth- 
ful, fluent, emphatic, serious, and persuasive, 
and more passive, although they are also seen 
as more potent. 

By using a large number of speakers and 
factorially varying speech rate and pitch, we 
have a reasonable measure of confidence in 
the validity of our findings. However, since 
no female voices were used, we cannot gen- 
eralize these results to women; it is conceiv- 
able that these same acoustic variables would 
produce different effects on perceptions of 
women speakers, 

Message context, Presumably mediated by 
the two question topics, was also demonstrated 
to influence attribution Processes. However, 
since there was only one question of each 
type, what follows must remain somewhat 
speculative. The question interactions sug- 
gest an interpretation along the lines of Kel- 
ley’s (1971) discounting principle. Kelley 
Suggests that the more factors a situation 
contains—any one of which might plausibly 
have resulted in an observed outcome—the 
less likely is any one factor to be perceived as 
the Cause of that outcome. With fewer pos- 
sible Causes present, the cause-to-effect at- 
tribution is more compelling. We hoped that 
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our acoustic manipulations would be poten 


enough to affect speaker state or trait a 
tributions. However, it is evident that lis-/ 
eners may have, at least partially, attributed 
our acoustic alterations to the question topic, 
The quota question called for an answer that 
was both complex and emotionally involving 


for our college-student speakers and raters 
As Table 1 shows, before any manipulations 
were done, the quota answers were longer ant 
slower (suggesting greater cognitive complex 
ity; see Goldman Eisler, 1968) and higher 
pitched (suggesting greater stressfulness; set 
Hecker et al., 1968). Because of this, slower 


and higher pitched answers might have beet 
perceived as more appropriate for quota tt 
sponses than for money responses.’ But even 
when premanipulation speech rate and pitdl 
were covaried, the question interactions Te 


mained, supporting the discounting principle 
For example, higher pitch levels seemed ti 
be discounted on ratings of truthfulness 
fluency, emphaticness, and seriousness fot 
speakers answering the quota, but not th 
money, question. 

It should be noted that manipulating fu 
damental frequency by multiplying by 4 scali 
factor as we did has the effect of multiplying 
the variance of the fundamental frequency "i 
the square of the scale factor. Thus, hih 
pitched segments were both high-pitched al! 
high pitch-variance segments, and vice vers 
Thus, we cannot rule out the interpretat 
that the pitch effects observed could be pit 
variance effects. However, this interpretatio 
seems unlikely considering the findings q 
Brown et al. (1974), who did the appropr# 
factorial experiment and found that incon 
mean fundamental frequency lowered JU ; 
ments of speakers’ competence and bene 
lence, while decreased variance also ee 
these ratings. Thus, it seems that aver 4 
Pitch and pitch variance affect judgments 


‘ study) 
7 Subsequently, in connection with another p” 


we had 12 undergraduates rate a long list aa in 
tential interview topics (including the two " com 
the present study) as to how stressful and how) nder 
plex they would be for a typical Columbia “ue 
graduate to discuss. Subjects judged the quo com 
tion to be significantly more stressful and more 
plex than the money question. 
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awide range of scales in opposite ways. In the 
present study, in which both pitch and pitch 
variance were positively correlated, the pitch- 


scorrelated variance could only have attenuated 


the effects of average pitch. 

Given that the experimental manipulations 
used here affected the perception of speakers’ 
personality and state, it remains to be ex- 
plained why these particular data patterns 
were observed. Miller et al. (1976), for ex- 
ample, conclude that the effect of speech rate 
on message persuasiveness is mediated by the 
effect that variable has on the perception of 
a speaker’s credibility. This, they assert, is a 
“Jess rationalistic view” of attitude change 
than other interpretations (e.g., change medi- 
ated by comprehension effects or counterargu- 
ment distuption—two hypotheses their experi- 
ments ruled out). 

Certainly such a conclusion would be jus- 
tified if it were the case that variations in 
voice quality bore no relation to the actual 
internal state or predisposition of the speaker. 
However, there is considerable evidence that 
Stressful situations do produce discernible 
changes in voice quality (Fairbanks, 1940; 
Hecker et al., 1968; Williams & Stevens, 
1969, 1972), and it seems reasonable to as- 
sume that listeners in the present study used 
such variations appropriately to infer a speak- 
er’s state from the quality of his speech: For 
example, Hecker et al. demonstrated that 
task-induced stress raised the fundamental 
frequency of those speakers who did not talk 
more softly under stress. In our experi- 
ment, attributions of increased nervousness to 
higher-pitched speakers are quite “rational- 
istic,” given the similarity of our pitch ma- 
nipulations to the effect of real-life stress. 
Similarly, Streeter et al. (1977) demonstrated 
that pitch increments accompany deceptive 
responses; listeners’ truthfulness judgments 
in Experiment 1 appropriately reflect this re- 
lationship. 

_ The effects of speech rate can be similarly 
interpreted in light of Goldman Eisler’s 
(1968) finding that rate and the cognitive 
complexity of the topic were negatively re- 
lated. Listeners may have assumed that lying 
increases speakers’ cognitive load, resulting in 
slower rates, Unpublished speech rate data 
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from the Streeter et al. (1977) study lend 
plausibility to this argument. A marginally 
significant interaction (p< .08) indicated 
that subjects did, in fact, speak more slowly 
when lying than when telling the truth, pro- 
vided they had been given instructions engag- 
ing their motivation to lie effectively. How- 
ever, speakers not receiving such instructions 
spoke more rapidly when lying. 

Judgments of what is or is not “rational- 
istic” are probably to a large extent matters 
of personal preference, but it does seem to 
us that a listener would be ill-advised to ig- 
nore reliable information concerning a speak- 
er’s internal state in evaluating the speaker’s 
message, especially when the internal state 
seems incongruent with the situation or with 
the message’s content. 

Furthermore, the significant Question X 
Manipulation interactions for ratings of state 
variables suggest some qualifications on the 
findings of Miller et al. (1976). Fast speakers 
are not always more persuasive; talking too 
quickly in response to the money question 
produced lower persuasiveness ratings than 
did responses at a normal rate. Apparently, 
listeners take more into account than meets 
the ear—at least, more than simply the 
acoustic data. 

Evidence of veridicality for vocally based 
attributions of enduring personality traits is 
less firm. We have found no reliable data to 
indicate that fast talkers actually are more 
active people or that higher pitched men are 
weaker than their lower pitched counterparts. 
Apart from studies of psychiatric patients, 
much of the work in this area deals with the 
traits of introversion—extraversion and domi- 
nance, Mallory and Miller (1958) found 
small but significant negative correlations be- 
tween judged pitch and rate and the domi- 
nance scale of the Bernreuter Personality In- 
ventory. While these findings support our re- 
sults on judged potency along the pitch 
dimension (Experiment 2), it is possible that 
Mallory and Miller’s acoustic judgments are 
either inaccurate or subject to biasing effects 
from other sources, since raters judged speak- 
ers in a live situation. Furthermore, our re- 
sults for judged potency on the speech-rate 
variable appear opposite to Mallory and 
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Miller’s. A more recent study (Ramsay, 1966) 
found that the speech of subjects classified as 
extraverts on the Eysenck Personality Inven- 
tory had longer unbroken phonation times 
and shorter silences than those of introverts 
across a variety of speaking tasks; no data on 
speech rates were presented. 

The acoustic stimulus, of course, contains 
more information than average pitch and rate. 
In addition, there is sequential information 
(provided by intonation contours and dura- 
tion pattern), loudness, and variability of 
both pitch and loudness over time. There is 
also voice quality information (e.g. “breathy” 
or “raspy” voices) that may not be so readily 
specified in terms of physical parameters. All 
of these factors can be expected to enter into 
the person perception process via stereotypes 
with larger or smaller kernels of truth. 

On none of the measures we examined was 
the Rate X Pitch interaction statistically sig- 
nificant. The median F interaction was 1.12— 
close to its expected value under the null hy- 
pothesis. This absence of interaction argues 
for an additive model, in which pitch and 
rate exert independent effects on listeners’ 
judgments, It remains to be seen whether 
support for such a model will continue as the 
role of additional vocal factors is explored. 


Reference Notes 


1. Nakatani, L. H. SYNLOG: An interactive system 
for manipulating speech. Unpublished manuscript, 
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ities and cues in the detection of deception. Paper 
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Corrections to Koretzky, Kohn, and Jeger 


In the article 


An Application of the Two-Factor Mod 
Kohn, and Abraham M. Jeger (Journal o, 
1978, Vol. 36, No. 9, pp. 1054-1059), 
ported in the first paragraph on page 10. 


for the Koretzky (1976) study 
relation for Factor I is .40, and- 
should read, “Consistency correlations 


were even stronger in this experiment 


respectable for Factor I (r = .40).” 


in B. tzky’s affiliation was erroneously given as 
la hc York. His affiliation at 


ministration Hospital, Bronx, New 


original research was the State University of N 


“Cross-Situational Consistency Among Problem Adolescents: 


el” by Martin B. Koretzky, Martin 
j Personality and Social Psychology, 


there are errors in two correlations re- 
58. The Factor I and Factor II scores 
cited there should be reversed. The correct cor- 
d for Factor II it is .60. Thus, the sentence 
between classroom and residence settings 
for Factor II (r =.60) and were also 


the Veterans Ad- 
the time of the 


ew York at Stony Brook. 


Journal of Personality and Social Psychology 
1979, Vol. 37, No. 5, 728-734 


Smiles Can Be Back Channels 


Lawrence J. Brunner 
Department of Behavioral Sciences 
University of Chicago 


The placement of auditor smile beginnings in the stream of dyadic interaction 
was investigated, using detailed transcriptions of the language, paralanguage, 
and body motion of the participants in four two-person conversations recorded | 
on videotape. Auditor smile beginnings showed a strong tendency to occur at 
the same kinds of location as “back channel” responses (such as “yeah,” 


“uh-huh,” and head nods). This finding indicates that the smile can function 
as a type of back channel. It is argued that smiles, like other forms of back 
channel, make communication more efficient by providing the speaker with 
feedback on a number of levels simultaneously. 


Most quantitative investigations of smiling 
in adults (see reviews by Berlyne, 1969; 
Chapman & Foot, 1976; and Goldstein & Mc- 
Ghee, 1972) have treated it as expressive of 
underlying states such as mirth or pleasurable 
arousal. Possibly because researchers in so- 
cial psychology and personality tend to focus 
their attention more on hypothetical con- 
structs than on the behavior that those con- 
structs are intended to explain (a thesis that 
Fiske, 1978, argues cogently and in detail), 
the frequent failure to find significant cor- 
relations between ratings of amusement and 
actions such as smiling and laughing (Chap- 
man, 1976, pp. 156-157) has been attributed 
(by Berlyne, 1969, and Chapman, 1976, for 
example) to the greater “sensitivity” of self- 
report measures, 

The present study is deliberately opposite 
in approach, Without denying that people 
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often smile when they are happy or whei 
something strikes them as funny, it is com 
cerned with smiles as actions rather than ei 
pressions of unobservable inner states. Iti 
an attempt to partly specify the “meaning” il 
smiles—specifically, the smiles of the persat 
who is not speaking in a two-person conversa 
tion—in what Goffman (1971, p. 235) call 
the “structural” sense of the term. That i 
it is an attempt to describe their placemetl 
in the stream of behavior. g 

The study is part of a program of investigt 
tion (Duncan, 1972, 1974; Duncan & Fisk 
1977; Duncan & Niederehe, 1974) who 
purpose is to discover the organization ú 
face-to-face interaction. The term organai 
tion reflects the assumption that just as lal 
guage operates according to rules that givet 
a definite structure, so does the process @ 
face-to-face interaction in which languag 
ordinarily occurs. a 

The research of Duncan and his associall 
on the structure of interaction is based % 
detailed transcription of speech and bo% 
motion in two-person conversations. Explor 
tion of these materials has led to the formuli 
tion of a set of hypothesized signals and nl 
that people apparently use to exchange SP@ 
ing turns and otherwise regulate the flow 4 
interaction in conversations. These rules © 
pend on a distinction between the “speak 
(the person who holds the speaking tu% 
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and the “auditor” (a person who is in the 
conversation without holding the turn). To 
hold the speaking turn is not the same as to 
emit speech sounds; in the conversations we 
have studied, brief pauses seldom if ever con- 
stitute relinquishment of the speaking turn, 
and it is possible for a person to talk without 
claiming the speaking turn (for example, to 
say things like “yeah” and “uh-huh”). 


Speaking Turns 


Duncan (1972) hypothesized a turn signal 
composed of six separate cues in language, 
paralanguage, and body motion. If any one or 
any combination of these cues is displayed, 
the auditor may take the turn if he or she 
wishes to. The auditor is not required to 
take the turn when the turn signal is dis- 
played; he or she merely has the option of 
doing so. There is a marked tendency for the 
turn signal to precede orderly (smooth) ex- 
changes of the speaking turn, but it seldom 
precedes instances of “simultaneous turns” 
in which both participants attempt to hold 
the floor at the same time. 

Complementary to the turn signal is the 
“gesticulation” signal. While the speaker is 
gesticulating, the auditor should not attempt 
to take the turn, regardless of whether the 
turn signal is displayed. The evidence for 
this formulation is that the auditor is much 
less likely to attempt the turn when the 
speaker is gesturing and that if the auditor 
does, the result is usually simultaneous turns, 
even when one or more turn cues are also 
present. 


Back Channels 


The turn signal and the gesticulation signal 
are both conceived to facilitate the smooth 
exchange of turns. In further development of 
the turn system, attention has been directed 
toward speaker-auditor interchanges that oc- 
cur during speaking turns. Duncan (1974) re- 
Ported a study of auditor “back channels” 
(head nods and statements such as “yeah” 
and “wh-huh”) that explored some of the 
differences between back channels and turn 
attempts and related auditor back channels 
to the actions of the speaker. 
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Unlike turn attempts, auditor back channels 
are not suppressed by the speaker’s gesticula- 
tion. In addition, back channels and turn at- 
tempts are distributed differently with re- 
spect to the boundaries of segments into 
which the conversations were divided for pur- 
poses of statistical analysis. It is likely that 
these segments reflect structural properties 
of speech and communicative body motion 
(Duncan & Fiske, 1977, discuss them in de- 
tail). Turn attempts usually occur exactly 
between segments, whereas back channels are 
more loosely distributed. 

Another important distinction between au- 
ditor back channels and turn attempts is that 
they seem to be related to different sets of 
cues given by the speaker. Duncan (1974) 
hypothesized a speaker “within-turn” signal 
originally consisting of two cues: grammatical 
completion and turning of the head toward 
the auditor. Brunner (1977) has reanalyzed 
the data and found evidence for changing 
the head turn cue to simple head direction 
toward the auditor. Responses to the within- 
turn signal, like responses to the turn signal, 
are conceived to be optional; the speaker is 
not obliged to give a back channel, but may 
do so if a signal is present. 

The thesis of the present study is that the 
placement of smile beginnings in conversa- 
tions is closely parallel to that of back chan- 
nels and that, in fact, smiling can be a type 
of back channel. This idea was suggested by 
the entirely serendipitous finding that most 
auditor smile beginnings were preceded by 
the within-turn signal. In order to claim, how- 
that the auditor’s smile beginnings oc- 
a place that is similar to back channels 
teraction, three other re- 


ever, 
cupy 
in the structure of in 


quirements must be met. a 
1. There must be an acceptable statistical 


relation between occurrence of a smile begin- 
ning and previous display of the within-turn 
signal. 

2. Auditor smile beginnings must be more 
Joosely distributed with respect to segment 
boundaries than auditor turn attempts are. 

3. The auditor’s smile beginning must not 


be suppressed by the speaker’s gesticulation. 
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Conversations and Transcription 


The four conversations on which this report is 
based are a subset of the ones described by Duncan 
and Fiske (1977). Conversations 1, 2, and 3 are 7 
minutes long. All three were between female law 
students and male social work students at the Uni- 
versity of Chicago; the participants were paid vol- 
unteers who were previously unacquainted. Conversa- 
tion 4 is 19 minutes long and, unlike the other three 
conversations, would have occurred whether or not 
it was videotaped. It was between two 40-year-old 
male psychotherapists from the Counseling Center at 
the University of Chicago. They were good friends 
and had known each other for about 10 years. The 
subject of their conversation was a client whom both 
of them had been seen recently. Duncan and Fiske 
give more detailed background information about 
these conversations, They refer to Conversations 1 
through 4 as numbers 5, 6, 7, and 2, respectively. 

In all four conversations, the participants were 
seated in adjacent chairs turned slightly toward each 
other, facing a video camera. The participants in the 
first three conversations were requested to get ac- 
quainted and carry on a short conversation, and in 
the fourth they-were asked to continue a conversa- 
tion that they had been unable to complete on a 
Previous occasion. 

The conversations have been transcribed in con- 
siderable detail, but for the present study, only the 
following actions were important: spoken syllables, 
completions of subject-predicate clauses, intonation 
(according to Trager & Smith’s, 1957, system), ges- 
tures, head movements, and smiles. In this study, the 
term smile refers to apparent bilateral or unilateral 
contraction of the risorius and buccinator muscles. 
Each event was carefully located with respect to the 
other events in the conversation so as to produce a 
transcript in which the sequence of actions is repre- 
sented, accurate to one syllable. 

In transcribing body motion for this study, the 
author and a research assistant worked together with 
the goal of making a transcription on which they 
completely agreed; in cases of doubt, other people 
familiar with the transcription system were con- 
sulted. Typically, agreement between observers is 
very high without discussion, and previous studies 
of the reliability of the transcription system used 
in the present investigation have yielded a median 
kappa (Cohen, 1960) of .81 (roughly 95% agree- 
ment) between separate observers (Duncan & Fiske, 
1977, p. 342). 


Data Analysis 


Unit of analysis. In order to assess behavioral 
regularities within an interaction, it must be divided 
into units to which scores are assigned. The reason is 
that in order to present evidence that some action, 
A, precedes B more often than one would expect by 
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chance, it is necessary to know not only how may 
times A preceded B and how many times each actin! 
occurred separately but also how often each actim 
did not occur. In the analyses to be reported here, 
the unit of analysis is a stretch of speech that ca 
be as short as a single phonemic clause (Trager 
Smith, 1957) and as long as an entire speaking tun 

The boundaries of the units of analysis (whi 
will be called “segments” for convenience) o 
after the boundaries of phonemic clauses that co 
tain one or more of the following actions: unfill 
pause, audible inhalation or exhalation, turning 
the speaker’s head toward the auditor, a drop in para 
linguistic pitch or loudness, any pitch level — termini 
juncture combination other than 2 2 | at the end d 
a phonemic clause, paralinguistic draw! on the fi 
or the stressed syllable of a phonemic clause, termini 
tion of a gesture or relaxation of a tensed ha 
position by the speaker, use of a stereotyped exp 
sion such as “but uh” or “you know,” or the co 
pletion of a grammatical clause consisting of as 
ject and predicate. Duncan and Fiske (1977) expl 
segments in much more detail. 

In addition to serving as the basic units of anil 
sis, segments are used to describe the distribution 
events within an interaction. The locations of i 
auditor’s actions with respect to segment bound 
can be represented in a fourfold typology. The íi 
categories of location are the following : s 

1. Speech overlap—from roughly the middle 
slightly before the end of the segment. q 

2. Pause—between the last syllable of the unit 
question and the first syllable of the next unit. 

3. Sociocentric sequence—before or during a 50 
centric sequence such as “but uh” or “you know: 

4. Postboundary—from the very beginning 
roughly the middle of the segment after the om 
question. 

Statistics. Evidence for the relation between f 
speaker’s within-turn signal and the auditor's ‘i 
beginning will be presented in a fourfold table v 
the columns indicate presence versus absence of 
signal, and the rows indicate presence versus abse! 
of a subsequent smile beginning by the auditor. 
the results are parallel to those obtained for 
channels, and the speaker’s within-turn signal pea 
an auditor smile beginning but does not require 
almost every auditor smile beginning shoul 
preceded by the signal. If the signal operated P 
fectly, the cell representing auditor smile peen 
in the absence of the within-turn signal WR i 
zero. There is no requirement, however, tha 
other cell in that diagonal (failures to respon! i 
the signal) be zero or even have a particulari i l 
frequency; permissive signals allow the opti 
not responding. 

Consequently, the most useful index of ass W 
between signal and response for the table 10° i 
scribed is one that reaches its maximum whe? 3 


ssociallt 


one cell is empty. The statistic Q (described bY 
1900, and further developed by Goodman & 


ui 
1954, 1959, 1963, 1972) has this property: Jt ° 
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10 or —1.0 when one cell or both cells on one di- 
agonal of a 2 X 2 table equals zero. The Q also has 
the advantage of being a true index of association, 
‘directly interpretable as percentage of improvement 
in predicting the dependent variable when the inde- 
pendent variable is known. The standard error of Q 
can be calculated, and the hypothesis that Q is sig- 
nificantly different from zero may be tested by 
dividing Q by its standard error and referring the 
result to the normal curve, The Q and its two-tailed 
probability value will be reported for all the 2 X 2 
tables in this article. 

Replication. Since the results to be reported here 
are the products of very thorough exploration rather 
than hypothesis testing, they must be replicated in 
order to ensure that they do not merely reflect ran- 
dom contingencies in the data. I have used Con- 
versations 1 and 2 for exploration and Conversations 
3 and 4 for replication. Although pooled data are 
reported for both the exploratory and replication 
samples, the results for all eight participants were 
examined separately to make sure that the same 
trends held for all individuals. 

The analyses to be reported do not include seg- 
ments with auditor turn attempts or smiles continued 
from the previous segment. Segments with turn 
attempts were excluded because when a person smiled 
as he or she claimed the speaking turn, it was often 
impossible to decide whether to call the action a 
Speaker smile or an auditor smile. Segments with 
auditor smile continuations were excluded because 
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the auditor could not begin to smile when he or she 
was already smiling. 


Results 
The Within-Turn Signal 


Previous research has shown that the 
speaker’s within-turn signal regularly pre- 
cedes auditor back channels whose location 
with respect to segment boundaries is late or 
between units, but not those which are early 
or occur before or during a sociocentric se- 
quence. This is also the case for auditor smile 
beginnings. 

Table 1 shows that in both the exploratory 
and replication conversations, the speaker’s 
within-turn signal (composed of grammatical 
completion or head direction toward the au- 
ditor or both) preceded a high percentage of 
the auditor’s late and between-units smile be- 
ginnings and was significantly related to their 
occurrence. In Conversations 1 and 2, 23 of 
24, or 95.8%, of the auditor’s smile begin- 
nings occurred while the within-turn signal 
was present. For the 2 x 2 table relating the 


Relationship Between Display of Speaker's Within-Turn Signal and 


Speaker’s within-turn signal 


$ FNE 
Proportion of Proportion ol KEA 
Auditor's smile segments with segments with Rore 
beginning signal absent signal present 
Pause and postboundary smiles: Exploratory sample* 
(Conversations 1 and 2) i 
Presen o % 24 
Present ‘04 7 
et k 
Pause and postboundary smiles: Replication sample’ 
(Conversations 3 and 4) sa 
Absent 26 ‘ a 
picecite 00 1.00 
Speech overlap and sociocentric sequence smiles® 
(Conversations 1, 2, and ka 
Preen 3 o 12 
Present 33 : 
»P = 00006. 
= 1.00, p =0. 
~:26, p = .36 
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auditor’s smile beginning to prior display of 
the speaker’s within-turn signal, Q = .79, p= 
00006. In Conversations 3 and 4, all 15 be- 
tween-units and late auditor smile beginnings 
happened in the presence of the within-turn 
signal. Q = 1.0, and the associated probability 
cannot be calculated because the standard 
error is zero. 

The bottom third of Table 1 shows that 
there is an insignificant negative relation be- 
tween presence of the speaker’s within-turn 
signal and auditor smile beginnings located 
at speech overlaps or sociocentric sequences. 
Similar results are reported by Duncan and 
Fiske (1977) for vocal back channels and 
head nods that occur in these locations. 


Effect of Speaker’s Gesticulation 


Auditor turn attempts are suppressed by 
the speaker’s gesticulation, but back channels 
are not. To test whether smile beginnings are 
also unaffected by the speaker’s gesticulation, 
data were arranged in 2 x 2 tables where the 
rows indicated presence versus absence of the 
speaker’s gesticulation and the columns indi- 
cated presence versus absence of the auditor’s 


Table 2 
Locations of Auditor Smile Beginnings and 


Turn Attempts With Respect to Segment 
Boundaries 


ee 


Auditor turn Auditor smile 


attempts beginnings 
d Fre- Fre- 
Location quency % quency % 


Exploratory sample (Conversations 1 and 2) 
Speech overlap 16 


Pause si BOS ekdik 
80.8 16 
Postboundary 2 1:9 8 oe 
Sociocentric 2 1.9 1 3.1 
sequence $ 
Total 104 33 


Replication sample (Conversations 3 and 4) 


Speech overlap 13 22.0 2 11.8 
Pause 40 67.8 9 52.9 
Postboundary g 3.4 6 35.3 
Sociocentric ; 
sequence 4 6.8 0 0 
Total 59 17 
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smile beginning. In the exploratory sampi 
(Conversations 1 and 2), the auditor’s sm, 
and the speaker’s gesticulation were not st 
nificantly related (0 = —.36, p= 14), |) 

In the replication sample (Conversation 
3 and 4), it seemed at first that the audit. 
was somewhat less likely to smile while th 
speaker was gesticulating (Q = —.45, p= 
.04). These are not strong results, but it if 
strange that smile beginnings should show i 
structural similarity to turn attempts rather 
than back channels, particularly when thi 
did not happen in the exploratory study. 

Examination of the data from Conversi 
tions 3 and 4 separately indicates that thet 
is virtually no relationship between the speak 
er’s gesticulation and the auditor’s smile i 
either conversation. Conversation 4 has a lo 
rate of auditor smiling and a high rated 
speaker gesticulating; Conversation 3 has i 
relatively high rate of auditor smiling andal 
extremely low rate of speaker gesticulating 
Consequently, although there is no relatior 
between the speaker’s gesticulation and 
auditor’s smile beginning in either conver 
tion, an artificial one appears when data fro 
the two are pooled, This. finding shows hd 
important it is to examine subsets of da 
separately before combining them. i 

Table 2 compares the distributions of al 
ditor smiles and turn attempts with respe 
to segment boundaries. Auditor smile begti 
nings are less concentrated in the “paus 
location (the same finding that has been € 
tained for back channels), although this ten 
ency is not as pronounced in the replicati 
study. 


Discussion 


The organization of auditor smile beg! 
nings in these conversations is parallel to 
of back channels. They are both consistent 
preceded by the speaker’s within-turn signé! 
they are both unaffected by the speak? 
gesticulation, and they are both more 1o05? 
distributed than turn attempts with re 
to segment boundaries. Taken together, ha 
results are grounds for placing auditor ® i 
beginnings in the category “back channi 
along with nods and statements such 
“yeah” and “uh-huh.” 


E 


_ Nods and vocal back channels seem intui- 
Nively to belong together, because they both 
seem to say, in effect, “Yes, you are under- 
stood. Proceed.” Although it is not immedi- 
ately obvious that smiles belong in this cate- 
gory, I believe that auditor smile beginnings, 
Jike auditor back channels, make communica- 
tion more efficient by providing the speaker 
with feedback. 

The idea of communicative efficiency lies 
behind much of the turn system, though it 
has not been given a great deal of emphasis. 
For example, the turn signal was constructed 
so that at least one cue would be present be- 
fore each smooth exchange of turns, but not 
before instances of simultaneous turns. The 
Teasoning was that a system of rules for tak- 
‘ing speaking turns ought to result in smooth 
exchanges when it is working properly, be- 
cause human beings have only a limited ca- 
pacity to send and receive information simul- 
taneously. 7 

; The people who have written about the ac- 
tions that we call back channels generally 
peee that the auditor uses them to provide 
k speaker with information. There is less 

manimity, however, concerning the exact sort 
of information that back channels convey. 

Fries (1952, p. 49) suggests that they are 
primarily signs of attention and involvement 
in the conversation. Yngve (Note 1) agrees, 
but proposes that they are also “very im- 
Portant in providing for monitoring of the 
quality of the communication” (p. 568). Ditt- 
Mann and Llewellyn (1967) imply that in 

dition, they are sometimes used for di- 
a esting further information or €x- 
Don on (this is consistent, of course with 

can’s treatment of short questions as back 
nnels), 

pon (1967) distinguishes between 
a a pae and “assenting” signals, both 
This di as would describe as back channels. 
usually een merely reflects the fact that 
ine ae anal feedback indicates that 
kri e ollowing what the speaker says, 
a T it also indicates agreement. It 
liai en to mention in this connection 
Baie shakes and verbal statements of 
in E ment were treated as back channels 

present study and in Duncan’s earlier 
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investigations. This was done mostly on in- 
tuitive grounds (they “felt” like back chan- 
nels), but the implicit rationale was that feed- 
back need not always be positive in tone 
although it usually is. ‘ 

There is a consistency among the various 
sorts of information that back channels are 
supposed to provide. I would like to suggest 
that back channels (including smiles) give 
the speaker feedback on three levels. On the 
first level, back channels signal the auditor’s 
involvement and participation in the interac- 
tion. They indicate that the auditor is attend- 
ing to what the speaker says and that a con- 
versation, not a monologue, is occurring. 

On the second level, back channels provide 
information about the auditor’s level of un- 
derstanding, allowing the speaker to adjust 
his or her communicative endeavor so as to 
get the ideas across efficiently. “Mhm’s” and 
head nods tell the speaker that the message 
is being decoded successfully; when the loca- 
tion of these signals is “speech overlap” 
(before the end of a segment of speech is 
reached), they may indicate that the auditor 
is “ahead” of the speaker and a little less 
elaboration would be in order; a puzzled or 
blank look may indicate that more informa- 
tion is needed (no work has been done on the 
actions that signify the auditor’s lack of un- 
derstanding, except for clarifying questions). 

On the third level, back channels can signal 
the auditor’s personal response to what the 
speaker has just said. This might mean agree- 
ment or disagreement, shock, amusement, 
scorn, or any number of other reactions. In 
general, if an action functions on a higher level 
it also functions on the ones below it; any 
indication of a personal reaction implies un- 
derstanding (or the lack of it), and any in- 
dication of a degree of understanding implies 
some involvement in the conversation. Smiles 
are higher level back channels, providing 
feedback on all three levels. They indicate a 
positive personal reaction (often merely po- 
lite), as well as understanding of the preced- 
ing statement and participation 1n the con- 
versation. 

We do not know 
present investigatio: 
findings of the exp! 


how far the results of the 
n can be generalized. The 
oratory study were fully 
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replicated, which indicates that at least they 
are not unique to a particular conversation. 
Definitive information about the limits within 
which they hold can be provided only by fur- 
ther studies of smiling in which many people 
are observed in many different settings. 


Reference Note 


1, Yngve, V. H. On getting a word in edgewise. In 
Papers from the sixth regional meeting of the 
Chicago Linguistic Society. Chicago: Chicago Lin- 
guistic Society, 1970. 
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Motoric and Symbolic Mediation in Observational Learning 


Seymour M. Berger, Linda L. Carli, K 

I $ A athy S. H: 

Judith F. Karshmer, and M. Estela Sanches” a 
University of Massachusetts—Amherst 


The observers’ motoric and symbolic representations of a model’s behavi 

important mediators in observational learning. The observers’ erie ae 
of these mediators may be influenced by their familiarity with res acs a 
formed by a model and by their intention to learn these responses. Unfamiliar 
observers do not have symbolic codes available for the model’s responses so 
they may rely on motor mimicry. Familiar observers have symbolic codes al 
able, so they may employ these codes and, possibly, motor mimicry as medi- 
ators. Spontaneous mediation may also depend on whether observers intend to 
learn these responses. Experiments revealed that familiar observers use motoric 
and symbolic mediators, whereas unfamiliar observers primarily use motor 
mimicry. Symbolic coding facilitates familiar observers’ but not unfamiliar 
observers’ learning; unfamiliar observers’ learning is related to motor mimicry. 
Intention to learn increases motor mimicry but not symbolic coding. An inter- 


pretation is offered for the observers’ pervasive use of motor mimicry. 


: Bandura (1974) has emphasized the role of 
imaginal and verbal coding, and rehearsal of 
these symbolic codes, in observational learn- 
ing. Berger (1966, 1968), on the other hand, 
has focused on the role of motor mimicry, 
Pt rehearsal, and conditioning in observa- 
ìonal learning. Studies supporting these anal- 
yses, however, either have not examined the 
oo Spontaneous use of different forms 
k n lation or else have instructed observers 
e= aan mediators (e.g., Bandura, 
eS 3 Menlove, 1966; Bandura & Jeffery, 
AR oe Irwin, & Frommer, 1970; 
6). th erger, 1976; Gerst, 1971; Jeffery, 
x 3 a eom experiments examine 
N itions that affect the observers’ 
ka nee motoric and symbolic mediation, 
ae re ationship between these mediators, 
servational learning situation. 


Th 
author UtHOrs are listed in alphabetical order. TOs 
Separately in ren in each experiment are identified 
Canes the report of each experiment. 
M. Berger. for reprints should be sent to Seymour 
Massachuse Department of Psychology, University of 
etts, Amherst, Massachusetts 01003. 
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Experiment 1: Letter Code Availability and 
Learning Instructions * 


Bandura (1974) argues that observers rep- 
resent a model’s behavior through symbolic 
codes and that these codes guide subsequent 
motoric reproduction of the model’s behavior. 
This analysis assumes that observers are able 
to code and that the response components al- 
ready exist in the observers’ behavioral reper- 
toire. If observers cannot symbolically code 
because there is insufficient time for them to 
employ appropriate symbolic codes, then ob- 
servers may attempt some other form of medi- 
ation, such as mimicry, in order to represent 


a= 

1 The authors of Experiment 1 were Seymour M. 
Berger and Kathy S. Hammersla. A portion of this 
experiment Wi under the general title of 
this article at the annual convention of the American 
Psychological Association, Toronto, Canada, in Au- 
. This project was formulated, with the as- 
Norbert Vanbeselaere, while the first 
right-Hays Senior Research Scholar 
at the Laboratory for Experimental Social Psychol- 
ogy, Catholic University of Leuven, Belgium. The 
authors wish to thank Margo McMahon for her as- 
sistance in selecting the hand signals and in checking 
the experimenter’s reliability. 


Association, Inc. (0022-3514/79/3705-0735$00.75 
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the model’s behavior. Furthermore, if the re- 
sponse components of the model’s behavior 
are not part of the observers’ repertoire, these 
components may be acquired during the ex- 
posure period through the observers’ mimetic 
practice. Thus, mimicry may facilitate learn- 
ing for observers who are unable to symbol- 
ically code and/or unable to perform the ap- 
propriate response components. 

When observers have symbolic codes and 
appropriate response components available, 
there appears to be no need for them to ac- 
tually mimic the model; in this case, symbolic 
codes would compensate or substitute for mim- 
icry, and mimicry would be reduced or not 
occur at all. It also is conceivable, however, 
that motoric and symbolic mediators comple- 
ment rather than compensate for one another. 
In this case, observers would engage in both 
types of mediation. Motoric and symbolic 
mediators may be elicited directly by the 
model’s behavior, or observers may use both 
types of mediation because they have found, 
through past experience, that using multiple 
mediators is more effective for learning than 
relying on a single form of mediation. 

Half of the observers in this experiment 
were familiar with the manual alphabet for 
the deaf and half were not. It has been shown 
that observers tend to overtly mimic these 
gestures and that such mimicry is positively 
correlated with observer learning (e.g., Berger, 
1966). If the compensatory theory of media- 
tion is correct, then observers who know the 
manual alphabet can use the letter associated 
with each gesture as a mediator; therefore, 
they should engage in less overt mimicry than 
observers who are unfamiliar with the alpha- 
bet. If the complementary theory of media- 
tion is correct, then familiarity with the 
manual alphabet should not be associated 
with a reduction in overt mimicry. 

In addition to varying response familiarity, 
observers were instructed either to learn from 
the model’s performance or merely to watch. 
This was done for two reasons: (a) The ob- 
servers’ spontaneous use of motor and sym- 
bolic mediators may depend on whether they 

intend to learn; and (b) we wanted to pro- 
vide a comparison with previous research. 
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Method 


Subjects. Thirty-two undergraduates at the Un 
versity of Massachusetts—Amberst were included in 
this study.2 Twenty-five of these subjects received’ 
course credit for their participation; seven subjects 
were volunteers from a course in sign language. 

Procedure. Subjects were run individually and 
were assigned alternately to either the learning of 
watch-only conditions, Since observers tend not to 
overtly mimic the model if they are being watched 
(Berger, 1966), observers were led to believe thal 
they were alone. l 

Briefly, observers were told that the experimenter 
had inadvertently signed up two subjects for th 
same time period, but that the observer could watdi 
the experiment over closed-circuit television from 
an adjacent “recording” room. After being seal 
in the recording room, half of the observers 
told to watch the television monitor and to foll 
the instructions given to the subject in the o 
room. Since the experiment was described as a stu 
of nonverbal learning, the experimenter explai 
that she would return to test the observer after al 
the learning materials were presented. The oth 
half of the observers were told that they would 1t 
ceive participation credit if they merely wal 
the experiment on their monitor. (Seven obse! 
could not receive credit. This presented no probl 
with three subjects who were assigned to the le 
ing instruction condition, since they had volunt 
anyway. The remaining four subjects were Im 
watch-only condition, and they were told they ¢ 
help the experimenter by evaluating her perfo 
ance.) 

The experimenter explained that she would turn® 
the camera when she returned to the subject mu 
adjacent room. Upon returning to that room, i 
started the videotape player and seated herseta 
the one-way mirror in order to record the observi 
mimicry. The observer was led to believe that ™ 
mirror was covered, but the experimenter coul 
the observer without being noticed. t 

The videotape showed the experimenter reeni 
the adjacent room and giving the instructions t0 
other subject (model). These instructions state 
the subject had to learn pairs of gestures an 
the subject would be tested after the pairs were i 
sented. The pairs were presented by means ‘I 
slide projector. Each gesture appeared on & Sep% 
slide for 5 sec, with a blank 5-sec period ia 
between each pair of gestures. The pairs Were H 
tures associated with X and Q, R and A, P a 
and K and B in the manual alphabet. The obser 


2 Forty-nine subjects were recruited for this s 
Eight of these subjects were eliminated becat: oe 
Procedural errors or because procedural require j 
could not be met. The remaining 41 subjects tond 
the pool from which the 32 observers were sele í 
as explained in the procedure. ; 
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saw a same-sex model perform each gesture once 
while it was projected on the screen. 

~ After all the pairs were shown, but before the 
subject was tested, the experimenter explained that 
she had to go into the next room for a moment. The 
videotape was stopped, as if the camera had been 
turned off. She returned to the recording room and 
asked the observer to show her all the pairs that 
the observer could recall. After noting the observer’s 
response, she asked the observer to complete a ques- 
tionnaire. 

Observers were asked to indicate how familiar they 
were with the manual alphabet and (with the aid 
| of printed drawings of the gestures used) to indi- 
cate in separate items which gestures they actually 
performed, which they formed images of, which they 
identified by letter, and which they represented in 
some “other” way (e.g., words, numbers, etc.).$ The 
final questions dealt with the observers’ understand- 
ing of the instructions and whether they were “sus- 
picious about the instructions . . . received at the 
time they were given or while . . . watching the 
TV set”; if observers responded affirmatively, they 
were to answer the following question: “What were 
you suspicious about?” 

According to their questionnaire responses, 16 of 
the 41 observers knew some or all of the gestures 
from the manual alphabet. Seven of these observers 
were volunteers from the sign language course; the 
remaining nine familiar observers may have been 
attracted to the experiment because it was described 
as a study of nonverbal learning involving gestures. 
Eight of these familiar observers (six females and 
two males) were instructed to learn, and the re- 
maining eight observers (all females) were instructed 
to merely watch. In order to obtain equal numbers 
of unfamiliar observers, eight observers (six females 
land two males) under instructions to learn and 
eight observers (six females and two males) under 
instructions to merely watch were randomly selected 
from the remaining pool of 25 unfamiliar observers. 


Results 


The number of different gestures performed 
was used as a dependent measure in this ex- 
periment (and the experiments that follow), 
rather than total amount of performance, be- 
cause the former measure was used previously 
(Berger, 1966) and the two measures are 
highly correlated (.85). The experimenter’s 
records of different gestures performed cor- 
related .98 (p< .001) with an independent 
assessment by another person during a pretest 
reliability check and .73 ($ < 001) with the 
observers? self-reported mimicry of different 
gestures in this experiment. 

In general, the experimenter recorded more 
mimicry of different gestures (M = 5.00) than 
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observers reported (M = 3.03), For observers 
who were seen to mimic or reported mimick- 
ing, this difference is highly significant, (24) 
= 4.40, p < .001. Observers apparently un- 
derestimated the number of different signals 
they performed. 

The data did not meet the criteria for anal- 
ysis of variance (ANOVA). The most serious 
violation was the lack of variability in cer- 
tain cells involving unfamiliar observers. 
Therefore, nonparametric tests were used to 
verify each significant effect found in the 
ANovas, and only those effects which were 
supported by the nonparametric tests are re- 
ported below. The results of the F tests are 
reported, and the significance levels obtained 
from the nonparametric tests are reported in 
parentheses next to F-test levels. All signifi- 
cance tests are two-tailed. 

Self-reported mediation. The mixed de- 
sign anova revealed that familiar observers 
engaged in more total mediation than un- 
familiar observers, F(1, 28) = 34.16, p< 
.001(<.002), and that there were differences, 
F(3, 84) = 7.56, p < .001(<.01), among the 
observers’ uses of the various mediators (i.e., 
mimicry, images, letters, or “other”). These 
main effects, however, have to be interpreted 
in terms of the Mediators x Familiarity, 
F(3, 84) = 3.30, p < .05, and the Mediator x 
Learning Instruction, F(3, 84) = 4.04, p < 
.01, interactions. 

The means for the Mediation x Familiarity 
interaction are reported in Table 1, along 
with the percentages of observers who en- 
gaged in each type of mediation. A Newman- 
Keuls test of these means indicated that fa- 
miliar and unfamiliar observers did not differ 
in their mimicry or in their use of “other” 
mediators. Familiar observers used more im- 


3 There were two series of questions regarding the 
observers’ mediational activities. The first series 
asked about mediation during the video presentation, 
whereas the second series dealt with mediation fol- 
lowing the presentation. Since the observers’ mimicry 
following the presentation may have been influenced 
by the observers’ expectation that the experimenter 
would return and see them performing, the ques- 
tionnaire responses to the second series were ex- 
cluded from analysis in Experiments 1 and 2 because 
they could have been influenced by this situational 


artifact. 
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Table 1 : 
Mediation X Familiarity Interaction: Means and Percentages ) 
| 
Mediator 
ee eee 
Mimicry Images Letters Other Total 
Observer M % M % M % M % Q M % 
ili 31 13 89 A 
Unfamil 2.81 69 44 13 0 0 k | 
Familiar 3:25) 69 3.38 75 2.75 63 1,25 2.66 W 


44 


Note. There were eight gestures, so each mean could range between 0 and 8. 


ages, œ = .01(<.02), and more letters, a = 
.01(<.002), than unfamiliar observers. 

Familiar observers engaged in more mimetic 
and imaginal mediation than “other” forms 
of mediation, a = .05(<.05), for both com- 
parisons. None of the other differences be- 
tween mediators for familiar observers was 
significant. 

Unfamiliar observers used mimicry more 
than images, letters, or “other” mediators, a 
= .01(<.01), for each comparison. There 
were no significant differences between images, 
letters, and “other” mediators for unfamiliar 
observers. These observers tended to employ 
mimicry rather than symbolic mediators. 

This analysis supports the complementary 
theory of mediation, because symbolic coding 
did not reduce motor mimicry. Familiar and 
unfamiliar observers engaged in about the 
same amount of mimicry even though familiar 
observers employed significantly more imag- 
inal and letter coding. 

The Newman-Keuls test showed that the 
Mediator x Learning Instruction interaction 
was produced by observers who were in- 
structed to learn engaging in more mimicry 
than any other form of mediation by these 


Table 2 


observers or by the watch-only observers, a= 
.01(<.01), for all comparisons. Table 2 pre 
sents the means for this interaction along wi 
the percentages of observers who engaged 
each form of mediation. There were no difio 
ences in symbolic mediation as a function 
learning instructions. Since the instruction 
learn increased motor mimicry but not sy 
bolic mediation, it appears that observers lt 
lieved that mimicry would facilitate tht 
learning more than symbolic mediation. 
Observed mimicry. The anova based 
the experimenter’s records revealed only 
significant main effect of learning instructio 
F(1, 28) = 20.14, p < .001(<.002), and 
interaction with familiarity. Observers 1 
were instructed to learn performed more 
ferent gestures (M = 7.06) than the walt 
only observers (M = 2.94). This finding W 
ports the results of the self-report analy 
regarding the effect of instructions to leat 
Observer learning. The ANOVA of the ni 
ber of pairs learned yielded only a maim l 
fect for familiarity, F (1, 28) = 10.51, ? 
.005(<.02), and no interaction with lea" 
instructions. Familiar observers learned ™ 


Mediation X Learning Instruction I nteraction: Means and Percentages 


Mediator 
eee aa A 
Mimicry Images Letters Other Total 
eee tic 
Condition M% M % M % M % M % 
Watch only 1.62 44 1.88 50 106 25 94 31 1.38 8! 
Learning instructions 4.44 94 1.94 38 1.69 38 62 19 2.17 100 


Note. There were eight gestures, 


so each mean could range between 0 and 8. 
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pairs (M = 1.69) than unfamiliar observers 
(M = 44). 

Correlations between mediators and be- 
‘tween mediators and number of pairs learned. 
None of the correlations between the self-re- 
ported mediators was significant. Unfamiliar 
observers’ self-reported mimicry correlated 
62 (p < .05) with pair learning, and these 
observers’ observed mimicry correlated .53 
(p < .05) with pair learning. For familiar ob- 
servers, letter coding correlated .55 (p < .05) 
with pair learning, and their imaginal coding 
correlated —.51 (p < .05) with pair learning. 
Because of a lack of variability, it was not 
possible to correlate unfamiliar observers’ 
symbolic coding (i.e., images, letters, and 
“others”) with pair learning. Familiar ob- 
servers’ “other” coding, self-reported mimicry, 
and observed mimicry were not significantly 
correlated with pair learning. 

These correlations suggest that in the ab- 
sence of other forms of coding and well-prac- 
ticed responses, motor mimicry prevails as an 
effective mediator in learning; but given a 
teadily available code that is associated with 
Well-practiced responses, symbolic coding is 
More effective than mimicry. 

Manipulation checks and suspiciousness. 
Most observers correctly recalled the instruc- 
tions given them. Approximately one third 
Of the observers instructed to learn and one 
thitd of the observers instructed to merely 
watch reported being suspicious. The major 
Sources of suspicion were the cover story, par- 
ticularly with regard to whether the other 
Subject was real and whether observers were 
A watched. None of the observers guessed 

actual purpose of the study. 


Discussion 


The finding that unfamiliar observers relied 
a exclusively on motoric mediation con- 
Wage sharply with the familiar observers 
server a variety of mediators. Unfamiliar ob- 
for S could have used imaginal and other 
AR of coding, but they did not do so. A 
a explanation for the familiar observers’ 

k ee behavior, as well as for similar 
re 8s in the studies that follow, will be 

“ented in the final section of this article. 
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_ There was a significant negative correla- 
tion between imaginal coding and learning, 
and a tendency (which was not significant) 
for imaginal coding to be negatively correlated 
(—.44) with letter coding for familiar ob- 
servers. Apparently, letter coding was a more 
effective form of mediation than imaginal cod- 
ing for these observers. Perhaps familiar ob- 
servers who chose to code imaginally did not 
have time to letter code as well, and their 
learning suffered as a consequence. Other stud- 
ies report that imaginal coding facilitates ob- 
servational learning (e.g., Gerst, 1971; Ito, 
1975; Jeffery, 1976). These studies may have 
provided more time for mediation, but they 
also differed from the present study in sev- 
eral other ways. Therefore, it is not possible 
to identify the basis for these inconsistent 


findings. 


Experiment 2: Symbolic Mediation With 
Relatively Available Verbal Codes * 


In Experiment 1, familiar observers’ learn- 
ing was positively correlated with letter cod- 
ing. In order to determine whether this cor- 
relation reflected a causal relationship, an 
attempt was made in Experiment 2 to bring 
the observers’ use of verbal coding under ex- 
perimental control by either exposing ob- 
servers to a verbal coding model (verbal cod- 
ing condition) or instructing them to use 
verbal codes to help them learn the model’s 
behavior (instruction condition). ; 

Exposure to a model who engages in overt 
verbal coding may have one or more conse- 
quences for observer mediation and learning. 
The model’s coding behavior may suggest a 
learning strategy to observers who might me 
otherwise use verbal coding. Furthermore, r i 
model’s coding behavior may provide a 
codes to observers who intend to use ver $ 
codes but cannot think of any. Fi inally, it also 
is possible that the model’s coding cone 
will elicit unintentional imitation and, thereby, 
facilitate observational learning. No matter 
which of these processes is involved, there is 


thors of Experiment 2 were Seymour M. 
ie Linda L. Carli, Judith F. Karshmer, M. 
Estela ‘sanchez, and Kathy S. Hammersla. 
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ample reason to believe that a verbal coding 
model will increase the observers’ verbal 
coding. , 

Instructing observers to code verbally is 
another way to increase the observers’ coding 
behavior, provided that they have an avail- 
able verbal code. Since several studies of ob- 
servational learning have used such instruc- 
tions (e.g, Bandura et al., 1966; Bandura & 
Jeffery, 1973; Gerst, 1971; Jeffery, 1976; 
van Hekken, 1969), it seemed desirable to 
compare this treatment with treatments in- 
volving a verbal coding model and a nonver- 
bal coding model. 

A secondary purpose of this investigation 
was to demonstrate that the mimicry phe- 
nomenon has some degree of generality. In- 
stead of gestures from the manual alphabet, 
gestures commonly used in United States cul- 
ture were selected, because all observers prob- 
ably had performed these gestures previously 
and would have a readily available word or 
phrase that they could associate with each 
gesture (e.g, “come here,” “good-bye,” 
“stop”). Thus, observers in this experiment 
were comparable to familiar observers in Ex- 
periment 1 in that they had a highly salient 
symbolic code available for mediation. 

It was hypothesized that observers exposed 
to a model who performed the gestures but 
did not engage in overt verbal coding (non- 
verbal coding condition) would use less verbal 
coding and would recall fewer pairs and fewer 
responses than observers in either the verbal 
coding or instruction conditions. It could be 
argued, of course, that almost all (if not all) 
observers would respond to the gestures with 
the verbal codes regardless of the treatments 
they received, because the verbal codes are 
readily elicited when the gestures are per- 
formed. This ceiling effect was not expected 
to occur, however, because the model per- 
formed the responses relatively rapidly, thus 
reducing the observers’ time to react with an 
appropriate verbal code. No differences were 
expected between conditions in the amount of 
mimicry, since the model’s performance was 
the same in all conditions and Experiment 1 
demonstrated that observer mimicry was in- 
dependent of the observers’ use of symbolic 
coding. 
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Method . 


Subjects. Sixty undergraduates were recruited froi 
a Variety of undergraduate courses in psychology a 
the University of Massachusetts—Amherst.5 Subjeqj 
received extra course credit for their participation, | 

Procedure, Observers were run individually Ú 
one of four different female experimenters, Each @ 
perimenter ran five subjects in each of the three coi 
ditions, 

The procedure for this experiment was similari 
that used in Experiment 1. Observers were asked 


learn pairs of gestures and were told that two sil 
jects were being run at the same time for pu post 
of efficiency. The gestures would be presented 
means of videotape to the subject in the adja 
room, but the observer would be able to see 
presentation over closed-circuit television, becaustl 


observer was seated in front of a small televisi 
monitor located directly in front of a one-way m 
ror that appeared to be completely covered. 
After giving the instructions, the experimenter 
the room, explaining that she had to turn on} 
camera and the videotape rec i 
room (observers saw this room and a videotape ú 
when they arrived for the experiment). She st 
that she would return to test the observer after 
presentation. The experimenter actually went to 
adjacent room, turned on the videotaped mo 
presentation, and positioned herself behind the 0 
way mirror, which afforded her a clear view of 
observer through a small uncovered space atd 
bottom of the mirror. 
After recording any gestures performed, she tu 
off the videotape, returned to the observer's I0 
asked the observer to perform the pairs of Gar 
and recorded all gestures correctly performed, f 
servers then completed a questionnaire similar t0 
one described in Experiment 1, but with drawint] 
the new gestures, On this questionnaire, obs!) 
also were asked to specify which gestures they 
ciated with words and the words they used. a 
The videotaped gesture presentation began A 
medium close-up picture of a large television E 
top of a desk. This videotaped television set W 
to present additional instructions and gestures) "4 
were actually performed. The instructions, ove 
a male voice, indicated that the experiment Wa e 
to begin. The subject would see three pairs °- 4 
tures. Each gesture would be presented se 
and the first member of each pair would be foh 
by the second member of that pair, with ê 


5 Because incomplete records were kept a f 
the sex of the subjects, the distribution of m d 
females cannot be given with complete © 
It appears that there were an equal number ° 
and females in each condition. Nine addition 4 
jects were run, but these were excluded i 
analysis either because their data were unintelP o 
or because they did not meet procedural requ! 
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pace appearing between the pairs of gestures. At the 
"end of these instructions, the male voice asked the 
Subject to get ready. At this point, the female 
model (presumably, the other subject) moved into 
the picture that the observer saw, as if she were 
leaning toward the television set in front of her to 
get a better view. In all conditions, this model per- 
formed each of the gestures immediately after it 
was presented. 

The six gestures used were culturally common 
signs for “come here” (a slightly closed hand with a 
beckoning index finger), “OK” (index finger and 
thumb forming a circle, with the remaining fingers 
vertical and slightly separated), “good-bye” (thumb 
stationary while the remaining fingers bent down 
and up), “you” (index finger pointing directly to- 
ward the viewer, with remaining fingers forming a 
fist), “stop” (palm facing viewer, with all fingers 
vertical, thrust toward the viewer), and “two” (index 
and middle finger forming a vertical V, with thumb 
overlapping remaining bent fingers). 

The gestures were presented in this sequence so 
that each gesture appeared on the screen for about 
sec. There was a 2-sec interval between pair mem- 
bers and a 34-sec interval between the pairs. 
Observers were assigned to one of three conditions 
i the order in which they appeared for the experi- 
Ment. Observers assigned to the verbal coding condi- 
tion (VC) heard the female model say the word as- 
fociated with the gesture as she performed each 
Besture, Observers assigned to either the nonverbal 
Coding condition (NC) or the instruction condition 
(IC) saw a tape in which the model performed each 
esture but did not say anything; however, IC ob- 
[Servers received the following instruction from the 
experimenter prior to the videotaped presentation; 
‘Each gesture can be associated with a common En- 
blish word. Use the word you associate with each 
gesture to help you learn the pairs.” 


Results 


The reliability of the four experimenters’ 
ecords of the number of different gestures 


Table 3 


Means and Percentages for Self-Reported Mediation: 
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performed was assessed by having two pairs 
of these experimenters watch a different set of 
observers. The reliability correlation for each 
pair of experimenters was .99 (p< .001), 
based on samples of 10 and 15 observers. The 
experimenters’ records correlated .77 (p < 
001) with the observers’ self-reported mim- 
icry of different gestures. 

As in Experiment 1, observed mimicry of 
different gestures (M = 4.13) was greater 
than self-reported mimicry (M = 3.37). This 
difference, based on observers who mimicked 
(observed or reported), was significant, ¢(39) 
= 2.63, p < 02. 

The observers’ self-reported use of “other” 
mediators (e.g., numbers, letters, etc.) was 
not included in the data analysis because only 
four observers reported using “other” medi- 
ators (one in VC, one in IC, and two in NC). 
This also eliminated the Anova problem en- 
countered in Experiment 1, because of cells 
having little or no variability. All significance 
tests are two-tailed. 

Self-reported and observed mediation, The 
mixed design ANova yielded a main effect for 
mediators, F(2, 114) = 5.89, p < 01, and a 
significant Conditions X Mediators interac- 
tion, F(4, 114) = 3.97, p < 01. The means 
for these significant effects are reported in 
Table 3, along with the percentages of ob- 
servers using the mediators. 

The main effect for mediators must be in- 
terpreted in terms of the Conditions X Medi- 
ators interaction. In order to carry out 18 
preplanned comparisons of the means involved 
in this interaction, the Bonferroni £ (see 
Myers, 1978) familywise error rate was set at 


Experiment 2 


Mediator 
Mimicry Images Words Total 
Condition M h M  % M % M % 
wien, © & 18 @ i ie 
Tae eee F i 300 75 4.20 90 3.57 100 
f Total 33370 82 2.48 67 3.65 87 3.17 100 


Note. There were six gestures, so eac! 


h mean could range between 0 and 6. 
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.10; thus the alpha level for any specific com- 
parison was .006. Looking first at the pattern 
of mediation within each condition, one can 
see that VC observers used significantly more 
words than either images or mimicry; the 
difference between images and mimicry was 
not significant. The NC observers used sig- 
Set more mimicry than images; their 
use of mimicry and images did not differ sig- 
nificantly from their use of words. There were 
no significant differences in the IC observers’ 
use of these mediators. Thus, each condition 
manifests a somewhat different pattern of 
self-reported mediation; the observers’ use of 
images tended to be relatively low in all con- 
ditions, which accounts for the mediator main 
effect. 

The comparisons across conditions for par- 
ticular. mediators are most relevant to the 
main purpose of the study. As predicted, NC 
observers used significantly fewer words than 
either VC or IC observers; the difference be- 
tween VC and IC observers’ use of words was 
not significant. There were no significant dif- 
ferences in the comparisons between condi- 
tions for images or for mimicry. (This is a 
somewhat conservative test, because compari- 
sons within the rows are not independent of 
the comparisons within the columns of means 
reported in Table 3.) 

An ANOVA using the experimenters’ reports 
of observed mimicry also showed no signifi- 
cant difference between conditions, F(2, 57) 
= 1.26. Therefore, the hypothesis regarding 
mediation is supported, because exposure to 
a verbal coding model or being given instruc- 
tions to use a verbal code facilitated the ob- 
servers’ use of verbal coding as compared to 
observers who were exposed to a nonverbal 
coding model. Also, it is clear that observers 
spontaneously mimic these cultural gestures, 
so mimicry is not limited to gestures from the 
manual alphabet. 

A separate analysis was conducted to de- 
termine whether observers who used verbal 
codes tended to use exactly the same words 
for gestures as the VC model. The mean per- 
centage of words that were exactly the same 
as the VC model’s words was 90% for VC 
observers, 66% for NC observers, and 67% 
for IC observers. 


BERGER, CARLI, HAMMERSLA, KARSHMER, SANCHEZ 


Observer learning. The ANOVA com 
the number of pairs learned between co 
tions yielded F(2, 57) = 2.79, which is 
significant. Nevertheless, comparisons betwee! 
pairs of conditions were carried out, becau | 
specific predictions were made regarding th! 
pattern of differences. The trend in the m : 
was as predicted. The mean number of pai 
learned by NC observers (1.05) was less thi” 
the mean number of pairs learned by VC a 
servers (1.55) and by IC observers (1.7) 
However, the Bonferroni ¢ with a famil i 
error rate set at .10 for the comparison 
three means revealed that only the different 
between the number of pairs learned by N 
and IC observers was significant (e = 0f 
each specific comparison). 

A second measure of learning involv}. 
merely counting the number of different g 
tures recalled by each observer, regardless 
whether they were properly paired. M 
ANova for this measure of learning yielded 
highly significant conditions effect, F(2, 
= 8.55, p < .001. The NC observers re al 
significantly fewer gestures (3.95) than et 
the VC observers (4.85) or the IC observ 
(5.00) (the VC and IC observers’ learn 
was not significantly different). These Wa 
highly significant differences: The error nl 
for each comparison was less than .003, 18! 
the Bonferroni t. The learning hypothesi f 
strongly supported by this analysis of wa 
ber of gestures learned and is partially 9 

ported by the analysis of pair learning. 
Manipulation checks and suspiciow | 
The manipulation checks indicated that m 
observers correctly recalled the instructi 
they received and the model’s performan 
gestures and words (in the VC condition) 
As in Experiment 1, approximately 
third of the observers indicated that 
were suspicious. These observers were # 
“What were you suspicious about am yy 
did you become suspicious?” Most ° r 
observers questioned whether the other * 
ject was real. Some observers report? 
they were suspicious about all ps 
experiments. A few observers thous 

one might be watching them; tw? w 

thought that someone might be wate 

see how they learned. 
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pion 


The lack of a clear conditions main effect 
or number of pairs learned is somewhat sur- 
rising given the highly significant conditions 
main effect for number of gestures learned. 
Perhaps a stronger conditions effect for pair 
earning might have been achieved if the 
learning task had included more pairs, longer 
exposure to the gestures, or a longer interval 
between the pairs of gestures. Several ob- 
servers complained that the exposure time was 
too short for them to think of words and to 
make the necessary associations; other ob- 
servers complained that the distinction be- 
tween pairs was not clear in the videotaped 
presentation. : 


Experiment 3: Symbolic Mediation With 
| Relatively Unavailable Verbal Codes ° 


olic mediators (i.e., letters and words, re- 

ectively ) was related to increased learning. 
This relationship existed only when observers 
had symbolic codes available for the model’s 
behavior. Experiment 3 investigated the in- 
pce of verbal mediators when observers 
had no prior symbolic (verbal) code for the 
model’s behavior. 

In Experiment 1, the unfamiliar observers 
did not engage in symbolic coding; their learn- 
ing was related to their mimicry~of the 
Model’s responses. Without prior practice in 
Performing the model’s responses and without 
salient verbal codes for these responses, it 
Seemed unlikely that unfamiliar observers 

ould profit from the use of symbolic codes. 
E" order to test this hypothesis, observers 
i © were unfamiliar with the manual alpha- 

et were instructed to learn certain pairs of 
gestures from the alphabet. Half of the ob- 

Servers also were instructed to associate words 
ces to help them learn (instructed 
ee the remaining observers did not 
Bios these instructions (noninstructed con- 
i. A Observers in the instructed condition 
Were expected to engage in more verbal media- 
a than observers in the noninstructed 
ae but no differences were expected be- 
‘tore, these conditions in learning. Further- 

, since mimicry was found to be unaf- 


yi: Experiments 1 and 2, the use of sym- 


~ tures it will help you learn. So 
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fected by verbal mediation in Experiments 1 
and 2, no differences in mimicry were ex- 
pected. Observers in the instructed condition, 


: however, were expected to engage in more 


imaginal and “other” types of coding than 
those in the noninstructed condition, because 
the results of Experiments 1 and 2 seemed to 
indicate that those observers who engaged in 
symbolic coding also engaged in imaginal and 
“other” forms of coding. 


Method 


Subjects. Forty undergraduates were recruited 
from a variety of undergraduate summer session 
courses at the University of Massachusetts—Am- 
herst.” All subjects received course credit for their 
participation. 

Procedure. Observers were run individually by 
one of three different female experimenters. Two of 
these experimenters each ran 7 observers, and one 
ran 6 observers in each condition, so that there were 
20 observers in each condition. The recruiting folder 
stated that the study involved gesture learning and 
that students who were familiar with the manual 
alphabet for the deaf should not participate. 

Observers were assigned to conditions in the order 
in which they appeared for the experiment, All ob- 
servers were brought into the subject room and told 
that this was a study of nonverbal learning. They 
were asked to learn pairs of gestures that would be 
presented by means of videotape. It was explained 
that the videotape showed a subject in a previous 
experiment. Only one tape was used—the one em- 
ployed in Experiment 1 showing a female model, The 
instructions given to all observers were those used 
in the learning condition in Experiment 1. 

The 6 male and 14 female observers assigned to 
the instructed condition (IC) were given the follow- 
ing additional instruction: .Previous experiments 
have shown that if you associate words with ges- 

please use this form 


of verbal coding to help you learn.” The remaining 
7 male and 13 female noninstructed observers (NC) 
did not receive this added instruction. 

The one-way mirror in the room was completely 
covered by a fiberboard. The videotape deck was 
located in the subject’s room alongside the small tele- 
vision monitor. The experimenter explained that she 
would turn on the deck and leave the room so as not 
to distract the observer during the presentation and 


f Experiment 3 were Seymour M. 


thors o; 
ee oy th F. Krashmer, and 


Berger, M. Estela Sanchez, Judit 


inda L. Carli. 
Leon 1] subjects were excluded from the 


7 Two additional [ 
data analysis because they did not meet the pro- 


cedural requirements. 
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Table 4 
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Means and Percentages for Self-Reported Mediation: Experiment 3 


Mediator 
Mimicry Images Words Other Total 
Condition M % M % M % M% M % 
Noninstructed 5.50 100 2.70 55 40 15 1.65 35 2.56 100 
Instructed 4.55 80 2.40 55 3.25 65 1.20 30 2.85 100 
Total 5.03 90 2.55 55 1.83 40 1.43 33 2.71 100 


Note. There were eight gestures, so each mean could range between 0 and 8. 


that she would return after the Presentation was 
over to test the observer's learning of the pairs of 
gestures, If there were no questions, the experimenter 
started the videotape and left the room, leaving the 
door slightly ajar, In this way, she could hear when 
the presentation ended and return to the room to 
test the observer, 

After the test, in which she recorded each gesture 
correctly performed, the experimenter asked the ob- 
Server to complete the questionnaire. The question- 
naire was the same as that used in Experiment 2, ex- 
cept that there were no questions about mediation 
after the television Presentation; the appropriate 
gestures from the manual alphabet were substituted 
for cultural gestures. 


Results 


Self-reported mediation. The mixed design 
ANOVA revealed a significant main effect for 
mediators, F(3, 114) = 17.76, p< .001, and 
a significant Conditions x Mediators interac- 
tion, F(3, 114) = 5.10, p < .005. The means 
and percentages involved in these compari- 
Sons are reported in Table 4, 

The Conditions x Mediators interaction in- 
dicates that both conditions showed a similar 
pattern of mediation except with regard to the 
use of words, As expected, the IC observers 
used significantly 
observers, Within 
mimicry was 
“others,” 
of images 


on a Bonferroni ¢ test with a famil 
rate set at .10, so that the alpha level 
for the 16 specific comparisons. 
Observers’ learning. Separate ANO 
the two learning measures (ie, m 
pairs learned and number of gestures 
did not yield significant differences 
conditions. In pair learning, the mt 
ber of pairs learned for NC obser 
1.25; for IC observers it was 1.35. Th 


for number of gestures learned was 4 


NC and 5.40 for IC observers. x 
Manipulation checks and susp 
The manipulation checks revealed 
vast majority of observers correctly 
the instructions they received and 
model had performed the gestures. 
observers were suspicious about the 


tions. Three of these observers were SUS 


because they were not given sufficient 
tion for the learning task, and two wi 
cious because the distinction betwe 
of gestures in the videotaped presenta 
not clear. 


Discussion 


Given the absence of both in 
“other” forms of coding found in Ex 
1 for unfamiliar observers, the occ 
both of these forms of coding in Ex 
among NC obseryers was unexpec! 
observers in Experiment 3 were told 
were watching a videotape of a previ 
ject, perhaps the experimenter’s sele 
this particular subject’s performan' 
a demand to mimic as well as app 
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the use of coding strategies in general. An- 
other possibility is that observers pay more at- 
tention to a “live” model (Experiment 1) 
and, therefore, are somewhat distracted from 
considering or engaging in symbolic coding 
when they have no available symbolic code. 
Finally, the summer school undergraduates re- 
cruited for Experiment 3 may have been 
more motivated or more skilled in learning 
strategies than regular-session students; sev- 
eral subjects were somewhat older than tradi- 
tional college undergraduates. Whether mode 
of presentation or subject selection accounts 
for these differences in mediation between Ex- 
periments 1 and 3 cannot be determined with- 
out further investigation. 


General Discussion 


t Although observational learning was re- 
lated to symbolic coding (i.e., letters in Ex- 
periment 1 and words in Experiment 2) 
among observers with a salient code for the 
model’s behavior, these observers also en- 
gaged in a considerable amount of mimicry 
and imaginal coding that had no apparent re- 
lationship to their learning. It is clear from 
the observers’ questionnaire responses that 
these observers mimicked and imaginally 
coded because they believed that these forms 
of mediation would help them learn. 

Since these observers had learned to per- 
form the gestures previously, it is not surpris- 
ing that the additional mimetic practice failed 
to directly enhance their learning. It is pos- 
sible that mimicry was an unintentional reac- 
tion to the model’s performance, because the 
model performed the gestures in all conditions. 
This interpretation would account for mimicry 
but not for imaginal coding. 

Perhaps a dual or multiple-stage coding pro- 
cess is involved, similar to the one suggested 
by Bandura and Jeffery (1973) and Jeffery 
(1976). For observers with a highly salient 
and well-learned letter or word code, it may 
be more efficient for them to store and retrieve 
the model’s behavior in terms of these sym- 
bolic codes. Use of mimicry (or other non- 
verbal codes) may help observers retain, tem- 
Porarily, those model responses which they 
are unable to letter- or word-code immedi- 
ately. Thus, using mimicry (or other non- 
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verbal codes) provides more time for observers 
to employ their well-learned letter or word 
codes for these behaviors. 

It also is possible that the self-report mea- 
sures of mediation are not valid indexes of 
the observers’ actual mediational behavior. 
Although substantial correlations were found 
in Experiments 1 and 2 between observed and 
self-reported mimicry, self-reported mimicry 
was significantly lower than observed mimicry. 
Furthermore, even observed mimicry tends 
to underestimate the total amount of rele- 
vant motor activity in this type of situation, 
as revealed by electromyographic recordings 
(Berger et al., 1970). Nevertheless, even 
though observed and self-reported mimicry 
measures probably lack absolute accuracy, 
these measures are assumed to be reasonably 
sensitive to relative differences in the ob- 
servers’ mimetic activities. They both show 
comparable relationships with observational 
learning in Experiments 1 and 2 (where both 
measures were obtained). If the self-report 
measure of mimicry is considered to have 
some degree of validity, then it is reasonable 
to assume that self-reports of other forms of 
mediation have some degree of validity as 
well. 

Finally, two additional points should be 
noted. The correlations reported in these 
studies probably are attenuated because of 
the low variability in the measures of media- 
tion and learning. The test of learning was 
administered within a few minutes after the 
observers’ exposure to the model. The long- 
term effects of mediation in learning, which 
are of considerable interest to observational 
learning theorists, were not examined in these 


experiments. 
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Raog research has indicated that although negative mood-induction procedures 
rela lead to enhanced helping in adults, such procedures do not produce 
increased helping in young children. Consistent with the negative state relief 
model, it was expected that, relative to neutral mood subjects, children in a 
negative mood would be more generous if the helping opportunity offered the 
potential for direct reward through social approval. This expectation was sup- 


ported in a pair of studies wherein 


asked to imagine either neutral or sa 


opportunity to be chi 


vided evidence that the enhanced pub 


more parsimoniously interpreted as 
to repair public image. Finally, a 
altruism as a self-reward is proposed. 


A large number of studies have shown in- 
creased helping as a function of experiences 
that are likely to have induced negative moods 
i adult subjects (e.g., Aderman & Berkowitz, 
oa Apsler, 1975; Filter & Gross, 1975; 
ice & Doob, 1968; Freedman, Walling- 
Ben Bless, 1967; Greenglass, 1969; Kidd & 
oe 1976; Koneéni, 1972; Rawlings, 
ae Steele, 1975). Cialdini and his asso- 
a es) (Cialdini, Darby & Vincent, 1973; 

ialdini & Kenrick, 1976) have suggested a 
Sane state relief model of altruism to ex- 
ee these data. According to this notion, al- 
ruistic behavior is self-gratifying and will 
serve to alleviate temporary negative mood 
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ae comments on an earlier version of this article. 
ee s are also due the administration of Kyrene del 
ee and Broadmor Elementary Schools, Tempe, 
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w and Herbert Hackett. 
ite for reprints should be addressed to Doug- 
Seige enrick, Department of Psychology, Montana 
niversity, Bozeman, Montana 59717. 
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first- 
d experiences and were then given the 
aritable either in public or in private. Experiment 2 pro- 
lic helping of negative mood subjects is 
an attempt to remove negative mood than 
three-step account of the development of 


Psychological Association, Inc. 0022-3514/ 


through third-grade children were 


states in the same way that any other mood- 
enhancing experience will. Such a notion is 
consistent with research demonstrating that 
negative mood will increase the likelihood of 
other forms of self-reward (Rosenhan, Under- 
wood, & Moore, 1974; Underwood, Moore, & 
Rosenhan, 1973) and receives direct support 
from the study by Cialdini et al. (1973). 


These authors found that although a negative 


mood-inducing experience generally led to 


increased helping, this relationship did not 
hold for subjects who received a gratifying 
event (money Or praise) between the nega- 
tive experience and the helping opportunity. 
Apparently, the negative affective state was 
cancelled for these subjects by the applica- 
tion of a reinforcer, and it was no longer nec- 
essary to employ the helping response for 
that purpose. 

In contrast to the studies showing a gen- 
eral tendency for mood-lowering procedures 
to lead to increased helping, there is a body of 
data indicating that such procedures typi- 
e degree of decreased help- 


cally lead to som 
ing (€8- Isen, Horn, & Rosenhan, 1973, 


79/3705-0747$00.75 
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Studies 1 and 3; Moore, Underwood, & Rosen- 
han, 1973; Rosenhan et al., 1974; Under- 
wood, Froming, & Moore, 1977). Cialdini and 
Kenrick (1976) sought to reconcile these 
seemingly discrepant findings by pointing out 
that the data showing a direct relationship 
between lowered mood and helping have come 
from adult subjects, whereas the data failing 
to show such a relationship have generally 
come from work on children. Cialdini and 
Kenrick reviewed evidence to support the 
position that altruism acquires its self-rein- 
forcing qualities as a function of the socializa- 
tion process, wherein helping behavior is sys- 
tematically paired with various forms of di- 
rect reinforcement (e.g., praise, reciprocated 
aid, thanks, etc.). On the basis of that re- 
view, it was expected that young children 
who had not completed the socialization ex- 
perience would not find helping behavior grat- 
ifying and, hence, would not use it to dispel 
a negative mood state. However, it was ex- 
pected that with age, individuals would in- 
creasingly employ prosocial activity as a self- 
gratifier when in a negative mood. In support 
of these expectations, Cialdini and Kenrick 
found that the relationship between negative 
mood and helping progressively changed from 
an inverse to a direct one over three groups 
of increasingly older subjects, only the oldest 
(high school students) of whom showed sig- 
nificantly enhanced helping under negative 
mood. 


Normative Versus Autonomous Altruism 


One implication of the negative state re- 
lief formulation is that young children will 
not show enhanced helping while in a nega- 
tive affective state unless the helping leads 
to some kind of direct, extrinsic reward (eg., 
social approval). Rosenhan ( 1970) has sug- 
gested that altruism can be divided into two 
categories: normative altruism, based on sanc- 
tions associated with social norms, and au- 
tonomous altruism, which occurs in the ab- 
sence of external reward and which must be 
internalized. However, in the above-cited 
studies investigating the influence of negative 
mood on children’s helping, only autonomous 
altruism has been examined, since the oppor- 
tunity to behave charitably has always been 
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anonymous. In keeping with the negative 
state relief formulation, we would expect that, 
were the helping situation to provide the op- 
portunity for normative social reward, nega. 
tive mood would increase help giving in young 
children. 

There are data to suggest that primary 
school children are aware of the normative 
sanctions for helping. For example, Ugure- 
Semin (1952) found in 6- to 9-year-old chil- 
dren an increasing recognition of the pressures 
exerted by norms and customs on the decision 
to share. Similarly, Eisenberg-Berg (in press) 
found a substantial tendency for 7- and 9: 
year-olds to employ social approval reasons 
as justification for prosocial action. Further, 
Bryan and Walbek (1970) found that the 
majority of their 8- to 9-year-old subjects ver- 
bally indicated to an adult experimenter 
knowledge of the social desirability of help- 
ing. These authors found, however, that theit 
subjects’ verbalizations about the appropri- 
ateness of helping did not predict their ac- 
tual generosity in the absence of the experi- 
menter. Thus, it appears that while young 
children are aware that social approval is 
likely to follow their prosocial activity, they 
have not yet internalized the norms for help: 
ing. 

The present investigation was designed t0 
test the hypothesis that primary school chil- 
dren should show increased helping as a func 
tion of negative mood induction in a situ 
tion involving normative (or publicly 9 
servable) altruism, since the potential {ot 
external reinforcement would be present ™ 
the form of social approval. In Experiment 1, 
first- through third-grade children were giv 
either a private or a public opportunity to 
behave charitably. In the public condition 
an adult was present during an opportunity 
for the child to contribute some rewards " 
other children. In the private condition, ® 
adult was present during this time. In act 
with the negative state relief model, it “4 
expected that negative mood would prod 
significant increases in the helping bene 
of children, as it does for adults, but only ! 
the public condition. That is, negative ma 
would serve to increase normative ati 
but not autonomous altruism, since child" 
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at this age would not be expected to have yet 
internalized the reward properties of prosocial 
behavior. 


Experiment 1 


Method 


Subjects. Subjects were 72 first- through third- 
grade children (35 males, 37 females) from an ele- 
mentary school in a middle-class suburban area. 

Procedure. The procedure used in the present in- 
vestigation was identical to that used by Cialdini 
and Kenrick (1976) with two exceptions: the inclu- 
sion of a brief check on the mood manipulation and 
the introduction of a public helping condition. 

Briefly, subjects were brought individually to an 
experimental room containing a table and chairs, 
books, a tape recorder, and a box labeled “Coupons 
for other students.” Subjects were told that they 
would be participating in two separate tasks, an 
initial “hearing task” and a subsequent “imagina- 
tion task.” They were told that for their participa- 
tion they would receive several coupons that could 
be exchanged for a valuable prize at the end of the 
week, It was stressed that larger accumulations of 
coupons would result in better prizes. The experi- 
menter directed the subjects’ attention to the sealed 
box and explained that they could donate some of 
their winnings to the other students who would not 
get a chance to participate in the experiments. 

Subjects then performed a hearing task and were 
given 10 coupons for their participation. The first 
experimenter then left, and a second experimenter, 
always of the same sex as the subject, entered and 
began the imagination task, which constituted the 
mood manipulation, Each subject received 5 coupons 
before the onset of this task. 

Mood manipulation. Half the subjects (negative 
mood) were asked to reminisce about a sad experi- 
ence, and half were asked to imagine a book and a 
chair (neutral mood). This was done in a manner 
identical to that of Cialdini and Kenrick (1976), 
except that following the mood manipulation, sub- 
jects were asked, “Did what you just thought about 
make you feel better than you usually feel, about 
DASR as you usually feel, or worse than you 
itis ly feel?” The sequence in which these possibil- 
Rc were presented was counterbalanced across sub- 
Bina. Responses were coded on a 1-3 scale, with 1 

cating the most positive response. 
ag opportunity. After the mood manipula- 
tite bee second experimenter left the room, and the 

‘perimenter (blind to the subject’s mood con- 


diti 
ition) returned. In the private condition, the ex- 


Ca explained that the tasks were done and 
the ly resetting the tape recorder) thanked 
iow Pject and said there would be a short delay 
the dl: the subject would be accompanied back to 
tion assroom. The experimenter again called atten- 

to the box labeled “Coupons for other stu- 
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dents,” reminded the subject of the option to share 
coupons, and left the subject alone for 90 sec. 

In the public condition, the experimenter per- 
formed the same behaviors except that instead of 
leaving after the reminder, he or she remained in 
the room and observed the subject’s behavior during 
the 90-sec helping opportunity. 

p For ethical reasons, the experimenter had all nega- 

tive mood subjects imagine happy events at the 
completion of the experiment. A second manipula- 
tion-check procedure indicated that the negative 
mood was removed as a result. Subjects were then 
asked not to discuss the tests with any of their 
classmates until the end of the week. 

Independent variables. Three factors were varied: 
mood (negative vs. neutral), type of helping op- 
portunity (public vs. private), and sex of subject. 

Dependent variables. Two dependent variables 
were employed. The subjects’ responses to the mood 
manipulation check constituted one dependent vari- 
able. The major dependent variable was the mean 
number of coupons contributed per cell. 

Predictions. It was expected that for our young 
subjects, helping would be substantially enhanced 
only when they were in a negative mood and the 
possibility of external reward existed. Consequently, 
the negative mood/public condition was predicted 
to produce more helping than the other three condi- 
tions, which were not expected to differ among them- 


selves. 


Results 


A 2X 2X 2 analysis of variance revealed 
that sex of subject had no effect on either de- 
pendent variable; thus the means for both 
the number of coupons contributed and the 
mood manipulation check are collapsed over 
sex and are presented in Table 1. 

Manipulation check. The results of _ the 
analysis of variance on the mood-check item 


Table 1 : 
Mood and Helping Scores for Experiment i 
Type of helping opportunity 
Private Public 
Uae peracid 
Neu- Nega- Neu- Nega- 
tral tive tral tive 
M mood mi mood mood 


easure 
Mood check 1.72 2.06 1.50 2.55 


M coupons 
contributed 


1.44 1.22 1.50 3.50 


is 1-3 on mood check (1 = positive 
es ANVE mood) and 1-15 on coupons 
contributed. n per cell = 18. 
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showed the predicted main effect for the mood 
induction. Mean values (displayed in Table 
1) suggest that negative mood subjects indi- 
cated more negative feelings as a result of 
their ideations, F(1, 64) = 17.08, p < .001. 
In addition, there was an unexpected inter- 
action of mood induction with type of help- 
ing opportunity, F(1, 64) = 4.32, p< .05, 
Suggesting that the mood manipulation, al- 
though producing a similar effect for both 
public-condition and private-condition sub- 
jects, was somewhat stronger in magnitude 
for subjects who were subsequently given a 
public helping opportunity. No other effects 
were significant for this variable. 

Coupon sharing behavior. Since the pre- 
diction was made that the largest number of 
coupons would be contributed during a public 
helping opportunity following a negative mood 
induction, an a priori contrast (Hays, 1963, 
p. 466) was performed to test this prediction. 
This contrast pitted the public negative mood 
group against the mean of all other groups. 
In addition, the residual between-groups vari- 
ance was tested to determine whether any 
remaining effects due to the independent 
variables existed. It was expected that the be- 
tween-groups variance for the a priori con- 
trast would account for the majority of the 
variance, 

Results indicated that the prediction was 
borne out. Subjects in the negative mood/ 
public condition contributed substantially 
more coupons than did subjects in the other 
cells (means are displayed in Table 1). The 
a priori contrast indicated that this effect was 
significant, F(1, 68) = 9.55, p < .003. In ad- 
dition, this contrast accounted for 99% of 
the between-groups variance, whereas the re- 
maining comparisons testing all other or- 
thogonal effects were nonsignificant (F < 1) 
and accounted for only 1% of the variance, 


Discussion 


As expected, subjects who were exposed to 
negative mood-inducing operations and who 
had the opportunity to help in the presence 
of an experimenter donate 
Negative mood sub 
help occurred in 


d the most coupons. 
ects whose Opportunity to 
Private, however, were not 
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especially likely to share their coupons, | 
fact, they were slightly less generous as are 
sult of the negative induction procedure, ; 
result that is consistent with the prior liter 
ture involving anonymous giving in youn 
children. It appears, then, that negative mo 
did not lead to increased helping in our youn 
subjects unless the helping opportunity o 
fered the likelihood of direct social rewan 
for generosity. It is our argument that helpin 
will be employed by individuals in a lower 
mood state in order to relieve their negatiy 
affective state. For adults, who have internd 
ized the reward value of benevolence, the hel 
may occur in private and still retain its sel 
gratifying function. For young children, a 
the other hand, altruism has not yet becom 
autonomous and, thus, will be gratifying on 
when it is instrumental to an extrinsic fom 
of reward like public approval. 4 
Negative mood relief or image repait 
The finding that our young subjects wet 
more likely to help when in a negative mod 
and in public is compatible with some eatlié 
results by Isen et al. (1973). They found (Er 
periments 2 and 3) that a public failure hal 
a positive effect on young children’s charitabl 
donations made in the presence of an expeti 
menter who knew of the failure. The autho 
accounted for the finding with an image '® 
pair explanation: The visible failing product 
a lowered public image that subjects tht! 
sought to repair via prosocial action. Our owi 
interpretation of those data would be Pe 
what different: The public failure produ 
a lowered mood that subjects then sought! 
Temove via prosocial action. Setting aside i 
question of how to best interpret the i 
al. results, the image repair notion of this 
authors presents a potential alternative es 
planation for the data of the present stl! y 
Perhaps, in asking negative mood subjects 
reminisce aloud about a sad experience, 
caused a large percentage of them to divu 
image-damaging things about themselves. 
so, the enhanced coupon donating er 
curred in the negative mood/public condi 
may have been a result of attempts to E. 
lowered public image rather than to relie 
a negative mood. int 
To test the negative state relief versus’ 


we 
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age repair interpretations of our findings, a 
second experiment designed to replicate and 
extend the results of Experiment 1 was con- 
ducted. A test of these alternative formula- 
tions gains importance because the image re- 
pair interpretation could account for many of 
the studies showing that certain negative 
mood procedures resulted in greater helping 
among adult subjects. Although Cialdini and 
Kenrick (1976) interpreted such results ac- 
cording to the negative state relief model, in- 
creased aid following a public transgression 
(eg. Konetni, 1972), receipt of an insult 
(Steele, 1975), receipt of a deviant score on 
a personality test (Filter & Gross, 1975), or 
public embarrassment (Apsler, 1975) could 
conceivably have occurred in an attempt to 
restore public image. 

A second reason for performing a replica- 
tion of Experiment 1 involved the unexpected 
interaction that appeared on our mood ma- 
nipulation check in that study. Even though 
the means in Table 1 suggest that the mood 
induction procedure was successful for sub- 
Jects who were subsequently randomly as- 
signed to both public and private helping op- 
portunity conditions, the induction appears 
to have been stronger among those in the pub- 
lic cells. While the pattern of means makes it 
unlikely that this factor mediated our results, 
it was felt that confidence in the results of 
Experiment 1 would be enhanced by a replica- 
tion showing equivalently strong negative 
mood induction among public and private con- 
dition subjects. To these ends, a replication 
of Experiment 1 was conducted that included 
an additional manipulated variable—whether 
the negative mood-induction procedure re- 
quired subjects to imagine negative experi- 


ences that were image related or image Un- 
related. 


Experiment 2 
M ethod 


ee Subjects were 77 first- through third- 
Menta ten, (42 males, 35 females) from an ele- 
Was ae ool in a middle-class suburban area. This 

Proced: e same school used in Experiment 1. 
Used in ae ‘The procedure was identical to that 
jects j xperiment 1 with the exception that sub- 
in the negative mood cells were divided into 
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an image and a nonimage condition. These conditions 
were operationalized in the following way: Experi- 
ences spontaneously generated by subjects in Ex- 
periment 1 were divided into image and nonimage 
categories by two independent raters. There was 90% 
agreement between raters. The four most frequently 
occurring responses in each category had 100% 
agreement between raters and were selected for ex- 
perimental conditions.* 

Negative mood was manipulated by asking one 
half of the subjects if they had ever experienced the 
most frequently occurring non-image-related re- 
sponse, whereas the other half were asked the same 
question about the most frequently occurring image- 
related response. If subjects responded “yes” to this 
first probe, the experimenter proceeded to induce a 
negative mood state in a manner identical to Experi- 
ment 1. If subjects responded “no,” the experimenter 
continued down the list of image or nonimage ex- 
periences until an affirmative response was generated. 
None of the subjects responded “no” to all four 
probes. 

Predictions. It was expected that helping would 
again be greatest in the negative mood/public condi- 
tion and that this effect would not vary as a func- 
tion of the image manipulation. 


Results 

A3 X22 analysis of variance again re- 
vealed that sex of subject had no effect on 
either dependent variable; thus, the means 
for these measures were collapsed over sex 
and can be found in Table 2. E 

Manipulation check. Once again, the ma- 
nipulation check showed a significant main 
effect of the mood manipulation, F(2, 65) = 
20.68, p < 001. In this experiment, neither 
the type of helping opportunity nor the inter- 
action with mood induction was significant 
for this variable (both Fs < 1). No other ef- 

roached significance. ; 

1 gon sare Tor The initial 3 X 
2 x 2 analysis of variance also revealed that 
no significant differences occurred as a tor 
tion of the image manipulation (all Fs < n: 
Examination of the means in Table 2 indi- 


1The nonimage condition contained the i 

experiences: (a) Have you ever had a pet ien 

die? (b) Have you ever had a toy you a as A 

lost, or stolen? (c) Have you ever been huri 4 ie 
accident? (d) Have you ever had someone you I 

away? The image condition in- 

cluded (a) ver been called a bad name? 

ever been cau i 

a) Have you ever been punished for doing 

something wrong? (d) Have you ever been left out 


of a game you wanted to be in? 
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Table 2 2 
Mood and Helping Scores for Experiment 2 


5 8 
Type of helping opportunity 


Private i 
Negative mood Negative mood 
Neutral Neutral 
mood Nonimage Image mood Nonimage Image 
Measure (n = 13) (n = 12) (n= 11) (n = 14) (n = 13) (nm = 14) 
Mood check 1.92 2.66 2.72 1.68 2.69 2.78 
M coupons j 
ant buted 1.39 2.00 1.00 2.07 3.15 2.93 


Note. Range is 1-3 on mood check (1 = positive mood, 3 = negative mood) and 1-15 on coupons contributi 


cates that if anything, image-related negative 
mood led to relatively less helping than non- 
image-related negative mood. 

Since there was no difference between the 
image and the nonimage cells, the means for 
the major dependent measure were collapsed 
over type of negative mood induction, and 
the negative mood/public group was pitted 
against all other groups, as in Experiment 1. 
Once again, results indicated the most helping 
by subjects in the negative mood/public cell 
(see Table 2).? The a priori contrast for this 
effect was significant, F(1, 73) = 3.05, p< 
-05, one-tailed test. Further examination of 
these results indicated that this contrast ac- 
counted for 90% of the between-groups vari- 
ance, whereas the remaining orthogonal com- 
parisons were nonsignificant (F < 1) and ac- 
counted for 10% of the variance. 

Rosenthal (1978) has recently reviewed a 
number of methods for combining the results 
of independent studies, suggesting that the 
Most serviceable technique among those avail- 
able is the method of adding Z scores (Mos- 
teller & Bush, 1954). When looked at in this 
manner, the combined results of Experiments 
1 and 2 strongly support the hypothesis that 
the greatest coupon sharing would occur in 


the negative mood/public condition, Z = 2.96, 
p < .003. 


General Discussion 


Taken together, the results of both experi- 
ments provided clear support for our major 
hypothesis. While primary-grade children in 
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Public 


a situation requiring autonomous (i.e, iN 
ternalized) altruism did not show the adi 
pattern of increased helping in a negati 
mood, the pattern shown in the public cond 
tion indicated that normative (i.e. sanctio 
based) helping was increased by negati 
mood. This latter outcome fits well vii 
findings indicating increased helping in ait 
(e.g, Apsler, 1975; Cialdini et al, 1% 
Freedman et al., 1967) as well as incre ; 
direct self-reinforcement in children (a 
han et al., 1974; Underwood et al., 1913) 
a function of negative mood induction. f 
results are also consistent with the Isen 
al. (1973, Study 3) finding that an ee 
damaging incident will lead to an increas 
helping, although the present results do a 
seem to be explained via the image 1¢ 
model. BE 

It should be noted that the negati A 
relief model and the image repair mote 
in no way incompatible. In fact, damga 
image is doubtless one common and H i 
form of negative state induction. It E 
position, however, that image repair vt 
subsumed under the more general neg olf 
state relief model. Image damage is, by 
ficient but not necessary to lead to i i 
in normative helping. Other negative 
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aly 
2 As the means in Table 2 also indicate, af oft 
of variance revealed a trend toward a per) 
for type of helping opportunity, F(1, ‘oducts 
p <12, with public opportunity to help Found 3 
somewhat more aid. A similar trend was 
Experiment 1, F(1, 64) = 3.72, p < .058. 
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ences will have the same result. Further, re- 
search by Cialdini et al. (1973) has indicated 
that relief of a negative mood state abolishes 
the negative-mood-leads-to-helping relation- 
ship even when such relief is in no way re- 
lated to image improvement. 


Qualifications on the Negative State 
Relief Model 


A further point of clarification regarding 
the negative state relief model should be 
made here. The model suggests that one helps 
when one is experiencing a transitory feeling 
of sadness or depression, in order to feel 
better. As indicated in previous statements of 
the model, (Cialdini et al., 1973; Cialdini & 
Kenrick, 1976) negative experiences likely 
to arouse feelings of frustration or anger would 
not be expected to result in altruistic behav- 
ior, Additionally, if the helping is effortful or 
unlikely to yield great benefit, negative mood 
might be expected to reduce helping even in 
adults. This latter suggestion is supported in 
a recent article by Weyant (1978). He found 
that a negative mood induction increased help- 
fa in his subjects, but only when the cost of 

elping was low and the potential benefits 
a as Conversely, when cost of helping 
ae igh but potential benefits were low, 

elping was somewhat decreased by negative 
mood. 
ea is also recent evidence that negative 
i can sometimes reduce helping by pro- 
A s an attentional deficit. McMillen, 
IA s, and Solomon (1977) found that neg- 
ce mood subjects were less attentive to 
an environment than neutral mood sub- 
i oni consistently, were less helpful to 
a oo confederate in distress, unless she 
ee ention to herself. In the latter condi- 
ee oe mood produced greater helping 
ie E neutral or positive mood condi- 
E e tendency to engage in altruistic 
ane FA as a means of self-reward following 
a oa induction, then, seems to be 
Bite Pe the reward value and the salience 
fhe hed elping opportunity. In keeping with 
ae > te basis of the model, it appears 
a ae elping opportunity is not perceive 
ae y to lead to a better mood, negative 
will not enhance helping. These find- 
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ings are consistent with the notion that help- 
ing will occur in a negative mood in order to 
alleviate such aversive feelings.“ 

It was interesting that whereas means for 
helping in the public conditions were invari- 
ably higher than those in the relevant private 
comparison, the public-private effect was not 
more pronounced in the neutral mood condi- 
tions. It would seem that the children gen- 
erally judged the value of the unknown prize 
to be potentially more valuable than the im- 
mediate social approval of the experimenter, 
but that negative mood enhanced the im- 
portance of such immediate approval. 


The Socialization of Altruism as a 
Self-Reinforcer: A Three-Step Model 


It will be recalled that the Cialdini and 
Kenrick (1976) data indicate that the rela- 
tionship between helping and negative mood 
changes from an inverse to a direct one as an 
individual passes through the socialization 
process. When those data are taken in com- 
bination with the results of the present in- 
vestigations, a three-step model of the so- 
cialization of altruism as a self-reward 
emerges. Initially, helpful behavior would be 
rarely performed by an unsocialized individ- 
ual, since it involves the loss of rewards. 
There is substantial evidence that preschool- 
age children help infrequently and that when 
they do help, it is unrelated to social approval 
(Eisenberg-Berg & Hand, in press; Fischer, 
1963; Ugurel-Semin, 1952). At this step, 


t study, by Underwood, Berenson, 
d to find increased helping in 
of an atten- 
authors’ suggestions 
(1978) more re- 
ose that negative mood sub- 


ts who were leaving movie 
very different films, it is 
ntrolled factors may 
in this study. Further research 


i d to clarify the cont ; 
Seo (1977) and McMillen 


al. (1977) studies. R 
bee are indebted to Robert A. Baron for his con- 


tributions regarding the thinking contained in this 
paragraph. 
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helpfulness would be unlikely under condi- 
tions of negative mood, since it would be 
experienced as self-punishment. 

The second step involves children’s acquisi- 
tion of the norms for prosocial behavior, 
without concomitant internalization of these 
norms. Children at this step, usually in the 
primary grades, are aware that helping can 
lead to systematic social reinforcement (Bryan 
& Walbek, 1970; Eisenberg-Berg, in press; 
Ugurel-Semin, 1952) and employ a public 
helping opportunity to obtain external social 
reward when in a negative mood, as sug- 
gested by the data of the present studies. 
However, such children may still tend to be 
less helpful under negative mood when the 
helpfulness is anonymous (Cialdini & Ken- 
rick, 1976; Moore et al., 1973; Rosenhan et 
al, 1974; Underwood, Froming, & Moore, 
1977). 

Finally, after sufficient experience with ex- 
ternal reward for charitable action, that be- 
havior itself takes on the quality of a sec- 
ondary reinforcer (Weiss, Buchanan, Alstatt, 
& Lombardo, 1971). Weiss et al. found that 
for college students, the opportunity to be- 
have altruistically acted as a reinforcer in a 
learning task and concluded that altruism it- 
self was rewarding for their subjects. A study 
by Pisarowicz (Note 1) has provided addi- 
tional support for the notion that altruism is 
self-rewarding for adults. Subjects in Pisaro- 
wicz’s study indicated a greater sense of self- 
satisfaction and a lessened tendency to en- 
gage in direct self-gratification (prize tokens) 
following altruistic behavior. Consistently, 
charitable behavior in adults occurs with en- 
hanced frequency under con itions of nega- 
tive affect (see Cialdini & Kenrick, 1976, for 
a review), presumably to dispel the bad mood. 
For the individual who has reached this level 
of socialization, usually by high school age, 
helpfulness will have become self-gratifying 
and will frequently occur Privately without 
the expectation of material or social reward. 


Reference Note 


1. Pisarowicz, J. A. 


Self-reinforcement 
truistic behavior, 
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The Organizational Functions of the Self: An 


Alternative 


to the Duval and Wicklund Model of Self-A wareness 


Jay G. Hull and Alan S. Levy 
Duke University 


The importance of an adequate theory of 
self for any complete account of human social 
behavior has long been recognized by philoso- 
phers and psychologists alike (Freud, 1927; 
James, 1890/1950; 
1935; Mead, 1934). These theorists have for 


the most part distinguished two functions of 
self: that 


aspect of the self that has been emphasized 
in modern social psychology (Bem, 1972; 
Festinger, 1957; Nisbett & Valins, 1971). 
According to this view, behavior follows from 
an evaluation of past activities and as such 
represents a conscious attempt at self-regula- 
tion. While not denying that self-evaluation 
can serve to mediate behavior in this way, we 
feel that these theories have tended to re- 
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e that self- 
er se, but 


strict unnecessarily the role of the self toi 
process of self-concept regulation M 
sentially ignoring its broader function 0 i 
ganizing the individual’s understanding i 
the social environment. This is most easi 
seen in a recent theory of objective selte 
ness (Duval & Wicklund, 1972; Wickluni 
1975). 

In a of the Duval and Wicklund m 
self-awareness * is composed of four er 
Stages: self-focused attention > sel diy 
tion — affective reaction > noi i 
crepancy reduction. Very briefly, atten i 
can be focused in one of two directions: 
ward on the self or outward on the on 
environment. Self-focused attention is r 
pothesized to induce an evaluation of he 
ent self in terms of the individual’s id 


defined 


jective 


* Duval and Wicklund (1972) originai 
self-awareness in terms of the ratio of an focus 
(self-focused) to a subjective (environment ae 
state. In contrasting their theory with othe o 
high self-awareness was postulated to angi jet 
higher levels of objective as opposed to Be ba 
self-awareness. Following the lead of othe ort 
substituted the term “self-awareness” for conne- 
cumbersome “objective self-awareness” in present 
tion with their position throughout the | 
article, 
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image. Recognition of a real-ideal discrepancy 
is hypothesized to elicit an affective reaction 
—positive or negative depending on the di- 
rection of the discrepancy. Upon recognizing 
that one’s ideals exceed one’s present self and 
experiencing a negative reaction, the individ- 
ual is postulated to be motivated to reduce 
the degree of negative affect by either (a) 
working to reduce the size of the real—ideal 
discrepancy or (b) avoiding the self-aware 
state. The Duval and Wicklund (1972) the- 
ory thus explicitly defines the relation be- 
tween self and behavior in terms of a regula- 
tory process of self-evaluation and in addition 
specifies a boundary condition on this process 
in terms of self-focused attention. 


An Encoding Theory of Self-Awareness 


In contrast to Duval and Wicklund (1972), 
we would propose that the defining features 
of self-awareness do not exist in terms of self- 
focused attention and the operation of a 
particular kind of self-evaluative process, but 
rather exist in terms of the individual’s or- 
ganization of the social environment prior to 
focal attention and comparison processes. 
Specifically, we propose that self-awareness 
Corresponds to the encoding of information in 
terms of its relevance for the self and as 
such directly entails a greater responsivity to 
the self-relevant aspects of the environment. 
Situationally, this level of encoding may be 
aà consequence of the nature of the assigned 
task, the specific instructions, or the presence 
of self-symbolic cues (e.g. mirrors, videotape 
and audiotape recordings). Dispositionally it 
May represent either a general propensity on 
the part of the individual or a by-product 
of more elaborate cognitive structures corre- 
sponding to relationships between the self 
and the environment (Markus, 1977; Rogers, 
Kuiper, & Kirker, 1977). 

“ae present distinction between encoding 
a ocal attention/comparison processes 1S 
isa to several recent theories of cogni- 
yaw 1967; Shiffrin & Schneider, 
ae ternberg, 1975). In terms of a model 
ae by Shiffrin and Schneider, feature 
aa oS Corresponds to the automatic activa- 
A specific cognitive structures in response 

Particular input configuration and as 
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such operates through a set of associative 
connections in long-term memory. On the 
other hand, controlled processing corresponds 
to a capacity-limited comparison or search 
process dependent on focused attention and 
under the control of the individual. Since 
feature encoding defines the form of informa- 
tion available to the individual, it need not be 
associated with different kinds of controlled 
processes to lead to different behavioral ef- 
fects. Thus, encoding may be associated with 
automatic responses that do not require con- 
trolled processing, or, by its inherently con- 
structive nature, encoding may determine the 
outcomes of those controlled processes that 
do occur. 

From our own perspective, then, self-aware- 
ness corresponds to a particular form of en- 
coding process that has its effects on behavior 
independent of focal attention and compari- 
son processes by rendering the individual sen- 
sitive to those aspects of the environment that 
are potentially self-relevant. It follows from 
such an analysis that before one can define 
the behavioral effects of self-awareness, one 
must specify the ways in which particular 
aspects of the situation are relevant for the 
self. We propose that information about one’s 
present situational context is principally self- 
relevant insofar as it specifies contingencies 
related to the individual’s present activities 
or projects (Mead, 1934). As such, self-aware- 
ness will serve to increase the individual's un- 
derstanding of the immediate situation in 
terms of contingencies bearing on those ac- 
tivities. To the extent that such self-relevant 
contingencies are defined by the responses of 
others, self-awareness prior to behavior will 
lead to actions consistent m a a 

itions of a propriate conduc’ By 
definition 4, 1975; Diener & Wallbom, 1976; 
Duval, 1976; Liebling & Shaver, 1973 
Scheier, Fenigstein, & Buss, 1974; AS 
& Duval, 1971). While self-awareness wil 


thus be associated with greater responsivity 


to situational variations in self-relevant con- 


tingencies, it will also be associated with be- 
havioral consistency across situations involv- 
ing activities that define similar self-relevant 
contingencies (e.g, Pryor, Gibbons, Wick- 
lund, Fazio, & Hood, 1977). 
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In addition, we propose that information 
about past performances and present physio- 
logical states also constitutes self-relevant in- 
formation insofar as it specifies particular 
kinds of relationships between self and en- 
vironment. Self-awareness as a greater sensi- 
tivity to such information will be associated 
with a greater responsivity to the evaluative 
and affective connotations of the information 
(e.g, Gibbons & Wicklund, 1976; Steen- 
barger & Aderman, in press; Scheier & Carver, 
1977; Gibbons, Note 1). The claim, however, 
that self-awareness involves a naturally self- 
evaluative and affect-inducing state (Duval 
& Wicklund, 1972; Wicklund, 1975) is simply 
unsubstantiated in the literature to date. 

In terms of research issues, our alternative 
raises questions about three separate aspects 
of self-awareness theory: (a) To what ex- 
tent is a state of self-awareness based ona 
State of self-focused attention? (b) To what 
extent are the behavioral effects of self-aware- 
ness mediated by processes of self-evaluation 
and affect reduction? (c) To what extent are 
the attributional effects of self-awareness 
based on attentional differences? Previous re- 
search bearing on each of these issues will be 
briefly considered along with three studies 
Specifically designed to illustrate our own 
perspective. 


Focal Attention or Encoding Processes? 


Two of the main tests of the focus of at- 
tention hypothesis have used projective mea- 
sures (Carver & Scheier, 1978; Davis & 
Brock, 1975). While these studies did in- 
deed find that self-awareness manipulations 
led to increases in self-referent Tesponses, we 
would contend that this was not because of 
an autistic self-focus, but rather reflects a gen- 
eral propensity to construct the environment 
in terms of its relationship to the self. In 
contrast to such projective techniques, Geller 
and Shaver (1976) have recently used the 
Stroop color-word test (1938) to demon- 
strate basic Processing differences among dif- 
ferentially self-aware subjects, Thus, 
aware subjects were found to be more sensi- 
tive to irrelevant aspects of a color-naming 
task when those aspects involved self-relevant 
as opposed to self-irrelevant information. Non- 
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self-aware subjects were not responsive to this 
difference. It should be noted, however, that 
self-aware subjects did not show the general 
decrement in processing speed that should be 
expected with an attentional focus away from 
the task (Gopher & Kahneman, 1971; Shif- 
frin & Schneider, 1977). We would propose 
that the lack of such an overall processing 
decrement suggests that self-awareness does 
not correspond to an attentional focus on the 
self and away from the environment, but 
rather corresponds to a particular form of in- 
formational encoding that is cued by the pres- 
ence of self-symbolic stimuli. 

To provide support for the notion of self- 
awareness as an encoding process, the present 
authors designed an experiment to examine 
differences in self-relevant encoding among 
individuals varying in degrees of dispositional 
self-awareness, It was predicted that individ- 
uals with high scores on a measure of private 
self-consciousness (see Fenigstein, Scheier, & 
Buss, 1975) would be more likely to encode 
self-relevant information at a deeper level 
than (a) information encoded on other dimen- 
sions and (b) all information (including self- 
relevant) processed by individuals low in 
self-consciousness. The primary dependent 
measure of encoding depth involved incidental 
recall following a word evaluation task (Craik 
& Tulving, 1975; Rogers et al., 1977). The 
encoding dimensions of word length and mean- 
ingfulness were selected as control baselines 
for comparisons with the encoding dimension 
of self-relevance. The present design thus 
varied three levels of encoding dimensions 
along with two levels of self-consciousness. 


Experiment 1 


Method 
Subjects 


Subjects were 66 undergraduates of Duke Univer- 
sity who participated in partial fulfillment of @ 
course requirement, All subjects had participated in 
a group testing session 2 months earlier, during whi 
the Fenigstein et al. (1975) self-consciousness in- 
ventory was administered. 


Procedure 


Subjects were run in small groups varying in m 
from 4 to 11 people. The procedure consisted of t 


i 


slowing instructions read verbatim: 


Jn this experiment you will be asked to evaluate 
a series of 30 words which will be presented one at 
Ta time, Each word is to be evaluated on one 
specific dimension These dimensions are listed in 
" order on the rating sheets you have been provided. 


Į will call out a number and you are then to read 
the statement associated with that number. I will 
then call out a single word. You are then to place 
a check in the appropriate column to designate 
whether or not the statement is applicable to that 
T word: in the left column if yes; in the right if no. 
Some of the statements ask for an evaluation of 
the length of the word, For these statements check 
the left if the word is long, the right if short. 


Each word is to be evaluated only with respect to 
the single statement associated with the word 
called out. Are there any questions about this pro- 
cedure? 


As long as there were no questions, the experimenter 
continued by calling out a series of 30 words. In 
tach instance he would first read aloud the number 
asociated with the word, wait 5 sec, read aloud the 
word itself, and then wait 18 sec before calling out 
the next number 

Evaluation statements. Each word was to be 
evaluated with respect to one of the three state- 
ments used by Rogers et al. (1977, Experiment 2). 
The specific criteria used in the present study were 
(a) how long or short the word was; (b) whether 
the word was meaningful to the subject; and (c) 
Whether the word described the subject. The first 
poe is hypothesized to require the shallowest 
level of processing; the last, the deepest. The order 
in which the statements appeared was counterbal- 
go such that in every three trials, each statement 
eure once. Subjects were randomly assigned one 
PORA rating sheets, each of which contained a dif- 
“ian order of evaluation statements. Thus, within 
aN ing session a given stimulus word was evaluated 
Ringe to all three dimensions, the specific di- 
h n depending on which sheet the subject had 
=, randomly assigned. 
Age measures. After evaluating all 30 words, 
E A Ea were instructed to turn over their rating 
they and write down as many of the words as 
aes remember. Since stimuli had been pre- 
Ree they were available only in memory. 
ae es informed that they would have 3 min 
oe apele the task, At the end of 3 min, subjects 
tating a to place their names at the top of the 
= a: ets and return them to the experimenter. 
aa ube of the experiment was then explained, 
ae jects were asked not to discuss the nature of 

Xperiment with others. 


Results and Discussion? 


ther the randomly assigned evaluation orders, 
€ Were 24 subjects in Order 1, 21 in Order 
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2, and 21 in Order 3. Within each order, sub- 
jects were divided into high and low self-con- 
scious groups by means of a median split per- 
formed on private self-consciousness scores 
(Fenigstein et al., 1975). Two subjects were 
randomly discarded in order to equalize the 
number of high and low self-conscious sub- 
jects in each evaluation order. 

The data were submitted to a mixed-design 
analysis of variance with the number of words 
remembered within each evaluation category 
used as a repeated measure. The main effect 
of self-consciousness was marginally signifi- 
cant, F(1, 62) = 3.03, p < .10. The within- 
subjects analysis consisted of three planned 
comparisons. As predicted, high self-conscious 
subjects recalled more words encoded as self- 
relevant than did low self-conscious subjects, 
F(1, 124) = 5.61, p < .025. In addition, high 
self-conscious subjects recalled more words 
encoded according to their self-relevance than 
words encoded according to the other con- 
trol dimensions, F(1, 124) = 3.88, p=.05. 
Finally, low self-conscious subjects recalled 
nonsignificantly fewer words encoded accord- 
ing to their self-relevance than words encoded 
according to other dimensions, F(1, 124) = 
Alas. A comparison testing the interaction 
of self-consciousness and self-relevant versus 
control words did not achieve a conventional 
level of significance, F(1, 124) = 2.58, p= 
BSE 


2 This experiment was originally run as two sepa- 
rate experiments, the second being an exact replica- 
tion of the first. The results of each were consistent 
with our hypotheses, and there were no independent 
effects associated with “experimental session” when 
this was included as a factor in the design. Therefore, 
for the sake of brevity, the results are reported col- 
lapsed across experiments. 

3Six of the 10 items from the private self-con- 
sciousness scale were significantly correlated with 
incidental memory of self-referenced words (.21< 


r<3i, p<.05, one-tailed). Two of the items (“I 


sometimes have the feeling that I’m off somewhere 
watching myself” and “I’m aware of the way my 
mind works when I work through a problem”) failed 
to predict in either experimental session: Session 1 
(n=20), s= 08 and .02, respectively ; Session 2 
(n= 44), s= 09 and —.06, respectively. While we 
feel that these items are more closely associated with 
a focal attention interpretation of self-awareness than 
are the remaining items, this distinction is obviously 


a personal judgment on our part. 
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tow EE 
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Figure 1. Number of words remembered as a func- 
tion of self-consciousness and encoding task: Experi- 


ment 1, 
As can be seen in Figure 1, the pattern of 
results conformed nicely with Predictions, For 
individuals high in self-consciousness, there is 

- a striking difference 


(Rogers et al., 1977). 
To determine the effects of the alternative 
subscales of the self-consciousness inventory, 


analysis provided by Rogers et al., (1977), 


that high self- 
possess a more 
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uniform and well-structured cognitive orga 
nization corresponding to the self, or (b) this 
superordinate framework is more easily ag 
tivated by environmental cues to self-refer. 
ence. Finally, it should be noted that although 
self-consciousness was associated with an in 
creased sensitivity to self-relevant informa 
tion, there was no concomitant decrease in 
encoding efficiency of non-self-releyant in- 
formation. We would Suggest that these find. 
ings, along with those of Geller and Shaver 
(1976), indicate that self-awareness does not 
correspond to a bidirectional attentional phe- 
nomenon, but rather Corresponds to an or 
ganizational phenomenon associated with a 
greater sensitivity to specific forms of en 
vironmental information. 


Self-Awareness and Self-Evaluation 


In addition to denying that self-awareness 
corresponds to a state of self-focused atten- 
tion, our analysis is in basic disagreement 
with the Duval and Wicklund account of the 
self-evaluative Consequences of self-awareness 
in the absence of performance feedback, The 
Primary evidence for their position exists in 
the form of an experiment by Ickes, Wicklund, 
and’ Ferris (1973) in which self-aware sub- 
jects described themselves less favorably rela- 
tive to their ideals on a “real—ideal self ques- 
tionnaire,” Proponents of the theory claim 
this study as evidence that under natural con- 
ditions, self-aware subjects tend to exaggerate 
the size of discrepancies between their real 
and ideal selves. Indirect evidence on subjects 
whose self-esteem has not been manipulated, 
however, Suggests that self-awareness is as- 
Sociated neither with self-criticism (Carver & 
Scheier, 1978) nor with the avoidance of gen- 
erally self-referent responses (Davis & Brock, } 
1975). 

_ According to our own analysis, a media- j 
tional process of self-evaluation and affect re- 
duction is not necessary in order to account 
for the behavioral Consequences of self-aware- 
ness. Self-awareness as an encoding process 
is hypothesized to directly mediate behavioral 
responses by Prefiguring the important and 
relevant aspects of social situations without 
engaging a self-evaluative or affect-reduction 


process. According to this analysis, subjects’ 
If-critical responses in the experiment by 
ckes et al. (1973) were not an inevitable con- 
sequence of self-awareness, but rather a con- 
“sequence of the particular characteristics of 
‘the situational context within which the re- 
sponse occurred. It seems likely, for example, 
that self-deprecation exists as a common self- 
resentational response to situations similar 
to that used by Ickes et al. (1973). Lefebvre 
(Note 2) has shown that female subjects 
adopt self-deprecation as an ingratiation tactic 
in the presence of other females and self-en- 
hancement in the presence of males. Self- 
k deprecation in the Ickes et al. study occurred 
‘within just such a female (subject) — female 
(experimenter) interaction. If self-awareness 
is postulated to heighten the individual’s sen- 
sitivity to self-relevant aspects of the im- 
mediate situation, it might be expected to in- 
crease self-deprecation in just this kind of 
Situation. 
_ Although both analyses can explain the re- 
‘sults of this particular experiment, they 
clearly predict different consequences when 
other variables are introduced. In terms of 
Four analysis, self-deprecation is a response to 
the immediate circumstances surrounding the 
self-description. Altering those circumstances 
by Manipulating the characteristics of the 
: audience (e.g., through use of a male rather 
= a female experimenter) or by rendering 
the self-descriptions anonymous would be ex- 
Pected to alter the quality of the self-aware 
response, From the perspective of the Duval 
a Wicklund (1972) model, such manipula- 
yy would not be expected to affect the self- 
Valuative consequences of self-awareness. 
pa experiment was designed to test these 
aag analyses. In accord with our own 
Eor it was predicted that self-awareness 
A am to self-deprecation when responses 
aes e publicly available to a female ex- 
the = er, but the reverse would be true in 
ity ine of a male. In addition, anonym- 
api eliminating the self-presentational 
Iae a of the situation) was predicted to 
the T e both of these effects by rendering 
en oi the experimenter irrelevant to the 
a S ehavior. Finally, self-awareness was 
Predicted to affect subjects? mood. In con- 
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trast, Duval and Wicklund (1972) would pre- 
dict that self-awareness should lead to self- 
deprecation regardless of other situational 
variations and that negative affect should be a 
consequence of this negative self-evaluation. 


Experiment 2 
Method 
Subjects 


The subjects were 101 female undergraduates from 
Duke University who participated in partial fulfill- 
ment of a course requirement. Subjects were inten- 
tionally restricted to females in order to replicate 
the conditions of the Ickes et al. (1973) and Le- 
febvre (Note 2) experiments. Subjects were run in- 
dividually and randomly assigned to one of eight 
experimental conditions. Five of these subjects were 
discarded according to predetermined criteria.* The 
remaining subjects were evenly distributed across 
conditions. 


Procedure 


Upon entering the experimental room, subjects 
were seated before a large (48-inch X 32-inch, 122- 
cm X 81-cm) mirror that was turned so that its 
reflective front (high self-aware conditions) or non- 
reflective back (low self-aware conditions) faced the 
subject. Subjects’ self-descriptions were then obtained 
under the guise of an experiment described as in- 
vestigating sex differences in aesthetic reactions. Each 
subject was shown a stack of art reproductions, along 
with a series of rating forms for each painting. It 
was explained that personality and mood differences 
had to be statistically controlled and that prior to 
reacting to the pictures it would be necessary for 
the subjects to fill out two brief _self-descriptive 
questionnaires. ` 

Public and private conditions. Subjects assigned 
to public conditions were told to write their names 
and social security numbers on the booklets and to 
hand them to the experimenter after completing 
them, In addition, they were told that the booklets 
would be scored while they rated the pictures, and 
feedback would be supplied at the end of the ex- 
periment. i 5 

Subjects randomly assigned to the private condi- 
tions were told not to write their names or social 


4 It was determined in advance to discard any sub- 
jects in the public conditions who did not put their 
name on the booklet (two individuals did not) and 
any subjects in the private conditions who did not 
place their booklet in the anonymous pile (three in- 
dividuals did not). The results of all tests of signifi- 
cance are unchanged when these subjects are in- 
cluded in the analysis. 
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Table 1 
Impact of Self-Awareness on Task-Related 
Self-Descriptions: Experiment 2 


a ee Se ee ee 


Female Male 
experimenter experimenter 
Self- 3 = 
awareness Public Private Public Private 
High 
M 12.67 14.07 14.22 14.11 
SD 1.42 1.74 2.01 1.91 
Low 
M 15.04 13.11 13.11 14.26 
SD 2.03 2.38 1.85 1.18 


Note. Higher numbers refer to more positive self- 
descriptions on a 20-point scale; all statistics are 
based on 12 subjects per cell, 


security numbers on the booklets and after complet- 
ing them to place them somewhere in the middle of 
a pile of booklets behind them. No mention was made 
of the possibility of feedback, 

Target other. The characteristics of the individual 
who would have access to subjects? 


about the Procedure. As long as there were no ques- 

the experimenter concluded by commenting 
that he/she had to share the room with another ex- 
perimenter and that the bulk of the equipment in the 


tape recorder, and various other experimental stimuli. 
The subject was then left to work on her own. 


priori basis into 

task-relevant evaluative (sensi- 

and impression- 

able) ; general evaluative (careful, intelligent, honest, 

disagreeable, and crude) ; and sex-related (masculine, 
ambitious, forgiving, cooperative, and jealous) 5 

The second questionnaire consisted of 


After completion of the: 


se questionnaires and the 
return of the experimente: 


T, subjects were informed 
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that the experiment was over and were probed fo 
suspicion. The actual nature of the experiment wat 
explained to them, and they were thanked for ths 
participation. All subjects were asked not to discuss 
the experiment with other potential participants, 


Results 


For the purposes of analysis, the data wen 
combined into the personality and mood cate 
gories described above. They were then sub 
mitted to a 2 (self-aware vs. non-self-aware) 
X 2 (public vs. private) X 2 (female vs. mak 
experimenter) analysis of variance.’ 


Personality Self-Descriptions 


Task-relevant evaluations. Subjects’ re 
sponses to these five items were combined y 
subtracting negative (unskilled) from pos 
tive (analytic, sensitive, creative, and imi 
pressionable) self-evaluations. The means for 
each of these eight conditions are reported it 
Table 1. An analysis of variance on these com: 
bined scores revealed a significant triple in 
teraction, F(1, 88) = 9.27, p < .005. There 
were no other significant effects. Post ho 
analyses indicated that differences in public 
Conditions were primarily responsible for this 
interaction. Thus, the two-way interaction of 
self-awareness and sex of experimenter was 
significant in public (Scheffé procedure, $ < 


5 The first three task-relevant evaluative and Ta 
first four general evaluative traits appeared in ai 
original Ickes et al, (1973) study. An ane 
these seven traits revealed effects identical to = 
task-relevant and general evaluative dimensions a 
Ported below. The two remaining task-relevant tral i 
were regarded as positive skills in the immedial 
situation of judging art reproductions. ral 

“If the distinction is not made between generi 
evaluative, task-relevant, and sex-related traits, p 
items are combined only on the basis of their Piy 
tive-negative qualities, the results remain basi 38) 
the same. Thus, both the triple interaction, F(1, ain ; 
= 14.92, p< -001, and the sex of experimenter ™ 
effect, F(1, 88) = 3.2, <.06, are significant, A 
can be seen in the more extensive analysis, honit a 
the former effect is restricted to general ee 
and task-relevant Self-descriptions, whereas the la! ce 
only appears on sex-related self-descriptions. Sin 
A combined analysis does not reveal these importan 
differences, separate analyses based on preplamn 
distinctions were deemed more appropriate. 


001) but not in private conditions (p> .25). 
elf-awareness thus led to more negative self- 
‘descriptions in the presence of a female ex- 
perimenter and more positive self-descriptions 
jn the presence of a male. This was only the 
case, however, when those self-descriptions 
were publicly available to those experimenters. 

-descriptions in private conditions were 
ingly unaffected by either self-awareness 
or sex of the experimenter. 

General evaluations. Subjects’ responses to 
these five items were combined by subtracting 
negative (crude, disagreeable) from positive 
(intelligent, honest, careful) self-evaluations. 
The means for each of the eight conditions 
ate reported in Table 2. An analysis of vari- 
ance on these combined scores revealed a sig- 
Mificant triple interaction of the same form 
as that found on the task-relevant items, 
A(1, 88) = 9.49, p< 005. Post hoc analyses 

again indicated that differences in public 

Self-descriptions were primarily responsible 

for this interaction. Self-awareness interacted 

strongly with sex of experimenter in public 
conditions (p < .001), but again there were 

-no differences in private ($ > .25).7 

Sex-related evaluations. Subjects’ responses 
to these five items were combined by subtract- 
ing “Masculine (masculine, ambitious) from 
feminine (forgiving, cooperative, jealous) self- 
evaluations. The means for each of the eight 
conditions are reported in Table 3. The triple 
interaction found on the other indexes fail 

| to attain significance on these sex-related 


Table 2 
Impact of Self-Awareness on General 


Male 


Female z 
experimenter 


experimenter 


Self- 
Public Private 


@Wareness Public Private 


High 
M 13.14 — 15.48 15.78 14.59 
SD 1.81 2.07 118 251 
"y 15.89 
M 14.63 . 
15:67 15.19 u 


SD 1.70 2.34 : 
"Note ie itive self- 
» Higher numbers refer to more positiv" 
e opon eae at 


on 12 subjects per cell. 
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Table 3 
Impact of Self-Awareness on Sex-Related 
Self-Descriptions: Experiment 2 
SSS ees 
Female Male 
experimenter experimenter: 


Self- 
awareness Public Private Public Private 


High 


M 11.33 11.19 12.59 1241 

SD 1.36 1.30 2.21 1.57 
Low 

M 12.37 11.22 12.11 12.89 

SD 2.14 2.07 2.03 1.44 


Note. Higher numbers refer to more feminine self- 


descriptions on a 20-point scale; all statistics are 


based on 12 subjects per cell. 


items, F(1, 88) = 1.78, p < 20. There was, 
however, a significant main effect for sex of 
experimenter, F(1, 88) = 6.98, p < 01. Thus, 
females described themselves as more feminine 
in the presence of a male experimenter than in 
the presence of a female experimenter. This 


difference did not appear to be contingent on 


the public quality of the self-description. The 
most fruitful interpretation of this effect may 
be in terms of differential trait salience as a 
function of tal cues (McGuire, 

i i 1978; McGuire 


McGuire, Chi! 
MeO aner-Singer; 1976). According to this 
the present findings of increased fem- 


analysis 

jnine trait descriptions in the presence of & 
male as OP! to a female experimenter 
would be because of the differential salience 
of those traits as a function of their situa- 
tional rarity oF distinctiveness. 

Mood Self-Descriptions 


jects’ 
= oie ined into three categories, for ve 
; positive, negative, anc 
purpose of analysis 
ts that self-aware 


inspection cans sugges! 

i in ot ontions involving a male ex- 

subjects p themselves slightly less favorably 

pe ware subjects. However, insofar 

th da ag not significant on this measure 
= AL 


for the task-relevant meas 
even Mat F(A, 88)= 03, We feel that it is not 
meaningful. 
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defiant. There were no significant effects on 
any of these indexes. 


Discussion 


Contrary to the Duval and Wicklund 
(1972) model, self-awareness did not result 
in consistently negative self-evaluations and 
had no apparent effects on reported mood. In- 
stead, the effects of self-awareness appeared 
contingent on several situational factors. Self- 
awareness thus interacted with the manipu- 
lated anonymity of the response and the sex 
of the experimenter for both task-relevant 
and general evaluative self-descriptions. 

Consistent with our alternative analysis, 
the pattern of these results suggests that self- 
awareness serves to increase sensitivity to 
self-relevant aspects of the immediate situa- 
tion. The effects of self-awareness were thus 
contingent on characteristics of the target for 
whom the self-descriptions were available.: 
Additional analyses made clear, however, that 
these effects were restricted to situations in 
which subjects’ self-descriptions were pub- 
licly available. Rendering responses anony- 
mous serves the theoretical purpose of elim- 
inating the potential social consequences of 
the self-descriptions and thus the self-rele- 
vance of the sex of the experimenter as a sit- 
uational cue. Consistent with the present anal- 
sis, such a manipulation effectively elimi- 
nated the effects of self-awareness. 

This is not to say that self-awareness in- 
volves “self-presentation” any more than it 
is to say that self-awareness involves “self- 
evaluation.” The form of the self-aware re- 
sponse is of necessity dependent on the nature 
of the information that is encoded as self- 
relevant. Thus, encoding information about 
past performances in terms of their self-rele- 
vance can involve an increased sensitivity to 
the evaluative connotation of the individual’s 
relationship to the environment. The claim, 
however, that self-awareness is, in and of it- 
self, a self-evaluative and affect-inducing state 


is simply unsubstantiated in the literature 
to date. 


Self-Awareness and Causal Attribution 


In contrast to an attentional notion of 
self-awareness, we have Proposed that self- 
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awareness involves a particular form of e 
coding process. Thus, while Personally agep 
ing with the general proposition that atrial 
tions follow the individual’s focus of attenti 
(Duval & Wicklund, 1972; Jones & Nisbet 
1971; Taylor & Fiske, 1978), we are foi | 
to disagree with the hypothesis that seli 
awareness affects attribution by increas 
self-focused attention (Arkin & Dun 
1975; Buss & Scheier, 1976; Duval & Wide 
lund, 1973). Rather, we would propose thai 
self-awareness affects attribution by increas 
ing the individual’s sensitivity to the st 
relevant aspects of the immediate situati 
Surrounding the attributional response. 

In line with such an analysis, attriby 
tional theorists have begun to acknowledg 
the complex influence of situational factos 
surrounding attributions of  responsibilly 
(Schlenker, 1975; Bradley, 1978; Frey 
1978). Thus, a recent experiment by Fte 
has demonstrated that when others ate 
aware of a particular outcome, public sel 
attributions of responsibility are constrained 
by self-presentational factors of strategit 
modesty following failure and a complex mit 
of modesty and self-enhancement following 
success. On the other hand, private attrib 
tions were found to be more responsive W 
purely defensive concerns. 

According to our own analysis, a 
ness should be associated with a heighten 
sensitivity to just these kinds of sita 
sentational and defensive concerns. In i t 
circumstances, then, we would predict thi! 

P 


the effect of self-awareness on atribi A 
for negative outcomes would depend e 
public-private character of the situati 

f 


ch 
8 Since only two experimenters were involved, tê 


the 
nically it cannot be concluded that the eo the 


experimenter was the difference responsible i 


experimental effects. However, given the Le anal 
(Note 2) analysis, the brevity of the expe 
encounter, and the main effect on sex-related tra onde 
is reasonable to assume that subjects were ‘es? 
ing on the basis of this obvious distinction. hetical 
In the present experiment we used ye by 
situations whose outcomes were public knowle anter 
Virtue of having been written by the experim iis 
and assigned to the subject. The manipulations" 
involved public versus private attributions or 
involving situations. Rendering outcomes priv! 


Jow: 
or 


i 


In contrast, the focus of attention hypoth- 
esis would predict a main effect of self- 
awareness on self-attribution. An experiment 
was designed to test these opposing predic- 
tions by assessing public and private at- 
tributions for negative outcomes using pro- 
cedures based on those developed by Duval 
and Wicklund (1973) and Buss and Scheier 
(1976). 


Experiment 3 
Method 


Subjects 


The subjects were 52 male undergraduates from 
Duke University who participated in partial fulfill- 
ment of a course requirement. The subject popula- 

tion was restricted to one sex for the sake of conve- 

 nience. Subjects were run individually and randomly 

y assigned to one of four experimental conditions. Eight 

of these subjects were eliminated from the analysis on 
the basis of failure to follow instructions, These 
subjects were discarded according to predetermined 
criteria. The remaining subjects were evenly dis- 
tributed across conditions.1° 


Procedure 


Upon entering the experimental room, subjects 
were seated before a 48-inch X 32-inch (122-cm X 
8l-cm) mirror that was turned so that its reflective 
front or nonreflective back faced the subject. All 
Subjects were then introduced to the study in terms 
similar to those used by Duval and Wicklund 
(1973) and Buss and Scheier (1976). They were 
informed that the study involved the development 
of a social perception questionnaire for use during 
the following semester and were given a booklet 
that contained a series of hypothetical situations. 
They were asked to imagine themselves to be the 
main character in each of these situations and to 
Tespond to each by indicating their degree of re- 
Sponsibility for various hypothetical outcomes. Thus, 
for example: 


You pull up behind a bus that’s stopped at a 
stop sign, and you want to turn right at this 
intersection, After waiting 15 or 2 minutes, the bus 
_hasn’t moved. Finally, not knowing what he's 
going to do, you decide to pull out around him 
and have to cut back in front to turn right at 


increasing the subject’s involvement might well be 
pected to alter their evaluative connotations and, 
a the character of the self-aware response (cf. 
Federoff & Harvey, 1976; Frey, 1978). 
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the corner. Just as you do, he pulls out and runs 
right into you. 


After answering any questions the subject had 
about this procedure, the experimenter continued 
with, the public-private manipulation, 

Public and private conditions. Subjects in the 
private conditions were asked not to put their 
names on the booklets and to place the booklets 
upon completion in the middle of a large pile, Sub- 
jects in the public condition were asked to put 
their names and social security numbers on the 
booklets, to hand them to the experimenter upon 
completion, and to anticipate a brief discussion of 
their responses at the end of the experiment. 

Explanation for mirror, For all subjects the ex- 
perimenter concluded by casually mentioning that 
he had to share the room with another experi- 
menter and that much of the equipment in the 
room belonged to the other experimenter's study. 
He asked the subject not to disturb any of these 
things, indicating the mirror and various other ob- 
jects. The subject was then left to work on his own. 
After completion of the dependent measures, all 
subjects were carefully probed for suspicion and ex- 
tensively debriefed. 


Dependent measures 


The main dependent measures consisted of the 
percentage of responsibility attributed to the self in 
each of the five negative-outcome situations used 
by Duval and Wicklund (1973, Experiment 2). In 
addition, the amount of time taken to complete 
the questionnaire was recorded for each subject. 
This exploratory measure yielded no significant ef- 
fects and will not be discussed further. 


Results 


The data were submitted to a mixed-de- 
sign analysis of variance with two levels each 
of two between-subjects factors (self-aware 
ys. non-self-aware; public vs. private) and 
five levels of the within-subjects factor of 
hypothetical situation. Contrary to the focus 
of attention hypothesis, the self-awareness 
main effect was not significant (F > 1.00). 
However, in accord with our own analysis, 
the Self-Awareness X Anonymity interaction 
was significant, F(1, 40) = 4.38, p< 05. 


discard sub. 

10 Jt was decided in advance to discar’ any sub- 
jects in the public conditions who did not put their 
name on the booklet (three subjects did not) and 
any subjects in the private conditions who did not 
place their booklets in the anonymous pile (two sub- 


jects did not). 
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Table 4 h 
Percentage of Attribution to Self: Experiment 3 
Anonymity 
Self-awareness Public Private 
High 
M 60.0 51.1 
SD 10.9 10.3 
Low 
M 53.1 58.2 
SD 13.1 9.9 


Note. Higher numbers refer to greater attribution to 
the self on a 100-point scale; all statistics are based 
on 11 subjects per cell. 


Self-aware subjects attributed more responsi- 
bility to themselves than non-self-aware sub- 
jects in the public condition, but this pattern 
of results was reversed in private, The only 
other significant result was a main effect for 
hypothetical situation, F (4, 160) = 27.34, p< 
001, which simply reflects differences among 
the hypothetical situations. The means of 
the conditions collapsed across hypothetical 
situations appear in Table 4. 


Discussion 


The present pattern of results suggests that 
the focus of attention hypothesis by itself is 
an inadequate conceptualization of the ef- 
fect of self-awareness on the attribution pro- 
cess: Self-awareness led to greater self-at- 
tribution of responsibility than non-self- 
awareness only under public conditions, the 
reverse being true in private, Thus, while it 
may be useful to conceive of causal attribu- 
tions as affected by the individual’s focus 
of attention (Jones & Nisbett, 1971; Taylor 
& Fiske, 1978), it is not useful to conceive 
of the attributional effects of self-awareness 
in such terms. The Present experiment would 
Suggest that self-awareness evokes a much 
More complex attributional process than 
merely self-focused attention. Specifically, it 
is proposed that self-awareness involves an 
increased sensitivity to the situationally de- 
fined meaning of one’s actions and, as such, 
entails a greater Tesponsivity to the pre- 
Sentational constraints and evaluative con- 
Notations of  self-attributed responsibility. 
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Thus, self-awareness cannot be expected 
have a unidirectional impact on attribution! 
(Duval & Wicklund, 1972, p. 103), by 
rather its effects must be considered in terms 
of the self-relevant aspects of the attriby. 
tional context. Anonymity, by rendering dif. 
ferent classes of variables self-relevant in 
the immediate situation, serves to moderate 
the self-aware response. 


Conclusion 


In conclusion we would contend that the 
effects of self-awareness should not be de 
fined by the operation of a particular form 
of self-regulative process, but rather must 
be defined in terms of its broader functions 
of organizing the individual’s understand: 
ing of the social environment. As such, we} 
would define self-awareness in terms of a 
heightened sensitivity to particular forms} 
of available information: specifically, the self 
relevant contingencies associated with present 
activity and the self-definitional qualities of 
information feedback. The three experiments 
reported in the present article are each at 
tempts to develop the concept of self-aware- 
ness along these lines. Thus, Experiment | 
illustrated the particular form of informa 
tion encoding associated with self-awareness, 
and Experiments 2 and 3 used conceptual 
replications of previous research to dem 
onstrate that self-criticism and _self-attribu- 
tion are not characteristic of seland 
per se, but rather depend on aspects of i 
immediate situation. Future research, we feel 
should be directed toward specifying col 
ditions that determine the self-releyar 
various forms of information and thus n 
effective boundary conditions on the behav 
ioral influences of self-awareness. 
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Disruptive Effects of Disconfirmed Expectancies About Crowding 


Kitty Klein and Bruce Harris 
North Carolina State University 


The experiment utilized a 2 (high vs. low room density) X 2 (forewarning of a 
crowded room vs. no forewarning) X 2 (simple vs. complex task) design to 
examine the effects of anticipation of crowding on task performance. More tasks 
were attempted and efficiency was higher when expectancies about the crowd 
were confirmed. Subjects not told to anticipate a crowd who actually worked 
under high density and subjects warned about a crowd that did not materialize 
performed most poorly. These differences were largest for the complex task. 
Baum and Greenberg’s results were replicated with the performance data. Per- 
ceptions of the experimental room also differed as a function of anticipation, 
but failure to obtain a Crowding X Anticipation interaction did not support 
their hypothesis that anticipating a crowd induces identical perceptions to those 
obtained under actual crowding. The results are discussed in terms of discon- 
firmed expectancies being disruptive of performance, particularly complex task 


performance. 


Although studies of crowding with human 
populations continue to proliferate, the dy- 
namics of crowding are still largely a mystery. 
In particular, the search for inhibitory effects 
of crowding on task performance has not been 
very successful (Freedman, Klevansky, & 
Ehrlich, 1971; Sherrod, 1974; Stokols, Rall, 
Pinner, & Schopler, 1973). Paulus et al. 
(1976) have suggested three possible reasons 
to explain the lack of the clear-cut perform- 
ance deficits predicted by many researchers. 
First, density levels may have been too low. 
Second, task variables sensitive to crowding 
may not have been used; and third, coping 
mechanisms have not been considered. 


Density Levels 


Paulus et al. (1976) have suggested that ex- 
perimental levels of density have not been 
sufficiently aversive to affect task perform- 
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ance. However, a review of this literature in- 
dicates that task and no-task effects occur in 
relatively similar ratios of persons to space 
for the high-density subjects. Differences in 
density that do exist arise from variations in 
operationalizing low-density (uncrowded) 
conditions. Investigators reporting no effects 
of density on task performance, such as 
Freedman et al. (1971), allowed high-density 
(crowded) subjects 3.88 square feet (.36 m?) 
per person, whereas low-density subjects were 
afforded 17.70 square feet (1.65 m?) per per- 
son, Similarly, Sherrod’s (1974) high-density 
subjects had 4.62 square feet (.43 m”) each, 
and his low-density subjects, 18.75 square 
feet (1.74 m°) each. Stokols et al. (1973) 
utilized 5.64 square feet (.52 mê) per person 
for their high-density conditions and 16.73 
square feet (1.55 m?) per person for their 
low-density condition. h > 
Density levels in experiments in which per- 
formance differences occur are very similar 
to the levels reported above. Heller, Groff, 
and Solomon (1977), who found a density 
effect on task performance when subjects in- 
teracted closely with each other, used den- 
sities of 3-5 square feet (.28-.47 m*) per 
person (high density) and 13-15 square feet 
(1.2-14 m?) per person (low density). 
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Paulus et al. (1976), who also report effects 
of density on task performance, allowed sub- 
jects 8.8 square feet (.82 m?) per person 
(high-density condition) in the second ex- 


periment reported, while their low-density 
subjects had 26.1 square feet (2.4 m?) per 
person. Other reporting some ef- 
fect of 


density on task performance include 
and Karlin (1975), whose high-den- 
sity condition permitted 2.66 square feet (.25 

per and whose low density con- 
99 square feet (9.2 m?) per 
Aiello, DeRisi, Epstein, and Kar- 

(1977), who utilized 2.50 square feet (.23 
m?) per person (high density) and 48 square 

(4.4 m*) per person (low density), Un- 
are extremely sensitive to these 
variations in density, density per se is not 
explanation of the incon- 
— results in the effects of crowding on 


Task Variables 


A variety of tasks have been used to assess 

in crowded situations, Density 
reported to have no effects on per- 
On quiz games (Stokols et al, 1973), 

and number cross-outs (Freed. 
1971), or on simple arithmetic 
color-word tests (Sherrod, 1974). 
crowded subjects, however, did not 
long as uncrowded subjects at an 
puzzle in 
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plez task (Guilford’s Creative uses test), but 
Previowsly crowded subjects did perform bet- 
the simple (counting a’s) task. In an- 
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other study of the delayed effects of crowd, 
ing, Aiello et al. (1977) employed Torrance 
unusual use task and line completion tasks 
On some subscores of these tasks, previous) 
crowded subjects were less Creative than non 
crowded ones. 

Heller, Groff, and Solomon (1977) did fin 
performance decrements on a collating tay 
as a function of high density and high physi 
cal interaction. This task decrement is mo 
surprising in light of the physical obstruction 
caused by other subjects in the high-density 
condition, 

Taken together, the results of Paulus et al, 
Epstein and Karlin, and Aiello et al. do sug 
gest that crowding can facilitate simple task 
performance and interfere with complex tasks 
However, the two latter studies measured 
Performance in noncrowded situations follow- 
ing a period in which subjects were or welt 
not crowded. Therefore, the change in cond 
tions rather than the crowding itself con 
founds their results, 


Coping Mechanisms 


In Sundstrom’s (1975) model of crowding, 
coping responses are a possible outcome t0 
Stress induced by unwanted social inputs. 

One variable that may affect coping 1 
Sponses is the ability to anticipate future 
conditions. Averill (1973) has applied the 
term “cognitive control” to information con 
cerning impending stress. Averill’s review o 
laboratory and surgery studies in which in- 
formation was provided led him to conclude 
that information is generally preferred but 
that “there is no consistent relationship a 
tween such information and initial re 
ity” (p. 295). Applied to the probie 
crowding and human behavior, informing SU 
jects of crowded conditions has received lit 
attention. 

Langer and Saegert (1977) have proposed 
4 concept of attentional overload, whic iy 
caused by a combination of close proximi ‘a 
to others and the necessity to predict futu 3 
environmental states and coordinate e. 
Langer and Saegert recruited shoppers duti a 
crowded and uncrowded hours to shop fof 
list of 50 items. Half the subjects ™ 


warned that crowded supermarkets induce 
Vanxiety. The subjects thus informed showed 
Fimore positive affect and found more of the 
‘tems on the list. Shoppers during crowded 
| hours attempted fewer items, correctly com- 
} pleted fewer items, and had more negative at- 
fitudes. There were no Density X Informa- 
jon interactions. 

Although Baum and Greenberg (1975) and 
Baum and Koman (1976) led some subjects 
to anticipate a crowded experimental room, 
the crowd in fact never materialized, Whether 
the warning facilitated or impeded coping 
“under conditions of actual crowding was not 
addressed in their research. Subjects who an- 
ticipated a crowded experimental room chose 
different seat positions and were more nega- 
tive in their ratings of the experimental room 
and the confederates than subjects who did 
‘Not anticipate a crowd. 

In the present study, the effects of anticipa- 
tion and the confirmation or nonconfirmation 
thereof were tested directly. If anticipating a 
Crowd serves to decrease sensitivity to the 
stress invoked, subjects who are forewarned 
and do experience the crowd should report 
less discomfort and perform better than 
those who do not have advance warning. If 
Anticipation heightens sensitivity to stress, 
forewarned individuals whose anticipation is 
Confirmed should feel more stressed when the 
Crowd actually appears. 

The design utilized two types of tasks for 
the purpose of testing differences in the abil- 
‘Ity to cope with a crowd as a function of fore- 
Warning. Applying social facilitation theory 
(Cottrell, Wack, Sekerak, & Rittle, 1968; 
Zajonc, 1965), if anticipation of a crowd in- 
teases arousal, performance on simple tasks 
should improve, compared to a no-anticipa- 
tion condition, Likewise, arousal-increasing 
Anticipation should have deleterious effects onm 


though there may be some facilitation. Fi- 
tally, if Baum and Greenberg (1975) are 
Correct regarding the dynamics of crowding, 
Jects anticipating a crowd should perceive 
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m 
the experimental room differently than those 
subjects not anticipating a crowd, and these 
perceptions should be similar to 
crowded subjects not forewarned. 
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Subjects arrived at the experiment in groups of 
either two or five. They were greeted by the ex- 
perimenter and told to leave all belongings, except 
for a pencil, on a table in a large room that adjoined 
a small experimental room, which was 1.75 m* and 
2.70 m high. 


No-Anticipation Condition 


The door to the experimental room was partially 
closed for those subjects not led to anticipate a 
crowd, When these subjects (no-anticipation group) 
were ready with their pencils, the door to the ex- 
perimental room was opened and subjects were led 
into the room. In the no-crowd condition, two sub- 
jects entered the room, which contained two chairs. 
In the crowded condition, five subjects entered the 
room, which contained five chairs. The tasks were 
Placed face down on the chairs before the subjects 
arrived. Subjects then read the instructions of their 
tasks, either crossing out and counting vowels from 
a list of random letters or performing five logical 
word puzzles taken from Wylie (1957). The instruc- 
tions urged subjects to work as quickly and ac- 
curately as possible and to time their own solutions. 
The experimenter then closed the door for a speci- 
fied period of time (13 minutes for the vowel cross- 
out and 23 minutes for the word puzzles). A pre- 
test with the same tasks had indicated that the ma- 
jority of subjects completed the puzzle task in 23 
minutes, whereas the vowel-counting task required 
only 13 minutes to complete, All subjects in the 
room worked on the same type of task. Subjects 
were asked to record their answers on an answer 
sheet along with the time of day that each puzzle or 
vowel was completed. That is, subjects wrote down 
the time, according to a clock hanging in the experi- 
mental room, after completing each of the five puz- 
zles, or after counting each of the five vowels. In 
addition to the starting time, then, five times were 
recorded on the answer sheet if the subject com- 
pleted all five tasks. 

After the time period elapsed, the experimenter 
opened the door, collecteq the papers, and handed 
the subjects a questionnaire to assess their feelings 
about the experimental room, This questionnaire con- 
sisted of 12 7-point scales; subjects rated the room 
in terms of its temperature, how pretty it was, il- 
lumination levels, stuffiness, size, adequacy, cheerful- 
ness, suitability, dampness, Pleasantness, how crowded 
it appeared, and how much physical discomfort it 
caused them. When this questionnaire was com- 
pleted, subjects left the experimental room and were 
debriefed and thanked. 


Anticipation-of-Crowding Condition 


For those subjects who were led to believe that 
five subjects would Participate, the procedure was 
the same with the following exceptions: The door to 
the experimental room was open when subjects ar- 
rived at the large anteroom, and in all cases, five 
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chairs with five papers (the tasks) on them wert 
clearly visible. Before telling the subjects to hy 
their belongings on a table, the experimenter Pointed 
to the experimental room and stated that the small 
room was the location of the experiment, When 
subjects were ready with their pencils, the experi. 
menter showed them a list of five names, and ask 
ing each subject his/her name, conspicuously checked 
it off. In the condition in which crowding occurred) 
five subjects entered the experimental room, and 
the door was closed. In the condition in which the 
anticipation of crowding was not confirmed, the two 
subjects entered the experimental room after having 
their names checked off the list of five names and 
were told by the experimenter that he would let th 
others in as soon as they arrived. The experimenter! 
then closed the door, leaving the two subjects to 
work in a room with three empty chairs. As in the 
no-anticipation condition, subjects were debriefed 
and thanked after completing the final questionnaitt, 


Dependent Measures 


The tasks were scored in the following manner: 
For the vowel-counting task, there were five correct 
answers (one for each vowel), and a subject wai 
either correct or incorrect in recording the frequenty 
of each vowel. Of the five logical word puzzles, two 
required only one answer to solve the puzzle Core 
rectly. Of the remaining three puzzles, one required 
two answers for successful completion, one requir 
three answers, and the third required four answers. 
To be scored correct, all answers of the given pure 
had to be shown correct. In the analyses that fol- 
low, standard scores were computed using the ovet- 
all means of all subjects assigned to the puzzle condi- 
tions and of all subjects in the vowel conditions. 

Three different measures of performance we 
analyzed: simple number of correct puzzle sot 
and correct vowel counts; number of puzzles a 
vowel problems attempted; and the number of a 
utes spent for each solution on both tasks. Bee 
of the nature of the instructions, the minutes/| E 
measure was treated as an index of efficiency Ro 
lar to Paulus, Gatchel, and Seta’s (1978) effici i 
(rate). In order to test both the puzzle ae 
vowel data in an analysis that included type ° ee 
as an independent variable, each subjects pes: 
were transformed to standard scores, using the sie 
all means of all subjects assigned to the approP 
task condition. 


Results 


Preliminary Analyses 


t 

All three dependent measures weré a 

analyzed to determine if there were a of 

fects of sex of subject, sexual composi l 
the group, or the groups themselves. 9% 


F 


sex, crowding, and whether the crowding was 
anticipated were used as factors in a multi- 
variate analysis of variance (MANOVA) on the 
three measures for each task. Neither main ef- 
fects of sex nor interactions were found for 
either the puzzles or the vowels. 

The groups, which consisted of either 
two persons (uncrowded) or five persons 
(crowded) were further categorized as all 
male, all female, or mixed in order to deter- 
mine whether sexual composition of the group 
affected performance. Again, a multivariate 
analysis of variance indicated neither a main 
effect nor interactions for sexual composition 
for either task. 

Finally, a MANovA using crowding, antici- 
pation, and groups as a nested factor was con- 
ducted for each task. The groups effect was 
not significant in either the puzzle or the 
vowel conditions. Because of the lack of ef- 
fects for sex, sexual composition of the group, 
or the groups themselves, the data below are 
collapsed across these variables. 


Task Performance 


The effects of forewarning of a crowd on 
simple and complex task performance was 
tested in a three way MANOVA (Anticipation 
X Crowding X Task) using the number of 
correct solutions, the number attempted, and 
the time per correct solution (efficiency). 
There were significant interactions: Anticipa- 
tion x Crowding, F(3, 70) = 3.81, p< 01; 
Crowding x Task, F(3, 70) = 2.77, p < 
05; and Anticipation x Crowding X Task, 
F(3, 70) = 2.58, p < .06; and a significant 
main effect of tasks, F(3, 70) = 74.74, p < 
0001. 

The univariate tests on each dependent 
“Variable indicated that on the number cor- 
Py completed, only a task effect was ob- 

tained. Subjects who counted vowels actually 
got fewer correct than did subjects who 
Solved puzzles—1.40 correct for the vowel 
Count versus 1.97 for the puzzles, F(1, 72) 
= 6.89, p<.01. This difference may arise 
ftom the different nature of the feedback in- 
Volved in each task. The vowel counting of- 
fered no check on the correctness of the solu- 
tion; that is, errors in counting were unlikely 
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Table 1 
Mean Number of Tasks Attempted as a 
Function of Forewarning and Crowding 
Condition Crowded Not crowded 
Anticipation 4.6a,b 4.15, 
No anticipation 44a» 4.755 


Note. For each mean, » = 20. Means with different 
subscripts differ significantly using Duncan's multi- 
ple-range tests. Number of tasks attempted (both 
puzzles and vowels) ranged from 3 to 5; maximum 
possible = 5, 


to be detected. An incorrect guess concerning 
the logical puzzle solution could be more 
easily rejected, because subsequent guesses 
would prove incorrect. 

On the number of tasks attempted, the 
Anticipation X Crowding interaction was sig- 
nificant, F(1, 72) = 6.03, p< 016. Across 
both tasks, subjects whose expectancies were 
confirmed attempted more problems than did 
subjects whose expectancies were discon- 
firmed, Subjects who were forewarned about 
a crowd and were actually crowded tried 
more tasks than subjects forewarned and not 
actually crowded. Subjects not forewarned 
and not crowded tried more tasks than sub- 
jects surprised by the crowd (see Table 1). 

On the third dependent variable, the 
amount of time spent per correct solution, 
the three-way interaction (Anticipation X 
Crowding XTask) was significant, FRU, 72) 
= 5.11, p <.02, As can be seen in Figure 1, 
it is clearly in the complex task that crowd- 
ing and anticipation had their greatest effect, 

To examine these results in a a es 

Jy, a 2 (anticipation — no an cipation) X 
any eee not crowded) analysis of vari- 
ance was performed on all three dependent 
variables. In this analysis, no effects of an- 


ticipation or crowding appeared for either the 
number of vowels correctly counted or the 


number attempted. There was an effect of 
anticipation on efficiency (number of minutes 
spent working on each letter) on the vowel- 
counting task, however, F(1, 36) = 4.38, p 
< 04. Subjects who anticipated a crowd spent 
more time on each vowel count than subjects 
not anticipating a crowd (3.14 minutes vs. 


2.68 minutes per vowel count). 
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Figure 1. Efficiency as a function of density, an- 
ticipation, and type of task, 


On the puzzle task, the Crowding x An- 
ticipation interaction was significant on both 
the number of puzzles attempted, F(1, 36) = 
7.04, p< 012, and on the mean time per 
solution, F(1, 36) = 9.10, p < .004. As can 
be seen in Table 2, fewer attempts to solve 
were made when crowding was expected but 
failed to occur and when crowding was not 
expected but did occur. In both these cells, 
subjects were caught “unaware.” Those who 
had been forewarned that they would be per- 
forming in a high-density environment but 
actually worked in low-density conditions 


Table 2 
Mean Number of Puzzles Attem 
of Forewarning and Crowding 


No anticipation 
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‘pted and Time (in minutes) per Solution as a Function 


Pe E a 


Crowded 
a I E 


Condition No. attempted Time No. attempted Time E 
Anticipation 


The same pattern of effects appears in the 
time required for each solution (see Table 
2). In this interaction, subjects who expected 
a crowd and were indeed crowded or who dil 
not expect a crowd and were not crowded pers 
formed more efficiently than subjects who 
were surprised by the experimental condition, 


Perceptions of the Experimental Room 


To determine whether the anticipation fac 
tor also affects perceptions of the environ: 
ment, a multivariate analysis of variance was 
performed on the scales subjects used to ralé 
the experimental room. As in Baum & Greet) 
berg’s (1975) study, there was a significant 
main effect of anticipation, F(12, 65) = 2.36) 
p < 012. a f 

Univariate tests of anticipation indicated 
that anticipation had significant effects on 
how hot, F(1, 72) = 7.32, p < .01; how a 
fortable, F(1, 72) = 4.65, p < .05; how A 
quate, F(1, 72) = 7.99, p< .01; and £ 
suitable, F(1, 72) = 10.26, p < .01, the rong 
appeared (see Table 3). Subjects who r. 
pated a crowd rated the room as hotter, ablel 
comfortable, less adequate, and less suit 
than subjects who were not forenaif 
Whether or not the crowd actually maA 
ized also approached significance on the u 
perception data, F(12, 65) = 1.82, i A 
The only rating on which crowding h al 
significant main effect was the pee Y 
how crowded the room appeared, F(1, 1 


Not crowded 


5.63y 
4.25x 


4.20, 
4.80, 
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14.50, p < .001. Crowded subjects did per- 
ceive the room as more crowded than non- 
‘crowded subjects (M = 4.25 vs. M = 2.62 
respectively ) . i 

The multivariate F for the Crowding X 
Anticipation interaction was not significant. 
F(12, 65) = 1.03, p< 4. Type of task alo 
had no effect on room perceptions, even 
though puzzle subjects spent a longer time 
in the experimental room than did subjects 
counting vowels. 


Discussion 
Performance 


The present data raise a number of issues 
heretofore unaddressed in either the social 
facilitation or crowding literature. Subjects 
in uncrowded conditions who were not misled 
to anticipate a crowd performed better on 
both simple and complex tasks compared to 
subjects warned of a crowd that failed to 
materialize, Likewise, being crowded when 
one was not forewarned hurt performance on 
a and complex tasks compared to work- 
a crowded conditions with forewarning, 

ee these differences were not significant. 
Pan simple vowel-counting task, antici- 
pen ecreased efficiency for both crowded 
s ced conditions. This finding fits 
E ae Jearned drive (Cottrell, 1972) nor 
Bae or explanation (Baron, Moore, & 
en. ey of social facilitation. If fore- 
ad A out crowding acts to increase 
he a salle subjects’ performance on 
a T e vowel-counting task should have 

i perior to the performance of subjects 
given no information. 
as these results analogous to those 

y investigators concerned with the 


effe i 3 
cts of information about impending stress- 


In S 
a oe studies, the amount or kind of in- 
ation has varied, rather than its accuracy. 


an i 
S Averill (1973) notes, predictability has 


} 


= found both to enhance and to inhibit 

Bed ene However, in the data pre- 

Bitton Es; subjects with equivalent infor- 

is the about the experimental setting (that 

ticipati anticipation-crowded and the no-an- 

aot ‘on—not-crowded conditions) were not 
ally efficient vowel counters. 
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Table 3 

Mean Ratings of the Experimental Room as a 

Function of Anticipating a Crowd 

eS ES SS Ee ee 
Descriptive dimension 


ae ee ee ee 


Com- 
a fort- Ade- Suit- 
Condition Hot able quate able 
Anticipation 3.70 4.47 4.50 4,25 
No anticipation 4.20 3.67 3.42 3.02 
F 7.32%* 4.65* 7.99** 10.26** 


_ ab ee 
Note. n = 40. Ratings are on a 7-point scale, The 
lower the mean, the closer to the descriptive dimen- 


sion. 
*p < 05. ** p < 01. 


Further indication that anticipation does 
not affect the actual crowding experience in 
any straightforward manner is found in the 
data from the complex task conditions. Per- 


by anticipation itself. Subjects who antici- 


pated and did experience the crowd performed 


better than those who had no advance warn- 


ing of the crowd. For crowded subjects then, 
anticipation enhanced performance. Subjects 
not forewarned and not crowded performed 
equally well, but those who mistakenly an- 
ticipated a crowd were less efficient. 
These complex task results fit a social fa- 
cilitation paradigm only if one assumes that 
tion of an expectancy that 


it is the disconfirma! 

contributes to arousal or distraction, The dis- 
confirmation explanation also helps explain 
the data in terms of the information about 


aversive stimuli research. Lack of predictabil- 


ity decreases complex task efficiency. 
As in previous studies (Freedman et al., 
4; Stokols et al, 1973), 


no differences in pe! 
a result of crowding alone. 


(Aiello et al., 1977; Epstein & Karlin, 
in which delayed effects of crowding are re- 
ported may inadvertently have confounded 
expectations about crowding in the test situa- 
tion with the pretest density manipulation. 


Subjects exposed to & crowd may anticipate 
further crowding; t their sub- 


their surprise al 
sequent jsolation during task performance 
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may affect their performance as much as the 
crowd they previously experienced. 

The performance data only partially repli- 
cate the findings of Langer and Saegert 
(1977). In that study, forewarning proved 
beneficial whether or not crowding actually 
occurred. This discrepancy may arise from 
the different nature of the tasks used or the 
type of forewarning manipulation. Langer 
and Saegert explicitly warned subjects that 
crowding induced anxiety; in the present 
study, the forewarning was information only, 
and subjects were not led to expect any par- 
ticular psychological consequences. 


Room Perceptions 


The room perception data do show main 
effects of anticipation. On several scales, sub- 
jects who anticipated the crowd rated the 
room as more uncomfortable than subjects 
not forewarned. The room ratings were made 
at the end of the experimental session. The 
absence of any Crowding X Anticipation in- 
teraction on room perceptions suggests that 
the density levels used here did not create 
additional discomfort beyond that generated 
by the forewarning at the beginning of the 
experiment. 

Baum and Greenberg’s (1975) contention 
that crowding and anticipation are dynami- 
cally similar states receives equivocal support 
in the present results. A strict test of Baum 
and Greenberg’s hypothesis would require 
similar perceptions from subjects anticipating 
a crowd that did not materialize and sub- 
jects who did not anticipate a crowd but did 
find one. Complete support for this interpreta- 
tion of Baum and Greenberg’s hypothesis 
would necessitate an interaction between 
crowding and anticipation such that antici- 
pating-not-crowded subjects perceived the 
room in the same way as nonanticipating— 
crowded subjects. No univariate tests of the 
Crowding X Anticipation interactions on the 
room ratings were significant. Thus, although 
anticipation did cause differences in percep- 
tions of the room, anticipating a crowd did 
not lead to the same perceptions as actually 
being in one. 


There is more support for Baum and 
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Greenberg’s proposal in the performance data, 
Subjects who anticipated a crowd that never 
materialized did not perform significantly’ 
differently than subjects not anticipating a 
crowd who worked under crowded conditions, 

In the present study, reliable interactions 
were obtained despite the indirect nature of 
the forewarning and a crowding experience 
that was perceived as only moderately uncom 
fortable. Whether anticipation would have 
had similar effects if the crowding had truly 
been stressful has yet to be determined. The 
degree to which the forewarning is simply an 
objective description of future conditions, 
advises on coping mechanisms, or forecasts 
physical or emotional distress as a const 
quence of severe crowding also should be 
examined. Human beings encounter naturally} 
crowded situations both with (e.g., holiday 
shopping) and without (e.g., delays in alt 
terminals) forewarning. It would seem ap 
propriate to investigate under what conditions 
different types of foreknowledge increase the 
ability to cope with a crowd. 
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Persistence of Opinion Change Induced Under Conditions 
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Ninety-six high school students participated in a study investigating the im- 
mediate and delayed effects of forewarning of persuasive intent. It was predicted 
that subjects would change less immediately after reading persuasive com- 
munications because the forewarning would serve as a discounting cue, but that 
over time, they would tend to forget or dissociate this cue, thus allowing the full 


impact of the communication 
hypothesis. 


to emerge. The results strongly supported this 


A second experiment involving 104 high school students was conducted to 
replicate the first study and to extend the same reasoning to the case of distrac- 
tion. Distraction was expected to facilitate immediate opinion change, presumably 
because of interference with counterarguments; but because of its detrimental 
effect on comprehension and a presumed tendency for subjects to think of op- f 
posing arguments after leaving the experimental situation, the change was ex- 
pected to dissipate more rapidly than in the nondistracted conditions. The data 
confirmed predictions regarding both forewarning and distraction. 


Although forewarning of persuasive intent 
has been studied extensively (e.g., McGuire, 
1966, 1969; McGuire & Papageorgis, i962; 
Papageorgis, 1968; Petty & Cacioppo, 1977), 
there have been no reported investigations of 
its long-term, or delayed, effects. If forewarn- 
ing has an immediate detrimental effect on 
opinion change, which is by no means uni- 
versally found (e.g., Papageorgis, 1968), 
there is reason to believe that the inhibiting 
influence will dissipate over time, thus allow- 
ing a greater delayed than immediate impact 
of the message. That is, if forewarning leads 
subjects to think of counterarguments either 
before or while Treading a communication (e.g. 
McGuire & Papageorgis, 1962; Petty & 
Cacioppo, 1977) , arouses psychological re- 
actance (Brehm, 1966; Hass & Grady, 1975), 


Sincere appreciation is extended to Robert S. 
Wyer, Jr., and Richard E, Petty for their most help- 
ful and astute comments when reading earlier yer- 
sions of this manuscript. 

Requests for reprints should be sent to William 
A. Watts, Department of Education, University of 
California, Berkeley, California 94720, 


or causes the subjects to perceive the oa 
municator as less fair (e.g., Hass & Cay 
1975), an immediate reduction in persuasiv 
impact should result. However, pe E 
forewarning significantly interferes with veel 
ing the message content, which seems un aa 
(e.g., Freedman & Sears, 1965), subjects jate 
tend to forget or spontaneously disso¢ 
these initial reactions over time, wa 
lowing the full persuasive impact of the a 
terial to emerge. For example, Hass i 
Grady (1975) point out that despite 
presence of reactance, one may be pees 
by the informational value of the HA d log 
presented by the communicator. It wou i) 
ically follow that given compelling are a 
once the initial reactance subsides, the E 
of the information should produce man ( 
opinion change over time (e.g., Gruder i 
1978). Similarly, if forewarning tends 
create the perception of a biased oon ata 
(eg., Hass & Grady, 1975), as this diso 
ing cue becomes dissociated from the ee nt 
over time, a delayed increase in a8" to the 
might be expected. This is analogous 
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original interpretation of the sleeper effect in 
persuasion (Hovland, Lumsdaine, & Schef- 


field, 1949). 
Issue Involvement as a Factor in Forewarning 


In attempting to account for the conflict- 
ing results in studies of forewarning, Papa- 
georgis (1968) has differentiated between 
cases involving the mere prior announcement 
of the topics and directions of persuasive com- 
munications, termed “warnings,” and those 
in which the subjects are specifically told 
that the experiment deals with persuasion, 
dubbed ‘persuasion contexts.” He hypothe- 
sized that persuasion contexts served to re- 
duce the impact of communications about is- 
sues with high involvement or those that con- 
cern controversial topics or make emotional 
appeals, whereas for communications about 
issues with low involvement or those that 
argue about cultural truisms or make factual 
appeals, persuasion contexts have no more ef- 
fect than disguised (no forewarning) con- 
texts when other characteristics of the persua- 
sion situation are neutral. 

There are at least two theoretical reasons 
for expecting different results for issues of 
high and low involvement: (a) Psychological 
reactance (Brehm, 1966) should be greater 
for important issues, and (b) subjects are 
probably less able and/or motivated (e.8-, 
Vinokur & Burnstein, 1978) to think of 
counterarguments in the case of cultural tru- 
isms or esoteric topics. Dean, Austin, and 
Watts (1971) found no support for these con- 
Jectures; however, their studies ‘were limited 
in the sense that each employed only two is- 
sues—one high and the other low in involve- 
ae Naturally, the topics varied in a num- 
er of other respects that may have influenced 
the obtained results. 

_ In the present study, 
in an attempt to minimize the 
problem: Four dealt with familiar, contro- 
Versial issues; and the others, with esoteric 
topics. An example of the former would be 
the proliferation of nuclear weapons, and of 
ee latter, the increasing length of the geo 
logical day. If the reasoning of Papageorgis 
(1968) and perhaps that of Apsler and Sears 
(1968) is correct, the immediate effects of 


eight issues were used 
aforementioned 
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forewarning should be to inhibit change on 
the familiar topics and to facilitate it or have 
no effect on subjects’ opinions concerning the 
esoteric issues. 

The first study was designed to test these 
conjectures. Since each subject read messages 
on four different topics, it was not feasible to 
warn them of the positions advocated in each 
case. Rather, subjects were warned of the 
persuasive intent (Papageorgis, 1968) of the 
materials they were about to read. This ma- 
nipulation has been used in several previous 
studies (e.g., Hass & Grady, 1975; Kiesler & 
Kiesler, 1964). Under these circumstances 
there is no opportunity for anticipatory 
counterarguing (Petty & Cacioppo, 1977); 
consequently, there is no reason for a delay 
between forewarning and receipt of the mes- 
sages (e.g, Hass & Grady, 1975). Similarly, 
there is little opportunity for anticipatory 
belief change (e8., McGuire & Millman, 
1965), since the subjects are unaware of the 


topics and positions ‘advocated until the mes- 


sage are received. 


Distraction as @ Variable Influencing 
Persistence of Opinion Change 


A second study focused on the single and 
joint effects of forewarning and distraction. 
Whereas forewarning has been shown previ- 
ously to facilitate production of counterargu- 
ments (e.g, Petty & Cacioppo, 1977), dis- 
traction apparently interferes with this pro- 

Baron, & Miller, 1973; 
1964; Osterhouse & 


Petty, Wells, & Brock, 1976). 


tely comprehen! ed. ; 
Domes: There is considerable evidence 


that distraction also inter’ 
hension (€.8-, aala! 
Petty et al, 1 
Zimbardo, Snyder, ee i 
. Whenever a vari 
ae and yielding components of persua- 
sion in opposite directions, the resultant effect 
often may be nonmonotonic (e.g, McGuire, 
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1968, 1969; Wyer, 1974, chap. 7). That is, 
a moderate amount of distraction will facili- 
tate opinion change because the reduction in 
counterarguing more than offsets the decre- 
ment in comprehension. As distraction in- 
creases, however, the loss in comprehension 
should become greater, thus leading to a 
decrease in opinion change, since, in the ex- 
treme case, one cannot conform to the posi- 
tion advocated in the communication without 
understanding the side taken. Even though 
the immediate effect of distraction may be to 
facilitate opinion change, there are at least 
two reasons to expect the gain to be short- 
lived. First, several prior studies indicate 
that experimental treatments affecting initial 
learning influence persistence of induced opin- 
ion change as well (see Cook & Flay, 1978). 
Thus, the decay in opinion change may be 
quite rapid, depending on the extent that 
distraction interferes with comprehension. 
Note that in the case of distraction, consid- 
eration of its effects on comprehension lead 
to the exact opposite prediction of that de- 
rived for forewarning, namely, that the in- 
duced opinion change would dissipate more 
rapidly for distracted subjects because of 
poorer learning of the message content. 
Another reason for more rapid decay of 
opinion change under conditions of distrac- 
tion hinges on postexperimental counterargu- 
ing. Once the subject has left the experimental 
room, he or she is free of distraction and in 
a position to reflect on the persuasive mes- 
sages. It seems likely that in the process, the 
individual would be able to think of a greater 
number of opposing arguments that would 
serve to dampen his/her newly formed opin- 
ion. Naturally, the people who were not dis- 
tracted may also cogitate upon the issues 
afterward and think of additional counter- 
arguments. However, the latter should be 
smaller in number and have tess impact be- 
cause these individuals would already have 
considered many of the counterarguments 
during the experimental session and have 
taken them into account at that time (e.g., 
Vinokur & Burnstein, 1978; Vinokur, Trope, 
& Burnstein, 1975). Actually, the distracted 
subjects may be more motivated to think 
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about the topics after the experimental se 
sion, since reading communications undu 
conditions of distraction may require grea 
effort (e.g, Baron et al., 1973), and if sub 
jects have worked harder on a task, it would 
seem likely that they would feel increased 
involvement. 

To test these conjectures, in the second 
study, forewarning versus no forewarning plig 
distraction versus no distraction were vari 
orthogonally so that their single and joint 
effects on immediate and delayed opinion 
change could be examined. 


Experiment 1 
Method 
Subjects and Design 


The subjects consisted of 96 students from four 
senior-level high school classes who participated ip 
both sessions of the experiment, separated by a I 
week interval, during their normal class periodi. 
There were approximately equal numbers of males 
and females. 

Eight messages were randomly divided into two 
subsets of four each with the restriction that thty 
had to contain two familiar and two esoterit 
topics. These forms of the materials were alternated 
Subjects read one of the subsets of persuasive mes 
sages and then stated their opinions on all ei 
issues. Hence, each person served as an experimen! 
subject on four issues and as a control on the re 
maining four topics. A 2 X 2 X 2 X 2 factorial desig 
was used involving one between-subjects varis 
(forewarning vs. no forewarning) and three wi 
subjects variables (message type: esoteric vs. Mt 
miliar, time of measurement, and experimental vs 
control). The order of presentation of the two oe 
of messages was counterbalanced. Since subjects ra 
randomly assigned to experimental treatments, 
posttest-only design was employed, inasmu re 
pretest itself would serve to some extent as & 
warning of persuasion. 


Procedure $ 
The study was represented as an investigation 
factors related to learning and retention © 
communications, It was explained that su ther & 
as the controversiality of the topics and T 70 
says were written in an emotional or a int 
were thought to play important roles n de ie 
learning efficiency and recall of message oe com- 
the first session, subjects read four pertan in 

munications, each averaging about 300 
length. Two of the messages in each se! 


ora 
t dealt i 
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, somewhat controversial topics, whereas the 
other two concerned esoteric issues.t 

The forewarning manipulation was incorporated 
into the written materials to facilitate random as- 
signment of this variable to subjects within a given 
dass. Approximately one half of the subjects re- 
ceived instructions indicating that the messages were 
designed to change their opinions on certain issues, 

‘The directions were worded as follows: 


On the following pages you will find four pas- 
sages; they are designed to persuade you. Each 
message will attempt to change your opinion about 
a particular topic. To ensure a careful reading, 
please pick out and underline the shortest phrase, 
or phrases, in each paragraph which convey the 
idea expressed. As you know, we are interested in 
the manner in which subject matter affects in- 
formation processing; and its persuasive intent is 
thought to be one important factor. Your close at- 
tention to the task will contribute to a more ade- 
quate understanding of the process. Thank you. 


: In the instructions for the nonforewarned sub- 
jects, all references to the persuasive intent of the 
materials were deleted, and the previous verbal state- 
ment regarding the influence of familiarity and con- 
troversiality in information processing was repeated, 
The time interval between warning and exposure to 
the persuasive materials was no more than a few 
seconds, which Hass and Grady (1975) have shown 
is quite adequate. 

After the four messages were read and the ma- 
terials collected, subjects’ opinions were assessed on 
all eight issues, thus providing no message control 
data for four topics. The justification of the opinion 
Measurement was in terms of obtaining indices of 
Controversiality of the issues for the population 
being studied. Opinions were measured on 100- 
Point probability-of-truth scales calibrated in units 
10, ranging from 0 (Very improbable) to 100 
(Very probable). The rating scale was presented im- 
ly below each statement. One opinion item 
pert to each topic. Examples include “The 
ümber of countries producing nuclear weapons is 
N rapidly” and “A substantial falling off 
l consumer purchasing power in the United States 
į presently occurring.” After the opinion ratings 
ad been collected, each subject completed a mul- 
choice test of comprehension, with three items 
laining to each issue. In addition, subjects rated, 
T-point scales, how fair the articles were, how 
g they found the subject matter, and to 
extent they were thinking of opposing argu- 
as they read the communications. Afterw 
bjects were thanked for their participation, 
Mention was made of any follow-up testing. 
Second session, after a 1-week interval, sub- 
n supplied opinion ratings and completed 
prehension test. Thereafter, the true purpose 
Study was explained. Since the experiment 
nted as one dealing with comprehension 
tion of message content as functions of 


ma- 
controversiality and other characteristics of H 
terial, the second request for opinion mint em- 
Justified on the basis of providing indices Ois 
poral fluctuations, since cont 

often quite changeable. 


Results and Discussion 


2x 

The opinion data were analyzed with 4 f a 
2X 2X 2 factorial analysis of variance oe 
sisting of one between-subjects vi 
warning vs. no forewarning) and three i 


esoteric 
subjects variables (message type: j ex- 
vs. familiar, time of measurement, naly- 


perimental vs, control). To facilitate 

sis, data from two subjects in the forewarnen 
condition were randomly discarded in rail 
to obtain equal cell frequencies. The O°" pe 
persuasive impact of the messages © ere 
effects of the experimental treatments Misi 
determined by comparing the final i aa 
scores for the message topics with 
message control scores. Within the 
and esoteric message conditions, 

opinion scores were averaged across 
issues. 


the 
familiar 
subjects 


Message Effects 


The overall persuasive effect of m 
sages, without regard to experimen $s 
tions, was impressive, F(1, 92) = 62. aS 
.01, for the main effect of experimen! subjects 
control treatments, As expected, on 
changed their opinions considerably than OF 
the low-involvement, esoteric issues ll 92) 
the familiar, more involving ones; ’ 
= 13.75, p < l, tor ji ae 
of message type in ex 
control aton. Although the is 
between forewarning and experimen! r 92) 
control conditions was significant, ft 


prolileras 
1 The four familiar topics dealt with the oping 


ion of nuclear weapons, economic proles- 
African nations, the in number of oe 


the periphe ae 
ike amount of fuel and power in Lee yield of 


influence of New York City banks on 
U.S. government bonds. 
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Table 1 


Familiar issues 


Mean Opinion Scores for Each of the Experimental and Control Conditions 
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Esoteric issues 


Not forewarned Forewarned Not forewarned Forewarned 
Imme- Imme- Imme- Imme- 

diate Delayed diate Delayed diate Delayed diate Delaye 
measure- measure- measure- measure- measure- measure- measure- measure 

Condition ment ment ment ment ment ment ment ment 
3.19 57.34 63,62 

imental 72.76 61.92 61.38 65.74 73.40 63. 

Cnr z 55:32), 55:21 63.62 60.32 44.04 49.36 45.42 49,68 


Note. These means are averaged across the two similar issues in each condition and are based on 100-poitt 
scales ranging from 0 (Very improbable) to 100 (Very probable). Cell ns = 47. 


= 9.93, p < .01, the second-order interaction 
involving forewarning, message type, and 
experimental versus control conditions that 
would be expected from the work of Papa- 
georgis (1968) was trivial (F = .10), That 
is, although subjects changed their opinions 
much more when the topics were low rather 
than high in involvement, compared to no- 
message controls, the inhibitory effects of 
forewarning were the same for both types of 
issues. While the data offered no support for 
Apsler and Sears’ (1968) multiplier hy- 
pothesis, one should keep in mind that these 
investigators warned subjects of the specific 
topics and sides to be taken, whereas we 
simply forewarned them of persuasive intent 
without specifying the topic or side. Finally, 
in retrospect, it seems that perhaps even the 
familiar issues were relatively low in personal 
involvement, particularly for the population 
studied, This may explain the weakness of 
some of the anticipated effects, 


Persistence of Forewarning Effects 


The primary hypothesis tested in the pres- 
ent study involved the interaction between 
forewarning and time of measurement, It was 
predicted that forewarning would inhibit the 
immediate change resulting from the per- 
Suasive communications, but that this effect 
would be short-lived, and over time, the fore- 
warned group would show an increment as 
the forewarning is forgotten or spontaneously 
dissociated, whereas those subjects who were 


not forewarned would show a typical decai 
The test for this hypothesis is the Forewar ei 
X Time X Experimental versus Control inte 
action, which was significant beyond the } 
level, F(1, 92) = 22.12. The pattern of mean 
displayed in Table 1 shows that the directio 
of this interaction was as predicted, wil 
subjects who were forewarned showing a 
initial decrement of 13.72 points, averag 
across message type, compared to those M 
dividuals who were not forewarned, and, in 
deed, changing only trivially more than ù 
no-message controls, After a week ie 
however, the forewarned subjects had shai 
an absolute increase of 5.32 points and 
now slightly higher than the nonfet a 
group and substantially higher than thei ad 
trols. Although the data show @ Re 
sleeper effect (Cook & Flay, 1978); we 
crease did not reach significance vi j 
by Dunn’s (1961) method; a ania 
sleeper effect was not obtained. = 
Tt is interesting to note the similarity 
the delayed measurement means an veal 
perimental treatments, indicating t a j] 
less of the immediate effects of oa 
the long-term (1-week interval) me ore 18 
about the same as for subjects who trib 
forewarned. The findings cannot be w 
to all persons reverting to the one 
during the time interval, since the 0 
layed experimental mean, aggrega 
treatments, was much higher thai 
similar control conditions (e.g- 63: 
53.64). 


ted 4 
n for ™ 
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Forewarning Effects on the Control Issues 


As stated in the introduction, it would 
have been virtually impossible for antici- 
“patory belief change (such as that observed 

by McGuire & Millman, 1965, and others) 
to have occurred in the present study, since 

neither the topics nor the positions advocated 
were known prior to being read by the sub- 
jects. It is quite conceivable, however, that 
the forewarning might have led to some gen- 
eral reaction at the time of completing the 
postexperimental test, particularly since each 
subject served in both the experimental and 
control conditions. It can be seen in Table 1 
that control subjects in the forewarned condi- 
tions were more favorable toward the familiar 
issues, both immediately afterward and dur- 
ing the delayed testing, than those subjects 
who were not forewarned; but no such effect 
occurred for the esoteric topics. 

Dinner, Lewkowicz, and Cooper (1972) 
found greater change in high-self-esteem sub- 
jects anticipating communications concern- 
ing familiar topics, presumably so they would 

avoid appearing gullible. While Dinner et al. 
announced the topics and sides to be taken, 
it is possible that in the present study, sub- 
jects surmised the directions that would be 
advocated on the control issues and responded 
accordingly to avoid subsequent change 
should they later receive communications on 
these topics, However, considering the rela- 
tive lack of sophistication of the high school 
subjects, it seems far more probable that this 
Was just a chance occurrence, particularly 
since it did not appear for the esoteric topics 
and was not replicated in the second study 
for familiar issues. 


z 


Comprehension 


These data were analyzed in the same man- 
_her as the opinion scores, with subjects” re- 
sponses averaged across the two topics wi 
each experimental condition. Naturally, there 
Were no control scores for this variable, since 
subjects could only be asked to recall what 
they had read. The data indicated a slight 
Superiority across time for students in the 
forewarned conditions, who obtained a mean 
a 1.40 correct responses out of & possible 3, 
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compared to 1.17 correct for those who were 
not forewarned. The difference between these 
means is of borderline significance F(1, 92) 
= 3.85, p< .06. Forgetting during the 1- 
week interval was trivial (F = 1.02), and the 
only other significant effect was markedly 
superior memory for the familiar, compared 
to the esoteric, issues, F(1, 92) = 24.60, p 
< 0l. 


Reactions to the Persuasive Communication 


Three questions were included in the im- 
mediate posttest to measure subjects’ reac- 
tions to the persuasive communication: “How 
fair and unbiased did you find the communi- 
cations?”; “How interesting did you find the 
messages?” ; and “As you read the messages, 
to what extent did you find yourself thinking 
of arguments on the other side?” Subjects re- 
sponded to each question by checking a 7- 
point scale ranging from 1 (Not at all) to 
7 (Very). 

Only the différences in ratings of fairness 


ignifi t = 3.07, p < .01), with 
reached signi cance ( ri es 
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evidence for the predicted sleeper effect due 
to forewarning of persuasive intent. They 
are readily interpreted in terms of forewarn- 
ing serving as a discounting cue to acceptance 
without interfering with learning—a cue that 
over time (1 week) subjects are likely to for- 
get or spontaneously dissociate, thus allow- 
ing the persuasive materials to reach their full 
impact. Indeed, this method of studying the 
sleeper effect may be superior to the classic 
approach of attributing a compelling message 
to a negative source, since the latter often 
must arouse feelings of incredulity on the 
part of the subjects. 


Experiment 2 


The second study was designed to replicate 
the first and, in addition, to test the conjec- 
ture that while moderate distraction may fa- 
cilitate immediate opinion change, the latter 
will rapidly dissipate, because once removed 
from the distraction conditions, the person 
will think of additional counterarguments to 
the communications, and, to some extent, 
comprehension of the persuasive messages 
will have been impaired. Each of these factors 
should operate to shorten any immediate ad- 
vantages realized. 


Method 
Subjects and Design 


One hundred four high school students participated 
in both sessions of the experiment, which were held 
1 week apart, during their normal class periods. The 
number of males and females was divided about 
equally. Each subject read two out of four persua- 
sive messages and then stated his or her opinion on 
all four, thus providing control scores for two issues. 
Only one of the issues was read under conditions 
of distraction, and subjects were either forewarned 
or not for both issues, 

A posttest-only design was employed, with sub- 
jects randomly assigned to the experimental condi- 
tions. Since distraction was a within-subjects vari- 
able, the order of presentation of the two messages 
(under conditions of distraction and nondistraction) 
was counterbalanced. The two levels of forewarning 
and distraction and the two times of measurement 
constituted a 2 X 2 X 2 factorial design with one be- 
tween-subjects (forewarning) and two within-sub- 
jects treatments. An equal number of subjects (52) 
served in each condition. 
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Procedure 


As in Experiment 1, the study purported to inai 
vestigate the effects of types of reading material o 
retention of message content. 

In the first session, each subject read two (rane 
domly selected from a set of four) persuasive mes 
sages averaging about 300 words in length. These 
communications dealt with current topics of mod 
erate familiarity, namely, economic aid to Africam 
nations, consumer purchasing power in the Unite 
States, the increasing number of trained ministers) 
and proliferation of nuclear weapons. One me: 


under distraction conditions. The distraction wa 
designed to be relatively neutral and consisted 
printing the messages in white type on a black backs 
ground, rather than the conventional black on white 
Pretests had shown that this reverse negative prod 
cedure was indeed distracting without being annoyd 
ing, at least for the relatively short periods of timi 
involved. Furthermore, the distraction was quM 
consistent with the rationale given that the studji 
investigated the influence of such factors on com 
prehension. k 

As in the previous study, the forewarning mang 
ulation was incorporated into the written mate 
in order to facilitate random assignment of subjecti 
within a given class to the different experimenti 
conditions. For one half of the subjects, the instig 
tions forewarned that the communications were a 
signed to change their opinions on certain a i 
the other half, these statements were deleted; othe 
wise the same directions were issued to the fo i 
warned and nonforewarned groups. All other spa 
including the purported rationale for the se 
opinion ratings and comprehension testing 
identical to those in Experiment 1, Of cours 
before, the true purpose of the experiment 
eventually revealed. 


Results 


In the first study, the types of messages i F 
troduced the possibility that subjects Co 
hold different initial opinions for i ae 
and familiar topics, as indeed was ' A vas 
Therefore, inclusion of the control Si 
necessary for the main analysis: 1 
such variables were included in t for We 
study, and there was no counterpart contr 
distraction manipulation under the 


3 The materials were pretested by 
to rate the extent to which they fous 
distracted or annoyed. In addition, 
prehension was found to be poorer = 
subjects—a characteristic that has. 
with distraction in several other studies- 


onditions, the data for the experimental con- 
itions were analyzed separately; and com- 
arisons were made with the controls only 
hen they were theoretically interesting. 


Persistence of Forewarning and 
Distraction Effects 


The opinion means for each of the experi- 
mental conditions are presented in Table 2, 
where it can be seen that the data generally 
conform to the predicted pattern. 

An analysis of variance with repeated mea- 
sures indicated a main effect of time of mea- 
surement, F(1, 102) = 39.02, p < .01, with 
subjects being less favorable after a week had 
elapsed; and significant interactions appeared 
between both forewarning, F(1, 102) = 6.77, 
p< .01, and distraction, F(1, 102) = 34.55, 
p< .01, and time of measurement. Further- 
more, the second-order interaction involving 
all three variables was significant beyond the 
01 level, (1, 102) = 7.69. 

The directions of the first-order interac- 
tions were as predicted: Forewarning initially 
inhibited opinion change, but this disadvan- 
tage vanished over time; and distraction fa- 
cilitated immediate opinion change, but the 
initial advantage rapidly dissipated. The sig- 
ficant second-order interaction appears to 
e due primarily to the fact that in the im- 
ediate-measurement condition, the distrac- 
ion completely nullified the effects of fore- 
arning, yielding a difference of more than 
0 points between the distracted and non- 
listracted subjects’ means. This fact is par- 
icularly interesting, since it is exactly what 
ould be expected if the major mediating 
variable was the number of counterarguments 
roduced, That is, if the inhibiting effect of 
‘Orewarning of persuasive intent is primarily 
due to the increased production of counter- 
guments while reading the messages, then, 
When subjects are distracted from counter- 
Suing, the usual effects produced by fore- 
arning should be nullified. 
It is of interest to determine whether the 
3 ta replicate the earlier finding of a sleeper 
A a interaction between forewarning and 
k of opinion measurement when subjects 

€ not distracted. The relevant data are con- 
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Table 2 
Mean Opinion Scores for Each of the 
Experimental and Control Conditions 
Not 
forewarned Forewarned 
Measurement Measurement 
o Imme- De- Imme- De- 
Condition diate layed diate layed 
No distraction 70.77 63.85 53.85 61.73 
Distraction 75.38 59.42 74.42 57.12 
Controls 54.42 54.23 47.12 44.52 


Note. These means are based on 100-point scales 
ranging from 0 (Very improbable) to 100 (Very 
probable). Cell ns = 52. 


tained in the no-distraction conditions of 
Table 2 (identical to the earlier study), The 
immediate effect of forewarning of persuasive 
intent was to reduce substantially the impact 
of the messages (Ms = 53.85 vs. 70.77 in the 
nonforewarned conditions), After a week's 
delay, however, this initial difference had all 
but disappeared, with the forewarned group 
showing an increase of 7.88 points. Planned 
comparisons (Dunn, 1961) indicated that the 
interaction and the aforementioned increase 
were significant beyond the .0S level. Hence, 
this study showed both relative and absolute 
sleeper effects for the forewarned groups. 

In regard to the effects of distraction, the 
data for the no-forewarning conditions, which 
would be most similar to the typical distrac- 
tion study, showed the predicted interaction. 
Distraction facilitated opinion change im- 
mediately after reading the communications 
(M = 75.38 for the distracted individuals, 
compared to 70.77 for those persons in the 
no-distraction conditions), but after a week, 
this pattern had reversed, with the nondis- 
tracted subjects showing somewhat superior 
retention of opinion change. This interaction 
was also significant beyond the .05 level as 
tested by Dunn’s (1961) method. 

In summary, there appears to be consid- 
erable support for the conjecture that mod- 
erate distraction may have a facilitating, but 
short-lived, effect on opinion change. There 
was no evidence in the present study that 
forewarning increased the favorability of sub- 
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jects’ responses to the familiar control issues. 
In contrast, when-only familiar issues were 
presented, subjects who were forewarned 
agreed somewhat less strongly with the opin- 
ion statements on both the immediate and 
delayed measurements. There seems to be 
no ready explanation for the different results 
obtained with the control issues in the two 
experiments. 


Comprehension 


The prediction that forewarning would 
serve as a discounting cue and lead to a 
sleeper effect was based on the assumption 
that it would not interfere with learning of 
the message content. In contrast, the predic- 
tions involving distraction were based in 
part on the assumption that distraction would 
interfere with learning and that, consequently, 
induced change would be quite ephemeral. 
The mean comprehension scores for each of 
the experimental conditions are presented in 
Table 3. 

Analysis of variance indicated that the main 
effect of distraction was significant, F(1, 102) 
= 28.33, p < .01, and in the predicted direc- 
tion. The main effect of forewarning was in 
the same direction as that obtained in the 
first study, F(1, 102) = 2.57, p= .11. In 
Contrast to the earlier study, time of measure- 
ment had a significant effect, F(1, 102) = 
52.36, p< 01, with subjects remembering 
less of the messages’ contents at the time of 
delayed measurement, The only other signifi- 


Table 3 


Mean Comprehension Scores for Each of the 
Experimental Conditions 


TER a a S 
Not 
forewarned Forewarned 
ae 
Measurement Measurement 
i Imme- De- Imme- De- 
Condition diate layed diate layed 


No distraction 
Distraction 


Note. These means are based 
correct answers 
Cell ns = 52, 


on the number of 
to three multiple-choice questions. 
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cant effect was the interaction between for 
warning and distraction, F(1, 102) = 41 
p < .05. There was a greater facilitating ¢ 
fect of forewarning on learning when subjec 
were not distracted. Hence, in the prese 
study, subjects learned significantly less u 
der conditions of distraction even thou; 
their opinions were changed much mor 
These results are quite similar to those ol 
tained by Insko, Turnbull, and Yand 
(1974) and Petty et al. (1976), in who 
studies, at least under some conditions, di 
traction interfered with recall but facilitate 
opinion change, Together, these studies su 
gest some limitations of the generalizatio 
advanced by previous researchers (e.g., Fes 
inger & Maccoby, 1964; Osterhouse & Brod 
1970; Regan & Cheng, 1973) that distractio 
facilitates opinion change only in cases whe! 
it does not interfere with reception of th 
message. Naturally, if the interference wit 
comprehension were severe, the distractio 
would inhibit opinion change, since one ca 
only change his or her opinion in the dire 
tion advocated if he or she knows what tha 
position is. 


General Discussion 
Effects of Forewarning ( 


Both studies indicated that forewarnini | 
of persuasive intent of a communis 
duced a sleeper effect whereby subjec i i 
more influenced by the message after & i 
interval than they were immediately ai 4 
ing it. There were at least three A 
expecting such a delayed-action © vara 
first pertained to the number of bee 
ments evoked by forewarning. It was arguing 
that if forewarning increased oe result, 
less immediate opinion change WoW” ti 
but that over time, the subjects 
would be forgotten or dissociate mation. £9 
messages, thus allowing the Ra study; 
have a greater impact. In the tly great 
forewarned subjects showed a i he & 
tendency toward counterarguing; may have 
fect fell short of significance. T ity 0 
been due, in part, to the insensit! 
response rating scale employed. 
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W The second reason for expecting a delayed- 
tion effect rested on the assumption that 
rewarning would increase the subjects’ tend- 
cies to derogate the communications. As 
found in earlier studies (e.g., Dean et al., 
11971; Hass & Grady, 1975), subjects rated 
the messages as less fair and unbiased when 
forewarned of their persuasive intent. Thus, 
‘forewarning apparently served as a discount- 
ling cue (much like a negative source), thereby 
reducing the immediate impact of the com- 
munication, However, with the passage of 
time, subjects tended to forget or dissociate 
ithe discounting cue, thus allowing the full 
llimpact of the persuasive material to emerge, 
yproviding that forewarning did not interfere 
{with comprehension. Consequently, the fore- 
thes subjects would be expected to reach 


(the same level as the nonforewarned group 
tover time. Obviously, whether an absolute in- 
Hcrease occurred, as witnessed in the second 
study, would depend on such factors as the 
period of time elapsed and the decay rate 
{for the nonforewarned group. This reasoning 
(parallels the original interpretation of the 
sleeper effect (Hovland et al., 1949), in 
which it was presumed that the propaganda- 
type context led to an initial dampening of 
the film’s persuasive impact, but that as this 
discounting cue was forgotten over time, the 
ubjects accepted the message to a greater ex- 
ent. Despite the fact that Gillig and Green- 
ald (1974) have written its obituary, and 
apon and Hulbert (1973) have concluded 
at there is no strong evidence for a gen- 
ralized sleeper effect, the data in these studies 
ully supported an opposing stance: Indeed, 
ruder et al. (1978) have recently shown 
that absolute sleeper effects can be obtained 
When certain conditions are met for strong 
tests: (a) when a persuasive message has a 
substantial initial impact on attitudes, (b) 
When this change is totally inhibited by a dis- 
counting cue, (c) when the cue and message 
are dissociated over time, and (d) when this 
dissociation occurs quickly enough so that the 
essage still has some impact. The present 
Studies generally would appear to meet these 
Criteria, 

As previously mentioned, forewarning may 
be a better method of studying the sleeper 
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effect than attributing a negative source to 
a compelling message, since this must often 
arouse feelings of incredulity, and further- 
more, the negative source may be over- 
shadowed by the mention of a number of 
positive sources in the communication itself. 
For example, a health message may be at- 
tributed to a low-prestige source, such as a 
high school student, but within the text, vari- 
ous facts may be presented that presumably 
come from medical journals, learned pro- 
fessors, or prominent physicians, The result 
could conceivably end in confusion. 

The third possibility, that forewarning in- 
creases reactance, was not tested in these 
studies; but again, this view would lead to 
the predicted delayed-action effect. As Hass 
and Grady (1975) pointed out, despite the 
presence of reactance, well-written communi- 
cations containing compelling arguments often 
lead to persuasion. The arguments should 
have greater influence on the subject once the 
reactance has dissipated. 3 

Actually, all three of the interpretations 
mentioned above can be viewed as variations 
of the discounting cue hypothesis, with coun- 
terarguments, derogation of source and mes- 
sage, and feelings of reactance each serving 
as an initial rejection cue. 


Effects of Distraction 


More rapid decay of opinion change, in- 
duced under conditions of distraction, was 
expected for two reasons. First, distraction 
has been shown to interfere with comprehen- 
sion of message content (e.g., Petty et al, 
1976; Zimbardo et al., 1970). While correla- 
tional studies of memory and persistence have 
produced somewhat inconsistent results (e.g, 
Cook & Flay, 1978; Miller & Campbell, 1959; 


Watts & McGuire, 1964), Cook and Flay 


point out that experimental manipulations 


affecting learning have usually influenced per- 
sistence of opinion change. Thus, while the 
immediate effects of distraction may be to 
increase opinion change (presumably because 
more is gained from the reduction in counter- 
arguing than is lost through poor comprehen- 
sion), the long-term prognosis would be a 
rapid reversion to the level of the nondis- 
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tracted subjects, if not to a lower level. The experiments. Journal of Personality ang § 
data in the present study are consistent with me eae ER ste ee 1, 
this view, inasmuch as distracted subjects ~ 00) ©. H., Lewkowicz, B, E, & Cooper, J, 


à 2 ticipatory attitude change as a function of 
changed more immediately afterward, but not esteem and issue familiarity, Journal of Pers 
during the delayed measures of opinions; and ity and Social Psychology, 1972, 24, 407-412, 


they scored significantly lower on the com- Dunn, O. J. Multiple comparisons among mé 
prehension test at both time intervals Journal of the American Statistical Associa 
fost 1961, 56, 52-64, 

The second reason for expecting the ob- Festinger, L., & Maccoby, N. On resistance to | 
tained temporal effects depends on Postexperi- Suasive communications, Journal of Abnormal, 
mental counterarguing. Although this vari- Social Psychology, 1964, 68, 359-366, 
able was not assessed in the present study, F oman, J. L., & Sears, D. O, Warning, distra 
the fact that distraction completely nullified Fa Sas mae ae k 
the immediate effects of forewarning strongly wald, A: G ' 


: Gillig, P., M., & Greenwald, A. G. Is it time tol 
Suggests that counterarguing is a major medi- the sleeper effect to rest? Journal of Personi 
ating process, After the subject has left the _ and Social Psychology, 1974, 20, 132-139, 
experimental room, it is only reasonable to Studer, C. L., Cook, T. D., Hennigan, K, M, Fi 
assume that h ifn Id be abl 4 B. R., Alessis, C., & Halamaj, J. Empirical t 
me that he or she would be a e to think of the absolute sleeper effect predicted from ú 
of a greater number of counterarguments to 


discounting cue hypothesis, Journal of Perso 
the communications when free to reflect on ity and Social Psycholog 


83 
them without being distracted. This phenome- Haaland, G. A., & Venkate i ieee 
i Suasive communicatio An examinatio 
pi sonia, i rg; Serve: to Move the per- distraction hypotheses, Journal of Personality 
A Social Psychology, 1968, 9, 167-170. 

level. It is regrettable that both immediate Hass, R. G., & Grady, K. Temporal delay, type 

forewarning, and resistance to infiuence A 
sured in the second tudy, i Experimental Social Psychology, 1975, 1, 

that was used in hee csi Ta ee Hovland, C. I., Lumsdaine, A, A., & Sheffield, F 

adequate d th Experiments on mass communication. Prim 

fied an e more thorough thought- N.J.: Princeton University Press, 1949, ` 
Insko, C. A., Turnbull, W., & Yandell, B. Fa 

have been too complicated for our within-sub- tive and inhibiting effects of distraction on 
jects design. titude change. Sociometry, 1974, 37, 508-528. 

Kiesler, C, A, & Kiesler, S. B. Role of fonta 

in persuasive communications. Journal of A 


1978, 36, 1061-1074. 
n, M. Resistance to 
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Two experiments were conducted to explore the adequacy with which balance 
theory can account for attraction and agreement effects in p-o-x triads. The first 
experiment provided evidence for the importance of assumed reciprocated senti- 
ment in the production of attraction effects, and the second experiment pro- 
vided evidence for the importance of assumed -o similarity in the production 
of agreement effects. In addition, both experiments replicated past results con- 
cerning the importance of contact for attraction and agreement effects and 
also found that attraction and agreement effects were more apparent with 


’ 


affective scales 
(expectancy, 


The present article addresses the general 
issue of the extent to which balance theory 
can be elaborated to account for subjects’ re- 
actions to hypothetical social situations, like 
Heider’s (1946, 1958) p-o-x (or self- other — 
inanimate object) triad. What is the reason 
for studying the subjects’ reactions to hypo- 
thetical social situations? There are at least 
two possible answers to such a question. One 
is that such studies are important because 
they are role-playing simulations of recurrent 
social situations. (See the Summer, 1977, is- 
sue of the Personality and Social Psychology 
Bulletin for a series of articles arguing the 
pros and cons of such role-playing methodol- 
ogy.) A second, and more compelling, answer 
to the question asked above relates to the 
fact that such studies Provide an arena for 
theory development. 

There is a tendency to downgrade the im- 
portance of research in hypothetical situa- 
tions because it is “irrelevant.” To some ex- 
tent such a reaction is based on a negative 
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Insko, Department of Psychology, University of 
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Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3705-0790$00.75 


790 


(pleasantness and harmony) than with more cognitive scales 
consistency, and stability). An unexpected finding of the second 
experiment was that the p-o similarity (or social comparison) interpretation 
of the agreement effect held only for the more cognitive scales. 


reaction to balance theory, and to some ext 
it is based on the failure to recognize 
from a balance theory perspective, pheno 
nology is phenomenology—whether it relat 
to actual or hypothetical situations. There 
thus the possibility of working out princi 
that may generally apply. Research on hyi 
thetical triads can be conceived of as pla 
a role for the balance theorist that is an 
gous to the role played by the Skinner 
for the operant psychologist. The oa 
box, like the hypothetical triad, holds li 
intrinsic interest or “relevance.” On the a 
hand, the Skinner box does provide aA 
tion in which variables believed to ni 
theoretical importance can be easily ne 
lated and behavior can be easily er 
Thus, principles may be worked out 
be applied to other situations. F 
Insko, Songer, and McGarvey on 
argued that Heider’s p-o-x triad oe ; 
conceptualized from the persue 
three-factor analysis of variance MO di i 
first factor is the positive or nee ‘ive 
the p-o relation, the second is the ae 
negative sign of the p-x relation, T 
third is the positive or negative SI8 
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o-x relation. Within-subjects ratings of the 
eight possible triads (typically for degree of 
pleasantness) give seven possible orthogonal 
effects (three main effects, three double in- 
teractions, and a triple interaction). Numer- 
ous studies (see Zajonc, 1968, for a review) 
have found evidence for three of the seven 
effects. These are the main effect for the p-o 
relation, or attraction effect; the interaction 
between the p-x and o-a relations, or agree- 
ment effect; and the triple interaction, or 
three-sign balance effect.t The attraction ef- 
fect is a tendency for subjects to rate p-likes-o 
triads as more pleasant than p-dislikes-o 
triads. The agreement effect is a tendency 
for subjects to rate triads in which p and o 
both like or both dislike x as more pleasant 
than triads in which p and o do not agree 
about x. The three-sign balance effect is a 
tendency for subjects to rate the triads in 
which the product of the three signs is posi- 
tive as more pleasant than triads in which 
the product of the three signs is negative. 
Stated another way, it is the tendency to rate 
triads in which p likes an o with whom there 
is agreement, or dislikes an o with whom 
there is disagreement, as more pleasant than 
the triads in which p likes an o with whom 
there is disagreement, or dislikes an o with 
whom there is agreement, 

It can be argued that attraction and agree- 
ment effects are embarrassing to balance the- 
ory (cf. Zajonc, 1968). Insko et al. (1974), 
however, have pointed out that various bal- 
ance interpretations are possible. These in- 
lerpretations are based on the general as- 
‘umption that subjects supply additional 
Cognitive bands beyond the three that are ex- 
Petimentally given, For example, according to 

derman (1969), when subjects consider the 
bo-x triad, there is a tendency for some of 
them to think about a specific o with whom 
Miey\'interact. or. have contada in oaa 
e of balance theory, such social interac- 
a ^ a Positive unit relation between p and 

"“nsko et al. argued that the addition of this 
Positive unit relation to the p-o- triad (or 
i ‘ycle) Produces two additional cycles that 
= balance interpretations of attraction 

agreement effects, 

unting the research in Insko et al. as 
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one experiment, and including the two ex- 
periments reported by McGarvey (1974) as 
well as an additional unpublished experiment 
(LaTour et al., Note 1), there are four ex- 
periments that have studied the important 
contact conditions. There are four such con- 
ditions: standard (with no mention of con- 
tact), future contact, no contact, and break- 
ing contact. According to the unit-relation 
interpretation, the algebraic magnitudes of 
the attraction and agreement effects should 
be as follows: future contact > standard > 
no contact > breaking contact. Furthermore, 
the direction of the effects should be positive 
in the future contact and standard conditions, 
neither positive nor negative in the no-con- 
tact condition, and negative in the breaking- 
contact condition, With regard to the attrac- 
tion effect, results have been generally con- 
sistent with the predictions given above. The 
exception is a significant attraction effect in 
the no-contact condition in one of the four 
experiments, With regard to the agreement 
effect, results have been less supportive of 
the unit-relation interaction. All experiments 
have obtained the same rank order of condi- 
tions: future contact œ standard > no con- 
tact > breaking contact, Furthermore, three 
of the four experiments obtained a significant 
agreement effect in the no-contact condition, 
and one failed to find a significant negative 
(or reversed) agreement effect in the break- 
ing contact condition, Clearly the contact in- 
terpretation does not adequately account for 
all of the reliable variance in the agreement 
effect, 

One further complication relates to the 
fact that some of the studies mentioned above 
had two future contact conditions: future 
contact with discussion of x and future con- 
tact without discussion of x. The difference 
between these two conditions does not ob- 
viously reflect either contact or similarity, and 
the reason for the previously obtained finding 
of a greater agreement effect with than with- 
out discussion (e.g., Insko et al., 1974) is not 


1 Insko, Songer, & McGarvey (1974) refer to at- 
traction effects as “positivity” effects, Since usage 
seems to be following Zajonc’s precedent, we will 


conform. 
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clear. Perhaps the explicit elimination of dis- 
Cussion regarding x decreases the importance 
or weight of the agreement cycle, or perhaps 
discussion of x produces greater concern with 
the consequences of disagreement. 

As previously indicated, there are several 
possible interpretations of the attraction ef- 
fect. One relates to contact, another to re- 
ciprocated sentiment. Existing evidence (Mc- 
Garvey, 1974) indicates that at least some 
subjects do assume same-sign reciprocation 
of the p-o relation, 
Garvey (1974) 
of reciprocated 


tude of the attraction effect. Why should this 
be? Reciprocated 


Ctors relates to the type 
of rating scale, Existing research by Crockett 
(1974); Gutman and Knox (1972); Gutman, 
Knox, and Storm (1974); and Miller and 
Norman ( 1976) Suggests that pleasantness 
scales reveal larger attraction effects than do 
Consistency and/or expectancy ratings. Ex- 
periment 1 contains a scales factor that in- 
cludes five different rating scales: pleasant- 
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ness, harmony, expectancy, consistency, 
stability. 

It is reasonable to 
self-related and that an 
the self more salient than does a more co 
tive rating, 


assume that affect 


reciprocated sentiment (with its implicati 
of agreement or disagreement regarding 
worth of the self), affective ratings will 
duce a larger attraction effect than will ; 
nitive ratings. In terms of the present desi 
this is a prediction that a contrast effect o 
paring affective (pleasantness and harmon 
scales with cognitive (expectancy, consistenti 
and stability) scales will be significant, 
Recall that McGarvey (1974) found 
main effect of reciprocation on attract 
using just a pleasantness scale. We pel 
that in the present study there would i 
interaction between reciprocation and rati 
scales such that the effect of recip 
would be more apparent for the affeci q 
scales (pleasantness and harmony) on 
the relatively more cognitive la 
tancy, consistency, and stability). The i 
retical rationale for this prediction yee 
reciprocated sentiment produces ee 
or disagreement regarding the sa a 
and the affective scales are more directly ; 
vant to cycles including the oe 
an element (e.g, p interacts with a ae 
Proves of his/her self-concept, 9 dis rae 
of p’s self-concept). We are pee yde 
affective scales are most relevant E 
that include the self-concept as a pro 
(i.e., cycles in which p’s self-evalua ay 
vides one of the signs that determines 
or imbalance in the cycle). ; the re 
A third factor having a bearing Me attrac 
ciprocated sentiment interpretation arison 0 
tion effects is involyement, or a oa fe 
p-o-x and q-o-x triads. With ge he subjet 
subject is told that p, or “I” E u the sil 
himself/herself. With g-o-x triads, o person 
ject is asked to assume that g is somi 


who is known but has never been met. Sub- 
jects are asked to rate the q-o-«% triads ac- 
cording to how they as outside observers 
would find the situation. In view of the argu- 
ment detailed above regarding the role of 
self-relevancy in the production of the attrac- 
tion effect, high-involvement (p-o-x) triads 
should produce a greater attraction effect 
than low-involvement (g-0-%) triads. How- 
ever, a study by LaTour et al. (Note 1) did 
not support this prediction. It is possible, 
however, that subjects tend to judge q-o-« 
triads by vicariously identifying with q. Some 
evidence for vicarious identification in the 
context of p-o-q (three-person) triads was 
obtained by Insko et al. (1974). In that 
study, o-to-g attraction was found under the 
special circumstances of g-to-o0 same-sign re- 
ciprocation. Since same-sign reciprocation also 
increased the magnitude of -q attraction, it 
was argued that o-q attraction resulted from 
p's vicarious identification with o. Thus, it is 
plausible that subjects judge the interper- 
sonal situation through at least partial vicari- 
ous identification. 

We believed it unlikely that vicarious iden- 
tification would be sufficient to explain all of 
the subtleties present in p-o-« triads. Tf this 
is true, it is reasonable to suppose that the 
prediction given above regarding the greater 
effect of reciprocation with affective than cog- 
nitive scales would be more apparent for p-0-¥ 
than q-o-x triads. This is a triple interaction: 
Involvement x Reciprocation X Scales. To 
the extent that this interaction occurs, we 
Would have further support for the supposi- 
tion that the self-concept is important in an 
adequate conceptualization of attraction ef- 
fects, 

Finally, we come to the three-sign balance 
effect. Past results have uniformly indicated 

| ā greater effect in the standard condition 
than in the remaining contact conditions. 

Such results possibly indicate an attentional 

shift away from the three-sign balance cycle 

(that does not include the unit relation) to 

the attraction and agreement cycles (that do 

Include the unit relation). No further predic- 

tions regarding three-sign balance were made. 

_ Experiment 1 contained four between-sub- 

Jects factors: reciprocation (standard no men- 
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tion, or explicitly mentioned), scales (pleas- 
antness, harmony, expectancy, consistency, 
or stability), involvement (p-o-x or q-0-x 
triads), and contact (standard, future con- 
tact with discussion of x, future contact with- 
out discussion of x, no contact, or breaking 
contact). The contact factor was included in 
order to examine the extent to which contact 
results would generalize across scales. 

Experiment 1 did not include a manipula- 
tion of triad order. This was not done for two 
reasons. First, the existing design of four be- 
tween-subjects factors generates 100 cells, 
and that is the capacity of the available mul- 
tivariate analysis of variance program. Sec- 
ond, past research (Insko et al., 1974; Mc- 
Garvey, 1974) has not found any interactive 
effects between triad order and other ma- 
nipulated factors. 


Experiment 1 


Method 
Subjects 


The subjects were 1,000 male and female students 
from the introductory psychology course at the 
University of North Carolina, 


Independent and Dependent Variables 


Each subject rated eight triads, These ratings can 
be seat of as generating three within-subjects 
factors: positive or negative sign of the first rela- 
tion (p to o or q to o), positive or negative sign of 
the second relation (p to * or q to x), and positive 
or negative sign of the third relation (0 to x), These 
within-subjects factors were used to generate seven 
orthogonal difference scores corresponding to the 
main effects and interactions of these factors, Three 
of these difference scores, the attraction, agreement, 
and three-sign balance effects, were used as the main 
dependent variables in a multivariate analysis of 
variance that included four between-subjects factors. 
The four between-subjects factors were involvement 
(p-0-x or q-0-¥ triads), reciprocation (standard no 
mention of reciprocation, or same-sign reciprocation 
of p- or q-to-0 relation), contact (standard, future 
contact with discussion of +, future contact without 
discussion of *, no contact, or breaking contact), 
and rating scales (pleasantness, expectancy, harmony, 
stability, or consistency). 


Procedure 


Subjects were tested in small groups of up to 
approximately 20 people. In any given group, every 
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subject was in a different one of the 100 conditions 
or cells. After subjects had been briefly instructed 
regarding the procedure for rating each hypothetical 
situation, they were given a nine-page booklet, Sub- 
jects were asked to regard each situation as inde- 
pendent of the others and not to alter a previous re- 
sponse after a page had been turned. It was empha- 
sized that they should rate each of the eight situa- 
tions and not leave any of the rating scales blank. 
Remarkably, this instruction was followed by all 
1,000 subjects, and no subjects were eliminated from 
the final sample. 

The first page of the booklet contained instruc- 
tions regarding the experimental procedure and in- 
troduced two of the between-subjects factors, in- 
volvement and rating scales, For example, the self- 
involved-consistency instruction was as follows: 


On each of the following eight pages is a descrip- 
tion of a situation involving two persons, I and 
O, and an impersonal object or issue, X, As you 
read each situation, try and imagine yourself as 
Person I. Think about what it would be like to be 
involved in the situation described. Then rate 
each situation according to the extent to which 
you as a person, I, would find such an interper- 
sonal situation consistent or inconsistent, Make 
the rating by circling the number on the scale 
corresponding to how you as Person I find the 
situation. Below is an example of the scale, 


Most inconsistent 1 2 3 4 5 6 7 Most consistent 


For the self-uninvolved-harmonious-scale condi- 
tion, the instruction was the following: 


tion of a situation involving two persons, Q and O, 
X. Imagine that 
that you know 
X. Rate each 
to which you 
such an inter- 


Most discordant 1 2 3 4 5 6 7 Most harmonious 


As previously indicated, five different rating scales 
were used (most unpleasant ~ most pleasant, most 
unexpected — most exepected, most inconsistent — 
most consistent, most discordant — most harmonious, 
most changeable — most stable). These rating scales 
were repeated on each of the eight pages. Each page 
stated one of the triadic situations, indicated recipro- 
cation (in the appropriate condition), stated assump- 
tions regarding contact (in the nonstandard con- 
ditions), and finally gave a rating scale. Here is an 
example. 


Situation: I like O, 


I like X, O dislikes X, 
O likes I (me). iy 
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Assume: I and O have not had 
in the past and will never have 
in the future. However, I 
about each other to form 
each other. 


Rating of situation: 
Most unpleasant 1 2 3 4 56 7 Most pleasant 


personal conta 
personal contag 
and O know enoug 
some feelings aboy 


Reciprocation was manipulated by stating O's fed, 
ing for I (or Q) as the same as I (or Q's) feeling fo 
O, or by not mentioning anything about O's feeling, 
Contact was manipulated by stating (in all but th 
standard condition) an assumption regarding com 
tact. The assumptions were as follows: I and O have 
had contact in the past and will have contact in the 
future and will discuss X; I and O have had persona 
contact in the past and will have personal contad 
in the future but will not discuss X; I and O haw 
not had personal contact in the past and will never 
have personal contact in the future, However, | 
and O know enough about each other to form 
some feelings about each other; and I and 0 
have had personal contact in the past but will never 
have personal contact in the future. 

At the conclusion of the experiment, subjects wert 
told about the purposes of the study. 


Dependent Variables 


It is worth emphasizing that the dependent vat 
ables were the seven orthogonal difference scott 
created by taking the difference between the meat 
of the ratings for a given four triads and the a 
of the remaining four triads, It is true that the sul 
jects rated the triads for pleasantness, consistent 
and so forth. The dependent variables, how 
were not pleasantness and consistency, but ag 
difference scores. From this perspective the ral 
scales constituted an independent variable. 


Results 
Grand Mean Tests 


The initial analysis involved a test fe 
difference from zero for each of es ol 
orthogonal difference scores (or com! a 
This analysis revealed the usual eo ‘ 
effects for attraction, F(1, 900) = ee ye 
<.01; agreement, F(1, 900) = 171. 900)2 
.01; and three-sign balance, F(1, f 
288.18, p < .01. In addition there wirpa A 
significant effect for the second E, nfe 
q-x), F(1, 900) = 11.36, p < -01. a 
indicates a tendency for the pi _ “able 
likes-x triads to be rated as more whi 
(pleasant, etc.) than the triads ina aomi) 
dislikes x and q dislikes x. Insko et al. 
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Tre 


` 2 Three-sign 

Source Attraction F Agreement F balance F Multivariate F 
Involvement (1) 13.48** .22 4.17** had 
Reciprocation (R) 4.79* 2.47 n ry 
Contact (C) 80.39% 37.68** 22.44** 21.36% 
Scales (S) 36.60% 22.44" 6.73% 9.70" 
IXR 34 AS «30 “55 
EX C 2.03 1.60 3.19* 1.29 
IXS .96 1.90 1.46 99 
RXC 1.57 3.84** 1.69 1,55* 
RXS 6.90** 1.40 12 1,799* 
CXS 432°" 73 225% 1.779% 
IXRXC ll 1.06 85 84 
IXRXS 2.90* 66 1,57 1,63* 
IXCXS 1.17 1.11 1.44 1.09 
RXCXS 1.02 1.21 1.57 1.25 
Quadruple interaction 1.70* .96 1.53 1.18 


*p<.05.** p < 01. 


found a similar effect. A possible balance 
interpretation of this effect would involve a 
conceptualization of an x with which p or q 
has a positive unit relation—a recently pur- 
chased bicycle, for example. However, since 
none of the between-subjects factors were de- 
Signed to explore this matter, since the effect 
Was small relative to the other three effects, 
and since the analysis is already complex, re- 
Ported results will relate solely to the attrac- 
tion, agreement, and three-sign balance effects. 


Multivariate Analysis of Variance 


A four-factor multivariate analysis of 
Variance of the attraction, agreement, and 
three-sign balance effects is presented in 
Table 1. This analysis revealed significant 


Table 2 
Rating Scale Means for Experiment 1 
Condition Pleasantness Harmony 
Attraction 5.71** 5.90** 
ment 3.59%* 3.84°* 
Three-sign balance 2.28** 2.08°* 


Note, Each subject rated eight triads on one of five 
Attraction, agreement, and three-sign balance scores 
he remaining four tria 
bjects to give the tabled means. 


Propriate four triads and the mean of t 
(+7 to —7 range) were then averaged across su 
b < .05, df = 1, 900. ** p < .01, df = 1, 900. 


multivariate Fs for all four main effects, as 
well as for four of the interactions. Discussion 
of the significant univariate Fs accompanying 
the significant multivariate Fs is complex. 
This material will be presented under three 
subheadings: rating scales main effect, in- 
volvement and reciprocation, and contact, 


Rating Scales Main Effect 


There was a significant main effect for 
rating scales. Table 2 contains the marginal 
means. A contrast comparing the affective 
scales (pleasantness and harmony) with the 
more cognitive scales (expectancy, stability, 
and consistency) was significant for both 
attraction and agreement. As expected, at- 
traction, F(1, 900) = 162.12, p < .01, and 


Expectancy Consistency Stability 
49 1.02* 67 
05 oar" 1.78°* 
1,.84°* 2.14% 3.92% 


7-point scales. These ratings were then converted to 
by taking the difference between the mean of the ap- 


iads. These difference scores (with a possible 
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agreement, F(1, 900) = 89.06, p< .01, ef- 
fects were greater with the affective scales 
than with the cognitive. 

Grand mean tests were conducted on each 
of the cell means in Table 2. The significance 
of these tests of differences from zero is in- 
dicated by the asterisks above each mean. All 
of the means except three (expectancy ratings 
for attraction and agreement and stability 
ratings for attraction) differ from zero. The 
cognitive scales tend to show smaller attrac- 
tion and agreement effects, with almost no 
effects at all appearing in ratings of ex- 
pectancy. 


Involvement and Reciprocation 


Involvement main effect. There was a sig- 
nificant involvement effect for both attraction 
and three-sign balance. The marginal means 
indicated that the attraction effect was 
greater with self-uninvolved triads (q-o-x) 
than with self-involved (p-0-x) triads (M q- 
0-% = 3.52, M p-o-% = 2.00), while just the 
opposite was true for the three-sign balance 
effect (M p-o-x = 2.75, M q-o-~ = 2.16). 
This finding suggests a tendency for subjects 
to react more to the simpler aspects of the 
triads when they are not personally involved 
and to react more to the complex aspects of 
the triads when they are personally involved. 
Attraction involves one relation, and three- 
sign balance involves three relations. 

Reciprocation main effect. The only sig- 
nificant reciprocation main effect was for at- 
traction. Consistent with McGarvey’s (1974) 


Table 3 
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results (which related only to Pleasantnes 
ratings of p-o-x triads), same-sign reciprow 
tion of the p-o or g-o relation increased th 
magnitude of the attraction effect (3.21) ova 
standard (no stated) reciprocation (2.31). Ay 
previously indicated, assuming a Positively 
evaluated self-concept, reciprocated positiv: 
sentiment produces agreement regarding the 
self, and reciprocated negative sentiment pro 
duces disagreement regarding the self. This 
from the perspective of reciprocated sentimen 
the attraction effect can be regarded as al 
agreement effect. 

Reciprocation X Scales interaction. TW 
main effect for reciprocation on attraction wä 
qualified by several interactions, one of whid 
was the Reciprocation x Scales interaction 
The cell means and the significance of tli 
grand mean tests for the Reciprocation 1 
Scales X Involvement interaction are given ii 
Table 3. The Reciprocation X Scales inter 
action for attraction was explored by testing 
the interaction involving reciprocation an 
the contrast between affective scales (pleas 
antness and harmony) and cognitive scal 
(expectancy, stability, and consistency). Th 
contrast was significant, F(1, 900) = 234 
p <.01. The interaction involving this of 
trast indicates that the main effect for Ii 
ciprocation is a resultant of the affecti 
scales. The pattern can be seen in Table 3. } 

An alternative view of this interaction a 
qualification of the tendency for affec 
scales to produce a larger attraction a 
than cognitive scales. This aspect © renl 
main effect for rating scales is more appa! 


Involvement X Reciprocation X Scales Attraction Means for Experiment 1 


Involvement and 


À t tabilit! 
reciprocation Pleasantness Harmony Expectancy Consistency 3 
p-0-x triad 
Standard (no mention — dt 
of) reciprocation 3.28** 2.06* 60 1.56 — 86 
Reciprocation 7.04** 7.74** 18 —1,22 
q-0-x triad 
Standard (no mention 3,30" 
of) reciprocation 4.90** 6.32** —.46 1.86* 58 
Reciprocation 7.60** 7.50** 1.64 1,90" 
vement- 


Note. p-o-% triad indicates high personal involvement; g-0-x triad indicates low personal invol 


* P <.05, df = 1, 900. ** p < 01, df = 1, 900. 
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Table 4 
Contact Means for Experiment 1 
Future Future 
; Standard contact contact 
ig no mention with without No Breaki 
Condition of) contact discussion discussion contact nla 
Attraction 2.86* 7.45* 4.20* 3.19* —3,88* 
Agreement 2.75 4.79" s1 ve 6 
Three-Sign Balance 4.58* 2.06* .86* 3.43* 1.35" 


 *p < 01, df = 1, 900. 


with stated reciprocation than with standard 
(no stated) reciprocation. Even with standard 
reciprocation, however, the effect was still sig- 
nificant, F(1, 900) = 30.49, p < .01. 

The Reciprocation X Scales interaction is 
consistent with our theoretical argument re- 
garding the effect of reciprocation on attrac- 
tion. One of the arguments advanced to ex- 
plain the existence of attraction effects re- 
lates to the assumption of reciprocated senti- 
ment. Reciprocated positive sentiment pro- 
duces agreement regarding the worth of the 
self, whereas reciprocated negative sentiment 
produces disagreement regarding the worth of 
the self. Given that the affective scales are 
most self-related, it is reasonable that they 
would most clearly reveal an attraction ef- 
fect and would also most clearly show an al- 
teration in this effect as a function of the 
reciprocation manipulation. 

Involvement x Reciprocation X Scales in- 
teraction. The reasoning outlined above im- 
plies that the double interaction pattern 
shown in Table 3 would reveal itself most 
clearly in self-involved (p-o-) triads rather 
than in self-uninvolved (q-0-*) triads. This, 
in fact, occurred. The reciprocation main ef- 
fect was strongest with affective scales and 
self-involved triads. This interpretation of the 
Involvement x Reciprocation x Scales interac- 
tion was supported by a significant triple in- 
teraction involving the affective versus cogni- 


4 scales contrast, F(1, 900) = 5.12, p< 


Contact 


c The remaining results all relate to contact. 
onsistent with past results, there was 4 


significant contact main effect on attraction, 
agreement, and three-sign balance. Table 4 
contains the marginal means. Four contrasts 
were planned for both attraction and agree- 
ment. For attraction, all contrasts were signifi- 
cant. The first contrast indicated that the 
three conditions in which contact was either 
implicitly or explicitly positive showed a 
greater attraction effect than the two condi- 
tions in which contact was either null or nega- 
tive, F(1, 900) = 77.12, p < 01. The second 
contrast indicated that the standard condition, 
in which contact was implicit, showed a 
smaller attraction effect than the two future- 
contact conditions, in which contact was €x- 
plicit, F (1, 900) = 281.73, p < 01, The third 
contrast indicated that attraction was greater 
with than without discussion, F(1, 900) = 
26.82, p < .01, and the fourth contrast in- 
dicated that attraction was algebraically 
greater in the no-contact condition than in 
the breaking-contact condition, F(1, 900) = 
130.51, p < .01. 

The asterisks in Table 4 indicate the sig- 
nificance for grand mean tests (tests of dif- 
ference from zero) for each of the cell means. 
As Table 4 indicates, all of the cell means 
for attraction were significantly different from 
zero. Except for the significant mean in the 
no-contact condition, these results are con- 
sistent with the contact interpretation of at- 
traction. 

The same planned contrasts that were 
used for attraction were used for agreement. 
The first contrast indicated that the three 
conditions in which contact was either im- 
plicitly or explicitly positive showed more 
agreement than the two conditions in which 
contact was either null or negative, F(1, 900) 
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Table 5 i 
Contact X Rating Scales Attraction Means for Experiment 1 
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Contact Pleasantness Harmony Expectancy Consistency 
Standard (no 7.18** 5.50** .30 1.70 
mention) 
Future contact 7 
with discussion 7.85** 8.95** 10.20** 6.63** 
Future contact $ 
without discussion 8,00** 7.60** -30 1.50 
No contact 7.70** 7.30** 63 -93 
Breaking contact —2.10* .22 —8.93** SSS 


* p < .05, df = 1, 900. * p < .01, df = 1, 900. 


= 122.50, p < .01. This contrast was signifi- 
cant in spite of the unexpectedly low mean 
for future contact without discussion. Fur- 
thermore, this unexpectedly low mean ap- 
pears responsible for the nonsignificance of 
the second contrast tomparing the standard 
and two future-contact conditions. The third 
and fourth contrasts revealed the expected 
results. There was a greater agreement effect 
with than without discussion, F(1, 900) = 
115.03, ~<.01, and agreement was alge- 
braically greater with no contact than with 
breaking contact, F(1, 900) = 78.51, p < .01. 
The grand mean tests indicated that all of the 
cell means, except the ones for breaking con- 
tact and future contact without discussion, 
were significantly different from zero. 

The only planned contrast for three-sign 
balance was a comparison of the standard con- 
dition with the four remaining conditions. 
Consistent with past results, this contrast was 
significant, F(1,900) = 16.1, p< .01. The 
effect was larger in the standard condition. 

As Table 1 indicates, the large contact main 
effect was qualified by two descriptively 
smaller. interactions: the Reciprocation x 
Contact interaction for agreement only and 
the Contact x Rating Scales interaction for 
attraction and three-sign balance. 

The Reciprocation X Contact interaction 
indicates that the contact predictions for 
agreement were more strongly supported in 
the standard condition than in the reciproca- 
tion. Theoretically, reciprocation relates most 
directly to attraction, not agreement. This 
unexpected interaction may have resulted 
from an attention shift away from agreement 


and the bearing of contact on agreement wi 
there was explicit reciprocation. l 
The potentially most important interact 
Contact x Rating Scales, was not signifi¢ 
for agreement, and the effects for attrac 
and three-sign balance were not replicated 
Experiment 2. The attraction means are ¢ 
tained in Table 5.° One of the more inter 
ing aspects of this interaction relates to 
means in the no-contact condition. Note 
the pleasantness and harmony means in’ 
no-contact condition both differed signifi 
from zero—contrary to the contact inter 
tion. Furthermore, they were signifi 
greater than the means for the cogni 
scales, F(1, 900)= 6.20, p< .01. Sud 
result makes sense, if it is recalled that i 


tion of a positive attraction effect ) 
reciprocated sentiment. The effect was 
predicted, but it is consistent with the Í 
tially stated theoretical orientation. 


Discussion 


Experiment 1 had two general E 
one descriptive and one theoretical. E. 
scriptive purpose was to look simultane 
at the differential effects of fives 
(pleasantness, harmony, cons e A 
ancy, and stability). Although oe 
previous literature for the harmony 
(Heider’s term for balance), the E 
studies were interpreted as indicat! s F 
affective scales (pleasantness and 


2 An analysis of this interactio! 
thogonal contrasts is available from th 
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would produce larger attraction and agree- 
ment effects than cognitive scales (consist- 
ency, expectancy, and stability). This, in fact, 
occurred. Such results may suggest that only 
the cognitive scales are accurate measures of 
balance. It should be recognized, however, 
that all five scales produced significant three- 
sign balance effects, that the consistency 
scale produced significant attraction and agree- 
ment effects, and that the stability scale pro- 
duced a significant agreement effect (see 
Table 2), Only the expectancy scale ex- 
hibited a total absence of attraction and 
agreement effects. On the other hand, the 
means in the final column in Table 2 de- 
scriptively indicate that the expectancy scale 
produced the smallest three-sign balance 
effect. 

What is the meaning of the term balance? 
Heider (1958) described balanced relations 
as “harmonious” (p. 180). Clearly he was €x- 
pressing himself metaphorically. Nonetheless, 
the metaphor does have an interesting double 
meaning. For the layman, harmony means 
affectively pleasing, but for the musician, har- 
mony has more of a cognitive, rational mean- 
ing. To the extent that balance has both 
tational and affective connotations, the term 
harmony may be an appropriate metaphor. 

The theoretical purpose of Experiment 1 
was to provide support for the reciprocated 
sentiment interpretation of the attraction ef- 
fect (the greater pleasantness, consistency, 
and so forth of the p-likes-o than the p-dis- 
likes-o relations). When McGarvey (1974) 
interviewed a small sample of subjects after 
they had rated the eight triads, he found that 
some of them did in fact report that they had 
considered what the other person might think 
of them (i.e., considered the nonspecified o-to- 
P relation). Furthermore, many of these sub- 
Jects assumed same-sign reciprocation of sen- 
timent (ie, o likes p when p-likes-o was 
Specified, and o dislikes p when p-dislikes-o 
Was specified). McGarvey went on to show 
that a direct manipulation of reciprocated 
sentiment increased the magnitude of the at- 
traction effect (on pleasantness ratings). 

From a balance theory perspective, the 
Problem is to understand why reciprocated 
Same-sign sentiment should increase the mag- 
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nitude of the attraction effect. Insofar as at- 
tention is directed solely toward the p-0, o-p 
cycle or dyad, both reciprocated liking and 
reciprocated disliking are balanced. Thus, for 
this one cycle, there is no way for balance 
theory to predict McGarvey’s results, How- 
ever, Heider assumed that p likes p: that the 
typical individual holds himself/herself in 
high regard. If the second p is regarded as a 
self-concept with which the first p has a posi- 
tive unit relation, it is balanced for p likes p 
(Wiest, 1965). Thus, it follows that recipro- 
cated positive sentiment produces agreement 
regarding the worth of the self and that re- 
ciprocated negative sentiment produces dis- 
agreement regarding the worth of the self. 
This means, of course, that from the perspec- 
tive of reciprocated sentiment, the attrac- 
tion effect is an agreement effect. For balance 
theory the problem is one of whether agree- 
ment effects can be reduced to cycles that are 
consistent with the multiplicative rule, These 
matters will be discussed in greater detail in 
the context of Experiment 2, which was more 
directly concerned with the agreement effect. 
Whatever is the correct account of the agree- 
ment effect, the discussion above does make 
obvious that reciprocated sentiment relates 
directly to self-evaluation. For that reason, 
the finding in the current study that the main 
effect of reciprocation on attraction only oč 
curred with more affective scales (pleasant 
ness and harmony) 
means for this Reciprocation X Scales inter- 
action are contained in Table 3, The occur- 
rence of this interaction provides assurance 
that the effect of reciprocated sentiment on 
attraction is indeed mediated by self-relevancy 
considerations. (This assertion rests on the 
assumption that pleasure and pain are more 
obviously self-related than are consistency, 
stability, and expectancy.) The self-relevancy 
ive provides a partial explana- 
tion for the finding that the affective seales 
produced a larger attraction effect than did 
the cognitive scales. (Note that this scales 
effect was present even in the standard con- 
dition, in which there was no explicit state- 
ment of reciprocation.) To the extent that 
some subjects assumed reciprocated sentiment 
with its implications for self-evaluation, it 
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makes sense that affective scales would be 
more sensitive to attraction. 

One potential problem with this line of 
argument is the finding of LaTour et al. 
(Note 1) that the attraction effect in a high- 
involvement (p-o-x) triad was not greater 
than in a low-involvement (g-o-x) triad. The 
present study, in fact, found that the attrac- 

~ tion effect was larger with g-o-x than p-o-x 
triads. This result is exactly opposite to what 
one would expect on the basis of a simple self- 
relevancy perspective on attraction effects. 

It is possible that subjects did not realize 
that q was someone other than themselves. 
We were, however, sensitive to the problem, 
and made the instructions as explicit as pos- 
sible. Subjects were twice instructed to rate 
the situations according to how “you as the 
outside observer” would find the situation. 
We initially theorized that subjects judged 
q-0-x triads through partial, vicarious identifi- 
cation, To the extent that such identification 
does occur, the comparison of p-o-x% and q-o-x 
triads is less obviously a manipulation of self- 
involvement. It was further theorized that 
vicarious identification would not be suffi- 
ciently complete to detect all of the subtleties 
in p-o-« triads, and thus the interaction of 
reciprocation and affective versus cognitive 
scales should be more apparent with p-0-x 
than with g-o-x triads. This triple interaction 
occurred. The means in Table 3 indicate that 
the tendency for reciprocation to increase the 
magnitude of the attraction effect was most 
apparent with affective scales and $-o-x triads. 
The occurrence of this triple interaction is the 
heart of the present study and strongly im- 
plies that some of the variance in attraction 
effects is mediated by self-relevancy con- 
siderations. 

A further finding that in retrospect ap- 
pears consistent with the argument detailed 
above is that the three-sign balance effect 
was larger with p-o-x than with q-0-% triads. 
The attraction effect was larger with g-o-x 
triads, but the three-sign balance effect was 

larger with p-o-x triads. The vicarious identifi- 
cation tendency does rather well in repro- 
ducing the simpler effects, but less well in 
reproducing the more complex effects. (At- 
traction is a main effect and three-sign balance 
a triple interaction.) 
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The remaining results all related to contagi 
Past studies of these matters had used oi 
pleasantness and expectancy scales (Lato 
et al, Note 1) or only pleasantness 
(Aderman, 1969; Insko et al., 1974: 
Garvey, 1974). The present results add g 
erality to the contact-cycle analysis of In 
et al. Despite the fact that half of the ofl 
involved an explicit statement of recipro 
tion that appears to have partially distrad 
attention from the contact cycles, the resill 
by and large were consistent with past resul 
As the means in Table 4 indicate, attractit 
was maximized by the explicit statement 0 
future contact and was reversed, or negativ 
in the breaking-contact condition. The maj 
finding that was not consistent with the col 
tact-cycle analysis was the fact that the My 
contact mean of 3.19 (see Table 4) was sif 
nificantly greater than zero. According to tit 
contact-cycle analysis, this mean should b 
within error variance of zero. This theoreti 
analysis relates most directly to p-o-« triads 
which there is no stated reciprocation. Ho 
ever, even within this condition, the meant 
2.42 still differed from zero, F (1, 900) = 54 
p< .01. 

A further possibility is that some of @ 
subjects in the standard-recipro aia 
contact condition assumed reciprocation K 
produced an attraction effect. Some a 
stantial evidence for this possibility 00 
from the finding that the affective i 
(pleasantness and harmony) were respo 
for the attraction effect under no pe 
This pattern can be clearly seen along the ™ 
contact row of Table 5. 


Experiment 2 7 
i J 

The general purpose of the present m 
research was to examine the extent tO nie 
balance theory is capable of provi ects 
pretations of attraction and agreeme ital 
in the perception of hypothetical F 
Experiment 1 provided support A taining 
pothesis that contact and cycles © ft i 
the contact unit relation can acen scart 
least some of the variance. Previo Gant 
(Aderman, 1969; Insko et al., 1974; sistent | 
1974; LaTour et al., Note 1) has © 


tact 
demonstrated the importance of coni ) 
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subjects respond on pleasantness scales. Ex- 
periment 1 found that although there were 
differences among various scales (see Table 5), 
the effects of contact were evident for all 
scales. Experiment 1 went further and pro- 
duced evidence that reciprocated sentiment 
and self-relevancy considerations were also 
important in explaining attraction effects. The 
theoretical account provides a possible ex- 
planation for the fact that affective scales 
revealed a larger attraction effect than did 
more cognitive scales. 

The major remaining problem relates not 
to attraction, but to agreement. The above- 
cited studies, and also Experiment 1, all found 
that the agreement effect was altered in 
theoretically predictable ways by the manipu- 
lation of contact. In general, however, the 
contact predictions for the agreement effect 
have been confirmed less frequently than has 
been the case for the attraction effect. Ex- 
periment 1 is a case in point. Note that the 
contact prediction of a reversed (negative) 
attraction effect with the breaking-contact 
condition was supported, but the analogous 
prediction of a reversed agreement effect with 
breaking contact was not supported. The 
agreement mean in this condition was —.06 
(see Table 4), Even within the more ap- 
propriate standard-reciprocation condition, the 
breaking-contact mean was —.73— still not 
significantly less than zero. Another way of 
making this point is to note that the contact 
F for agreement (37,68) was descriptively 
smaller than the contact F for attraction 
(80,39), 

It is clear that contact does affect agree- 
ment. It is, of course, intuitively compelling 
that disagreement with an o with whom there 
is direct contact has more impact than dis- 
agreement with an o with whom there is no 
direct contact. Milgram’s (1965) “imme: 
diacy”-of-the-victim finding is a possible il- 
lustration of an analogous effect. On the 
other hand, it is also intuitively compelling 
that disagreement has an impact even 
there is no contact with o. Why is this? Ex- 
periment 2 was designed to explore one pos- 
sible explanation. 

_ The best known theory of social influence 
is, of course, social comparison theory 
(Festinger, 1954). Social comparison posits a 
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self-evaluative drive that results in a motiva- 
tional force to validate opinions. Validation 
of opinions (or more specificially, nonverifi- 
able opinion) is sought through agreement 
with others, Thus, social comparison theory 
provides a possible explanation of the agree- 
ment effect. 

Acceptance of the social comparison theory 
explanation of agreement effects would in- 
volve an admission that balance theory could 
not account for the no-contact variance in 
the agreement effect. This would not mean 
that balance theory was incorrect; but it 
would mean that balance theory was of more 
limited generality. Balance theory explana- 
tions proceed by identifying some implicit 
aspect of p's that can be 
reduced to a balanced cycle. The theory itself 
does not specify the important aspect of the 
phenomenology, and the subject himself/her- 
self may not be explicitly aware of the process 
that is causally important (cf. Nisbett & wil- 
son, 1977). This matter is illustrated by 
previously obtained results regarding contact, 
When McGarvey interviewed a sample of 18 
subjects after they had rated the standard 
eight triads for pleasantness, he found that 
13 of the subjects reported thinking about an 
o with whom there was interaction. Heider's 


of rie 

tact in the attraction 
sperent McGarvey did 
not specifically ask about this). 

The current challenge is to identify some 
implicit aspect of p's phenomenology. Ex- 
amination of McGarvey’s interview results in- 
dicates that in 8 out of 18 instances, the 


jects 


ing that these subjects concept 
someone who was sf ilar—similar in cultural 
and in reasoning capacity. They 
did not conceptualize o as a “man from 
Mars,” a member of another culture or sub- 
culture, a resident of a mental institution, or 
a child. From a balance theory perspective, 
similarity is a unit relation. Therefore, even 
under conditions of no contact, there would 
be an agreement cycle—p agrees with a 
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similar o or $ disagrees with a similar o—and 
also an attraction cycle—p likes a similar o 
or p dislikes a dissimilar 0. The analysis par- 
allels Insko et al.’s (1974, p. 56) cycles inter- 
pretation of contact, except that in this in- 
stance the unit relation involves similarity 
and not contact. 

It is intriguing that social comparison 
theory appears to take an analogous position 
regarding similarity. According to Festinger 
(1954), “a person who believes that Negroes 
are the intellectual equals of whites does not 
evaluate his opinions by comparison with the 
opinion of a person who belongs to some very 
anti-Negro group” (p. 21). Why does social 
comparison theory make this assumption? 
Social comparison theory itself does not an- 
swer this question. Festinger appears to have 
intuited a balance implication without being 
explicit about it. 

The manipulation of similarity in Experi- 
ment 2 had three levels: a standard condition 
(with no mention of similarity), a similar 
condition (“O generally has interests and at- 
titudes similar to yours”), and a dissimilar 
condition (“O generally has interests and at- 
titudes dissimilar to yours”). In addition, 
there were manipulations of scales ( pleasant- 
ness, harmony, consistency, expectancy, and 
stability) and contact (standard, future con- 
tact, no contact, and breaking contact). Since 
the theoretical import of future contact with 
and without discussion of x is not clear, refer- 
ence to discussion of x was simply omitted. 
Altogether the design involved 60 between- 
subjects cells (3 xX 5 x 4). 

The scales factor was included to gain some 
information regarding the generality of the 
similarity manipulation, and the contact fac- 
tor was included to examine the extent to 
which similarity produced effects beyond those 
resulting from positive, absent, or negative 
contact. The factorial combinations produced 
by the present design present a theoretical 
dilemma regarding the effects of contact 
across the levels of similarity. It is possible 
that the effects of contact may summate (or 
average) with those of similarity, or that the 
effects of contact will be more apparent with 
standard similarity (when there is no explicit 
mention of similarity). The latter possibility 
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could be the result of attention shifts and ay 
of information overload. 

Previous research has obtained results thy 
appear to be the result of attention shifts 
First, Experiment 1, in agreement with pre. 
vious research (Aderman, 1969; Insko et al, 
1974; McGarvey, 1974; LaTour et al., Note 
1), found that the explicit mention of contact 
reduced the magnitude of the three-sign 
balance effect from its magnitude in the 
standard condition (with no mention of con) 
tact). Second, Experiment 1 found that the 
explicit statement of reciprocation reduced the 
magnitude of the agreement effect from its 
magnitude in the standard-reciprocation con- 
dition in which there was no mention of re 
ciprocation. Third, manipulations of triad 
order by Insko et al., and also McGarvey, 
found that the magnitude of the three-sigt 
balance effect was greater if the first triad 
was three-sign imbalanced (and thus mote 
atypical and attention getting) rather than 
three-sign balanced. 


Method 
Subjects i 


students 


The subjects were 600 male and female ei 


from the introductory psychology course at tl 
versity of North Carolina. 


Independent and Dependent Variables 


à id- 
As in Experiment 1, each subject rated the Ai 
ard eight triads, and these ratings were aje 
generate seven orthogonal effects. The major a! a 
however, concerned the attraction, agreement, 
three-sign balance effects. rh 
r : sim 
There were three between-subjects factors: 


Pare ests ARN contact 
ilarity (standard, similar, or dissimila aie 


(standard, future contact, no contact, O! ea 

contact), and rating scales (pleasantnessy ta 

ancy, harmony, stability, or consistency). and three 
design involved 60 between-subjects cells F 
dependent variables with 10 subjects per cet 

Procedure ; 

f Exper 

The procedure closely paralleled that ? proxi- 


* al 
ment 1. Subjects were tested in groups © 4 


int of 
sve endpoint 
3 Through an oversight the negative €n pane?” 


t s 
the stability scale was changed from SMO endpoint 
able” to “Most unstable,” and the nega arose dis- 
of the harmony scale was changed from 
cordant” to “Most unharmonious.” 
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Table 6 
Multivariate Analysis of Variance for Experiment 2 
3 Three-sign 
Source Attraction F Agreement F balance F Multivariate F 
Similarity (S) 6.65** 21.60** 3.18* SASS 
Contact (C) 65.21** 9.50** 5,99** 11.05** 
Scales (L) 17.93** 24.19** 1.50 6.40"* 
SxCc 3.17* 2.45* 1.67 1.92** 
SxL 4.10** 2.20* 1.21 1.96** 
cx L 1.24 .52 1.06 1.20 
SXCXL 84 ate .80 1,29 


*p <.05.** p < 01. 


mately 25 people, and no two subjects in any group 
were in the same experimental condition. Each sub- 
ject received a nine-page booklet, the first page of 
which gave general instructions and the remaining 
pages of which listed each of the eight triads (one 
per page), along with a statement of one or more 
assumptions and a rating scale. The similarity as- 
sumptions were stated as follows. 


Assume: O generally has interests and attitudes 
similar to yours. 


Or 


Assume: O generally has interests and attitudes 
dissimilar to yours. 


„As in Experiment 1, no subjects failed to rate all 
eight triads, and thus no subjects were eliminated 
from the analysis. 


Results 
Grand Mean Tests 


Grand mean tests of the seven analysis of 
variance contrasts for the eight triads reveal 
the usual attraction, F(1, 540) = 110.88, 
P< Ol: agreement, F(1, 540) = 149.80, 
P< 01; and three-sign balance effects, R 
540) = 252.67, p < .01. This analysis also 
found an unexpected First X Second Sign inter- 
action, F(1, 540) = 6.70, p < O1. Triads in 
which the p-o and p-x relation were of the 
same sign were rated higher than triads in 
Which these two relations were of unlike sign. 
Subsequent analyses focused on the descrip- 
tively larger attraction, agreement, and three- 
Sign balance effects. 


Multivariate Analysis of Variance 


A three-factor multivariate analysis of vari- 
ance of the attraction, agreement, and three- 


sign balance effects is presented in Table 6. 
This analysis revealed significant multivariate 
Fs for all three main effects, as well as for 
two of the interactions. Since both of the 
interactions involve similarity, we will begin 
with a description of the main effects for 
rating scales and contact. 


Rating Scales Main Effect 


There was a significant rating scales main 
effect for both attraction and agreement. 
Table 7 contains the marginal means. A con- 
trast comparison of the affective scales 
(pleasantness and harmony) with the more 
cognitive scales (expectancy, stability, and 
consistency) was significant for both attrac- 
tion, F(1, 540) = 60.16, p < 01, and agree- 
ment, F(1, 540) = 71.50, P< 01, indicating 
a greater effect with the affective scales and 
replicating a similar finding in Experiment 1. 

A second contrast comparing expectancy 
and consistency scales with the stability scale 
indicated a greater effect with stability for 
both attraction, F(1, 540) = 15.66, P< 01, 
and agreement, F(1, 540) = 25.10, p< 01. 
This contrast, which was significant for agree- 
ment but not attraction in Experiment 1, pos- 
sibly means that stability has a more affective 
connotation than do expectancy and stability. 

Grand mean tests were conducted on each 
of the cell means in Table 7, which also in- 
cludes the means for three-sign balance. Ex- 
cept for the failure of consistency to reveal 
significant attraction and agreement effects 
and the presence of a significant attraction 
effect for stability, the results replicated the 
findings of Experiment 1. 
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Table 7 
Rating Scale Means for Experiment 2 
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Condition Pleasantness Harmony Expectancy Ce onsistency Stability . 
Attraction 4.34" idee —.26 78 3.02" 
Agreement 4.21** 4.44** —.74 1,24 2.97* 
Three-sign balance 3.44* 2.39* 2.79* 3.3128 2.40° 


* p < .05, df = 1, 540. ** p < .01, df = 1, 540. 


Contact Main Effect 


In agreement with Experiment 1 and all 
past results, there was a significant contact 
main effect on attraction, agreement, and 
three-sign balance. Table 8 contains the mar- 
ginal means. Three contrasts were planned 
for attraction and agreement. First, attrac- 
tion was greater in the standard and future- 
contact conditions than in the no-contact and 
breaking-contact conditions, F(1, 540) = 
55.54, p < .01, as was also agreement, F(1, 
540) = 18.57, p<.01. Second, attraction 
was greater in the future-contact than in the 
standard condition, F(1, 540) = 84.23, p< 
:01, but the contrast was not significant for 
agreement. Third, attraction was greater with 
no contact than with breaking contact, F(1, 
540) = 55.86, p< .01, as was also agreement, 
F(1, 540) = 7.83, p <.01. 

The asterisks in Table 8 indicate the sig- 
nificance of the grand mean tests (tests of 
differences from zero) for each of the cell 
means. The failure of the attraction mean in 
the standard condition to reach significance 
is inconsistent with the results of Experiment 
1 (see Table 4) and is not in agreement with 
the general assumption that some subjects ina 
standard condition implicitly assume contact. 
As is indicated below, however, the standard 


Table 8 
Contact Means for Experiment 2 


contact mean was significantly greater tha 
zero within the standard similarity condition 
(see Table 10). 

The only planned contrast for three-sigi 
balance was a comparison of the standarl 
condition with the four remaining conditions 
In agreement with Experiment 1, the contrat 
was significant, F(1, 560) = 10.24, p < OL 


Similarity Main Effect 


There was a significant similarity main eh 
fect for attraction, agreement, and three-sigh 
balance; the marginal means for attraction 
and agreement are in the right-hand colum 
of Table 9. A contrast of the standard ie 
dition (with implicit similarity) and similar 
condition with the dissimilar condition ve 
significant for both attraction, F(1, 540) = 
13.25, p < .01, and agreement, F(1, 540)= 
27.69, p<.01. A second contrast a 
standard and similar conditions was signif 2 
for agreement only, F(1, 540) = 15.51, 7 
.01. According to the similarity interpreta 
of attraction and agreement, all four o 
contrasts should have been signifa 
agreement results were more in acca 
Prediction than were the attraction whid 
(in contrast to the contact results, er 
the opposite occurred). Grand mean tes 


Standard 
(no mention ing contact 
Condition of) contact Future contact No contact Breaking 
-1915 
Attraction 1.27 7.88** 3.47** p 
Agreement 2.87* 3.68** 2.35* 
Three-sign balance 3.65** 2,23* 3.60** 


* p < 05, df = 1, 540. ** p < .01, df = 1, 540, 
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‘Table 9 
Similarity X Rating Scales Means 
Condition and 
similarity Pleasantness Harmony Expectancy Consistency Stability Marginal 
Attraction 
Standard 
(no mention) 2.88* 6,08** 85 2.60* 3. had 
Similarity 4.00** 5.60** — 40 Zhen eee sais 
Dissimilarity 6.15** 4.90** Sik: —3.33** ‘35 1.37 
Agreement 
Standard 
„(no mention) 3:53° 4.63"" —.60 1,50 1.98* 2,21° 
Similarity 5.00** 5.00** 1,00 4,00°* 5,58** 4.12" 
Dissimilarity 4,10** 3.70** —2,63** —1.78* 1.35 ‘95 


tp < 05, df = 1, 540. ** p < .01, df = 1, 540. 


Table 9) indicated that significant attraction 
and agreement effects occurred in all but the 
dissimilar condition. According to the sim- 
ilarity interpretation, the attraction and agree- 
ment effects should have been negative in the 
dissimilar condition. In reality, of course, 

‘contact and reciprocation also have some 
bearing on these means. 

The results for three-sign balance indicated 
that only the first contrast was significant, 
F(1, 540) = 5.66, p < .05. Three-sign balance 
was greater in the standard and similar con- 
ditions than in the dissimilar condition. 


Similarity x Rating Scales Interaction 


Unfortunately for the sake of both initial 
Prediction and simplicity of discussion, the 
main effect for similarity mentioned 
Was qualified by a Similarity x Rating Scales 
interaction. The. attraction and agreement 
means are contained in Table 9. A series of 
contrasts was calculated to explore this pat- 
tern of results These contrasts indica 
that greater attraction and agreement effects 
with the standard and similar conditions than 
a the dissimilar condition were only found 
or the cognitive scales (expectancy, 00! t- 
aA and stability). For attraction the effect 
“dl pleasantness and harmony was even in 
ee wrong direction, although only slightly. 
or agreement, the effect was in icted 
direction, although not significantly 50. 
results indicate that the social comparison 
theory perspective is only supported by the 


more cognitive scales. Put less technically, 
the subjects indicated that they regarded 
agreement with a dissimilar other as incon- 
sistent but not unpleasant. 


Similarity x Contact Interaction 


There was a significant Similarity x Con- 
agree- 
ment. As the means in Table 10 indicate, the 
interaction implies that the contact predictions 
were more strongly supported in the standard- 
similarity condition. Such a result is in agree- 
ment with the anticipation that the factorial 
combination of similarity and contact might 
result in attention shifts and/or information 
overload. It is, furthermore, reassuring to 
find that the attraction effect with standard 
contact is significantly than zero when 
there is standard similarity. 
of similarity the standard contact 
not significantly greater than zero 
8). 


mean was 
(see Table 


Scales Within the No-Contact Condition 


finding should be mentioned. 
relevant to the argument that 
and self-relevancy con- 

t in accounting for 
in the no-contact condi- 
tion. To the extent that self-relevancy con- 


One further 


—_——_— 


4A detailed summary of these analyses is avail- 
able from the first author. 
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Table 10 
Similarity X Contact Means 
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———— MSGS SS 


Standard 
(no mention Future 

Condition and similarity of) contact contact No contact Breaking contact 
Attraction 

Standard (no mention) 2.76* 9.92** 4.12** —3.74* 

Similarity 1.48 7.36** 4.04** -14 

Dissimilarity —.42 6.36* 2.26* =2.72* 
Agreement 

Standard (no mention) 3.96** ERY han 2.04* —.50 

Similarity 4.80** 5.56** 3.56** 2.54" 

Dissimilarity —.14 2.16* 1.46 132; 


* p < 05, df = 1, 540. ** p < .01, df = 1, 540. 


siderations are important in the no-contact 
condition, the affective scales should produce 
greater attraction than the more cognitive 
scales. This happened in Experiment 1 (see 
Table 5). A planned test of the same con- 
trast in Experiment 2 was also significant, 
F(1, 540) = 35.60, p < .01. Mean attraction 
effects for the pleasantness, harmony, expect- 
ancy, consistency, and stability scales were 
4.47, 8.60, —.03, 1.53, and 2.80, respectively. 


Postexperimental Questionnaire 


Finally, we should like to mention that 
Experiment 2 also collected some self-report 
data—data that will only be briefly described. 
Since the one variable that appears to be most 
consistently important in a balance account of 
attraction and agreement effects is contact, and 
since McGarvey’s informal postexperimental 
interviews on this matter were only done with 
18 subjects, we were interested in collecting 
data from a somewhat larger sample. The 
first question, which was asked only in the 
standard contact condition, inquired whether 
the subjects thought of a specific ọ (another 
person) with whom they had contact. The 
results were even more striking than Mc- 
Garvey’s since 136 of 150 subjects (90.7%) 
answered in the affirmative. The second 
question, which was also asked in the non- 
standard conditions, inquired about the time 
duration of the contact. All but one subject 
in the standard condition specified present 
contact. In the nonstandard conditions, the 
question constituted a manipulation check on 


past and future contact and overwhelmingly 
yielded the expected results. 

The final set of questions was directed ati 
matter with which we were most concerned 
This is the potential criticism that the pre 
viously obtained contact results were an arth 
fact of choice. Perhaps when subjects wert 
told to assume no more contact or continued 
contact in the future they imagined that this 
change was the result of their own free choic& 
A set of questions designed to get at this 
problem revealed that some subjects imagine’ 
that their contact was the result of p’s choice 
some, of 0’s choice; some, of mutual choices; 
some, of no-choice circumstances; and some 
of a combination of choice and circumstance 
Since the modal response category Was be 
cumstances,” this category was compared Mi 
all of the other choice categories. Except | d 
two instances, this analysis revealed no se 
nificant effects. In the future-contact i 
tion, the agreement effect was greater n 
choice than for choice subjects; and in r 
breaking-contact condition, the attraction 
fect was more negative for no 
choice subjects. In general, the results 
seem to indicate that the usual con 
sults are an artifact of choice. 


Discussion 
3 Es 
Experiment 2 generally replicated the ale 
periment 1 results relating to rating "i, 
and contact. The two experiments Re was 
the major finding that the affect! 
(pleasantness and harmony) produc 


attraction and agreement effects than did the 
‘cognitive scales (expectancy, consistency, and 
ility). It is also interesting that in neither 
Experiment 1 nor Experiment 2 did expect- 
ancy ratings reveal significant attraction or 
nt effects. The means in Experiment 2 
are, in fact, slightly negative. LaTour et al. 
(Note 1) also did not find that expectancy 
ratings produced significant attraction or 
agreement effects. All three experiments found 
‘that expectancy ratings did not reveal sig- 
“nificant attraction or agreement effects, but 
did reveal significant three-sign balance effects. 
‘These results possibly indicate that when sub- 
jects make expectancy judgments, they rely 
‘to a great extent on the relative frequencies 
‘of past social experiences (i.e., they rely on 
something analogous to the long-run relative 
frequency interpretation of probability). From 
this perspective, subjects do not regard p- 
‘dislikes-o or p-disagrees-with-o situations as 
infrequent. On the other hand, three-sign 
in situations in which p likes some- 
One and disagrees with that person or dislikes 
Someone and agrees with him/her may have 
A relatively low frequency of past occurrence 
possibly because past group relationships 
to be ones in which there is both attrac- 
tion and agreement (cf. Newcomb, 1978). 
‘The results for contact were generally in 
Accord with the findings of Experiment 1. It 
Is again apparent that more of the contact 
Predictions were supported for attraction than 
‘agreement. In view of the fact that the 
explicit mention of similarity and/or dis- 
ity appeared to lessen the magnitudes 
of some of the predicted contact effects, this 
Reneral replication of basic results is gratify- 


wine main focus of Experiment 2 was on 
Me anticipated effect of similarity on at- 
and agreement. Experiment 2 can be 


traction 
Tegarded as a test of the social comparison 


story perspective on agreement and attrac- 
lion effects. This perspective is that we seek 
ement with and are attracted to similar 
. The anticipated main effect of sim- 
“tity on both attraction and agreement 0c- 
Mired. Both attraction and agreement effects 
je greater in the standard and similar con- 
Hons than in the dissimilar condition. Fur- 
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807 


thermore, the agreement effect was greater in 
the similar condition than in the standard 
condition. Beyond this, however, the attrac- 
tion and agreement effects were not signifi- 
cantly less than zero in the dissimilar con- 
dition, and, most important of all, the main 
effect of similarity was qualified by the rating 
scales factor. With the affective scales, the 
manipulation of similarity had no significant 
effect on either attraction or agreement (see 
Table 9). Agreement with similar and dis- 
similar others was equally pleasant, but not 
equally consistent. Although this finding is 
intuitively plausible, it was not anticipated. 
There does not seem to be anything in social 
comparison theory that would have indicated 
such results, Indeed, the basic assumption of 
a drive for self-evaluation might be inter- 
preted as implying a greater effect on the 
affective than on the cognitive scales—cer- 
tainly not a lesser effect. 

From the perspective of our initial expecta- 
tions, Experiment 2 was a qualified success. 
Some additional agreement variance, beyond 
that explained by contact, was accounted for, 
but there was obviously unexplained variance, 
particularly on the affective scales. The prob- 
lem becomes acute when it is recognized that 
it was the affective scales that produced the 
most agreement variance. Also it is important 
to note that a balance account of the re- 
ciprocated sentiment results for attraction is 
dependent on the adequacy with which bal- 
ance theory can account for agreement vari- 
ance. At the present time it is apparent that 
the balance theory is better able to account 
for variance on the cognitive scales than for 
variance on the affective scales. : 

We regard the results of the two experi- 
ments as sufficiently encouraging to warrant 
continued investigation of the possibility that 
balance theory can account for the reliable 
variance in the phenomenology of hypothetical 
social situations. As previously indicated, the 
rationale for studying hypothetical social sit- 
uations is not necessarily to learn about actual 
social situations. Although the phenomenology 
of actual social situations may bear some 
resemblance to the phenomenology of hy- 
pothetical social situations (cf. Sampson & 
Insko, 1964, for a study of nonhypothetical 
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p-o-x triads), there are some important dif- 
ferences. At the very least the phenomenology 
of actual situations will be more complex 
and span a greater period of time. Hypo- 
thetical social situations are studied because 
the ease of manipulating variables makes 
these situations a convenient arena for testing 
specific balance hypotheses, Balance theory 
should be able to account for certain aspects 
of human thought, whatever their basis in 


reality. 


Reference Note 


1. LaTour, et al. Pleasantness and expectancy ratings 
in p-o-x and q-o-x triads: The partially overlap- 
ping predictions of psychological hedonism and 
balance theory. Unpublished manuscript, Uni- 


versity of North Carolina, 1975. 
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Feeling More Than We Can Know: 


Exposure Effects Without Learning 
William Raft Wilson ol 
School of Nursing 
University of Michigan 
Given the joint occurrence of positive attitudinal and learning effects that 
typically result from repeated stimulus exposure, most researchers have con- 
cluded that what an individual knows and what he or she feels about the 
familiar stimulus are not independent. In fact, it is generally assumed that 
both and mediates the 


such as 


unconfound affect and 

employed a dichotic-listening procedure for stimulus exposure, in which the 

critical stimuli were exposed on the unattended channel. Reliable attitudinal 
familiar obtained in both experi 


significant predictor of 
previously encountered object are not 
perceiving that the object is familiar. 

processes of attitude formation 


Research on 
changes that an i 
function of repeated $ 
consistently obtained two results. 
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t. Positive feelings toward a 
on consciously knowing of 
for the role of awareness in 


under conditions of mere exposure are discussed. 


tive level, the individual's ability to recog: 
nize the stimulus increases (eg, ' 
1968). At an affective level, the individual's 
attitude toward the stimulus becomes in- 
creasingly positive (e.g., Zajonc, 1968). Given 
the joint occurrence of these two effects, most 
have concluded that changes in 
what an individual knows and in what he or 
she feels about the familiar stimulus are 
highly related. In fact, it is generally assumed 
that some form of learning that improves the 
individual's ability to identify, classify, or 
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recognize the stimulus both precedes and 
mediates the affective response (Berlyne, 
1970; Harrison, 1968; Stang, 1975). 

Although the above assumption has been 
incorporated in all major explanations of the 
attitudinal effects of stimulus exposure, little 
research has been designed to test it experi- 
mentally. Furthermore, clear support for stim- 
ulus recognition or feelings of familiarity as 
a necessary condition for obtaining exposure 
effects has not emerged in past research. For 
example, Matlin (1971) reported that ob- 
jectively (de facto) familiar stimuli, which 
were correctly recognized and subjectively 
familiar, were rated more positively than ob- 
jectively familiar stimuli that were incor- 
rectly identified and not subjectively familiar, 
Surprisingly, however, feelings of subjective 
familiarity did not lead to more Positive rat- 
ings of stimuli if they were indeed novel. Con- 
fidence in these findings should be limited, 
since comparisons were based on two very in- 
frequent recognition errors. It is possible that 
these judgment errors may have resulted 
from certain biases elicited by particular fea- 
tures of the stimuli that also influence affect, 

In a direct attempt to connect learning to 
exposure effects, Stang (1975) performed a 
number of experiments in which he varied 
exposure duration and measured subjects’ af- 
fective ratings of, and ability to recognize or 
recall, stimuli. Learning and affect curves 
were found to vary, almost identically, as a 
function of exposure duration, Stang con- 
cluded that repeated stimulus exposure leads 
to learning that is intrinsically rewarding, and 
that this reward value is associatively linked 
to the stimulus. Thus, when asked to rate 
the familiar stimulus, an individual will re- 
flect this accumulation of reward value by 
giving a positive rating. 

The conclusions drawn from Stang’s data 
are questionable. First, subjects were told 
that they were participating in a learning 
study. This may have inflated the relation be- 
tween stimulus recognition and affect rating, 
since “good” performance was dependent on 
accuracy of learning. Other related research 
suggests that subjects respond differently to 
stimuli during exposure, once they are in- 
formed that recognition will later be measured 
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(e.g, Berlyne, Craw, Salapatek, & Lewis, 
1963). Secondly, the joint occurrence of stim- 
ulus recognition and positive affect ratings 
does not in itself demonstrate that learning is 
a necessary condition for obtaining exposure 
effects. That learning is not a necessary condi- 
tion was suggested by recent research that 
reported a significant partial correlation be- 
tween exposure and affect, independent of 
the effects of subjective familiarity (More- 
land & Zajonc, 1977). 

The fact that, under typical conditions of 
exposure, subjective familiarity judgments and 
affective ratings are highly correlated can not 
be disputed. The critical question that re- 
mains to be investigated, however, is whether 
these effects are partially independent prod- 
ucts of repeated stimulus presentations, or 
whether the enhanced affective response is 
dependent on increments in subjective famil- 
iarity. What is needed is an alternative method 
for stimulus exposure that can unconfound 
attitudinal and learning effects. One obvious 
approach is to establish exposure conditions 
in which the form of perceptual learning that 
commonly enhances stimulus recognition is 
severely attenuated or reduced to chance 
levels, even after repeated stimulus presenta- 
tions. 

Although repeated presentation of a stimu- 
lus to an individual usually does increase the 
Probability that it will be recognized or re- 
called, studies of selective attention suggests 
that this is not an inevitable result when the 
information is not attended to during presen- 
tation. For example, the dichotic-listening 
Procedure involves simultaneous transmission 
of different messages to each of the subject’s 
ears. The subject is required to attend to the 
information being transmitted to one ear and 
to ignore the information coming to the other 
ear. Under these conditions, information 10 
the unattended channel rarely enters con- 
sciousness, and its later recognition or recall 
rarely exceeds chance levels, even when pro- 
cessing up to a semantic level is evident (Cor 
teen & Wood, 1972; Moray, 1959). Thus, the 
dichotic-listening Procedure appears to be an 
excellent method for establishing conditions 
of mere exposure, that is, for making the 
stimulus accessible to the individual’s percep- 
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tion (Zajonc, 1968) without increasing the 
~ probability of stimulus recognition. 

The purpose of the present study was to 
examine the contribution of objective and 
subjective familiarity to exposure effects ob- 
tained in the absence of stimulus recognition. 
It was assumed that objectively familiar stim- 
uli would be rated more positively than ob- 
jectively unfamiliar stimuli, independent of 
how familiar these stimuli may have appeared 
to the individual, At the extreme, I sought 
to obtain the exposure effect in the absence 
of stimulus recognition. 


Experiment 1 


A modified dichotic-listening procedure for 
stimulus exposure was employed. Subjects 
attend to a distractor mes- 
sage in one ear while the critical stimuli were 
being transmitted to the unattended ear. The 
purpose of this experiment was to determine 
if attitudinal enhancement effects of mere ex- 
posure could be obtained when the critical 
stimuli were presented on the unattended 


channel. 


Method 


„Subjects 


The subjects were 24 undergraduate women en- 
rolled in introductory psychology courses at the 
University of Michigan. They were randomly drawn 
from the Department of Psychology voluntary sub- 
ject pool. Female students were selected for reasons 


of convenience. 


Materials and Apparatus 


The critical stimuli were six tone eE ma 
sequence was approximately 10 sec Jong ap. 


i Jody. Each 
peared, to the listener, to be a simple mel gA a 


having a given frequency 5 
among 10 frequencies varying 
Hz, having a given duration selected 
among five durations varying 


msec, and having one of three v 
six sequences were of intermediate complexity level, 


from among 18 cre 
(1964; 
these sequences were 
able when judged for t 
tion, and complexity (Crozier, 
1966). 


found to be 


pleasantness, 
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Six tapes were prepared for exposure purposes, 
each of which contained three of the six critical 
stimuli repeated five times each, for a total of 15 
exposure trials per tape with 5-sec intertrial intervals. 
The order in which the tone sequences were played 
on any given tape was randomly determined., For 
purposes of postexposure ratings, a test tape, com- 
prised of all six critical stimuli, randomly ordered, 
was prepared. Use of each tone sequence for exposure 
and/or testing was balanced so that half of the sub- 
jects were exposed to it prior to testing, while the 
other half encountered the stimulus only during 
testing. 

The distractor message, to which subjects were 
asked to attend during the exposure phase, was a 
taped passage of a story segment taken from Daphne 
DuMaurier’s The Birds. This passage was read in a 
monotone female voice at about 150 words per 
min, A typewritten version of this passage was pre- 
pared which contained a number of “errors.” For 
example, the subject would hear “Autumn was best 
for this, better than spring,” whereas the written 
passage would read, “Author was been for this, bet- 
ter them spring.” There were either 3 or 4 randomly 
determined errors per line. Whether any particular 
line had 3 or 4 errors was also randomly determined. 

As in previous research (e.g, Zajonc, Markus, & 
Wilson, 1974), a 7-point bipolar scale was employed 
to measure affect. The extreme points on this scale 
were labeled “like” and “dislike.” The recognition 
responses were recorded on a separate sheet. On 
the left side of the page, the subject checked whether 
the test stimulus was “orp” or “New.” To the right 
of this judgment were spaces where the subject indi- 
cated whether she was sure, half-sure, or guessing 
when making the recognition judgment. 

Both the critical stimuli and the distractor message 
were recorded and played back on a stereo tape re- 
corder, All stimuli were heard by the subjects over 
stereo headphones. The subjects were seated at 
library-type desks, which prevented them from see- 
ing the.experimenter, or any of the equipment, dur- 
ing the experiment. Finally, volume levels for both 
the distractor message and the critical stimuli were 
measured by a sound meter. 


Procedure 


Exposure phase. The subjects were asked to sit 
down immediately upon arriving and were read the 
following instructions: 


Before you is a copy of a literary passage. I 

would like you to do two things: (1) listen to 

someone recite that passage while you simulta- 

neously read the text and (2) put a slash through 
all words in the written text which do not cor- 
respond to the words you hear. This is like a 
proofreading task where all noncorresponding 
words are considered errors. In order to do your 
best at this task, you will have to listen very 
closely to what the speaker is saying. There will 
be many errors in the written text and your job 


814 


is to put a slash through as many as you can dis- 
cover. If you have no questions, we will now have 
a short practice session. 


The subject was then given eight lines of practice. 
Then the entire passage was played to the subject's 
right ear without any interruptions, Approximately 
30 sec after the passage had started, the critical 
stimuli were played to the subject’s left ear. The 
distractor passage continued for 30 sec after the 
exposure phase was completed. The subjects were 
not forewarned about the occurrence of the critical 
stimuli, but they were instructed to ignore any ex- 
traneous sounds, regardless of the source, and to 
focus their complete attention on the literary pas- 
sage. The distractor passage was played at a volume 
of approximately 60 dB (B)? and the critical stimuli 
at approximately 55 dB. 

Testing phase. After the exposure phase, the sub- 
jects were given the following instructions for the 
recognition test: 


I am now going to play six tone Sequences that 
sound much like simple melodies, Three of the 
tone sequences you will hear were to you 
during the Proofreading task. Three of the tone 
Sequences you will hear will be ones played to 
you for the first time. What I would like you to 
do is to indicate, for each tone sequence, whether 
or not you think it was played during the proof- 
reading task. 


The subjects were then shown how to use the recog- 
nition and confidence scales, The same set of in- 
structions was provided for the affective ratings, 
with the exception that the subjects were asked to 
indicate how they felt toward each tone sequence. 
Up to 30 sec per stimulus was allowed for making a 
recognition or affective judgment. 

To control for possible order effects, judgment of 
affect and recognition were counterbalanced. Half of 
the subjects made recognition judgments first and 


then gave affect ratings, while the opposite was true 
for the other subjects. 


Results 
Recognition 


On the average, the subjects were able to 
identify correctly 3.54 of the test stimuli as 
OLD or NEW.? Analysis of this average, in com- 
parison with the average recognition that 
would be expected by chance (3.00), reveals 
that the subjects did better than chance, 
t(23) = 2.41, p < .05. Thus, the method of 
exposure employed in this experiment appears 
to be relatively effective for attenuating recog- 
nition; however, it is not sufficient for elimi- 
nating recognition entirely.* 
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Affect 


Responses on the affect scale were con- 
verted to numbers ranging from 0 to 6, with 
6 indicating the most positive response, Each 
subject's ratings of stimuli in each of the four 
cells (e.g, “oLp”/NEW) were then averaged, 
If a subject did not provide ratings for stimuli 
in a particular cell, for example, if no oLD 
stimuli were identified as “oLD” by that sub- 
ject, that cell was given an estimated value 
in the form of the subject’s average rating 
of stimuli in all other cells combined. This 
procedure allowed for a straightforward com- 
parison of the effects of objective and sub- 
jective familiarity on affective rating of stim- 
uli, The means of both the adjusted and 
unadjusted ratings appear in Table 1. 

A 2X22 analysis of variance for re-% 
peated measures, with one between-subjects 
factor (order of judgment-affect or recogni- 
tion first) and two within-subjects factors 
(objective familiarity—oLp vs. New; and 
subjective familiarity—“oLp” vs. “NEW”), 
was performed on the adjusted affect ratings. 
The results show no significant main effect 
for Order of Judgment, F(1, 22) < 1. They 
do reveal, however, that the subjects rated 
OLD stimuli more positively than NEW stimuli, 
regardless of whether they were judged “OLD 
or “NEw,” F(1, 22) = 11.66, p < .005 (Table 


* The left ear was selected because research sug- 
gests that, in general, musical patterns are bee 
Processed and better recognized later when played 
to the right hemisphere (cf. Bever & Chiarello, 1974; 
Kimura, 1973). 

* re 20 uN/m?. : 

*Stimuli that appeared in the training series are 
designated as orp whereas those that appeared in 
testing only are designated as new. The subjects’ rê- 
sponses are designated as “oLD” and “NEW.” las 

* In earlier pilot research to establish the validity 
of the recognition measure and to confirm the ef 
criminability of the tone sequences, 12 subjects wer 
asked to attend to the critical stimuli (rather na 
the distractor passage) during the exposure p! FE 
and then to do the recognition test. As might ade 
expected under these conditions, the subjects m: Fi, 
no errors (100% accuracy) in classifying previou! 
exposed tone sequences as “oro” and tone saer 
exposed for the first time during testing s ft a 
Thus, there appears to be quite a substantial re 
tion in recognition accuracy when the tone sequen 
are presented on the unattended channel. î 


ency for the subjects to rate stimuli that they 
judged as “oLD” more positively than stimuli 
judged as “NEw,” regardless of whether or 
not the stimulus was objectively OLD or NEW, 
I FO, 22) =3.41, p < 10. 

Even though the Objective Familiarity X 
Subjective Familiarity interaction is not sig- 
nificant, it is of interest to examine, sepa- 
¥ rately, the ratings of objective familiar stimuli 
called “orp” and “New.” For stimuli judged 
“orp,” truly orp stimuli were liked better 
than NEw stimuli, although this effect is not 
significant, F(1, 22) = 2.23. This effect, how- 
ever, is reliable for stimuli judged “NEw”: 
_ Truly orp stimuli were liked better than truly 
New stimuli, F(1, 22) = 7.40, p < .05. Thus 
"the exposure effect was obtained even in the 
“absence of the subject’s awareness of his own 
“differential experience with the stimuli. 

The relationship between affect ratings and 
| subjective familiarity was further examined 
by combining recognition and confidence judg- 
ments into a single measure of subjective fa- 
` miliarity, Each stimulus was classified as to 
its degree of oldness, ranging from “OLD-SURE” 
" (6) to “NEW-SURE” (1). Then each subject's 
“average oldness rating and average affect 
rating were correlated for OLD and NEW 
_stimuli, separately. The correlations obtained 


Table 1 P 
Average Stimulus Affect Ratings as a Function 
“of Objective Familiarity (OLD-NEW) and 
Subjective Familiarity (“OLD"-""NEW”) 


Subjective familiarity 


Objective 
familiarity “op” “NEW” M 
4.22 4,00 4.12 
on. (4.16) (411) (412) 
3.17 3.04 3.30 
TEN 3.33) (319) (3.23) 
: 3.52 
2 (3.83) (3-60) 


j i i include esti- 
Note. Figures without parentheses inc 

‘mated values for some subjects. The aumibes o a 
jects for which values are estimated is 2, 4, 6 e 
for the “oLD” /OLD, “NEW"'/OLD, OED /NEW, 


a ively. Responses on the 
EW” /NEW cells, renee a E Gindicating the 


ect scale range from 0 to 
most positive evaluation. 
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Table 2 
Intercorrelations Among the Measures 

Measure 1 2 3 4 
1. Affect 1,00 
2. Objective 

familiarity .26** 1.00 
3. Subjective 

familiarity  .07 -18* 1.00 
4. Oldness 209 AS .88*** 1.00 


Note. Objective familiarity was controlled by the 
experimenter and refers to the true degree of contact 
that the subject had with the stimulus. The other 
variables were under the subject's control. The old- 
ness response is comprised of both the subject's 
familiarity (“‘oLp” or “New”) judgment and the 
confidence (“SURE,” “HALF-SURE,” or ‘GUESS'’) 
rating of that judgment. 


for NEW and orp stimuli were —.29 and .14 
(df = 22), respectively. In th if objec- 
tively NEW stimuli, a negative co tion in- 
dicates that the more certain the subject was 
that the stimulus was “NEw,” the better she 
liked it. Conversely, in the case of oLD stim- 
uli, a positive correlation indicates that the 
more certain the subject was that the stimu- 
lus was “op,” the better she liked it. In short, 
although neither of the above correlations is 
significant, they are in the direction of a posi- 
tive relationship between liking and degree of 
subjective certainty about the true familiarity 
of the stimulus. Because the correlation was 
positive for OLD stimuli and negative for NEW 
stimuli, these results do not indicate that af- 
fective ratings depend, to any significant de- 
gree, on oldness ratings per se. If the latter 
were true, we would expect a positive correla- 
tion between oldness and affect ratings, for 
both orp and NEw stimuli. 

Intercorrelations among all the measures 
are provided in Table 2. 


Experiment 2 


The procedure employed in the first ex- 
periment proved to be relatively successful in 
attenuating recognition accuracy, but it did 
not reduce it to a chance level. Nevertheless, 
under those conditions, objective familiarity 
was found to influence affective ratings, inde- 
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pendently of other processes. To some extent, 
subjective familiarity and subjective certainty 
were found to be related to affective judg- 
ments, as well. Whether these same effects 
would be observed under conditions in which 
subjects could not recognize the critical stim- 
uli at better than chance levels remained to 
be demonstrated. Therefore, in this second 
experiment, a modified procedure was em- 
ployed in order to eliminate all reliable recog- 
nition; during stimulus exposure, the subjects 
were required to verbally shadow the dis- 
tractor message and simultaneously do the 
proofreading task. i 

To gain a better understanding of the cog- 
nitive and affective changes that result when 
the critical stimuli are exposed in this unob- 
trusive fashion, two additional measures, 
found in past research to be sensitive to 
levels of stimulus familiarity, were employed. 
The first measure was latency of recognition 
judg general, familiar stimuli are 
recogn ot only better but also faster than 
less familiar stimuli are (cf. Matlin, 1971). 
It is possible that subjects respond faster 
to objectively familiar stimuli, even when 
they do not recognize them as familiar. This 
finding would suggest that, even in the ab- 
sence of stimulus recognition, there may be 
some form of differential cognitive processing 
of de facto familiar and unfamiliar stimuli. 

The second new measure was an indicator 
of the subject’s level of arousal when evaluat- 
ing each stimulus, namely, galvanic skin re- 
sponse (GSR). Both Berlyne et al. (1963) 
and Zajonc (1968) have reported that novel 
stimuli evoke a greater drop in skin resistance 
than do familiar stimuli. Even in the absence 
of stimulus recognition, such physiological 
differences in response to de facto familiar 
and unfamiliar stimuli may occur, given that 
changes in skin resistance are a function of 
information (about the stimulus) that is not 


used in making the recognition judgment (see 
Eriksen, 1960). 


Method 
Subjects 


The subjects were 24 undergraduate women en- 
rolled in introductory psychology courses at the Uni- 
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versity of Michigan. They were randomly drawn 
from the Department of Psychology voluntary ‘sub: 
ject pool. 


Materials and Apparatus 

The critical stimuli and the distractor message: 
were the same as those employed in Experiment 1. 
All the equipment used in the first experiment was, 
employed in this experiment. In addition, a respons 
panel was constructed to register the subject’s recog: 
nition and confidence judgments. The subject madel 
her recognition and confidence judgments by press 
ing the appropriate buttons on the response pant 
“orp” or “New” for recognition, and “sure,” “ 
SURE,” or “Guess” for confidence judgments, Thi 
subject’s response activated a light that indicat 
to the experimenter which response had been made. 
The latency of the subject’s response was recorded 
by a clock, outside of the subject’s vision. 

Changes in skin resistance (GSR) were measure 
on an event recorder, connected to a control unit ani 
to Lykken zinc-zinc sulphate electrodes, The ski 
resistance was recorded for each subject on a scale 
calibrated at 500 k-ohms, with a 600 k-ohms limit. 
This system relies on a 10-uV current (see Kaplan 
& Hobart, 1964). Change in skin resistance was 
arbitrarily defined as the distance in ohm units be- 
tween the lowest and highest point of resistance 
within a 20-sec period immediately following stimu 
lus onset. Affect ratings were obtained in the si 
way as in Experiment 1. 


Procedure be 


Exposure phase. When the subject arrived, m 
experimenter connected the electrodes to tei E 
and third fingers of the subject’s nonpreferred hai ‘ 
The subject was told that the electrodes would GE 
no inconvenience, other than that due to their P! al 
ence, and that the purpose of the electrodes Wes, 
be explained after the experiment. The subjec ane 
then seated at the experimental desk and rea 
following instructions: 


You are about to hear, over these eae 
literary passage read by a young woman. eh word 
like you to do two things. First, repeat ea ny 
that this person says, immediately after ae sd 
it, She will be speaking at a fairly fast rate, $ 
you will have to work hard to keep up Wit i 
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proofreading task where all noncorresponding 
words are considered errors. In order to do your 
best at this task, you will have to listen very 
closely to, what the speaker is saying. There will 
be many errors in the written text, and your job 
is to put a slash through as many as you can dis- 
cover. If you have no questions, we will now have 
a short practice session. 


The subject was then given eight lines of Practice, 
followed by a restatement of the task requirements, 
The rest of the exposure phase was the same as in 
Experiment 1, 

Testing phase. The instruction and procedure em- 
ployed for testing was comparable to that used in 
Experiment 1, In this experiment, however, the sub- 
ject was instructed on how to use the apparatus for 
indicating recognition judgments; also, the fact that 
half of the stimuli were truly orp was emphasized 
again, just prior to testing. 


Results 
Recognition 


The subjects identified stimuli as “oLD” or 
“New,” independently of their true exposure 
history, with an overall average level of ac- 
curacy of only 3.17, #(23) < 1. Thus, the 
method of exposure employed in this experi- 
ment achieved the desired condition of chance 
recognition. 


Table 3 3 
Average Stimulus Affect Ratings as a Function 
\ of Objective Familiarity (OLD-NEW) and 
\ Subjective Familiarity (“OLD”-“NEW") 
i 
Subjective familiarity 


Objective ; 
familiarity “oLD” “NEW” M 
3.51 3.85 3.66 
OLD (3.61) (3.87) (3.72) 
3.03 3.02 3.03 
NEW (3.08) (3.06) (3.07) 
3.29 3.40 
M (3.36) (3.42) 


Note. Figures without parentheses include esti- 
mated values for some subjects. The number of sub- 
jects for which values are estimated is 2, 5, 2, and 2 
l for the “oLD” /oLD, “NEw” /oLD, “OLD” /NEW, and 
NEW” /NEW cells, respectively. Responses on the 
affect scale range from 0 to 6, with 6 indicating the 
Most positive evaluation. 
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Table 4 

Average Recognition Judgment Latencies (in 
sec) as a Function of Objective Familiarity 
(OLD-NEW) and Subjective Familiarity 
(“OLD”-“NEW”) 
< a 
Subjective familiarity 


Objective — 
familiarity torp? “NEw” M 

12.12 11.76 11,97 

OLD (12,34) (11,53) (11,99) 

13.35 11.17 12,26 

NEW (13,52) (11.61) (12,57) 
M 12.70 11,44 
(12.89) (11.57) 


Note. Figures without parentheses include esti- 
mated values for some subjects, The number of sub- 
jects for which values are estimated is 2, 5, 2, and 2 
for the “oLD” /oLD, “‘NEw’'/oLD, “oLD” /New, and 
“NEW” /NEW cells, respectively. 


Affect 


Liking ratings were scored and analyzed as 
described in the first experiment. Again, no 
significant effects were obtained for Order of 
Judgment, F(1, 22) <1. As we can see in 
Table 3, orp stimuli were rated more posi- 
tively than NEw stimuli, regardless of whether 
they were judged “oLD” or “NEw,” F( 1, 22) 
= 5.71, p < .05, a finding that is consistent 
with the one obtained in Experiment 1. Now, 
however, subjective familiarity has no in- 
fluence on affect ratings, (1, 22) < 1, and 
once again, no significant effects are observed 
for the Objective Familiarity x Subjective 
Familiarity interaction, or for any other in- 

n. 
Eaa the main effect of objective familiar- 
ity is decomposed, the pattern of results ob- 
tained is similar to that found in the first ex- 
periment. OLD stimuli were liked somewhat 
more than NEw stimuli, even when both were 
identified as “orp”; but the effect was ie 
significant, F(1, 22) = 1.51. However, the 
exposure effect was again obtained without 
the subject having been aware of her differen- 
tial experience with the tone sequences, in 
that orp stimuli were liked significantly more 
than NEW stimuli, even when both yes 
judged as “New,” F(1, 22) = 6.89, p < 05. 
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As in Experiment 1, a 6-point oldness mea- 
sure of subjective familiarity was obtained 
by combining recognition and confidence judg- 
ments. The average oldness ratings and the 
average affect ratings were correlated for OLD 
and NEW stimuli, separately. The coefficients 
suggest little relation between affect and sub- 
jective familiarity or subjective certainty— 
for orp stimuli 7(22) = .09, and for NEw 
stimuli r(22) = —.02. 


GSR and Latency 


GSR and recognition latency scores were 
subjected to the same analysis as affect rat- 
ings. Little systematic variation in skin re- 
sistance was obtained among the different 
cells, Analysis revealed no significant effect 
for order of judgment, objective familiarity, 
subjective familiarity, or the OLD X “OLD” 
interaction. 

Table 4 presents average latency responses. 
Neither order of judgment, F(1, 22) = 1.49, 
nor objective familiarity, F(1, 22) < 1, 
emerged as a significant factor. There is a re- 
liable difference between latency of “oLD” and 
“NEW” judgments, F(1, 22) = 8.36, p < 01, 
the average difference being 1.25 sec. There 
were no significant interactions. 

Intercorrelations among all the measures 
are provided in Table 5. 


General Discussion 


When the attitudinal and learning effects 
produced by repeated stimulus exposure are 


Table 5 
Intercorrelations Among the Measures 


Measure 1 
1. Affect 1.00 
2. Objective familiarity -18* 
3. Subjective familiarity —.02 
4. Oldness —.05 
5. Latency -06 
6. Galvanic skin response —.05 


Note. Objective familiarity was controlled by the experimenter and 
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unconfounded, an interesting pattern of re- 
sults emerges. First, reliable exposure effects 
are obtained, that is, previously encountered 
stimuli are rated more positively than novel 
stimuli. Second, subjective judgments of stim- 
ulus familiarity are not reliably related to af- 
fective evaluations, even when confidence 
ratings are considered. Together, these results 
Suggest that an individual’s positive feelings 
toward a previously encountered object are 
not dependent on his consciously knowing or 
perceiving that the object is familiar. Such 
findings have important implications for the 
role of awareness in processes of attitude 
formation under conditions of mere exposure. 


Explanations of the Exposure Effect 


The first major explanation of the atti- 
tudinal effects of exposure is the response 
competition hypothesis proposed by Harrison 
(1968). He suggested that when a stimulus 
is first encountered, it elicits a number of 
response tendencies that compete for emission. 
This induction of response competition leads 
to a state of tension in the individual, charac- 
terized by “negative affect and a motivational 
impetus to eliminate the tension” (p. 363). 
With repeated exposure of the stimulus, how- 
ever, only a limited number of compatible 
response tendencies achieve ascendancy. The 
end result of repeated stimulus presentations 
is a reduction in response competition, which 
is reflected by an increment of positive affect 


1.00 
07 1.00 
10 85** 1.00 
—.06 13 .08 1.00 
—.01 .05 03  —.10 1.00 


refers to the true degree of contact that 


the subject had with the stimulus. The other variables were under the subject's control. The oldness response 


is comprised of both the subject’s familiarity (“oLD or “ 


“HALF-SURE,” or “GUESS”’) rating of that judgment. 
*p< .05. ; 


** p < 001. 


NEW”) judgment and the confidence (“SURE, 
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for the familiar stimulus. Although the point 
is never made explicitly by Harrison, it is 
clear that stimulus recognition is a sine qua 
non for the reduction of response competition, 
that is, access to, and retrieval of, previously 
stored associations is contingent on the elicita- 
tion of a stable encoding response by the in- 
dividual over successive exposures (cf. Höff- 
ding, 1891). Thus, an individual must cor- 
rectly identify the stimulus in order to have 
access to past associations (response tend- 
encies) that have been linked to it (cf. Law- 
rence, 1963; Martin, 1968). 

A second major explanation of exposure ef- 
fects is a two-factor learning theory proposed 
by Berlyne (1970). He postulated that in- 
dividuals have a preference for some optimal 
level of arousal. When extremely novel or 
complex stimuli are encountered, arousal is 
increased above the optimal level and the 
stimuli elicit low or even negative affect in 
the individual. But with repeated encounters, 
the arousal potential of the stimuli declines 
because they are processed with greater ease 
and are better understood. Thus, familiar 
stimuli will be favorably evaluated following 
exposure, because the individual’s arousal 
level is modified back toward the optimum. 
Berlyne reasoned that “opportunities for fur- 
ther acquaintance with the pattern will pro- 
vide scope for the perceptual and ideational 
processing that enables uncertainty and con- 
flict to be resolved as elements are discrimi- 
nated, classified, recognized, and grouped to- 
gether as sub-wholes. This can evidently be a 
source of pleasure, presumably dependent on 
the arousal-reduction mechanism” (p. 285). 
Clearly, the role of stimulus recognition 1s 
central to this theory as well. ‘ 

Obviously, the present results raise ques- 
tions about the validity of the above theories 
or any other explanations of exposure effects 
that implicitly include overt stimulus recogni- 
tion as a prerequisite for obtaining enhanced 
affect (e.g, Burgess & Sales, 1971; Crandall, 
1967). In the current experiments, it is un- 
likely that the subjects were able to engage 
in strategies to reduce uncertainties, form as- 


sociations, and/or systematically explore the 


critical stimuli in order to promote subsequent 


recognition ability. Furthermore, the data 


clearly suggest that subjective familiarity, 
veridical or otherwise, does not influence af- 
fective responses to previously encountered 
stimuli in a simple fashion, in such a way 
that the former always precedes and mediates 
the latter. Thus, one must question any the- 
ory that assumes that individuals must be 
aware of their past experiences with the stim- 
ulus as a sine qua non for obtaining the fre- 
quency-affect relationship. 

In a related vein, some researchers (€.g., 
Stang, 1974) have suggested that exposure 
effects may result from demand character- 
istics, because the subject is “aware” of the 
familiarity—affect hypothesis and responds ac- 
cordingly. Even granting that such awareness 
was present in these experiments, it would be 
of little use to the subjects, since they could 
not reliably determine which stimuli they 
had heard previously. Thus it is difficult to 
interpret the present results as artifactual. 
One might argue, however, that if subjects 
were aware of the familiarity-affect hypothe- 
sis and were conforming to demand character- 
istics, they would consistently assign high af- 
fect ratings to stimuli that they judged to 
be familiar and low ratings to stimuli judged 
unfamiliar. The data reveal no such effects. 


The Role of Awareness 


Although the results from this study do 
not provide a sufficient basis for generating 
an alternative model to account for exposure 
effects, they do suggest that if learning pro- 
cesses underlie these effects, those processes 
may not require conscious awareness or effort 
by the subject.’ In other words, one might 
argue that the subjects did engage in some 
form of learning during exposure (i.e. de- 
veloped expectancies, reduced uncertainty, al- 
leviated response competition, and the like) 
but not at the level of conscious awareness. 
Such lack of awareness during exposure may 
have prevented the subjects from later re- 


——— 

5 It is interesting to note that frequency informa- 
tion appears to be encoded with little effort or con- 
scious intent by individuals (Hasher & Chromiak, 
1977; Howell, 1973a, 1973b). 
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liably retrieving, from memory, information 
about the occurrence of the stimulus (cf. 
Melton & Martin, 1972), but the affective re- 
sponse was not impaired because learning 
without awareness had occurred. The plausi- 
bility of such an explanation is suggested by 
recently developed models of human informa- 
tion processing that question the necessity 
(Erdelyi, 1974) and even the efficacy (Nis- 
bett & Wilson, 1977) of conscious awareness 
of processes underlying affective reactions. 

Related demonstrations of such effects have 
been reported in studies of perception and 
learning without awareness. For example, pic- 
tures presented too briefly (1 msec) to ob- 
tain correct verbal identification or recogni- 
tion responses nevertheless produced reliable 
differences in evoked potentials (average 
evoked response) and interpretable free as- 
sociations (Shevrin & Fritzler, 1968; Shevrin, 
Smith, & Fritzler, 1971). Employing a di- 
chotic-listening procedure, Konecni and Sla- 
mecka (1972) found that verbal conditioning 
effects could be obtained even when subjects 
were not aware of the verbal reinforcer pre- 
sented on the unattended channel, 

The merit of a learning-without-awareness 
explanation of exposure effects needs, how- 
ever, to be demonstrated. No evidence in the 
present results can be invoked to support or 
deny such a Process. Support for a theory 
that postulates that the exposure effect is 
mediated by a completely covert learning 
Process must demonstrate that this Process 
occurs prior to affect changes, and must em- 


ploy measures that are independent of such 
changes, 


Conclusions 


If exposure effects need not involve the in- 
dividual’s awareness of the familiarization 
Process, then the influences of Proximity and 
familiarity on attitude formation may be 
even more pervasive than originally sus- 
pected. It has been established that objects 
and individuals we knowingly encounter or 
interact with (even incidentally) are likely 
to come to be regarded Positively (Festinger, 
Schachter, & Back, 1950; Saegert, Swap, & 
Zajonc, 1973; Segal, 1974; Zajonc, 1968). 
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The present results raise the possibility that 
the objects and individuals we encounter in 
our environment are likely to engender a 
positive orientation, regardless of our knowl- 
edge of past encounters with them. Thus, in 
some cases, what we feel about objects or in- 
dividuals may precede our “awareness” of 
previous contact! One might speculate that, 
in the extreme case, unconsciously formed 
Positive attitudes toward “novel” objects or 
“strangers” in our environment stimulate both 
an awareness of their existence and conscious 
attempts to obtain knowledge about them. 
Whereas prior research has focused largely 
on how learning mediates affect, we should 
consider more seriously the Moreland and 
Zajonc (1977) proposition that changes in 
affect following exposure may influence learn- 
ing.* 

Obviously, the validity of such speculations 
must be determined by additional research. 
The results from the present study suggest 
that future work may require a departure 
from traditional methods of studying exposure 
effects and, to some extent, a change in as- 
sumptions about the conditions under which 
these effects can occur. Such research must 
determine whether familiarity mediates affect 
directly, without the intervention of other 
Processes, and, if so, whether changes in affect 
influence learning processes. Finally, we need 
to examine more explicitly the role of aware- 
ness in exposure effects, in order to better 
understand how familiarity breeds liking. 


— 


€ In a recent article, Birnbaum and Mellers (1979) 
are critical of the partial correlation technique em- 
ployed by Moreland and Zajonc (1977) to deter- 
mine that affect and recognition are two independent 
outcomes of stimulus exposure (ie. that these re- 
sponses are mediated by two different mechanisms). 
They argue that a single mediator, labeled subjec- 
tive recognition, could account for both the affect 
and the recognition responses obtained by Moreland 
and Zajonc under the very likely conditions that 
measures of these two responses are not perfectly 
correlated. The present results appear to make some 
aspects of their thesis questionable. In contrast to 
Moreland and Zajonc’s technique, the recognition 
factor in the present study was partialed out of the 
effects experimentally rather than statistically, but 
affect was still found to be reliably related to ob- 
jective frequency. 


mi, 


a 
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Many Hands Make Light the Work: 
The Causes and Consequences of Social Loafing 


Bibb Latané, Kipling Williams, and Stephen Harkins 
Ohio State University 


Two experiments found that when asked to perform the physically exerting 
tasks of clapping and shouting, people exhibit a sizable decrease in individual 
effort when performing in groups as compared to when they perform alone. This 
decrease, which we call social loafing, is in addition to losses due to faulty co- 
ordination of group efforts. Social loafing is discussed in terms of its experi- 
mental generality and theoretical importance. The widespread occurrence, the 
negative consequences for society, and some conditions that can minimize social 


loafing are also explored. 


There is an old saying that “many hands 
make light the work.” This saying is interest- 
ing for two reasons, First, it captures one of 
the promises of social life—that with social 
organization people can fulfill their individual 
goals more easily through collective action. 
When many hands are available, people often 
do not have to work as hard as when only a 
few are present. The saying is interesting in 
a second, less hopeful way—it seems that 
when many hands are available, people ac- 
tually work less hard than they ought to. 

Over 50 years ago a German psychologist 
named Ringelmann did a study that he never 
managed to get published. In rare proof that 
unpublished work does not necessarily perish, 
the results of that study, reported only in 
summary form in German by Moede (1927), 
have been cited by Dashiell (1935), Davis 
(1969), Köhler (1927), and Zajonc (1966) 
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and extensively analyzed by Steiner (1966, 
1972) and Ingham, Levinger, Graves, and 
Peckham (1974). Apparently Ringelmann 
simply asked German workers to pull as hard 
as they could on a rope, alone or with one, 
two, or seven other people, and then he used 
a strain gauge to measure how hard they 
pulled in kilograms of pressure. 

Rope pulling is, in Steiner’s (1972) useful 
classification of tasks, maximizing, unitary, 
and additive. In a maximizing task, success 
depends on how much or how rapidly some- 
thing is accomplished and presumably on how 
much effort is expended, as opposed to an 
optimizing task, in which precision, accuracy, 
or correctness are paramount, A unitary task 
cannot be divided into separate subtasks—all 
members work together doing the same thing 
and no division of labor is possible. In an 
additive task, group success depends on the 
sum of the individual efforts, rather than on 
the performance of any subset of members. 
From these characteristics, we should expect 
three people pulling together on a rope with 
Perfect efficiency to be able to exert three 
times as much force as one person can, and 
eight people to exert eight times as much 
force. ‘ 

Ringelmann’s results, however, were strik- 
ingly different. When pulling one at a time, 
individuals averaged a very respectable 63 kg 
of pressure. Groups of three people were able 
to exert a force of 160 kg, only two and a 
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half times the average individual perform- 
ance, and groups of eight pulled at 248 kg, 
less than four times the solo rate. Thus the 
collective group performance, while increasing 
somewhat with group size, was substantially 
less than the sum of the individual efforts, 
with dyads pulling at 93% of the sum of 
their individual efforts, trios at 859%, and 
groups of eight at only 49%. In a way some- 
what different from how the old saw would 
have it, many hands apparently made light 
the work. 

The Ringelmann effect is interesting be- 
cause it seems to violate both common stereo- 
type and social psychological theory. Com- 
mon stereotype tells us that the sense of team 
participation leads to increased effort, that 
group morale and cohesiveness spur individual 
enthusiasm, that by pulling together groups 
can achieve any goal, that in unity there is 
strength. Social psychological theory holds 
that, at least for simple, well-learned tasks 
involving dominant responses, the presence 
of other people, whether as co-workers or 
spectators, should facilitate performance. It 
is thus important to find out whether Ringel- 
mann’s effect is replicable and whether it can 
be obtained with other tasks. 

The Ringelmann effect is also interesting 
because it provides a different arena for test- 
ing a new theory of social impact (Latané, 
1973). Social impact theory holds that when 
a person stands as a target of social forces 
coming from other persons, the amount of 
social pressure on the target person should 
increase as a multiplicative function of the 
strength, immediacy, and number of these 
other persons. However, if a person is 4 
member of a group that is the target of social 
forces from outside the group, the impact of 
these forces on any given member should 
diminish in inverse proportion to the strength, 
immediacy, and number of group members. 
Impact is divided up among the group mem- 
bers, in much the sang ea tate 
for helping seems to be divide 
vate = a emergency (Latané & Darley, 


1970). Latané further suggests that just a 
psychophysical reactions to external ae 
can be described in terms of a power law 
(Stevens, 1957), 50 also should reactions to 
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social stimuli, but with an exponent having 
an absolute value less than 1, so that the nth 
person should have less effect than the 
(n —1)th. Ringelmann’s asking his workers 
to pull on a rope can be considered social 
pressure. The more people who are the target 
of this pressure, the less pressure should be 
felt by any one person. Since people are likely 
to work hard in proportion to the pressure 
they feel to do so, we should expect increased 
group size to result in reduced efforts on the 
part of individual group members. These re- 
duced efforts can be called “social loafing”—a 
decrease in individual effort due to the social 
presence of other persons. With respect to 
the Ringelmann phenomenon, social impact 
theory suggests that at least some of the ef- 
fect should be due to reduced efforts on the 
part of group participants, and that this re- 
duced effort should follow the form of an 
inverse power function having an exponent 
with an absolute value less than one. 

The Ringelmann effect is interesting for a 
third reason: If it represents a general phe- 
nomenon and is not restricted to pulling on a 
rope, it poses the important practical question 
of when and why collective efforts are less 
efficient than individual ones. Since many 
components of our standard of life are pro- 
duced through one form or another of collec- 
tive action, research identifying the causes 
and conditions of inefficient group output 
and suggesting strategies to overcome these 
inefficiencies is clearly desirable. 

For these three and other reasons, we de- 
cided to initiate a program of research into 
the collective performance of individuals in 


groups. 
Experiment 1 
Clap Your Hands and Shout Out Loud 


One of the disadvantages of Ringelmann’s 
rope pulling task is that the equipment and 
procedures are relatively cumbersome and in- 
efficient. Therefore, we decided to keep our 
ears open for other tasks that would allow us 
to replicate the Ringelmann finding concep- 
tually and would provide the basis for ex- 
tended empirical and theoretical analysis. We 
chose cheering and clapping, two activities 
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that people commonly do together in social 
settings and that are maximizing, unitary, 
and additive. As with rope pulling, output 
can be measured in simple physical units that 
make up a ratio scale. 


Method 


On eight separate occasions, groups of six under- 
graduate males were recruited from introductory 
psychology classes at Ohio State University; they 
were seated in a semicircle, 1 m apart, in a large 
soundproofed laboratory and told, “We are interested 
in judgments of how much noise people make in 
social settings, namely cheering and applause, and 
how loud they seem to those who hear them, Thus, 
we want each of you to do two things: (1) Make 
noises, and (2) judge noises.” They were told that 
on each trial “the experimenter will tell you the 
trial number, who is to perform and whether you 
are to cheer (Rah!) or clap. When you are to begin, 
the experimenter will count backwards from three 
and raise his hand. Continue until he lowers it. We 
would like you to clap or cheer for 5 seconds as 
loud as you can,” On each trial, both the performers 
and the observers were also asked to make magnitude 
estimates of how much noise had been produced 
(Stevens, 1966). Since these data are not relevant 
to our concerns, we will not mention them further, 

After some practice at both producing and judg- 
ing noise, there were 36 trials of yelling and 36 trials 
of clapping. Within each modality, each person per- 
formed twice alone, four times in pairs, four times 
in groups of four, and six times in groups of six. 
These frequencies were chosen as a compromise 
between equating the number of occasions on which 
we measured people making noise alone or in groups 
(which would have required more noisemaking in 
fours and sixes) and equating the number of indi- 
vidual performances contributing to our measure- 
ments in the various group sizes (which would have 
required more noisemaking by individuals and pairs). 
We also arranged the sequence of performances to 
space and counterbalance the order of conditions 
over each block of 36 trials, while making sure that 
no one had to perform more than twice in a row. 

Performances were measured with a General Radio 
sound-level meter, Model 1565A, using the C scale 
and the slow time constant, which was placed ex- 
actly 4 m away from each performer. The C scale 
was used so that sounds varying only in frequency 
or pitch would be recorded as equally loud. Sound- 
level meters are read in decibel (dB) units, which 
are intended to approximate the human reaction to 
sound. For our purposes, however, the appropriate 
measure is the effort used in generating noise, not 
how loud it sounds. Therefore, our results are pre- 
sented in terms of dynes/cm*, the physical unit of 
work involved in producing sound pressure. 

Because people shouted and clapped in full view 
and earshot of each other, each person’s performance 
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could affect and be affected by the others. For this 
reason, the group, rather than the individual, was 
the unit of analysis, and each score was based on 
the average output per person. Results were analyzed 


in a 4X22 analysis of variance, with Group 
Size (1, 2, 4, 6), Response Mode (clapping vs. shout- 
ing), and Replications (1, 2) as factors. 

Results 


Participants seemed to adapt to the task 
with good humor if not great enthusiasm, 
Nobody refused to clap or shout, even though 
a number seemed somewhat embarrassed or 
shy about making these noises in public. 
Despite this, they did manage to produce a 
good deal of noise. Individuals averaged 84 
dB (C) clapping and 87 dB cheering, while 
groups of six clapped at 91 dB and shouted at 
95 dB (an increment of 6 dB represents a 
doubling of sound pressure). 

As might be expected, the more people 
clapping or cheering together, the more in- 
tense the noise and the more the sound pres- 
sure produced. However, it did not grow in 
proportion to the number of people: The 
average sound pressure generated per person 
decreased with increasing group size, F (3, 21) 
= 41.5, p < .001. People averaged about 3.7 
dynes/cm* alone, 2.6 in pairs, 1.8 in four- 
somes, and about 1.5 in groups of six (Figure 
1). Put another way, two-person groups pef- 
formed at only 71% of the sum of their in- 
dividual capacity, four-person groups at 
51%, and six-person groups at 40%. As in 
pulling ropes, it appears that when it comes 
to clapping and shouting out loud, many 
hands do, in fact, make light the work. 

People also produced about 60% more 
sound power when they shouted than when 
they clapped, F(1, 7) = 8.79, p < 01, pre- 
sumably reflecting physical capacity rather 
than any psychological process. There was > 
effect due to blocks of trials, indicating tin 
the subjects needed little or no practice T 
that their performance was not deleteriousty 
affected by fatigue. In addition, there Were 
no interactions among the variables. 


Discussion 


The results provide a strong replication a 
Ringelmann’s original findings, using & © 


ey 
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pletely different task and in a different his- 
torical epoch and culture. At least when peo- 
ple are making noise as part of a task im- 
posed by someone else, voices raised together 
do not seem to be raised as much as voices 
raised alone, and the sound of 12 hands clap- 
ping is not even three times as intense as the 
sound of 2. 

Zajonc’s (1965) elegant theory of social 
facilitation suggests that people are aroused 
by the mere presence of others and are thus 
likely to work harder (though not necessarily 
to achieve more) when together. Although 
social facilitation theory might seem to pre- 
dict enhanced group performance on a simple 
task like clapping or shouting, in the present 
case it would not predict any effect due to 
group size, since the number of people present 
was always eight, six participants and two 
experimenters. Evaluation apprehension the- 
ory (Cottrell, 1972) would also not predict 
any effect as long as it is assumed that co- 
actors and audience members are equally ef- 
fective in arousing performance anxiety. 
Therefore, these theories are not inconsistent 
with our position that an unrelated social 
process is involved. The results of Experiment 
1 also can be taken as support for Latané’s 
(1973) theory of social impact: The impact 
that the experimenters have on an individual 
seems to decrease as the number of coper- 
formers increases, leading to an apparent 
drop in individual performance, 4 phenome- 
non we call social loafing. 

However, there is an alternative explana- 
tion to these results. It may be, not that 
people exert less effort in groups, but that 
the group product suffers as 4 result of group 
inefficiency. In his invaluable theoretical anal- 
ysis of group productivity, Steiner (1972) 
suggests that the discrepancy between a 
group’s potential prodiictivity (in this case 
n times the average individual output) and 
its actual productivity may be attributed to 


faulty social process. In the case of Ringel- 
mann’s rope pull, Steiner identifies one source 
social coordina- 


of process loss as inadequate ; 
tion. As group size increases, the number G 
«coordination links,” and thus the possibility 
of faulty coordination (pulling in different 
directions at different times), also increases. 
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SOUND PRESSURE PER PERSON IN DYNES PER cm? 


GROUP SIZE 


Figure 1. Intensity of noise as a function of group 
size and response mode, Experiment 1. 


Steiner shows that for Ringelmann’s original 
data the decrement in obtained productivity 
is exactly proportional to the number of co- 
ordination links. 

Ingham et al. (1974) designed an ingenious 
experiment to determine whether the process 
losses found in rope pulling were mainly due 
to problems of coordinating individual ef- 
forts and the physics of the task, or whether 
they resulted from reductions in personal ex- 
ertion (what we have called social loafing). 
First, they conducted a careful replication 
of Ringelmann’s original rope-pulling study 
and found similar results—dyads pulled at 
91% of the sum of their individual capacities, 
trios at 82%, and groups of six at only 78%. 

In a second experiment, Ingham et al. 
cleverly arranged things so that only the indi- 
vidual’s perception of group size was varied. 
Individuals were blindfolded and led to be- 
lieve that others were pulling with them, but 
in fact, they always pulled alone. Under these 
conditions, of course, there is no possibility 
of loss due to faulty synchronization. Still 
there was a substantial drop in output with 
increases in perceived group size: Individ- 
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uals pulled at 90% of their alone rate when 
they believed one other person was also pull- 
ing, and at only 85% with two to six others 
believed pulling. It appears that virtually all 
of the performance decrement in rope pull- 
ing observed by Ingham et al. can be ac- 
counted for in terms of reduced effort or so- 
cial loafing. 

With respect to clapping and especially 
shouting, however, there are several possible 
sources of coordination loss that might have 
operated in addition to social loafing: (a) 
Sound cancellation will occur to the extent 
that sound pressure waves interfere with each 
other, (b) directional coordination losses will 
occur to the extent that voices are projected 
toward different locations, and (c) temporal 
coordination losses will occur to the extent 
that moment-to-moment individual variations 
in intensity are not in synchrony. Our second 
experiment was designed to assess the rela- 
tive effects of coordination loss and social 
loafing in explaining the failure of group 
cheering to be as intense as the sum of indi- 
vidual noise outputs, 


Experiment 2 
Coordination Loss or Reduced Effort? 


For Experiment 2 we arranged things so 
that people could not hear each other shout; 
Participants were asked to wear headphones, 
and during each trial a constant 90-dB re- 
cording of six people shouting was played 
over the earphones, ostensibly to reduce audi- 
tory feedback and to signal each trial. Asa 
consequence, individuals could be led to be- 
lieve they were shouting in groups while ac- 
tually shouting alone. Ingham et al. (1974) 
accomplished this through the use of ““Dseudo- 
subjects,” confederates who pretended to be 
pulling with the participants but who in fact 
did not pull any weight at all. That is an 
expensive procedure—each of the 36 Partici- 
Pants tested by Ingham et al, required the 
services of 5 pseudosubjects as well as the 
experimenter. We were able to devise a pro- 
cedure whereby, on any given trial, one per- 
son could be led to believe that he was per- 
forming in a group, while the rest thought 


B. LATANE, K. WILLIAMS, AND S. HARKINS 


he was performing alone. Thus, we were able 
to test six real participants at one time. 
Additionally, although we find the inter- 
pretation offered by Ingham et al. plausible 
and convincing, the results of their second 
experiment are susceptible to an alternative 
explanation, When participants were not pull- 
ing the rope, they stood and watched the 
pseudosubjects pull. This would lead people 
accurately to believe that while they were 
pulling the rope, idle participants would be 
watching (Levinger, Note 1). Thus, as the 
number of performers decreased, the size of 
the audience increased. According to Cottrell’s 
evaluation apprehension hypothesis (1972), 
the presence of an evaluative audience should 
enhance performance for a simple, well- 
learned task such as rope pulling, and, al- 
though there is little supportive evidence, it 
seems reasonable that the larger the audience, 
the greater the enhancement (Martens & 
Landers, 1969; Seta, Paulus, & Schkade, 
1976). Thus, it is not clear whether there 
was a reduced effort put forth by group mem- 
bers because they believed other people were 
pulling with them, or an increase in the effort 
exerted by individuals because they believed 
other people were watching them. In Experi- 
ment 2, therefore, we arranged to hold the 
size of the audience constant, even while vary- 
ing the number of people working together. 


Method 


Six groups of six male undergraduate volunteers 
heard the following instructions: 


In our experiment today we are interested in 
the effects of sensory feedback on the produc 
of sound in social groups. We will ask you 
Produce sounds in groups of one, two, or SIX, i 
we will record the sound output on the ery 
level meter that you can see up here in front. ‘ 
though this is not a competition and you will a 
learn your scores until the end of the experimen) 
we would like you to make your sounds as a 
as possible. Since we are interested in sensory f 7 
back, we will ask you to wear blindfolds andie 
phones and, as you will see, will arrange it i 
that you will not be able to hear yourself as Y 
shout. 


We realize it may seem strange to you to ee 
as loud as you can, especially since other Pee 
are around. Remember that the room is ote 
proofed and that people outside the room will n 
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be able to hear you. In addition, because you will 
be wearing blindfolds and headsets, the other par- 
ticipants will not be able to hear you or to see 
you. Please, therefore, feel free to let loose and 
really shout. As I said, we are interested in how 
loud you can shout, and there is no reason not to 
do your best. Here’s your chance to really give it a 
try. Do you have any questions? 


Once participants had donned their headsets and 
blindfolds, they went through a series of 13 trials, 
in which each person shouted four times in a group 
of six, once in a group of two, and once by himself. 
Before each trial they heard the identification letters 
of those people who were to shout. 

Interspersed with these trials were 12 trials, two 
for each participant, in which the individual's head- 
set was switched to a separate track on the stereo- 
phonic instruction tape. On these trials, everybody 
else was told that only the focal person should 
shout, but that individual was led to believe either 
that one other person would shout with him or that 
all six would shout. 

Thus, each person shouted by himself, in actual 
groups of two and six, and in pseudogroups of two 
and six, with trials arranged so that each person 
would have approximately equal rest periods between 
the trials on which he performed. Each trial was 
preceded by the specification of who was to perform. 
The yells were coordinated by a tape-recorded voice 
counting backwards from three, followed by a con- 
stant 90-dB $-sec recording of the sound of six 
people shouting. This background noise made it im- 
possible for performers to determine whether or 
how loudly other people were shouting, or, for that 
matter, to hear themselves shout. Each trial was 
terminated by the sound of a bell. This sequence of 
25 trials was repeated three times, for a total of 75 
trials, in the course of which each subject shouted 
24 times. dS 

As in Experiment 1, the data were transformed 
into dynes/em® and subjected to analyses of Not 
ance, with the group as the unit of analysis ani 
each score based on the average output per person. 
Two separate 3 X 3 analyses of variance with group 
size (1,2,6) and trial block (1-3) were run, one sad 
the output of trials in which groups actually n s 
together, and one on the pseudogroup trials in whic 
only one person actually shouted. 


Results 


ie houted with consid- 
Overall, participants shou mene 


erably more intensity in Experim n 
in Experiment 1, averagu dynesjem 
when shouting alone, . 
dynes/cm?, #(12) = 4.05, 01. T 
several plausible reasons for we E 
The new rationale involving the el eee 
duced sensory feedback may have ini 
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SOUND PRESSURE PER PERSON IN DYNES PER cm? 


1 2 6 
GROUP SIZE 

Figure 2. Intensity of sound produced per person 

when cheering in actual or perceived groups of 1, 2, 

and 6, as a result of reduced effort and faulty co- 

ordination of group efforts, Experiment 2. 


or challenged individuals to perform well. 
The constant 90-dB background noise may 
have led people to shout with more intensity, 
just as someone listening to music through 
headphones will often speak inappropriately 
loudly (the Lombard reflex). The performers 
may have felt less embarrassed because the 
room was soundproof and the others were 
unable to see or hear them. Finally, through 
eliminating the possibility of hearing each 
other, individuals could no longer be in- 
fluenced by the output of the others, thereby 
lifting the pressure of social conformity. 

As in Experiment 1, as the number of ac- 
tual performers increase, the total sound out- 
put also increased, but at a slower rate than 
would be expected from the sum of the indi- 
vidual outputs. Actual groups of two shouted 
at only 66% of capacity, and groups of six 
at 36%, F(2, 10) = 226, p < .001. The com- 
parable figures for Experiment 1 are 11% and 
40%. These similarities between experiments 
suggest that our procedural changes, even 
though they made people unable to hear or 
see each other, did not eliminate their feel- 
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ing of being in a group or reduce the amount 
of incoordination or social loafing. 

The line connecting the solid circles in 
Figure 2 shows the decreased output per per- 
son when actually performing in groups. The 
dashed line along the top represents potential 
productivity—the output to be expected if 
there were no losses due to faulty coordina- 
tion or to social loafing. The striped area at 
the bottom represents the obtained output 
Per person in actual groups. Output is ob- 
viously lower than potential productivity, and 
this decrease can be considered as represent- 
ing the sum of the losses due to incoordina- 
tion and to reduced individual effort. 

In addition to shouting in actual groups, 
individuals also performed in pseudogroups in 
which they believed that others shouted with 
them but in which they actually shouted 
alone, thus preventing coordination loss from 
affecting output. As shown in Figure 2, peo- 
ple shouted with less intensity in pseudo- 
groups than when alone, F(2, 10) = 37.0, p 
< .0001. Thus, group size made a significant 
difference even in pseudogroups in which co- 
ordination loss is not a factor and only social 
loafing can operate. 

When performers believed one other person 
was yelling, they shouted 82% as intensely as 
when alone, and when they believed five 
others to be yelling, they shouted 74% as 
intensely. The stippled area defined at the 
top of Figure 2 by the data from the pseudo- 
groups represents the amount of loss due to 
social loafing. By subtraction, we can infer 
that the white area of Figure 2 represents the 
amount of loss due to faulty coordination, 
Since the latter comprises about the same area 
as the former, we can conclude that, for shout- 
ing, half the performance loss decrement is 
due to incoordination and half is due to so- 
cial loafing. 


Discussion 


Despite the methodological differences be- 
tween Experiments 1 and 2, both experiments 
showed that there is a reduction in sound 
Pressure produced per person when people 
make noise in groups compared to when 
alone. People in Experiment 1 applauded and 
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cheered in full view of each other, with all 


the excitement, embarrassment, and conform- 
ity that goes along with such a situation, In 
Experiment 2, no one could see or hear any 
other person. Only the experimenters could 
see the people perform. And finally, the ra- 
tionale changed drastically, from the experi- 
menters’ interest in “judgments of how much 
noise people make in socia! settings” to their 
interest in “the effects of sensory feedback 
on the production of sound in social groups.” 
Yet, despite differences in the task charac- 
teristics and supposed purpose, the two studies 
produced similar results. This points to the 
robust nature of both the phenomenon and 
the paradigm. 


General Discussion 
Noise Production as Group Performance 


Although we do not usually think about it 
that way, making noise can be hard work, in 
both the physical and the psychological sense. 
In the present case, the participants were 
asked to produce sound pressure waves, either” 
by rapidly vibrating their laryngeal mem- 


branes or by vigorously striking their is 


together. Although superficially similar in 
consequence, this task should not be con- 
fused with more normal outbreaks of shouting 
and clapping that occur as spontaneous Out- 
bursts of exuberant expressiveness. Our par- 
ticipants shouted and clapped because wê 
asked them to, not because they wanted to. — 
This effortful and fatiguing task resulted in 
sound pressure waves, which, although in- 
visible, can be easily and accurately measur 
in physical units that are proportional to a 
amount of work performed. The making ° 
noise is a useful task for the study of grouP 
processes from the standpoint both of produc- 
tion and of measurement—people are we 
ticed and skilled at making noise and can z 
so without the help of expensive or cumber- 
some apparatus, and acoustics and audio i 
gineering are sufficiently advanced to pana 
sophisticated data collection. We seem ; 
have found a paradigm wherein people 
involved enough to try hard and become ae 
what enthusiastic, yet the task is still effort: 
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enough so that they loaf when given the 
opportunity. 


The Causes. of Social Loafing 


The present research shows that groups 
can inhibit the productivity of individuals so 
that people reduce their exertions when it 
comes to shouting and clapping with others. 
Why does this occur? We suggest three lines 
of explanation, the first having to do with at- 
tribution and equity, the second with sub- 
maximal goal setting, and the third with the 
lessening of the contingency between individ- 
ual inputs and outcomes. 

1. Attribution and equity. It may be that 
participants engaged in a faulty attribution 
process, leading to an attempt to maintain an 
equitable division of labor. There are at least 
three aspects of the physics and psychophysics 
of producing sound that could have led peo- 
ple to believe that the other persons in their 
group were not working as hard or effectively 
as themselves. First, individuals judged their 
own outputs to be louder than those of the 
others, simply because they were closer to 
the sound source. Second, even if everyone 
worked to capacity, sound cancellation would 
cause group outputs to seem much less than 
the sum of their individual performances. 
Finally, the perception of the amount of 
sound produced in a group should be much 
less than the actual amount—growing only 
as the .67 power of the actual amount of 
sound, according to Stevens’s psychophysical 
power law (1975). 

These factors may have led individuals to 
believe that the other participants were less 
motivated or less skillful than themselves— 
in short, were shirkers or incompetents. Thus, 
differences in the perception of sound produc- 
tion that were essentially the result of physi- 
cal and psychophysical processes may have 
been mistakenly attributed to a lack of either 
skill or motivation on the part of the others, 
leading individuals to produce less sound in 
groups because there is no reason to work 
hard in aid of shirkers or those who are less 


competent. n 
This process cannot explain the rene ee 
Experiment 2, since the ca ity to judge 
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loudness of one’s own output, much less that 
of others, was severely impaired by the 90-dB 
background masking noise used to signal the 
trials. However, rather than “discovering” so- 
cial loafing while participating in the experi- 
ment, the participants may have arrived with 
the preexisting notion that people often do 
not pull their own weight in groups. Thus, 
despite being unable to hear or see one an- 
other, lack of trust and the propensity to at- 
tribute laziness or ineptitude to others could 
have led people to work less hard themselves. 

2. Submaximal goal setting. It may be 
that despite our instructions, participants re- 
defined the task and adopted a goal, not of 
making as much noise as possible, but merely 
of making enough noise or of matching some 
more or less well-defined standard. Individ- 
uals would clearly expect it to be easier to 
achieve this goal when others are helping, and 
might work less hard as a consequence. This, 
of course, would change the nature of noise 
production from what Steiner (1972) would 
term a maximizing task to an optimizing 
task. A maximizing task makes success a func- 
tion of how much or how rapidly something 
is accomplished, For an optimizing task, how- 
ever, success is a function of how closely the 
individual or group approximates a prede- 
termined “best” or correct outcome. If par- 
ticipants in our experiments perceived sound 
production as an optimizing rather than a 
maximizing task, they might feel the optimal 
level of sound output could be reached more 
easily in groups than alone, thereby allowing 
them to exert less effort. 

The participants in Experiment 2 could 
hear neither themselves nor others and would 
not be able to determine whether their output 
was obnoxious or to develop a group standard 
for an optimal level. Furthermore, in both 
experiments, the experimenters reiterated 
their request to yell “as loud as you can, 
every time,” over and over again. Before the 
first trial they would ask the group how loud 
they were supposed to yell. In unison, the 
group would reply, “As loud as we can!” We 
think it unlikely that participants perceived 
the task to be anything other than maxi- 
mizing. 

3, Lessened contingency between input and 
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outcome. It may be that participants felt 
that the contingency between their input and 
the outcome was lessened when performing 
in groups. Individuals could “hide in the 
crowd” (Davis, 1969) and avoid the negative 
consequences of slacking off, or they may 
have felt “lost in the crowd” and unable to 
obtain their fair share of the positive con- 
sequences for working hard. Since individual 
scores are unidentifiable when groups perform 
together, people can receive neither precise 
credit nor appropriate blame for their per- 
formance. Only when performing alone can 
individual outputs be exactly evaluated and 
rewarded, 

Let us assume that group members expect 
approval or other reward proportional to the 
total output of a group of n performers, but 
that since individual efforts are indistinguish- 
able, the reward is psychologically divided 
equally among the participants, each getting 
1/n units of reward. Under these assump- 
tions, the average group, if it performed up 
to capacity and suffered no process loss, could 
expect to divide up m times the reward of 
the average individual, resulting in each mem- 
ber’s getting n x 1/ n, or n/n, units of re- 
ward, the same amount as an individual. 

Although the total amount of reward may 
be the same, the contingency on individual 
output is not. Any given individual under 
«these assumptions will get back only one nth 
of his own contribution to the group; the 
rest will be shared by the others. Even though 
he may also receive unearned one nth of 
each other person’s contribution, he will be 
tempted, to the extent that his own perform- 
ance is costly or effortful, to become a “free 
rider” (Olson, 1965). Thus, under these as- 
sumptions, if his own performance cannot be 
individually monitored, an individual’s in- 
pera to perform should be Proportional to 

n. 

Seligman (1975) has shown that animals 
and people become lethargic and depressed 
when confronted with tasks in which they 
have little or no control over the outcomes, 
Likewise, in our experiments, people may 
have felt a loss of control over their fair 
share of the rewards when they performed in 
groups, leading them also to become, if not 
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lethargic and depressed, at least less en- 
thusiastic about making lots of noise, 

Since people were asked to shout both alone 
and in groups, they may have felt it smart to 
save their strength in groups and to shout as 
lustily as possible when scores were individ- 
ually identifiable, marshalling their energy 
for the occasions when they could earn re- 
wards, This line of reasoning suggests that if 
inputs were made identifiable and rewards 
contingent on them, even when in groups, it 
would be impossible for performers to get a 
free ride and they would have an incentive to 
work equally hard in groups of different sizes. 


Social Loafing and Social Impact Theory 


Each of these three lines of explanation 
may be described in terms of Latané’s (1973) 
theory of social impact. If a person is the 
target of social forces, increasing the number 
of other persons also in the target group 
should diminish the pressures on each indi- 
vidual because the impact is divided among 
the group members. In a group performance 
situation in which pressures to work come 
from outside the group and individual out- 
puts are not identifiable, this division of im- 
pact should lead each individual to work less 
hard. Thus, whether the subject is dividing 
up the amount of work he thinks should be 
performed or whether he is dividing up the 
amount of reward he expects to earn with his 
work, he should work less hard in groups. — 

The theory of social impact further stipu- 
lates the form that the decrease in output 
should follow. Just as perceptual judgments 
of physical stimuli follow power functions 
(Stevens, 1957), so also should judgments of 
social stimuli, and the exponent of the psycho- 
social power function should have an exponent 
of less than one, resulting in a marginally de- 
creasing impact of additional people. Thus; 
social impact theory suggests that the amount 
of effort expended on group tasks should de- 
crease as an inverse power function of the 
number of people in the group. This implica- 
tion cannot be tested in Experiment 1 or W! 
the actual groups of Experiment 2, inasmut 
as coordination loss is confounded with socl 
loafing. However, a power function with an 
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exponent of —.14 accounted for 93% of the 
variance for the pseudogroups of Experiment 
2. It appears that social impact theory pro- 
vides a good account of both the existence 
and the magnitude of social loafing. 


The Transsituational and Transcultural 
Generality of Social Loafing 


The present research demonstrates that 
performance losses in groups occur with tasks 
other than rope pulling and with people other 
than prewar German workers. There are, in 
addition, other instances of experimental re- 
search that demonstrate similar cases of social 
loafing. For example, Marriott (1949) and 
Campbell (1952) have shown that factory 
workers produce less per person in larger 
groups than in smaller ones. Latané and Dar- 
ley (1970) have found that the likelihood 
that a bystander will intervene in a situation 
in which someone requires assistance is sub- 
stantially reduced by the addition of other 
bystanders who share in the responsibility for 
help. Wicker (1969) has found that the pro- 
portion of members taking part in church 
activities is lower in large than in small 
churches, presumably because the responsibil- 
ity for taking part is more diffuse. Similarly, 
Petty, Harkins, Williams, and Latané (1977) 
found that people perceived themselves as 
exerting less cognitive effort on evaluating 
poems and editorials when they were among 
groups of other unidentifiable evaluators than 
when they alone were responsible for the task. 

These experimental findings have demon- 
strated that a clear potential exists in human 
nature for social loafing. We suspect that the 
effects of social loafing have far-reaching and 
profound consequences both in our culture 
and in other cultures. For example, on col- 
lective farms (kolkhoz) in Russia, the peas- 
ants “move all over huge areas, working one 
field and one task one day, another field the 
next, having no sense of responsibility and 
no direct dependence on the results of their 
labor” (Smith, 1976, p. 281). Bach peasant 
family is also allowed a private plot of up to 
an acre in size that may be worked after the 
responsibility to the collective is discharged. 
The produce of these plots, for which the 
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peasants are individually responsible, may be 
used as they see fit. Although these plots oc- 
cupy less than 1% of the nation’s agricultural 
lands (about 26 million acres), they pro- 
duce 27% of the total value of Soviet farm 
output (about $32.5 billion worth) (Yemel- 
yanoy, 1975, cited in Smith, 1976, p. 266). 
It is not, however, that the private sector is 
so highly efficient; rather, it is that the ef- 
ficiency of the public sector is so low (Wa- 
dekin, 1973, p. 67). 

However, before we become overly pessi- 
mistic about the potential of collective effort, 
we should consider the Israeli kibbutz, an 
example that suggests that the effects of so- 
cial loafing can be circumvented. Despite the 
fact that kibbutzim are often located in re- 
mote and undeveloped areas on the periphery 
of Israel to protect the borders and develop 
these regions, these communes have been very 
successful. For example, in dairying, 1963 
yields per cow on the kibbutz were 27% 
higher than for the rest of Israel’s herds, and 
in 1960 yields were 75% higher than in En- 
gland. In 1959, kibbutz chickens were pro- 
ducing 22% of the eggs with only 16% of 
the chickens (Leon, 1969). The kibbutz and 
the kolkhoz represent the range of possi- 
bilities for collective effort, and comparisons 
of these two types of collective enterprise may 
suggest conditions under which per person 
output would be greater in groups than in- 


dividually. 


Social Loafing as a Social Disease 


Although some people still think science 
should be value free, we must confess that we 
think social loafing can be regarded as a 
kind of social disease. It is a “disease” in 
that it has negative consequences for indi- 
viduals, social institutions, and societies. So- 
cial loafing results in a reduction in human 
efficiency, which leads to lowered profits and 
lowered benefits for all. It is “social” in that 
it results from the presence or actions of other 
people. 

The “cure,” however, is not to do away 
with groups, because despite their inefficiency, 
groups make possible the achievement of 
many goals that individuals alone could not 
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possibly accomplish. Collective action is a 
vital aspect of our lives: From time immemo- 
rial it has made possible the construction of 
monuments, but today it is necessary to the 
provision of even our food and shelter. We 
think the cure will come from finding ways 
of channeling social forces so that the group 
can serve as a means of intensifying individual 
responsibility rather than diffusing it, 


Reference Note 
1. Levinger, G. Personal communication, June 1976, 
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Validity of Experimental Settings in Social Psychology 


Charles G. Lord 


Daryl J. Bem 
Stanford University 


Cornell University 


The question of ecological validity—Do people behave in real life as they be- 
have in our laboratories?—is central to experimental social psychology. The 
template-matching technique is introduced as a way to explore such validity. 
First, each laboratory behavior is characterized by a template, a personality 
description (Q sort) of the hypothetical ideal person most likely to display that 
behavior. These templates characterize subjects’ potential behaviors inside the 


laboratory. Next, Q-sort personality descriptions of the subjects are obtained 
from their peers; these characterize the subjects’ behaviors outside the labora- 


tory. Evidence for the 
sought in the match bet 
support was found for th 


ecological validity of the laboratory setting is then 
tween these two sets of data. Using this technique, 
e ecological validity of the mixed-motive game, a 


widely used laboratory paradigm whose validity is often questioned, The 


heuristic use of Q-sort data to 
illustrated. 


Do people behave in real life as they be- 
have in our experimental laboratories? This 
question, the question of ecological validity, 
is central to experimental social psychology, 
and the possibility that the answer might be 
no is central to the field’s recent crisis of con- 
fidence (e.g, McGuire, 1973; Ring, 1967). 
Not surprisingly, the strongest doubts are 
provoked by the field’s more dramatic experi- 
ments: Is there really a plausible analogue 
between a subject’s obediently delivering 
shocks to a fellow subject in Milgram’s lab- 
oratory and an Eichmann’s obediently de- 
livering Jews to their deaths in Nazi Germany 
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draw clinical portraits of the subjects is also 


(Baumrind, 1964; Milgram, 1974; Orne & 
Holland, 1968)? Does a Zimbardo prison with 
college students playing the roles of guards 
and prisoners really capture any of the func- 
tional properties of actual incarceration (Ba- 
nuazizi & Movahedi, 1975; Haney, Banks, 
& Zimbardo, 1973)? But even the field’s more 
mundane experimental paradigms must bear 
the same burden of proof. Do the cooperative 
and competitive choices of subjects in the 
Prisoner’s Dilemma game, for example, have 
any parallels with the cooperative and com- 
titive choices of individuals in the real 
world—let alone the foreign policy choices of 
nation states? The concern is not new: Even 
well before the current crisis, Carl Hovland 
was agonizing over the fact that laboratory 
and field studies of persuasive communication 
produced widely divergent results (Hovland, 
1959). 

NA have usually attempted to 
establish the external validity of their lab- 
oratory procedures in two basic ways. The 
first strategy, often proposed but rarely per- 
formed, is to conduct conceptually parallel 
laboratory experiments and field studies in 
tandem, in order to confirm the assumed 
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isomorphism directly. The second strategy is 
to demonstrate the construct validity of the 
laboratory-assessed concepts, to show that 
the behavior relates to other variables as the 
theory says it should. For example, one’s faith 
in the ring-toss game as a situation that 
elicits achievement motivation is bolstered 
by findings that individuals high on the mo- 
tive stand a moderate distance from the peg 
as the theory of achievement motivation im- 
plies (McClelland, 1958). Similarly, one’s 
confidence in a paradigm’s external validity 
is enhanced if individual difference variables 
relate to the laboratory behaviors in theo- 
retically compelling ways. 


The Template-Matching Technique 


In this article, we propose a novel exten- 
sion of the individual-differences approach to 
assessing the degree to which subjects’ behav- 
iors in the laboratory match their behaviors 
in everyday life. It is called the template- 
matching technique, a general method intro- 
duced by Bem and Funder (1978) for charac- 
terizing situations in the language of person- 
ality description. Applied to the problem of 
ecological validity, the technique consists of 
two basic steps. First, each conceptually dis- 
tinct behavior in the laboratory setting under 
investigation is characterized by a template, 
a personality description of the hypothetical 
ideal person who is most likely to display 
the designated behavior in that situation. 
These templates serve to characterize the 
potential behaviors of subjects inside the 
laboratory. Second, personality descriptions 
of the subjects are obtained from close ac- 
quaintances; these descriptions characterize 
the subjects’ behaviors outside the laboratory. 
The ecological validity of the laboratory set- 
ting, then, is sought in the match between 
these two sets of data. Specifically, the model 
Predicts that the description of the individ- 
ual’s personality will match most closely the 
template that characterizes the behavior he 
or she actually displays in the laboratory. A 
complete discussion of the Procedure is pro- 
vided by Bem and Funder (1978), who have 
applied it to the delay-of-gratification situa- 
tion and the forced-compliance experiment, 
demonstrating how it can be used to pinpoint 


DARYL J. BEM AND CHARLES G. LORD 


sources of cross-situational inconsistency and 
to test rival theories of psychological phe- 
nomena. 


The Mixed-Motive Game 


To illustrate the utility of the template 
matching technique for addressing the i 
of ecological validity, we have chosen to ap- 
ply it to one of the most frequently used— 
and frequently criticized—experimental para 
digms in social psychology, the mixed-motive 
game (e.g, the Prisoner's Dilemma game) 
It will be recalled that in this procedure, two 
subjects are required simultaneously to make 
a series of decisions independently of 
other and with no communication betw 
them, The payoff to each subject is contingen| 
on both of their choices. Thus in the Prisoner's 
Dilemma game, the most familiar of tl 
mixed-motive games, when both subjects se 
lect the cooperative response, their joint pay- 
off is high; when both select the competitive 
response, their joint payoff is low; when oi 
subject chooses the cooperative response and 
the other the competitive response, the latter 
(the defector) earns points or money at the 
expense of the cooperator. The mixed-motiy 
game is a particularly appropriate test aS 
here, not only because it has been extensively 
used, but also because the debate between its 
proponents (e.g., Kelley et al., 1970) and its 
critics (e.g., Knox & Douglas, 1971; Nemeth, 
1972; Pruitt, 1967) nicely epitomizes the 
field’s more general concern with the issue 0 
ecological validity. 

Most of the studies using the mixed-m0= 
tive game are concerned not with its exte! i 
validity but with the internal structure 0 
the situation, the ways in which a subje 
playing strategy varies as a function of 
Payoff matrix, communication variables, oA 
preprogrammed features of the opponen 
Strategy. (See, for example, reviews dal 
Nemeth, 1972, and Oskamp, 1971. See ia 
Kuhlman & Marshello, 1975.) Such studi 
yield an internally consistent picture 0 a 
Operation and competition within the oed 
digm but provide very little in the way 0f 
dependent evidence for its ecological vii 

There appear to be no studies that direc 
compare a laboratory experiment and 4 
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situation, and most of the studies that do 
bear upon the issue of external validity are 
those that explore group or individual differ- 
ences. For example, army cadets are more 
competitive in mixed-motive games than 
Berkeley students (Pilisuk, Kiritz, & Clam- 
pitt, 1971); individuals high on the need for 
power are more competitive than those low 
on this motive (Terhune, 1968); isolationists 
are more competitive than internationalists 
(McClintock, Harrison, Strand, & Gallo, 
1963); high authoritarians are more competi- 
tive than low authoritarians (Wrightsman, 
1966); externals are more competitive than 
internals (Tiller, 1970). And, in a study that 
goes beyond the simple cooperative-competi- 
tive dichotomy by utilizing the decomposed 
versions of mixed-motive games introduced 
by Messick and McClintock (1968), it was 
found that subjects who systematically at- 
tempted to maximize the joint gain of both 
players (called J’s) scored higher on the Af- 
filiation and Intraception scales but lower on 
the Aggression scale of the Edwards Personal 
Preference Schedule (Edwards, 1959) than 
did subjects who attempted to maximize their 
own gain (O’s) or subjects who attempted to 
maximize their gain relative to the other per- 
son (R’s) (Bennett & Carbonari, 1976). 

In the study reported here, we also used 
the decomposed mixed-motive games. Spe- 
cifically, we constructed templates of three 
hypothetical persons: the ideal J, the ideal 0, 
and the ideal R. Each subject in the actual 
experiment was then classified on the basis 
of his or her responses as a J, an O, or an R, 
and it was predicted that the subject's simi- 
larity to the corresponding template would 
be higher than his or her similarity to the 
templates characterizing the two alternative 
strategies. 


Method 


The Mixed-Motive Game and Its 
Decomposition 


In a mixed-motive game, two subjects are ae 
from one another and required simultaneously a2 
select from Alternatives A and B in the eee 
matrix shown in Table 1. This matrix 1s sear A 
of the Prisoner’s Dilemma game m which joint 
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Table 1 
Payoff Matrix for the Prisoner's Dilemma Game 
Other Your choice 
person’s 
choice A B 
A You earn 100 You earn 110 
Other earns 100 Other earns 70 
B You earn 70 You earn 80 


Other earns 110 Other earns 80 


operation (A responses) earns the players the maxi- 
mum joint payoff, joint competition (B responses) 
yields minimum joint payoff, and dejection to the 
competitive response (B) earns the defector maxi- 
mum payoff when the other player selects the co- 
operative response (A). 

Every mixed-motive game can also be decomposed 
into a situation in which the subject makes his or 
her own choices without direct simultaneous refer- 
ence to the other subject’s behavior. For example, 
the full-matrix mixed-motive game decomposes into 
the following decision alternatives for such a subject: 


Your choice 


A B 
You earn 60 70 
Other earns 40 10 


Here the subject chooses between A and B while 
being told that the other subject is making the same 
choice concurrently. Note that if the subject chooses 
Alternative A, he or she assigns 60 points to the self 
and 40 points to the other subject; if he or she 
chooses Alternative B, 70 points are assigned to the 
self and 10 points to the other subject. The game is 
isomorphic to the full-matrix version because if both 
subjects select Alternative A, each earns 100 points; 
if both select B, each earns 80 points; and if one 
selects A and the other selects B, the former earns 
70 points and the latter earns 110 points, Thus 
the two games are formally equivalent, but the de- 
composed game is played without trial-by-trial feed- 
back, and it focuses attention on each player’s indi- 
vidual strategy, unconfounded by two-person inter- 
actions. Note, for example, that a subject who wishes 
to maximize the joint outcome of both players will 
choose Alternative A, whereas a subject who wishes 
to maximize his or her own payoff will select Alterna- 
tive B. The elegance of this feature is best illustrated 
by the Triple Dominance game in which subjects 
choose from among three alternatives: 


Your choice 


A B Cc 
40 50 40 
Other earns 40 20 0 
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Note that each alternative satisfies one, and only 
one, of the three strategies (J, O, or R) discussed 
earlier. The subject who selects Alternative A as- 
signs 40 points to the self and 40 points to the other 
person, thus maximizing the joint return for both 
players (80 points versus 70 or 40 points for the 
other alternatives) ; the subject who chooses Alterna- 
tive B assigns 50 points to the self, thereby maxi- 
mizing his or her ow» return; and the subject who 
chooses Alternative C assigns 40 points to the self 
and 0 points to the other person, maximizing his or 
her return relative to the other person (40 points 
versus 0 or 30 points). 

The Prisoner’s Dilemma game and the Triple 
Dominance game are but two of the possible mixed- 
motive games that can be used. (See, for example, 
Kuhlman & Marshello, 1975.) 


Constructing the Templates 


To operationalize the template concept, Bem and 
Funder (1978) employ the Q-sort technique, utiliz- 
ing a modification of the California Q set of items 
devised by Block (1961/1978). The Q set consists of 
100 descriptive personality statements (eg., “behaves 
in a giving way toward others”), which are sorted 
by an assessor into nine categories ranging from the 
least to the most characteristic of the person being 
described; each item thus receives a score from 1 
to 9. Two Q sorts can be compared with one another 
pth a ee a Pearson product-moment cor- 
relation across items, thus expressing directly 
and quantitatively the degree of similarity between 
the two profiles. Similarly, if one has constructed a 
Q sort of some hypothetical ideal personality (our 
template), one can correlate the Q sorts of actual 
individuals with this idealized sort in order to as- 
sess their similarities to the ideal type. 

As noted above, the Q sort of an individual is ob- 
tained by having an assessor actually sort the 100 
items into nine categories, Constructing a template, 
however, requires a different method for assigning 
numerical values to the items because only a small 
number of items will be relevant to any particular 
template. Accordingly, Bem and Funder (1978) con- 
structed templates by “adjusting” the composite Q 


sort of the entire subject sam ccording 
following formula: iii ha 


Qi = Mi + wir 


where Q: is the adjusted or tem late value 

ith item, M, and o are the mean ea Pitia dove 
tion, respectively, of the ith item for the subject 
sample as a whole, and w is a weighting factor that 
reflects the relevance of the ith item to the criterion 
behavior being characterized, 

Suppose, for example, that we wish to construct 
a template of the person who pursues the J strategy, 
the strategy that maximizes the joint return of both 
Players. The Q item “is physically attractive” is 
conceptually irrelevant to this behavior, and ac- 
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cordingly, it is assigned a relevance weight (t) of 
zero, The adjusted value of this item in the tem 
then, simply becomes the mean of the item in the 
subject sample. But now consider the item “behaves 
in a giving way toward others,” which would seem 
relevant to the behavior implied by the J strategy. If 
we assign this a weight of, say, +2, then the tem- 
plate value of this item will be adjusted upward from 
the sample mean by two standard deviations, and— 
the model predicts—the resulting template will be 
more similar to the Q sorts of individuals who pursue 
the J strategy than to the Q sorts of individuals 
pursuing the alternative strategies. A complete dis- 
cussion of this template-construction algorithm will 
be found in Bem and Funder (1978) 

In the present study, we constructed the tem- 
plates by recruiting five graduate students in psy- 
chology to serve as judges. They were supplied with 
a written description of the three strategies along 
with the illustrative Triple Dominance game shown 
above and the complete list of Q-sort items. Each 
judge went through the list and gave an integer 
rating to each item, ranging from —2 (“very un- 
characteristic of a person following this strategy”) 
through 0 (“not relevant; neither characteristic nor 
uncharacteristic of the person following this strat- 
egy”) to +2 (“very characteristic of the person fol 
lowing this strategy”). For example, the item be- 
haves in a giving way toward others” received @ 
mean rating of +1.6 for the J strategy, and the ite 
“is power oriented; values power in self and others 
received a mean rating of +1.2 for the R strategy: 
the strategy in which subjects attempt to maximize 
their earnings relative to the other person, The 
judges were explicitly directed not to play person- 
ality theorist, not to speculate about remote oa 
sonality correlates of the behavior, but rather ‘ 
judge only whether an item might or might nol 
characterize the behavior involved in the sel 
itself, The judges completed this procedure for 
of the three strategies. The mean rating given a 
item by the five judges then became the rie 
weight (w:) for that item in the template-genera 
algorithm described previously. Separate temi et 
were constructed for each sex by using the are 
of weights (w,s) but the separate means and ie 
dard deviations (Mis and os) for each sex. he 
procedure, then, yielded a set of templates for 


1 Even though this index of similarity is computed 


ion, it cannot 
like a product-moment correlation, there 18 


of zero. Moreover, the magnitude of the corre 
is a function of factors other than the comPd in 
of template-specific similarity in which we A ry 
terested. (See Bem & Funder, 1978.) To avo! 
confusion on this issue, we routinely convert a 
similarity scores to standardized T scores with ™ 

of 50 and standard deviations of 10. 
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situation, each characterizing one of the three “pure” 
strategies that could be pursued. 

For purposes of comparison, a second set of tem- 
plates was constructed by enlisting the aid of 
Spencer Kagan, an investigator who has worked ex- 
tensively with the decomposed mixed-motive games 
and who, in fact, distinguishes not 3, but 16 distinct 
strategies in his own work (Kagan, 1972, 1977). 
Moreover, because Kagan is trained as a clinician, 
we asked him to supply a set of weights (wis) for 
the three strategies by drawing as heavily as he 
wished on his accumulated experience and clinical 
intuitions, instructions quite different from the con- 
servative and restrictive guidelines we imposed upon 
our graduate-student judges. 


The Experiment 


Forty-eight male and 44 female Stanford Univer- 
sity students who had previously provided self Q 
sorts in an unrelated setting were recruited to par- 
ticipate in a 1-hr “decision-making” experiment. For 
their participation, they received experimental credit 
points in introductory psychology as well as the 
money they earned during the experiment. In the 
experimental session, four subjects of the same sex 
were seated in separate experimental booths and told 
that they would be anonymously paired with each 
of the other subjects in turn and asked to make a 
series of decisions. In fact, the subjects did not ac- 
tually play against one another; rather, they played 
simultaneously against a preprogrammed sequence of 
decisions throughout the experiment. They then made 
choices on 24 decomposed mixed-motive games with 
no feedback concerning the “other subject's” deci- 
sions. Each payoff matrix was projected on a screen 
which could be seen by all subjects from their sepa- 
rate booths, They then played 48 full-matrix mixed- 
motive games (allegedly 16 with each of the other 
three subjects in turn) with the usual trial-by-trial 
feedback about the joint payoff. After the session, 
„they were debriefed and paid $2, the amount they 
would have earned with optimal play and maximally 
cooperative opponents. 

During the following academic quarter, we cone 
tacted 10 subjects who had systematically adopted 
the J strategy, 10 subjects who had adopted the O 
strategy, and the entire sample of 10 subjects who 
| had adopted the R strategy during the experiment. 
Each group contained 5 males and 5 females. All 
these subjects gave us permission to obtain confiden- 
tial Q sort descriptions of them from one of their 
roommates. The roommates came to Our laboratory 
and were paid $2 to complete the Q sorts; they mere 
not told about the subjects’ participation n OME 
periment the previous quarter. (Since si on 
introductory psychology participate in several stu the 
it is unlikely that these roommates knew o 
experiment or its relation to the information is 
were providing.) Finally, the experimenters FETS 
several phases of the study were always blin bjects 
respect to all other information about the subje! 


| and the study. 
as 


kg 


ba 
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Results and Discussion 


The first step in the analysis of the data 
was to classify each subject as a J, an O, or 
an R, according to the criterion that he or 
she must have selected 50% or more of the 
choices compatible with that strategy on each 
of the four types of decomposed games. Under 
this criterion, we classified 36% of our sub- 
jects as J’s, 39% as O’s, and 11% as R’s, 
leaving 14% unclassified. This grouping con- 
tains relatively fewer R’s and unclassified 
subjects than reported by Kuhlman and 
Marshello (1975), who used the same clas- 
sification criterion. 

We next confirmed that behavior on the 
decomposed games does, in fact, predict be- 
havior on the more familiar full-matrix games 
employed in most previous studies. For ex- 
ample, those who adopted the strategy of 
trying to maximize the joint return of both 
themselves and the other person in the de- 
composed games (J’s) selected the coopera- 
tive response on full-matrix games 1% of 
the time; those who adopted the strategy of 
maximizing their own return (O's) selected 
the cooperative response 57% of the time; 
and those who adopted the strategy of try- 
ing to maximize their return relative to the 
other subject (R’s) selected the cooperative 
response 31% of the time. This result is 
highly significant, F = 20.47, p < 001, with 
no overlap between the 95% confidence in- 
tervals for the three groups. Clearly, then, we 
can generalize from the decomposed games to 
the full-matrix games, a fact that has not 
previously been established using a variety of 
full-matrix mixed-motive games (cf. Kuhl- 
man & Marshello, 1975). 


Assessing Ecological Validity 

As we noted earlier, the roommates’ Q 
sorts of our subjects characterize their be- 
haviors outside the laboratory, and the tem- 
plates characterize their behaviors within the 
laboratory. Accordingly, the ecological valid- 
ity of the laboratory situation is sought in 
the match between these two sets of data. 
Specifically, we predict that each subject’s 
roommate Q sort will be more similar to the 
template characterizing his or her adopted 
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strategy in the laboratory than to the tem- 
plates characterizing the two alternative 
strategies, 

To test this hypothesis, the subjects’ room- 
mate Q sorts were correlated with each of the 
three sex-appropriate templates in turn; and 
for each template, these correlations were 
then converted to standardized 7 scores with 
means of 50 and standard deviations of 10 
so that all templates would have the same 
baseline, that is, would equally characterize 
the average or composite subject. (Also see 
Footnote 1.) All analyses were then con- 
ducted on these standardized template-simi- 
larity scores. 

The results show that our subjects’ room- 
mate Q sorts are, in fact, more similar to the 
templates characterizing their actual play- 
ing strategies than they are to the alterna- 
tive templates (52.2 vs, 48.9), (29) = 2.33, 
p < .025, one-tailed. Moreover, this effect is 
consistent across subjects: 22 of the 30 dif- 
ference scores were positive, z = 2.37, p= 
009, by a one-tailed sign test. The templates 
generated by our clinical expert, Spencer 
Kagan, were similarly successful (52.5 vs. 
48.8), £(29) = 2.63, p < .01, one-tailed, with 
21 of the 30 difference scores positive, z = 
2.01, p = .022, one-tailed. The template- 
matching technique has thus provided one 
kind of support for the ecological validity of 
mixed-motive games as analogues to real- 
world situations. As seen by their peers, our 
subjects behave outside the laboratory as we 
saw them behave within it. The relationship 
between this kind of evidence and the full as- 


sessment of ecological validity will be dis- 
cussed later.? 


Clinical Portraits of the Subjects 


_Although it is the template-matching tech- 
niques that provide the formal mechanism 
for testing hypotheses about situations, it is 
the Q-sort information that has the most 
heuristic potential for revealing the functional 
Properties of the setting and Probing the 
meaning of the behaviors observed (Bem & 
Funder, 1978). In particular, Q sorts draw 
clinical portraits of the subjects and identify 
those attributes that discriminate one type of 
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subject from another. The following descrip- 
tions are based upon the data Presented ini 
detail in the Appendix (Table Al). On ead 
Q item, each subject type is contrasted wil 
its same-sex complement (e.g., male J's w 
male O's and R’s combined), using the err 
term from the one-way analysis of arian 
on the item. This is done separately for th 
roommate Q sorts and the self-sorts of tht 
subjects. Within each subject type, the ite 
is also tested to see if the roommates and th 
subjects disagree significantly. 
Again, it is the heuristic value of this ana 
ysis that we wish to emphasize; the statistic 
tests should be used only as rough guideling 
First, we conducted a separate analysis 
each item; some of the significant contras 
may have emerged by chance. Second, | 
contrasts comparing each subject type againd 
its complement are not orthogonal; if a grou 
is significantly higher than its complement tf 
an item, one or both of the other types al 
likely to show up, nonindependently, as si 
nificantly lower than their complements @ 
that same item. And finally, even though í 
subject type might be significantly h o 
than its complement on a particular item, E 
does not necessarily mean that the item} 
descriptive of the type in absolute tem 
(The absolute standing of an item can 3 
inferred roughly from Table A1, which lst 
the items in rank order for the sample E 
whole, based on the roommates’ sorts. Thus 
the top items are those that are most desc! u 
tive of the subject sample; the bottom 1 
are those least descriptive of the sample.) 
With these caveats in mind, we can venta 
into the clinical thicket of Table Al. Sin 
the complete data base is provided, the re f | 
is free to agree or disagree with our inte ra 
tations and to venture more or less bolt! 
than we shall do. tiv 
The J’s. The J strategy is the coopera’ | 
or even altruistic strategy that maximizes mi 
Eis the othe 
players’ joint return as well as 


ny 
player’s return. Relative to the compleméy 


No significant results are obtained if ee it 
Sorts rather than the roommates’ sorts are a th 
the template-matching analysis. We shall 4 
reason for this in what follows. 
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the male J is described by his roommate as 
thinking in unusual ways, being repressive, 
not arousing liking and acceptance in people 
nor having social poise, not being gregarious, 
and not being particularly interesting. Rela- 
tive to the complement, the male J describes 
himself as being introspective, concerned 

with philosophical problems, and, feeling a 

lack of personal meaning in life, and he agrees 

with his roommate that he thinks and associ- 
ates ideas in unusual ways. He also sees him- 
self as having relatively little social poise and 

“as being less masculine and less power ori- 

ented than the complement males consider 

themselves to be. He doesn’t think he is as 
repressive as his roommate does, however. 

These various findings are reminiscent of the 

Bennett and Carbonari (1976) finding that 

J’s score higher on the Intraception scale but 

lower on the Aggression scale of the Edwards 

Personal Preference Schedule than the com- 

plement groups do and lower on the Hetero- 

sexual scale than the R’s do. 

_ The female J looks a bit more rigid than 
the male J. She is described by her roommate 
as being concerned with philosophical prob- 
lems, but she is also seen as moralistic, fas- 
tidious, fussy about minor things, and having 
persistent, preoccupying thoughts. She de- 
scribes herself as being basically distrustful 
of people in general, but this is only in con- 
trast to the complement groups’ self-ratings; 
the item still falls well into the uncharacter- 
istic end of her self Q sort. In general, her 

“roommate sees her less positively than she 
sees herself, rating her significantly higher on 
such items as “delays gratification unmeces- 
sarily,” “concerned with own bodily func- 
tions,” “thin-skinned; sensitive to criticism,” 
“would be disorganized under stress,” and 
“has persistent, preoccupying thoughts.” Her 
roommate rates her significantly lower on the 

items “is cheerful,” “enjoys aesthetic impres- 
sions,” and “behaves in an assertive fashion.” 

In general, the J’s turned out to be more 
withdrawn and less charming than the warm, 
outgoing, generous people our judges 
anticipated. Their generosity in the mixed- 
motive games seems to spring more from an 
ethical or moral position—and a rather rigid 
one among the women—than from a spong 
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taneous love of people. Thus, the J templates, 
while still characterizing the J’s better than 
the O’s or the R’s, tended to err on the side 
of being too positive about this group. In- 
deed, the item “behaves in a giving way to- 
ward others,” the assumed sine qua non of 
the J strategy, does not discriminate the J’s 
from the other groups. 

The O’s. The O strategy attempts to maxi- 
mize the individual’s own earnings irrespec- 
tive of the other player’s outcomes. This was 
the modal strategy, being adopted by 39% 
of the subject sample. It also turned out to 
be the normative strategy in the sense that it 
conveys the least amount of individuating in- 
formation about those who adopt it. Although 
there are several discriminating items in the 
Q sorts of the O’s, they do not form any ob- 
vious clusters, nor do any of them seem par- 
ticularly related to the O strategy as such. 
Rather, it appears that the J’s and the R’s 
define the distinctive groups in the study, 
with the O’s constituting the residual group. 
The templates were least successful at cap- 
turing this group, tending to describe them as 
much more selfish and self-centered than they 
turned out to be. In fact, the O’s would ap- 
pear to be the roommates of choice, particu- 
jarly the women, who are described as cheer- 
ful, verbally fluent, aesthetically sensitive, 
interpersonally consistent, and not particu- 
larly thin-skinned or fussy. 

The R’s. The R strategy is the most com- 
petitive in that it manages to maximize the 
difference between the individual’s return and 
the other player’s return, as well as to mini- 
mize the absolute return for the other player. 
This was the most deviant strategy In our 
Stanford sample of 92 subjects, and our sub- 
sample of 30 on whom we have roommate 
sorts contains the entire set of 5 men and 5 
women who adopted it. They are deviant in 
their personal attributes as well. $ 

Consider the males first. Being competitive 
is, of course, sex-role appropriate for men in 
our culture, and the male R seems to epito- 
mize the stereotypic extraverted American 
male. He is assertive, masculine, socially 
poised, conservative, conventional, and some- 
what condescending. He is not introspective, 
guilt ridden, anxious, or plagued by doubts 
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about the meaning of life. He describes him- 
self as dependable, straightforward, and can- 
did. (Bennett & Carbonari, 1976, found the 
R’s to be higher on Aggression and Hetero- 
sexuality than the J’s were, but not signifi- 
cantly different from the O’s.) Such a pic- 
ture may not be deviant in the larger culture, 
but the male R’s are clearly in the minority 
among Stanford males, and this may explain 
why our study uncovered fewer R’s than re- 
ported by other investigators. 

The female R’s are the most interesting 
group of all. Here is a group of women who 
pursue a strategy quite at variance with the 
sex-role stereotype of the female, and yet they 
describe themselves as quintessentially femi- 
nine: feminine, physically attractive, socially 
poised, arousing liking and acceptance in 
others, not condescending, and not distrust- 
ful. Their roommates do not agree, finding 
them instead to be aloof from people, power 
oriented, negativistic, tending to undermine, 
obstruct or sabotage, not sympathetic, con- 
siderate, or giving, and not relating to every- 
one in the same way. Compared to the female 
R’s self-image, her roommate sees her as 
hostile, distrustful, critical, histrionic, not 
cheerful, not gregarious, and as having little 
or no insight into her own motives and be- 
havior. The female R’s and their roommates 
disagree significantly on 25 of the 100 Q-sort 
items, representing 30% of all the room- 
mate-self discrepancies observed among the 
six groups. This is a proportion almost twice 
that expected by chance, z = 3.22, p < .002, 
two-tailed. All in all, the female R’s present 
a rather chilling portrait. It is also clear why 
the self-sorts did not function well in the tem- 
plate-matching procedure. In the laboratory, 
we saw the women their roommates saw. 


Q Sorts, Template Matching, and Ecological 
Validity 


In our view, the major payoff from the 
present investigation resides in the data-rich 
Table Al—however unwieldy and statistically 
unaesthetic it may be. We believe that in- 
vestigators ought to collect Q-sort data rou- 
tinely even when they have no interest in in- 
dividual differences per se, particularly when 
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a new laboratory paradigm is first being de. 
veloped. By examining the Q items that cor, 
relate with the behavior under investigation, 
one can learn an enormous amount about the 
subjects and the experimental setting simul- 
taneously. (See Bem & Funder, 1978.) Al 
though we advocate collecting Q-sort data 
from peers as well as from the subjects them- 
selves, the more easily obtained  self-sorts 
would probably suffice for many purposes, 
Even the ranking of Q-sort items that charac- 
terizes a subject sample as a whole provides 
valuable information. For example, when an 
experiment fails to replicate, suspicions often 
center upon subject-sample differences, but 
the sparse demographic information usually 
available on subject samples provides little 
help in assessing this possibility. In contrast 
composite Q sorts of the subject samples im 
volved would provide instant leads about the 
possible sources of different results patterns 
across seemingly identical investigations. In 
the present study, for example, it is clear 
from the composite Q sort of the sample Spy 
a whole (summarized by Table A1) that we 
should expect to see few R’s in our mixed: 
motive games; Stanford students, it appeals 
are a rather saintly group. In the “context of 
discovery,” there is no match for Q-sort data. 
The template-matching technique itself ex 
tends the heuristic utility of Q-sort data to 
the “context of justification” as well. In the 
present study, we constructed our templates 
on the basis of observers’ ratings of the Jabs, 
oratory setting under investigation. We then 
demonstrated support for the ecological valid- 
ity of mixed-motive games by showing that 
these templates matched peer Q-sort descrip 
tions of the subjects’ real-life behaviors out 
side the laboratory. Bem and Funder (1978 
have demonstrated that the templates Ca 
also be constructed post hoc on the basis H 
the data and then cross-validated; they us" 


d not orig 


3 The descriptive use of Q-sort data di arch pro” 


(e.g., Block, 1977; Block & Petersen, 
(1960, 1965) makes similar use of the 
Check List. In our view this technique hi 


received the attention it has deserved nor 
heuristic value been fully appreciated. 


has 18 


et 
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this variation of the technique to probe the 
functional properties of a delay-of-gratifica- 
tion situation. In a second study, they showed 
how templates can also be derived from for- 
mal theories, employing this variation to test 
rival explanations of the forced-compliance 
experiment against one another. The tem- 
plate-matching technique is thus a very gen- 
eral method that enables investigators to dis- 
play their virtuosity at characterizing situa- 
tions, and to do so in a formally legitimate 
way, that is, within the context of justifica- 
tion. 

Even so, however, it is important in the 
present study to recognize just what success- 
ful template matching does and does not 
imply about the ecological validity of an ex- 
perimental setting. As we have noted, it does 
demonstrate that the subjects are behaving 
in the setting as they behave in the world 
outside the laboratory. But this is a weak 
form of ecological validity. As an anonymous 
reviewer of this article correctly noted, the 
concept of ecological validity requires that 
the relationships between situational variables 
and the behavior in the setting replicate the 
relationships between situational variables 
and the behavior outside the laboratory. For 
example, researchers who use mixed-motive 
games are typically interested in how vari- 
ables such as the payoff matrices, communica- 
tion opportunities, and features of the op- 
ponent’s playing strategies affect the subject's 
choices. For the game to have ecological va- 
lidity in the strong sense, it must be demon- 
strated that these kinds of variables affect 
behavior in similar ways in the game’s real- 
world counterparts. Template matching does 
not speak to this issue. Similarly, Hovland’s 
(1959) lament about the divergence between 
laboratory and field studies of persuasion was 
that source and message variables did not 
seem to show similar effects or interactions in 
the two spheres. Hovland would not have 
been much comforted to learn that a person 
who is easily persuaded in the laboratory 1S 
also easily persuaded in the field. 

Successful template matching, then, would 
appear to be a necessary but not sufficient 
condition for establishing the full ecological 
validity of a laboratory setting. Nevertheless, 
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if the template-matching technique—or some 
equivalent procedure—were to be employed 
more widely, then we might have more con- 
fidence that people are behaving in our lab- 
oratories as they behave in real life. 
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Factors Affecting the Evaluation of Improvement: 


The Role of Normative Standards and Allocator Resources 


André deCarufel 
2 Faculty of Administration 
University of Ottawa, Ottawa, Canada 


In a simulated industrial setting, subjects performed a clerical task, believing 
that their pay was being determined by a (fictitious) peer allocator..After being 
treated inequitably, subjects were able to request a (fictitious) third party to 
review these allocations. The third party established one of two “normative” 
standards to govern a second series of allocations: either “fairness from now 
on” or “fairness now, plus compensation for the initial discrimination.” Subjects 
subsequently received increased outcomes that matched one or the other of 
these standards, Concurrently, the resources available to the allocator either 
remained constant, increased, or decreased. The results indicated dissatisfaction 
despite the improvement in outcome level when the increase was felt to be 
inadequate relative to a normative standard and to the “social” standard formed 


by the allocator’s own outcomes. Related hypotheses based on a relative de- 


privation interpretation of discontent with improvement are also presented. 


Numerous social agencies and programs 
have been set up with the intention of im- 
proving the outcomes accruing to disadvan- 
taged parties in our society. A particularly 
fascinating observation is that the recipients 
in these programs often become discontented 
with the aid and alienated from their bene- 
factors. This pattern has been reported in a 
wide variety of settings, involving foreign aid 
recipients (Gergen & Gergen, 1971), disabled 
persons (Ladieu, Hanfman, & Dembo, 1947), 
and welfare clients (Briar, 1966). Similar ob- 
servations have been reported in connection 
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with the occurrence of riot and revolution 

during times of increasing prosperity (e.g., 

Davies, 1962; Gurr, 1970; Tocqueville, 1856). 

Clearly these negative responses by recipients 

are at variance with those envisioned by well- 

intentioned aid donors and social reformers. 

It is also clear that the improvement does not 

always lead to discontent, but the prevalence 

of these “paradoxical” responses suggests that 

a more thorough examination of the experi- 

ence of receiving improved outcomes is in 

order. 

A major theoretical approach to the evalua- 

tion of improvement by disadvantaged parties 

has its roots in the concept of relative depri- 

vation (Crosby, 1976; Stouffer et al., 1949), 

which suggests that outcomes are evaluated 

not in objective terms but rather in relation 

to a subjective standard of deserving. An in- 

dividual who receives less than he or she 

feels is deserved is hypothesized to experi- 

ence a sense of injustice, which leads to cog- 

nitive and behavioral efforts to restore justice. 

There are at least two ways for discontent to 
arise in the specific case of outcome improve- 
ment (deCarufel & Schopler, 1979). If the 
standard of deservingness rises and falls with 
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experienced outcomes, then the improvement 
itself may provoke feelings of entitlement. To 
the extent that the actual improvement lags 
behind these “rising expectations,” discon- 
tent results, Alternatively, the parties involved 
may negotiate, formally or informally, a “con- 
tract” to govern subsequent allocations. Dis- 
content arises in this case to the extent that 
the improvement does not match the terms of 
the contract. 


Research and Rationale 


Despite a large relative deprivation litera- 
ture, few experimental studies have examined 
“dynamic,” or changing, outcome distribu- 
tions, particularly in the case of improvement 
over time. Two recent studies (deCarufel & 
Schopler, 1979; Folger, 1977) examined a 
simulated work setting in which subjects in 
the role of a worker were initially unfairly 
treated by a (fictitious) allocator. Folger 
found discontent with a subsequent improve- 
ment when it followed an expression of allo- 
cation desires by the subject (“voice”) and 
the improvement itself failed to restore over- 
all equality of outcomes with the allocator. 
Folger suggested that the voice-improvement 
sequence was interpreted by subjects as an 
acknowledgment that they deserved more pay 
and created the expectation that fair alloca- 
tions would now follow. These hopes were, of 
course, dashed by the subsequent allocations. 

deCarufel and Schopler examined a number 
of alternative interpretations based on the 
type of “voicing” available to subjects and 
also included a broader range of outcome 
improvements provided by the allocator within 
the context of overall inequality of outcomes. 
The results indicated increased satisfaction 
with the improvement when it was introduced 
arbitrarily by the allocator or was in response 
to a threat from the worker. When the 
“voice? took the form of an appeal to 
principles of fairness, an improvement provid- 
ing the worker and allocator with equal out- 
comes in a second series of allocations led to 
high satisfaction. However, an objectively 
larger improvement, which gave the worker 
more than the allocator in the second half 
as a kind of “compensation” for the initial 
unfairness, led to no more satisfaction than no 
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improvement at all! deCarufel and Schopler 
suggested that the size of the improvement 
may have implied different standards of fair- 
ness and set the stage for discontent. Receiy. 
ing equal outcomes may have suggested a 
“proximal” standard of “fairness from now 
on.” This in fact was matched, leading to 
high satisfaction, The compensation, however, 
may have implied a “distal” standard of full 
overall equality, which was not matched, lead- 
ing to dissatisfaction. 

Although this evidence is suggestive of the 
operation of rising expectations, there remain 
several points of ambiguity. The first is that 
the improvement has been interpreted not 
only as an increase in the subject’s outcome 
level but also as an implicit legitimizing by 
the allocator of a standard of fairness for the 
remaining allocations. It would seem desira- 
ble to clarify the dynamics of the voice- 
improvement conditions with independent 
manipulations of the improvement and the 
standard. Second, the role of social compati- 
son with the allocator’s outcomes has been 
largely neglected in the interpretation of sub- 
jects’ evaluation of the improvement. Subjects 
may have felt that the allocator did not meet 
the standard in order to maintain his/her own 
outcomes at a high level. Thus, a manipu- 
tion of the resources available to the i 
tor with which to provide various levels 0 
improvement would be of some interest. H 
nally, there exists an alternative to deCaruke 
and Schopler’s interpretation of their a 
macy conditions. This interpretation sugges : 
that subjects may have perceived the alloca 
tor as having been overly generous in Fe 
compensation conditions. The dissatisfactt 
may therefore have been due to a bee, J 
“guilt” rather than to discontent. This z 
pothesis, which casts doubt on the rising 
pectations account, also deserves experimen 
attention. im 

Hypothesis 1. If the magnitude of we f, 
provement fails to raise the outcome leve Bi 
the individual to that specified by the Hie 
mate standard, the individual will feel & 
satisfied despite the increase in his/her 
solute outcome level. o- 

This hypothesis is a direct test of the rat 
cess assumed to have mediated the disco 
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in the Folger (1977) and the deCarufel and 
Schopler (1979) studies. This hypothesis is 
consistent with the relative deprivation litera- 
ture and with a fundamental proposition of 
equity theory that individuals become dis- 
tressed when they find themselves participat- 
ing in an inequitable relationship (Walster, 
Berscheid, & Walster, 1976). Discontent 
should be most clearly evident here in those 
conditions where the distal standard of full 
equality is legitimized but an improvement 
matching “only” the proximal standard is pro- 
vided by the allocator. If the proximal and 
distal standards are referred to as the low 
and high standards, respectively, and the im- 
provements matching these standards are 
called the small and large improvements, re- 
spectively, discontent should be evident in the 
high-standard-small-improvement conditions. 
There are two relevant comparisons. The first 
is the high-standard—large-improvement con- 
ditions. In this case, the distal standard is 
legitimized in both sets of conditions, but the 
small improvement does not match the stan- 
dard, whereas the large one does. The statis- 
tical comparison is the simple main effect of 
improvement within the high-standard condi- 
tions. The second comparison is the low-stan- 
dard-small-improvement conditions. Here, the 
magnitude of the improvement is small in both 
cases but it is compared to either the proxi- 
mal or the distal standard. In the first case 
the standard is matched, whereas in the sec- 
ond it is not, The statistical comparison is 
the simple effect of standard within the small- 
improvement conditions. The hypothesis will 
be accepted only if both simple effects com- 
parisons are significant. 
Recall that there exists an alternative in- 
terpretation of the deCarufel and Schopler 
(1979) legitimacy conditions, based on the 
subject’s experiencing guilt over the excessive 
magnitude of the improvement in the over- 
compensation cell. It has been suggested by 
Adams (1965) and by Walster et al. (1976) 
that oversufficient as well as insufficient rê- 
wards may cause uneasiness in the recipient. 
Several studies (e.g., Austin & Walster, 1974; 
Gergen, Ellsworth, Maslach, & Seipel, 1975) 
have in fact demonstrated a preference for 
equitably distributed rewards to those that 
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were inequitable in either direction. This pos- 
sibility can be examined in the present design 
by comparing the low-standard-large-improve- 
ment conditions, where the legitimate standard 
is exceeded by the improvement, with the low- 
standard-small-improvement conditions, where 
the low standard is matched, and with the 
high-standard-large-improvement conditions, 
where the high standard is matched by the 
improvement. The statistical comparisons are 
the simple main effect of improvement within 
the low-standard conditions and the simple 
main effect of standard within the large-im- 
provement conditions, respectively. This sug- 
gestion about the possible operation of guilt 
is being offered in the spirit of the devil’s 
advocate, however, since the thrust of this 
article has been to argue for a rising-expecta- 
tions interpretation. Nevertheless, if both 
statistical comparisons are significant, the 
operation of guilt over the magnitude of the 
improvements will be included in the theo- 
retical model. 

Hypothesis 2. When the legitimate stan- 
dard of fair allocation is matched by the im- 
provement, satisfaction should be equally high 
regardless of the absolute level of the standard 
itself or the magnitude of the improvement 
required to match it. 

This hypothesis is a direct implication of 
the central tenet of relative deprivation thee 
ory. The theory states that dissatisfaction 
with outcome depends on the existence of a 
discrepancy between experienced outcomes 
and the standard used to evaluate them rather 
than on the absolute outcome level itself. It 
follows that eliminating the discrepancy, by 
raising the outcomes or by lowering the stan- 
dards, should also eliminate the discontent. 
It should also be true that if two individuals 
have different standards and both experience 
improvements that may differ in magnitude 
but in each case are sufficient to reduce the 
outcome-standard discrepancy to zero, then 
each should experience equal satisfaction. 
This hypothesis may be tested by comparing 
the satisfaction of subjects in the two sets 
of conditions where the normative standard is 
matched by the improvement: the low-stan- 
dard-small-improvement and the high-stan- 
dard-large-improvement conditions. In this 
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case one would predict that the statistical 
contrast should be nonsignificant, indicating 
no difference in satisfaction levels. This null 
hypothesis prediction may seem rather un- 
usual, but in this case it is a legitimate deriva- 
tion, and within the context of the other pre- 
dicted significant effects, it provides a strin- 
gent test of the theory. 

Hypothesis 3. The individual’s satisfaction 
with a given level of improvement should be 
independently affected by the level of the al- 
locator’s resources. In general, satisfaction 
should be lower when the allocator’s resources 
increase than when they remain constant or 
diminish. 

This hypothesis suggests that subjects will 
use the allocator’s outcomes as one standard 
against which to evaluate their own outcomes. 
Social comparison (Festinger, 1954) is a pro- 
cess that is central to the experience of rela- 
tive deprivation. The outcomes received by 
others are a powerful determinant of what we 
decide we ourselves deserve. Crosby (1976) 
made it her first precondition (seeing others 
possess x), and numerous studies (e.g., Weick 
& Nesset, 1968) have demonstrated explicitly 
the role of social comparison as a contributor 
to equity judgments. Further, the literature 
in the area of riot participation is also con- 
sistent with the primacy of social comparison 
(e.g., Pettigrew, 1967; Sears & McConahay, 
1973). In this paradigm, the greater the re- 
sources of the allocator, the more will be left 
for him/her after providing the improvement. 
In other words, the greater the resources, the 
less the improvement narrows the gap between 
subject’s and the allocator’s outcomes created 
by the initial inequity. This would contribute 
to the overall degree of inequity between the 
participants and should lead to increasing 
dissatisfaction. The predicted statistical effect 
is a main effect for the resource factor. 

Hypothesis 4. Satisfaction with the mag- 
nitude of the improvement should be affected 
by the level of the allocator’s resources. 

This hypothesis suggests that satisfaction 
with the magnitude of the improvement will 
depend on the perceived ability of the allo- 
cator to provide adequate compensation and 
upon his/her apparent effort to allocate fairly. 
The larger the allocator’s resources, the larger 
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is the improvement that could be provided, 
Should only a small rather than a large in- 7 
crease be provided, illegitimate motives such 
as the allocator’s greed may be suggested, This _ 
hypothesis predicts an Improvement X Re 
source interaction. 

Finally, although it has not been specifically 
predicted by any of the separate hypotheses, 
it has been implied that all three independent: 
variables should exert powerful effects on sub- 
jects’ satisfaction with the allocations. Thus, 
an overall Standard X Improvement X Re 
sources interaction might be expected to form 
a context within which to examine the specific 
predicted effects, This interaction should in- 
dicate the most marked discontent in the high- 
standard—small-improvement cells, jally 
where the allocator’s resources increase. Here; 
the subject would experience deprivation rela- 
tive to both standards (normative and 
simultaneously. 


Method 
Participants 


The participants were 108 male and female ue 
graduates at the University of North Carolina E 
Chapel Hill. All participated to fulfill a course t 
quirement. Subjects were run in groups ota o 
but since interaction among them was minim: a 
confined to a brief period before the introdu } 
of any of the manipulations, the individual was US 
as the unit of analysis. Nine subjects were assigned 
to each of 12 experimental conditions. 


Procedure 


Subjects reported in groups of three for Fi A 
periment called “Organizational Behavior ae 
immediately ushered into separate rooms, wi via at 
linked to the experimenter’s control room eer 
intercom system and a set of signal lights. 4 oi 
perimenter explained that the focus of the organi 
ment was on the behavior of individuals in n they 
zations. Subjects were told that the organizatio roles, 
would be participating in would have three 
worker, allocator, and government. 

At this point subjects were assigne y 
the experimenter, using a series of signal HE er in 
subjects discovered that they were the ke, a 70 
the organization and assumed that the i aial 
roles were being played by the other subj peti- 
fact, these other roles were played by the subje 
menter. The experimenter then visited ad ipning 
and gave him/her a “worker’s folder con! 
experimental materials. 


i 
1 
|1 


d to roles ® 


ct 
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The experimenter returned to the control room and 
resumed the instructions, which specified the duties 
of the worker and allocator roles. The worker was 
told that his/her duties would be to perform a cleri- 
cal task that consisted of transcribing sets of num- 
bers from a computer printout onto a work sheet. 
The number of items transcribed was to be used as 
a measure of worker productivity. Subjects were 
also told that they would be paid by the allocator 
over a series of eight pay periods, each of which 
would last 1 minute. The allocator was to receive 
a sum of money from the experimenter at each pay 
period and would then divide up that sum between 
himself/herself and the worker. This money was to 
be exchanged for tickets in a $25 lottery to be held 
at the end of the semester. Subjects were asked to 
keep a record of the allocations on the “payment 
record card” in their folder. The allocator was also 
told that later she/he would check the worker's 
performance, 

In order to make the experimental situation more 
plausible and to set the stage for the introduction 
of the independent variables, subjects were told that 
the study was concerned with the degree of contact 
among organization members and the market condi- 
tions in which the organization was to operate. The 
contact aspect was used to account for the subjects 
being in separate rooms and to make it plausible 
that the allocator would be dividing up the pay 
without knowing how the worker was performing 
and would check the worker’s performance only at 
the end of the final pay period. The “market con- 
ditions” were described simply as the amount of 
money made available by the experimenter to the 
allocator at each pay period. This explanation set 
up the resource manipulation in such a way as to 
allow the allocator’s available pay to change without 
making him/her appear responsible for the change 
(in equity theory terms, to keep the allocator’s 
“inputs” equal in all conditions, while allowing his/ 
her outcomes to vary). At this point, the experi- 
menter visited each subject to answer any questions 
about the procedure. 

The experimenter then announced that the first 
series of pay periods was about to begin, and the 
subjects were instructed to begin their tasks, At 
1-minute intervals, the experimenter interrupted the 


workers to inform them of the amount of money 
available to the allocator at that pay period and 
them. All 


how the sum had been divided between 
subjects were told at each of the first four pay Pe 
riods that the allocator had had 24¢ available and 
had kept 16¢ for himself/herself and had given 8¢ 
to the worker. Thus, the degree of initial inequity 
was identical in all conditions At the conclusion 
of Pay Period 4, subjects totaled up on their pay- 
ment record cards the allocations to that point. 


Introduction of the Independent Variables 
When everyone had finished, the experimenter con 


tinued, He indicated that the role of the government 
would be to mediate any possible dispute between 
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the allocator and worker. Both were asked to fill 
out a “payment review request card.” This would 
enable them to request a government review of the 
allocations of the first half and to express alloca- 
tion desires for the second half. The government 
would then review these appeals and offer a fair 
standard for the allocations in the second half. The 
standard was set by a third party to make the ex- 
perimental procedures more credible, especially in 
those cases in which the standard would not be met. 
This arrangement also avoided the possibility that 
the allocator would be perceived as having changed 
his/her mind about what was fair in the interval 
between the appeal and the second series of pay 
periods. It was emphasized, though, that the allo- 
cator could still allocate freely, so that the subject 
would perceive him/her to be choosing to match or 
not to match the standard in the second half. 

When the payment review request cards had been 
filled out, the experimenter collected them, osten- 
sibly delivered them to the government, and after a 
short interval returned them to the subjects. The 
experimenter returned to the control booth and said 
that the government had examined the appeals and 
had suggested that the worker should receive either 
12¢ during the second half (the proximal standard) 
or 16¢ during the second half (the distal standard). 
Each of these standards was described as the “fair- 
est way to allocate in the second half, according to 
a neutral third party.” Subjects were asked to write 
down the standard at the top of their payment rec- 


ord cards. j 
The second half of the pay periods resumed at this 


1 This is in contrast to both the Folger (1977) and 
deCarufel and Schopler (1979) studies, The former 
confounded initial inequity with both his equity and 
improvement manipulations, whereas the latter con- 
founded it with their improvement manipulation. 
The present study does contain a confounding of 


‘the size of the improvement with the amount re- 


i the allocator in the various resource COn- 
ponte Sater (eg, in the diminished-resource 
conditions: 12¢ and 6¢ vs. 16¢ and 2¢ for the worker 
and allocator, respectively). This type of confound- 
ing is inevitable, To remove it would have required 
a confound of size of improvement with either initial 
inequity or with allocator resources. The significance 
of this confounding has been discussed in the text 
in reference to the “gaps” between worker and allo- 
cator outcomes, especially in connection with Hy- 
potheses 3 and 4. In the discussion of Hypothesis 
1, this gap may provide an alternative to the norma- 
tive-standard-outcome-discrepancy interpretation of 
the simple effect of the improvement within the 
high-standard conditions. However, it does not ac- 
count for the parallel results obtained on the simple 
effects test of standard within the small-improvement 
conditions. This being the case, parsimony would 
dictate that the interpretation based on discrepancies 
from the normative standard would be the preferred 


one. 
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point. During these four pay periods, the worker 
received either 12¢ or 16¢ per pay period, and the 
amount available to the allocator to allocate on each 
pay period either remained the same (24¢), increased 
(to 36¢), or decreased to (18¢). 


Dependent Variables 


The main dependent variables were assessed on 
the “job satisfaction questionnaire,” which was ad- 
ministered directly after the final pay period. Five 
measures were selected to tap various aspects of the 
evaluation of the improvement and were analyzed 
together as a multivariate set. These measures were 
(a) satisfaction with pay (“How satisfied were you 
with your pay during the second series of pay pe- 
riods?”); (b) allocator fairness (“How fairly was 
the allocator dividing up the pay during the second 
series of pay periods?”); (c) satisfaction with pay 
change (“How satisfied were you with your second- 
half pay in relation to your first-half pay?”); (d) 
satisfaction with standard (“How satisfied were you 
with your second-half pay in relation to the sug- 
gestion made by the government regarding second- 
half allocation?”); (e) satisfaction with allocator's 
pay (“How satisfied were you with your second- 
half pay in relation to what the allocator received 
in the second half?”), All of these variables were 
measured on 9-point scales (1 = not at all, 9 = very 
much). 

In addition, measures were taken of subjects’ satis- 
faction with, and perceptions of fairness of, the total 
allocations, and subjects’ work performance (number 
of items transcribed per pay period) was tallied. 
Finally, several additional measures were included 
to test the hypothesized mediation of the predicted 
effects. These were a magnitude estimation measure 
of the size of the improvement in relation to the 
government standard and some “intent” measures to 
assess subjects’ perceptions of the allocator’s motives. 

When subjects had completed this form and an 
“information questionnaire,” which assessed the im- 
pact of the manipulations, the experimenter brought 
them into a common room and fully debriefed them. 


Results and Discussion 


Effectiveness of the Manipulations 


The allocations during the first four pay 
periods were designed to create in the sub- 
ject the experience of having received an in- 
equitably small share of the available pay. 
In fact, 99 of the 108 subjects requested a 
government review of these allocations, and 
all of these subjects indicated that the reason 
for their appeal was that they felt they had 
received too little pay in the first half. The 
9 subjects not requesting a review were dis- 
tributed across eight cells of the design. The 
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improvement manipulation was checked with 
an item from the “information questionnaire,” | 
which was administered after the final pay | 
period (“How much of a change was there in | 
your pay from the first to the second series 
of pay periods?”—1 = very little, 9 = very 
much), The means for the small- and large- 
improvement conditions were 5.8 and 7.6, re 
spectively, F(1, 96) = 43.85, p < .001. The} 
standard manipulation was also checked with 
an item from the information questionnaire 
(“How much of a change in your pay did 
the government suggest?”—1 = very little, 9 | 
= very much). The means for the low- and 
high-standard conditions were 6.1 and 7.9 
respectively, F(1, 96) = 48.25, p < .001. Fie 
nally, the resource manipulation was checked 
by asking subjects to indicate how much pay 
had been available to the allocator during the 
second half. Counting 18¢, 24¢, and 36¢ a 
correct for the diminished, constant, and in 
creased resource conditions, respectively, 108 
of the 108 subjects (97%) responded cot- 
rectly. y 


Overall Impact of the Independent Variables 


Two measures asked subjects to consid : 
the magnitude of the improvement in relation 
to the government standard. These were i 
magnitude estimation measure (“How did tof 
change in pay you received from the alloca! 
compare with the amount suggested as ee le 
priate by the government?”—I1 = too ‘i 
9=too much), and the satisfaction-W a 
standard measure from the evaluation ME y 
improvement set of measures. Both of thes 
showed a Standard X Improvement 3 i 
source interaction, F(2, 96) = 6.48, ? sei 
and F(2, 96) = 5.39, p <.01, respect’ 
Two other measures were marginally se 
cant for this effect: how satisfied sim , 
were with their total pay, F(2, 96) = e 
p < .06, and the extent to which they io 
ceived that the allocator was “trying” t06 
cate fairly during the second half, F (2, wert 
2.65, p < .08. The means for these i f 
in accord with expectations. Satisfaction ie 4 
lower and subjects felt that the im n 
compared unfavorably with the stani oon 
the high-standard-small-improvement 


di 
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tions, especially where the allocator’s resources 
increased. The triple interaction was also sig- 
nificant on the behavioral measure of number 
of items transcribed during the second series 
of pay periods, F(2, 96) = 4.04, p < .05.? 


Tests of the Hypotheses * 


Hypothesis 1. This was the major “discon- 
tent” hypothesis, which predicted that sub- 
jects would be more dissatisfied with the 
increase in their outcome level if the improve- 
ment failed to meet the normative standard 
than if it did. Before proceeding to the sim- 
ple effects tests, it should be noted that the 
overall Standard X Improvement interaction 
on the items in the improvement set was sig- 
nificant, multivariate F(5, 92) = 5.78, P< 
001, 

The simple effect of improvement within 
the high-standard conditions was significant, 
multivariate F(5, 92) = 8.08, p < .001. The 
F values for the individual improvement mea- 
sures organized by test of hypothesis are 
presented in Table 1. The means for these 
items are presented in Table 2. Further evi- 
dence comes from two auxiliary measures, the 
extent to which subjects perceived that the 
allocator “went far enough” in his/her efforts 
to be fair and the “trying” measure disc 
above. Both of these were significant for the 
simple effect of improvement within the high- 
standard conditions, F(1, 96) = 21.34, P< 
001, and F(1, 96) = 13.38, p<.001, re- 
spectively. Thus when the distal standard 
had been legitimized, subjects felt more 
dissatisfied with the small improvement that 
failed to match it than they did with the large 
improvement that did. They also felt that 
the allocator was not trying as hard to be fair. 

The simple effect of standard within the 
small-improvement conditions was also signifi- 
cant, multivariate F(5, 92) = 6.83, Ż < 001. 
The two intention measures were supportive 
as well: “went far enough,” F(1, 96) = 7.30, 
p < 01, and “trying,” (p < -10). This effect 
suggests that subjects were more satisfied with 
the small improvement when it was evaluated 
against the proximal rather than against the 
distal standard. 


These data of course do not prove that the 
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arousal, legitimation, and subsequent viola- 
tion of normative standards were responsible 
for the Folger (1977) and deCarufel and 
Schopler (1979) results. They do show, how- 
ever, that if the conditions they speculated 
about are experimentally created by separat- 
ing the procedure of setting the normative 
standard from the actual delivery of the im- 
provement, discontent is evident in those cases 
where the improvement is evaluated as inade- 
quate. This was true whether discontent was 
measured using a single standard and two 
levels of improvement that either did or did 
not match it, or with a single level of im- 
provement being evaluated against either a 
low or a high standard. 

Recall that the guilt interpretation of de- 
Carufel and Schopler’s (1979) study was also 
evaluated. The simple effect of improvement 
within the low-standard conditions was sig- 
nificant, multivariate F(5, 92) = 2.71, p< 


2 There had been no behavioral differences during 
the first half when the allocators were equal across 
conditions. The pattern was somewhat different from 
that described above, but still contained strong links 
to the theoretical framework. The resource effect 
(Hypothesis 3, below) was evident, but only under 
the large-improvement conditions, In fact, the Im- 
provement X Resource interaction was also signifi- 
cant, F(2, 96) = 3.83, P < 025. In the small-improve- , 
ment conditions, productivity varied by standard, 
In the low-standard conditions, productivity was 
highest in the constant resource condition, whereas 
it was lowest in the constant resource cell under the 
high standard. Two things need to be said about 
these productivity effects, The first is that neither 
Folger (1977) nor deCarufel and Schopler (1979) 
found them in similar paradigms, Perhaps this was 
because in those studies, the standards of fairness 
were largely implicit as subjects arrived at an inter- 
pretation of the voice-improvement sequences, In the 
present study, the standards were explicit and allowed 
subjects to be aware of the unfairness throughout 
the second half and, thus, to alter their work output 
accordingly. Second, it is interesting that the hy- 
pothesized pattern of means was most clearly evident 
in the constant resource conditions. Here the norma- 
tive standard was most clearly relevant to the sub- 
sequent allocations because there was no change in 
the allocator's resources to suggest the appropriate- 
ness of any other standard. This is consistent with 
the speculations offered above that productivity ef- 
fects would be most likely under conditions of clear- 
cut violation or no violation of the standard. 

3 There were no effects for sex of subject, so the 
data for males and females were combined. 
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Table 1 


F Values for Measures in the Improvement Sel, by Hypothesis 


Satisfaction 
with pay 6.00" ist <i 
Allocator 
fairness In 16 165 
Satisfaction 
with pay 
change 6.69 288 <i 
Satisfaction 
with 
standard 34.28400 30.15%% < 
Satisfaction 
with 
allocator's 
outcome <i <i <i 
*p <05. 
*p <0. 
s.p < 001. 


.0S. Inspection of the means for the items in 
the improvement set indicates, however, that 
the direction of the difference is the opposite 
of a guilt effect! Subjects evaluated the im- 
provement more positively when the standard 
was exceeded than when it was matched. This 
effect, combined with the simple effect of im- 
provement within the high-standard condi- 
tions, resulted in a significant improvement 
main effect, multivariate F(5, 92) = 4.98, 


Table 2 
Means for Measures in the Improvement Set 


Low standard High standard = 
Small improvement Large improvement a 
Resource Resource i 
Dimin- Con- In- Dimin- Con- In- - cat 
Measure ished stant creased ished stant creased creased — 
Satisfaction r 
with pay 8.11 8.00 2.44 8.44 7.89 4.89 
Allocator 
fairness 7.22 7.78 2.33 6.1L 6.44 5.00 
Satisfaction 844 833 3.44 8.89 7.89 6.78 
with pay 
increase 
Satisfaction 
with 
standard 856 8.78 3.89 8.00 8.00 4.56 
Satisfaction 
with 
allocator’s 6.89 
pay 6.33 7.67 1.89 5.78 7.22 4.67 7.56 6.78 1.56 6.22 H 


Note. All of the measures in the improvement set used 9-point scales (1 = not at all, 9 = very much). 
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Hypothesis 3: Hypothesis 4; 
Resource Improvement 
resource resource resource Resource 


main effect x 
in 


<1 214 81,7200° 1.88 

<i 349 36.93%% 6.51% 
1.59 5.67* 57.16%% 9.08" 
<i 2.26 $0.45% 5.36% 
<i 109 4.41% 


38.57%% 


p < .001. Clearly, there was no evidence of 
guilt here. a, 

The simple effect of standard within the 
large-improvement conditions was not signifi 
cant (.10 < p < .15) at the multivariate level, 
again indicating the lack of a guilt a 
The major factor in this paradigm that woul i 
seem to blunt a guilt effect is the subjects 
experience of initial inequity. In studies Gon? 
within the “obligation tradition” (eg, Ge 
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gen et al., 1975), where guilt effects are 
common, the subjects’ need usually arises be- 
cause of their bad luck or the difficulty of 
the task. In these cases, the aid may be viewed 
as help, which may imply the need to re- 
ciprocate at a later date. In the present 
situation, however, the subject is initially 
mistreated by the allocator. Here, the im- 
provement is viewed as compensation. This 
interpretation of the aid by the recipient 
seems to have little capacity to arouse the 
need to repay the former tormenter (e.g., de- 
Carufel & Carere, Note 1). 

Hypothesis 2, Hypothesis 2 examined sub- 
jects’ satisfaction in those conditions where 
the standards were matched. Specifically, it 
was predicted that since in neither the low- 
standard-small-improvement nor in the high- 
standard-large-improvement condition was 
there a discrepancy between experienced and 
deserved outcomes, there should be no sig- 
nificant difference in satisfaction between 
them despite the difference in absolute out- 
come level. A special contrast of these condi- 
tions was run separately in each of the levels 
of the resource factor in anticipation of the 
concurrent resource main effect (Hypothesis 
3, below). Thus, each contrast examined only 
the effect of matching the normative stan- 
dards, holding constant the level of the allo- 
cator’s resources. Consistent with the hypothe- 
sis, this contrast was nonsignificant at the 
multivariate level in each of the three re- 
source rows (ps > .20). Thus, the relative 
deprivation hypothesis that dissatisfaction is 
based on a discrepancy between experie 
outcomes and the standard used to evaluate 
them rather than on the basis of their absolute 
level received further support. 

Hypothesis 3, Hypothesis 3 suggested that 
subjects would use not only the normative 
standards established by the government to 
evaluate the improvement but also the out- 
comes received by the allocator. It was ex- 
pected that satisfaction would be lowest in 
those conditions where the allocator’s re- 
sources increased. This prediction was on 
firmed. The resource main effect was signifi- 
cant, multivariate F(10, 184) = 18.22, $ < 
.001. The effect was also evident on the 
intention items: “went far enough,” F (2, 96) 
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= 92.79, p < .001, and “trying,” F(2, 96) = 
61.73, p < .001, as well as satisfaction with 
total pay, F(2, 96) = 39.02, p < .001, and 
fairness of the total allocations, F(2, 96) = 
41.17, p < .001. Subjects also transcribed 
fewer numbers onto their work sheets when 
the allocator’s resources increased, F(2, 96) 
= 3.00, p < .06. 

Thus, it is clear that subjects did base 

their perceptions of fairness and feelings of 
satisfaction at least in part on social com- 
parison with the allocator’s outcomes. In the 
increased resource conditions, the allocator 
was able to provide the required compensa- 
tion without depriving himself/herself of a 
high outcome level. When the allocator con- 
tinued to prosper and the gap created by the 
initial unfairness was not reduced, subjects 
felt that no real effort to be fair was being 
made. This dissatisfaction is reminiscent of 
Pettigrew’s (1967) observation that blacks 
felt deprived relative to whites despite their 
economic and educational gains during the 
1960s because the initial gap between blacks 
and whites had not been reduced and in some 
cases had grown wider. This perception of in- 
sufficient effort may have been reinforced in 
the present experiment by the fact that the in- 
creased resources were provided to the al- 
locator by the experimenter rather than re- 
sulting from increased effort on the allocator’s 
part. 
Further evidence of the operation of social 
comparison processes over and above the ef- 
fect of matching the normative standards 
comes from the fact that the resource main 
effect was significant within the low-standard— 
small-improvement and the high-standard— 
large-improvement conditions, multivariate 
F(10, 184) = 7.11, P< .001, and multivari- 
ate F(10, 184) = 6.63, p< 001, respectively. 
Thus, it seems clear that the “external” re- 
source factor caused subjects to feel that 
matching either normative standard was an 
insufficient indication of the allocator’s good- 
will when his/her resources increased. The 
likely result would be a desire to renegotiate 
the standard, beginning anew the process of 
voice and (possible) improvement. 

Hypothesis 4. Hypothesis 4 predicted that 
if the allocator were to be perceived as being 
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able to provide a large improvement because 
of increased resources, subjects would be 
dissatisfied if “only” a small one were 
granted. The predicted Improvement X Re- 
source interaction was significant, multivari- 
ate F(10, 184) = 3.79, p < .001. The inten- 
tion measures also showed the effect: “went 
far enough,” F(2, 96) = 2.92, p < .06, and 
“trying,” F(2, 96) = 12.45, p < .001, as did 
the fairness of the total allocations measure, 
F(2, 96) =4.15, p < .05. It is clear that 
subjects took the allocator’s ability to provide 
adequate compensation into account in their 
evaluation of the improvement. The small in- 
crease was seen as inadequate when the al- 
locator’s resources increased and led to lower 
satisfaction than did the large one. 


General Discussion and Conclusions 


The present study was concerned with the 
evaluation of the adequacy of compensation 
received by victims of injustice in relation to 
salient legitimate standards of fairness. The 
major theoretical perspective was that of rela- 
tive deprivation. The theory was extended to 
predict that individuals develop expectations 
about what they ought to receive and that dis- 
satisfaction may result even with improving 
outcomes if the increases do not match ex- 
pectations. In fact there is evidence (Folger, 
1977) that insufficient improvement may in- 
tensify discontent by legitimizing the vic- 
tim’s perceived right to higher outcomes and 
by bringing the former injustice into sharper 
relief. 

Discontent with improvement was evident 
in two sections of the present study, in the 
high-standard-small-improvement conditions 
and in those where the allocator’s pay in- 
creased. In both cases, subjects’ legitimate ex- 
pectations of fair allocation were violated by 
the allocator, apparently to his/her own bene- 
fit. In the former conditions, normative stan- 
dards were legitimized by the government and 
not matched by the improvement; and in the 
latter, subjects felt spontaneously deprived 
relative to the allocator’s own improved out- 
comes, This discontent was matched by per- 
ceptions of unfairness of the allocations and 
with the attribution that the allocator had not 
tried hard enough to be fair. These demon- 
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strations are important because they sel 
clarify the dynamics of the discontent 
Folger (1977) and deCarufel and Së 
(1979) studies. In those studies, the Op 
tion of normative standards was a 
conjecture, and the role of social stal 
and social comparison was virtually igmg 
Further, the present study found no 
to support a guilt-based interpr 
these previous results. Thus, with the 
of this experiment, the relative dep 
account of discontent with improvem 
this paradigm has been considerably stren 
ened, 
Finally, since this article opened 
discussion of discontent in “real wo! 
tings, the focus of the discussion 
broadened briefly to show the rel 
the present research to those siti 
Normative standards of allocation 
distributive justice are used in a 
settings as a criterion of fair pay 
1965), and when they are violated, di 
may be expressed in strike action 
teeism from the work place. Similar 
data of Sears and McConahay (1973 
cated that Watts riot participants 
mative standards such as the formal 
ian ideology of the U.S. Constitution 
basis for their discontent with racial i 
ity. The normative standards used 
present study’are of direct relevance 
often invoked to deal with the pro 
the disadvantaged. The proximal stat 
similar to statutes that prohibit injust 
that point on, whereas the distal 
represents the contention of many VIC 
injustice that they should receive COn 
tion for their past misfortune. This 
finds expression in quota systems for 
awards of back pay lost through ¢ 
tion, and so on. Social standards 
clearly used, as Pettigrew (1967) 4 
have noted. As in the present study, 
comes received by the (previous) hat 
are often those selected by the di 
as the criterion of their own dese 
Unlike many social psychological P! 
ena studied in laboratory settings, 
the task is to establish some “ex 
lidity,” the literature in the area Olg 
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deprivation contains many striking instances 
of revolution, riot, and so on that have been 
interpreted in this framework in a post hoc 
fashion. What is lacking here is a sufficiently 
elaborated and tested theory that can make 
a priori predictions about the instigating con- 
ditions for discontent. The results of the 
present experiment offer cause for optimism 
that the rising-expectations framework under 
consideration here may prove fruitful in this 
regard. 


Reference Note 


1. deCarufel, A, & Carere, R. The evaluation of 
outcome improvement by disadvantaged parties. 
Paper presented at the annual meeting of the 
Canadian Psychological Association, Ottawa, Can- 
ada, June 1978. 
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Cardiovascular Changes During Social Competition 
in a Mixed-Motive Game 


Lawrence F. Van Egeren 
Department of Psychiatry 
Michigan State University 


Male and female subjects played a mixed-motive game against a male con- 
federate under either a 20% cooperative or an 80% cooperative strategy while 
cardiovascular responses were computer monitored. Females had larger heart 
rate responses than males during play against the competitive strategy, and 
the opposite was true during play against the cooperative strategy. Subjects 
who were more competitive during the game or who scored higher on a cor- 
onary-prone (Type A) behavior scale or who reported having an action orien- 
tation toward life stress tended to have larger heart rate responses during the 
game than the remaining subjects. The results draw attention to the importance 
of covert autonomic responses for understanding overt behavioral choices in 
mixed-motive games and to the potential utility of this behavioral model for 
studying the role of psychosocial factors in psychosomatic illnesses. 


Cardiovascular responses in competitive 
conflict situations are potentially interesting 
for two reasons: (a) Competitiveness (strug- 
gles for dominance, control) is a putative 
etiologic factor in cardiovascular disorders 
(Friedman & Rosenman, 1974; Henry & Ely, 
1976), and (b) in human evolution, conflicts 
over resources were usually settled by vigor- 
ous action requiring cardiovascular arousal 
(Tiger & Fox, 1971); people today may carry 
important vestigial, now metabolically exces- 
sive, competition-released cardiovascular re- 
sponse tendencies. What the form of cardio- 
vascular responses during mixed-motive games 
might be is not clear, because the nature of 
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these games as a stressor is not clear. 
vascular stress studies by Obrist (eg, 
et al, 1978) indicate that the type of 
sponse depends on whether stress is contig 
lable (ie., whether some active response) 
available to deal with it). When it 15 
trollable, the heart pumps more vigorous: 
presumably in preparation for the action, 5 
systolic blood pressure rises. When it 182 
controllable, the diastolic blood pressure Aa 
apparently as the result of constriction 
peripheral blood vessels. 3 
Two mixed-motive game experi! i 
volving cardiovascular variables have 7 
reported. In one, after an anger-induction 
perience, subjects played a game 48? 
confederate of the experimenters, 
instigator, under either predictable or g 
dictable conditions (Van Egeren, Abelson 
Thornton, 1978). When the confederi 
behavior was unpredictable and subj 4 
little control over outcomes, bl A 
pulses were transmitted faster from theag 
to a finger, and diastolic blood pressure u 
creased less during the game than wia f 
confederate was predictable and subjects A 
greater control. These results sean 
with the Obrist et al. (1978) findings 


ments 1! 


for 9° 


SOCIAL COMPETITION 


social stressors. In another experiment, Blas- 
covich, Nash, and Ginsburg (1978) related 
resting heart rates prior to a mixed-motive 
game to game behavior. Since the games were 
zero-sum, it was possible to divide the sub- 
jects into winners and losers. Male winners 
(i.e., the more competitive males) had faster 
pregame heart rates than male losers, 82 ver- 
sus 74 beats per minute, Females had the op- 
posite pattern, 82 versus 83 for winners versus 
losers, but this difference was nonsignificant. 

Three individual differences in game re- 
sponse were explored in the present experi- 
ment: action orientation towards stress, a be- 
havior pattern called Type A, and sex mem- 
bership. The Type A behavior pattern, de- 
scribed by two cardiologists (Friedman & 


Rosenman, 1974), includes intense competi- 


tiveness, high achievement drive, impatience, 
and intense facial and vocal mannerisms. The 
more unhurried and relaxed person is referred 
to as Type B. In predictive studies, Type A’s 
have had from two to five times the incidence 
of ischemic heart disease as Type B’s (Rosen- 
man, 1974; Rosenman et al., 1966). Sex dif- 
ferences in game behavior are often incon- 
sistent (Davis, Laughlin, & Komorita 1976), 
but males seem to be more concerned with 
winning than females (Komorita, 1965). Be- 
cause the present experiment was conducted 
before the appearance of the Blascovich et al. 
(1978) study, no prediction of an interaction 
between sex membership and competitiveness 
in heart rate was made. It was simply felt that 
because of male-female differences in aggres- 
sion training, coded in part in sex role stereo- 
types surrounding aggression (Frodi, Ma- 
caulay, & Thome, 1977), social competition 
has a different emotional significance for men 
and women and that this would influence 
cardiovascular responses during 4 mixed-mo- 
tive game. Specifically, it was predicted that 
when subjects were paired with a competi- 
tive opponent, cardiovascular arousal would 
be greater for males than females, greater for 
Type A’s than Type B’s, and greater for ac- 
tion-oriented subjects than passive-oriented 
subjects, The opponent for all subjects was a 
male confederate of the experimenter. Sub- 
jects appeared to play the game against the 
confederate but ‘actually played against & 
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Confederate’s Choice 


Cooperate Compete 


“ge Pe ee 
aa par | 3s | 


Figure 1. Matrix representation of payoffs, (The 
first number in each cell is the payoff in points to 
the subject. The second number is the payoff to the 
opponent [confederate].) 


Subject’s Choice 


laboratory computer that was programmed to 
be either very competitive or very coopera- 
tive. An electrocardiogram and blood volume 
pulses in a finger were computer monitored 
during the game. 


Method 


Subjects 


The subjects were 16 male and 16 female Michi- 
gan State University students enrolled in intro- 
ductory psychology courses. Subjects of each sex 
were randomly assigned to treatment conditions. 


Payoff Matrix 


All subjects played against a first-year male medi- 
cal student (confederate), using the payoff matrix 


shown in Figure 1. 
The payoff structure was designed to make it 
to cooperate, tempting to compete, and more 


palatable to give in to the opposing player's competi- 
tiveness than is true in the Prisoner's Dilemma game. 
The major force of the game lies in the direction of 
a struggle for the “compete” response, but the asym- 
metric outcome is not as “unfair” to the loser as it 
is in Prisoner's Dilemma. Subjects were paid 2¢ 


per point. 
Experimental Setting 


PDP 11/40 system) presented information to sub- 
jects, paced the game, collected behavioral and 
physiological responses, and reduced the data after 
each subject was run. An equipment room and a 
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subject room were connected by one-way mirrors 
and an intercom system. The subject and confederate 
sat side by side, 2 feet (61 cm) apart, separated by 
a curtain that dropped from the ceiling to the seat 
of the chair, The two players interacted by pressing 
buttons on a response panel placed in front of them. 
The computer “read” these button presses, decoded 
them, and indicated the players’ moves (behaviors) 
on a television receiver placed so that both players 
could see it. Electrophysiological signals were re- 
corded continuously from the subject and sampled 
intermittently at critical times by the computer (de- 
scribed below). 


Procedure 


After brief orienting instructions, electrodes and 
transducers were applied to the subject and the con- 
federate. Players were handed typed instructions, 
taken from Rapoport and Chammah (1965) with 
slight modifications. A short rest was followed by 
a 10-trial practice (Prisoner's Dilemma) game to 
familiarize players with the procedures. A second 
brief rest was followed by new instructions and 45 
plays of the game using the payoffs shown in 
Figure 1. The new instructions indicated that points 
were now worth 2¢ each, Following the game, the 
Subject filled out two questionnaires (described be- 
low) and was debriefed and paid. 

A play was initiated by display of the message 
“MAKE DECISION” on the TV screen, followed by a 
message “MAKE RESPONSE," which terminated after 
both subjects responded. Players responded by press- 
ing either a green (cooperate) or red (compete) but- 
ton. After the subject responded there was 3-5-sec 
delay before the outcome of the play was displayed 
for 10 sec. The display included the play number, 
players’ choices, players’ payoffs, and cumulative 
gains or losses, The interplay interval was random 
over the range of 10-20 sec, 

The confederate pressed buttons during the games, 
but the button presses were not read by the com- 
puter. The computer followed a preprogrammed 
strategy to play all subjects 50% cooperative in a 
fixed play sequence during the practice game. The 
computer played 80% cooperative in a cooperative 
strategy condition and 20% cooperative in a com- 
petitive strategy condition during the game that fol- 
lowed the practice game. Eight males and eight fe- 
males were assigned to each strategy condition. 


Physiological Measures 


Physiological responses were computer sampled 
every third play during the game, starting with 
Play Number 3. Three 6-sec samples were taken on 
each play sampled: (a) a prestimulus sample (PRE) 
Prior to onset of the message “MAKE DECISION,” 
(b) an anticipatory response sample (ANT) during 
the message “MAKE DECISION,” and (c) an outcome 
sample (OUT) during display of the outcome mes- 
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sage. Samples b and c were initiated 1 See 
onset of the messages y 

1, Electrocardiogram. A single lead (Va) wi 
corded. The output of a Grass 7P3 cardiotachom 
was sampled by the computer at 10 Hz andi 
verted to heart rate in beats per minute (bpm), 

2, Digital blood volume pulse. Relative ch 
in blood volume pulse were recorded using a th 
missive photoplethysmograph transducer placed 
the first phalanx of the left-hand little finger 
output of a driver amplifier coupled to a Grass 
preamplifier was computer sampled at 20 Hi 
computer routine for the analysis of blood yo 
pulses computed the trough and peak of the pi 
associated with each heart beat, The term “pules 
plitude” refers to peak-minus-trough values, 
digital pulse amplitude (DPA—mean DPA dil 
ANT and OUT samples divided by mean DPA) 
ing PRE sample) will be reported. 

3. Respiration. Tracings of respiratory excurs 
were recorded in order to detect the interferend 
cardiovascular activity caused by respiratory 1 
neuvers, The tracing itself was not scored, 


Questionnaires 


Two questionnaires were administered at the € 
of the experiment: a student version of the Jen 
Activity Survey for Health Prediction (JAS) 
the Habit Survey of Nervous Tension (HSNT). 1 
JAS is a set of items sampling major componen y 
the Type A behavior pattern (competitiveness, f 
patience, time urgency, etc.) developed by a 
(1965). In a validation study of 2,960 men, a J 
agreed with A-B classifications based on a CM 
interview conducted 3 to 5 years earlier in 13 
the cases (Jenkins, Zyzanski, & Rosen 
A version of the JAS adapted for college sí 
by Krantz, Glass, and Snyder (1974) was 
istered. The HSNT is a checklist of cognitive, 
ioral, emotional, and autonomic responses to SUT: 
developed by Thomas and Ross (1963). A E 
dicating the degree of action orientato f 
stress was derived by scoring the following fv i 
in the direction indicated: depressed f a 
increased activity (+), decreased activity ( Be. 
creased urge to sleep (—), and urge to be by ont 
and get away from it all (—). 


Results 


The mean percentages of competitive © 
sponses (red button presses) of E A 
females playing against cooperative an 
petitive programmed strategies were 
and 64% (males) and 57% and 62% 
males), respectively. The percentages i 
computed over 45 plays of the game i 
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analysis of variance of percentages of com- 
petitive responses indicated that neither main 
effect (sex and strategy) nor the interaction 
was statistically significant at a 5% alpha 
level. 

Physiological responses during the pre- 
stimulus (PRE) period were chosen as refer- 
ence points for evaluating the effects of 
stimuli during the anticipatory and outcome 

_periods surrounding a behavioral choice. A 

two-way (Sex X Strategy) analysis of vari- 
ance of PRE period heart rates, averaged 
over 15 trials, was computed. The sex and 
strategy main effects were not significant at 
a 5% alpha level, but the interaction was 
significant, F(1, 28) = 6.01, p < .02. Females 
playing against the competitive prepro- 
grammed strategy had larger PRE heart rates 
(84 bpm) than females playing against the 
cooperative strategy (78 bpm); the opposite 
was true for males, for whom the heart rates 
were 70 and 81 bpm, respectively. 

Two-way analyses of covariance of maxi- 
mum heart rate, minimum heart rate, and 
heart rate range (maximum minus minimum) 
during the anticipatory and outcome periods 
were computed, with mean PRE heart rate 
the covariate. Measures were averaged over 
trials before the analysis, and tests were two- 
tailed. Again, the main effects were not sig- 
nificant, but there was a significant Sex X 
Strategy interaction for the maximum heart 
rate, F(1, 27) = 7.50, p < .02, and heart rate 
range, F(1, 27) = 6.94, p< .02, during the 
anticipatory periods, and for the minimum 
heart rate, F(1, 27) = 5.43, p< -03, and 
heart rate range, F(1, 27) = 4.34, P< 05, 
during the outcome period. The adjusted heart 
rates, displayed in Figure 2, indicate that fe- 
males who played against the competitive 
Strategy accelerated more prior to making a 
behavioral choice and decelerated more when 
the results were displayed than females who 
Played against the cooperative strategy; the 
Pposite was true for males. A significant 
a x Strategy interaction appeared in with- 
eal heart rate responses as well as be- 
Ween-trial heart rate levels, even when the 
former were adjusted for group differences in 
the latter. 

The method for measuring blood volume 
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COOPERATIVE COMPETITIVE 


Maximum 


HEART RATE (BPM) 


Minimum 


PULSE 
AMPLITUDE (%) 


OUT 


ANT OUT ANT 


Trial Period 


Figure 2. Comparison of cardiovascular activity of 
males and females during anticipatory (ANT) and 
outcome (OUT) periods surrounding behavioral 
choices while playing against either a cooperative or 
competitive programmed strategy. (Each data point 
is based on 16 subjects. The heart rates displayed 
have been adjusted for group differences in pre- 
stimulus heart rate level. BPM = beats per minute.) 


pulses used in this experiment does not yield 
valid estimates of absolute blood volume. The 
prestimulus period measures were therefore 
not analyzed, since only relative changes from 
an unstimulated to a stimulated state are 
physiologically meaningful. Two-way analy- 
ses of variance of percentage changes in pulse 
amplitude from the prestimulus period to the 
anticipatory and outcome periods were com- 
puted. Neither the main effects nor the in- 


teraction was significant. 

Three behavioral measures (percentage of 
competitive responses, Type A behavior 
scores, and action-oriented coping scores) 
and six cardiovascular measures (maximum 
heart rate, minimum heart rate, and digital 
pulse amplitude during the anticipatory and 
outcome periods) were intercorrelated across 
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Table 1 


Competitive Cooperative 
% Com- Action- % Com- Action- 
petitive Type A oriented petitive TypeA ori 
Measure responses behavior coping responses behavior 
Anticipatory period 
Max. heart rate 00 —.05 -15 22 
Min. heart rate .00 07 09 s39 
Digital pulse amplitude 22 Ase 24 —.21 
Outcome period 
Max. heart rate 20 Aj 61° 43° 
Min. heart rate .22 As* .59* —.20 
Digital pulse amplitude 2) 40 —,58° —,.28 


*p <05. 


subjects in each strategy condition. The post- 
stimulus heart rate measures were “adjusted” 
for prestimulus heart rate levels prior to the 
correlation analysis by computing the devia- 
tion of poststimulus heart rates from a pre- 
stimulus-poststimulus line of regression. The 
results appearing in Table 1 may be sum- 
marized as follows. First, the more the sub- 
ject competed during play against the co- 
operative program the greater his/her heart 
rate acceleration when the outcomes were dis- 
played. Second, the higher the subject’s Type 
A questionnaire score, the larger his/her heart 
rate response when outcomes were displayed 
during play against a competitive strategy 
(i.e., the higher the subject’s maximum and 
minimum heart rate following an adjustment 
for the subject’s mean prestimulus heart rate). 
Third, subjects who reported having an ac- 
tion orientation towards stressful life events 
also had larger maximum and minimum heart 
rate responses when outcomes were displayed 
during play against the competitive strategy. 
The three behavioral measures themselves 
were not significantly correlated; the largest 
intercorrelation was .11. 


Discussion 


Subjects in this experiment may be clas- 
sified as either “winners” or “losers” on the 
basis of the strategy played against them. 


Correlations Between Behavioral and Cardiovascular Measures 
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Strategy condition 


With two exceptions, subjects who play! 
against the cooperative strategy won ® 
game (gained more points than the © 
erate), and subjects who played against 
competitive strategy lost the game. Then, t 
unexpected finding of a Sex X Outcome 4 
teraction in heart rate agrees with the Blas 
vich et al. (1978) results. For both ap 
period (Blascovich et al.) and pretrial p 
during the game (present experiment), 4 
ning was associated with faster pumping & 
blood and losing with slower pumping 
males; the opposite was true for females. 
thermore, the interaction continu a 
trials, male winners having larger heart ral a 
celerations prior to making behavior ‘hel 
and greater slowing of the heart when ed 
comes of plays were displayed pa “ul 
and again the opposite was true for ee 
Males seemed to find play against H6 
operative program, during which they 
more “stimulating” (eliciting 4 larger 
mum heart rate response and a gear 
of response) than play against the A 
tive program, during which they lost; 
opposite was true for females. How 
these rather surprising results be i 
Males and females may have ee 
the one-on-one competitive game di g 
Males may have focused more 0n be Ff 
(Komorita, 1965) and females more “ 
chances of losing (e.g, winning mY ~ 
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been more exciting to males and losing more 
fear provoking to females). Frodi et al. 
(1977) comment on sex differences in affec- 
tive response to aggressive cues in their re- 
cent extensive review as follows: “Some evi- 
dence suggests that men and women react 
differently to external aggressive cues and 
provocation. . . . What may be anger-provok- 
ing for men may be anxiety-provoking for 
_women” (p. 634). 

Sex differences in dominance-subordination 
relations is one of the foci of the current 
sociobiology debate (e.g., Tiger & Fox, 1971; 
Wilson, 1978). Viewed this way, heart rate 
responses were greatest when the males were 
dominant (winning) and the females were 
subordinate (losing) in this experiment. 
Whether men and women are culturally or 
genetically conditioned to respond in such 
ways is one of the issues in the debate. 
Blascovich et al. (1978) mentioned that men 
and women are socialized differently for 
competition and have different degrees of 
familiarity with competitive contests. Socio- 
biologists point out that the hunting-gather- 
ing way of life, in which men hunted wild 
animals and women gathered wild plants was 
a human universal for more than a million 
years (Lee & DeVore, 1976). Surely this 
. major adaptation over such a long period of 
time, with its sharp sexual division of labor, 
exerted different selection pressures on males 
and females. Thus, it is not surprising that 
men and women attach a different socioemo- 
tional significance to modern equivalents of 
the prehistoric hunt such as sporting Con- 
tests (and mixed-motive games?), or so the 
argument goes (see Tiger & Fox, 1971). 

An explanation of the results based on sex 
| tole stereotypes and conformity to social 
horms is also worth considering. In Western 
cultures males are expected to be aggressive 
and females to be nonaggressive. It may have 
been more difficult (conflictual, distressing) 
for the males interacting with a friendly con- 
federate, and the females interacting with an 
Unfriendly confederate, to perform sex role- 
expected behaviors than was true of their 
Same-sex counterparts. Thus, the subjects who 
interacted with a confederate whose behavior 
Was noncomplementary to their own sex role- 
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expected behavior had larger heart rate re- 
sponses than did subjects who interacted with 
a “complementary” confederate, regardless of 
sex of subject. Others have shown in modified 
Asch situations that pressures encouraging 
deviations from social norms or expectations 
produce physiological arousal (Back & Bog- 
donoff, 1963; Costell & Leiderman, 1968). 
Regardless of explanation, the difference be- 
tween males and females in heart rate re- 
sponse in a competitive situation in the ab- 
sence of a significant behavioral difference is 
quite interesting. 

The nonsex individual differences in be- 
havior were most clearly related to cardio- 
vascular changes during the display of out- 
comes against a competitive opponent. Sub- 
jects with higher Type A questionnaire scores 
or action-orientation scores tended to have 
larger heart rate responses during this period, 
Tt was as though, for these subjects, the “ac- 
tion” was not yet over when the outcome was 
displayed against a competitive opponent; the 
heart was still preparing for action, The 
larger heart rate response toward a competi- 
tive opponent of subjects with high Type A 
questionnaire scores is in agreement with 
Type A behavior theory (Friedman & Rosen- 
man, 1974), whereas the lack of correlation 
between the questionnaire and the actual com- 
petitiveness of subjects during the game is 
not. A recent experiment in our laboratory 
showed that subjects taken from the extremes 
of the questionnaire distribution do differ in 
competitiveness in a mixed-motive game (Van 

n, 1979). 
E of the present results for 
sex is limited by the fact that both males and 
females played against a male confederate. 
Subjects in the Blascovich et al. iA 
periment, however, were paired mal Š ¢ 
and female-female and also exhibited a Sex 
x Outcome interaction in heart rate. It a 
worth noting that the difference between the 
female winners and losers in heart rate re- 
sponse was larger in the present experiment 
than in the Blascovich et al. experiment. Thus, 
the effect of females pla! 


ying against an 0p- 
posite-sex opponent appeared to be to ac- 
centuate the interaction in heart rate re- 
ported in both experiments. 


864 LAWRENCE F. 


Conclusions 


First, men and women differed in cardiac 
response to someone else’s aggressiveness, 
women responding more strongly to a com- 
petitive opponent and men more to a co- 
operative opponent. This suggests that com- 
petitive mixed-motive games have a different 
sociophysiological significance for men and 
women. 

Second, cardiovascular events during mixed- 
motive games may help clarify motivational 
aspects of these games and aid in testing hy- 
Potheses about social and interpersonal con- 
tributions to heart disease. 
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Vocal Interruptions in Dyadic Communication 
as a Function of Speech and Social Anxiety 
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Interruptions have been defined as a breach of the “turn-taking” contract in 
interpersonal communication, The relation between a speaker’s personality and 
his or her propensity to interrupt was examined in 30-min unstructured con- 
versations for 36 dyads (12 male, 12 female, and 12 mixed sex). The following 
predictions were made: (a) Interruptive behavior is inversely related to speech 


anxiety and positively related to confidence as a speaker; 


(b) interruptive be- 


havior is inversely related to social anxiety (avoidance-distress ; fear of nega- 
tive evaluation). A stepwise multiple regression analysis was performed, con- 


Ervin-Tripp (1964, 1969) and Hymes 
(1972) have proposed a taxonomy of the 
situational determinants” of speech choice, 
with situational determinants being primarily 
defined as either endogenous or exogenous. 
Many studies (see Giles & Powesland, 1975, 
for a review) have shown speech modifica- 
tions to be affected by topic, general context, 
formality—informality, and so forth, thus dem- 
onstrating the influence of external factors. 
However, the influence of internal variables, 
particularly personality states, on the use of 
certain patterns of speech has not been re- 
searched to any great extent. The few studies 
that have directly examined the relation of 
personality to conversational style have u 
form, as opposed to content, in constructing 
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the conversational partner’s personality 
r’s use of back-channel responses. These 


measures of vocal behavior. Markel, Bein, 
Campbell, and Shaw (1976) have shown 
amount’ of speech to be positively related to 
extroversion. In a series of studies, Natale 
(1975a, 1975b, 1976b) has demonstrated that 
the magnitude of engaged “matching” (model- 
ing) for response latency, pause duration, and 
vocal intensity is directly related to a per- 
son’s need for social approval (Social Desira- 
bility scale, Crowne & Marlowe, 1960). The 
general purpose of the present investigation 
was to further explore the relation of person- 
ality characteristics to speech behavior in an 
unconstrained and freewheeling conversation. 


To pursue this research goal, some salient 
aspect of conversational behavior should be 
examined. Investigators have suggested that 


alternation of speakers is an integral part of 
dyadic communication that is governed by 
1970; Schegloff, 


rules (Jaffe & Feldstein, 

1968; Yngve, 1970). Empirical support of the 
conversational “turn-taking” mechanism has 
been provided by Duncan (1972, 1975), who 
demonstrated that behaviors in several com- 
munication modalities actively served as turn- 
taking signals. A relevant index of speaker 
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switching is the interruption. At first glance, 
an interruption might be construed as always 
representing a battle for the conversational 
floor, but some evidence runs counter to that 
notion: (a) Stephenson, Ayling, and Rutter 
(1976) have reported that visual access dur- 
ing conversation promotes spontaneity and is 
associated with increased speech interruptions 
(Rutter & Stephenson, 1977), and (b) Gal- 
lois and Markel (1975) have provided evi- 
dence suggesting that speech interruptions 
may have different psychological relevance 
(e.g., dominance or discomfort) during differ- 
ent phases of a conversation. In other words, 
“it would be a mistake . . . to infer that each 
interruption event is a miniature battle for 
ascendancy” (Meltzer, Morris, & Hayes, 
1971, p. 392). On the other hand, it may 
serve heuristic purposes to consider interrup- 
tions as a breach of the turn-taking process 
(Schegloff, 1968). If simultaneous speech is 
viewed as a violation of norms regulating con- 
versational exchange, it can be expected that 
anxiety associated with interpersonal com- 
munication would be inversely related to an 
individual’s propensity to interrupt. The pur- 
pose of the present study was to test this 
“commonsense” hypothesis in an attempt to 
explore the psychological factors contributing 
to interruption, asya salient form of speech 
behavior. 

There have been several previous studies 
that attemped to relate simultaneous speech 
to personality, but the results are not con- 
clusive. A recent study by Feldstein, Ben- 
Debba, and Alberti (Note 1) has shown that 
the frequency of initiated simultaneous speech 
is a stable and characteristic vocal feature of 
an individual. Gallois and Markel (1975) 
published an experiment that was purported 
to examine the relation of social personality 
to conversational style. However, the only sig- 
nificant finding of the study was that the fre- 
quency of successful interruptions changed 
over the course of spontaneous conversations, 
and nothing was reported concerning the per- 
sonality of the interrupter. We know of only 
two previous studies that directly assessed 
the relation between personality and conversa- 
tional style for simultaneous speech. Feld- 
stein, Alberti, BenDebba, and Welkowitz 
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(Note 2) used Cattell’s Sixteen Personality. 
Factor Questionnaire (16PF) to measure di- 
mensions of personality and found that “per- 
sons who are relaxed, complacent, secure, and 
not positively dependent on the approval of 
others tend to initiate more simultaneous 
speech than those who are generally ap- 
prehensive, self-reproaching, tense, and frus- 
trated.” A recent study by Ferguson (1977) 
examined the relation of dominance to the 
propensity to interrupt and found that inter- 
ruptive speech was not affected by the domi- 
nance measure. The results of Feldstein et 
al.’s study (Note 2) are inconclusive for two 
reasons: (a) The personality descriptions 
were based on factor scores from the 16PF, 
yet it has been emphasized by Bourchard 
and Porer (cited in Buros, 1972, p. 329) that 


because of weak predictive validity, person- | 


ality assessment from Cattell’s scales or pro- 


file interpretation is not desirable; (b) the 


results of Feldstein et al. were obtained from 
an uncross-validated regression analysis of 
only 24 subjects, thus obscuring the general- 
izability of the findings. 

To sum up, Feldstein et al.’s (Note 2) 
data do suggest that tense and apprehensive 
people are less likely to initiate simultaneous 
speech. This tentative finding is consonant 
with the hypothesis of the present study: The 
more fear individuals have concerning their 
speaking role, (i.e. speech anxiety), the less 
likely it is that they will interrupt. This no- 
tion was tested in unconstrained conversa- 
tions of previously unacquainted individuals 
so that “task” factors were minimized. We 
also “covaried out” the effects of potenti 
moderator variables (speaking style and 
sonality of dyadic partner) so that the rela- 
tion of an individual’s personality to interrup- 
tion behavior could be distinctly assessed. 


per- 


Method 


Subjects 


Í 

The subjects were 72 students (ages 18 t0 20) 0 

Ohio University enrolled in introductory PSY course 

courses who volunteered and who received ad 

credit for their participation. The subjects ac an 

domly paired; 12 male dyads, 12 female dya ie 
12 mixed-sex dyads. It was also a 


necessary Co asly 
tion that the dyadic partners not be prë 


I 
¥ 


SPEECH INTERRUPTIONS AND ANXIETY 


* acquainted with each other and that English be their 


native language. 


Personality Measures 


Self-reports of speech anxiety, Four self-report 
measures were selected to assess anxiety during con- 
versation and the generalization of anxiety to other 
social situations. 

1. An overall self-report rating of General Speech 
Anxiety on a scale from 1 (no anxiety) to 9 (ex- 
treme anxiety). This simple measure has been success- 
fully used by Fremouw and Harmatz (1975), and 
because of its face validity we deemed it appropriate. 

2. The Personal Report of Confidence as a Speaker 
(Paul, 1966) requires the subject to complete 30 
true-false items that assess the respondent’s feelings, 
thoughts, and behavior during the most recent speech. 
This scale has been shown to be both reliable and 
valid (Paul, 1966, p. 48). 

3. The Social Avoidance and Distress scale con- 
sists of 28 items that assess an individual's general- 
ized anxiety in many interpersonal situations. Watson 
and Friend (1969) demonstrated this scale to have 
Average reliability (r=.85) and to predict social- 
avoidance behavior. 

4. The Fear of Negative Evaluation scale (30 
items) measures the magnitude of a subject’s concern 
about other people’s evaluations of him. Watson and 
Friend (1969) showed this scale to have acceptable 
Teliability (r =.72) and to predict social-avoidance 
behavior. 

All four of these measures can be used to assess 
Speech anxiety (Fremouw & Harmatz, 1975); how- 
€ver, it should be noted that the speaker anxiety and 
Confidence measures, Scales 1 and 2, evaluate anxiety 
directly associated with the speech act, whereas Scales 
3 and 4 assess general anxiety associated with the 
Communication situation. 

Social desirability. A measure of the need for ap- 
Proval was obtained by using the Marlowe-Crowne 
Scale (Crowne & Marlowe, 1960). Although not di- 
Tectly related to the main hypothesis of the present 
Study, this scale was administered because prior re- 
Search (Natale, 1976b) suggested that an individual’s 
need for social desirability was related to his pro- 
Pensity to interrupt. 


Definition of Speech Interruption Behavior 


For the purposes of the present study, an inter- 
Tuption was defined as the occurrence of pope 
Speech and was assigned to the participant woe 
initiated speech while not possessing the “conversa- 

nal floor.” It)should be noted that for any ES 

dividual, possession of the conversational floor be- 

With the first sound that person utters alone 

and ends with the first unilateral sound uttered by 

€ other participant. This definition of speech ree 

fuption has been used by other investigators (Jaffe 

& Feldstein, 1970; Meltzer et al., 1971). ew 
affe and Feldstein (1970) summarized a deca 
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of interview research that clearly demonstrated the 
ability of an analogue-to-digital computer (Automatic 
Vocal Transacter Analyzer—avta) to measure the 
noise and silence pattern of recorded dialogue, Using 
AVTA, speech was measured at 300-msec intervals, 
and a separate analysis of speech parameters was 
provided for each subject in a dyad. The interested 
reader is referred to Jaffe (1970) for a sophisticated 
discussion of the methodology and validity of speech 
measures derived from the avra system. There is, 
however, an issue of measurement that deserves note. 
There is a possibility that brief utterances, such as 
“yes” or “uh huh,” which the listener intersperses 
during the speaker’s pauses and speech may be classi- 
fied by the computer (Avra) as interruptions,? These 
brief utterances (back-channel responses) are easily 
Tecognized and have been shown to function as the 
Opposite of a speaker-switching device, in that they 
appear to ensure that the speaker who is holding 
the conversational floor continues to do so (Duncan, 
1975). These vocal back-channel signals are not at- 

tempts to change the conversational floor and there- 

fore should not be included in the analysis of inter- 

ruptions (Meltzer et al., 1971, p. 395).2 To this end, 

back-channel responses were scored by a trained rater, 

Each recorded dialogue was heard twice by this 

judge so that each conversationalist's back-channel 

responses were scored separately, Concerning the re- 

liability of the trained judge’s scoring of back-channel 

responses, a pilot showed the rater to have a 91% 

agreement with one of the experimenters (MN). It 

should be noted that back-channel responses were the 

only speech measure not obtained from AVTA. 

Total speech interruptions, An interruption was 
defined as the occurrence of simultaneous speech and 
was assigned to the participant who initiated speech 
while not possessing the conversational floor. Pos- 
session of the conversational floor begins with the 
first sound a subject utters alone and ends with the 
first unilateral sound uttered by the conversational 
partner, As a measure, total speech interruptions does 
not discriminate between interruptions that result in 
a change of the conversational floor and those in 
which the interrupted speaker retains possession of 

rsational floor. 

Teeja speech interruptions. A speech interrup- 
tion that results in a speaker switch is a successful 
interruption. This measure is assigned to the sire 
who did not originally possess the conversatio: 

gam of successful interruptions, It should 
be noted that unsuccessful and successful interrup- 
tions summate to total speech interruptions. The 


ocalizations have been variously called 
ee, eaa (Dittman & Llewellyn, 1967), 
“ tion signal” (Kendon, 1967), and backchannel 
oe (Duncan, 1975; Yngve, 1970). In this 


responses 
j loy the latter term. 
pee drees thank Robert Krauss, Department 


of Psychology, Columbia University, for clarifying 
this issue. 
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Proportion of total interruptions that were successful 
(i.e., that resulted in a speaker switch) was calculated. 

Mean duration of interruption. Jaffe and Feld- 
stein’s (1970) avta allows for the calculation of dura- 
tion of interruption in .1-sec units. The mean dura- 
tion was calculated on each subject’s total speech 
interruptions. 

Total speech time, The total vocalization time of 
each conversationalist was calculated, via AVTA, using 
300-msec recording intervals, 


Apparatus 


The conversation between the pair of subjects took 
place in one of two conditions: (a) with the conver- 
sationalists in the same room (10 X 18 ft, or 3X5 
m), sitting 15 ft. apart in half-view of each other 
(from the waist up); (b) with the subjects in sepa- 
rate rooms but in half-view of each other, sitting 
2 ft. from a full window and directly facing each 
other. Approximately half of the conversations were 
recorded in each condition.® 

All subjects wore a headset consisting of a Kellogg 
earset and a Sony omnidirectional microphone that 
rested approximately 1 in. from the mouth of the 
subject. The close proximity of the microphone al- 
lowed us to maintain a low gain on the recorder, 
the result of this procedure being that each conver- 
Sationalist’s speech was clearly recorded without 
spillover from his dyadic partner. The conversational- 
ists heard each other monophonically via the earset. 
The recordings were made on a Sony stereo tape 
recorder that was out of view of the subjects. 

As stated, Jaffe and Feldstein’s (1970) AVTA sys- 
tem was used to obtain all the speech measures with 
the exception of back-channel responses, which were 
scored by an independent rater. AVTA is an analogue- 
to-digital computer that accepts stereo recordings of 
a dialogue and provides the previously described 
speech parameters for each conversationalist. 


Procedure 


Prior to the coaversational task, a pair of sub- 
jects was asked to fill out the personality measures 
described earlier (25 to 40 min); then some prelimi- 
nary instructions were given to ensure proper tape- 
recording. The subjects were told that this was 
merely an experiment on interpersonal communication 
and that the only experimental task was to talk 
freely with their partner. At this point, the research 
assistant activated the tape recorder and left the 
room. Each recorded dialogue lasted exactly 30 min, 
whereupon the research assistant returned to the 
room, terminated the conversation, and explained the 
purpose of the study to the participants. 


Statistical Analysis 


ivi i two 
The 36 dyads were randomly divided into 
equal-sized samples, with the stipulation that each 
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sample have an equal number of male, female, ay 
mixed-sex dyads. One was called the initial 
and the other the cross-validated sample. All aj 
were first conducted on the initial sample of 1 
dyads.* In case the regression analysis (des 
later) capitalized upon chance, the results were vali 
dated using the cross-validated sample. 4 

Stepwise regression. For the initial sample, a step 
wise regression ahalysis was used to evaluate the con 
tribution of. an individual’s self-reported speec 
anxiety and desire for social approval (predictors) t 
the use of the following interruption behaviors (de 
pendent variables): total speech interruptions, si 
cessful speech interruptions, and mean duration 
speech interruptions. A separate regression 
was performed for each dependent variable, 

Since the purpose of the present study was to de 
termine the precise effect of specific personality mea 
sures on an individual’s propensity to interrupt, í 


ing modifying variables in the regression an lys 
for the initial sample. (a) It is obvious that an il 
dividual will have greater opportunity to interrup 
a talkative partner and less opportunity to ini er 
rupt a taciturn partner; therefore, the effect of 
partner’s total speech time should be eliminated i 


channel responses) are not intended by the sp ; 
to indicate a desire to grab the conversational floor 
hence, any study wishing to examine interrupti i 
as a speaker-switching mechanism should elim nat 
the role of back-channel responses (Melzer et al 
1971, p. 395). (c) A person perception variable 
also influence an individual’s behavior in the intet 
personal situation; therefore, the personality trails) 
of the dyadic partner should be taken into accoull 
when assessing the individual's behavior. (d) It hā 
been suggested that the individual’s sex and "i 
dyad’s sex affect the use of interruptions (Zina 
man & West, 1975); hence, these sex variables mi 
be taken into account. y indi i 
In this light, the stepwise regression of an ma 
vidual’s personality measures to his or her 7 
ruptive behavior was performed using the oi a 
back-channel responses, the speaking time an 


sex as covariates. inted 
Marsden, Kalter, and Ericson (1974) a oft 
out that this analysis can easily be achieved by 


* The statistical technique used in this study W% 
stepwise multiple regression, This regression tics of 
dure capitalizes on the particular chara x 
the sample; therefore, a slight variation in +a 
cording procedures was viewed by the 
to provide robustness to the findings. Je, but 

4 There were 18 dyads in the initial a jn the 
each dyad member was treated as a sam 
analysis. Hence, N = 36 for the initial sample- 


SPEECH INTERRUPTIONS AND ANXIETY 


ing the order of the predictors, Stepwise multiple 
regression works by adding one variable at a time 
to the regression while assessing to determine whether 
the independent variables significantly increase the 
prediction of the dependent variable. If one wants 
to treat certain predictors as covariates, all that is 
necessary is to force the entry of these covariates 
into the multiple regression prior to the other free- 
floating predictors (see Cohen & Cohen, 1975, for a 
detailed discussion of this technique). In other words, 
there were two sets of independent variables used in 
the stepwise regression procedure: a set of covariates 
followed by a set of predictors of theoretical value 
(variates). The order of the independent variables 
within each set of predictors was “free,” that is, 
selected by the computer to maximize prediction of 
the dependent variable. This straightforward pro- 
cedure was used in the present study. To restate, the 
multiple regression analysis tested the relation be- 
tween an individual’s personality and speech-inter- 
ruption behavior, while controlling for the systematic 
effects of sex, of the conversational partner’s personal- 
ity and amount of speech, and of the speaker’s use 
of back-channel responses. 

Cross-validation technique. As stated earlier, a 
stepwise regression was used in the initial sample. 
This statistical procedure adds (steps) one variable 
at a time to the regression for the purpose of in- 
creasing the prediction of the criterion. When the 
computer is free to determine the order of the pre- 
dictors, it is possible that the stepwise procedure 
will capitalize on chance relationships. In the present 
study, the predictors were grouped as two sets (co- 
variates and variates), and the computer freely deter- 
mined their order within the set. Thus a cross-valida~ 
tion was not absolutely necessary but was performed 
for the purpose of testing the replicability of the 
findings. 

The regression formula (beta weights) from the 
first sample was imposed on the independent varia- 
bles of the second, cross-validation, sample. This 
Procedure produces predicted values of the dependent 
variable in the cross-validated sample. For the sec- 
ond sample, a zero-order correlation between the 
estimated and the actual dependent variables was 
Performed at each step of the regression and was 
considered the cross-validated multiple regression C0- 
efficient. 


Results 


Table 1 presents the initial stepwise correla- 
tions obtained for Sample 1. A separate He 
gression analysis was performed for each e- 
pendent variable. The set of covariates was 
entered as predictors prior to the set of vari- 
ates, but within each set the orbi was 
free to order the predictor so as t 
Prediction. The regression of each independent 
variable to the dependent variable was on 
for statistical significance by an F ra 
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(Cohen & Cohen, 1975, p. 107) that took into 
account the effect of other independent vari- 
ables. In other words, the F test indicated 
whether the increment in variance accounted 
for was significant when a predictor was added. 

The multiple correlations obtained in Sam- 
ple 1 (Table 1) were cross-validated in a sec- 
ond sample. The cross-validation procedure 
(see Method) used all of the predictors (co- 
variates and variates), but for the sake of 
brevity, the cross-validated coefficients are 
reported (see Table 2) only for the significant 
predictors of the first sample. 


Total Interruptions 


In reference to the total amount of inter- 
ruptions initiated by a speaker in the dyadic 
situation, the total speaking time of the con- 
versational partner was a strong positive fac- 
tor (r = .79, 63% of the variance), Obvious 
logic explains this finding—the more a person 
talks, the more likely it is that he or she will 
be interrupted. Examination of Table 1 also 
shows another covariate to have had an ef- 
fect on a speaker’s use of interruptions, It 
was found that males interrupted more often 
than females (correlation increased from .79 
to .83; additional variance accounted for was 
7%), a finding that Zimmerman and West 
(1975) have also reported, There was no sig- 
nificant contribution of the conversational 
partner’s personality traits to the speaker's 
propensity to interrupt. Of direct relevance 
to the hypothesis of the present study, exami- 
nation of the variates in Table 1 shows the 

r’s desire for social smon to be sl 

lated to the total use of interruptions. 
AAE of Table 2 shows that even 
after a cross-validation procedure a person's 
desire for social approval was still shown to 
be significantly related to his or her use of 
interruptions. The relationship of partner's 
speaking time and sex to rate of vocal inter- 
ruptions also withstood cross-validation (see 


Table 2). 


Successjul Interruptions 


This dependent measure is an expression 
of interruptions that result in the interrupter 
gaining the conversational floor. In this sense, 
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Table 1 
Stepwise Multiple Regression Relating Personality Characteristics to 
Engaged Speech Interruption 


SE ee 
R? 
Variable* Re R? increment 
Total interruptions (criteria) 
Covariates 
P's speech time +.790 625 .625**** 
S's sex —.834 696 .071** 
P's avoidance-distress +.859 .722 026 
P's negative evaluation +.863 .754 .022 
Dyad’s sex 871 759 014 
S's back channel +.876 769 .010 
P's need for social approval —.884 .781 «013 
P’s speech anxiety —.885 .783 .002 
P's confidence +.885 784 001 
Variates 
S's need for social approval +.911 829 045* 
S's speech anxiety —.913 837 003 


S's con! ce 
S's negative evaluation 
S's avoidance-distress 


Multiple regression analysis 


+.918 
—.921 
—.921 


Successful interruptions (criteria) 


Covariates 
S's back channel +.466 217 21788 
P's speech time +.522 .272 -055* 
P's need for social approval +.561 315 043 
P's confidence —.592 -350 035 
P's speech anxiety —.600 -360 -010 
S's sex -612 375 015 
P’s negative evaluation +.616 379 004 
P's avoidance-distress —.622 -387 .008 
Dyad’s sex 623 .387 -000 

Variates 
S's speech anxiety —.754 568 .180*** 
S's need for social approval +.781 611 043 
S’s negative evaluation —.802 643 032 
S's avoidance-distress —.805 649 006 
S's confidence —.806 649 001 

TY cert Se tS ES Ss ae 


Percentage of successful interruptions (criteria) 


Covariates 
Pi fidence +.425 179 .179* 
P's Soelh anxiety —.505.. 255 077 
P’s avoidance—distress +.527 277 022 
P’s need for social approval —.534 .286 .008 
P’s back channel —.538 -289 .004 
S's sex 541 292 -003 
P’s speech time ar 2 on 
"s speech i —.54: : 
Si s .545 .297 -000 


Dyad’s sex 
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Table 1 (continued) 
Multiple regression analysis 
ee Ee Sahay ie et ed 
5 R 
Variable* Rè R Perae 

Variates 

S's avoidance-distress —.696 484 187%" 

S’s need for social approval +.712 -507 023 

S's speech anxiety —.716 +512 -005 

S's negative evaluation —.716 513 .001 

S's confidence +.716 513 -000 

Mean duration of interruption (criteria) 

Covariates 

P’s speech time +.645 AIS 416°*"" 

S's sex -678 459 044 

Dyad’s sex -700 489 -030 

P’s speech anxiety —.708 +502 011 

P’s confidence +.715 512 010 

P's negative evaluation +.729 -532 -020 

P’s need for social approval —.730 533 001 

S's back channel +.731 534 001 

S's avoidance-distress +.731 534 000 
Variates 

S's avoidance-distress —.816 666 aah 

S's need for social approval +.845 13 M 

S’s confidence F Hie a ‘000 

S's negative evaluation Teas ‘719 000 


S's speech anxiety 


Note, Sample 1, n = 36. A separate multiple f 
(criterion), The ability of each independent varial 
by examining the semipartial correlation (sr. 


regression analysis was performed for each dependent variable 
ble awake earlei nonzero contribution to R? was tested 
) of each predictor to the criteria as follows: 


se/i 


= @-R)/N-k-1 


where N = number of subjects (36) and k = number of independent variables; 


(Cohen & Cohen, 1975, p. 107). 


df=1 (N-k-1) 


4 F denotes partner and S denotes pecan of each predictor to the dependent variable. 


+ or — indicates the direction of the rel 
asb < 05. 
ksa b < .025. 
is $ < 01. 
b < .001. 


a multiple regression with this dependent mea- 
Sure provides a more stringent test of the 
hypothesized relations between personality 
and interruptive behavior, Looking at Table 
1, one sees that a person’s use of back-channel 
Tesponses was significantly related to his or 
er overall rate of successful interruptions 
"= .46, 21% of the variance). It was ais 
the case that the conversational partner's 
amount of speech was positively related to 2 
SPeaker’s use of successful interruptions (mul- 


tiple R increased from .47 to .52). The per- 
sonality characteristics of the conversational 
partner were not significantly related to the 
magnitude of successful interruptions used by 
ee tol above the effects of covariates 
on the incidence of successful interruptions, 
the hypothesized effect of the speaker’s per- 
sonality was demonstrated. Table 1 shows a 
speaker’s general speech anxiety to be in- 
versely related to his or her rate of successful 
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interruptions; the increase in the accounted 
for variance was rather large (18%). Table 
2 shows this finding to be generalizable under 
cross-validation procedures. The positive re- 
lation of a speaker’s back-channel responses 
to the frequency of successful interruptions 
used by the same speaker was marginally sig- 
nificant under cross-validation procedures (r 
= .29, p < .07). 


Percentage of Successful Interruptions 


This measure reflects the proportion of in- 
itiated interruptions that result in a speaker 
switch in favor of the interrupter. Table 1 
shows that a speaker’s percentage of success- 
ful interruptions was positively related to his 
or her conversational partner’s confidence as 
a speaker (r = .43, 18.7% of the variance). 
The speaker’s general anxiety about social 
situations (avoidance—distress) was negatively 
related to his or her propensity to effect a 
speaker switch once he or she initiated an 
interruption. In other words, a person who 
has high social anxiety is less likely to per- 
sist in his or her interruptions. It should be 
noted that the speaker’s social avoidance- 


Table 2 
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distress accounted for a substantial amount 
of the variance over and above the effects of 
the covariates (8.75%, see Table 1). This te. 
lation of percentage of successful interruptions 
to speaker’s avoidance—distress was also cross 
validated (see Table 2). 


Mean Duration of Interruption | 


This measure can be conceptualized as atl 
index of a person’s tenacity in effecting al 
speaker switch. In other words, when an ini 
dividual initiates an interruption does that 
individual prolong or shorten the simultaneous 
speech situation and how does this relate to 
his or her personality? Before answering thi 
question, we look at the effect of the covari 
ates on this dependent measure. Table 1 show 
the conversational partner’s amount of speet 
to be positively related to how long an in 
vidual persists in an interruption (r= 04 
419% of variance, Table 1). There was a 1% 
significant trend (p <.08) for males to ei 
gage in longer interruptions (Table 1). D 
effect of sex on interruptive behavior agi 
with the findings of Zimmerman i 
(1975). 


Cross- Validated Multiple Regression Correlations Between Speech 


Interruptions and Personality 


Dependent variable 


Multiple regression coefficient" 


Total interruptions 

P’s speech time 

S's sex 

S's need for social approval 
Successful interruptions 

S's back channel 

S's speech anxiety 


Percentage of successful interruptions 


P’s confidence 
S's avoidance-distress 


Mean duration of interruption 
P's speech time 
S’s avoidance-distress 


os Ne 
Bl bed 
44** 


.29 
.36* 


-21 
.38* 


.39* 
45** 


Note. Sample 2, n = 36. The product-moment correlation between the predicted and the obse 
variable at each step of the regression was defined as the cross-validated regression coe i 
tailed test). Only those covariates and variates that showed significant predictive powe 


ved dene é 
fficient Gi The il 


regression (see Table 1) were used in the above analysis. 


a P denotes partner and S denotes speaker. 
*p < 05. 
+p < 01. 
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The hypothesized effect of personality on 
interruption was again supported by the re- 
gression analysis. Looking at the variates in 
Table 1, we see that an individual’s general 
social anxiety (avoidance—distress) was nega- 
tively related to the mean duration of the 
interruptions he or she initiated. The pre- 
dicted variance contributed by this variable 
was 13.2%. 


, Table 3 
Stepwise Multiple Regression Relating Personality to Speaking Time 


. and Back-Channel Responses 


Variable* 


873 
Total Speech Time 


The relation of personali ing ti 
was not directly Ets acer a, als 
of the present study, However, a multiple re- 
gression analysis was performed for the sake 
of completeness, since the information was 
available. Table 3 shows that the sex and the 
personality characteristics of the conversa- 


Multiple regression analysis 
pa 


Total speech time (criteria) 


Covariates 
Sex 
P’s confidence 
P’s avoidance-distress 
P’s need for social approval 
S's back channel 
P’s negative evaluation 
Dyad's sex 
P's speech anxiety 


Variates 


S's speech anxiety 

S’s avoidance-distress 

S’s confidence 

S’s negative evaluation 

S’s need for social approval 


Covariates 
P’s confidence 

P’s anxiety 

P’s avoidance-distress 

P’s need for social approval 

Dyad’s sex 

P’s negative evaluation 

Sex 


Variates 


S's negative evaluation 
S’s confidence 

S's speech anxiety 

S's avoidance-distress 


Back-channel responses 


R 
R R increment 
.283 080 .080 
+.360 129 049 
—.520 «270 .141* 
+.530 .280 010 
+.533 284 003 
—.534 .285 001 
535 287 .002 
+.536 .287 000 
—.707 499 .212%* 
4.717 515 015 
+.745 555 .040 
+.747 559 004 
—.149 561 003 
(criteria) 
—.349 122 122 
—.461 213 091 
—.528 278 .065 
—.552 304 027 
516 332 028 
—.583 339 007 
584 341 .002 
+.697 490 14a 
+.707 500 013 
4.123 506 015 
4.735 522 022 
4.736 522 .000 


S's need for social approval - 
ested for each predict oy ing the significance of each 


Note. Sample 1, n = 36. R? increments were ie 107). See note to 
of each predictor to the dependent variable. 


Semipartial coefficient (Cohen & Coben, 157 


P denotes tes speaker. 
partner and S denotes y ‘ 
PAN or — indicates the direction 0! the relationship 
* p < .025. 

$ <01. 


* 


Table 1 for further details, 


874 


tional partner were poorly related to the total 
amount of speech time of a speaker. Looking 
at the variates in Table 3 it is apparent that 
the subjects’ speech anxiety was inversely re- 
lated to their speech time. 


Back-Channel Responses 


Although research has suggested that cer- 
tain brief interjections are “listener responses” 
(Duncan, 1975), their relation to the per- 
sonality characteristics of the listener or the 
speaker has not been examined. Data from 
the present study allowed us to address this 
question. Table 3 shows a person’s use of 
back-channel interruptions to be related 
uniquely to the personality of the individual 
who is emitting the listener response. The 
conversational partner’s personality was not 
significantly related to the use of back-chan- 
nel responses by an individual (see Table 3). 
A speaker’s use of back-channel responses was 
found to be positively related to his or her 
fear of negative evaluation; 14% of the vari- 
ance was accounted for by this variable 
(Table 3). Table 4 shows this finding to with- 
stand cross-validation. 


Discussion 
Specific Findings 


In evaluating the meaningfulness of the re- 
sults, two questions should be kept in mind. 
(a) When a predictor shows a significant re- 


Table 4 


Cross- Validated Multiple Correlations Between Total Speech Time 


Back-Channel Responses and Personality 


Dependent variable 
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Multiple regression coefficient* 


lation to the dependent variable, what is the 
amount of variance accounted for? In other 
words, even if the addition of a variable to 
the multiple regression increases R? signifi. 
cantly, the increase may be quite small, (b) 
Does the significance of the predictors sustain 
cross-validation procedures? With these points 
in mind, a discussion follows. 

Concerning the total rate of vocal inter- 
ruption that a person displays (see Table 1), 
the amount of speaking time of a conve 
tional partner is a strong determinant, Th e 
more people talk, the greater their chance ol 
being interrupted. This very basic finding in 
dicates the limits of a psychological explana 
tion of speech interruptions. In other words, 
the present study shows that roughly tw 
thirds of all vocal interruptions can be ex 
plained in terms of physical opportunity a 
opposed to personality factors, Within this 
limit, a significant but small proportion of 
total interruptions was predicted by the sex 
of the speaker; males interrupted more often 
than females. This effect of sex on interrup 
tive behavior verifies the earlier findings of 
Zimmerman and West (1975), which showed 
males to interrupt more often than females. — 

Personality characteristics of the conversa; 
tional partner did not significantly contribu e 
to the total use of interruptions. This find | 
ing, together with the positive relation of j 
speaker’s desire for social approval to voc ( 
interruption behavior (see Table 1), may M4 
dicate the inappropriateness of conceptuallZ” 


Total speech time 
P’s avoidance—distress 
S's speech anxiety 
Back-channel responses 
S’s negative evaluation 


-38* 
43** 


Bd boa 


i i he observed de: 

te. n = 36. The product-moment correlation between the predicted and t! e 
pica step of the regression was defined as the cross-validated regression coefficient (df See 
test). Only those covariates and variates that showed significant predictive power in the initi; & 


see Table 3) were used in the above analysis. 
: P denotes partner and S denotes speaker. 
*p < 05. 
p< 01. 


jable 
ndent varia?” 
35, wore 


ing speech interruptions as invariably repre- 
senting a contest for the conversational floor. 
In the introduction we noted that speech in- 
terruptions correlate positively with intimacy 
(e.g, Rutter & Stephenson, 1977). This in- 
ferpretation of speech interruptions may ex- 
lain why a person who has a high need for 
social approval tends to interrupt more often. 
Interruptions, rather than representing a vio- 
lation of speaker-switching roles, may serve 
to express joint enthusiasm and the need for 
an individual to achieve this level of com- 
nication (a measure of social desirability). 
However, simultaneous speech may repre- 
Sent conversational floor grabbing in the situa- 
tions where successful interruptions were ex- 
amined. Support for this notion comes from 
the multiple regression analysis (Table 1) that 
used successful interruptions as the dependent 
variable: An individual’s speech anxiety had 
a sizable negative relation to the number of 
successful interruptions. In this light, partial 
Support for the original hypothesis can be 
jstated as follows: Vocal interruptions repre- 
sent an “offense” against the turn-taking rules 
and as such are negatively related to an indi- 
Vidual’s speech anxiety, but only in the case 
of successful interruptions. This finding adds 
validity to the notion that certain brief inter- 
jections (i.e. back-channel responses) are 
emitted for the sole purpose of reinforcing the 
listening role. The data in Tables 3 and 4 show 
fear of negative evaluation to be significantly 
related to the use of back-channel responses. 
Specifically, a person’s fear of negative eval- 
uation was positively related to his or her 
Propensity to utter brief interruptions (ie. 
back-channel responses). The positive social 
unction of back-channel responses is appat- 


etable, thus 


dicating the need of future Ti a 


The percentage of successful interruptions 
‘son’s in- 


an be taken as a measure of a per 
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does an individual continue until he or she 
takes over the conversational floor? Interest- 
ingly, the percentage measure of interruptions 
yielded a solid relation to the partner’s per- 
sonality (see Table 1). The more confident 
the partner felt about speaking, the higher 
the proportion of successful interruptions by 
the other subject (approximately 18% of the 
predicted variance was accounted for by the 
partner’s speech confidence). One possible in- 
terpretation of this finding is that the partner 
communicates a positive attitude toward in- 
terpersonal communication and thus somehow 
gives an interrupter more confidence to at- 

tempt speaker switches. Conversely, if we 

perceive a person to be highly sensitive about 

conversing, we might avoid trying tenaciously 

to grab the conversational floor. Future re- 

search will have to clarify the role of person 

perception variables in the use of simulta- 

neous speech. 

The intensity measure (percentage success- 
ful) of interruptions was strongly related to 
a speaker’s social anxiety. The more anxiety 
(avoidance-distress) a person has concerning 
social interaction, the smaller his or her pro- 
portion of successful interruptions. This find- 
ing offers direct support of the general hy- 
pothesis of the present study: The use of 
speech interruptions was negatively related to 
interpersonal anxiety. j 

The strong relation between partners 
speaking time and interruption duration (see 
Table 1) is self-explanatory. The duration 
of the interruption was negatively related to 
social anxiety (avoidance-distress), thus rep- 
licating a similar relation between social anx- 
iety and the percentage of successful inter- 
ruptions. The findings presented in Tables 1 
and 2 can be summed up as follows: 

1. A significant negative relation between 
interruption and a speaker’s interpersonal anx- 
iety was demonstrated for all interruption 
measures with the exception of total inter- 
ruptions. This finding indicates that analysis 
of speech interruptions necessitates the de- 
lineation of the types of simultaneous speech. 

2. The fact that a person’s desire for social 
approval had a modest but significant posi- 
tive relation to interruptive behavior indicates 
that speech interruptions may not always 
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represent contests for the conversational floor. 
An interruption can also express a positive 
state of excitement and the like. 

3. The findings that males engaged in vocal 
interruption more than females did fully sup- 
ports the earlier research of Zimmerman and 
West (1975). 

4. The present findings indicate that the 
partner’s speech time was strongly related to 
a subject’s interruptive rate. The more people 
talk, the more likely they will be interrupted. 

It was also found (Tables 3 and 4) that a 
person’s speech anxiety was negatively related 
to total speaking time. To the extent that 
speech anxiety determines speaking time of 
an individual, it also affects the interruption 
rate of the conversational partner. The per- 
sonalities of the dyadic partners interact and 
influence each other’s behavior, but it was the 
speaker’s, as opposed to the partner’s, per- 
sonality that was the strongest predictor of 
vocal interruptions. 


General Theoretical Relevance 


Of general theoretical interest is the rela- 
tion between verbal behavior and personality. 
Clinical psychology has traditionally empha- 
sized the relation of verbal content to the 
presence of psychopathology (e.g., thought 
disorder, delusions, schizophrenia, subjective 
unhappiness, depression). Over the past dec- 
ade, researchers have tried to develop objec- 
tive scoring systems for verbal behavior in 
an attempt to overcome some of the problems 
of unreliability (e.g., spectral analysis of voice 
for anxiety, Smith, 1974; interaction chronog- 
raphy, Jaffe & Feldstein, 1970, and Matarazzo 
& Weins, 1972). Although automation of vocal 
measures has vastly improved reliability, the 
relation of these speech measures to variables 
of psychological interest has not been shown 
to be strong. This may be due to the fact that 
formal aspects of speech may represent an 
endogenous (neurolinguistic) characteristic of 
communication that is to a great degree in- 
variant. Evidence for this notion comes from 
Jaffe and Feldstein (1970), who have shown 
that speech timing of monologues and dia- 
logues can be accounted for by a Markovian 


(stochastic) process.® 
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If speech timing of dyadic interaction į 
partially predicted by a “random walk” mod 
then it follows that psychological factor 
might only partially affect conversational ti 
ing. This may account for the fact that o 
a minor, albeit significant, proportion 
speech interruptions was affected by the p 
sonality measures used in the present study 
The stochastic nature of dialogic speech tim 
ing may also account for Ferguson’s (1977) 
meager empirical support of the long-hel 
clinical notion that interruptions repr 
dominance in family interactions. 

However, there is some evidence that 
gests that the Markovian nature of dialo; 
speech timing (Jaffe & Feldstein, 1970) break 
down in the course of a conversation. Cobl 
(1973) showed that there were long-ter 
periodicities in the speech timing of indivi 
uals who knew each other. The presence @ 
speech rhythms obviously violates a funda 
mental assumption of Jafie and Feldstein: 
stochastic model; the transitional probabilitie 
of the vocal parameters cannot be assumé 
to be constant. It appears that the timing 0 
speech, and therefore interruptions, roug 
obeys stochastic principles which are none 
less modified by external variables (perso 
ity, task orientations, etc.). This was 
found to be the case for gaze behavior. Nat 
(1976a) found the timing of gaze bee 
generally to obey Markovian principles, % 
it was also observed that prolonged 100 
could not: be accounted for by the st 
model. 


In summary, it appears that although an 
TY, ppe: (including spee l 


ble to influence. The present stu 
only speech interruption behavior 
versations. It may be that a variety © ; 
tional variables are more likely to En 
interruptions of a psychological E fa 
stein and Welkowitz (1978, P- 3 


ct £0 

5 Gaze timing, often hypothesized to e ect 
nitive (Rutter & Stephenson, 197 ea to obe 
state (Natale, 1977), has also been ite Stephe 
stochastic principles (Natale, 1976a; Ra 
son, Lazzerini, Ayling, & White, 1977)- 
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stated that an avTa (Jaffe & Feldstein, 1970) 
analysis of free conversations shows speech 
interruptions (successful and nonsuccessful) 
generally to haye durations of approximately 
4 sec. (The present study obtained the same 
finding.) Feldstein and Welkowitz have aptly 
noted that Meltzer et al. (1971) reported in- 
stances of simultaneous speech as long as 6 
sec and suggest that the difference in findings 
may be a function of task. Meltzer et al. re- 
corded speech interruptions during problem- 
solving sessions, whereas Feldstein and Wel- 
skowitz (1978) and the present study examined 
turalistic conversations. The relevant point 
is that casual social conversation and task- 
oriented dialogue may elicit different patterns 
of speech interruption reflecting different re- 
lation to personality. It may be that inter- 
personal anxiety accounts for a larger propor- 
tion of interruption in interviews, therapy, 
and the like. Future research must examine 
these questions. 

Another theoretical issue relevant to the 
present findings is the extent to which the 
outcome of simultaneous speech is determined 
by cognitive variables. Morris (1971) and 
Meltzer et al. (1971) have provided sound 
evidence for the role of vocal amplitude in 
determining interruption outcome. These in- 
vestigators have suggested that the briefness 
of simultaneous speech indicates the small role 
of semantic content and hence the importance 
of physicalistic cues (vocal amplitude) in in- 
_ terruption outcome, But Feldstein and Wel- 
kowitz have stated that “amplitude may, func- 
tion as a mediating factor that allows speakers 
to respond quickly to content, or it may be 
| So highly related to content . . . that its role 
in determining outcome is artifactual” (1978, 
P. 353). The relation of interruption outcome 
to cognitive processes is not ruled out and 
does indeed receive indirect support from the 
Present study. Successful interruptions were 
negatively related to speech anxiety and so- 
cial anxiety, but the accounted for variance 
Was moderate. Also, duration of interruption, 
which averaged a mere .45 sec, was signifi- 
cantly related to personality. The fact that 
interruption behavior is subject to individual 
differences (Notes 1 and 2), and is related 
in some degree to personality, does suggest 
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that the role of cognitive processes in deter- 
mining interruption behavior cannot be ruled 
out. 

Finally, the issue of whether an interrup- 
tion represents a fight for the conversational 
floor was not directly examined in the present 
study. It will be recalled that for the sake of 
examining the relation of simultaneous speech 
to personality, we assumed that vocal inter- 
tuptions represent a contest for speaking. 
However, there is evidence that suggests that 
interruptions represent spontaneity and posi- 
tive states (Stephenson et al., 1976). This 
positive relation between interpersonal com- 
munication and interruptions could account 
for the moderate negative relation between in- 
terpersonal anxiety and the initiation of si- 
multaneous speech. It could be that vocal in- 
terruptions are either positive (emotional) or 
negative (competitive) according to situa- 
tional determinants, Future research should 
examine the relation of comite E 
and personality variables to forma! as 
dialogue so cn the magic of “good” and 
“bad” conversations might be discovered. 
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Multidimensional Input in Equity Theory 


Arthur J. Farkas and Norman H. Anderson 
University of California, San Diego 


Multiple dimensions of input, which are typical of equity judgments in prac- 
tical situations, may be handled in two ways. The standard assumption of input 
integration implies that the input cues are summed for each individual to yield 
a single unitary value of input for each separate person; the ratio model of 
equity division then applies to these unitary input values. Input integration 
thus transforms the multidimensional stimulus field into a one-dimensional 
value of input that mediateg the equity division. The alternative assumption of 
equity integration implies a reverse order of processing: First, the ratio model 
is applied to make an implicit equity division along each separate input dimen- 


sion; the resulting equity ratios are then averaged to yield a final judgment. 
The difference in processing order implies different patterns of interaction that 
thus serve to test the two assumptions. The results of the present research 
agreed very well with the hypothesized rule of equity integration when the 
input dimensions were dissimilar. When the input dimensions were similar, the 
results agreed moderately well with the hypothesized rule of input integration, 


although there were small, worriso 


me discrepancies. Overall, the results provide 


some promising support for a cognitive algebra of equity. 


It is widely assumed that equity should ob- 
tain when a person’s outcome is proportional 
to input. But how can this simple rule apply 
when input is multidimensional? In many s0- 
cial situations a person’s deservingness de- 
pends not only on concrete accomplishment 
but also on how hard the person tried. How 
can a one-dimensional outcome be propor- 
tional to a two-dimensional input? How can 
proportionality be maintained when the per- 
son tried hard but accomplished very little? 
Or accomplished a great deal with little effort? 

This problem of multidimensional input 1s 
fundamental in equity theory. Adams (1965) 
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has emphasized how numerous factors (senior- 
ity, age, educational and family background, 
for example) can all affect the perceived value 
of input. Similarly, a person might consider 
his or her intentions, efforts, or even need, as 
part of input, over and above actual perform- 
ance. In social practice, therefore, input will 
generally be multidimensional. Any theory of 
equity must solve the problem of multidimen- 
sional input. 

An obvious solution is to assume a one- 
dimensional mediator. All relevant stimulus 
information about the person would be inte- 
grated to obtain a single value of deserving- 
ness. The person’s outcome would then be 


value. Daie this hypothesis of input inte- 
gration, the multidimensionality resides in the 
stimulus field but the effective mediator is 
itself one-dimensional. ; i 

This hypothesis of input integration has 
been taken more or Jess for granted. Adams 
(1965) and Walster, Berscheid, and Walster 
(1976) have assumed an input summation 
rule, while ‘Anderson (1976a) and Leventhal 


(1976) have suggested an averaging rule. 
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Only one experiment (Anderson, 1976a) has 
provided an experimental test, however, and 
these results raised doubt about the validity 
of any rule of input integration. 

The assumption of input integration is strin- 
gent. It requires that all the diverse deter- 
minants of input be reduced to a common 
currency. That might be hard to do with 
qualitatively distinct dimensions, such as ac- 
tual accomplishment and effort, for example, 
or productivity and seniority. To integrate 
such different dimensions into a unitary, one- 
dimensional mediator may be difficult. 

There may be an easier way to handle 
multidimensional input. The first step would 
be an implicit equity division for each sepa- 
rate dimension of input. That involves com- 
paring the several persons on only one di- 
mension at a time. Further, these implicit 
divisions are themselves comparable across 
different dimensions, because they all repre- 
sent relative deservingness values. Hence, 
they should be easily integrated to obtain an 
overall division. 

This scheme may be called equity integra- 
tion (Anderson, 1976a), because it is the im- 
plicit equity divisions that are integrated. 
Both steps in equity integration involve 
quantities that are naturally comparable. 
Thus, equity integration offers a psychologi- 
cally attractive way to handle multidimen- 
sional input. 

The present experiment was designed to 
test between the two hypotheses of input in- 
tegration and equity integration. This turns 
out to be fairly straightforward by means of 
cognitive algebra. However, it requires an ex- 
plicit statement of the two hypotheses. For 
simplicity, this statement will be restricted 
to the equity model employed in information 
integration theory (Anderson, 1976a; Ander- 
son & Farkas, 1975; Farkas, 1977; Farkas & 
Anderson, Note 1; Anderson, Note 2). Other 
equity models are taken up in the discussion. 

In this model, the state of equity for two 
persons, A and B, is defined by the equality 
of two proportionate ratios: 

In/(In + In) = Os/(Ox + On), (1) 
where 7, and J, are the inputs for Persons A 
and B, and O, and Oy are their correspond- 
ing outcomes. 
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With a fixed amount T to divide between 
A and B, Equation | becomes 


On = (a/ Ua + In) IT. (2) 


Thus, A’s fair share is just A’s proportionate 
input. This model, however, assumes that I, | 
and /, are one-dimensional, The theoretical 
problem concerns multidimensional input. 

In the present experiments, information was f 
given about two input dimensions: effort (E), 
or how hard the person tried, and performance 
(P), or how much the person actually ac- 
complish®. The rule of input integration 
could then be written as an adding-type 
model: 


In= Pat Es, In = Pa t+ Es 


Weight parameters, wp and wg, are omitted: 
for convenience, since they are effectively ab- 
sorbed in the units of the scales of P and Ea} 
Substitution of these values into Equation 2 
yields the model for input summation: 


On = [(Pa + Ea)/ 
(Pat Ext Po + Ell. @ 


Under equity integration, separate equity 
calculations are made for performance an 
for effort, and these two equity ratios are then 
integrated. Under an averaging rule, 


On = wel Pa/(Pa + Py) JT 3 
i n+ wal Ea/ (Ea + I A 


Here, wp and wg are the relative weights an 
sociated with the two input dimensions, wig 
wp + we= 1. 

It is simple to test between input ie 
tion and equity summation because they PF 
dict quite different data patterns. a a 
ample, Equation 4 implies that Pa and ™ 
combine additively because they atè ee 
rated by a plus sign. The corresponding we | 
way factorial graph should thus exhibit a pP% 
tern of parallelism. 

In nast, Equation 3 implies that Ps 
and E, combine nonadditively becat i 
are separated by a division sign 4 v i 
by a plus sign. In this particular a ld 
corresponding two-way factorial graph ae 
exhibit the pattern of a diverging 
curves. 


vide 
This two-way Ps X Ea graph thus pro 


MULTIDIMENSIONAL INPUT IN EQUITY THEORY 881 


a simple sign to diagnose which, if either, of 
the two integration rules underlies the overt 
judgment. Other two-way interactions may 
be analyzed in the same way. This method 
of diagnosis requires some care, because the 
theoretical deviations from parallelism may 
be too small to detect under certain condi- 
tions. With due care, however, these two-way 
data patterns can be used to delineate the 
structure of the underlying integration pro- 
cesses. 


Method 


The subjects received information about the inputs 
of two persons, A and B, who had worked together 
on a common task. This input information specified 
effort (how hard A and B had tried) and perform- 
ance (how much A and B had actually accom- 
plished), The subject’s job was to divide a fixed re- 
ward fairly between A and B on the basis of their 
inputs. Judgments were made on a 200-mm graphic 
scale labeled “Person A” and “Person B” at the ends. 


Experiment 1 


In Parts 1 and 2, the five levels of performance and 
effort were specified as follows: extremely above 
average, above average, average, below average, an 
extremely below average. The input information for 
each judgment was typed on an index card, and the 
deck of cards for each design was presented in a 
different shuffled order for each subject. Preliminary 
Practice was given for each design. Subjects were 20 
Student volunteers from introductory psychology 
classes who received extra class credit for their ser- 
vices, They were run individually, served in all four 
designs, and made one judgment to every stimulus 
combination. 

Enar 1. Two 5X5 designs were used, the first of 
y ich specified the performances of A and B and the 
aos of which specified the efforts of A and B. 

e first design was presented twice, with the first 
Presentation serving as added practice, and the second 

esign was presented once. 

Part 2, One 5 X 5 design was used, in which both 


, the performance of A and the effort of A were varied, 


Subjects were told that B had worked at the task 
ut that they would not be informed of what 
ad done, 

H 3. One 3¢ design was used. Both performance 
ia fort were specified for A and B. Except for @ 
age ence in the levels (namely, above average, aver- 
in tans below average), the procedure was the same 

is part as in Parts 1 and 2. 


Experiment 2 


rposes. The first 


This experiment had two main pu : 
Experiment 1 


Ww: i; 
aS to replicate the main design © 


(Part 3) with more extreme levels of the stimuli, The 
second was to attempt to induce a strategy of input 
summation by making the input information homo- 
geneous. By presenting two pieces of performance 
information about each person, it was thought that 
subjects would integrate this homogeneous informa- 
tion into a preliminary judgment of deservingness for 
each person and then make an equity division based 
on these two deservingness values. 

Each factor had three levels in all designs, namely, 
very much above average, average, and very much 
below average. Subjects were 32 new subjects from 
the same pool as in Experiment 1. General procedure 
was similar to that of Experiment 1. 

Part 1. Two 3‘ designs were used, In one, per- 
formance and effort were specified for A and for B, 
as in Part 3 of Experiment 1, except for the change 
in the phrases used to define the levels. In the 
other, two performances from separate and inde- 
pendent days were specified for A and for B, For 
each design, four examples were used as part of the 
instructions. These were followed by four end anchors 
and then 18 representative trials from the regular 
design. The order of presentation of these two de- 
signs was balanced across subjects. 

Part 2. Two 3° designs were used, In one, per- 
formance was specified for A and for B. In the other, 
performance was specified on 2 separate and inde- 
pendent days for A, and B's performance over the 2 
days combined was specified as average. 


Results 


The main goal in the data analysis was to 
test the two alternative rules, input integra- 
tion and equity integration. These two rules 
predict quite different data patterns. These 
data patterns, especially the two-way factorial 
plots, may be used to diagnose which, if 
either, model underlies the observed judg- 
ments. Accordingly, the main exposition will 
be based on the visual appearance of the fac- 
torial graphs from the four-factor designs. 
Technical aspects of the model analyses are 
considered in more detail in a separate section. 


Effort and Performance: Experiment 1 


Theoretical interest centers on the ole 
factorial graphs in the top layer of Hapura $ 
There are six graphs in this top layer, one for 
each two-way interaction in the main design. 
The two on the left have the shape of a 
slanted barrel; the remaining four exhibit 
parallelism. These six patterns of data must 
be related to the theoretical predictions. 

Equity integration. The hypothesis of 
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equity integration provides a complete ac- 
count of these data. All four panels on the 
right are predicted to be parallel by the rea- 
soning given in connection with Equation 4 
in the introduction, To illustrate, consider the 
third graph in the row. This shows that the 
fair share for A is an inverse function of B’s 
performance (listed on the horizontal axis) 
and a direct function of A’s effort (listed as 
curve parameter), From Equation 4, B’s per- 
formance (Py) and A's effort (E4) combine 
additively, since they are separated by a plus 
sign. Accordingly, the theoretically expected 
shape is one of parallelism, and that is verified 
in the graph, Exactly the same reasoning ap- 
plies to the three graphs farther to the right 
because each represents effort for one factor, 
and performance for the other. 

The slanted barrel shapes of the two graphs 
at the left also agree with equity integration. 
To illustrate, consider the leftmost graph. It 
shows that the fair share for A is an inverse 
function of B’s performance (listed on the 
horizontal axis) and a direct function of A’s 
performance (listed as curve parameter). The 
barrel shape reflects the basic equity ratio, 
P,/(P, + Py), in Equation 4 (e.g., Ander- 
son & Farkas, 1975; see also below). In the 
same way, the barrel shape of the second left 


As PERFORMANCE, as EFFORT ~A 
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As EFFORT ~ A's PERFORMANCE) 


í 


graph reñects the basic equity | 
(Ea + En) 

All six two-factor graphs thus 
predictions from the hypothesis of 
tegration. This hypothesis also pr 
the higher-order interactions are 
diction that was largely supported, 
below. 

Input integration. The hypothe 
integration disagrees with four of th 
graphs in the top layer of Figurë 
dicts that the two leftmost graphs) 
very nearly parallel, whereas the 
show a barrel shape. It also predict 
two rightmost graphs should formi 
or diverging fans, whereas they | 
parallel. Only the near parallelism 0 
middle graphs agrees with predict i 
data, therefore, are sharply inconsis 
the hypothesis of input integration. 

It needs mention that Equation 
integration in principle predicts no 
ism for all six graphs in Figure Lt 
cases, however, predicted deviations: 
allelism may be negligible, as for the 
most graphs of Figure 1. And althou 
present concern, parallelism can 4 
when only two levels of each val 
employed (Anderson & Butzin, 19 
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Figure 1. Mean reward allotted to Person A as a function of several stimulus variables. a . 
bottom layers show graphs of the six two-way interactions from corresponding designs 4 
ments 1 and 2, respectively. HI = high; AVE = average; LO = low.) ` 
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fortunately, the algebraic structure of Equa- 
tion 3 is not very informative about the sizes 
of the interactions. Accordingly, it appears 
necessary to use numerical analysis, as illus- 
trated below in the section on model analysis. 

Goodness of fit. Formal tests of the model 
may be obtained directly from the analysis of 
variance. Because of its theoretical impor- 
tance, the complete analysis is presented in 
Table 1. Of primary interest are the F ratios 
for the two-way interactions: These F ratios 
test for deviations from parallelism for the 
two-factor graphs in the top layer of Figure 
1. The Fs for the two leftmost graphs are 
32.68 and 18.07, p < 10°. The sizes of these 
F ratios reflect the high reliability of the data, 
which allows ready detection of what are 
rather small deviations from parallelism. The 
other four graphs, in left-to-right order, cor- 
respond to nonsignificant Fs of 1.47, 67, 19, 
and 2.24, respectively. These tests thus con- 
firm the above graphical analysis and support 
the hypothesis of equity integration. 
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Equity integration also implies that the 
higher-order interactions should be nonsig- 
nificant. This implication follows from Equa- 
tion 4, because any set of three factors con- 
tains two that are separated only by a plus 
sign. In fact, none of the three-factor interac- 
tions in Table 1 is significant. The four-factor 
F of 1.83 is technically significant, but it is 
small and did not seem meaningful. Overall, 
therefore, the data provide strong support 
for equity integration and clear evidence 
against input integration. 


Effort and Performance: Experiment 2 


Further support for equity integration is 
given by Experiment 2, which included a 
nearly exact replication of the main design 
of Experiment 1 considered just above. The 
two-way graphs are plotted in the bottom 
layer of Figure 1, Each of these graphs shows 
essentially the same pattern as the correspond- 
ing graph from Experiment 1 in the top 


Table 1 i 
Summary Analyses of Variance for Four-Way Designs 
Performance X Performance 
SS ET 


Experiment 2: 
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Experiment 1: Experiment J: ees 
Source F ist 
: 9.78 
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B: Performance-B 173.13 h f c: Performance- 583.17 
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z wee A P spectively. Critical values of F at the 
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layer. The two leftmost graphs show the pre- 
dicted barrel shape, and the four rightmost 
graphs show parallelism. Just as in Experi- 
ment 1, therefore, these data provide clear evi- 
dence against the hypothesis of input integra- 
tion and strong support for the hypothesis 
of equity integration. 

There is one difference between the two 
experiments in Figure 1. The graphs show 
greater vertical spread in Experiment 2 than 
in Experiment 1. Also, the performance-effort 
asymmetry is considerably larger, as can be 
seen by comparing the two leftmost graphs in 
each layer. These effects arise from the 
changes in the definition of the stimulus levels 
and do not require any qualification of the 
theoretical interpretation. 

However, one complication arose in the 
statistical analysis for Experiment 2. Although 
the two-way interactions were much the same 
as in Experiment 1, Table 1 shows that the 
three-factor and four-factor interactions are 
all substantial, contrary to the equity integra- 
tion model. At first it was thought that these 
ineractions represented a mild degree of in- 
put integration. However, more detailed anal- 
ysis showed that the interactions predicted 
by input integration were much too small. 
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Figure 2. Mean reward allotted to Person A as a function of several stimulus yet a 
ment 2, Part 1. (Upper graphs show two representative two-way interactions; lower 8 
representative three-way interaction. HI = high; AVE = average; LO = low.) 


ARTHUR J. FARKAS AND NORMAN H. ANDERSON 


LO a's SECOND 
AVE PERFORMANCE 


A's SECOND 
PERFORMANCE 


N 


AVE+—B's SECOND PERFORMANCE 


ANE- =71.0 = “CHT 
B's FIRST PERFORMANCE 


Similar patterns of interaction were ag} 
obtained in collateral work on children't, 
judgments of equity (Anderson & Butaal 
1978). These data suggested the presence il 
a configural element within the basic strate 


formance). Thus, a dimension of differen 
appeared to influence the judgment more thi 
a dimension of no difference. When this coi 
figural assumption was incorporated into tl 
model, it provided a remarkably good a 
count of the data, as is shown in the mong 
detailed model analysis below. 


Performance and Performance; Experim | 


The rule of input integration was expec 
to apply when the input consisted of ‘j 
pieces of performance information about fl 
person. Since these two pieces of input i 
formation are similar, it should be a 
integrate them to obtain a single value of i] 
servingness for each person. These dese 
ness values would then constitute the ut 
input values for the equity division betwe 
the two persons. 


Experi 
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The model of input integration provided an 
ypproximate account of the data, but there 
vere small, worrisome discrepancies. Figure 
2 gives an overview of the data. The graph at 
he upper right is representative of the four 
two-way interactions that involved one piece 
f information about each person. Theoreti- 
cally, these graphs should be almost parallel. 
However, all four showed a slight barreling, 
and these deviations from parallelism were 
significant, as shown in Table 1. 

A more serious difficulty for input integra- 
tion is the pattern in the two-way graph in 
the upper left of Figure 2. This graph shows 
a mild convergence to the right, whereas the 
model predicts a moderate divergence. The 
arithmetic rationale for the divergence fol- 
lows directly from the fact that both terms 
in this interaction, being performance levels 
of B, appear in the denominator of the model. 
The larger the value of one term, therefore, 
4 smaller the predicted effects of the other. 
is discrepancy is reliable, since it also ap- 
peared for A. The given arithmetic rationale 
i. that this discrepancy cannot be recti- 
a by replacing the sum Pa +Za in Equa- 
a by a more general input integration 
E: ion, f (Ps,Eq). These data therefore pre- 
sent a serious problem for the hypothesis of 
Mput integration. 
E representative three-way interaction is 
A n in the lower layer of Figure 2. The left 
‘anaes panels show mild divergence and 

3 ergence, respectively, whereas the center 
staph exhibits the barrel shape. The three 
ee not geometrically congruent, and 
a incongruence produced a significant 
ieee interaction. The other three-way 

‘actions had a similar pattern, 4 conse- 
a of the symmetry in the design. Such 
ae pattern across the three panels, 
E divergence to barrel shape to; conyein 
M m is predicted by input summation, an 
Fite 2 initially thought that these three-way 
Se provided good support for the 
= However, more detailed analyses 
H ed that the predicted interactions 7 
a stantially smaller magnitude than the 09- 
i ved interactions. Accordingly, these 10- 
fraction patterns also present & difficulty for 


t A 
he hypothesis of input integration. 
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It deserves mention that the interaction 
patterns in Figure 2 provide a strong test of 
the input integration model of Equation 3, be- 
cause that model has only two free parameters 
in the present experimental design. All four 
performance factors in the design are equiva- 
lent by symmetry, and the value of the mid- 
dle level may arbitrarily be set equal to unity. 
Thus, the only two free parameters are the 
values of the high and low levels of per- 
formance. Since these values are essentially 
determined by the main effects, the interac- 
tion patterns provide clear tests of the model. 
Tt is possible, of course, that some configural 
assumption similar to that considered above 
would resolve the discrepancies. However, sev- 
eral attempts in this direction met with little 
success. Even though the discrepancies are 
small, therefore, they raise a serious question 
about the validity of the hypothesis of input 


integration. 

A natural question is whether the rule of 
equity integration, which did well in the other 
main design, might also apply to the data in 
Figure 2. That this is not the case may be 
seen by comparing the two-way graphs be- 
tween Figures 1 and 2. In Figure 2, all four 
graphs that included one piece of information 
about each person had the same slight barrel 
n in the upper right. In Figure t; 
graphs that included one 
information about each person are 
t graphs in either layer. These 
the same shape. Further, none 
to the upper-right 
2. From this comparison 
lear that the rule of equity in- 
tegration does not account for these data, 

Tt is worth noting that & simple linear, 
ing-subtracting model would describe the 
data rather well, for the deviations from 
parallelism in Figure 2 are visibly small. For 
practical applications, at least, an adding-tyPe 
model could provide & good approximation. 
Of course, that does not mean nr re 
ing i ation process is basically additive 
Saad A 1977), although the 
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j he subjects tended to sim- 


k in this manner deserves con- 
anteau & Anderson, 1972). 


do not all have 
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Figure 3. Mean reward allotted to Person A 
panel represents one of the two- 
low.) 


Other Tests of the Equity Ratio Model 


Both experiments contained designs in 
which the equity judgment was based on two 
pieces of information. These data do not 
bear on the main question of input integra- 
tion, but they are relevant to the basic ratio 


model of equity theory given by Equations 1 
and 2, 


The results from the three designs of Ex- 
periment 1 are givéh in Figure 3. These 
graphs show little or no sign of the slanted 
barrel shape that is predicted by the basic 
ratio model, Indeed, each panel appears to be 
roughly parallel, as though an adding-type 
integration obtained. The deviations from par- 
allelism are statistically significant in each 
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m A. (Each panel represents one of the two-way des 
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case, F(16, 304) = 2.72, 4.04, and 3 
the three respective graphs. However, n 
tern is apparent in these deviations, and t 
certainly do not follow the theoretical ba 
shape. 

A different picture was found in 
ment 2, as can be seen in Figure 4. 
both graphs show a clear barrel shap 
agreement with the equity model. The dev 
tions from parallelism are highly signi 
F(4, 124) = 17.20 and 24.86 in the 
spective graphs. The basic model of Eq 
2 gave a good fit to these data, as is $ 
below. : 
The five graphs of F igures 3 and 4 g 
mixed picture on the equity ratio mod : 
data of Experiment 2 support the ratio mo 


A's PERFORMANCE - | 
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A's PERFORMANCE - 2 


whereas those of Experiment 1 appear to sup- 
port an adding-type formulation. Of course, 
the ratio model was not expected to hold for 
the case of heterogeneous information, 
namely, the Performance-A X Effort-A design 
in the rightmost panel of Figure 3. Similar 
results had been obtained by Anderson 
(1976a, Figure 3), and those results had led 
to the hypothesis of equity integration that 
was studied in the present experiments. How- 
‘ever, the other two graphs of Figure 3 repre- 
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sent homogeneous stimulus information, and 
the apparent failure of the ratio model in 
these two cases is more disturbing. 

One possible source of this difference in 
results lies in the different ordering of the two- 
way and the four-way designs in the two ex- 
periments. The two-way designs came first in 
Experiment 1, last in Experiment 2. It is pos- 
sible, therefore, that the extended practice at 
integrating four pieces of information pro- | 
duced a tendency to use a ratio model that 
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carried over to the later two-way designs in 
Experiment 2, When these two-way designs 
came first, as in Experiment 1, a simpler add- 
ing-type strategy may have prevailed. 


Model Analyses 


This section presents the more detailed 
analyses for the models that have been dis- 
. cussed above. For simplicity, only the data 
of Experiment 2 will be explicitly considered. 


Equity Integration 


Constant-weight model. The main concern 
was to verify the statement above that Equa- 
tion 4 for equity integration provides a good 
account of the two-factor graphs of Figure 1. 
Since the main concern was with qualitative 
shapes or patterns, not with exact numerical 
fits, the model parameters were chosen by 
eye, together with a little trial and error. 

For the performance-effort design of Ex- 
periment 2, scale values for the three levels 
of performance were set equal to 3, 1, and 
1/3; similarly, scale values for effort were 
set equal to 4, 1, and 1/4. The weight param- 
eter for performance was set at twice that for 
effort. These parameters were used in Equa- 
tion 4, and the resulting predictions were mul- 
tiplied by an arbitrary scale factor to bring 
them into the same approximate range as the 
Measured response. This yielded the theoreti- 
cal two-factor graphs shown in the top layer 
of Figure 5. 

The assessment of the model is made by 
comparing the theoretical graphs in the top 
layer of Figure 5 with the empirical graphs 
in the bottom layer of Figure 1. The patterns 
are essentially identical across both figures: 
The two left-hand panels show slanted bar- 
rels, whereas the four right-hand panels show 
parallel curves, 

An interesting note is that the parameter 
values relate directly to the relative shapes 
of the two left-hand graphs. The greater 
weight for performance reflects the greater 
vertical spread in the leftmost graph; the 
greater range of scale values for effort re- 
flects the greater proportionate barreling of 
the next-to-left graph, Extreme levels of ef- 
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fort thus appear to have greater valence but 
less influence than extreme levels of perform: 
ance. 

Configural-weight model. Although Equ 
tion 4 for equity integration provided a go 
account of the two-factor graphs, it failed 
account for the three-factor interactions th 
appeared in Experiment 2. Inspection of tl 
data suggested that these three-factor inter 
actions resulted from configural weighting 
Each input dimension seemed to have greatetf 
influence on the equity division when the i 
puts of the two parties were unequal th 
when they were equal. 

This configural weighting can be more reat 
ily appreciated by looking at Equation 4, i 
which each input dimension corresponds to 
separate relative ratio. When A and B al 
equal on performance, say, then the perform 
ance ratio, Ps/(P, + Py), equals 1/2. T 
weight parameter wp then appeared to be 
lower than when A and B were unequal on 
performance. 

In terms of Equation 4, therefore, tht 
weight parameter wp is not constant but has 
two values, the lower applying when Pa = 
Py, the higher applying when Pa + Ps. Th 
weight wp for effort is assumed to obey a sim 
lar configural relation. Mathematically, it | 
convenient to replace the relative weights us 
in Equation 4 by the absolute weights, de 
noted here by primes. The relative and al: 
solute weights are related thus: 


` wp = wp!/ (wp! + wz’), 
We = wy'/ (wp! + wz’). 


Since the main concern was with. qualitai 
tive trends, parameters were estimated by €y% 
as above. Scale values were the same as 
used for the constant-weight model above. T 
low and high levels of the absolute weights 
were set at 2 and 4 for the performance : 
mension and at 1 and 2 for the effort dimi 
sion. These parameters were used in Equatidl 
4 to generate theoretical curves for the tw? 
factor and three-factor interaction graphs- i 

The predicted two-way graphs are ge 
in the middle layer of Figure 5. Each two-W#! 
graph has the same shape as the correspon 
ing graph from the constant-weight mO 
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just above. Thus the configural-weight and 
constant-weight models yield virtually identi- 
cal two-factor patterns. This pattern similar- 
ity comes about because the performance and 
effort dimensions are represented by separate 
ratio terms in Equation 4. 

The main purpose of the configural model 
was to account for the three-factor interac- 
tions. This it does in remarkable detail. One 
of the empirical interactions is shown as the 
three 3 X 3 graphs in the upper layer of 
Figure 6. All three graphs show the same basic 
shape: a slanted barrel with a middle curve 
of greater slope, That is the shape of the 
basic equity ratio model. However, the three 
graphs differ somewhat in shape, and that 
difference is the source of the three-factor 
interaction. In the left graph, the top curve 
is bowed, but the bottom curve is almost 
straight; also, the top and bottom curves di- 
verge to the right. The right graph shows the 
complementary pattern: The bottom curve is 
bowed, the top curve is nearly straight, and 
the two curves converge to the right. Finally, 
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the center graph shows the pure form of the 
basic equity ratio model. 

The theoretical predictions from the con- 
figural-weight model are shown in the lower 
layer of Figure 6. These have virtually the 
identical pattern as the data. The other three- 
factor interactions all had similar shape and 
were accounted for in the same manner. Thus, 
the configural-weight model for equity in- 
tegration provides a detailed account of the 
main patterns of the data. 

It is worth adding that the configural model 
does not account for the data when the 
weighting is reversed to use the larger weight 
when the levels of performance and effort are 
equal. This weighting reverses the shape of 
the leftmost predicted graph in the lower layer 
of Figure 6. The bottom curve becomes 
bowed, the top curve straight, and these two 
curves converge instead of diverging to the 
right; the rightmost predicted graph shows 
the complementary changes. The configural 
weighting is thus quite sensitive to specific 
pattern features in the data. 
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Input Integration 


One remaining question in the model analy- 
sis is whether the hypothesis of input integra- 
tion can account for the data from the main 
designs that included information about both 
performance and effort. The model for equity 
integration gave a good account of these data, 
but perhaps the model for input integration 
could do the same. 

To obtain a definite answer to this ques- 
tion, an exact fit of the model was made in 
the following way. Scale values for the me- 
dium levels of performance and effort were 
both set equal to unity; this is no restriction, 
since it merely sets the scale units. The 
weights are not identifiable parameters in this 
model, since they are absorbed into the scale 
units and so need not be estimated, Further, 
the scale values for Persons A and B are the 
same by virtue of the symmetry in the design. 
Thus, there are only four free parameters, the 
high and low levels of performance and of 
effort. The values of these four parameters 
are determined essentially by the sizes of the 
main effects, Accordingly, the shapes of the 
two-factor graphs will provide strong tests of 

the model. 

The four parameters were estimated by 
fitting the model of Equation 3 to the two 
center two-way graphs for Experiment 2 in 
the bottom layer of Figure 1. This choice 
was made because it was evident that the 
model would provide a good fit to these par- 
ticular data, as indeed it did. The estimation 
was done using Chandler’s (1969) steprr 
program. The estimated values were 2.32, 1, 
and .33 for high, medium, and low levels of 
performance, respectively, and 1.69, 1, and 
.56 for high, medium, and low levels of ef- 
fort, respectively. 

These parameters were then used to obtain 
predicted values for all six two-way interac- 
tion graphs, These theoretical graphs, shown 
in the bottom layer of Figure 5, may be com- 
pared to the theoretical graphs from the 
equity integration model in the middle and 
upper layers. There is considerable disagree- 
ment for the two left-hand graphs: These ex- 
hibit the slanted barrel shape in the upper 
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layers, but are virtually parallel in the bot- 
tom layer. There is also considerable disagree. 
ment for the two right-hand graphs: These 
are virtually parallel in the upper layers, but 
are markedly nonparallel in the bottom layer, 
Since the theoretical graphs for equity in- 
tegration fit the data very well, it follows that 
the model for input integration fits very ill, 
Configural-weight versions of the model for 
input integration were also considered, but 
these did not seem to give much improvement, “ 
It would seem that the algebraic structure of 
the model for input integration precludes a 
fit to these patterns of response. 


Some Properties of the Equity Ratio Model 


The relative ratio model of Equation 2 has 
some interesting mathematical properties when 
the two persons are treated symmetrically. 
These properties provide a basis for exact 
tests of fit and also for estimating the effec- 
tive stimulus values of the inputs. These 
properties hold both for Equation 2 and for 
the two corresponding two-factor graphs of 
the equity integration model of Equation 4. 
To show these properties, the 3 X 3 design 
of Table 2 will be used, in which the input 
values of both persons are designated as H, 
M, and L, with H > M > L. Most of these 
properties can be derived using simple algebra, 
and so no proofs will be given. 

The theoretical expression in each cell of 
Table 2 is simply the equity ratio. These pre 
dictions will be illustrated using the numeri- 
cal values from the left panel of Figure 4 
which are listed below each theoretical €% 
pression in Table 2. These observed values 
are given in terms of the 200-mm respons’ 
measure, so that a theoretical ratio of 1/? 
corresponds to 100 mm, and a theoretical ratl? 
of 1 corresponds to 200 mm. 

Symmetry, The diagonal elements should 
equal 1/2, an immediate consequence of the 
symmetry assumption. All three observ! 
values in Table 2 are very close to 1 
slightly larger than the nominal midpoint © 
the response scale. 

Complementarity. Entries in cells 


that até 
symmetrically located about the main diate 


2. & 


ible 2 


Row level H 
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e 
oretical Analysis of Ratio Model, Symmetrical Case 


Column level 
M L 


H (a) H/(H + H) 
101.9 
M (d) M/(M + H) 
65.3 
L (g) L/(L +H) 
43.7 


should add to unity. In Table 2, there- 


b+d=c+g=f+h=1. (5) 


e observed sums are 202.5, 201.5, and 
200.8, respectively, slightly greater than the 
theoretical value of 200. Complementarity is 
also an immediate consequence of the sym- 
Metry assumption. These data, like those just 
Noted, support the further use of the sym- 
Metry assumption for this task. 
$ Parallelism subtests. Table 2 contains three 
2X 2 subtables that should obey the parallel- 
"ism prediction. These can be designated as 
> follows: 


a—-d=b-e¢ 

EYE 6) 
a-g=c—i. 

Each of these tests can be made as a 2 X 2 
Analysis of variance in which the critical term 
‘Ws the interaction. Since the observed devia- 
tions from parallelism are small, no formal 
+ test was performed. 

Barrel shape. In the interaction tests, dis- 
fussed in a previous section, considerable em- 
Phasis was placed on the prediction of a bar- 
“Tel shape. In terms of Table 2, the barrel 
Shape is equivalent to 


b—h>a—g=c-i. (7) 


The effect was substantial, since the differ- 
ence on the left is 69 mm, and the two differ- 
ences on the right are 58 and 56 mm, re- 


(b) H/(H + M) 
137.2 
(e) M/(M + M) 
102.2 
(h) L/(L + M) 
67.8 


(c) H/(H + L) 
157.8 

(f) M/(M + L) 
133.0 

(i) L/(L + L) 
102.1 


lote. H, M , and L denote the three values of the input cues, assumed equal for the two parties. In each cell 
the design, the ratio of values of input cues represents theoretical prediction ; numerical value represents 
data from Experiment 2, Part 2. Letters a-i are used for identification in text. 


spectively. The mean net difference is thus 
12 mm, considerably larger than the differ- 
ence of 3.5 mm that would be required to 
reach significance. 

An asymmetry. There is also an interest- 
ing asymmetry that is implied by the ratio 
model, though its size and sign depend on 
the exact values of the parameters. For the 
3 X 3 design, the middle curve will generally 
have greater slope than the top and bottom 
curves, In terms of Table 2, this asymmetry 


is measured by 


(-f)- d-8) = 
(HL — M?)/(H + M)(M + L). (8) 


Accordingly, the expression on the left has 
the same sign as HL — M°. Since M can be 
set equal to 1 without loss of generality, the 
expression on the right will be positive when 
H > 1/L. This condition was satisfied by the 

ameter estimates below, though the effect 
was not overly large. 

In the data of Table 2, c— f= 24.8 mm, 
and d — g = 21.6 mm. The difference of 3.2 
mm, with F(1, 30) = 3.47, was somewhat 
short of the 3.5-mm value that would be re- 
quired to reach significance. Considerably 
stronger asymmetries appear in many of the 
graphs in Figures 1 and 2. 

Parameter-free test. A simple test of fit is 
available that does not require estimation of 
any parameters. This test rests on the rela- 
tion, 


ofi—o) =b/—-b) X f/(-— £). ©) 
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To perform this test, the terms on the left 
and right were calculated for each subject, 
and a ¢ test was applied to the 32 difference 
scores. The mean difference of .2 did not ap- 
proach significance, which supports the model. 
It should be noted, incidentally, that using 
the means from Table 2 in the expression 
above would introduce a small bias, since, in 
general, the ratio of the means is not the 
mean ratio. 

Direct estimation of parameters. Three 
parameters appear in Table 2, but one of 
them can be set to any convenient value. 
This reflects the fact that multiplying all 
three parameters by a constant does not 
change the values of the ratios in the model. 
Accordingly, M was set equal to 1. 

An estimate of H is then obtainable by 
equating the theoretical and observed values 
in Cell b of Table 2. This yields H = 2.17. 
Similarly, equating the theoretical and ob- 
served values in Cell h yields an estimate of 
L = .52, These estimates are not the best, be- 
cause they do not use all of the data. Also 
they suffer a slight bias relative to the mean 
of the corresponding individual estimates. 
However, they differ little from the improved 
estimates discussed next, and they would be 
adequate for many purposes. For example, 
the mean magnitude of the discrepancies be- 
tween the observed means of Table 2 and the 
values predicted from the estimates given 
above was only 1.4 mm. 

Improved parameter estimates. The pa- 
rameter estimates will be more stable when 
they are based on all of the data. Such esti- 
mates were obtained by fitting the ratio model 
separately to each subject’s data, using a 
least squares criterion. Owing to the non- 
linear nature of the model, an iterative pro- 
cedure is required, and accordingly, Chand- 
ler’s (1969) stTEPIT program was used. The 
mean values of the estimates of H and L 
were 2.15 and .53, respectively—quite close 
to those obtained by the less efficient direct 
method given above. The mean magnitude of 
the nine deviations between the mean ob- 
served and the mean predicted value was only 
1.2 mm. 

Goodness of fit. Getting an exact statisti- 
cal test of the ratio model is complicated by 
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the nonlinearity, which prevents an exact tally 
of the degrees of freedom used in estimating 
the parameters. Fortunately, this complica- 
tion can be circumvented using a new test 
procedure introduced by Leon and Anderson 
(1974). In brief, predicted values were ob- 
tained for each subject based on the param- 
eter estimates for that subject. A 3-X 3 matrix 
of deviation scores was then obtained for 
each subject, and these data were tested in 
an ordinary repeated measurements design, 


The model implies that rows, columns, and | 


interaction should all be nonsignificant in 
this analysis, and that prediction was sup- 
ported by the results: The corresponding F 
ratios were F(2, 60) = .15, F(2, 60) = .57, 
and F(4, 120) = 2.07. On the basis of this 
exact test of goodness of fit, therefore, the 
ratio model does rather well. This one set of 
data is only illustrative, of course, but the 
same methods can be used more generally, 
With the iterative procedure, parameter esti- 
mation and the goodness-of-fit test do not re- 
quire the assumption of symmetry. 


Discussion 
Theoretical Implications 


Input integration versus equity integration. 
Multidimensional input is characteristic of 
everyday equity judgments. Rarely are such 
judgments made on the basis of a single piece 
of information. Typically, they entail the use 
of different kinds or types of information. 
That creates an obvious difficulty, noted in 
the introduction, in dividing a one-dimen- 
sional outcome in proportion to a multidimen- 
sional input. 


This problem has been largely avoided in” 


Previous discussions of equity by the attrac- 
tive assumption of input integration. The mul- 
tiple aspects and dimensions of the stimulus 
field are assumed to be integrated to form 
one overall “deservingness” value. The com- 
plex stimulus field is thus reduced to a one 
dimensional, effective input that mediates the 
division. Unfortunately, this assumption of 
input integration ran into difficulty in its 
first experimental test (Anderson, 1976a, Patt 
3, p. 295). The present results confirm the 
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difficulty and indicate that the hypothesis of 
input integration has limited applicability. 

The alternative hypothesis of equity inte- 
gration has the advantage that it does not 
require integration of qualitatively heteroge- 
neous dimensions of input. Instead, each input 
dimension acts independently to determine its 
own implicit equity division. These implicit 
divisions, which are. themselves qualitatively 
similar, are then integrated to determine the 
overt equity division. 

This assumption of equity integration did 
remarkably well for heterogeneous input di- 
mensions (performance and effort) in the pres- 
ent experiments. The one complication, from 
the higher-order interactions in Experiment 2, 
seemed to be easily handled in terms of con- 
figural weighting. The generality of the pres- 
ent results certainly needs to be assessed, 
using other scenarios, more than two dimen- 
sions of input, and more than two parties. As 
they stand, however, these data provide strong 
support both for the hypothesis of equity in- 
tegration and for the basic relative ratio model. 

Unfortunately, equal success cannot be 
claimed for the case of homogeneous input 
dimensions (actual performance on two occa- 
sions). The hypothesis of input integration 
had been expected to hold in this case, and 
the model predictions were verified to a good 
numerical approximation. Nevertheless, the 
pattern of discrepancies suggested that the 
underlying cognitive algebra was different 
than assumed in the model. Indeed, a linear 
model based on simple adding and subtracting 
of the four input dimensions did about as well. 

This difficulty with the model for input in- 
tegration is perplexing. Although the problem 
of integrating input for a single person has 
hot received much explicit study, there seems 
little doubt that subjects can do this (see 
Anderson, 1978a, pp. 40ff). Also, the basic 
relative ratio model for division between two 
Persons did well for the case of heterogeneous 
input dimensions, as already noted. Simple 
concatenation of these two operations would 
yield the input integration model. In further 
Work on the problem of input integration, 
therefore, it may be useful to require initial 
judgments of deservingness about each sepa- 
tate party. That requirement would presun- 
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ably force input integration as a preliminary 
to equity division and would help localize the 
discrepancies from the model. 

Molar unitization. When input integration 
holds, then the effective input can be treated 
as a molar unit (Anderson, in press-a, in 
press-b; Anderson & Graesser, 1976) without 
regard for its molecular composition. No mat- 
ter how intricate the stimulus field, no matter 
how complicated the process of input integra- 
tion, the net result is a single value. This value 
can be quantified with functional measure- 
ment and used for description and for theo- 
retical tests without knowledge of historical 
antecedents. When input integration holds, 
therefore, it can markedly simplify the theo- 
retical analysis. 

When equity integration holds, however, 
then input cannot be treated as a molar unit. 
Each dimension of input must be considered 
separately. That may be feasible when the 
input dimensions are under experimental con- 
trol. However, it could be a serious problem in 
naturalistic situations in which the input di- 
mensions may not all be known. Despite the 
present success of the rule of equity integra- 
tion, the failure of molar unitization is dis- 
heartening because it points to unexpected 
complexities in equity theory. 

One such complexity concerns the bound- 
aries of equity theory. Some writers (e.g, 
Deutsch, 1975; Lerner, 1975) have assumed 
that parity (all persons treated alike) or need, 
for example, were distinct concepts that lay 
outside the boundaries of equity theory. Other 
writers (e.g. Walster et al, 1976) have 
claimed equity theory as a general theory, pre- 
sumably encompassing such concepts as need, 
effort, and so forth. However, neither group 
of writers has discussed how this issue may 
be tested. ’ ’ 

Equity theory does provide a simple con- 
ceptual test: If need affects input, then need 
lies within the domain of equity theory. This 
conceptual test would become operational if 
input integration held. Then the pattern of 
data could reveal whether need affected input 
in the same way as work contribution or acted 
as a separate force (Anderson, Note 2)..But 
if equity integration holds, then need acts 
separately, regardless of whether it is a proper 


+ 
894 


determinant of input. Under equity integra- 
tion, therefore, it may not be easy to progress 
beyond phenomenological speculation about 
the boundaries of equity theory. 

Despite these disadvantages, the rule of 
equity integration does have one notable ad- 
vantage in relation to the problem of multi- 
dimensional outcome (Farkas, 1977). In gen- 
eral, outcome is not one-dimensional. Besides 
money, it may include praise, status symbols, 
working conditions, and so forth. A simple 
rule of input integration would require that 
all outcomes be divided in the same propor- 
tion. However, a rule of equity integration 
makes direct allowance for weights that repre- 
sent input-outcome relevance. Thus, more 
money might be given the person with the 
greater work contribution, more praise to the 
person who tried harder, and more status 
symbols to the person who needed them more. 
From this standpoint, therefore, the rule of 
equity integration may be conducive to a 
more just society. 


Overview of Equity Models 


The discussion above has been restricted to 
the equity model from information integration 
theory, as given by Equations 1 and 2 in the 
introduction. This section gives an overview 
of other models that have been proposed in 
equity theory, followed by a brief summary 
of negative evidence on the basic ratio as- 
sumption employed in many of these models. 
For simplicity, it will be assumed that inputs 
are one-dimensional and positive (see also An- 
derson, Note 2, p. 162). 

Three ratio models. The relative ratio 
model from information integration theory 
(Equations 1-2) is closely related to ratio 
models proposed by Aristotle (quoted in 
Walster & Walster, 1975) and by Adams 
(1965). Aristotle’s definition of equity is given 
by the equation: 


Os/Oz = I4/Ip. (10) 
Adams defines equity by the equation: 
Oa/Ta = Op/Tp. (11) 


Phese models imply different psychological 
comparison processes. In Aristotle's model, 
the two ratios involve comparison of like 
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quantities. In Adams’ model, the two ratios 
involve comparison of unlike quantities. Alge- 
braically, however, they are equivalent to each 
other and to the relative ratio model of inte- 
gration theory. Equity judgments do not dis- 
tinguish among these three models. 
However, it is possible to distinguish among 
the three models by obtaining judgments of 
inequity (Anderson & Farkas, 1975). In these 


tests, both the relative ratio model and Aris- 4 


totle’s model did quite well. Adams’s model, 
however, did quite poorly, an outcome that 


has been corroborated in an extensive study | 


of unfairness judgments by Farkas (1977). 

Walster, Berscheid, and Walster. Walster 
et al. (1976; see also Walster, 1975) defined 
equity with the following equation: 


(Oa — Ia)/|Ta|*a 


= (Os — In)/| Z|", (12) 


where the ks are +1 or —1 according as 
(O—J) and J have the same or opposite 
signs. This model was introduced in an at- 


tempt to avoid apparent paradoxes that arise | 


with negative input for the three ratio models 


considered in the preceding subsection. How- | 


ever, simple thought experiments show that 
the Walster model also leads to paradoxes, 


even indeed with positive inputs and outcomes | 


(Anderson, Note 2, p. 161; Note 3, p. 16). 


To illustrate, let 7, = 10 and /, = 5, in | 


2:1 ratio. With 18 units to be divided, an 


equitable distribution is clearly O, = 12 and | 


Ox = 6. The three ratio models and the Wal- 
ster model all agree on this. 

Now suppose that there are only half as 
many outcome units to be divided. It seems 
obvious that the outcome for each person 
should also be reduced in half. Indeed, the 
first three models imply O, = 6 and Ox = 3: 
But the Walster model requires Oa = 8 and 
Oy = 1. It seems hardly just that B should 
lose 4 out of 5 input units, whereas A loses 
only 2 out of 10. P 

Finally, suppose that there are only 3 units 
to be divided. Then the Walster model not 


only denies that O, = 2 and O, = 1 is equit 


ble, but claims that equity requires Oa =i 
and Og = —3. Because B’s input uke ie 
than A’s, B is required to make an adder 
Payment to A. Thus, B’s net loss comes ý 
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8 units, whereas A’s net loss is only 4 units. 
This thought experiment shows that the Wal- 
ster model cannot be valid. 

Harris. A somewhat different approach to 
equity models has been taken by Harris 
(1976), who considers a number of formulas 
in terms of rational criteria that they might 
be expected to satisfy (e.g., outcome should 
be an increasing function of input). Only 

_Harris’s preferred “linear formula” (slightly 
amended by Harris, Note 4) is considered 
here. For a two-party division, it defines 
equity by the equations, 


Os = ala +r 
Os = aly +r, (13) 


where the parameters a >O and r are the 
same for both parties, but may depend on 
I, and Jp. 

This formula is unsatisfactory because it 
can always give a perfect fit to two-party di- 
visions. The formula is not truly linear be- 
Cause a need not and generally could not be 
constant across situations. In general, Harris 
allows a to be some arbitrary, unknown func- 
tion of the inputs and outcomes of all the 
Parties to a given division. Equations 13 are 
thus two equations in two unknowns and so 
can always be solved exactly for a two-party 

“division. In the present two-factor designs of 
Figures 3 and 4, for example, Harris's for- 
mula would provide a perfect fit to every data 
Point, These perfect fits do not support the 
formula, of course, but instead show that it 
ÎS not testable with two-party divisions. Per- 
haps some theoretical foundation for the 4 
Parameters could be found that would allow 
4 test similar to that allowed by the models 
discussed above. As long as the formula is not 

„testable with two-party divisions, however, it 
does not seem to be very satisfactory. 

b Subjective values. As has been emphasized 

Y Adams (1965) and by Walster et al. 
(1976), equity theory in psychology deals 
ka personal values and feelings. In general, 

erefore, tests of equity models must be capa- 
le of measuring psychological values of input 
Se outcome on true linear or ratio scales. 
pane to recognize the need for subjective 
‘i can undercut the theoretical reason- 
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This difficulty can be seen in the literature 
on children’s judgments of equity. Nearly all 
this work has relied on physical counts as 
measures of input and outcome. Input might 
be measured by number of problems solved, 
for example, and outcome by number of can- 
dies received. Various theoretical interpreta- 
tions rest on the implicit assumption that 
subjective values are proportional to these 
physical values. This assumption is arbitrary 
and uncertain, however, and so some of this 
work does not have any clear meaning (Ander- 
son & Butzin, 1978). 

It should be emphasized that many studies 
of equity do not require linear or ratio scales. 
A monotone, rank-order response scale can be 
entirely adequate for certain purposes (see, 
for example, Anderson, 1976b, p. 689), and 
it may not then be worthwhile to seek for 
linear scales. For other purposes, however, 
linear or ratio scales are vital. One reason for 
studying algebraic models of equity is that 
they can provide a base and frame for such 
measurement scales. 

Evidence on the relative ratio model, Al- 
though the relative ratio model has done re- 
markably well in some situations, it has done 
more or less poorly in others. Existing data 
point to three areas of poor performance that 
deserve mention. The most severe discrep- 
ancies were obtained in the sole test in which 
total outcome was not a fixed sum. In this 
case, the model has a ratio form that implies 
a linear fan pattern of response. The obtained 
response pattern showed marked discrepancies 
from this theoretical linear fan (Anderson, 
1976a, Figure 1 and p. 295; 1978b). A second 
kind of discrepancy appeared in Experiment 
2 above for the task in which two pieces of 
similar performance information were given 
about each person. Although the rule of input 
integration was perhaps not too far off, the 
meaning and seriousness of the discrepancies 
remain unknown. eee 

The third discrepancy is that rather similar 
tasks sometimes yield the theoretical pattern 
of a slanted barrel, but sometimes a pattern 
of approximate parallelism. Illustrations ap- 
pear in Figures 3 and 4 above, and in Figures 
2 and 3 of Anderson (1976a). Failure to get 
the theoretical barrel shape does not seem to 
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be due to lack of power. Although the ob- 
tained barrel patterns show only modest de- 
viation from parallelism, they have typically 
been highly significant statistically. It is pos- 
sible that the approximate parallelism results 
from a tendency to simplify the task by using 
adding or subtracting rules in place of the 
seemingly more difficult ratio rule. This is 
only speculation, however, for there is almost 
no evidence on the causes of this third kind 
of discrepancy. 

This brief summary indicates that the cog- 
nitive algebra of equity (and inequity) is in 
an uncertain state. Still, the relative ratio 
model has had some substantial successes, and 
there is hope that the discrepancies will find 
a simple theoretical interpretation. 
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Race, Sex, and the Expression of Self-Fulfilling 
Prophecies in a Laboratory Teaching Situation 


Marylee C. Taylor 
Harvard University 


The aims of the study were (a) to identify patterns of teacher behavior that 
vary with expectations for pupil ability and thus may mediate teacher expect- 
ancy effects and (b) to investigate ways in which pupil race and sex may inter- 
act with perceived pupil ability in affecting teacher behavior. For the experi- 
ment, 100 white female undergraduates enrolled in teacher training programs 
presented lessons to “phantom” students, who were alleged to be observing and 
responding from behind one-way glass. Race, sex, and ability manipulations 
were embedded in descriptions of these phantom pupils. Numerous verbal and 
nonverbal aspects of teacher behavior were recorded. Subjects with low-ability 


“pupils” taught less material, allowed less response opportunity, gave lengthier 
praise after successful performance, and showed less vocal nervousness. Partic- 
ularly interesting Race X Ability interactions were found, showing that teachers 
direct most positive affect to high-ability black and low-ability white pupils. 


In 1968 Robert Rosenthal and Lenore 
Jacobson published the report of a study dem- 
onstrating that experimentally manipulated 
leacher expectancies for pupil progress can 
affect pupils’ intellectual growth. Subsequent 
research has provided steadily accumulating 
evidence of self-fulfilling prophecy effects in 
classroom education (Crano & Mellon, 1978; 
Rosenthal, 1973, 1976). One implication is 
that the much discussed relation between 
Pupil academic achievement and such back- 
ground characteristics as race and socioeco- 
Nomic status may be partially attributable to 
teacher expectations for pupil performance. 

Although little question remains about the 
éxistence of expectancy effects, there are gaps 
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in our understanding of how these effects op- 
erate. The present study addresses two kinds 
of questions about the operation of teacher 
expectancy effects: First, what particular pat- 
terns of teacher behavior vary with expecta- 
tions for pupil ability and thus may mediate 
the relation between teacher expectations and 
pupil performance? And secondly, how do 
teacher expectations about pupil ability in- 
teract with other pupil characteristics—par- 
ticularly pupil race, and also pupil sex—in 
affecting teacher behavior? 


Ability-Expectancy Effects 


Systematic investigation of the mediation 
of teacher expectancy effects has been under- 
taken by a number of researchers, and an in- 
tegration of these research findings has been 
presented by Rosenthal (1973). In his re- 
view, Rosenthal offers a four-factor ‘theory 
of teacher expectancy mediation, naming four 
major categories of mediators, each suggested 
by at least five empirical studies to be related 
to pupil-ability expectations held by teachers. 
Rosenthal’s climate factor represents general, 
noncontingent teacher warmth, communicated 
through such behaviors as smiling, head nod- 
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ding, leaning forward, and maintaining eye 
contact. Greater teacher warmth is generally 
directed to pupils for whom positive expecta- 
tions are held. The second factor is input; 
teachers try to teach more material and more 
difficult material to their high-ability pupils. 
Represented in Rosenthal’s output factor is 
the evidence that teachers encourage greater 
responsiveness from high-ability pupils. This 
may take such forms as offering more frequent 
response opportunities, asking harder ques- 
tions, or being more patient in waiting for 
pupil responses. 

Research evidence pertaining to Rosenthal’s 
fourth factor, feedback, is more complex. In 
general, it seems to be the case that feedback 
is given more clearly and more consistently 
to high-ability than to low-ability pupils. 
There are qualifications, however: When criti- 
cism after unsuccessful performance is the 
feedback at issue, low-ability pupils have 
sometimes been the disfavored group; and 
particularly high levels of teacher praise have 
occasionally been found to be directed toward 
low-achieving and disliked pupils (Brophy & 
Good, 1974). Of the four categories suggested 
by Rosenthal, feedback seems to be the one 
most in need of conceptual and empirical 
clarification. 

A goal of the present study was to examine 
effects of experimentally induced teacher ex- 
pectations on behaviors representing all four 
of the mediation types Rosenthal suggests. 


Pupil-Race Effects 


Previous research provides substantial evi- 
dence for effects of pupil race as well as abil- 
ity expectations on teacher orientation and 
behavior, with black children the recipients 
of the less desirable evaluations and treatment 
at the hands of (typically white) teachers. 
(See, for example, Antonoplos, 1972; Datta, 
Schaefer, & Davis, 1968; Leacock, 1969; Yee, 
1968). Yet these studies focused on classroom 
teachers and their actual black and white 
pupils, so that pupil race was confounded with 
other pupil characteristics that could plausi- 
bly affect teacher orientation. Studies that 
have failed to find effects of pupil race typi- 
cally employed a “phantom other” design, in 
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which subjects were asked to rate written de- 
scriptions of imaginary black and white stu. 
dents (see Dietz & Parkey, 1969; Long & 
Henderson, 1974; Mazer, 1971; Miller, 1973), 
and there is reason for concern about general- 
izability. In a series of studies that attempt 
to address both internal and external validi 
concerns, safeguards against confounding with 
pupil race and realism were introduced in 
the design (Coates, 1972; Crowl & Ma 
Ginitie, 1974; Harvey & Slater, 1975; Jensen: 
& Rosenfeld, 1974); in these, pupil-race ef 
fects were found, but the dependent variable 
were teacher ratings rather than behavi 
measures. Thus, there do remain questi 
about the effect of pupil race per se on teacht 
behavior. 


The Repression-of-A ffect Hypothesis 


An intriguing hypothesis specifying a 
tern of incongruous race effects comes 
from educational research but from a st 
of interracial interaction. Weitz (1972) 
ministered a racial-attitude questionnaire i 
her subjects, and later obtained a number’ 
measures of their orientation to a fictiti 
soon-to-appear co-worker described to 
as being either black or white. The most 
teresting aspect of the Weitz findings 
negative correlations among some of hel 
sures of white-to-black attitudes and be 
iors. Weitz concluded that whites with ne 
tive racial feelings may, under pressures 
liberal environment, attempt to portray & 
black posture but leak their underlying ne 
tive feelings in behaviors not amenable M 
conscious control. The result would be. 
emission of double messages, suggested ( 
son, Jackson, Haley, & Weakland, 195 
be harder for children to handle than 
sistent negative messages. The repression 
affect proposition is certainly impot 
enough in its implications to warrant 
testing. 


The Interaction of Pupil Race and Ability” 


Central to the present project is the 
tion of how pupil race may combine 
ability expectations to influence teach 
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havior. Most discussions of pupil-race and 
ability expectations assume a simple additive 
model: Whiteness and a high-ability label 
serve as two strikes in favor of a child, black- 
ness and a low-ability reputation, two strikes 
against. A disturbing alternate possibility is, 
however, suggested by the findings of Rubo- 
vits and Maehr (1973). When these research- 
ers assigned white university students to teach 
black and white children who had been ran- 
domly assigned labels of “gifted” or “non- 
gifted,” there was an interaction of race and 
ability in their effect on teacher behavior: 
The gifted label seemed to be an advantage 
for the white children but a disadvantage for 
the black children. 

The Rubovits and Maehr evidence does not, 
however, stand without contradiction. Other 
studies have yielded results suggesting Race 
X Ability interaction in the other direction, 
with low-ability black pupils being the recipi- 
ents of particularly negative teacher orienta- 
tion (see Datta et al., 1968; Henderson & 
Long, 1973). In the case of these latter studies, 

. however, the dependent measure has been 
teacher ratings rather than behaviors. Thus 
there remains the need to examine Race X 
Ability effects on teacher behavior, using a 
design that eliminates the possible confound- 
Ing of pupil race with other factors. Such an 
examination was a major goal of the present 
Study, 


Pupil Sex and Its Interaction With 
Race and Ability 


As a supplementary benefit, the study of- 
ted the opportunity to examine teacher be- 
avior for evidence of pupil-sex effects. From 
heir review of relevant research, Brophy and 
= Good (1974) concluded that there is evidence 
of teachers’ preference for female pupils, as 
Well as evidence that teachers underestimate 
oth the potential and the achievements of 
Male pupils. However, they failed to find con- 
Vincing evidence of favoritism toward girls in 
facher behavior and concluded that pre- 
viously noted pupil-sex effects on teacher be- 
/avior were accounted for by sex differences 

M the behavior of the children. The present 

Study allowed reconsideration of the no-differ- 
j 
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ence conclusion drawn from previous research 
on sex effects. Perhaps more interesting, it 
allowed examination of evidence regarding 
possible interactive effects on teacher behav- 
ior of pupil sex and ability, and of pupil sex 
and race, topics about which little is, as yet, 
understood. 


Overview of the Study 


The present study was designed to address 
simultaneously a number of the hypotheses 
and questions suggested by the literature re- 
viewed here. White female undergraduates 
enrolled in teacher training programs were 
asked to teach two lessons to a phantom stu- 
dent, who was alleged to be observing and re- 
sponding to them from behind one-way glass. 
On a light panel in the experimental room, the 
subjects received standard responses to their 
questions from their “6-year-old pupil,” de- 
scribed as being either black or white, male 
or female, and of high or low ability. Numer- 
ous kinesic and paralinguistic measures of 
teacher behavior during the lesson presenta- 
tion were obtained, along with measures of 
various aspects of teaching style as well as 
teacher ratings of the pupils. 

The variety of behavioral measures in the 
present study, including representatives of 
each of Rosenthal’s (1973) four factors, al- 
lowed for examination of effects of pupil char- 
acteristics on the patterning of those teacher 
behaviors suggested to be most important in 
the transmission of self-fulfilling prophecy ef- 
fects in one-to-one teaching situations. This 
design allowed testing for interactive effects 
of ability expectations and pupil race on 
teacher behavior, without the possibility, pres- 
ent in the Rubovits and Maehr (1973) study, 
of confounding effects from actual differences 
in pupil behavior. Built into the present study 
was a direct search for evidence of Weitz’s 
(1972) repression of affect in subjects teach- 
ing black “pupils”: At the midpoint of the 
experimental session there was a manipula- 
tion of demand characteristics; half the sub- 
jects received a message designed to encourage 
attempts at problack behavior and thus to 
maximize the probability of leakage and the 
contradictory pattern Weitz discusses. Finally, 
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auxiliary questions, such as those regarding 
main and interactive effects of pupil sex on 
teacher behavior, were examined. 


Method 
Subjects 


For the experiment, 105 white female! under- 
graduates at the University of Massachusetts, Am- 
herst, served as subjects. All were either majoring in 
education or working for teaching credentials. All 
but 2 were upper-class students, all but 12 had prac- 
tice teaching experience or the equivalent, and the 
majority were enrolled in elementary education pro- 
grams. A few subjects volunteered in Tesponse to 
posted announcements of the study, but the vast 
majority were identified from university records and 
recruited by telephone. Recruits were told only that 
they would receive $2 compensation for 1 hour of 


participation in an experiment that would concern 
teaching. 


Setting, Equipment, and Materials 


The experimental room, in which the subjects per- 
formed their teaching task, was simply furnished 
with a long table, a rolling chair, a floor microphone 
to be used during lesson presentations, and an arm- 
chair and coffee table for the subjects’ use during 
Other phases of the experimental routine, Furnish- 
ings were positioned so that during their lesson 
Presentations the subjects could sit at an angle from 
the unobstructed section of a one-way glass window 
into the observation room, facing the window but 
unable to see their own reflection in it. Beneath the 
observation room window stood a light panel on 
which the “pupil” Tesponses were to be transmitted 
to the subjects. The panel contained 10 lights, labeled 
to represent 10 possible responses to the prescribed 
questions accompanying the lessons, 

Two lessons were constructed for the experiment, 
Both concerned mathematics, and the two paralleled 
each other as nearly as possible. Each lesson con- 
tained four sections, the internal structure of the 
sections being identical, A given section contained 
the following: a statement of the concepts and defi- 
nitions to be read to the “pupil” by the subject; a 
set of examples, half of which were illustrated ; ques- 
tion-answering instructions to be read to the “pupil”; 
and a set of illustrated questions, 

Standard Tesponse schedules were followed by an 
experimental assistant who controlled the light panel. 
For each lesson question, the schedule Specified not 
only the response(s) to be given but also the latency 
of the response(s). (In some cases incorrect answers 
were given initially, and correct responses given only 
after repeated questioning; for such questions, the 
sequence of responses to be given by the experimental 
assistant was specified by the schedule.) 

The lessons were constructed in consultation with 
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a specialist in elementary mathematics education, 
Response schedules were constructed after pretesting 
the lessons on children the age of the alleged pupils 
of the experiment. Thus, the appearance of the les- 
sons as legitimate educational vehicles and the plausi- 
bility of the allegation that an actual child was re- 
sponding to the questions were maximized. 


Experimental Manipulations and Design 


The three major independent variables were manip- 
ulated through a Pupil Information Sheet, which 
contained pupil race, sex, and ability designations,’ 
embedded in other information. All pupil information 
(including age—6 years) was identical except for 
the pupil’s name (John or Betty Powell), sex, ethnic 
origin (Afro-American or English American), and 
ability (“at the high end of the class range” and 
“in one of the highest classroom groups” or “at the 
low end of the class range” and “in one of the low: 
est classroom groups”), The awareness manipulation, 
designed to intensify demand characteristics and pos- 
sible repression of affect, was included as part of 
a written reminder for the subjects to try to behave 
as if they were in a natural teaching situation. For 
those in the aware condition, this reminder was pre- 
ceded by the statement: “A major purpose of this 
study is to compare the ways teachers behave to- 
ward black and white students.” < ] 

A completely crossed repeated-measures design was 
used. Subjects in the 16 (Ability X Race X Sex X| 
Awareness) conditions were measured twice on each 
of the dependent variables, once during each lesson 
Presentation. In order to minimize the possibility 
that any effects of time or order would be confounded 
with those of the independent variables, assignment 
to experimental condition was performed within each 
block of 16 subjects scheduled for consecutive te 
ing. Because some supplementary subjects were tested 
and their data retained in the analysis, cell ns welt 
slightly uneven, ranging from five to eight. 


Procedure 


As each subject was greeted by the (female) e% 
Perimenter, she was presented with written maa 
containing instructions and an explanation that 4 1 
research purpose was to examine certain aspects 
teaching behavior in a situation where feedbi o 
from pupil to teacher was limited. The subject a 
then taken to the experimental room and a 
5 min to read the Pupil Information Sheet, W 


1 With four independent measures of central mA | 
est here, it was deemed infeasible to vary ami ê 
menter sex and race as well. White females B, 
group targeted for study because of their hi AT, 
and continuing prevalence in elementary k tiga- 
There is certainly the need for analogous ing tas 
tion of black and male populations, an inviting 
for the future. 2 
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contained the pupil race, sex, and ability manipula- 
tions; to review the materials pertaining to the first 
of the two lessons she would present (the order 
in which the two lessons were presented was simply 
alternated from one subject to the next); and, 
finally, to rate estimated lesson difficulty for the 
pupil. At the conclusion of this interval, the subject 
was signaled to move the rolling chair and floor 
microphone to a position of her choice alongside the 
large table, and to begin presenting the lesson to 
the pupil who was allegedly listening and watching 
through the observation room window. Lesson pre- 
sentation involved proceeding through the sections 
of the lesson, repeating a sequence that consisted of 
exposition, optional presentation of illustrated or non- 
illustrated examples, and questioning pertinent to the 
material of that section. At the end of 10 min, the 
subject was signaled to conclude the lesson presenta- 
tion, and was given 4 min in which to rate the learn- 
ing and ability of the pupil. 

At this point the subject read the reminder in 
which the awareness manipulation was included. 
Then the second lesson was presented, procedures for 
this parallel presentation being identical to those for 
the first. 

Before being excused, the subjects completed a 
written manipulation check and probe for hypothe- 
ses about the nature of the experiment, and were 
paid the $2 compensation. Later, after all the sub- 
jects had been tested, a thorough debriefing letter 
was mailed to all the participants, and an announce- 
ment was made of a session at which the subjects 
could discuss the study with the experimenter. 

Two assistants were present in the observation 
room during all testing sessions, one controlling re- 
cording equipment and the light panel that allegedly 
conveyed pupil responses to questions, and the other 
tating kinesic and proxemic behaviors of the sub- 
jects. When possible, the experimenter made inde- 
pendent ratings so that interjudge reliabilities could 
be computed. 

As a safeguard against experimenter expectancy 
effects, all those at the research site were completely 
blind to subject condition, except in those cases 
Where the subject used the name of her “pupil,” thus 
revealing the sex of the “pupil” to the assistant who 
Was performing auditory monitoring of the teach- 
ing sessions, This assistant was positioned so as to 
be shielded from visual cues, which might have biased 
Tesponses on the light-panel controls. Analogously, 
the assistant performing proxemic and kinesic rat- 
ings was shielded from vocal cues, which might have 
contaminated the ratings. La 

Aside from the teacher ratings and the kinesic 
measures recorded by the rater positioned in tie 
observation room, dependent measures were taken 
from the audio recordings of the teaching sessions. 


Most involved sim i formed directly from 

ple coding perform H $ 

the audiotapes, the exception being the yia rat 
cedure. 


ings, which required a more elaborate pro! 

For purposes of obtaining voice ratings, four 3-se° 
clips of each subject’s lesson presentation were m 
cerpted, one from the beginning and one from e 
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middle of each lesson presentation. These taped ex- 
cerpts were content-filtered, a process in which cer- 
tain frequencies are deleted from the recording, ren- 
dering the verbal content of the speech incompre- 
hensible but preserving much of the tonal quality 
and rhythm of the speech. (For a more detailed de- 
scription of this technique, see Rogers, Scherer, & 
Rosenthal, 1971). For raters, 16 female Harvard 
summer school students were recruited. The raters, 
half black, and half white, ranged in age from 18 
to 21 years; eight had had some teaching experience. 
Each rater was presented with a stimulus tape con- 
taining half of the 3-sec excerpts, one excerpt repre- 
senting each subject during each lesson presentation. 
During each of six playings of the tape, these judges 
rated the content-filtered excerpts on a dimension 
specified by the researcher, 


Dependent Measures * 


The climate factor. Rosenthal’s (1973) climate 

factor was represented by four behavioral measures: 

1, Smiling (number of smiles during three 1-min 
periods of a lesson presentation) ; 

2, Head nodding (number of up-and-down head 
nods during three 1-min periods of a lesson presen- 
tation) ; 

3, Forward leaning (seconds of leaning toward the 
observation room window during three 1-min periods 
of a lesson presentation) ; 

4. Vocal warmth (mean of 16 rater judgments of 
content-filtered excerpts on a 9-point cold-warm 


scale, where warm is high). 

The input factor. The input factor named by 
Rosenthal (1973) was represented by the following 
measure: 

5. Teaching pace (number of nonoptional lesson 
segments presented during the 10-min session, coded 
so that for each of the four lesson sections, comple- 
tion of all nonoptional segments would yield a score 


of 75). 


2 ndent measures listed here are those to 
be Pats the pages to follow. Their selection 
from a larger set was based on considerations of in- 
terpretability, theoretical import, nonredundancy, 
and empirical interest. Data collected but not dis- 
cussed here include the following: a proxemic mea- 


kinesic measures of backward leaning and 
roa tors of the way in which 


shifting; various indica 

Paine time was apportioned; measures of several 
kinds of assistance given during question-and-answer 
sessions; voice ratings on dimensions defined by the 


poles businesslike-personal, condescending-admiring, 
discouraging-encouraging, and not competent — com- 
petent; subjects’ ratings of their pupils; and scores 
on a test of Afro-American history and culture ad- 
ministered to the subjects. Information about effects 
on these measures and relationships among them can 
be obtained from the author. 
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The output factor, Several assessments of the remaining responses being noncommittal. When asked 
reponse opportunities offered the pupil were made, about ethnic origin of their “pupils,” 96 responded 
appropriately, 8 failed to answer, and 1 subject in 
6 ber of times, per ques- the Afro-American condition reported “Anglo-Saxon.” 
tion asked, that the subject interrupted the “pupil” This last subject, who apparently misread the ma- 
response, Le. interrupted the latency period between nipulation, was the only case excluded from the data 
and response called for by the response analysis on the basis of the manipulation check; 
); four others were excluded because of some deviation 
1. Persistence after "pupil" errors (percentage of from standard experimental procedure, Thus, primary 
correct answers obtained; since the bg earned rom analyses were carried out on data from 100 sub- 
answers could y jects, 
repeating questioning, this is a measure of persistence The subjects’ ratings of their “pupils” were used to 
in repeating incorrectly answered questions). assess the potency of the ability manipulation. The 
The feedback factor. ‘The complexity of previ- subjects had estimated “lesson difficulty for the 
ous research findings concerning teacher feedback pupil” and “pupil intellectual potential,” in each case 
papa that multiple measures should be used yielding a score from 0 to 130. Those in the high- 
were ability condition estimated that the lessons would be 
4, Praise feedback incidence (percentage of cor- easier for their pupils (Mf = 71.2) than did those in 
feet responses followed by explicit praise, eg, “very the low-ability condition (Af = 53.7), FU, = 
opal Matements merely confirming the correctness 17.19, p < 001, MS, = 807.4, d= .945, And subjects 
Sa reponse were not included) ; in the high-ability condition also gave higher ratings 
Praise feedback length (average seconds spent of their pupils’ intellectual potential (Af = 89.1) than 
Praising correct responses per occasion on which such did those in the low-ability condition (M = 774), 
+ yo given); F(A, 82) = 10.29, p=.002, MS, = 671.9, d = 108. 
earch pret length (average seconds Subjects’ hypotheses regarding study. When asked 
ponies incorrect il per occasion haya gear eres the proie of 2 
11, Positive t Ta a nearly one half (44) of the subjects nam 
nea feedback omistion percentage = a theme related in some way to the teacher e+ 
eae by either silence or an am. Pectancy idea (eg. suggesting that the study's ot 
pose was to assess the impact of pupil backgrou 
feedback omission (percentage of in. on teaching style), Most of the Alanin group te 
Or an peated the cover story told them by the experi 
Peay erat menter, or named an unrelated hypothesis. (As will 
vioral 5 reported subsequently, where effects of ability on 
ching vior were found, these were confirmed 
My tnctuded in the study in analyses that excluded the 44 subjects holding 2° 
Curate suspicions about the research purpose.) Only 
eight subjects revealed skepticism about the existente 
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one 
eorrected hermit, made obvicws grammatical oa 
erron, 

poe ~ words, mirad sections of *Subject behavior after correct “pupil” responses 
tongue twisted”) ; was coded into three all-inclusive catcgories—pralt 
seconde of pausing during feedback, correctness feedback (without praise), and 
Of silence sed ag feedback omitted. Thus the incidence of praise and 

uF, feedback omission measures ted here 
Peal mervonsmen (mean of 1b rater judgment perbernagreaar Information about type of ote 

nervous- ollowing correct “pupil” nses. 

‘ales scale, where calm ka high)" fect behavior alter incorrect “pupil” responses Wik 
Inadvertent heipteinen The final behavioral cocbadasrass inns (eg, cases in which sabe 


of the ~  Fect answers), 
Er audhotapes ), coded into two categories 


oa on eee tenga mss aea given ually al 
mod es apparent slip of the ae subject or omitted. Thus, virt 
the answer while trying to mk a question) s raa information ae om ot f 
Preliminary Anal “ieee omimion variable. -à 
4 giving responses 
s +. the only type of Tirang arale evidenord 
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Table 1 
Summary of Analysis of Variance Effects 


Nee 


Effect Measure category Measure , 
Ability (A) Input on 
ility Oneal Teaching pace ‘om 
Feedback Praise feedback length on 
Vocal anxiety errors aM 
nervousness ms 
Race (B) Feedback Positive feedback omission os 
Incorrectness feedback length = O84 
Freudian helpfulness Helpful slips a 
Sex (C) Climate Head nodding am 
BXC Feedback Positive feedback omission ow 
Freudian helpfulness Helpful slips atd 
BIA Out Impatience ms 
x Awarenens PA Positive feedback omision 019 
BXA Climate Packed a 
Freudian helplulness ‘Helpful slipe m 
CXA Feedback Praise feedback incidence bosd 
nA O (Fiais _ te _= 
— man tote 
Nate. Relevant Fo and effec size estimates are presented in the text, Subgroup maas for mala 
presented in the text, those for interaction effects in Tables 2-8. 
demu aseet = 
ef an actual child in the observation room, and many ween ee rapire pee fs — be 
bf the subjects responded in a way that confirmed -a rond kesson prema ston 
their belief in the existence of an actual pupil, for S02 Mae annan of the berdinik. mean, $ 
ample, by showing concern for the reactions of a nee. feedback date was cuted Crem the 
ild to the experimental procedure or by ssaweting second ot ot iectsamele of mbata giving Ol 
that pupil learning was the dependent a pogo paminta, Fw w hee 
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Reliabilities and intercorrelations, 
the kinesic ratings were estimated using the inde. 
Pendent ratings of a second rater for a subsample of 
the subjects. For smiling, r(11) = $77 and (20) = 
400 for the first and second lewon = 
Wectively; for head nodding, (12) = 708 and 7629) 
= 850 for the first and second lemon presentations 
ted for leaning forward, r(13) = 863 and r(30) ™ 
1S for the first and second lessons, 

Since neither rater race nor timing of pi Bia 
ted appreciably with experimental condition ia 
fecting voice ratings, analyses were performed 
Domum of the 16 ratings available for each sebiet 
tach lesson presentation, Reliabilities 

a procedure that 


Presentations, respectively ; = 
Stasion, (92) = 758 and (90) = 267 lor the Aat 
"9d second lesson presentations. 
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Results 


An overview of the analysis of variance ef- 
fects to be reported and discussed is presented 
in Table 1. 


Behavioral Mediators of Ability-Expectancy 
Effects 


Main effects of pupil ability in the expected 
direction were shown on central indicators of 
two of the four categories of mediators sug- 
gested by Rosenthal (1973)—the input and 
output factors. Subject teachers in the low- 
ability condition taught at a slower pace and 
thus presented less material to their pupils in 
the alloted time (M = 103.3) than did those 
in the high-ability condition (M = 115.7), 
F(1, 84) = 5.45, p = 022, MS, = 1,377.1, d 
= .509.° Perhaps more interestingly, subjects 
in the low-ability condition were less patient 
in waiting for responses to questions, inter- 
rupting the latency period between question 
and response (M = .39) more frequently 
than subjects in the high-ability condition (M 
= .23), F(1, 83) = 5.06, p = .028, MS, = 
2,400.8, d = .494. 

Among the several feedback indicators, 
there were main effects of pupil ability on 
only one: Statements praising pupils after 
correct responses were more lengthy among 
subjects in the low-ability condition (M = 
- 2.87) than among those in the high-ability 
condition (M = 2.03), F(1, 78) = 7.24, p 
= .009, MS. = 447.1, d = .609. The null re- 
sults on the feedback incidence measures 
represent a failure to confirm Rosenthal’s 
(1973) suggestion that high-ability pupils re- 
ceive clearer and more consistent feedback 
from teachers,” but the significant effects on 
feedback length do not represent a reversal 
of his prediction: Average length of positive 
feedback was not an aspect of teacher be- 
havior discussed by Rosenthal or assessed in 
the studies he reviewed, and intercorrelations 
among measures in the present study showed 
feedback length to be unrelated to those in- 
cidence measures Rosenthal did discuss. 

Rosenthal’s fourth factor, climate, was the 
one category that showed no significant main 
effects of pupil ability, although, as will be re- 
ported later, there was a highly interesting 
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Race X Ability interaction on a primary cli- 
mate indicator, vocal warmth. 

The effects of the ability manipulation on 
the measures of vocal anxiety suggest a cate- 
gory of expectancy mediator that Rosenthal 
did not discuss. Low-ability subjects made 
fewer speech errors (M = 3.15) than did 
high-ability subjects (M = 4.14), F(1, 84) = 
4.72, p = .033, MS, = 10.05, d = 474, And 
there was a nearly significant effect of pupil 
ability on rated vocal nervousness, the con- 
tent-filtered speech of subjects in the low- 
ability condition being rated as less nervous 
(M = 5.36) than that of those in the high- 
ability condition (M = 5.08), F(1, 84) = 
3.25, p = .075, MS, = 12,018, d = 393.8 

In summary, these data support Rosen- 
thal’s hypothesis that low-ability pupils are 
taught less and are given fewer response op- 
portunities than are high-ability pupils. The 
low-ability group did not give less clear or 
consistent feedback in the present study; 
praise given after correct responses was more 
extensive in this group than among the high- 
ability subjects. And the climate provided by 
the low-ability group as a whole was not less 
warm than that provided by the high-ability 
group. Vocal anxiety is suggested by these 
results to be an additional category of teacher 


© All analysis of variance results reported are from 
repeated measures analyses with four between-sub- 
jects factors (pupil ability, pupil race, pupil sex, and 
awareness) and one within-subjects factor (Time, 
i.e., first vs. second lesson presentation). 

Throughout the article d = 2. F = dj for error term. 
Thus any given between-groups difference is stan- 
dardized by the within-cell standard deviation esti- 
mated from the sums of squares of the error term 
of the corresponding F. Conventionally, d = .20 is 
considered a small effect, d=.50 a medium elect 
size, and d= 80 large (Cohen, 1969, p. 38). Minor 
fluctuations in degrees of freedom for these analyses 
reflect occasional missing data. 

* The feedback-omission indicators are the most 
relevant to Rosenthal’s hypothesis, and the modest 
reliabilities of these measures suggest one possible ex- 
planation for the absence of the predicted ability 
effects. 

$ One reader suggested that the indications of an3- 
iety among subjects in the high-ability condition 
might be tied to the relatively rapid teaching pace 
of this group. Intercorrelations show, however, this 
there was no substantial relation between teaching 
pace and either speech errors or vocal nervousness. 
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tancy mediator; the subjects teaching 
Jow-ability pupils manifested less nervousness.” 


il-Race Effects 


The subjects’ ratings of their pupils, men- 
joned earlier as a means of assessing the po- 
of the ability manipulation, revealed 
in the black-pupil condition to profess 


ere black indicated in their ratings that they 
pected the lesson to be easier for their 
upils (M = 67.4) than did those in the 
hite-pupil condition (M = 57.6), F(1, 77) 
5.38, p = .023, MS, = 807.4, d = 529. 
However, the teacher behaviors that are 
e dependent variables of the present study 
d not reveal a pattern of particularly favor- 
treatment directed at black pupils. (The 
lly effect plausibly interpretable as favorable 
blacks is a trend for those with black 
üpils” to give briefer feedback after in- 
t responses (M = 2.94) than those with 
hite “pupils” (M = 4.65), F(1, 42),= 3.94, 
054, ME, = 2,026.2, d = .613. The two 
lêr main effects of pupil race on teaching 
havior suggest negative teacher orientation 
Ward black students, if anything. Omission 
p Positive feedback after correct responses 
$ more frequent in the black-pupil condition 
#= 5.49) than in the white-pupil condition 
M= 2.45), F(1, 84) = 3.76, p = 056, MSe 
7119.7, d = 423. And subjects with black 
Pupils” were less likely to make helpful slips 
the tongue that gave away answers (M= 
4) than were those with white “pupils” (M 
=15), F(1, 84) = 6.04, p= 017, MS, = 
84, d = 536.° These effects should be in- 
eted, however, in conjunction with the 
ce X Sex interaction on these two mea- 


Me Interactive Effects of Pupil Race and Sex 


Pupil race and sex were found to interact 


helpful slips. Relevant cell means are 
ted in Table 2. The subjects in the 


‘ly to offer their pupils either kind of as- 


Male and white-female groups were less _ 
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Table 2 
Race X Sex Interactions 
Pupil sex 
Pupil race Male Female 
Positive feedback omission* 
Black 7.18 3.19 
White 1.61 3.29 
Helpful slips 
Black 02 07 
White 24 .06 


Se ee EE EE 
* Scores summarized by these cell means are percent- 


ages of correct pupil responses that met with no 
teacher feedback. The actual range of scores is from 


0 to 70. 

b Scores summarized by these cell means represent 
total number of times a subject made an apparent 
slip of the tongue and gave away the answer while 
trying to ask a question. The range of scores is from 


0 to 2. 


sistance—positive feedback after correct re- 
sponses, F(1, 84) = 4.00, $ = 049, MS. = 
119.7, d = .436, or helpful slips while asking 
questions, F(1, 84) = 7.30, p= 009, MS. 
= 084, d = 590 *—than were the subjects 
in the white-male and black-female groups. 
The combined impact of these interaction ef- 
fects and the main effects of pupil race on 
these two variables is evident in Table 2. 


9 As mentioned earlier, 44 subjects reported sus- 
picions that the purpose of the study was to investi- 
gate effects of pupil background on teacher behavior. 
Supplementary analyses showed that the ability ef- 
fects reported here were more pronounced among 
the naive subjects than in this suspicious group for 
all except one dependent measure, praise feedback 
length. On the praise-feedback-length indicator, the 
ability effect within the naive group was large enough 
to have been significant had the entire group of sub- 
jects remained naive. Apparently, then, the ability 
effects reported above are not artifacts created by 
cooperative subjects. — ` ’ 

10 The helpful slips measure is badly skewed, since 
only 17 of the 100 subjects made any such slips, 
so the probability associated with the F should be 
accepted with caution. To confirm the conclusion 
that the effect was significant, the likelihood of the 
observed distribution of those 17 cases between 
pupil-race conditions was assessed using chi-square. 
For this effect x*(1) =7.56, $ < 01. 

11 For this effect, x°(1) =5.74, p < .025. (See ex- 
planation in Footnote 10.) 
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Table 3 3 
Race X Awareness X Time Interactions 
Time 
Group 1 2 
Positive feedback omission* 
Black 
Aware 11.1 5.4 
Unaware 3.4 2.0 
White 
Aware 1.2 17 
Unaware 6.8 0 
Impatience? 
Black 
Aware 37 Eyi 
Unaware 31 34 
White 
Aware 24 32 
Unaware 32 -28 


* Scores summarized by these cell means are percent- 
ages of correct pupil responses that met with no 
teacher feedback. The actual range of scores is from 
0 to 70, 


b Scores summarized by these cell means represent 
average number of times, per question asked, that 
subjects interrupted the latency period between 
question and response called for by the response 
schedule, The range of scores is from 0 to 1.8. 


Positive feedback is most often withheld in 
the black-male group, most often given in the 
white-male group; and helpful slips are least 
frequent in the black-male group, most fre- 
quent in the white-male group. 


The Awareness Manipulation Test for’ 
Repression of Affect 


The repression-of-affect idea was directly 
tested here via the awareness manipulation, 
and it is the three-way interactions involving 
pupil race, awareness, and time that announce 
the verdict. Support for the repression-of-af- 
fect notion would be provided by a pattern 
of such interaction effects showing controlla- 
ble, positive behaviors to increase dispropor- 
tionately among subjects in the aware — black- 
pupil condition after the awareness manipula- 
tion was given, and uncontrollable negative 
behaviors similarly to increase among this 
group. Cell means relevant to the two sig- 


nificant or nearly significant Race x Aware. 
ness X Time interactions are presented jr 
Table 3. It can be seen that the awareness 
manipulation encouraged the decrease of posi- 
tive feedback omission among those with 
black “pupils,” while depressing such a de 
crease among those with white “pupils, 
F(1, 84) = 5.74, p=.019, MS, = 70.5, d 
= .523. And there was a nearly significant 
tendency for the awareness manipulation 
discourage impatience in subjects in the black: 
pupil condition but encourage impatience in 
subjects in the white-pupil condition, F ( 1, 88) 
= 3.51, p = 065, MS. = 319.8, d = 411, © 

These effects suggest that the awarenes 
manipulation may have produced the in 
tended demand characteristics: The manipu- 
lation may have increased, for black-pup 
condition subjects relative to white-pupil 
dition subjects, the incidence of behavi 


practice. However, there is no evidence th 
the relative advantages that may have be 
conferred on black pupils by the awarené 


portionate increases in uncontrollable disad- 
vantageous teaching behaviors. And there atë 
difficulties even in the interpretation of the 
two effects that did occur: The pattern of 
means does not suggest that certain dé 
characteristics were produced uniquely in 
aware — black-pupil condition; and preman 
ulation (Time 1) differences between awale 
and unaware groups throw doubt on any cong 
clusions about the manipulation’s effect. All if 
all, it must be said that we fail to find ny 
Support for the repression-of-affect idea i 
these data, 


12 It was in the pattern of correlations among met 
Sures that Weitz (1972) found evidence for the "a 
Pression-of-affect proposition. The correlations n 
measures for the various Race X Awareness SUD 
groups after the awareness manipulation had 
implemented were examined here as well. The, P 
pression-of-affect prediction would be the followi 4 
for some pairs of behaviors, one member contro 
and the other not, there would be a correlation 
one direction in the white-pupil group, 
affective congruence of the two behaviors; cor: 
unaware-black group there would be a weaker tion, 
relation, or a correlation in the opposite oe 
Teflecting the combination of dissembling and 


The Interactive Effects of Pupil Race and 
Ability 


It is the interactive effects of pupil race 
and ability that are probably the most intrigu- 
ing results of the present study. Cell means 
relevant to the three such interactive effects 
that reached or approached significance are 


age; and for the aware-black group, subject to still 


greater deviation from the white-condition relation- 
ship. Pairs of measures were judged to represent 
this pattern if the correlations of the aware-white 
and unaware-white groups were not significantly 
different from each other, if the aware-black group 
was significantly different from the white group, and 
if the correlation of the unaware-black group was 
intermediate and not significantly different from the 
pooled correlations of the aware—black-pupil and 
aware - white-pupil conditions. 

Requirements for this pattern were technically met 
by three pairs of variables in the present study, but 
none of the three cases provides clear support of 
the Weitz repression and leakage idea. The correla- 
tion between speech errors and persistence was nega- 
tive in the white condition, r(50) =—.275, weak 
and positive in the unaware-black group, 7(19) = 
184, and positive and slightly stronger in the black- 
aware group, r(25) = .278. However, it is not clear 
that persistence and speech errors have contradictory 
affective implications. Persistence and head nodding 
Were ‘the second pair of measures that met the pat- 
tern: The correlation between them was weak and 
Positive, 7(49) = 157, in the white-pupil condition, 
Near zero, r(19) = .005, in the black-unaware condi- 
we and moderate and negative, r(25) =—.422, in 

e aware-black condition. The third pair was head 
nodding and positive feedback omission: The cor- 
telation was near zero, r(24) = .022, in the white- 
ee condition, weak and positive, r(19) =.116, in 
i A unaware-black condition, and moderate and posi- 
i @, r(25) = .440, in the aware-black condition. In 
a of the second and third pairs, it is possible 
e pret the pattern of behavior typical of the 
eak group as affectively incongruent—head 
i ding being associated with lack of persistence in 
in at case and with omission of positive feedback 
on k second. However, for both of these pairs of 
rear the differences between the correlations 
rm the white and the unaware-black groups were 

ial. And the Weitz discussion suggests that evi- 
Mes of the dissembling and leakage phenomenon 
ae be detectable even in the absence of conditions 
ha as the awareness manipulation in the present 

i Y, which may heighten it. 
ca short, evidence from subgroup correlations of- 
iden a Stronger support for the repression-of-affect 
ness an did interactive effects of pupil race, aware- 

» and time. 
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Table 4 
Race X Ability Interactions 
Pupil ability 
Pupil race High Low 
Vocal warmth* 
Black 5.29 4,80 
White 4.84 5.20 
Helpful slips» 
Black .09 00 
White 09 16 
Forward leaning® 
Black 121.8 148.4 
White 134.9 108.1 


a Scores summarized by these cell means are means 
of 16 rater judgments of content-filtered excerpts on 
a 9-point cold-warm scale, where warm is high. The 
range of scores is from 2.31 to 7,50. 

b Scores summarized by these cell means represent 
total number of times a subject made an apparent 
slip of the tongue and gave away the answer while 
trying to ask a question. The range of scores is from 


0 to 2. 
e Scores summarized by these cell means represent 


seconds of leaning toward the observation room 
window during three 1-min periods of a lesson pre- 
sentation, The actual range of scores is from 0 to 180. 


presented in Table 4. Relative to those in the 
other two Race X Ability cells, subjects who 
believed they were teaching high-ability - 
black and low-ability — white pupils spoke in 
warmer tones, F(1, 84) = 5.39, p = .023, MS, 
= 16,227.6, d = .507. Additionally, those in 
the high-ability — black and low-ability — white 
pupil groups were more likely to make help- 
ful slips as they asked questions, F(1, 84) = 
5.19, p = .026, MS. = .084, d = .497."" And, 
there was a nearly significant tendency for 
these groups to lean forward less often as they 
addressed their pupil, F(1, 82) = 3.95, p= 
051, MS, = 8,666.1, d= 439." 

The direction of the Race X Ability effect 
on forward leaning is puzzling in light of the 


ees 

13 x4(1) = 3.48, p < 10. (See explanation in Foot- 
note 10.) 

14 Supplementary analyses (see Footnote 9) showed 
these effects to be stronger for naive subjects than 
for suspicious subjects, except in the case of the 
helpful-slips measure. 
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Table 5 

Sex X Ability Interactions 

Pupil sex 
Pupil ability Male Female 
Praise feedback incidences 
High 67.3 58.3 
Low 56.5 74.2 
Helpful slips? 

High .08 ll 
Low 17 -02 


"Scores summarized by these cell means are per- 
centages of correct responses followed by explicit 
praise. The actual range is from 0 to 100. 

b Scores summarized by these cell means represent 
total number of times a subject made an apparent 
slip of the tongue and gave away the answer while 
trying to ask a question. The range of scores is from 
0 to 2. 


vocal-warmth and helpful-slips effects. It ap- 
pears that leaning should not be assumed to 
serve as an alternate indicator of warmth, 
The negative correlations between the two 
measures, 7(93) = —.144, p< -10, for the 
first lesson presentation, and r(91) = —.249, 
p < .05, for the second lesson presentation, 
support this conclusion, Future investigation 
of the correlates of leaning behavior seems in 
order. In the meantime, we can note that the 
more direct and interpretable indicator of 
climate, vocal warmth, along with inadvertent 
helpfulness, was mostly likely to exist among 
those in the high-ability — black and low-abil- 
ity —white conditions. This finding seems to 
contradict the Rubovits and Maehr (1973) 
conclusion that high-ability — black and low- 
ability—white groups are the recipients of 
the most detrimental teacher behaviors. 


Pupil Sex Effects and the Interactive Effects 
of Sex and Ability 


Main effects of pupil sex are negligible, The 
only behavior showing a significant effect of 
pupil sex was head nodding, girls receiving 
more frequent up-and-down head nods (M 
= 2.66) than boys (M = 1.62), F(1, 82) = 
5.19, p= .026, MS. = 10.1, d=.503. And 
even this effect is difficult to interpret. Head 


MARYLEE C. TAYLOR 


nodding had been assumed to be an indicator 
of climate or warmth, but that assumption jg 
thrown into serious question by the fact that 
head nodding did not show substantial cor- 
relations with any of the other kinesic mea- 
sures or with ratings of vocal warmth, We 
thus must say that there were no readily in- 
terpretable effects of pupil sex on teacher be- 
havior. This finding is congruent with the 
conclusions of Brophy and Good (1974) that 
such pupil-sex effects on teacher behavior as 
do exist in classroom situations are the prod: 
uct of actual behavior differences among male 
and female pupils. 

Interactive effects of pupil sex and abilit 
like main effects of pupil sex, are neithe 
numerous nor readily interpretable. Relevant 
cell means are presented in Table 5. On the 
one hand, the high-ability — female and low: 
ability - male groups revealed a lower inci 
dence of praise feedback than did the oi 
two Sex x Ability groups, F(1, 84) =% 
p= .009, MS. = 1,194.4, d= 588.8 Hi 
ever, it was also subjects in the high-abili 
female and low-ability— male groups wi 
were most likely to make helpful slips of 
tongue, F(1, 84) = 5.19, p= .026, MS 
084, d = 497." Thus, we see the least ind 
tentional encouragement, in the form of prail 
after correct responses, being directed 
those groups that receive the greatest uninten- 
tional assistance in the form of helpful slips. ' 

‘a 


Discussion 


The present findings provide strong sup 
port for Rosenthal’s (1973) conclusion that 
two of the modes in which teachers may adapt 
their behavior to reflect differential expecta 
tions of their students are by adjusting the) 


15 Supplementary analyses (see Footnote 9) showed 
the effect on helpful slips to be stronger among ba 
subjects than among suspicious subjects. The im htl 
action effect on praise feedback incidence is slighty 
stronger for suspicious subjects than for naive sub- 
jects, but the effect within the naive group | ie 
large enough to have been significant had the enui 
group of subjects remained naive. in Foot- 

Lees e aed 3.48, p < 10. (See explanation in *™ 
note 10.) 


pace of their teaching and by varying the re- 
“sponse opportunities allowed the children. 
The pattern of feedback behavior noted 
here merits more detailed comment. First, the 
virtual independence of the various measures 
of teacher feedback suggests that it is not ap- 
propriate simply to speak of feedback as being 
more or less good, or strong, or clear. Differ- 
tntiated questions must be asked about rates 
if teacher feedback, quality of feedback, and 
ixtensiveness of feedback. The finding that 
subjects in the low-ability condition offered 
longer statements of praise than did those in 
the high-ability condition was unexpected; 
however, several plausible interpretations sug- 
fest themselves. There may have been a con- 
ast effect at work, a tendency for subjects 
b express surprise at the accomplishment of 
bw-ability pupils by giving elaborate praise. 
Or, the subjects may have been taught self- 
tonsciously to use praise as a shaping tech- 
|tique, and perhaps were most diligent in the 
‘plication of this technique when addressing 
‘low-ability pupil. Alternately, focusing on 
lhe relative brevity of praise feedback given 
t the high-ability group, perhaps subjects 
tho believed their pupils to be bright func- 
toned under pressure of demand character- 
Mics dictating that the measure of their teach- 
' ability was the amount of material cov- 
ed; cutting short the positive verbal feed- 
lick was one way of speeding things up. 
(Correlations between teaching pace and 
Mgth of praise feedback are negative and 
Noderately strong.) 

Similarly, it is not difficult to find plausible 
"planations for the pupil-ability effect on 
"eal nervousness. Teachers may see low-abil- 
lY pupils as less threatening than high-ability 
Npils, and thus may approach them with 
ss anxiety, 

, The absence of pupil-ability effects parallel- 
Ng those found on climate indicators in prior 
"search remains unexplained. Perhaps climate 
Was the aspect of teacher behavior least amen- 

© to natural expression in the absence of 
‘tual face-to-face interaction. The results 
te do suggest that the various kinesic and 
Patalinguistic measures assumed to represent 

Mate do not necessarily correlate strongly 
™ even Positively with each other, and there- 


j 
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fore should not be-assumed to be indicators 
of the same construct. 

It must be remembered that only half of 
the mediation question was addressed in the 
present study. We have examined effects of 
pupil ability on teacher behavior, but behay- 
iors spotlighted as mediators of teacher ex- 
pectancy effects must have their impact on 
student performance, as well as show an im- 
pact of pupil-ability label. The teacher behav- 
iors that showed pupil-ability effects here 
probably vary in their influence on pupil per- 
formance. In actual classroom situations, it 
may often be sound pedagogical practice to 
adjust lesson pacing in accordance with pupil 
ability. Thus, the effect of pupil-ability label 
on teaching pace may only be a problem 
where ability assessments are faulty. In con- 
trast, the observed tendency to give children 
with low-ability labels less time to answer 
questions must clearly be dysfunctional for 
those children, the more so because the cor- 
relational analysis showed interruptions of the 
latency period between question and response 
often to take the form of additional questions, 
requiring different responses. Bombardment 
with a series of follow-up questions, before 
the pupil has had the chance to respond to 
the original question, would make successful 
performance difficult for any child; for the 
child who may indeed already have learning 
problems, such teacher behavior may present 
an insurmountable obstacle to successful per- 
formance, Length of praise feedback and vocal 
nervousness fall into still a third category, 
showing effects of pupil-ability label but not 
having prima facie positive or negative impli- 
cations for pupil performance. 

Clearly, the results of numerous field and 
lab studies, examining effects of teacher be- 
haviors as well as their causes, must be ac- 
cumulated before we can be satisfied with our 
understanding of the mediation of teacher ex- 


pectancy effects. 


The Joint Effects of Pupil Race and Sex 


The effects of pupil race on teacher behav- 
jor in the present study were limited. There 
does seem to be a tendency for black “pupils” 
to be recipients of the less positive teacher be- 
haviors. Perhaps more interesting, however, is 
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the observation that on the two indicators that 
appeared to show a bias against black pupils 
—the positive-feedback-omission and helpful- 
slips variables—there were also Race X Sex 
interactions, the two effects together implying 
that white male students receive the most 
favorable treatment, black male students, the 
most unfavorable. The parallel between this 
result and recent Census Bureau statistics is 
noteworthy: In 1974, educational attainment 
was highest among white males, lowest among 
black males (with white and black females in- 
termediate) (Farley, 1977). 


The Repressed-Affect Model 


It is encouraging that the awareness ma- 
nipulation, designed to evoke conscious ef- 
forts at problack behavior, did not also create 
detectable leakage of negative affect. The ex- 
planation for the incongruence of the present 
results and those of Weitz (1972) remains to 
be ascertained. 


The Null Results of the Pupil-Sex 
Manipulation 


It is the lack of a pronounced pattern of 
pupil-sex or Sex X Ability effects that is in- 
teresting here. Perhaps Brophy and Good 
(1974) are correct in suggesting that class- 
toom teachers merely respond to behavioral 
differences in male and female pupils, differ- 
ences that originate elsewhere. Other possi- 
bilities merit exploration, however. Certain 
kinds of pressure for sex role specialization in- 
tensify as children enter adolescence. It may 
be the case that differential teacher treatment 
of male and female pupils is pronounced, but 
for children older than the phantom 6-year-old 
of this study. It may also be the case that ef- 
fects of pupil sex do exist, even among teach- 
ers of young children, but on dimensions of 
behavior different from those selected for at- 
tention in the present study. For example, 
Dweck (1976) discusses the differential rates 
of praise and criticism teachers direct at male 
and female children for nonintellectual aspects 
of their performance. Dweck believes that ex- 
perience with nonspecific feedback affects 
pupil sensitivity to positive and negative feed- 
back on intellectual tasks, and that the sex 
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difference in receipt of nonspecific feedbac 
creates a sex difference in response to fai 
ure. This potentially important dimension o 
teacher behavior, nonspecific feedback, is on 
of those that obviously could not be represente 
here, since our teacher subjects had the op 
portunity to give feedback specific only { 


intellectual tasks. 


The Race by 
Vocal Warmth ; 


One of the most interesting observations 0 
the present study is that pupil race and abilitj 
interacted to influence teacher affective oti 
entation in a direction favoring high-ability; 
black and low-ability — white pupil groups. A 
noted earlier, this effect is opposite in directio 
from that found by Rubovits and Mad 
(1973). Why might teachers be positively dis 
posed toward high-ability — black and low 
ability — white pupils? Findings of Datta etal 
(1968) suggest an explanation. 


O 

When Datta et al. asked a predominantly 
white group of seventh-grade teachers to f i 
W: 


Ability Interaction Effect ú 


selected black and white pupils in th 
classes on a variety of dimensions, the hig 
ability — black and low-ability — white grou 
were rated higher on positive task orientatio 
and lower on verbal aggression than were l0 
ability blacks and high-ability whites. 
Intuition suggests that most teachers wou 
prefer pupils to manifest the orderly and s 
missive patterns of behavior represented 
high task orientation and low verbal agg 
siveness, and in this case there is ample en 
pirical evidence to support the intuitive A 
clusion. Feshbach (1971) finds, for examp 
that teachers most prefer rigid, confor 
and orderly pupil descriptions, and least PA 
those that are independent, active, and 45 
Ta 


3 4 : enl 
tive. The interaction effect noted in the pres 


study may, then, reflect teacher anticipa d 
that high-ability blacks and low-ability W H 
will be relatively trouble-free students, rela: 
low-ability blacks and high-ability whites 
tively troublesome. 
1 
External Validity and the Need for | 
Naturalistic Research i] de 
The limitations of this phantom PUP’ ent 
sign must be acknowledged. In the P 


study we tried to simulate one-to-one interac- 
tion instead of the typical group teaching sit- 
uation, and for one of the interactants we de- 
ijpended on our cover story’s credibility, the 
subjects’ imaginations, and a remote-control 
light panel. The reader is invited to be as 
tncerned about the external validity of the 
judy as we are. On the other hand, as in vir- 
Wally all social psychology research, we nec- 
tsarily had to accept some limitations in 
der to avoid others. A great deal of research 
las been done in more naturalistic settings 
where, typically, the lack of control precluded 
sme behavioral measures and meant that ex- 
taneous pupil characteristics were possibly 
wnfounded with those of primary interest. In 
irder to employ the range of refined measures 
"presented here, and to be assured that we 
ould look at clean effects of pupil race, sex, 
ind ability, a design was selected that leaves 
en some questions about generalizability. 
ll we are to reap maximum value from this 
Sudy’s strengths and minimize the ramifica- 
‘ins of its potential weaknesses, the present 
sults must be considered along with evidence 
fom more naturalistic classroom studies. 


Concluding Comments 


| Directions that could profitably be taken in 
ture research are undoubtedly obvious to 
te reader. There is the need for further data 
be gathered in actual classrooms and in 
Sntrolled laboratory settings, focusing on 
ack teachers and males, as well as on white 
male teachers. Conceptualization and em- 
pa work relevant to the dimensionality of 
tacher behavior will provide valuable addi- 
a to existing understandings. And, if we 
Re to talk of the impact that various teacher 
Hlentations may have on pupil performance, 
a needs to be much more extensive evi- 
“hee collected concerning the effect of teacher 
thaviors on pupils. 
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Attitude Organization and the 


The validity of a single-component mo: 


for the single-component model but was 


was not supported. 


Considerable research attention has been 
directed at the attitude-behavior relationship. 
M Much of this research has yielded disappoint- 

ingly weak results (e.g, Berg, 1966; Bray, 
1950; Kutner, Wilkins, & Yarrow, 1952; La 
Piere, 1934; Nemeth, 1970). Wicker (1969) 
Teviewed over 30 studies concerned with the 
Attitude-behavior relationship and concluded 
that in most cases attitude is either unrelated 
Or only slightly related to behavior. 

The failure to find consistent, strong rela- 
tionships between attitude and behavior has 
led to elaboration and clarification of the at- 
titude construct and to further specification 
Of its relationship to behavior. It is frequently 
argued that attitude is a multidimensional con- 
Struct and that simultaneous consideration of 
these several dimensions will provide better 
Predictions of behavior than the consideration 
Of a single attitudinal dimension or component 
‘alone (Krech, Crutchfield, & Ballachey, 1962; 
Norman, 1975). Other researchers argue that 
‘Attitude may be appropriately regarded as a 
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del of attitude was assessed and com- 


pared with a two-component (affective/cognitive) conceptualization of attitude. 
Confirmatory factor analysis was employed in a reanalysis of data previously 
reported by Fishbein and Ajzen in the Psychological Review. Convergent valid- 
ity, in the sense of uniform consistency of responses, was found not to hold 


achieved for the two-component model. 


Further, the two-component model of attitude was found to predict scaled, but 
not unscaled, multiple-act behavior criteria; and the predictive validity of the 
relationship between the two-component model and single-act behavior criteria 


single dimension of affect and that the rela- 
tionship between attitude and behavior would 
be stronger if both attitudinal predictors and 
behavioral criteria were measured at an 
equivalent level of generality (e.g., Fishbein, 
1967b; Fishbein & Ajzen, 1975). Fishbein 
and his associates maintain that a general at- 
titude should be predictive of a general set of 
behaviors, but it should bear no necessary re- 
lationship to any specific behavior the re- 
searcher might be interested in (Ajzen & Fish- 
bein, 1977; Fishbein, 1973; Fishbein & Ajzen, 
1974).* 

This article addresses both of these issues. 
In doing so, it provides empirical tests of the 
validity of the multicomponent and single- 
component models of attitude. Second, it 
tests the validity of the relationship between 
a general attitude and single-act criteria and 
between a general attitude and multiple-act 
criteria, Finally, it examines the relationship 
between attitude and scaled measures of be- 


havior. 


1Many other factors have been cited to explain 
the lack of consistently strong positive relationships 
between attitude and behavior. These will be treated 
in the discussion section of this article, but for a 
fuller treatment see Calder and Ross (1973), Fish- 
bein and Ajzen (1975), Kelman (1974), and Wicker 


(1969). 
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Attitude Organization 


Those who hold a multidimensional view of 
attitude consider it to be a complex construct 
comprised of an affective (i.e. feeling) com- 
ponent, a cognitive (i.e. knowledge) com- 
ponent, and a conative (i.e., behavioral) com- 
ponent (Krech et al., 1962; Smith, 1947). 
Rosenberg and Hovland (1960) regard at- 
titude as a predisposition to respond to some 
class of stimuli with cognitive, affective, and 
behavioral responses. They suggest that each 
response class is mediated by a separate com- 
ponent of attitude. Greenwald (1968) main- 
tains that these components may have sepa- 
rate antecedents. He further suggests that 
affect may be formed through classical condi- 
tioning, cognitions through cognitive learning, 
and conations through instrumental learning 
or operant conditioning. 

It has been pointed out, in accordance with 
a multicomponent view of attitude, that peo- 
ple may hold the same attitude as measured 
by one index of attitude and yet hold very 
different attitudes in terms of their standing 
on the other unmeasured components (e.g., 
Rosenberg & Hovland, 1960). It is implied 
that the attitude-behavior relationship will be 
stronger when the components are consistent 
than when they are inconsistent (e.g., Rosen- 
berg, 1968; Rosenberg & Hovland, 1960). 
Failure to find a consistent, direct relation- 
ship between attitude and behavior may be 
due, then, to failure to measure people’s stand- 
Ing on all three components of attitude and 
employ these as simultaneous Predictors of 
behavior. The three-component view suggests 
that cognitive, affective, and conative evalua- 
tions of objects are distinguishable aspects of 
attitude and that simultaneous consideration 
of all three components should be most pre- 
dictive of overt behavior, 

The major alternative to the three-com- 
ponent view treats attitude as a single dimen- 
sion of affect for or against an object. In ac- 
cordance with this Position, Fishbein (1967b) 
Points out that although definitions of atti- 
tude usually refer to three components, it is 
the evaluative or affective component that is 
usually measured and treated as attitude. He 


argues that all attitude-scaling techniques 
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have in common the characteristic that they 
place individuals on a dimension of affect for 
or against a psychological object. According 
to this view, alternative approaches to the 
measurement of attitude provide alternative 
measures of the same thing (i.e., affect) and 
should yield the same results. Obtained differ- 
ences among alternative attitude instruments 
would be due, therefore, to measurement er- 
ror and not the assessment of valid alternative 
components, 

A third and intermediate position is main- 
tained in the present article. Attitude is con- 
sidered to be a complex construct comprised 
of cognitive and affective components. These 
components are believed to simultaneously ac- 
count for behavioral intentions or actual be- 
havior.” 

Katz and Stotland (1959) and Rosenberg 
(1968) assert that all true attitudes must 
have both cognitive and affective content al- 
though they need not include a conative com- 
ponent. Rosenberg (1968) notes that with 
the exception of cognitive dissonance, 


The authors and proponents of most of the other 
consistency theories pay ready lip service to the 
definition of attitude as an internally consistent struc- 
ture of affective, cognitive and behavioral com- 
ponents. But, in practice, the last of these com- 
ponents is usually slighted. Behavior (in the sense 
of externally visible, overt action) toward the at- 
titude object is usually relegated to the status of @ 
dependent variable; implicitly it is assumed that the 
Person will simply act toward an attitude object in 
a manner consistent with his coordinated affective- 
cognitive orientation toward that object. (p. 101) 


The two-component attitude position taken 
here recognizes and is consistent with the fact 
that self-reported behavior and stated inten- 
tions to respond have frequently been treated 
as dependent effects of affective and/or oe 
tive variables (e.g., DeFleur & Westie, i 2 
Linn, 1965; Nemeth, 1970; Rogers & Thistle- 
thwaite, 1970; Sample & Warland, 1973; Tit- 
tle & Hill, 1967), Intentions seem to be at 4 


*It is also recognized that behavior may lead y 
changes in the affective and cognitive components or 
attitude as a result of learning and self-perception of 
dissonance effects, However, explicit consideration, is 
these feedback effects is beyond the scope of 
research. 


: 


4 


` ATTITUDE-BEHAVIOR RELATIONSHIP 


lower level of abstraction (i.e., closer to ob- 
servable behavior) than cognitions or affect. 
Several attemps have been made to provide 
empirical support for a multicomponent con- 
ceptualization of attitude. The results of these 
studies have been equivocal at best. Wood- 
mansee and Cook (1967) factor analyzed 
scales designed to measure attitude toward 
blacks. They failed to obtain unique factors 
that could be interpreted as representing cog- 
nitive, affective, and conative components of 
attitude. Based on judgment procedures, Os- 
trom (1969) and Kothandapani (1971) de- 
veloped Guttman, Likert, Thurstone, and 
Guilford self-rating attitude scales for each 
component of attitude. People then responded 
to cognitive, affective, and conative scales. 
Using Campbell and Fiske’s (1959) multi- 
trait-multimethod matrix approach and 
criteria, both researchers concluded that ob- 
tained correlations among the attitude com- 
ponent scores provided evidence for con- 
vergent and discriminant validity for the 
affective, cognitive, and conative components, 
However, in a reanalysis of both data sets, 
Bagozzi (1978) found that discriminant 
validity was obtained for the Ostrom, but 
not the Kothandapani, data. Thus, only the 
Ostrom study provides clear support for a 
multicomponent attitude model. 
` An objective of the present article is to 
Provide empirical tests of the validity of the 
two-component and single-component models 
of attitude. This research involves a reanaly- 
sis of data presented by Fishbein and Ajzen 
(1974). In that study, each subject’s attitude 
toward “being religious” was assessed on 
Thurstone, Guttman, Likert, semantic differ- 
ential, and Guilford self-rating scales. In ad- 
dition, subjects provided responses to either 
100 behavioral intention items or 100 self- 
"eported behaviors. While the approach taken 
by Fishbein and Ajzen was consistent with a 
Unidimensional approach to attitude in that 
all five attitude scales were treated as alterna- 
live measures of attitude, it is our contention 
that the evaluative dimension of the semantic 
differential and the Guilford self-rating scale 
"present alternative measures of the affective 
“omponent of attitude and that the Guttman, 
ikert, and Thurstone scales represent alterna- 
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tive measures of the cognitive component of 
attitude. 

Theory and research tend to support this 
distinction. Katz and Stotland (1959) identi- 
fied the affective component with attributions 
of good or bad qualities, Osgood, Suci, and 
Tannenbaum (1957) defined attitude as the 
projection of a concept on the evaluative di- 
mension of semantic space. Similarly, various 
other researchers have presented arguments 
and evidence for using the evaluative dimen- 
sion of semantic space as a measure of affect 
(cf. Davidson & Jaccard, 1975; Fishbein & 
Raven, 1962; McGuire, 1969; Norman, 1975). 
In a parallel fashion, the Guilford self-rating 
favorable/unfavorable scale can be used to 
measure affect. Conceptual and empirical sup- 
port for this claim can be found in Ostrom 
(1969), Kothandapani (1971), and Norman 
(1975), 

The affective component measures the de- 
gree of emotional attraction toward an atti- 
tude object. The cognitive component, in con- 
trast, accounts for the perceived relationship 
between attitude object and other objects or 
concepts. The cognitive component typically 
covers “beliefs about the object, character- 
istics of the object, and relationships of the 
object with other objects” (Ostrom, 1969, p. 
16). With regard to the specific cognitions 
employed in the present research, Fishbein 
and Ajzen (1974) maintain that “the other 
three attitude measures [ie., the Guttman, 
Likert, and Thurstone scale] were standard 
religiosity scales based on opinion items (p. 
62). Elsewhere, Fishbein and Ajzen (1975, 
p. 12) consider opinions to be cognitions, and 
they distinguish them from affect and cona- 
tion. In their study, Fishbein and Ajzen 
(1974) employed a Likert scale from Bardis 
(1961), a Guttman scale from Faulkner and 
DeJong (1969), and a Thurstone scale from 
Poppleton and Pilkington (1963). Examina- 
tion of these scales shows them to be con- 
sistent with the preceding definitions of the 
cognitive component. For instance, the state- 
ments employed are of the following form: 

1. “Religious faith is merely another name 
for belief which is contrary to reason” (Pop- 
pleton & Pilkington, 1963, p. 34). 

2. “Religious truth is higher than any other 
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form of trust” (Faulkner & DeJong, 1969, p. 
567). 

It is possible to distinguish empirically be- 
tween the single-component and two-com- 
ponent views of attitude using the data of 
Fishbein and Ajzen (1974). To do this, one 
may examine and compare the convergent 
validity of each model. Although the Camp- 
bell and Fiske (1959) criterion of high cor- 
relations among the measures of each com- 
ponent provides one indication of convergent 
validity, the procedure is an arbitrary one, 
and the more rigorous confirmatory factor 
analysis methodology was employed in the 
present study. Among other benefits (cf. 
Bagozzi, 1978), Kenny (1976) notes the fol- 
lowing advantages of confirmatory factor 
analysis: 


The application of confirmatory factor analysis to 
the multitrait-multimethod matrix has a number of 
advantages over the traditional Campbell-Fiske set 
of criteria: (a) Confirmatory factor analysis gives 
estimates of parameters while the Campbell-Fiske 
criteria are only rules of thumb. (b) Significance 
tests are possible with confirmatory factor analysis. 
(c) Given marked differences in the reliability of 
measures, the Campbell-Fiske criteria are misleading 
- +. While confirmatory factor analysis takes into ac- 
count differential reliability. (p. 248) 


To test for convergent validity using con- 
firmatory factor analysis, the null hypotheses 
expressed in Figure 1 must be examined. Panel 
a of Figure 1 shows the null hypothesis for 
the single-component model. Under the single- 
component model, for convergent validity to 
be sustained, it is necessary that the responses 
of subjects to all attitude measures achieve a 
high degree of correspondence. Rather than 
using the criterion of correspondence sug- 
gested by Campbell and Fiske, which is not 
as stringent as confirmatory factor analysis 
and which can be misleading, as noted above, 
the likelihood ratio chi-square goodness-of-fit 
test can be used. Similarly, Panel b of Figure 
1 displays the null hypothesis for the two- 
component model. Under this model, con- 
vergent validity will be obtained when the 
Tesponses of subjects achieve a high degree 
of correspondence for (a) the affective com- 
ponent and (b) the cognitive component and 
when the correlations of attitude measures 
across the two components are both consistent 
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in their pattern and significantly lower in 
magnitude than the within-component atti- 
tude measure correlations. Again, the likeli- 
hood ratio chi-square goodness-of-fit test can 
be used to test this hypothesis. The formal 
statement of the convergent validity null hy- 
potheses will be made below under Method. 

The data of Fishbein and Ajzen (1974) do 
not permit one to distinguish between the 
prediction that behavior is accounted for by 
intentions alone and the prediction that be-*y 
havior is accounted for by the simultaneous 
consideration of the conative, cognitive, and 
affective components of attitude. This is be- 
cause behavior and behavioral intention mea- 
sures were not both obtained from the same 
individuals.* However, a finding that the cog- 
nitive and affective components of attitude 
simultaneously account both for behavioral 
intention and for verbal reports of past be- 
havior would be consistent with the two-com- 
ponent view suggested here. 


Attitude-Behavior Relationship 


It has been argued (e.g., Ajzen & Fishbein, 
1977; Fishbein, 1967a, 1973; Fishbein & 
Ajzen, 1974) that a major reason for the weak 
relationship obtained between attitude and be- 
havior is the lack of correspondence between 
attitudinal predictors and behavior criteria. 
Fishbein (1973) notes that although a rela- 
tionship is frequently assumed between at- 
titude and whatever behavior the researcher 
happens to be interested in, there is no neces- 
sary relationship between a general attitude 
toward an object and the performance of @ 
specific behavior with respect to that object. 
However, a general attitude toward an_ob- 
ject should be predictive of the overall pat- 


*Fishbein and Ajzen (1974) obtained self-reports 
of behavioral intentions from a sample of 63 see 
and self-reports of past behavior from a at 
sample of 62 subjects. They argue that while both i 
these behavioral criteria are verbal responses, S! 0- 
reports of past behavior go beyond intentions ee i, 
vide meaningful approximations of behavior. rt 
recently, Weigel and Newman (1976) employed overi 
behaviors as criteria, and they obtained results tm 
are consistent with those obtained by Fishbein 4! 
Ajzen. 
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terns of behavior engaged in with respect to 
that object. In their most complete elabora- 
tion of this view, Ajzen and Fishbein (1977) 
maintain that “the strength of an attitude— 
behavior relationship depends in large part 
on the degree of correspondence between atti- 
tudinal and behavioral entities” (p. 891). 
When correspondence is high, the strength of 
the relationship between attitude and behavior 
should also be high. They suggest that at- 
titudinal and behavioral entities may be con- 
sidered in terms of four different elements: an 
action element, a target toward which the ac- 
tion is directed, a context in which the action 
is performed, and a time at which the action 
is performed. As a result, an attitude toward 
a specific act to be engaged in at a given time 
in a specific context should be predictive of 
the actual performance of that act in that 
time and situation. 

_ However, an attitude toward an object (i.e., 
action, and context are unspecified) 


Should not necessarily predict any particular 
behavior logically related to the object. One 
Would not, therefore, necessarily expect a cor- 
telation between attitude toward an object 
and single-act behavior criteria or even mul- 
tiple-act behavior criteria unless the content 
Of the criteria were scaled at comparable levels 
of generality and/or the multiple-act criteria 
encompassed a wide range of relevant behav- 
lors. The attitude employed in the present 
Study was attitude toward “being religious” 
4nd is thus nonspecific with respect to action, 

intext, and time, Consequently, it may be 
ypothesized that attitude will predict scaled 

ultiple-act criteria but not necessarily single- 


act criteria or unscaled multiple-act criteria. 

Fishbein and Ajzen (1974) have shown sup- 
tions between measures of attitude and behav- 
"or should be expected only under conditions 
Df high attitude-behavior correspondence. 
ingle-act criteria were low, ranging from .121 
2 202. When a multiple-act criterion was 

Vioral intention statements or 100 self-re- 
Es of past behavior, the correlations be- 


Port for their contention that high correla- 
Verage correlations between attitude and 
structed by summing responses to 100 be- 
fen attitude and this criterion were high, 


ging from .604 to .749. Similar results were 
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obtained by Weigel and Newman (1976) in 
an assessment of overt behaviors obtained 3 
months after administration of an attitude 
instrument. Fishbein and Ajzen also found 
that correlations between attitude and mul- 
tiple-act criteria tended to be equally high 
and significant when the behavioral criteria 
were scaled in accordance with Guttman, 
Likert, or Thurstone scaling procedures. 

It is rather surprising that correlations be- 
tween attitude and scaled multiple-act criteria 
were not stronger than correlations obtained 
between attitude and a simple summative cri- 
terion. As Fishbein and Ajzen (1974) pointed 
out, “Many (if not most) behaviors that are 
performed with respect to or in the presence 
of a given object, cannot be considered valid 
indicants of a person’s attitude towards that 
object” (p. 66). Use of a standard scaling 
procedure “is one way of insuring that the be- 
havioral items selected constitute valid be- 
havioral criteria, that is, that they are in some 
way related to the attitude under considera- 
tion” (Fishbein & Ajzen, 1974, p. 65). Fish- 
bein and Ajzen reported that when scaling was 
employed, most of the 100 items were rejected 
by all three scaling procedures. This would 
suggest that most behaviors (or intentions) 
were unrelated to people’s attitudes toward 
being religious. These behaviors may be per- 
formed for essentially nonattitudinal reasons. 
The inclusion of these behaviors in a multiple- 
act criterion would incorporate considerable 
error in the instrument. We would expect, 
therefore, that a strong relationship would be 
obtained between attitude and behavior only 
when both attitude and the multiple-act cri- 
terion are appropriately scaled (see discussion 
below and under Method). 

To test for the predictive validity of the 
attitude-behavior relationship using confirma- 
tory factor analysis, the null hypotheses of 
Figures 2 and 3 must be investigated. Al- 
though a formal, detailed description of these 
hypotheses will be provided in the Method 
section, for now it should be noted that the 
path diagrams provide a representation of the 
relationship between attitude as a construct 
and the appropriate behavior criterion. The 
construct to use in these hypotheses is the one 
achieving convergent validity. Intuitively, the 
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null hypotheses entailed by Figures 2 and 3 
state that the more favorable the attitude, A, 
the greater the number of behaviors, B, per- 
formed. The parameter y represents the de- 
gree of predictive validity and is analogous to 
a beta coefficient in regression, It should be 
stressed, however, that the hypotheses in con- 
firmatory factor analysis are more stringent 
than those suggested by, say, correlation or 
regression analyses, That is, a failure to reject 
the null hypotheses implied by Figures 2 and 3 
requires that the relationship y be statistically 
significant and that (a) the intercorrelations 
among the attitude measures of A achieve a 
high degree of correspondence, (b) each mea- 
sure of A relate to the behavioral criteria in 
a statistically significant and sufficiently large 
way, and (c) the pattern of intercorrelations 
between measures of A and B be uniform. 
The likelihood ratio chi-square goodness-of- 
fit test can be used to evaluate this hypothesis. 

Notice that it is possible for each individual 
attitude measure to correlate highly with the 
behavior criterion, yet if the pattern of inter- 
correlations is not consistent, then one would 
reject the null hypothesis. This condition is 
analogous to Campbell and Fiske’s (1959) 
rule of thumb that “the same pattern of trait 
interrelationship be shown in all of the hetero- 
trait triangles of both the monomethod and 
heteromethod blocks” (p. 83). This aspect of 
the null hypotheses is also what is meant, in 
part, by Kenny’s (1976) observation that con- 
firmatory factor analysis “takes into account 
differential reliability” (p, 248). With re- 
gard to the attitude-behavior relationship, it 
is hypothesized that the attitude construct 
will not predict the single-act or multiple-act 
behavior criteria even though correlations be- 
tween individual measures of attitude and 
single acts or multiple acts might be signifi- 
cant. Because the correspondence between the 
attitude measures and both the single-act cri- 
terion (i.e. the mean correlation with 100 
behaviors) and the multiple-act criterion (ie, 
the sum over 100 behaviors) is low (in the 
sense of being at different levels of specificity 
with regard to time, action, and context), one 
would not expect the pattern of correlations 
between measures of A and B to be uniform, 


Consequently, it is hypothesized that the re- 
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lationship between attitude and single-act and 
multiple-act behavior criteria should be re- 
jected. In contrast, it is hypothesized that the 
relationship between attitude and scaled mul- 
tiple-act criteria will be sustained, because 
the attitude and behavior measures are formed 
at comparable levels of specificity. 

In sum, we expected that a reanalysis of 
the Fishbein and Ajzen (1974) data using 
confirmatory factor analysis would support 
the following hypotheses: 

1. Convergent validity will be obtained 
when the cognitive and affective measures of 
attitude are treated as separate components, 
but convergent validity will not be obtained 
when all five instruments are treated as al- 
ternative measures of the same underlying 
construct. 

2. The attitude construct will not predict 
single-act behavior criteria. 

3. The attitude construct will not predict 
multiple-act behavior criteria. 

4. The attitude construct will predict scaled 
multiple-act behavior criteria. 


Method 
Data Source 


To test the hypotheses, a reanalysis of data re- 
ported in Fishbein and Ajzen (1974) was performed. 
Fishbein and Ajzen had used two samples of re- 
spondents. In the first, 62 male and female under- 
graduates indicated which items from a set of 100 
behaviors they had performed (the self-reported be- 
haviors sample). The set of behaviors consisted of 4 
list of 70 actions dealing with religious matters (¢&» 
pray before or after meals, donate money to a re- 
ligious institution, etc.) and 30 additional actions in 
a refusal format selected from the original 70 (¢8» 
refuse to state a religious preference during univer- 
sity registration). The second sample was composed 
of 63 male and female undergraduates who indicated 
which items from the set of 100 behaviors they 
would perform (the behavioral intentions sample) 

In addition, all subjects completed five scales mea- 
suring attitudes toward religion. A Guilford self-rat- 
ing scale measured attitudes towards being religious 
on an 11-point scale ranging from extremely favor- 
able to extremely unfavorable, A semantic differential 
scale measured evaluations of “being religious” On 
five 1l-point bipolar scales having the following 
endpoints: good-bad, harmful-benefical, wise-fool 
ish, pleasant-unpleasant, and sick-healthy. The Guil 
ford and semantic differential scales are consi 
in the present study to tap the affective dimensions 
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of attitudes. Support for this contention was pro- 
vided in the preceding sections of this article, 

‘To measure the cognitive component of attitude 
d religion, three standard religiosity scales were 
loyed, These included Likert (Bardis, 1961), 
ttman (Faulkner & DeJong, 1969), and Thurstone 
ppleton & Pilkington, 1963) scales. 


a. 


Two-Factor Model 


= seman! 
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The Fishbein and Ajzen (1974) data set provide a 
better test of the predictive validity of attitude than 
the data of either Kothandapani (1971) or Osrom 
(1969). This is because Kothandapani used a single 
dichotomous measure of behavior (ie, used or did 
not use contraceptives) and Ostrom employed eight 
measures of religious behaviors but had to omit 44 


Single-Factor Model 


f attitude measures. (A... = single- 
ose didi A. = cognitive attitude com- 
tic differential, G = Guttman, L= Likert, and 
to be estimated and are de- 
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subjects who completed the attitude scales but failed 
to complete all of the behavioral self-reports. The 
Fishbein and Ajzen data offer the advantages of 100 
measures of behavior or behavioral intentions with 
no attrition. 


Methods of Analysis 


Convergent validity. Before a test can be made of 
hypotheses relating attitude to behavior, it is neces- 
sary to establish the validity of the attitude mea- 
sures. Although a full test of construct validity would 
require the use of the multitrait-multimethod matrix 
design (cf. Ostrom, 1969), this was not possible in 
the present study because the same set of multiple 
measures was not used on each of the hypothesized 
attitudinal components. However, because two mea- 
sures of affect and three measures of beliefs were ob- 
tained, it is possible to ascertain the degree of con- 
vergent validity for the attitudinal measures, 

Two rival models bear upon the issue of conver- 
gence in the present research. The first is termed the 
single-factor model and hypothesizes that the vari- 
ance in individual responses (except for random 
error in measurement) is accounted for by one 
underlying attitudinal construct. Panel a of Figure 1 
depicts the single-factor model, in which the five at- 
titudinal scales are shown measuring one overall at- 
titudinal construct, affective and cognitive (i.e.. Asse). 
Random error in measurements is indicated by the 
zs. The single-factor model is the attitude model im- 
plicitly employed by Fishbein and Ajzen (1974), in 
that they did not differentiate the five scales into af- 
fective and cognitive dimensions but rather treated 
all five as independent measures of the same under- 
lying “attitude towards religion” construct. Panel a 
of Figure 1 represents a diagram of a hypothesis that 
can be used to test Fishbein and Ajzen’s (1974) as- 
sumption. 

The two-factor model (Figure 1, Panel b) posits 
that attitudes are represented as two conceptually 
independent, yet empirically related, constructs: (a) 
an affective component (A,) and (b) a cognitive or 
belief dimension (Ac). The empirical correlation be- 
tween A, and As is shown as Paro. 

As a point of interpretation, it should be noted 
that the path diagrams in Figure 1 follow current 
practice in the confirmatory factor analysis literature 
in that they represent null hypotheses to be tested 
on data (cf. Joreskog, 1969; Kenny, 1976). Although 
the application of confirmatory factor analysis does 
not permit one to discriminate between the single- 
and two-factor models per se, it does allow the re- 
searcher the opportunity of examining both the 
single-factor model as a null hypothesis and the two- 
factor model as a null hypothesis. The confirmatory 
factor analysis in the present context permits the re- 
searcher to test whether the one-factor, the two- 
factor, or both the one-factor and two-factor models 
are consistent with the data. In this way, the pro- 
cedures yield an empirical test of the convergent va- 
lidity for each attitude model and can be used to 
identify internally consistent constructs. The test is 
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important to apply before investigation of the rela- 
tionship between attitude and behavior because it is, 
a matter of logical and empirical necessity that con- | 
vergent validity be achieved in order for one to ac- 
cept the theoretical and predictive utility of the at- 
titude model. If one or both of the convergent valid- 
ity hypotheses were rejected, then it would not be 
legitimate to examine further hypotheses as to the 
predictive validity of the nonconvergent attitudinal 
construct. In the present investigation, it is sug- 
gested that the two-component, but not the single- 
component, model will achieve convergent validity. 

To test the models of Figure 1, the following gen-s 
eral confirmatory factor analysis model can be ex- 
amined: 


y=Axt+z, (1) 


where y is the vector of p attitude measures; x is a 
k<p vector of hypothesized attitude components; 
k is the number of attitude components; z is a 
vector of p unique scores (i.e. errors in measure- 
ment); and A is a p Xk matrix of factor loadings 
relating attitude measurements to their respective 
components. For Figure 1, Panel a, p= 5 and k=1, 
while for Figure 1, Panel b, p=5 and k= 2, With 
the assumptions that E(x) = E(z) =0 (i.e., measures 
are expressed as deviations from their means), E(xx') 
= (where ¢ is the intercorrelation between atti- 
tude components), and E(zz') =% (where ¥ is a 
diagonal matrix of error variances), the variance- 
covariance matrix of measurements, X, may be ex- 
pressed as 


Z=AgA'+y, (2) 


Jöreskog (1969) derives a maximum likelihood pro- 
cedure for estimating the parameters in A, ¢, and ¥ 
Further, the methodology yields an overall chi-square 
goodness-of-fit measure that tests the null hypothesis 
specified by Figure 1 and Equations 1 and 2 against 
the alternative hypothesis that the variance-covati- 
ance matrix is any positive definite matrix. The com- 
puter program LISREL can be used to test the models 
of Figure 1 and Equations 1 and 2 (cf. Jéreskog & 
van Thillo, 1972). 

Single-act and multiple-act criteria. To test the 
hypotheses that attitude predicts single-act and multi- 
ple-act behavioral criteria, the path model of Figure 
2 was used. In this figure, A represents the attitude 
construct achieving convergent validity; B stands i o 
the appropriate single-act or multiple-act criterion 
measures; and attitude measures (i.e., yıs) are drawn 
to indicate that the test applies to the general cast 
The single-act behavioral criterion measures are sim- 
ply the individual responses to each of the 100 self- | 
reports of behaviors or behavioral intention. The 
multiple-act behavioral criterion measure is the sum 
Over 100 self-reports of behaviors or behavioral in- 
tentions, 

In the present study, it is hypothesized that the 
attitude construct will not act as a significant PA 
dictor of either single-act or multiple-act behavior®! 


ATTITUDE-BEHAVIOR RELATIONSHIP 


criteria (see the earlier discussion on attitude-behav- 
ior hypotheses). Looking first at the relation between 
attitude and the single-act behavioral criterion, it 
is expected that the relationship will be weak or non- 
significant because the degree of correspondence be- 
tween attitude and behavior is minimal (see earlier 
comments and comments in Discussion section). This 
hypothesis is the same as that posed by Fishbein and 
Ajzen (1974). 

Looking next at the relation between attitude and 
the multiple-act behavioral criterion, however, it is 
hypothesized that the overall predictive model of 
Figure 2 will be rejected despite the presence of high 
absolute correlations between each individual atti- 
tude measure and the multiple-act criterion measure. 
This perhaps counterintuitive hypothesis is contrary 
to that suggested by Fishbein and Ajzen (1974). The 
rationale for this hypothesis is that the high observed 
correlations between each individual attitude measure 
and the multiple-act criterion arise as a consequence 
of the definition and operationalization of the multi- 
ple-act criterion. Because this criterion is defined as 
the sum of occurrences over 100 different behaviors, 
it is probable that each respondent’s score expresses 
a different set of behaviors, and thus the numerical 
value for the multiple-act criterion assigned to each 
person would represent an inconsistent mapping in 
measurement. Two or more people can have identical 
Scores on the multiple-act criterion yet exhibit vastly 
different behaviors as a function of the content of 

i their responses, As a result, the procedure used to 
construct the criterion violates the rule of numerical 
assignment theory that requires a 1:1 correspondence 
between number and referent. Even though one ex- 
bects high correlations between individual measures 
A attitude and the criterion, it can be anticipated 

that the entire pattern of correlations among attitude 

‘measures and the criterion would not be consistent 
with the model hypothesized in Figure 2. 

: To test the model of Figure 2, Jéreskog’s (1970) 
nalysis of covariance structures methodology can 

used in conjunction with LisREL. The appropriate 
model to test is 


Basar (3) 


Yi a 0 zi 
Y2 as: Z: 


i] 
=. 
w> 
pa 
we 


Yn an 0 fn 
if 01 0), (O) 

pe B is the appropriate single-act or multiple-act 
inn criterion; A is the attitude component 
scribe convergent validity in the previously de- 
| ‘ttibed tests; y is a parameter analogous to @ regres- 


z Coefficient relating attitude to behavior; 

n ia error in equation term; Yn Ys -- +? na 

Ctite itude measurements; y» is the actual behavior: 
rion measure; the as are parameters relating A 


to j 
Ta measures; and the zs are random measurement 
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Single-act 
or 
multiple-act 
criterion 


Figure 2. Path diagram for testing validity of single- 
act and multiple-act criteria models of behavior, (A 
= attitude construct; B = behavior criterion, The re- 
maining symbols represent parameters to be estimated 
and are defined in the text.) 


Prediction of scaled behavior. To test the hy- 
pothesis that attitude predicts scaled behavioral mea- 
sures, the path diagram of Figure 3 may be used, 
In contrast to the single-act and multiple-act be- 
havioral criteria, the scaled behaviors are meaningful 
measures and approach the same level of specificity 
as the attitudinal scales. For a description of the scal- 
ing procedures used, the reader is referred to Fishbein 
and Ajzen (1974). Although Fishbein and Ajzen hy- 
pothesized that attitude would predict scaled behav- 
joral measures, they tested the hypothesis only by 
examining the individual correlations between each 
attitude measure and each scaled behavior measure. 
A more rigorous and meaningful test of the hypothe- 
sis can be performed by using the general analysis 
of covariance structures model, which is a generaliza- 
tion of confirmatory factor analysis, This method 
examines the entire pattern of intercorrelations as a 
unit rather than examining only the individual re- 
lationships. Further, it provides a chi-square good- 
ness-of-fit test, estimates parameters, and takes into 
account differential reliability, as noted earlier. The 


appropriate equations to test are 


B= yA +t (5) 
y a 0 i 
5 a: 0 pA 
Yn an 0 fn 
yor] 0N B + À 
Ymi 
< os 
Yrm 0 Mn e) iO 
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Figure 3. Path diagram for determining the relationship between attitudes and scaled measures of 
behavior. (A= attitude construct; B.=scaled multiple-act behavior criterion. The remaining 
symbols represent parameters to be estimated and are defined in the text.) 


where B, is the scaled behavioral construct; A is the 
attitude component; yı, ys, ..., Ya are the n mea- 
sures of A: Yas, Yno, . - -, Ymen are the m measures 
of B,; the as and às are parameters relating A and 
B. to their respective measures; the z and e are 
random measurement errors; and y and ¢ are defined 
as above. 


Results 


Table 1 summarizes the intercorrelations of 
variables for both samples. Looking at these 


Table 1 


correlations, Fishbein and Ajzen (1974) con- 
cluded that the five verbal attitude scales 
showed “a high degree of convergent validity” 
and “all attitude scales correlated highly with 
the multiple-act criterion, while the predic- 
tion of single-act criteria tended to be low and 
nonsignificant” (p. 62). The latter was taken 
as evidence that a person’s attitude towards 
an object should be related to the overall pat- 
tern of one’s behaviors but not necessarily to 


Intercorrelations of Verbal Attitude Scales, Single-Act and M: ultiple-Act 


Behavioral Criteria, and Behavioral Scales* 


ee eee 


Variable Peco S E 8 9 10 
1. Self-report —  .800 .519 .652 .584 137 .640 .451 .582 .701 
2, Semantic differential .765" — 644 .762 .685 .149 .714 .591 .688 .727 
3. Guttman 688.773, — .790 .744 121 .608 .531 .647 .570 
4. Likert 743 837 878 — .785 .142 .684 .660 .656 .611 
5. Thurstone 672.666 .818 .786 — 131 .628 .575 .624 .542 
6. Single-act criterion® 162.178 176 .202 170 — — 143 .178 .156 
7, Multiple-act criterion? .604 .658 .656 .749 .648 — — 699 .898 .789 
8. Guttman behavior -444 S17 501.563 483.189 .750 — .776 .631 
9. Likert behavior 594.640.656.727 .649 229 915 851 —  .792 
10. Thurstone behavior 545.566.663.677 .628 .202 .797 .553 .793  — 


* Above diagonal for respondents indicating self-reported behaviors (n = 62); below diagonal for respondents 


indicating behavioral intentions (m = 63). 

Pros = .250; ro = 325. 

° Mean correlation with 100 behaviors or intentions. 
å Sum over 100 behaviors or intentions. 


‘single behavior that may be performed 
th respect to the object. Each of these hy- 
ses and the ones relating attitude to 
behavior measures are examined below 
the more rigorous confirmatory factor 
s methodology. An extension of the at- 
model is also investigated. 


gent Validity 


ngle-factor model. Application of the 
atory factor analysis model of Figure 
a and LIsREL to the data shows that 
pothesis of convergent validity must be 
d for both the self-reported behaviors 
x2(5) = 23.91, p < .01, and the be- 
al intentions sample, x*(5) = 13.98, $ 
Thus, contrary to the original claim 
‘by Fishbein and Ajzen (1974), one can- 
pt the hypothesis that the five atti- 
al scales are measures of a single under- 
Fattitude construct. 
factor model. Given the failure to 
convergent validity for the single-fac- 
lel, the two-factor, two-component 
el was tested (Figure 1, Panel b). Appli- 
‘of LisREL to the data indicates that the 
esis of convergent validity is supported 
self-reported behaviors sample, x*(4) 
, P œ .68, and the behavioral intentions 
, x°(4) = 8.30, p œ .08.* Hence, con- 
ent validity is established for the two- 
model but not for the single-com- 
model. Further analysis of the predic- 
dity of attitude in the present study 
hls justified only for the two-component 
eptualization of attitude. 


-Act Criteria 


act criterion. As posited, the hy- 
maintaining that attitude will be a 
predictor of single-act criterion 
S was not supported. Looking first at 
reported behaviors sample, it can be 
t—although the pattern of correla- 
among attitude and behavioral criterion 
was consistent with predictions us- 
er affect, x?(1) = .02, $ ~ 89, or COB” 
x2(3) = .05, p= .99, as predictors 
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—the standardized parameter estimates for 
the YS, relating attitude to behavior, were 
insignificant, ya = .157, ms; ye = .148, ns. 
Similarly, looking at the behavioral intentions 
sample, it can be observed that—despite find- 
ing a pattern of correlations consistent with 
predictions for affect, x?(1) = .04, p œ .83, 
and cognitions, x?(3) =.34, pœ .95—the 
standardized parameter estimates, ys, were in- 
significant, ya = .190, ms; ye = .193, ns. In 
sum, it was found that attitude does not act 
as a significant predictor of single-act criterion 
measures. These results are typical of that 
found in the literature relating attitude to 
single-act behaviors (cf. Wicker, 1969). 
Multiple-act criterion. Although Table 1 
indicates that high correlations exist between 
each of the five measures of attitude and the 
multiple-act behavior criterion, this does not 
necessarily imply that the attitude construct 
itself will be related validly to the multiple-act 
behavior criterion. The prediction of behavior 
indicated by Figure 2 requires that the entire 
pattern of correlations among all measure- 
ments meet a set of logical and empirical con- 
ditions. Not only must the correlations among 


4 Although the chi-square value for the behavioral 
intentions sample indicates a borderline fit (ie, $ ~ 
.08), when Bartlett’s (1951) small-sample correction 
factor is applied, the model reaches acceptable levels 
of significance, x4) = 7.65, $ ENO) 

5To test the prediction of behavior hypotheses, 
each attitude component was examined as a separate 
predictor rather than observing both simultaneously. 
This was done in order to avoid what Burt (1976) 
terms “interpretational confounding.” Interpretational 
confounding is similar in form to the problem of 
multicollinearity in multiple regression analysis, and 
the condition can affect the precision of the param- 
eter estimates relating attitude to behavior. Briefly, 
the problem arises because of the structure of inter- 
correlations among attitude and behavior measures. 
Logical considerations dictate that measures of the 
same construct should correlate highly among them- 
selves and that most, if not all, correlations between 
measures of different constructs should be lower than 
the within-construct correlations (Campbell & Fiske, 
1959). However, as shown in Table 1, many of the 
cross-construct correlations that include the criterion 
measure violate this rule, even though the criterion 
correlates highly with all measures of attitudes in an 
absolute sense. As a result, to examine the validity of 
attitude as a predictor of behavior, it is more mean- 
ingful to examine the predictive ability of the affec- 
tive and cognitive components separately. 
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attitude measures be significant and achieve 
convergence, but simultaneously the composi- 
tion of correlations between each attitude mea- 
sure and the criterion must be consistent with 
the structure implied by Figure 2. It is hy- 
pothesized, however, that the construction of 
the multiple-act criterion will not sustain the 
prediction of behavior because the content of 
the criterion measure fails to achieve construct 
validity and is not at the same level of speci- 
ficity as the attitude construct. 

Application of Joreskog’s analysis of co- 
variance structures to the data and the model 
of Figure 2 shows that the hypothesis refer- 
ring to the prediction of the multiple-act cri- 
terion must be rejected, as maintained. Look- 
ing first at the self-reported behaviors sample, 
the hypothesis is rejected for both the model 
using affect as a predictor, y2(1) = 14.05, p 
< .01, and the model using cognitions as a 
predictor, x?(3) = 12.89, p < .01. Similarly, 
for the behavioral intentions sample, the hy- 
pothesis must be rejected for both the model 
using affect as a predictor, y?(1) = 10.93, p 
< .01, and the model using cognitions as a 
predictor, y?(3) = 21.41, p < 01. 


Prediction of Behavior: Scaled 
Behavior Measures 


As posited, the hypothesis that attitude will 
predict scaled behavior was supported (see 
Figure 3), Looking first at the self-reported 
behaviors sample, it can be seen that the hy- 
pothesis ° is supported for (a) the model with 
affect as a predictor and either Guttman and 
Likert scaled behavior as a criterion, y2(1) = 
1.01, p = 32, or Likert and Thurstone scaled 
behavior as a criterion, x?(1) = 2.77, p= 10, 
and (b) the model with cognitions as a pre- 
dictor and either Guttman and Likert scaled 
behavior as a criterion, x?(4) = 5.38, p = 25, 
or Likert and Thurstone scaled behavior as a 
criterion, x?(4) = .85, p œ 93, Similarly, for 
the behavioral intentions sample, the hypothe- 
sis is supported for (a) the model with affect 
as a predictor and either Guttman and Likert 
scaled behavior as a criterion, y?(1) = 1.38, 
pœ .24, or Likert and Thurstone scaled be- 
havior as a criterion, y2(1) = 14, p=., 
and (b) the model with cognitions as a pre- 
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dictor and either Guttman and Likert scaled 
behavior as a criterion, x?(4) = 6.35, p= 17,\ 
or Likert and Thurstone scaled behavior as a 
criterion, x?(4) = 4.79, p = 31, 

To examine the joint effect of affect and 
cognitions on scaled behavior, the path model! 
of Figure 4 can be used.” The structural equa- 
tions for this model can be written as 


SR a, 0 21 
SD a, 0 A 22 
G | = |0 œ hal + |z 
L 0 a s Z4 
T 0 a; zs), (9) 


where B, is the scaled behavior construct; 
A, and A, are the affective and cognitive com- 
ponents of attitude, respectively; Lg and Ts 


"Predictive validity models are run with the Gutt- 
man/Likert and Likert/Thurstone operationalizations 
rather than with a single Guttman/Likert/Thurston 
operationalization of behavior because inspection 0 
the correlation matrix in Table 1 reveals that the 
Guttman and Thurstone measures correlate atal 
level than the other two pairs of correlations in bo! 
samples. As Burt (1976) notes, such a pattern of E 
tercorrelations among the measures of the same cole 
struct can result in “interpretational confounding 
Pariticularly if each measure correlates at a ate 
degree with a criterion measure. The details of i 
confounding are quite mathematically complex, a 
for a complete description of the problem, the ee 
is referred to Burt (1976). At an intuitive level be 
interpretational confounding means in the pri ol 
context is that the covariances of the nee 
behavior with the attitude measures are not Proa 
tional but reflect some unmeasured influence Al 
excessive measurement error. The condition 1s aes 
only when all three measures of behavior are ae j 
or when the Guttman and Thurstone scales sutiman 
The condition is not present when the os 4 
Likert and Likert/Thurstone combinations, fy 
ployed. Hence, the predictive validity mo! pehavi 
examined with the latter two measures of F 

* The joint effect of affect and cognitions a pro 
behavior should be interpreted with caution. terprett” 
cedure should be employed only when in 
tional confounding is not a serious problem 
1976). See Footnote 5. 
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Figure 4. Testing for the relative impact of affect and cognitions on behavior, (A, = affective at- 
titude component; A, = cognitive attitude component; B, = scaled multiple-act behavior criterion ; 
Ls = Likert scaled behavior measure; Ts =Thurstone scaled behavior measure; SR = Guilford 
self-report, SD = semantic differential, G = Guttman, L = Likert, and T= Thurstone scales, The 
remaining symbols represent parameters to be estimated and are defined in the text.) 


„âre the Likert and Thurstone scaled behavior 
Measures, respectively; SR = Guilford self- 
report, SD = semantic differential, G = Gutt- 
man, L = Likert, T = Thurstone scales; and 

e remaining symbols are as defined earlier. 

Application of Jéreskog’s analysis of co- 
Variance structures to the data and model of 
Figure 4 yields the results shown in Table 2 
Where, because of problems of interpretational 
Confounding, only the findings for the self- 
"eported behaviors sample are displayed. In 
8eneral, the model fits the data well, x2(10) = 


13.11, p œ.22. Further, the affective com- 
ponent can be seen to be approximately three 
times as forceful in its impact on behavior as 


the cognitive component (yı = .651 and yz = 
.226). 


Discussion 
Attitude Organization 


The results obtained from a reanalysis of 
the Fishbein and Ajzen (1974) data support 
a two-component model of attitude and lead 
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Table 2 

Goodness-of-Fit Tests and Standardized 
Parameter Estimates for a Model Testing 
the Relative Impact of Affect and 
Cognitions on Behavior 


Self-reported 
Parameter behavior 
a 838 
a 955 
C .849 
as +930 
as .853 
yı 651 
Ye .226 
M -879 
Az .901 
Pay -834 


Note. Parameters in this table correspond to those 
indicated in Figure 4 and are defined further in the 
text. Test of model: x?(10) = 13.11, p = .22. 


to a rejection of the single-component model. 
When Guilford self-rating, semantic differen- 
tial, Guttman, Likert, and Thurstone scales 
were treated as alternative measures of a sin- 
gle underlying construct (i.e., attitude), the 
analysis failed to provide convergent validity. 
When Guilford self-rating and semantic differ- 
ential scales were treated as alternative mea- 
sures of an affective component of attitude, 
and Guttman, Likert, and Thurstone scales 
were treated as alternative measures of a cog- 
nitive component of attitude, convergent 
validity was obtained. The validity of the 
two-component model was supported for the 
self-report and behavioral intention samples. 
Further support for the validity of the two- 
component model is provided by a considera- 
tion of nomological validity. Following Camp- 
bell (1960), nomological validity is defined 
as the degree to which predictions from a 
concept in a theoretical system of concepts 
are confirmed. It was found that each com- 
ponent of the two-component model separately 
predicted scaled multiple-act criteria, Further- 
more, both components contributed simul- 
taneously to the prediction of behavior. Be- 
cause of interpretational confounding, only 
self-report data were employed in the latter 
analysis. While both components simultane- 
ously accounted for behavior, the affective 
component was approximately three times as 
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powerful as the cognitive component. In sum, 
then, the results provide support for the nomo- 
logical validity of the two-component attitude 
model. 

These results and those obtained from Ba- 
gozzi’s (1978) reanalysis of Ostrom’s (1969) 
data provide support for a multicomponent 
treatment of attitude. Most attitude research, 
however, is based on measurement of only 
one component of attitude (Fishbein & Ajzen, 
1975). Both in research dealing with attitude 
change and in research considering the atti- 
tude-behavior relationship, conclusions are 
generally made after measuring either the 
cognitive or affective component of attitude: 
The results obtained from the present research 
are consistent with the argument that a com 
plete accounting of attitude and the predic 
tion of behavior requires measurement of both 
the cognitive and affective components. 


Attitude—Behavior Relationship 


The validity of the attitude—behavior rela- 
tionship was assessed separately for single-act 
criteria, scaled multiple-act criteria, and un- 
scaled multiple-act criteria. The hypothesis 
that the attitude construct predicts scaled 
multiple-act criteria received strong support 
for both the behavioral intentions and self- 
reported behavior samples. The relationship 
between the attitude construct and the un- 
scaled multiple-act criterion was found not to 
hold. This finding was true for both the be- 
havioral intention and self-reported behavior 
samples. Finally, as predicted, the relation- 
ship between the attitude construct and single 
act criteria was found not to hold in either 
sample. 

These results provide partial suppor 
Ajzen and Fishbein’s (1977) contention t 
the strength of the attitude-behavior relation” 
ship is directly related to the degree of corre- 
spondence between “attitudinal and behavior 
entities.” According to Ajzen and Fishbein's 
conceptualization, however, correspondence 
should also be high for the multiple-act C" 
terion derived from summation across 
behaviors, because the action, context, 
time elements are left unspecified in both 
attitudinal and behavioral entities. vi 
though correspondence was high in this ses 


t for 
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the summation across 100 behaviors is likely 
to have incorporated considerable error in the 
final criterion score. In an earlier paper, Fish- 
bein (1973) notes that scaling is necessary in 
the construction of a multiple-act criterion to 
eliminate invalid items (i.e., items unrelated 
to favorability for or against the concept un- 
der consideration). The error introduced by 
the inclusion of “invalid items” in the un- 
scaled multiple-act criterion is likely to have 

. lead to an inconsistent pattern among the 
correlations between attitude and behavior 
even though the magnitude of the relation- 
ship remained quite high. 

Therefore, we would argue, in contrast to 
Ajzen and Fishbein, that correspondence be- 
tween attitude and behavior is only high when 
attitude is related to scaled multiple-act cri- 
teria, The attitudinal construct was based on 
scaling procedures that ensured that only be- 
liefs and feelings achieving some degree of 
construct validity were used. The unscaled 
multiple-act criterion was not based on such 
a restrictive procedure. Correspondence will 
be high only when the behavioral entity is 
also based on scaling procedures that ensure 
that the criterion achieves convergent valid- 
ity and the content of the attitudinal and be- 
havioral items is at comparable levels of gen- 
erality. This was found only for the scaled 
Multiple-act criteria. 

Failure to sustain the validity of the rela- 
tionship between attitude and single-act cri- 
teria provides further support to Fishbein 
and Ajzen’s argument that a general attitude 
bears no necessary relationship to single-act 
criteria. This does not alter the frequently ob- 
tained finding of significant relationships be- 
tween attitude and single acts (i.e., Campbell 
et al, 1960; Fishbein & Ajzen, 1974). It 
does point out, however, that these relation- 
ships cannot be automatically assumed. T! 
question remains as to how we can determine 
Which single acts can be expected to bear a 
valid relationship to attitude. 

Failure to obtain direct relationships be- 
tween general attitude and single acts may be 
due to the fact that frequently little a priori 
thought is given to why a relationship should 
logically be expected between attitude and 
action in a given context and time. Not only 
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should a rationale exist for hypothesizing a 
specific link between attitude and single acts, 
but other nonattitudinal factors should also 
be considered. Normative prescriptions and 
social constraints have frequently been iden- 
tified as factors that moderate the strength of 
the attitude—behavior relationship (e.g., Gross 
& Niman, 1975; Kelman, 1974; Liska, 1974; 
Wicker, 1969).* We may suggest that the at- 
titude-behavior relationship will be strong to 
the extent that the behavior does not involve 
inconsistent social or normative constraints. 
Warner and DeFleur (1969) provide support 
for the contention that the attitude-behavior 
relationship will be stronger when normative 
pressures are consistent with attitude and 
weaker when normative pressures are incon- 
sistent with attitude. Similar results have been 
obtained by Schofield (1975). In contrast to 

such public behaviors as appearing in a photo- 

graph with a member of another racial group 

(cf. Linn, 1965), voting behavior takes place 

in the privacy of the voting booth, Strong 

relationships have been obtained between at- 

titude toward a candidate and the act of vot- 

ing for that candidate (e.g., Campbell et al., 

1960). It may be, then, that those behaviors 

most directly related to attitude toward re- 

ligion are behaviors toward religion that are 

either private or in other respects unlikely to 

be subject to inconsistent normative pressures. 

While it is beyond the purpose of this article 

to develop and test the implications of a 

priori criteria that will affect predictions of 

whether a behavior will be related to atti- 

tude, it is suggested that continued considera- 

tion of the meaning of single acts in conjunc- 

tion with these and other moderating varia- 

bles would further our understanding of the 
attitude-behavior relationship. 

Further study of behaviors unrelated to at- 
titude would also be desirable. Most behav- 
iors employed in the Fishbein and Ajzen 
(1974) study were rejected by all three scal- 


$ It has also been maintained that other attitudes 
and motives are important mediators of the attitude- 
behavior relationship and that the behaviors must be 
perceived to be relevant to the attitude (ie., Gross 
& Niman, 1975; Insko & Schopler, 1967; Kelman, 
1974; Liska, 1974; Wicker, 1969). 
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ing procedures and were essentially unrelated 
to attitude. Nevertheless, performance of these 
behaviors may be strongly related to other 
psychologically relevant dimensions such as 
motivation, attribution processes, or personal- 
ity variables. If we are to enhance our under- 
standing of and ability to predict behavior, 
then it would seem desirable to direct further 
attention to psychological dimensions (other 
than attitude) accounting for behavior. 


Conclusion 


This study was directed at further clarify- 
ing attitude organization and attitude—behav- 
ior issues. Evidence was obtained to support 

an affective—cognitive conceptualization of at- 
titude. Convergent validity was obtained for 
this two-component model, and it was found 
that attitude accounted for self-reported be- 
havior and for behavioral intentions, It was 
suggested that further research with attitude 
should include measures of both components 
and treat them as simultaneous predictors of 
behavioral criteria, 


Strong support was found for the hypothe- . 


sis that attitude would predict scaled multiple- 
act criteria. Criteria based on summation over 
a large set of behaviors were found not to be 
related systematically to the attitude con- 
struct. Furthermore, the hypothesis that atti- 


tude would predict single-act criteria was not 


supported, These findings provide partial sup- 


port for Ajzen and Fishbein’s concept of atti- 
and lead to 


likely to result in 

ships. Tt was speculated that 
tionship would be for ing when 
elements of the attitude and behayi 
spond and when the behavior under 
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x 
tion is relatively devoid of norm 


straints. 
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Special Section on Interpretive Controversies in 
Personality and Social Psychology 


Editor’s Foreword 


The convergence of 13 articles on five interpretive controversies in this issue 
of the Journal of Personality and Social Psychology has been a product of co- 


senting controversial exchanges. In Preparing the articles, parties to each ex- 
change were asked to read each others’ prepublication drafts in order to bring 
to the editor’s attention misstatements of each others’ positions. Insofar as 
was possible—and because of disagreements between authors, it was not always 
Possible—apparent misstatements were modified or deleted prior to publica- 


peripheral matters. In publishing the articles, a novel format was used to avoid 
an obvious last-word advantage for any author, The author (Author A) of 
the first-submitted article in an exchange was not invited to write a rebuttal 


way as to bring in new issues that might legitimate further comment by 
Author B. 


torial Advisor, American Psychological Association, 1200 Seventeenth Street 
N.W., Washington, D.C. 20036.—A.G.G. 
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A Reply to Norman H. Anderson’s Criti f 
; ) ; que of the 
Subject-Verb-Object Approach to Social Cognition 


Harry F. Gollob 


University of Denver 


The Subject—Verb-Object (S-V-O) approach to social cognition provides a 
method of conceptualizing several social cognition problems within the context 
of item sets that contain eight subject-verb-object sentences, such as “The 
kind man avoided the intelligent swindler” and “Conservative voters enjoy 
golf.” Anderson recently criticized the S-V-O model on numerous grounds that 
were primarily technical in nature. The present reply refutes all of Anderson's 
major criticisms. In discussing an alternative to the S-V-O model, Anderson 
recommends that problems raised by the pervasive tendency of subjects to re- 
spond to information configurally be dealt with by collecting data that can be 
described by simple algebraic models. It is argued that Anderson’s alternative 
avoids, rather than deals with, the important issues raised by the configural 
responding of subjects. Although the specific content of the present article is 
primarily concerned with the S-V-O model, much of the discussion is relevant 
to issues of general importance to investigators who use analysis of variance 
models and/or simple algebraic models in their work. A postscript replies to 


the article by Anderson that follows this one. 


Anderson (1977) recently wrote a detailed 
critique of Gollob’s (1974a, 1974b) Subject— 
Verb-Object (S-V-O) approach to social cog- 
nition as well as of Wyer’s (1975) and Insko, 
Songer, and McGarvey’s (1974) applications 
of that approach, Wyer and his co-workers 
Wyer, 1974, 1975; Wyer, Henninger, & 
Wolfson, 1975) have extended Gollob’s model 
and have applied it in a broad variety of areas 
concerned with social inference and attribu- 
tion theory. Insko et al. applied the S-V-O 
Model to the analysis of pleasantness judg- 
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aureen Gollob for helpful discussions concern- 
"8 the postscript, 
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lob, Department of Psychology, University of 
‘Wer, Denver, Colorado 80208. 
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ments in a traditional balance-theory para- 
digm. 

In his critique, Anderson claims that Gol- 
lob and his colleagues have used analysis of 
variance techniques inappropriately by failing 
to use tests of goodness of fit, by failing to 
recognize that monotonic transformations can 
affect results, by failing to realize that a com- 
plete analysis of variance decomposition can 
perfectly account for any set of data, by fail- 
ing to realize that correlation coefficients can 
be misleading, by failing to recognize that 
people often respond configurally to informa- 
tion in sentences, and by committing various 
other foolish or careless errors. Anderson also 
suggests that all would have been fine if only 
Gollob and his colleagues had used Anderson’s 
preferred methods to guide their data collec- 
tion and analyses. 

The present reply argues that all of An- 
derson’s major criticisms of the S-V-O model 
are either wrong, irrelevant, or misleading. 
Since Anderson’s criticisms focus primarily on 
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Gollob’s work, this reply emphasizes compari- 
sons of Gollob’s (1974a, 1974b; Rossman & 
Gollob, 1976) claims and practices with those 
attributed to him by Anderson. Although the 
specific content of the reply is concerned with 
the S-V-O model, much of the discussion is 
relevant to issues of general importance to 
investigators who use analysis of variance 
models or other relatively simple mathemati- 
cal models in their work. 

Anderson’s critique frequently ignores the 
necessity of considering the theoretical and 
psychological meaningfulness of a model in 
evaluating its usefulness. To enable meaning- 
fulness to be given adequate consideration, 
the present article summarizes the major 
characteristics of the S-V-O model before 
giving detailed consideration to Anderson’s 
criticisms. In typical applications of the 
S-V-O approach, considerations of judges’ 
phenomenology play a crucial role, but since 
most of Anderson’s criticisms are primarily 
technical in nature, the following description 
emphasizes technical aspects of the model. 


The S—V-O Approach to Social Cognition 


It should be noted at the outset that al- 
though the historical links between balance 
theory and the S-V-O approach to social 
Cognition are strong, the similarity between 
the two systems is much weaker than Ander- 
son implies by his decision to refer to the 
S-V-O model as the “balance triad model,” 
The S-V-O approach differs from balance 
theory in several ways: The S-V-O approach 
considers six cognitive biases in addition to 
Heiderian balance and proposes a framework 
within which the joint effects of the hypothe- 
sized biases can be investigated; it deals with 
stimuli expressed as subject—verb-object sen- 
tences rather than as triads; it emphasizes 
that the relative importance of cognitive biases 
in an item set will vary greatly depending 
on the specific content of the item set and 
on the type of inference (or other judgment) 
being made; it focuses on the role of infor- 
mational processes in affecting social judg- 
ments; and it makes several theoretical claims 
that are not part of traditional balance theory. 

The S-V-O approach to social cognition 
Provides a systematic method of investigating 
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and conceptualizing several social cognition 
problems within the context of subject-verb- 
object sentences such as “Friendly Bill dis- 
likes criminals” or “Intelligent men oppose 
antiobscenity laws.” The substantive examples 
Anderson uses in his critique involve social 
inferences made on the basis of information 
that is highly evaluative in nature. There- 
fore, in order to facilitate discussion, I shall 
summarize only those aspects of the S-V-O 
approach that are relevant when dealing with 
item sets in which social inference judgments 
are obtained, and in which each of the sen- 
tence components is described either by a 
positively evaluated term (+1) or by a nega- 
tively evaluated term (—1). 

Although Anderson considers only verb in- 
ferences, the S-V-O model distinguishes be- 
tween three major types of social inference 
judgment: subject inferences, verb inferences, 
and object inferences. When making verb in- 
ferences a perceiver judges the probability 
that an actor with specified characteristics 
behaves or feels in a particular way toward 
a recipient who also has some specified char- 
acteristics—for example, “How likely is it 
that the kind man hates Mr. B who is 4 
psychologist?” Examples of subject and ob- 
ject inferences, respectively, are “The man 
hates Mr. B who is a psychologist. How likely 
is it that the man is kind?” and “The kind 
man hates Mr. B. How likely is it that Mr. 
B is a psychologist?” 

To apply the S-V-O approach, item sets 
are constructed in which two sentence sub- 
jects, two verbs, and two objects are factos 
nally combined to yield the eight basic sen- 
tence types shown in Table 1. For example, 
the code “+ + —” represents the sentence 
type in which the subject and the verb are 
positively evaluated and the object is nega- 
tively evaluated. As shown in Table 1, each 
of the eight sentence types can be described 
according to the presence or absence of seven 
different cognitive biases—in the case of eval- 
uative content there are three positivity 
biases and four balance biases. The presence 
or absence of each cognitive bias is easily 
determined. Let S’, V’, and 0’, respectively, 
denote the sign of the sentence subject, verb; 
and object. Then, whenever S', V’, or 0 i$ 


sitive, or the product S’V’, S'O', V'O’, or 
§V'0' is positive, the specified sentence type 
i said to possess the corresponding cognitive 
bias. The names of the seven biases correspond 
io the names of the seven sources of varia- 
‘tion one would have in a Subject x Verb x 
Object analysis of variance. 
I shall now briefly discuss interpretations 
of the seven cognitive biases that often seem 
appropriate when dealing with evaluative con- 
‘tnt that is relevant to interpersonal relation- 
(dips. S, V, and O positivity are cognitive 
biases that should be important in affecting 
inferences to the extent that judges tend to 
asume that, on the average, an actor, behav- 
br, or object will have positively evaluated 
tharacteristics rather than negatively evalu- 
ited characteristics. It is hypothesized that 
SV balance will be important in affecting an 
inference to the extent that the perceiver ex- 
fects the sentence subject and verb to have 
le same evaluative sign regardless of the 
laracteristics of the recipient of the act (e.g., 
perceivers expect good people to do good 
“ings and bad people to do bad things). Simi- 
lar lines of reasoning are used to relate SO 
lalance and VO balance to relevant aspects 
| perceivers’ phenomenology. SVO balance 
‘simply traditional Heiderian balance. For 
More detailed discussion of the psychological 
“ghificance of the biases in both evaluative 
itd nonevaluative cases, see Gollob (1974b, 
PD. 288-292, 316-319). 
In the S-V-O approach, each of the cog- 
nitive biases is thought of as providing a psy- 
logically meaningful unit of information 
t is combined with other biases to yield 
‘likelihood judgment, Formally, the most 
feral version of the model assumes that the 
ji hood judgment, y associated with the 
Sentence type in an item set is given by 
è equation 


1 
yi = DY w;Bi; + constant, (1) 


E.l 


iere Bi; is equal to +1 or —1, depending 
Whether the jth cognitive bias is present 
“sent in the ith sentence in the item set, 
a n is the weight given to the jth cogni- 
y las for the item set (Gollob, 1974b, P. 
). The ‘absolute value of the weight as- 
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Table 1 

Presence (+1) and Absence (— 1) of 

Cognitive Biases in Each Sentence T: ‘ype 
FO Aa ES ae DLT ai 


Positivity Balance 


Sentence 


type SiSeviacO: VO SVO 


wth hat 1 1 1 1 1 1 1 
shah 1 1 -1 Vitec hey ea 
Eanes dyed KESE SIN ome ABAS S ay 
che 1 -1 -1 -i -i 1 1 
E ea 1 1 -1 -1 Tesh 
Se I aaa at =1 1 -1 1 
--+ -1 -1 1 1-1 -1 1 


eri 1 1 1st 


Note. S = subject, V = verb, O = object. 


signed to a bias reflects the importance of the 
bias in describing the inferences in the item 
set. 

When no restrictions are placed on the 
weights, Equation 1 is simply one way of 
writing the standard analysis of variance de- 
composition for a 2X 2 X 2 table. Signifi- 
cance tests of corresponding sources of varia- 
tion in an analysis of variance are tests of 
the significance of the regression weights in 
Equation 1. In order to give the model some 
predictive power, some restrictions must be 
placed on the weights. The major restrictions 
discussed by Gollob (1974b, pp. 292-295) 
are summarized in the following section. 


Weight Restrictions for the S-V-O Model 


Nonnegative weight hypothesis. This hy- 
pothesis states that any cognitive bias useful 
in describing the judgment will be given a 
positive weight. The interpretations given 
above for the biases implicitly assumed that 
this hypothesis will hold when dealing with 
evaluative content. : 

Negatively accelerated weight hypothesis. 
This hypothesis states that when the abso- 
lute values of the weights in Equation 1 are 
put in rank order from largest to smallest, 
the size of the nonzero weights decreases at 
a slower and slower rate (Gollob, 1974b, p. 
295; Rossman & Gollob, 1976, p. 377). Em- 
pirical results suggest that this negatively ac- 
celerated curve usually is fairly steep, so that 
typically only two or three cognitive biases 
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in a particular item set are given substantial 
weights (Gollob, 1974b; Rossman & Gollob, 
1976; Wyer, 1974). 

Relevant-bias hypothesis. This hypothesis 
states that the three cognitive biases that do 
not involve the sentence component to be in- 
ferred will be given zero weights, For example, 
in the case of verb inferences the hypothesis 
states that S positivity, O positivity, and SO 
balance will be given weights of zero, and 
that only V positivity and SV, VO, and SVO 
balance can be relevant in accounting for 
verb inferences, The relevant-bias hypothesis 
is expected to apply in most evaluative item 
sets in which each of the sentence compo- 
nents is described in terms of bipolar oppo- 
sites (e.g., friendly—unfriendly, likes—dislikes, 
Supports—opposes, liberals~conservatives). Use 
of such bipolar opposites, however, does not 
ensure that the relevant-bias hypothesis will 
hold (see Gollob, 1974b, pp. 294, 318-319), 

Some intuitive feeling for the psychological 
meaningfulness of the relevant-bias hypothe- 
sis may be gained by considering an example. 
In making a verb inference, values along the 
verb dimension (e.g, helps-harms) often have 
implications for the judgment, both when they 
are considered alone and when they are con- 
sidered in conjunction with other sentence 
components. More specifically, to the extent 
that a judge feels that it is generally true 
that people help rather than hurt others, V 
positivity should contribute to his or her in- 

ference; to the extent that he or she believes 
that the evaluative characteristics of the actor 
(i.e. the sentence subject) affect the likeli- 
hood of the actor helping another person, SV 
balance should contribute; to the extent that 
he or she believes that the evaluative char- 
acteristics of the recipient (i.e., the sentence 
object) affect the likelihood that the recipi- 
ent will be helped, VO balance should con- 
tribute; finally, to the extent that the evalua- 
tive characteristics of both the sentence sub- 
ject and object are jointly considered, SVO 
balance should contribute. 

When the nonnegative weight hypothesis, 
the negatively accelerated weight hypothesis, 

and the relevant-bias hypothesis all hold, a 
different special case of the general S-V-O 
model is obtained for subject, verb, and ob- 
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ject inferences. Gollob (1974b) has sho 
that each of these submodels implies that onl 
24 different orderings of the eight basic sen 
tence types can be obtained in data, Sine 
there are 40,320 possible orderings of th 
eight sentence types, it is clear that the 
special cases of the general model have col 
siderable predictive power. 

Other weight restrictions. In addition i 
the above hypotheses, which have rathe 
broad applicability to social inference iten 
sets, it is of course possible to use addition 
theoretical and intuitive considerations 
make more specific a priori predictions 
cerning the relative importance of the biase 
Such considerations depend on the conten 
of the item sets, the experimental manip} 
tions used, the characteristics of the judge 
and the types of judgments obtained. For e% 
ample, since the phenomenological basis {6 
the relevant-bias hypothesis is applicable only 
to inference judgments, Rossman and Gollob 
(1976, pp. 379-380) did not use this hy 
pothesis to make predictions concernin 
Pleasantness judgments. Instead they use 
the nonnegative weight hypothesis and vari! 
ous theoretical and empirical consideration 
of subjects’ phenomenology to predict thal 
VO and SVO balance would dominate ple 
antness judgments and would be given posi 
tive weights. The resulting special case 
the general model was reasonably well sup 
ported by the data. In general, most S-V-O 
research to date has involved testing variou 
special cases of the general model given b 
Equation 1. 

Having reviewed the main ideas in 
S-V-O formulation, let us now consider 
Specific issues Anderson raises in his critiq 


Technical Issues 


Most of Anderson’s technical criticisms or | 
the S-V-O work of Gollob and his colleag 4 
are on issues concerned with tests 0 


goodness of fit. 


Testing Goodness of Fit 


Can the S-V-O model fit data perfectist 
Anderson (1977) begins his formal crite 
of the S-V-O model with the comment: 


odd difficulty faces the user of the balance 
viad [S-V-O] model: It can always fit the 
data perfectly. As is well known, the com- 
plete analysis of variance model holds by 
satistical definition and so is not in general 
à model of substantive process” (p. 142). 
Proceeding from this base, Anderson repeat- 
edly claims that the S-V-O model has no 
way of testing goodness of fit and therefore 
ij defenseless against various types of mea- 
surement problems. These claims are wrong 
because most S-V-O work by Gollob, Wyer, 
ind their co-workers has investigated special 
tases of the complete analysis of variance 
model, for which appropriate goodness-of-fit 
ests are available. Incidentally, the meth- 
ods of analysis typically advocated by An- 
erson also involve testing special cases of 
he complete analysis of variance model. 
Gollob (1974b, pp. 293-301) spends eight 
Pages discussing plausible restrictions that 
ün be placed on the weights in the complete 
S-V-O model. Moreover, most of the em- 
pirical results reported in Gollob (1974a, 
194b) were directed toward testing special 
Mses of the model that resulted when vari- 
ols combinations of the nonnegative weight, 
Mgatively accelerated weight, and relevant- 
bias hypotheses were used to restrict the 
Parameters of the general model. Other in- 
luitive and theoretical considerations con- 
‘ming subjects’ phenomenology also have 
en used to make a priori predictions con- 
iming the relative importance of biases in 
erent item sets and under different ex- 
Ptimental manipulations (e.g, Rossman & 
Gollob, 1976; Wyer, 1974, 1975; Wyer, 
ninger, & Wolfson, 1975). Each different 
*t of predictions yielded a different special 
Se of the model, which was not forced to 
it the data perfectly. Since Anderson (1977, 
146) explicitly acknowledges that special 
of the complete model can readily be 
a ed, it is clearly misleading for him to 
More the fact that it is precisely such special 
868 of the complete model that have been 
x focus of most S-V-O research done to 
In addition to making the misleading 
tim that the S-V-O model can always fit 
' Perfectly, Anderson misrepresents the 
fic methods that Gollob and his col- 
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leagues actually have used to assess the 
goodness of fit of the S-V-O model. 

Specific tests of goodness of fit used in 
-V-O research, In this section I shall 
briefly describe several published analyses 
that were specifically designed to assess the 
goodness of fit of various special cases of 
the complete S-V-O model. In discussing 
the verb inference submodel, Gollob (1974a, 
Pp. 162-163, 165, 167-168; 1974b, p, 304) 
presented several figures that compared ob- 
tained mean ratings with the best fitting pre- 
dicted order of the mean ratings. These fig- 
ures showed all occurrences of errors in the 
predicted order and indicated whether or 
not each error was statistically significant. 
This enabled discrepancies from prediction 
to be examined in detail. These are precisely 
the kinds of goodness-of-fit tests that An- 
derson (1977, pp. 145-146) claims are miss- 
ing from evaluations of the S-V-O model, 

Other methods of evaluating goodness of 
fit also have been used in S-V-O research, 
Gollob (1974a, pp. 170-171) and Rossman 
and Gollob (1976, pp. 382-385) presented 
clearly labeled sections that discussed several 
analyses explicitly directed toward evaluating 
the degree to which various hypotheses con- 
cerning the general S-V-O model held; 
failures of the hypotheses to hold would 
help to identify specific aspects of lack of 
fit of the associated special cases of the 
model. In particular, both of these papers 
reported indices of the magnitude and/or 
statistical significance of the data’s discrep- 
ancies from the patterns predicted by the 
relevant-bias hypothesis, the nonnegative 
weight hypothesis, and the negatively ac- 
celerated weight (or a closely related) hy- 
pothesis, Wyer (1975) and Wyer, Henninger, 
and Wolfson (1975) also have used appro- 
priate methods to test the statistical sig- 
nificance of sources of variation attributable 
to biases that they predicted would be un- 
important in accounting for their data. 

Although there are some exceptions (cf. 
Gollob, 1974b, pp. 294, 318-319), the over- 
all results of the above tests of goodness of 
fit provide strong support for various special 
cases of the model that were investigated. 
The following section rebuts Anderson’s 
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Table 2 
S-V-O Weights for Anderson’s Item Sets 
A and C 


Item Set A Item Set C 
(helps/harms) (helps/hinders) 

Bias Weight % Weight % 
S See 4 —.03 0 
V 1.30 9 12 0 
(0) .39 1 Spal 0 
SV 3.70 71 2.50 59 
SO «19 3 07 0 
vO 1.09 6 78 6 
SVO 1.05 6 1.94 35 


Note. S = subject; V = verb; O = object. The 
weights were taken from Anderson's (1977, p. 149) 
Table 2. Figures in the columns labeled “%” give 
the percentage of variation in the item set that is 
accounted for by each bias. In both item sets the 
levels for the sentence subject are kind/cruel and the 
levels for the object are physicians/criminals. 


(1977) criticisms of Gollob’s use of correla- 
tionlike indices. 

Appropriate uses of correlationlike indices 
in S-V-O research. Anderson belabors the 
well-known fact that correlations between 
model predictions and data can be extremely 
high even when the model is seriously in error. 
He then, without presenting specific examples 
or any qualifying conditions, uses this in- 
nocuous premise as the basis for various 
criticisms concerning Gollob’s use of corre- 
lation coefficients, percentages of variation, 
and other correlationlike indices. Anderson’s 
criticisms include such sweeping statements 
as “Clearly, correlations can be extremely 
misleading in model analysis, for they ob- 
scure and misrepresent the facts... . The 
correlation coefficient and similar statistics 
are not valid as tests of fit... . The not 
infrequent practice of comparing several 
models in terms of correlations simply com- 
pounds the problem” (1977, pp. 145-146). 
So strong is his damnation of correlations 
that Anderson does not give even a hint that 
correlations have appropriate, as well as in- 
appropriate, uses in model analysis. 

We will now use the results of S-V-O 
analyses presented by Anderson (1977, p. 
149) to illustrate the major uses that Gollob 
(1974a, 1974b; Rossman & Gollob, 1976) 
has made of correlationlike indices in work- 
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ing with the S-V-O model. The data are 
presented in Table 2, but the descriptions 9 
the specific item content are deferred to a 
point in the discussion where they are more 
relevant. For convenience, the following dis 
cussion assumes that the percentages being 
compared are significantly different i 
each other. 

In one or more of the papers cited above 
Gollob has made statements analogous tol 


A and C, respectively, SVO balance accounts 
for 6% and 35% of the item-set variation 
therefore SVO balance is relatively mote) 


Set C than in Item Set A. (b) In Anderson's 
Item Set A, SV balance and SVO balance 
respectively, account for 71% and 6% 0 
the item-set variation, therefore SV balance 
is relatively more important than SVO bak 
ance in accounting for variation in Item Sél 
A. (c) The S-V-O verb inference submodel 
accounts for 92% of the variation in An 
derson’s Item Set A and for 98% of the 
variation in his Item Set C, therefore R 
verb inference submodel is more effective it 
accounting for variation in Item Set C than 
in Item Set A. These are appropriate ust 
of percentage of variation indices; they pro 
vide useful summary descriptions of various 
aspects of the data. l 

Finally, Anderson (1977) asserts that thg 
common practice of comparing several mod si 
in terms of correlations or percentages © 
variation is not a valid procedure. For i 
counterexample to Anderson’s assertion, COM” 
sider the fact that the verb inference $ 
model and the object inference submo 
account for 92% and 6%, respectively, 
the variation in Anderson’s Item Set A. f 
suming the verb inference submodel 18 ® 
least as psychologically meaningful a5 1 
object inference submodel, surely these @ 
provide legitimate grounds for conclud 
that the verb inference submodel provi i 
a better account of the data than does ™ 
competitor. 

Goodness of fit is not enough. Althougi 
no evaluation of a model can be compia 
without considering the results of de 
tests of goodness of fit, more is needed 
general, it is necessary to evaluate not © : 
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discrepancies from the fit of the model, but 
also to evaluate the fit itself. Knowledge that 
there are no reliable or important deviations 
from a model’s fit does not imply that the 
model is useful in accounting for the data. 
One typically also wants to know how well 
the model accounts for the data and whether 
it accounts for a statistically significant 
amount of variation. Moreover, contrary to 
the flavor of Anderson’s remarks, it is not 
enough to consider only statistical tests in 
evaluating the fit of a model. It is also neces- 
sary to consider the psychological meaning- 
fulness of the values obtained for the model’s 
parameters. Work using the S-V-O approach 
has been encouraging in this regard. 

Summary. In summary, all of Anderson’s 
claims concerning methods Gollob has used 
to test goodness of fit can be refuted. In 
particular, Gollob’s S-V-O articles have fo- 
cused on studying special cases of the model, 
which need not fit the data perfectly; ap- 
propriate techniques have been used to test 
the goodness of fit of these special cases; 
and correlationlike indices have been used 
appropriately to provide useful summaries 
of various aspects of the model’s relation 
to data. Let us now discuss scaling issues— 
the second major category of technical issues 
that Anderson raises. 


The Importance of Using Appropriate Scales 


The need for interval response scales. 
Anderson uses a set of fictitious S-V-O data 
to illustrate that a perfect additive fit can 
be destroyed by moving a single point, and 
that when this is done in a 2 X 2 X 2 design 
all four interactions, which were initially 
zero, are changed to have the same nonzero 
value. He correctly points out that in this 
instance it would make little sense to inter- 
Pret the four interactions as indicating that 
four distinct balance tendencies were operat- 

ing. He makes the traditional suggestion 
that it would be more parsimonious to inter- 
Pret the interactions as resulting from an 
aberrant response, and in this example the 
aberrant response could well be the result of 
à floor effect. Anderson’s (1977, p. 143) 
Seneral argument is that scaling difficulties 
tan cause observed interactions to contain 
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undecipherable confoundings, and that the 
S-V-O model has no way to guard against 
such confoundings because it lacks a test of 
goodness of fit. Let us now examine some of 
the details of his argument. 

It is argued above that in spite of An- 
derson’s claim to the contrary, the S-V-O 
model does have tests of goodness of fit. For 
example, a simple test shows that the ficti- 
tious verb inference data that Anderson 
(1977, pp. 142-143) uses to illustrate a floor 
effect cannot be fit adequately by the S-V-O 
submodel for verb inferences. In this appli- 
cation the relevant-bias hypothesis predicts 
that the S, O, and SO sources of variation 
will be unimportant. In fact, however, these 
sources account for over 70% of the varia- 
tion in Anderson’s example, and the sub- 
model obviously fails to meet the goodness- 
of-fit test. Consequently, the S-V-O model 
would not be expected to be helpful in de- 
ciphering the meaning of the observed inter- 
actions; other interpretations would be 
sought—including, of course, interpretations 
that allow for the possibility of scaling arti- 
facts.* 

Anderson is correct in pointing out that 
rating scales are subject to biases which can 
make it risky to give substantive interpreta- 
tions to main effects and/or interactions. 
Fortunately, he also is correct when he later 
points out that “relatively modest experi- 
mental precautions, such as the use of stimu- 
lus end anchors, can largely eliminate these 
biases in many experimental situations” 
(1977, p. 144). Gollob and Wyer agree and 
have consistently used such precautions in 
their research. r 

Although a floor effect or other artifact 
can parsimoniously account for Anderson’s 
single set of fictitious data, it seems highly 
unlikely that scaling artifacts can explain the 


1Examples of content-based interpretations of 
some verb inference item sets in which the relevant- 
bias hypothesis was severely disconfirmed were given 
by Gollob (1974b, pp- 318-319). These examples 
involved item sets whose sentence subjects and ob- 
jects were nonevaluative. Such severe disconfirma- 
tions of the relevant-bias hypothesis have rarely been 
observed in item sets whose sentence components 
are bipolar and primarily evaluative in content (but 
for a probable exception see Gollob, 1974b, p. 294). 
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results that have been observed in the many 
S-V-O item sets studied by Gollob, Wyer, 
and their co-workers. In fact, Anderson 
makes no effort to apply “scaling artifact” 
interpretations to even a single set of actual 
data. 

In summary, it is agreed that measurement 
issues must be considered when any model is 
used to help understand psychological data. 
By using appropriate experimental precau- 
tions to guard against scale artifacts, by 
treating rating-scale data as though they are 
interval data, and by assessing the goodness 
of fit of special cases of the model, Gollob, 
Wyer, and their co-workers have repeatedly 
obtained psychologically meaningful results 
within the S-V-O framework. 

The need for ratio stimulus scales. In 
Anderson’s section on “The Need for Ratio 
Stimulus Scales,” he points out that the areas 
of a set of rectangles can be represented (a) 
as the product of Base x Height or (b) by 
an analysis of variance decomposition that 
expresses the area of each rectangle as the 
sum of a grand mean, two main effects, and 
an interaction. He also points out that an 
analogous situation exists for the volume of 
rectangular solids. Anderson argues that the 
analysis of variance decomposition, if taken 
literally, would imply that the area and/or 
volume is the sum of several independent 
processes, and that the main effects and in- 
teractions in the analysis of variance de- 
composition are terms “whose meaning is 
essentially artificial. . . . These examples, of 
course, depend on the assumption that the 
true integration rule is multiplicative in the 
two or three variables. . . . Thus, the prob- 
lem of stimulus zeros seems to be relevant 
to at least some applications of the balance 
triad model” (1977, p. 145). 

Anderson seems casual about assuming 
which of two mathematically equivalent 
models describes “the true integration rule.” 
Rather than choosing between two mathe- 
matically equivalent models on the basis of 
an assumption that one knows “‘the true 
integration rule,” it probably is more rea- 
sonable to ask which of the two models is 
more helpful to use in thinking about the 
problems of interest. In the case of Ander- 
son’s rectangles, it is obvious that the multi- 
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plicative model is more helpful. In the case 
of S-V-O item sets, on the other hand, the 
multiplicative model discussed by Anderson 
rarely provides a reasonable alternative to 
the S-V-O model. For example, using con- 
joint-measurement techniques (Krantz & 
Tversky, 1971, pp. 160-163), it can be 
shown that even if one allows arbitrary 
monotonic transformations of the response 
scales, it is impossible for a multiplicative 
model to account for the data in even one of 
the six S-V-O item sets Anderson presents 
in his Table 1 (1977, p. 148).? More im- 
portant, except when grossly implausible re- 
strictions are met, the three-way multiplica- 
tive model and the relevant-bias hypothesis 
are inconsistent with each other. For ex- 
ample, in the case of verb inferences, the 
relevant-bias hypothesis implies that S, O, 
and SO sources of variation are all equal to 
zero. Except in trivial cases, the resulting 
S-V-O submodel for verb inferences and the 
three-way multiplicative model can both be 
correct if, and only if, wsyo 40 and WyWsvo 
= Ws,Wyo.2 These conditions imply that SVO 


2In order for a multiplicative rule to fit the data 
in an S-V-O item set, it is necessary that the order- 
ing for the four combinations of S and V either be 
the same or be exactly reversed for the two levels 
of O. Similar relations must hold for the ordering 
of the four combinations of S and O at each level 
of V, and for the four combinations of V and O 
at each level of S. None of Anderson’s item sets 
satisfies these necessary conditions; many of the fail- 
ures are substantial and seem unlikely to be due to 
error in the data. 

8 This footnote shows that the claim made in the 
text is correct, If the multiplicative model holds, 
we have 


Yije = S:V;5O; + constant, 


where yijx is the likelihood judgment of the żjkth 
sentence in the item set, and where #, j and & each 
are either positive or negative. Thus, y+-+ denotes 
the sentence with a positive subject, negative verb, 
and positive object. S, and S-, respectively, refer to 
the scale values for the positive and negative sen- 
tence subject, and V,, V-, O+, and O- are analogously 
defined. In order to avoid trivial cases of the multi- 
Plicative model we assume that S, S-, V+ ANG 
and O, #O., that is, we assume that the positive 
and negative scale values for any given sentence 
component are not equal to each other. 

The weight for an S-V-O bias is equal to half the 
difference between the mean of the four sentences 
that have the bias and the mean of the four sen- 
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balance can never have a weight of zero, 
Moreover, if SV balance and/or VO balance 
have a weight of zero, then V positivity must 
also have a weight of zero, and vice versa. 
Since these highly restrictive and psychologi- 
cally vacuous conditions are rarely met, the 
multiplicative model usually cannot account 
for the data obtained in the many studies 
that have found strong support for the rele- 
yant-bias hypothesis (Gollob, 1974a, 1974b; 
Rossman & Gollob, 1976; Wyer, 1974, 1975; 
Wyer, Henninger, & Wolfson, 1975). More- 
over, since we seek parsimonious understand- 
ing of many sets of data—not merely of iso- 
lated cases—the S-V-O representation usu- 
ally would be preferable even in those rare 
cases where the multiplicative model could 
adequately describe the data. 


tences that do not have the bias. Thus, when the 
multiplicative model holds, the S positivity weight 
can be expressed as 


w= (y4.. — y-.,)/2 = (S,V,0, — S-V.0.)/2 
= V.0.(S, — S-)/2, 


Where a dot replacing a subscript indicates that a 
mean has been computed over the levels of the re- 
placed subscript. When similar calculations are done 
lor wo and Wso, we obtain 


w = S.V. (0+ — O-)/2 
Wyo = V. (S4 — S-) (O+ — 0-)/4- 


‘ince S, = S. and O, #0, we see that the verb in- 
rence submodel that assumes Ws = We = Wso = 
an hold if, and only if, V, =O. Proceeding as we 
lid for w., and using the fact that V.=O implies 
hat V_= —V., we obtain the following expressions 
4 the weights for V positivity, SV, VO, and SVO 
lance: 


wy = S,0.V4 
ww = O.V4 (S+ — S-) 

Wvo = S,V4 (04 — O-) 
Wvo = V4 (S+ — S-) (0+ — O-)- 


since V, = —y_ and V, # V-, we see that Vs # 0, 
‘nd therefore the equation for Wave implies that 
t, Weight for SVO balance can never be Lae 
tis easy to verify that the above equations for if 
Weights also imply that wyWavo = WsvWvo. Thus, 
tse conditions are not met, then the multiplicative 
ddel and the S-V-O submodel for verb inferentes 
Anot both be correct. Similar conclusions hold for 
€ subject inference and object inference subi 
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Summary 


Anderson, of course, is correct in asserting 
that measurement issues and tests of good- 
ness of fit must be considered when evaluat- 
ing models that are intended to help us 
understand psychological data, The above 
discussion demonstrates that contrary to the 
charges Anderson makes in his critique, Gol- 
lob and his colleagues have dealt with these 
matters in careful, effective, and responsible 
ways. The technical portion of Anderson’s 
critique, however, ignores the fact that no 
matter how sound one’s measurement meth- 
ods or how well one’s model fits some data, 
it is necessary to consider the question of 
whether the model makes sense. Surely an 
investigator oriented toward using models 
that make good psychological sense would 
not seriously suggest using an analysis of 
variance decomposition to analyze areas of 
rectangles! For a model to be most useful, 
the values obtained for its parameters should 
have sensible interpretations in terms of our 
best understanding of the psychological pro- 
cesses involved. In the case of the S-V-O 
model, it has been found repeatedly that sub- 
stantive interpretations of the weights given 
to the hypothesized cognitive biases make 
good psychological sense. At this point it is 
natural to consider Anderson’s criticisms con- 
cerning substantive issues. 


Substantive Issues 


Configurality 


Anderson’s position concerning ways in 
which the S-V-O model is affected by con- 
figural response patterns is based on a mis- 
understanding of how weights in the S-V-O 
model should be interpreted. For this reason, 
I shall first discuss correct methods of inter- 
preting the weights in S-V-O analyses, and 
shall then discuss ways that the S-V-O 
model deals with configural response pat- 
earal oj weights in S-V-O anal- 
yses. Anderson considers the possibility that 
the weights in an S-V-O analysis can be 
interpreted as estimates of the weights of 
biases in individual sentences. He concludes 
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that this interpretation makes no sense. Gol- 
lob and Wyer agree, and they neither made 
nor intended this interpretation. Although 
the S-V-O model assumes that the judgment 
of a given sentence is a linear combination 
of the various psychological biases, the 
S-V-O method of analysis does not yield 
direct estimates of the weights for that linear 
combination. Gollob (1974b, p. 292) and 
Wyer (1974) emphasize that the weights 
obtained in an S-V-O analysis are inter- 
preted as reflecting the direction and impor- 
tance of biases in accounting for variation 
within individual item sets—not within in- 
dividual sentences. To compare Gollob’s in- 
terpretation of the weights obtained in an 
S-V-O analysis with the interpretation at- 
tributed to him by Anderson, let us con- 
sider the results of S-V-O analyses of An- 
derson’s Item Sets A and C, which are pre- 
sented in Table 2. 

The data for Anderson’s Item Set A con- 
sist of college students’ responses to eight 
sentences of the following form: “Bill is 
kind (cruel); how probable is it that Bill 
helps (harms) physicians (criminals)?” The 
italicized words denote evaluatively positive 
(negative) levels of the sentence subject, 
verb, and object, respectively. Responses 
were made orally using a 0-20 rating scale 
whose ends were defined by the terms ex- 
tremely improbable and extremely probable. 
Anderson’s Item Set C was based on re- 
sponses to eight sentences that were the same 
as those in Item Set A except that the verb 
hinders was substituted for the verb harms. 
Thus, the four sentences that contain the 
verb helps are ‘common to both item sets. 

Anderson reminds the reader that the rat- 
ings associated with each sentence in each 
item set can be recovered by applying the 
weights in Table 2 to the contrasts that 
define the biases (see Table 1) and empha- 
sizes that there are large and statistically 
significant differences in the patterns of 
weights in the two item sets. On the basis of 
this observation, he correctly points out that 
each of the four sentences common to Item 
Sets A and C (i.e., the sentences that contain 
the verb helps) have two distinct representa- 
tions, one in Item Set A and one in Item Set 
C. He then argues that there is no way to 
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choose between these two representations and 
from this draws the conclusion that the 
S-V-O method of analysis “would seem to 
have no use” (1977, p. 149). The confusion 
here seems to stem from Anderson’s failure 
to recognize that the two different repre- 
sentations are not inconsistent with each 
other, and therefore it is not necessary to 
choose between them—both can be correct. 
The weights obtained in the S-V-O analysis 
of the two item sets express the four com- 
mon sentences in terms that are relative to 
the other content in each item set. Thus, in 
Item Set A, the four sentences that contain 
the verb helps are expressed in terms relative 
to the four sentences that contain the verb 
harms, whereas in Item Set C, they are ex- 
pressed in terms relative to the four sentences 
that contain the verb hinders. Obviously the 
results of an analysis of variance are changed 
when the data input to it are changed. 
Since the issue of how to interpret the 
weights in an S-V-O analysis is at the core 
of the matter, the following paragraphs dis- 
cuss interpretation of the weights in detail. 

To illustrate the relativistic interpretation 
of the weights obtained in an S-V-O analy- 
sis, let us consider the results Anderson ob- 
tained for verb positivity. In Item Set A, 
verb positivity was given a weight of 1.30, 
and in Item Set C it was given a weight of 
.12, Using correct methods of interpretation, 
one would view the different weights given 
to verb positivity in Item Sets A and C as 
indicating that for the particular sentence 
subjects and objects used, verb positivity 1s 
more important in accounting for verb in- 
ferences when the verbs are helps and harms 
(Item Set A) than when the verbs are helps 
and hinders (Item Set C). That is, on the 
average, the tendency for judges to infer 
that a person will help rather than harm, 
regardless of whether the person is described 
as kind or cruel, and regardless of whether 
the object is physicians or criminals, 
stronger than the comparable tendency for 
judges to infer that a person will help rather 
than hinder. 

Further insight into the meaning of the 
weights obtained in an S-V-O analysis can 
be gained by recalling that the analysis of 
variance contrast that defines a particular 
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bias assigns +1 values to the sentences that 
have the bias and assigns —1 values to the 
sentences that do not have the bias. Thus, 
the weight for a bias is closely related to the 
difference between the mean of the four 
sentences that have the bias and the mean of 
the four sentences that do not have the bias. 
Specifically, the weight given to a bias is 
equal to half the difference between these 
f two means. In his uses of the S-V-O model, 
Wyer does not report the S-V-O weights 
per se; instead, he reports the difference 
between the mean rating of the sentences 
having the bias and that of the sentences not 
having the bias in question. Let us now use 
this perspective to look once again at the 
difference between the verb positivity weights 
in Anderson’s Item Sets A and C. The mean 
rating of the four sentences containing the 
"verb helps was 9.94 for Set A and 9.49 for 
Set C. (Since the same four sentences were 
used in both item sets, probably the differ- 
ence between these two means is due to 
| measurement error.) The mean rating given 
to the sentences that used the verb harms 
| was 7.34, whereas that for the four sentences 
that used the verb hinders was 9.25. Thus, 
as one would expect, the average judged 
probability that kind or cruel people will 
harm physicians or criminals is considerably 
less than the average judged probability that 
kind or cruel people will hinder physicians 
or criminals, The difference between the 
Weights for verb positivity in the two item 
Sets reflects these characteristics of the data. 
Thus, the weight for verb positivity in Set A 
Was (9.94 — 7,34)/2 = 1.30, and the weight 
in Set C was (9.49 — 9.25) /2 = .12. 
In a less technical way, let us now Con- 
Sider interpretation of Anderson’s results con- 
cerning SV and SVO balance in the two 
ltem sets, Suppose that the tendency among 
judges to think that a person who harms 
Someone is bad is much stronger than their 
Corresponding tendency to think that a per- 
Son who hinders someone is bad, that is, 
evaluative implications of hindering 
Someone depend to a much greater extent on 
e characteristics of the person who is hin- 
ered. This phenomenological assumption 
Would lead SV balance to contribute more 
to judgments in the help/harm item set, and 
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because SVO balance allows for the impor- 
tance of the characteristics of the object, 
the assumption would lead SVO balance to 
be more important in the help/hinder item 
set. As shown in Table 2, this is what oc- 
curred in Anderson’s data. Thus, when in- 
terpreted correctly, findings that Anderson 
emphasizes as invalidating the S-V-O model 
are seen in reality to be amenable to a psy- 
chologically plausible interpretation in terms 
of the model. 

Having shown that Anderson’s position is 
based on an incorrect interpretation of the 
weights in S-V-O analyses, and having dis- 
cussed correct methods of interpreting the 
weights, I shall now discuss more specifically 
the question of how the S-V-O model deals 
with configural responding. 

Configural effects and the S-V-O model. 
As an example of a configural response, An- 
derson emphasizes that with respect to verb 
inference judgments, the difference between 
harming and hindering physicians is much 
less than the difference between harming 
and hindering criminals. The judged likeli- 
hood of the verb depends in part on the 
object. Anderson’s discussion conveys the 
incorrect impression that S-V-O researchers 
have not been concerned with such configural 
effects. Gollob and Wyer have discussed doz- 
ens of S-V-O analyses of item sets that in- 
volve highly configural responses. For ex- 
ample, Item Sets 1 and 4 in Gollob’s (1974a, 
p. 165) Table 2 include data showing that 
with respect to verb inference judgments, the 
difference between helping and admiring 
friendly people is much less than the differ- 
ence between helping and admiring un- 
friendly people. His subjects inferred that a 
friendly person was reasonably likely to help 
both friendly and unfriendly people, but they 
inferred that admiration was considerably 
more likely to be directed toward friendly, 
rather than unfriendly, people. This help- 
admire example closely parallels Anderson’s 
harm-hinder example of configural respond- 
ing. Gollob (1974a, pp. 166-168; 1974b, p. 
303) grouped together several item sets with 
configural patterns similar to the help-admire 
example and found that the hypothesized 
cognitive biases of the S-V-O model were 
helpful in thinking about the results. 
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In general, since the S-V-O model and its 
hypothesized cognitive biases are psycho- 
logically meaningful and can account for 
highly configural data such as those pre- 
sented by Anderson, it follows that config- 
ural responding poses no special problems 
for the formulation. The model simply con- 
ceptualizes configural responding in terms of 
the effect such responding has on the relative 
importance of the hypothesized cognitive 
biases of the model. 

We have seen that the S-V-O system can 

handle configurality issues of the type raised 
by Anderson. There also is another sense in 
which the S-V-O system is intimately con- 
cerned with configural aspects of judges’ 
responses. The S-V-O conceptualization rec- 
ognizes that a given piece of information can 
contribute to judgments simultaneously in a 
variety of ways, and a basic assumption of 
the model is that judges often use configura- 
tions of information (combinations of S, V, 
and O) as single informational units. In the 
present article, these informational units are 
referred to as cognitive biases; Wyer often 
refers to them as informational cues. At one 
level, the fact that notions of configurality 
are essential in defining the hypothesized 
biases is one reason that nonconfigural 
models, such as the multiplying model advo- 
cated by Anderson, typically are unable to 
account for data that can be conveniently 
conceptualized in terms of the S-V-O model 
(see Footnotes 2 and 3). 
D In summary, if one is to obtain sensible 
interpretations of the weights obtained in 
S-V-O analyses, one must interpret them 
in the context of the item sets from which 
they are obtained. When this is done, con- 
figural response patterns pose no special 
problem, and the weights often can be used 
to compare the contributions of hypothe- 
sized biases across different item sets in a 
systematic and psychologically meaningful 
way. 


_ Nonadditivity 


Anderson argues that if the hypothesized 
S-V-O biases are psychologically teal, they 
Probably are combined by an averaging, 
tather than an adding, rule. The basic argu- 
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ment Anderson presents in support of his 
position is that because the averaging rule 
has worked well in numerous investigations, 
it probably also would work well if applied 
to cognitive biases. The question is one that 
can be investigated empirically, and there 
seems to be little value in engaging in further 
speculation on the issue. It is important, 
however, to realize that the ratio of the 
weight of one bias to another is the same 
regardless of whether the biases are com- 
bined in accordance with an additive or with 
a simple weighted-average rule. Thus, the 
“adding versus averaging” question raised by 
Anderson is irrelevant if one’s objective is 
to compare the relative importance of the 
hypothesized biases in affecting judgments. 


An Alternative Approach 


Anderson’s (1977) section entitled “An 
Alternative Approach” presents (a) an al- 
ternative approach for the analysis of S-V-0 
item sets, (b) a criticism of a procedure 
Gollob (1974b, pp. 305-306, 313-316) used 
to transform 2 X 2 item sets to make them 
amenable to S-V-O analyses, (c) a multi- 
plicative model that is sometimes useful in 
the analysis of verb inference data when 
sentences are constructed by using one verb 
in combination with several levels of sentence 
subject and object, and (d) a set of general 
arguments advocating use of simple algebraic 
models to study problems in implicit pef- 
sonality theory. I shall consider each of these 
topics in turn, 


Anderson’s Method for Analyzing 
S-V-O Item Sets 


In his subsection entitled “A Multiplying 
Model for Subject-Object Integration 
(1977, pp. 151-153), Anderson implicitly ad- 
vises that one should avoid collecting data 
in the 2X 2x2 form of an S-V-O item 
set. If, however, one ends up with verb in- 
ference data in this form, he recommends 
that the S-V-O item sets be investigated 85 
two separate 2x2 designs—one for ¢a¢ 
of the two verbs. He recommends that eat 
of the 2 x 2 designs be conceptualized a” 
analyzed in terms of a multiplicative mode 


that states that the judged likelihood of the 
verb can be expressed as the product of a 
number assigned to the sentence subject and 
a number assigned to the sentence object. 
He then says that “the 2 X 2 design does 
not provide a proper test of goodness of fit 
for the multiplying model” (p, 153), and he 
goes on to argue that a better test of the 
multiplying model can be obtained by quan- 
' tifying the subject and object term (see his 
Figure 3, p. 151). 

Anderson’s comment highlights a limita- 
tion of using the procedure he recommends 
for the analysis of S-V-O item sets. The 
limitation, however, does not apply to S~-V-O 
| analyses of intact 2 X 2 X 2 item sets be- 
cause, as was emphasized above, appropri- 
ate tests of goodness of fit are possible and 
| have been used routinely in S-V-O research. 
Moreover, the S-V-O model enables one to 
conceptualize the data in an intact item set 
in terms of the hypothesized cognitive biases, 
Whereas the analysis method proposed by 
| Anderson breaks the design into two separate 
pieces and provides no natural way of inte- 
grating the results of the two analyses. In 
addition, the S-V-O model is used not only 
in the investigation of verb inferences, which 
are discussed by Anderson, but also in the 
Investigation of subject inferences, object 
inferences, pleasantness, and other types of 
Social judgments. 


Transforming 2 x 2 Analyses into 
Equivalent 2 x 2 x 2 Analyses 


Anderson (1977, pp. 152-153) uses the 
tems “artificial,” “misleading,” and “awk- 
| Ward” to describe a procedure Gollob (1974b, 

PD. 305, 313) used to enable Rodrigues’s 
(1968) verb inference data and Osgood and 
 Jannenbaum’s (1955) subject and object 
Inference data to be analyzed using the 
SV-O model. The procedure involved trans- 

Orming two-way analyses into equivalent 
“three-way analyses. Rodrigues’s verb infer- 

fice data will be used to describe the pro- 

cedure and the reasoning that motivated its 

"se; generalization of the procedure to sub- 

Feet and object inference data is obvious. 
Í he transformation. In each of three dif- 
“tent conditions, Rodrigues presented €x- 
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perimental subjects with four items that fit 
the following format; “You dislike [or like] 
0; how does o feel about x [an issue the 
experimental subject either favored or op- 
posed]?” In each condition 20 subjects re- 
sponded to each item by indicating whether 
he or she predicted that o had a positive 
feeling or a negative feeling toward the issue. 
When both the person, o (i.e., the sentence 
subject), and the issue, x (i.e., the sentence 
object), were negatively evaluated, 13 of 
Rodrigues’s 20 experimental subjects inferred 
a positive verb and 7 inferred a negative 
verb; Gollob expressed the triads as S-V-O 
sentences and in this case assigned a “prob- 
ability score” of 13 to the — + — sentence 
and a score of 7 to the — — — sentence. The 
other three items were dealt with in a similar 
manner, so that an S-V-O item set consisting 
of eight sentences was generated. The S-V-O 
model was then used to analyze the resulting 
2x 2 X 2 item set. 

When S-V-O item sets are generated in 
this manner, the relevant-bias hypothesis is 
forced to hold perfectly and therefore cannot 
be tested in data of the type collected by 
Rodrigues. Gollob (1974b, pp. 305-306, 313) 
emphasized this point and discussed ways 
of taking this fact into account in interpret- 
ing results of S-V-O analyses that are con- 
ducted on data of this type. It should be 
emphasized, however, that the transforma- 
tion is rarely needed because S-V-O research 
typically involves collection of the type of 
22x 2 data sets for which it was de- 
signed. 

Anderson’s recommendation. Rather than 
apply the transformation used by Gollob, 
Anderson (1977, pp. 151-153) recommends 
that 2 X 2 analyses of variance be applied 
to data of the type collected by Rodrigues. 
Let us use Rodrigues’s data to discuss the 
interpretation of such an analysis. Averaging 
over Rodrigues’s three conditions, the mean 
probability scores for the sentences contain- 
ing a positive verb were + + + (18.7), 
ee (40)p = + (7.7), and — + — 
(13.0). For convenience of exposition we 
subtract the “midpoint” of 10 from each 
of these scores before doing the analysis of 
variance; if the grand mean is larger than 
zero on this new scale, we conclude that 


944 


over half of the experimental subjects in- 
ferred the positive verb. In the two-way 
analysis recommended by Anderson, the 
weights given to the standard analysis of 
variance contrasts that define the grand 
mean, S, O, and SO effects are .85, .50, 2.35, 
and 5.00, respectively.* The large weight 
given to the SO interaction suggests that on 
the average, people we like are judged more 
likely to agree with us on issues than are 
people whom we dislike. Thus, in the 2 x 2 
design Rodrigues used, the predicted effects 
of Heiderian balance are shown by the 
Presence of a large SO interaction. In an 
S-V-O analysis, on the other hand, Heider- 
ian balance leads to a prediction of an SVO 
interaction. The following discussion shows 
that this difference is no cause for alarm. 

Comparison of the two analyses. The 
most important thing to realize in comparing 
the two analyses of verb inferences is that 
there is an exact correspondence between the 
grand mean, S, O, and SO sources, respec- 
tively, in the two-way analysis recommended 
by Anderson, and the V, SV, VO, and SVO 
Sources in the three-way analysis used by 
Gollob. Standard analysis of variance pro- 
cedures give identical weights to the con- 
trasts that define appropriately paired 
sources of variation in the two designs. The 
weights given to the grand mean, S, O, and 
SO in the three-way design are all necessarily 
equal to zero. Thus, except for the names 
given to the sources of variation, the two 
methods of analysis yield identical results. 
Since the two analyses have identical psy- 
chological implications, the choice between 
them should be made on the basis of which 
labels are most convenient to use. 

When Gollob’s three-way analysis is used, 
one obtains results directly interpretable in 
terms of the hypothesized cognitive biases. 
When Anderson’s two-way analysis is used, 
however, much unnecessary confusion results, 
Consider, for example, a situation in which 
Heiderian balance is predicted to have an 
important effect on subject, verb, and ob- 
ject inference judgments. When using three- 
way analyses, an SVO interaction is implied 
by the Heiderian balance prediction regard- 
less of the type of inference. When using 
two-way analyses, however, the same hy- 
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pothesized balance tendency leads one 
predict a VO interaction in the case of S 
ject inferences, an SO interaction in the ¢ 
of verb inferences, and an SV interaction in 
the case of object inferences. The confusion 
is further compounded by the fact that in 
standard S-V-O item sets, SO, SV, and VO. 
refer to three different hypothesized cogni- 
tive biases—none of which are equivalent to 
traditional Heiderian balance, The method 
Gollob used to transform two-way analyses 
into equivalent three-way analyses avoids 
these difficulties and enables the S-V-O mod: 
el’s concepts and notation to be used in a 
consistent manner in investigating data in- 
volving different types of judgments and 
different types of data collection procedures.’ 

In summary, in order to use the S-V-O 
model to reanalyze their data, Gollob trans 
formed Rodrigues’s (1968) and Osgood am 
Tannenbaum’s (1955) 2 Xx 2 data sets in J 
equivalent 2 x 2 x 2 data sets. Contrary to 
Anderson’s (1977, p. 152) assertion, there is 
nothing artificial or misleading about the 
transformation. It avoids the notational coni 
fusion that Anderson’s method introduc Sy 
and provides a convenient way to concep: 
tualize the results in a 2 x 2 data set in 
terms of the hypothesized cognitive biases 
used in the S-V-O system. 


A Multiplicative Model for Verb 
Inference Data 


g 


tive to the S-V-O approach, Anderson recom: 
mends using Oden and Anderson’s (1974). 
multiplicative model to study verb inference 
judgments that are obtained in response t0 
Sets of sentences constructed by using one 
verb in combination with several quantifi 3 
levels of sentence subject and object. 7 
data Anderson uses to illustrate the modeli 
Consist of judges’ responses to stimuli of the 


paroro aet af 
*When an analysis is done on the probably 
Scores for the sentences containing negative i 
the grand mean and weights have the same absolu 
value but are reversed in sign. le 
> Although Anderson used a rating scale examp 
to present his criticism of the procedure Gollob E. 
to apply the S-V-O model to two-day de 
issues involved are identical to those discussed al 
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following type: “Mr. X is very sociable; how 
likely is it that he will socialize with moder- 
ately unsociable people?” In both the sentence 
subject and object positions, the adjective 
was graded in five levels from very unsociable 
to very sociable and the verb socialize with 
was used in all sentences. Although the method 
Anderson (1977) and Oden and Anderson 
(1974) used to test the goodness of fit of 
their multiplicative model is not entirely ap- 
propriate,” it is clear that if a constant term 
is added to their model, it can provide a rea- 
sonable account of the data in Anderson’s 
Figure 3 (1977, p. 151). 

Anderson recommends use of a noncon- 
figural model, Without attempting to explain 
the apparent paradox, Anderson presents the 
nonconfigural multiplicative model as provid- 
mg a method of attacking problems due to 
configurality (1977, pp. 147-148, 156). An- 
derson seems to be saying we should study 
problems due to subjects’ configural respond- 
Ing by avoiding them. For example, to study 
Verb inferences using the approach Anderson 
Tecommends, the data sets investigated must 
Meet several restrictions explicitly designed to 
eliminate configural responding on the part 
of the judges: (a) The data set must contain 
only one verb, (b) the semantic relationship 
between the sentence subject and verb, and 
also the semantic relationship between the 
verb and object, must somehow be qualita- 
lively fixed in all sentences in the data set, 
(c) the sentence subject and verb must be 
‘apable of being quantitatively varied with- 
out disturbing the fixed qualitative semantic 
telationships, and, finally, (d) the likelihood 
Ndgment must be a multiplicative function 
of the scale values assigned to the sentence 
Subjects and objects. Oden and Anderson’s 
esults are of interest in their own right be- 
Case they show that data sets in which these 
‘ttingent requirements are met can in fact 
è constructed. Since, however, none of the 
“nditions required by Oden and Anderson's 
Method need to be met when using the S-V-O 
*Pptoach, it is not clear what relevance their 
"sults have for assessing the usefulness of 
the S-V—O formulation. 

Anderson’s approach and the S-V-O ap- 
Proach are concerned with different problems 


945 


for which different models and different data 
collection procedures are appropriate, If one’s 
interest is in identifying sets of social judg- 
ment data that are highly quantified and can 
be fit by nonconfigural models, the approach 
recommended by Anderson is preferable to 
the S-V-O approach. On the other hand, the 
S-V-O approach is preferable if one’s inter- 
est is in studying social judgments in the 
context of 2 X 2 X 2 item sets in which con- 
figural responding can be investigated in 
terms of the hypothesized cognitive biases.’ 


Use of Simple Algebraic Models in the 
Analysis of Social Judgment 


In the last few pages of his critique, Ander- 
son (1977, pp. 153-156) argues that problems 
in implicit personality theory and general 
social inference can best be investigated by 
use of the methods of data analysis and data 
collection that he typically uses in his own 
work (cf. Anderson, 1970, 1971), Although 


ê Except for a change of notation, the multiplica- 
tive model presented by Anderson (1977) and Oden 
and Anderson (1974) is 


Yij = AB;, 


where yi, denotes the judged likelihood for the sen- 
tence containing the ith sentence subject and the jth 
object, and A; and By denote the scale values asso- 
ciated with the stimuli. Anderson (1977, p. 151) 
notes that the data in his Figure 3 form a diverging 
fan of straight lines, which is characteristic of the 
multiplying model, and on this basis he concludes 
that the data support the model given above. Oden 
and Anderson (1974) report that an analysis of 
variance test of goodness of fit indicated that the 
data did not deviate significantly from the linear 
fan pattern. However, unless a transformation of 
the response scale is allowed (an option not men- 
tioned by Anderson or by Oden and Anderson), 
these methods test the goodness of fit of the model 


yij = AB; + constant, 


rather than the model proposed by Oden and An- 
derson. Thus, in the absence of some independent 
a priori certainty that the constant term in the 
model is zero, the test of goodness of fit used by 
Anderson (1977) and by Oden and Anderson (1974) 
is inappropriate. 

7 Although some work relevant to quantifying 
various notions of cognitive bias has been reported 
(eg, Gollob, 1968b; Gollob & Fischer, 1973; Gollob, 
Rossman, & Abelson, 1973, p. 29; Heise, 1969), at 
present the S-V-O model assesses the biases only at 
the level of “present” or “absent.” 
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these methods are based on simple algebraic 
models that are essentially nonconfigural in 
nature, it is clear that many important prob- 
lems in social judgment can be effectively ap- 
proached by their use. Gollob has used such 
methods when they have been appropriate for 
the problems under investigation. 

One such example is provided by Gollob’s 
(1965, 1968b) study of impression formation 
and word combination in simple subject-verb- 
object sentences. In that study, judges made 
evaluative ratings of men described by sen- 
tences that were constructed from factorial 
combination of several adjectives, verbs, and 
objects (e.g., “The vicious man likes beggars,” 
“The kind man praises communists”). As 
part of the larger study, Gollob used several 
different models to analyze the mean evalua- 
tive ratings (averaged over observations and 
adjectives) of verb-object combinations. One 
set of analyses compared two multiplicative 
models of verb—object integration. One model 
treated the direct evaluative ratings of the 
isolated verbs and objects as interval-level 
scale values, and the other treated the verb 
and object main effects as interval-level scale 
values in the model. (The latter model is the 
type of algebraic model whose use is advo- 
cated by Anderson.) Interestingly, although 
the model based on the main effects estimated 
2.5 times as many free parameters as did the 
model based on direct evaluative ratings of 
isolated sentence components, it was slightly 
less effective in accounting for variation in 
the ratings of the verb-object combinations, 

Gollob (1965, 1968b) also tested models 
more complex than the ones just described, 
Using Gollob’s (1968a, 1968c; Tucker, 1968) 
Factor Analysis of Variance (FANova) model, 
it was found that two multiplicative terms 
were needed in order to adequately account 
for the variation in ratings of the verb-object 
combinations. Each of the multiplicative terms 
suggested a different aspect of meaning that 
contributed to the evaluative ratings. The 
types of model whose use is advocated by 
Anderson are nonconfigural and do not allow 
for the possibility that more than one multi- 
plicative term can affect judgments. Wyer and 
Hinkle (1976) also have reported social judg- 
ment data in which the FANOVA heuristic of 
representing the data by a series of multi- 
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plicative components was found to be useful, 
And finally, Gollob and Rossman (197, y 
found that both evaluative and potency judg- 
ments of sentence components were important 
in accounting for potency judgments of sen- 
tence subjects. These few examples emphasize 
that although simple algebraic models such 
as those advocated by Anderson often are 
useful, they are far from complete in their 
ability to describe social judgment data, One 
must be prepared to modify one’s models and 
one’s methods of analysis as dictated by the 
characteristics of the problems of interest and 
the data at hand. ` N 

Anderson’s critique implies that if a model 
has a simple algebraic form and passes ap- 
propriate tests of goodness of fit, then 
model and its associated scale values are cor- 
rect, valid, and subjectively real. Anderson 
seems to forget that radically different models 
can be roughly equivalent in their ability t0 
fit data and that the choice between such 
models must be made on the basis of conveni- 
ence, theoretical considerations, or other s 
stantive grounds. Anderson might agree, hows 
ever, that it is important to temper one 
enthusiasm at finding a model that fits data 
well by considering Tukey’s (1977, p. 586) 
admonition concerning model fitting: “Even 
when we see a very good fit—something we 
know has to be a very useful summary of 
the data—we dare not assume that what isi 
fitted is of exactly the right form; we dare 
not believe we have found a natural law.” 

In summary, Anderson’s critique ends 
advising researchers in implicit personal 
theory to focus their work sharply on the 
vestigation of problems that are amenable t 
quantitative analysis by use of simple alge 
braic models. In contrast to Anderson’s Te 
ommendation that we tailor our questions y 
fit a particular research paradigm and method 
of analysis, the present article recommends 
that we tailor our research paradigms and data 
analysis methods to fit our questions. 
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Postscript: Anderson’s Reply Is Not 
Responsive to the Issues 


This postscript is a rejoinder to the immedi- 
ately following article by Anderson (1979). The 
preceding part of the present article directly re- 
sponds to every major criticism Anderson (1977) 
presented in his original critique of the S-V-O 
formulation. Anderson’s (1979) article on the 
other hand, does not attempt to rebut the points 
made above; it merely repeats a few of the many 
questionable claims Anderson made in his origi- 
nal paper. This postscript considers (a) Ander- 
son’s shift of emphasis away from technical is- 
sues, (b) Anderson’s insistence that weights ob- 
tained from S-V-O analyses be applied to single 
sentences, whereas Gollob has specifically said 
that they apply to item sets, and (c) Anderson’s 
claims concerning goodness of fit. 


Indeterminate Criticism 

tly the central thrust of Anderson’s 
ERES "the S-)-O formulation has shifted. 
In his reply, Anderson (1979) complains that 
the main point of his original paper is becoming 
lost in a mass of technical detail. This is a 
surprising complaint. Anderson’s (1977) original 
paper spent a dozen pages presenting technical 
criticisms, but only three pages on what he now 
describes as “the main issue.” Thus, contrary to 
the implication of his reply, Anderson’s critique 
was primarily technical in nature. In the first 
part of the present article I have shown that 
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Anderson’s (1977) technical criticisms of the 
S-V-O approach are refutable, and that the meth- 
ods Anderson proposes for dealing with problems 
of configurality are ineffectual. Rather than dis- 
cuss my refutations of his criticisms concerning 
technical issues, Anderson now states that tech- 
nical issues played a minor part in his original 
critique. 


Choice of Problem 


Anderson begins his reply by saying that the 
“main point” of his original critique and of his 
reply is that an S-V-O analysis “yields a theo- 
etical representation of the response to a single 
sentence that is arbitrary and indeterminate” (p. 
950, emphasis added). Thus, Anderson has ig- 
nored the explicit statement that “the weights 
obtained in an S-V-O analysis are interpreted as 
reflecting the direction and importance of biases 
in accounting for variation within individual item 
sets—not within individual sentences” (p. 940 
above). Moreover, at the end of his reply, An- 
derson (1979) implies that Gollob (1974b, Equa- 
tion 1, p. 292) presented the basic S-V-O model 
as applying to single sentences, regardless of 
context. Contrary to Anderson’s implication, 
Gollob (1974b, p. 292) specifically stated that 
the model equation and weights apply to sen- 
tences in the context of item sets. Since the 
weights obtained in S-V-O analyses do not ap- 
ply to single isolated sentences, Anderson’s “main 
point” is irrelevant in evaluating the usefulness 
of S-V-O analyses. 

Anderson (1979) complains that Gollob has 
not emphasized that the weights obtained in 
S-V-O analyses must be interpreted in the con- 
text of the particular item sets being analyzed. 
This is an odd complaint. The derivation of the 
televant-bias hypothesis, discussions of how 
weights should be interpreted, and the use ac- 
tually made of weights in empirical work all 
show that S-V-O weights must be interpreted 
in the context of the item sets being analyzed. 
Moreover, with the single exception of Ander- 
son (1977), every study conducted within the 
S-V-O framework has considered theoretical 
bases for making a priori predictions of the spe- 
cific ways in which the weights given to biases 
will vary as a function of the particular content 
of the item sets being investigated. Finally, the 
results of most S-V-O studies demonstrate how 
the approach is used to compare the contribu- 
tions of hypothesized cognitive biases across item 
Sets of varying content in a systematic and psy- 
chologically meaningful way. 

Anderson (1979) defines “operative weights” 
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as “the weights that were operative in the actual 
judgment of each sentence” (p. 950). Because 
S-V-O analyses do not yield estimates of the | 
operative weights, Anderson concludes that the | 
S-V-O model is indeterminate, meaningless, un- f 
testable, and unable to deal with configural re- 
sponses. It seems that Anderson’s concern with 
operative weights prevents him from seeing any 
value whatsoever in studying and making pre- 
dictions about other kinds of weights. He ignores 
the fact that the whole purpose of the S-V-0 | 
model is to conceptualize social cognition prob- 
lems in the context of item sets of eight sen- 
tences; the S-V-O model was not designed to 
provide estimates of operative weights. 

Although the problem of estimating operative 
weights is not the problem addressed by S-V-0 
analyses of 2X 2X2 item sets, estimation of 
such weights has been the focus of other re- 
search conducted by Gollob and his colleagues. f 
For example, studies of impression formation by 
Gollob (1968b), Gollob and Fischer (1973), and | 
Gollob and Rossman (1973) have estimated the | 
values of operative weights that are given to 
various cognitive biases. In these studies, quan- | 
tified indices of some of the S-V-O biases are 
used in place of the qualitative indices considered 
in the 2X 2X2 analyses. The basic ideas ui- 
derlying the S-V-O biases have been useful in 
both types of studies. 

In summary, Anderson (1977, 1979) does not 
attempt to evaluate what the S-V-O approach 
does, how it has been used, or even what claims 
have been made for it. His criticisms might be | 
germane to other models constructed for other 
purposes; they surely are not relevant to the 
S-V-O model. 


Goodness of Fit 


Anderson (1979) emphasizes that he and Gol 
lob are in agreement on several important me 
ods that should be used in testing goodness 0 
fit. Agreement about what methods should a 
used, however, is not relevant to the presen | 
debate. Anderson’s original critique did not K 
Gollob’s beliefs were in error; it said that 4 
methods Gollob actually used to test goodnes 
of fit were in error, The nonpostscript portion K 
the present article presents several specific © 
amples that show Anderson's criticism 1 
rect and misleading.® i 


s incor 


i 
m ad | 
test | 
fol- | 


®Since Anderson is well known for the ae i 
monitions he dispenses to those who fail the 
goodness of fit, it is interesting to consider 


REPLY TO ANDERSON 


Anderson (1977, 1979) sets up a classic “straw 
man” about goodness of fit by emphasizing that 
if all seven biases in the S-V-O model are as- 
sumed to be meaningful in every item set, the 
model cannot be invalidated. Although this 
statement is correct, it ignores the fact that 
only restricted versions of the model are dis- 
cussed in S-V-O work. I am not aware of even 
one S-V-O study that has assumed that all seven 
biases will be relevant in every item set studied. 
It is obvious, and has been repeatedly stated, 
that the cognitive biases of the model must be 
psychologically meaningful on a priori grounds 
in order to be taken seriously. As far as I know, 
Anderson is the only person who talks as though 


lowing points. In discussing goodness of fit, Ander- 
son (1979) refers to Table 2 of his original article 
and says that the restricted S-V-O model implied 
by the relevant-bias hypothesis “failed badly in the 
representation of Design A.” It is ironic that neither 
his original paper nor his reply tells the reader 
whether the restricted model showed statistically 
significant lack of fit. Nor does he mention that the 
restricted model accounted for 92% of the item set 
variation, or that the nonrelevant biases all had 
smaller weights than did any of the relevant biases. 
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anyone would expect all seven biases to be psy- 
chologically meaningful in every item set. 
Contrary to Anderson’s (1977, 1979) accusa- 
tions, the present article has demonstrated that 
Gollob, Wyer, and their colleagues have con- 
sistently used appropriate methods to test the 
goodness of fit of several restricted versions of 
the S-V-O model. Anderson’s (1979) sole re- 
sponse to this demonstration is to assert that 
restricted versions of the model are “merely 
plausible hunches” and have “no special theo- 
retical status.” Such polemic cannot take the 
place of careful model analysis and evaluation. 


Conclusion 


In summary, Anderson’s (1979) reply presents 
as extreme a distortion of the S-V-O approach 
as did his original critique. I urge readers who 
are interested in more than the heat involved in 
this exchange to read Anderson’s original cri- 
tique, together with some examples of research 
conducted within the S-V-O framework. The 
research examples are important in showing how 
the S-V-O framework has been used and in 
demonstrating its usefulness in dealing with sub- 
stantive issues in social cognition. 


Journal of Personality and Social Psychology 
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Indeterminate Theory: Reply to Gollob 


Norman H. Anderson 
University of California, San Diego 


The central criticism of the balance triad model presented by Anderson con- 
cerned its theoretical indeterminacy: The same sentence can receive quite 
different theoretical representations, depending on arbitrary choice of other 


sentences. In other words, the balance triad model gives an indeterminate de- 


J 


scription of the cognitive processes involved in the judgment of any given 
sentence. Gollob’s reply agrees on this point. 


The Main Issue 


The central criticism in my discussion of 
the balance triad model is conceded by Gollob 
to be correct (Anderson, 1977; Gollob, 1979). 
Since this central issue is becoming lost under 
a mass of technical detail, it deserves re- 
emphasis in simple terms: The balance triad 
model yields a theoretical representation of 
the response to a single sentence that is ar- 
bitrary and indeterminate. 

At face value, the balance triad model is 
simple. Seven balance tendencies or “cogni- 
tive biases” are hypothesized. They deter- 
mine the judgment of each single sentence 
by a weighted sum model. The model con- 
ceptualizes “how the information carried by 
the various cognitive biases is combined by 
perceivers in making social inference judg- 
ments.” Specifically, it assumes that the seven 
cognitive biases “are combined linearly to 
yield the judged probability of a sentence 
type” (Gollob, 1974, p. 292). Of primary con- 
cern, therefore, is the importance or operative 
weight of each cognitive bias. 

Unfortunately, the model analysis yields 
indeterminate estimates of these weights. To 
estimate the weights requires the use of seven 
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additional sentences to form a 2° item set, 
In principle, that is legitimate and proper 
In this particular case, it has an illegitimate © 
consequence: The estimated weights for the 
given sentence can have quite different val- 
ues, depending on arbitrary choice of the ad 
ditional sentences. Each sentence thus has 
many different theoretical representations; 
each representation tells a different theoreti- 
cal story. Since the choice among them is 
arbitrary, the theoretical representation of 
the cognitive biases is indeterminate. 

This indeterminacy was the central crite” 
cism of my article. This was the point of 
the “thought experiment.” This point wasi 
verified in the actual experiment. The same 
sentence did indeed have two quite different 
theoretical representations. Each representa- 
tion created a quite different theoretical pi 
ture of the underlying cognitive processes. 

In his reply, Gollob (1979) agrees wit 
this criticism: The model “assumes that the 
judgment of a given sentence is a linear 
bination of the various psychological biases; ~ 
but the model analysis “does not yield dit 
estimates of the weights for that linear com: 
bination” (p. 940). 

We are thus faced with a peculiar theo 
retical situation in which there are two dit- 
ferent sets of weights. One set refers to 
basic cognitive model and may be 
operative weights; these are the weights tar 
were operative in the actual judgment oi 
each sentence. The other set may be Cale” 
estimated weights; these are the weights Hi 
are obtained when the same basic mode 


INDETERMINATE THEORY 


actually applied to these same judgments. 
But these estimated weights are not the op- 
erative weights. This is a strange kind of 
model analysis. What can it mean? 

Gollob argues that the model analysis ac- 
tually measures relative importance of the 
cognitive biases, not for any one sentence but 
for an item set of eight sentences. But this 
argument begs the question, for it rests on 
mere assumption that the model is correct. 
If the model is not correct for any single sen- 
tence, averaging over a set of sentences does 
not necessarily improve it. 

The basic problem, as I have pointed out 
in other discussions of balance theory, is the 
problem of configurality. In my experiment 
(Anderson, 1977), for example, harms and 
hinders were interpreted in terms of config- 
ural relations to the other parts of the sen- 
tence. Such configurality is an important 
problem, but one to which the balance triad 
model is insensitive. No doubt Gollob is cor- 
tect in saying that configurality poses no spe- 
cial problem to the balance triad model. That 
is not because the model solves the problem, 
however, but because it buries it by averaging 
over the varied configuralities involved in 
eight different sentences. 

As it stands, therefore, the balance triad 
model should perhaps not be called a model 
at all. Instead, it illustrates the Procrustean, 
curve-fitting properties of analysis of variance. 

In my original article (Anderson, 1977), I 
concluded that 


The balance triad model of Gollob (1974b), Insko 
et al, (1974), and Wyer (1975b) represents a nota- 
ble attempt to bring needed precision into balance 
theory. If successful, this model could fractionate 
and measure each of several coacting balance ten- 
dencies, In Gollob’s applications especially, this at- 
tempt went beyond the affective relations of tra- 
ditional balance theory to consider semantic rela- 
tions and a more informational analysis. 

Despite the present criticisms, the work of Gollob, 
Insko, and Wyer is important in representing 
thoughtful and serious attempts to break through 
the limitations of traditional balance theory to & 
new approach. There is no doubt about the inter- 
est and potential’ value of this approach. There i5, 
of course,'the question of its validity. (P- 154) 


The question of validity still remains, more 
Serious than before. 
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Goodness of Fit 


Since the central criticism of the balance 
triad model was correct, technical details no 
longer seem to have much relevance. Although 
there is much to question in Gollob’s com- 
ments on technical details, they require only 
minor qualifications of my original discussion. 
Accordingly, I wish to comment only on Gol- 
lob’s implication that I have not correctly 
represented his position on goodness of fit. 

On goodness of fit, I explicitly noted that 
“as Gollob (1974b) has carefully pointed out, 
the balance triad model can always fit the 
data perfectly” and that Wyer and especially 
Gollob “have given serious attention to this 
problem” of obtaining a test of goodness of 
fit (Anderson, 1977, p. 145). I also explicitly 
noted that “restricted versions of the com- 
plete balance triad model can readily be 
tested” (p. 146), again in agreement with 
Gollob’s position. And in general, I completely 
agree with Gollob that common sense is a 
much-needed ingredient in model analysis 
(e.g., Anderson, 1976, in press). 

Of course, the basic problem in testing the 
balance triad model still remains. The re- 
stricted versions of the model that can be 
tested have no special theoretical status, They 
are merely plausible hunches. If the test re- 
jects the null hypothesis that some weight is 
zero, that only means that the hunch was 
wrong and that the weight is not zero after all. 

Just this situation arose in the comparison 
of Designs A and C in Table 2 of my article. 
The two designs yielded two quite different 
theoretical representations of their four com- 
mon sentences. The relevant-bias hypothesis 
succeeded in the representation of Design C 
but failed badly in the representation of De- 
sign A. I did not take the latter failure as 
a criticism of the theory, however, because 
the general model places no restriction on the 
weights, Gollob evidently concurs; referring 
to these two theoretical representations in his 
reply, he states that “both can be correct.” 
In other words, failure of the relevant-bias 
hypothésis is not considered as a failure of 
the general model. 

Indeed, Gollob (1974) explicitly asserts 
that “when no restrictions are placed in the 
weights . . . the model is useful as a method 
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of decomposing the data from an item set 
into psychologically interesting components” 
(p. 293). But when no restrictions are placed 
on the weights, the model is admittedly un- 
testable. To say that it yields a meaningful 
decomposition is to assume its truth regard- 
less of testability. In short, the general model 
disallows only its own disproof. 

One further comment on goodness of fit 
should be added in light of Gollob’s reply. It 
now is clear that the estimated weights ob- 
tained from the model analysis are not the 
weights in the operative model. Such model 
analyses would seem to have no relevance 
whatever to testing the operative model. 


A Question of Understanding 


In his reply, Gollob states that he never 
intended to make and never did make the 
theoretical interpretation to which my cen- 
tral criticism was directed. His present posi- 
tion, certainly, is quite different. I apologize 
for any deficiencies in my understanding, but 
I did spend considerable time on the article 
trying to be correct and fair. Since Gollob 
finds fault with my understanding, I feel it 
appropriate to add three remarks in self- 
defense. 

First, I sent the original and the revised 
preprint of my article to Gollob and to other 
interested parties, together with covering let- 
ters asking for comments and criticism. Not 
one of the replies that I received suggested 
that my central criticism rested on a misun- 
derstanding of the balance triad model. 

Second, it was an editorial requirement that 
I verify the thought experiment with an ac- 
tual experiment. But the sole purpose of the 
thought experiment was to illustrate the theo- 
retical indeterminacy. Neither it nor the ac- 
tual experiment had any other point or pur- 
pose. If Gollob’s present position had been 
clear then, both the thought experiment and 
the actual experiment would have been un- 
necessary. Since this point was not raised in 
the editorial process, it seems proper to say 
that I was not alone in my understanding of 
Gollob’s position. 
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Finally, my understanding of Gollob’s 
(1974) theoretical position still seems to me 
to agree with his statements in that article, 
It is true, to be sure, that the numerical cal- 
culations to estimate the weights require use 
of a 2° item set of sentences. The issue is 
not how the weights are calculated, however, 
but what they mean. Gollob (1974) repeat- 
edly refers to the weights of the cognitive 
biases in inferences or judgments to single 
sentences; these are the operative weights, 
No warning is given about any other set 
of weights. No warning is given that the esti- 
mated weights obtained from the model anal- 
ysis are not the operative weights for any 
sentence. 

Quite the contrary. The basic model (Gol- 
lob, 1974, Equation 1, p. 292) explicitly rep- 
resents the judgment of the given single sen- 
tence as a weighted sum of cognitive biases; 
the same paragraph asserts a few lines later 
that the model analysis provides “least squares 
estimates of the weights.” If Gollob (1974) 
considered that two different sets of weights 
were involved, he should have said so then. 
Since Gollob (1974) does not even allude to 
two different sets of weights, the complaints 
in Gollob (1979) seem ill-taken. 
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Negative Emotional Biasing of Unexplained Arousal 


Christina Maslach 
University of California, Berkeley 


Schachter and Singer’s two-factor model of emotion argued that a state of un- 
explained arousal could as easily be labeled “euphoria” as “anger,” depending 
on the cognitions available in the immediate situation. A theoretical critique of 
this model suggests that the individual’s search for a cognitive explanation of 
an arousal state is (a) more extensive than originally hypothesized and (b) 
biased towards negative emotional labels. Because of methodological weak- 
nesses in the original study, a new paradigm, utilizing amnesia for hypnotically 
induced arousal, was developed to test the contrasting hypotheses. Subjects 
either did or did not experience unexplained arousal in the presence of a con- 
federate who was displaying either happy or angry emotions. The results of 
this modified replication, as well as of an exact replication by Marshall and 
Zimbardo, fail to support the Schachter and Singer model and demonstrate the 


negative bias in the process of identifying unexplained internal states. 


The study of emotion has been slowed by 
aà lack of consensus about the scientific con- 
ceptualization of this phenomenon that we 
tach know so well at a personal level, While 
Some researchers define emotion in terms of 
Motives or behavioral responses, others focus 
on the sensations of bodily changes, subjec- 
tive experience, or cognitive processes. With- 
in the more cognitive models of emotion, the 
Most influential work has been the classic 
‘periment by Schachter and Singer (1962). 
Two dimensions of emotion were indepen- 
dently manipulated in the study: physiologi- 
‘al arousal and cognitive cues. Some physio- 
logically aroused subjects were correctly 
Informed as to the source of their arousal, 
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while others were either uninformed or given 
an erroneous explanation, Crosscutting this 
manipulation of arousal was the manipula- 
tion of the emotional cues presented in the 
immediate situation. Subjects interacted with 
a confederate who was acting either euphoric 
or angry. The results appear to show that 
aroused subjects without a prior appropriate 
explanation used the available situational 
cues to label their state in emotional terms. 
If they were with a euphoric person, they 
labeled their own arousal as “euphoria,” but 
if they were with an angry person, the same 
physiological arousal was labeled as “anger.” 
Unaroused subjects or those with an appro- 
priate explanation for their arousal were re- 
ported to be uninfluenced by the emotional 
cues provided by the confederate. 

This experiment represented an important 
conceptual change from previous theoretical 
positions, all of which postulated a direct 
causal relationship between physiological 
arousal and cognitions about emotional 
states. In contrast, Schachter and Singer 
proposed that the factors of physiological 
arousal and emotional cognition could func- 
tion independently and that they interacted 
with each other to produce a true emotion. 
In addition to this interactionist model, 
Schachter and Singer made a second signifi- 
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cant contribution by demonstrating the im- 
portance of situational determinants of emo- 
tion. The theoretical impact of this study 
- cannot be overestimated, foreshadowing as 
it did the rise of cognitive analysis in social 
psychology. This general line of thinking 
played an important role in the current de- 
velopment of attribution theory, as a result 
of its emphasis on the role of cognition in 
defining reality. Furthermore, it has inspired 
several new lines of theory and research 
about a variety of internal feeling states, in- 
cluding fear, hunger, passion, and pain (see 
London & Nisbett, 1974; Mandler, 1975; 
Schachter, 1971; Walster, 1971). 

However, despite its ingenious paradigm 
and its theoretical contributions, the Schach- 
ter and Singer (1962) study cannot be 
accepted uncritically. Although the conclu- 
sion is provocative, a careful inspection of 
the findings suggests that it is not, in fact, 
supported by the evidence. The initial data 
analysis reveals no significant difference in 
self-reported emotion between any of the 
unexplained arousal groups and the placebo 
control groups. Although the authors point 
to the predicted pattern of means within 
each emotion condition, the absolute size of 
the effects is minimal, since the difference in 
emotion between any of the groups amounts 
to less than one single scale unit on a 9-point 
scale, The theory also requires absolute 
qualitative differences in emotion as well as 
relative differences in emotion ratings be- 
tween the experimental groups of the 
euphoria and anger conditions. However, the 
means of all conditions reflect a feeling of 
“slight happiness,” regardless of the emo- 
tional cues provided by the euphoric and 
angry confederates, 

The behavioral data are also weak. The 
comparison between unexplained arousal 
groups and placebo controls reaches statis- 
tical significance only in the anger condition, 
not in the euphoria one, Even though this 
pattern of “anger units” is in line with the 
authors’ theory, the size of this response cer- 
tainly is not. The low mean score for subjects 
with unexplained arousal could represent 
merely one instance of verbal agreement with 
the angry confederate’s position over nearly 
20 minutes of interaction. Somewhat stronger 
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behavioral differences emerged only after a 
sizeable number of subjects were discarded 
in an internal analysis. 

One possible reason for the lack of em- 
pirical support may be that there are me- 
thodological flaws in the design of the study: 
(see Maslach, 1979, for a review of these 
problems; see also the critiques by Lazarus, 
1968; Leventhal, 1974; and Plutchik & Ax, 
1967). Given the theoretical significance of 
the Schachter and Singer study, one would 
expect that more methodologically rigorous 
replications would have already been done. 
Surprisingly, only one attempt has been 
made. Marshall and Zimbardo (1979) tested 
80 male subjects in a partial replication of 
the Schachter and Singer paradigm for the 
euphoria condition. However, they found no 
empirical support for Schachter and Singer's 
conclusions, Subjects who had received the 
standard epinephrine injection, along with 
an “inadequate” explanation of its side 
effects, did not differ from placebo controls 
in either emotional affect or behavior. Fur- 
thermore, when the arousal was made 
stronger and more salient, subjects reported 
a negative emotional state, rather than the 
positive “euphoria” predicted by Schachter 
and Singer. 


Alternative Views 


The lack of empirical support for the com 
clusions advanced by Schachter and Singet 
(1962) forces us to consider alternative hy: 
potheses. One possibility is that the Schach- 
ter and Singer theory is incorrect. The quai 
tative differentiation of emotions may havé 
some basis other than situational cues. More 
over, an undifferentiated autonomic arou 
may not be the physiological basis for emo 
tion. The Schachter and Singer model may 
also be incorrect in its assumption that autoi 
nomic arousal is a neutral activator steer a 
by available cognitions, Marshall and a 
bardo’s findings suggest that attending. 
such arousal may have a negative qua” 
that will overshadow positive emotional a 
nitions. If so, this would argue for ng 
importance of physiological feedback in the 
termining the quality (and not just 
quantity) of the emotion. 


| 
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NEGATIVE EMOTIONAL BIASING 


An alternative interpretation is that the 
Schachter and Singer theory is correct but 
_ that the particular paradigm did not provide 

an adequate test of it. Schachter and Singer 
present their model as an explanation of 
emotion in everyday situations, in which the 
physiological and cognitive factors are com- 
pletely entwined. However, they test their 
model by separating the two factors and 
creating a situation of unexplained auto- 
nomic arousal. Such a situation is rather rare 
and unusual in everyday life, and it may be 
erroneous to generalize from this special 
case to more common emotional experiences. 
Because of its novelty, the experience of 
unexplained arousal may not be affectively 
neutral, but may instead produce a bias in 
the search for an emotional cognition. 

Negative biasing. The idea that unex- 
plained arousal is an affectively neutral state 
seems intuitively correct only for someone 
who is experiencing such arousal for the first 
time (such as a child), and who must look 
to other people and/or the situation to make 
sense of this strange and puzzling experience. 
The concept does not apply as well to people 
who have had numerous experiences pairing 
arousal with emotional labels and with ap- 
propriate arousal instigators. If they were to 
feel aroused without knowing why, it would 
not only be an unusual experience but a dis- 
turbing or even frightening one, since this 
uncertainty would be felt as a loss of per- 
sonal control over their own internal states. 
Indeed, the concept of unexplained arousal 
is very close to the clinical definition of 
“free-floating anxiety,” which is always char- 
acterized by negative emotional affect. 
Therefore, unexplained arousal may generate 
a biased, rather than an unbiased, scanning 
of evidence and alternative hypotheses. 
That is, the search for an appropriate cogni- 
tive label for the arousal could be bi 
toward available negative instances, A nega- 
tive bias also seems more probable because 
Noticeable physical arousal is a far more 
common concomitant of negative emotional 
€xperiences than of positive ones (with the 
Possible exception of sexual excitement). 

An expanded search. This alternative 
view still accepts the Schachter and Singer 
Notion of a cognitive search for an emotional 
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explanation. However, their experimental 
test of the search process was a narrow one 
that did not consider the full range of in- 
formation that a person might have used. 
They only assessed the impact of the con- 
federate’s mood on the subject and ignored 
other information in the immediate situation 
or in the subject’s life experiences. For ex- 
ample, they did not consider the possibility 
that a person will search his or her memory 
for past events or prior comparable situations 
that would provide an appropriate explana- 
tion for the current state of arousal, Subjects 
might have tried to recall similar situations 
in which they had felt aroused (such as be- 
ing in a testing situation or visiting a doc- 
tor). They might have then used these 
memories to derive an appropriate emotional 
cognition (“I must be getting upset about 
the tests they are going to do, since I always 
worry about failing” or “I’ve never liked 
getting stuck with needles by doctors, so 
that’s why this injection bothers me”). To 
the extent that subjects’ explanations were 
based on their past history, on attributions 
about their health, or on nonsocial aspects 
of the environment, it might explain the 
relatively weak impact of the confederate’s 
mood or the substantial within-condition 
variance that attenuated treatment effects. 

Although Schachter and Singer (1962) 
postulate that the relevant emotional cog- 
nitions arise from the immediate situation, 
they operationalize this concept in terms of 
a person who serves as a social comparison 
referent for the subjects. They predict that 
it is the emotion displayed by this compari- 
son person that will be the “appropriate cog- 
nition” which the subjects use to label their 
own arousal. However, such a prediction 
must rest on the assumptions that subjects 
will view the other person as (a) being some- 
one similar and comparable to them, (b) 
responding to a set of situational demands 
that are also impinging on the subjects, and 
(c) behaving in an appropriate way to these 
demands. 

The first assumption may be justified in 
the experiment, since the comparison person 
was of the same sex and age as the subject 
and had a similar educational background. 
There is less evidence to support the other 
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two assumptions. An examination of the ex- 
perimental procedure reveals that the com- 
parison person was not always responding 
to the same situation that confronted the 
subject. For example, there is nothing in- 
herent in the messy experimental room that 
should cause people to experience euphoria. 
Given this lack of situational inducements,’ 
the confederate’s playful behavior says more 
about his particular personality than it does 
about the appropriate way to respond to a 
situation. This dispositional interpretation is 
made especially salient by the confederate’s 
own statement that “this is one of my good 
days” (Schachter & Singer, 1962, p. 384). 
From the subject’s point of view, whatever 
is causing the confederate to feel euphoric 
is something unique to him and not some- 
thing that they share in common. In con- 
trast, the anger condition does provide some 
relevant situational demands, since the con- 
federate’s anger is directed at a long and 
insulting questionnaire that the subject is 
also completing. 

Finally, there is insufficient evidence that 
subjects would view the confederate’s be- 
havior as appropriate to the situation. In 
both experimental conditions, the confederate 
behaved in unusual ways (agitated, manic, 
and assertive), especially compared to the 
norm of quiet compliance that is generally 
characteristic of research subjects. Spontane- 
ously playing with paper airplanes and hula 
hoops is not a typical thing to do in a scien- 
tific laboratory, nor is abruptly walking out 
of the experiment, To the extent that sub- 
jects perceive the confederate as acting in 
idiosyncratic, unusual, or inappropriate ways, 
he loses his social comparison reference 
power, and subjects would not be likely to 
label their own arousal in terms of their 
cognitive interpretation of his mood. 


Current Study 


The current research was designed to further 
investigate the process of a cognitive search 
for the causal antecedents of unexplained 
arousal. It involved a modified replication in 
which procedural elements were altered to 
correct some of the original methodological 
problems. These improvements included the 
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use of a hypnotic induction to produce an 
arousal state, rather than a drug injection, 
Previous research has demonstrated that 
marked physiological changes can be pro- 
duced hypnotically (see the review by Sarbin 
& Slagle, 1972). In addition, the use of hyp- 
nosis allowed for a better degree of experi- 
mental control over the onset and the main- 
tenance of the physiological arousal. The 
lack of an explanation was produced by a 
posthypnotic suggestion for amnesia of the 
true cause of the arousal. Subjects experi- 
enced physiological arousal but did not know 
why the arousal was occurring and did not 
remember that it was due to the hypnotic 
suggestion. Posthypnotic amnesia is a phe- 
nomenon that has been demonstrated in 
numerous studies (see Cooper’s review, 1972) 
and is believed to involve a temporary dis- 
ruption of the normal retrieval process 
(Evans & Kihlstrom, 1973). Cues that nor- 
mally aid the process of recall are not) uti- 
lized as they would be in normal waking 
memory. 

Second, the behaviors of the two emotion 
conditions were made directly comparable— 
the nature, level, and timing of the activity 
were equivalent in both conditions, and only 
the confederate’s obvious emotional state 
(happy or angry) was varied. The reasons 
given for the confederate’s emotion were 
ones that the subject could readily share, 
and the confederate’s behavior appeared be- 
lievable and appropriate in the laboratory 
setting. Finally, several measures were added 
to the study, both to assess the validity of 


*In a personal communication (1976), Bibb 
Latané (who was the confederate in the euphoria 
condition) pointed out that the confederate exerted 
a great deal of pressure on the subject to join hin 
in the “euphoric” behavior and that this constitute’ 
a relevant situational inducement. However, this 
raises even more problems in interpreting the data, 
since the subject’s behavior would then reflect 50- 
cial conformity rather than an experienced emotion: 
One could also argue that once subjects had ao 
induced to imitate the confederate, they would ma 3 
judgments of their emotional state on the basis 0 
observing their own behavior (Bem, 1972; we 
& Amoroso, 1967), According to this self-perceptio 
interpretation, the “appropriate cognition” 5 na 
the mood of the confederate but the behavior 0 
oneself (which has been induced by the confederate): 
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lhe experimental manipulations and to reflect 
nore precisely any differences in emotional 
iehavior. 

Hypotheses. The proposed modification 
md extension of the concept of cognitive 
watch leads to some different predictions 
fom those made by Schachter and Singer 
(1962). These competing hypotheses center 
m the questions of (a) whether the search 
is limited to, or goes beyond, the immediate 
situation and (b) whether the search is un- 
based or biased. On the first question, 
Schachter and Singer imply that subjects 
vill rely primarily on the information pro- 
lided by the confederate to label their own 
inexplained arousal. In contrast, the current 
tudy predicts that subjects will also utilize 
Mormation from both their past and present 
ile. Thus, their emotional response will be 
lore independent of the confederate’s mood 
nd less constrained by the immediate situ- 
tion, On the second question, Schachter and 
Inger argue that subjects with inadequately 
‘plained arousal will engage in an unbiased 
arch for a cognitive label. In contrast, the 
arent study proposes that regardless of 
he confederate’s display of affect, the cog- 
live search will be biased towards labels 
t negative emotions, since a state of un- 
plained arousal is initially anxiety provok- 
8 for adult subjects. For those not experi- 
iting arousal or those given an appropriate 
Dlanation for their arousal, both theories 
tee on the absence of a search for cogni- 
re labels of emotion—and thus an absence 

an experience of emotion. 


Method 


bjects 


'wty-cight undergraduate subjects (25 males and 
males) 2 were drawn from the introductory 
"ology Course subject pool at Stanford Univer- 
ae Paid for their participation in this experi- 
3 All of the subjects had received scores of high 
pes Susceptibility on the Harvard Group Scale 
iyane Susceptibility (Shor & Orne, 1962) $ 
a of these subjects were randomly assigne! 
Sı ming in hypnosis prior to the experiment, while 
Fe jects were randomly assigned to poel 
Moti that did not receive such training. e 
e training, which averaged about 10 hours 
Nbject, was conducted in small groups and 
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utilized a relaxed and permissive approach (see 
Zimbardo, Maslach, & Marshall, 1972). The training 
Program sampled a wide variety of hypnotic ex- 
Periences and included as criterion tests (a) the 
Successful completion of posthypnotic suggestions 
with accompanying amnesia and (b) increased tol- 
erance of ischemic pain (a measure developed by 
Lenox, 1970). Throughout the experimental session, 
trained and untrained subjects were treated identi- 
cally except for a brief hypnotic induction for the 
trained subjects. 


Procedure 


The study consisted of two experiments that fol- 
lowed the same procedural format. The testing 
session was divided into two parts, The purpose of 
Part 1 was to establish a verbally conditioned 
arousal response to a specific cue and to demon- 
strate that this arousal involved substantial physio- 
logical changes. In part 2, this arousal was either 
elicited or not in the presence of a same-sex con- 
federate who was behaving happily or angrily. 

Part 1: Physiological arousal. Subjects were told 
that the study involved several types of individual 
assessment, including personality characteristics, 
physiological responsiveness, and learning perform- 
ance, The subject was seated in a comfortable 
chair in an acoustic chamber, and the experimenter 
wiped clean the recording areas with alcohol. 
Beckman silver -silver chloride skin electrodes (17 
mm in diameter) filled with Beckman Electrolyte 
Gel were attached by adhesive discs to the thenar 
eminence of the subject’s nondominant palm and 
the volar surface of the two forearms. For the re- 
mainder of the period in the acoustic chamber, the 
subject’s heart rate and galvanic skin response 
(GSR), as measured by skin resistance, were con- 
tinuously recorded on a four-channel Offner Type 
R dynagraph with a #9857 cardiotachometer cou- 
pler and a #9842 galvanic skin response coupler. 
Standardized, prerecorded instructions were pre- 
sented to the subjects over an intercom. The 5-min 
hypnotic induction procedure (which was also pre- 
recorded) was eliminated for the unhypnotized con- 
trol groups, who were merely told to relax for an 
equivalent period of time while baseline recordings 
were being made. 

All subjects then heard the following verbal con- 


2 There were no significant differences between the 
responses of male and female subjects, so their data 
were combined. s 

3 The selection of high-scoring subjects was done 
for practical reasons, since they can complete the 
hypnotic training and pass the criterion tests suc- 
cessfully in a shorter period of time than low-scoring 
subjects. There is no evidence that they differ in 
substantial ways from the general population as a 
whole, although they tend to exhibit a greater im- 


agination (Hilgard, 1970). 
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ditioning instructions: 


In this session, the following reactions will occur 
whenever you see the word start. When you see 
the word start, your heart will beat faster, your 
breathing will increase, there will be a sinking 
feeling in your stomach, and your hands will get 
moist. You will feel all of these sensations as 
soon as you see the word start, and they will 
last until I say to you, “That’s all for now.” 
When I say, “That’s all for now,” you will return 
to your normal state and feel relaxed and good, 
However, when you see the word start and ex- 
perience these reactions, you will not know why 
you are feeling the way you are, or remember my 
telling you anything about it. 


The four symptoms that were cited in these in- 
structions had all been reported by a pretest sample 
of volunteer subjects who had received an injection 
of epinephrine, Also, the relatively quick onset of 
the hypnotically induced arousal parallels the rapid 
“rush” that is typical of many people’s response 
to epinephrine, 

Subjects then described their current mood state, 
using the Nowlis Mood-Adjective Checklist (1963). 
A series of stimulus lights were then individually 
illuminated. The first of three neutral cues was a 
red light, the second was a white light, the third 
was labeled “stop,” and the cue word “start” was 
given as the fourth signal. Following this arousal 
cue, subjects filled out a second Mood-Adjective 
Checklist, describing their feelings at that moment. 
Next, the termination cue was given for the dis- 
appearance of any experienced symptoms and for 
the return to a normal physiological state and a 
relaxed psychological state. There was a repetition 
of the posthypnotic Suggestion, and then the hyp- 
Eae subjects were brought out of their hypnotic 
state. 

Part 2: Emotional arousal. The subject returned 
to the first room and joined a waiting confederate 
(ostensibly another subject). In front of each of 
them was a memory drum, a learning test sheet, 
and a folder containing bogus test materials, such 
as Thematic Apperception Test (TAT) pictures, the 
Stroop color-word test, and a puzzle. The experi- 
menters observed the behavior of the subject and 
the confederate through a one-way mirror and 
communicated via an intercom system. 

A learning task was introduced in order to have 
a nonobvious way of presenting the arousal cue 
without either the confederate or observers being 
aware of the manipulation. Two 15-word lists were 
serially presented on the memory drum with a 4-sec 
exposure for each word and a 1-min recall test 
administered immediately after each list. The final 
word on the second list was either the arousal cue 
word (“start”) or a neutral word (“speedy”). Sub- 
jects in the aroused and unhypnotized conditions 
saw the arousal cue word, whereas subjects in the 
unaroused conditions saw the neutral one. Next, the 
subject and confederate were told to look through 
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the materials in the folder and were then left alone 
for 4 min. During this period, the confederate went 
through a prearranged series of either “angry” or 
“happy” behaviors. 

The confederate’s behavior followed the same 
general pattern in both the happy and angry con- 
ditions, although it differed in affect and the reasons 
given to justify that affect. During the first minute, 
the confederate worked with the folder materials 
and made a few comments about them, while in the 
second minute he or she talked about the experi- 
ment and school. During the third minute, the con- 
federate directed some questions at the subject, and 
during the fourth minute, he or she finished exam- 
ining the materials and “accidentally” dropped them 
on the floor. In both conditions, the confederate’s 
affective level was mild at first (annoyance or posi 
tive interest) but became more intense over time 
(visible anger or exuberant laughter). Both confed- 
erates explained their emotional reaction in terms of 
impending tests and term papers. The angry confed- 
erate resented the time away from study caused by 
this uninteresting and slow-paced experiment. The 
happy confederate enjoyed the temporary distrat- 
tion from these academic demands that the ex- 
periment made possible. a 

At the end of this emotional modeling period, 
the pair was separated and the subject was asked 
to complete two questionnaires, one of which fo- 
cused on the subject’s present physiological state 
and experienced emotion, while the other assesse 
reactions to the confederate. Next, the experimenter 
took the subject to an adjacent room to be inter 
viewed by a clinical psychologist. This person wi 
not affiliated with the experimenters and did not 
know what had taken place in the experiment. ‘The 
clinician’s task was to learn what the subject wi 
feeling at the time, discover the cause of any & 
pressed affect, and (if the subject was feeling at 
distressed) make him or her feel better. The inte 
views were open-ended, were about 30 min 
length, and were tape recorded for. subsequent ani p 
ysis, The purpose of these interviews was to ass 
whether the amnesia induction was powerful an 
to resist skilled probing by someone unconnec! 
with the hypnosis experience. 

After the interview was completed, 
menter entered the room and removed the hy: 
suggestions for the arousal response and the 


the exper 
pnotie 
an 


their reactions to the study. All subjects pro 
not to talk about the experiment until it was 
Pleted. Experimental silence was maintained, 
all subjects later participated in an elabor A 
briefing session conducted in a series of in any} 
small group sessions. There was no evidence ® 4] 
emotional carry-over beyond the immediate 

Perimental setting. 


com 
and 
des 


Summary of Design 


fat 
The first experiment consisted of a 2X 2 : 
torial design. There were three levels of 2" 


(hypnotized + arousal cue), unaroused 
motized + no arousal cue), and unhypnotized 
hypnotized + arousal cue). There were two 
of confederate emotion, happy and angry. 
who were trained in hypnosis were ran- 
assigned to the four hypnosis groups, and 
ained subjects were randomly assigned to the 
0 unhypnotized groups, all with six subjects per 
Because the results of this first experiment did 
replicate those found by Schachter and Singer 
1962) for the “euphoria” treatment, two addi- 
experimental conditions were tested to dis- 
the basis for these differing outcomes. In both 
tions, subjects were given the arousal treat- 
(hypnotized + arousal cue) and were exposed 
0 the display of happy emotion by the con- 
One of these additional groups (informed 
remained aware of the basis for their 
Since they were not given any suggestion 
thypnotic amnesia. This group parallels that 
informed treatment in the Schachter and 
(1962) study. The other group (simplified 
was given only two physiological symp- 
(heart beating faster, respiration increasing), 
er to create a more purely neutral arousal. It 
neeivable that the other two symptoms (sink- 
ng in the stomach, moist hands), although 
in reactions to epinephrine, too strongly 
a negative arousal state. As before, sub- 
tained in hypnosis were randomly assigned 
two conditions, with six subjects per group. 


Results 


ysiological Arousal 

re was a striking difference in physio- 
Tesponse between the hypnotized and 
yPnotized conditions (see Table 1). For 
MN subject, the heart rate record was 
into 5-sec intervals, and both the 
and mean heart rates for each inter- 
ere calculated. There were six such 
‘als following each stimulus, and their 
Vidua] scores were combined into a total 
ü score for each stimulus, A repeated 
lites analysis of variance for the highest 
{Tate scores showed that there was a 
ant main effect of visual stimuli, such 
the highest heart rate following the 
cue was significantly higher, F(1, 34) 
» P<.001,* than the highest heart 
lowing the neutral stimuli (which did 
er from each other). However, there 
l significant interaction between stimuli 
Pnosis condition, F(1, 34) = 5.35, ? 
The F tests for the two hypnosis con- 
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Table 1 
Conditioned Physiological Arousal 
ere e ee 
Cue 
e Seer 
Condition n Neutral Arousal increase 
Heart rate* 
Hypnotized 24 
Mean 80.8 88.0 v 
High 84.6 94,1 95 
Unhypnotized 12 
Mean 76.6 78.8 2.2 
High 80.6 83.8 3.2 
Galvanic skin response? 
Hypnotized 24 2.6 4.4 1.8 
Unhypnotized 12 2.1 2.4 3 


* Beats per minute. 
b Number per period. 


ditions reveal that this main effect was due 
entirely to the hypnotized subjects, F(1, 34) 
= 37.17, p <.001. The unhypnotized sub- 
jects did not show a significant increase in 
highest heart rate in response to the arousal 
cue (F=2.19, ms), An analysis of mean 
heart rate revealed exactly the same pattern 
of results. 

A similar pattern of arousal was reflected 
in the total number of GSRs following each 
stimulus. There was a significantly greater 
number of GSRs following the aroysal cue 
than following the neutral stimuli, F(1, 34) 
= 20.87, p < .001. However, there was also 
a significant interaction between stimuli and 
hypnosis condition, F(1, 34) = 6.04, p < 
.025, and the F tests for the two hypnosis 
conditions show that the main effect was due 
entirely to the hypnotized subjects, F(1, 34) 
= 26.51, p < .001, and not to the unhypno- 
tized group (F = .40, ns). This difference in 
arousal response is even more striking given 
the fact that both hypnotized and unhypno- 
tized subjects showed a more rapid recogni- 
tion of the cue word (as evidenced by a 
shorter latency and higher amplitude of the 
first GSR following the cue). 

Two Mood-Adjective Checklists were com- 
pleted by each subject, one before the 


4 All reported values are two-tailed. 
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arousal stimulus and one after. For purposes 
of analysis, the adjectives on this list were 
divided a priori into four groups—positive 
(eg., affectionate, overjoyed); negative (e.g., 
angry, clutched-up); active (e.g., energetic, 
vigorous); and passive (e.g., calm, quiet). 
Changes in mood responses were assessed by 
computing separate mean change scores for 
each category. The changes scores for the 
hypnotized subjects indicate the marked shift 
in emotional state that resulted from their 
arousal. Their mood became significantly 
more negative, £(23) = 7.22, p< .001; less 
positive, #(23) = 7.68, p< .001; and less 
passive, ¢(23) = 7.68, p < .001. In contrast, 
the unhypnotized subjects showed no changes 
on the negative, active, or passive mood in- 
dexes, although there was a tendency for 
them to report less positive feelings, t(11) 
= 2.07, p < .10. Overall, the hypnotized sub- 
jects showed a more extreme shift on a com- 
posite of these four indexes of mood than 
did the unhypnotized subjects, F(1, 34) = 
17.14, p < .001. 

Because hypnotic subjects responded phys- 
iologically to the arousal cue in the acoustic 
sound chamber, it was assumed that they 
would respond similarly when they again saw 
this stimulus during the learning task. Un- 
fortunately, we could not directly assess the 
validity of this assumption because telemetry 
equipment was not available to us at the 
time of the study. Nevertheless, two in- 
direct measures of arousal do provide sup- 
port for our reasoning. One was the subjects’ 
recall performance on the learning task. 
Since the second of the two lists either did 
or did not contain the arousal cue, it was 
predicted that differences in subsequent 
arousal states might reduce cue utilization 
and affect the recall process (cf. Easterbrook, 
1959). Although there was no between- 
groups difference in overall recall accuracy 
on either list (about 60% accuracy), sub- 
jects in the arousal groups made significantly 
fewer errors of commission (M =.7) than 
did the unaroused and unhypnotized groups 
(M = 1.7) on the second learning list, t(34) 
= 2.54, p<.02. It appears that subjects 
performing under the distracting influence of 
arousal wrote down only the words they had 
stored in readily accessible memory and then 
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stopped, without making guesses about the 
remaining words. 

The second indirect measure of arousal 
was part of the final questionnaire. Subjects 
were given a list of eight physiological symp. 
toms and asked to indicate which (if any) 
they were now experiencing. The percentage 
of subjects who checked each symptom was 
calculated for each of the experimental con- 
ditions. From 92% to 100% of the subjects 
in the aroused conditions reported experienc- 
ing each of the four suggested symptoms, 
whereas only 26% to 52% of the unaroused 
and unhypnotized subjects reported having 
even a single one of the symptoms. The be- 
tween-groups differences between these pro} 
portions for each of the four symptoms com: 
puted separately are all beyond the .02 level} 
of significance. 

Assessment of the success of the posthyp- 
notic suggestion was provided by open-ended | 
questions on the final questionnaire, as well 
as by the in-depth postexperimental inter- 
views conducted by the clinical psychologist. 
Both sources of data clearly support the coni i 
clusion that these trained hypnotic subjects 
who were given an amnesia suggestion were 
unaware of the true cause of their exper- 
mentally induced arousal. 


Perception of Confederate 


A check on the confederate manipulation) 
was derived from the mean score of om 
bipolar scale items of seven alternatives 
about the confederate’s emotional behavio! 
(sad-happy, angry-peaceful, friendly- f 
friendly). These data were, without excep 
tion, in opposite directions for subjects & 
posed to the happy versus the angty oe 
federate. Subjects in the happy cont 
evaluated the confederate’s mood as Bk 
cantly more positive (M = +1.2) than t 
neutral midpoint of zero, (17) = 4.94 a 
:001. In contrast, the treatment mean P| 
subjects in the angry condition (M = ait 
was significantly more negative than “el 
#(17) = 7.45, p< 001. The difference © 
tween the two confederate conditions 
highly significant, F(1, 30) =83.16, p 
.001. 
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M no. of sociable behaviors 


Self-reported emotion 


Verbal Nonverbal (happy minus angry)? 
Confederate Confederate Confederate 
Condition® Happy Angry Happy Angry Happy Angry 
Aroused 4.4 Hi 10.5 8 -10 =i 
Unaroused 4.3 A 8.2 4 +18 +2.5 
Unhypnotized 2.5 9 6.0 1.6 +.8 +4 


*n = 12 for each condition (6 per cell). 


> Mean scores are based on the Schachter and Singer (1962) index, which ranges from +4 (most happy) to 


—4 (most angry). 


Overt Emotional Behavior 


Measures of subjects’ behavior were coded 
by two judges (interrater reliability was at 
least .80) who observed each subject during 
the 4-min emotional modeling period. For 
each of the 4 min, the judges coded the sub- 
ject’s verbal and nonverbal responses. Some 
of these responses were sociable ones (i.e., 
smiling, nodding, agreeing with the confed- 
erate), while others were not (i.e, disagree- 
ing, ignoring, frowning). The overt behavior 
of the subjects varied as a function of the 
confederate’s mood, but seemed unaffected 
by experienced arousal. As shown in Table 2, 
subjects in the presence of a happy confeder- 
ate exhibited a significantly higher amount 
of sociable behaviors than did subjects who 
were with the angry confederate. This was 
true for both verbal behaviors, F(1, 30) = 
20.97, p< .001, and nonverbal behaviors, 
F(1, 30) = 44.80, p < .001. 


Reported Emotional Experience 


Two scale items tapping extent of experi- 
enced anger and happiness were adopted 
from Schachter and Singer (1962). To the 
items, “How irritated, angry, Or annoyed 
would you say you feel at present?” and 
“How good or happy would you say you feel 
at present?” subjects responded on 5-point 
scales ranging from 0 (not at all) to 4 (ex- 
tremely). Analyses were made of the differ- 
ence scores (happy-angry), 4S well as of the 
happy and the angry ratings separately. A 
broader portrait of the emotional state of 


each subject comes from eight additional bi- 
polar scales (e.g., happy-sad, angry-peace- 
ful, confident-apprehensive) with seven re- 
sponse alternatives. These responses were 
analyzed separately and then combined into 
a single index of emotional response ranging 
from extremely negative to extremely posi- 
tive. 

Although the confederate’s mood had a 
demonstrable effect on the subjects’ overt 
behavior, it did not exert the expected influ- 
ence on their reported emotional state. For 
the Schachter and Singer index, there was a 
significant main effect of arousal, F(2, 30) 
= 5.35, p<.025. A comparison of means 
(shown in Table 2) revealed that this was 
due to the more negative scores of the 
aroused condition (p< .005).° Analyses of 
each of the happy and the angry ratings sep- 
arately revealed the same significant differ- 
ence in emotional reactions between the 
aroused subjects and the other subjects with- 
out arousal. 

The mean ratings on the eight-scale index 
of emotional response are shown in Figure 1. 
Again, there was a significant main effect of 
arousal condition, F(2, 30) = 15.94, p< 
.001, and a comparison of means revealed 
that the aroused groups reported a signifi- 
cantly more negative emotional state (p < 
.001) than both the unaroused and the un- 
hypnotized groups (which did not differ from 


5 All comparisons between means following a find- 
ing of a significant F ratio were performed using 
Tukey’s honestly significant difference test. 
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each other). This same pattern of results 
was found for each individual scale. On the 
overall index, the aroused conditions reported 
negative emotions that were significantly dif- 
ferent from the neutral midpoint of zero, 
t(11) = 6.47, p < .001. While the unhypno- 
tized groups were neutral in their emotions 
(their mean score did not differ significantly 
from zero), the unaroused subjects rated 
themselves as feeling a positive emotional 
state, ¢(11) = 2.32, p < .05. Apparently, the 
unaroused subjects were responding appro- 
priately to the posthypnotic suggestion that 
they would feel relaxed and good when they 
came out of hypnosis. The positive feelings 
of the unaroused group offer a dramatic com- 
parison level for the negative reaction re- 
ported by the aroused subjects (despite the 
same posthypnotic suggestion to feel good and 
identical prior to hypnotic training). This 
positive response is particularly noticeable in 
the unaroused angry condition, although it 
did not differ statistically from the unaroused 
happy group. This finding may be due to a 
contrast effect (the unaroused subjects may 
have felt particularly good in comparison to 
the negative reactions of the angry confeder- 
ate) or to a self-persuasion effect (many of 
the unaroused subjects made reference to 
their positive feelings in an attempt to cheer 
up the angry confederate). 
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Attribution of Causality 


On the final questionnaire, subjects were 
asked if they knew why they felt as they 
did and, if so, to state the reason. These re- 
sponses were scored by judges who were 
blind to both the purpose of the experiment 
and the condition of the subject. The aroused 
subjects always reported reasons for a nega- 
tive emotional state, regardless of the mood 
expressed by the confederate. Some subjects 
said they were nervous about performing 
well on the experimental tests, others said 
they were feeling tense about upcoming 
final exams, and still others labeled their 
arousal as irritation generated by the con- 
federate’s “talkativeness.” In only a single 
instance did a subject explicitly state that 
the confederate’s mood determined his own: 
“The other guy was really hassled—it 
rubbed off.” 

The possibility that the posthypnotic sug- 
gestion had the effect of blocking any search 
for an explanation of the experienced arousal, 
rather than merely preventing the “correct” 
explanation, does not appear tenable. The 
number of aroused subjects (67%) who 
thought they knew the cause of their current 
feelings was as high as that of the unaroused 
and unhypnotized groups. However, at no 
time did the aroused subjects state that the 


HAPPY CONFEDERATE 


MB AROUSED 


Figure 1. Self-reported emotion (mean self-ratings on an eight-scale index of emotion). 


ason was related to the experimentally 
ced cause of posthypnotic suggestions. 


omparison of Arousal Control Groups 


Both of the arousal control groups (in- 
med arousal and simplified arousal) dis- 
played the same pattern of physiological 
usal as the happy aroused subjects in the 
t experiment. They showed the same in- 
in heart rate, similar changes on 
ood-Adjective Checklist, the same pat- 
lem of recall errors, but not a similar change 
1GSR.° On the self-report of physiological 
iptoms, all of the informed aroused sub- 
reported experiencing the four sug- 
arousal symptoms, The simplified 
l subjects reported increased breathing 
heart rate, as expected. Surprisingly, 
ever, all of them also reported sweaty 
is, and two thirds reported a sinking 
ing in the stomach. Even though they 
not received any suggestion for these 
two symptoms, they generated them 
taneously as correlates of changes in 
ration and heart rate. The perception 
e confederate by the two control groups 
the same as that of the happy aroused 
P, and they showed the same pattern of 
lable behaviors (both verbal and non- 
al) in their interaction with the con- 
erate, 
Overall, the two arousal control groups 
te similar to the happy aroused group in 
Ms of physiological arousal, veridical per- 
Jon of the confederate, and overt be- 
r, However, they differed in experienced 
ional state. The informed arousal sub- 
Teported a neutral set of feelings (which 
hot differ from the zero midpoint), as 
ed to the negative feelings expressed by 
lappy aroused group, #(10) = 1.90, $ 
| As expected, they all stated that the 
for their experienced arousal was the 
tic suggestion. In contrast, the simpli- 
arousal subjects reported the same nega- 
‘tion as the happy aroused group (on 
Of the self-report measures) and gave 
e type of causal explanations. As can 
n in Figure 1, their mean rating on 
Otion index was significantly more 
ive than the neutral midpoint of zero, 
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t(5) = 4.15, p < .01. In other words, despite 
a posthypnotic suggestion of only the two 
most basic physiological concomitants of 
arousal, these subjects who were unaware 
of the “appropriate” explanation for their 
arousal did not label their emotion as 
“happy” in accordance with the available 
cognitions provided by the confederate’s be- 
havior. Rather, they experienced a negative 
emotional state. 


Discussion and Conclusions 


The current study was designed as a modi- 
fied replication and extension of Schachter 
and Singer’s (1962) two-factor theory of 
emotion. The necessary conditions of strong 
physiological arousal, lack of an explanation 
for that arousal, and varied emotional-situa- 
tional cognitions were all achieved in an 
experimental setting that controlled for many 
of the methodological problems of the origi- 
nal study. However, the results reveal a re- 
markably consistent and coherent pattern of 
emotional response (especially noteworthy 
given the small sample size) that is at vari- 
ance with the interaction pattern predicted 
by Schachter and Singer. In all cases, sub- 
jects with unexplained arousal reported nega- 
tive emotions, irrespective of the confeder- 
ate’s mood.’ This finding stands in opposi- 


6 The only inconsistency in the comparison of both 
the simplified arousal and informed arousal groups 
with the aroused condition tested initially is the’ 
failure of the former to display the same expected 
changes in GSR. While this may reflect a lower 
level of physiological arousal, this interpretation is 
weakened by the fact that these subjects did show 
the same pattern and degree of change in heart rate 
and reported awareness of physiological symptoms, 
A more probable explanation is that since these 
two groups were run several months after the origi- 
nal conditions, the source of invalidity might have 
been instrument variation, through unmonitored 
changes in the electrodes or GSR coupler. 

7It might be argued that the minimal impact of 
the confederate on the subject’s emotional state 
was due to the fact that subjects first experienced 
the physiological arousal while alone (in the acous- 
tic chamber), prior to being aroused for a second 
time in the presence of the confederate. However, 
the same pattern of consistently negative affect oc- 
curred during the pretesting for the experiment, 
when subjects received the hypnotic arousal induc- 
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tion to the position that unexplained physio- 
logical arousal is a neutral, energizing vari- 
able that does not contribute directly to the 
qualitative labeling of the emotion. Whereas 
Schachter and Singer postulated that the 
lack of explanation for arousal would moti- 
vate people to search in an unbiased way for 
an appropriate cognition, the present study 
suggests that it not only motivates but biases 
the search, since it tints perception of that 
arousal with negative affect. The fact that 
Marshall and Zimbardo (1979) also found 
self-reports of negative affect among subjects 
who lacked an explanation for their strong 
physiological arousal (and focused their at- 
tention on that discordant state) lends addi- 
tional support to this alternative view.® 
What might account for this negative emo- 
tional bias? One possibility is that since 
there are more terms referring to negative 
emotions than to positive ones (e.g., Izard, 
1971), people might report negative emotions 
more often on a purely statistical basis alone. 
Second, if noticeable physical arousal is in- 
deed a more frequent correlate of negative 
emotional experiences in everyday life, then 
the negative affect reported by subjects in 
the arousal conditions could be regarded as a 
logical learned response based on prior ex- 
periences. Third, as proposed earlier, it may 
be that unexplained arousal is always char- 
acterized by negative affect because it is 
akin to a state of free-floating anxiety. 
The results of the current study suggest 
» that a negative emotional bias will have 
implications for the person’s search for causal 
information, Subjects who reported negative 
feelings almost always provided an explana- 
tion for them, such as “I’m upset because I 
don’t do well on tests” or “I’m annoyed be- 
cause this guy keeps joking all the time.” 


tion in the presence of the comparison person. Fur- 
thermore, in subsequent research (Zimbardo & Mas- 
lach, Note 1), subjects received the hypnotic arousal 
induction in the presence of at least one other com- 
parison person, and again there was the same pat- 
tern of negative affect. The fact that the data from 
these two studies parallel the results in the current 
experiment suggests that the procedure surrounding 
the initial induction (i.e., subject alone vs. subject 
with confederate) was not a crucial element in the 
subject’s search for an emotional explanation. 
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In contrast, subjects reporting positive or I 
neutral emotions rarely indicated their cause į 
and gave, instead, such nonexplanatory state- 
ments as “I don’t know, it’s just the way I 
normally feel.” ° In line with the proposition 
by Jones and Davis (1965), it seems as if 
negative experiences are more likely to mo- 
tivate a search for causal information. People 
want to know why they feel upset, fright- 
ened, or angry, perhaps because they want 
to control (and thus reduce) the future oc- 
currence of such experiences. On the other 
hand, they are less motivated to know why 
they feel happy, pleased, or content, perhaps 
because they consider such feelings to be 
their normal baseline condition as opposed 
to a “deviant” response that demands ex- 
planation. 


Social Influences on Emotion 


An important conclusion to be derived 
from the current research is that experienced 
and expressed emotions are not perfectly cot 
related. Although subjects with unexplained 
arousal displayed sociable behaviors while 


8It is important to note that Marshall and Zim- 
bardo’s (1979) research provides a connecting link 
between the current study (which found a signif- 
cant negative emotional bias) and the Schachter 
and Singer (1962) study (which found no systematic 
biases from the baseline of the placebo control con- 
dition). In the conditions that were an exact repli- 
cation of Schachter and Singer’s experiment, Mar- 
shall and Zimbardo “replicated” the original finding 
of no difference between epinephrine and placebo 
subjects, Some of the physiological data sugseste 
that the induced arousal was not very strong 0 
salient for many of the subjects, and this may a 
plain why their responses did not differ from the 
unaroused placebo controls, However, when Marsh 
and Zimbardo did make the physiological aro i 
more salient for the subjects, they reported & ne 
tive emotional state. The hypnotic manipulate 
used in the current study also made the arousal ® 
salient for the subjects, and their consistently neg% 
tive emotional responses replicate those foun 
Marshall and Zimbardo in their “salient arov 
conditions. 

9In a personal communication (1976); 
Blum reported that subjects in his research o 
tive arousal (Blum, 1972) showed a similar 
of responses, giving more causal statements al si 
their negative feelings and fewer about their p°% 
tive feelings. 
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in the presence of the happy confederate, 
their self-reported mood state was not posi- 
tive. Their comments imply that their overt 
behavior was more influenced by social con- 
tingencies (i.e., norms of social appropriate- 
ness) than was their more private feeling 
state. Such deliberate “management” of one’s 
emotional expression is a fairly common ex- 
perience for people—as when, for example, 
one laughs and acts happily with friends 
even though feeling depressed or hurt by 
some remark, To assume that one’s behavior 
and self-reported mood are equivalent in- 
dexes of emotion is to ignore functional dif- 
ferences between these two response classes. 
This issue is relevant for interpreting the 
results of Schachter’s follow-up study on the 
effects of variations in unexplained arousal 
on amusement (Schachter & Wheeler, 1962). 
In this experiment, subjects received an in- 
jection of either epinephrine, a placebo, or 
chlorpromazine (a depressant drug) and 
were led to believe that there would be no 
side effects. They then watched an excerpt 
from a slapstick movie. Whether or not sub- 
jects labeled their arousal state in terms of 
the available cognition of “amusement” was 
measured by self-reported evaluations of the 
film and observation of their amusement 
behavior, Even with large numbers of sub- 
jects, there were no significant differences 
between the three conditions on the self- 
teport measures. The epinephrine subjects 
and the unaroused placebo controls did not 
iffer on the overall behavioral index (al- 
though there was a significant difference on 
the number of laughs, as opposed to smiles 
and grins). The major finding was the de- 
ressed behavioral response of the chlor- 
Promazine subjects, which may have been a 
irect somatic effect of the tranquilizer. 
This lack of correspondence between amuse- 
ment behaviors and self-reported evaluations 
of amusement, the lack of variation across 
treatments in self-report, and the minimal 
differences between the epinephrine subjects 
and the placebo controls obviate any serious 
challenge of the Schachter and Wheeler 
(1962) study to the results or line of reason- 
mg of the present study. 
Another conclusion to be 
Current research is that people ut 


T 


drawn from the 
ilize far 
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more sources of information in generating 
their emotional explanations than was ini- 
tially suggested by Schachter and Singer’s 
(1962) analysis. The aroused subjects in the 
current study perceived correctly the emotion 
being expressed by the confederate, but they 
did not necessarily use this emotion to label 
or explain their own state, Their emotional 
explanations appeared to be a complex func- 
tion of their past experience, current life 
situation, and the immediate situational cir- 
cumstances, rather than just the mood of 
the confederate. Not only does this finding 
extend the notion of “cognitive search” but 
it also points to the necessity of reconcep- 
tualizing the type of information that is pre- 
sented by the confederate, 

Schachter and Singer’s model is based on 
a theory of social comparison (Festinger, 
1954) that argues that self-evaluation is often 
accomplished by comparing oneself with 
other people. This is particularly true when 
one has doubts about the appropriateness or 
correctness of one’s feelings, beliefs, or be- 
havior. According to this viewpoint, what is 
provided by the confederate (the social com- 
parison referent) is normative information. 
The confederate models the “appropriate” 
way to label the arousal, and the subject 
adopts a similar response in order to appear 
(both to self and others) as normal. While 
normative information indicates how one 
should feel and act, it does not necessarily 
indicate why one should feel that way. In 
other words, normative information is not 
always causal information as well, Schachter 
and Singer argue that the lack of causal in- 
formation about one’s aroused state is what 
motivates the search for an emotional cogni- 
tion, However, in postulating that this cog- 
nition is obtained via social comparison pro- 
cesses, they are suggesting that the ultimate 
solution to the search is normative informa- 
tion. If only normative information were 
needed, then the confederate’s reaction would 
provide a sufficient basis for the subject to 
evaluate his or her emotion. However, if the 
subject wants information as to why the 
arousal response is occurring, then the con- 
federate’s overt reaction becomes evidence 
only for the fact that an unknown cause is 
generating a strong reaction in others as well. 
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Therefore, the overt reaction is not itself the 
cause. The subject will look beyond the 
confederate’s reaction for such causal in- 
formation and may arrive at an emotional 
cognition that differs from the emotion being 
expressed by the confederate. 

As mentioned earlier, a state of unex- 
plained physiological arousal may not be an 
appropriate paradigm for studying common 
emotional experiences. However, the negative 
emotional bias associated with unexplained 
arousal suggests that it may be a useful 
paradigm for studying emotional pathology, 
particularly if it is viewed as just one of a 
class of anomalous personal and social ex- 
periences. Such anomalous reactions are anxi- 
ety provoking because they represent a 
threat to self-control, and they will usually 
generate a search for suitable causal linkages. 
A rational, information-processing model 


10Tn a personal communication (1977), Stanley 
Schachter argued that the discrepancy in results be- 
tween the current research and the Schachter and 
Singer (1962) study may be due to differences in 
the timing of the confederate’s behavior. In their 
study, an attempt was made to have the confed- 
erate’s routine precede or occur simultaneously with 
the subject's experienced arousal. In the current re- 
search, the confederate’s behavior followed the onset 
of arousal by about one minute (the duration of 
the recall task). According to Schachter, subjects 
in the current experiment would not have used the 
confederate’s emotion to label their own, since it 
occurred after the onset of arousal and thus does 
not qualify as a prior cause, However, this argu- 
ment has several flaws. First of all, it continues to 
confuse whether normative or causal information is 
being provided by the confederate. If it is norma- 
tive information (as Schachter and Singer’s theory 
implies), then the confederate’s behavior is a clue 
to a possibly shared prior cause but is not the cause 
itself. The confederate’s behavior need only occur 
at about the same time as the subject’s arousal to 
suggest a common, proximal causal event; it is not 
necessarily required to occur Prior to it. Second, 
Schachter’s argument ignores the fact that Marshall 
and Zimbardo (1979) used the same timing proce- 
dure as in the Schachter and Singer study and ob- 
tained results supportive of the current research. 
Third, even if Schachter’s argument were correct, it 
would not explain the highly consistent negative 
bias of the aroused subjects. If they were indeed 
forced to interpret their arousal in terms of events 
preceding the confederate’s behavior, then we would 
expect to find far more variability in the affective 
quality of their emotional explanations than was 
actually the case. 
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might assume that people engage in a scien- 
tific (unbiased) search, in which they seek 
out and consider all possible evidence in an 
objective way. However, it may be more 
correct to assume that they are biased 
towards certain types of explanations as a 
result of their past history of reinforcement 
for particular classes of explanation or ways 
of thinking about new experiences. We are 


currently conducting exploratory research on 


this biasing process, again using hypnosis as 
a methodological tool to create unexplained 
arousal. 

Furthermore, following from an earlier ar- 
gument, the study of the process of searching 
for causal explanations would benefit from 
an experimental analysis that focused on 
children’s reactions to novel or anomalous 
events (e.g., first headache or nightmare). 
This line of work suggests that emotional 
pathology may not be an impairment of one’s 
Cognitive abilities, but a normal cognitive 
Process used to explain an unusual state or 
event. It extends and expands Maher’s 
(1974) hypothesis about the perceptual gene- 
sis of delusional thinking and may provide 
a unique social psychological perspective on 
the dynamics of abnormal behavior. 


Reference Note 


1. Zimbardo, P. G., & Maslach, C. Biased searches 
for causal explanations of experienced discon- 
tinuities. Unpublished manuscript, 1979. 
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Postscript 


In their attempt to dismiss the relevance of 
the data generated by the present paradigm for 
understanding the role of unexplained arousal in 
emotion, Schachter and Singer (1979) focus 
solely on the issue of timing. The points I will 
address briefly in this rejoinder are designed to 
(a) clarify the precise timing sequence in the 
present study; (b) question the generalizability 
of a model of emotion that depends on such a 
restricted temporal synchrony between arousal 
and social cognitions; (c) highlight the impor- 
tance of the subject’s awareness of a discontinu- 
ous arousal state; and (d) reaffirm the distinc- 
tion between two alternative processes a person 
may engage in when “bodily symptoms search 
for an explanation”—the normative search and 


the causal search. 


The Timing Sequence and Awareness 


In my study, the arousal cue was followed by 
a l-min recall test and a 6-sec experimenter 
statement that “the next part of the study will 
take a few minutes to set up. In the meantime, 
you might want to look through the folder ma- 
terials.” The very first minute of the confeder- 
ate’s routine was not affectively neutral, since it 
included smiles, humming, and verbal pleasan- 
tries in the happy condition and frowns and 
verbal expressions of annoyance in the anger 
condition. Thus, between 1 min, 6 sec and no 
more than 1 min, 30 sec elapsed from arousal 
ue to situational manipulation of “happy- 
angry” social cognition. I submit that this tem- 
poral sequence is adequate to allow the con- 
federate’s behavior to serve as a reasonable so- 
cial context for the subject’s unexplained arousal. 


cl 
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Indeed, if it is not, then neither is the timing 
in the original Schachter-Singer (1962) anger 
condition. The stooge’s display of affect in that 
study might well have occurred after most sub- 
jects started experiencing their epinephrine-in- 
duced reactions. About 1 min is reported to 
have elapsed before subject and stooge were 
brought together. Then they had to take some 
additional minutes to complete the first of five 
pages of questionnaires, which “start off inno- 
cently by requesting face sheet information. . . . 
The stooge . . . paces his own answers so that 
at all times subject and stooge are working on 
the same question” (p. 385). This procedure 
allows loose subject-controlled temporal vari- 
ations where Schachter and Singer demand tight 
experimenter control. Since the onset latency 
of epinephrine symptoms varies across subjects, 
some “anger” subjects would have been aroused 
several minutes before the first anger cognitions 
were available, some at about the same time, 
and some slightly after. The timing sequence is 
less well regulated than in my study and violates 
Schachter and Singer’s condition that social cog- 
nitions must precede unexplained arousal in or- 
der to be utilized as cues to emotional state. 
Furthermore, if precise timing is as critical as 
Schachter and Singer state in their comments 
on my research, then it must be individually 
determined for each subject and not left to “on 
the aversge” variations in approximate time 
before symptom onset and approximate time 
before stooge’s display of affect. 

Although I address the timing issue directly 
in Footnotes 7 and 10, Schachter and Singer 
(1979) have deliberately chosen to dismiss it 
with their cavalier “footnotes notwithstanding” 
(p. 990). In rebuttal to their description of the 
procedures experienced by subjects in the pres- 
ent study and to their comments in their Foot- 
note 1, I assert that the pretest I conducted did 
not involve the critical timing issue raised by 
Schachter and Singer, Subjects were induced to 
‘experience unexplained arousal in the presence 
of another subject who was already expressing 
anger or happiness. Similarly, in the unpublished 
study by Zimbardo and Maslach (Note 1), the 
timing criticism is again not an issue, and the 
consequence of experiencing unexplained arousal 
is still negative. 

If, as noted earlier, 1-minute differences are 
So critical, then the generalizability of this model 
of emotion is severely restricted, It is perception 
of one’s arousal state that motivates a search 
for a causal explanation. The Psychologically 
relevant variable is the timing of the subject’s 
awareness of his of her arousal and available 
Cognitions, and not the time when the drug was 
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injected, when an arousal cue was given, or even 
when measurable physiological symptoms first 


appeared. We know that awareness varies con- 


siderably across subjects in terms of individual 
differences in self-monitoring and sensitivity to 
autonomic discontinuities. The proximal social 
cognitions that immediately precede this per- 
ception of arousal may be the most salient can- 
didates for a causal explanation, but they are 
not the only ones. Schachter and Singer (1962) 
manipulated one class of attributes for the ex- 
perienced arousal, namely a physically present 
other person. As befits social psychologists, they 
chose social cognitions as the reasonable context 
for unexplained arousal. But the subjects could 
have chosen otherwise, from events in their past 
or anticipated future, from nonsocial cognitions 
in the current physical environment, or from 
health and biological sources. The immediacy of 
the social cognition increases its probability of 
being used as the basis for transforming physio- 
logical arousal into a psychological emotional 
State—but it is not a sufficient condition for that 
to occur. Furthermore, there is no reason to 
assume that the search process is usually com- 
pleted in a matter of minutes. Provisional ex- 
planations may be generated, tried out, rejected 
for others, changed by new feedback, and so on. 
Yes, the confederate in the Schachter-Singer 
Paradigm will be of little effect if he starts after 
the subject has already chosen a mood label (as 
Footnote 1 of their 1979 Comments states), but 
that choice may require some time before it is 
final, especially in novel situations. 


The Normative-Causal Distinction 


Schachter and Singer’s (1979) Footnote ! 
credits me with a “new” distinction between eY 
normative and causal aspects of the cognitive 
search to account for unexplained arousal. a 
will accept that as a compliment and not inter- 
pret “new” to mean untenable because not yet 
empirically validated.) More than the timing 
issue is involved in recognizing the way these 
different Psychological processes operate whe $ 
an individual tries to make sense of a bodily 
state of insufficiently explained arousal. P 

The normative search entails seeking cogni- 
tions that one’s reactions are situationally aP- 
propriate and socially acceptable. The person 
wants to know how to behave in an unfamiliar 
Setting or to discover what range of responses 
is expected. In the process of looking for this 
“how,” the person often comes up with a suit- 
able “what” as well, “What” is the label, oF 
socially agreed upon term, that normalizes an 
otherwise idiosyncratic and ambiguous reaction- 
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The causal search entails seeking cognitions 
that provide information about likely causes and 
consequences of one’s perceived arousal. This 


f search for “why” starts with an effect to be 


explained in terms of reasonable causal ante- 
cedents. Such an analysis need not necessarily 
involve any social cognitions (although these, 
of course, are central to any normative search 
process). 

The search for a causal explanation, then, is 
an attempt to establish effect-cause linkages that 
help make sense of or bring rationality out of 
uncertainty. On the other hand, the search for 
an emotion label and for the “right” way to 
react may induce emotional contagion merely as 
a compliance phenomenon, 

Schachter has operated primarily out of a 
social comparison framework since his early 
work on affiliation and anxiety. Because of this 
theoretical orientation, he either emphasizes the 
normative search or fails to acknowledge ade- 
quately the causal alternative. When subjects in 
novel or uncertain situations seek to discover 
how to behave and what to feel, the confeder- 
ate’s affective display is of little relevance to 
explaining why the subject is suddenly aroused. 
It serves only a labeling function (“I don’t know 
why, but I am feeling angry, or acting happy,” 
etc.). Or it serves to normalize strange reactions 
(“I don’t know why I’m reacting like this, but 
at least I’m not the only one”). 

If the search is normative, then the timing of 
subject arousal and confederate’s behavior 
should be of little consequence as long as both 
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occur in the same setting (so that the emotional 
state is perceived as shared), the other person 
is an acceptable social comparison referent, and 
there is no other readily available label for the 
arousal. Timing should be of importance in the 
causal search only to the extent that the rela- 
tive contiguity of the confederate’s response 
narrows the search to common elements of the 
situation that might have caused both it and the 
subject’s own internal reaction. 

In conclusion, I am willing to accept the sub- 
stance of Schachter and Singer’s characterization 
of what my study “proves,” when stated in the 
following form: When subjects are aroused for 
no apparent reason, they report themselves to be 
in a “lousy mood” regardless of whether a con- 
federate has been acting happy or angry in their 
presence and regardless of whether they them- 
selves have displayed happy or angry behaviors. 
They do so more than do unaroused controls or 
those who are aroused and know the cause. This 
is indeed the major finding of my study: that 
strong unexplained arousal per se is typically 
perceived as a negative state by adults and not 
as an undifferentiated, affectively neutral state, 
as the Schachter-Singer model proposes. At 
this point, I assume further that both the search 
for a causal explanation and the normative 
search for an appropriate label will be biased in 
negative directions as a result of prior experi- 
ences in which perception of strong arousal is 
more often associated with aversive events than 


with pleasant ones. 
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Affective Consequences of Inadequately Explained 
Physiological Arousal 


Gary D. Marshall and Philip G. Zimbardo 
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This research reexamined Schachter and Singer’s reported demonstration of 
cognitively influenced physiological determinants in the experience of emotion. 
In the present study, reports of affective state by placebo-injected and epi- 
nephrine-injected subjects, who were misinformed about possible somatic effects 
and exposed to a euphoric confederate, did not differ. Closer inspection of the 
Schachter and Singer results also indicates a failure of inadequately explained 
epinephrine-produced physiological arousal to enhance affective susceptibility to 
the social environment. Additional evidence provided by the present study re- 
veals a tendency for inadequately explained epinephrine-induced physiological 
arousal to produce negatively toned reports of affect. The results for additional 
experimental conditions indicate that (a) manipulations to heighten the experi- 
ence of physiological arousal produced greater negative affect; (b) similarly, 
expectation of epinephrine side effects produced negative affect; and (c) the 
affective tone of the social environment had minimal effect on arousal subjects. 
These negative affective reactions occurred in a social context where a con- 


federate was perceived to be happy. 


The theorizing and research on emotions 
by Schachter and Singer (1962) must be 
counted among the primary forces responsi- 
ble for redirecting the field of social psychol- 
ogy toward a more cognitive orientation 
(Mandler, 1966, 1975). Their classic study 
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strongly emphasized the important role played 
by cognitive influences in determining the 
role of physiological arousal in affective ex- 
perience, 

The Schachter and Singer study examined 
the contribution of both cognitive and physio- 
logical determinants by using a distinctively 
social psychological research strategy in which 
both factors were manipulated independently. 
Physiological arousal was controlled by pro- 
viding injections of either epinephrine or 4 
Placebo. Cognitive factors regarding the 
arousal were controlled by providing accurate 
or inaccurate information regarding symp- 
toms. In addition, the social situation was ex- 
Perimentally manipulated by means of a con- 
federate who behaved either euphorically or 
angrily. 

The experience of emotion in this situation 
was reportedly shown to depend on the in- 
teraction of (a) the presence of physiological 
arousal (via epinephrine injection), (b) the 
lack of an adequate explanation for this 
arousal, and (c) the particular affective in- 
formation provided by the social environment. 


Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3706-0970$00.75 
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EMOTION AND INADEQUATELY EXPLAINED AROUSAL 


twas intriguing to discover the reported ease 
nth which the same condition of physiologi- 
hl arousal could be steered in as seemingly 
posite affective directions as happiness or 
mer as a result of these cognitive manipula- 
ons. The Schachter and Singer report thus 
mstituted a strong argument for a common 
ysiological substrate for different emotions 
ffr & Stern, 1970). It also suggested the 
histence of an “emotional plasticity” that 
nal apparent therapeutic significance (Nisbett 
Schachter, 1966; Ross, Rodin, & Zimbardo, 
960; Storms & Nisbett, 1970, among others). 
Despite its importance, there have been 
W replications of the Schachter and Singer 
1962) study reported in the literature. How- 
et, several critiques of the study have ap- 
peared (Averill & Opton, 1968; Lazarus, 
1967; Leventhal, 1974; Plutchik & Ax, 1967; 
Papiro & Schwartz, 1970; Stein, 1967). 
[hese critiques have questioned the adequacy 
Í some of Schachter and Singer’s basic con- 
lions, and, given the widespread citation 
the study, the importance of a direct at- 
tmpt at replication is obvious. 

The present research design was limited to 
è “euphoric social cognition” condition, be- 
ise of constraints imposed by the Stanford 
iversity Medical School Human Subjects 
Mmmittee.! Furthermore, we chose to con- 
[trate on the information condition that 
[duced the most euphoric outcome for 
le physiologically aroused subjects in the 
nachter and Singer study—the one in which 
Mbjects were actively misinformed regarding 


k teceive an injection, 
Pport the cover story, obtaining 
easures in order to directly assess chang 
ct as a result of. the manipulations, 


pretest 
e in 
and 
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continuous physiological monitoring, includ- 
ing additional self-report measures (see 
Marshall, 1976, for further details). 

To further explore the arousal-affect rela- 
tionship, four additional comparison groups 
were tested. These were introduced to assess 
the effects of heightened physiological arousal, 
the influence of the confederate’s behavior, 
and the effects of information about the 
likely somatic effects of the injection. The 
fundamental comparison for assessing inade- 
quately explained physiological arousal, how- 
ever, is that between the basic placebo sub- 
jects and the basic epinephrine-injected sub- 
jects (who were misinformed about what to 
expect from the injection). Both groups were 
exposed to a euphoric confederate. 


Method 


Overview 


The basic design of the study involved manipula- 
tion of physiological arousal via injections of epi- 
nephrine or placebo, misinformation about the drug’s 
anticipated somatic effects, and exposure to a con- 
federate who behaved euphorically. In the additional 
conditions, the following factors were systematically 
altered: the method of determining epinephrine dos- 
age, the focus of the subject’s attention, the con- 
federate’s behavior, and the type of somatic effects 
te was continuously monitored 


expected. Heart ral 
by telemetry. Pre- and posttreatment self-reports 
of affect were collected. The subject’s behavior while 


in the presence of the confederate was both rated 
directly by observers and videotaped for subsequent 
scoring. 

As volunteers for an alleged study of the effects 
of a special vitamin on vision, a pair of subjects 
(one a confederate) went through the different 


» condition was disallowed because 

the committee believed it unethical to induce anger 

in unsuspecting subjects (assuming the validity of 

the Schachter and Singer conclusions). The concern 

of the committee about possible adverse reactions to 

epinephrine injections in normal young adult males 

was based on an unreferenced comment in a gen- 

eral pharmacology reference text. In over 100 ad- 
ministrations of epinephrine of the dosages reported 
in this article, there was not a single adverse reac- 
tion. The precautions we took of screening for po- 
tential overreactions, availability of a cardiologist, 
and continuous monitoring of physiological func- 
tioning proved to be sufficient safeguards and should 
be used in further investigations of the psychologi- 
cal effects of this drug in normal populations. 


1The “angry 
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phases of experimental procedures together. First, 
following an explanatory overview of the session, 
they completed a series of questionnaires that pro- 
vided an evaluation of their preinjection physical 
and affective state. Then they took specious “pre- 
vitamin-injection” vision tests and had a baseline 
heart rate measurement while performing visual tasks. 
Next a physician administered the injection (alleged 
to at least subtly affect visual activity), with ac- 
companying information regarding expected side 
effects. During a 15-minute waiting period, sup- 
posedly before the drug would begin to take effect, 
the confederate engaged in a prearranged sequence 
of euphoric activities. At the end of the waiting pe- 
riod, measures of self-reported affect, experienced 
symptoms, and an evaluation of the confederate 
were obtained, The specious “drug condition” vision 
tests were then readministered in keeping with the 
purported rationale of the study. The physician and 
the experimenter then examined and interviewed each 
subject, who was dismissed only when his pulse and 
blood pressure were normal and there were no mani- 
fest signs of arousal, The duration of each session 
was about 90 minutes, Finally, a separate and thor- 
ough debriefing session was held for all subjects at 
the completion of the study. 


Subjects 


The subjects were 85 male undergraduates attend- 
ing Stanford University. They were initially re- 
cruited by means of a notice asking for males over 
18 years of age to join the “Stanford Subject Pool 
for Drug Research.” The medical records of these 
subjects (maintained at the student health center) 
were examined by a physician. Subjects were ex- 
cluded from the subject pool if they met any of 
several criteria that were contraindicative of a nor- 
mal-range reaction to epinephrine or that might in- 
dicate a prior history of epinephrine use (e.g. 
asthma, a respiratory condition, high blood pres- 
sure) or if currently using medication. Fifteen vol- 
unteers were thus judged ineligible to participate. 

Subjects were individually contacted by phone 
and invited to participate in a study “being con- 
ducted by a member of the ophthalmology depart- 
ment concerning the influences of an injection of a 
vitamin supplement on visual activity.’ A standard 
medical interview was conducted over the phone, 
both as a further screening device and to lend cre- 
dence to the cover story (several additional volun- 
teers were eliminated through this second phase of 
screening.) The eligible volunteers were then ran- 
domly preassigned to one of six experimental con- 
ditions, which are described subsequently. 


Procedure 


In keeping with the purported medical nature of 
the study, the experiment was performed in the stu- 
dent health center, where appropriate signs directed 
the incoming subjects to the “vision testing area.” 
The subject and a male experimental confederate 
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of comparable age went through all of the os 
together. At the beginning of the session, the experi. 
menter presented an overview of the experiment to 
them. He repeated the description of the study a 
an ophthalmology experiment dealing with vitami 
and visual activity and described the tandem pro 
cedures for the repeated vision tests and “physios 
logical monitoring.” Subjects were told that thë 
physician would administer the vitamin injection 
following the completion of the initial series of 
tests. The injection was described as having nol 
major side effects, although the experimenter indi 
cated that some minor, transitory effects had bee 
reported by some previous subjects. Misinformation 
was given regarding these somatic effects; instead 
of the usual epinephrine sequelae, subjects were led 
to expect dryness of the throat, slight headach 
and some slight coolness in toes and fingers, Sub: 
jects were told that they would have to wait abou 
15 minutes after the injection before the vitamin 
would be affecting the appropriate visual structures 
and then they would proceed with the second serie 
of eye tests, physiological recordings, and question} 
naires. 

After the experimenter delivered this overview 
the first set of questionnaires was administered 
These contained self-ratings concerning vision, physi 
cal condition, and current affective state. When th 
forms were completed, the participants proceeded 
to the “baseline testing” phase of the experimen! 
The vision tests consisted of reading a standard B 
chart with and without a pair of field-inverti 
goggles and taking the Speed of Color Discriminati¢ 
Test (Messick, 1964). The “preliminary physiologi 
monitoring” period, which included several vis) 
tasks, initiated the continuous recording of the sv 
ject’s heart rate as he proceeded through the tt 
of the experiment. Silver disk surface electrodes wéi 
attached to the center of the subject’s upper al 
lower sternum. These were then connected to a smil 
radio transmitter (Narco Bio-Systems, FM1100- i 
placed on the back of the subject’s shoulder. 
transmitter was described to the subject as a “reli 
box” that was connected to the polygraph (E & 
Physiograph) immediately behind him. In actualit 
the signal from the transmitter was picked up b 
two FM receivers (Narco Bio-Systems, FM1100-6)| 
One was connected to the polygraph in the vicinil) 
of the subject; the other was connected to a poly 
graph (Beckman Type R dynagraph with a 9806-A4 
coupler and a 9857 cardiotachometer coupler) <% 
cealed from view in another room. 

After the subject performed several “eye m0 
ment” tasks, a 1-minute resting baseline measure f 
heart rate was obtained. As the subject’s “initi 
physiological monitoring” period was being cM 
pleted, the physician entered the room and pretendé 
to administer an injection to the confederate, w 
had just completed his baseline vision testing. 
experimenter conspicuously disconnected the «rel 
box” from the adjacent polygraph and informed 
subject that no further physiological recording woul 
be done until the second part of the experiment. 
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actuality, the subject’s heart rate information was 
being continuously telemetered by the radio trans- 
mitter to the receiver and polygraph in the other 
room, 

Manipulation of physiological arousal. After the 
physician finished with the confederate, he entered 
the physiological recording area where the subject 
was waiting. As the physician prepared to give the 
injection, he repeated both its earlier description as 
a “multicomponent vitamin supplement called D-27” 
and the earlier statement regarding its likely somatic 
effects. The injection was then administered sub- 


, cutaneously to the subject’s upper right arm. Sub- 


jects in the epinephrine-basic group received an in- 
jection of .Scc epinephrine (a 1:1000 solution of 
Parke-Davis Adrenalin). The placebo-basic group 
received a .5-cc injection of bacteriostatic water. 
Only the physician (a cardiologist) was aware of 
the contents of specific injections. The experimenter 
then escorted the subject to the adjacent waiting 
area, where the confederate was already seated at a 
table. The rationale for the 15-minute waiting pe- 
riod was restated, After instructing the two partici- 
pants simply to make themselves comfortable, the 
experimenter excused himself and left them alone 


_in the room, 


Manipulation of social environment. The confed- 
erate enacted a carefully rehearsed sequence of be- 
haviors that was closely patterned on the behavior 
of the “euphoric” accomplice in the Schachter and 
Singer (1962) study. The confederate had been in- 
structed to be friendly and responsive to the subject 
but to maintain a primarily independent role. 

The programmed “euphoric” script of the confed- 
erate, both verbal and nonverbal, followed this ap- 
Proximate chronology (which began within a minute 
after the subject’s injection) : i) 

1. Minutes 1 through 3. Looks around, examines 
materials on the table, sketches or doodles, hums. 

2. Minutes 4 through 5. Crumples paper into a 
ball, tosses it at wastepaper can; repeats with a 
couple more sheets of paper; gets up, retrieves misses, 
tries different types of shots from different positions; 
tosses one to the subject, inviting him to try a shot. 

3. Minutes 6 through 8. Sits down again, picks up 
a piece of paper, makes a paper airplane and flies 
it; makes another one, flies it; retrieves plane from 
the floor and flies it in the direction of subject; 
Tepeats flying plane. 

4. Mi fauie gs Notices and tries a swivel-type foot 
exerciser on the floor. A 

5. Minutes 10 through 12. Picks up paper ait- 
plane, sits down and plays with materials on the 
table; makes a slingshot with a rubber band; tears 
off a piece of the airplane to use as “ammunition 
and shoots it at a clock on the far wall; while re- 
trieving shot, notices old, empty folders on @ Sa 
of equipment and sets them up as a target; makes 
more “ammunition” and continues to shoot at the 
target, Pii “, 

6. Minutes 13 through 15. While picking up KE 
munition” from floor, notices hula hoops against the 
wall behind some exercise equipment (in the ME 
ing physical therapy room), picks one up and tries 


973 


a couple of times to rotate it freely around his hips, 
and then spins it across the room toward the sub- 
ject; continues spinning the hoop on the floor until 
the sound of a door opening indicates the experi- 
menter’s imminent return; replaces hoop and re- 
turns to his seat. 

When the experimenter reentered, he stated that 
as soon as the second set of questionnaires was com- 
pleted, the participants would proceed with the 
physiological monitoring and vision tests “under 
vitamin conditions.” These questionnaires were the 
same as the first series but also included both a 
checklist and ratings of a variety of experienced 
physical symptoms. An additional form asked both 
participants for a rating of their experimental part- 
ner. The purpose of this “partner evaluation” (really 
a check on the perceived mood of the confederate 
and also of suspicion as to his role) was rationalized 
as necessary to match subjects for a future phase 
of the study. When the questionnaires had been 
completed, the initial vision tests and tasks were 
repeated. Following this, the subject was brought to 
another room where the physician checked his pulse 
rate and blood pressure. The experimenter gave a 
brief exposition about the experiment, paid the sub- 
ject, and requested that he not discuss the experi- 
ment. A complete debriefing session was postponed 
for several weeks until all of the subjects had been 
tested. At these subsequent sessions subjects filled 
out a modified form of the Autonomic Perception 
Questionnaire (Mandler, Mandler, & Uviller, 1958) 
and were then completely debriefed. 


Experimental Conditions 


Sixteen subjects were randomly assigned to each 
of the two basic experimental conditions, The epi- 
nephrine-basic group received the epinephrine in- 
jection, while the placebo-basic group received the 
placebo injection. In all other respects the proce- 
dure outlined aboye was identical for both conditions, 
Four other groups (with 12 subjects in each) were 
utilized to provide essential comparative data. In 
the epinephrine - neutral confedërate condition, the 
confederate did not behave euphorically, but instead 
he was merely passivelywpleasant. In the placebo ~ 
arousal symptoms group, the information regarding 
anticipated consequences of the injection was changed 
to that appropriate for epinephrine. The final two 
groups represented an attempt to strengthen the 
manipulation of physiological arousal, In the epi- 
nephrine - increased arousal group, the dosage of the 
epinephrine injection was individually adjusted for 
the subject's body weight. In the epinephrine ~ in- 
creased salience group, subjects not only received 
the adjusted dosage but also engaged in activities 
that were designed to increase the salience of their 
arousal. 

Epinephrine — neutral confederate group. In this 
condition every aspect of the procedure, with the 
exception of the confederate’s behavior, was identi- 
cal to that for the subjects in the epinephrine-basic 
group. The confederate, instead of acting in a eu- 
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phoric manner, offered no dramatic cues as to his 
emotional state. When the subject and confederate 
were brought together in the waiting room after the 
injection, the confederate casually removed a paper- 
back book from his back pocket and proceeded to 
read it for the entire waiting period. He responded 
in a friendly, pleasant manner to any questions that 
were asked, but then returned to his reading. 

The primary reason for this neutral confederate 
cendition was to provide a comparison level for 
assessing the extent of affect induction elicited by 
the euphoric confederate in epinephrine-injected sub- 
jects. The neutral confederate condition permits a 
direct evaluation of the impact of varied social en- 
vironments, holding inadequately explained physio- 
logical arousal constant. 

Placebo - arousal symptoms group. In this condi- 
tion, a group of placebo subjects was told that the 
symptoms expected from the injection included those 
actually appropriate to an injection of epinephrine. 
This information was provided during the overview 
at the beginning of the session and again just prior 
to the injection. Subjects in the placebo- arousal 
symptoms group were told, “You may notice that 
your hand will start to shake, your heart will start 
to pound, and your face may get warm and flushed.” 
With the exception of this change in symptom ex- 
pectation, the procedures for this group were iden- 
tical to those for the placebo-basic group. 

The primary reason for this experimental condi- 
tion was to further examine the effects of the symp- 
tom-expectancy manipulation. This is the particular 
information Schachter and Singer (1962) employed 
to provide some subjects in epinephrine-injection 
conditions with an “adequate” explanation for 
their ensuing physiological arousal. The arousal 
symptoms group illustrates the effects of providing 
arousal-appropriate information, independent of the 
Subsequent induced state of physiological arousal. 
In particular, we wanted to test the possibility that 
telling subjects to expect their heart to pound and 
so forth might have independent, direct effects on 
affect in addition to simply providing a possible ex- 
planation for any subsequent experience of physio- 
logical arousal, ~~ 

Heightened arousal groups. In these conditions, 
instead of receiving a “ ” dosage of epinephrine, 
subjects received dosages in proportion to their 
body weight. The average body weight of the sub- 
jects in the epinephrine-basic group was approxi- 
mately 72 kg; thus, the standard injection of .5 cc 
represents an average injection of about .007 cc/kg. 
Heavier-weight subjects in the constant dosage con- 
dition, of course, received relatively smaller dosages 
than lighter ones. 

The 12 subjects randomly assigned to the epi- 
nephrine — increased arousal treatment received an in- 
jection of .01 cc of epinephrine per kilogram of 
body weight. The individual dosages ranged from 
55 cc to 82 cc (M=.72 cc; SD=.08). With the 
exception of this change in dose determination, the 
procedures and manipulations for this group were 
identical to those for the epinephrine-basic group. 

This method of selecting dose levels controls for 


GARY D. MARSHALL AND PHILIP G. ZIMBARDO 


any differential effects resulting from individual dif- 
ferences in body weight. The average of the actual 
doses administered to this group is equivalent to 
1.44 times the fixed dosage administered in the other 
epinephrine conditions and in the Schachter and 
Singer study. As such, this method also represents 
a somewhat stronger level of the physiological 
arousal manipulation for the group as a whole. 

The 12 subjects randomly assigned to the epi- 
nephrine — increased salience condition were also ad- 
ministered the body-weight-adjusted drug dosage of 
01 cc/kg. Individual dosages ranged from .64 cc 
to .93 cc (M=.77 cc; SD=.08), In addition, for 
these subjects the postinjection waiting period, while 
euphoric in tone, was designed to increase the sa- 
lience of the induced physiological arousal. The 
subject and confederate spent the first 5 minutes 
of the waiting period writing a story about a The- 
matic Apperception Test (TAT) picture (to increase 
the salience of hand tremor). The remaining 10 min- 
utes were spent sitting with their eyes closed (to 
focus attention on increased palpitations and changes 
in respiration). The confederate enacted his “eu- 
phoric” role both verbally and nonverbally. He 
joked, hummed, and gave various indications of 
being happy. 

This change in task demands during exposure to 
a “euphoric” confederate, combined with the some- 
what stronger arousal manipulation resulting from 
the weight-adjusted drug dosages, provided condi- 
tions under which the awareness of physiological 
arousal should have been enhanced. The critical com- 
ponents were still present: arousal that was inade- 


quately explained and a generally euphoric con- 
federate. 


Dependent Measures 


Physiological, behavioral, and self-report responses 
were measured. 

Physiological measures. Each subject’s heart rate 
was monitored continuously using a cardiotachom- 
eter. The three measurement periods were (a) base- 
line—the 1-minute relaxation period during the in- 
itial physiological monitoring; (b) preinjection—the 
1-minute period immediately preceding the injection; 
and (c) waiting period—the period beginning at 1 
minute after the injection and lasting until the end 
of the waiting period. For scoring purposes, the rate 
indicated by the cardiotachometer record was read 
at 10-sec intervals for each of the appropriate pe- 
riods. Means where then calculated for the appro- 
priate epochs. 

Behavioral measures. During the waiting period, 
the subject was both observed and videotafed 
through a one-way mirror, Each discrete occurrence 
of smiling or of laughter was recorded on an event 
recorder (Esterline-Angus). The total frequency of 
each of these behaviors was determined for each 
subject. A composite index of positive emotional ex- 
Pression was calculated by weighting the number of 
smiles by one and the number of laughs by two and 
then summing these scores. The videotape of each 
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subject’s activities was scored by an observer who 
was blind to the experimental condition. The type, 
frequency, and duration of the subject’s various ac- 
tivities, as well as their temporal occurrence with 
respect to the confederate’s actions, were noted. Our 
scoring procedure was based on a weighting of type 
of activity, following Schachter and Singer (1962, 
p. 391). Thus, 5 points were given for hula hooping, 
4 for shooting with the slingshot, 3 for paper air- 
planes, 2 for paper basketballs, 1 for doodling, and 
0 for doing nothing. Scores for amount of “imita- 
tion” (follows confederate’s lead) and “initiation” 
(innovates new activities or encourages further par- 
ticipation of confederate) were also determined. To 
assess the reliability of these various measures, 10 
of the videotapes were selected at random and rated 
by a second independent observer. The scores for 
both judges of the total frequency for each activ- 
ity for each of the 10 subjects differed by only 1 
unit or less 97% of the time and were in complete 
agreement 85% of the time. The imitation and in- 
itiation scores were in exact agreement 90% and 10% 
of the time, respectively; and for both categories 
the scores were within 1 point of each other 100% 
of the time. 

Self-report responses. Questionnaires were used to 
assess the following: (a) the subject's description 
of his affective state, (b) his perception of physio- 
logical arousal, and (c) his evaluation of the con- 
federate, The measures concerning affective state 
were administered both at the beginning of the ex- 
periment and following the waiting period. The 
reports of physical state and the confederate eval- 
uations were obtained only after the waiting period. 
In order to disguise the actual focus of the ques- 
tionnaires, several “dummy” questions were included 
(such as ratings of hunger, thirst, and degree of 
alertness, as well as ratings of different aspects of 
visual activity). The questionnaires also contained 
several open-ended questions encouraging the subject 
to comment on various aspects of his experience. 

Two of the affect self-report items provide 


major de oses of replica- 
jor dependent measure for purpos f rep! 

ti f jo findings by Schachter ani d 
on of the previous 


2} irri an; 
ee inate “How good 


feel at present?” Sub- 


Singer (1962) : 
would you say you feel at pr 
or happy would you say you 4 
jects ARET, one of ae response alternatives: S 
(not at all), 1 (a little), 2 (quite), 3 Ge es 
4 (extremely). An additional series of 11 aes 
sampled a broader affective domain. These Me 
the form of bipolar adjective pairs, with the sub) 


tating himself on a 7-point scale ranging Fae 


for one adjective to +3 for the other. ij 
calm, ated den and angry-P mure 
amples of these 11 item pairs. i : 
The questionnaires also contained two bare 
taining to the subject’s perception of vanou nd the 
symptoms. The experience of palpitati ving scale: 
feeling of tremor were rated on the ue 
0 (not at all), 1 (a slight amount); ? oe subject 
amount), and 3 (an intense amount). 
also completed a checklist noting th 
absence of a variety of symptoms 
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beating faster, breathing faster, numbness, and watery 
eyes. 

The subject rated the confederate on a scale from 
—3 to +3 for each of nine adjective pairs (e.g, 
friendly-unfriendly, relaxed-agitated). One pair, 
happy-sad, permitted a direct check on the ade- 
quacy of the confederate’s portrayal of a “euphoric” 
mood. 

An additional questionnaire was administered at 
the debriefing session. This questionnaire was de- 
veloped from the Autonomic Perception Questionnaire 
and was used to measure the extent to which a 
subject experienced specific physical symptoms with 
different states of affect. The standard deviations of 
symptom ratings were used to derive an affect symp- 
tom-discrimination index. Subjects with a low index 
score did not report experiencing symptoms that 
varied with the different emotions (eg., if a subject 
rated himself as “perspiring a great deal” regardless 
of the specific emotion, that symptom would have 
a standard deviation of zero). Subjects with a high 
index score, on the other hand, did report experienc- 
ing different symptoms with different emotions. 


Results 


Five subjects met an a priori exclusion 
criterion based on skepticism of the validity 
of the confederate’s role (three subjects from 
placebo and two from epinephrine groups). 
Exclusion resulted when a subject noted him- 
self as “suspicious” on the adjective checklist 
and questioned the credibility of the confed- 
erate on the open-ended questionnaire items. 
These subjects were replaced with additional 
subjects from the available subject pool. 

This study was conducted in two phases: 
First, the basic epinephrine and placebo 
groups, both paired with a euphoric confed- 
erate, were tested; a week later, testing of 
the four additional groups began. All sub- 
jects came from the same population, were 
tested within a comparable time period, and 
were randomly assigned to experimental con- 
ditions. For purposes of data analysis, the 
six groups are treated as if they had been 


tested contemporaneously. 


Checks on the Manipulations 


The manipulations appear to have been 
effective in creating the conditions necessary 
to evaluate the hypotheses under considera- 
tion. Heart rate increases as a function of 
the injection were found only for the epi- 
nephrine — heightened arousal groups, in both 
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Table 1 ; 
Reported Experience of Physiological Symptoms 


GARY D. MARSHALL AND PHILIP G. ZIMBARDO 


Placebo groups 


Epinephrine groups 


Neutral 

Arousal con- Increased Increased 

Measure Basic symptoms. Basic federate arousal salience 
Mean degree of tremor Al 1.6 1.5 1.6 1.7 
Mean degree of palpitations jal 9 pe 1.4 1.6 

Mean number of 
epinephrine symptoms 5 2.8 2.4 3.4 3.4 
of which the dosage of epinephrine had been Perceived arousal symptoms. Table 1 pre- 


individually adjusted for body weight. The 
injections of epinephrine, however, clearly did 
lead subjects to perceive more symptoms of 
physiological arousal than were perceived by 
placebo subjects. The confederate was per- 
ceived as being happy and euphoric in all 
experimental conditions where he behaved 
that way and as significantly less so where he 
enacted an affectively neutral role. Finally, 
the misinformation regarding the symptoms 
associated with the injection was reflected in 
the failure of a majority of the subjects to 
mention the injection as a possible explana- 
tion for their feeling state. 

Physiological arousal. Because of appa- 
ratus problems, only 68% of the sample had 
complete physiological records. Examination 
of the change in heart rate from the period 
immediately preceding the injection to the 
average heart rate during the 15-minute wait- 
ing period revealed the following results, 
There were no significant changes in heart 
rate for the two epinephrine conditions given 
the standard, fixed dosage of .5 cc of epi- 
nephrine: M = —2.3/beats per minute (bpm) 
for the epinephrine-basic group, and M = +.9 
bpm for the epinephrine — neutral confederate 
group. There were, however, increases in heart 
rate for the two heightened arousal conditions. 
The mean increase for the epinephrine — in- 
creased arousal group was +6.7 bpm, ¢#(7) = 
3.12, p < .02, while the mean increase for 
the epinephrine — increased salience group was 
+6.0 bpm, #(8) = 2.26, p < .10. The placebo- 
basic group did not show a significant change 
in heart rate (M = —3.0 bpm), while the 
placebo — arousal symptoms group had a sig- 
nificant decrease in heart rate (M = —6,7 
bpm), £(8) = 2.54, p < .05. 


sents the mean ratings of palpitation and of 
tremor. Four of the items on the symptom 
checklist (“moist, sweaty palms,” “heart is 
beating faster,” “stomach feels upset,” and 
“breathing is faster”) were considered as in- 
dicative of epinephrine — induced arousal, and 
their mean frequency is also reported in Table 
1. Subjects in the two placebo groups rarely 
reported tremor and palpitations, and only 
occasionally did they report experiencing one 
of the epinephrine symptoms. In contrast, the 
subjects in the four epinephrine conditions on 
the average reported experiencing slight to 
moderate amounts of tremor and palpitations, 
as well as at least several of the epinephrine 
symptoms. Thus, with respect to reports of 
both the degree of tremor and palpitation and 
also to reports of the number of symptoms 
experienced, the injection of epinephrine did 
definitely produce discernible symptoms of 
arousal that were not found with the placebo 
injection. 

Confederate’s affective state. The success 
of the “social cognition” manipulation was 
determined from the ratings of the confeder- 
ate completed by each subject after the wait- 
ing period. The means of the ratings in each 
of the five groups paired with a euphoric 
confederate indicate that he was indeed per- 
ceived as being happy. On a scale from +3 
(happy) to —3 (sad), the confederate was 
given mean ratings that ranged from +1.5 
to +2.2 for the five conditions having a eu- 
Phoric confederate. When tested against a | 
neutral score of zero, each of these mean 
ratings is highly significant (p < .001). This 
perception of the confederate as being happy 
was even found in the ratings of the more 
restrained confederate in the epinephrine — in- 
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i 


meased salience condition (M = +1.7). In 
pntrast, the epinephrine group with the af- 
fctively “neutral” confederate gave him a 
an rating of +.7 (not significantly differ- 

t from the neutral midpoint of zero). 
Explanation of symptoms. The subjects 
ere misinformed regarding the possible 
ptoms associated with the injection. Fol- 
wing Schachter and Singer’s (1962) proce- 
ure, subjects were subsequently categorized 
as “self-informed” or “not  self-informed” 
about the source of their physiological arousal. 
Subjects who mentioned something about the 
ection (e.g., “suspect that it’s the vitamin 
fluence”; “the shot”), when asked to ex- 
in their feeling state on the final question- 
aire, were classified as “self-informed.” Those 
[ho made no reference to the injection (e.g., 
Mest situation”; “the guy I was waiting 
ith”) were classified as “not self-informed.” 
There were indeed subjects in each group who 
[iet the ‘“self-informed” criterion. In the epi- 
iephrine conditions, this varied from 25% 
lif the basic group and 17% of the increased 
ousal group to 42% of the neutral con- 
lltderate group and 50% of the increased sa- 
lience group, both of whom were exposed to 
fi relatively inactive and less distracting con- 
lltderate. In the placebo conditions, 12% of 
e subjects in the basic group and 8% of 
he arousal symptoms group also met the 
tiiterion, The reports in these latter groups 
Sugeest that at least some of the references 
0 the injection in the epinephrine groups may 
ave occurred independently of induced physi- 
logical arousal, Although the misinformation 
[garding likely side effects did not neces- 


‘ection. This supports the apparent face valid- 
ity of the subjects’ not having 4 readily 
available “adequate” explanation for any en- 
‘ting physiological arousal. 


The epinephrine-basic subjects did not ex- 
‘pinephrine eke is be 


Mess more positive affect (0 than 

: positive affect 

‘ioral or the self-report measures) a4 
te placebo-basic group. In fact, m di- 
ifferences did exist between the two con 


tions were in the opposite direction of more 
positive affect for the placebo group. Height- 
ening the arousal did not increase reports o! 
displays of positive affect, but instead pro- 
duced signs of more negative affect, For sub- 
jects with inadequately explained physiologi- 
cal arousal, it seemed to make little difference 
if they were in the presence of an affectively 
positive or neutral confederate. Placebo sub- 
jects given information to expect symptoms 
appropriate to epinephrine evidenced a nega- 
tive affective bias prior to the actual injection. 

Preinjection. With the exception of the 
information regarding possible somatic effects, 
the experimental procedures were identical for 
all subjects prior to the injection. Subjects 
in the placebo — arousal symptoms group were 
told to expect epinephrine-appropriate symp- 
toms; subjects in the other groups were told 
to expect only minor, transitory symptoms 
(unrelated to predominant epinephrine ef- 
fects). An index of general affect was derived 
from the set of 11 bipolar affect-adjective 
scales by combining the ratings on the items 
such that a higher score represented more pos- 
itive affect in general. The placebo - arousal 
symptoms group was the only one that did 
not have a positive index score significantly 
different from zero (all other ps at Teast 
<.02). The a priori comparison of the mean 
for the placebo —arousal symptoms group 
(M = 2.5) with the average of the scores for 
the other groups (M = 7.4) indicates that 
these subjects had a significantly lower or less 
positive index score (MS, = 55.59), #(74) = 
2.12, p < .05, after being informed about pos- 
sible symptoms. 

The effect of the symptom-information ma- 
nipulation was also assessed by the a priori 
comparison of the change in heart rate for 
the placebo - arousal symptoms group in an- 
ticipation of the injection (M = 14.8 bpm) 
with the average of the change in heart rate 
for the other five groups (M = 64 bpm). 
This comparison indicates that subjects in 
the arousal symptoms condition had a signifi- 
cantly greater increase in heart rate in an- 
ticipation of their injection than did the other 
subjects who had been told to expect only 
minor, transitory symptoms (MS, = 46.70), 
#(47) = 3.40, p < 01. This and the general 
affect index demonstrate that the two de- 
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Table 2 . 
Self-Report Ratings of Experienced Affect 
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aaam 


Experimental condition 


Placebo- 


Epi- Epi- Epi- 
Placebo- Epi- increased increased neutral arousal 
basic basic arousal salience confederate symptoms 
Measure (n=16) (n=16) (m=12) (n= 12) (n=12) (n=12) 
Schachter-Singer index 
post score* 1.5 1.4 9 1.1 t 1.5 
“Adjusted” pre-post 
change score? +.40* +.28 +.03 —.07 —.25 +.39 
General affect index 
post score* 10.4 74 1.8 5 8.2 8.3 
“Adjusted” pre-post 
change score> +3.64* +.19 —4.31* —6.41"* — 1.00 


+4.20* 


Note. Epi = epinephrine. 


* The higher the score, the more positive the reported affect. we: P 
» A positive score indicates a positive change in affect over time; a negative score indicates a negative 
change. Scores were adjusted by regression for any differences in initial values. 


*p < 05. 


scriptions of side effects were not equivalent 
and that the symptom-information manipula- 
tion produced effects prior to the injection. 

Self-report of affect. One of the self-report 

measures of experienced affect was identical 
to that employed by Schachter and Singer 
(1962): Subjects rated how “good or happy” 
and how “irritated, angry, or annoyed” they 
felt. Following Schachter and Singer’s proce- 
dure, the “angry” score was subtracted from 
the “happy” score to yield a single index of 
affect. As can be seen in Table 2, all of the 
six conditions had a positive postscore on 
this affect index. This measure was adminis- 
tered both before and after the experimental 
manipulations in order to assess changes in 
reports of affect. An analysis of these changes 
(“adjusted” by regression on the initial 
scores) revealed no significant changes for any 
of the epinephrine conditions, However, the 
placebo-basic group showed a positive change, 
#(15) = 2.20, p< .05, and the placebo — 
arousal symptoms group also tended to do SO, 
b(11) = 1.86, $ < .10. 

The other self-report measure of experi- 
enced affect was the general affect index de- 
rived from the set of 11 bipolar affect-adjec- 
tive scales. Table 2 presents the mean scores 
on this index for the different groups. A posi- 
tive general index score (significantly differ- 


ent from the neutral midpoint of zero) was 
obtained by both the epinephrine-basic group, 
t(15) = 3.73, p < .01, and the placebo-basic 
group, ¿(15) = 6.40, p < .001. Similarly posi- 
tive scores were also obtained by the placebo - 
arousal symptoms group, ¢(11) = 7.13, p< 


-001, and the epinephrine — neutral confeder- 


ate group, #(11) = 2.92, p < .02. In contrast, 
the two epinephrine—heightened arousal 
groups had scores that did not differ from 
zero. 


‘ 


The general index score of the epinephrine- — 


basic group was slightly less positive than that 
of the placebo-basic group (contrary to the 
Schachter and Singer prediction), and that 
of the epinephrine — increased arousal group 
was significantly less positive (p < .01). In 
the epinephrine conditions, there was no sig- 
nificant difference between the basic and 
neutral confederate groups, while the scores 
of both of the heightened arousal groups were 
significantly less positive than that for the 
basic group (p at least <.05). i 

An examination of the pre- to postexperr 
mental changes in general index scores (“ad- 


j 


justed” by regression) reveals an interesting - 


pattern (see Table 2). In the epinephrine con- 
ditions, both the basic and neutral confederat 
groups showed no significant change on t i 
affect index. However, both of the heightene 


arousal groups had significantly negative 
changes on this index: for the increased 
arousal group, ¢(11) = 2.39, p < .05; for the 
increased salience group, #(11) = 3.56, p< 
01. In contrast to all of the epinephrine 
groups, the two placebo conditions showed a 
significantly positive change in this general 
index: for the basic group, #(15) = 2.34, p< 
.05; for the arousal symptoms group, #(11) 
= 2.28, p < .05. 

A comparison of the epinephrine-basic 
| group with the placebo-basic group reveals 
no significant difference between their gen- 
eral index change scores. The epinephrine — 
increased arousal condition had significantly 
more negative change scores than the placebo- 
basic group (p< .01). In the epinephrine 
conditions, the change scores for the increased 
salience group were significantly more nega- 
tive than that for the basic group (p < .01), 
| and those for the increased arousal group also 
tended to be more negative (p < .10). There 
Í was no significant difference between the epi- 
Í nephrine-basic group and the neutral con- 
| federate group. 

Emotional behavior. The analysis of the 
subjects’ euphoric activities is restricted to 
the four groups in which the subjects spent 
the experimental waiting period with an ac- 
tive, euphoric confederate who went through 
the standardized Schachter-Singer (1962) 


groups had the highest scores on this index 
(M= 28.1 for the basic group; M = 22.5 
for the arousal symptoms group). The mean 
| Sore for the epinephrine-basic group (ae 

21.7), although lower, did not differ signifi- 

tantly from the placebo-basic condition. 
| However, the scores for the epinephrine — 
creased arousal group (M = 18.2) were 
Significantly lower (p <.05) than those of 


| the placebo-basic group. 
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Table 3 


Behavioral Measures of Euphoric Activity 
a nT 


Experimental condition 


Epi- Placebo- 
Placebo- Epi- increased arousal 
basic basic arousal symptoms 


Measure (n = 15) (n= 11) (n = 12) (n = 11) 


Euphoric 

activity 35.0 25.6 14.5 19.8 
Imitation 1.4 1.2 1.0 1.3 
Initiation 38 9 io S 


Note. Epi = epinephrine. Higher numbers reflect 
greater degrees of activity. The videotapes for some 
subjects were not of scorable quality. 


Similar to Schachter and Singer’s procedure, 
the number of times each of the activities was 
performed by a subject was multiplied by 
the weight assigned to the presumed euphoric 
quality of that activity. All of these scores 
were then combined into an overall euphoric 
activity index (see Table 3). There was a 
considerable range of scores within each of 
the groups (the smallest intragroup range 
was 53.0), with some subjects remaining rela- 
tively inactive and others performing a great 
deal of euphoric behavior. The placebo-basic 
group had the highest score on this index— 
as opposed to either of the epinephrine groups. 
While their scores did not differ significantly 
from those of the epinephrine-basic group, 
they tended to be significantly higher than 
the activity scores for the epinephrine — in- 
creased arousal group (p < .10). The differ- 
ence between the two epinephrine conditions 
did not approach significance. The number of 
different activities (i.e., the range of euphoric 
behavior) in which the subject engaged was 
also examined. Each subject was classified as 
either (a) participating in none or only one 
activity or (b) participating in more than 
one activity. The percentage of subjects in 
the second category was 677% for the pla- 
cebo-basic group, 55% for the placebo- 
arousal symptoms group, 45% for the epi- 
nephrine-basic group, and 33% for the epi- 
nephrine — increased arousal group. The dif- 
ference between the placebo-basic and 
epinephrine-basic groups was not significant, 
while the epinephrine—increased arousal 


980 


group had a significantly lower percentage of 
subjects exhibiting a range of euphoric be- 
haviors than the placebo-basic group (z= 
1.76, p < .05). 

Similar to the Schachter and Singer (1962) 
procedure, the number of confederate activi- 
ties that a subject performed after the con- 
federate had done so was used to constitute 
an imitation score, and the number of inno- 
vative euphoric behaviors performed by the 
subject was used as his initiation score. As 
can be seen in Table 3, all of the experi- 
mental groups both imitated and initiated 
euphoric behavior. There were no significant 
differences between groups on either of these 
measures. 

Additional analyses. As noted earlier, if 
subjects cited receiving an injection as a pos- 
sible cause for their feeling state, they were 
categorized as “self-informed.” Being self- 
informed, according to Schachter and Singer, 
should render subjects functionally unsus- 
ceptible to the affect manipulations. Conse- 
quently, their reactions should differ from 
cohorts not so self-informed. To test the 
significance of this variable on the pattern 
of results in the present study, ¢ tests were 
performed between self-informed and not 
self-informed subjects on each of the de- 
pendent measures. There was no trend nor 
even tendency for differences between them 
on any of the self-report or behavioral mea- 
sures of affect (all p values > .10). In other 
words, subjects responded similarly to the 
affect manipulations, regardless of their at- 
tributions regarding the injection. 

The subjects’ responses to the physiologi- 
cal arousal manipulation may have been in- 
fluenced by their perceived association of 
particular symptoms with different states of 
affect. Correlations between the affect self- 
report scores and three scores derived from 
the subsequently administered Autonomic 
Perception Questionnaire (APQ) were com- 
puted. The APQ scores included the affect 
symptom-discrimination index (i.e., the ex- 
tent to which the subject reported experienc- 
ing different symptoms with different emo- 
tions) and the standard deviation for the 
ratings on the specific item regarding ex- 
periencing “increased heart beat.” The final 
APQ score was the mean on an item assess- 
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ing how “bothered” the subject was by these 
“bodily reactions.” For the placebo subjects, 
there were no significant correlations be- 
tween the APQ scores and the self-reports of 
affect. However, for comparable subjects who 
had received an injection of epinephrine, 
there was a consistent pattern of significant 
(p<.05) negative correlations (ranging 
from —.67 to —.49) between affect reported 
in the experimental setting and the affect 
symptom-discrimination index and also the 
other APQ scores. The subjects who re- 
ported more negative responses to their ex- 
perimentally induced, inadequately explained 
physiological arousal were also those who 
were more likely to report doing the follow- 
ing: (a) discriminating between different 
states of affect in terms of physical symp- 
toms, (b) experiencing increased heart beat 
differentially with various emotions, and (c) 
being bothered by these bodily reactions. 
Thus, symptom-sensitive subjects showed a 
negative affective bias in response to symp- 
toms associated with the epinephrine-induced 
physiological arousal. 


Discussion and Conclusions 


There is no evidence in the data generated 
by this study to warrant acceptance of the 
Schachter and Singer (1962) reported dem- 
onstration of the interaction of cognitive and 
physiological determinants of emotion. In 
particular, there was no instance in which 
subjects with inadequately explained epi- 
nephrine-produced arousal were significantly 
more susceptible than placebo controls to 
the induction of affect by exposure to a 
confederate who modeled euphoric behavior- 
If, as Schachter (1971) has suggested, the 
disguised injection of epinephrine actually 
did create “a bodily state ‘in search of’ an 
appropriate cognition” (p. 33), the majority 
of these subjects did not discover that cog- 
nition in the mood displayed by the euphoric 
confederate. Whatever arousal treatment dif- 
ferences did exist were likely to suggest more 
positive affect in this situation for placebo 
subjects than for epinephrine subjects. 

The particular physiological substrate 10- 
duced by exogenous epinephrine seems, Trom 
the consistent pattern of our data, to hav 
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produced a relatively negative affective bias, 
wen given the presence of a euphoric con- 
federate. This finding became more evident 
mder conditions of increased arousal and 
increased salience of the arousal. The pres- 
nce of a euphoric confederate exerted some 
nctional control over the “emotional” be- 
laviors of the subjects, but had minimal 
Impact on their self-reported affect. “It’s like 
bing at a party,” one epinephrine subject 
hter reported, “when everyone is clearly 
faving a good time and you have a headache 
it feel depressed for some unknown reason. 
ou don’t want to rain on their parade; 
ou might even laugh and try to join in, 
Wut you don’t really feel happy.” Nowhere 
as there evidence of inadequately explained 
inephrine-produced arousal enhancing af- 
lective susceptibility to situational influences. 


Reappraisal of the Schachter and 


înger Conclusions 

As previously mentioned, Schachter and 
iger (1962) similarly found no significant 
liferences between their placebo controls 
ùd epinephrine aroused groups (see their 
ible 2, p. 390). However, they considered 
lhe direction of the differences to provide 
tentative but consistent” support for their 
redictions regarding cognitively steered 
bhysiological arousal. Their actual conclu- 
fons concerning the role of physiological 
tousal, however, are primarily based on sev- 


By these means, they were able to obtain 
ime “predicted” effects within groups on the 
Ikthavioral measures. They were still unable, 
However, to demonstrate similar effects on 
i self-report measures. In the present study, 
ocedural improvements were introduced to 
ftengthen the possibility of finding any be- 
een-group arousal treatment effects. De- 
re this, there were still no significant dif- 
ences obtained between the arousal treat- 
Ment conditions, Opposite to what Schachter 
Ad Singer found, this time the direction of 
liferences that did exist was in the favor of 
Note positive affect in the presence of a 
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euphoric confederate for the placebo group. 
Furthermore, this pattern of results was sup- 
ported by the data from additional compari- 
son groups. These data suggest a negative 
affective bias for epinephrine-produced 
arousal. Thus, it seems most accurate to con- 
clude that inadequately explained epineph- 
rine-produced physiological arousal does not 
enhance general affective susceptibility to 
environmental influences, 

This lack of “emotional plasticity” is also 
suggested by other results in the Schachter 
and Singer study. Interestingly, Schachter 
and Singer do not compare the effects of the 
different social environments presented in 
their study. When such a comparison is done, 
it is apparent that the difference in reported 
affect between epinephrine-ignorant (without 
an adequate explanation) subjects exposed 
to an angry confederate and those exposed 
to a euphoric confederate is rather small (.4 
scale units) compared to the difference found 
for epinephrine-informed (with an adequate 
explanation) subjects (.9 scale units). This 
marginal situational effect for inadequately 
explained physiological arousal is similar to 
the minimal influence found for the affective 
tone of the social environment in the present 
study. Thus, there is additional evidence to 
support the conclusion that inadequately ex- 
plained epinephrine-produced physiological 
arousal does not enhance general affective 
susceptibility to environmental influences. £ 

Although they only employed postexperi- 
mental measures, Schachter and Singer inter- 
preted their within-confederate condition re- 
sults in terms of changes in affect for their 
unexplained arousal groups. In the present 
study, the use of pre- and postexperimental 
measures permitted a direct assessment of 
actual change. No significant change in affect 
was found for subjects who were comparable 
to Schachter and Singer’s unexplained 
arousal group. This suggests the possibility 
that the group differences they obtained were 
actually due to changes by their accurately 
informed epinephrine subjects. As already 
pointed out, this condition was the one to 
show the greatest situational effect, Addi- 
tional evidence suggesting that the accurate 
information condition was the one to have 
the changes in affect is provided by the re- 
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sults in the present study for the placebo 
subjects who were told to expect epineph- 
rine-type symptoms (the “accurate” informa- 
tion manipulation). 

Placebo subjects who were told to expect 
the somatic effects associated with epineph- 
rine reported relatively more negative af- 
fect prior to the injection and had a rela- 
tively greater increase in heart rate in an- 
ticipation of the injection. This information 
about possible effects of the injection seems 
to have produced a negative affective re- 
sponse by itself. This response appears to 
have dissipated over the course of the experi- 
ment, presumably because the anticipated 
effects did not occur (since in fact they re- 
ceived an injection of a placebo). If these 
subjects had indeed received an epinephrine 
injection, then it is probable their initial neg- 
ative reaction would have persisted over the 
course of the contact with the confederate. 
That condition is, of course, the epineph- 
rine — informed treatment in the Schachter 
and Singer study. Therefore, it seems likely 
that their “accurate information” manipula- 
tion, instead of precluding any affective 
change, actually produced one. If the differ- 
ences found by Schachter and Singer are 
actually due to changes for their informed 
condition instead of their unexplained condi- 
tion, then their posited motivated search to 
account for inadequately explained arousal 
is not demonstrated by their results, 

Schachter and Singer ( 1962) have clearly 
made valuable theoretical contributions to 
our understanding of emotion. However, 
there does not appear to be an adequate evi- 
dential basis for the conclusions they have 
drawn. The conclusions suggested by the 
present study also seem to question the body 
of literature on the cognitive alteration of 
feeling states (see London & Nisbett, 1974), 
which was largely based on the Schachter 
and Singer work. We recognize, of course, 
the power of cognitions to control a range 
of human reactions (Zimbardo, 1969), as is 
also evidenced by the results of the current 
placebo group with arousal expectations. We 
equally believe, however, that there are defi- 
nite limitations to the power of cognitive 
control. We argue for a much greater com- 
plexity in the interaction of physiologicai 
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substrate and cognitive cues in determinin 
emotions. Immediately available social cog- 
nitions are but one of many sources of in- 
formation used to give arousal an emotional 
quality. When other determinants of various 
classes of responses are not explicitly ac 
knowledged in research designs but still exert 
an influence, their presence is manifested in 
excessive error variance. Thus, many of the 
studies that followed in the wake of the 
Schachter and Singer study have failed ti 
find predicted effects across both behavioral 
and experiential variables (see Bem, 1974). 
Or, as in the case of the extention of Schach- 
ter’s theorizing to the cognitive manipulation 
of pain (Nisbett & Schachter, 1966), the 
statistically significant effects are conceptu- 
ally insignificant. In that study, for example, 
no differences were found on any of three 
measures between subjects who attributed 
their high degree of fear to a pill or to ai 
impending shock. The only reliable result is 
the least interesting one, namely, low-fear 
subjects who accurately attributed their 
arousal to the shock had a lower tolerance 
threshold than all other groups (none of 
which differed among themselves). 


Inadequately Explained Arousal 


The idea of dissociating physiological from 
cognitive elements of emotion had been sug 
gested to Schachter and Singer by the earlier 
work of Marañon (1924). Marañon, it will 
be recalled, reported two types of reaction 
to epinephrine injections: “true” emotions 
and “cold” or “as if” emotions. In the for- 
mer, not only were there the same somatic 
elements as in the mock emotions but also 
the “psychic element.” It is interesting ! 
note that the true emotions, though less fre 
quent, were typically of a negative a 
tone. A true emotion appeared spontaneously 
in only some cases; “in other cases, to bring 
it about the investigator must suggest 4 
memory of great affective energy or force 
(p. 307). 2 

Later research (Cantril & Hunt, 193 5 
Landis & Hunt, 1932) confirmed Marañon 
finding. In these and subsequent studi 
When cases of “true” emotion were jndeet 
be present, the emotion was generally 


S 
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ibed as “fear,” “anguish,” or “anxiety.” 
Reviews of this literature offer various ex- 
ations. Breggin (1964) suggests that the 
pearance of these negative emotions may 
ve been due to the superimposed stress of 
experimental procedures. A more inter- 
ling possibility is that these true negative 
motions ‘‘arose from the particular (learned) 
ociations the somatic sensations may 
we” (Rothballer, 1959, p. 511). These as- 
peiations, in turn, are likely to have arisen 
fom prior experiences with endogenous epi- 
lphrine. In the present study, the results 
fidicate that subjects who reported distin- 
[ishing between emotions in terms of so- 
tic sensations also reported more negative 


i anxiety or in threatening situations of 
itertain or unpredictable nature in which 
itive coping behavior may be required but 
hot been achieved” (Schildkraut & Kety, 
w, p. 23). f 

[The physiological substrate associated with 
least a certain degree of epinephrine also 
ms to be somewhat uniquely associated 
lth negative affect. Parenthetically, it is 
leresting to note that the incidence of sub- 
kis being ‘self-informed” in the Schachter 
id Singer study varied with the social en- 
ltonment. It was greater with the appar- 
lily dissimilar euphoric confederate (327) 
n with the clearly negative angry con- 
ferate (13%); here the incidence was al- 
lst identical with the rate obtained for 
lieebo subjects in the present study (127). 
Mis finding suggests that the confederate’s 
Marre angry behavior provided a sufficiently 
fusible explanation for the subject’s own 
Batively toned experience of arousal symp- 
Ms. It is likely, of course, that there are 
"tain critical thresholds that must be 
hed before epinephrine, whether endoge- 
Mus or exogenous, is associated with nega- 
affect, In the present study, the proce- 
è of both increasing the arousal (by tak- 
individual differences in body weight into 
*ount) and additionally increasing its sali- 
e consistently produced indications of 
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more negative affect. The variance in the 
literature in the reports of affect associated 
with epinephrine may be partially accounted 
for by individual differences in pharmaco- 
kinetics, such as rates of absorption, distri- 
bution, and metabolism (Lasagna & Laties, 
1959; Smith & Rawlins, 1973), which would 
contribute to determining whether a thresh- 
old had been reached. 

The general picture that emerges from the 
present study, the companion study by Mas- 
lach (1979), and a reexamination of the 
Schachter and Singer (1962) study indicates 
that epinephrine-related physiological arousal 
does not provide “emotional plasticity,” but 
rather shows a consistent association with 
negative affect. The specific expression or 
manifestation of this negative affective bias 
may be influenced by cognitive manipula- 
tions (Lazarus, 1967), but we believe that 
the range of outcomes would still be limited 
to a negative affective domain, It is somewhat 
reassuring, especially considering their pos- 
sible adaptive significance, that our true emo- 
tions may be more rationally determined and 
less susceptible to transient or whimsical 
situational determinants than has been sug- 
gested by Schachter and Singer. Perhaps we 
social psychologists should better appreciate 
our biological “hardware” or, to paraphrase 
a frequent media message, “It’s not so easy 
to fool Mother Nature.” 
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Postscript 


[The] analyses sometimes suffer the imprint of 
Someone convinced that he’s right and willing t0 
force his data to prove it, (Schachter, 1971, p. 141) 


One function of replication studies is the in- 
dependent assessment of the validity of conclu- 
sions drawn by investigators who sometimes 
force their data to fit their preexisting belief 
systems, thereby violating basic canons of scien- 
tific evidence. We began our research fully E 
pecting to replicate Schachter and Singers 
(1962) conclusion that inadequately explained; 
epinephrine-induced physiological arousal Ne 
readily manipulable into the disparate feeling 
States of euphoria and anger by varying avail- 
able social cognitions. Our interest was to extend 
their analysis to the area of emotional path 
ology. After extensive testing, experimental vat 
ations, and procedural refinements designed to 
yield the predicted finding, we were forced to 
acknowledge the failure to replicate this MAW 
classic conclusion. 
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Obviously, there are many ways not to find 
an effect and few to demonstrate it. We now be- 
lieve that both the basic paradigm and the 
methodology employed in the Schachter and 
Singer study and in our own may not be ade- 
quate to uncover the “emotional plasticity” pre- 
dicted by their theory. But in addition, our data 
and that of Maslach lead us to propose the 
hypothesis that regardless of the experimental 
conditions utilized, awareness of one’s strong 
arousal without adequate explanation is most 
likely to be interpreted as a negative mood 
state. Thus it may be that the extent to which 
states of unexplained physiological arousal are 
manipulable into disparate psychological mood 
states is limited to conditions of mild arousal 
(as Schachter and Singer’s, 1979, current com- 
ments suggest, pp. 992-993. Some support for 
this constraint on the latitude of emotional plas- 
ticity comes from Nisbett and Schachter’s 
(1966) inability to manipulate the emotion of 
subjects in a high-fear state, obtaining the mis- 
attribution effect only for subjects in a low- 
fear condition, Such clarification of the range 
or areas of applicability of a theory is another 
contribution of replication studies. 

However, when Schachter and Singer's re- 
joinder now declares that they too “never found 
a difference,” our failure to replicate becomes 
a replication of the null hypothesis and to some 
extent belittles our efforts. How could we—and, 
we presume, most other psychologists—have 
been led to misunderstand that the conclusions 
of the Schachter and Singer (1962) study were 
merely “tentative,” based on nonsignificant 
tteatment differences and internal analyses of 
questionable status? We were all led there by 
bold assertions to the contrary, by the rhetoric 
of conviction that Schachter and Singer chose to 
describe what they “really ” found. “Conforma- 
tion of Data to Theoretical Expectations” (P. 
393) is their section heading, which belies any 
equivocation. It is important in this exchange 
to show briefly that both in their original article 
and in their present comments on our study and 
Maslach’s, Schachter and Singer sometimes use 
their considerable literary talent to misdirect 
the reader's attention toward inadequately justi 
fied conclusions such that what was “never 
found” becomes one of the most cited findings 
M psychology textbooks. 

_Let us consider one instance 0 
tion, A one-sentence (cautiously worded) dis- 
claimer of their findings can indeed be found: 
“However, the fact that we were forced, to some 
extent, to rely on internal analyses in order to 
Partial out the effects of experimental artifacts 


f this misdirec- 
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inevitably makes our conclusions somewhat ten- 
tative” (Schachter & Singer, 1962, p. 396). This 
lean statement is sandwiched between an opening 
sentence that proclaims, “The pattern of data, 
then, falls neatly in line with theoretical expec- 
tations,” and the following forceful conclusion: 


It has been suggested, first, that given a state of 
physiological arousal for which an individual has 
no explanation, he will label this state in terms of 
the cognitions available to him. This implies, of 
course, that by manipulating the cognitions of an 
individual in such a state we can manipulate his 
feelings in diverse directions. Experimental results 
support this proposition for following the injection 
of epinephrine, those subjects who had no explana- 
tion for the bodily state thus produced, gave be- 
havioral and self-report indications that they had 
been readily manipulable into the disparate feeling 
states of euphoria and anger. 


From this first proposition, it must follow that 
given a state of physiological arousal for which 
the individual has a completely satisfactory ex- 
planation, he will not label this state in terms of 
the alternative cognitions available. Experimental 
evidence strongly supports this expectation. (pp. 
395-396; reiterated verbatim in Schachter, 1971, 


p. 23) 


We would like to propose that these conclu- 
sions should be recoded by psychologists as fol- 
lows: There is no evidential basis for the em- 
pirical conclusion that epinephrine-aroused sub- 
jects can be readily induced to experience (or 
label that arousal as) euphoria or anger by 
manipulating available social cognitions (through 
a confederate’s behavior). 

Schachter and Singer’s (1979) rejoinder cites 
the results of a study by Erdmann and Janke 
(1978) whose “results are consistent with ours 
and present none of the interpretative problems 
of the Marshall and Zimbardo and Maslach 
studies” (p. 992). However, they did not state 
what those “interpretative problems” are, nor 
did they mention several critical points about 
the Erdmann and Janke study. Unlike epineph- 
rine, the drug used in the study (ephedrine) 
has strong central effects; the study failed to 
find any expected physiological differences, and 
the final conclusion advanced by the authors was 


the following: 


Regarding Schachter’s theory the results and the 
conclusions of the present experiment imply that 
the relationship between emotional and physiologi- 
cal arousal might be less tight and/or less general 
than the theory proposes. (Erdmann & Janke, 1978, 


p. 73) 
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Dosage Levels, Source Awareness, and Plasticity 


Since we found no differences between placebo 
controls and epinephrine-injected subjects, the 
drug dosage was increased (by tailoring level to 
body weight, a standard procedure in drug 
studies) to generate sufficient arousal differences 
for demonstrating possible emotionality effects 
of the manipulated social cognitions. Schachter 
(1971) led us to believe that emotionality was 
a direct, not a curvilinear, function of arousal 
by his reference to “the proposition that the 
degree of emotionality is directly related to the 
degree of physiological arousal” (p. 35). In their 
rejoinder, Schachter and Singer (1979) say that 
it is so obvious that the function is U-shaped 
as to be unworthy of mention. What may be 
worthy of mention are the parameters of that 
function, the specification of which should have 
been systematically investigated in humans. 
Having found weak effects on the subjects with 
their standard dose of epinephrine and the nega- 
tive affect induced by our increased dosage (with 
inadequate explanation) does not justify their 
conclusion that our dose level put subjects on the 
other side of the curve. Clearly one cannot fit 
a U function with only two points. Moreover, 
it does not seem evident that the increase 
from .007 cc/kg (the equivalent of Schachter 
and Singer’s .5 cc) to our .01 cc/kg would rep- 
resent a vast difference on an inverted-U curve 
leading to performance reversal. We know of no 
empirical literature to support such an inference. 
The magnitude of the physiological reactions 
and self-reports of our increased arousal subjects 
were not qualitatively different from standard 
arousal subjects, (As can be seen in the data 
in our Table 1, the mean palpitation index was 
9 for the standard dose subjects and only 
slightly higher, 1.4, for the increased arousal 
group.) In the animal literature cited by Schach- 
ter (1971), no differences were found for a 2:1 
dosage ratio (p. 38), and to get dramatic de- 
bilitating effects the dosage ratio was pushed 
to 20:1. 

Switching from rats to Schachter and Singer’s 
(1979) phenomenological reports of their drug 
reactions, not one of our subjects ever reacted 
with anything resembling their near-to-death 
sensitivity. Of course, we and our graduate as- 
sistants pretested all drug doses on ourselves 
before administering them to any subject. The 
effects were remarkably variable. Marshall was 
hardly bothered by the .01-cc/kg dose, Zimbar- 
do’s experience was more marked, and the re- 
actions of four others were arrayed from mild 
to strong, 
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When one’s immediate physical reaction to a 
drug is intense, then the direct consequence is 
the perception that something is “very wrong” 
and the feeling is “lousy.” In short, such a con- 
dition is not amenable to manipulation into 
positive moods by social cognitions—according 
to the current comments by Schachter and 
Singer (1979, p. 991). They state further that 
adrenaline alone produces a negative state when 
animals are in disturbing situations (pp. 991- 
992). Following their lines of argument, we would 
restate our position as follows: In unfamiliar set- 
tings (like those faced by the subjects in these 
experiments), the awareness of strong physical 
arousal symptoms in the absence of an adequate 
explanation creates a disturbing situation that is 
likely to be appraised in negative affect terms. 
Strong unexplained arousal states, then, should 
exhibit a negative bias, which is likely to be 
resistant to the blandishments of external social 
cognitions that are positive. That is a central 
conclusion of our research. 

All of the evidence advanced by Schachter 
and Singer (1979) regarding the drug studies 
done in the 1920s and 30s (p. 994) is irrelevant 
both to our point and the necessary conditions 
for testing their two-factor theory of emotion. 
Subjects must find themselves aroused without 
adequate explanation in order for there to be 
a “bodily state in search of an appropriate cog- 
nition.” In the studies by Marafion, Cantril and 
Hunt, and Landis and Hunt, cited by Schachter 
and Singer (1979), the subjects were fully aware 
of their arousal; they had a totally adequate 
explanation of its source in the injection they 
received. The “cold” or “as if” emotions shown 
by the majority of these subjects are not per- 
tinent to our analysis, because we are interested 
in understanding the psychological experience of 
affect when particular somatic elements are 
present as part of an emotion, and not simply 
in a person’s description of his or her somatic 
state. 

It must be noted that repeated exposure to 
drug-induced arousal with a known physical 
source was typical for many of the Cantril and 
Hunt subjects. Such a condition is qualitatively 
different from those we are concerned with and 
have studied in the analysis of the consequences 
of unexplained arousal. Moreover, the “genuine- 
emotion subject” selected by Schachter and 
Singer (1979, p. 994) to illustrate that reactions 
to adrenaline can be pleasant, also reported 
“cold emotions” at another episode and also 
reported strongly negative consequences of 
arousal: “Extreme fear was present, but no Con- 
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tent for it at the time. . 
afraid” (p. 305). 
Unfortunately, the bias of researchers willing 
to force the evidence to prove their point also 
is evident if one reads the original Landis and 
Hunt paper and Schachter and Singer’s (1979) 
summary of it. Hysterical laughter does not tell 
us anything about the internal subjective state 
experienced by the institutionalized “lunatics” 
used as subjects. Landis and Hunt (1932) make 
quite explicit the difference between “outward 
and manifest expressive emotional reactions.” 
Despite the overt hysteria, all cases labeled 
“emotion-subjective” were negative. This is ex- 
actly what Schachter and Singer would have 
predicted if they were using these data to sup- 
port their earlier point about the negative effects 
of large doses of adrenaline. The Landis and 
Hunt subjects received much larger doses than 
our increased arousal subjects. Their reactions 
were quite variable but intense. One subject 
(classified “dementia praecox”) reported feeling 
“pretty nervous” at 1.0 cc and laughed hysteri- 
tally at 1.5 cc, while another in a manic phase 
ad a crying spell at 1.0 cc and hysterical 
laughter at 1.5 cc. 
This evidence again underscores the need for 
Schachter and Singer and all who work in the 
atea of emotion to acknowledge the difference 
etween public expressions of emotional reac- 
tions and the subjective experience of emotion. 
The two are rarely correlated in any of the 
studies on “emotional plasticity” or misattribu- 
tion, The overt behavior may not reflect under- 
lying affect at all, but rather conformity to situ- 
ational contingencies. In our research and that 
of Maslach, many subjects faced with the antics 
of a euphoric confederate gave positive overt 
teactions, even laughed, but that behavior was 
independent of their self-reported affect, which 
Was usually negative. 
This same inconsistency shows up in Schach- 
ter and Singer’s (1962) original data (the in- 
ternal analyses used to salvage the study do not 
€ven present self-reported affect data) and is 
more glaring in Schachter and Wheeler (1962). 
The significant finding that is emphasized is 
that placebo subjects gave less overt laughter 
to a slapstick movie than did epinephrine sub- 
jects. However, on an overall amusement index, 
Placebo subjects did not differ significantly from 
the epinephrine subjects (see their Table 3, P. 
126), despite Schachter and Singer's (1979) 
fanciful relishing the contrary in their Footnote 
3. Of greater import is the inability of Schachter 
and Wheeler to find any difference om self-re- 
Ported reactions to the film. The means in their 
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Table 4 are virtually identical for epinephrine, 
placebo, and chlorpromazine subjects: “funny,” 
4.1, 4.0, and 3.8, respectively; “enjoy it,” 4.0, 
3.9, and 3.8, respectively; “recommend it,” 2.0, 
1.9, and 1.8, respectively. All subjects reported 
that they felt the film was “mildly funny,” they 
“enjoyed it a little,’ and would “recommend it 
moderately.” This is hardly a pattern of data 
one would predict from Schachter and Singer’s 
theoretical propositions. 


Experimental Instructions 


In response to an injection of epinephrine, 
“as far as the subject is concerned, the major 
subjective symptoms are palpitation, tremor, 
and sometimes a feeling of flushing and accel- 
erated breathing” (Schachter & Singer, 1962, 
p. 382). Rothballer (1959), an expert on the 
effects of the drug, finds that tremor is “almost 
invariably” noted; palpitation is “the next most 
common finding” (p. 510). Thus, to mislead 
subjects about their anticipated reactions, symp- 
toms other than these primary ones were men- 
tioned as the side effects of their injection by 
both Schachter and Singer and us. 

In the original Schachter and Singer (1962) 
study, the symptoms suggested were headache, 
numbness, and itching. In our replication we 
also mentioned the irrelevant symptom of head- 
aches, but changed numbness to coolness and 
itching to dryness of mouth. Our reason for 
doing so was intentionally to include some 
minor, occasionally reported symptoms associ- 
ated with epinephrine in order to increase the 
validity of the general instructions. We felt that 
if subjects experienced some weak effects that 
were attributed to the drug, they would be more 
likely to believe in the validity of the physi- 
cian’s instructions. This should have given them 
less reason to suspect that their major reactions 
(tremor and palpitations) were drug-related, 
since the credible communicator did not include 
them in his description of anticipated side 
effects. 

In Schachter and Singer's (1962) study, the 
epinephrine-misinformed subjects tended to re- 
ewer of the false symptoms, headaches, 


port fe 
numbness, itching, than some or all of the other 


groups. In rejecting the symptoms specified by 
the physician, these subjects may have also 
doubted her competence. In a thorough critique 
of this feature of Schachter and Singer’s work, 
Kemper (1978) suggests that the epinephrine- 
misinformed subjects were likely to have per- 
ceived the whole study as “unscientific” because 
of the totally wrong symptom expectations they 
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were given. Our instructions, thus, were an at- 
tempt to make the misinformation treatment 
function more appropriately in operationally 
evaluating the two-factor model of emotion. 

Schachter and Singer (1979) argue that in 
doing so we did not properly misinform our 
subjects, and that is the reason we failed to 
find even a “shred of support for [their] origi- 
nal findings.” They remind us that Schachter 
and Wheeler (1962) found a significant “dry- 
ness of mouth” effect. Examination of those data 
reveal that on a 5-point scale, the mean re- 
ported degree of dryness for epinephrine sub- 
jects was a scant .7, while placebo subjects were 
different by only .4 units. We believe that the 
dominant symptoms experienced by our subjects 
were unaccounted for and that the conceptual 
essence of the misinformation procedure is 
satisfied. 

We would argue further that there are more 
substantial procedural differences between our 
study and Schachter and Singer’s that readers 
should note. They dropped epinephrine subjects 
who reported “virtually no” palpitations or 
changes in heart rate before making their be- 
tween-conditions analysis. This questionable sub- 
ject selection excludes those who may have been 
physically inactive (activity alone will accelerate 
heart functions), These are the very subjects 
likely to have been focusing on their internal 
arousal rather than hula hooping and thus re- 
sponding more negatively to their unexplained 
arousal (as we found). Another key difference 
may be that their subjects did not know prior 
to their appearance that they were to be in- 
jected. Ours consented to being injected before 
finding themselves in a fait accompli situation. 

There are other differences that might be 
dredged up to account for our failure to repli- 
cate Schachter and Singer’s (1962) findings— 
but sight is lost of the fact that we replicated 


GARY D. MARSHALL AND PHILIP G. ZIMBARDO 


their null effects only too well, despite many 
surface differences in procedures. On the Schach- 
ter and Singer affect index, the mean of their 
placebo group is 1.6, while ours is 1.5. The com- 
parison between the two epinephrine-misin- 
formed groups shows their mean is 1.9; ours is 
1.4. Although they never report standard devia- 
tions, these small differences (.5 scale units on 
a 9-point scale) are well within sampling error 
range. Thus, we found virtually the same degree 
of inconsequential treatment difference as they 
did originally. 

“The most pressing and crucial need with 
regard to Schachter and Singer results is a 
replication,” states Kemper (1978, p. 174), who 
goes on to comment that “the credence with 
which the findings have been greeted is some- 
what surprising in light of the absence of con- 
firmatory results from another experimenter in 
which exactly the same findings were sought” 
(italics in original). 

We indeed sought to replicate what we 
thought were the important results found by 
Schachter and Singer and could not. We tried 
to salvage something of value from an unsatis- 
fying failure to replicate this classic by adding 
treatments that extended and tested some boun- 
daries of their theory of emotion. 

All the available data from our study, Mas- 
lach’s, and Schachter and Singer’s should be 
open to the close scrutiny of readers who by 
now must be wary of possibly being misin- 
formed by the personal biases of the researchers 
involved in this argument. We hope some im- 
portant clarifications have emerged from this 
exchange and apologize for occasionally replying 
in kind to personal attacks on our good sense 
and sensibility. In the future we will heed Oscar 
Wilde’s (1908) admonition: “Arguments are to 
be avoided; they are always vulgar and often 
convincing” (p. 52). 
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This article addresses interpretive questions raised in the immediately preced- 
ing articles by Christina Maslach and by Gary D. Marshall and Philip G. 


Zimbardo. 


First man: How’s your wife? 
Second man: Compared to what? 
(H. Youngman) 


The Maslach Study 


On page 384 of our write-up (Schachter & 
Singer, 1962) of the epinephrine-emotion ex- 
periment we wrote, “Immediately ° after the 
subject had been injected, the physician left 
the room and the experimenter returned with 
astooge.” Footnote 6 read: 


It was, of course, imperative that the sequence with 
the stooge begin before the subject felt his first symp- 
toms for otherwise the subject would be virtually 
forced to interpret his feelings in terms of events 
Mteceding the stooge’s entrance. Pretests had indi- 
fated that, for most subjects, epinephrine-caused 
Ymptoms began within 3-5 minutes after injection. 
A deliberate attempt was made then to bring in the 
Stooge within 1 minute after the subject’s injection. 


The matter of timing is obviously crucial 
in all of these experiments. If the subject ex- 
Periences palpitations, accelerated breathing, 
and so on before the emotion-inducing manip- 
lation, there is obviously little possibility 
that he can “attribute” or “interpret” his 
Physiological state in terms of the experimen- 
al manipulation and he must interpret it in 
terms of events preceding the manipulation 
(e.g., he is nervous about being in the experi- 
Ment, excited about something that happened 
to him, getting sick, etc.)- 


— 


é 
Requests for reprints should be sent either to 


tanley Schachter, Department of Psychology, Co- 
lumbia University, New York, New York 10027 ve 
t0 Jerome E. Singer, Medical Psychology, Uniformed 
vices University of the Health Sciences, Bethesda, 


Maryland 20014. 


In the crucial phase (Part 2) of Maslach’s 
(1979) experiment, the sequence of events is 
as follows: 

1. For roughly a 1-minute period the sub- 
ject is exposed to a 15-word list on a memory 
drum. 

2. There is a 1-minute recall test. 

3. For the next minute, the subject is ex- 
posed to another 15-word list. The final word 
on this list is either the neutral word or the 
arousal word start. 

4, There is a 1-minute recall test. 

5. “Next, the subject and confederate were 
told to look through the materials in the folder 
and were then left alone” (p. 958). On the 
assumption that there was at least some brief 
description of what they were supposed to do 
with the “materials in the folder,” this could 
have taken anywhere from a few seconds to a 
few minutes. 

6. Only after they start looking through 
the folder does the confederate go into her 
act, and from the description, it seems that 
the first minute of this routine is essentially 
neutral, with little affective overtone. 

Maslach’s posthypnotic arousal suggestion 
was the following: “When you see the word 
start, your heart will beat faster, your breath- 
ing will increase, there will be a sinking feel- 
ing in your stomach, and your hands will get 
moist. You will feel all of these sensations 
as soon as you see the word start” (p. 958). 


The word start appeared on the second 


word list, which was followed by a 1-minute 
recall test. Since Maslach uses performance 
on this recall test as proof that the post- 
hypnotic suggestion has taken, it must follow 
that her subjects were in a state of arousal 
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for some minutes before the confederate even 
began his or her act. 

In addition, we note that this is not the 
first time during this experimental session that 
subjects were thrown into this state of marked 
physiological arousal with no explanation pos- 
sible from the preplotted script of the con- 
federate. To check on the effectiveness of the 
hypnotic induction of arousal, the first part 
of the experiment consisted of a private ses- 
sion with the subject during which he or she 
was hypnotized, the arousal suggestion was 
planted, the subject was exposed to the cue 
word start, and his or her physiological re- 
sponses were measured. In short, twice dur- 
ing the experimental hour, a subject experi- 
enced, suddenly and with no explanation (if 
we credit the effectiveness of the amnesic 
Suggestion), palpitations, speeded-up breath- 
ing, a sinking feeling in his or her stomach, 
and sweaty hands, The first time this hap- 
pened the subject had not yet met the con- 
federate; the second time it happened, the 
subject was working on a memory drum task 
and the confederate had not yet said a word. 

Because of these matters of timing, Mas- 
lach’s experiment and results have little to 
do with our experiment or with our hypothe- 
ses. The study does, however, unequivocally 
prove that when for no apparent reason, a 
subject’s heart starts pounding, his or her 
breathing accelerates, palms turri sweaty, and 
stomach sinks, his or her mood is not likely 
to be affected by the irrelevant behavior of 
a confederate that follows after the appear- 
ance of this bodily state. The study also 
proves that subjects who, again for no ap- 
parent reason, find themselves in this bodily 
condition are likely to describe themselves as 
in a lousier mood than do subjects to whom 
this does not happen, 

Footnotes notwithstanding, within her ex- 
perimental context there is only one way for 
Maslach to prove her position, and that is to 
run a set of conditions in which the word 
start appears a minute or two after the con- 
federate starts his or her happy or angry 
routine and where there has been no prior 
posthypnotic arousal. If she is right, she will 
find the same results in these conditions as 
in those already run. If we are right, she will 
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find that subjects in the happy condition are 
happy. 


g 
i 


The Marshall and Zimbardo Study: 
1. Dosage Levels 


As to the Marshall and Zimbardo (1979) 
study, we find ourselves both bemused and 
perplexed by their strategy of replication. The 
simple fact is that their experiment repli- 
cates ours all too well. We never found a dif- 
ference between the conditions they chose to 
replicate, and neither do they. Having found 
what we did, they then ran an additional con- 
dition with markedly increased doses of epi- 
nephrine and found that this produced a state 
of negative affect. Then ignoring virtually 
everything else that we and others have done, — 
they conclude on the basis of this, almost 
their sole positive finding, that ‘“epinephrine- 
related physiological arousal does not provide 
‘emotional plasticity,’ but rather shows a con- 
sistent association with negative affect” (p. 
983). As Maslach (1979) summarizes these 
data, “When the arousal was made stronger 
and more salient, subjects reported a negative 
emotional state, rather than the positive ‘eu- 
phoria’ predicted by Schachter and Singer” 
(p. 954). 

In fact, Schachter and Singer would have 
predicted nothing of the sort, Maslach, Mar- 


1 Apparently in response to comments made by us 
in previous communications, Maslach discusses these 
issues in her Footnotes 7 and 10. Footnote 7 rejects 
the prior-arousal possibility by citing, without data, 
a pretest that could contain the faulty-timing arti- 
fact, and by citing an unpublished manuscript (Zim- 
bardo & Maslach, Note 1). Footnote 10 raises three 
arguments against the faulty-timing objection. The 
first raises a new distinction—between normative 
and causal information—and claims that for causal 
information the confederate’s actions must occur 
merely at the same time as the subject’s arousal, 
not necessarily before the arousal. But that misses 
Our point: The confederate need not start before 
the arousal but will certainly be of little effect if 
he or she starts after the arousal has occurred and 
a label has already been chosen. The reason for 
starting early is to assure that the unexplained 
arousal does not occur in the absence of a reasonable 
Social context. Arguments 2 and 3 presuppose un- 
critical acceptance of the Marshall and Zimbardo 
article, while Marshall and Zimbardo presuppose un- 
critical acceptance of the Maslach article. 
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shall, and Zimbardo do not mention the fact, 
but we have been explicitly concerned with 
the effects of dose on emotional behavior, and 
in Schachter (1971), we reported experi- 
ments and reviewed evidence that small doses 
of epinephrine facilitate emotionally mediated 
behaviors in animals, whereas large doses in- 
terfere with such behavior. As dose increases, 
animals move less, get sick, and at large 
enough doses, die. Obviously we have done 
no counterpart to these dose experiments with 
humans, but it hardly seemed necessary, for 
in searching for a feasible dosage of epineph- 
tine in our original experiment, we ourselves 
tried doses as heavy as, and heavier than, that 
used by Marshall and Zimbardo in their in- 
creased arousal condition. We would not do 
it again and can only assume that Marshall 
and Zimbardo never tried their increased dose 
on themselves or, if they did, that they are 
made of considerably stronger stuff than are 
we. At this dose, we did not have palpita- 
tions—our hearts pounded; we did not have 
ttemors—we shook. We might have been con- 
vinced by someone that we were about to die, 
but no amount of social psychological tom- 
foolery could have convinced us that we were 
tuphoric, or angry, or excited, or indeed any- 
thing but that something was very wrong and 
that we felt lousy. The point has always 
semed too obvious to dignify with the status 
of the formal hypothesis that for humans, too, 
there is an inverted-U relationship between 
adrenaline dosage and the effectiveness of an 
tmotion-inducing manipulation. 


Other Research Evidence 


We concede that we read Marshall and 
Timbardo’s (1979) description of our results 
With wistful envy. When they report that they 
Were intrigued “to discover the reported ease 
With which the same condition of physiologi- 
tal arousal could be steered in as seemingly 
Opposite affective directions as happiness OF 
anger as a result of these cognitive manipu- 
lations” (p. 971), they certainly are not de- 
scribing our experiment. The elaborate stage 
Setting, the large cast, and the attention to 
exquisite nuances of dosage and timing yield 
‘sults requiring two internal analyses and 
three supplementary studies to make our 
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point. Our conclusions about emotion were 
hardly based on this experiment alone, and 
we believe that there was no write-up of this 
study in which we did not make explicit what 
we believed was wrong with the experiment 
and did not point out the gross inadequacies 
of the data and then go on to describe the 
further research that did support our proposi- 
tions. It was not that we were particularly 
humble about our work; we simply did not 
want to seem to be nitwits by basing a theo- 
retical structure on obviously inadequate data,” 

Since Maslach, Marshall, and Zimbardo 
either disregard or, to our mind, misinterpret 
this supporting material, we shall review this 
literature. First, Schachter and Wheeler 
(1962) examined subjects’ reactions to a slap- 
stick movie. In terms of quiet reactions, such 
as smiling, there were no differences between 
subjects injected with adrenaline and those 
injected with placebo. In terms of broad re- 
actions such as laughter, epinephrine subjects 
laughed significantly more than did placebo 
subjects. As far as raucous reactions are con- 
cerned, 16% of the epinephrine subjects belly 
laughed; not a single placebo subject did so 
(p < .01)—hardly the pattern of findings one 
anticipates from an agent that produces a 
bias toward “negatively toned reports of 
affect.” * 

On negative emotions such as fear, we did 
a number of animal studies, of which the most 
relevant was Singer’s (1963) demonstration 
that in neutral situations, an animal injected 
with adrenaline is no more disturbed or fear- 
ful than a control animal. In frightening situa- 
tions, however, the adrenaline animal is mark- 


2 Relevant to the issue raised on pages 984-986 of 
the Marshall-Zimbardo Postscript, we suggest reading 
all of page 396 of our original article rather than 
the few sentences that Marshall and Zimbardo have 
selected from this page. Readers can then decide 
whether we have based our conclusions on this single 
experiment Or whether we base them on the body 
of research that was generated by this study. f 

3 Maslach also describes this experiment in her 
article. Anyone who relished Hastorf and Cantril’s 
(1954) observations on the Princeton-Dartmouth 
game of 1951 should enjoy comparing these two de- 
scriptions to Table 3 and the pertinent paragraphs 
on pages 125-126 of the original article (Schachter 


& Wheeler, 1962). 
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Table 1 

The Effects of Ephedrine and Placebo on 

Mood in Various Experimental Contexts* 

— ee ee eae 


Setting 
Condition Happiness Neutral Anger 
Ephedrine 33.2 27.2 24.3 
Placebo 28.5 30.7 30.8 


Note. Higher scores indicate more pleasant ratings. 
* From Erdmann and Janke (1978). 


edly more frightened than a placebo animal. 
This is a result that in various forms has been 
replicated several times, most recently by 
Haroutunian and Riccio (1977). In and of 
itself, then, adrenaline does not produce a 
negative state in animals. Only in disturbing 
situations does it do so. 

As to replications and related studies con- 
ducted since our group was working on these 
studies in the early 1960s, we simply have not 
followed the literature systematically, and in 
preparing these comments we have not had 
sufficient time to institute a thorough library 
search. However, what we have learned rele- 
vant to the issue of “emotional plasticity” 
seems to us to support our original findings. 
Though only we and Marshall and Zimbardo 
seem to have used epinephrine, on the whole, 
studies employing other sympathomimetic 
agents report findings consistent with our 
original findings, Erdmann and Janke (1978) 
relied on a well-disguised oral administration 
of the sympathomimetic agent ephedrine to 
manipulate arousal, They had an eight-condi- 
tion experiment—four placebo conditions and 
four (in our terms) ephedrine-ignorant con- 
ditions (i.e., because of the mode of adminis- 
tration, subjects could have no idea of why 
they felt as they did). Crosscutting these con- 
ditions, Erdmann and Janke manipulated four 
emotional contexts: neutrality, happiness, an- 
ger, and anxiety. They measured the joint 
effects of context and arousal on self-reports 
of mood using both rating scales and a stan- 
dardized adjective checklist. For the condi- 
tions relevant to this article, their results were 
as shown in Table 1 for the mood scale de- 
rived from the adjective checklist. 

The three placebo conditions do not differ. 
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With ephedrine, however, the conditions di- 
verge, and in the predicted directions. The 
ephedrine-happiness group is more euphoric 
than the ephedrine-neutral group, which in 
turn is happier than the ephedrine-anger 
group. These results are consistent with ours 
and present none of the interpretive problems 
of the Marshall and Zimbardo and Maslach 
studies. Given the Maslach-Marshall-Zim- 
bardo hypothesis that sympathomimetic agents 
bias people toward negative affect, it is of; 
interest to note that only in the anxiety con- 
ditions did ephedrine fail to have an effect— 
a finding that Erdmann and Janke attribute 
to the ceiling effect. 

We note, too, that our interactionist view 
of emotion has become something of a model 
for studies of a variety of psychopharmaco- 
logical agents. For at least two agents, mari- 
juana and alcohol, there have been studies 
that clearly demonstrate that whether or not 
these agents produce positive, negative, or 
neutral affective states depends on the inter- 
action of the physiological effects of these 
agents with cognitive, social, and situational 
factors (Pliner & Cappell, 1974; Rossi, 
Kuehnle, & Mendelson, 1978). Obviously, all 
of the work on the effects of set on the action 
of drug agents makes much the same point 
(Wikler, 1957). 


The Marshall-Zimbardo Study: y 
2. Experimental Instructions 


Given this background material, it hardly 
seems an exotic hypothesis to repeat our sug- 
gestion of almost 20 years ago that adrenaline, 
too, in mild doses, produces a state of auto- 
nomic arousal that of itself is neither pleasant 
nor unpleasant but to a considerable extent 
takes on its affective coloration from the situ- 
ation. How then can we account for Marshall 
and Zimbardo’s results in their basic epineph- 
tine and placebo conditions? In these condi- 
tions, unlike the increased arousal condition, 
their dose was the same as ours, yet consis- 
tently on self-report and activity measures 
their epinephrine subjects were somewhat un- 
happier than placebo subjects. Although, 
given the inadequacies of our experimental 
design, we do not expect significant differences 
between these conditions, at the very least 


our line of thought would lead us to expect 
that, as in our experiment, the epinephrine— 
basic subjects would be slightly more, not less, 
euphoric than placebo-basic subjects. 

Since the two experiments differ in numer- 
ous, possibly consequential, ways, it is ob- 
viously guesswork to try to understand why 
Marshall and Zimbardo fail to find even this 
shred of support for our original findings, but 
| we have a pretty good guess. Marshall and 
Zimbardo attempted to replicate what we 
| called the epinephrine—misinformed condition. 
The original instructions in this condition 
were: 


| I should tell you that some of our subjects have ex- 
| perienced side effects from the Suproxin. These side 
effects are transitory, that is, they will only last for 
about 15 or 20 minutes. What will probably happen 
is that your feet will feel numb, you will have an 
itching sensation over parts of your body, and you 
may get a slight headache. (Schachter & Singer, 1962, 
Tp. 383) 


None * of these symptoms, of course, are con- 
sequences of an injection, and in our study 
these instructions were deliberately designed 
to provide the subject with an inappropriate 
explanation of his bodily feelings. 

For no reason we can fathom,’ Marshall and 
Zimbardo decided to change our symptoms, 
and their instructions were the following: 


Ishould mention that although there are usually no 
‘side effects to the injection, a few of our subjects 
have reported some minor transitory reactions that 
may have resulted from the injection. These have 
typically included a feeling of dryness in the back 
of the mouth or throat, or a feeling of coolness in 
the hands or feet, and a couple of people have re- 
Ported a slight headache. (Marshall, 1976, p. 13) 


Unfortunately, both dryness of mouth and 
coolness in the hands or feet are consequences 
of an injection of adrenaline, and rather than 
having the counterpart of our epinephrine— 
misinformed condition, Marshall and Zim- 
bardo to some extent have the counterpart of 
Our epinephrine-informed condition. In the 
Schachter and Wheeler (1962) study, one of 
the questions asked to assess the effects of the 
drug manipulations was, “Does your mouth 
feel dry?” In Table 1 of that article it can 
be seen that subjects injected with epinephrine 
teported significantly (p < .01) more dryness 
than subjects injected with placebo. On the 
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question of coolness of limbs, it is well known 
that adrenaline causes marked peripheral vaso- 
constriction, and several studies (e.g., Bier- 
man, 1941; Wenger et al., 1960) have dem- 
onstrated that epinephrine causes a marked 
decrease (as much as 5 °C) in the temperature 
of the limbs. An injection of epinephrine will 
produce both a feeling of dryness in the mouth 
and a feeling of coolness in the hands and feet. 

The Marshall and Zimbardo basic epineph- 
rine condition can hardly, then, be considered 
the counterpart of our epinephrine—misin- 
formed condition. If anything, it is somewhat 
comparable to our epinephrine—informed con- 
dition, in which the subjects were given an 
accurate description of their symptoms. This 
condition was designed to test the hypothesis 
that when subjects have an appropriate ex- 
planation for their physiological symptoms, 
they will be unlikely to label their feelings in 
terms of the alternative cognitions available. 
This proved to be the case, since subjects in 
the euphoria-epinephrine-informed condition 
of our experiment were less happy than placebo 
subjects. We suggest that this finding is in- 
advertently replicated in the Marshall-Zim- 
bardo experiment. 

Obviously, we are unconvinced that either 
the Maslach or the Marshall-Zimbardo ex- 
periments demonstrate what is claimed for 
them, We believe that the various experimen- 
tal problems we have identified invalidate the 
conclusions drawn from these studies. Oddly 
enough, however, the simplest and most direct 
test of the major conclusion drawn by these 


4The pharmacological texts (Goodman & Gilman, 
1975) list “headache” as one of the effects of an in- 
jection of epinephrine. At the doses we used, this 
proved not to be the case, as can be seen in Table 
1 of the Schachter and Singer (1962) article. In our 
pretests, however, headaches did sometimes accom- 
pany heavier doses of adrenaline. í $ 

5 Relevant to Marshall and Zimbardo’s postscript 
justification for these changes in procedure, we note 
that we have read Marshall’s PhD dissertation care- 
fully and nowhere in the 224 pages of text that is 
heavily concerned with differences between his study 
and ours is there any justification for changing the 
symptom description from “numbness to coolness 
and itching to dryness of mouth” (Marshall & Zim- 
bardo, 1979, p. 987). In fact, nowhere in the entire 
text is this change even mentioned. These are simply 
referred to again and again as “irrelevant symptoms.” 
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authors would bear little resemblance to either 
their experiments or ours. 


Evidence on Emotional Plasticity 


Clearly, Marshall, Maslach, and Zimbardo 
feel that their experiments are more than just 
a failure to replicate; they signal a fundamen- 
tal deficiency in our model. Emotions, they 
claim, are not really so plastic. While there 
is no fundamental disagreement between us 
concerning the need for both arousal and an 
appropriate cognition, they suggest that the 
ultimate label of the physiological state asso- 
ciated with adrenaline is always likely to be 
unpleasant. Marshall and Zimbardo (1979) 
summarize their own and Maslach’s work in 
these words, “The general picture that emerges 
... indicates that epinephrine-related physio- 
logical arousal does not provide ‘emotional 
plasticity,’ but rather shows a consistent asso- 
ciation with negative affect” (p, 983).° 

The test of the hypothesis that such arousal 
is unpleasant requires neither our cumbersome 
set of studies nor the equally complicated 
Marshall, Maslach, and Zimbardo replications. 
All it requires it that a number of people be 
injected with adrenaline and asked to describe 
how they feel—precisely the studies done in 
the 1920s and ’30s by Marañon (1924), Can- 
tril and Hunt (1932), and Landis and Hunt 
(1932). These studies are cited by Marshall 
and Zimbardo, but our reading of them differs 
considerably from theirs, They restrict them- 
selves only to Marafion’s subjects, who report 
“true” emotions and neglect the larger group 
who reported “cold” or “as if” ones. But the 
point at issue is not whether exogenous adren- 
aline produces an emotion rated in Marañon’s 
context as genuine, but whether it produces 
a feeling of unpleasantness. The “cold” emo- 
tions, as far as we can determine from Mar- 
shall, Maslach, and Zimbardo’s theory, are 
as pertinent to this point as any. This is es- 
pecially true since Marañon suggests that 
“real” emotions following the injection are 
most likely to occur in those people with a 
prior endocrine disturbance or a history of 
affective disorder. 

Marañon (1924) does not systematically 
categorize the reactions of his subjects, but 
from the various quotations throughout his 
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article, it is evident that his subjects ran the 
full gamut of reactions from the negative “I 
feel as if I were afraid” to the exultant “some- 
thing which comes up from the guts as when 
one is awaiting a great happiness” (p. 309). 
Cantril and Hunt (1932) find no evidence of 
a uniformly negative affective reaction to 
adrenaline. In their Table 1, entitled “Sum- 
mary of the Results of the Experiment,” one 
out of the four cases of genuine emotion was a 
pleasant one, and in 11 cases of reported af- 
fect, 3 were pleasant, 3 were unpleasant, and 
5 were indifferent, The following self-descrip- 
tion, from a genuine-emotion subject, certainly 
does not appear to us as the product of a bias 
toward negativism: 


Excitement (have felt like this in large class when 
I think I have a good idea, when raising my hand). 
No elation, but a background of elation. Feel light, 
as I used to in a burst of creative activity (writing). 
A pleasant state, possibly by association with such 
activity. Not unfamiliar. Used to attempt to bring 
on such states deliberately in adolescence. . . . During 
times of tremendous intellectual experiences in the 
past, times when my vision seemed peculiarly clear 
and far reaching, I have experienced similar reac- 
tions, but not even then so strongly. A tremendous 
illusion of power, vitality. (p. 303) 


Landis and Hunt (1932) report results that 
may be unrepresentative, since they used in- 
stitutionalized lunatics as subjects, but here, 
too, the authors report considerable variety 
in emotional response, Following the injection, 
three of their subjects had weeping fits, and 
three of their subjects broke into giggles or 


€ It is difficult to determine if Marshall and Zim- 
bardo feel that the state of arousal induced by epi- 
nephrine is of itself sufficient to produce negative 
affect or if it Tequires an inadequate explanation as 
well. The quotation above suggests that the arousal 
state alone is Sufficient; in other places they seem 
to indicate that an “inadequate explanation” is also 
required. In either case, their experiment has nothing 
to do with the question, for they do not manipulate 
the adequacy of the explanation. Our own results 
Suggest that contrary to what we believe are Marshall 
and Zimbardo’s expectations, providing a subject 
with an adequate explanation of his or her arousal 
state will, in euphoria conditions, worsen his or her 
mood. Our epinephrine-informed subjects were de- 
liberately told what they would feel and why. Such 
Subjects reported themselves as significantly less eu- 
Phoric than subjects who had no adequate explana- 
tion for their bodily state. 
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hysterical laughter—hardly the uniformly 
negative affective tone required by the Mar- 
shall, Maslach, and Zimbardo formulation. 
Landis and Hunt conclude, “The adrenaline 
syndrome may be the basis for either pleasant 
or unpleasant emotional experiences” (p. 484). 

To add our own bit, the sensation was for 
is curious and exhilarating—a feeling of emo- 
tional déja vu. We had felt this way often in 
the past, but in this laboratory context it made 
no sense to feel this way—a pure arousal 
tate uncluttered by passion or fear, affection 
or hatred. It was neither pleasant nor un- 
jleasant; it was, though, absorbing enough 
that at mild doses, we were delighted to 
serve as our own subjects again and again. 
In these days of ethical guidelines and hu- 
man subjects committees, this may very well 
be the end of the matter, for it is unlikely 
hat anyone will do experiments such as ours 
ot Marshall and Zimbardo’s for quite a while, 
if ever again, On the particular issue at stake, 
however, this is probably of little moment, 
for this is one issue on which the readers can 
rve as their own subjects. If they will do a 
thorough introspective job after convincing a 
jhysician to inject them with .5 cc of a 
1:1000 solution of epinephrine, they can de- 
tide which of us is right—or would this, too, 
quire the approval of a human subjects 
Committee? 


Reference Note 


| Zimbardo, P. G., & Maslach, C. Biased searches for 
causal explanations of experienced discontinuities. 


Unpublished manuscript, 1979. 
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Bem Sex Role Inventory: 
A Theoretical and Methodological Critique 


Elazar J. Pedhazur 
New York University 


The rationale underlying the construction of the Bem Sex Role Inventory 
(BSRI), and the psychometric properties of the resulting instrument are scru- 
tinized. Following a brief discussion of different types of sex role research, the 
results of two studies are presented and discussed. In Study 1, 1,464 graduate 
students rated the desirability of the BSRI traits for one of three referents: 
man, woman, or adult in American society. In Study 2, 571 graduate students 
used the BSRI for self-ratings. It was found that, regardless of the referents 
used, the “masculine” traits were relatively high in desirability but some of the 
“feminine” traits were low in desirability. Discriminant function analyses re- 
vealed that discrimination among groups was primarily due to the differential 
ratings of the two traits Masculine and Feminine for the different referents. 
Results from factor analyses of the ratings of desirability and the self-ratings 
indicated that (a) Bem’s classification of the BSRI traits into masculine, fem- 
inine, and neutral is not tenable; (b) the dimensions that underlie desirability 
ratings differ from those that underlie self-ratings; and (c) the dimensions of 
self-ratings of males differ from those of females. The article concludes with a 
discussion of the implications of the findings for the measurement of androgyny. 


x 
Toby J. Tetenbaum 
Fordham University 


Although sex roles have been a subject of 
study for some time, interest in them has 
greatly intensified as a direct result of the 
activities and writings dealing with, or ema- 
nating from, the women’s liberation move- 
ment. (For a review of sex role research, see 
Hochschild, 1973, and, more recently, Ruble, 
Frieze, & Parsons, 1976.) Investigations of 
masculinity, femininity, and androgyny have 
proliferated, emphasizing such aspects as his- 
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torical perspective (Bullough, 1973; Hunter, 
1976; Taylor, 1973), development “on 


1973; Block, Van der Lippe, & Block, 1973; 
Maccoby & Jacklin, 1974; Rebecca, Hefner, 
& Oleshansky, 1976), psychological and s0- 
ciological correlates (Maccoby & Jacklin, 
1974; Parsons, Ruble, Hodges, & Small, 
1976), and paradigms for change (Bernard, ' 
1975, 1976; Lipman-Blumen, 1973). Defini- 
tions of masculinity (e.g., Brannon, 1976; 
Pleck, 1975, 1976), femininity (e.g., Sherman, 
1971, 1976; Steinmann & Fox, 1966), and 
androgyny (e.g., Bem, 1974, 1975) similarly 
abound. 

There is a wide variety of instruments put 
portedly designed to measure masculinity and 
femininity—for example, the Fe scale of the 
California Psychological Inventory (Gough; i 
1957); the Mf scale of the Minnesota Multi- 
phasic Personality Inventory (Hathaway & 
McKinley, 1943); the GAMIN M scale (Guil- 
ford & Zimmerman, 1949); and the Stereo 
type Questionnaire (Rosenkrantz, Vogel, Bee, 
Broverman, & Broverman, 1968). In an exten- 
sive review of such instruments, Constanti- 
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nople (1973) concluded that they were in- 
adequate on theoretical as well as psycho- 
metric grounds. In addition to questioning 
definitions of masculinity and femininity as 
operationalized by existing measures, Con- 
stantinople’s primary criticism was directed 
at the assumptions of unidimensionality and 
bipolarity that seem to underlie the construc- 
tion and use of masculinity—femininity mea- 
sures. She concluded that the available evi- 
dence indicates that masculinity and femi- 
ninity are neither bipolar nor unidimensional. 

Reflecting similar concerns about the treat- 
ment of masculinity and femininity as anti- 
thetical, Bem (1974) constructed the Bem 
Sex Role Inventory (BSRI) to measure the 
two attributes as orthogonal dimensions, and 
also to yield an androgyny score. In the pres- 
ent article the rationale underlying the con- 
struction of the BSRI and its psychometric 
properties are scrutinized. It should be noted 
that much of what is said about the BSRI 
applies also to various other sex role measures 
currently in use. The BSRI was chosen be- 
cause of its great popularity, and because, 
from a perusal of research studies in which it 
was used and from statements about it in re- 
cent textbooks (e.g, Oskamp, 1977; Wrights- 
man, 1977), it appears that its validity has 
been accepted without much critical question- 
ing, 

Hochschild (1973) has identified four broad 
types of research in sex roles. One type, for 
example, deals with sex roles as reflected in 
actual or attributed sex differences in person- 
ality traits, interests, and the like. Another 
type focuses on appropriate behaviors for 
males and females. Regardless of the relative 
merits of the different types of sex role re- 
search, failure on the part of a researcher to 
use an unambiguous definition of the type of 
sex roles being investigated is bound to result 
in the construction or use of measures of 
dubious validity, as well as in ambiguities 1n 
the conceptualization and execution of the re- 
search, Moreover, explicit delineations and 
Specifications are necessary within each type 
of sex role research. For example, is one in 
Vestigating role expectations Or role enact- 
ments (Sarbin & Allen, 1968)? Are sex dif- 
ferences in traits studied normatively or pre- 
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scriptively? (For a detailed discussion, see 
Spence, Note 1.) 

Bem’s work was done in the context of 
trait stereotyping. Consequently, it is im- 
portant to bear in mind the potential prob- 
lems and ambiguities associated with research 
in stereotyping. For example, some researchers 
use the “everyday” or “commonsense” con- 
ception of stereotypes, which is usually con- 
sidered to be narrow and pejorative. Other re- 
searchers are primarily interested in one or 
more of the following aspects of stereotypes: 
their complexity, function, accuracy, resistance 
to change, to name but a few (Fishman, 1956; 
Vinacke, 1957). Bem offers no clear state- 
ment about the aspect, or aspects, of stereo- 
types in which she is interested, except to say 
that her concern is with positive traits only. 

Also, it has been demonstrated that the 
format and the conditions under which stereo- 
types are elicited affect the results, For ex- 
ample, Erlich and Rinehart (1965) provided 
respondents with either a trait checklist or an 
open-ended format for the purpose of de- 
scribing different ethnic groups. It was found 
that respondents given the checklist used a 
larger number of traits and exhibited greater 
consensus than did those who responded to the 
open-ended format. A probably more impor- 
tant finding was that the trait list obtained 
with the open-ended format was considerably 
different from the one used in the checklist 
format. Dealing specifically with stereotyping 
of a woman, Clifton, McGrath, and Wick 
(1976) have demonstrated that different traits 
were attributed depending on the role ascribed 
to her, for example, housewife, bunny, athlete. 

With the above comments in mind, we 
shall proceed to examine the method used in 
the construction of the BSRI. Basically, 
judges were asked to rate the desirability of 
approximately 400 personality traits either for 
a man or for a woman in American society. 
Tests of significance of the difference between 
the mean ratings of each item were performed 
Twenty traits rated significantly more desir 
able for a man than for a woman were desig 
nated as masculine, and 20 traits rated sig 
nificantly more desirable for a woman tha 
for a man were designated as feminine. I 
addition, 20 traits whose mean ratings of d 
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sirability for a man and for a woman did not 
differ significantly were designated as neutral. 
Having thus selected the traits, Bem proposed 
that the BSRI be used as a self-rating instru- 
ment, in which the respondent is asked to in- 
dicate on a 7-point scale, ranging from “never 
or almost never true” to “always or almost 
always true,” the degree to which each trait 
describes himself or herself. “Masculinity 
equals the mean self-rating for all endorsed 
[sic] masculine items and Femininity equals 
the mean self-rating for all endorsed [sic] 
femininity items” (Bem, 1974, p. 158). Two 
additional scores can be obtained from the re- 
sponses to the BSRI: an Androgyny score, 
which is essentially the discrepancy between 
the Masculinity and the Femininity scores, 
and a Social Desirability score, which is the 
mean self-rating on the 20 “neutral” traits. 
(For further details, see Bem, 1974.) 

Of the various problems and issues related 
to the construction and use of the BSRI, only 
several will be noted at this stage. Instead of 
defining the domains of masculinity and femi- 
ninity and attempting to construct measures 
consistent with the definitions, Bem has chosen 
a strictly empirical approach, While such an 
approach may have limited utility when the 
primary concern is with criterion-related va- 
lidity, its appropriateness is very doubtful 
when the focus is on construct validity (e.g., 
Cronbach & Meehl, 1955; Travers, 1951), Re- 
gardless of one’s viewpoint on this issue, how- 
ever, it is important to examine some of the 
possible consequences and to indicate the 
basic requirements of the method of scale con- 
struction that Bem has used. 

It is clear that Bem was studying the de- 
sirability of traits, or personality character- 
istics, for males and females, But since the 
meaning of the term desirable in the instruc- 
tions was not clarified, it was open to different 
interpretations by respondents (e.g., norma- 
tively, prescriptively; see Strahan, 1975, and 
Spence, Note 1). 

In the discussion of the rationale for the 
construction of the BSRI, the distinction be- 
tween traits and behaviors is occasionally 
blurred, as is evidenced by the statement that 
“because the BSRI was founded on the con- 


ception of the sex-typed person as someone 
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who has internalized society’s sex-typed stan- 
dards of desirable behavior for men and 
women, these personality characteristics were 
selected as masculine or feminine on the 
basis of sex-typed desirability” (Bem, 1974 
p. 155; italics added). 

The fact that 400 ż tests were treated as if 
they were independent of each other seems to 
require no elaboration. It is, however, im- 
portant to note that when relying solely on 
tests of significance one runs the risk of over- 
looking the distinction between statistically 
significant and substantively meaningful find- 
ings. 

The criterion employed for the designation 
of traits as masculine, feminine, or neutral 
seems to be based on the assumption that 
traits used to characterize one of the sexes 
are not, or should not be, used to characterize 
the other sex. No justification is offered for 
this assumption, which appears to be un- 
tenable on the basis of research on ethnic or 
racial stereotyping (see, for example, Erlich 
& Rinehart, 1965). One possible consequence 
of the trait selection procedure is that it may 
have resulted in the inclusion of traits that 
are not desirable but, rather, less undesirable 
for one of the sexes. In sum, the method of 
trait selection for the BSRI is not unlike the 
method occasionally encountered in other 
areas (e.g., achievement testing) where ex- 
clusive reliance on the statistical character- 
istics of the items (e.g., difficulty, discrimi- 
nation) for their selection may lead a test 
constructor to neglect the most important 
Property of the test: its validity. 

It is necessary to identify the dimensions 
that underlie the relations among the ratings 
of trait desirability, and those that underlie 
the relations among ratings of traits when 
they are used for self-ratings. The demonstra- 
tion that each of the proposed subscales is 
unidimensional is necessary for their valid 
use as summated rating scales, regardless of 
whether they were designed to be consistent 
with theoretical definitions of the constructs 
or whether they were arrived at empirically 
(Green, 1954). No evidence about the dimen- 
sions underlying the BSRI is provided by 
Bem. 


To provide support for and amplify some 
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of the observations made above, two studies 
were undertaken: The first dealt with the 
ratings of desirability of the BSRI traits, and 
the second was concerned with the properties 
of the BSRI when it is used as a self-rating 
instrument. In an attempt to replicate Bem’s 
work as closely as possible, her format and 
instructions were used despite some of the 
reservations expressed above. 


Study 1: Desirability Ratings 


The 60-item BSRI was administered to 
1,464 graduate students of education attend- 
ing one of three universities in New York 
City. The instructions were those used by 
Bem (1974) in the trait-selection phase of 
her study; that is, the subjects were asked to 
rate the desirability of each trait for a man 
or a woman in American society. In addition, 
the present study sought ratings of desirabil- 
ity for an adult in American society. 


Descriptive Statistics 


Means and standard deviations of desirabil- 
ity ratings are reported in Table 1. Also in- 
cluded in the table, for comparative purposes, 
are mean likableness ratings of traits common 
to the BSRI and to trait lists used in studies 
by Anderson (1968) and Bryson and Corey 
(1977). 

Several points will be noted. The traits 
labeled by Bem as masculine are generally 
perceived as desirable when applied to a man, 
All the item means are greater than 5 ona 
T-point scale, ranging from “not at all de- 
sirable” (1) to “extremely desirable” (7). The 
means for the same traits, while still on the 
desirable end of the continuum, are generally 
lower when the traits are applied to an adult, 
and lowest when they are applied to a woman. 
Note, however, that the standard deviations 
associated with the trait-desirability ratings 
for a woman are generally larger than those 
for a man or an adult, indicating less agree- 
ment among respondents when the traits are 
applied to a woman. i : 

Overall, the mean ratings for traits desig- 
nated as feminine tend to be lower than those 
designated as masculine, even when the former 
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are applied to a woman. This is also noted 
when contrasting the mean ratings of the 
“masculine” and “feminine” traits for an 
adult. More important, however, is the find- 
ing that some of the “feminine” items are per- 
ceived as relatively undesirable, or negative. 
See, for example, the means for the traits 
Shy, Gullible, and Childlike. Not only are 
these traits rated low in desirability when 
they are applied to a man or an adult, but 
their desirability is also low even when the 
referent is a woman, These findings cast doubt 
about Bem’s (1974, 1975, 1977) statements 
that the “masculine” and “feminine” traits 
are positive, or socially desirable, In fact, 
Bem indicates that she included 20 “neutral” 
traits, 10 positive and 10 negative ones, as a 
means of checking whether the responses to 
the “masculine” and “feminine” traits are af- 
fected by a social desirability response set. 
The negativity of some of the “feminine” 
traits is underscored when their mean ratings 
are compared with the mean ratings of “neu- 

tral” traits designated by Bem as being nega- 

tive. For example, the mean ratings of the 

“feminine” traits Gullible and Childlike are 

lower than the mean ratings of the “neutral” 

traits Theatrical, Unpredictable, Jealous, and 

Secretive. 

About half of the traits on the Masculinity 
and Femininity subscales of the BSRI appear 
also on the list of traits rated for likableness 
in a study by Anderson (1968), who says: 
“Sex-linked words were . » - deliberately 
omitted from the list” (p. 277). The overlap 
between the two lists is actually greater when 
one considers not only identical words but 
also close synonyms. Rosnow, Wainer, and 
Arms (1969) indicate that certain of the 
traits used by Anderson are sex linked, and 
others are not. Among the latter are Loyal 
and Shy, which Bem designated as feminine 
traits, and Ambitious, which according to 
Bem is a masculine trait. The situation be- 
comes more complicated as one considers ad- 
ditional studies of trait ratings. For example, 
Bryson and Corey (1977) asked subjects to 
indicate, in addition to ratings of likability, 
the sex-relatedness of a set of traits. They 
found that for some traits designated as mas- 
culine or feminine there were neither sub- 
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Mean Desirability Ratings and Standard Deviations for the 60 BSRI Traits 


Bryson and Corey (1977) 


Anderson 
Trait Man* Woman? Adulte (1968) Male Female General 
1. Self-reliant 6.40 4.45 6.06 4.62 
(.82) (1.56) (1.03) 
2. Defends own beliefs 5.81 4.43 5.46 = 
(1.12) (1.53) (1.29) 
3. Independent 6.31 4.19 5.91 4.55 7.06 7.12 7.42 
(1.00) (1.70) (1.16) 
4. Athletic CE 3.54 4,72 = 6.69 3.93 6.67 
(1.27) (1.47) (1.31) 
5. Assertive 5.88 3.71 5.13 = 
(1.13) (1.63) (1.32) 
6. Strong personality 5.91 4.08 5.32 — 
(1.08) (1.69) (1.26) 
7. Forceful 5.38 gat! 4.47 — 
(1.43) (1.53) (1.48) 
8. Analytical 5.09 3.68 4.63 = 
(1.29) (1.50) i (1:34) 
9. Has leadership abilities 6.04 3.95 5.58 — 
(1.04) (1.64) (1.08) 
10. Willing to take risks 5.42 4.06 4.93 = 5.14 5.58 5.21 
(1.15) (1.53) (1.24) 
11. Makes decisions easily 5.47 4.29 5.13 = 6.83 6.64 6.77 
(1.26) (1.39) = (1:33) 
12. Self-sufficient 6.22 4.41 6.00 4.12 
(.95) (1.63) (1.03) 
13. Dominant 5.11 2.48 3.75 1.53 
(1.65) (1.52) (1.65) 
14.. Masculine 6.37 1.51 $5.12 — 6.96 1.83 6.16 
(1.02) (1.05) (1.58) 
15. Willing to take a stand 5.80 4.17 5.46 = 
A (-98) (1.60) (1.12) 
16. Aggressive 545 2.97 4.48 3.04 6.08 5.17 5.56 
1.32 1.61 1, 
17. Acts as a leader a ee he — 
1.1 3 $ 
18. Individualistic oe te od 4.67 
i; al a 
19. Competitive o nsa Ep — 
s] (1.27) (1.54) 1.32 
20. Ambitious 6.08 4.03 A 4.84 7.26 6.92 698 
1.00. 1.65 1.11 
21. Yielding ao ce oe eee! 
(1.46) 1.49) 1,33, 
22. Cheerful 4.85 nas, ny 5.04 
(1.28) (1.01) (1.30) 
23. Shy 1.80 3.22 1.98 291 3.76 4.11 4.49 
(1.03) (1.44) (1.01) 
24. Affectionate 4.63 6.04 483 al 6.94 719 7.52 
(1.39) (93) (1.29) 
25. Flatterable 3.40 4.63 3.32 — 
(1.67) (1.58) (1.58) 
26. Loyal 5.46 5.95 5.63 5.47 
(1.27) (1.41) (1.27) 
27. Feminine 1.39 6.17 4.28 — 1:22 6.78 5.38 
(96) (1.18) (1.76) 
28. Sympathetic 448 577 499 4.59 6.42 6.72 6.67 
(1.32) (1.06) (1.27) 
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Bryson and Corey (1977) 


z Anderson 
Trait Man* Woman> Adulte (1968) Male Female General 
19. Sensitive to the needs of others 4.78 6.03 5.45 — 642 6.97 6.72 
(1.39) (.94) (1.29) 
30. Understanding 4.94 6.08 5.48 5.49 
(1.34) (88) (1.18) 
$1. Compassionate 4.54 5.99 5.15 = 
(1.42) (.99) (1.29) 
32, Eager to soothe hurt feelings 3.96 5.58 4.43 == 
(1.46) (1.27) (1.41) 
33. Soft-spoken 2.85 4.80 3.31 3.80 
(1.44) (1.62) (1.43) 
34. Warm 4.63 5.98 5.04 5.22 7.04 7.66 7.75 
(1.42) (1.02) (1.27) 
35. Tender 4.13 6.03 4.57 4.56 6.18 6.65 6.74 
(1.52) (1.02) (1.43) 
46, Gullible 1,53 2.50 1,56 2.19 2.69 2,56 3.29 
(1.04) (1.57) (.93) 
37. Childlike 1,59 2.79 1,80 = 
(1.00) (1.63) (1.14) 
38. Does not use harsh language 3.42 4.63 3.90 = 
(1.49) (1.89) (1.60) 
89. Loves children 4.69 5.98 5.01 cay 
(1.30) (1.22) (1.40) 
40, Gentle 4.17 6.00 4.66 5.03 6.50 6.90 7.33 
(1.55) (1.08) (1.47) 
$l. Helpful 4.97 5.75 5.30 4.92 
(1.26) (1.01) (1.22) z 
2. Mood 1.70 1.83 1.61 1. 
y (9) a25 (95) 
8. Conscienti 5.79 5.34 5.88 481 
onscientious (1.05) (1.18) (1.15) 3 
U. Theatri 2.65 3.46 282 2 
heatrical (34) (L51) (L25) 2 
6. H: 5.15 5.81 5.60 ; 
ay (1.31) (1.07) Ca aah 
16. U i 218 2.87 2.0 i 
npredictable a39) (1.62) (1.32) Ti 
#, Reli 5.95 5.69 6.04 j 
eliable (99) (1.05) (94) K 
48. Jealous 2.73 2.92 2.03 1.04 3.01 2.24 A 
ie (1.45) (1.66) azn pe 
49. T 5.32 5.67 5.4 E 
ruthful (137) (1.17) 30 
`o, i 271 3.04 2 = 
Secretive (150) 0.62) (1.46) 
$l. Sincere 5.10 5.88 E ee 
7 aoa ae ER 14 1.81 1.78 1.69 
» Conceited 44) (1.33) (1.27) ES 
3, Li 519 5.74 5.46 a! 
Likable (127) (1.08) (1.28) 
8. Solem: 3.08 2.74 2.80 2.89 
j a38) (1.49) (1.40) ae 
S. Fri 5.16 5.88 5.52 5 
Friendly (1.19) (97) (1.13) A 
56, 5 1.30 1.83 1.26 313 
Inefficient (4) (1.29) (77) 


(table continued) 
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Table 1 (continued) : 
See l 
Bryson and Corey sng) 
Anderson —— 
Trait Man* Woman? Adulte (1968) Male Female General 
57. Adaptable 5.21 5.16 5.58 — 
7 (1.21) (1.21) (1.18) 
58. Unsystematic 1.85 2.30 1.79 — 
(1.17) (1.38) (1.16) 
59. Tactful 4.87 5.50 5.23 4.94 
(1.19) (1.16) (1.21) x 
60. Conventional 3.94 4.34 4.13 2.60 p 
(1.50) (1.60) (1.56) 


Nole. N = 1,464. BSRI = Bem Sex Role Inventory. Standard deviations appear in parentheses. To facili- 
tate reading and interpretation, traits have been grouped by scale rather than as they appear on the BSRI. 
Thus, Traits 1-20 are considered by Bem to be masculine; Traits 21-40, feminine; and Traits 41-60, social 
desirability. Traits on the BSRI and Anderson were rated on a 7-point scale, whereas traits on Bryson ag 
Corey were rated on a 10-point scale. Since the BSRI scale ranges from 1 to 7, whereas the Anderson scale 
ranges from 0 to 6, for purposes of comparability a constant of 1 was added to each mean on the Anderson 


data. 

an = 493. 
bn = 426. 
en = 545. 


stantively meaningful nor statistically signifi- 
cant differences between the mean ratings of 
their likability for males and for females. Ex- 
amples of such traits are Conceited, which 
was classified as masculine, and Gullible, 
which was classified as feminine. Using Bem’s 
criterion for trait selection, both traits would 
have to be classified as neutral, Parker (1969) 
provides a list of feminine traits, which in- 
cludes the following: Helpful, Moody, Con- 
scientious, Sincere, Friendly, and Conven- 
tional. These are among the traits classified 
as neutral by Bem. Applying Bem’s criterion 
for the designation of traits as sex related to 
the results of the present study, one would 
have to conclude that 16 of the traits classified 
as neutral on the BSRI are actually sex re- 
lated because the differences in the mean rat- 
ings of their desirability for a man and for a 
woman are statistically significant. If one 
were to add a criterion of meaningfulness, 
some of the traits would still have to be clas- 
sified as sex linked. For example, using both 
criteria, the traits Helpful, Theatrical, Un- 
predictable, Sincere, Happy, Friendly, Tact- 
ful would have to be classified as feminine, 
whereas on the BSRI they are treated as 
neutral. 

The examples above highlight some of the 


difficulties that may be encountered when 
trait selection is based solely on the criterion 
of statistical tests of significance. The power 
of the statistical test depends, among other 
things, on the size of the groups used. Con- 
sequently, given sufficiently large groups, any 
difference between their means may reach a 
prespecified level of statistical significance. 
The contradictory findings reported above i 
may also serve to underscore the dangers in 
drawing conclusions on the basis of results 
obtained from “samples” of convenience. 
While this state of affairs is not unique to re- 
search in sex roles, it is unfortunate that some 
researchers in this area present their findings 
as “norms,” and that other researchers pro- 
ceed and interpret their own findings in rela- 
tion to such “norms.” à 
Discriminant function analysis. It will be 
recalled that Bem has performed about 400 
univariate tests of significance in the trait- 
selection phase of her study. In the present 
study, the ratings of the desirability of o 
60 BSRI traits for an adult, for a man, an 
for a woman were subjected to a stepwise dis- 
criminant function analysis. In bare 
separate analysis was done of the ratings for 
a man and for a woman, in order to ao ene 
the type of results Bem might have obtaine 


Table 2 
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Discriminant Analysis of Trait Desirability for a Man, for an Adult, and for a Woman 
Si i E A E 


% 
ee correct Centroids 
Group Traits Function Rê A Se df ace reat 1 2 
Man 
2. 
Bi: a 1 .817 .183 2434.47 61 89.35 S E 
88.9 = 
2 348.652 612.13 59 10.65 AE hes 
oman —2.99 48 
af 2 =y 
a Masculine 1 803.197 2372.66 3 95.40 a 3 
lult UN 87.0 16 57 
Feminine 2 164.836 261.23 1 4.60 
Woman —2.83 —.30 
2.66 
60 1 892 108 1974.13 60 100.00 98.3 
oman —3.08 
an Masculine —2.55 
1 883.117 1965.35 2 100.00 97.7 
oman Feminine 2.95 


Note. R? = squared canonical correlation 
Iman), 545 (adult), and 426 (woman). 


she performed a multivariate analysis.’ 
Despite our expectation that the major con- 
ltibution to the separation among the groups 
ould be made by the two traits Masculine 

d Feminine, no hierarchy for the inclusion 
Í traits in the analyses was preestablished. 
The results of the analyses are summarized in 
Table 2. 

It will be noted that while both functions 
te statistically significant, the first one ac- 
‘ounts for the bulk of the differences among 
the three groups, namely 89% of the trace. 
From the centroids associated with the first 
function, it is evident the three groups are 
Sparated about evenly on this dimension, 
With the adult group being in the middle. The 
cond function, on the other hand, reflects 
Dtimarily differences between trait desirability 
lor a man and for a woman, as contrasted 
with the adult group. 
| Among indices used to study the relative 
[ontribution of each of a set of variables to 
è discriminant function are the standardized 
“efficients and the structure coefficients, the 
datter being the corretations of each variable 
With the discriminant function (see, 


. All the chi-squares are significant beyond the .001 level. ns = 493 


ample, Cooley & Lohnes, 1971). Although we 
prefer the use of structure coefficients, both 
types of indices will be discussed. Because of 
space limitations the entire functions and the 
structure coefficients associated with them 
will not be reported. Instead, it will be noted 
that while the standardized coefficients associ- 


as composed of male and 


1 Each of the groups W: 
female respondents. Except for a very small number 


for whom sex identification is missing, the break- 

down was as follows: 196 males and 343 females 

rated the desirability of the traits for an adult; 147 

males and 317 females rated the traits for a man; 142 

males and 266 females rated the traits for a woman. 

Discriminant analyses between males and. females for 

each of the conditions were performed in order to 

determine whether there were meaningful differences . 
in ratings due to the sex of the respondents. In the 

three analyses the differences between male and fe- 

male raters were small, warranting pooling the data 
across sex. One interesting exception will be noted. 
There was a meaningful difference between male and 
female ratings of the desirability of the trait feminine 
for an adult, the mean rating by males being 3.69, 
and that by females 4.63. This result may be con- 
trasted with the mean ratings of the two groups of 
the trait masculine for an adult: 5.20 and 5.09 by 
males and females, respectively. 
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ated with the 2 traits Feminine and Masculine 
in the first discriminant function are —.60 
and .73, respectively, the absolute values of 
the coefficients associated with 44 other traits 
are less than .05, the remaining 14 traits hav- 
ing coefficients whose absolute values range 
from .05 to .12. The structure coefficients as- 
sociated with the traits Feminine and Mas- 
culine are —.642 and .731, respectively, while 
those associated with the remaining traits are 
small except for 9 which are around .30. 

The first trait to enter in the stepwise 
analysis was Masculine. This was followed 
by the trait Feminine. The remaining 58 
traits add very little to the discrimination 
among the groups. Specifically, the overall A 
(Lambda) associated with the 60 traits is 
.119, whereas that associated with the 2 traits 
Masculine and Feminine is .165, Also, the 
functions based on the 60 traits provide 89% 
of correct classification into the three groups, 
as compared with 87% correct classifications 
provided by the functions based on the traits 
Masculine and Feminine only. Looking back 
at the mean ratings of desirability (see Table 
1), it is noted that the differences between 
the ratings of Masculine and Feminine for a 
man and for a woman far exceed the mean 
differences of the ratings of the remaining 
traits. Specifically, the mean ratings of Femi- 
nine for a woman and for a man are 6.17 and 
1.39, respectively, and those of Masculine for 
a man and for a woman are 6.37 and LST 
respectively, 

The results of the present study are even 
clearer when one considers the discriminant 
analysis based on the ratings of desirability 
for a man and for a woman only—the two 
conditions used by Bem. The A associated 
with the 60 traits is .108, while that associ- 
ated with the traits Masculine and Feminine 
is .117. The use of the function based on the 
60 traits provides 98.3% correct classification 
in the two groups, as compared with 97.7% 
accomplished by using the function based on 
the traits Masculine and Feminine only, 
Clearly, it is these two traits that carry the 
major burden of the discrimination between 
the two groups. 


Factor analysis. As was noted earlier, the 
selection of traits for the BSRI and their clas- 
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sification into one of three categories—mas. |. 
culine, feminine, and neutral—was accom-)| 
plished on the basis of tests of significance of 
the differences between the mean ratings of 
the desirability of each trait when applied to 
a man and when applied to a woman. Even if 
one were to accept this procedure, it is im- 
portant to bear in mind that mean differences 
reveal nothing about the dimensions that un- 
derlie the relations among the ratings of the 
traits. The justification for treating a set of 
traits as a separate summated rating scale 
rests on the assumption that it is unidimen- 
sional. To the best of our knowledge, no at- 
tempt has been made by Bem, or by other re- 
searchers, to study the dimensions that under- 
lie the ratings of desirability of the traits 
included in the BSRI. 

In order to study the dimensionality of the 
BSRI, three separate factor analyses of the 
intercorrelations among the desirability rat- 
ings of the 60 traits when applied to a man, 
to a woman, or to an adult were performed. 
Squared multiple correlations were used as 
initial estimates of the communalities. Both 
varimax orthogonal and oblique (delta = 0) 
rotations were used. The two solutions were 
very similar, and the correlations among the 
factors in the oblique solutions were close t0 
zero. Consequently, only the orthogonal solu- 
tions are reported. 

The rotated factor structures for the three 
referents are reported in Table 3. Three in- 
terpretable factors were retained in each anal- 
ysis. Attempts to extract additional factors re 
sulted in excessive fragmentation. It should 
be noted, however, that the three factor solu- 
tions account for about 80%, 81%, and 72% 
of the common factor variance for a man, # 
woman, and an adult, respectively. f 

Using a factor loading > .40 as criterion 
for meaningfulness, it will be noted that Fae 
tor 1 for the man and the adult referents an 
Factor 2, the woman referent, are very simi- 
lar. The same 13 “feminine” traits and 
“neutral” ones have meaningful loadings 0" 


2 Separate analyses were first performed for males 
and for females in each condition, Since the results 
were virtually the same, pooled analyses across S¢* 
within each condition are reported. 


BEM SEX ROLE INVENTORY 


his factor. Among the traits with the highest 
oadings in the three solutions are the follow- 
ing: Compassionate, Sensitive to the needs of 
thers, Understanding, Sympathetic, Warm, 
fender, Gentle, Sincere, and Friendly. All 
xpress concern with, and positive affect to- 
yard, others. Consequently, the factor is 
amed Interpersonal Sensitivity. 

As will be noted from Table 3, several 
raits have meaningful loadings in only one 
f the solutions, the most notable being the 
rait Feminine, with a meaningful loading 
nly when the referent is a woman. This is 
ot surprising when one considers that all the 
tems with high loadings on this factor have 
een rated relatively high in desirability for 
he three referents, except for the trait Femi- 
ine, which was rated high in desirability only 
yhen applied to a woman. Considering that 
he trait Feminine does not have a meaning- 
ul loading on the analogous factors associated 
ith a man and an adult, and that it has the 
west of all the meaningful loadings on Factor 
| for a woman, it would be inappropriate to 
teat the traits that have high loadings on this 
actor as “feminine” traits, or to name the 
actor Femininity. 

The second set of factors with similar pat- 
ems of loadings is composed of Factor 2 for 
| man and an adult, and Factor 1 for a 
ioman. All the “masculine” traits have mean- 
ngful loadings for a man. Seventeen of these 
raits have meaningful loadings for an adult. 
imong the three traits that have no meaning- 
ul loadings on this factor is the trait Mascu- 
ine, which is also the only trait that has no 
neaningful loading on Factor 1 for a woman. 
this latter exception is particularly notable 
onsidering that the meaningful loadings for 
i woman on Factor 1 are generally much 
arger than the loadings of the same traits on 
he respective factors for a man and for an 
dult, Moreover, while the loadings for the 
emaining 40 traits are low on Factor 2 for a 
nan and for an adult, there is a trend toward 
ipolarity in Factor 1 when the referent is a 
roman. Note, for example, the negative load- 
ngs on this factor for the traits Yielding, 
‘latterable, Soft-spoken, Childlike, Conven- 
ional. The relative strength of this factor in 
he three solutions can be discerned from its 
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eigenvalue, which is 5.77 for an adult, 8.12 
for a man, and 13.43 for a woman, 

The traits with the highest loadings on this 
factor for the three solutions are the follow- 
ing: Assertive, Forceful, Has leadership abil- 
ities, Acts as leader, Ambitious, Strong person- 
ality. The factor may be named Assertiveness 
or Instrumentality. Note, again, that the trait 
Masculine has a meaningful loading only 
when the referent is a man, Consequently, it 
would be a misnomer to name this factor 
Masculinity. (See Johnson, 1975, for a discus- 
sion that questions the validity of equating 
dependence and expressiveness with feminin- 
ity, and equating independence, instrumen- 
tality, and assertiveness with masculinity.) 

The third set of factors with similar pat- 
terns of factor loadings is composed of Factor 
3 in the three solutions. Two of the “femi- 
nine” traits, Gullible and Childlike, have 
meaningful loadings in the three solutions. 
In addition, the trait Shy has a meaningful 
loading for a woman, and Flatterable for a 
man. Of the 10 “neutral” traits that are nega- 
tive, 8 have meaningful loadings on this fac- 
tor in the three solutions, Among traits with 
the highest loadings are Jealous, Secretive, 
Conceited, Gullible, and Childlike, all of 
which are probably best captured by the 
label Immaturity. 

In sum, there is no evidence that the traits 
on the BSRI comprise three subsets of Mas- 
culinity, Femininity, and Social Desirability, 
While most of the “masculine” traits comprise 
a single factor, it appears that it would be 
most appropriate to refer to it as Instrumen- 
tality or Assertiveness. Moreover, it should 
be noted that this factor reflects a relatively 
narrow domain, as it is composed of some 
traits that are almost synonymous and others 
that are very close in meaning (eg, Has 
leadership abilities and Acts as a leader; Self- 
reliant, Self-sufficient, and Independent). 

The remaining two factors are each com- 
posed of a mixture of “feminine” and “neu- 
tral” traits. The distinction between the two 
factors is that one is composed of traits con- 
sidered generally desirable for the three refer- 
ents, whereas the other consists of traits con- 
sidered relatively undesirable for the three 
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Table 3 
Trait Desirability for an Adult, for a Man, and for a Woman, 
Four-Factor Solution, Orthogonal Rotation 


$$. 


Man factors Woman factors Adult factors 
Traite 1 2 3 1 2 3 1 7r 
1. Self-reliant 084 541 —.111 74l —.003 —.151 208 427 — 26 
2. Defends own beliefs 348 «400 -.018 750 O44 —138 i 347 301 —.094 
3. Independent 057 «S75 —.059 «833 —.068 —.163 153 504 — 1066 
4. Athletic —144 530 .209 501 083 —.052 084 334 19 
5. Assertive —.083  .667 —.107 847 —.090 —.060 —.073 628 —,042 
6. Strong personality 018 688 —.014 :795 —.004 —.048 .136 587 082 
7. Forceful —.258 658 082 783 —.098 .007 098 606 085 
8. Analytical 148 402 060 .737 071 .080 251 406 028 
9. Has leadership abilities  .069 699 —.012 871 —.059 —.100 062 .630 —,092 
10. Willing to take risks 415 503 070 += -710 —.006 —.044 1173 406 136 
11. Makes decisions easily —.012 579 .070 .634 105 —.129 .128 454 —.031 
12. Self-sufficient i3 oio —.069 835 030 —.146 = 229 539 248 
13. Dominant —.382 soL 169 666 —.204 188. —.273 54l, 20l 
14. Masculine —109 540 049 341 245 100 —.046 326 030 
15. Willing to take a stand .261 -666 056 .837 —.026 —.082 .295 .542 —.058 
16. Aggressive a 086 S0, 717 —.153 1154. —.266 552 10 
17. Acts as a leader —.060 .728 .031 817 —.063 —.040 .065 633. —.008 
18. Individualistic 231 458 046 779.011 —.139 221 434 00) 
19, Competitive —228 675 119 781 —.085 .045 —.184 564 134 
20. Ambitious 006 648 —.021 834 —.038 —.075 036 608 _—058 
24. Yielding AML -176 163 =.369 260 272 231 —.099 2% 
22. Cheerful 602 —.012 —.001 —.103 661 072 579 085 027 
23. Shy 260 —.193 404 —.309 .246 439 274 —.158 30l 
24. Affectionate 733 —.083 064 —.110 .632 1120 642 —.000 _.068 
25. Flatterable 187 140 405.367 330 378 116.023 —«346 
26. Loyal 514.232 —.163 —.124 630.013 532 .097 —.096 
y ered q 24 -24 177-308 479 1065 203 020 os 
; etic 764 —.083 u = = 0h 
= See pened 021 134 658 016 724 006 
of others 846 —.118 —.001 092.700 032 —.1 
Understanding Meant. nse dy A ET ee ‘026 —.047 
a pekan. 816 098 006 —.088 722 —.072 i814 010 —.068 
hurt feelings 738 —.007 227 
: : £ —.254 038 209 
33. Solt-apoken E tog E aa aea ai “tos — 007 CA 
35. Tende 773 —113 093 —.128 701 0o44 791 010 A0 
Se Tendik 768 —.102 136 —.161 692 153 744 —.016 JI 
6. Gullible 012 —.205 564 -340 044 614 —.019 —.142 390 
37, Childlike 160-177 468 «301 | $ o 3 455 
38. Does not use $ ‘ E 509 049r 10S 
harsh language .397 016 = 083 
39. Loves children (as 009 “149 Zan E tos 164 
40. Gent! F : p } k i í 
41. Helpful AET E 753 00 te 
42. Moody ye ee -J 604.077 632 043 —.08) 
43. Conscientious 364377 ee 2 028 351 
a. Thats tt, Ge eT o ue 290-4 
45. Happy Minpes e eS 090 —.0ll 
iy Rowdee 023.005 4564 = 009 —.009 554 026 081 S18 
47. Reliable 419 349 —.203 319, : i 297 —.221 
e “12 3G se -a im es im o9 M 
¥ f 5 037 —.142 253 Pop ; 19 —-4 
50. Secretive SO 197 812 HON te Seok 128 1s S2 


51. Sincere 798 020 —.092 136 Re 
= b : F IIE ie 773 -039 
52. Conceited Sassi irgo aor | 067 i083 1 870- — 2481761 3S 


BEM SEX ROLE 
Table 3 (continued) 
Man factors 
Trait® 1 2 3 
$3. Likable 525.230 —.035 
54. Solemn 060 190 405 
55. Friendly 651 168 —.025 
56, Inefficient —.015 —.093 .532 
57. Adaptable 453.148 —.079 
58. Unsystematic 006 —.062 453 
59. Tactful 618 =.187 065 
60, Conventional —113 149 239 
» 11.699 8.116 3.757 


Note. Factor loadings >.40 are in boldface. ns = 545 (adult), 493 (man), 
been grouped by scale rather than presented as they 


‘To facilitate reading and interpretation, traits have 
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Woman factors Adult factors 

1 2 3 1 2 3 
—.075 644 066 436 138 —.013 
„174 036.338 „180 123 419 
„113 655 054 676 055 —.027 
—.234 —.109 .606 —.113 —.153 SSi 
201 346 —.112 319.250 —.113 
—.201 —.033 521 042 —047 430 
104 «5350 —. 111 „524 40 002 
—.413 207 308 —.037 08978 
13.431 9.592 4.257 10.377 5.772 3.960 


and 426 (woman). 


appear on the Bem Sex Role Inventory. Thus, Traits 1-20 are considered by Bem to be masculine; Traits 


21-40, feminine; and Traits 41-60, neutral. 


referents. The first was named Interpersonal 
Sensitivity, and the second, Immaturity. 

Basically, the same factor structure is ap- 
parent in the three solutions, indicating that 
the crucial aspects in the rating of desirability 
of the traits is not the referent to which they 
are applied, but the respondents’ conceptions 
of which traits are part of the same domain, A 
plausible explanation of the results could be 
made within the context of implicit person- 
ality theory (Schneider, 1973). The specific 
theory and the specific names of the factors 
notwithstanding, the results of this study in- 
dictate that it is inappropriate to treat the 20 
“feminine” traits and the 20 “neutral” ones as 
two separate unidimensional sets. 

Finally, it is obvious that one cannot ob- 
tain from a factor analysis more than one 
puts into it. Bearing in mind the approach in 
the selection of traits for the BSRI, it is not 
surprising that various dimensions and traits 
considered by other researchers as aspects of 
the domains of masculinity or femininity do 
‘hot appear at all on the BSRI (see, for ex- 
ample, Bryson & Corey, 1977; Parker, 1969). 


Study 2: Self-Ratings — 


This study was designed to investigate the 
Pattern of responses to the BSRI traits, 
the dimensions underlying the relations among 
them, when the instrument is used for self- 
rating. It should be noted at the outset that 


since the BSRI is a self-report inventory, it is 
subject to the limitations and deficiencies as- 
sociated with such inventories, particularly 
those designed to measure self-concept (see, 
for example, Edwards, 1970; Messick, 1963; 
Wiggins, 1973; Wylie, 1968, 1974). 

Only the 20 “masculine” and the 20 “femi- 
nine” traits were used in this study, because 
that is the form most often used in research 
related to sex roles, and because the dubious 
nature of the “neutral” traits was amply dem- 
onstrated in Study 1. A total of 571 graduate 
students of education attending one of two 
universities in New York City responded to 
the BSRI using Bem's directions. 


Descriptive Statistics 


Means and standard deviations for the 40 
items for males and females are reported in 
Table 4. The most important thing to be 
noted is the general similarity of the mean 
ratings for males and for females, Except for 
the very large differences in the mean ratings 
of the traits Masculine (5.89 and 2,02 for 
males and females, respectively) and Feminine 
(5.85 and 2,09 for females and males, re- 
spectively), and the difference of 1.08 between 
the mean ratings of the trait Athletic, all 
other differences between the mean ratings of 
males and females are small, ranging (in ab- 
solute value) from .01 to .47, with a median 
of .16. The similarity of the ratings can also 
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Table 4 

Mean Self-Ratings and Standard Deviations 
for the 20 Masculine Traits and 20 Feminine 
Traits of the BSRI for Males and 

for Females 


Males Females 
Trait M SD M SD 
1. Self-reliant 5.82.94 5.96 .92 
2, Defends own 
beliefs 5.63 1.11 5.79 1.04 
3, Independent 1D OE 5.87.95 
4, Athletic 4.66 1.83 3.58 1.75 
5. Assertive 4.93 1.14 5.04 1.18 
6. Strong 
personality 5.39 1.19 5.40 1.19 
7. Forceful 4.84 1.23 4.85 1.26 
8. Analytical 5.61 1.02 5.29 1.29 
9. Has leadership 
abilities 5.49 1.13 SOLES. 
10. Willing to take 
risks 4.99 1.22 4.84 1.29 
11. Make decisions 
easily 4.89 1.24 4.49 1.40 
12. Self-sufficient 5.71.89 5.77 .99 
13. Dominant 4.76 1.15 4.57 1.36 
14. Masculine 5.89 1.18 2,02 1.23 
15. Willing to take 
a stand 5.60 1.02 5.46 1.11 
16. Aggressive 4.58 1.16 4.30 1.39 
17. Acts as a leader 5.06 1.17 4.74 1.33 
18, Individualistic 5.56 1.05 5.47 1.14 
19, Competitive 5.14 1.28 4.69 1.51 
20. Ambitious 5.60 119 5.57 1.28 
21. Yielding 4.44 1.10 4.29 1.16 
22, Cheerful 5.47 93 5.60 98 
23. Shy 3.75 1.41 3.69 1.34 
24. Affectionate 5.42 1.09 5.74 1.07 
25. Flatterable 446 141 4.32 1.41 
26. Loyal 618 86 6.37 .79 
27. Feminine 2.09 1.25 5.85 1.06 
28. Sympathetic 5.73.93 5.94 96 
29, Sensitive to needs 
of others 5.92.89 6.09 .86 
30, Understanding 5.85.76 5.98 81 
31, Compassionate 5.77.96 5.91 91 
32. Eager to soothe 
hurt feelings 5.50 1.11 5.65 1.08 
33. Soft-spoken 4.71 1.36 4.41 1,56 
34. Warm 5.53.95 5.78 92 
35. Tender 5.17 1.09 5.62 1.04 
36. Gullible 3.15 1.35 3.62 1.55 
37. Childlike 2.98 1.29 3.19 1.43 
38. Does not use 
harsh language 4.43 1.66 4.44 1.81 
39. Loves children 5.71 1.20 5.96 1.12 
40. Gentle 5.51 97 5.71 1.08 


ie 
Nole. ns = 171 (males) and 400 (females). BSRI 
= Bem Sex Role Inventory. Traits 1-20 are con- 
sidered by Bem to be masculine and Traits 21-40, 
feminine. 
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be seen when one compares the overall mean 
ratings for males and females on Masculinity 
and Femininity, in which the traits Masculine 
and Feminine are included, and mean ratings 
in which those two traits are excluded. When 
the trait Masculine is included, the overall 
mean ratings of Masculinity for males and 
females, respectively, are 5.30 and 4.95, a 
difference of .35. When the trait Masculine 
is excluded, the respective means for males 
and females are 5.27 and 5.09, a difference of 
-18, or about half of the original difference. ‘ 
The overall mean ratings of Femininity, with 
the trait Feminine included, are 5.21 and 4.89 
for females and males, respectively, the dif- 
ference being .32. When the trait Feminine is 
excluded, the means for females and males 
are 5.17 and 5.03, respectively, a difference of 
.14, or less than half of the original difference. 

From the overall mean ratings, it is clear 
that, on the average, both males and females 
indicate that “it is often true” (a rating of 5) 
that the traits in question describe them. This 
is even more pronounced when one excludes 
from consideration the traits Masculine and 
Feminine and the few traits that have been 
shown to be low in desirability (e.g., Gullible, 
Childlike). Since most of the traits have been 
rated toward the desirable end of the con- 
tinuum when applied to an adult (see Table 
1), it is conceivable that both males and fe- 
males respond to the BSRI in a socially de- 
sirable manner or, as Smith (1968) has put 
it, with a “bias toward thinking as well of 
oneself as one can get away with” (p. 368). 
The available data cannot shed further light 
on this issue. It will be noted, however, that 
in a study by Parker and Veldman (1969), 
the traits Cheerful, Loyal, Sympathetic, 
Warm, and Gentle are part of a Social De- 
sirability factor. Also, except for one or two 
traits, all the traits common to the BSRI and 
the Anderson (1968) list have been rated 
high on likability in several studies. 

Lest one think that the relatively high 
Mean ratings are unique to the present study, 
it is noted that other researchers have ob- 
tained similar results in the mean ratings of 
Masculinity and Femininity (cf. Bem, 1974; 
1977). Since neither Bem nor other research- 
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ers report mean ratings on individual traits, 
more detailed comparisons are precluded. 

Discriminant function analysis, Using over- 
all mean ratings, Bem (1974) demonstrated 
that males score significantly higher than fe- 
males on Masculinity, and that females score 
significantly higher than males on Femininity. 
If, as it appears, this was intended to lend 
support to the validity of the scales, several 
problems ensue, the most important being 
that Bem’s contention that a person may be 
both masculine and feminine seems incom- 
patible with the use of sex differences on Mas- 
culinity and Femininity as evidence of scale 
validity. This is particularly problematic when 
me uses groups that may be nonrepresenta- 
live of any population of males and females. 

In any case, since Bem does not provide 
any information about mean ratings of indi- 
vidual traits, it is not possible to tell how 
much of the difference between males and fe- 
males was due to the differences in the self- 
ratings on the traits Masculine and Feminine. 
It was shown above that the ratings on these 
vo traits accounted for about half of the 
mean differences between males and females 
in Masculinity and on Femininity. 

To further illustrate the relative potency 
f the traits Masculine and Feminine and 
hat of the remaining 38 traits to discriminate 
between males and females, the responses of 
he two groups to the 40 traits were subjected 
b a stepwise discriminant function analysis. 
The findings are summarized in Table 5. Note 
hat the A associated with the 40 traits is 
212 and the associated with the 2 traits 
Masculine and Feminine only is .241. Stated 
lifferently, the R? of the regression of group 
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membership (male-female coded as a dummy 
variable) on the 40 traits is .788, while that 
on only the 2 traits Masculine and Feminine 
is .759. While 96.7% of the males and fe- 
males are correctly classified on the basis of 
the full function, 93.5% of them are cor- 
rectly classified by using the function based 
on the traits Masculine and Feminine. To 
further highlight the findings, it will be noted 
that the standardized coefficients associated 
with the traits Feminine and Masculine in the 
full function are —.70 and .55, respectively, 
Of the remaining traits, 32 have coefficients 
of <.10, and six have coefficients between .10 
and .15, Probably even more revealing are 
the structure coefficients, which for the traits 
Feminine and Masculine are —.80 and .76, 
respectively. Except for the trait Athletic, 
whose structure coefficient is .14, and the trait 
Tender, whose structure coefficient is —.10, 
all the remaining traits have structure coef- 
ficients of <.10, with most being close to zero, 

In sum, all the indices strongly support 
the notion that the discrimination between 
males and females is almost exclusively due 
to their self-ratings on the two traits Mascu- 
line and Feminine, Stated differently, knowl- 
edge of respondents’ self-ratings on the re- 
maining 38 traits adds little to the knowledge 
obtained from their ratings on Masculine and 
Feminine. Not surprisingly, males rate them- 
selves high on Masculine and low on Femi- 
nine, The converse is true for females. 

Factor analysis. It was stated earlier that 
we have not found any published reports of 
factor analyses of the desirability ratings of 
the BSRI traits. We did find one published 
study in which self-ratings on the BSRI were 


lable 5 
Discriminant Analysis of Trait Self-Ratings for Males and for Females 
% correct í 
Traits R A df $$ classification Centroids 
—1.26 
40 .788 .212 40 850.47 96.7 ia 
—2.71 
Masculine vi T 
.159 241 2 808.2 Hé 


Feminine 


Vote. ns = 171 (males) and 400 (females). 
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factor analyzed (Gaudreau, 1977). This study 
suffers from several methodological shortcom- 
ings. Not only were males and females used 
in a single analysis but also the subjects came 
from widely diverse groups (i.e., clerical work- 
ers, supervisors, middle managers, executives, 
police officers, and housewives). Warnings 
against inducing spurious correlations among 
variables by using heterogeneous groups have 
been sounded frequently throughout the brief 
history of social science research. Spearman 
(1904), for example, criticized researchers 
who “have purposely thrown together subjects 
of all sorts and ages, and thus have gone out 
of their way to invite fallacious elements into 
their work” (p. 223). Guilford (1952) elabo- 
rated on the dangers of using diverse groups 
in a single-factor analysis. 

Another flaw in Gaudreau’s (1977) analy- 
sis is that in addition to the individual traits, 
she included the scores on Masculinity, Femi- 
ninity, and Androgyny (i.e., the discrepancy 
between Masculinity and Femininity), thus 
introducing three linear and experimental de- 
pendencies that may lead to artifactors rather 
than factors (see, for example, Gorsuch, 1974, 
pp. 267-268; Guilford, 1952), Because of 
these problems Gaudreau’s findings will not 
be discussed. 

In the present study, separate principal- 
axes factor analyses were performed for the 
male and female groups. Squared multiple cor- 
see hoe A mas as initial estimates of the 
comm varimax orthogonal and 
oblique (delta = 0) rotations eee Since 
the solutions were very similar, only the 
orthogonal ones are reported. 

Four interpretable factors emerged from 
each analysis, accounting for about 83% and 
73% of the common-factor variance for fe- 
males and males, respectively, Extraction of 
a fifth factor for the males resulted in ac- 
counting for an additional 5% of the com- 
mon-factor variance, but in a less interpretable 
factor structure. The two orthogonally rotated 
four-factor solutions are reported in Table 6. 

Looking first at the factor structure for 
females and treating loadings of >.40 as 
meaningful, 17 traits load meaningfully on 
Factor 1. All but one are what Bem classifies 
as masculine traits, the exception being the 
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“feminine” trait Shy, which loads negatively 
on this factor. The highest loadings are as: 
sociated with the following traits: Assertive, 
Forceful, Acts as leader, Has leadership abil- 
ities, Strong personality, Dominant, and Ag- 
gressive. It is clearly an assertiveness factor, 
Factor 2 has 12 traits with meaningful load- 
ings, all of which are classified as feminine by 
Bem. The traits with the highest loadings are 
Compassionate, Tender, Understanding, Sym- 
pathetic, Sensitive to the needs of others, an 
Gentle, all reflecting interpersonal sensitiv- 
ity. Factor 3 is bipolar with three traits (Self- 
reliant, Independent, and Self-sufficient) load- 
ing positively, and two traits (Gullible and 
Childlike) loading negatively. This cluster of 
traits appears to reflect one’s feelings of self- 
sufficiency; that is, individuals who perceive 
themselves as being self-sufficient, or relying 
on their own resources, quite naturally tend 
also to perceive themselves as not being gul~ 
lible and childlike. Factor 4 is bipolar with 
only two traits having meaningful loadings on 
it. Feminine, with a positive loading, and 
Masculine, with a negative loading. Bem’s 
contention that Masculinity and Femininity 
are orthogonal dimensions notwithstanding, 
females who rate themselves high on Femi- 
nine tend to rate themselves low on Mascu- 
line, regardless of their self-ratings on the 
remaining traits. 

Compared with the females, the males tend 
to make a greater and stronger differentiation 
between assertiveness and self-sufficiency as 
reflected by Factors 2 and 3, respectively: 
Note that, as in the solution for females, the 
trait Shy has a negative loading on the As 
Sertiveness factor. Also, Loyal, which is pre- 
sumably a feminine trait, has a meaningful 
loading on the Self-sufficiency factor. Factor 
1 for males is similar to Factor 2 for females, 
except that the loadings in the former te 
to be higher than in the latter (eigenvalues 
of 5.46 and 4.90, respectively). Eleven of the 
“feminine” traits have meaningful loadings 
on this factor, which clearly reflects the notion 
of interpersonal sensitivity. As in the solution 
for the females, Factor 4 is bipolar. But Ít 
differs from the corresponding factor for fe- 
males in that, in addition to the trait Mascu- 
line loading positively and the trait Feminine 
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Table 6 
Masculine and Feminine BSRI Traits for Females and for Male 
Four-Factor Solution, Orthogonal aioe al ý 


Female factors Male factors 

1, Self-reliant 360 135 492 007 ir 

2. Defends own beliefs .506 «6176 106 —.133 17) 91 a3 ‘as 
3. Independent 442 098 480 —.049 033 247 657 053 
4. Athletic 168 —.008 018 —.192 035 178 086 350 
5. Assertive 770 —.015 018 —.073 027 680 ASi ~on 
6. Strong personality 689 044 058 005 O64 632 46 0n 
7. Forceful 768 —.016 —.011 —.047 002 oss 377 AAS 
8. Analytical 363 006 .178 —.037 162 iT) ns œ 
9, Has leadership abilities 731 096 127 051 28708 382 149 
10. Willing to take risks 497 075 036 —.071 214 367 405 OD 
11, Makes decisions easily 404 145 .236 —.009 oss 30o aso à m 
12. Self-sufficient 468 109 460 ~=.119 00 187 5097 Mo 
13. Dominant 687 —.080 —.106 —.128 O0 68 2809 On 
14, Masculine 133 —.265 —.064 —.S07 130 101 059 416 
15. Willing to take a stand 615 m4 122 =.071 215 J15 (S88 AMS 
16. Aggressive 6m -151 —190 —177 101.607.227.048 
17. Acts as a leader 738 = «108 «086,075 23 627 36 sO 
18. Individualistic s9 n6 2 -I = 48 A OTS 
19. Competitive 506 —.091 —.081 233 219 398 078 0 
20. Ambitious 502 083 —.045 168 30 S9 -0 10 
21, Yielding —.262 340 —.093 .083 9 = 203 =010 ~ 0H 
22. Cheerful 192 423. «174007 A20 24 aBa 205 
23. Shy —.422 —.105 —.128 —.021 055 —400 —.102} —.007 
24, Affectionate 187 S45 —.195 146 46 280 = 065 ~.116 
25. Flatterable 13 6 339 183 276 05 =188 ~=.215 
26, Loyal 088 .396 081 —.008 149 126 å an å Mm 
27. Feminine yt e ae) i x “a a =e 
28. Sympathetic n o a it ‘ou 


29, Sensitive to the needs of others AIS 
ar 1071339129 


065 

702 

z 
30. Understanding A 
31, Compassionate 127.741, 087 —.107 ELU 
32, Hage to soothe hurt feelings —031 $41 —.206 —,018 SH -0N H 
33. Soft-spoken -27 W M 19 J3 -230 ON 
34. Warm i6 626 -077 219 ™ MS a on 
35. Tender 05 727 -05 W 735 å J9 -0 17 
36. Gullible —.140 198 —480 -08 ons =o z5 =» -= 
37. Childlike —016 025 —461 —,.120 os -0i ~ A 
38. Does not use harsh language -113 21 078 08 AS oe za ne 
39. Loves children -09 4% 03 O w -j pd 
40. Gentle wo o7 -00 189 r7 OW 

` 693 490 167 107 $46 Wo 1o LP 

Now. Factor loadings 240 are in boldface. ms = 400 (Females) and 171 (males). BSRI = Bem Sex Role 


as teciftace readlog scale rather than as they appear on 
on ing and interpretation, traits have been grouped by 
Ce SRE Te Traits 1-20 are considered by Bem to be masculine, and Traits 21-40. feminine. 


r Childlike these traits as related to their conception of 
negira padi yee who rate Feminine. It will be recalled that Gullible and 
so have negative loadings. Childlike had negative loadings on the Self- 


igh on Masculine tend to rate s 
pmo bah Feminine as well as Gulli- sufficiency factor in the solution for females. 
that they perceive In sum, the factor analyses of self-ratings 


ble and Childlike, 
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for males and females do not reflect the di- 
mensions of masculinity and femininity pro- 
posed by Bem. Not only did four factors 
emerge in each solution, but also, despite some 
similarities, they are sufficienty different from 
each other to lead one to believe that the pat- 
terns of self-ratings for males and for females 
differ. The fact that the traits Masculine and 
Feminine describe a separate bipolar factor 
also casts doubt on the validity of the classi- 
fication of the remaining items as masculine 
and feminine. It will also be noted that some 
traits (e.g, Yielding, Soft-spoken) did not 
have meaningful loadings in either of the 
solutions, 

Because some researchers seem to consider 
internal consistency reliability as evidence of 
the unidimensionality of summated rating 
scales, we present some reliability data on 
the Femininity scale to demonstrate the du- 
bious nature of this approach. In the present 
study, the alphas for the Femininity scale 
(consisting of the 20 “feminine” items) are 
.79 and .77 for females and males, respec- 
tively. Contrast these with the alphas of .87 
and .89 for females and males, respectively, 
on the 11 items with meaningful loadings on 
the Interpersonal Sensitivity factor (see Table 
6, Factor 2 for females and Factor 1 for 
males; the trait Feminine, which has a mean- 
ingful loading on this factor for females only, 
was excluded from this analysis for compara- 
tive purposes). 

Tt would be interesting to compare the fac- 
tor structures obtained under the conditions 
of self-ratings with those obtained under the 
conditions of desirability ratings (Study 1). 
It should be noted, however, that the two 
studies used different subjects and that Study 
1 included also the “neutral” traits. Never- 
theless, on the basis of the differences in the 
nature of self-ratings and desirability ratings, 
and on the basis of the overall pattern of the 
two sets of factor structures, it seems plausi- 
ble to speculate that the dimensions that un- 
derlie trait ratings differ when the task is to 
rate oneself and when it is to rate the de- 
sirability of the traits for an abstract referent, 
Assuming that similar findings are obtained 
in future research, one would have to ques- 
tion the validity of arriving at subscales for 
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self-ratings on the basis of analysis of desira- 
bility ratings. 


Discussion 


Bem’s effort to construct measures of mas- 
culinity and femininity was destined to fail, 
as it was based solely on an empirical ap- 
proach in which trait selection was determined 
by a multitude of nonindependent univariate 
tests of significance. It is not surprising that 
some of the traits designated as sex related 
by Bem are considered neutral, and vice versa, 
by other researchers, who have resorted to dif- 
ferent empirical approaches in their search 
for sex-related traits. The absence of theo- 
retical definitions of the constructs precludes 
attempts to determine whether or not a given 
set of traits is representative of a given do- 
main. How can one assess the validity of a 
measure when the construct it is supposed 
to be measuring is undefined? The present 
discussion is necessarily limited to a considera- 
tion of the properties of the BSRI regardless 
of one’s definitions of sex roles. 

It will be recalled that Bem’s sole criterion 
for the designation of a trait as sex related 
was whether or not there was a significant 
difference in the ratings of its desirability for 
a man and for a woman. As was demonstrated 
in Study 1, this approach yielded 20 “mascu- 
line” traits which are all positive, or relatively 
high on desirability, and 20 “feminine” traits, 
some of which are negative even when the 
referent is a woman. This, of course, has im- 
plications for the total scores on Masculinity 
and Femininity, in view of people’s general 
tendency to attribute to themselves positive 
traits, and their reluctance to attribute to 
themselves negative traits, when they respond 
to a self-report instrument. This problem is 
particularly serious because the difference be- 
tween the scores on Masculinity and Feminin- 
ity is used to arrive at an Androgyny score. 

Before dealing with the issues of androgyny, 
however, it is important to note that no at- 
tempt was made by Bem, or other researchers 
using the BSRI, to study the dimensions that 
underlie the ratings of the desirability of the 
BSRI traits. It was demonstrated in Study 
1 that the factor structure of the ratings of 


trait desirability is not consistent with the 
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PARI trait classification. While most of the 

masculine” traits have meaningful loadings 
on a single factor, reflecting a narrow domain 
of assertiveness, the “feminine” and the “neu- 
tral” traits are indistinguishable from each 
other, and break down into two factors: one 
consisting of positive traits reflecting inter- 
personal sensitivity; the other being composed 
of negative traits reflecting emotional imma- 
turity. Consequently, it is inappropriate to 
treat the 20 “feminine” traits as a single 
summated rating scale. 

The situation becomes even more compli- 
cated when one recalls that the factor struc- 
ture of self-ratings (Study 2) was more com- 
plex in that the “masculine” items also split 
into two factors. Moreover, the factor struc- 
ture for females differed from the one for 
males. 

The above noted findings are sufficient to 
reject Bem’s operational definition of an- 
drogyny solely on empirical grounds. Since, 
however, the popularity of the BSRI is proba- 
bly due primarily to the Androgyny score that 
it presumably provides, it is of interest to 
sketch the metamorphosis that the operational 
definition of androgyny has undergone during 
its brief history. Except for indicating that 
being androgynous means being “both mascu- 
line and feminine, both assertive and yielding, 
both instrumental and expressive” (Bem, 
1974, p. 555), Bem provides no discussion of 
androgyny. Instead, it is operationally defined 
on the basis of a test of significance between 
the means of Masculinity and Femininity, 
treating the self-ratings of one individual on 
40 traits as being independent of each other. 
It seems unnecessary to discuss the problems 
in such an approach, as this has already been 
done by Strahan (1975). Nor is it necessary 
to discuss the methodological problems at- 
tendant on the use of difference scores, as de- 
tailed discussions are available (e.g., Harris, 

1963). It will only be pointed out that the 
use of a difference score as a definition ofa 
construct is highly questionable (see, for ex- 
ample, Cronbach, 1958; Cronbach & Furby, 


1970). 7 
. Responding to criticisms (e.g, Spence, 
Helmreich, & Stapp, 1975; Strahan, 1975) 
that her operational definition of androgyny 
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does not distinguish between different kinds 
of the same numerical discrepancy between 
Masculinity and Femininity (e.g., high—high, 
average—average, low-low), Bem proposed 
that median splits be used instead. According 
to the new definition, people who are above 
the median on both scales are classified as 
androgynous; those who are above the median 
on Masculinity and below the median on 
Femininity are classified as masculine; people 
who are above the median on Femininity and 
below the median on Masculinity are classi- 
fied as feminine; and those below the median 
on both scales are considered undifferentiated. 
This is probably the crudest and the least 
useful method for arriving at a typology (for 
a review of available methods see Bailey, 
1974). When using median splits one runs 
the risk of classifying some of the people 
whose scores on both scales are relatively 
similar as being different types, and of classi- 
fying some whose scores are relatively dis- 
similar as being of the same type. If, in addi- 
tion, it is noted that all the groups used in 
studies with the BSRI were groups of con- 
venience and not samples of any defined 
population, it becomes evident that a person 
may be classified as being one type or an- 
other depending upon the aggregate of people 
to which he, or she, is considered to belong 
because the BSRI was administered to them 
(e.g, introductory psychology classes). Fi- 
nally, the use of median splits is unwarranted 
in view of the factorial complexity of the 
scales. 

In a recent discussion of the relative merits 
of difference scores and median splits, Bem 
(1977) relies again on statistical tests of sig- 
nificance as the sole criterion. Using an as- 
sortment of scales (e.g, Attitudes Toward 
Women, Internal-External Locus of Control, 
Machiavellianism, Self-Esteem), Bem tested 
whether high—high individuals differed sig- 
nificantly from low—low individuals on each. 
Having found significant differences on some 
scales, but not on others, she concluded that 
a distinction between high-high and low-low 
was warranted, though she acknowledged that 
she was unable to specify the conditions un- 
der which the high—high will differ from the 
low-low. Is one to deduce that sometimes 
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high-high and low-low are the same kind of 
androgynous people, and sometimes they are 
different (androgynous?) types, though it is 
not possible to say when? 

Bem (1977) concludes her discussion by 
stating, “Finally, we urge investigators to fur- 
ther analyze their data without categorizing 
individual subjects in any way, i.e., through 
the use of multiple regression techniques” (p. 
204). While endorsing what appears to be a 
suggestion to conduct studies within the 
framework of trait-treatment interactions, one 
cannot help wondering: Where has androgyny 
gone? 
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Postscript 


We confine this comment on Bem’s (1979) 
reply to the present article to the question of the 
theory behind the construction of the BSRI. No- 
where in her earlier statements regarding the con- 
struction of the BSRI (e.g., Bem, 1974) were 
we able to find an allusion to the theory of sex 
roles that is said by Bem (1979) to be the basis 
for the BSRI. Bem does make reference to state- 
ments by other people about possible motiva- 
tional processes of sex-typed individuals. But no- 
where does she present “a theory about both the 
cognitive processing and the motivational dynam- 
ics of sex-typed and androgynous individuals” 
(Bem, 1979, p. 1048). Quite the contrary, Bem 
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has stated, “My hypotheses have derived from 
no formal theory” (Bem, Note 2, p. 1, italics 
added). In our opinion, the rationale presented 
in Bem’s (1979) reply to this article does not 
constitute even a rudimentary theory. Asserting, 
as Bem does, that some individuals are moti- 
vated to conform to sex-typed cultural norms and 
that others are not motivated to do so, or that 
some individuals are “consistent” and others are 
“inconsistent”, is tautological, unless one articu- 
lates a theoretical explanation for such phenomena. 

Thus, in our opinion, it is Bem’s approach to 
scale construction, mot the culture, that “has 
clustered a quite heterogeneous collection of at- 
tributes into two mutually exclusive categories” 
(Bem, 1979, p. 1048, italics added). It is Bem, 
not the culture, that “groups a hodgepodge of 
attributes into a category it calls ‘femininity’ or 
‘masculinity’ ” (Bem, 1979, p.1049, italics added). 
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And it is Bem who treats a hodgepodge of items 
as if it comprised a unidimentional scale. Bem 
maintains that her “theory deliberately does not 
specify the particular contents of these defini- 
tions, however, because these will vary from cul- 
ture to culture” ( Bem, 1979, p. 1049) and because 
hers is a theory of “process” not of “content.” 
How, then, does one reconcile this orientation 
with the approach used by Bem in the revision 
of the BSRI (Bem, 1979, p. 1051) in which 
items are eliminated because their correlations 
with the total score are low, or because they loa 

on separate factors? Seeking to determine which 
items are “clustered,” Bem appropriately resorts 
to some of the same methods whose absence in 
the construction of the original BSRI we have 
criticized. In sum, we do not see how the retained 
items in the short BSRI reflect a theory of “pro- 
cess” that is not enunciated. 
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Psychological Androgyny: A Case of Mistaken Identity? 


Anne Locksley 
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Reviews of 40 years of psychological re- 
earch and theory about masculinity/feminin- 
ty (Constantinople, 1973; Harrison, 1975; 
accoby & Jacklin, 1974; Pleck, 1975, 1976) 
eveal that Terman and Miles’s (1936) origi- 
al formulation of the masculinity/femininity 
‘onstruct initiated a line of research that con- 


ently, the concept of psychological androgyny 
(Bem, 1974, 1975, 1976; Block, 1973; Heil- 
brun, 1973; Spence & Helmreich, 1978; 
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This article critically discusses theoretical and methodological issues raised by 
psychological androgyny research and theory. A number of problematic assump- 
tions shared by this approach and the previous masculinity/femininity approach 
are detailed. The first part of the article considers whether inventories devised 
to tap general sex stereotypes should be used as individual personality measures. 
Alternative forms of cognitive structures, linking sex, other person features, and 
behavioral rules, are described and hypothesized to have sex-differentiating 
effects on behavior. The second part of the article discusses problems created 
by the persistent use of indicators of adaptation and mental health as criterion 
variables in research on sex identity and sex roles. It is suggested that a psycho- 
logical theory of sex identity and sex roles should recognize the fact that sex is 
a structural feature of situations and of ongoing organizations of life experience. 
In a society in which sex plays a role in the very structuration of experience, 
the notion of psychological androgyny, with its implication of freedom from 
sex-related social and biological effects on personality and behavior, is arbitrary. 


Spence, Helmreich, & Stapp, 1974, 1975; 
Bem, Note 1) has emerged in the wake of 
masculinity/femininity research. It has al- 
ready been hailed as an exemplar for a new 
paradigm for psychological research on sex 
identity (cf. Harrison, 1975; Kaplan & Bean, 
1976; Pleck, 1975, 1976; Rebecca, Hefner, & 
Oleshansky, 1976). 

At first glance, the concept of psychological 
androgyny is very appealing. The concept, 
and the theory in which it is nested, seems to 
be saying not only that there is no intrinsic 
link between one’s anatomical sex and one’s 
behavior and interests, but also that people 
whose personalities fail to conform to sex-role 
standards are better off for it. However, re- 
search and theory about psychological an- 
drogyny deserve more than a first glance for 
their proper evaluation. The politicization of 
the argument has tended to obscure the fact 
that it makes specifically psychological asser- 
tions about the nature of sex identity, defini- 
tions of masculinity and femininity, and ap- 
propriate means of measuring these phe- 
nomena. The validity of these psychological 
assumptions and techniques for measurement 
should be assessed on scientific as well as po- 
litical grounds. After all, the general area of 
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sex roles and sex differences in psychology 
can only benefit from testing the theoretical 
and methodological mettle of what has already 
been acclaimed as an important new contribu- 
tion to the field. 

Accordingly, the present article is intended 
to contribute to a reflective and critical anal- 
ysis of psychological androgyny research and 
theory. It should be noted at the outset that 
the primary focus of this paper is on the logic 
of psychological androgyny research, as ex- 
pressed in theoretical formulations and meth- 
odological practices. We will not address psy- 
chometric issues raised by existing measures 
of psychological androgyny, except when they 
are indicative of the conceptualization of the 
phenomena in question. Thus recent studies 
of the instability of Bem’s (1974) item-selec- 
tion procedure (cf. Edwards & Ashworth, 
1977; Walkup & Abbott, 1978) or of the Bem 
Sex Role Inventory (BSRI) factor structure 
(Gaudreau, 1977; Moreland et al., 1978) will 
not be discussed. Nor will other purely psy- 
chometric considerations, such as the absence 
of homogeneity across classes of trait terms 
(cf. Hogan, DeSoto, & Solano, 1977) on the 
BSRI, (cf. also Strahan, 1975), be explored. 
The fundamental issues addressed here are 
(a) the feasibility of using inventories devel- 
oped to tap general perceptions of aggregate 
differences as measures of individual differ- 
ences and (b) the appropriateness of a tra- 
ditional individual-differences approach to the 
phenomena of sex roles, sex differences in per- 
sonality or behavior, and sex identity. It will 
become apparent that psychological androg- 
yny research and theory, far from constituting 
a radical departure, share a number of prob- 
lematic assumptions with masculinity/femi- 
ninity research and theory. We discuss some 
of these problems with the hope of encourag- 
ing a better approach to the phenomena of 
sex roles, sex differences in personality, and 
sex identity. 


Definition of Psychological Androgyny 


The concept of psychological androgyny 
has been operationally defined in differing 
ways, though there is a general consensus 
about the properties that define an androgy- 
nous person. These properties are best under- 
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stood in relation to the earlier formulation 
of masculinity/femininity. The various mascu-" 
linity/femininity measures (Gough, 1952; 
Guilford & Guilford, 1936; Hathaway & Mc- 
Kinley, 1951; Strong, 1936; Terman & Miles, 
1936) had all been constructed with the as- 
sumption that masculinity and femininity 
were opposite poles of a single dimension; 
thus the less masculine a given respondent’s 
score, the more feminine the respondent was 
deemed to be, ipso facto. In contrast, the con- 
cept of psychological androgyny presupposes 
that masculinity and femininity are orthog- 
onal personality constructs, making it possi- 
ble to categorize respondents on the basis of 
their scores on both masculinity and feminin- 
ity scales. 

Psychological androgyny has been defined 
simply as a function of masculinity and femi- 
ninity scores. Currently two scales are widely 
used in psychological androgyny research: the. 
Bem Sex Role Inventory (BSRI), developed 
by Bem (1974), and the Personal Attributes 
Questionnaire (PAQ), developed by Spence, 
Helmreich, and Stapp (1974) with items 
culled from the Sex Role Stereotype Ques- 
tionnaire (Rosenkrantz, Vogel, Bee, Brover- 
man, & Broverman, 1968). Bem originally 
used the BSRI to categorize respondents into 
three groups: masculine, feminine, and an- 
drogynous. However, the Spence group argued 
that truly androgynous people are those who. 
endorse both masculine and feminine traits, 
while those who endorse neither are undiffer- 
entiated with respect to masculinity and femi- 
ninity. Accordingly, Bem (1977) explored the 
utility of this distinction. Although her reanal- 
yses of data from earlier experimental studies, 
and her investigation of relationships between 
sex type and other personality constructs, pro- 
duced equivocal results, she concluded that 
the distinction between undifferentiated and 
androgynous respondents is warranted. So 
psychological androgyny is the presence of 
both masculinity and femininity. In turn, 
masculinity and femininity are defined as ad- 
ditive combinations of trait terms judged to 
be significantly more desirable for, or more 
characteristic of, each sex relative to the cate 

Bem and Spence and Helmreich are in ac- 
cord with respect to the essential content of 
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the masculinity and femininity constructs. 
Both consider femininity to involve nurtur- 
ance, expressivity, and empathy, and mascu- 
linity to involve agency, instrumentality, and 
dominance. However, they interpret scores on 
their own instruments somewhat differently. 
Bem deliberately invokes cognitive and dis- 
positional processes for interpreting responses 
to the BSRI. At one point she writes that 
cross-situational consistency in behavior can 
be attributed to a “motivational disturbance” 
that “causes a person to respond to dissimilar 
situations as if they were equivalent” because 
she or he “is defensively motivated to main- 
tain some sort of image” (Note 1, p. 8). 
Highly “sex-typed” persons are seen to be de- 
fensively motivated in this sense. Thus Bem 
presents the BSRI as a measure of the “degree 
of sex-role stereotyping in the person’s self- 
concept” (1976, p. 51). 

In contrast, Spence and Helmreich avoid 
appeals to cognitive or motivational processes 
and argue that responses to the PAQ can be 
understood as relatively veridical assessments 
of individuals’ repertoires of traits or personal- 
ity characteristics (cf. Spence & Helmreich, 
1978, pp. 14-16). They are especially com- 
mitted to distinguishing masculinity and femi- 
ninity as personality constructs from sex role- 
related behavior, as well as from dispositions 
to actualize concepts of masculinity and 
femininity. 

From our perspective, the theoretical and 
methodological similarities of the two bodies 
of work outweigh their differences and justify 
their joint treatment. Most of the criticisms 
to be presented here are intended to speak 
both to Bem’s and to Spence and Helmreich’s 
work on psychological androgyny. However, 
as a result of their differing interpretations 
of their scales, some criticisms will refer only 
to Bems work,! and other criticisms only to 
Spence and Helmreich’s work. In such cases 
the referent will be explicit. 


Problems With the Conceptualization and 


Measurement of Psychological Androgyny 


The development and labeling of any per- 
sonality typology involve claims of reference 
and of identity. The investigator in effect as- 
serts that all persons falling within each cate- 
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gory of the typology can be considered as 
equivalent, at least with respect to the quality 
identified by the category title. The strategy 
by which the typology or measure is con- 
structed essentially represents the logic of 
these claims. The labeling of the typology, 
particularly when appropriating terms current 
in ordinary discourse, conveys the nature of 
the quality or construct at issue and thereby 
evokes commonly associated expectations and 
assumptions. This section considers the logic 
governing the development and application 
of the BSRI and the PAQ, traces the origin 
of that logic in masculinity/femininity re- 
search, and seriously questions its validity. 
Whether defining masculinity and femininity 
as opposite ends of a continuum or as orthog- 
onal dimensional constructs, psychologists 
have invariably derived their content from 
aggregate differentiation of the two sexes, 


1 Because Bem invokes cognitive and dispositional 
processes when interpreting responses to the BSRI, 
the burden of proof, as it were, is heavier for her 
than for Spence and Helmreich. Bem wants to inter- 
pret item endorsement as indicative of the extent 
to which sex stereotypes have been internalized and 
consequently control one’s behavior, proscribing and 
prescribing various acts, However, the nature of the 
BSRI items and the manner of the administration 
of the BSRI do not warrant such an inference. For 
example, consider two items on the BSRI Masculin- 
ity scale: “Acts as a leader” and “has leadership 
abilities.” In order to know whether or not one has 
leadership abilities, one has to have been in situa- 
tions in which such a position was accessible. It 
hardly needs to be pointed out that tremendous dis- 
crimination against women’s assumption of leader- 
ship positions is a pervasive feature of our society. 
Endorsement of such an item, or lack thereof, may 
be a function of experiences and of the cumulative 
effects of those experiences resulting from a number 
of causes other than identification with a sex stereo- 
type. In general, responses may reflect behavior in- 
duced by systematic situational factors as well as 
behavior indicative of personal dispositions, The 
Jack of homogeneity (cf. Hogan et al., 1977) among 
the trait terms also undermines the assumption that 
exclusive endorsement of masculinity items or ex- 
clusive endorsement of femininity items necessarily 
reflects a global underlying disposition to emulate 
masculine or feminine stereotypes. Factor analyses 
(Gaudreau, 1977; Moreland et al., 1978) have pro- 
duced four factors more internally consistent than 
the original three BSRI scales, further weakening 
the assumption of unidimensional constructs con- 
forming to the original labels. 
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biologically defined, and have created con- 
structs subsequently applied to members of 
either sex. Masculinity/femininity research 
was generally based on the assumption that 
sex type is a function of behaviors or attri- 
butes significantly more characteristic of one 
sex than of the other, Sex-differing attributes 
were established empirically; sex differences 
in response constituted the criterion for item 
selection in developing the masculinity /femi- 
ninity scales. 

Psychological androgyny researchers differ 
from masculinity /femininity researchers not 
only in’ treating masculinity and femininity as 
orthogonal constructs but also in relying on 
general perceptions of aggregate sex differ- 
ences in personality attributes. Items for both 
the BSRI and the PAQ masculinity and femi- 
ninity scales were selected because they were 
judged, on the average, to be differentially 
desirable or characteristic of the typical man 
and the typical woman. This concept of mas- 
culinity and femininity, as composites of at- 
tributes distinguishing males from females in 
general, has been a persistent feature of sex- 
stereotype research (e.g, Broverman, Vogel, 
Broverman, Clarkson, & Rosenkrantz, 1972; 
McKee & Sherriffs, 1957, 1959; Rosenkrantz 
et al., 1968; Sherriffs & Jarrett, 1953; Sher- 
riffs & McKee, 1957). Indeed, the BSRI and 
the PAQ are constructed in exactly the same 
manner as the sex-stereotype scales. The psy- 
chological androgyny researchers have as- 
sumed that sex type is a function of covaria- 
tion between individual traits and traits popu- 
larly thought to be differentially desirable or 
typical of men and of women. 

The distinction between actual and stereo- 
typically perceived sex differences is an im- 
portant one. Obviously the effect on behavior 
of generally held notions about personality 
covariates of sex may be quite different from 
the effect of anatomical sex on behavior. There 
may be sex differences in behavior caused by 
social or biological factors that are not per- 
ceived to be a function of sex and that are 
consequently not even partially attributable 
to the press of differential socialization prac- 
tices or normative expectations. The two 
sorts of effects on behavior were confounded 
in masculinity /femininity research by the use 
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of sex differences in item response as the cri- 
terion for item selection. In addition, the re- 
striction of scale content to items perceived 
to be linked to sex forces consideration of the 
cognitive mediation of a sense of self by con- 
cepts connecting sex and other attributes, and 
of the effects of these concepts on behavior. 
The reliance on self-report for the measure- 
ment of individual differences further under- 
scores the need for recognizing the role of 
cognitive aspects of sex typing or sex iden- 
tity on sex differentiation in behavior. 

But like the masculinity/femininity re- 
searchers, the psychological androgyny re- 
searchers use personality inventories devel- 
oped around aggregate distinctions between 
the categories of biological sex to measure 
individual differences in personality regardless 
of biological sex. This procedure raises two 
basic theoretical and methodological questions. 
First, if the intent is to define and measure 
individual differences in masculinity and femi- 
ninity, should the definition and measurement 
of these constructs be based on stereotypically 
perceived aggregate sex differences? Second, 
can an inventory developed to tap beliefs 
about aggregate sex differences be used as 4 
measure of individual differences? Specifically, 
can the sense provided by the context in 
which the measure is developed be sustained 
in the context of its application? 


Limitations of the Stereotypic Conception of 
Masculinity and Femininity 


The rationale for deriving the BSRI and 
the PAQ from sex stereotypes is based on 
the assumption that different frequencies of 
attributing adjectives to the typical woman 
or the typical man indicate beliefs about C0- 
variates of sex. The findings of sex-stereotype 
research do not necessarily support this as- 
sumption. Because sex segregation in family 
and work roles is a widespread feature of 
American society, those adjectives that meet 
the criterion may be linked to prototypically 
female and male family and work roles rather 
than to sex per se. The evidence suggests that 
this might be the case. Clifton, McGrath, and 
Wick (1976) asked a sample of respondents 
to check those adjectives, from a list of 153, 
that describe the typical housewife, bunny, 


club woman, career woman, and woman ath- 
lete. They reasoned that if a general stereo- 
_ type of women existed, then similar clusters 
of adjectives should be used across all roles. 
Instead, they found a surprisingly low com- 
monality of adjectives used across all roles. 
Indeed, only one adjective was used to de- 
scribe each of the five types of woman: ac- 
tive. The adjectives used to describe the 
typical housewife were very similar to those 
found on the BSRI and the PAQ femininity 
“scales. However, none of the adjectives used 
to describe bunny appear on these scales. The 
other three roles, club woman, career woman, 
and woman athlete, were all described with a 
preponderance of adjectives similar to those 
on the BSRI and the PAQ masculinity scales. 
Inspection of the content of the PAQ and 
the BSRI further corroborates the premise 
that respondents ideas of characteristics gen- 
erally distinguishing males and females and 
their ideas of characteristics specific to highly 
sex-segregated roles (e.g., housewife/mother 
and career man) may be confounded. It is not 
difficult to derive traits of nurturance, em- 
pathy, and expressiveness from the American 
ideal of motherhood, or to derive traits of 
agency, instrumentality, and dominance from 
the American ideal of the father providing for 
his family. 

Like all sex-stereotype scales developed to 
date, the BSRI and the PAQ development 
‘ocedures left respondents with no choice 
but to try to convey the significance of non- 
trait person features with trait terms. The 
consequences may be misleading. For ex- 
ample, Johnson (1975) has pointed out that 
the instrumentality/expressiveness dichotomy 
is not a good indication of the masculinity/ 
femininity dichotomy because traditionally 
female activities, such as mothering, require 
instrumental behavior in order to be carried 
“out effectively. Nonetheless, the instrumental- 
ity/expressiveness dichotomy recurs consis- 
tently throughout the sex-stereotype literature 
and is the basis of the BSRI and the PAQ. 
__ We suspect that this distinction derives from 
_ the prototypical middle-class housewife who, 
as a result of present-day affluence, does not 
Procure resources for the family. When the 
family unit is less affluent, mothering respon- 
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sibilities often include actions that obtain re- 
sources needed for family survival. When trait 
terms are the only means by which respon- 
dents can distinguish females from males, the 
terms may be used to signify something other 
than their ordinarily understood meanings. 
Although the typical housewife may not work 
for money or use social networks to procure 
major survival resources, her actual behav- 
ior may require “agency” or “instrumental- 
ity,” and her instrumentality is distinguishable 
from that of her husband not by adjectival 
characterizations of behavior but by the con- 
text in which it occurs and the purposes it 
serves. These observations suggest that the 
content of general sex stereotypes may be 
nothing other than reified personality charac- 
teristics associated with ideal representatives 
of adult, sex-segregated social roles. 

The probable nonindependence of concep- 
tions of prototypically sex-segregated social 
roles and conceptions of general covariates 
with sex has several problematic implications 
for psychological androgyny research and the- 
ory. To begin with, most psychological an- 
drogyny research has been conducted on col- 
lege students and even on high school students. 
The variance in scores permitting the cate- 
gorization of androgynous persons may indi- 
cate only the lack of press from the definitions 
and constraints of adult sex-segregated social 
roles rather than the absence of relevant sex- 
differentiated traits. In fact, the preoccupa- 
tion of adolescents and young adults with sex- 
uality and courtship suggests an alternative 
list of adjectives to that of the BSRI or the 
PAQ. Furthermore, Clifton et al.’s (1976) 
failure to obtain general stereotypes when in- 
formation in addition to that of the stimulus 
person’s sex was provided is corroborated by 
reflection on ordinary language rules. For ex- 
ample, we make discriminations based on such 
person features as age, sex, and kinship posi- 
tion when we characterize people with sex- 
related qualities. We may say, “The Marlboro 
man is the epitome of masculinity,” but we 
don’t say, “He’s a masculine father,” even - 
though it is syntactically acceptable to do so. 
Similarly, we may characterize a man as ef- 
feminate, but not a woman. These observa- 
tions suggest that general sex stereotypes may 
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be too global for interpreting and guiding be- 


havior at the level of individual self-percep- 
tion or self-direction. 
The assumption that traits differentially at- 
tributed to the typical woman and the typi- 
cal man retain their sex-associated character 
when used for self-description purposes is un- 
cermined by a recent series of experiments 
by Nisbett and Zukier (Note 2). They dem- 
onstrate that the impact of stereotypes on 
social inference and prediction is diluted by 
the presence of “nondiagnostic” information, 
that is, information unrelated to the stereo- 
type. Obviously, personal experience provides 
a wealth of nondiagnostic information with 
respect to global sex stereotypes. The results 
of Nisbett and Zukier’s studies suggest that 
global sex stereotypes are rarely activated for 
self-perception or self-description purposes. 
In fact it is probable that general sex stereo- 
types are most likely to have effects under 
either of the following conditions: (a) in 
situations comparable to those in which sex 
stereotypes are assessed, namely, in situations 
in which only the sex of the target person is 
known or in which sex is the most informative 
characteristic known, and (b) in situations in 
which sex is the salient factor distinguishing 
_ two persons and in which its general covariates 
are representative of the situational requisites. 
Support for the first condition is provided by 
studies of preferential hiring decisions. Blau 
and Jusenius (1976) show that sex discrimi- 
nation in employment is most extensive at 
the port of entry for internal job ladders pre- 
cisely because few other person features (e.g., 
dependability, companionability) relevant for 
long-term association are known. Support for 
the second condition is suggested by the popu- 
lar recognition of the impact of childbirth and 
of parenting during early infancy on parents’ 
acceptance of, and behavioral adherence to, 
sex stereotypes. 

Psychological androgyny researchers define 
masculinity and femininity as stereotypically 
perceived composites of traits believed to dif- 
ferentiate the sexes on the average. However, 
if the intent is to use concepts of sex covari- 
ates at the individual level of analysis, in an 
effort to discern either the influence of these 
Concepts on behavior or the Presence and 
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ramifications of those covariates, it may be 
more reasonable and effective to tap those e 
concepts most pertinent for self-descriptiong 
and self-perception. Otherwise, psychological 
androgyny may reflect only the failure of the” 
particular masculinity and femininity mea- 
sures used to tap the most salient dimensions’ 
of sex differentiation in personality and be- 
havior. Certainly the form and content of 
ideas of sex covariates have yet to be deter- 
mined. They may be like disconnected set 
of propositions (e.g., propositions about les- 
bians may be quite different from propositions 
about glamorous actresses) ; they may dis- 
tinguish people on the basis of kinship cate- 
gories or on the basis of traits; they may be $ 
embedded in concepts of settings or environ- d 
ments. Wittgenstein’s (1958) notion of a fam- 
ily of concepts with varying criteria of family 
resemblances may be more appropriate here 
than the notion of a solitary and general sex 
stereotype. In any case, the determination off 
the nature of ordinary concepts linking sex, 
other attributes, and behavioral rules is im 
portant because of its ramifications for the 
types of predictions one would make about 
the effects of such concepts oncognitive struc- $ 
tures on behavior. + 

Recent work in cognitive psychology offers 
hypotheses about the nature and structure ofi 
ideas of masculinity and femininity alternative! 
to those presupposed by Bem and by Spend 
and Helmreich. The BSRI and PAQ develop 
ment and scoring procedures attribute mas- 
culinity to an individual when all traits be- $p 
lieved to be significantly more desirable for, f 
or characteristic of, males rather than females 
are properties of that individual. The attribu- 
tion of femininity follows a parallel logic. 
However, the structure of such concepts need 
not be so Aristotelian. For example, Rosch 
(1975) demonstrated that the structure of 
cognitive representations of categories can be* 
of a prototypical sort. Such a category is § 
represented by a particular object, called a $ 
good instance of the category, surrounded by 
other objects of decreasing similarity to the 
prototype and correspondingly decreasing de- 

i l pular 

grees of membership. Analogously, pO] 
ideas of what it is to be a woman and what 
it is to be a man may be represented by proto- 


"typical persona, or conceptual structures that 
cluster salient information definitive of a type 
or category of individuals. Unlike trait-dimen- 
sional summaries of People, person prototypes 
may cluster diverse types of information (eg., 
physical appearance, occupation, social pres- 
tige, as well as behavioral dispositions and 
characteristics), including information about 
vivid, immediately perceptible person features. 
Such features may activate the prototype, 
causing the individual to generate numerous 
inferences about the stimulus person’s proba- 
ble behavioral dispositions in the absence of 
any information about their actual behavior 
(e.g, Snyder, Tanke, & Berscheid, 1977; 
Taylor & Crocker, Note 3). This type of rea- 
soning can affect self-perception as well. For 
example, when the poet Maya Angelou was 15 
years old, she decided that she must be a les- 
= bian because, unlike her girlfriends, her 

, breasts remained small, her hips narrow, and 
¿> her voice deep (Angelou, 1971). For her, a 
`| boyish physique was as evocative of the cate- 
gory of lesbianism as homoerotic feelings. 

Certainly the investigator retains the right 
to label operationally defined personological 
phenomena as she or he chooses. But should 
the scientific meaning of terms be arbitrary 
. with respect to their ordinary sense? The term 

psychological androgyny clearly implies “of 
sindeterminate sex, or not sex related.” How- 
| ever, its scientific operationalization defines 
persons as androgynous only in relation to 
global sex stereotypes. Such persons may still 
| be as influenced as sex-typed persons by con- 
| cepts of sex-linked attributes, or may evidence 
sex-differentiated personality or behavioral 
qualities. 


Methodological Problems of Using Sex 
Stereotypes for Individual-Difference 
Measures 


The validity of transferring the sense pro- 
vided by the context in which the BSRI and 
the PAQ were developed to the context in 
which they are administered can be ques- 
tioned on certain methodological grounds. The 
conceptual and referential context in which 
judgments are made about the self is en- 
tirely different from the context in which 
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judgments are made about such abstractions 
as the typical man and the typical woman. 

To begin with, self-rating on an attribute 
entails comparison processes. When female 
respondents rate themselves, are they com- 
paring themselves to their female peers, their 
Peers of both sexes, to a general notion of the 
average woman, or to some specific exemplar 
of femininity? The same question can be 
asked regarding male respondents, For ex- 
ample, a woman may endorse “athletic” as 
a true description of herself because she js 
athletic relative to other women, although she 
may not be particularly athletic relative to 
most men, Similarly a man may endorse “com- 
passionate” because he considers himself to 
be more compassionate than most men, though 
less compassionate than most women, Fur- 
thermore, the behaviors referred to by these 
terms may not be the same for men as for 
women. The compassionate man may be think- 
ing of his indignation over inequities and in- 
justices in the world, whereas the compas- 
sionate woman may be thinking of her em- 
pathy for the problems of her friends. En- 
dorsement of the same items may thus conceal 
sex differences, again raising the question of 
the validity of the behavioral reflections of 
a label such as androgyny. 

In addition, respondents asked to describe 
themselves use a different inferential context, 
which may alter the sense of a given term 
from that afforded by its original context. For 
example, when thinking of the ideal house- 
wife/mother, loyalty (an item on the BSRI 
femininity scale) would seem to be a salient 
feature of the mother’s feelings about her 
family. On the other hand, a football jock 
may think of loyalty in the sense of loyalty 
to his buddies, or to his team and to his 
school. This concept of loyalty is hardly in- 
dicative of femininity or of emulation of a 
female sex stereotype. 

It could be argued that all personality in- 
ventories feature indeterminant inferential 
contexts, and so singling out the BSRI and 
the PAQ is unfair. But the problem is not so 
much one of general ambiguity, but rather 
one of biases in self-report, or automatic ad- 
justments to one’s sex, systematically related 
to the notion of what is measured. The sorts 
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of referential vagaries noted above contribute 
to a surface impression of equivalence not 
necessarily supported by the underlying evi- 
dence. Indeed, sex differences in the results 
of Bem’s (1976) experimental studies lend 
credence to this premise, that responses differ 
in their significance by sex, 

Fundamentally, the appropriateness of a 
traditional individual-differences approach to 
this general domain is questionable. There are 
special characteristics of masculinity, femi- 
ninity, sexual identity, and sex differentiation, 
such as it is, in personality or behavior, that 
do not hoid for any other individual-differ- 
ence construct (with the possible exception of 
intelligence). For one thing, sex, the anchor 
for concepts of masculinity and femininity, is 
a universal attribute of the organism. The 
lay person and, until recently, the psycholo- 
gist tend to assume the bipolar nature of 
masculinity and femininity because biological 
sex is thought to be causally implicated in 
trait differentiation. Masculinity and feminin- 
ity are conceptually related to biological sex. 
They are also, at least in their ordinary sense, 
conceptually related to sexuality and sexual 
relations, universal phenomena of extraordi- 
nary salience and consequence. Finally, their 
relation to biological sex suggests another con- 
sideration. Sex is a major criterion for the 
structural differentiation of experience over 
the life cycle, given an extensively sex-segre- 
gated society. The implications of these con- 
siderations for the viability of psychological 
androgyny research and theory are explored 
in the following section. 


Masculinity, Femininity, and Adaptation 


Masculinity/femininity research and the- 
ory had always been characterized by an em- 
phasis on the contribution of the construct 
to overall adaptation or adjustment. As a per- 
sonality construct, masculinity/femininity was 
postulated to account for individual varia- 
tion in indicators of adaptation or adjustment. 
The Construct was assumed to be valid if it 
discriminated criterion groups that could be 
distinguished on the basis of a “relevant” 
adjustment indicator. Homosexuality in men 
adjudged to be pathological and maladaptive, 
was one diagnostic criterion advocated by 
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Terman and Miles (1936), and a notable 
amount of research was conducted investigat- 
ing the predictive validity of various mascu- 
linity/femininity scales regarding male sexual ~ 
object choice (e.g., Burton, 1947; Krippner, — 
1964; Manosevitz, 1970). In a similar vein, © 
research was conducted to determine the con- 
tribution of masculinity/femininity to global 
adjustment (e.g., Mussen, 1961), schizophre- ` 
nia (e.g., Reed, 1956), anxiety (e.g., Con- 
sentino & Heilbrun, 1964; Gotts & Phillips, 
1968; Gray, 1957), obesity (e.g. Lefley, 
1971), and alcoholism (e.g., Jensen & Hoff- 
man, 1973), among other adjustment indica- 
tors. The persistent assumption was that mas- 
culinity in males and femininity in females is 
functional for adaptation. The assumption was 
never substantiated, though the limitations of 
the masculinity/femininity scales make it dif- 
ficult to know whether these findings are at- 
tributable to construct validity problems or 
if they constitute valid disconfirmations of the 
hypothesis. 

Psychological androgyny research has in- 
herited the focus on adaptation from the mas- 
culinity/femininity research. However, the di- 
rection of the argument has been reversed: 
Androgyny (rather than exclusive masculinity 
or exclusive femininity) is postulated to fa- 
cilitate adaptation or adjustment. Bem (1974, 
1976, Note 1) has always nested androgyny 


in a broader notion of the basis of mental Wy 


health, and her experimental studies are de- 
signed to demonstrate that androgynous sub- 
jects are more able to cope with the requisites 
of situations than are masculine or feminine 
subjects. Indeed, she describes her sex-typed 
subjects as having “behavioral deficits,” argu- 
ing that they don’t have the abilities neces- 
sary for effective action in certain types of 
situations. Similarly, Spence et al. (1975) 
demonstrated that psychologically androgy- 
nous subjects have higher self-esteem than do 
masculine, feminine, or undifferentiated sub- 
jects. They explain this finding as indicative 
of androgynous subjects’ greater personal and 
social effectiveness. 

Two major problems, however, stem from 
the focus of this research on the prediction of 
indicators of adaptation or adjustment. The 
first involves a persistent blurring between no- 
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tions of predictive validity and of construct 
validity. The second and more serious problem 
involves contradictions between psychological 
and sociological levels of analysis. 

Predictive validity is equivalent to the 
“truth value”, or “power” of a theory, which 
increases as the hypotheses deduced from the 
theory are confirmed (Wiggins, 1973); it is 
not synonymous with construct validity. Con- 
struct validity is a feature of a particular 
measure of a personality construct and is es- 
tablished by a complex set of procedures de- 
termining the plausibility that the measure 
measures what it purports to measure (Camp- 
bell & Fiske, 1959; Cronbach & Meehl, 1955; 
Nunnally, 1967; Wiggins, 1973). The con- 
struct validity of a measure cannot be estab- 
lished by demonstrating its predictive valid- 
ity. Yet the psychological androgyny research- 
ers have attempted to do precisely that. They 
have reasoned that if androgynous subjects 
are more flexible in handling diverse situations 
or evidence more positive self-regard com- 


‘pared to the other categories of subjects, this 


must be due to differences in the sex type 
between the four categories of subjects. The 
notion of the function of sex type for adapta- 
bility or adjustment is so entrenched that the 
construct validity of the BSRI and the PAQ 
is taken for granted. Oddly, there are numer- 
ous behaviors that are popularly recognized 
to covary with masculinity and femininity, 
but neither the masculinity/femininity re- 
searchers nor the psychological androgyny 
researchers have tried to determine their re- 
lation to the sex-type measures. Consider, for 
example, masculine males. They are generally 
assumed to “sow their wild oats,” to put a 
high priority on physical attractiveness in the 
choice of a mate, to value male friendships, 
and to socialize frequently in male groups. In 
contrast, feminine females are generally as- 
sumed to invest considerable amounts of time 
in their physical appearance, to prefer the 
company of males over females, and to sub- 
ordinate commitments to female friends to 
commitments to male friends. Feminine males 
are assumed to be given to stereotypically 


_ feminine gestures and to enjoy sitting about 


and gossiping. These examples are obviously 
suggested -by popularly disseminated carica- 


tures. Readers may think of still other behav- 
iors that are widely believed to be indicative 
of masculinity and femininity. What is impor- 
tant for establishing the construct validity of 
the BSRI and the PAQ is their discrimina- 
tion of behaviors that reflect dispositional sex- 
differentiated preferences on the part of the 
actor, rather than behaviors that reflect such 
relatively non-sex-related attributes as self- 
esteem, self-confidence, intelligence, or con- 
servatism. 

The confusion between construct validity 
and predictive validity characteristic of psy- 
chological androgyny research and theory can 
easily be rectified by further research. A more 
serious problem of contradictions between psy- 
chological and sociological levels of analysis 
is created by the focus on adaptation or ad- 
justment as criterion variables for sex-identity 
research. Given the overall argument of psy- 
chological research and theory, the use of cri- 
terion indicators of adaptability or adjustment 
ignores the fact that many situations continue 
to be structured with respect to sex, such that 
sex-appropriate behaviors generally procure 
psychological and material rewards, whereas 
sex-inappropriate behaviors may be costly. If 
situational appropriateness is to be a criterion 
of adjustment, it would be “maladjusted” to 
behave in a sex-inappropriate manner. And, 
given that behavior is at least partially a 
function of situations (e.g., Mischel, 1968), 
men and women will act differently because 
they have different ongoing tasks, responsi- 
bilities, and access to resources. The exis- 
tence of sex segregation and discrimination 
in the paid labor force has been amply docu- 
mented, revealing differential sex-dependent 
contingencies surrounding entry into occupa- 
tions, considerable sex segregation in occupa- 
tional distribution, and striking sex differences 
in earnings (Blau & Jusenius, 1976; Kanter, 
1977; Tangri, 1972). Also, in spite of the in- 
creasing participation of married women in 
the work force, time-budget studies show that 
sex-based division of labor in the family has 
persisted (Robinson, 1978; Walker & Woods, 
1976). Within the coeducational college en- 
vironment, which is the setting for practically 
all of the psychological androgyny research, 
and which is, perhaps, the least sex-segregated 


1026 


setting of any institutionalized environment, 
one still discerns ample evidence for differ- 
ential constraints: in athletic programs, in 
student political organizations, and in extra- 
curricular activities (Douvan, 1975). 

At a level that is more immediately experi- 
enced by the average college student, norms 
governing appropriate behavior in interper- 
sonal relationships in informal as well as for- 
mal settings still differ depending on the ac- 
tor’s sex. People are relatively aware of these 
norms. Their impact on behavior is suggested 
by the rather distressing findings of a study 
(Zanna & Pack, 1975) that found that when 
a potential male partner was described with 
desirable characteristics, female subjects’ self- 
descriptions were influenced by information 
about his ideal woman: They tended to de- 
scribe themselves in conventional terms when 
the partner’s ideal was conventional and in 
unconventional terms when the partner’s ideal 
was unconventional. When the partner was 
undesirable, information about his ideal had 
little impact on the subjects’ self-descriptions. 
The women students clearly geared their be- 
havior to reward contingencies present in the 
situation. 

Snyder and Skrypnek provide further evi- 
dence (described in Snyder, Note 4) of the 
behavioral impact of sex-contingent expecta- 
tions. They located two individuals in separate 
rooms and required the pair to divide a series 
of tasks varying in their sex specificity. Since 
the dyads could communicate only by means 
of a signaling system, neither member could 


detect the sex of the other. In one condition, 


the “perceiver” was told that the “target” 
was female; in a second condition, the per- 
ceiver was told that the target was male. 
Snyder and Skrypnek found not only that the 
perceived sex of the target influenced the out- 
come of the division of labor process, but also 
that the targets themselves began to initiate 
choices consistent with their assigned sex, 
Once we consider that situations involve 
different contingencies and behavioral norms 
for males and for females, the use of adapta- 
bility or adjustment in androgyny research 
becomes contradictory. Bem’s experimental 
studies demonstrate that androgynous subjects 
are capable of sex-inappropriate behavior 
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when the demand characteristics of the ex- 
periment clearly tell the subjects that sex- 
inappropriate behavior is situationally appro- 
priate. The studies do not demonstrate that 
androgynous subjects engage in sex-inappro- 
priate behavior when sex-appropriate behavior 
is situationally appropriate or rewarding. And 
if androgynous subjects are to be distinguished 
by their ability to adapt to situational requi- 
sites, will they not act in sex-role-congruent 
ways when such acts enable them to obtain 
psychological, interpersonal, and material re- 
wards? The fact that androgynous subjects 
endorse both masculine and feminine traits in 
no way militates against this point. Consider 
the female college student who is analytical, 
assertive, competitive, independent, individ- 
ualistic, and self-reliant in her studies, as all 
good students are expected to be, and affec- 
tionate, childlike, flatterable, loyal, soft- 
spoken, sympathetic, understanding, warm, 
and yielding with her boyfriend, as all good 
girlfriends are expected to be. She would be 
classified as androgynous by Bem, yet her 
behavior is not independent of sex role norms. 

Even though many of the traits may be 
statistically orthogonal and although individ- 
uals, for example, may be both dominant and 
submissive between situations, within situa- 
tions their behavior may still be constrained 
by interaction rules that affect the sexes dif- 
ferently. Bem argues that androgynous indi- 
viduals are better off than masculine or femi- 
nine individuals because they are capable of 
being, to take one example, yielding in situa- 
tions in which it is appropriate to yield and, 
to take another example, assertive in situa- 
tions in which it is appropriate to be assertive. 
This argument neglects the fact that norms 
and rules governing behavior in settings are 
often sex specific and are contingent on the 
sex composition of the actors in the setting 
(cf. Ruble & Higgins, 1976). For example, in 
a given social interaction it may be appro- 
Priate for a woman to yield to a man, but 
highly inappropriate for a man to yield to a 
woman. In other settings, the woman may be 
assertive, but such behavior may not procure 
or maintain authority as the man’s behavior 
does in interaction with the woman. 

Because many situations are structured with 
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respect to sex in such a way that sex-inappro- 
priate behavior is costly, we should expect 
to find sex differences in the frequencies of 
certain types of behavior, which, in turn, have 
implications for how one might realistically 
describe oneself as an actor in general. Bem’s 
(1974) data does provide some evidence for 
this line of reasoning, though she has never 
considered the implications of the following 
findings. She found significant sex differences 
in androgyny scores, with females scoring on 
the feminine side of the scale and males scor- 
ing on the masculine side. In addition, only 
6% of the males at Stanford and 9% at Foot- 
hill were feminine, and only 8% of the females 
at Stanford and 8% at Foothill were mascu- 
line. These findings suggest that there are 
definite constraints either influencing respon- 
dents’ willingness to endorse cross-sex-associ- 
ated items exclusively or influencing respon- 
dents’ frequency of acting in ways that could 
be described with cross-sex-associated items. 
Either explanation would be consistent with 
the notions that situations are structured with 
respect to sex at both institutionalized and in- 
formal social levels, and that differential con-_ 
tingencies influence the behavior of males and 
females, their dispositions to behave in vary- 
ing ways, and consequently their perceptions 
of self as an actor in general. : 

Spence and Helmreich (1979) argue that 
the influence of sex-contingent norms of be- 
havior “cannot be used as a criticism of per- 
sonality instruments such as the PAQ; it 
merely serves as a warning that one should not 
make predictions about behavior as a function 
of PAQ scores without taking situational vari- 
ables into account” (pp. 1037-1038). Else- 
where they stipulate that “expression of the 
personality traits tapped by the PAQ is likely 
to be exaggerated when these response dis- 
positions are congruent with situational de- 
mands . . . and likely to be suppressed or 
modified when they are incongruent” (p. 
1037). For support of this point they cite a 
study by Megargee (1969), who fully crossed 
sex and high- and low-dominant subjects in 
dyads and asked each pair to choose a leader 
for working on one of two laboratory tasks. 
In by far the majority of mixed-sex pairs, 
the male wound up as the leader, regardless 
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of either partner’s dominance rating. Domi- 
nant women were more likely to participate 
verbally in the choice of a leader but were 
not more likely to be a leader than nondomi- 
nant women were, whether paired with a 
high- or a low-dominant male. Spence and 
Helmreich neglect to point out that this 
effect occurred regardless of whether the 
task was of a “masculine” sort, involving 
industrial mechanical work, or of a “femi- 
nine” sort, involving clerical work and verbal 
talent. 

Clearly situational demands and press can 
inhibit the expression of a personality trait. 
However, the investigator ought to be able 
to specify, in a theoretically plausible fashion, 
those situations in which the personality trait 
should predict behavior and those situations 
in which the trait should not predict behavior. 
Congruence with situational demands is not a 
valid criterion, at least unless the contribu- 
tion of situational demands can be distin- 
guished from the contribution of the personal- 
ity disposition. After all, there is no a priori 
reason to assume that situational demands 
always overshadow the influence of personal- 
ity dispositions. The Megargee study is a case 
in point. It is not clear to us that one would 
have predicted no effect of the sex-typing of 
the task on the expression of the personality 
trait dominance on the part of the women 
subjects. Furthermore, if situational contin- 
gencies and social expectations repeatedly 
overwhelm the influence of assumed disposi- 
tions on behavior, one can hardly expect the 
targets of these powerful forces to sustain dis- 
positions or self-perceptions largely indepen- 
dent of their behavior.” 


2 Interestingly, most of Bem’s experimental studies 
involve behaviors directly tapping the behavioral 
content of the items on the BSRI masculinity and 
femininity scales. Spence (Note 5) cautions against 
assuming that either the PAQ or the BSRI masculin- 
ity and femininity scales relate to other sex-related 
aspects of behavior or personality. No one has found 
any association, for example, between psychological 
androgyny and attitudes towards women’s rights 
(Jones et al., 1978; Spence, Note 5). Spence and 
Helmreich observe that in general “the psychological 
dimensions of masculinity and femininity . . . are 
only weakly related within each sex to the broad 
spectrum of sex-role behaviors” (1978, p. 3). They 
argue that this state of affairs is legitimate and even 
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Underlying the notion of psychological an- 
drogyny is the assumption that sex typing 
per se is dysfunctional or maladaptive, and 
that men are as disadvantaged as women, since 
they are expected to conform or aspire to a 
stereotypic ideal of masculinity, just as women 
are expected to conform to a stereotypic ideal 
of femininity, 

Neither data produced by Bem and by 
Spence and Helmreich nor observations of 
the relative well-being of men and women in 
our society support the conclusion that indi- 
viduals evidencing exclusively masculine traits 
are in fact as disadvantaged as those evidenc- 
ing exclusively feminine characteristics. Bem’s 
findings show that feminine females are the 
least capable of the lot and that little differ- 
ence obtains between masculine and androgy- 
nous females. Spence and Helmreich’s sub- 
jects scoring high-feminine/low-masculine dis- 
play considerably lower self-esteem than those 
scoring high-masculine/low-feminine. 

Most of the masculinity traits included in 
the BSRI and the PAQ reflect skills that en- 
able individuals to access resources and con- 
trol situations and social interactions. Al- 
though masculine individuals may forfeit some 
emotional gratifications, they have survival 
skills. On the other hand, most of the feminin- 
ity traits do not reflect skills: Compare child- 


to be expected, given that the strength of some sex 
role norms can create homogeneity in behavior re- 
gardless of personal inclination, (1978, cf. p. 28). 
This caveat creates certain difficulties, On the face of 
it, at least, it is unsatisfactory to show that the 
BSRI and the PAQ scales simply tap the traits they 
contain. If they are only trait lists, why are they 
labeled Masculinity and Femininity scales? It is in- 
Sufficient to derive their sense purely from aggregate 
distinctions. That would be equivalent to develop- 
ing and labeling a scale “Whiteness” because it com- 
prises intelligence items that discriminate the races 
on the average, and using the scale to categorize in- 
dividuals as white, nonwhite, and beyond coloration 
regardless of their race. The labels and the research 
that has been conducted with these scales clearly 
Suggest the assumptions that there is something about 
individual personalities that is being measured, and 
that that something is sex related. Furthermore, 
there are situations so weakly governed by sex-role 
norms that personality dispositions ought to predict 
whether the person’s behavior is sex role conforming 
or deviant. These situations ought to be specifiable 
in advance. 
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like, shy, guillible, soft-spoken, to analytical, 
assertive, independent, self-reliant, items on 
the BSRI femininity and masculinity scales, 
respectively. 

Jones, Chernovetz, and Hansson (1978) 
conducted a series of studies designed to in- 
vestigate Bem’s (Note 1) hypothesis that psy- 
chologically androgynous individuals are more 
adjusted than sex-typed persons, in conse- 
quence of their greater behavioral flexibility 
or more extended behavioral repertoire, They 
used a variety of dependent measures, includ- 
ing self-esteem, locus of control, problems with 
alcohol, helplessness, sexual maturity, and 
neurosis, as adjustment indices, as well as a 
series of attitudinal variables. They found 
that masculinity rather than androgyny gen- 
erally predicted greater adjustment and flexi- 
bility, for both sexes. They also found that 
both male and female feminine and androgy- 
nous subjects expressed a desire to become 
more masculine. These results do not vindi- 
cate neglect of the fact that behavior occurs 
in situations all too often governed by asym- 
metrically sex-differentiated norms and con- 
tingencies. More important, these observations 
Suggest that the term psychological androg- 
yny may be utterly arbitrary, inasmuch as 
persons classified as androgynous are no less 
aware or concerned about the social implica- 
tions of their sex and its ramifications for their 
behavior than are sex-typed persons. 

Indeed, how could they be? It is impossible 
to escape the social consequences of one’s sex. 
Sex is an immediately perceptible feature of 
every person. Regardless of one’s own inclina- 
tions, one’s sex evokes and elicits sex stereo- 
types in others. Thus one is repeatedly as- 
saulted by the influence of sex-contingent ex- 
pectations and norms. Furthermore, as the 
studies by Megargee (1969), Ruble and Hig- 
gins (1976), Snyder and Skrypnek (Snyder, 
Note 4), and Zanna and Pack (1975) sug- 
gest, individuals are easily swayed by sex- 
contingent expectations, quickly adopting 
these norms as their own. Both the influence 
of one’s sex on others’ perceptions and ex- 
pectations on oneself and one’s actions are 
unavoidable. The negotiation of social expe” 
ence involves participating in systems of r 
havioral rules that mark the sex of each an 
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every actor. Considering the role of sex in 
the very architecture of experience and be- 
havior, the notion of psychological androgyny, 
with its implication of freedom from sex- 
related social effects on personality and be- 
havior, is arbitrary at best. 

The present article has raised a number of 
questions about the explicit and implicit 
claims of psychological androgyny research 
and theory. We have addressed issues concern- 
ing the interpretation of respondents’ scores 
on the BSRI and the PAQ, assumptions about 
the definitions of masculinity and femininity, 
and the general focus on adjustment and men- 
tal health as criterion variables. Throughout 
our questioning of psychological androgyny 
research and theory, we have repeatedly em- 
phasized that sex is a structural feature of 
situations and of ongoing organizations of life 
experience, in that settings and ongoing or- 
ganizations are characterized by sex-differ- 
entiated behavioral norms, cost/reward con- 
tingencies, and access to resources. In general 
a psychological theory of sex identity or sex 
differences should recognize this fact, because 
individuals apprehend the significance of their 
sex at least partly as a function of their ex- 
perience with sex-differentiating social and in- 
stitutional environments, 

The diversity and complexity of the social 
and sociological functions of sex do not auto- 
matically exile the factor from the province 
of psychology, though. Psychological androg- 
yny research introduced the role of cognitive 
and motivational processes in sex differentia- 
tion of behavior, if only by deriving masculin- 
ity and femininity scales from general sex 
stereotypes. A potentially more fruitful ap- 
proach to the measurement of cognitive struc- 
tures linking sex, other person features, and 
behavior rules would begin with concepts of 
more immediate relevance for self-perception 
and self-direction than global sex stereotypes 
and would take advantage of recent develop- 
ments in cognitive psychology regarding for- 
mal and structural features of cognitive or- 
ganizations (e.g., Abelson, 1976; Cantor & 
Mischel, 1977; Minsky, 1975; Mischel, 1977; 
Neisser, 1976; Rosch, 1975). One possible 
conceptual form suggested earlier is the per- 
son prototype. Psychological androgyny Te- 
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searchers define masculinity and femininity in 
personality-trait terms, as clusters of attri- 
butes fixed across time and across situations. 
Possibly a more adequate representation of 
person factors that tend to sex-differentiate 
behavior would focus on the cognitive media- 
tion of a sense of self by concepts connecting 
sex, other attributes, and their general valence 
and would examine the effects of those con- 
cepts on behavior. As noted earlier, Wittgen- 
stein’s (1958) notion of a family of concepts, 
with varying overlap or criteria of resem- 
blance, is probably preferable to the notion 
of a general sex stereotype, for purposes of 
investigating individually initiated, sex-differ- 
entiated behavior. In any case, it is important 
to recognize that existing measures of sex 
stereotypes carry implicit assumptions about 
the nature of cognitive representations of 
world knowledge involving gender. Those as- 
sumptions require explicit examination. 
Finally, the profound and extensive distinc- 
tion by sex still characteristic of our society 
is precisely what has been brought to our at- 
tention by the feminist movement. Individuals 
are willy-nilly caught in the process of con- 
tending with social as well as biological im- 
plications of their sex that have serious rami- 
fications for their self-regard and for major 
and minor life decisions. From this perspec- 
tive, there is no such thing as psychological 
androgyny. Individuals living within a cul- 
ture in which sex is deeply structurally em- 
bedded are inevitably subject to a complex 
nexus of social relations that locate them in 
social space partly as a function of their sex. 
Rather than constructing personality typol- 
ogies in which some types of people are ac- 
corded privileges over others, even so bland 
a privilege as the mental health of conformity 
to situational requisites, it would be far more 
beneficial for psychologists to investigate the 
means by which people in general participate 
in social processes that persistently rely on 
sex for differential allocation of resources, 
bases of power, social authority, and prestige. 
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The Many Faces of Androgyny: 
A Reply to Locksley and Colten 


Janet T. Spence and Robert L. Helmreich 
University of Texas at Austin 


In their methodological and theoretical critique of the Bem Sex Role Inventory 
(BSRI) and the Personal Attributes Questionnaire (PAQ), Locksley and Colten 
assume that there is a singular androgyny theory to which the rationale and 
the psychometric properties of these instruments are inexorably tied and that 
each is intended to be a broad-gauged measure of masculinity and femininity 


or of global self-images of these concep! 


ts. The present authors, however, con- 


ceive of the PAQ as a specialized measure of socially desirable instrumental 


and expressive characteristics, objectiv 


ely defined trait dimensions that distin- 


guish between the sexes to some degree and thus may be labeled “masculine” 
and “feminine.” Further, they hypothesize that these dimensions have complex 
and frequently weak relationships with other components of masculinity and 


femininity, although simultaneously havi! 
functioning. When the PAQ is viewed 


ng implications for significant areas of 


from this theoretical perspective, the 


authors suggest that Locksley and Colten’s criticisms directed at its rationale, 
its empirical properties, and its definition of androgyny become largely inappro- 


priate or irrelevant. 


In their article “Psychological Androgyny: 
A Case of Mistaken Identity?”, Locksley and 
Colten (1979) vigorously criticize the theory 
of psychological androgyny as well as both 
the rationale and the methodology of the ma- 
jor instruments used to measure androgyny— 
the Bem Sex Role Inventory (BSRI; Bem, 
1974) and the Personal Attributes Question- 
naire (PAQ; Spence & Helmreich, 1978; 
Spence, Helmreich, & Stapp, 1974, 1975). 
We find ourselves in agreement with a num- 
ber of their specific points. However, our over- 
all assessment of the research done with the 
PAQ in particular leads us to different con- 
clusions, in large part because we view our 
instrument and its relevance to androgyny 
theory, and to such broad concepts as mascu- 
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linity and femininity and sex role identity, 
from a perspective that differs from that of 
Locksley and Colten and that of many other 


androgyny researchers. Within the context of |) 
we will discuss | 


our theoretical perspective, 
some of Locksley and Colten’s major argu- 
ments, in an effort to clarify further concep- 
tual and methodological issues. 


What has come to be known as androgyny ` 


theory rests essentially on the proposition 


that the perpetuation of traditional sex role | 
society in | 
which differences between the sexes were 1e 4 
duced to the minimum dictated by anatomy f 


distinctions is dysfunctional; a 


would be not only more just but also healthier 
for its members. A companion proposition) 


articulated most clearly by Bem (1974), is | 


that masculinity and femininity, so long Te 
garded as terms representing two ends of 4 
single bipolar dimension, the independent di- 
mensions. Both masculine and feminine char- 
acteristics may coexist in the same individual, 
with those individuals exhibiting & relatively 
high degree of both being labeled “androgy- 


= 
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nous.” * In this theoretical context, masculin- 
ity and femininity are global, unidimensional 
concepts, referring to constellations of covary- 
ing attributes and behaviors that normatively 
distinguish between men and women, and/or 
to individuals’ conceptions of their own mas- 
culinity and femininity. 

Locksley and Colten (1979) express reser- 
vations about accepting the assumption that 
individuals high in both masculinity and 
femininity (in these global senses) are better 
adjusted than sex-typed individuals are, in 
both a psychological and a sociological sense. 
Other major criticisms concern the usefulness 
of these global notions of masculinity and 
femininity, and the conceptual and methodo- 
logical deficiencies of the BSRI and PAQ in 
measuring these constructs. It is to these 
latter issues that we shall largely address our- 
selves, confining our discussion to the PAQ, 
since both its theoretical rationale and its 
psychometric properties are not identical to 
those of other androgyny measures. 

Although acknowledging at several places 
that our views and Bem’s (e.g., 1975, 1977) 
are not identical, a pervasive theme under- 
lying Locksley and Colten’s critique is that 
there is a single androgyny theory to which 
the construction and content of the BSRI 
and the PAQ, as well as the research done 
with these instruments, is firmly tied. They 
also imply that the PAQ, as well as the BSRI, 
is—or is intended to be—a measure of such 
general notions as sex identity, sex stereotypes, 
or sex differences in personality. However, our 
conception of what the PAQ measures is quite 
narrow, and both our definition of androgyny 
and the theoretical suppositions in which our 
research with the PAQ has been embedded 
depart in several critical respects from an- 
drogyny ‘theory as just described. It is not 
so much that we disagree with the tenets of 
that theory as the fact that we are not talking 
about quite the same things. 

To respond ‘to the criticisms offered by 
Locksley and Colten (1979), it is crucial to 
present first the theoretical and empirical ra- 
tionale underlying the development of the 
PAQ. The 24-item version of the PAQ, used 
in all of our research following our initial in- 
vestigation (Spence et al., 1974, 1975), con- 
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sists of certain clusters of socially desirable 
socioemotional trait descriptions reflecting 
what are typically labeled personality charac- 
teristics, to distinguish them from such other 
psychological attributes as cognitive abilities 
or styles, values, attitudes, and so forth. The 
instructions are general, making no reference 
to gender, and nothing in the content of the 
items makes the intent of the instrument ob- 
vious. Indeed, the items are ones that appear 
in identical or similar form on other conven- 
tional personality inventories, whose develop- 
ers and users have, at most, only incidental 
interest in sex differences, 

The development of the PAQ was guided 
simultaneously by theoretical and empirical 
considerations. The initial (55-item) version 
of the scale was drawn from a larger pool of 
items for which ratings of the typical man 
and the typical woman (personality stereo- 
types) and the ideal man and the ideal woman 
had been obtained. The 55 items chosen for 
the PAQ were selected from those judged by 
both sexes to distinguish between the typical 
man and the typical woman (both adults and 
individuals of college age). 

In using stereotype data to select items 
for the PAQ, we were in part expressing our 
conviction that traits about which there was 
consistent agreement would be a good source 
of items that would yield actual differences 
between the sexes and thus permit us to de- 
vise an instrument with which we could test 
a number of hypotheses, such as a dualistic 
versus a bipolar conception of masculine and 
feminine personality traits. The usefulness of 
this method of selection is demonstrated by 
the fact that sex differences in self-reports 
appeared in the expected direction on every 


item. 
Items were assigned to subscales on the 


1 Bem’s (1974) androgyny measure, the BSRI, was 
initially scored by finding the difference between 
scores on masculinity and femininity scales, and an- 
drogyny was defined as a relative balance between 
the two scores, regardless of their absolute level. We, 
along with others, have argued that, to correspond 
more closely to accepted usage, the term should be 
reserved for those relatively high in both dimensions, 
a usage that Bem (1977) now accepts and that was 
also adopted by Locksley and Colten. 
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basis of these stereotype ratings as well as rat- 
ings of the ideal man and the ideal woman. 
Traits that were stereotypically more charac- 
teristic of males but socially desirable to some 
degree in both sexes were assigned to a Mascu- 
linity (M) scale, while traits that were more 
stereotypically characteristic of females but 
socially desirable in both sexes were assigned 
to a Femininity (F) scale. This method of as- 
signment resulted in two groups of items 
whose content by and large confirmed the 
speculations of a number of theorists, as well 
as popular thought, about the fundamental 
differences between the personalities of men 
and women. These differences have been char- 
acterized by Bakan (1966) as a distinction 
between masculine agency (a sense of self- 
assertion and self-protectiveness) and femi- 
nine communion (a sense of selflessness and 
need to identify with others) and by Parsons 
and Bales (1955) as a distinction between 
instrumental and expressive traits. Analyses 
of the data from several samples of college 
students indicated significant sex differences 
in M and F scale scores and also a slightly 
positive rather than strongly negative corre- 
lation between scales in each sex. The latter 
outcome refuted the implications of the bi- 
polar M-F model of these personality attri- 
butes, suggesting instead the usefulness of a 
dualistic approach, 

Use of the ideal ratings resulted in a third 
scale with more bipolar properties. This scale, 
labeled Masculine-Feminine (M-F), contains 
items for which ratings fell toward opposite 
poles for the ideal man and the ideal woman. 
The scale is mixed in content, containing items 
dealing with instrumental qualities, such as 
“aggressive” and “dominant,” which are seen 
as somewhat desirable for men but not for 
women, and other items, such as “excitable in 
a major crisis” and “cries easily,” which re- 
flect emotional vulnerability and are seen as 
somewhat desirable in women but not in men. 

Following our initial study, we shortened 
each scale to eight items, retaining those that 
showed the best psychometric properties and 
that also illustrated instrumental and expres- 
sive personality traits. (Thus, such items as 
“athletic” and “good at math and science” 
were dropped.) The current PAQ may there- 
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fore be described as a quite conventional per- 
sonality test in the self-report mode, consist- 
ing of clusters of socially desirable instru- 
mental (masculine) and expressive (feminine) 
traits. The ultimate justification for the ap- 
pellations Masculinity and Femininity lies in 
the demonstration that the clusters of items 
discriminate between the sexes in their self- 
reports. 

The decision to limit the content of the ‘ 
scales to personality traits describing instru- 
mental and expressive qualities was a theo- 
retical one. Space limitations preclude an ex- 
plication of the considerations underlying this 
decision. They led us to conclude, however, 
that the essential core both of the socially 
sanctioned and expected differences in the per- 
sonalities of men and women and of the actual 
differences between them, as indicated in self- 
reports, lay within the realm of instrumental- 
ity and expressiveness. Implied in this state- 
ment is our expectation that the instrumental- 
expressive conceptualization has considerable 
transcultural and transgenerational validity 
in distinguishing between the sexes.” 

Purportedly biological sex differences in ex- 
pressive and instrumental personality charac- 
teristics have frequently been used to explain 
traditional instrumental versus expressive role 
divisions along sexual lines, and acceptance 
of the functional value of these divisions has 
led to justifications for socialization practices 
that attempt to assure the development of a 
high degree of expressiveness in girls and of 
instrumentality in boys. Development of a 
personality scale narrowly confined to “mas- 


* The PAQ does not contain all possible exemplars 
of instrumental and expressive traits. We do not an- 
ticipate that in any given group, actual or expected 
sex differences will necessarily be found for the en- 
tire universe of instrumental and expressive qualities 
or that exactly the same instrumental and expressive 
qualities will distinguish between men and women 
in all cultures and nationalities (see Spence & Helm- 
reich, 1978, for a further discussion of these issues). 
However, in several completed and ongoing cross- 
national investigations with the PAQ, limited as it 
is in content, we have found more similarities than 
differences with comparable United States samples. 
As we will discuss shortly, our PAQ results with 
United States samples are quite consistent across 4 
broad range of chronological ages and socioeconomic 
groups. 
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culine” instrumental characteristics and “femi- 
nine” expressive characteristics allows ex- 
ploration of the validity of many suppositions 
about the relation between these personality 
dimensions and other types of gender-related 
phenomena. Our suspicions (which formal 
evidence is now verifying) were, on the one 
hand, that the relation between instrumental— 
expressive personality traits and other gen- 
der-related phenomena would turn out to be 
considerably weaker and/or more complex 
than had been presumed and, on the other 
hand, that these trait continua would have 
important implications for significant areas 
of functioning without regard to gender. 

Independent investigation of the effects of 
PAQ-defined masculine instrumentality and 
feminine expressiveness on criterion variables 
could proceed because the M and F scales 
were found to be essentially orthogonal. Once 
the separate relations of M and of F to cri- 
terion measures are discovered, their simulta- 
neous contribution to criterion variance be- 
comes important to determine. We have found 
that the median split method of jointly classi- 
fying subjects on M and F scores is a useful 
conceptual scheme for representing the com- 
bined influence of these scales on many de- 
pendent variables. For mnemonic purposes, 
the four resulting cells have been given verbal 
labels: Androgynous (above the median on 
both M and F), Masculine (above the me- 
dian on M, below on F), Feminine (below 
on M, above on F), and Undifferentiated (be- 
low on both). 

As should be obvious from this description, 
we introduced the term androgyny simply as 
a convenient label to identify individuals who 
score relatively high on both the M and the 
F scales of our particular instrument. Within 
this context, it is merely a label that corre- 
sponds to the definition of the term in stan- 
dard dictionaries (e.g., “1. having the charac- 
teristics or nature of both male and female,” 
Webster’s New Collegiate Dictionary, 1974). 
Locksley and Colten’s statement that with 
respect to its “ordinary sense . . . the term 
Psychological androgyny clearly implies ‘of 
indeterminate sex, or not sex related’ ” (1979, 
P. 1023) is based on an extraordinary sense 
with which we are not familiar. In any event, 


1035 


androgyny, as we have used the term, has no 
particular theoretical import, being intended 
to indicate nothing more than a relatively 
high degree of both instrumental and expres- 
sive personality traits, as defined by the PAQ. 
The relation between androgyny in this per- 
sonality sense and other types of androgyny 
(e.g., interest in both “masculine” and “femi- 
nine” hobbies) must be determined empirically. 

Locksley and Colten express concern that 
“most psychological androgyny research has 
been conducted on college students and even 
on high school students,” and they suggest 
that “the variance in scores permitting the 
categorization of androgynous persons may 
indicate only the lack of press from the defini- 


8 Widespread confusion surrounds the use of this 
method. On the one hand, it has been accepted by 
some with uncritical enthusiasm as a kind of genuine 
typology, suitable for use on all occasions. However, 
the conceptual model that best represents the con- 
joint influence of M and F must be sought for each 
new type of data, not assumed a priori. The median- 
split method has proved to be broadly useful for 
this purpose, but there is no theoretical reason to 
anticipate that this will be universally true. On the 
other hand, the median-split method has come un- 
der attack primarily on the grounds of crudeness, 
loss of information, and inaccuracy of classification 
into types (e.g., Pedhazur & Tetenbaum, 1979). The 
frequently suggested solution to the problem is to 
apply more sophisticated techniques, such as multi- 
ple regression, to the data, a suggestion not further 
amplified. The median-split method is indeed crude 
and is not the method of choice if one is seeking 
to maximize predictive accuracy for individual sub- 
jects. Our intent, however, has been to provide a 
conceptual heuristic scheme that will reveal, in a 
simple and easily communicable form, the nature of 
the conjoint influence of M and F on the criterion 
variable, Assuming that the method is properly used; 
more complex analyses, including regression tech- 
niques, should provide no new conceptual informa- 
tion. One should seek other mathematical schemes, 
such as regression models, when the median-split 
method proves inadequate or when one is seeking 
to maximize variance accounted for by using the 
measures appropriately as continuous variables. In 
the case of such variables as our measure of self- 
esteem, the maximum R° is produced by an equation 
containing the partials of M and F and the inter- 
action term (M X F). However, without a priori 
predictions concerning the joint influence of M and 
F on criteria, the mechanical use of stepwise regres- 
sion can lead to conceptually misleading outcomes. 
See Spence and Helmreich (in press) for an extended 
discussion of these issues. 
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tions and constraints of adult sex-segregated 
social roles” (1979, p. 1021). We have now 
collected data from more than 1,300 middle- 
aged adults (Spence & Helmreich, 1978, Note 
1). The most striking finding revealed by 
these data is their parallelism to the data ob- 
tained from college students in absolute scores 
on means and in variance of the scales, in 
the strength of sex differences, and in the per- 
centage classified into each of the four M-F 
categories we have defined—a parallelism that 
exists even when the middle-aged sample is 
classified using medians drawn from a college 
population. Significant sex differences have 
also been found in highly selective popula- 
tions, for example, between male and female 
PhD-holding scientists (Beane, 1976), and 
between male and female athletes (Helmreich, 
Spence, & Vacalis, Note 2). Further, our data 
(Spence & Helmreich, 1978) from “even high 
school students,’ who were much more he- 
terogeneous in social origins than the samples 
described above, suggest that the pattern of 
sex differences, scale correlations, and distri- 
butions among PAQ categories is quite im- 
pervious to social class. We have recently de- 
veloped a child’s version of the PAQ (Simms, 
Davis, Foushee, Holahan, Spence, & Helm- 
reich, Note 3) and have found similar sex 
differences on each scale among 6- and 7-year- 
old children, as well as variability within each 
sex, 
In criticizing the BSRI, Locksley and Col- 
ten (1979) cite factor-analytic studies (e.g., 
Gaudreau, 1977; Moreland, Gulanick, Mon- 
tague, & Harren, 1978) that produced four 
factors, and they suggest that this further 
weakens the assumption of unidimensional 
constructs conforming to the original labels 
assigned to the inventory’s scales. Similarly, 
factor-analytic and discriminant-function data 
obtained from the BSRI by Pedhazur and 
Tetenbaum (1979) led these investigators to 
question the BSRI and the concept of an- 
drogyny on both theoretical and methodologi- 
cal grounds. Although there is a fair amount 
of overlap between the PAQ and the BSRI, 
and a moderate correlation between parallel 
scales, there are enough differences between 
the two instruments to suggest that the type 
of results found with the BSRI would not 
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necessarily be found with the PAQ. The items 
contained on the BSRI are not confined to 
expressive and instrumental traits, and the 
methods of item-to-scale assignment were 
somewhat different (resulting in three gender- 
related scales on the PAQ vs. two on the 
BSRI). Finally, Gilbert and Strahan (1978) 
and Pedhazur and Tetenbaum (1979) have 
provided data suggesting that several of the 
BSRI F items are judged to be socially un- 
desirable for both sexes.* 

We have undertaken additional psycho- 
metric investigations of the PAQ that sug- 
gest the unidimensionality of the M and F 
scales, These investigations included examina- 
tion of the structure of the item-correlation 
matrices obtained from high school, college, 
and adult respondents of each sex along with 
factor analyses. Using the techniques recom- 
mended by Dziuban and Shirkey (1974), the 
correlation matrices comparing the M and F 
scales of the PAQ were classified as highly 
appropriate for factor analysis in each sex 
in each sample.’ 

Both maximum likelihood and principal 
factor analyses of the M and F items—com- 
puted separately for each sex and each sam- 
ple—yielded similar results in each instance. 
Two large, oblique factors emerged, one re- 
producing the M scale, the other the F scale.* 


4 The significance of inclusion of such items is sug- 
gested by results obtained with our recent exten- 
sion of the PAQ (EPAQ; Spence, Helmreich, & 
Holahan, in press) to include Masculinity and Femi- 
ninity scales containing socially undesirable traits. 
Very low correlations were found between parallel 
Masculinity scales and Femininity scales. 

5The Kaiser-Meyer-Olkin measure of sampling 
adequacy and Bartletts test of sphericity were com- 
puted, and the off-diagonal elements of the anti- 
image covariance matrix were inspected for the 16 
items of the M and F scales and the full 24-item 
PAQ in six samples, 1,010 female and 757 male high 
school students, 670 female and 477 male college 
students, and 243 female and 208 malg adults (aver- 
age age, 36 years). The results for each measure are 
reported in Helmreich, Spence, and Wilhelm (Note 4). 

6 Comrey (1978) discusses methodological problems 
in factor-analytic studies, Many of these are evi- 
dent in analyses of masculinity and femininity mea- 
sures (see, for example, the discussion by Pedhazut 
and Tetenbaum, 1979) of the BSRI. Our analyses 
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Consistent with the correlations obtained with 
the unit-weighted M and F scales, the two fac- 
tors had a low positive correlation (average 
r = .08 across the six samples). 

Of greatest interest are the results obtained 
from adult respondents. Our sample consisted 
of 243 females and 208 males, all of whom 
were married and the parents of elementary 
school children. These respondents were 
clearly familiar with the definitions and con- 
raints of adult, sex-segregated social roles. 
The fact that the same clusters of traits 
emerged in this sample provides reassuring 
vidence for the theoretically postulated trait 
limensions. These data further suggest that 
the derivation of constituent masculine and 
feminine trait constellations from stereotypi- 
cally perceived aggregate sex differences is, 
contrary to Locksley and Colten (1979), a 
reasonable and valid method of operationally 


an 


ag 


defining these constructs. The soundness of the 
method is shown in the stability of sex differ- 
ences across disparate groups as well as in the 
homogeneity of the trait clusters. (Indeed, 
he use of perceptions about aggregates to 
develop measures of individual differences rep- 
resents a traditional approach to personality 
test construction.) 

In sum, the PAQ is an instrument of re- 
stricted content, containing items describing 
personality traits of an expressive or instru- 
mental nature. Our theoretical conception of 
such traits is that they are internally located 
response predispositions or capacities that 
have considerable transituational significance 
for behavior but are neither conceptually 
equivalent to behavior nor its sole determi- 


(Helmreich, Spence, & Wilhelm, Note 4) involved 
use of several techniques, construction of varying 
numbers of factors, rotations varying the obliqueness 
of the solution, and replications in independent sam- 
ples, The convergence on a two-factor solution for 
the 16 items of the M and F scales thus seems 
methodologically justified. | 

Parallel factor analyses of the full 24-item instru- 
ment (including items assigned to the M-F scale) 
were also computed. These also yielded two main 
factors, with the F factor remaining unchanged and 
all of the M-F items loading on the M factor. The 
resultant M factor combines instrumentality with a 
lack of emotional vulnerability (ie, feelings not 
easily hurt, never cries, etc.). 
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nant. Trait dispositions interact both with 
situational factors and with other properties 
of the individual to determine the form and 
the intensity of his or her responses in any 
given instance. As Fishbein and Ajzen (1974) 
have demonstrated, general trait measures are 
capable of a high degree of predictive accuracy 
if behavior is aggregated over multiple acts 
and types of situations. Adequate prediction 
of singular acts or of behavior in limited set- 
tings typically requires additional information. 

Overt expression of the personality traits 
tapped by the PAQ is likely to be exaggerated 
when these response dispositions are congru- 
ent with situational demands (including role- 
related demands) and likely to be suppressed 
or modified when they are incongruent. An 
elegant example may be found in a study by 
Megargee (1969) in which high- and low- 
dominant individuals (as predetermined by a 
personality test) were paired and asked to 
choose a leader before beginning to work 
jointly on a laboratory task. In same-sex pairs 
and in mixed-sex pairs in which the male was 
high dominant (but particularly in the latter), 
the high-dominant member of the pair was 
chosen as the leader in the vast majority of 
instances. In mixed-sex pairs in which the 
female was high dominant, the low-dominant 
males ended up as the leaders about three 
quarters of the time. However, in most pairs, 
it was the dominant female who selected the 
leader, almost without exception designating 
the male. Thus, while the majority of these 
women conformed to implied sex role pres- 
sures, they actively utilized their disposition 
to dominate by manipulating the situation in 
a way that preserved the proprieties of male- 
female interaction. In this type of situation, 
dominant, assertive women, who neither 
valued conventional rules for the interactions 
between the sexes nor anticipated that role 
violation would result in negative conse- 
quences, would presumably have instead let 
nature (their own) take its course. 

Other studies, some of them cited by Locks- 
ley and Colten, also demonstrate the effects 
of role demands on behavior. The fact that 
behavior is jointly determined both by situa- 
tional variables and by relatively stable prop- 
erties of the individual cannot be used as a 
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criticism of personality instruments such as 
the PAQ; it merely serves as a warning that 
one should not make predictions about be- 
havior as a function of PAQ scores without 
taking situational variables into account. 

Locksley and Colten take us to task for 
having neglected to point out (in an earlier 
version of our reply) that in the Megargee 
(1969) study, the “effect occurred regardless 
of whether the task was of a ‘masculine’ sort, 
involving industrial mechanical work, or of a 
‘feminine’ sort, involving clerical work and 
verbal talent” (p. 1027), However, Megargee 
selected the clerical-verbal task as an example 
of a sex-neutral task, the label of “feminine” 
being supplied by Locksley and Colten. Our 
vote goes with Megargee. It remains an open 
question as to how feminine a task must be 
before men, as a group, are willing to relin- 
quish leadership and women willing to assume 
it. Our reading of the attribution literature 
makes us suspect that when the task becomes 
one at which women are expected to be more 
competent than men, the leadership role will 
be shifted to women, with a concomitant 
downgrading of the importance of the task 
and the significance of being leader. 

Locksley and Colten go on to comment that 
an “investigator ought to be able to specify, 
in a theoretically plausible fashion, those situ- 
ations in which the personality trait should 
Predict behavior and those situations in which 
the trait should not predict behavior. Con- 
gruence with situational demands is not a 
valid criterion, at least unless the contribution 
of situational demands can be distinguished 
from the contribution of the personality dis- 
position. After all, there is no a priori reason 
to assume that situational demands always 
overshadow the influence of personality dis- 
positions” (p. 1027). Apparently Locksley and 
Colten intend these remarks to be critical of 
our position, but we are hard put to determine 
what we are supposed to be disagreeing about. 
The importance of the Megargee study was 
in its demonstration that while certain forms 
of behavior were suppressed by high-dominant 
women, at the same time dominance behavior 
appeared in modified form. The study thus 
illustrated both the importance of situational 
factors, such as sex role demands, and that 
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of trait dispositions such as measured by the 
PAQ. Criticizing the PAQ or our conceptions 
(or those of any investigator) because we can- 
not yet make detailed predictions about spe- 
cific situational variables fails to recognize 
the very primitive state of knowledge in this 
field. Many of the theoretical presuppositions 
underlying decades of masculinity—femininity 
research have been found to be wanting. 
Those of us in this general area of research 
are attempting to develop a conceptual frame- 
work that will serve as a guide to more fruit- 
ful empirical research that will, in turn, allow 
the development of a more valid and sophisti- 
cated theoretical model. 

One of the assumptions that has been re- 
jected is that masculine and feminine behav- 
iors or attributes are, of necessity, psychologi- 
cal opposites, two endpoints of a continuum. 
(We should note, however, that formal dem- 
onstration that masculine and feminine qual- 
ities are in fact essentially independent has 
been conducted only for personality instru- 
ments of limited content, such as the PAQ 
or the BSRI). A presupposition continuing 
to dominate this area of research is that 
gender-related phenomena, whatever their na- 
ture or level of operation, tend to covary: 
preferences or adoptions of the many cate- 
gories of role-related behavior, sex differences 
in non-role-related attributes, sexual predilec- 
tions, and the like. This assumption has led 
to the use of such broad concepts as sex role 
identity or masculinity and femininity, with- 
out any systematic attempt to decompose 
them. It has also resulted in attempts to mea- 
sure these constructs by instruments of omni- 
bus content, surveying a diverse array of 
gender-related phenomena, or by specialized 
instruments such as the PAQ or the BSRI 
with limited content domains. We have been 
critical both of these broad concepts and of 
the methods used to measure them, Omnibus 
measures, as we have noted elsewhere, would 
be perfectly satisfactory if, as is frequently 
assumed in both lay and scientific circles, 
various classes of masculine and feminine at- 
tributes and behaviors were highly correlated. 
Similarly, if such correlations could be dem- 
onstrated, it would be legitimate to generalize 
from a measure of a narrow subclass of mas- 
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culine and feminine attributes to other sub- 
classes or to the class as a whole. However, a 
number of sources of evidence suggest that 
gender-related phenomena are only loosely 
associated, and that the relationships among 
various kinds of masculine and feminine be- 
haviors tend to be weak or complexly deter- 
mined. Thus, global constructs of masculinity 
and femininity or of sex role identification, 
based on an additive composite of the spe- 
cific ways in which an individual resembles 
the typical member of his or her own sex, are 
likely to have little utility. 

A profitable way to proceed, we have sug- 
gested, is to identify the principle components 
or classes of psychological phenomena related 
to gender and to devise methods to measure 
them so that their interrelationships may be 
determined. Part of our current research pro- 
gram is devoted to the development of addi- 
tional measures which, like the PAQ or our 
Attitudes Toward Women Scale (AWS; 
Spence & Helmreich, 1972), tap a specific 
subset of sex-related phenomena, for example, 
overt behaviors in interactions with members 
of the opposite versus same sex, actual and 
preferred divisions of responsibilities in vari- 
ous areas within the homes of married couples, 
and so forth. Although the evidence supports 
our view that the intercorrelations among 
these measures and their interactions with 
situational variables are complex, it is pre- 
mature to speculate on the exact nature of 
these complexities. The outcome of these in- 
vestigations should suggest better articulated 
theoretical assumptions that should, in turn, 
suggest the directions in which future empiri- 
cal research might fruitfully go. (It may turn 
out, of course, that gender-related phenomena 
are more tightly bound together than we sus- 
pect. This, however, must be demonstrated 
empirically, not taken for granted.) 

Returning to the PAQ and the concept of 
androgyny, as defined by that instrument, our 
preceding discussion should make it obvious 
that we differ in several important respects 
from the position of Bem and her followers. 
First, we do not define the individual who is 
androgynous on the PAQ or BSRI (i.e., who 
has high scores on both the M and F scales) 
as a person who is “flexible” (i.e, who ex- 
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hibits both masculine and feminine behaviors) 
across the general category of sex role behav- 
iors. Nor do we predict that these androgy- 
nous individuals are likely to differ markedly 
from others in sex role behaviors that do not 
directly involve expressive or instrumental 
skills. Bem, in contrast, implies that the BSRI 
can be used as a measure of general sex role 
flexibility and hence assumes that androg- 
ynes, as defined by their high scores on the 
M and F scales, will not be sex-typed in role 
behaviors. 

In her only experimental test of the sex role 
flexibility hypothesis, Bem (Bem & Lenney, 
1976) preselected groups of individuals who, 
on the BSRI, were balanced (‘androgynous,” 
according to her initial definition), extremely 
sex-typed, or extremely cross-typed. In the 
experimental setting, subjects indicated their 
preference for, and comfort in, performing sex- 
appropriate versus sex-inappropriate tasks 
such as ironing a napkin or pounding a nail. 
Sex-typed individuals showed somewhat more 
avoidance of sex-inappropriate tasks than did 
the other groups, thus providing some sup- 
port for the flexibility hypothesis. In a re- 
cently completed conceptual replication of the 
Bem and Lenney study (Helmreich, Spence, 
& Holahan, in press) using unselected sub- 
jects, we have found only minimal correlations 
between M and F scores on the PAQ and 
preferences for stereotypically masculine and 
feminine tasks, and no particular elevation of 
this type of role flexibility among Androgy- 
nous (high M and F) individuals of either 
sex. (Stronger relations, however, were found 
with a sex role attitudes measure.) 

These findings lend credence to our sus- 
picion that the personality traits measured by 
the M and F scales of the PAQ and the BSRI 
are only minimally related to many sex role 
behaviors that do not quite directly require 
instrumental or expressive skills, Thus we are 
extremely hesitant to generalize the findings 
obtained with these scales to other gender- 
related phenomena. It should not be assumed 
without empirical demonstration, for example, 
that masculine and feminine attributes in do- 
mains other than socially desirable instrumen- 
tal and expressive personality traits form in- 
dependent, orthogonal dimensions. Nor can 
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our data on the positive relation between self- 
esteem and the PAQ M and F scales (Spence 
& Helmreich, 1978; Spence et al., 1975) be 
used to conclude that individuals who are an- 
drogynous in a role sense are necessarily 
“healthier” or exhibit higher self-esteem than 
nonandrogynous individuals. 

We indicated earlier that on theoretical 
grounds, we were impressed that the empirical 
process by which the PAQ scales were gen- 
erated supported the conceptions of theorists 
such as Parsons and Bales (1955) concerning 
the distinction between masculine instrumen- 
tal traits and feminine expressive traits. 
Locksley and Colten, however, express discom- 
fort over this outcome. The instrumental-ex- 
pressive distinction, they state, is not a good 
indication of the masculine-feminine dichot- 
omy because “traditionally female activities, 
such as mothering, require instrumental be- 
havior in order to be carried out effectively” 
(1979, p. 1021). Once more we see the invidi- 
ous effects, endemic in the sex role literature, 
of the failure to distinguish between personal- 
ity traits and sex role behaviors and respon- 
sibilities. Ironically, Talcott Parsons, who 
originated the proposal that the division in 
role responsibilities between the sexes could 
be characterized by the terms instrumental 
and expressive, and who also postulated that 
effective role performance demanded instru- 
mental or expressive qualities of personality, 
did not confuse the conceptual distinction be- 
tween traits and roles. However, we concur 
with Locksley and Colten, and, indeed, have 
said elsewhere (e.g., Spence & Helmreich, 
1978), that it is fallacious to assume that the 
execution of expressive, feminine roles requires 
exclusively expressive personality traits; simi- 
larly, implementation of instrumental, mascu- 
line roles can benefit from expressive personal- 
ity characteristics (e.g., management roles 
involving interpersonal relations). But as long 
as the difference in the two uses of instru- 
mental and expressive is understood, each may 
legitimately be used to describe the differ- 
ences between masculine versus feminine roles, 
and between masculine versus feminine per- 
sonality traits. 

We also reject the notion that, on the level 
of personality, a high degree of instrumental- 
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ity is necessarily required for the effective 
performance of all stereotypically masculine 
tasks or a high degree of expressiveness for 
the performance of all stereotypically feminine 
tasks. Nonetheless, task analyses suggest that 
there is some degree of conjunction and that 
Parsonian theory, while perhaps overdrawn, 
is not without foundation. For example, some 
female-dominated professions, such as elemen- 
tary education and social work, place a spe- 
cial premium on expressive or communal ca- 
pacities and motives. There is also evidence 
(Wertheim, Widom, & Wortzel, 1978) to 
suggest that individuals of both sexes who are 
attracted to such professions score higher on 
the PAQ F scale than do their same-sex peers 
entering male-dominated professions. The lat- 
ter, in turn, score higher on the M (instru- 
mental) scale, Similarly, we have found that 
academic scientists, particularly female sci- 
entists, are higher in M (but no different in 
F) than unselected samples of men and women. 

Data such as these suggest that instru- 
mental and expressive characteristics have im- 
portant implications for functioning in 4 
number of significant areas. They further sug- 
gest that many activities that are normatively 
masculine (i.e., male dominated) demand in- 
strumental attributes for their execution 
(whether or not they also benefit from ex- 
pressivity) and that many activities that are 
normatively feminine (female dominated) de- 
mand expressive talents (whether or not they 


also demand instrumental qualities). Evidence 1 


also shows that differential socialization prac- 
tices employed with boys and with girls ac- 


tively encourage the development of instru- | 


mental attributes in the boys and expressive 
attributes in the girls (e.g., Block, 1973), in 
the at least partially correct belief that these 
personal qualities will be required for the dis- 
charge of traditional sex roles. The wisdom 
and justice of perpetuating sex role distinc- 
tions aside, there is failure to recognize that 
functioning may be enhanced by both types 
of qualities. Even worse, children—boys 10 
particular—may be actively discouraged from 
developing and manifesting cross-sex charac 
teristics, in the mistaken belief that instru- 
mentality and expressivity preclude one ah- 


g 
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other (Foushee, Helmreich, & Spence, 1979). 

Locksley and Colten (1979), however, de- 
nigrate the value of identifying the instru- 
mental—expressive personality distinction as 
one component of masculinity-femininity by 
attacking the use of stereotype data to de- 
rive the PAQ and BSRI scales. A variety of 
arguments is advanced to buttress this posi- 
tion, several of them following from the as- 
sumption that the intent of these instruments 
was to measure global self-concepts of mas- 
culinity and femininity and/or concepts of 
masculinity and femininity in prototypic males 
and females. We will not respond to this set 
of arguments because, as we have indicated, 
our theoretical conception of the PAQ is that 
it is a more specialized instrument. 

Other criticisms, however, are germane. 
They dispute the notion of a general stereo- 
type, even on the specialized level of per- 
sonality-trait attributions, citing as evidence 
a study by Clifton, McGrath, and Wick 
(1976). These investigators provided respon- 
dents with a list of adjectives, asking them to 
check those items that characterized the typi- 
cal housewife, bunny, club woman, career 
woman, and woman athlete. Little commonal- 
ity was found across roles in adjective choices. 
These results, while interesting for the light 
they shed on specific stereotypes, are quite 
to be expected. To use another example, one 
would not be surprised to find that, if asked 
to characterize men who are professional foot- 
ball players, people would agree that they are 
taller, stronger, and have fewer front teeth 
than nonathletically inclined males of the 
same age (all stereotypes that objective evi- 
dence would doubtless show are valid). This 
demonstration would not destroy the impor- 
tance of the known difference between men 
and women in height and muscular strength, 
nor would it tell us much about people’s be- 
liefs (or the objective facts) about the num- 
ber of missing front teeth in the typical mem- 
ber of each sex. We would also expect that 
female football players would be judged to 
be bigger and stronger than women in gen- 
eral—but not as big or strong as men. The 
existence of within-groups stereotypes does 
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not demonstrate the nonexistence of between- 
groups stereotypes. 

Locksley and Colten (1979), however, ap- 
pear to argue from the Clifton et al. (1976) 
data that general stereotypes about the per- 
sonality traits of the typical or the ideal man 
and of the typical or the ideal woman do not 
exist, or that, if they do exist, they are to be 
found only among high school and college 
students and others who have not experienced 
adult role constraints and conceptions, and/ 
or that they represent stereotypes about the 
prototypic ‘“housewife/mother and career 
man.” In developing the PAQ, we obtained 
two types of stereotypic ratings (from inde- 
pendent samples), one for the typical male 
and the female adult and the other for the 
typical male and the female college student 
(who has not yet married or launched upon 
a career). Only those items for which signifi- 
cant stereotyping occurred under both sets of 
rating instructions were chosen for our ques- 
tionnaire. Further, inspection of stereotype 
data obtained from older samples (e.g., 
Broverman, Vogel, Broverman, Clarkson, & 
Rosenkrantz, 1972; Rosenkrantz, Vogel, Bee, 
Broverman, & Broverman, 1968) does not 
confirm the contention that these stereotypes 
can be found only among younger inexperi- 
enced groups. Although cultural and subcul- 
tural differences in stereotypes about specific 
traits may well exist (Spence & Helmreich, 
1978), what is most striking about the extant 
findings is the amount of agreement, over a 
wide age span of raters and rating targets, 
about the types of characteristics that dis- 
tinguish the sexes. Finally, as we have noted, 
sex differences in self-reports on the PAQ M 
and F scales have also been found not merely 
in college students but also in diverse sam- 
ples varying widely in age, ethnicity, and so- 
cioeconomic background. 

Locksley and Colten claim further that 
“general sex stereotypes may be too global 
for interpreting and guiding behavior at the 
level of individual self-perception or self-di- 
rection” (1979, pp. 1021-1022). Two issues 
are involved here: the influence of sex stereo- 
types in guiding reactions toward others and 
in guiding reactions toward the self. With re- 
spect to the former, the danger of stereotypes 
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is not only that they tend to exaggerate both 
the number of temperamental characteristics 
that differentiate the sexes and the magnitude 
of these differences, but also that they too 
often result in crucial policy decisions dis- 
criminating against women and girls, based 
on the actual or supposed characteristics of 
females as a class versus males as a class. 
Women should be excluded from positions of 
authority because they are too emotional; 
girls should not be allowed to play Little 
League baseball because they are too fragile; 
women should not be admitted to graduate 
school because they are not highly motivated 
and are likely to drop out to get married and 
have babies; and so forth ad nauseum. De- 
termination of the conditions under which in- 
formation supplied about an individual man 
or woman will reduce the role of sex stereo- 
types in judgments about that individual is 
an important area of investigation. However, 
instances in which sex stereotypes have been 
the primary determinant in reaching decisions 
are too numerous and too serious to dismiss 
as trivial the influence of general stereotypes 
on people’s lives. 

The influence of sex stereotypes about mas- 
culine versus feminine personality character- 
istics in guiding self-behavior is a complex 
question, We should point out first that we 
as investigators, rather than the respondents, 
have labeled the PAQ scales “masculine” and 
“feminine,” labels justified by the fact that 
sex differences in self-report are consistently 
found, The same appellation can be applied 
to other types of personal qualities and be- 
havior that objectively differentiate the sexes, 
whether or not there is general awareness 
about the existence of particular types of sex 
differences and whether or not they are posi- 


tively sanctioned. But, as we have written 
elsewhere: 


Masculinity and femininity can also be regarded as 
global aspects of the self-concepts that men and 
women directly identify in these or equivalent terms 
(such as being a “real man” or a “real woman”). 


We then went on to state: 


Individuals . . | have organized belief systems, how- 
ever poorly articulated, about the Psychosocial mean- 
ing of being “a man” or “a woman” and can be 
expected to have incorporated these beliefs into their 
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sense of self. These belief systems, our observations 
Suggest, are compounded of many elements: assump- 
tions about appropriate sex-roles, characteristics of 
the self such as personality attributes and cognitive 
abilities, physique and physical appearance, styles of 
speech and body movement, sexual behavior, and so 
forth. One can expect that individuals belonging to 
the same culture or subculture will be reasonably 
Similar in identifying the factors contributing to 
masculinity and femininity in themselves and others 
in the sense that they will be drawn from a common 
pool and will not be completely idiosyncratic, Indi- 
viduals’ definitions of masculinity and femininity, 
however, are likely to be based on complexly weighted 
sets of indicators that not only vary from one per- 
son to another but also change with age and may 
even differ when individuals are assessing themselves 
as opposed to others, (Spence & Helmreich, 1978, pp. 
115-116) 


How salient instrumental and expressive 


characteristics tend to be in the modal indi- — 


vidual’s self-concept of masculinity or femi- 
ninity remains problematical. Hence the avail- 


ability of these characteristics as models for ` 


more or less self-conscious guidance of behav- 
ior in situations in which gender per se is 
salient is unknown, 

We have also suggested the importance of 
determining the nature of individuals’ self- 
concepts of their own masculinity or feminin- 
ity and their global conceptions of masculinity 
or femininity in others, as well as the impor- 
tance of outlining some of the elements that 
might enter into these conceptions. Locksley 
and Colten have made the valuable proposal 
that techniques from cognitive psychology 
could fruitfully be employed to discover the 
structure of these global concepts. We regard 
such efforts as complementing—rather than 
Supplanting—the study of masculinity and 
femininity, objectively defined from the per- 
spective of the investigator. Whatever the 
place of instrumentality and expressivity in 
these global concepts, the fact remains that 
these personality dimensions have been shown 
to be significant determinants of behavior and 
that they do differentiate the sexes to some 
degree. 

In the final section of their critique, Locks- 
ley and Colten observe that research and the- 
ory on masculinity and femininity have al- 
ways emphasized the contribution of sex roles 
to adaptation or adjustment. In fact, they 
state, claims for the validity of the constructs 
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have often been based on their ability to dis- 
criminate between the adjustment levels of 
criterion groups. In contemporary theory, su- 
perior adjustment is claimed for androgynous 
individuals, whereas previously, superior ad- 
justment was attributed to sex-typed men 
and women. The authors contend that this 
emphasis on adjustment by current androg- 
yny theorists has led to two major problems. 
The first, they suggest, is a confusion between 
predictive validity and construct validity. (As 
they define the terms, predictive validity re- 
fers to a confirmation of the hypotheses de- 
rived from a theory, whereas construct valid- 
ity refers to demonstrations that a measure 
taps what it purports to measure.) They also 
imply that efforts to establish the construct 
validity of the BSRI and the PAQ have been 
lacking, predictive validity having been mis- 
taken for construct validity. The second prob- 
lem, they state, reflects a confusion between 
psychological and sociological levels of analysis. 
We agree that there has been confusion be- 
tween construct and predictive validity, but 
our analysis differs from Locksley and Col- 
ten’s, With the exception of her study with 
Lenney (Bem & Lenney, 1967), Bem’s experi- 
mental studies have set up situations that 
were designed to elicit quite directly either 
instrumental or expressive behaviors. The re- 
sults of these studies (Bem, 1977) generally 
showed that in situations in which instru- 
mental behaviors were elicited, individuals 
with high M scores on the BSRI (Androgy- 
nous and Masculine individuals, according to 
the four-way categorical scheme) showed more 
of that type of behavior than did those low 
in M; similarly, in situations in which ex- 
pressive behavior was appropriate, individuals 
high in F (Androgynous and Feminine) 
showed more of that type of behavior than 
did those low in F. Alone among the categori- 
cal groups, Androgynous subjects were thus 
high in both types of behavior. These data 
simultaneously provide evidence for the con- 
struct validity of the BSRI M and F scales 
as well as a behavioral demonstration of what 
was found in the correlational analysis of M 
and F self-report scores: namely, that it is 
Possible to possess a high degree of both in- 
strumental and expressive qualities. 
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In the Bem and Lenney (1976) study, a 
relation was determined between BSRI scores 
and preference for simple gender-related tasks 
whose performance had no discernible connec- 
tion with instrumentality and expressiveness. 
Whether this study is relevant to the con- 
struct or predictive validity of the BSRI de- 
pends on one’s theory. Bem apparently re- 
gards her instrument as a general measure of 
sex role preferences and behaviors. According 
to this theoretical conception of the BSRI, 
demonstration that Androgynous individuals 
show less sex typing in reaction to gender- 
related tasks than do other types of individ- 
uals (or that individuals who are sex typed 
are more sex typed in their task reactions) 
would thus fall into the category of construct 
validity. Our conception of the BSRI, and 
even more particularly of the PAQ, is that 
these measures primarily tap instrumental and 
expressive trait dispositions. Our theory fur- 
ther leads us to anticipate, at best, only mini- 
mal relation between M and F scores and the 
type of tasks devised by Bem and Lenney 
(1976). The evidence seems to favor the weak 
relation postulated by our theory rather than 
the strong ones postulated by Bem. While 
resolution of these rival views awaits addi- 
tional data, the methodological point is clear: 
according to our theory, this type of experi- 
ment is addressed to predictive, not construct, 
validity. Negative results (i.e., failure to find 
a relation to behavior as a function of PAQ 
or BSRI scores) would strengthen the pre- 
dictive validity of our theory but would not 
undermine the construct validity of these in- 
struments, as we conceptualize them. 

In commenting on our contention that the 
masculine and feminine characteristics tapped 
by the PAQ cannot necessarily be assumed to 
be highly related to other components of mas- 
culinity and femininity, Locksley and Colten 
note that it is “unsatisfactory to show that 
the . . - PAQ scales simply tap the traits they 
contain” (1979, p. 1028, Footnote 2). This 
is not the place to review a growing literature 
that shows the relation between the PAQ 
scales and a number of other types of self- 
report and behavioral measures (e.g., Helm- 
reich & Spence, 1978; Klein, 1978; Spence 
& Helmreich, 1978). We note only that these 
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other measures sometimes show significant 
main effects for sex of subject and sometimes 
do not; when sex differences do occur, they 
often—but not universally—disappear when 
PAQ scores are taken into account. The in- 
strumental and expressive characteristics mea- 
sured by the PAQ show promise of contribut- 
ing to our understanding of significant real- 
life behaviors, regardless of the gender of the 
individual in which they occur, and also of 
helping us to unravel some of the mysteries 
of sex differences. 

One of our early findings, replicated numer- 
ous times with many types of samples, is that 
both M and F scores (but particularly the 
former) were positively correlated with a self- 
report measure of social competence and self- 
esteem. Androgynous individuals of both sexes 
—those relatively high in both M and F— 
scored highest, followed by Masculine, Femi- 
nine, and Undifferentiated. Similarly, we have 
found that clients of a student counseling 
center scored markedly lower on the M and 
F scales than unselected students did. We can 
find nothing that we have said about these 
data or about the rationale of the PAQ that 
would permit Locksley and Colten’s curious 
conclusion that we regard these findings as 
support for the construct validity of our ques- 
tionnaire, 

We indicated above that empirical data 
gathered with the PAQ suggest that both the 
M and F scales are positively related to self- 
esteem and negatively related to neuroticism, 
anxiety, and other problem behaviors. Since M 
is more strongly correlated with the indices 
of adjustment than F, men and women classi- 
fied as Masculine (high M, low F) receive 
scores that are only slightly different from 
those classified as Androgynous (high M, 
high F). Locksley and Colten, however, sug- 
gest that feminine characteristics are not 
merely more weakly related to adjustment 
than are masculine characteristics but that 
the relationship may be negative in direction, 
thus giving the edge to Masculine rather 
than to Androgynous individuals. In sup- 
port of this claim, they cite a recent study 
by Jones, Chernovetz, and Hansson (1978), 
employing the BSRI, in which “masculine” 

subjects were found to have greater self- 
esteem, fewer neurotic and alcohol prob- 
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lems, and the like, than “androgynous” 
subjects had. The composition of these 
groups, however, was based on the now dis- 
carded subtractive method of scoring the 
BSRI, in which androgyny is defined as a 
balance between M and F, regardless of level, 
rather than as a relatively high degree of 
both M and F. However, Jones et al. (1978) 
went on to isolate groups of subjects who 
corresponded to our definitions of androgy- 
nous and undifferentiated subjects (above the 
median on both scales vs. below the median 
on both scales). Although not always signifi- 
cant, the means of indices of adjustment were 
uniformly higher in both sexes for the an- 
drogynous individuals than for the Undiffer- 
entiated individuals. What a comparison of 
Androgynous and Masculine groups would 
have shown is unknown. Even more trouble- 
some, from our perspective, is the use of the 
BSRI rather than the PAQ. The BSRI F 
scale contains a number of items (some of 
which are listed by Locksley and Colten) that 
are not clearly expressive, communal traits, 
as we conceive of them, and/or, as we noted 
earlier, are socially undesirable for women as 
well as men. In our recent work with an ex- 
panded version of the PAQ that includes 
scales tapping masculine and feminine traits 
that are socially undesirable, we have found 
correlations with measures of neuroticism and 
acting-out behaviors that are opposite in sign 
for socially desirable and undesirable charac- 
teristics (Spence et al., in press). 

It is also critical to note that we have not 
used the data indicating a positive relation 
between the socially desirable M and F scales 
and several measures of adjustment and self- 
esteem to predict that Androgyny (as de- 
fined by the PAQ) will always lead to the 
most “desirable” or “adaptive” behavior. Our 
reluctance is twofold. First, our empirical 
findings suggest that, whether considered 
separately or in combination, the M and F 
scales do not always relate to criterion vari- 
ables in exactly the same way with respect 
either to magnitude or to direction. There- 
fore, the order of the categorical groups in 
terms of criterion responses does not always 
put Androgynous subjects at one end of 4 
continuum and Undifferentiated subjects at 
the other. Further, as Locksley and Colten 
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(1979) note, terms such as “adaptive” or 
“effective” are often value laden and politi- 
cized. One example from our research will 
serve as an illustration. We have recently de- 
veloped a multifactor measure of achievement 
motivation (Helmreich & Spence, 1978) that 
has demonstrated considerable construct and 
predictive validity. Significant differences in 
achievement-scale scores as a function of 
PAQ M and F category have also been found. 
For one of the scales, Competitiveness, the 
order in each sex is Masculine, Androgynous, 
Undifferentiated, and Feminine. Dependent 
on political philosophy, this result could be 
viewed as evidence for the superiority of the 
Androgynous individual over the Masculine— 
or the reverse. We have for these reasons re- 
jected the denotations androgyny researchers 
or androgyny theorists as descriptions of our 
own efforts, writing: 


We are wary ... of the surplus meaning, both 
ideological and empirical, that has accrued to “an- 
drogyny” as a label for the general model. . . . 
Our predictive model is . . . an open, evolving, 
dualistic one and in this sense is not “androgynous.” 
(Spence & Helmreich, 1978, p. 109) 


In suggesting a confusion between psycho- 
logical and sociological levels of analysis, 
Locksley and Colten indicate that individuals 
who ignore sex-contingent expectations may 
do so at great cost and in this sense they 
may show behavior that is dysfunctional and 
maladaptive. Self-acknowledged androgyny 
theorists would be the first to admit, indeed 
insist, that this is true in contemporary s0- 
ciety. In the long view, they argue that a 
society in which sex role distinctions were 
minimized would be healthier for all of its 
members. In ‘the world of the here and now, 
they deny that those who fly in the face of 
traditional expectations on what Locksley and 
Colten describe as the sociological level are, 
on the psychological level, neurotic misfits, 
uncertain in their gender identification, sex- 
ually abnormal, and so forth. To the contrary, 
such individuals may be “healthier” than 
those who conform to conventional norms. 

These speculations may well be valid. How- 
ever, individuals who are androgynous in a 
role sense are not necessarily the same indi- 
Viduals who are androgynous on the PAQ or 
the BSRI. Data bearing on the hypotheses of 
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androgyny theorists, to say nothing of Locks- 
ley and Colten’s counterpropositions, have yet 
‘to be gathered. 


Conclusions 


From our theoretical perspective, the search 
for global measures of masculinity and femi- 
ninity or sex role identity—measures that al- 
low individuals to be placed along a quantita- 
tive dimension—is a snare and a delusion. 
The classes of psychological attributes and 
behavioral patterns that distinguish between 
men and women at a given time and a given 
culture are not only multitudinous but also 
may have different roots and may vary rela- 
tively independently across individuals. Still 
different findings may emerge when individ- 
uals’ conceptions of their own masculinity or 
femininity are probed. There are many mas- 
culinities and femininities, and androgyny 
has many identities. The masculinity and 
femininity measured by the PAQ refers to 
a limited set of socially desirable instrumental 
and expressive personality traits, which are 
but one possible component of the global 
self-conceptions that Locksley and Colten 
discussed. Thus, from our theoretical vantage 
point, many of Locksley and Colten’s meth- 
odological criticisms are rendered irrelevant. 
Androgyny on the level of sex role behaviors 
(the area of inquiry to which their critique 
is primarily directed) is a worthy topic of in- 
vestigation that awaits the development of 
more appropriate measuring instruments than 
the BSRI or the PAQ. 
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and Locksley—Colten Critiques 


Sandra Lipsitz Bem 
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The theory behind both the Bem Sex Role Inventory (BSRI) and Bem’s an- 
drogyny research, as well as particular issues raised in the critiques of Pedhazur 
and Tetenbaum and of Locksley and Colten, are discussed. It is noted that 
(a) the BSRI is based on a theory about both the cognitive processing and the 
motivational dynamics of sex-typed and androgynous individuals; (b) the strat- 
egy of item selection for the BSRI followed directly from the theory and 
utilized established techniques for test construction; (c) a short BSRI has been 
developed in accordance with the results of various factor analyses; (d) current 
research is testing the hypothesis that sex-typed and androgynous individuals 
differ in the extent to which gender serves as a cognitive schema; and (e) the 
concept of androgyny contains an inner contradiction and hence a built-in 


obsolescence. 


The methodological critique of the Bem Sex 
Role Inventory (BSRI) by Pedhazur and 
Tetenbaum (1979) rests on a misunderstand- 
ing of both the purpose of the instrument and 
the theory underlying it. The more conceptual 
critique by Locksley and Colten (1979) re- 
duces in most of its particulars to empirical 
questions about the generality of the concept 
of androgyny and about the domain of ap- 
plicability of its associated measuring instru- 
ments. Accordingly, this response begins with 
an explication of the theory on which the 
BSRI is based, then proceeds to a discus- 
sion of some of the specifics of the Pedhazur— 
Tetenbaum critique, and ends with an over- 
view of research currently in progress, re- 
search that ought to answer some of the ques- 
tions raised by Locksley and Colten. 
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The Theoretical Rationale Behind 
the BSRI 


The recent debate in personality—social psy- 
chology over the degree to which individuals 
show transituational consistencies in their 
behavior has led to the suggestion that we 
ought to reverse our usual assumption of con- 
sistency as given and inconsistency as prob- 
lematic and, instead, adopt the view that it 
is the phenomenon of consistency that re- 
quires explanation (D. J. Bem, 1972). Such 
an approach shifts the burden of proof and 
leads us to ask why a person’s behavior might 
display consistency rather than why it does 
not. Such an approach also suggests that be- 
havioral consistency might itself be an im- 
portant individual difference variable (e.g., 
Bem & Allen, 1974; Campus, 1974). 

In the special case of sex roles, this shift 
of emphasis brings two idealized groups of 
individuals into focus: those “sex-typed” in- 
dividuals who restrict their behavior in ac- 
cordance with cultural definitions of sex-ap- 
propriate behavior, and those “androgynous” 
individuals who do not. This leads us for the 
first time to view the situational adaptability 
(“inconsistency”) of the androgynous group 
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as the “unmarked” norm—the given, the base- 
line—and to regard the sex-stereotyped con- 
sistency of the sex-typed group as “marked” 
or problematic, as the phenomenon to be ex- 
plained. 

Because the BSRI was developed to “cap- 
ture” these particular groups of individuals, 
its construction was based on two specific 
theoretical assumptions: (a) Largely as a 
result of historical accident, the culture has 
clustered a quite heterogeneous collection of 
attributes into two mutually exclusive cate- 
gories, each category considered both more 
characteristic of and more desirable for one 
or the other of the two sexes. These cultural 
expectations and prescriptions are well known 
by virtually all members of the culture. (b) 
Individuals differ from one another in the 
extent to which they utilize these cultural 
definitions as idealized standards of feminin- 
ity and masculinity against which their own 
personality and behavior are to be evaluated. 
In particular, the sex-typed individual is 
highly attuned to these definitions and is 
motivated to keep her or his behavior con- 
sistent with them, a goal she or he presuma- 
bly accomplishes both by selecting behaviors 
and attributes that enhance the image and 
by avoiding behaviors and attributes that vio- 
late the image. In contrast, the androgynous 
individual is less attuned to these cultural 
definitions of femininity and masculinity and 
is less likely to regulate her or his behavior in 
accordance with them. The BSRI is thus 
based on a theory about both the cognitive 
processing and the motivational dynamics of 
sex-typed and androgynous individuals. More- 
over, empirical research on the behavioral 
correlates of sex typing and androgyny has 
so far confirmed that it is serving its intended 
conceptual purposes (cf. S. L. Bem, 1975; 
Bem & Lenney, 1976; Bem, Martyna, & Wat- 
son, 1976; Ickes & Barnes, 1978; Russell, 
1978). 


The Construction of the BSRI 


The strategy utilized in the construction 
of the BSRI follows directly from these theo- 
retical premises. Because the BSRI is founded 
on a conception of the sex-typed individual 
as someone who is highly attuned to cultural 
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definitions of sex-appropriate behavior and 
who uses such definitions as the ideal stan- 
dard against which her or his own behavior 
is to be judged, items for the BSRI were se- 
lected not on the basis of sex differences in 
self-report, as most previous masculinity- 
femininity inventories have been compiled, 
but on the basis of judges’ ratings of the 
culturally defined desirability of various at- 
tributes for each of the two sexes. In con- 
trast to self-reports on such items, these cul- 
tural definitions were expected to be widely 
known and to be quite stable across time and 
from sample to sample. The BSRI is thus 
designed: to assess the extent to which the 
culture’s definitions of desirable female and 
male attributes are reflected in an individ- 
ual’s self-description. More specifically, the 
BSRI presents an individual with a hetero- 
geneous collection of attributes and assesses 
the extent to which the individual clusters 
this collection into the two categories desig- 
nated by the culture as more desirable for one 
or the other of the two sexes. 

In order to select items for the Femininity 
and Masculinity scales of the BSRI, under- 
graduate judges were asked to rate the desira- 
bility of approximately 200 personality attri- 
butes either “for a woman” or “for a man.” 
No judge was asked to rate both. A personal- 
ity characteristic was then defined as femi- 
nine or masculine (and hence eligible for the 
Femininity and Masculinity scales of the 
BSRI) if, and only if, it was judged to be 
significantly more desirable in American so- 
ciety for one sex than for the other by four 
independent samples of judges. The assump- 
tion that the culture’s definitions of desirable 
female and male personalities are widely 
known implies that virtually any sample of 
American adults would be qualified to serve 
as knowledgeable informants with respect to 
these cultural definitions, Moreover, our in- 
structions to the judges emphasized that we 
were interested not in the judges’ personal 
Opinions of how desirable these various attri- 
butes were but in their judgment of how 
American society would evaluate the various 
attributes. The consistency of the ratings 
across four independent samples of judges is 
Strong evidence that the BSRI is tapping 
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widely known cultural definitions, as stipu- 
lated by the theory. 

As Pedhazur and Tetenbaum (1979) point 
out, there has recently been a discussion in 
the literature as to how best to score the 
BSRI. When the BSRI was first developed, 
I set forth a definition of androgyny based on 
the ¢ ratio between an individual’s endorse- 
ment of feminine and masculine attributes, 
with small ¢ ratios indicating androgyny and 
large ¢ ratios (significant differences) indicat- 
ing sex typing (S. L. Bem, 1974). This 
method of scoring the BSRI follows directly 
from the theory in that it distinguishes be- 
tween those individuals who cluster the at- 
tributes on the BSRI into the two categories 
designated by the culture as more desirable 
for one or the other of the two sexes, and 
those individuals who do not. The later emer- 
gence of behavioral differences between those 
who score high on both the Femininity and 
the Masculinity scales (the androgynous high- 
highs) and those who score low on both (the 
“undifferentiated” low-lows)—even though 
both groups achieve small ¢ ratios—led to the 
suggestion that the BSRI be scored on the 
basis of a median split on both dimensions, 
a scoring procedure yielding four rather than 
three distinct groups of individuals (S. L. 
Bem, 1977), For empirical purposes, this was 
an entirely sensible proposal, but it did tend 
to obscure the original theoretical rationale 
behind the BSRI. 


The Pedhazur and Tetenbaum Critique 
The BSRI as Atheoretical 


The most pervasive criticism advanced by 
Pedhazur and Tetenbaum (1979) is that the 
BSRI is atheoretical: “Instead of defining 
the domains of masculinity and femininity 
and attempting to construct measures con- 
sistent with the definitions, Bem has chosen 
a strictly empirical approach” (p. 998), an 
approach that “was destined to fail” (p. 1012 ). 
The above discussion, however, contradicts 
Pedhazur and Tetenbaum’s conclusion. The 
theory underlying the BSRI asserts that sex- 
typed individuals will conform to whatever 
definitions of femininity and masculinity the 
culture happens to provide. The theory de- 
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liberately does not specify the particular con- 
tents of these definitions, however, because 
these will vary from culture to culture. The 
theory is a theory of process, not content, and 
the use of judges as “native informants” about 
the particular contents of the culture’s pre- 
scriptions flows directly from the theory itself. 
Contrary to Pedhazur and Tetenbaum, this 
is not dust-bowl empiricism. 

Closely related to the criticism that the 
BSRI is atheoretical is the objection that the 
Femininity and Masculinity scales are not 
unidimensional, and that one should employ 
a methodology for constructing the scales that 
would guarantee their unidimensionality. But 
Pedhazur and Tetenbaum are putting the 
methodological cart before the theoretical 
horse. The culture has arbitrarily clustered 
together heterogeneous collections of attri- 
butes into the two categories prescribed as 
more desirable for one sex or the other. The 
very concept of androgyny is a positive as- 
sertion that these arbitrary clusters of apples 
and oranges need not—and for some individ- 
uals do not—“hang together.” If the culture 
groups a hodgepodge of attributes into a cate- 
gory it calls “femininity” or “masculinity,” 
then that hodgepodge is what sex-typed in- 
dividuals will take as the standard for their 
behavior. The purpose of the BSRI is to dis- 
criminate between those individuals for whom 
this hodgepodge does form a unitary cluster 
and those individuals for whom it does not. 


Multiple t Tests as Inappropriate for 
Item Selection 


Pedhazur and Tetenbaum (1979) criticize 
the use of item-by-item ¢ tests as the basis 
for item selection, presumably because of a 
concern that this strategy might capitalize 
on chance findings. They did not note, how- 
ever, that the initial list of 200 personality 
characteristics was rated by four independent 
groups of judges and that an item was de- 
fined as feminine or masculine if, and only 
if, it was consistently and reliably rated as 
significantly more desirable for one or the 
other of the two sexes by all four groups of 
judges. The probability of this occurring by 
chance for any individual item is 1/160,000; 
the number of items out of 200 expected by 
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chance to reach significance in all four groups 
is 00125. Moreover, a recent replication at 
the University of Washington cross-validated 
this pattern for 37 of the 40 items on the 
BSRI (Walkup & Abbott, 1978). The results 
for the three exceptions were in the predicted 
direction, but reached significance only for 
female judges. Accordingly, the BSRI would 
appear to tap relatively enduring definitions 
of sex-appropriate behavior, culturally de- 
fined standards of sex-appropriate attributes 
that have not given way even in the face of 
a strong feminist critique in the culture at 
large. (The BSRI also contains 20 items that 
were judged by the original four samples of 
judges to be no more desirable in American 
society for one sex than for the other. The 
Walkup and Abbott replication cross-validated 
this “neutral” pattern for only about half of 
these items, a finding of little practical or 
theoretical significance because these items 
serve as filler and do not enter into the as- 
sessment of an individual’s sex role.) + 
Moreover, the use of item-by-item tests 
(whether ¢ tests, D, or item-total correlations) 
is established practice in test construction 
(Anastasi, 1968), and it is the use of stepwise 
discriminant analysis that seems quite curi- 
ous in this context. Because discriminant anal- 
ysis weights each item only with respect to 
the amount of incremental discriminating 
power it has over and above the discriminat- 
ing power of the items already entered into 
the equation, discriminant analysis is exceed- 
ingly sensitive to the composition of the par- 
ticular items in the item pool. In the present 
case, for example, discriminant analysis would 
have yielded dramatically different weightings 
if the two items “feminine” and “masculine” 
had been excluded from the initial item pool, 
but the content of the culture’s definitions 
would not have changed. Indeed, I suspect 
that even these two items would have been 
eliminated by a discriminant analysis if we 
had simply included the items “female” and 
“male” in the initial item pool. The goal of 
the social desirability ratings was to identify 
a comprehensive complement of items that the 
culture consistently and reliably designates as 
more appropriate for one or the other of the 
two sexes. Given this goal, it is simply not 
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meaningful to conclude—as a discriminant 
analysis suggests we must—that only the two 
items “feminine” and “masculine” need to be 
considered. 

In fact, there is now a body of accumu- 
lated evidence that suggests that—far from 
being the best two items on the scale—“femi- 
nine” and “masculine” are actually the worst 
two items on the scale. It is true, as Pedhazur 
and Tetenbaum (1979) point out, that these 
two items are responsible in large part for 
sex differences in self-report on the BSRI, 
but the goal of the BSRI is to measure and 
to facilitate the investigation of within-sex 
differences, not between-sex differences. The 
difference between females and males on the 
scales of the BSRI has never been used either 
as a criterion for item selection or as an in- 
dicator of validity (unlike the sex difference 
on Spence and Helmreich’s Personal Attri- 
butes Questionnaire, 1978), but only as an 
empirical datum about a particular subject 
sample at a particular point in time. The the- 
ory behind the BSRI in no way commits it- 
self even to the existence of a sex difference 
in any particular sample of women and men. 
Once again, the focus is on individual differ- 
ences, not sex differences. Moreover, as fac- 
tor analyses by Pedhazur and Tetenbaum and 
others (e.g., Berzins, Welling, & Wetter, 1978; 
Feather, 1978; Gaudreau, 1977; Waters, 
Waters, & Pincus, 1977) have indicated, 
these two items do not load highly with other 
items on the BSRI; rather, they form a bi- 
polar factor of their own, a factor correlated 


*It should be noted that an earlier study, also 
conducted at the University of Washington, had pre- 
viously replicated this pattern for only 2 of the 40 
attributes—“feminine” and “masculine’—a finding 
that was interpreted by the authors as indicating 
that the original social desirability ratings were not 
stable (Edwards & Ashworth, 1977). However, that 
Study did not utilize our original instructions, but 
asked judges instead to rate “how desirable or un- 
desirable you judge [each attribute] to be in An 

rican male/female.” In contrast, our instructions 
emphasized that we were only interested in each 
judge’s assessment of how American society woul 
evaluate the various attributes, The two kinds of 
ratings are simply not equivalent—in 1972 or 1977- 
The Edwards and Ashworth study thus constitutes 
a failure to replicate our methodology, not a failure 
to replicate our results. 
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in several analyses with gender. Finally, these 
two items are not highly correlated with their 
own total scale scores, but serve primarily as 
gender markers, 

It is important in this context not to con- 
fuse the individual items “feminine” and 
“masculine” with the Femininity and Mascu- 
linity scales. The labels for the scales are a 
shorthand means of summarizing that all of 
the items were selected in the first place be- 
cause they were judged more desirable either 
for women or for men. 


Factor Analysis and the Short BSRI 


As Pedhazur and Tetenbaum (1979) sug- 
gest, factor analyses of the feminine and mas- 
culine items on the BSRI generally yield four 
factors that account for 75% or more of the 
common variance: (a) a single feminine fac- 
tor defined by such items as “warm,” “gen- 
tle,” and “eager to soothe hurt feelings”; (b) 
two masculine factors, one defined by such 
items as “dominant,” “aggressive,” and “as- 
sertive,” and the other defined by such items 
as “independent,” “self-reliant,” and “self- 
sufficient”; and (c) a factor correlated in 
several analyses with gender and defined by 
the items “masculine” and “feminine” (and 
occasionally “athletic”). According to Ped- 
hazur and Tetenbaum, these results are dev- 
astating to the BSRI. I disagree. Because the 
theory underlying the BSRI does not require 
that the domains of femininity and masculin- 
ity be unidimensional, it is only the existence 
of that small fourth factor that is unantici- 
pated by the theory. 

The results of these factor analyses do sug- 
gest ways in which the BSRI might be re- 
fined, however; and in fact, a short BSRI 
has recently been developed that contains 
exactly half of the original items. Two groups 
of feminine and masculine items in particular 
were eliminated during the development of 
the short BSRI: (a) The few items, including 
“feminine” and “masculine,” that defined the 
factor correlated with gender, as noted above, 
and (b) a group of feminine items with rela- 
tively low social desirability, none of which 
correlated highly with the Femininity score 
or loaded on the feminine factor for either 
sex, and a few of which even had high load- 
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ings on the masculine factor (e.g., “yielding,” 
“shy,” and “soft-spoken”), These feminine 
items were included on the original BSRI in 
order to balance the overall social desirability 
of the feminine and masculine attributes, but 
as the concept of androgyny has evolved, it 
has seemed increasingly inappropriate to de- 
fine androgyny in terms of these relatively 
undesirable attributes, Accordingly, the Femi- 
ninity and Masculinity scales of the short 
BSRI consist of items that represent the most 
desirable personality characteristics for a 
given sex, and the variances of their social 
desirability ratings are quite comparable as 
well. Finally, the short BSRI also includes 
10 filler items that were judged to be no more 
desirable for one sex than for the other in all 
four of our samples of judges as well as in 
the Walkup and Abbott (1978) replication,” 


The Locksley and Colten Critique 


In contrast to the critique by Pedhazur and 
Tetenbaum (1979), the Locksley and Colten 
(1979) critique raises strategic issues of a 
broader and more conceptual nature, issues 
that turn on the eventual heuristic payoff of 
differing approaches. For example, Locksley 
and Colten question the feasibility of basing 
the measurement of individual differences in 
femininity and masculinity on broadly based 
cultural stereotypes about women and men, 
and they believe that a more cognitive ap- 
proach to sex typing would be more fruitful. 
Again, only future research can decide such 
issues. They also question the concept of an- 
drogyny itself: “Sex is an immediately per- 
ceptible feature of every person. . . . Con- 
sidering the role of sex in the very architecture 
of experience and behavior, the notion of psy- 
chological androgyny, with its implication of 
freedom from sex-related social effects on per- 
sonality and behavior, is arbitrary at best” 
(pp. 1028-1029). 

But like most psychological concepts, the 
concepts of sex typing and androgyny are seen 
as matters of degree. I, too, would agree with 


2A detailed manual for both the original and the 
short BSRI will soon be available from Consulting 
Psychologists Press, 577 College Avenue, Palo Alto, 
California 94306. 
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the rather unexceptional position that it is 
not possible for an individual to be completely 
free from sex-related social effects, but that 
does not preclude the possibility that indi- 
viduals may differ in the extent to which 
gender serves as a cognitive schema for the 
processing of information, a lens through 
which they perceive and interpret social real- 
ity. Moreover, my current research on the 
cognitive processes mediating sex typing and 
androgyny is addressed to precisely this hy- 
pothesis. 


Gender as a Cognitive Schema 


The distinction between male and female 
clearly exists “out there” in the real world as 
a basic and fairly primitive dichotomy. More- 
over, it is a dichotomy that is important to 
almost all human cultures in a way that ex- 
tends well beyond basic biological differences 
in body build and reproductive function; and, 
precisely because it does loom so large, it is 
a dichotomy that cannot be overlooked or 
treated as psychologically irrelevant by any- 
one. Indeed, the distinction between male and 
female is known by children as young as 1 
or 2 years of age, and in most cultures, chil- 
dren are explicitly taught to treat the gender 
dichotomy with seriousness and respect from 
that time forward. Moreover, within most 
cultures, a variety of different sources can 
aid and abet an individual’s awareness of the 
gender dichotomy: for example, interacting 
exclusively with same-sex peers and inferring 
that the other sex must be fundamentally dif- 
ferent from you; observing mothers and 
fathers in sex-stereotyped roles as homemaker 
and breadwinner and inferring that the two 
Sexes must be fundamentally different from 
One another; observing that mixed-sex rela- 
tionships always seem to have sexual over- 
tones and inferring that heterosexual attrac- 
tion constitutes the only experience shared in 
common by males and females; observing that 
different pronouns are used in reference to 
the two sexes and inferring that the distinc- 
tion between them must be even more im- 
Portant than, say, the distinction between 
black and white or between young and old; 
and so forth. 


But despite the universal recognition of the 
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gender dichotomy, it is a basic assumption 


underlying my current research that there ; 


are wide individual differences in the func- 
tional importance attached to it. In particu- 
lar, I hypothesize that sex-typed and androgy- 
nous individuals differ from one another in 
how much they believe the sexes to be basi- 
cally different from one another—a belief in 
“gender polarity’—and that the differences 
between these sex-role groups both in self- 
description and in behavior are themselves a 
consequence of the differences in the content 
of their beliefs about the two sexes. More- 
over, it is suggested that not only do indi- 
viduals of different sex roles differ in the 
extent to which they hold different beliefs 
and expectations about what the two sexes 
are like but, furthermore, these beliefs medi- 
ate both how they, as individual males and 
females, behave and how they interpret the 
behavior of male and female others as well. 
At the more basic level, it is hypothesized 
that individuals of different sex roles differ 
not only in the content of their beliefs about 
gender differences (gender polarity) but in 
their cognitive structures for coding and proc- 
essing information as well, structures that 
have variously been called frames (Minsky, 
1975), scripts (Abelson, 1976), and schemata 
(Bartlett, 1932; Bobrow & Norman, 1975; 
Kelley, 1972; Markus, 1977; Stotland & 
Canon, 1972). The notion of a schema con- 
notes a highly articulated and dynamic con- 
cept that organizes and guides one’s process- 
ing of information by virtue of its rich 
associative network, its implications for cause- 
and-effect relationships, and its well-developed 
criteria for making discriminations along the 
relevant dimensions, implying that individuals 
of different sex roles should differ in the cog- 
nitive processing of gender-related informa- 
tion. In particular, to the extent that an in- 
dividual holds different beliefs about what 
the two sexes are like, information related to 
gender and to gender differences should be 
more perceptually salient and more cognitively 
available (Nisbett & Ross, in press; Ross, 
1977; Tversky & Kahneman, 1973, 1974). 
That is, gender differences should be more 
teadily perceived or noted; they should be 
more readily stored in and retrieved from 
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memory; they should be more readily used 
s the basis of personality attributions and 
predictive inferences; they should be more 
readily put forth as a causal candidate for 
a variety of behavioral outcomes. In short, 
the gender dimension should be more cog- 
nitively available for the processing of in- 
formation, Accordingly, individuals of differ- 
ent sex roles are seen as differing in how much 
they spontaneously process information about 
the self, about others, and about the nonsocial 
environment in general in gender-related terms. 
. In sum, then, individuals of different sex 
roles are not viewed here as differing primarily 
in terms of how much masculinity or feminin- 
ity they possess, but rather, they are viewed 
as differing more fundamentally (a) in the 
content of their beliefs about what the two 
exes are like and (b) in their cognitive 
schemata for processing gender-related in- 
formation, and hence in the perceptual sa- 
lience and cognitive availability of gender 
and gender-related concepts as dimensions for 
processing incoming information. 

It should be noted that two studies already 
ein the literature provide preliminary support 
for the hypothesis that sex-typed individuals 
differentiate along a gender-related dimension 
significantly more than androgynous individ- 
wuals do. In the first study, sex-typed and an- 
drogynous subjects watched either a female 
or a male on videotape performing a series 
of neutral activities, and, using a hand-held 
recorder, they segmented the videotaped se- 
quence into self-defined units that seemed 
natural and meaningful to them. The results 
revealed a significant Sex Role X Sex of 
Actor interaction, with sex-typed subjects 
differentiating between the female and male 
actor significantly more than androgynous 
Subjects did (Deaux & Major, 1977). In the 
“second study, sex-typed and androgynous sub- 
jects rated the similarity of handwritings on 
Masculinity-femininity, and they also rated 
the handwritings on an absolute scale of mas- 
‘culinity-femininity. The results indicated 
both that sex-typed subjects differentiated 
along the dimension of masculinity—femininity 
Significantly more than androgynous subjects 
did and that they weighted it more heavily 
in making similarity judgments (Lippa, 1977). 
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Androgyny as a Transitional Concept 


Locksley and Colten (1979) are rightly 
concerned with the conceptual status of the 
androgyny concept. As I have noted elsewhere, 
T have a discomfort of my own: 


If there is a moral to the concept of psychological 
androgyny, it is that behavior should have no gen- 
der. But there is an irony here, for the concept of 
androgyny contains an inner contradiction and hence 
the seeds of its own destruction. Thus, as the etymol- 
ogy of the word implies, the concept of androgyny 
necessarily presupposes that the concepts of femi- 
ninity and masculinity themselves have distinct and 
substantive content. But to the extent that the an- 
drogynous message is absorbed by the culture, the 
concepts of femininity and masculinity will cease to 
have such content and the distinctions to which they 
refer will blur into invisibility. Thus, when androg- 
yny becomes a reality, the concept of androgyny will 
have been transcended. (S. L. Bem, in press) 
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How Relevant is a Semantic Similarity Interpretation 


of Personality Ratings? 


Jack Block, Daniel S. Weiss, and Avril Thorne 
University of California, Berkeley 


In recent years, a conceptual similarity interpretation of personality and inter- 
personal assessment ratings has been advanced by D’Andrade and Shweder. 
According to their view, such ratings are primarily understandable as linguistic 
artifacts having little or no connection with the real world. This position con- 
travenes a central and widely used methodology of personality and behavior 
assessment. The present article critically evaluates the logic and data that have 
been used to support this conceptual similarity explanation and concludes that 
pertinent evidence is wanting. A primary finding of D’Andrade and Shweder, 
that judges can predict the intercorrelations among personality ratings, is 
shown in an experimental study to be an adventitious function of previously 
unrecognized patterns of redundancy in the personality variables that happen 
to have been used. Finally, some empirical relationships not assimilable to their 
position are noted together with some conceptual problems of their viewpoint. 


A decade ago, D’Andrade (1965, 1974) be- 
gan to advance a radical and sweeping cri- 
tique of two essential techniques of per- 
sonality assessment: personality ratings and 
personality questionnaires. Applying his own 
analytical logic to several sets of previously 
published data, D’Andrade concluded that 
“the correlations [among rating and inven- 
Jory scales] are primarily an artifact of the 
tater’s or the questionnaire taker’s cognitive 
structure, and not a reflection of the real 
world” (1974, p. 181). 

More recently, Shweder (1975, 1977a, 
1977b) has extended and elaborated on 
D’Andrade’s theme, asking the confronting 
question: “How relevant is an individual dif- 
ference theory of personality?” (1975, Pp. 
455). By “an individual difference theory of 
Personality,” Shweder referred to generally 
held conceptions of personality as epitomized, 
„for example, by Child’s (1968) definition: 
“More or less stable internal factors that make 
One person’s behavior consistent from one 


+ This study was supported by National Institute of 
Mental Health Grant 5RO1 MH 16080 to Jack Block. 
Requests for reprints should be sent to Jack Block, 
Department of Psychology, University of California, 
Berkeley, California 94720. 


Copyright 1979 by 


he American Psychological Association, 


time to another, and different from the be- 
havior other people would manifest in com- 
parable situations” (p. 83). Shweder answered 
his question negatively, boldly asserting that 
“an individual difference theory of personal- 
ity” has been “shown” by him to be “no more 
than statements about how respondents [and 
psychologists] classify things as alike in mean- 
ing” (1975, p. 482). As Shweder noted, much 
of the evidence for the existence of personal- 
ity parameters that shape behavior in co- 
herent ways has depended on observation- 
based ratings and personality inventories, Ac- 
cording to Shweder, when rating and inven- 
tory data are generated by psychologist- 
observers or by inventory takers, respondents 
(including psychologists) 

unwittingly substitute a theory of conceptual likeness 
for descriptions of behavioral co-occurrences, . . . 
Items alike in concept are inferred to be behaviorally 
characteristic of the same person even when, as is 
typically the case, conceptual relationships among 
items do not correspond to the actual behavioral re- 
lationships among items. . . . [As a result] these 
conceptually biased judgments create an “illusion” 
of underlying behavioral consistency which, although 
not apparent in actual behavior, deceptively validates 
the “individual difference” conceptualization of “per- 
sonality.” (1975, pp. 455-456) 


This point of view, which strikes at the 
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very core of personality and behavioral as- 
sessment, has recently achieved appreciable 
currency, being approvingly cited in reviews 
by Schneider (1973) and Mischel (1968, 
1969, 1973, 1977), among others. The pres- 
ent article is an effort to evaluate the pre- 
mises, logic, and empiricism that underlie the 
strong conclusions of what we shall call the 
conceptual similarity position. We focus pri- 
marily on the “relevance” of personality rat- 
ings, a quite sufficient context in which to 
address the issues that have been raised. 


The Conceptual Similarity Argument 


Underlying the conceptual similarity posi- 
tion are several propositions and presump- 
tions about personality assessment that re- 
quire specification and scrutiny if the tenabil- 
ity of this conclusion is to be evaluated. 


The Conceptual Similarity View of Personality 
and Behavior Assessment Research: Emphasis 
on Trait Similarity Structures 


D’Andrade characterizes personality assess- 
ment as follows: 


First, one or more human observers are asked to 
judge one or more subjects on a number of traits 
of behavior; the judgments are expressed in ratings 
or rankings (scores) based on the observers’ long- 
term (i.e, more than 10 minutes) memory of the 
Subjects’ behavior, Second, a single score is com- 
puted for each subject for each trait, usually by 
taking the mean of all the scores given to each sub- 
ject for each trait. Third, to find out how the traits 
are related to each other, some measure of associa- 
tion, such as the product-moment correlation, is 
computed from the subjects’ scores for all pairs of 
traits. Finally, the measures of association are ana- 
lyzed to determine how the traits are organized with 
respect to each other... . The results of such 
analyses indicate which traits tend to go together, 
and the similarity structure of the trait measurements 
is taken as a representation of the structure or or- 
ganization of the subjects’ behavior, (1974, p. 161) 


f This impression of personality and behav- 
ioral assessment research will appear foreign 
to most assessment psychologists, Insofar as 
ratings are used as one of the primary meth- 
ods of assessment, the first two steps listed 
by D’Andrade often apply. The rating data 
thus generated, however, are then generally 
used in a multipronged effort ( together with 
other kinds of Measures) to establish the con- 
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struct validity of a concept. The goal of such) 
effort is a dependable and coherent nomologii 
cal network of relationships on which to base 
subsequent theorizing or prediction. The em 
phasis of personality research is, centrally 
on the validity of the measures being usedi 
that is, Do individuals earning different score 
on an index also earn conceptually expectable 
different scores on entirely separate and inde 
pendently formulated measures? There is 
long history to this kind of effort; recent rë 
views of the status, accomplishments, and dé 
ficiencies of personality research are to b 
found in Block (1977a), Gough (1976) 
Hogan, DeSoto, and Solano (1977), and 
Stagner (1977). 

The third and last steps listed in thi 
D’Andrade (1974) conceptualization of per 
sonality research may, and often do, arisd 
when multiple ratings or measures have bee 
gathered and there is question or conce 
about the extent of redundancy of coveragi 
of the personality or behavioral domain by 
the variables used. Although a certain amoulll 
of redundancy is required to ensure reliabil 
ity, redundancy beyond this certain amoull 
is wasteful. Various clustering methods, sud 
as factor analysis, can be extremely useful i 
quickly and simply revealing the latent 4 
mensionality of the set of ratings employe 
and as a guide to identifying rating variable 
that are dispensable with little, or at leas 
acceptable, cost. 

Although the results of factor analysé 
(and related methods) of sets of ratings at 
often of immediate and intrinsic interest, ther 
are few personality psychologists today wat 
would claim that obtained factor structures 
Tepresent inexorable verities (cf. Lykken 
1971). It is now well recognized that by 
varying the mix of variables included in 4 
clustering analysis, one can fundamentally 
alter both the number and nature of the sum 
marizing dimensions subsequently obtained 
Thus, D’Andrade’s claim that “the similarity 
structure of the trait measurements is taket 
[by psychologists] as a representation of 
structure or organization of the subjects’ be 
havior” misunderstands both the analytic? 
methods used and the views of most person? 
ity psychologists. Personality psychologis™ 
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seek or settle on useful sets of variables to 
be employed for the specification of the per- 
sonal or behavioral qualities of the individ- 
uals to be studied. To the extent that the set 
of personality or behavioral variables has 
communal variance (i.e., is redundant), there 
will be a “similarity structure of the trait 
measurements.” But such similarity structures 
represent neither conceptual nor empirical fix- 
ities. We believe that the focus on arbitrary 
and often adventitious “similarity structures” 
derived from ratings addresses tangential or 
subsidiary issues. Instead, we suggest that the 
essential concern of personality assessment— 
a concern not addressed by D’Andrade and 
Shweder—is the convergent and discriminant 
validity of personality and behavioral ratings. 
Moreover, we note that the validity of a per- 
sonality measure is not necessarily tied to the 
interrelationships among a set of personality 
measures. We shall be returning to this point. 


The Primary Demonstration of the Conceptual 
Similarity Position: Correspondence Between 
Rated Behavior and Conceptual 

Similarity Matrices 


The personality or behavioral variables 
rated by observers or by peers to describe a 
set of subjects can be intercorrelated. We call 
the consequent matrix a Rated Behavior 
matrix. Separately, the variables that had 
been rated with respect to actual individuals 
can be scaled by judges for the degree of 
conceptual similarity or meaning equivalence 
each variable has with respect to every other 
variable. The resulting matrix of similarity 
indices we call a Conceptual Similarity ma- 
trix. The primary demonstration of the con- 
ceptual similarity position has been to show 
that for a number of previously published 
Rated Behavior matrices, the pattern of in- 
tercorrelations characterizing a Rated Behav- 
ior matrix could be well approximated by the 
pattern of intercorrelations characterizing the 
Conceptual Similarity matrix based on the 
same set of variables. Because of the fre- 
quently strong correspondence between these 
two matrices, in which the Rated Behavior 
matrix stems from observations of real indi- 
viduals, whereas the Conceptual Similarity 
matrix derives from purely semantic judg- 
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ments of the similarity of personality variables 
made without reference to palpable people, 
D’Andrade concluded that “with respect to 
personality and behavior specifications in the 
field of psychology . . . it is possible to con- 
fuse propositions about the world with propo- 
sitions about language” (1965, p. 215). 
Shweder has been more vigorous in his as- 
sertions, claiming that 
Evidence generally thought to demonstrate the ap- 
plicability of the “individual difference” conceptuali- 
zation of “personality” [has been] shown to be 
equivocal by deriving the trait-factors discovered in 
interpersonal rating forms, questionnaire interviews 
and personality inventories from purely conceptual 
criteria without any reference to actual behavior. 
(1975, p. 459) 

We suggest that the correspondence between 
a Rated Behavior matrix and a Conceptual 
Similarity matrix, when it exists, does not 
have the logical implication with respect to 
individual differences that the conceptual simi- 
larity position seems to claim. The reader 
should pause for a moment to verify that in 
no way whatsoever is it possible to proceed 
from the specific semantic similarity judg- 
ments on which the Conceptual Similarity 
matrix is based to a specification of just which 
individuals are rated high or low on a particu- 
lar rating dimension. For example, to know 
that judges evaluate the personality rating 
variables “Introverted” and “Devaluates him- 
self” as conceptually highly similar (Shweder, 
1975, Table 1) is to be entirely uninformed 
about who is rated Introverted or as De- 
valuat(ing) himself or whether Introverted 
and Devaluates himself, as separate or con- 
joined variables, are important and useful 
characteristics in terms of which to classify 
people. We shall return later to this distinc- 
tion between classifying personality variables 
as opposed to classifying individuals in terms 
of personality variables. 


The Secondary Demonstration of the 
Conceptual Similarity Position: Absence 
of Correspondence Between Rated Behavior 
and Actual Behavior Matrices 

From the beginning, D’Andrade recognized ` 
the confounded nature of the primary dem- 
onstration: 
It is possible that the so-called psychological traits 
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dealt with in this paper exist both as components in 
terms used to describe the external world and in 
the external world as well. Such an isomorphism, if 
it exists, might be the result of the external world 
affecting first the discriminations made by speakers 
of a language, who then eventually develop a se- 
mantic structure within the language to encode these 
discriminations. (1965, pp. 227-228) 


Thus, the isomorphism hypothesis that “the 
semantic similarity of trait terms corresponds 
to the way these traits actually go together” 
(D’Andrade, 1974, p. 162) was confounded 
with the more dramatic systematic-distortion 
hypothesis “that traits the observer considers 
similar will be recalled as applying to the same 
person, even when this is not the case” 
(D’Andrade, 1974, p. 161). 

To speak to the issue of confounding, 
D’Andrade (and Shweder) sought an addi- 
tional demonstration that would distinguish 
between the two competing hypotheses. They 
reasoned that if the systematic distortion or 
conceptual similarity hypothesis was correct, 
a matrix of intercorrelations among what they 
called “actual behaviors” (which we will call 
the Actual Behavior matrix) should not cor- 
respond to the Rated Behavior or Conceptual 
Similarity matrices. 


Tf the observer’s memory-based ratings showed a 
very different pattern of correlations from that found 
for the data based on the actual behavior of the 
subjects (but a pattern similar to judgments of se- 
mantic similarity), it would be reasonable to reject 
the isomorphic hypothesis and to consider the sys- 


tematic-distortion hypothesis supported. (D’. 
1974, p. 162) pii See ee 


D’Andrade (1974) has reported on two 
studies in the existing psychological literature 
that he deemed to have Actual Behavior and 
Rated Behavior matrices suitable for uncon- 
founding purposes; Shweder (1975) has re- 
ported a third study relating Actual Behavior 
and Rated Behavior patterns of intercorrela- 
tions to each other and to a Conceptual Simi- 
larity pattern, In all three sets of analyses, 
a strong conclusion is reached affirming the 


systematic-distortion or conceptual similari 
s ; simil: 
interpretation: : i 


Rated behavior is almost entiri 

hav ely under the infi 
of pre-existing conceptual schemes and ESEE 
to actual behavior relationships only to the extent 
pre-existing conceptual schemes happen to partially 


coincide i z 
a le with actual behavior. (Shweder, 1975, p. 
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Later in this article, we shall suggest a 
number of reasons why the three studies 
located by D’Andrade and Shweder fail to 
meet their special hypothesis-testing require- 
ments. 


The Implications of the Conceptual 
Similarity Position 


The critical import of the conceptual simi- 
larity argument is powerful, blunt, and devas- 
tating to personality psychology and other 
behavioral science fields relying on some form 
of ratings. 


If the correlations on which these studies rest are 


primarily an artifact of the rater’s or the question- 
naire taker’s cognitive structure, and not a reflec- 
tion of the real world, there is little or no evidence 


that human behavior can be described by large 
multibehavior units. What remains is a world in 
which human behavior is to be described in terms 
of specific behaviors occurring in specific situations, 
as Mischel and others have argued. (D'Andrade, 
1974, p. 181) 


D’Andrade went on to suggest that “it is 
not necessary to group behavior into clusters, 
or traits, or dimensions to be able to give 
an economical description of an individual's 
behaviors” (1974, p. 184). Instead, he 
claimed that “a remarkably predictive de- 
scription can be obtained” (p. 185) if one 
simply predicts the most frequent behavior 
previously observed in each of a number of 
different behavior settings. No data are of- 
fered with regard to the predictive efficacy of 
this behavior-specific approach. Whatever the 
predictive accuracy of this actuarial orienta- 
tion may prove to be, it is clear that it aban- 
dons the elegance and economy of a theory 
of behavior. Many psychologists will not be 
interested in a psychology reduced to, and 
aspiring to no more than, an endless graph- 
ing of frequency distributions for behavior- 
Category systems. Were the logic and em- 
Piticism surrounding the behavior-specific 
approach compelling, this reductionistic but 
assumption-free approach to “good predic- 
tion” would prevail. It is our contention, how- 
ever, that the argument and evidence brought 
together to buttress the conceptual similarity 
interpretation of personality ratings as exist- 
ing solely “in the eyes of the beholder” arè 
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deeply flawed, wrongly focused, and greatly 
embarrassed by a host of empirical findings 
unassimilable to the systematic-distortion 
position. 

The remainder of this article will more 
closely consider the evidence and reasoning 
underlying the conceptual similarity inter- 
pretation of the processes underlying per- 
sonality assessment. We shall first evaluate 
the Actual Behavior- Rated Behavior ma- 


trix analyses intended to unconfound the 


isomorphic and systematic distortion hy- 
potheses. Next, we go on to illustrate the 
conditions that are sufficient and perhaps 
necessary to create or to prevent correspond- 
ence between Rated Behavior and Conceptual 
Similarity matrices, Finally, we list a variety 
of connections between personality ratings 
and indisputably real world outcomes that 
cannot be understood in artifactual terms. 
In the course of our countering evaluation, 
we shall burden the reader with matters of 
analytic detail, psychometrics, and princi- 
ples of inference. We do not apologize for 
the closeness of our analysis; too often in 
psychology, crucial experimental points and 
analytical considerations go unnoticed. If the 
reader is to form well-based opinions on the 
important issues involved, he or she will 
have to become actively involved in the spe- 
cifics of the evaluative process. 


Unconfounding the Isomorphic and 
Systematic-Distortion Hypotheses: 
Some Problems in the Analyses 


To harden the case for a conceptual simi- 
larity interpretation of personality ratings, 
demonstrations were sought that 


Verbal data collecting procedures, such as inter- 
Personal ratings, self-report inventories, and retro- 
spective questionnaires, . . . cannot be systemati- 
cally supported by detailed and immediately re- 
poa observational data. (Shweder, 1975, pP- 459- 
60) 


It was reasoned that 


Pattern comparison [of an Actual Behavior matrix 
with Rated Behavior matrices] is needed here be- 
Cause it is not the validity of specific traits or cate- 
gories which is in question [emphasis added], but 
the validity of the correlations between traits. 
Ideally, for recording the actual behavior of the 
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subjects, a mechanical device should be constructed 
to count frequencies of different kinds of behavior. 
Unfortunately . . . a mechanical measuring instru- 
ment is not practical at present for any judgment 
more complex than is making noise versus is not 
making noise. The closest approximation to a me- 
chanical device appears to be a trained observer 
using a simple coding scheme to record a subject’s 
behavior as it occurs. The immediacy of the ob- 
server’s assessment and the'simplicity of the coding 
decisions should, it is hoped, protect against sys- 
tematic distortion of the type thought to take place 
in long-term (i.e., more than ten minutes) memory. 
(D’Andrade, 1974, p. 162) 

The two studies reworked by D’Andrade 
for his Actual Behavior—Rated Behavior 
matrix comparisons are an investigation by 
Borgatta, Cottrell, and Mann (1958) and 
the doctoral dissertation of Mann (1959). 
Shweder has added a comparable analysis of 
a study by Newcomb (1929). It will be 
helpful to briefly characterize these studies. 

The Borgatta et al. (1958) research used 
47 graduates divided into five small groups. 
Each group met for ten 2-hour sessions to 
discuss “democratic leadership.” During the 
ninth and tenth sessions, a single observer 
using the Interaction Process Analysis (IPA) 
category system (Bales, 1951) coded the on- 
going interaction. Subsequently, for each 
member of the group, the rate of each Bales 
category was calculated. These rates served 
as the Actual Behavior scores for subsequent 
analysis. After the ninth session, each group 
member ranked all the members of his group 
with respect to 40 traits, 6 of which approxi- 
mated the Bales categories. Subsequently, 
for each member, the average ranking (ad- 
justed for the slight differences in group 
sizes) on each of the 6 traits approximating 
the Bales categories was calculated. These 
average rankings served as the Rated Be- 
havior scores for subsequent analysis. 

The Mann (1959) study used 100 fra- 
ternity undergraduates, each of whom was 
assigned to two different and nonoverlapping 
five-man groups. In one of these 50-minute 
groups, the group worked on a specific task; 
in the other kind of group, the group mem- 
bers attempted to formulate some fraternity 
policies. During each session, a single ob- 
server used a version of the Bales interaction 
categories to code the ongoing interaction. 
For each member of the group, the total 
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number of “acts” was calculated. For each of 
seven Bales IPA categories, the subject’s act 
frequency was divided by his total number 
of acts to derive a percentage. These per- 
centages served as the Actual Behavior scores 
for subsequent analysis. After a group ses- 
sion, each member rated the other four on 
16 characteristics, 7 of which approximated 
the Bales-Mann interaction categories. Sub- 
sequently, for each subject, the sum of the 
ratings assigned by the other four members 
of his group on each of the characteristics 
was calculated. These pooled ratings served 
as one form of Rated Behavior scores for 
subsequent analysis. In addition, the inter- 
action-observing coder at the end of each 
session was asked to rate directly for each 
subject the percentage of the subject’s be- 
havior in each of the Bales-Mann categories. 
These estimated percentages served as a sec- 
ond form of Rated Behavior scores in subse- 
quent analysis, 

The Newcomb (1929) study evaluated 51 
problem boys (a group of 27 preadolescents 
and a group of 24 adolescents) attending a 
month-long summer camp. For each group, 
camp consisted of six tentfuls of boys, each 
tent supervised by a different counselor. 
Sometime during each day, for each boy, his 
tent counselor completed a behavior record 
with respect to 26 behavioral questions New- 
comb asked because of their presumed rele- 
vance to extraversion-introversion. For each 
question, the counselor selected one of four 
alternatives provided by Newcomb, precoded 
as “positive” (i.e, implying extraversion) or 
“negative” (i.e, implying introversion). In 
subsequent tabulation for each boy, for each 
of the behavior questions the number of days 
on which a positive or a negative alternative 
had been recorded was divided by the total 
number of camp days for which a report had 
been submitted. These 26 proportions, from 
single and uncalibrated observers, served as 
the Actual Behavior scores for subsequent 
analysis. At the end of the camp period, a 
tating form on approximately the same 26 
behavior situations was completed for each 
aes ba oe and by five other 
boy. The six va S oe oe 
Comb into ie or eac! boy were then 

onsensus rating. The 26 
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pooled ratings for each boy served as the 
Rated Behavior scores for subsequent anal- 
ysis. 

D’Andrade (1974) and Shweder (1975) 
suggest that the three studies just described 
permit them to “explicitly separate . . . the 
structure of ‘social impressions’ from the 
structure of behavioral instances” (Shweder, 
1975, p. 479). We suggest, alternatively, that 
the reasoning and evidence adduced by 
D’Andrade and Shweder as support for this 
“separation” lack power and pertinence. 
Some of our concerns follow. 

The categories or items or dimensions of 
behavior measured by the Actual Behavior 
indices are often different from those rated 
by observers. An analysis seeking to inter- 
pret the differences between the similarity 
structures derived from an Actual Behavior 
and a Rated Behavior matrix in conceptual 
similarity terms requires that the sets of 
variables being contrasted must, of course, 
be identical. The only data difference should 
be that in the Actual Behavior case, the vari- 
ables are scored from “actual behavior,” 
whereas in the Rated Behavior case, the vari- 
ables are scored on the basis of retrospective 
judgments and inferences. 

However, in none of the three studies used 
to demonstrate the difference between Actual 
Behavior and Rated Behavior matrices does 
the required equivalence obtain between the 
Actual Behavior and Rated Behavior sets of 
variables. Consider the following examples of 
failure of definitional comparability. 

In the Borgatta et al. (1958) study, the 
Actual Behavior variable, IPA Category 1, 
is defined as “Shows solidarity, raises others 
status, jokes, gives help, reward.” The cor- 
responding Rated Behavior variable is ae 
fined as “Shows solidarity and friendliness. | 
Such facets of interaction as raising others 
status, joking, and giving help and reward, 
included in the Actual Behavior version of 
this variable, are not included in the Rated 
Behavior version. The Actual Behavior vari- 
able, IPA Category 2, is defined as “Shows 
tension release, shows satisfaction, laughs. 
The corresponding Rated Behavior variable 
is defined as “Is responsive to laughter- 
Such facets of interaction as showing tension 
release and showing satisfaction, included i 
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the Actual Behavior version, are omitted 
from the Rated Behavior version. The Ac- 
tual Behavior variable, IPA Category 4, is 
defined as “Gives suggestion, direction, im- 
plying autonomy for other.” The correspond- 
ing Rated Behavior variable is defined as 
“Makes the most suggestions.” The impor- 
tant qualifier, “implying autonomy for 
other,” present in the Actual Behavior ver- 
sion, is absent from the Rated Behavior ver- 
sion. The remaining Actual Behavior - Rated 
Behavior correspondences pose similar prob- 
lems of comparability. 

In the Mann (1959) study, various Bales 
categories were merged, The first Actual Be- 
havior variable, Mann’s Category 1, sub- 
sumes showing solidarity (excluding all jok- 
ing behavior) and showing agreement. The 
Rated Behavior version of this variable in- 
quired “how much [the group member being 
rated| tended to agree with what others had 
said.” No mention of solidarity is contained 
in the Rated Behavior variable definition, a 
most important omission, 

Shweder’s (1975) report of the Newcomb 
(1929) study claims that the same 26 items 
of behavior were evaluated by the daily 
records and by the end-of-camp ratings. In 
fact, 15 of the 26 items lack Actual Be- 
havior~ Rated Behavior identity. For ex- 
ample, the daily record item, “Was he fond 
of swimming?” becomes the rating item, 
“Did he spend his swimming period periods 
in the water, actively moving about?” The 
daily record item, “How much of the day 
did he spend doing things that required little 
or no action?” becomes the rating item, 
“Was he actively moving about most of the 
day?” Sometimes these differences seem 
slight but, as is well known, slight differences 
in phrasing often can have great effect. 

The absence of exact equivalence between 
the definitions of the Actual Behavior and 
the Rated Behavior variable sets is regret- 
table; this consideration alone renders prob- 
lematic the significance of subsequent Actual 
Behavior and Rated Behavior matrix com- 
parisons that D’Andrade and Shweder have 
interpreted as evidence for the essential in- 
validity of “rated behaviors.” 

Emphasis solely on the corre 
tween traits ignores consideratio 


lations be- 
n of the 
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validity of specific traits. We believe that 
most psychologists will indeed wish to know 
the specific validity of traits or categories— 
data deemed irrelevant by D’Andrade and 
Shweder. A moment of reflection will indi- 
cate to the reader that if the traits or cate- 
gories being evaluated are validly measured, 
then the correlations between the traits or 
categories must be valid. However, the con- 
verse of this relationship does not apply; 
correlations between traits can exist and 
even be valid even though each trait is en- 
tirely invalid as a basis for ordering in- 
dividuals.’ Therefore, it is of interest to com- 
pare how similarly subjects are ordered by 
“rated behavior” scores and by “actual be- 
havior” scores posited as the criterion for 
validity. In the Borgatta et al. (1958) study, 
the mean correlation between corresponding 
Rated Behavior and Actual Behavior vari- 
ables is .25, with a range from .14 through 
.38. In the Mann (1959) study, the mean 
correlation of corresponding Rated Behavior 
and Actuel Behavior variables is, for the 
Rated Behavior variables based on group 
member ratings, .33, with a range from .10 
to .55. For the Rated Behavior variables 
based on ratings by the Actual Behavior 
coder, the mean Rated Behavior — Actual Be- 
havior correlation is .41, with a range from 
18 to .65. In the Newcomb (1929) study, 
the mean Rated Behavior - Actual Behavior 
correlation was .49, with a range from .16 to 
.73, for one group of boys, and .43, with a 
range from .02 to .68, for the second group 
of boys. 

All of these figures warrant correction for 
attenuation due to the unreliability of the 
Rated Behavior and Actual Behavior mea- 
sures (Block, 1963, 1964). However, unre- 
liability of measurement is not noted as a 
factor operating to diminish the correspond- 
ence between Rated Behavior scores and Ac- 


1 As a simple physical illustration, consider a mea- 
sure of height and a measure of weight and the con- 
sequent correlation between height and weight. If, 
for the sample of subjects involved, the paired scores 
for height and weight were to be randomly reassigned, 
the correlation between height and weight would 
be unchanged, but the validity of height and the 
validity of weight can be expected to be zero. 
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tual Behavior scores. Because the three stud- 
ies provide no (or technically unsound) re- 
liability coefficients, we cannot estimate what 
the relationships between the Rated Be- 
havior and Actual Behavior variables would 
be if they were psychometrically improved. 
For many of the pairings, however, we can 
expect the correlation to increase appreci- 
ably.? Nevertheless, even as they stand, and 
recognizing the definitional discrepancies al- 
ready noted, many of the Rated Behavior — 
Actual Behavior correlations cannot be con- 
sidered low, especially when some of the 
additional attenuating factors to be intro- 
duced below are recognized. 

The use of frequency counts of behavior 
is not a sufficient means of operationalizing 
complex psychological concepts. Conceptual 
similarity advocates recommend, “for record- 
ing the actual behavior of the subjects, . . . 
count frequencies of different kinds of be- 
havior . . . using a simple coding scheme” 
(D’Andrade, 1974, p. 162). The several as- 
sumptions underlying this epistemological 
position are bothersome on several grounds, 

1. D’Andrade and Shweder write as if 
“actual behavior” exists; it does not. Be- 
havior in the raw has an infinite variety of 
facets and pattern possibilities and therefore 
can only be studied by selection. The act of 
selection, however done, represents a con- 
structive and theoretical assertion about the 
world that wins its justification by the’ (ulti- 
mately personal) nomological network with 
which it subsequently can be surrounded. 
Kaplan (1964) has written tellingly regard- 
ing the philosophical issues in making the 
decision as to what to observe. More re- 
cently, Neisser (1976) has summarized vari- 
ous lines of evidence from cognitive psychol- 
ogy that indicate the human organism is an 
active, schematizing but also schemata-modi- 
fying being who only selectively connects 
with unencompassable “reality.” 

It should also be noted that the problems 
surrounding the notion of “actual behavior” 
connor be escaped by shifting to a focus on 

objective behavior.” So-called “objective” 
Measures of behavior of course can only re- 
flect earlier decisions, sometimi 


i es not full; 
considered, as to where attention should E 


addressed. A conceptually inappropriate de- 
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cision as to what aspect of behavior to select 
as an index of a psychological dimension 
cannot be redeemed by the subsequent im- 
partiality of the coding of the irrelevant be- 
havior. The scientific merit of “objective be- 
havior” depends, finally, on the psychologi- 
cal incisiveness of what is being figured from 
the ground, not on the mechanicalness of re- 


cording. 
What D’Andrade and Shweder appear to 
mean, then, by “actual behavior” are fre- 


quency counts or proportions summarizing 
arbitrary and narrow selections from the in- 
finitely broad stream of behavior, observed 
by an individual who need make no infer- 
ences and who records the singled-out be- 
haviors immediately. Such counts or propor- 
tions are viewed as an ultimate or at least 
sufficient criterion of truth—as being the 
benchmark against which all other data can 
be referenced and tested. However, we sug- 
gest that large problems assail even this more 
restricted view. 

2. It is assumed that the Actual Behavior 
data are simple and direct in two senses: No 
pertinent behaviors are omitted, and no in- 
ference is required of the recording observer. 
Both of these assumptions are questionable. 
Interaction Process Analysis, used to provide 
Actual Behavior data in both of the studies 
offered by D’Andrade (1965, 1974), has been 
found to have many problems and is no 
longer viewed with the enthusiasm that sur- 
rounded it when first introduced (see Longa- 
baugh, 1963; O'Dell, 1968; Waxler & Mish- 
ler, 1966). IPA coding is a taxing and inun- 
dating task: The solitary observer records 
“acts” continuously, typically at a rate of 
10 to 20 scores per minute (Bales, 1968), 
observing all the members of the group for 
50 minutes (in the Mann, 1959, study) oF 
for 120 minutes (in the Borgatta et al., 1958, 
study), coding the onrushing stream of 
“acts” of group members into 12 supposedly 
sufficient, mutually exclusive, and by 1° 
Means simple categories (7 categories in the 
Mann study) according to the particular 


* For example, in the Mann (1959) study, a single 
50-minute session is a doubtful basis on which 2 
Predict reliable Actual Behavior or Rated Behavior 
Scores on five people on multiple characteristics. 


wee 
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“actor” involved. The task requires super- 
human information-processing capacities; it 
is not surprising that flesh-and-blood IPA 
coders have not fared well when evaluated. 
Thus, whereas one IPA coder might record 
10 acts during a particular minute, another 
coder might record 20. Because of the num- 
ber of members in a group, the simultaneity 
of many “acts”, and the differential and os- 
cillating perceptions of observers, various 
group members receive slighted or unreliable 
recordings of their “acts.” Over the inter- 
action period, observers are often unable to 
sustain consistency in their codings. The IPA 
categories are widely criticized as being psy- 
chologically heterogeneous and requiring ap- 
preciable deliberation, awareness of context, 
and inference on the part of the observer. 
The reader should contemplate spending an 
hour or two watching 5 or 10 individuals, 
making decisions every 3 to 6 seconds, differ- 
entiating only between, for example, IPA 
Category 1 (shows solidarity, raises others’ 
status, jokes, gives help, rewards), Category 
2 (shows tension release, shows satisfaction, 
laughs), and Category 3 (agrees, shows pas- 
sive acceptance, understands, concurs, com- 
plies). Is an act involving humor a “show- 
ing of solidarity” or a “showing of tension 
release”? Is a remark a “reward” (Category 
1) or a “showing of satisfaction” (Category 
2)? Certainly, Interaction Process Analysis 
can provide useful data of a kind. But the 
technique is highly complicated and is sub- 
ject to many kinds of problems or vitiating 
influences often not recognized in the early 
IPA studies now used unquestioningly by 
D’Andrade. We suggest that few psycholo- 
gists will award IPA-based data deferential 
Status as a criterion. 

The Newcomb (1929) “behavior records” 
offered by Shweder (1975) as “actual be- 
havior” also cannot be viewed as involving 
a sufficiently inclusive and noninferential set 
of coding rules. Rather, the recording coun- 
selor was compelled to make delayed and 
highly complex judgments with respect only 
to the categories afforded by Newcomb; be- 
havioral observations not codable into New- 
comb’s categories could not be registered. 
By way of illustration, consider Behavior 
Record Situation 1: “Did [the boy] show 
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confidence in his own abilities?” Newcomb 
defined four “degrees of response to this sit- 
uation”; (a) “Boasted loudly of greater 
abilities than he had”; (b) “Spoke confi- 
dently of abilities he really had”; (c) “Ex- 
pressed lack of confidence in own abilities”; 
and (d) “Hesitated even to try his ability” 
(p. 21). It is immediately obvious, when the 
full text of the categories available to the 
counselors is referenced, that the superordi- 
nate questions (regarding “confidence,” “ini- 
tiative,” etc.) asked of them required appre- 
ciable psychological inference before the “be- 
havior record? could be completed. Such 
characteristics as “speaking confidently of 
abilities one really has” or being “loyal to 
the leader” are not so behaviorally denotable 
as to require little human judgment. Also, 
only one counselor completed the behavior 
record for each boy, and the extent to which 
another counselor would have completed that 
record in the same way is nowhere assessed. 

We intend no disparagement of the New- 
comb study of a half century ago, a study 
that for its time was in many ways exem- 
plary. We do suggest that because of the 
presence of appreciable psychological infer- 
ence in the behavior records, the sharp dis- 
tinction necessary between “actual behavior” 
and “rated behavior” is, instead, blurred. 

3. In the IPA studies cited by D’Andrade 
(1965, 1974) and in the Newcomb study re- 
used by Shweder (1975), “actual behavior” 
was finally indexed in one of three ways: 
rate of emission of certain categories of be- 
havior, act percentages, or the percentage of 
days on which a categorized behavior had 
been noted. All of these indices of “actual 
behavior” involve some form of averaging 
over behavioral instances. Behavior averag- 
ing is a very useful technique, but the in- 
dices so derived are insensitive to behavior 
context and the relative salience of any par- 
ticular instance of coded behavior. All occur- 
rences of a coded act are equally weighted 
in the “actual behavior” averaging. Thus, in 
Newcomb’s study, a boy who began the 
camp session as shyly noninteractive and 
ended up as comfortably gregarious could 
well earn the same percentage score on the 
interaction dimension as another boy who 
began camp as aggressively interactive and 
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became a rejected isolate by the end. The 
behavior percentage scores used to reference 
“actual behavior” are logically unable to 
index such change. On the other hand, the 
“rated behaviors” in the Newcomb study 
have the possibility of expressing such 
changes as may have occurred. Making their 
ratings at the end of the camp session, the 
raters had the opportunity to express their 
integration of all of their understandings of 
each boy, including such contextual or se- 
quential or salience effects as they believed 
to be pertinent. In short, useful though the 
averaging basis underlying the “actual be- 
havior” scores may generally be, such scores 
may not be able to capture important infor- 
mation and recognitions codable by raters 
who use long-term memory. Besides the pos- 
sibility of “systematic distortion” emphasized 
by D’Andrade (and Shweder) when raters 
invoke long-term memory, there is also the 
possibility that raters will be able to discern 
relationship, meanings, and trends that were 
denied identifiability in the highly summariz- 
ing “actual behavior” averages, Again, we 
Suggest that most personality psychologists 
will not accept such averages uncritically as 
a criterion of “actual behavior.” 
Extraneous influences distorting correlation 
coefficients and their patterning are not con- 
sidered, Many factors influence individual 
correlation coefficients and, thus, the subse- 
quently organized matrix of correlation co- 
efficients, Sometimes these coefficient-influ- 
encing factors are not of great importance 
vis-à-vis the broad questions being asked of 
the data. But when conclusions hinge on 
the particular values of particular pattern- 
making or pattern-breaking coefficients, these 
considerations become crucial to evaluate and 
to control. Yet, when D’Andrade and Shwe- 
der contrast matrices, they ignore the many 
Psychometric and statistical considerations 
that influence the observed differences in cor- 
relational patterning, 
i 1. The distributions of the Actual Be- 
avior scores and the Rated Behavior scores 
are not considered. They are often likely to 
be highly skewed or not unimodal, particu- 
oy the IPA rates or percentages represent- 
ing actual behavior.” Correlations between 
variables are Importantly influenced (and 
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always lowered) by differences in the shapes 
of the distributions being related (Carroll 
1961). 

2. The logic by which IPA “actual be- 
havior” scores are generated for each subject 
necessarily creates an appreciable degree of 
negative correlation among IPA “actual be- 
havior” scores,’ The degree of ativity in- 
creases as the number of IPA categories de- 
creases; especially affected is the Mann 
(1959) 7-category IPA system. Although the 
entailed negative relation among IPA “ac- 
tual behavior” scores does not uniformly 
depress the intercorrelations among such IPA 
scores, the average intercorrelation among 
these IPA scores will be moved toward nega- 
tivity by this influence. The interpretation 
by D’Andrade (1974) of the generally lower 
intercorrelations among his Actual Behavior 
matrices as compared to his Rated Behavior 
matrices does not recognize this lowering 
effect on the Actual Behavior coefficients. 

3. The Actual Behavior data used were 
derived from only one observer-recorder; 
four of the five sets of Rated Behavior data 
were based on pooled judgments from a 
number of observers. Ceteris paribus, data 
from a single observer are less dependable 
than data derived by pooling the data from 
a number of observers. D’Andrade and 
Shweder did not consider that to an extent 
unassessable within the data they used, the 
discrepancies between their Actual Behavior 
and Rated Behavior scores and consequent 
Matrices are a function of this source of un- 
reliability in the Actual Behavior data. } 

4. D’Andrade’s IPA analyses involve Six 
or seven IPA categories or variables. Six of 
Seven variables generate, respectively, 15 or 
21 coefficients of correlation. The correlation 
of these Actual Behavior coefficients with 
Corresponding Rated Behavior coefficients 
yields intermatrix correlations based on ms 
of 15 or 21, extremely small “sample” sizes: 


1 


3 Because an individual’s IPA category scores are 
calculated with reference to his total number of acts 
the resulting scores are like slices of a pie—as °”? 
Slice becomes large, the remaining slices are COP 
strained to be small. Guilford (1952) has warned 0 
the dangers of misinterpreting correlations among 
Such reciprocally dependent scores. 
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Furthermore, these 15 or 21 “observations” 
are not independent of each other; the cor- 
relation between Categories X and Y and the 
correlation between Categories X and Z will 
constrain the correlation between Categories 
Y and Z. The distribution of nonindependent 
observations will tend to be strange and un- 
stable, reacting strongly to changes in the 
constitution of the set of six or seven cate- 
gories that happen to have been used. Cor- 
relations based on nonindependent observa- 
tions are not correlations in the usual sense 
of the term, since the interdependency of 
observations violates a fundamental assump- 
tion of the statistical logic underlying the 
correlation coefficient. One can no longer 
interpret such fluky coefficients in terms of 
amount of variance explained or significance 
level. The correlation between two correla- 
tion matrices has a certain descriptive value, 
but the intuitive appreciation of these co- 
efficients requires the further recognition 
that these coefficients are indeed of an un- 
specifiable but certainly bizarre metric. 

5, The direction of scoring or rating a 
psychological variable is usually arbitrary; 
one can algebraically reflect a variable with- 
out changing its conceptual import. Instead 
of assigning high numbers to indicate “ex- 
troversion,” we could as well assign high 
numbers to mean “absence of extroversion.” 
This long-established recognition has severe 
implications for the “correlation” between 
two matrices. 

Changes in the direction in which a vari- 
able is scored reverse the signs of the corre- 
lations of that variable with other variables 
and will fundamentally influence the “corre- 
lation” subsequently computed between these 
revised matrices (Tellegen, 1965). In the 
D’Andrade and Shweder analyses, bipolari- 
ties tended to characterize their redundancies 
of measurement. Had the variables defining 
one end of their dimensions been defined 
instead in reflected terms, the subsequent 
correlation matrix would have had fewer 
negative entries and the “correlation” be- 
tween correlation matrices would have been 
importantly smaller. Thus, in yet another 
Way, the results of these analyses are shaped 
by adventitious aspects of the data. 
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6. The preceding psychometric recitative 
has listed seriatim a number of concerns 
about the D’Andrade and Shweder numerical 
analyses. Their influence, in the aggregate 
and in interaction, are beyond ready evalua- 
tion, but can be expected to be appreciable. 

What can be concluded from the D’An- 
drade and Shweder reanalyses seeking to un- 
confound the conceptual similarity interpre- 
tation from the isomorphism view? D’An- 
drade reported a correlation of .34 between 
the Actual Behavior and Rated Behavior 
matrices in the Borgatta et al. (1958) study; 
for the Mann (1959) study, he reported 
Actual Behavior - Rated Behavior matrix in- 
tercorrelations averaging .02 and .24 for the 
pooled judgment and individual observer 
Rated Behavior matrices, respectively. Shwe- 
der (1975) reports Actual Behavior — Rated 
Behavior matrix intercorrelations of .51 and 
.38. Given the psychometric, definitional, 
conceptual, and methodological problems af- 
fecting the comparison of the Actual Be- 
havior and Rated Behavior matrices, we 
suggested that it is impossible to view these 
“coefficients” as compelling of any interpre- 
tation at all. Instead of being impressed that 
these “coefficients” are so low, one can as 
well be surprised that they are so high, 


Controlling the Correspondence of Rated 
Behavior and Conceptual Similarity 
Structures 


The primary demonstration of the concep- 
tual similarity position has been to show 
that a variety of previously published Rated 
Behavior matrices or analyses of Rated Be- 
havior matrices are in strong correspondence 
with Conceptual Similarity matrices or anal- 
yses of Conceptual Similarity matrices that 
are entirely separately generated. D’Andrade 
and Shweder view the observed correspond- 
ence between the Rated Behavior and Con- 
ceptual Similarity matrices they have ex- 
amined as an invariant and widely generaliz- 
able datum demanding recognition in their 
own preferred terms. In disagreement, we 
suggest that the correspondences noted are 
a previously unrecognized function of ‘the 
high redundancy and the structure of that 
redundancy that just happened to be present 
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in the particular variable sets D’Andrade 
and Shweder sampled. 

By varying the amount and structure of 
variable redundancy, thus influencing the ex- 
tent and pattern of intercorrelations among 
variables, we have been able to manipulate 
the subsequent degree of correspondence be- 
tween Rated Behavior and Conceptual Simi- 
larity matrices. Our logical demonstration, 
together with the recognition of the kinds of 
redundancy present in the variable sets eval- 
uated by D’Andrade and Shweder, diminishes 
the implicativeness of their findings. 


Method 


A Demonstration of the Workings 
of Redundancy 


For our manipulational Purposes, a set of per- 
sonality variables was required from which sub- 
sets could be selected that met Specified redun- 
dancy designs. The California Q-set (CQ-set; Block, 
1961, 1971), a broadly ranging set of personality- 
descriptive items, was used because of its extensive 
use in personality assessment and because of its 
multidimensional characteristics, The CQ-set had 
been used as a basic Personality description proce- 
dure in a longitudinal study of personality devel- 
opment (Block, 1971), In this study, the person- 
en and 84 women were 
CQ-set at three time 


to th 


“quality controls” used 
ess of generating these 


cipal components solution 
max criterion, yielding 15 
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resulting factor loading matrix provided the basis 
on which selection of stimulus items was made, 

Three sets of 12 items each were selected. Th 
first set of CQ items was chosen on the basis i 
their high loadings, either positive or negative, on 
the first factor, The 12 items, 6 positive and 6 
negative, had a mean first factor loading, disre- 
garding sign, of .73 (all calculations averaging cors 
relations used Fisher’s r-to-z transformation). These 
12 CQ items represent a unidimensional and bipolar 
redundancy and will hereafter be called the Uni- 
dimensional, Bipolar Redundancy variable set. The 
66 intercorrelations among these 12 variables con- ; 
stitute a Unidimensional, Bipolar Redundancy Rated 
Behavior matrix. For each of the three time periods, 
and for each of the sexes, the Unidimensional, Bif 
polar Redundancy Rated Behavior matrices were. 
calculated. 


will hereafter be called the Multidimensional, Bi 
polar Redundancy variable set. The 66 intercorrela= 
tions among these 12 variables constitute a Multi- 
dimensional, Bipolar Redundancy Rated Behavior 
matrix. The six Multidimensional, Bipolar Redun- 
dancy Rated Behavior matrices (derived from the 
data on three time periods and both sexes) were 
calculated. 

The third set of 12 CQ items was chosen so 
that each item represented 1 of 12 orthogonal fac- 
tors. Some of the later factors extracted are defined 
by only a few CQ items. In selecting an item to 
Tepresent a factor, some attention was paid to the 
item’s loadings on the other 11 factors so as to 
Preserve item orthogonality but, because of the 
restrictions on possible item choices, the items 
finally selected to represent the 12 factors can be 
said to only approximate orthogonality. Each item 
had a factor loading on its respective factor of at 
least .62, the mean of these factor loadings being 
-78, The mean loading of the items representing 
the last 11 factors with respect to the first factor 
was 16. We shall call this third set of CQ items 
the Approximately Orthogonal variable set. The 66 
intercorrelations among these 12 variables consti- 
tute an Approximately Orthogonal Rated Behavior 
matrix. The six Approximately Orthogonal Rated 
Behavior Matrices were calculated. 

_ The three sets of variables selected are presented 
in Table 1, 

Within each variable set, each CQ item was paired 
with every other item, the 66 pairings being typed 
onto index cards. Each deck of 66 cards was given 


to conceptual similarity judges with the following 
instructions: 


The purpose of this research is to obtain judg- 
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Table 1 
California Q-Set Items Constituting the Three Variable Sets 


Items constituting the Unidimensional, Bipolar Redundancy set 


4, Is a talkative individual. 

17, Has a rapid personal tempo; behaves and acts quickly. 

40. Is facially and/or gesturally expressive, 

49. Behaves in an assertive fashion in interpersonal situations, 

50. Tends toward undercontrol of needs and impulses; unable to delay gratification. 
89. Is self-dramatizing; histrionic. 


‘22. Tends toward overcontrol of needs and impulses; binds tensions excessively; delays gratification un- 
necessarily. 

37. Is vulnerable to real or fancied threat, is generally fearful. 

à 39. Reluctant to commit self to any definite course of action; tends to delay or avoid action. (Uncharac- 

teristic end indicates quick to act.) 

45. Aloof, keeps people at a distance; avoids close interpersonal relationships. 

72. Tends to ruminate and have persistent, preoccupying thoughts (either pathological or creative). 

87. Is emotionally bland; has flattened affect. 


Items constituting the Multidimensional, Bipolar Redundancy set 
4. Is a talkative individual. 
17, Has a rapid personal tempo; behaves and acts quickly. 


45. Aloof, keeps people at a distance; avoids close interpersonal relationships. 
87. Is emotionally bland; has flattened affect. 


24. Shows condescending behavior in relations with others, 
33. Is subtly negativistic; tends to undermine and obstruct or sabotage. 


5. Behaves in a giving way toward others. 
15. Behaves in a sympathetic or considerate manner. 


42. Has a brittle ego-defense system; has a small reserve of integration; would be disorganized and mal- 
adaptive when under stress or trauma. 
‘52. Is self-defeating. 


23. Is productive; gets things done. 4 . 
67. Is consciously unaware of self-concern; feels satisfied with self. 


Items constituting the Approximately Orthogonal set 
7. Appears to have a high degree of intellectual capacity (whether actualized or not). (Originality is not 


necessarily assumed.) on A . 

12. Is thin-signned vulnerable to anything that can be construed as criticism or an interpersonal slight. 
17, Has a rapid personal tempo; behaves and acts quickly. 
A Shows condescending behavior in relations sih aes 

. Engages in personal fantasy, daydreams, and speculations. Be j 
56. Is concer with own body and the adequacy of its physiological functioning. (Body cathexis.) 
62. Enj ici ions; is aesthetically reactive. y j TOR 
68. Has a eee internally ena personality. (Amount of information available before sorting is 

not intended here. 4 MT is i 

13, Interested in en of the opposite sex. (At opposite end, implies absence of such interest.) 
14. Is Physically attractive. (The cultural criterion is to be applied here.) 
85. Tends to offer advice. 
86. Values own independence and autonomy. 


imilari lescriptions. scriptions on a card and decide how conceptually 
Hae bee ZA To alike are the two self-descriptions. Make each 
familiarize yourself with the task, spread the cards judgment according to the following scale, plac- 
out and skim through them Then taking the ing each card in the pile corresponding to your 
cards in any order a wish read ‘the two de- judgment: 7= extremely similar; 6 = moderately 
5 
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similar; 5 = mildly similar; 4= unrelated, neither 
similar nor dissimilar; 3 = mildly dissimilar ; 2= 
moderately dissimilar; 1= extremely dissimilar. 


These instructions were adapted from those previ- 
ously employed by D’Andrade and Shweder. 

For the Unidimensional, Bipolar Redundancy 
variable set and the Multidimensional, Bipolar Re- 
dundancy variable set, 10 judges each made simi- 
larity judgments. For the Approximately Orthogonal 
set, 20 judges were used. No judge rated more than 
one variable set. The judges were undergraduates 
in a course on the psychology of personality and 
were uninformed as to both the nature and the 
purpose of the study. Two of the 20 judges of the 
Approximately Orthogonal variable set did not cor- 
rectly complete the task, reducing this final total 
to 18, 


Results 


The 66 judgments of conceptual similarity 
by a judge is his or her Conceptual Similar- 
ity matrix. When the individual Conceptual 
Similarity matrices of the Unidimensional, 
Bipolar Redundancy judges are correlated 
with the six Unidimensional, Bipolar Redun- 
dancy Rated Behavior matrices, the average 
Conceptual Similarity- Rated Behavior in- 
termatrix “correlation” is .63, ranging from 
+18 to .93. When the individual Conceptual 
Similarity matrices of the Multidimensional, 
Bipolar Redundancy judges are correlated 
with the six Multidimensional, Bipolar Re- 
dundancy Rated Behavior matrices, the aver- 
age Conceptual Similarity — Rated Behavior 
intermatrix “correlation” is .50, ranging from 
—.11 to .75. When the individual Conceptual 
Similarity matrices of the Approximately 
Orthogonal judges are correlated with the 
six Approximately Orthogonal Rated Behav- 
ior matrices, the average Conceptual Similar- 
ity-Rated Behavior intermatrix “correla- 
tion” is .21, ranging from —.13 to .46.4 As 
expected, the mean Conceptual Similarity — 
Rated Behavior “correlations” of the Unidi- 
mensional, Bipolar Redundancy and Multi- 
dimensional, Bipolar Redundancy groups are 
not significantly different (t = 90, p< 38), 

en the Unidimensional, Bipolar Redun. 
dancy and Multidimensional, Bipolar Redun- 
dancy judging groups are combined and con- 
trasted with the Approximately Orthogonal 
judging group, the difference between their 
respective mean Conceptual Similarity — 
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Rated Behavior intermatrix “correlations” ig 
highly significant (¢= 5.17, p < .00001), 
Thus these data reveal, strikingly, that Con- 
ceptual Similarity —- Rated Behavior interma: 
trix correspondence is high when bipolar re 
dundancy is high and is low when bipolar 
redundancy is low. Judges appear to 
rather faithful reflectors of the pattern of 
redundancy empirically existing within the 
set of variables they are evaluating for con- 
ceptual similarity. 

As a related finding, the average interma- 
trix “correlation” among the Conceptual Sim- 
ilarity matrices contributed by the 10 Uni- 
dimensional, Bipolar Redundancy judges was 
41. The average “intercorrelation” among 


the Conceptual Similarity matrices of the) 
10 Multidimensional, Bipolar Redundancy) 


judges was 36. The average “intercorrela- 
tion” among the Conceptual Similarity ma- 
trices of the 18 Approximately Orthogonal 
judges was .21. Thus, it appears that the 
ability of judges to agree among themselves 
in their judgments of conceptual similarity is 
a clear function of the pattern of redundancy 
present among the variables being evaluated. 


Redundancy in the Variable Sets 
Evaluated by D’Andrade and Shweder 


The pertinence of our demonstration of 
the influence of redundancy on the “correla- 
tion” between Rated Behavior and Concep- 
tual Similarity matrices depends on the ex 
tent to which the amount and pattern of 
redundancy, previously unrecognized as 0P- 


erative, characterizes the variable sets ptt 
viously found to be “retrievable” by Con- 


ceptual Similarity judgments. 

Inspecting the variable sets that have been 
used, it is clear, we suggest, that appreciable 
and highly patterned redundancy does in- 
deed characterize them. 

The 20 personality terms of Normal 
(1963), further analyzed by D'Andradi 
(1965), were a carefully refined and well- 


e 


+ It should be noted that the generally small af 
consistent loadings of the Approximately Orhon 
variables on the first factor contribute to the ao 
Conceptual Similarity - Rated Behavior intermal 
correlation of .21. 


=e 
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studied variable set intended to provide re- 
dundant specification of an orthogonal five- 
factor structure deemed conceptually suit- 
able for encompassing peer-nomination data. 
Our Multidimensional, Bipolar variable set 
was designed to emulate the redundancy de- 
sign built into the Norman terms. 

The six rated variables in the Borgatta et 
al. study were included in a factor analysis 
(1958, Table 3). Four of the variables 
loaded highly on their first interpersonal 
factor; the remaining two loaded highly on 
their second factor. Our Multidimensional, 
Bipolar variable set approximately represents 
the redundancy characteristics of the Bor- 
gatta et al. set of rated variables, 

The six (or seven) rated variables in the 
Mann (1959) study were not submitted to 
a factor analysis but, since they parallel 
closely the variables used by Borgatta et al., 
it is probably safe to assume that the pat- 
tern of redundancy characteristic of the 
earlier study also characterizes the later 
study. 

The 26 variables used by Newcomb 
(1929) were not and cannot now be studied 
empirically with regard to their redundancy 
characteristics, However, because Newcomb 
selected his variables for their presumed rele- 
vance to the single broad dimension of extro- 
version-introversion, we suggest that appre- 
ciable redundancy of the unidimensional, 
bipolar type characterizes the Newcomb vari- 
able set. 

The three other variable sets whose ma- 
trices are reported by Shweder (1975, 1977a) 
to be “retrievable” via conceptual similarity 


* judgments—those of Bales (1970), of Sears, _ 


Maccoby, and Levin (1957), and of Block 
(1965)—all were factor analyzed, as re- 
ported in their initial publications, and all 
manifested appreciable redundancy of the 
multidimensional, bipolar, or unidimensional, 
bipolar type, according to the investigators’ 
measurement intentions. A 

Thus, bipolar redundancy within the vari- 
able sets appears to be a strong and constant 
feature of the D’Andrade and Shweder anal- 
yses. This bipolar redundancy is generally 
antonymic (and synonymic) in nature, thus 
offering a semantic basis to judges for esti- 
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mating, if only approximately, the nonzero 
correlations among various pairs of variables 
in the set. To the extent that Rated Be- 
havior variables are measured validly but not 
with antonymic (and synonymic) redun- 
dancy, the possibility of correspondence 
(“retrievability”) between a Rated Behavior 
and a Conceptual Similarity matrix tends to 
disappear because the “similarity structure” 
of each matrix tends to disappear. The ex- 
ternal validity of the variables in the vari- 
able set, either singly or as subsets, is in no 
way contingent on the “retrievability” or 
the absence of “retrievability” of a Rated 
Behavior matrix by a Conceptual Similarity 
matrix. To this last concern—the predictive 
and concurrent validity of personality rat- 
ings—we now turn. 


Some Empirical Relationships Currently 
Beyond the Reach of the Conceptual 
Similarity Position 


For assessment psychologists, a frustrating 
aspect of the current controversy has been 
the narrowly delineated arena in which the 
argument has been waged. In this section we 
introduce some relationships of a kind that 
personality psychologists find reinforcing of 
their belief in reliable individual differences 

Paa regard to personality parameters of con- 
sequence in the real world. We believe the 
kinds of across-time or across-domain rela- 
tionships to be reported are either unassimi- 
lable to the conceptual similarity interpreta- 
tion or else are encompassable only by a 
string of conjectures so complicated and ten- 
uous as to deprive the conceptual similarity 
view of interest until such time as the neces- 
sary empirical support becomes available. 
Our strategy for affirming the usefulness and 
even the necessity of the conceptual-em- 
pirical enterprise of personality requires 
even the necessity of the conceptual-em- 
phasis on classifications of personality vari- 
ables. We focus instead on the construct 
validity of classifications of individuals with 
respect. to personality variables. Space limi- 
tations permit only brief citation of some of 
the available data confronting the conceptual 
similarity position. 

The six Approximately Orthogonal Rated 
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Behavior matrices could not be well “re- 
trieved” by the Approximately Orthogonal 
Conceptual Similarity matrix. By the reason- 
ing of D’Andrade and Shweder, it follows 
therefore that “an illusion of . . . behavioral 
consistency” (Shweder, 1975, p. 456) was 
not created by the conceptual similarity 
among the personality variables involved 
simply because there was not appreciable 
“conceptual likeness” among the variables 
being evaluated. If there is behavioral con- 
sistency nevertheless, such consistency is not 
illusory. 

The “behavioral consistency view” or “in- 
dividual difference theory of personality,” 
as defined by Shweder (and as rejected by 
him), assumes that an individual’s behavior 
will tend to be “consistent from one time to 
another, and different from the behavior 
other people would manifest” (Child, 1968, 
p. 83). A direct test of behavioral consist- 
ency is the correlation of a Rated Behavior 
variable as rated at one time with the same 
Rated Behavior variable as independently 
tated at another time. 

For the 12 Approximately Orthogonal 
Rated Behavior variables, across-time corre- 
lations are available, for the sexes separately, 
connecting the junior high school (JHS) and 
senior high school (SHS) ratings (a period 
of 3 years), connecting the SHS and Adult 
hood ratings (a period of about 20 years), 
and connecting the JHS and Adulthood rat- 
ings (a period of about 23 years). It should 
be recalled again that these ratings were 
based on entirely independent sets of data 
and entirely independent sets of judges. 

For the male and female samples from 
JHS to SHS, the 12 Approximately Orthogo- 
nal rated variables correlate, on the average, 
42 and .40, respectively. If these correlations 
are adjusted for their attenuation due to un- 
reliability, the mean correlations rise to .68 
and .66. From SHS to Adulthood, the average 
correlations are .27 and .30 for the two sexes, 
respectively, Adjusted for attenuation, these 
mean correlations become .41 and 49. From 
JHS to Adulthood, the average correlations 
are 24 and .28 for the two sexes, respec- 
tively. Adjusted for attenuation, these mean 
Correlations are estimated to be .36 and .46, 
It should be noted that the 12 variables for 
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which across-time person-ordering consistence 
is reported were not selected for considera 
tion because of the across-time consistency 
they manifested. Many more such across 
time correlations exist, the full set being re 
ported by Block (1971). It should also be 
recognized that these across-time correla 
tions are further and appreciably lowered by 
such developmental changes and transforma- : 
tions as take place from early to late adoles 


thirties. 
All things considered, we suggest that 


ality.” A close and empirically supported exe 
planation of these and related data in co 
ceptual similarity or other attributional term J 
has yet to appear. 4 
In seeking to dispense with personality 
ratings and personality inventories on thé 
basis of their conceptual similarity argu 
ments and analyses, D’Andrade and Shweder 
appear to have presumed the essential inf 
validity of personality assessment procedures, 
However, there exists a large and diverse 
personality assessment literature attesting 
Construct-valid connections between assess 
ment measures and the real world that has 
been overlooked by D’Andrade and Shweder 
(e.g., Block, 1957; Block, 1971, pp. 159- 
168, 189-202, 228-238; Block, Jennings, 
Harvey, & Simpson, 1964; Clausen, 1968; 
Helson, 1971; Jones, 1968, 1971; Kety, 
1974; MacKinnon, 1962; Manheimer & Mel- 
linger, 1967; Peskin, 1973; Robins, 1966; 
Rosenthal, 1970). A priori, it would appea 
difficult for a conceptual similarity viewpoint 
to explain the existence of relationships such 
as these, sampled from literally hundreds 
that could have been offered. 


The Need to Deepen, Clarify, and 
Substantiate the Meaning of 
Conceptual Similarity 


RELEVANCY OF A SEMANTIC SIMILARITY INTERPRETATION 


underlie personality ratings with a linguistic 
explanation. Leaving aside the empirical re- 
lationships that the conceptual similarity in- 
terpretation must attempt to encompass, we 
note that this proffered explanation does not 
resolve the problem of understanding; it 
merely transfers or translates the problem. 
We must still face up to the thorny question 
of the nature and basis of similarity judg- 
ments. The study of similarity is a pro- 
foundly complex area of conceptual and 
empirical inquiry (Tversky, 1977); the in- 
vocation of “conceptual similarity” as an 
explanation without immediate close consid- 
eration and articulation of the processes by 
which similarity judgments are made is not 
especially helpful. D’Andrade was circum- 
spect in his final claims, acknowledging that 
because of obscurities surrounding the way 
in which similarities are developed and rec- 
ognized, 

It is somewhat ironic that the problem of not 
knowing what is happening when people are used 
as measuring instruments, . . . given as the reason 
why [personality or interactional ratings] produce 
invalid results, returns to plague [his] formulation 
about exactly how these invalid results come about. 
(1974, p. 180) 


Shweder was less cautious in his view. He 
argued that judges make similarity judg- 
ments based on the degree to which contigu- 
ous attitudes go together in making up what 
he called a behavioral type, which he treated 
as a learned cultural construct. Shweder went 
on vigorously: 


What is disputed is that the categories (in our 
terms, personality characteristics) into which people 
Sort themselves or others can be induced from ex- 
Perience. Cultural constructs are not empirical gen- 
eralizations. (Shweder, 1977b, p. 938) 


It would be useful to see the argument 
and empiricism in support of this strong but 
undetailed position. Our own view is that 
although cultural constructs need not be 
evolved from living in the world, they very 
often are. Individuals frequently indepen- 
dently generate convergent terms to express 
their recognition about the people they en- 
counter. To the extent that language is an 
efficient means of communicating calibrated 
understandings, individuals will or can come 
to agree in their usage of terms to describe 
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or integrate or summarize the implications of 
the behavior of other people (cf. Norman & 
Goldberg, 1966). The developmental sequence 
by which children evolve understandings of 
other people has received some study (Peevers 
& Secord, 1973) and appears to express the 
pragmatic relevance that descriptions of 
others in terms of personality characteristics 
take on for individuals as they interact with 
the people in their world. Such individually 
evolved and yet consensual structurings of in- 
terpersonal experience provide the basis, we 
suggest, for many learned cultural constructs. 
Indeed, with respect to certain personality 
characteristics or constellations rooted in the 
common human biology, very different cul- 
tures may evolve essentially equivalent cul- 
tural constructs encompassing various forms 
of psychopathology. In a splendid essay, 
Murphy (1976) brought together evidence 
from many and diverse societies, primitive 
and advanced, testifying to a larger consen- 
suality in the cultural constructs of psycho- 
pathology that societies separately have ex- 
perientially educed and codified. 

Finally, we note that although the effects 
attributable to manipulation of the redun- 
dancy and structure of the variables being 
rated are large, the range of correlations be- 
tween an individual’s Conceptual Similarity 
matrix and the Rated Behavior matrix is 
interestingly wide. Moreover, the phenomenon 
previously noted (Block, 1977b) was again 
observed: A judge’s degree of consensuality 
in conceptual similarity judging, even for the 
Approximately Orthogonal set, was related to 
his ability to “retrieve” the Approximately 
Orthogonal Rated Behavior matrix (r = .46, 
p< .06). 

Separately, we will be studying the possible 
implications of these individual differences. 
There are preliminary indications that these 
differences are reliable and can provide ad- 
ditional perspective on the processes involved 
in the effort to develop, conceptualize, inte- 
grate, and .apply one’s psychological under- 
standing of people. Meanwhile, we suggest 
that the inherent intricacy characterizing 
both human judgment and the ongoing stream 
of interpersonal behavior makes it unlikely 
that so simple a principle as conceptual simi- 
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larity will provide an appreciable explanation 
of so complex a process as personality assess- 
ment. 
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Postscript 


1. If personality ratings demonstrate concep- 
tually expected concurrent and predictive valid- 
ity, the semantic “retrieval” of their intercorrela- 
tions by Shweder and D’Andrade (1979) takes 
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on a very different implication and significance. 
We cited a sampling of evidence, from a variety 
of sources, that ratings can generate replicable 
external relationships that support their empirical 
validity. Shweder and D’Andrade state that evi- 
dence of empirical validity is “beside the point” 
(p. 1079) in evaluating the question of the pres- 
ence or absence of conceptual factors in judges; 
we do not think so. 

2. Interested readers will simply have to judge 
for themselves, by referring to the original 
sources, whether the behavior categories worded 
differently from the rating dimensions “still main- 
tain equivalent meaning,” (p. 1078) as Shweder 
and D’Andrade declare. 

3. Because we are without the space for proper 
psychometric explanation, we can only reiterate 
that the studies reported by Shweder and 
D’Andrade “provide no (or technically unsound) 
reliability coefficients.” 

4. Shweder and D’Andrade remain impressed 
by the generality of their finding that appreciable 
intermatrix correspondence between what they 
called “actual behaviors” and “rated behaviors” is 
not observed. We suggest that the common effect 
of the many problems besetting their analyses 
is to increase unreliability and invalidity of mea- 
surement and thus to attenuate the possibility 
of finding strong congruence between Actual Be- 
havior and Rated Behavior matrices. We agree 
with Shweder and D’Andrade (1979) that “it is 
the consistency with which a given result is ob- 
tained across variations in method that ulti- 
mately supports or disconfirms a hypothesis” (p. 
1080). But “the given result” must be a positive 
result, not repeated failures to reject the null 
hypothesis. The motivated reader should look 
again at our catalogue of the problems surround- 
ing the comparison of Actual Behavior and Rated 
Behavior matrices before deciding whether the 
generally poor intermatrix correspondences ob- 
served by Shweder and D’Andrade are deserving 
of substantive interpretation or are more simply 
understandable as due to comparing the non- 
comparable. 

5. In our study, we argued that “retrieval” 
of the matrix of correlations among Rated Be- 
haviors by a matrix of relations among Concep- 
tual Similarity judgments was a function of the 
particular mix of personality variables employed, 
namely, their bipolar psychological redundancy. 
Shweder and D’Andrade (1979) suggest that our 
reduction of the variance of the distribution of 
the intervariable correlations creates an “arti- 
factual effect” (p. 1077). But, equally, the high 
intermatrix correlations previously reported by 
Shweder and D’Andrade can be viewed as an 
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“artifactual effect” of extended variances of the 
distribution of intervariable correlations. Prop- 
erly, arguments should not arise about artifactual 
effects on intermatrix congruences; the interde- 
pendent intervariable correlations from which 
they are computed have no “natural” population 
to which they can be referred so that one can 
sensibly talk of artificiality or representativeness 
of their distribution. What is important is that 
we have been able to provide a psychological ex- 
planation, in terms of redundancy and the proper 
measurement goals of investigators, of how these 
different extremes of “artificiality” come about 
and have their effect. In particular, high redun- 
dancy, both antonymic and synonymic, often has 
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been deliberately introduced by investigators 
good reasons (to balance direction of wo 
effects, to achieve psychometric respectabil 
subsequent composite scores, etc.). To make 
redundancy subtle or indirect would be 
ing to the psychologists using these rating s 
It is therefore not surprising that conci 
similarity judges are able to latch onto the 
ture underlying cleanly separated clusters 
variables when highly and simply organized 
able sets are employed. But such cong 
are essentially adventitious; they have no 
cations, in and of themselves, for the validity 
usefulness of well-based and well-encoded ] 
sonality ratings. 
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The “systematic distortion” hypothesis, advocated by D’Andrade and Shweder, 
states that correlational structures derived from memory-based personality rat- 
ings are primarily a product of the conceptual affiliations among rating cat- 
egories rather than a reflection of the empirical correlational structure of be- 
havior. This hypothesis is clarified with reference to the Block, Weiss, and 
Thorne critique. The “bipolar redundancy” effect, presented by Block, Weiss, 
and Thorne as an alternative explanation for correspondence between the cor- 
relations derived from memory-based ratings and judgments of conceptual afl- 
iation, is found to be an artifact of reduced variance in matrices of uncorrelated 
traits. Block, Weiss, and Thorne argue that D’Andrade and Shweder failed to 
find correspondences between correlations for memory-based ratings and imme- 
diate scorings because of methodological flaws in their studies. This argument 
is reviewed, and it is concluded that the systematic distortion hypothesis re- 
mains well supported and that the evidence for the existence of covarying, 
multibehavior personality traits established by the correlational analysis of 
memory-based ratings remains dubious. Finally, an attempt is made to clarify 
the different uses of the term ¢rait and to identify the different types of con- 
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sistency in behavior that are relevant to the controversy. 


Most classifications of individual differences 
in the personality literature are classifications 
of response patterns on personality rating 
forms, inventories, and questionnaire inter- 
views. The pattern of correlations among 
variables in most of these classifications can 
be reproduced or replicated by asking a small 
number of respondents to judge the conceptual 
similarity or dissimilarity of the variables on 
the rating form, inventory, or questionnaire. 
To date, results from the following analyses 
of rating data have been successfully repro- 
duced: 

_ 1. Factor-analytic classification of personal- 
ity adjectives, as given in Norman (1963). 
(See D’Andrade, 1965; Mulaik, 1964.) 


Requests for reprints should be sent to Richard 
- Shweder, Committee on Human Development, 
niversity of Chicago, 5730 South Woodlawn Ave- 
nue, Chicago, Illinois 60637, or Roy G. D’Andrade, 
epartment of Anthropology, University of Califor- 
nia at San Diego, La Jolla, California 92093. 
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2. Leary grid organization of interpersonal 
behavior, as given in LaForge and Suzcek 
(1955). (See D’Andrade, 1965.) 

3. Factor-analytic classification of personal- 
ity and interpersonal behavior, as given in 
Bales (1970). (See Shweder, 1972, 1975.) 

4, Factor-analytic classification of mater- 
nal personality, as given in Sears, Maccoby, 
and Levin (1957). (See Shweder, 1975.) 

5, Correlation matrices for Bales Interac- 
tion Process Analysis categories used as rat- 
ing scales, as given in Borgatta, Cottrell, 
and Mann (1958) and in Mann (1959). (See 
D’Andrade, 1974.) 

6. Partial correlation matrix for observers’ 
ratings of extraversion-introversion in boys 
in summer camp, as given in Newcomb 
(1929). (See Shweder, 1975, 1977a.) 

7. The Alpha factor from the MMPI, as 
given in Block (1965). (See Shweder, 1977a, 
1977b.) 

8. The common factor structure of the 
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Murray Needs from five personality tests, 
as given in Fiske (1973) and in Huba and 
Hamilton (1976). (See Ebbesen & Allen, 
Note 1.) 

9. Syndrome clusters from the Brief Psy- 
chiatric Rating Scales, as given in Overall, 
Hollister, and Pichot (1967). (See Shweder 
& D’Andrade, Note 2.) 

In these studies a variety of techniques 
for obtaining similarity judgments have been 
used, ranging from pile sorting to triads test- 
ing. The phrasing of the instructions for these 
judgments has also varied from an emphasis 
on “co-occurrence” to an emphasis on “like- 
ness in meaning.” Similarity judgments ap- 
pear to be robust, in that small differences 
in technique do not seem to affect the cor- 
respondence found between conceptual simi- 
larity judgments and memory-based personal- 
ity ratings. 

At least two hypotheses can be constructed 
to account for the fact that conceptual simi- 
larity judgments reproduce the correlational 
structure of variables in personality ratings. 
The first is the “accurate reflection” hypothe- 
sis, which asserts that ordinary folk learn or 
develop “implicit personality theories” that 
summarize and preserve the empirical covaria- 
tion of behavior across individual differences 
in conduct, According to this hypothesis, 
people use empirically valid implicit personal- 
ity theories in making conceptual similarity 
judgments, thereby accurately reporting the 
intercorrelation of behaviors (Jackson, Chan, 
& Stricker, in press; Passini & Norman, 1966). 

The second hypothesis is the “systematic 
distortion” hypothesis, which asserts that the 
correspondence between preexisting ideas of 
what is like what and memory-based rating 
correlations occurs because, under difficult 
memory conditions, people infer what hap- 
pened from their general model of what the 
world is like, and because conceptually affili- 
ated memory items are easier to retrieve 
(Mandler, 1970). According to the systematic 
distortion hypothesis, the schemata held by 
most people tend to be inaccurate with Te- 
spect to how behaviors 
BES is like what” with “what goes with 
what. Hence, memory for events contains a 
systematic bias, in that things that are con- 


covary, confusing 
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ceptually similar are recalled as if they co. 
varied. L. J. Chapman (1967) has called this’ 
effect “illusory correlation.” 

A test of the accurate reflection hypothesis 
versus the systematic distortion hypothesis 
can be made by obtaining both observational 
evidence and memory-based ratings for a set 
of behavioral variables. By observational evi- 
dence we mean reasonably objective records 
made at the time of observation. Such records 
are called here “immediate scorings.” (Block, 
Weiss, and Thorne, 1979, refer to this type 
of data as “Actual Behavior matrices.”) To 
the extent that the correlations among varia 
bles derived from memory-based ratings are 
not like the correlations among variables dė- 
rived from immediate scorings, but are like 
the pattern of conceptual similarity judg: 
ments, the accurate reflection hypothesis is 
disconfirmed, and the systematic distortion 
hypothesis supported. (See Shweder, 19776) 
Note 3, for other ways to test the accuralé 
reflection and systematic distortion hypothe 
ses.) 

In previous papers (D’Andrade, 1975) 
1974; Shweder, 1975, 1977a) we examine 
published sets of data in which immediat 
scorings and memory-based ratings for the 
same variables were available. In all casts 
the pattern of intercorrelations for memory? 
based ratings was like the pattern of concep 
tual similarity judgments but only weakly 1¢ 
lated to the pattern of intercorrelations fot 
the immediate scorings. The evidence supi 
ported the systematic distortion hypothesis 
and we concluded that personality classifica” 
tions derived from memory-based data cou | 
not be trusted. i 

Block, Weiss, and Thorne (1979) assert 
that the systematic distortion hypothesis 5 
not adequately supported. Two general argi 
ments are advanced: the “bipolar redui 
dancy” argument and a series of methodo 
logical criticisms. 


The Bipolar Redundancy Argument 


Block, Weiss, and Thorne (1979) claim th 
the correspondence between conceptual sim! 
larity matrices and correlation matrices a 
memory-based ratings is “a previously unre 
ognized function of the high redundancy 3 
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structure of that redundancy that just 
ppened to be present in the particular: vari- 
able sets D’Andrade and Shweder sampled” 

(pp. 1065-1066). They attempt to show this 
by demonstrating that by varying the degree 
of bipolar redundancy among variables, they 
can affect the degree of correspondence be- 
tween conceptual similarity matrices and cor- 

elation matrices for memory-based behavior 
fatings. Using the California Q-set variables 

d a factor analysis based on ratings by 
linical psychologists of a sample of 160 sub- 
cts, they selected three sets of 12 items each, 

‘or the first set, 6 items with high positive 

loadings and 6 items with high negative load- 
ings from a single factor were selected; for 
e second set, 2 items with high positive 
dings and 2 items with high negative load- 
igs from three orthogonal factors were se- 
cted; for the third set, single items with 
igh loadings from each of 12 approximately 
thogonal factors were selected. 
For each of these three sets, conceptual 
milarity judgments were made by a small 
ple of respondents, For the first set, the 
an correlation between the individual con- 
tual similarity matrices and the rated be- 
vior matrix was found to be .63. For the 
cond set, the mean correlation was found 
e .50. But for the third set, which is made 
of items from orthogonal factors, the cor- 
lation was found to be only .21. 

Based on the fact that the mean intermatrix 

forrelation for the orthogonal items is signifi- 
tly less than the mean intermatrix corre- 
tions for items selected from bipolar factors, 
lock, Weiss, and Thorne (1979) conclude 
t “Conceptual Similarity — Rated Behavior 
ntermatrix correspondence is high when bi- 
olar redundancy is high and is low when 

Ipolar redundancy is low” (p. 1068) and 

at they have therefore controlled the de- 
ke of correspondence between conceptual 
milarity and memory-based rating correla- 

ons by varying the degree of bipolar re- 
lndancy, 

We believe that Block, Weiss, and Thorne 
re Mcorrect in this claim, since the artifactual 
fect of the reduced variance within the mat- 
X Of orthogonal items has been overlooked. 

“ng Block, Weiss, and Thorne’s original 

» Kindly supplied to us by the authors, 
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and comparing the three data sets, we found 
standard deviations of .53 and .37 in the rated 
behavior correlation matrices for the first and 
second sets, in contrast to a standard devia- 
tion of only .15 for the third set. The varia- 
bility of the rated behavior correlations in 
the third set has been much reduced by the 
selection process, and this in turn has reduced 
the possible covariance between the matrices, 

It is also crucial to note that contrary to 
Block, Weiss, and Thorne’s claim, respondent 
judgments about the conceptual similarity of 
the items in the third set do in fact corre- 
spond to the level of the correlations in the 
rated behavior matrix. Our reanalysis of 
Block, Weiss, and Thorne’s data reveals that 
the mean conceptual similarity score for items 
in the third set is 4.3 on a scale for which 1 
is defined as “extremely dissimilar,” 4 as 
“unrelated,” and 7 as “extremely similar.” 
The mean of the correlations in the rated be- 
havior matrix is .12 disregarding sign. Thus, 
in a set of almost orthogonal items, concep- 
tual similarity judgments indicate that the 
items are almost unrelated. 

The problem with using correlations to mea- 
sure the relation between matrices in this in- 
stance can be illustrated by considering the 
extreme case in which all the items are cor- 
related .00 with each other, and all the con- 
ceptual similarity judgments are at 4.0, indi- 
cating that the items are unrelated. Using 
correlations as the only measure of corre- 
spondence, one would assume that there was 
no relation between these matrices, even 
though the judgments and the correlations 
match perfectly. We conclude that the corre- 
spondence between conceptual similarity ma- 
trices and correlation matrices for memory- 
based ratings is not a function of adventitious 
redundancy. 

There is an important methodological prob- 
lem with the redundancy hypothesis, namely, 
that the independent variable is operationally 
the same as one of the dependent variables. 
By “redundancy” Block, Weiss, and Thorne 
mean the “communal variance” shared by a 
set of variables. But communal variance is a 
function of the intercorrelations between the 
variables. Therefore, in trying to control for 
redundancy, they are in effect controlling for 
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the intercorrelations between variables, which 
is also one part of the relation they are trying 
to predict. As a result, in reducing redundancy 
they necessarily reduce the variance in the 
intercorrelations between the ratings and, 
thereby, the possible covariance between ma- 
trices. 

It is interesting that the redundancy hy- 
pothesis asserts part of what the systematic 
distortion hypothesis asserts: that there is a 
correspondence between conceptual similarity 
judgments and the intercorrelations among 
memory-based behavior ratings (since redun- 
dancy is a function of the intercorrelation 
among variables); however, it reverses the 
direction of causation and thus becomes a ver- 
sion of the accurate reflection hypothesis. 


The Methodological Critique 


Block, Weiss, and Thorne (1979) present a 
series of methodological criticisms concerning 
the analyses by D’Andrade (1974) and 
Shweder (1975) of data taken from studies 
by Newcomb (1929), Borgatta et al. (1958), 
and Mann (1959). What follows is a list of 
responses to these criticisms, 

1. “The categories or items or dimensions 
of behavior measured by the Actual Behavior 
indices are often different from those rated 
by observers” (Block, Weiss, & Thorne, 1979, 
p. 1060), This is an important point, since 
the data were analyzed as if they were from 
a memory experiment in which the task is to 
try to remember the relative frequencies of 
various behaviors across different actors. To 
the extent that the memory-based ratings were 
made on categories not equivalent in meaning 
to the categories used in the immediate scor- 
ings, lack of correspondence between the two 
Sets of data becomes artifactual. ~ 

Since our analyses were carried out on al- 
teady-published data, we had no control over 
how the labeling was done. In our judgment, 
the categories that do not have identical word- 
ing still maintain equivalent meaning. How- 
ever, it is possible to make a more stringent 
test by using only categories that have identi- 
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lations between the immediately 
havior matrices and memory-based ta 
matrices for the Mann data are .15, — 
—.27, and .03, respectively (M = —.03), 4 
the nonidentical items removed, compared 
intermatrix correlations of .07, .20, — 03% 
.27, respectively (M = .13), for the da 
originally given, Thus removing the items 
were not identically phrased did not incre 
(in fact decreased slightly) the cori esp 
dence between the actual behavior and 
memory-based rated behavior matrices, 
Since the issue about maintaining equival 
definitions in both conditions is important 
evaluating the evidence for or against 
systematic distortion hypothesis, it should 
mentioned that relevant data exist in ami 
published study (Shweder & D’Andrade, Ni 
2). Block, Weiss, and Thorne did not hi 
the results of this study available to them 
the time they wrote their critique, nor 
results of this new study diminish the p 
nence of their concern about this issue. Brie 
this study is similar in design to the stud 
already reviewed, except that (a) the 
gories of behavior were taken from 
language terms for social interaction 
“advises,” “criticizes,” etc.), (b) the i 
ate scoring categories and the memory: 
rating categories are identical, and ( 
stimulus material consisted of videotape” 
natural family interaction, thereby permitti 
immediate scoring by a number of observi e! 
The intermatrix correlations are quite simi 
to the results found in the other cases 
correlation between the immediately 
behavior matrix and the memory-based 
havior rating matrix is .23; the corre 
between the memory-based behavior t 
matrix and the conceptual similarity ma 
is .75; and the correlation between the! 
mediately scored behavior matrix and th 
ceptual similarity matrix is .00. p 
In summary, on the basis of the recompus 
figures and the new study, we conclude t 
the low degree of correspondence we bi 
consistently found between memory- 
rated behavior correlation matrices and 
mediately scored behavior correlation mai 
is not a product of wording changes im! 
havioral categories, 


. “Emphasis solely on the correlations be- 
een traits ignores consideration of the valid- 
y of specific traits” (Block, Weiss, & Thorne, 
979, p. 1061). This is true, but beside the 
int. The reason for the emphasis on the cor- 
lations between traits is that it is these cor- 
lations that test the systematic distortion 
d accurate reflection hypotheses. Of course, 
the correspondence between the memory- 
ed data and the immediately scored data 
re very high, there would not be enough 
rror variance for systematic distortion to 
e place. However, the correlations between 
ecific immediate scorings and memory-based 
havior ratings are in the .3 range. Even 
igorous correction for attenuation would not 
ut the level of validity so high that there 
ould be insufficient error variance to permit 
e effect of systematic distortion to occur. 
Related to the issue of validity, Block, 
eiss, and Thorne (1979) raise questions 
out the reliability of the immediate scor- 
igs. They assert that “because the three 
udies provide no (or technically unsound) 
liability coefficients, we cannot estimate 
at the relationships between the Rated Be- 
vior and the Actual Behavior variables 
ould be if they were psychometrically im- 
‘oved” (p. 1062). For the Newcomb (1929) 
Yy, an odd-even day reliability check on 
thavior percentages yielded a mean reliabil- 
y coefficient of .78 over 26 variables. In the 
orgatta et al. (1958) study, reliability was 
ased on comparing the profiles of the three 
toups across two sessions, with the results 
resented graphically. From visual inspection 
iere appears to be high consistency in the 
‘sults for each group across sessions. In the 
lann (1959) study, Mann did the scoring 
lone, However, he had worked with R. F. 
ales for 3 years and had attained a scoring 
liability of approximately .90 on the ma- 
ity of the categories, with Bales as the 
terion. However, since the reliabilities are 
t given on a category-by-category basis, 
"ck, Weiss, and Thorne are correct in point- 
8 out that correction for attenuation cannot 
"made, Nevertheless, the figures indicate 
lat the immediate scorings have reasonable 
“an reliability in all three studies. 
3. “The use of frequency counts of behav- 
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ior is not a sufficient means of operationaliz- 
ing complex psychological concepts” (Block, 
Weiss, & Thorne, 1979, p. 1062). Again, this 
is true but beside the point. This criticism 
would be appropriate were we trying to de- 
velop a personality assessment instrument. 
Our main concern in these studies is not to 
develop personality assessment techniques 
but to test hypotheses about systematic dis- 
tortion in memory, In trying .to anchor one 
set of ratings in what can be observed and 
counted, we have used studies that used 
relatively simple and direct methods, If we 
did not have simple and direct measures, 
how could we know whether memory dis- 
tortion had occurred? 

This same general point holds for the re- 
lated criticism by Block, Weiss, and Thorne 
that the categories of Interaction Process 
Analysis (IPA) are not the most effective 
means of describing social interaction and (on 
the other side of the fence) the criticism that 
the Newcomb categories require considerable 
psychological inference. Difficulties in apply- 
ing particular category systems would be 
likely to lower the validity of the ratings, but 
would not introduce systematic distortion. 

Block, Weiss, and Thorne (1979) also ques- 
tion the use of averages as a way of summariz- 
ing behavior. They say, “Useful though the 
averaging basis underlying the ‘actual behay- 
ior’ scores may generally be, such scores may 
not be able to capture important information 
and recognitions codable by ratefS who use 
long-term memory” (p. 1064). We do not 
deny these criticisms of averages; but we do 
deny the relevance of these criticisms to our 
research. Repeating the point made above, 
such criticisms would apply if we were 
attempting to assess the personality of the 
subjects rated in these studies. Given that we 
are trying to test hypotheses about memory, 
we hold that these criticisms do not apply, 
since the methods used did not artifactually 
introduce systematic distortion. Were it the 
case that the memory-based raters captured 
important information not to be found in the 
immediate scorings, this would affect the cor- 
relations between specific categories across 
conditions but would not affect the relation 
between the correlation matrices. 
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4, “Extraneous influences distorting corre- 
lation coefficients and their patterning are not 
considered” (Block, Weiss, & Thorne, 1979, 
p. 1064), First, Block, Weiss, and Thorne 
make the point that the IPA rates are likely 
to be highly skewed or not unimodal, which 
will tend to lower correlations. Second, where 
the behavior counts are transformed into per- 
centages, some degree of negativity is intro- 
duced into the correlation matrix. Third, in 
four of the five sets of data, the memory-based 
behavior rating data were based on pooled 
judgments from a number of observers, but 
the immediately scored behavior data were 
obtained from a single observer, so that the 
discrepancies may be a function of the “vola- 
tility” of the immediate scorings. 

If we had merely presented one test of the 
systematic distortion hypothesis that had all 
of these problems, our results would be equiv- 
ocal. However, we have presented a number 
of different tests of the systematic distortion 
hypothesis on different types of data, collected 
by different investigators, using different 
methods. One advantage of using a number 
of different tests is that criticisms of the data 
or of the procedures in one case can be 
matched with a case in which the criticism 
does not apply. We can then see if the same 
results. obtain. Thus the Newcomb (1929) 
data do not have distributional difficulties of 
the sort that might have affected the Mann 
(1959) data, and these data are also not af- 
fected by the negativity problem. Further, in 
the Mann study there are two sets of data— 
one in which there is no discrepancy between 
the number of subjects used in the memory- 
based behavior rating condition and the num- 
ber of subjects used in the immediately scored 
behavior condition, and one in which there is. 
No difference is observed in the results. Thus, 
to each of Block, Weiss, and Thorne’s three 
criticisms there are one or more tests to which 
the criticism does not apply, yet the same 
pattern of findings occurs. Almost every study 
has some methodological weak point. We be- 
lieve it is the consistency with which a given 
result is obtained across variations in method 
that ultimately Supports or disconfirms a hy- 
pothesis. 


Also included under the heading of “ex- 
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traneous influences” by Block, Weiss, 
Thorne are two other criticisms, The | 
concerns the use of correlations to meas 
the degree of relatedness between two mati 
of correlations. Block, Weiss, and Th 
point out that because of the lack of in 
pendence in the “observations,” the no 
interpretations of the correlation coeffic 
with regard to amount of variance ex 


However, even under these conditions, 
relation coefficient maintains its descrip 
value, yielding an interpretable figure, ¥ 


y 


indicates how well one can predict from 


for the relative degree of association beti 
various sets of measurements, we have | 
force relied on the product-moment 
Spearman’s rank-order correlation coeffici 

The second issue concerns the effect ol 
flecting variables in a matrix of correlati 
thereby changing the degree of association 
tween matrices. Since some psychological ¥ 
ables permit such reflections, it is P 
that some of the variables could have 
reflected, which would possibly have ch 
the relationship between matrices. Ho 
this criticism applies only to some of the 
ies that attempt to reproduce the struct 
found in various personality assessment 
struments. None of the variables we havet 
in comparing immediately scored b 
matrices with memory-based behavior 
matrices were of the dimensional or DIPS 
types that would permit reflection, ane 
this particular problem does not affect 0 r 
tempts. to test the systematic distortion 
pothesis. Be 

Last on the list of methodological critics! 
Block, Weiss, and Thorne (1979) raise” 
question of whether the correlations oS 
between the immediately scored behavior @ 
trices and memory-based behavior rating : 
trices are so high that it is impossible 0 
confound the conceptual similarity interp 
tion from the isomorphism view” (P: 100 
They say, “Instead of being impressed | 
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these ‘coefficients’ are so low, one can as well 
be surprised that they are so high” (p. 1065). 
However, the test of the systematic distortion 
versus accurate reflection hypothesis requires 
the comparison of three correlations: (a) the 
correlation between the memory-based behav- 
ior rating matrix and the conceptual similarity 
matrix, (b) the correlation between the mem- 
ory-based behavior rating matrix and the im- 
mediately scored behavior matrix, and (c) the 
h correlation between the immediately scored 
behavior matrix and the conceptual similarity 
matrix. 

Across seven different tests we find the fol- 
lowing: (a) The mean correlation between 
memory-based behavior rating matrices and 
immediately scored behavior matrices is .25; 
(b) the mean correlation between memory- 
based behavior rating matrices and conceptual 
similarity matrices is .75; and (c) the mean 
correlation between the immediately scored 
behavior matrices and conceptual similarity 
matrices is .26. 

Comparing the three relationships, we ob- 
serve that the relation between the matrices 
for memory-based ratings and the matrices for 
immediate scorings is weaker than the rela- 
tion between the matrices for memory-based 
ratings and the conceptual similarity matrices. 
This result is not due to some effect of the 
telation between the immediate scoring ma- 
trices and the conceptual similarity matrices, 
since these are also only weakly related. It is 
this pattern of correlations that supports the 
systematic distortion hypothesis, not our im- 
Pression of whether any one of these relation- 
ships is surprisingly strong or weak. 


a 


= 


Theoretical Issues 


In the final section of their article, Block, 
Weiss, and Thorne take up a series of general 
‘ theoretical issues concerning the conceptual 
Similarity position. One of the issues they 
raise concerns the nature and the basis of 
similarity judgments. “The study of similar- 
ity is a profoundly complex area of conceptual 
and empirical inquiry (Tversky, 1977); the 
Mvocation of ‘conceptual similarity’ as an ex- 
Planation without immediate close considera- 
tion and articulation of the processes by 
Which similarity judgments are made is not 
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especially helpful” (Block, Weiss, & Thorne, 
1979, p. 1071). The study of similarity judg- 
ments has been a neglected area. A simple 
model, informally presented, dealing with only 
the denotative aspects of meaning, is given in 
Romney and D’Andrade (1964). It is our cur- 
rent view that similarity judgments are based 
on the relative number of predications com- 
mon to the two terms being rated; such pred- 
ications we believe refer to a wide variety 
of things, including the sets to which both 
items belong, parts which are common to both 
items, common effects of both items, and so 
on. The study of similarity can be divided 
into two parts: The first is to determine how 
information is processed in making a judg- 
ment; the second is to determine what infor- 
mational content is being processed. Multi- 
dimensional scaling of similarity judgments 
has proven to be an effective way of investi- 
gating the second problem (Romney & D’An- 
drade, 1964; Shepard, 1974). It seems likely 
that with the stimulus of Tversky’s model 
there will soon be an increase in research on 
the first problem. 

Another general issue taken up by Block, 
Weiss, and Thorne (1979) is the degree to 
which cultural belief systems can be assumed 
to be veridical. They point out that although 
constructs need not be empirically accurate, 
they often are. Further, they argue that there 
is good reason to expect cultures and individ- 
uals to develop both reliable and valid ways 
of describing human personality (p. 1071). 

As anthropologists interested in the rela- 
tion between culture and cognition, we support 
the generalization that cultural belief sys- 
tems are sometimes empirically accurate. 
However, having some experience with the 
strong commitment people often have to be- 
liefs that are empirically inaccurate, we feel 
that the other side of the proposition also 
needs stating; that is, cultural belief systems 
can be quite inaccurate and yet quite com- 
pelling. In this instance, the question is 
whether people have personality traits consist- 
ing of covarying behaviors. The fact that this 
is believed to be true in our culture does not 
constitute evidence for either its truth or its 
falsity. 

From a more psychologically oriented per- 
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spective, Block, Weiss, and Thorne (1979) 
assert: “In seeking to dispense with personāl- 
ity ratings and personality inventories on the 
basis of their conceptual similarity arguments 
and analyses, D’Andrade and Shweder appear 
to have presumed the essential invalidity of 
personality assessment procedures” (p. 1070). 
Arguing for the essential validity of personal- 
ity assessment procedures, Block, Weiss, and 
Thorne introduce “some relationships of a 
kind that personality psychologists find rein- 
forcing of their belief in reliable individual 
differences with regard to personality param- 
eters of consequence in the real world” (p. 
1069), As examples of relationships of this 
kind they point to across-time consistencies 
in trait ratings and the ability of personality 
assessment measures to discriminate between 
criterion groups, Such relationships, they as- 
sert, can be neither reproduced nor explained 
by the conceptual similarity position, 

It is true that such relationships, to the 
extent they did exist, could not be reproduced 
or explained by reference to conceptual simi- 
larity. This is because the conceptual similar- 
ity argument has no bearing on the issue of 
perceived longitudinal consistency or on the 
asserted ability of personality measures to 
discriminate between criterion groups. 

What becomes apparent in examining the 
final section of Block, Weiss, and Thorne’s 
article is that there are (at least) two uses of 
the term fait and the related phrase individ- 
ual differences. The conceptual similarity 
Position is part of an attack on the hypothe- 
sis that people have global traits and on the 
related hypothesis that one can determine 
such traits through the analysis of individual 
differences. The defense given by Block, 
Weiss, and Thorne concerns the predictive 
utility of traits—but in a different sense of 
the word—and the Possibility of determining 
such traits through the analysis of individual 
differences—in a different sense of the phrase. 

Originally, the notion of a trait appears to 

ave contained four distinct kinds of consis- 
tency: consisting across similar behaviors (an 
honest person does not lie, does not steal, and 
does not cheat) ; consistency across situations 
(an honest person does no 


t cheat when taking 
a math test and does not cheat when playing 
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a ring toss game); consistency across time ( 
honest person is likely to have been hones 
yesterday and will be honest tomorrow); and 
consistency within a framework of psychologi 
cal functioning (honesty is theoretically and 
empirically related to self-control and iden- 
tification with parental figures). (See Mo 
Clelland, 1951, 210-234.) 

The kind of trait consistency that is rele. 
vant to the conceptual similarity position is 
consistency across similar behaviors. The 
classical approach to the problem of deter 
mining consistency across similar behaviors 
is correlational analysis, in which the correla: 
tions are computed across persons, as typified 
by the work of Eysenck and Cattell. This 
approach assumes that individual differences 
arise from differences between persons in the 
amount of a trait, and so the co-variation of 
behaviors across persons can be used to deter: 
mine which behaviors belong to which traits 
(Cattell, 1946). 

It has been our position that evidence for 
the existence of this type of trait—this typé 
of consistency across similar behaviors—s 
rather dubious. Most of the supporting data 
are from memory-based ratings, question 
naires, and inventories. (Correlational aa 
from self-report tests can be considered “ob- 
jective evidence” in one sense, but such data 
do not establish that the behaviors reported 
on actually co-occur as rated.) We believe 
that memory-based data are so strongly al- 
fected by conceptually based systematic dis- 
tortion that they are primarily evidence i 
the existence of a powerful cultural belie 
system. 

However, the definition of trait given abovè 
does not correspond to the use of the term by 
Block, Weiss, and Thorne (1979). They say: 


Personality psychologists seek or settle on wN 
sets of variables to be employed for the spec! the 
tion of the personal or behavioral qualities ot set 
individuals to be studied. To the extent that the al! 
of personality or behavioral variables has commodi 
variance (i.e., is redundant), there will be 4 na 
larity structure of the trait measurements.” ne 
Similarity structures represent neither conceptual 
empirical fixities. (p. 1056-1057) 


Thus, for Block, Weiss, and Thorne, the i 
phasis is on useful ratings, which usually 
volve constructing a composite index act 
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a number of measures. The different measures 
to be placed in a composite index do not have 
to be internally correlated or homogeneous; 
they may even lack face validity, as long as 
the composite index displays significant rela- 
tionships with an external criterion. 

With respect to current personality assess- 
ment work as described by Block, Weiss, and 
Thorne, it appears that two of the four types 
of consistency originally contained in the no- 

ation of a trait have been dropped or sus- 
pended: consistency across similar behaviors 
and consistency across different situations. If 
true, this is an interesting shift. We have 
been arguing that one of these two suspended 
types of consistency has little empirical war- 
rant, and we are encouraged to believe that 
this position no longer appears heretical. 
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Exposure Effects May Not Depend on Stimulus Recognition 


Richard L. Moreland 
University of Pittsburgh 


Robert B. Zajonc 
University of Michigan 


Birnbaum and Mellers have presented an alternative theoretical model for 
Moreland and Zajonc’s data on the role of stimulus recognition in the mere 
exposure phenomenon. They contend that the exposure effects observed in the 
authors’ research could have been produced by a single underlying factor called 
subjective recognition, Birnbaum and Mellers’ model is quite plausible, and 
follows the traditional view that the improvement in attitudes associated with 
repeated stimulus exposure arises largely from increasing stimulus familiarity. 
In order to investigate how well their one-factor model, as well as the authors’ 
own two-factor model, actually fits the data, a series of linear structural equa- 
tion analyses was carried out. The results revealed that the two-factor model 
provided a significantly better degree of fit. Moreland and Zajonc’s original 
conclusions thus received additional support—stimulus recognition still does not 


appear to be a necessary condition for the occurrence of exposure effects, 


Recently, we reported on the results of two 
experiments that sought to determine whether 
stimulus recognition was a necessary condition 
for the occurrence of exposure effects (More- 
land & Zajonc, 1977). Our analyses of the 
data from those two experiments suggested 
that it was not. Birnbaum and Mellers 
(1979b), however, have since offered an al- 
ternative model for our data. Their model 
takes the following form, 


Liking 
Frequency Subjective 
of stimulus ————> recognition 
exposure 

Familiarity 


in which familiarity and liking ratings are 
viewed as imperfect measures (with uncorre- 
lated errors) of “subjective recognition”—a 
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ittsburgh, Pittsburgh, Pennsylvania 15260. 


hypothetical mediating factor that depends 
on the frequency of stimulus exposure. 

Birnbaum and Mellers evaluated their 
model by examining the pattern of correlations 
among the three major variables (frequency 
of stimulus exposure, liking, and subjective 
familiarity) in each of our experiments. In 
both cases, the correlations were consistent 
with a one-factor model of the data. Birnbaum 
and Mellers concluded, therefore, that stimu- 
lus recognition may have indeed mediated 
the observed exposure effects. 

This new theoretical model provides a 
plausible explanation for our results. There 
are at least two important issues that must be 
considered, however, before any final conclu- 
sions can be drawn about the role of stimulus 
recognition in producing exposure effects. The 
first of those issues involves the nature of the 
proposed mediating factor. Birnbaum and 
Mellers (1979b) showed that a one-factor 
model was consistent with our data, and in 
accordance with the null hypothesis being 
tested, they called that factor “subjective 
recognition.” It should be noted, however, that 
the true nature of such a factor remains an 
open issue. All that Birnbaum and Mellers 
really showed was that some single factor 
could have had an impact on our results. They 
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did not prove, nor did they claim to prove, 
that those results were in fact influenced by 
stimulus recognition. It could be argued, for 
example, that the exposure effects observed 
in our experiments were actually mediated by 
feelings of “subjective affect,” as indicated in 
the model shown below: 


Liking 
Frequency 
of stimulus — Subjective affect 
exposure 

Familiarity 


This affective model, which we described 
briefly in our earlier article, does not require 
any conscious form of stimulus recognition 
for the occurrence of exposure effects. Never- 
theless, it is a model that Birnbaum and 
Mellers would have to accept as a possible 
explanation for our results, since it is in com- 
plete accordance with their own analyses of 
the data. A valid decision regarding the nature 
of their hypothetical mediating factor cannot 
be made on the basis of the available evidence. 
In particular, these is no good empirical reason 
to believe that the exposure effects observed 
in our two experiments were mediated by 
stimulus recognition. 

A second important issue, quite distinct 
from the first, involves the ability of any one- 
factor model to account for our results, Birn- 
baum and Mellers showed that their own 
model was at least consistent with our data. 
That does not mean, however, that their model 
must necessarily be accepted as the best ex- 
planation for our results, or that more com- 
plex theoretical models must now be rejected 
on any grounds other than parsimony. Clearly, 
it would be both interesting and informative 
to specify several potential models of exposure 
effects and then investigate the degree to 
ites each of them actually fits the observed 

ta. 

In order to examine our data more closely, 
we decided to perform some linear structural 
equation analyses. Unlike sociologists, econo- 
metricians, and political scientists, relatively 
few experimental social psychologists have 
taken advantage of these powerful analytical 
techniques, which can be used to evaluate a 
variety of fairly complex causal models within 
a single data set (cf. Jöreskog & Sörbom, 
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1977; Long, 1976). We used the LISREL TII 
program (Joreskog & Sérbom, 1976) to calcu- 
late maximum likelihood estimates for sev- 
eral causal models of our data. This program 
distinguishes between latent variables (con- 
structs) and their observed indicators (mea- 
sures), and allows the researcher to specify 
particular causal models for the variables 
under investigation. By estimating any un- 
known coefficients in the system of linear 
structural equations for a particular model, 
the program reveals the pattern of relations 
among its latent variables, differentiating 
causal effects from unexplained variance in 
each case. An unstandardized maximum likeli- 
hood solution, as well as a standardized solu- 
tion, is generated for each causal model that 
is specified. 

Using the LISREL III program, we were 
able to evaluate the degree to which several 
different models fit our data. For the present 
purposes, we will only describe the results of 
analyses performed on the data from our sec 
ond experiment. The program’s input for 
those analyses was the matrix of intercorrela- 
tions among five variables: frequency of stim- 
ulus exposure, liking, subjective familiarity, 
recognition confidence, and recognition ac- 
curacy. The actual correlations are shown 1M 
Table 1. 

Among the many models we tested, two 
were especially interesting. Our preferred 
model for the data, illustrated below, invol 
two major factors: 


Subjectite —> Liking 
affect 


Frequency 
of stimulus 
exposure F 
Familiarity 
Subjective Z 
recognition Confidence 


Accuracy 


1 Similar results were found in the data from. e 
experiments. The data from the second expe E 
were more suitable for LISREL III analyses, D 
ever, because (a) more subjects were tesi ples 
separate measures of the various recognition V: a eae 
were obtained, and (c) recognition and li be 
Sures were obtained from the same individuals. 
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Table 1 
Matrix of Intercorrelations for Experiment 2 


ee mamammmmħÃ 


Note. Frequency of stimulus exposure is the log plus one of the actual exposure frequency. All of these corre- 


Briefly, we felt that repeated exposure to a 
novel stimulus could produce feelings of both 
recognition and affect, each of which might 
in turn have some influence of its own on the 
other. Liking ratings were assumed to be pri- 
marily affective in nature, while responses on 
the various recognition measures were thought 
to share a more cognitive origin. 

Applying this two-factor model to the data 
from our second experiment, we obtained the 
Standardized solution shown in Figure 1. La- 
tent variables are shown there in ellipses, 
While indicators of those variables are shown 
in boxes. The path coefficients linking ellipses 
with boxes represent the estimated validities 
with which the latent variables were measured 
by their respective predictors. Path coefficients 
linking latent variables to each other represent 
the estimated strength of particular causal 
telationships. Unexplained variance in the 
latent variables (V; and V2) and measurement 
error in their indicators (E, to Es) are also 
estimated. Note that a few of the model’s 
Parameters (shown in parentheses) were set 
equal to some a priori value in the maximum 
likelihood solution, so that the variance in 
all of the latent variables could be identified.* 
l The results provided some interesting in- 
ormation about our data. The chain of paths 
linking stimulus exposure to subjective recog- 
nition, and subjective recognition to subjective 
affect, indicated that some of the exposure 
effects observed in our research may have in- 
deed been mediated by subjective recognition. 

owever, there was also a relatively strong 
a direct path linking stimulus exposure to 
ete affect, suggesting that subjective 

gnition was not an essential factor in pro- 

Ree our results. The overall goodness of 

or our model was x?(5) = 38.99.° 


Tre. | 


| Variable 1 2 3 4 5 
i 1. Frequency of stimulus exposure 1.00 
l 2. Liking -663 1.00 
3. Subjective familiarity .579 533 1.00 
4. Recognition confidence 340 .252 -291 1.00 
5. Recognition accuracy 413 302 555 AIT 1.00 


lations were calculated using the residual data set (cf. Moreland & Zajonc, 1977, p. 195). 


The standardized solution that we obtained 
when Birnbaum and Mellers’ (1979b) one- 
factor model was applied to the same data set 


2 Since stimulus exposure and subjective affect each 
had only one indicator, the program could not suc- 
cessfully identify their variance. For that reason, we 
specified (in the maximum likelihood solution) both 
the validity and the error coefficients for those two 
indicators. Frequency of stimulus exposure was as- 
sumed to have a validity of 1.00 (and an error of 
.00), as it represented a presumably perfect experi- 
mental manipulation of stimulus exposure. The va- 
lidity coefficient for liking was set equal to .87 (and 
its error coefficient to .24) on the basis of a reliability 
estimate obtained in our original research (cf. More- 
land & Zajonc, 1977, pp. 195 and 197). In order for 
the program to identify the variance in subjective 
recognition, we followed an accepted standard prac- 
tice, and set the validity coefficient for subjective 
familiarity to 1.00, again in the maximum likelihood 
solution. It should be noted that parameters whose 
values were specified in the maximum likelihood 
solution sometimes took on slightly different values 
in the standardized solution. These differences oc- 
curred whenever the estimated variability of a latent 
variable was less than 1.00, and were a natural by- 
product of the standardization procedure (cf. 
Jéreskog & Sérbom, 1976, p. 24). 

We also specified that the two path coefficients 
between subjective affect and subjective recognition 
had to be equal to each other in the maximum likeli- 
hood solution, This was equivalent to specifying the 
value of a single parameter, and therefore added a 
single degree of freedom to the chi-square. Without 
this additional constraint, which is often adopted 
when testing models involving reciprocal causation 
(cf. Wiley, 1973; Wright, 1960), our model would 
have been underidentified. 

3The degrees of freedom for this chi-square are 
the number of cells in the correlation matrix (in- 
cluding the diagonals) minus the number of “free” 
parameters. The chi-square measuring goodness of 
fit was fairly large for all of the models that we 
tested. Jéreskog (1969) has suggested, however, that 
the chi-square statistic may not, in itself, provide the 
most useful information about how well a model 
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Eo=(.24) 


Recognition Confidence 


Figure 1. Moreland and Zajonc’s two-factor model of exposure effects (standardized solution). 
(V = variable; E = error.) 


is illustrated in Figure 2. The same parameter 
constraints described earlier were again used. 
In general, the path coefficients for this model 
seemed reasonable, although it was somewhat 
surprising that liking measured subjective 
recognition as well as subjective familiarity 
did, and even better than recognition confi- 
dence or recognition accuracy were able to 
do.* If our data were indeed influenced by 
subjective recognition, as Birnbaum and 
Mellers’ model indicates, then variables meant 
to measure recognition processes should have 
been most closely associated with that factor. 
The overall goodness of fit for Birnbaum and 
Mellers’ model was x°(6) = 83.58. 

To compare the accuracy of these two 
models, we first took Jéreskog’s (1969) sug- 
gestion and looked to see which one had the 
lower ratio between its chi-square and degrees 
of freedom. According to that criterion, our 
own model of the data (38.99:5 = 7.80) was 
clearly better than the one proposed by Birn- 
baum and Mellers (83.58:6 = 13.93). Since 
their model was actually nested within ours, 


fits the data. Instead, he recommends looking at the 
tatio between the chi-square and its degrees of 
freedom as another means of evaluating the degree 
of fit (cf. Jéreskog, 1969, p. 201). 


however, a statistical test could be made b; 
subtracting the chi-square and degrees 0 
freedom for one model from those of the othe 
(cf. Long, 1976). The results showed that ql 
model provided a significantly better fit 
the data, x?(1) = 44.59, p < 01. 

In summary, we see no reason to alter ou 
earlier conclusions about the role of stimulu 


4In the accompanying article, Birnbaum d 
Mellers (1979a) have argued that at least ie e 
recognition measures (confidence and accuracy ann 
related to stimulus exposure in a nonlinear Hei 
and thus could not be expected to load moe Rei, 
on subjective recognition. In order to test tl A 
ment, we repeated the two LISREL m “aE co 
reported here, but allowed for the possibility E 
related errors between recognition contan by Í 
recognition accuracy. This was accomplishe aya 
troducing a new latent variable into each an: Jate 
one that was unrelated to any of the i a, 
variables in the model and was measured only niti 
confidence and accuracy of the subjects no 
responses. Both of the revised models EU f g 
better than their predecessors did, indioa 
shared nonlinearity among the recognition orth! 
was indeed a factor in our research. Nev nt 
even after the possibility of correlated errors io 
recognition confidence and recognition accu" 
taken into account, our own model 
still more accurate than that proposed b 
and Mellers. 
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Figure 


recognition in producing exposure effects. 
Birnbaum and Mellers (1979b) proved that 
a one-factor model of our data cannot be re- 
jected. The nature of their hypothetical factor, 
however, remains a mystery—in particular, 
there is no evidence to show that it necessarily 
involved any form of stimulus recognition. 
Moreover, the fact that a one-factor model 
(of any sort) was consistent with our data 
says little about the relative ability of such a 
model to account for our results. In fact, when 
we reanalyzed our data using linear structural 
equation analyses, the results showed that our 
own two-factor model provided a significantly 
better fit. This finding, along with the results 
of other research (cf. Matlin, 1971; Wilson, 
1975, 1979), strongly suggests that stimulus 
tecognition is not a necessary condition for the 
occurrence of exposure effects. 
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One-Mediator Model of Exposure Effects Is Still Viable 


Michael H. Birnbaum and Barbara A. Mellers 
University of Illinois at Urbana-Champaign 


Birnbaum and Mellers criticized the use of partial correlation and multiple re- 
gression by Moreland and Zajonc to argue for two independent effects of stimu- 
lus exposure on liking. The null hypothesis that one variable mediates the effect 
of the independent variable on the dependent variables was not tested by their 
analyses. In response, Moreland and Zajonc reanalyzed their data, using struc- 
tural equations analysis, and replied that there is evidence to support their 
previous conclusions. However, the present article shows that the small residuals 
from the one-mediator model may be due to shared nonlinearity (correlated 
errors) in three of the dependent variables. This simpler interpretation achieves 
a better fit to the Moreland and Zajonc data than the two-mediator model they 
advocated. Since the null hypothesis of one mediator is still viable, the burden 
of proof rests on those who contend that there is more to the exposure effect 


than stimulus recognition. 


Moreland and Zajonc (1977) replicated 
the exposure effect, that is, the finding that 
stimuli that are presented with greater fre- 
quency will be rated more favorably. They 
also asked their subjects to rate their famili- 
arity with the stimuli, and they found a sig- 
nificant partial correlation between liking 
and exposure frequency when rated famili- 
arity was partialed out. This partial correla- 
tion (and related regression analyses) led 
them to conclude that there is an additional, 
“independent” effect of exposure frequency 
on liking that is not mediated by stimulus 
recognition. 

Birnbaum and Mellers (1979) questioned 
the use of partial correlation and regression 
in this argument. They pointed out that the 
null hypothesis that a single variable (e.g., 
recognition) mediates the effect of the inde- 
pendent variable on both dependent variables 
predicts that both partials should have the 
same sign as the original correlations. Only 
if the dependent variable measuring recogni- 
tion is assumed to be perfectly correlated 
with the mediator does the regression analy- 
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sis test the null hypothesis. Otherwise, lat 
of perfect validity of the dependent variab 
vitiates the analysis. 

Birnbaum and Mellers (1979) present 
path models both for the null hypothesis 
one mediator and for the alternative My 
pothesis that a second mediator is require 
Implications of the models and methods | 
distinguish them were described. These meti 
ods were applied to the three major variab 
in Moreland and Zajonc (1977) with t 
result that no evidence was found to req 
the hypothesis of two mediators over 
simpler hypothesis of one. 

In response to these arguments, Morelal 


of an extra path from stimulus frequency # 
liking, in addition to the path via subje 
recognition. However, the present article 
show that the residuals from the one-medl 
tor model, fit to all five variables, are smi 
in magnitude and are not of the form | 
ticipated by the Moreland and Zajonc moe 
Instead, they can be explained by the simp! 
hypothesis that some of the dependent 1 
ables are nonlinearly related to the siig 
mediator. 


ONE-MEDIATOR MODEL IS STILL VIABLE 


One or Two Mediators? 


The diagram in Figure 1 is an extension 
of Birnbaum and Mellers (1979, Expression 
9) for all five variables reported by More- 
land and Zajonc (1977). 

The diagram indicates that the independent 
variable, frequency of stimulus exposure, af- 
fects a mediator, “subjective recognition,” 
which in turn influences the four dependent 
variables, rated affect (liking), rated familiar- 
ity, rated recognition confidence, and recog- 
nition accuracy. Moreland and Zajone (1977, 
1979) argue that there is an additional, inde- 
pendent effect of stimulus exposure on liking, 
presumably via a second mediator, labeled 
“subjective affect” in Figure 1. 

The one-mediator model is a special case 
of Figure 1 with 6 = 0. The model favored 
by Moreland and Zajone (1979, Figure 1), 
although written in more complex form, is 
also a special case of Figure 1, with cı = Ca = 
C3 = 0. 


One-Mediator Model 


Table 1 shows the residual correlations that 
remain after fitting a linear one-mediator 
model (b = 0, cı = C2 = c3 = 0) to the cor- 
telations for both experiments of Moreland 
and Zajonc (1977; see Table 1 of Birnbaum 
& Mellers, 1979). Models were fit to maximum 
likelihood criterion by means of ACOVS and 
LISREL III (Jöreskog, 1974; Jöreskog, Gru- 
vaeus, & van Thillo, 1970; Jöreskog & Sör- 
bom, 1976). In general, the residual correla- 
tions in Table 1 are small in magnitude. In 
particular, the largest deviation does not oc- 
Cur for the residual correlation between liking 
(affect) and frequency (only .03 and .06 for 

Xperiments 1 and 2), which might be ex- 
Pected to be large if there is an extra causal 
path from frequency to liking. Instead, the 
atgest residual for both experiments is for 

€ correlation between recognition accuracy 
and recognition confidence (.08 and .18). 

i ne can reasonably ask if the residual cor- 
“ations in Table 1, which (for Experiment 
oe based on 10 ratings from each of 40 

Jects, are of sufficient magnitude to reject 

(ip rae mediator model. Moreland and Zajonc 
9) contend that the lack of fit for this 
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wae? 


Subjective _(1.0) 
eh Affect 
Familiarity <— 6s, 
Stimulus 4 , Subjective Gy ae 


Exposure eee iTe 

x Accuracy ~<—e, 
Figure 1. Path diagram representing theories of 
stimulus exposure. (The coefficient a: represents the 
correlation between stimulus exposure and a media- 
tor, labeled “subjective recognition,” which in turn 
is correlated with the four dependent variables with 
coefficients of az, ds, as, and as. The errors, e 
through es represent residuals, analogous to unique- 
nesses in factor analysis. The coefficient b represents 
the effect of stimulus exposure on liking apart from 
its effect via the first mediator, The correlations 
C, Ca and cs allow for correlated errors, which 
could be produced by shared nonlinearity of the 
dependent variables measuring recognition. The 
issue at hand is whether the data of Moreland and 
Zajonc (1977) provide sound evidence to reject 
the null hypothesis that b= 0.) 


model is evidence of a second mediator. It 
will be shown below that this pattern of small 
residual correlations can be better fit by a 
one-mediator model, which allows the errors 
in the recognition measures to be correlated 
(i.e., the cs in Figure 1 are not fixed to zero). 
Shared nonlinearity of the recognition mea- 
sures would predict correlated errors. 


Nonlinearity in Moreland and 
Zajonc Data 


Figure 2A plots the means of the four de- 
pendent variables as a function of the inde- 
pendent variable, log stimulus frequency, as 
reported in Table 3 of Moreland and Zajonc 
(1977). The ordinate has been recalibrated 
linearly so that all four dependent variables 
can be shown simultaneously. The one-media- 
tor model and the two-mediator model both 
imply that the curves should all be linear. 
Should two variables share common non- 
linearity, the residual correlation in Table 1 
is expected to be positive. All three recogni- 
tion measures appear to be nonlinearly related 
to frequency of stimulus exposure. Further- 
more, recognition confidence and recognition 
accuracy appear to share a common cubic 
trend in relation to the other curves. A simi- 
lar pattern was evident for Experiment 1. 
Thus, the two variables that show the largest 
residual from the one-mediator model (Table 
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Table 1 3 
Residuals From One-Mediator Linear Model 
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Frequency Liking Familiarity Confidence Accuracy 
Frequency 03 01 .02 —.07 
Affect 06 —.01 02 —.06 
Familiarity —.03 —.02 —.06 04 
Confidence —.01 —.06 —.02 08 
Accuracy —.05 — M1 13 18 


Note, Each entry is the residual correlation after fitting a one-mediator model to the correlations obtained b 
Moreland and Zajonc (1977). Values above the diagonal are for Experiment 1; values below the diagonal arey 


for Experiment 2. Liking and affect are corresponding measures for Experiments 1 and 2, respectively. The 
question under consideration is whether correlations of this pattern and magnitude, based on 10 judgments 
from each of 40 subjects (Experiment 2), warrant rejection of the one-mediator model. Note that the largest 


residuals involve accuracy, which is nonlinearly related to frequency (Figure 2). 


1) are the two that are most nonlinearly re- 
lated to the others (Figure 2A). 


Nonlinearity Can Produce 
Correlated Residuals 


To illustrate how shared nonlinearity can 
produce deviations from a one-mediator model, 
a hypothetical data set was constructed from 
the following one-mediator model: 


F=X+e; 
Y, =F +6; 
Y, = F + .33F* + 6; 
Ys =F + 33F? + ex; 
Y, = F + .33F? + 6s; 


where X is the manipulated independent vari- 
able (analogous to stimulus exposure in Fig- 
ure 1); Y, through Y4 are the measured, de- 
pendent variables; F is the mediating factor 
(analogous to subjective recognition); the 
addition of .33F? produces nonlinear relation- 
ships between Y» through Y, and F; and e; 
through es are mutually uncorrelated error 
terms. The value .33 for the coefficient of F? 
was chosen so that the degree of nonlinearity 
would be roughly comparable to that in the 
data of Moreland and Zajonc (1977), as in 
Figure 2A. To generate the 160 hypothetical 
cases, 5 values of X, (—2, —1, 0, 1, 2) were 
factorially combined with 2 values of e3 (+1 
1) to produce 10 values of F. These values of 
F were factorially combined with two levels 
of each error (es = —1.1, 1.1; es = —4, 4; 
4 = —3, 3; es = —2, 2) to generate the val- 


(1) 


ues of the dependent variables, Yı, Ya, Ys 
and F4, according to Equation 1. The meat 
values for Y for each level of X are shown 
in Figure 2B, plotted for comparison with 
Figure 2A. For the hypothetical example) 
variables Və, Ys, and Y, share a commol 
nonlinear (quadratic) relationship with the 
other two variables. If the data were generated 
without the F? terms, the curves would bt 
linear; therefore, the linear one-mediatot 
model (with b = cı = C2 = Cs fixed to zeri) 
in Figure 1) would fit perfectly. 

The correlation matrix, generated from the 


A. Moreland & Zajonc 


B Hypothetical 
(1977) Exp.2 


Data 


Familiarity 


Mean Value of Dependent Variable 


OaS A Cco ct. Os me 
Frequency x 
Independent Variable 


Figure 2. A. Mean values of affect, rated familiarity) 
recognition confidence, and recognition accu 
a function of exposure frequency [log (f 


inat 
(From Moreland & Zajonc, 1977, Table 3- one 4 
separately calibrated for each variable.) B- d Yo 


values of dependent variables, Y, Ys Ys 2™ 


tid 
as a function of independent variable. (Hypothe 
cal data were generated from a one-mediator © py 


with Y2, Ys, and Y, nonlinearly related to 
mediating variable, as shown in Equation 1 


ONE-MEDIATOR MODEL IS STILL VIABLE 


one-mediator model of Equation 1 (with some 
dependent variables nonlinearly related to the 
mediator), is shown in the upper portion of 
Table 2. The correlation matrix was fit by 
means of LISREL to a linear one-mediator 
model (b, ci, C2, and c3 in Figure 1 were fixed 
to zero). The residual correlations are shown 
below the diagonal in Table 2. Note that the 
largest discrepancies in Table 2 are among 
variables Vo, Ys, and V4, which have non- 
linearity (i.e., quadratic term, F°, in Equa- 
tion 1) in common. The residual correlation 
between X and F, is also positive, since the 
best-fit single factor falls between F and F?. 
This pattern of residuals is similar to that in 
Table 1. 

A one-mediator model can be fit to the hy- 
pothetical data perfectly if the residuals for 
_ Fs, Ys, and Yq are allowed to be correlated. 

These correlations represent the shared non- 
linearity of these variables. The model in Fig- 
ure 1 thus fits the hypothetical data perfectly 
with b set to zero, by allowing correlations 
C1, Cz, and cg to be nonzero (the values of c 
can be further constrained to satisfy Equation 
10 of Birnbaum & Mellers, 1979). 

This example illustrates that even when 
data are constructed from a single-mediator 
model, if some variables are nonlinearly re- 
_ lated, deviations from the model can appear 
unless the model allows correlations among the 
residuals for nonlinear variables. 


Correlated Residual Model Fits Moreland 
and Zajonc Data 


The model that fits the hypothetical data 
(Figure 2B) also achieves a good fit to the 
Moreland and Zajonc (1977) data (Figure 
2A). It is a special case of a one-mediator 
model with nonzero correlations among the 
_ &trors in the recognition variables. The cor- 
relations among the residuals (c1, ¢2, and ¢3) 
May be given a response-bias, or “method,” 
nterpretation, since this model does not pos- 
e any additional paths from stimulus 

equency to any of the variables, besides the 
one via the recognition mediator (b is set to 
zero). The residual correlations from this one 
Sead model are quite small, the largest 
F “lation being .04 for both experiments. The 

Sidual correlations between exposure fre- 
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Table 2 
Hypothetical Example for One- Mediator 
Nonlinear Model 
x Yı Y: Y: Y, 

X 69 32 39 AD 

Y, .02 :32 40 sol 

Y: —.02 —.02 26 OM 

Y; —.03 —.02 04 40 

Ys —.02 —.02 .06 08 


Note. X = independent variable; Yı through Y, 
= dependent variables. Correlations are above diag- 
onal; residuals from one-mediator linear model are 
below diagonal. Note similarity of residuals to those 
in Table 1. 


quency and liking are —.02 and .00 for Ex- 
periments 1 and 2, respectively. 

Thus, the residual correlations from the 
linear one-mediator model may be attributed 
to shared nonlinearity of the recognition mea- 
sures. This interpretation is consistent with 
the data shown in Figure 2A, with the pattern 
of residuals in Table 1 (in which the largest 
deviations involve recognition accuracy), and 
with the fact that the one-mediator model 
with correlated errors provides a good fit to 
the correlations. In sum, the one-mediator 
model remains a viable interpretation of the 
Moreland and Zajonc (1977) data if it is 
allowed that the recognition measures are non- 
linearly related to the other variables. 


Two-Mediator Model Not Required 


The two-mediator model favored by More- 
land and Zajonc (1979, Figure 1) yields the 
theoretical correlation matrix shown in the 
lower triangle of Table 3. The model is equiva- 
lent to Figure 1 of the present article with 
C1 = Cz = Cg set to zero. Table 3 shows that 
if b = 0, each correlation can be represented 
as the product of correlations with the mediat- 
ing variable, subjective recognition. However, 
with b > 0, the correlations between Yı (lik- 
ing) and the others increase, especially the 
correlation between X and V3. 

The hypothetical correlations generated 
from a two-mediator model in the upper por- 
tion of Table 3 show that in principle, it is 
possible to distinguish the two-mediator model 
from the one-mediator model. Note that the 
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Table 3 : 
Theoretical Values for Two- Mediator Model 


and Hypothetical Example 


x Y, Ya Y; Yı 
x 15 40 40 40 
Y, aa.+6 -60 -60 -60 
Y: at; aas + baits 64 64 
Ys aids Qt, + bad, alta 64 
Yı aids Gots + baias Qs Qs 


Note. X = independent variable; Y, through Yı 
= dependent variables. Theoretical values are be- 
low diagonal. Hypothetical values above diagonal 
are based on a, = @: = .5, Gs = a = Gs = .8, and 
b=.5. 


hypothetical correlations of frequency, liking, 
and familiarity violate Equation 10 of Birn- 
baum and Mellers (1979), since .75/.40 = 
1.88, which exceeds the limit set by 1/.60 = 
1.67.7 

Table 4 shows parameter estimates and an 
index of fit for the various models under con- 
sideration, applied to data for Experiment 2 
of Moreland and Zajonc (1977). To obtain 
the predicted correlation between two varia- 
bles, multiply the coefficients along each path 
and sum these products over all distinct paths 
connecting the two variables in Figure 1. 
For example, for the one-mediator model, the 
predicted correlation between stimulus expo- 
sure and rated affect is ido, or .61. For the 
two-mediator model, the predicted value would 
be aiaz + b, or .66. 

The two-mediator model shown in Figure 
1 of Moreland and Zajonc (1979) is equiva- 
lent to the model in Figure 1, with cı = C2 = 
¢3 = 0. Because the models are mathemati- 
cally identical, the index of fit in Table 4 
(“x”) is the same as reported by Moreland 
and Zajonc (1979).? The largest residual for 
the two-mediator model for Experiment 2 is 
.13, between recognition confidence and recog- 
nition accuracy, a slight improvement over 
the corresponding value of .18 for the one- 
mediator model.* 

The one-mediator model with correlated 
errors provides a better fit to the data than 
the two-mediator model, although it does not 
Postulate an additional path from exposure 
to liking. The largest residual for this model 
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is only .04 for both experiments. The 
of x is less than one sixth as large as tl 
for the two-mediator model. In addition, 
mitting b to be nonzero does not improve’ 
fit, once correlations are allowed among the 
errors in the recognition measures (see la 
column of Table 4). J 

In sum, the analyses provide no com] 
reason to reject the null hypothesis tha 
in Figure 1 is zero. Therefore, the data 
not demonstrate the existence of another pa 
from stimulus exposure to liking apart fi 


1 The numerical example in Table 3 also illusi 
that the two-mediator model predicts a larger cO 
relation between X and F, than would be pred 
from the one-mediator model According to 1 
one-mediator model 


"i (010a) (a303) (0405) _ PXYaPY YPX 
Pry, (a3a4) (a305) PY2¥sP¥aN4 
which reduces to aids. In contrast, the two 
tor theory with b > 0 prediots that the ob 
correlation should exceed the value of fxv, Mt 
Equation 2, For the values used in the hypothetit 
example of Table 3, Equation 2 (the one: a 
model) yields pry, = (40) (,60) (.64)/(.64) G 

.375, which is only half as large as the table 
mediator model) value of pxy; = -75 When 
tion 2 is applied to the correlations of Mo : 
and Zajonc (1977), the values of pxy, are 48 i 
81 for Experiments 1 and 2. The obtained 
which by the foregoing should have been 
than these values of fxr, according to the 
mediator theory, were only 42 and 66. 

2 The model comparisons presented by Mo! 
and Zajonc (1979) are based on chi-square 
that may have been inflated by the assumption 
by partialing out the main effect of subjeci 
sample of 40 subjects produced 400 indep 
observations. However, repeated observations 
the same subjects should not be treated as 
were independent. Since the computed chi 
is directly proportional to the assumed num 
independent observations in each correlation 
“2 values reported should not be used for 
tistical inferences. The “x*” values reported 
are calculated using the same value (400) f 
sample size, only to provide an index that 
compared with the accompanying article. 

3 Moreland and Zajonc (1979) fixed the €O em 
of familiarity in both of their analyses even 
it was actually estimated from their data (se 
Footnote 2). As a consequence of this pr 
the correct df for both models are 1 less 
report. 


ONE-MEDIATOR MODEL IS STILL VIABLE 


Table 4 
Estimates of Parameters for Model Comparisons 
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i 


Model 


One mediator Two mediator 


Variable One mediator Two mediator correlated errors correlated errors 
a 82 69 86 86 
a: -14 .26 7 72 
a3 74 83 68 68 
a4 42 44 37 38 
a5 57 66 45 AS 
b (0) 48 (0) .04 
ĉi (0) (0) 04 03 
Ce (0) (0) -25 .25 
Cs (0) (0) .25 24 
df 5 4 2 1 
bee 83.58 38.99 6.22 6.22 


Note. Based on data of Experiment 2 of Moreland and Zajonc (1977). Each entry represents estimated pa- 
rameter as described in Figure 1, with restrictions imposed by the models. Values shown in parentheses are 


fixed. Degrees of freedom (df) are 10 minus the number of estimated parameters. The values of e; 
* dant; for example for the one-mediator model, e; 


are redun- 


can be computed from 1 — a. The “\2" is inflated and 


should not be used for statistical tests, but represents an index of fit that can be compared with values 


reported in the accompanying article. 


an effect that could be mediated by stimulus 
recognition.* 

It is always preferable to choose among 
theories on the basis of qualitative differences 
in predictions for a variety of experimental 
manipulations rather than on the basis of 
small differences in an index of fit. Perhaps 
evidence for a second mediator would occur 
if a new variable could be found that could 
potentially reverse the exposure effect, so that 
repeated exposure could either increase or de- 
crease liking (while increasing recognition), 
depending on the value of the new variable. 
A future study may provide sound evidence 
that stimulus recognition does not mediate 
the exposure effect. Until such evidence is 
Presented, however, it is premature to assert 
that there are two independent effects of 
stimulus exposure on liking. 


Conclusions 


ae analyses do not disprove the possi- 
ane of unconscious affect, learning 
a m awareness, or subception. They do 
a the favored interpretation of 
ee ss and Zajonc (1979), though the 
ag ediator model with correlated errors fits 

er than the two-mediator model without 


correlated errors (Table 4). However, the 
analyses do show that the null hypothesis of 
one mediator remains a viable description of 
the data of Moreland and Zajonc (1977), who 
argued that there are two independent effects. 
Perhaps the null hypothesis should be favored 
by skeptics and the burden of proof be laid 
upon those who would refute it. 


4Moreland and Zajonc (1979) argued that if a 
single mediator explains their (1977) data, the 
mediator may be named “affect” rather than “rec- 
ognition.” The name of a mediator is a matter of 
definition that cannot be refuted by experiment. 
However, any discussion of the proper label for a 
single mediator is tangential to the original argu- 
ment of Moreland and Zajonc (1977, Pp. 193), In 
that article they stated that evidence of two inde- 
pendent effects of exposure on liking was required 
to demonstrate that stimulus recognition is not 
necessary to the exposure effect. 
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The Stability of Behavior: 
I. On Predicting Most of the People Much of the Time 


Seymour Epstein 


University of Massachusetts—Amherst ` 


One of the classic debates in psychology concerns the stability of personality. 
With rare exception, studies that have correlated objective behavior on two 
occasions have obtained coefficients below .30. Not only has the direct measure- 
ment of objective behavior failed to provide evidence of stability, but self- 
report scales in attitude and personality inventories, as well as ratings of be- 
havioral samples by judges (although themselves stable), have produced low 
correlations with objective behavior. Does this indicate, as some have sug- 
gested, that stability of behavior lies primarily in the eye of the beholder? 
The issue can be resolved by recognizing that most single items of behavior 
have a high component of error of measurement and a narrow range of gen- 
erality. In four separate studies it was demonstrated that when measures of 
behavior are averaged over an increasing number of events, stability coefficients 
increase to high levels for all kinds of data, including objective behavior, self- 
ratings, and ratings by others, and that objective behavior can then be reliably 
related to self-report measures, including standard personality inventories. The 
observation that it is normally not possible; to predict single instances of be- 
havior, but that it is possible to predict behavior averaged over a sample of 
: situations and/or occasions, has important implications not only for the study 


of personality but for psychological research in general. 


$ 
A critical issue in personality theory is 
Whether stable behavioral dispositions, or 

‘traits, exist. On the basis of everyday ob- 
‘“ervation, it seems evident to most people 
‘that they do. Yet the vast bulk of psycho- 
Bical research fails to provide confirmatory 

‘Vidence. It must be concluded that either 
ithe lay view is right and our typical methods 
Of research are lacking, or the research findings 
F correct and the lay view itself is a phe- 
nomenon worthy of study. Not surprisingly, 
Psychologists of both persuasions can be found. 
e debate on the stability of personality, 


which is one of the classic debates in psy- 
chology, has recently been given new impetus 
by findings from statistical procedures that 
have suggested a resolution in the form of 
what has been called “modern interactionism.”’ 
It will be demonstrated in this article that 
modern interactionism does not resolve the 
issue of stability of personality, no matter 
what other virtues it may have. A solution 
proposed in this article lies in an entirely 
different direction, one that is so obvious 
that, once pointed out, it reminds one of the 
fairy tale of The Emperor's New Clothing. 


Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3707-1097$00.75 


1097 


1098 


Like the fairy tale, the implications of the 
solution go far beyond the immediate issue 
at hand, Á 

This is not the place for a lengthy review 
of the debate. Excellent reviews and discus- 
sions are available in papers by Bowers 
(1973), Ekehammer (1974), Endler and Mag- 
nusson (1976), and Magnusson and Endler 
(1977). Allport (1937, 1961, 1966) can be 
referred to for a staunch theoretical defense 
of the trait position and Cattell (1965) and 
Eysenck (1969, 1970) for sophisticated, em- 
pirically based trait theories. The flavor of 
the attacks on the trait position can be 
represented by the following quotations span- 
ning the period of the debate. 


Training the mind means the development of thousands 
of particular, independent capacities, the formation of 
countless particular habits, for the working of any 
mental capacity depends upon the concrete data with 
which it works. Improvement in any one mental 
function or activity will improve others only in so 
far as they possess elements common to it also. The 
amount of identical elements in different mental 
functions and the amount of general influence from 
special training are much less than common opinion 
supposes. (Thorndike, 1906, p. 248) 


Over and over, a battery of tests designed to measure 
traits such as persistence, or aggressiveness, or honesty 
yields results so unreliable and undependable (when 
compared with other criteria) that one is led to ques- 
tion the actual existence of the general trait. In nu- 
merous instances the instruments are very loosely 
constructed and are clearly equivocal. À survey 
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of . . . a wide variety of frequently used tests Suggests 
that there is a fundamental limitation common to 
most of them. Trait testers appear to assume that 
whatever they name has objective reality; many i 
need not so much to improve their measures but to 
improve or change their thought regarding traits, 
(Lehman & Witty, 1934, p. 49) 


The generality of these (personality) measures over 
method and situation was still not high enough tof 
justify perpetuating the traditional concepts of per 
sonality. The findings required abandonment of a line 
of research to which I had devoted ten years of my 
life as a psychologist. The results also required al 
change in beliefs about the nature of personality, 
This research, per se, did not say which way the 
conceptual shift should go, but it suggested very 
strongly that traditional conceptions of personality 
as internal behavior dispositions were inadequate and] 
insufficient. (Peterson, 1968, p. 23) 


With the possible exception of intelligence, highly 
generalized behavioral consistencies have not been 
demonstrated, and the concept of personality traits 
as broad predispositions is thus untenable. (Mischel, 
1968, p. 146) 


As noted earlier, there was nothing silly about the 
initial assumption of personologists that everything 
was glued together until proved otherwise. But since 
it has now proved otherwise, it seems only fair to 
give a sporting chance to the counter-assumptio 
that nothing is glued together until proved otherwise; 
Instead of assuming cross-situation correlations to bë 
+1.00, let us begin by supposing them to be 0.0) 
until we can explicitly construct them to be otherwise: 
(Bem, 1972, p. 25) 


The charge that personality traits do not 
exist clearly strikes at the very heart af f 
personality theory. One could well argue that 
if individuals do not have relatively stable 
behavioral dispositions that differentiate them 
from other individuals, then the concept of 
personality itself can be dispensed with: 
Shweder (1975), in fact, comes close t0 
arguing just that. 


We 
The degree of relevance of the concept “personality 
constructed in this “individual difference” sense É 
questioned. Some of the most ubiquitous personally 
assessment procedures, specifically interpersonal che 4 
lists, personality inventories, and questionnaire a 
views are shown to be subject to a form of systemi 
bias which creates the “illusion” that the diversity 
of individual behavioral differences can be ee 
to a limited set of person-distinguishing under Me 
forces (traits, dispositions, scales, factors oF dim 
sions). (p. 455) 


As will be seen shortly, the claim of a 
bias is one of the main arguments ia 
against the trait position, as it can eXP 


STABILITY OF BEHAVIOR: I 


why there are data from self-report inven- 
tories and ratings by others in support of 
* stability in personality but not such data 
from the direct, objective measurement of 
behavior. 


Arguments For and Against the Existence 
of Traits 


Not surprisingly, given attacks on the very 
foundation of personality theory such as 
those cited above, some lively debate has 
ensued, Others who have joined the fray 
include Alker (1972), Argyle and Little 
(1972), Averill (1973), Campus (1974), Cron- 
bach (1957, 1975), Endler and Hunt (1968, 
1969), Epstein (1977), Fiske (1961), Moos 
(1969, 1970), Olweus (1977a), Pervin (1968), 

Raush (1965), Sarason (1977), Schalling 
(1977), Vale and Vale (1969), Vernon (1964), 
t Wachtel (1973), and Zuckerman (1977). It 
is not our intention to review the position 
of each here. For present purposes, it will 
suffice to list the arguments and the kind of 
evidence cited for and against traits. 


| Situationist Position 


According to the situationist position, there 

is little stability in personality, as behavior 
r is determined almost exclusively by situational 
variables. The belief that there is little sta- 
bility in personality rests on three major 
sources of evidence. The most important is 
that when behavior in one situation is cor- 
telated with behavior in another situation, 
. the correlations are so low—usually less than 
30—that they have been disparagingly re- 
‘ferred to by Mischel (1968) as “personality 
Coefficients.” Mischel (1969) further notes, 
“A correlation of .30 leaves us understanding 
less than 10% of the relevant variance. And 
even correlations of that magnitude are not 
very common and have come to be considered 
good in research on the consistency of any 
Roncognitive dimension of personality” (p. 
1012). At the same time there is no dearth 
of evidence that behavior varies markedly as 
4 function of stimulus or situational variables. 
A second source of evidence consists of 
findings from the apportionment of variances 
in analysis of variance designs. This procedure 
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was independently introduced into the assess- 
ment of stability in personality by Raush 
and his colleagues (Raush, 1965; Raush, 
Dittman, & Taylor, 1959; Raush, Farbman, 
& Llewelynn, 1960) and by Endler, Hunt, 
and their colleagues (Endler, 1966; Endler 
& Hunt, 1968, 1969; Endler, Hunt, & Rosen- 
stein, 1962). They and others since (e.g, 
Argyle & Little, 1972; Ekehammar & Magnus- 
son, 1973; Magnusson, 1971; Moos, 1968, 
1969, 1970; Magnusson, Note 1) have demon- 
strated that the variance attributable to indi- 
vidual differences is usually much smaller 
than the variance attributable to situations 
and to the interaction of individuals and 
situations. 

A third source of evidence cited by situ- 
ationists is that when people rate others, 
they tend to attribute more stability to indi- 
viduals across situations than is objectively 
warranted (e.g., Bem & Allen, 1974; Jones 
& Nisbett, 1971; Mischel, 1968; Shweder, 
1975). This, of course, can explain how there 
can be a widespread belief in the stability 
of personality when, in faet, there is little 
stability. The study of such bias in person 
perception has itself become a significant area 
of research, and has produced a quantity of 
explanations as to why the phenomenon of 
falsely perceiving stability in personality oc- 
curs. For example, it has been suggested 
that (a) it is emotionally satisfying to believe 
that behavior is predictable, particularly when 
it is someone else’s; (b) it is simpler to clas- 
sify behavior by people than by situations; 
(c) people have implicit personality theories 
that assume stability in personality, and their 
theories bias their perceptions; (d) a primacy 
effect operates to make new impressions con- 
form to old ones; (e) the observer is always 
present in the situations that he or she ob- 
serves in real life, thereby presenting the 
observer with a biased sample; (f) there is 
a tendency for observers to equate behavior 
that elicits the same emotional reactions in 
them; (g) there is a tendency for judges to 
generalize from a few attributes that are 
stable, such as intelligence, to others that are 
not; and (h) there are more terms for clas- 
sifying people than for classifying situations, 
which leads to a bias toward attributing 
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behavior more often to characteristics of 
people than to characteristics of situations. 


Trait Position 


The arguments that have been advanced 
in favor of traits amount primarily to con- 
jectures that if different procedures had been 
followed in the investigations of stability in 
personality, higher stability coefficients would 
have been found. There are also a few studies 
that can be cited that obtained respectable 
stability coefficients. 

The arguments can be summarized as fol- 
lows: (a) Many of the studies undertaken to 
assess stability in personality have been ex- 
perimental studies, which are better suited 
to demonstrate change than stability in per- 
sonality (Bowers, 1973). (b) The unit of 
analysis is a critical factor that has not been 
adequately taken into account. What appears 
to be instability at a phenotypic level of 
analysis may ke stability at a genotypic level 
(Atiker, 1972; k wets, 1973). To use one of 
Bowers’ exam, Si a woman who is con- 
tinuously changing her wardrobe may be 
consistently fashionable. (c) The use of mod- 
erator variables ee considerably increase 
reliability coefficients (Alker, 1972). (d) A fail- 
ure to recognize that some people are more 
variable than others results in reporting gen- 
erally low. stability coefficients, rather than 
in noting that at least some individuals are 
highly stable (Alker, 1972; Bem & Allen, 
1974). There is at least one study (Bem & 
Allen, 1974) that has demonstrated this effect. 

(e) In everyday life, people determine their 
own environment, which, in turn, helps 
maintain the stability of their personality. 
The operation of this effect is not possible 
in laboratory studies in which individuals are 
arbitrarily assigned to conditions (Bowers, 
1973; Wachtel, 1973). (f) Stability in per. 
sonality is mediated by an individual’s cogni- 
tions and, accordingly, will only be found 
when ideographic procedures are used that 
take into account the subjective nature of 
perception (Alker, 1972; Bem & Allen, 1974; 
Bowers, 1973; Mischel, 1973). (g) Stability 
may be demonstrable to a greater extent in 
within-subject relationships than in between- 
subjects relationships (Alker, 1972). Carlson 
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(1971) makes a related point in noting that 
although the concept of personality implies 
an organization of variables within an indi. 9 
vidual, among a large number of studies she | 
reviewed, “nol a single published study al- 
tempted even minimal inquiry inlo the organiza- | 
tion of personality variables within the individ- | 
ual” (p. 209). 

As to studies that have reported evidence 
for stability in personality, we will not con- | 
sider investigations relying on self-report in- 
ventories, as such studies can simply demon- 
strate that people’s beliefs about their behavior 
are consistent, which is a far cry from de- 
monstrating that the behavior itself is con- 
sistent. Nor will an attempt be made to 
review all behavioral studies that have 
reported positive findings. Our purpose will 
be served adequately by a review of three 
series of carefully conducted studies. On the 
basis of the findings in these studies, it is 
possible to formulate an integrative hy- 
pothesis that can account for all the known 
results to date. This will be followed by @ 
presentation of four studies that were eX 
plicitly undertaken to test the hypothesis. 

Block (1971, 1977) conducted a series of 
studies in which the stability of a variety 
of personality variables was examined during 
different periods between childhood and adult- 
hood. In one set of studies, records were 
obtained from the archives of the well-known 
longitudinal studies at Berkeley of the be- 
havior of students when they were in junior 
and senior high school. The subjects were 
then intensively interviewed when they were 
in their mid-thirties. Judges rated each sub- 
ject for the different periods on the items 1 
the California Q Set (Block, 1971). To ensure 
that stability coefficients would not be arti- 
ficially inflated by response sets, memory, or 
other sources of rater bias, different judges 
rated the subjects at the different timg 
periods. For the period from junior to senlo? l 
high school, 58% of 114 personality variables 
produced stability coefficients that were at 
least as high as .35, which is significant at 
the .001 level. Some of the correlations wel 
as high as .70. For the period from oe 
high school to age 30 and above, 29% of u 
personality variables yielded stability €0° 
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ry 
‘cient of 35 or greater, with some as high 


and 
and 


as .61. 

In a study of younger children by Block 
and Block (Block, 1977), observations were 
made during the children’s 4th, 6th, and 8th 
years of life. Two or three judges, who were 
the children’s nursery school teachers, ob- 
served the children for 3 hours a day over 
a period of 5-9 months and then rated each 
child on a modification of the California 
Q Set. Rater bias was controlled by having 
diferent teachers rate the children at different 
periods. The average stability coefficient for 
100 Q items was .48, with several over .60, 
and a few as high as .70. When the Q items 
were grouped into broader scales by factor 
analysis, the mean stability coefficients rose 
to .56. 

In-a series of studies on aggression in 
young boys, Olweus (1973, 1974, 1977a, 1977b) 
had classmates rate 201 boys in the 6th grade, 
and again 3 years later, on a number of 
variables pertaining to aggression, such as 
tendency to start fights with peers. Ratings 
for each boy were averaged over 3-10 raters. 
Memory effects and response bia$ were evalu- 
ated and controlled by statistical procedures 
and by examining subgroups in which there 
Was no overlap in raters. The mean stability 
coefficient for the 3-year period was .66. 
When correlations were corrected for at- 
tenuation due to rater unreliability, the 
stability coefficients rose to about .80. In a 
Second similar study in which 85 13-year-old 
boys were the subjects, even higher stability 
Coefficients were obtained. In other studies 
Mm the series, it was observed that peer ratings, 
teacher ratings, and self-ratings on a specially 
designed aggression inventory were all highly 
'ntercorrelated, suggesting the existence of a 
broad dispositional trait of aggression. 

The classic series of studies by Hartshorne 
May (1928, 1929) and Hartshorne, May, 
nd Shuttleworth (1930) on honesty, often 
cited as evidence against the existence of 
Stability in personality, was an enormous 
Project that spanned half a dozen years, 
‘mployed a large team of researchers, and 
tested a national sample of over 8,000 children. 
Among the behavioral items that were as- 
Stssed were cheating in a classroom, cheating 
0 a take-home examination, cheating during 
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a game, stealing money, lying, and falsifying 
records of athletic performance. The average 
intercorrelation of 23 subtests used as part 
of a total character score was found to be .23, 
which led Hartshorne and May to conclude 
that honesty in any single situation has low 
predictive value for honesty in any other 
single situation. This conclusion is invariably 
cited as evidence against the existence of 
stability in personality. What is generally not 
known is that when Hartshorne and May 
combined several tests of honesty into a single 
score, the reliability coefficient increased to .73, 
and they concluded, 


Just as one test is an insufficient and unreliable mea- 
sure in the case of intelligence, so one test of decep- 
tion is quite incapable of measuring a subject’s 
tendency to deceive. That is, we cannot predict from 
what a pupil does on one test what he will do on 
another. If we use ten tests of classroom deception, 
however, we can safely predict what a subject will 
do on the average whenever ten similar situations 
are presented. (Hartshorne & May, 1928, p. 135) 


Further support for the existence of a broad 
trait of honesty in the dorne and*May 
data is provided by a Y or analysis. by 
Burton (1963), who observed that a general 
factor of honesty accounted for nearly 50% 
of the total variance. a 


Interactionist Position 


According to the interactionist position, 
the question of which is more important, the 
situation or the person, is a meaningless one, 
as behavior is always a joint function of the 
person and the situation. In its applicability 
to the issue of stable individual differences, 
the interaction position can be viewed as a 
compromise between the trait position and 
the situationist position, for it acknowledges 
the existence of behavioral stability, but only 
within situational constraints. > ik 

The so-called modern interactionist position 
was derived from findings on apportioning 
variances in an analysis of variance design. 
‘As already noted, it was observed by a num- 
ber of psychologists that the interaction of 
individuals and situations accounted for more 
variance than either source of variance by 
itself. It was consequently argued that an 
interactionist position should supplant both 
the trait and the situationist position (Bowers, 
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1973; Ekehammer, 1974; Endler, 1966; Endler 
& Hunt, 1968; Magnusson, Note 1). 

As for a more general interactionist view- 
point that goes beyond statistical interaction 
(see Magnusson & Endler, 1977, for a thorough 
review of this position), it is assumed that 
individuals and situations are interdependent. 
That is, the individual’s cognitions and per- 
ceptual processes, as much as the objective 
characteristics of the stimulus, determine the 
meaning of the stimulus. From this viewpoint, 
behavior can best be viewed as a transaction 
between the individual and the stimulus, each 
influencing the other. Since behavior never 
takes place in a vacuum but always occurs 
in a situational context, it is meaningless to 
talk about characteristics of an individual’s 
behavior without specifying the situation in 
which the behavior occurs. To understand 
and predict behavior, it is, accordingly, just 
as necessary to have a classification system 
for situations as for individuals and, most 
important, to know how individuals interpret 
different kinds of situations. 


Evaluation of uke For and Against 
the Existence of Traits 


Evaluation of the Situationist Position 


The case of the situationists against traits 
rests mainly on empirical grounds. They note 
that the concept of traits is not unreasonable 
but add that it is also not supported by the 
facts. The Strongest evidence they cite against 
traits consists of the low stability coefficients 


obtained when data are derived from direct 
ioral 


' were on single i 
of behavior. Such correlations aiy ae 
little relevance for the existence of traits, as 
no trait theorist believes that a trait can be 
inferred from a single instance of behavior. 
A trait is a generalized tendency for a person 
to behave in a certain manner over a suffi- 
cient sample of events 
that he or she will exhibit trait- 
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thereby limiting the possibility of replication 
and a high component of situational uni fe 
ness, thereby limiting the possibility of gen. 
eralization. It is no more reasonable to assess 
the stability of nonintellective behavior by 
correlating single observations than it is {9 
assess the stability of intellective behavior 
by correlating single items in an intelligence 
test. Thus, for statistical reasons alone, the 
low correlations cited as evidence against the 
existence of stable response dispositions can 
be discounted. ' 

The view that there are traits consisting 
of relatively broad, stable, behavioral disposi- 
tions does not require the assumption that 
situations do not affect behavior. Behavior 
can vary significantly with situations, and 
there can still be an underlying consistent 
thread in behavior averaged over situations. 
Thus, demonstrations that behavior varies 
with situations or even that it varies mort 
with situations than with individuals cannot 
be taken as evidence against the existence 
of traits. 

Let us now consider the procedure of ap- 
portioning variances in an analysis of vari- 
ance design, on the basis of which it has been 
claimed that there is little stability in per 
sonality. It has been falsely argued that if 
there were stability in personality, individual 
differences would necessarily account for a 
relatively large proportion of total variance: 
This argument is fallacious for two reasons: 
First, the proportion of variance attributable 
to any one factor, such as individuals, 1$ 
always influenced by the range of variability 
represented by the other factors, Thus, if 
situations are selected over a wide range 0 
variability and individuals over a narrow 
range, the proportion of variance for indi- 
viduals will be smaller than that for situations: 
It is evident that depending on how one 
selects individuals and situations, and how 
the two are selected in relationship to ea 
other, individuals, situations, or their inter- 
action can be made relatively large or SM 4 
Second, the analysis of variance has me, 
misused in obtaining estimates of stabili a 
for instead of using the appropriate ¢™ 
term for obtaining estimates of stability H 
efficients, the variance for individual f 
ferences has been compared to total varian 
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(see Epstein, 1977; Golding, 1975; Olweus, 
1977; Magnusson, Note 1, for a further 
development of this point on the misuse of 
analysis of variance). It is possible with such 
a procedure for individuals to be perfectly 
reliable over time, as indicated by a reliability 
coefficient of +1.00, and yet have individual 
differences contribute a relatively small pro- 
portion of total variance. For example, con- 
sider a situation in which all individuals come 
out exactly the same in a 50-yard dash and 
in a 500-yard dash. As the mean difference 
between events will be many times greater 
than the differences among racers within 
events, it is evident that the variance due 
to individuals will be a small proportion of 
the variance due to situations, let alone of 
the total variance. 

The argument that judges sometimes at- 
tribute stability to people that is not there 
describes an interesting phenomenon, but 
cannot establish that there is no stability in 
behavior apart from such bias. In fact, the 
assumption that under most circumstances 
there is stability in personality can account 
for a bias to assume there is stability even 
when there is not. Further, the same factors 
that contribute to the perception of stability 
in personality could also contribute to its 
actual stability. If it is assumed that there 
is a fundamental need for people to establish 
orderliness and predictability in the world of 
their experience and that they accomplish 
this through their cognitions, including their 
habits of perception (see Epstein, 1973, Note 2; 
Mischel, 1973), then the same cognitions and 
perceptions could contribute not only to 
people’s ratings of stability but to the sta- 
bility of their actual behavior as well. 

Finally, it should be recognized that the 
null hypothesis cannot be proven by the 
failure of many studies to demonstrate sta- 
bility in personality. The possibility remains 
that with new understanding and new ap- 
Proaches, the conditions for demonstrating 
Stability in objectively measured behavior 
will be established. 


Evaluation of the Trait Position 


The arguments in defense of traits are, for 
the most part, speculations that if things had 
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been done differently, stability in personality 
might have been demonstrated. Although the 
proposals may be of interest in suggesting 
new directions for personality research, they 
do not constitute an adequate defense of the 
position that there is stability in personality. 
All these proposals can do is indicate that 
the issue should not be closed until they have 
been tried. Unfortunately, though some of 
these proposals have been tested, they have 
failed to fulfill their promise (Bem, 1972), 
In our own research, in which we compared 
mean within-subject correlations to between- 
subjects correlations for the same variables, 
we have generally found the former to yield 
smaller values. 

Finally, one must agree with Bem (1972) 
that the issue of stability in personality 
cannot be resolved by disputation but only 
by data, “And if these separate indices per- 
mit Alker to predict behavior across situations 
better than +.30, Mischel will fold up his 
tent and steal away” (p. 18). 

How is one to evaluate the findings in a 
few studies that have demonstrated stability 
in behavior in comparison to the vast number 
of studies that have obtained negative find- 
ings? One possibility is that when enough 
studies are done, a few are bound to produce 
significant results by chance alone. An alter- 
native is that the few studies that produced 
positive findings were better conceived and 
conducted than the many that failed (see 
Block, 1977). A careful analysis of the studies 
that succeeded could uncover the critical 
conditions for demonstrating stability in be- 
havior, It is noteworthy, in this respect, that 
all three series of studies that succeeded 
examined relatively extensive samples of be- 
havior. In the series of studies reported by 
Block (1971, 1977) and by Olweus (1973, 
1974, 1977a, 1977b), judges rated behavior 
over a relatively long period. In the Hart- 
shorne and May studies (1928, 1929), reli- 
ability was found only when a sufficient 
number of behavioral items was combined 
into a single index. As previously noted, 
single items of behavior, like single items in 
a test, tend to have a high component of 
error of measurement. Thus, it may be that 
procedures for reducing error of measurement 
are critical for demonstrating stability in 
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personality. Further evidence in support of 
this conjecture is that in the studies by the 
Blocks and by Olweus, ratings from several 
judges were combined, which is one means 
of reducing error of measurement. In ad- 
dition, the Blocks found that when they 
combined a number of single measures into 
a broader one, stability coefficients markedly 
increased, a finding also reported by Hart- 
shorne and May. Thus, a possibility that 
must be considered is that the critical factor 
separating the studies that succeeded in 
establishing stability from those that failed 
to do so is the steps taken to reduce error 
of measurement by obtaining adequate sam- 
ples of behavior in the former studies. 


Evaluation of the Interactionist Position 


As already noted, there have been problems 
with the statistical and methodological pro- 
cedures employed by the interactionists. 
Moreover, the claim that the interaction of 

_ individuals and situations accounts for more 

’ variance than either individuals or situations 
alone has not been uniformly upheld (see 
Sarason, Smith, & Diener, 1975). In a review 
of a large number of studies, Sarason and 
his colleagues found that individuals, situ- 
ations, and their interaction did not account 
for very much of the total variance, Rather, 
the predominant source of variance was error 
variance, Thus, which® of the first three 
Sources of variance is most important is a 
moot point, as often none accounts for a 
satisfactory amount of variance, 

The more general interactionist position 
does not involve so much a new position as 
a rewakening of interest in an old position 
(see Ekehammer, 1974). Murray (1938), in 
his classic studies of personality, not only 
endorsed an interactionist view at a theoretical 
level, but developed a classification system 
for response tendencies and for subjectively 
and objectively defining stimulus situations. 
Even more to the point, his concept of thema 
identified a unit of interaction between a 
response disposition and a stimulus variable, 
There is little question but that behavior is 
a joint function of the person and the situ- 
ation. However, this is irrelevant to the 


question of whether a reasonable degree of, 
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stability in individual behavior can be de- 
monstrated when the behavior is averaged 
over a sufficient sample of situations. It js 
noteworthy, in this respect, that inter! 
actionists have been no more successful than 
others in breaking the .30 personality barrier, 
If they are to succeed in doing so, they will 
have to obtain a sufficient sample of people 
in particular situations. That is, interactionism 
does not replace the need to reduce error of 
measurement by sampling, but simply de 
termines what it is that must be sampled; 
namely people with certain attributes in 
situations with certain attributes. It can “be 
anticipated, of course, that to the extent to 
which situational as well as personality vari-’ 
ables are taken into consideration, the amount 
of averaging over occasions that is required 
for a fixed level of temporal reliability will 
be reduced. 


Resolution of the Three Positions and the 
Proposal of an Integrative Hypothesis 


The trait position, the situationist positions 
and the interactionist position are often pre 
sented as three different approaches to the 
issue of stability in behavior, with the im] 
plication that only one of them can be right 
Actually, all of them can be right, because 
they identify not three different solutions tø 
the same problem but three different probi 
lems. The interactionist wishes to study th | 
behavior of people with certain attributes i 
situations with certain attributes. The tral 
theorist wishes to study consistent behavioral 
tendencies in individuals over a sample 0 
situations. The situationist is concerned with 
the general effects of situations over a sample 
of individuals, for example, the educator who, 
seeks to determine the most effective metho! 
for teaching most pupils to read and th 
architect who seeks to determine the desig 
for a structure that will please the m 
people. In other words, for certain purpos: 
it is important to predict behavior of peopl 
with certain attributes in situations wil 
certain attributes; for other purposes it 1 
important to predict a person’s behavior OVS 
a sample of situations; and for yet 
purposes it is important to predict the 1 
fluence of a specific situation on a sample 0 


a 
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people. Underlying all approaches is the need 
to consider that error of measurement is apt 
to be high and temporal reliability, or repli- 
cability, low when findings are derived from 
single observations. This is so because of the 
contribution of unassessable factors of “situ- 
ational uniqueness” that are apt to be present 
on any one occasion, (Situational uniqueness 
will be further discussed in the second article 
in this series; see Epstein, Note 2.) By aver- 
aging over situations and/or occasions, error 
of measurement can be redticed and temporal 
reliability increased. Without temporal re- 
liability, meaningful generalization cannot be 
established, and this applies equally to the 
demonstration of stable effects for individual 
differences, for situations, and for their 
interaction, 

| The main focus of this article is the sta- 
bility of individual differences. If our as- 
sumption about reducing error of measure- 
ment through averaging over observations is 
correct, it should be possible to routinely 
break the presumed .30 personality barrier 
by averaging behavior over a sufficient number 
of events. It should be noted that this was 
done implicitly in the studies by the Blocks 
and by Olweus, as when a judge makes a 
single rating after observing a child on many 
occasions, the single rating can be viewed as 
in intuitive averaging. 

The above analysis of what.is n@gessary to 
lemonstrate stability in behavior, simple as 
t is, is different from previous explanations 
hat have assumed that the crucial factor is 
he kind of data obtained. Thus, Block (1977) 
loted that there is impressive evidence for 
he stability of personality in ratings by 
lidges of real-life behavior (R data) and in 
tlf-reports (S data) and that there exists 
mpressive evidence for the coherence of the 
Wo forms of data. He further observed, 
owever, that there is no evidence for sta- 
ility in personality when the data consist 
Í the direct measurement of objective be- 
avior (T data) and that such data are, at 
est, tenuously related to the other two kinds 
f data. Block concluded, “It is now in- 
imbent upon us to consider . . . what strate- 
es are likely to extend the realm of coherence 

” 
as to include as well the domain of T-data 

g 63). Mischel (Note 3) also expressed con- 
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cern over the lack of relationship between 
objective data and data derived from ratings 
by others and from self-report. He stated, 


An important test, although surely not the only 
one—of the utility of constructs about personality 
dispositions remains their ability to predict the indi- 
vidual’s behavior in specific situations. Unless R and $ 
data predict T data appreciably, the links between 
trait impressions and specific behavior-in-situations 
remains tenuous. (p. 4) 


If our reasoning is correct, the following 
hypothesis and corollary should define the 
conditions for routinely establishing stability 
in all kinds of data, including ratings by 
others, self-ratings, and objective behavior, 
and for relating ratings by others and self- 
ratings to objective behavior. 


Hypothesis 


Stability can be demonstrated over a wide 
range of variables so long as the behavior in 
question is averaged over a sufficient number 
of occurrences, This applies equally to data 
derived from the direct measurement of ob- 
jective behavior, from self-reports, and from 
ratings by others. 


Corollary 


Reliable relationships can be demonstrated 
between ratings by others and self-ratings, 
including standard personality inventories on 
the one hand and objective behavior on the 
other, so long ag the objective behavior is 
sampled over an appropriate level of gen- 
erality and averaged over a sufficient number 
of occurrences. 


Empirical Testing of the Hypothesis 
and Corollary 


A series of studies has been completed that 
examined the temporal reliability and in some 
cases the validity of data derived from self- 
observation, observation by others, and the 
direct measurement of objective behavior. In 
all cases, a similar procedure was followed, 
that is, behavior was observed on several 
occasions, and single observations were treated 
like single items on a test. More specifically, 
stability coefficients were first determined for 
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a 1-day sample by correlating each subject's 
scores on Day 1 with each subject’s scores 
on Day 2. Next, coefficients were determined 
for a 2-day sample by correlating the mean 
of a subject’s scores on Days 1 and 3 with 
the mean of the subject’s scores on Days 2 
and 4, and so on, until the mean of a sub- 
ject’s scores on all odd days was correlated 
with the mean of the subject’s scores on all 
even days. This made it possible to examine 
split-half stability coefficients for each vari- 
able as a function of the number of observa- 
tions that were averaged. Such a procedure 
is analogous to that which compares the 
reliability coefficients of tests of different 
lengths.’ 


Study 1: Stability of Self-Recorded Data 
Method 


Study 1 used data available from a previous in- 
vestigation of emotions in everyday life. The selection 
of variables in the study was influenced by a theory 
of the self-concept that assumes that a person’s sig- 
nificant postulates about self and world can be in- 
ferred from events that elicit emotions (Epstein, 1973, 
1976, in press). For 1 month 14 men and 14 women 
university students kept records, on specially devised 
forms, of their most pleasant and unpleasant ex- 
perience each day. At the end of each week, they 
met with interviewers who answered their questions 
and checked for errors. The number of recording days 
completed by subjects varied from 24 to 34. The 
first page of each form consisted of a blank page for 
describing the incident in narrative form. This was 
followed by a 90-item adjective checklist for recording 
emotions and a 66-item checklist for recording response 
tendencies and for noting whether they had been 
carried out. Emotions and impulses were scored on 
a 4-point scale for intensity, with a blank signifying 
zero intensity. Behavior carried out was scored on a 
2-point scale for carried out or not carried out, Items 
were collapsed into broader scales by a factor analysis 
of the data. These broader scales constituted the 
data that were analyzed in the present study. As an 
example of a scale of emotions, worthy included the 
following six adjectives: adequate, competent, ap- 
preciated, respected, pleased with self, and proud. As 
an example of a scale of response tendencies, stimulus 
secking included the following three items: to try 
something new and adventurous; to take risks, to 
gamble, to seek stimulation; and to seek thrills ‘and 
excitement. Scores were obtained by averaging the 
responses to the items in a scale. 

As an afterthought, it was decided to score the 
eae for me a conditions, or situational 

» responsible for ing the emotions. Despite 
successive modifications of Murray’s (1938) apes 
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for scoring Thematic Apperception Test protoci 
intensive training of judges, it was not p 
obtain satisfactory interscorer reliability coeffi 
as information necessary to make decisions was- 
unavailable. Rather than drop the scoring of 
ational factors, it was decided to retain it, with 
realization that whatever results were obtained 
the stability of situational factors would underesti 
the true stability but would nevertheless be of 


Results 


Between-subjects reliability coefficients: 
stability of individual differences. In Ti 
stability coefficients for pleasant experi 
are presented for the mean of all odd 
compared to the mean of all even days, 
Day 1 compared to Day 2, and for the 
day compared to the next-to-last day. 
included are the means and standard d 
tions for all variables for the data aver 
over all days. Table 2 presents the same d 
for unpleasant experiences. It can be see 
both tables that for a 1-day sample, 
reliability coefficients are below .30, and 0 
is true for the last 2 days as well as for th 
first 2 days, indicating that a bias to 
greater stability did not develop over 
course of the study. On the other hand, W 
the mean of all odd days is correlated 
the mean of all even days, most coeffi 
are above .70, and some are above .90. 
tions, which were the central focus of i 
study and on the basis of which events ¥ 
selected by the subjects, had the 
stability coefficients, with a mean of 7 
unpleasant emotions and of .88 for plea 


1A related procedure is reported by Patrick, 
man, and Masterson (1974; also see Zucke 
1976). They administered an adjective che j> 
a class on 13 occasions. They then examined 
Correlations of responses over an increasing numbel 
days with the sum of the responses over # 
Not surprisingly, as the number of days j 
so did the correlations with total days, until a 
correlation was obtained when the data over # 
were correlated with themselves. Of greater im a 
is the form of the curve, which was negatively 
celerated. Correlations increased rapidly over 
3 days, after which increments became te 
small. This indicates that for data such as A 
examined in the study, aggregating over three a 
sions can effectively predict the results for a 
larger number of observations. 
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Table 1 


Between-Subjects Reliability Coefficients, Means, and Standard Deviations for All 


Pleasant Experience Variables in Study 1 


a 


Correlation 
All odd vs. Last vs. 
j all even Day 1 vs. next-to- 
Variable M SD days Day 2 last day 
Emotion -u 
Happy 89.4 32.2 .92 —.03 +22 
Kindly 69.1 40.7 .89 AO 29 
Calm 15:3 Sieh 88 43 34 
Adequate 52.6 34.7 .87 37 A4 
Unified 25.3 \ 35.4 87 AO 44 
Energetic 60.5 29.0 86 A9 4 
M 62.0 35.0 88 36 34 
Situation 
Entertainment 6.1 8.4 18 —.10 —.05 
Love and affection 22.7 15.2 14 -10 47 
Freedom 10.9 10.2 73 10 34 
Positive evaluation 20.7 14.4 57 18 —.10 
Affiliation 29.0 171 55 1 08 
Adequacy 20.8 15.0 52 -AT 21 
Pleasant physical stimulus 6.2 6.8 45 35 —,08 
Relief 19:9) HS 35 —.19 —.03 
Aesthetic stimulus 5.9 6.8 —.07 00 00 
M 158 (0:7. 55 00 09 
Impulse 
Pleasure seeking 50,1 29.1 .92 .08 48 
Nurturance 78.2 43.3 87 13 38 
Exuberance 62.7 410 82 38 26 
Stimulus seeking 19.5 20.6 77 =.18 04 
Problem solving 12.6 10.8 71 48 21 
Affiliation 46.9 25.9 68 36 40 
Achievement 24.6 20.1 58 10 09 
M 42.1 27.3 76 AD 27 
Behavior 
Nurturance 23.6 = 17.7 95 06 62 
Exuberance 13.4 15.1 91 28 —.22 
Pleasure 13.7 10.0 89 -:28 34 
Affiliation 14.4 9.3 a7 27 a 
Problem solving 4.0 3.6 70 ae z 
Stimulus seeking 3.6 47 57 wo - 
Achievement 6.4 5.4 40 -. 
74 06 28 


M 11.3 9.4 


Note. Correlations of .37 and .48 are significant, 


Motions. It is not evident why positive 
motions should produce higher reliability 
Coefficients than negative emotions, but this 
finding has held up in all further studies. 
Perhaps subjects are more confused about 


eir negative than their positive feelings. 


respectively, at the .05 and .01 levels. N = 28. 


As expected, the lowest reliability coeffi- 
cients were obtained for situations. Yet, even 
here, the mean correlation was well above 
the .30 barrier, more than half the correla- 
tions were significant at the .01 level, and 
several were above .70. The finding of stable 


1108 SEYMOUR EPSTEIN 


Table 2 ‘ye 
Between-Subjects Reliability Coefficients, Means, and Standard Deviations for All 


Unpleasant Experience Variables in Study 1 


ea aaaaaaaaaaaaaaaaaasaaiħÃi 
Correlation 


All odd vs. Last vs. 
all even Day 1 vs. next-to- 
Variable M SD days Day 2 last day 
Emotion 
Blocked 43.7 27.4 „91 25 
Fragmented 36.0 25.0 -89 32 
Depressed 60.9 33.5 88 22 
Tense 34.6 23.3 82 52 
Angry 70.7 32.7 77 26 
Frightened 46.0 21.8 -12 20 
Inadequate 25.9 19.8 72 09 
Tired 37.0 29.2 62 12 
M 44.5 26.6 79 25 
Situation 
Loss of love 12.8 12.4 70 =t 
Noxious stimulation 11.0 11.1 54 —.01 
Frustration 46.2 18.2 53 06 
Isolation 4.6 6.0 50 —.04 
Inconsideration 15.9 11.3 44 -08 
Attack 13.3 12.6 42 19 
Identification 11.5 11.5 40 —.16 
Failure 18.4 12.5 .29 —.06 
Immorality 6.1 6.9 28 —.04 
Accidental injury 4.4 6.2 06 -00 
M 14.4 10.9 40 —.02 
Impulse 
Problem solving 26.2 18.9 88 -64 
Stimulus reduction 33.5 16.6 85 —.04 
Affiliation 31.4 22.4 .82 32 
Counteraction 39.0 21.0 .82 -08 
Physical escape 29.9 25.0 as -06 
Mental escape 49.1 24.0 68 —.06 
Self-punishment 18.3 16.1 ‘68 —.06 
Tension discharge 18.1 17.6 ‘66 48 
Achievement 13.3 10.9 “64 —.05 
Aggression 32.4 29.2 ‘64 43 
Nurturance 19.4 22.8 ‘51 24 
Withdrawal 32.7 19.4 51 —.13 
M 28.6 20.4 .70 16 
Behavior 
Affiliation 6.5 
Problem solving 8.2 X x 3 
Stimulus reduction 8.7 5.7 <82 13 
Counteraction 8.6 1.5 79 18 
Mental escape ; 12.7 9.2 173 «06 
Achievement 3.1 3.7 65 —:16 
Nurturance 3.6 5.9 ‘60 08 
Withdrawal 5.2 5.5 ‘55 — "06 
Aggression 1.4 2.4 43 56 
Self-punishment 1.8 2.3 135 —04 
Tension discharge sel 3.7 22 ‘81 
ene escape 1.2 1.9 —.05 -00 
5.3 5.2 .57 As 


Note. Correlations of .37 and .48 are significant, respectively, at the .05 and .01 levels. N = 28. 
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individual differences for situational variables 
is consonant with the view that an important 
source of stability in personality is that 
people create stable environments for them- 
| selves (Bowers, 1973; Wachtel, 1973). 

Figure 1 presents mean stability coefficients 
averaged over all variables in a category as 
a function of the number of days in the 
sample. In all cases, correlations are relatively 
low for a 1-day sample, rise sharply as the 
number of days in the sample is increased, 
and then approach an asymptote. Figure 1 
can be used to determine how many observa- 
tions are necessary to obtain different levels 
of reliability. It can be seen, for example, 
that for unpleasant emotions, a 4-day sample 
is required for a mean stability coefficient of 
about .50 and a 10-day sample for a mean 
coefficient of about .75. 

The most obvious explanation of the in- 
crease in reliability with averaging over in- 
creasing observations is that the averaging 
reduces error of measurement. This explana- 
tion can be tested by examining standard 
deviations as a function of the number of 
days averaged. In Figure 2 it can be seen 
that for all categories, there is an initial 
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sharp decline in standard deviation as a func- 
tion of increasing observations, followed by 
a gradual approach to an asymptote. The 
curves, to a large extent, are an upside-down 
reflection of the curves for the correlations 
in Figure 1. As the correlations get larger, 
the standard deviations get smaller. This 
indicates that the variance being eliminated 
is within-subject variance, which from the 
viewpoint of the present study (where con- 
cern is with establishing stable individual 
differences), can be regarded as error variance. 

To what extent are the findings on stability 
coefficients limited because the data consist 
of self-report statements? Such a question 
misses the point, which is to demonstrate 
that in a condition where a high level of 
stability can be shown to exist by adequate 
sampling, it would not be found if only a 
single event had been observed. It follows 
that the vast majority of studies that have 
failed to find stability in personality when 
comparing behavior on single occasions used 
procedures incapable of demonstrating sta- 
bility. In any event, other studies, reported 
later, indicate that the results are not limited 


to self-report data. 


PLEASANT 


; 2 4 6 8 10 I2 Max 
NUMBER OF DAYS IN SAMPLE 


ts in Study 1 as a function of the number of days 
nt the mean of the correlations for the variables in 
S. Epstein. In D. Magnusson & N. S. Endler 
ional psychology. Copyright 1977 by 


Figure 1. Between-subjects reliability coefficient 
in the odd-even samples. (Value Bete ce ty 
a category. [From “Traits are Alive ani z E 

(Eds.), Personality at the crossroads: ERA in interact 
Erlbaum, Hillsdale, N.J. Reprinted by permission.. ‘ 
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Figure 2. Standard deviations as a function of the number of days in t ‘ y 
study 1, (Values plotted represent the mean standard deviation for the variables in a category.) 


Within-subject reliability coefficients: The sta- 
bility of the organization of variables within an 
individual, No matter what else personality 
is, it is widely recognized that it involves an 
organization of variables within an individual. 
Yet, despite universal agreement on this point, 
Carlson (1971) found not a single study that 
addressed itself to the organization of vari- 
ables within the individual. The first step in 
such an inquiry might be to establish whether 
such organization can be demonstrated to 
have a reasonable degree of stability. If what 
has been observed about error of measurement 
among individuals applies to the organization 
of variables within individuals, then the sta- 
bility of such organization should be demon- 
strable when data are averaged over sufficient 
observations but not when they consist of 
single observations. 

For each of the 28 subjects, correlations 
were obtained across variables within a cate- 
gory. A high correlation for a subject indi- 
cated that his or her profile was stable over 
days. Pleasant and unpleasant experiences 
were treated separately, and the data across 
variables within a category were averaged for 
different numbers of days. It can be seen in 
Figure 3 that when profiles were derived from 

single observations, the average intrasubject 
stability coefficient was less than .25 over all 
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categories, As the number of days from which 
individual profiles were derived was increased 
the stability coefficients also increased. Thi 
average correlation for profiles based on a 
maximum number of odd days versus thi 
maximum number of even days varied bé 
tween .60 and .79, depending on the categorii 
Relatively high stability coefficients we 
even obtained for profiles of w. 
Table 3 presents the ranges and m 
the correlations of the profiles within E 
viduals for the different categories, with tt 
data based on the maximum number of da 
in the odd-even samples. It is apparent k 
there are marked individual differences m E 
stability of profiles, with some subjes i 
hibiting almost perfect stability, and E 
exhibiting a lack of stability. It may be cH 
cluded that within-subject reliability “a 
cients provide evidence for a relatively ig! 
degree of stability of the organization 
variables within most individuals when | 4 
data are derived from sufficient obse H 
but provide no such evidence when the a 
are derived from single observations. It a 
further be concluded that there are E 
individual differences in the degree to "A 
individuals exhibit stability in their Pe 

ality profiles. 


STABILITY OF BEHAVIOR: I 


UNPLEASANT 


e Situations 
e— Emotions 
e--@ |mpulses 
«:— Behavior 


RELIABILITY COEFFICIENT 


4 6 8 10 
NUMBER OF DAYS IN SAMPLE 


|2 Max 


1111 


PLEASANT 


10 
NUMBER OF DAYS IN SAMPLE 


2 4 6 8 12 Max 


Figure 3. Within-subject reliability coefficients as a function of the number of days in the odd-even 
samples in Study 1. (Values plotted represent the mean of the correlations for the variables in a cate- 
gory. [From “Traits are Alive and Well” by S. Epstein. In D. Magnusson & N. S. Endler (Eds.), 
Personality at the crossroads: Current issues in interactional psychology. Copyright 1977 by Erlbaum, 
Hillsdale, N.J. Reprinted by permission. ]) 


jects and observers in the study. For a couple to be 
selected, the observer had to have ample opportunity 
to observe the subject during and outside of class. 
The observer kept records of the subject’s behavior 
on eight items related to impulsivity and sociability. 
Ratings were made on 3-point scales for frequency 
of occurrence. The following is an example of one of 
the items: “She actively sought out the company of 
others: (a) never, (b) once or twice, or (c) three or 
more times.” 


Study 2: Stability of Behavior Observed 
by Others 


Method 


, The data for this study were obtained from an 
Investigation by Barry Leon (Note 4) of social and 
impulsive behavior in everyday life. Thirty-two pairs 
of women students at Smith College served as sub- 


Table 3 
Means and Ranges of Within-Subject Odd-Even Reliability Coeficients for 


Maximum Number of Days Sampled in Study 1 
Least reliable Most reliable 
subject within subject within Mean 
category category reliability 
eT ee for all 
Category n Subject no. r Subject no. r subjects 
Unpleasant experi 
Biioticna A E 28 24 16 96 68 
Situations 10 10 =.05 28 95 69 
Impulses 12 14 18 20 95 66 
Behavior 12 2 —.17 25 .97 60 
Pleasant experience 
Emotions. 6 12 13 3 95 18 
Situations 9 2 —.04 7 97 65 
Impulses 7 2 Lis 23 99 "19 
Behavior 7 10 24 28 1.00 79 
eee 
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Figure 4. Reliability coefficients for each of the ya 


of days in the odd-even samples. 


Results 


As in the previous studies, odd-even cor- 
relations were computed for the data averaged 
over different numbers of days. It can be 
seen in Figure 4 that for six out of eight 
variables the pattern is the same, with rela- 
tively low reliabilities for a 
and with increasing reliabilities as the number 
of observations is increased, until a relatively 
high level of reliability is obtained. Variable 2, 
which referred to the closeness of acquaintances 


1-day sample 
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CELSE 


riables in Study 2 as a function of the number 


A + 
with whom a subject initiated conversa 
and which was a poorly constructed a 
since it did not unambiguously define a ce 
titative continuum, never achieved PET 
liability, whereas Variable 4, which biel 
simply to the number of times the ; fl 
initiated contacts with others, starte tand 
a high reliability coefficient and maintain’) 
it throughout. ; r| 

To Sealine whether the establishmen 
a rating set during the course of the B 
influenced the results, correlations be 
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the 1st and 2nd day were compared with 
correlations between the next-to-last and last 
day. The latter correlations were slightly 
lower, indicating that the establishment of 
a rating set during the study could not have 
contributed to the increase in correlations 
that occurred with increasing observations. 

A further possibility for contamination of 
the stability coefficients lies in the inferential 
processes of the observers. That is, a judge's 
view on stability could have influenced her 
ratings independent of the behavior of her 
subject. To the extent that this occurred, it 
would most clearly be revealed in the ratings 
of variables that required the most inference. 
To examine this possibility, variables were 
sorted into three groups according to the 
degree of inference required on the part of 
the observer. It was found that what dif- 
ferences occurred were in the direction of the 
items which required the least inference pro- 
ducing the highest stability coefficients. Items 
3 and 4, which required no inference, pro- 
duced reliability coefficients of .90 and .89, 
respectively, which were among the highest 


| obtained (see Figure 4). As demonstrated 


later, other findings also indicate that the 
more objective the data, the higher the 
stability coefficients rise as a function of 
averaging over observations. 


Study 3: Stability of Directly Measured 
Objective Behavior 


Method 


This study, in addition to replicating many aspects 
of Study 1, examined discrete items of objectively 
Measured behavior. It was conducted in two separate 
Classes as an exercise in research. Nh 

Nineteen undergraduate seniors in a class in clinical 
Psychology and 15 1st-year graduate students in a 
class in personality, at the beginning of each class 
Period during the second half of the semester, filled 
out an adjective checklist on their current emotional 
State. Following this, they recorded information from 
daily behavior tallies that they had made since the last 
Meeting. In addition, unknown to the class, uae he 
Structor kept records of selected items of behavior. 

Current emotional state was rated by the students 
on S-point graphic scales anchored at one end with 

Not at all” and at the other with “very.” The scales 
Were identified by clusters of three adjectives, such 


4s happy-cheerful-joyous, the clusters having been 
i ae adjective checklists 


determined b is of 
y factor analysis oi 3 
Used in previous, similar studies (e.g., Epstein, 1976). 
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Ten scales for positive emotions and 10 separate, 
opposite scales for negative emotions were randomly 
arranged. Tn addition to the usual emotions, scales 
included feelings of self-esteem, arousal, and ability 
to assimilate information, as these were of special 
interest for a theory relating emotions to the self- 
concept (see Epstein, 1973, 1976, in press). After 
rating their current emotional states, the students 
transcribed from their daily tally sheets the frequency 
with which the following occurred since the last class 
meeting: social telephone calls made and received, 
social letters written and received, entertainment 
events attended, and number of headaches and stomach 
aches. Next, with the instructor keeping time for the 
class, subjects took three 30-sec samples of their 
pulse rate and recorded the mean and range. The 
above information was recorded on computer-scored 
standard answer sheets. Without the knowledge of the 
students, the instructor kept records of the following: 
borrowing of a Number 2 pencil routinely made 
available to students who had forgotten to bring 
their own, number of minutes late to class, number 
of errors and omissions in providing information on 
the answer sheet, and number of erasures in filling 
in the spaces for recording answers. In addition, the 
instructor kept records throughout the semester of 
number of absences, number of papers turned in late, 
and number of papers not’ turned in at all. These 
will be referred to as fixed variables, as they do not 
vary over days. 

Students in both classes were combined into a 
single group for the analysis of all variables except 
minutes late and pencils borrowed, which were recorded 
only for the undergraduate class. The data finally 
selected consisted of the records of 15 undergraduates 
and 11 graduates who had filled out forms properly 
on at least six Tuesdays and six Thursdays. For all 
split-half correlations, a balance was maintained be- 
tween Tuesdays and Thursdays to control for the 
unequal number of days between meetings. 


Results 


To evaluate the effects of bias in the self- 
ratings, it is important to consider that the 
data can be divided according to level of 
objectivity in two ways. First, they can be 
classified as subjective, in the sense that they 
are descriptions of inner states, and objective, 
in the sense that they can be observed by an 
external observer. Second, the latter category 
can be subdivided according to the degree to 
which there is the possibility for bias to enter. 
For example, when a subject records the 
number of telephone calls or letters he or she 
has received, the figure may be distorted 
according to the impression the subject wishes 
to create. The data can be arranged in three 
categories of decreasing opportunity for sub- 
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jective bias as follows: (a) self-recorded events 
outside the classroom, such as number of 
telephone calls reported; (b)  self-recorded 
physiological reactions in class, such as the 
count of pulse rate, with the instructor doing 
the timing; and (c) examiner-recorded events, 
such as minutes late to class and number of 
erasures on the answer sheets. It will be helpful 
to keep this scheme in mind when examining 
the influence of bias in self-report on the 
stability coefficients. 

In Table 4, stability coefficients are pre- 
sented for a 1-day and a 12-day sample for 
all variables in the study. Also included are 
the intercorrelations for a 12-day sample of 
the objective events with each other and with 
the negative emotions. The results for positive 
emotions are not included because they add 
little to the findings for the negative emo- 
tions, producing by and large equivalent op- 
posite results. Given a total sample of 12 days, 
split-half reliability coefficients could, of 
course, only be computed for subsamples of 
6 days. The reliability coefficients for the 
12-day sample are estimates obtained by the 
Spearman-Brown formula for determining the 
reliability of a total test from the correlation 
between its halves. To assess the accuracy 
of such estimates for the current data, 6-day 
reliability coefficients were estimated from 
3-day samples. In all cases, the estimates 
were almost identical to the values actually 
observed for a 6-day sample. The 12-day 
reliability estimates are presented in Table 4, 
since the intercorrelations of the different 
variables with each other are all based on 
12-day samples, and it is helpful in evaluating 
the validity coefficients to examine the re- 
liability coefficients for the same number of 
observations. 

All items in Table 4 exhibit a marked 
increase in reliability from a 1-day to a 12-day 
sample. For the 12-day sample, 21 of the 23 
items have a reliability coefficient of at 
least .70, and 9 of these are at least 90. 
Of these 9, only one involves the report of 
an inner state. The increase in reliability that 
Occurs when data are averaged over events 
and arranged in order of increasing objec- 
tivity results in the following: For the 10 
Unpleasant inner states, the respective mean 
Coefficients for a 1-day and a 12-day sample 
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are .37 and .77, respectively. For self-recorded 
behavior that is externally observable, the 
corresponding figures are .40 and .96, For 
self-recorded physiological reactions, the cor- 
responding figures are .27 and .94. For ex- 
aminer-recorded variables, the corresponding 
figures are .44 and .84. It is noteworthy that 
all three categories that include externally 
observable events yield higher reliability co- 
efficients than the category that refers to 
inner states. That the most objective cate- 
gory, examiner-recorded behavior, attained a 
somewhat lower level of reliability (.84) than 
the other two categories of externally ob- 
servable behavior is attributable to one vari- 
able, number of erasures, which had a rela- 
tively low frequency of occurrence. The other 
two variables in this category, lateness and 
pencils forgotten, obtained reliability coeffi- 
cients of .94 and .93, respectively. It may be 
concluded that the increase in reliability as 
a function of averaging over an increasing 
number of observations cannot be accounted 
for by subjective bias associated with self- 
report, since the same phenomenon was as 
well demonstrated for objective data, This 
same conclusion is supported by inspection 
of Figure 5, which presents the data grouped 
in a somewhat different fashion. It is apparent 
in Figure 5 that self-ratings of emotions tend 
to differ in the direction of producing lower 
reliability coefficients. 

Having established high levels of reliability, 
let us now examine the data with respect to 
validity. In Table 4, it is evident that the 
number of significant correlations of objective 
events with other objective events and with 
inner states is well beyond chance. Further, 
the correlations form coherent patterns. Calls 
made, calls received, letters written, and 
letters received are all variables that involve 
communication with others and are inter- 
correlated. Erasures, papers missing, and 
lateness to class, all suggestive of carelessness 
and lack of organization, are related to the 
feeling states of tension, powerlessness, and 
confusion. Heart rate mean, heart rate range, 
headaches, and stomachaches, suggesting 
heightened physiological arousal and psycho- 
somatic reactivity, are correlated with un- 
happiness, confusion, and anger-in but, in- 
terestingly, not with anger-out. Heart rate 
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range, but not heart rate mean, despite the 
former having a lower reliability coefficient, 
is significantly correlated with errors (.57) 
and with erasures (.32), suggesting a variable 
of physiological lability that is associated with 
behavioral lability. 

It may be concluded that high levels of 
stability can be demonstrated for objective 
as well as subjective data when the data are 
averaged over a sufficient number of events 
and that once reliability is established, evi- 
dence of validity may not be far behind. 


Study 4: Relationship of Personality 
Inventories to Behavior 


There are four ways of interpreting the 
evidence that personality inventories have in 
the past produced, at best, low correlations 
when evaluated against a criterion of objective 
behavior. One is that the two are measuring 
nonoverlapping aspects of behavior; another 
is that the inventories are inadequate; a third 
is that the objective criteria are inadequate; 
and a fourth is that both are inadequate. 
Let us consider the possibility that the cri- 
teria are inadequate, either because they are 
unreliable or because they are so narrow and 
limited in representativeness as to share little 
variance in common with the broader at- 
tributes sampled by personality inventories 
(see Davidson & Jaccard, 1975; Eagly, 1978; 
Fishbein & Ajzen, 1974; Jacard, 1974; 
McGowan & Gormly, 1976; Weigel & New- 
man, 1976). To the extent that either of 
these limitdtions exists, an increase in the 
sample of objective behavior comprising the 
criteria should produce an increase in the 
correlations between personality inventories 
and objective behavior. Study 4 was under- 
taken to examine this possibility. 


Method 


$ Forty-five undergraduates kept records (on forms 
similar to those used in the other studies) of their 
feelings and behavior on 14 consecutive days, not 
including weekends. Each day, at the same time, 
a subject set aside 10 min to rate his or her current 
feelings and to obtain three 30-sec samples of pulse 
tate. Feelings were rated on bipolar scales with 10 
subdivisions that were anchored at one end by a 
cluster of adjectives, such as happy-cheerful-joyous, 


and at the other end by an opposite cluster, such as 
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unhappy-sad-depressed. A tally was kept 4 
following, recorded at the end of each day 
of social phone calls made and received, num 
social letters written and received, number of 
contacts initiated with groups of three or more 
number of headaches, and number of stoma 
On the morning of each day, hours of sleep 
the previous night were recorded, and a ra 
made of soundness of sleep. 
Before they began recording their daily 
subjects took a battery of personality inven 
including a specific and a general form of a sp 
constructed inventory made to resemble as clos 
possible the forms used for daily recording, 
specific form contained precise descriptions of 
ations and responses. The general form was n 
in wording to most standard personality invent 
in that it was more vague or general in des pt 
of events and response options. An example 


other with “very often.” The corresponding il 
the specific form was, “How many social phont 
do you make, on the average, over a 5-day 
not including weekends? Consider social 
include all calls other than business calls.” The 
sponse options were as follows: less than 1, 1-3, # 
7-9, 10 or more, The general form was always. 
ministered before the specific form. Both fo 
cluded all the items of objective behavior i 
in the daily forms, with the exception of heart 


*The results are quite different when data 
a 1-day sample are examined. A 1-day sample 
a total of eight correlations that are signific 
the .05 level when the same variables as in 
are related to each other, as compared to 57 
cant relationships when the variables are av 
over 12 days (see Table 4). Moreover, of the 
significant correlations for the 1-day sample, five 
produced with the correlation of a fixed 
papers missing, which is based on the entire sen 
The significant correlations for which both 
were based on a 1-day sample were as follo 
received and calls made (.49), stomachaches 
headaches (.58), and unhappy and stomachaches | 

To further illustrate the difference betw 
results for a 1-day and a 12-day sample, 
the interesting, unanticipated findings for heal 
Tange for a 12-day sample and the absence © 
significant relationships for heart rate range 
a 1-day sample was examined. In Table 4 it 
seen that for a 12-day sample, heart rate range 
duced significant correlations with the objective Vi 
ables of heart rate mean and errors and with self 
feelings of frustration, unworthiness, and conf 
thereby suggesting an interesting psychosomatic ™ 
tionship between autonomic lability and beha i 
and cognitive lability, a relationship of which 
is not a trace in the 1-day sample. 
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Figure 5. 
Study 3. (Values plotted represent the mean of t 


values for the 12-day samples were estimated from 


a and heart rate range. Frequency of emotions, in 
4 form of an adjective checklist containing single 
caves was included only in the general form, as it 
at not seem reasonable to ask subjects to estimate pre- 
{ oy how many times they experienced a particular 
po per day. Subjects responded with estimates of 
i frequency they experienced the emotions by check- 
ng “never,” “rarely,” “sometimes,” “frequently,” or 
Marly, always.” Scores on the adjective checklists 
ere obtained by combining individual adjectives into 
€ same 6-item clusters as in the daily forms. 
int, edition to the above two specially designed 
Moe the following more standard personality 
cat ‘ories were administered: the Guilford-Zimmer- 
aan Temperament Survey (1949), the Eysenck Per- 
Riality Inventory (Eysenck & Eysenck, 1968), the 
ee enz Manifest Anxiety scales and the Epstein 
Hostility scales (Fenz & Epstein,” 1965), and the 
Pstein-O’Brien Self-Esteem Scale (Epstein, 1976). 
a latter three were specially designed for previous 
earch and have been demonstrated to have some 
tere of construct validity (cf. Epstein, 1962, 1976; 
Pstein & Fenz, 1970; Fenz & Epstein, 1965)- 


Results 


The findings on reliability replicate the 
results of the previous studies that demon- 
strated a marked rise in stability coefficients 
as a result of averaging over an increasing 
number of observations. In Table 5, the 
stability coefficients for ratings of inner states 
for a 1-day sample range from .22 for ex- 
ternal versus internal direction of attention 
to .59 for feeling attractive versus unattractive. 
The mean reliability coefficient for the 15 
inner states is .45, which is somewhat higher 
than in the previous study. The increase in 
reliability in the present study for a 1-day 
sample can be attributed to the use of bipolar 
scales that bring six, rather than three, ad- 
jectives to bear on each dimension and that 
by contrasting the opposites, elucidate the 
construct. For the entire 14-day sample, as 
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Wed from the split-half 7-day coefficients 
je Spearman-Brown formula, the reli- 
Wvecoefficients for inner states Tange 

A for external versus internal direction 
n to .94 for feeling attractive versus 
active, with a mean of .86 for the 
er states. 

ilar increase in reliability as a func- 
the increase in number “of occasions 
occurs for the more objective data. 
day sample, the reliability coefficients 
om .09 for hours of sleep to .70 for 
te mean, with an average correlation 
Yor the 12 objective variables. The 
d reliability coefficients for a 14-day 
from 7-day samples range from 74 
received and letters written to 97 
contacts initiated, with an average 
tion of .88. It is noteworthy that for 
fective events, the two kinds of data 
Were most objective in the sense that 
volved nothing more than a count 
under supervised conditions in the 
m, that is, heart rate mean and heart 
Tange, exhibited the same increase as 
ther measures, attaining two of the 
St reliability coefficients, .94 and .93; 
ively. Moreover, although the former 
relatively high stability coefficient for 
sample, the latter did not. 
tus now turn to a consideration of 
ships of the 
recorded 


3 beginning with the inventories de- 
to match the daily records on an 
by-item basis. It can be seen in Table 5 
at the general form of the inventory has 


validity coefficient of "53 for objective 
a validity coefficient 


There was ap- 
ine i validity as the 
Mding in the self-report inventory Was made 


ts are greater than 70, indicating 2 sub- 
tial degree of validity for 40% of the 
. These items, with their validity coef- 
ne: are as follows: calls made (.71), calls 
ia (.72), social contacts jnitiated (.73); 
d hours of study (.79). Although hours of 
Udy requires a degree of inference, S it is 
clear exactly what constitutes study, 
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number of letters received and written simply 
requires a tally of unambiguous events, Thus, 
the results are not easily explained away by 
the assumption that daily tallies of events 
and self-report estimates over an extended 
period can be viewed as common method 
y both require reports 


variance, in that they 
by subjects. Moreover, if all that is involved 
ce, the correlations 


is common method varian 
daily recordings ` 


of self-report estimates with 
should be as great for a 1-day sample of the 
ample, which inspec- 


latter as for a 14-day s$ 
tion of the data indicates is not the case, 


and could have been predicted from the con- 
sideration that reliability provides a limiting 
condition for validity. Tt is noteworthy, in 
this respect, that in 9 of the 10 variables 


of objective events under consideration, the 


validity coefficients for the specific form of 
the inventory against the criterion of objective 
events over a 14-day sample are higher than 
the matching reliability coefficients for a 1-day 
sample of objective events. i 
Validity can further be examined by noting 
the relationships of the standard personality 
inventories with the 14-day samples of the 
It should not be expected, of 


daily records. d not ; 
course, that the relationships will be very 
ifferent ranges of generaliza- 


recording of specific 
he broad personality 
the personality in- 
an be seen that 


kinds of behavior and t 
dimensions measured by 


ventories. In Table 5 it ¢ t 
there are far more significant correlations 
d by chance and that 


than would be expecte 

the relationships 21° for the most part, 
coherent and i high for what 1s 
usually obtained for validity coefficients 
inst objective criteria. There is 4 large 
number of relationships that break the 30 
barrier, and no correlation is in an opposite 
direction from expectancy. All three of the 
anxiety scales in the Epstein-Fenz inventory 
(Fenz & Epstein, 1965) correlate significantly 


the mean validity coefficient 
was 19 as contraste 

a 14-day sample. For the general form 
tive events, the mean validity 
4-day samples were, respec- 


3. For the specific form of the same 
ing figures were .40 and 61. 
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in the expected direction with daily ratings 
of threat and tension. Although it is true 
that the anxiety scales also correlate with 
daily ratings of other negative feelings, such 
positive correlations are to be expected, as 
there is a general tendency for negative 
emotions to cluster together (see factor anal- 
ysis of similar data reported in Epstein, 1976). 
One or more of the anxiety scales correlates 
significantly negative with soundness of sleep, 
stomachaches, and headaches. Moreover, the 
scale of muscle tension correlates more highly 
with headaches, often a symptom of muscle 
tension, than with stomachaches, often a symp- 
tom of autonomic arousal, whereas the reverse 
is the case for the scale of autonomic arousal. 
The scales of overcontrolled hostility and 
undercontrolled hostility are both directly 
correlated with daily ratings of threat and 
of tension, which is reasonable considering 
that hostility is often a reaction to threat. 
Yet, only the scale of overcontrolled hostility 
is significantly correlated with daily ratings 
of inhibition, inward direction of attention, 
feelings of helplessness, and lack of reactivity, 
all of which suggest restraint. 

In view of the observation in the previous 
study that heart rate range is associated with 
behavioral lability, it is noteworthy that heart 
rate range in this study is significantly cor- 
related with undercontrolled hostility but not 
with overcontrolled hostility, whereas heart 
rate mean is associated with neither. It is 
of further interest that heart rate range, but 
not heart rate mean, is significantly asso- 
ciated with the scale of disturbance over 
hostile feelings (.40) and with Eysenck and 
Eysenck’s (1968) scale of neuroticism (.37), 
which suggests that, despite its lower reli- 


ability than heart rate mean, heart rate 
range is a more interesting personality 
variable. 


In conformity with the theory of the self- 
concept (cf. Epstein, 1973, 1976) that guided 
the selection of the items in the daily forms, 
the Epstein-O’Brien Self-Esteem Scale (Ep- 
Stein, 1976) correlates with daily feelings of 
worthiness (.47), integration (.51), optimism 
(55), and alertness (40), Among the Guilford- 
Zimmerman scales (1949), the scale of restraint 
is significantly negatively (—.37), and the 
scale of sociability significantly Positively (.40) 
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associated with number of social 
initiated. The scale of emotional 
positively associated with daily 
integration (.44), kindliness (41), 
(44), and soundness of sleep (.30 
negatively associated with frequency | 
aches (—.30). The Eysenck (Ei 
Eysenck, 1968) scale of extraversion 
strongly associated with daily ratings o 
taneity (.45), with outgoing feelings 
and with number of social contacts | 
(.52), all of which are recognized elem 
extraversion according to Eysenck.‘ The} 
may judge for himself or herself by 
examination of Table 5 whether the 
provide evidence of coherent relations 
tween self-report inventories and reco 
daily behavior and feelings. 

In summary, the correlations of the di 
records with a specific form of an in 
that closely matched the items in the di 
form were in the vicinity of .60, th 
relations with a more general form © 
same inventory were in the vicinity Ø 
and the correlations with relevant sca 
more standard personality inventories ¥ 
in the vicinity of .40.° It should be m 
that all of these correlations are highl 
nificant, that all are above the presumi 
personality barrier, that none was corre 
for attenuation due to unreliability, andt 
most would be about .10 higher if they 
thus corrected. It may be concluded 


‘Not surprisingly, as in the previous study, 
were far fewer significant correlations for # 
sample, and among these the correlations were 
and less coherent. For example, the Guilfoi 
man scale of Sociability (1949) correlated s 
with nine variables, including the variable of 
of social contacts initiated, when the data we 
on a 14-day sample, but with only one varial 
the data were based on a 1-day sample. That 
was self-rated attractiveness, which happens 
relatively high reliability for a 1-day 
Table 5). i 

* Similar results have been reported by Zud 
(1977) and his colleagues (Mellstrom, Cica 
Zuckerman, 1976; Mellstrom, Zuckerman, & 
1978; Zuckerman, Note 5). In an interesting 
of studies involving inventories, self-rating, 
jective measures of fear in a variety of settin 
was observed that the more specifically a trai 
corresponded to a state rating or to an Ov) 
measure of behavior, the higher the correlation. 
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evidence for a respectable degree of validity 
in self-report inventories can be demonstrated 
when the criterion consists of an adequate 
sample of behavior. This is not to imply that 
scores on personality inventories permit a 
high degree of accuracy in predicting behavior. 
The problem of accuracy in predicting be- 
havior will be discussed later. 


Discussion 
Stability of Behavior and Error of Measurement 


The classic debate on stability in person- 
ality can be resolved by noting that the 
problem lay in a failure to take into account 
error of measurement as it relates to temporal 
reliability. Single items of behavior, no matter 
how carefully measured, like single items in 
4 test, normally have too high a component 
of error of measurement to permit demon- 
stration of high degrees of stability. Once this 
is recognized, the solution to two related 
problems becomes apparent. First, the con- 
tradictory findings between a few studies that 
have reported stability in personality using 
data derived from ratings and a much larger 
number of studies that failed to find evidence 
of stability using data derived from the direct 
Measurement of behavior can be accounted 
for by the observation that the former studies 
txamined adequate samples of behavior, 
Whereas the latter examined single events. 
Second, the failure to relate self-ratings, 
ratings by others, and personality inventories 
to objective behavioral criteria can be ac- 
Counted for by the unreliability of the be- 
havioral criteria, which almost always con- 
Sisted of single items of behavior, usually 
Measured in a laboratory setting. 

After a review of the literature, the fol- 
lowing hypothesis and corollary were for- 
Mulated: (a) Stability can be demonstrated 
Wer a wide range of variables, so long as 
the behavior in question is averaged over a 
Sufficient number of occurrences. This applies 
“ually to data derived from the direct mea- 
‘ltement of objective behavior, from self- 
"Ports, and from ratings by others. (b) Sig- 
nificant relationships can be demonstrated 
p Ween ratings by others and self ratings, 
cluding standard personality inventories on 
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the one hand and objective behavior on the 
other, so long as the objective behavior is 
sampled over an appropriate level of generality 
and averaged over a sufficient number of oc- 
currences. Four studies conducted to test the 
hypothesis and corollary provided unequivocal 
support for their validity. The studies de- 
monstrated that when single events are ex- 
amined, there is little evidence for stability, 
but that when averaging is done over a suf- 
ficient sample of events, there is strong 
evidence for stability, as well as convergence 
among the different kinds of data. The studies 
also demonstrated that once high levels of 
reliability are established, evidence of con- 
struct validity is apt to emerge in relation- 
ships among the different variables, including 
ones that do not share common method 

variance. Thus, error of measurement appears 

to be the crucial consideration in demon- 

strating stability in personality and in relating 

self-ratings and ratings by others to ob- 

jective data." 

It is difficult to believe that with rare 
exception (e.g, Tryon, 1973), the concept 
of error of measurement was overlooked 
throughout the long history of the debate 
on stability of personality. Perhaps there is 
a lesson to be learned from this. Can it be 
that overevaluation of the experimental 
method, as normally practiced, blinded re- 
searchers to the inherent limitations of 
studying behavior on single occasions? Given 
the awe in which laboratory experimental 
procedures have been held, who would have 
dared to think that they often fail to meet 
one of the most fundamental scientific tests 
of all, temporal reliability (replicability)? 7 


Other evidence consistent with this thesis is pro- 
vided by Fishbein and Ajzen (1974), Jacard (1974), 
Magnusson and Hefer (1969), McGowan and Gormly 
(1976), Stagner (1977), Tryon (1973), and Weigel 
and Newman (1976). 

7 To recognize the limitations of the psychological 
experiment is not to deny its virtues, the primary 
one of which is to establish causal relationships. 
Moreover, there are remedies for the limitations. One 
remedy is to conduct experiments over sessions when 
this is feasible and appropriate for the phenomenon 
under consideration, thereby permitting temporal 
reliability to be assessed and to be increased by 
averaging over sessions. Another is to encourage 
replications in different laboratories and to not take 
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Who would have dared to think that the 
emperor was wearing no clothing? In both 
cases the solution is not only emotionally 
unacceptable but seems too simple to be true. 
Yet having arrived at the solution, one finds 
that the future is more hopeful, for delusions 
can neither warm an emperor nor advance 
the cause of science. (See also Brunswik, 
1947, 1956; Hammond, 1954, 1955; Magnus- 
son & Heffler, 1969; Pervin, 1977; and Tryon, 
1973, for other expressions of concern about 
inadequate sampling of situations.) The 
broader implications for psychological re- 
search of the low temporal reliability of single 
observations will be* discussed in a second 
article devoted to this topic (Epstein, Note 2). 


Issue of Cross-Situational Stability 


It has been argued that the real issue with 
respect to the existence of traits is the de- 
monstration of cross-situational stability. The 
demonstration of stability per se may, of 
course, establish nothing more than the 
existence of narrow habits, which never has 
been at issue. It is thus important to consider 
how the findings reported in this article 
relate to the issue of cross-situational stability. 

It will be recalled that a relatively high 
level of stability in behavior was demon- 
strated over a wide variety of behaviors, 
some narrowly conceived, such as making 
telephone calls, and others broadly conceived, 
such as feelings of kindliness and acts of 
nurturance, all of which could be elicited by 
a variety of situations. Inspection of the data 
indicated that the same responses were, in 
fact, elicited by a variety of situations. As 
the stability that was demonstrated occurred 
over the normal range of situational vari- 


any finding seriously until it has been replicated on 
several occasions, Undoubtedly there are some ex- 
perimental situations that produce temporally reliable 
results on a single occasion, whereas others require 
Averaging over occasions. The former very likely 
include highly potent, ego-involving variables as well 
as certain very simple effects. However, which situ- 
Beng aly produce temporally reliable results 
occasions cannot simply be assumed 

must t be tablished by EO a seats ti Sct 

rel i i 
SAE eiea i rd of the psychological experi- 


of investigation. Thi 
is further pursued in Epstein (Note 2)” matter 
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ability in everyday life, it would 
to state that a meaningful level ¢ 
situational stability was demonstrated 
pressed otherwise, it was demonstra 

there is enough cross-situational s 
everyday life so that useful stateme 
individual behavior can be made 


of course, is the way a trait is usually d 
and the findings demonstrate the uti 
such a concept. 


intercorrelations among the variables, 
clusters of correlations occurred not Q 
among variables sharing the same 
variance, such as ratings of inner 
but also among variables assessed by di 
methods. Thus, a broadly defined vari 
a personality inventory, such as Eys 
extraversion scale (Eysenck & Eysenck, I 
produced highly reliable correlations 
daily records of feelings of inhibition 
desire for seclusion, as well as with a 
of social contacts initiated, all of w 
clearly related to the construct of extra 
as defined by Eysenck. Number of et 
and papers missing, which were record 
the examiner without the subject’s awi 
not only correlated reliably with each 
but correlated reliably with daily self 
of feelings of tension, powerlessness; 
confusion. 
The conclusion that there are t 
broad, stable response dispositions, © 
does not conflict with the assumption’ 
situations often exert a strong influene 
behavior. People obviously do not 
response dispositions independent of 
ting. That is why it is usually nec 
cancel out situational effects, including 
background effects that often go unrecog® 
by averaging over occasions to demi 
stability in behavior. By the same 
demonstrations that behavior varies | 
situations cannot be taken as evidence 
traits. The fact that people read in a} 
and swim in a swimming pool do 
establish that there is no general 
“cross-situational stability,” in either § 
ming or in reading behavior. More 
point is that some people are more 
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swim than others when there is a reasonable 
opportunity to do so, and this may include 
swimming in pools, in lakes, and in oceans, 
Further, one cannot test such a cross-situ- 
ational proclivity to swim by observing a 
person once in the vicinity of a swimming 
pool and once in the vicinity of a lake, as 
there may be many reasons for that person 
to forego swimming on a particular occasion, 
Behavior is obviously determined by more 
than response dispositions. Given an adequate 
sample of occasions, however, response dis- 
positions will out. 


Issue of Predictability of Behavior 


How stable is stable? It has been demon- 
strated that there is sufficient stability in 
behavior so that over a sufficient sample of 
vents, threads of consistency become ap- 
parent. As noted above, in any one instance 
behavior is determined largely by the situ- 
ation. This, of course, allows one to predict 
situational effects averaged over subjects but 
hot necessarily an individual’s rank order, as 
represented by a correlation coefficient, from 
one situation to the next. 

It is important to recognize that no situation 
lor an individual can ever exactly reproduce 
nother, if only because the time lag in be- 
ween must have somewhat changed the 
individual, including his or her momentary 
motivational state. Moreover, always be- 
having in the same way in the same situation 
without regard to contextual changes would 

behaving in a pathologically rigid manner. 

the basis of such considerations alone, it 
hould not have been surprising that the 
orrelations between single items of behavior 
n different occasions are usually below .30. 

To predict individual behavior with reason- 
ble accuracy, correlations in the vicinity 
Í .80 or .90 are required. In the four studies 
‘ported in this article, correlations of such 
Magnitude were obtained when the average 
Í a sample of 14 days was correlated with 

€ average of another sample of 14 days, 
ut rarely for smaller samples of behavior. 

Mls indicates that one can predict average 
“havior accurately from a similar sample of 
erage behavior. However, the prediction is 
"ly actuarial and is a far cry from predicting 
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with confidence to individual instances of 
behavior. How useful is such actuarial pre- 
diction? It is extremely useful, for it tells 
us that in the long run, we can depend on 
people behaving true to character. As I have 
noted elsewhere (Epstein, 1977), whether we 
are betting our finances on the outcome of 
material events or betting our happiness on 
the outcome of human relationships, it can 
be extremely rewarding to be right more often 
than allowed for by chance. From such a 
perspective, even correlations much lower 
than .80 or .90 can be useful.* With this in 
mind, let us examine the correlations of self- 
report statements with daily behavior, 

When self-report items that were highly 
specific in wording were compared with cor- 
responding objective criteria, correlations in 
the vicinity of .60 were obtained. When the 
self-report items were worded in a more dif- 
fuse manner, the correlations fell to about .50, 
When standard personality inventories were 
used that defined much broader styles of 
functioning than the behavioral criteria to 
which they were compared, the correlations 
were, on the average, about .40, but several 
were above .50, Although none of these allows 
great accuracy in prediction, all allow much 
better than chance prediction from informa- 
tion that is readily obtainefl. Moreover, there 
is reason to believe that these correlations 
could be improved by including equally broad 
categories of items in both the criterion 
sample and the self-report measure (see studies 
by the Blocks [in Block, 1977] and by 
Hartshorne & May, 1928, 1929, described in 
this paper; see also Fishbein & Ajzen, 1974). 
It remains to be determined how high such 
correlations can be brought when a broad 
sample of matching behavior for the two sets 
of variables is obtained, and this is combined 
with averaging the criterion over a sufficient 
number of occasions to bring it to as high 
a level of temporal reliability as is customarily 


* Any correlation that is significantly greater than 
zero can, of course, be useful for theoretical purposes. 
In fact, it would be inefficient to have to develop 
methods and measures that permitted high degrees 
of prediction to test theories, particularly in early 
stages of theory development in which it is often 
desirable to explore many variables, both dependent 
and independent, rather than pursue a few in depth. 


1124 


obtained for personality inventories. If both 
measures had sufficiently high stability co- 
efficients, then at least the unreliability of the 
criterion would not be a limiting factor for 
validity, as it customarily has been. 

Finally, it should be noted that not everyone 
is equally predictable. This was demonstrated 
in the present article by the finding that 
within-subject correlations varied over a range 
that suggested almost no stability in a few 
individuals and extremely high stability in 
others, with most individuals demonstrating 
a moderately high degree of stability. Bem 
and Allen (1974) observed that a sample of 
subjects who described themselves as highly 
consistent produced moderately high stability 
coefficients, whereas a sample of subjects who 
reported they were not consistent produced 
low coefficients. The authors concluded that 
stability in personality can be demonstrated 
only for specially selected samples of sub- 
jects, which is aptly reflected in their title, 
“On Predicting Some of the People Some of 
the Time.” According to our findings, it is 
not necessary to select particular classes of 
people to demonstrate stability unless one 
has failed to obtain a sufficient sample of 
behavior to begin with, in which case one 
needs individuals with unusually high sta- 
bility to compensate for a high degree of 
error of measurement. As demonstrated in 
the present article, given an adequate sample 
of behavior to begin with, it should be pos- 
sible “to predict most of the people much of 
the time.” 
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Psychological Differentiation: Current Status 
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The status of the differentiation hypothesis, derived from differentiation theory, 
is examined in light of the evidence that has accumulated since the hypothesis 
was proposed a decade and a half ago. Assuming that greater or less differen- 
tiation is an organismic attribute, the hypothesis postulates that behaviors re- 
flecting extent of differentiation are likely to be interrelated, resulting in self- 
consistency in individual functioning across domains. The newer evidence has, 
on the whole, confirmed the linkages among the behaviors examined in the 
earlier research that was done to test the differentiation hypothesis. The differ- 
entiation hypothesis has also proven useful in generating predictions about 
linkages to behaviors in new domains (cognitive restructuring, interpersonal 
competencies, and cerebral lateralization); these predictions have been tested 
and generally confirmed. The differentiation construct appears able to account 
for phenomena that cannot be accommodated by other lower order constructs, 
such as field dependence-independence. On all these grounds the differentiation 
construct continues to serve a useful function. Whereas the differentiation hy- 
pothesis has dealt only with the interrelatedness among components of a cluster 
of behaviors subsumed under differentiation, the newer evidence carries sug- 
gestions for a hierarchical ordering of these components and for the nature of 


causal connections among them. 


The differentiation construct was introduced 
into the stream of research on field dependence- 
Independence in 1962 to accommodate new 
findings and to guide further research on the 
broad patterns of psychological functioning 
associated with individual differences in 
Manner of establishing the upright in space 
(Witkin, Dyk, Faterson, Goodenough, & 
Karp, 1962/1974). 

Briefly stated, differentiation is a major 
formal property of an organismic system. A 
less differentiated system is in a relatively 
homogeneous state; a more differentiated sys- 
tem is in a relatively heterogeneous state. A 
System that is more differentiated shows greater 
Self-nonself segregation, signifying definite 
boundaries between an inner core of attributes, 
feelings, and needs identified as the self, and 
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the outer world, particularly other people. In a 
less differentiated system, in contrast, there is 
greater connectedness between self and others. 
A system that is more differentiated is also 
characterized by greater segregation of psy- 
chological functions; that is, functions are 
more separate from each other and activities 
within each are more specialized.’ 

To make the differentiation construct a use- 
ful guide for research, it was necessary to 
operationalize it by specifying the particular 
ways in which greater or less differentiation 


1 Although more differentiated systems are likely to 
be more complexly organized (i.e., relationships among 
system components and between the system and its 
environment are more elaborate), there is no inherent 
relation between differentiation and effectiveness of 
integration (i.e., the harmonious working together of 
system components with each other, and of the system 
as a whole with its environment). Supporting the con- 
cept that level of differentiation is unrelated to effec- 
tiveness of integration are the results of many studies 
that have shown that more differentiated and less 
differentiated people are not different in sheer presence 
or absence of psychopathology, in other words, in being 
well adjusted or poorly adjusted (Witkin, 1965). 
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Differentiation 
Articulated © Articulated Sense of Structured controls 
cognitive body separate ‘ond 
functioning concept identity specialized defenses 
Figure 1. 1962 model of differentiation. 


may show itself in different psychological 
domains, and by formulating testable hy- 
potheses. Assuming that the development of 
differentiation is an organismwide process, the 
main hypothesis we proposed was that greater 
or less differentiation is likely to be charac- 
` teristic of an individual’s activities in diverse 
domains. According to this differentiation hy- 
pothesis, functional manifestations or indica- 
tors of greater or less differentiation should be 
related to some extent, resulting in self-con- 
sistency in functioning across domains. We 
held this expectation, though recognizing that 
special training and experience during growth 
may contribute to unequal progress among 
areas, and that dedifferentiation during aging 
or with psychological impairment may affect 
functioning irregularly from one area to 
another, 
In the theoretical conception that guided our 
research, shown schematically in Figure 1, 
differentiation was considered a high-order in- 
dividual-differences construct, with four lower 
order constructs radiating from it. The first of 
these was articulated cognitive functioning. 
To experience parts of organized fields as 
discrete and to organize unstructured fields, 
which articulation implies, is to show differen. 
tiation in the cognitive domain. The second was 
an articulated body concept. This refers to an 
impression of the body as having definite 
limits or boundaries, and the parts within as 
discrete yet interrelated and joined into a 
definite structure. The third was sense of 
Separate identity—that is, identification of at- 
tributes, needs, and values that are recognized 
as one’s own and as distinct from those of 
others, The availability of an internal core of 
characteristics and standards was expected to 
enable the person to function with relatively 
little need for guidance and support from 
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others, and to maintain an internally 
perspective in the face of contradict 
spectives of others. The fourth was a 
of structured controls over impulse e: 
and the use of specialized defenses for 
with potentially disturbing experiences, 
signify more differentiated functioning, i 
trast to the diffuse expression of imp 
the use of nonspecific defenses. By 196; 
was a good deal of evidence that measu 
these lower order constructs were interri 
as hypothesized. 
Since then, the differentiation hypothe 
been the stimulus for a large body of fi 
research,? A segment of that research, 
cerned with the field-dependence-indepem 
cognitive-style component of differenti 
has recently been reviewed elsewhere 
& Goodenough, in press; Witkin & 
enough, Note 5). Our main task here 
broader one of examining the different 
hypothesis in light of the newer evidenci 
consider that evidence under the headi 
the four lower order constructs of the 
model of differentiation. 


Articulated Versus Global Field Appro: 


The articulated verus global ¢ 
emerged from a line of research on indi 
differences in perceptual and intellectu 
tioning that began with studies of perc 
of the upright in the rod-and-frame test ( 
body-adjustment test (BAT), and roi 
room test (RRT). In each of these tests 
jects differed in the extent to which thi 
the external visual field or the body 
the main referent for locating the upri 
they were self-consistent across tests 
tendency to rely primarily on the one 
or the other. These contrasting tendeni 
designated field dependent and field indep 

Later, manner of locating the uprli 
shown to be related to relative ease of 
ing part of an organized field from 
as a whole, as in the embedded-figi 
(EFT), in which the task is to find 


2 For recent comprehensive bibliographies 
and Witkin (Note 1); Witkin, Cox, and 
(Note 2); Witkin, Cox, Friedman, Hrishike 
Siegel (Note 3); Witkin et al., (Note 4)- 


figure within a complex design (Witkin et al., 
1954/1972). Since the requirement of separat- 
ing the item from its embedding context in this 
task appeared similar to the task of keeping a 
rod or the body apart from the surrounding 
field in the RFT, BAT, and RRT, we redefined 
the field-dependence-independence concept as 
a disembedding ability in perception. Dis- 
embedding ability in perception was subse- 
juently related to disembedding ability in 
intellectual activities, and disembedding ability 
in both domains was related to structuring 
competence in both domains. To analyze and 
structure fields is to show articulated cognitive 
functioning as a characteristic approach to the 
field; to follow the field as given is to use a 
global approach. This greatly enlarged indi- 
vidual-differences dimension was conceived as 
an articulated versus global field approach di- 
mension, and was designated a cognitive style. 

Since 1962, a great deal of research has been 
carried out on the articulated-global field ap- 
proach conception. (See Witkin & Goodenough, 
in press, and Witkin & Goodenough, Note 5, 
for recent reviews of that research.) This re- 
search has confirmed the picture of self-con- 
sistency in cognitive functioning established 
tarlier, and extended it by linking a wide array 
of spatial-visual cognitive dimensions to field 
dependence-independence in perception of the 
Upright. A common feature of situations in 
Which these dimensions are represented is that 

e person has to restructure percepts or 
symbolic representations to meet the require- 
ments of the task. Field-independent people 
ate better able to accomplish such restructur- 
‘ng, in contrast to field-dependent people who 
ate more likely to follow the prevailing or- 
§anization of a stimulus array as given. Ex- 
amples of restructuring acts on which people 
Who are field-independent in perception of the 
Upright do better are locating a simple figure 
a complex organized gestalt (as in the EFT; 
*8, Goodenough & Karp, 1961), achieving 
ternative perspectives in spatial-visualization 
tasks (e.g., Gardner, Jackson, & Messick, 
1960; Gough & Olton, 1972), showing con- 
“tvation in Piagetian situations (e.g-, Pascual- 
“one, 1969), and breaking a set in Einstellung 
Problems (e.g., Busse, 1968; Dinius, 1975). 

°st factor-analytic studies have suggested 
that the restructuring competence related to 
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field independence in perception of the upright 
may be limited to performance on spatial- 
visualization types of tasks, but the evidence 
on the issue is not conclusive. 

Recent research on the articulated-global 
field approach concept also suggests that 
manner of mixing visual and vestibular in- 
formation, rather than cognitive restructuring 
(Witkin & Goodenough, in press; Witkin & 
Goodenough, Note 5), may underlie individual 
differences in performance on tests of percep- 
tion of the upright. In a later section of this 
article, we consider alternatives to the dis- 
embedding hypothesis that may account for 
the relation that exists between manner of 
establishing the upright and performance in 
cognitive restructuring tasks such as the EFT, 


Articulated Body Concept 


People’s conception of their bodies is the 
product of bringing into a coherent entity a 
wide array of experiences while growing up. 
These experiences have their primary sources 
in children’s apprehension of their bodies 
through vision; in the tactile and kinesthetic 
feedback arising from their manual exploration 
of their bodies and from their motoric and 
bodily functions; and in the handling of their 
bodies by other people. The bodies of others 
provide still another source of experience, The 
typical progression in development of the body 
concept is from a relatively global view of the 
body, to awareness of its constituent parts and 
their interrelatedness, as well as awareness of 
the outer limits of the body. The outcome of 
this progression is an articulated body concept 
—that is, one in which components of the body 
are experienced as discrete and joined into a 
bounded whole. According to the differentia- 
tion hypothesis, individual differences in posi- 
tion on the articulated-global body-concept 
dimension should be related to differences in 
position on the field-dependence-independence 
cognitive-style dimension.’ This expectation 


3 The differentiation hypothesis supposes that mea- 
sures of the postulated indicators of extent of differ- 
entiation would tend to be related to each other. In the 
studies described in our earlier report (Witkin et al., 
1962/1974), intercorrelations among measures of these 
indicators were examined and were generally found to be 
significant. Reflecting the early and continuing role of 
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was confirmed in our early studies, in which we 
used an articulation of body concept scale, 
applied to human figure drawings, to assess 
that aspect of the body concept (Faterson & 
Witkin, 1970; Witkin et al., 1962/1974). This 
relationship has been further confirmed with 
a high degree of consistency in many subse- 
quent studies (e.g, Adevai, Silverman, & 
McGough, 1968; Pizzamiglio & Carli, 1973; 
Schuck, Cross, & Mills, 1970; Winestine, 
1969), 

Some studies have examined articulation of 
body concept by means other than figure 
drawings. One set of studies used the Finger 
Apposition Test (Jacobson, 1944), which re- 
quires subjects to reproduce with their own 
hands the finger positions of pairs of hands 
shown in a series of photographs. The ability to 
manipulate the hands vicariously in imagina- 
tion, to achieve a particular relationship be- 
tween them, was conceived to require an 
articulated conception of one’s own body. A 
similar kind of reasoning lay behind the use of 
the Laterality Orientation Test (Culver, 1969) 
and the Hands Test (Thurstone, 1938) in 
another set of studies. In these tests the sub- 
ject is required to identify the sidedness of 
drawings of right or left body parts. Most 
Studies found the expected relationship be- 
tween scores on these tests and measures of 
field dependence-independence (Adams, 1974; 
Adevai et al., 1968; Bachelis, 1965; Culver, 
Cohen, Silverman, & Shmavonian, 1964; 
Epstein, 1961; Gaughran, 1964). 

Still other studies considered only the body- 
boundary component of an articulated body 
concept. Some of these used the“ barrier index” 
(Fisher & Cleveland, 1958), an indirect in- 
dicator of body-concept boundaries based on 
frequency of bounded percepts in a subject’s 
Rorschach productions. Although a few studies 


perception of the upright and perceptual disembedding 
in our research, as well as the availability of effective 
tests of these functions, it has predominantly been re- 
lations between Performance on the RFT and/or EFT 
and measures of each of the other indicators that have 
been examined in the subsequent research on differen- 
tiation, reviewed in this and the following sections of 
this article. Although there is some additional evidence 


on relations among measures of the other indicators in 


the more recent literature, that evidence is limited and 
is not considered here. 
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found a relation between boundary score 
field independence (Jacobson, 1965; 
by Fisher of data from Young, 1959, ¢ 
Witkin et al., 1962/1974), most have not 
Bachelis, 1965; Carlson, Tucker, Harro 
Quinlan, 1971; Siegel, 1977). Other stud 
came closer to the body itself in their ass 
ment of body-boundary characteristics, 
examining discreteness of experiences oi 
surface of the body through such indicat 
the two-point threshold, traced letter id 
cation, and tactile localization (Adams, 
Adevai et al., 1968; Bachelis, 1965; Coh 
Silverman, & Shmavonian, 1962; Paquett 
1974; Silverman, Cohen, Shmavonian, 4 
Greenberg, 1961). In most instances, tl 
anticipated relationship was found be 
field independence and the measures oi 
creteness of experience on the surface of th 
body. i 
If the various measures derived from ta 
involving the body itself do indeed allo 
ferences about articulation of body com 
we have, in these studies, a suggestion thatt 
relationship between field dependence-ind 
pendence and that body-concept dimension 
not method specific, in the sense of 
limited to the figure-drawing, articulati 
body-concept assessment technique. 
tional studies using more experimental | 
proaches to examination of the body coni ep 
are clearly needed, however. 


r 


Sense of Separate Identity 


The extensive evidence derived from $ 
carried out since 1962 under the aegis 0l 
Sense of separate identity concept, recem 
reviewed by Witkin and Goodenough (1! 
has essentially confirmed and extended 
findings. 3 

In one approach, sense of separate ide 
was assessed through its expression in 
object differentiation. Winestine (1969) 
strength of the twinning reaction, as an m a 
tion of standing on that dimension, for @ 
member of the pairs of male twins she studis 
The twinning reaction was conceived 
stronger the greater the degree of inter 
cation of the boy with his twin, as ass 
from an interview with the boy. Boys W 
stronger twinning reaction were signi® 


more field dependent. Olesker (1978), following 
the separation—individualization model of 
Mahler (Mahler, 1966; Mahler, Pine, & 
ergman, 1975), assessed degree of self-object 
differentiation through extended observations 
of young children in a nursery school. Again, 
greater self-object differentiation was signifi- 
cantly related to greater field independence. 
araga (1978), also working within the Mahler 
framework, found that less attachment at age 
{was associated with greater field independ- 
ence at age 3, although the expected relations 
etween field dependence-independence and 
articular maternal characteristics and child 
ehaviors were not found. Finally, Paul (1975) 
observed that children who showed difficulty 
in separating from parents after 1 week of 
nursery school experience were more field 
dependent than children who did not. 

Another body of evidence has demonstrated 
at people who show greater autonomy of the 
txternal visual field in perception of the upright 
iso tend to function more autonomously of 
others in social-interpersonal situations, par- 
ticularly under conditions of ambiguity (e.g., 
Shulman, 1976; Solar, Davenport, & Bruehl, 
1969; Weinberg, 1970). This finding broadens 
the picture of self-consistency in functioning 
iross domains in ways consistent with the 
ifferentiation hypothesis. 

The evidence linking autonomy in percep- 
ton of the upright to autonomy of external 
ferents in social behavior has affected our 
sage of the cognitive-style and field-depend- 
nce-independence concepts (Witkin & Good- 
tough, in press; Witkin & Goodenough, Note 
). We now apply the designation cognitive style 
(0 the contrasting tendencies to rely primarily 
M external referents or on the self in psycho- 
gical functioning.“ We have also transferred 
designation field dependence-independence 
\0 this process dimension from its early use as a 
Abel for disembedding ability in perception. 
Another important product of the more 
"cent research on sense of separate identity is 
lts demonstration that field-dependent people 
ad to have an interpersonal orientation and 
ild-independent people an impersonal orienta- 
ton (Witkin & Goodenough, 1977). Compared 
b field-independent people, field-dependent 
Mes favor social over solitary situations (€g., 
Coates, Lord, & Jakabovics, 1975; Crandall & 


PSYCHOLOGICAL DIFFERENTIATION 


1131 


Sinkeldam, 1964; Nadeau, 1969); they prefer 
to be physically close to others in an interac- 
tion situation (Greene, 1976; Justice, 1970; 
Wineman, 1974) ; they are selectively attentive 
to social sources of information (e.g., Fitz, 
1971; Fitzgibbons, Goldberger, & Eagle, 1965; 
Konstadt & Forman, 1965); they are open in 
expressing their feelings and thoughts (DeMers, 
1971; Greene, 1976; Sousa-Poza & Rohrberg, 
1976), an approach likely to elicit similar be- 
havior from others toward them. These at- 
tributes and behaviors seem likely to provide 
the person with information about what others 
may be feeling and thinking, as well as experi- 
ence in dealing with people. They may thereby 
facilitate getting along with others. Probably 
contributing to that same end are such char- 
acteristics ascribed to field-dependent people 
as considerate and attentive to others (e.g. 
Elliott, 1961). Indeed, evidence is beginning to 
emerge that field-dependent people may be 
more effective in social interactions than more 
field-independent people (e.g., Oltman, Good- 
enough, Witkin, Freedman, & Friedman, 
1975). The social attributes and behaviors 
useful in interpersonal relations, found among 
field-dependent people, and their apparently 
greater facility in getting along with others 
add up to a picture of interpersonal 
competencies. 

Research on the social skills associated with 
field dependence-independence is limited, re- 
flecting the relatively recent emergence of 
interest in the social domain in the history of 
work on psychological differentiation. More 
precise specification of the interpersonal com- 
petencies characteristic of relatively field- 
dependent people and of the social skills com- 


4 Other dimensions that on the surface may appear 
similar to field dependence-independence in fact are 
not. An example is the locus-of-control construct. This 
dimension is conceptually different from the field- 
dependence-independence dimension, and many studies 
have shown, almost invariably, that measures of the 
two dimensions are not related (e.g, Roodin, 
Broughton, & Vaught, 1974; Shapson, 1973; Tobacyk, 
Broughton, & Vaught, 1975). Field dependence-inde- 
pendence is conceived as a process variable, representing 
degree of autonomy of external referents in processing 
information from field and self; locus of control is a 
belief or attitudinal variable, representing greater or 
less fatalism as an outlook on life. 


1132 


mon among more field-independent people,* 
as well as a clearer conceptualization of these 
competencies and skills within the differentia- 
tion framework, are tasks that lie ahead. 

The involvement of cognitive restructuring 
skills and interpersonal competencies in field 
dependence-independence makes that cogni- 
tive-style dimension bipolar, in the sense of 
having no clear high and low ends: People at 
the field-independent pole have more developed 
cognitive restructuring skills, and people at 
the field-dependent pole are more likely to give 
evidence of interpersonal competencies. To the 
extent that at both poles we find characteris- 
tics adaptive in particular situations, the field- 
dependence-independence cognitive-style di- 
mension is not value biased. 


Segregation of Psychological Functions 


The main trademark of segregation of psy- 
chological functions, a major manifestation of 
differentiation, is specificity of activities and 
experiences. Signifying the development of 
specificity are the formation of structured con- 
trols for routing impulses and expending energy 
and the formation of specialized defenses for 
dealing with potentially disturbing experi- 
ences. The differentiation hypothesis leads us 
to expect that these indicators of greater 
differentiation would be related to field 
independence. 

With regard to controls, early in life, im- 
pulses are likely to find expression in diffuse 
systemwide reactions. As the child grows older, 
Structured systems of control are developed 
that make possible the specific channeling of 
impulse. This accomplishment diminishes the 
likelihood of a “spilling over” of the content 
of one domain into another, thereby avoiding 
the too easy mixing of the ideational, affective, 
Motivational, and motoric. 

With regard to defenses, some may be con- 


sidered relatively unspecialized, in the sense 


that they involve dealing with i i 
global fashion. Exam i E 


4 : ples are denial and repres- 
sion, which are characterized by a turning od 

m perception of the content of an immediate 
experience, or the memory of a past experience 
often in its entirety. In the case of denial the 
existence of a disturbing external reality ‘as a 
whole may not be acknowledged. Repression, 
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when used as a generalized defensive str 
has been conceived in psychoanalytic 
to be the blocking from awareness of not o 
objectionable instinctual demands but the 
pulses, thoughts, and feelings associated 
them as well. In contrast to these nons 
ways of functioning, defenses such as iso 
intellectualization, and projection are m 
specialized. These defenses are chara 
tically directed at particular componen 
experience, which may be obliterated or 
more pallid, or they may act to separate 
ponents of experience that are actually 
nected, without the components neci 
being lost. In the case of isolation, the 
nections between ideas and feelings (or v 
or impulses) that are actually related are] 
from awareness, although the affect and i 
tion are consciously available to the pel 
Intellectualization is a defense in which 
is neglected or excluded, whereas the 
tional components of the experience remi 
force. Even though its affective coni 
washed out, the experience continues 
available to the person in its intell 
aspects as something that can be name 
defined. Finally, in projection, specific ch 
acteristics and purposes are attributed tot 
person who is its object; it is thus not 
criminate in its action. 
To the extent that structured co 
signify specialization of functions and 
tribute to specificity of behavior, we 
expect from the differentiation hypothesi 
field-independent people will provide 
evidence of the availability and use of 


5 Very little is yet known about the social skil 
found among field-independent people, beyond # 
that such people tend to be limited in interp 
competencies. Some of these people, who ha’ 
described as aloof, distant, and solitary, are 
stay away from interpersonal involvement as m 
possible. There are indications that when fi 
pendent people do invest in the social sphere, © 
social skills may in fact be expressions of their 
turing abilities, taking the form of ordering and 0 
ing social situations, perspectivism, and accu 
the more cognitive aspects of person perception 
from use of an analytical approach. These skills 
appear to contrast with those more common § 
field-dependent people that particularly involve! 
in getting along with others. The evidence om 
points is hardly conclusive, but it suggests ¢ 
for the research that is needed. 


tured controls, compared to field-dependent 
people. The characteristics of specialization 
and specificity of defenses, such as intellectuali- 
zation, isolation, and projection, would simi- 
larly lead us to expect these defenses to be 
more commonly used by field-independent 
people, and the use of more global defenses, 
such as repression and denial, to be more 
common among field-dependent people. 


Siructured Controls 


In one set of studies, degree of structure of 
controls was inferred from the content of a 
battery of projective tests or from particular 
projective test responses reflecting directness 
of affective discharge, with controls judged 
to be more structured when the expression of 
affect was modulated or mediated (Freedman 
& Marks, 1965; Gardner, Holzman, Klein, 
Linton, & Spence, 1959; Guskin, 1955; Witkin 
tt al., 1962/1974; Witkin et al., 1954/1972). 
Most of these clinical studies found the ex- 
pected tendency for field-independent subjects 
to show more structured controls, although the 
evidence is not entirely consistent. In line with 
these findings is the observation (Witkin, 
Lewis, & Weil, 1968) that field-dependent 
Patients showed more diffuse anxiety in their 
verbal productions during psychotherapy when 
these productions were assessed according to 
the Gottschalk-Gleser method (Gottschalk & 
Gleser, 1969), although these two kinds of 
Patients were not different in overall amount 
of anxiety expressed. 

Other studies have assessed structure of con- 
ttols from observations of behavior. Crutch- 
‘eld and Starkweather (Note 6) found that 
Judges described field-dependent adults they 
ad observed over a period of time as people 
Who “undercontrol impulses” and “act with 
Nsufficient thinking and deliberateness.” Per- 
Srmance on the mazes task has been observed 
ot the evidence it provides of impulsivity 
versus control in carrying out a specific ac- 
ity. In most studies, field-dependent sub- 
Kets, both children and adults, were more 
ely to provide evidence of impulsive be- 
vior than were field-independent subjects 
a Best, 1975; Gorman, 1968; Podell & 

illips, 1959; Swan, 1974). A related set of 
\idies was concerned with motoric inhibition. 
“method used required the subject to carry 
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out a well-practiced act—such as drawing a 
line, walking a line, or writing one’s own 
name—very slowly. Many studies with children 
have shown capacity for motor inhibition to 
be greater among field-independent than field- 
dependent children (eg. Eddy, 1974; 
Maccoby, Dowley, Hagen, & Degerman, 1965; 
Mayer, 1967; Mumbauer & Miller, 1970). 
Other studies with children did not find a 
relation between field dependence-independ- 
ence and assessments of impulse control from 
general behavior manifestations (Reppucci, 
1970; Seitz, 1972) or from behavior manifesta- 
tions together with psychological test data 
(Gardner & Moriarty, 1968). On the other 
hand, with a high degree of consistency, hy- 
peractive children, in whom impulsivity is a 
prominent characteristic, have proved to be 
field dependent (e.g., Campbell, Douglas, & 
Morgenstern, 1971; Cohen, Weiss, & Minde, 
1972; Douglas, 1972; Halverson & Waldrop, 
1976). j 

Measures of impulsivity from self-report in- 
ventories have generally proved not to be! 
related to field dependence-independence. This 
outcome may be a function of particular char- 
acteristics of these inventories." 


®Self-report inventory measures of defenses gen- 
erally do not relate to field dependence-independence 
either. The possibility that the contrast between the 
results from self-report inventories and the results ob- 
tained with projective test responses and behavioral 
observations may be a function of the methodologies 
involved is supported by several observations made in 
other areas of research on psychological differentiation. 
‘As one illustration, mothers’ reports of their actual 
behavior with their children tend to distinguish signifi- 
cantly between the child-rearing experiences of rela- 
tively field-dependent and field-independent children, 
but mothers’ expressed attitudes toward child rearing 
generally do not (Witkin & Goodenough, in press; 
Goodenough & Witkin, Note 7). As another example, 
studies of the position subjects assume in relation to 
those with whom they are interacting have shown that 
field-dependent people prefer to be closer to others than 
do field-independent people; on the other hand, sub- 
jects’ responses to hypothetical situations involving the 
use of interpersonal space have not shown such a differ- 
ence (Witkin & Goodenough, 1977). As still another 
example, whereas field-dependent people are more re- 
sponsive to group standards when the standard is 
derived from actual interaction with a group, no differ- 
ence has typically been found between field-dependent 
and field-independent people when the standard was 
given in the form of a statement attributed to either an 
authoritative source or a bogus group average (Witkin 
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By most signs, field-independent people thus 
give evidence of greater regulation of affective 
discharge and of motor activities, suggesting 
the presence of more structured controls. 


Specialized Defenses 


The nature of the defenses predominantly 
used by field-dependent and field-independent 
people has been examined in many studies 
employing a variety of approaches. 

In one approach, clinical assessments of 
defenses were made. The clinical techniques 
used have included the Rorschach (e.g., 
Bertini, 1962; Levine & Spivak, 1964; Morrison 
& Centers, 1969; Schimek, 1968) ; an interview 
(Witkin et al., 1954, 1972) or therapy interac- 
tions (Safer, 1975); a battery of psychological 
tests and behavioral observations (Gardner & 
Moriarty, 1968); Blacky Pictures and the 
Sentence Completion Test (Lutzky & 
Schmeidler, 1963; Silverstone, 1966); and the 
Defense Mechanism Inventory, developed by 
Gleser and Ihilevich (1969) (e.g., Bogo, 
Winget, & Gleser, 1970; Donovan, Hague, & 
O'Leary, 1975; Erickson, Smyth, Donovan, & 
O'Leary, 1976; Williamson, 1977). In another 
kind of study, repressors and isolators were 
identified through performance on a meta- 
contrast situation (Almgren, 1971; Nilsson & 
Almgren, 1970). The weight of the evidence 
from these studies together is that field-inde- 
pendent people are prone to use isolation, intel- 
lectualization, and projection as characteristic 
defenses, whereas field-dependent people are 
more likely to use repression and denial. 

Evidence relevant to the issue of defenses 
has also come from studies of the effects of 
Stress on recall and perception. These studies 
have shown that field-dependent people, com- 
pared to field-independent ones, are more likely 


& Goodenough, 1977). It appears that the most con- 
sistent relations are found between field dependence- 
independence and a variety of personality and social 
variables when assessment of these variables is made 
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to forget previously learned stressful m 
than neutral material (e.g, Duvall, 
Rosen, 1963; Schimek & Wachtel, 
Uhlmann & Saltz, 1965). Field-dep 
people have also been shown to be 
affected by stressful material in per 
(e.g., Almgren, 1971; Duvall, 1970; Lefe 
Gronnerud, & McDonald, 1973; Mina) 
Mooney, 1969). g 

Still another kind of evidence comes fromt 
literature on dream forgetting, a phenom 
that has been attributed to repression. 
evidence, recently reviewed by Gooden 
(1976), suggests that people who charac 
tically fail to recall their dreams, partie 
stressful ones, are likely to be field depen 

The expectation from the differentiatio 
pothesis that field-independent peop! 
more prone to use specialized defenses, suc 
isolation, intellectualization, and projec 
and field-dependent people to use less sp 
ized defenses, such as repression and 
has received substantial confirmation fi 
available evidence.* 


Further Evidence on“ Keeping Things Sepa 


Greater discreteness of functions and expe 
ences, signifying segregation and specializati 
of activities, has been observed among He! 


7 The use of more specialized defenses has the gi 
consequence of separating the self from emo! 
reducing the potency of affect, giving ascendam 
more cognitive aspects of experience. There is an 
parallel between this emphasis on the cognitivi 
psychic life among field-independent people and 
impersonal orientation, with its emphasis on ide 

ple 


®The use of more specialized or less S 
defenses carries no implication as to effectiveni 
justment. Thus, the use of intellectualization 
valuable aid in the development of the individ 
tellectual potential, by giving intellectual PI 
freedom from intrusion of feelings, impulses, 
that may divert these processes and impair th 5 
tiveness. Extreme reliance on intellectualization 
ever, may hamper the development of the ind 
emotional life. Similarly, repression may he 
dividual to function unhampered by threatening: 
and memories that if allowed entry into 
could be disruptive; extreme reliance on rep! 
constrict development of salient parts of the pê 
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independent people in psychological domains 
other than controls and defenses. 

Í One illustration comes from a study by 
Brilhart (1970) that examined the influence of 
affective features of a communication on sub- 

jects’ evaluations of its cognitive content. Two 

messages were prepared on a given theme; one 
was “good” in its cognitive structure and 
content, the other “poor.” In addition, each 
kind of message was delivered by a good or 
or speaker. Among field-dependent subjects, 
ut not among field-independent ones, esti- 
mates of the logic of the message were influ- 
enced by the kind of speaker who delivered it. 

Thus, field-independent subjects, more than 

eld-dependent ones, kept cognitive features 

of the message separate from the personal 
context in which it was presented. 

A quite different kind of evidence on the 

maintenance of discreteness may be found in 

he domain of stimulus generalization. A 

tecent review (Goodenough, 1976) indicates 

that generalization along a stimulus continuum 
is less among field-independent than among 
field-dependent people. Thus, field-independent 
people more quickly attach the critical re- 
sponse to a particular stimulus in the array, in- 
hibiting expression of that response to other 

stimuli, a 
The evidence reviewed here is, on the whole, 

consistent with expectations from the differ- 
entiation hypothesis, that greater field in- 
dependence is likely to be associated with 
greater specialization of psychological 
functions. 


Neurophysiological Differentiation 


On the premise that differentiation is a broad 
Organismic construct, we may expect greater 
or less differentiation to manifest itself in 
the neurophysiological domain as well as in the 
Psychological domain. It is only recently that 
the rapidly expanding knowledge about asym- 
metrical specialization of the hemispheres of 
the brain has made it possible to formulate 
plicit hypotheses, within differentiation 

cory, regarding linkages between psycho- 
gical and neurophysiological differentiation 
(Oltman, Note 8). 

great deal of evidence no 
‘mong normal right-handed people, th 


w indicates that 
e left 
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and right hemispheres have characteristically 
different modes of processing information (e.g., 
Dimond & Beaumont, 1974; Milner, 1975). 
The left hemisphere, served most directly by 
the right visual field and right ear, seems par- 
ticularly suited to the processing of verbal- 
conceptual and executive motor functions. The 
mode of functioning of the right hemisphere, 
served most directly by the left visual field 
and left ear, has been characterized in ways 
that imply holistic processing of configurations 
or gestalten (e.g., Bogen, 1975). These ob- 
servations lend themselves to interpretation 
within the differentiation framework. Speciali- 
zation of the functions of the two hemispheres 
may be taken as an indicator of neurophysio- 
logical differentiation, much as specialization 
has served as an indicator of psychological 
differentiation. The differentiation hypothesis 
leads us to predict that greater specialization 
in the psychological domain will be linked 
to greater specialization in the neurophysio- 
logical domain. In particular, we may expect 
that field-independent people, compared to 
field-dependent ones, will show evidence of 
greater lateral specialization of the hemi- 
spheres ; this will show itself in greater speciali- 
zation of the left hemisphere for verbal and 
motor control processing, and greater speciali- 
zation of the right hemisphere for configura- 
tional-gestalt processing. The emphasis, it 
should be noted, is on degree of lateralization 
of different types of processing in the respective 
sides, not on the dominance of one hemisphere 
over the other as a generalized hemisphericity 
tendency. 

Evidence of the linkages between psycho- 
logical and neurophysiological differentiation, 
as predicted by differentiation theory, is now 
available from a number of studies using a 
variety of approaches. 

It is a well-established observation that in 
right-handed individuals the perception of 
faces is processed primarily by the right hemi- 
sphere (e.g-, Hilliard, 1973; Rizzolatti, Umilta, 
& Berlucchi, 1971). Using a composite face- 

ception technique (Gilbert & Bakan, 
4973), Oltman, Ehrlichman, and Cox (1977) 
found a significant tendency for relatively 
field-independent subjects to show a stronger 
left visual field bias than did field-dependent 
subjects. Thus, when confronted with a task 
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that is more suited to the processing mode of 
the right hemisphere, the field-independent 
observers gave particular weight to cues from 
the left visual field, which project most directly 
to the right hemisphere. Field-dependent sub- 
jects showed little or no lateralization on this 
task, 

Complementing the Oltman et al. (1977) 
study of right-hemisphere functions are studies 
of left-hemisphere functions. One such study, 
by Sousa-Poza and Rohrberg (Note 9), took 
as its point of departure the observation by 
Kimura (1973) that right-handed speakers 
typically make most of their speech-related 
gestures with their right hands. Sousa-Poza 
and Rohrberg found that this relationship was 
moderated by field dependence-independence, 
however. Although relatively field-independent 
speakers gestured primarily with their right 
hands, conforming to the modal pattern de- 
scribed by Kimura, field-dependent speakers 
showed more bilateral speech-related gestures. 

The Kimura (1973) finding apparently re- 
flects left-hemisphere control, in general, of 
both speech and speech-related right-hand 
movements. On this basis, the relatively 
greater number of bilateral speech-related 
movements among field-dependent people may 
reflect a more bilateral representation of 
language. Results consistent with the idea of 
more definite left-hemisphere language repre- 
Sentation among field-independent people have 
come from studies using the dichotic listening 
technique. Those studies show that persons 
with Stronger right-ear superiority in the 
Processing of verbal material tend to be rela- 
Pi ae independent (Dawson, 1977; 

izzamiglio, 1974; Pizzamiglio ini, 
1971; Waber, 1977b). CA Oe 

Assessments of extent of lateralization of 
left-hemisphere and right-hemisphere proc- 
essing, as a function of field dependence-inde- 
Pendence, were recently combined in a single 
study (Zoccolotti & Oltman, 1978). Rizzolatti 
-et al. (1971) had earlier found that overall 

crimination reaction times were faster to 
letters when they were presented tachistoscopi- 
cally to the right visual field (which projects 
directly to the left “Verbal hemisphere”), and 
were faster to faces shown in the left visual 
fad (which Projects directly to the right 
“configurational hemisphere”). Using the same 
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stimuli employed by Rizzolatti et al., Zoccolot 
and Oltman found this pattern of opposi 
visual-field superiorities, as a function of tl 
type of processing required, among their fiel 
independent, but not among their field-d 
pendent, subjects. These two kinds of subjec 
were not different in overall performance | 
either task. A recent study by Rapaczyn: 
and Ehrlichman (Note 10), using a face recog 
tion task, found reversed, rather than small 
hemifield differences in their field-depende 
subjects, but the complexity of the task th 
used and the less accurate overall performant 
of their field-dependent subjects make int 
pretation of these results difficult.’ 

In addition to leading to differences in} 
havior of the kinds considered, differentiati 
of function of the two hemispheres may also 
expected to show itself more directly in diff 
ences in electrical activity of the hemispher 
The more similar the activities in the two hen 
spheres (or the less different their function 
the more similar are the electroencephalogrt 
(EEG) recordings from the left and rig 
hemispheres likely to be. Oltman, Semple, a 
Goldstein (in press) recently found th 
fluctuations over time, in integrated El 
amplitudes, showed significantly higher € 
relations between left and right hemisphet 
among field-dependent subjects. 

Still another kind of evidence, of a m€ 
indirect nature, comes from studies of su 
groups that differ in degree of lateralization 
well as in performance on tests of field í 
pendence-independence or on tests of spati 
visualization ability, one of the cognitive ! 
structuring dimensions in which field im 
pendence expresses itself. For example, sevé 
studies have shown that women tend to bel 
strongly lateralized than men (e.g, Kimu 


*In the face-recognition data of Rapaczynski i 
Ehrlichman (Note 10), the field-independent aul 
showed the usual left-visual-field superiority, but : 
dependent subjects showed a significant right? 
superiority. The task used required a verbal repo 
fixation digit in addition to face recognition, and A 
dependent subjects showed lower overall perio 
on the face task. The combination of a verbal repor 
face discrimination in the same series of trials con 
cates inferences about extent of lateralization 1, 
processing of the face stimuli, particularly ip 
level of performance also differed between the 81 
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1969; McGlone & Davidson, 1973; Ray, 

Morell, Frediani, & Tucker, 1976), and there 
Wiis considerable evidence that people who are 
g not fully right-handed are less lateralized than 
G\those who are clearly right-handed (e.g., 
Gilbert & Bakan, 1973; Silverman, Adevai, & 
Mi McGough, 1966; Zangwill, 1960). As to the 
Mil cognitive style of these less lateralized groups, 
i} women are relatively more field dependent than 
imen (e.g., Witkin et al., 1962/1974), and they 
Ili do less well on spatial tests (e.g., Maccoby & 
i} Jacklin, 1974). Similarly, people who are less 
lthan fully right-handed are relatively more 
M field dependent than clear right-handers (e.g., 
Dawson, 1977; Pizzamiglio, 1974; Silverman 
ittal., 1966), and they tend to be lower in spatial 
Mibility (Levy, 1969; Miller, 1971), although the 
Wwidence on this point is not entirely con- 
stent (e.g, Briggs, Nebes, & Kinsbourne, 
11976). The indications that women and non- 
“ight-handers are less lateralized, and relatively 


Ol In all, the evidence that greater neuro- 
MPhysiological differentiation tends to be as- 
fociated with a more field-independent cogni- 
e style lends support to the differentiation 


ve hypothesized that the tendency to func- 
nin more differentiated or less differentiated 
hion is likely to be evident, to some degree, 
toughout an individual’s psychological and 
}“tophysiological activities. This expectation 
self-consistency has received considerable 
sport in the evidence that has been reviewed. 
it)" manifestations of greater or less differ- 
Wis tiation found to be interrelated are quite 
mpverse, and together cover considerable 
Ychological and neurophysiological territory. 
ni|. Seyond confirming the linkages among func- 
lis and behaviors identified in the earlier 
NIOK, the newer research has shown that the 
“entiation hypothesis meets the require- 
t of any useful conception, that it be able 
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to generate testable predictions in new do- 
mains. Now implicated in the differentiation 
network, through that research, are such addi- 
tional domains as cerebral lateralization, 
cognitive restructuring, and interpersonal be- 
havior. The differentiation construct also seems 
able to account for findings and relationships 
that cannot be accounted for by other con- 
structs in our conceptual framework. Thus, 
the field-dependence-independence construct, 
now defined as extent of autonomous function- 
ing, is not able to accommodate such phe- 
nomena as nature of controls and defenses or 
extent of cerebral lateralization, whereas the 
broader differentiation construct can do so. 

The differentiation hypothesis is essentially 
a proposal to account for an association, among 
components of a network of functions and be- 
haviors, responsible for a particular kind of 
individual self-consistency. It does not con- 
cern itself with the issue of hierarchical order- 
ing of the components; nor does it concern 
itself with the issue of causal influences of the 
components on each other, either in the per- 
son’s current functioning or during develop- 
ment. The issues of hierarchical ordering and 
of causal interconnections have recently come 
to the fore, stimulated in part by the addition 
of new areas to the differentiation network. 

A hierarchical ordering primarily signifies 
an arrangement of constructs from more gen- 
eral to more specific; it may carry the implica- 
tion as well that higher order constructs 
influence lower order ones. In our 1962 model 
(Figure 1), differentiation was the most general 
construct, and the series of postulated indica- 
tors of extent of differentiation constituted an 
array of narrower, lower order constructs, all 
of equal status. From the more recent research, 
proposals have emerged for amalgamating some 
of the lower order constructs, as well as pro- 
posals for a larger number of delineated con- 
structs varying in degree of specificity. A 
plausible structure incorporating these pro- 
posals is shown in its main outlines in Figure 2 
and is discussed later along with some elabora- 
tions and possible alternatives. As in our earlier 
model, differentiation continues to be located 
at the apex. At the level immediately below 
the apex are the three major indicators of 
differentiation : self-nonself segregation, segre- 
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Differentiation 


Self -nonself Segregation of Segregation of f 
segregation psychological neurophysiological i 
(Field independence) functions functions 
Restructuring Limited Structured Specialized Hemispheric 
skills interpersonal controls defenses lateralization 
competencies 


Figure 2. Proposed differentiation model. 


gation of psychological functions, and segrega- 
tion of neurophysiological functions. 

Self-nonself segregation, in Figure 2, is 
where we locate the field-dependence-inde- 
pendence cognitive-style construct, a bipolar 
process variable conceived to reflect extent of 
autonomy of external referents. 

Limited self-nonself segregation, responsible 
for less autonomous functioning or a field- 
dependent cognitive style, signifies continued 
connectedness with others. Such connected- 
ness, it seems reasonable to postulate, fosters 
the development of an interpersonal orienta- 
tion, “fellow feeling,” a fund of knowledge 
about others, and facility in getting along with 
others—in other words, interpersonal com- 
petencies. Interpersonal competencies are, in 
this view, a derivative of greater connectedness 
with others, and hence a construct below the 
level of self-nonself Segregation or field de- 
pendence-independence. 

It also seems possible that cognitive-restruc- 
turing abilities may be a product of field inde- 
pendence. Whether a person tends to rely 
primarily on external referents or to be self- 
reliant might be expected to influence the de- 


velopment of cognitive-restructuring skills. A 


Person who functions less autonomously may 
adhere to a percept or symbolic representation 
as given when dealing with restructuring tasks. 
A more autonomously functioning person may 
go beyond the information given when that is 
required b 


ed by situational demands or inner needs. 
f this view of the relationship between 
autonomy and Testructuring is correct, then 


restructuring competence is likely to be a vg 
general dimension, located below the level 
field dependence-independence. 

In the present state of the evidence, wea 
inclined to consider an articulated body cone 
to be a manifestation of restructuring whe | 
body itself, rather than a stimulus array mf 
external field, is the source of experience, & 
the examples of restructuring considel 
earlier. Articulation emphasizes the M 
cognitive aspects of the body concept, and p 
ticularly the person’s view of the spatial ı e 
sentation of the body. Though formation 
the body concept is a process that takes pi 
over the course of development, it may bes 
to entail the imposition of structure 0 
stimulus array much as do some of the 
lem-solving tasks we considered, which a 
structuring at the moment they are carne 4 
In suggesting that an articulated body 
cept be subsumed under cognitive restr 
ing, we are changing the location of h 
struct from the higher order status it 74 
our earlier model in Figure 1. 

Consistent with the concept that extent 
autonomy may exert a causal influence 0” 
development of interpersonal competencl i. 
restructuring skills is the evidence from a s 
body of cross-cultural research (e.g k, 
1978; Witkin & Berry, 1975; Witkin @ 
1974). It appears that societies that an n 
autonomy from societal and parental au! “id 
in young children, in comparison to S°C™ 
that stress conformity, tend to produce a 
who show the associated characteristi 
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field independence in perception of the upright, 
competence in cognitive restructuring, and 
self-reliance. The contrasting set of charac- 
teristics is more common in societies that en- 
courage conformity. Extent of socialization for 
autonomy is thus implicated as central to the 
development of this cluster of characteristics. 

We have proposed that the greater self- 
nonself polarity and more autonomous func- 
tioning of field-independent people may foster 
the development of cognitive restructuring 
skills among them, but it is not as likely to 
foster the development of interpersonal com- 
petencies. We have also proposed that the 
greater recourse of field-dependent people to 
external referents stimulates the development 
of interpersonal competencies, but may be 
responsible for these people’s lesser cognitive 
restructuring skills. Relatively field-dependent 
and field-independent people may thus be seen 
as making their main developmental invest- 
ment in different domains, with the result that 
their psychological development proceeds along 
different pathways. This conception views the 
development of differentiation as multilinear; 
genuine development takes place along both 
the pathways of limited differentiation and 
greater differentiation. This conception also 
implies that cognitive styles, which are process 
variables, influence the development of pat- 
terns of abilities—cognitive restructuring skills 
and interpersonal competencies—and so may 
be regarded as expressing themselves in these 
abilities, 

The second major indicator of differentia- 
tion, segregation of psychological functions, is 
Manifested in other ways, in structured con- 
trols and specialized defenses. These manifesta- 
tions thus have the status of more specific con- 
Structs below the level of segregation of psy- 
chological functions. x 

The third major indicator of differentiation, 

Segregation of neurophysiological functions, 1 
located at the same level in our model as the 
Psychological indicators, segregation of psy- 
thological functions and self-nonself segrega- 
tion. Below the level of neurophysiological 
Segregation we place the construct lateraliza- 
tion of cerebral functions, in which neuro- 
Physiological segregation manifests itself. 

Within the differentiation framework, segre- 
ation of neurophysiological functions may 
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provide causal pathways to the development 
of the psychological manifestations of differ- 
entiation, in addition to the socialization and 
experiential pathways that have been sug- 
gested at various points. (To avoid undue 
complexity, these pathways are not shown in 
Figure 2.) Psychological functions are rooted 
in a substratum of neurophysiological activity ; 
therefore, specialization of functions at a 
neurophysiological level may be an important 
determinant of specialization of psychological 
functions. The possible influence of neuro- 
physiological specialization on the develop- 
ment of cognitive restructuring and of special- 
ized defenses may be examined as illustrative. 

With regard to cognitive restructuring, one 
might suppose that the processing modes that 
are characteristic of the left and right hemi- 
spheres are maximally effective if they are 
localized in separate hemispheres. Levy (1969, 
1974) has proposed that when language repre- 
sentation is bilateral, verbal functions will pre- 
dominate at the expense of configurational- 
gestalt processes. It may be, as Teuber (1975) 
has suggested, that speech is established earlier 
and is more resilient, thereby attaining a 
favored position wherever it is located. That 
is, extent of hemispheric specialization may 
have little effect on verbal competence, but it 
is likely to facilitate spatial-configurational 
processes ordinarily mediated by the right 
hemisphere, either alone or in complex inter- 
action with the left hemisphere’s propositional 
mode. Although the scope of the cognitive- 
restructuring dimension has not yet been con- 
clusively established, the weight of the evidence 
currently available suggests that the cognitive 
correlates of field independence in perception 
of the upright may be limited to spatial-visual 
restructuring abilities. If this suggestion is con- 
firmed by future research, Levy’s hypothesis 
may help to explain such a possible limitation, 
and individual differences in spatial-restruc- 
turing abilities may be understood as a con- 
sequence, in part, of degree of cerebral 
lateralization.” 


10 Waber (1977a), working within the framework of 
Levy’s (1969) hypothesis, has recently proposed a 
model that implicates hormonal factors. There is evi- 
dence, although it is not entirely consistent, that in- 
dividuals of both sexes who reach puberty relatively 
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Turning to the possible influence of cerebral 
specialization on specialization of defenses, we 
have proposed that greater differentiation 
manifests itself in clear separation between 
affect and ideas or percepts. This separation 
may perhaps be related to the newly emerging 
evidence that the right hemisphere, in addition 
to mediating spatial-configurational processes, 
may also be implicated in the expression and 
control of affect (Carmon & Nachshon, 1973; 
Gainotti, 1972; Galin, 1974; Heilman, Scholes, 
& Watson, 1975; Schwartz, Davidson, & Maer, 
1975). If there is a special role for the right 
hemisphere in affective experience and expres- 
sion, we might then ask whether greater or less 
cerebral lateralization may enter into determin- 
ing how affect and ideation are interrelated. It 
is perhaps possible that one of the bases for the 
relative lack of separation of feelings and ideas 
among less differentiated people is a neural 
organization in which parts of the brain or- 
dinarily involved in their processing are not as 
specialized for these purposes. Such a possi- 
bility seems consistent with the differentiation 
framework and is capable of generating testable 
predictions. 

The possible involvement of cerebral speciali- 
zation in the psychological components of the 
differentiation cluster Suggests that causal in- 
fluences may cross the three main pathways 
radiating from differentiation, in the model 
sketched in Figure 2, making for multideter- 
mination of these components. In still broader 
Perspective, it may well be that the socializa- 
tion practices that encourage self-nonself 
Segregation and individual autonomy (see 
Witkin & Goodenough, in press, and Good- 
enough & Witkin, Note 7, for recent reviews) 
have much in common with those that en- 
courage the development of such expressions 
of segregation of Psychological functions as 
Structured controls and Specialized defenses 
(Dyk, 1969; Dyk & Witkin, 1965; Witkin et 
al., 1962/1974). Moreover, socialization prac- 


that the relationship betw: 


structuring ability is mediated by hormonal influences 


on the development of hemi 
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tices may well influence the development af 
neurophysiological differentiation, which in 
turn may have its own consequences for a 
variety of specific psychological functions im. 
plicated in the differentiation framework. That 
socialization experiences may indeed affect the 
development of lateralization is suggested by 
the finding of Kimura (1967) and Geffner and 
Hochberg (1971) that their lower socioecono- 
mic status (SES) children developed the usual 
right-ear advantage in dichotic listening late 
than did middle and upper SES children, Al 
though we do not yet know the precise nature 
of the operative variables involved, or their 
overlap with the socialization variables im 
plicated in the development of self-nonstl 
segregation and segregation of psychological 
functions, it seems that experience may be one 
important source of individual differences it) 
neurophysiological differentiation. 

Complementing these various routes ol 
causal influence are possibilities for mo 
directly acting causal effects. Thus, training 
has been shown to enhance specific restructur 
ing skills such as disembedding (Witkin & 
Goodenough, in press; Goodenough, & Witkin, 
Note 7), and it may affect restructuring as M 
general competence as well (Dolecki, 1976) 
The development of interpersonal competence 
is, of course, also subject to direct influence by 
appropriate social experiences, 

Much progress has been made over the Me 
in elaborating and making more specific a 
propositions of differentiation theory a 
larging its empirical underpinning. With A 
expansion have come new questions that g- 
require answers, an inevitable accompanimei 
of the evolution of any theory. The Mee 
directions for further work that have ha 
opened by recent research on differential | 
theory provide an encouraging sign O 
theory’s continuing usefulness. 
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Miller and Norman have distinguished between active observers (those on the 
receiving end of an actor’s behavior) and passive observers (onlookers of an 
event involving an actor and an active observer). Following the concept of 
hedonic relevance, it was hypothesized that active observers will attribute the 
actor's behavior to personal dispositions of the actor even more strongly than 
passive observers will. In a series of hypothetical emotional events, subjects 
were depicted either as actors (“You like Ted”), active observers (“Ted likes 
you’), or passive observers (“Ted likes Paul”). They then rated the degree to 
which the actor, active observer, or some “other reason” had caused the given 
event. Although the actor-observer effect was obtained overall, an interaction 
between subject role and positivity of verb indicated that it occurred much 
more strongly in negative-verb events than in positive-verb events. That is, 
subjects—as either actors or active observers—tended to deny their responsi- 
bility for negative events but did not claim praise for positive events, Impli- 
cations for the effects of egotism on attribution are discussed. 


behaviorally influenced by the other participants 
Active observers, therefore, function as both actors 8 
observers. The term passive observer . . . refers ‘a 
individual who neither influences nor is influence 
the actor he is observing. (p. 503) 


The intriguingly simple hypothesis of Jones 
and Nisbett (1971) that “there is a pervasive 
tendency for actors to attribute their actions 
to situational requirements, whereas observers 
tend to attribute the same actions to personal 
dispositions” (p, 2) has sparked a profusion 


Clearly, this distinction is critical, since the 
of research in recent years. In any interpersonal 


aa ikel 
active observer’s attributions are more likely 


situation that involves both actors and ob- 
Servers, it is necessary to distinguish between 
two types of observers, active and Passive. 
Miller and Norman (1975) stated that 


the term active observer refers to any participant in a 
social interaction situation who, in addition to observing 
the behavior of the other participants, influences the 
behavior of the other Participants and is himself 
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to be swayed than the passive observer's m 
the hedonic relevance and personalism of 4 
actor’s behavior (Chaikin & Cooper, 19M 
Hill & Jones, 1973; Jones & Davis, 1965). 
Among all of the interpersonal studies to 
found in the actor-observer livers i 
have investigated this distinction exp ak 
(Miller & Norman, 1975; Wolfson & ee ; 
1977; Weber, 1975; Cunningham & a 
Note 1). Miller and Norman found, as a 
dicted, that active observers attributed i 
behavioral responsibility to the actor an a 
responsibility to the interaction sce 
Prisoner’s Dilemma game—than pe 
servers did. Supportive of Miller and bi : 
results, divergent attributions were Pr 


be 
e 
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SELF AS ACTOR, ACTIVE OBSERVER, AND PASSIVE OBSERVER 


[by actors and active observers in two other 

bargaining studies (Snyder, Stephan, & Rosen- 
f feld, 1976; Stephan, Rosenfield, & Stephan, 
1976). Yet Wolfson and Salancik’s (1977) 
“involved (i.e., active) observers,” who watched 
fan actor operate a model race car set while 
anticipating competing against either the 
actor or another subject, made more situational 
(external) and fewer dispositional (internal) 
attributions for the actor’s failing performance 
than uninvolved (passive) observers. To the 
extent that both involved and noncompetitive 
bšervers’ attributions paralleled those of 
actors, these observers seemed to have adopted 
an empathic set (Regan & Totten, 1975; 
Stephan, 1975) rather than the motivationally 
‘defensive posture of the bargaining game 
players. The emergence of either empathic 
"or “egotistic” (Snyder et al., 1976) attribution 
thus seems to be a complex function of the 
patterning of interaction (trial-by-trial bargain- 
ing vs. observer role anticipating actor role), 
outcome (success-failure vs. no explicit result), 
and sex of subjects. Unfortunately, the un- 
structured-conversation paradigm of Cunning- 
ham and Antill (Note 1) and Weber (1975)— 
the only apparent active-passive observer 
Studies that did not use a competitive or 
achievement setting—yielded no differences 
between the attributions of active and passive 
observers. 

The present study was conducted in the 
hope that a simplified paradigm would bring 
clarity to the issue of similarity or divergence 
in: attributions made by actors, active ob- 
Servers, and passive observers. Jones and 
Nisbett (1971) have argued that both infor- 
mational bias and information-processing bias 
contribute to differences in attributions made 
by actors and observers. The studies by Miller 
and Norman (1975) and by Wolfson and 
Salancik (1977) differed, among other ways, 
in the amount of information available to the 
active observer; in the latter study, the actor 
Performed in full view of the active observer, 
but in the former, he was never seen. Regan 
and Totten (1975) showed that through 
Induction of an empathic set, information 


Processing alone can yield actor-observer 
format, the 


differences. In a questionnaire ‘ 
Present study sought to minimize informational 
Complexity but permit information-processing 
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differences by locating the actor as the subject 
of a sentence (“You like Paul”) and the 
active observer as the object of a sentence 
(“Ted likes you”). 

Using success and failure at hypothetical 
achievement events, Ruble (1973) manipulated 
actor-observer perspective by phrasing sen- 
tence subjects in the third person for the 
passive-observer role (e.g., “John Doe didn’t 
work well with others on a project”), as 
opposed to the second person for the actor 
role (e.g., “You worked well with others on a 
project”). Incorporating these actor and 
passive-observer roles, this procedure was 
adapted in the present study to include the 
active-observer role as well by phrasing 
certain sentence objects in the second person 
(eg., “Ted likes you”). Emotional verbs were 
used particularly to determine whether the 
predominance of causal attributions to sentence 
objects for subjective verbs in passive- 
observer events (Abelson & Kanouse, 1966; 
McArthur, 1972) would shift to sentence 
subjects in active-observer events, as the 
results of Miller and Norman (1975) would 
predict. 

A second goal of the study examined the 
effect of extremity of event on ascriptions of 
causality. More extreme or severe conse- 
quences lead observers to assign greater 
responsibility to actors (Cunningham & Kelley, 
1975; Walster, 1966), whereas actors attribute 


less responsibility and freedom to themselves 
1975), It is 


(Harvey, Harris, & Barnes, 
predicted that actors and active observers 
will place the locus of causality for extreme 
emotional reactions more in the other person 
than in themselves, whereas passive observers 
will attribute greater causality to sentence 
subjects as emotions become more extreme. 


Method 


Subjects 

ale undergraduates at the Uni- 
Los Angeles, who participated 
of an introductory psychology 


Subjects were 24 mi 
versity of California, 
in partial fulfillment 
course requirement. 


Materials 


A series of one-sentence interpersonal events depict- 


ing subjective emotional states was constructed (e.g., 


1148 


“Ted likes Paul somewhat”). Sentence subjects and 
objects were either common male first names or the 
Pronoun you. Actor-observer perspective was manipu- 
lated by four combinations of subject-dbject pairings, 
for example, “Ted... Paul,” “Paul... Ted,” 
“You . . . Paul” (self as actor), and “Paul . + + you! 
(self as active observer). Four positive-negative verb 
pairs were used (like-dislike, admire-resent, love-hate, 
and trust-distrust). All verbs were phrased in the 
present tense. The extremity or intensity of the event 
was varied by the inclusion of a moderate ("some 
what”) or extreme (“deeply”) modifier. The complete 
crossing of all possible combinations of these four 
variables—actor-observer role, verb, positivity, and 
extremity—yielded a total of 64 events. 

Each event was followed by three one-sentence causes 
or explanations for the given event. The first cause 
referred to qualities of the sentence subject, and the 
second referred to qualities of the sentence object. The 
third attribution (“some other reason”) was left 
optional for experimental subjects to rate. Subjects 
were told to “rate each of these explanations on how 
likely it is that they caused the initial situation,” and 
to make their rating of each cause independent of their 
ratings of the other causes. A typical item follows: 


Ted likes Paul somewhat. 
How likely is this because: 


A. Ted is the kind of person that likes people. 
128. a S OANE aS 
B. Paul is the kind of person that people like. 
Y EE A ecw = 8" 9 

C. Some other reason 


1234 5 6 78 9 


The 9-point Likert scales were anchored at each end 


by “not likely” or “definitely likely” id- 
point by “moderately likely.” la 
All subjects made rat 


the 
was randomized, but each See epee “bade 
Sequence of events. The order in which subjects received 


one booklet or the other first 
subjects variable, Ao oy os 


Results 


_As preliminary analyses indi 

Significant differences spa k 
of equivalent passive-observer staten 

eg, “Ted likes Paul” and “Paul likes Ted”) 
these data were pooled in subsequent analyses 

(Jat the subject-role variable had ree 
* actor, active observer, and passive 
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observer.’ All ratings were subjected to 
variate analysis of variinece (Dixon, 
Harris, 1975) with order of presentatión 
booklet as the only between-subjects yai 

and with subject role, posit ivity, extremity, a 
verbs as repeated measures. The depende 
variables were the ratings of sentence a 
active observers, and “other reasons 
causes of the given event 


Order of Presentation Effects 


The multivariate effects involving 
presentation of booklets are sporadic n 
difficult to interpret. Also, no significant mi 
effect or interaction involving order emet 
when the three dependent variables 
analyzed separately. But when all thg 
pendent variables were 
the multivariate analysis i 
ceived the booklet containing passive-obse 
events first rated causality for negative 
events lower than positive events and l 
than subjects who received that Do0 
second, Order X Positivity interaction; | 
196) = 32.7, p < .001. Those who reed 
the booklet containing passive-observer & a 
first gave lower ratings of causality {gra 
for which self “loves” or “hates” som 
else than all of the other combinations, i 
X Verb X Subject Role multivariate a 
action, F (27, 573.06) = 3.19, p < 001: A 
who received the self-relevant booklet £ 
rated moderate negative events invo 
self as actor or active observer 
whereas those receiving the other A 
first rated extreme negative events IV’ 
self as actor or active observer highest, 
X Positivity X Extremity X Subject | a 
multivariate interaction, F(9, wile 
$ < .001. However, none of these 0 i i 
changed the conclusions to be dra pond 
the other results, nor did they corte 


f ive-observer 5 
1 Pooling of the equivalent passive vations if 


ments resulted in twice as many observa igh 
Passive-observer condition as in either psie 
active-observer conditions. Unequal C% ogam Ù 
accommodated by the BMD computer P' 
in the analyses (Dixon, 1973). because 
? Fractional eee of freedom eee deal ™ 
the algorithm employed by the progra 
Unequal cell sizes (Dixon, 1973). 


hny apparent way to effects obtained when 
he order of actor and observer roles was 
anipulated (Sherrod & Farber, 1975, in 
Wailure situations). 


4ctor-Observer Hypothesis and Hedonic 
elevance 


} A significant multivariate main effect for 
subject role lends support to the actor- 
observer hypothesis, as attributions to the 
belf either as actor or as active observer were 
bwest of all means, F(9, 477.16) = 67.9, 
[p < .001. When this analysis was performed 
pn ratings of actor and active observer only 
pn events that involved the self (i.e, actor 
tole and active-observer role), the subject-role 
ain effect emerged even more strongly, 
(6, 392) = 100.6, p< .001. This confirms 
iller and Norman’s (1975) distinction be- 
tween active and passive observers to the 
xtent that subjects attributed less causality 
b themselves as active observers than when 
thers were portrayed as active observers. 
However, the Jones-Nisbett (1971) effect 
Was ancillary to an interaction between subject 
tole and positivity that reflected the: hedonic 
elevance of the event. That is, subjects were 
Much more reluctant to attribute causality 
to the self for negative verbs (e.g. hate and 
distrust) than for positive verbs, multivariate 


fictive observer against the other 16 means in 
this interaction was highly significant, 7 (1, 23) 
= 64.0, p< .001. Thus, contrary to Miller 
ind Norman’s findings, the active observer 
did not attribute greater causality to the 
actor than a passive observer did. Rather, the 
active observer attributed—in negative events 
Predominantly—less causality to self, as 
active observer, than a passive observer. did. 

Following the significant multivariate mie 
faction between sukject role and positivity, 
variate interactions appeared for causal 
‘tributions to active and active observa 
warately, F (3, 42) = 5.30 and 8.96, ps < .005 
and .001, respectively. As can be Boat Ob 
Table 1, subjects tended to deny responsibility 
for negative events when they were cast pos 
às actors or as active observers- 
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Table 1 

Mean Causal Attributions to Actor and Active 
Observer As a Function of Subject Role and 
Verb Positivity 


Subject role 


Causal attri- Active Passive 
bution to Actor observer observer 

Actor 

Positive verb 4.67 5.33 5.29 

Negative verb 3.53, 5.14 5.17 
Active observer 

Positive verb 5.48 5.81 6.12 

Negative verb 4.97 3.81 5.33 


Note. A sample sentence in the actor condition is 
“You like Ted"; in the active observer condition is 
“Ted likes you”; and in the passive observer condi- 
tion is “Ted likes Paul.” 


comparisons probing these interactions showed 
that negative verb ratings of self as actor 
were much lower than all other actor ratings, 
F(1,23) = 29.7, p < .001, and that negative 
verb ratings of self as active observer were 
much lower than all other active-observer 
ratings, F(1, 23) = 46.8, p < .001. On the 
other hand, positive verb ratings of self as 
actor were only somewhat lower than all other 
actor ratings, F(1, 23) = 4.98, p < 04, and 
positive verb ratings of self as active observer 
did not differ from all other active-observer 


ratings (F < 1). 


Verb Effects and Extremity Effects 


The interaction between actor-observer 
perspective and positivity was particularly 
pronounced on events involving the verb hate 
and nearly as pronounced on the verb dislike, 
multivariate interaction, F (27, 573.06) = 2.78, 

< 001. As a consequence, negative verbs, 
such as dislike and hate, re¢eived the lowest 
causality ratings of all verbs, and positive 
verbs, -especially trust, the highest, multi- 
variate interaction, F(9, 477.16) = 8.95, p 
< .001. Positive verbs had higher ratings than 
negative verbs on actor and active-observer 
ratings, univariate F(1, 14) = 23.2 and 34.5, 
respectively, both ps < .001, but lower ratings 
on other reasons, F(1, 14) = 10.3, p < .01. As 
predicted, more causality was attributed to 
actors for extremely negative events than for 
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moderately positive events, as seen in the 
significant multivariate interaction between 
extremity and positivity, F(3, 196) = 4.07, 
p < 01. Overall, extreme events were rated 
higher on causality ratings than moderate 
events, F(3, 196) = 10.8, p < .001; this was 
particularly true of other reasons ratings. 


Discussion 


The main conclusion to be drawn from this 
study is that the actor-observer hypothesis 
of Jones and Nisbett (1971) may hold true 
particularly for blameworthy or socially 
undesirable events, that is, events involving 
hatred, distrust, dislike, or resentment. When- 
ever portrayed as a participant in such an 
event, as either an actor or an active observer, 
the subject deflected blame from the self, 
directing it instead to the other person or to 
some unspecified other reason. This did not 
occur to the same extent when the subject was 
depicted as either an actor or active observer 
in a socially desirable event, that is, love, 
trust, liking, or admiration. 

Snyder et al. (1976) defined egotism as 
“the tendency to make attributions that put 
oneself in the best possible light” (p. 435). 
Accordingly, subjects in the present experi- 
ment displayed limited egotism. Although they 
strongly denied their role in undesirable 
events, they did not claim excessive credit for 
desirable events. Thus, persons may be more 
likely to avoid blame for blameworthy acts 
than they are to demand praise for praise- 
worthy acts. This finding is explicable in terms 
of the results of a study by de Charms, Carpen- 
ter, and Kuperman (1965), who found that 
more freedom is attributed to a person who 
acts in order to please a liked other than to 
one who acts to avoid the threat of punishment. 
If persons are held responsible for an act to the 
extent that they are perceived as having the 
freedom to do it or not (Harvey, 1976), then, 
presumably, they will be held more responsible 
for, acts of self-aggrandizement (claiming 
praise) than for acts of self-protection (blame 
avoidance). Others may regard excessive self- 
aggrandizement as boastful and, therefore, 
undesirable. Hence, a balance between de- 
mands for a favorable public image and for a 
robust level of self-esteem may be struck if 
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individuals avoid blame but do not always 
claim praise for their actions. 

The muting of self-aggrandizement in thig 
study seems to contradict the findings @ 
Snyder et al. (1976), who found that egotism 
was as potent a factor in a bargaining game 
after winning as after losing. That is, they 
subjects were as likely to flatter themselves oy 
success as they were to deny personal respon 
sibility for failure. However, the competiti 
nature of the bargaining game may havi 
the subjects to adopt an assertive “heads I win} 
tails you lose” attitude toward the situation 
Similarly, when competing only against them 
selves, actors are more likely to take € 
for their successes than their failures (Ha 
Arkin, Gleason, & Johnston, 1974; Johi 
Feigenbaum, & Weiby, 1964). The context! 
the present study, which involved neithe 
competition nor achievement, might hi 
encouraged a more even-handed set 4 
subjects. Unfortunately, attributional 
from the bargaining study of Miller 
Norman (1975) do not address this particu 
question. 6 

By supporting the original proposition’ 
Jones and Nisbett (1971) that actors attru 
less causality to themselves than pi 
observers do, the present findings contr 
the results of Miller and Norman (19 
Their bargaining subjects ascribed 
behavioral responsibility to themselves 
perceived more disposition in their beh 
than did observers. However, in the physic 
absence of the bargaining opponent, 
opponent’s behavior probably did not “e 
the field” (Heider, 1958), a presumed 
requisite for the actor-observer effect. Be 
the bargaining session consisted of a numpy 
of trials involving identical choices in the sat 
situation, the actor’s own behavior wou! 
seem to be the most vivid feature of an 
wise informationally impoverished settii 
Such vividness would direct attention t0 7 
self, thus fostering attributions of responsib! J 
to the self (Duval & Wicklund, 1972). Me 


treated as a within-subjects factor, 
and Norman found no significant different” 
on ratings of responsibility or personality, k 
contrast to ratings of causality in the prese i 
study. 4 
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It may be argued that the artificiality of the 
present questionnaire format lacks impact 
and precludes generalization to behavioral 
situations (Ruble, 1973). However, actor- 
observer differences have been found in 
previous questionnaire studies (e.g., Nisbett, 
Caputo, Legant, & Maracek, 1973). To the 
extent that this technique has limited realism, 
it may simply underestimate the strength of 
actual behavioral effects. This technique also 
has several advantages. As each subject 
attributes causality for many events in a 
questionnaire, a wide range of events can be 
sampled. By contrast, the findings of Miller 
and Norman (1975), Snyder et al. (1976), and 
Stephan et al. (1976) may not generalize 
beyond the bargaining context that these 
authors used. Such situations undoubtedly 
dicit rivalry and face-saving strategies that 
may apply only to a circumscribed subset of 
natural settings. Within a limited period of 
time, it would be difficult indeed to stage, in 
factorial combinations, behavioral equivalents 
of each of many events in both passive- 
observer and self-relevant modes. As well, 
events need not be informationally or be- 
haviorally rich before people seek to ascribe 
causality for them. Often, attributions are 
based on hearsay or on minimal information, 
as in the present study. Further research 
would benefit from a systematic sampling of a 
variety of interpersonal events (as by Carson’s, 
1969, circumplex) to identify the situations in 
which active observers display varying degrees 
of empathic or egotistic attribution, The 
obvious limitations of a questionnaire study 
would be transcended by creating behavioral 
Situations that permit more direct extrapola- 
tion to face-to-face encounters: 


Reference Note 


1. Cunningham, J. D., & Antill, J. K Actors, active 
observers, and passive observers in @ gelting-acquat g 
situation. Unpublished data, Macquarie University, 
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Self-Esteem and Education: 
Sex and Cohort Comparisons Among High School Seniors 


Patrick M. O’Malley and Jerald G. Bachman 
University of Michigan (Ann Arbor) 


in a nationwide sample of the high school class of 1977 and draws comparisons 


The present article reports on the self-esteem of 3,183 male and female seniors 


with 1,715 males from the class of 1969. It thus provides a partial replication 
and extension of an earlier study, which showed that educational accomplish- 


ments undergo a reduction in central 


lity—become less important—for self- 


self-esteem of seniors was correlated with educationally relevant measures 
almost equally for males in the classes of 1969 and 1977. This finding appears 
to rule out a secular trend interpretation of the earlier study's results, thereby 
providing further support for a developmental interpretation. A second finding 
is that self-esteem correlated with educationally relevant measures about 
equally for males and females in the class of 1977, suggesting that the impact 


| esteem during the late teens and early twenties. One major finding is that the 
d 
of educational factors is basically similar for the two sexes. 


A third finding ìs 


that males and females were very similar in levels of self-esteem. 


A recent analysis by Bachman and O'Malley 
(1977), based on a sample of about 1,600 
young men from the high school class of 1969, 

showed that self-esteem is positively correlated 
with educational success. The study also 
showed that educational accomplishments 
seem to have greater importance or centrality 
for self-esteem during the high school years 
than during the 5 years beyond high school. 
Although the data for that study were nation- 
‘lly representative and covered an 8-year 
longitudinal span, there remained at least two 
“M®portant limitations on the generalizability 
of the findings. First, and most obviously, the 
study was limited to young men, thus leaving 
t unclear whether the findings are applicable 
to young women. Second, the study spanned 
the interval from 1966 to 1974, a period in 
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which there were important social changes, 
including changes in public views about the 
value of education. Therefore, the data left 
some uncertainty about whether the changing 
link between educational success and self- 
esteem reflected a genuine developmental 
process in late adolescence or a secular trend 
that affected the society as a whole. 

The present article, reporting new data 
from a nationwide cross section of male and 
female seniors from the high school class of 
1977, provides a partial replication and exten- 
sion of the study of the class of 1969. The 
new data permit us to overcome both of the 
limitations noted above. Unlike the previous 
study, the findings reported here are not 
longitudinal; therefore, we will not be con- 
cerned with relating self-esteem to later (post- 
high school) educational and occupational 
attainments. Instead, we will see how self- 
esteem among seniors relates to recent educa- 
tional success (high school grades), self-ratings 
of academic ability, college plans, and parental 
education. We will examine these relationships 
in parallel fashion for three distinct groups: 
senior males in the class of 1969, senior males 
in the class of 1977, and senior females in the 


class of 1977. 


/3707-1153$00.75 
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One of the basis conclusions from the pre- 
vious study was that 


those things having to do with a self-concept of educa- 
tional success—things such as academic skills, past 
classroom performance, future aspirations, and the 
like—undergo some reduction in salience or “central- 
ity” for the overall self-esteem of young men as they 
move through the final years of high school and go on 
to other experiences. (Bachman & O'Malley, 1977, 
p. 377) 


The study recognized the possibility of 
different interpretations. 


The first interpretation, and the one which we have 
emphasized, is that this shift in centrality is a fairly 
typical part of the developmental sequence followed 
by young people in this society. During the late high 
school years and the period which follows, the young 
person in the process of becoming an adult increasingly 
anticipates and experiences situations in which self- 
evaluation depends on factors quite different from 
success in school; and this means that academic 
things become less dominant in shaping self-esteem. 
An alternative interpretation of our findings is that 
they reflect a particular secular trend or cultural 
change during the late sixties and early seventies—a 
general decline in the importance or value that society 
places upon education and educational success. Trust 
in government declined dramatically during this period, 
and some would argue that faith in education as the 
pathway to success has also suffered a setback. (Bach- 
man & O'Malley, 1977, pp. 378-379) 


The present analysis, by comparing data for 
males in the classes of 1969 and 1977, provides 
a means of testing whether one of the alter- 
native interpretations outlined above is more 
valid than the other. First, it must be under- 
stood that the earlier longitudinal study 
revealed a gradual and continuing change from 
1966 (start of 10th grade) through 1974 (5 
years postgraduation); across a sequence of 
five data collections, correlations between 
self-esteem and educational factors grew 
progressively smaller (Bachman & O'Malley, 
1977; Bachman, O'Malley, & Johnston, 1978). 
Now suppose that this gradual change re- 
flected a substantial secular trend or cultural 
change affecting the relationship between self- 
esteem and educational success; then, unless 
the trend dramatically reversed during the 
Past several years, correlations between edu- 
cational factors and self-esteem should be 
Substantially lower for seniors in 1977 com- 
pared with seniors in 1969, But suppose, on 
the other hand, that the earlier findings were 
due solely to developmental changes occurring 
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during late adolescence; then, unless such 
developmental patterns were somehow differ- 
ent a decade later, the correlations for seniors 
in 1977 should be similar to those found for 
seniors in 1969. 

A different point of interest in the present 
analysis is how males and females compare in 
the extent to which self-esteem is determined: 
by (correlated with) various educationally 
relevant dimensions. Douvan and Gold (1966), 
in reviewing the literature on self-esteem more} 
than a decade ago, concluded that the self 
esteem of boys and girls depended to some 
extent on different components. Also, Rosen 
berg and Simmons (1975) found evidence that 
indicated that adolescent boys emphasized the 
importance of competence and, achievemé 
more than did girls. If this is true, we may 
expect that to the extent that boys’ needs for 
competence and achievement depend omy 
educational experiences, correlations of educi 
tionally relevant variables with self-esteetll 
should be higher for boys than for girls. 

A final point of interest is whether boyi 
have higher levels of self-esteem than girls do 
There have been reports that adolescell 
females’ self-images are less favorable than 
males’ (Rosenberg & Simmons, 1975), althoug 
others have not found differences (Drumm 
McIntire, & Ryan, 1977; Rosenberg, 1963) 
There are good reasons why females’ $ 
esteem should be lower than that of mal 
feminists would argue. The females’ 
advantageous position in society could asi} 
result in lower self-esteem. Simmons ® 
Rosenberg (1975) in their study of adoles K 
girls found that “at least in 1968, girls appea a 
to have a more unfavorable self-picture W 
did boys” (p. 233). As implied in the quoi 
an interesting question is whether that hi 
still true in 1977. Perhaps the upheaval l 
women’s rights and roles in the period frol 
1968 to 1977 resulted in an erasure of 
differences. : 

In fact, a comparison has already b 
made between the findings from the "4 
study (Simmons & Rosenberg, 1975) 4 d 
findings from a later, similar study conducta 
from 1974 to 1975 (Bush, Simmons, Hute Wm 
son, & Blyth, 1977-1978) “to see if the 
feminist ideology and movement could a2 
had an effect on girls’ self-esteem” (p g 
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Table 1 
Self-Esteem Items and Item-Index Correlations 
1977 high school seniors 
Males Females t test 
betw 
Item n M SD _ rt n M SD r meaai 
[take a positive attitude toward 
myself.> 1,568 4.13 .93 54 1,736 3.98 99 .56 cag 
I feel I am a person of worth, on 
an equal plane with others, 1,555 4.27 .85 49 1,727 4.21 90 5S 
Lam able to do things as well as 
most other people.” 1,560 4.41 .75 .39 1,733 4.30 81 .50 oe 
On the whole, I am satisfied 
with myself.” 1,539 4.06 1.05 48 1,718 4.06 1.03 .57 
Ifeel I do not have much to be 
proud of.° 1,519 4.03 1.14 54 1,707 4,05 1.15 .60 
Sometimes I think that | am no 
good at all.° 1,506 3.63 1,32 „51 1,701 3.47 139 .50 pe 
Ifeel that I can’t do anything 
right.° 1,498 4.07 111 41 1,695 4.11 1.08 .53 
Ifeel that my life is not very 
useful.° 1,501 4.18 1,05 .60 1,687 4,18 1.10 .62 
1,494 4.10  .66 aoe 1,689 4.04 .72 834 x 


Self-esteem index 


“Item-index correlation, corrected for part-whole. 

* Response of “agree” coded 5 (high self-esteem). 

"Response of “disagree” coded 5 (high self-esteem). 
"Coefficient alpha. 

p< 05. ** p < 01. 


Like the 1968 study, this more recent study 
found that girls’ self-esteem in the sixth and 
seventh grades averaged slightly lower than 
boys’. The present study extends the time 
Period to 1977 and looks at seniors, rather than 
sixth and seventh graders. 

In sum, this article employs data from two 
studies to address three theoretically im- 
Portant questions about self-esteem. First, 
are the patterns of correlation between self- 
esteem and educational factors essentially the 
Same for males in the high school classes of 
1969 and 1977? Second, are these patterns of 
correlation similar for males and females (in 
1977)? Third, are there differences in levels 
of self-esteem between males and females 
(in 1977)? 


Method 


Sample 
One source of data is the Youth in ‘Transition sample 
of young men from the high school class of 1969, 
escribed in detail by Bachman et al. (1978). Briefly, 
is is a nationally representative sample of 


48 contiguous states. 


sophomores in 87 schools in the 
1966 when they were 


Data collections were made in 
sophomores and again in 1968, 1969 (when they were 
seniors), 1970, and 1974. In the 1969 data collection, 
the one most relevant aini 1,799 participated (19% 
of the original target sample). 
The second get of data is the Monitoring the 
Future project being conducted by the University of 
's Institute for Social Research, The study is 
Johnston (Note 1); 


female seniors in 
follow-up surveys are 
for 6 years following 
base-year data collection is 
schools selected to provide an accurate cross section 
high school seniors throughout the United States. 

survey of over 18,000 seniors 
provided the data to be reported here. The measure of 
primary interest, self-esteem, was included in only one 
of the five forms used in the study, $0 analyses in this 
were ‘on an essentially random 20% of 
the total. The number of respondents with complete 
data on the self-esteem index was 3,183. 


pees 

1 The study bega 
1977 was the first year 
self-esteem items presen 


n with the class of 1975; however, 
that included all of the eight 
ted in this article. 
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Measures 


Self-esteem measures in the two studies were similar 
but not identical. The Monitoring the Future measure 
was an eight-item index similar to that used by Rosen- 
berg (1965). Respondents were asked to indicate on a 
5-point scale the extent of their agreement or disagree- 
ment with the items. The response categories—disagree, 
mostly disagree, neither, mostly agree, agree—were 
coded from 1 to 5, with higher values assigned to 
responses indicating higher self-esteem. The first four 
items were positively worded, and the second four 
were negatively worded. Table 1 shows the eight items, 
with their means, standard deviations, and item-index 
correlations, separately for males and females. Coeffi- 
cient alphas were .79 for males and .83 for females. 

The measure of self-esteem used in the Youth in 
Transition study differed slightly from the Monitoring 
the Future measure. A different 5-point scale was 
used; respondents were asked to indicate how often 
each item was true for them, with responses ranging 
from almost always true to never true. There were 10 
items, 7 of which were virtually identical to those 
used in the 1977 Monitoring the Future study (Items 
1-3 and 5-8 in Table 1). (See Bachman & O'Malley, 
1977, for the Youth in Transition item wordings, 
means, standard deviations, and item-index correla- 
tions.) Coefficient alpha for males in the class of 1969 
was .79, the same as the value for males in the class 
of 1977. 

Because the two studies employed different response 
scales, and some different items, it is difficult to compare 
means and standard deviations directly. However, 
because the two measures are very similar in their 
basic content as well as index characteristics (coefficient 
alphas and item-index correlations), their correlations 
with other measures should be comparable. (The 
comparability can be heightened by using subscales 
based on the seven common items. The present article 
includes correlations based on these subscales. To 
maintain comparability with other reports using these 
data, however, we focused primarily on the complete 
10-item and 8-item measures.) 

_ Unless otherwise stated, the remaining measures 
listed below are the same in the two studies. 

Father’s education and mother’s education were 
measured on a 6-point scale ranging from grade school 
or less to graduate or professional school. Parental 
education is a mean of father’s and mother’s education. 
Grades were self-reported on a 9-point scale. In the 
Youth in Transition sample, the grades for junior 
and senior years were averaged together; in the Moni- 
toring the Future sample, respondents were asked for 

your average grades so far in high school.” 

College plans is a dichotomy indicating whether the 
respondent planned 8 enter college or not. Self-concept 
of school ability is a mean of two items that asked 
respondents to compare themselves, on a 7-point scale, 
has others of the same age on school ability and 

telligence. For the class of 1969, the measure was 
obtained as of junior year; for the class of 1977, the 
measure was made as of senior year. 
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Statistical Significance 


In reporting statistical significance levels, we made ` 
no adjustment for the clustered sampling design used 
for both samples. One reason for not adjusting for 
design effect was that we did not have good estimates 
of the design effect for self-esteem and its correlations, 
with other variables, We are confident, however, that” 
these design effects are small because there is little’ 
clustering of self-esteem by school. Since the nominal 
probability levels are nevertheless likely to be slightly 
inflated, we report statistical significance at both the! 
.01 and the .05 levels. 


Results and Discussion 


Class of 1977: Male Versus Female Levels 
of Self-Esteem 


Table 1 shows that responses to the self 
esteem items for males and females in the class 
of 1977 indicated a generally high level of self- 
acceptance. The average response was Jus} 
above “mostly agree” for the positive items 
and about “mostly disagree” for the negati 
items. On six of the eight items, males 
higher in self-esteem than females (three of 
these significantly so); for the total inde 
males were significantly higher (p < Oi 
Females were more internally consistent, ho i 
ever, as indicated by the higher item-indé 
correlations for all but one item, and 
higher coefficient alpha. ; 

The fact that self-esteem was slightly lo ver 
for females than for males is consistent WH 
Simmons and Rosenberg (1975) and Bush etä 
(1977-1978). Apparently, the recent activité 
aimed at upgrading women’s position | 
society have not resulted in a total eliminatiol 
of the differences in self-esteem for high s 
seniors. But perhaps more important 1$ © 
fact that the difference in Table 1 is $ 
indeed—less than .09 of a standard devia 
This very small difference is statisti 
significant given a large number of cases, 
studies that use smaller samples would 
unlikely to show statistically significant diffe? 
ences. This may explain why Maccoby a 
Jacklin (1974) reported that statistica 
significant sex differences were seldom founi 
They commented that “the similarity of 
sexes in self-esteem is remarkably unies 
across age levels through college age” (P: i 
But their review was based on a number n 
studies, most of which involved small samp 
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| Correlations of Self-Esteem With Educational Measures for Seniors in 1969 and in 1977 


nn el 


Youth in Transition 
(Class of 1969) 


Monitoring the Future 
(Class of 1977) 


Males Males Females 
Measure (n = 1,715)* (n = 1,494)" (n = 1,689)" 

Father’s education .09 (.10) -08 (.09 
Mother’s education ‘06 (07) ‘1 (07) ‘08 C08) 
Parental education 08 (.10) .08 (.09) i (11) 
Grades .21 (21) .25 (.26) .24 (.26) 
College plans .18 (.19) .18 (.20) „14 (.16) 
Self-concept of school ability .27 (.26) 328-029) *  .29 (.30) 


Note. All correlations are significantly different from zero at the .01 level, except that between self-esteem 


and mother's education, for the 1977 senior males, 


parentheses are based on the subscale that consists of the 7 items common to 


and 5-8 in Table 1). In no case does a subscale correlation differ 
lates .95 with the 10-item Youth in Transition measure, .98 with 


for males, and .99 for females. Subscale 


by more than .02. The 7-item subscale corre! 
the 8-item Monitoring the Future measure 
16, 76, and .80 for the three samples, respectively. 


'The number of cases varies slightly for each correlation, 


There is another large-scale study of a 
hationally representative sample of high school 
seniors that included a measure of self-esteem 
similar to those used in the present articles, 
namely, the “National Longitudinal Study of 
the High School Class of 1972.” Using these 
data, Conger, Peng, and Dunteman (Note 2) 
reported that female seniors were statistically 
significantly lower in self-esteem than males 


| by about .10 of a standard deviation. 


| “self-esteem at face value and dealt on! 


' More substantial difference between 


Scales. Thus, boys might 


A further note of consistency in the finding 
of slightly lower scores on self-esteem for 
females than for males comes from the Monitor- 
ing the Future data collections from seniors 
in 1975 and 1976. In those years, an abbrevi- 
ated four-item self-esteem scale was used. In 
tach year, the females averaged slightly lower 
than the males. 

One other point should be made. Our 
discussion thus far has taken these self-reports 
ly with 
possible that a 
sexes in 
self-esteem—or even a reversal of the difference 
teported here—is being masked or suppress® 

Y other related variables that we have not 
Controlled (Rosenberg, 1973). There are 
Several likely candidates for such suppressor 
Variables. Maccoby and Jacklin (1974) noted 
that boys score higher on lie or defensiveness 
defensively overrate 


variate relationships. It is 


which is significant at the .05 level. The correlations in 


both studies (Items 1-3 
from the corresponding full scale correlation 


coefficient alphas are 


with up to 8% missing data. 


themselves on the self-esteem measures. If so, 
girls may actually have higher self-esteem 
than boys. A related possible suppressor is 
social desirability, the tendency to present 
oneself in as favorable a light as possible. 
Bush et al. (1977-1978) included such a 
measure in their study and found that girls 
were higher than boys in social desirability. 
Thus, their findings that females were slightly 
lower in self-esteem could not be due to a 
social desirability effect. If anything, the 
differences would be greater if social desira- 
bility were controlled. But the correlations 
among all the variables in their study were 
low, and controlling actually made little 
difference in relationships; the zero-order and 
partial correlations were similar. 

Finally, in the Monitoring the Future data, 
grades could be suppressing a sex effect on 
self-esteem. We demonstrate that grades and 
self-esteem are positively correlated ; further, 
females report higher grades than males (data 
not shown here). Therefore, one would expect 
females to show higher self-esteem. The fact 
that they showed lower self-esteem indicates 
that self-esteem differences would be greater 
if grades were controlled. But again the 
differences between zero-order and partial 
relationships were small. The zero-order cor- 
relation of .04 between sex and self-esteem 
becomes a standardized regression coefficient 
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of only .08 when grades are controlled, and the 
substantive significance of a regression coeffi- 
cient of .08 is dubious indeed. 

In sum, the data on sex differences in self- 
esteem among adolescents showed some con- 
sistency across studies and across time. 
Females scored slightly lower than males, 
but so slight were the differences that relatively 
large samples were required to achieve sta- 
tistical significance. Such differences—on the 
order of .10 of a standard deviation or less—are 
of dubious substantive significance. As Mac- 
coby and Jacklin (1974) pointed out, it is the 
similarity, not the difference, that is remark- 
able. Although there remains the logical 
possibility of substantively important effects 
of sex on self-esteem, the various suggested 
controls—some of which serve to heighten 
self-esteem differences between the sexes and 
some of which serve to lessen differences—do 
not appear to be sufficiently strong to alter 
the conclusion that self-esteem scores are 
distributed similarly among males and females. 


Classes of 1969 and 1977: Educational 
Variables and Self-Esteem 


Now we turn to correlations between self- 
esteem and measures of parental education, 
grades, college plans, and self-concept of school 
ability. We compare these correlations, ob- 
tained separately for males and females, with 
those in the Youth in Transition sample of 
boys in the high school class of 1969.2 Table 2 
shows these correlations. The basic import of 
Table 2 is that the correlations of self-esteem 
with the various measures of background, 
achievement, and aspirations are all similar 
among the three groups. None of the pairwise 
comparisons between groups showed a sta- 
listically significant difference. In other words, 
(a) the correlations for male Seniors were stable 
across the 8 years from 1969 to 1977 and (b) 
the correlations were similar for both males 
and females in the class of 1977, 

This second finding is rather important. 
As noted earlier, Douvan and Gold (1966) 
concluded that the self-esteem of boys and 
girls depended to some extent on different 
components, The data in Table 2 Suggest that 
the self-esteem of male and female seniors does 


not depend differentiall th 
listed there. i a 
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The first correlational finding—that 
self-esteem of male seniors correlates x 
educationally relevant measures about e I 
for the classes of 1969 and 1977—pro 
an extension and replication of part of 
findings of the previous report. Although 
does not deal with the longitudinal aspects 0 
that report, by replicating the cross-sectional 
aspects it provides some added support 
the conclusions reached there. 

It appears that self-esteem in high sc 
does depend to some extent on factors 
background, achievement, and aspiration. 
correlations in Table 2 are positive, and som 
are at least moderately high (i.e., grades 
self-concept of school ability). (All of | 
correlations in Table 2 are significant 
different from zero.) 

As discussed above, the earlier longitudin 
analysis found that “educational success ai 
its correlates seemed to grow less central 
the self-esteem of young men... as thi 
moved through high school and beyond,” 4 
this was interpreted as “a fairly typical pari 
the developmental sequence followed | 
young people in this society” (Bachman 
O'Malley, 1977, p. 378). But an alternati 
explanation was that there was an ovél 
secular trend involving a lessening impo! 
of education for self-esteem—a pattern ti 
would be reflected in reduced correlatior 
between education and self-esteem in lati 
cohorts of students. If the data presented } 
Table 2 had shown lower correlations for ti 
1977 seniors (male) than for those from t 
class of 1969, the secular trend interpretatid 
would have been supported. In fact, howe 
Table 2 shows a striking degree of simi 
in the relationships for the two coho 
certainly the correlations for the class of 1 
are not at all lower than those for the classi 
1969. 

Thus, by failing to confirm the secular t 
interpretation, the present replication 
extension provides additional, albeit ind 


? Because the Monitoring the Future sample des 
eliminated high school dropouts, the present com 
tations for the Youth in Transition sample also el 
nate dropouts. The number of cases is th 
slightly different from the earlier report that 
dropouts (Bachman & O'Malley, 1977). 
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| support for a basic conclusion from the earlier 


study: Educational success becomes less 
central to self-esteem during late high school 
and the years that follow. Furthermore, given 
the high degree of similarity between male 
and female self-esteem data in the class of 1977, 
it is likely that this process holds true for both 
sexes. 
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Deindividuation, Self-Awareness, and Disinhibition 


Ed Diener 
University of Illinois at Urbana-Champaign 


This study was designed to discover whether lack of self-awareness, group 
unity, lack of conscious planning, and disinhibited behavior occur together in 
deindividuating settings, as predicted by Diener’s theory of deindividuation. 
Another purpose of the study was to compare the characteristics of deindivid- 
uation in groups with the characteristics of lack of self-awareness induced in a 
nonsocial way. Three conditions were compared: deindividuated, non-self- 
aware, and self-aware. After the manipulations, participants could choose in- 
hibited versus disinhibited tasks in a supposed “creativity” session, followed 
by a variety of deindividuation measures. When the dependent variables were 
factor analyzed, key components of deindividuation loaded together on a single 
common factor. The results also revealed that the deindividuation group sur- 
passed the other two on the deindividuation factor ($ < .001) and on most of 
the individual measures. For some of the variables, the deindividuation and 
non-self-aware groups differed significantly, Suggesting that deindividuation may 
not be identical in every respect to lack of self-awareness induced in a non- 


social way. 


Festinger, Pepitone, and Newcomb (1952) 
Suggested that persons may not be perceived 
as individuals in some groups. They argued 
that these “deindividuated” people are freed 
from restraints and are likely to perform be- 
haviors they would usually inhibit. Zimbardo 
(1970) further theorized that there are certain 
input variables that cause deindividuation, 
certain inferred internal changes such as mini- 
mal self-evaluation, and certain “output be- 
haviors” that characterize the state. Spurred 
by Zimbardo’s theory, a number of studies 
have been conducted that have explored de- 
individuation in the laboratory (e.g., Baron, 
1971; Diener, Westford, Dineen, & Fraser, 
1973; Dion, 1971; Maslach, 1974), in field 
Settings (e.g., Diener, Fraser, Beaman, & 
Kelem, 1976), and Cross-culturally (Watson, 
1973), Past research, which has been reviewed 
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in detail elsewhere (Diener, 1977; Dipboyg 
1977), has focused almost exclusively on the 
relationship between specific input variables 
and the resultant behavior. Thus, the majority 
of the findings can be interpreted in terms ol 
discriminative stimuli for disinhibited be 
havior without recourse to the concept ot 
deindividuation. 
Diener attempted to provide direct support 
for deindividuation by measuring the interni 
changes that Zimbardo’s (1970) theory sug: 
gested should accompany the situational inputs 
(Diener, 1976; Diener, Dineen, Endresen, Bea- 
man, & Fraser, 1975). He manipulated factors 
Such as anonymity and responsibility an 
measured aggressive behavior and na 
cognitive changes as well. He assceeeat 
example, memory and self-consciousness. 
findings did not appear to offer strong sure 
for the theory. The cognitive measures in m0 . 
cases did not correlate with the overt a 
hibited behavior, and no strong central ma 
was found when the measures were ae 
analyzed. However, when the paradigm Sa 
ployed in these studies is considered, it 1S 1 i 
surprising that deindividuation did not o 
After subjects were exposed to the manipt! 


DEINDIVIDUATION 


tions, they were observed while they “tested a 
pacifist” by being physically aggressive to 
him. In such a laboratory situation, subjects 


| were undoubtedly quite self-conscious. In con- 


clusion, past findings on the behavioral con- 
sequences of deindividuating conditions have 
offered some support for the theory. However, 
no study has found that subjective experiences 
such as lack of self-awareness or feelings of 


| group unity follow deindividuating conditions 


and covary together. 


Cognitive Factors Underlying Deindividuation 


The theory of self-awareness (Wicklund, 
1975, 1979) has much in common with the 
theory of deindividuation. Duval and Wick- 
lund (1972) hypothesized that self-aware 
persons are more likely than non-self-aware 
persons to behave in accord with personal and 


| social standards. Much empirical evidence has 


een accumulated that supports this formula- 
tion (e.g., Carver, 1974, 1975; Diener & Srull, 
1979, Diener & Wallbom, 1976; Rule, Nesdale, 
& Dyck, 1975; Scheier, Fenigstein, & Buss, 
1974), Given the pivotal position of attention 
0 oneself as a separate entity in both the 
heories of self-awareness and deindividuation, 
several authors have suggested that a concep- 
ual integration is in order (Diener, 1977; 
Duval & Wicklund, 1972), and Ickes, Layden, 
and Barnes (1978) have found empirical sup- 
ort for such an integration. 

Recently Diener (1979) advanced a theory 
of deindividuation that borrows from the 
heory of self-awareness, from behavioral 
heories of self-regulation (Bandura, 1976; 
Kanfer, 1977), and from earlier models of 
deindividuation. In Diener’s theory, moni- 
oring of one’s own behavior and awareness 
of oneself as an individual are necessary for 
self-regulation (cf. Carver, in press) and are 
Prevented by certain factors present in some 
groups. Diener has lessened the emphasis on 
anonymity as a cause of deindividuation and 
Stressed instead group cohesiveness and uni- 
formity (Ziller, 1964), group activity, and an 
outward focus of attention, Each of these 
seems to be a likely cause of deindividuation 
ecause it can diminish or prevent self- 
awareness, According to this theory, the de- 
individuated person is blocked by environ- 
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mental factors from becoming self-aware and 
is thus less likely to regulate his or her behavior 
in reference to personal and social standards 
and long-term planning considerations, and 
more likely to react to immediate cues, motiva- 
tions, and emotions. Although antisocial be- 
havior does not necessarily result from de- 
individuation, the person is more influenced 
by situational factors and less concerned with 


„standards. 


Purposes of the Study 


The present study was designed to assess 
whether deindividuation is a useful construct 
and actually represents a unitary experiential 
and behavioral phenomenon. A major purpose 
of the study was to implement seemingly de- 
individuating conditions and measure both 
the behavior and the other changes that are 
defining characteristics of the concept. When 
in deindividuating circumstances, according 
to Diener’s theory, persons should be less self- 
aware, plan ahead less, feel close to the group, 
and be more disinhibited than usual. There- 
fore, persons in deindividuating situations 
should have higher scores on the average on 
each of the central measures of deindividuation 
than persons who are not in such settings. In 
addition, these characteristics (low self-aware- 
ness, group unity, and disinhibition) should 
occur together within deindividuated persons 
and form a single factor when factor analyzed. 
In the present study, conditions thought to 
produce deindividuation were maximized, and 
multiple measures of disinhibited behaviors 
and postulated intervening cognitive variables 
were employed. A number of improvements 
over previous studies (Diener, 1976; Diener 
et al., 1975) were introduced that seemed 
likely to increase the likelihood of occurrence 
of deindividuation. 

A second purpose of the present study was 
to compare the characteristics and effects of 
group-induced deindividuation with nonsocially 
induced non-self-awareness. Although Diener’s 
(1979) theory maintains that deindividuating 
conditions block self-awareness, it seems likely 
that nonsocial conditions, for example, view- 
ing an exciting movie, can also minimize self- 
awareness. However, group deindividuating 
conditions may be more certain to prevent 
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self-awareness because they not only direct 
attention away from the self, but also focus 
attention explicitly on the group. Thus, atten- 
tion may be less likely to drift back to the self. 

A last question addressed by the present 
study is, What are the characteristics of de- 
individuation beyond those mentioned earlier? 
Although group unity and lack of self-aware- 
ness are defining characteristics of deindivi- 
duation, many additional components have 
been suggested. For example, Festinger et al. 
(1952) suggested memory impairment re- 
garding the acts of individual persons. Zim- 
bardo (1970) suggested several additional 
characteristics such as amnesia, perceptual 
distortion, and emotionality. 

In summary, the major purpose of the 
present study was to create a situation that 
appeared to be deindividuating and to assess 
whether the characteristics, both overt and 
covert, predicted by the theory result from 
this situation. A factor analysis of the mea- 
sures was planned to reveal whether the hy- 
pothesized components of deindividuation 
occur together within individuals. Another 
purpose of the study was to compare the 
characteristics of deindividuation and lack of 
self-awareness resulting from nonsocial causes. 
Finally, the study was to ascertain the charac- 
teristics of deindividuation beyond those that 
define the state, for example, whether altered 
State experiences accompany deindividuation. 


Method 
Overview 


$ The experiment was a 2 X 3 (Sex X Condition) de- 
sign with the three conditions being deindividuation, 
non-self-aware, and self-aware, Participants were 59 
male and 67 female subjects with 42 persons in each of 
the three conditions. After a half hour of group activi- 
ties that were designed to create the three levels of 
group feeling and self-awareness, subjects were allowed 
to participate in activities that might “enhance their 
creativity for a later Creativity test.” This subsequent 
Session, which was labeled the “individual” session for 
subjects, was actually the session in which evidence of 
the behavioral disinhibition was gathered. In the “in- 
dividual” session participants could choose from disin- 
hibited or inhibited tasks. Subjects were also asked to 
make creative verbal Statements (which could be in- 
at certain times during the 
n. These two activities follow- 
ession served as the measures 
these activities, participants 


ED DIENER 


completed memory measures and an inventory assessi 
self-reported self-consciousness, group unity, and so on, 


Selting 


A group of eight same-sex introductory psychology 
students participated in each session, but unknown to 
the two real subjects, six of the persons were con: 
federates who manipulated group factors in a prepro: 
grammed way. Although the experimenter was always 
a male, he was assisted in running the experiment by 
two persons who were the same sex as the confederate 
and the subjects at that session. The eight participants | 
(six confederates and two subjects) were either all 
males or all females at any one session. The experiment 
was run in a large room during evening hours and was 
said to be concerned with creativity. Subjects were 
told that the first session was a series of group activi- 
ties designed to limber their minds up and get them 
in touch with their feelings. This group session pro- 
vided the context for the manipulations. Subjects were 
told they would next work in an “individual session” in 
which they would all still be together, but in which 
each participant could choose the tasks he or 
wanted to work at. Other sessions that would test 
creativity were alluded to later in the period. In other! 
words, the first two sessions were presented as though 
they were designed to enhance creativity. Following 
these sessions, however, the two subjects actually com 
pleted the memory measure and questionnaire. 


Conditions 


A series of activities was developed for the three 
conditions. The manipulations for each condition 
during the group session lasted about 30 minute 
Male subjects only were run on Monday and Wednté 
day evenings and females on Tuesday and Thursday 
evenings, with the conditions being systematically 
rotated over evenings. The study occurred ove a 
8-week period. Subjects were phoned a day ahead 0 
their appointment and asked to wear old clothes W 
the study. 

Self-awareness condition. Subjects in the a 
Conditions were exposed to a group atmosphere W lity 
activities designed to heighten feelings of individus! E 
and self-awareness. In the waiting room the “the 
federates acted fidgety and self-conscious. After a 
initial instructions the group was divided into & w R 
group (the confederates) and a smaller group (the al 
subjects) and taken to separate rooms to wor% ects 
diferent tasks. It was stated that later all sun 
would rotate in pairs through the one task, and ae 
meantime the others would complete some ques ‘i 3 
naires. Once in the experimental room, the T E i 
jects were asked to put on mechanics’ long-5 o 
coveralls to “protect their clothes” during the pos 
messy later activities. edat 2 

The two subjects were given name tags, seat pula 
large table, and then exposed to a series of mere 
tions designed to heighten their self-awareness: touch 
were presented as ways of helping them get waressed 
with feelings and their inner self. Subjects, # 


by name by the experimenter, first listened to music 
and answered questions about how it matched their 
© personality characteristics. Next they wrote an essay 
about themselves, covering such topics as what makes 
them unique. They also listed their hometown, hobbies, 
personal motto, and other individuating information. 
They next read all the personal information to the other 
subject (being allowed to omit anything they felt was 
too personal). The subjects then answered questions 
about themselves compared to the other subject, for ex- 
ample, “Who do you think is probably more creative?” 

At this point the research assistant brought in the 
six confederates who had ostensibly been working at a 
liferent task in another room. The assistant said that 
they had been told about the coveralls but that they 
preferred not to wear them. Since the phone caller who 
reminded them of their appointment had asked them 
to wear old clothes, they were not concerned if their 
tlothes got dirty. Several confederates agreed that 
they preferred not to wear the coveralls and the ex- 
petimenter, seemingly surprised, allowed them to 
proceed without coveralls if they weren’t “worried if 
their clothes get dirty.” Therefore, the two subjects 
stood out from the rest of the group because of their 
attire. The experiment next entered the “individual” 
| Phase of the study. At this point, the participants in 
the self-aware condition were dressed differently from 
six confederates, had been exposed to a series of 
individuating tasks, and were part of a noncohesive 
group: 

} Non-self-aware condition. The confederates in this 
| (ondition were pleasantly friendly but not outgoing. 

Subjects were exposed to a series of manipulation acti- 
Vities designed to focus their attention outward. The 
tonfederates and subjects performed the activities as 
‘group throughout this condition and were all dressed 
in the coveralls. For example, they listened to music 
tnd rated its qualities and also rated jokes. They wrote 

W universities would be different in 100 years. 
Subjects also wrote a rebuttal to a letter that criticized 
thiversity students. Next they worked on a series of 
thallenging and fun games and puzzles. The emphasis 
Meach activity was to relax and become more creative, 
tnd it was stated that none of the activities would be 
tvaluated, 

During the “individual” session these subjects were 
%ked to press a pedal rhythmically throughout the 
sion. Although this was said to be a trial activity to 
help judge the effects of rhythm on creativity, it was 
‘tually designed to be slightly distracting and thus 
teduce self-awareness (Duval & Wicklund, 1972) as the 
Sssion continued, No names or name tags were used in 
‘his condition. Thus, this condition was designed to 
Mtvolve subjects in activities that would not accentuate 
Ndividuality but would direct their attention outward 
‘vay from themselves. No competition was creat 
Within the group, but neither was a spirit of group 
Mity generated. 

Deindividuation condition. In the waiting room the 
federates were friendly and attempted to A 
Es group feeling in this condition. They spoke to the 
| “tire group, not to single individuals. The erenn 
| ae them to adopt a group name, and he the 

ed the group by this name. He said this name wo 
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be used to label their data, which were to be “treated 
as a group, not on an individual basis.” The group was 
given the coveralls, and the confederates made putting 
them on a unifying and fun event. The first activities 
were designed specifically to draw the group together. 
Group members sang together and then did “eleva- 
tion”’—getting “psyched up” and together lifting an 
assistant high in the air. The group next formed a 
circle by interlocking arms and were told to prevent 
one of the assistants from breaking into the circle. The 
next activity, African dancing, was designed to con- 
tinue the group-cohesive activities, but was also in- 
tended to produce the arousal, group coordination, and 
kinesethetic sensory experience that seems to be a part 
of many groups in which disinhibition occurs (Sargant, 
1975). During the loud Burundi drum music, the group 
clapped in unison, swayed in unison, and danced around 
in a circle together. The lights in the room were dimmed. 
Thus, subjects in this condition were exposed to a 
variety of manipulations designed to create a unified 
group. To carry on the group feeling during the indi- 
vidual disinhibition session (in which subjects were 
seated around a large table) the confederates made 
occasional remarks to the group. 


Measures 


There were four major sets of measures: (a) activity 
disinhibition during the individual task, (b) speech 
disinhibition during the individual task, (c) memory 
measures taken after the “individual” session, and (d) 
a postexperimental questionnaire measuring „group 
feeling, self-consciousness, altered state experiences, 
and so on. 

Disinhibition tasks. Immediately after the manipula- 
tion session in which varying levels of group feeling and 
self-consciousness were induced, subjects moved to the 
“individual” session. They made their choices of the 
tasks they wanted to do, and each took a seat around 
a large square table. In front of each participant wasa 
cue light, a microphone, and a list of the activity boxes 
available for this part of the experiment. A 

It was desirable to use tasks that were different in 
their social appropriateness but roughly equivalent 
in interest and fun. Two sets of tasks were created, dis- 
inhibited tasks in colorful boxes and inhibited tasks 
in dark brown boxes. In a pilot study 12 male and 13 
female undergraduates had rated a large list of potential 
tasks on a scale for inhibition-disinhibition and a scale 
for boring/aversive versus interesting/fun. Twenty dis- 
inhibited tasks were selected and matched with 20 in- 
hibited tasks that averaged the same score on interest- 
ing/fun. Since there were sex differences in the ratings, 
the 40 tasks were selected so that these differences were 
minimized on both the interest and disinhibition di- 
mensions. On the 9-point disinhibition scale, the 20 
inhibited tasks had a mean of 3.15, and the 20 disin- 
hibited tasks averaged 6.45. However, the interesting/ 
fun value of the tasks was similar (5.40 vs. 5.25). Ex- 
amples of tasks and their ratings are shown in Table 1. 

The tasks were grouped together in sets of four, 
and these tasks all came together in a box. For example, 
an inhibited box labeled Intellectual challenges contained 
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Table 1 i 
Examples of Tasks and Ratings 


eT 


Task 


ED DIENER 


Interesting/fun rating* Disinhibition rating? 


Inhibited task 
Crossword puzzles 
Reading about disarmament 3 
Answering moral dilemma questions 
Playing “Hi-Q” x 
Answering comprehension questions about a story 


Disinhibited task 
Playing in mud 
“Finger painting” with your nose 
Writing the faults of your friends 
Sucking liquids from baby bottles d 
Writing down all the obscenities you can think of 


4.34 2.39 
6.64 2.00 
5.57 4.01 
3.83 3.71 
6.41 3.17 
4.84 7.18 
4.93 7.71 
6.24 5.03 
6.34 6.84 
5.80 5.52 


*Low = fun. * Low = inhibited. 


four tasks such as “Hi-Q,” pencil mazes, and so on. 
There were five types of boxes for the inhibited tasks 
and five types of boxes for the disinhibited ones. Each 
type of box had a label that broadly described the 
tasks inside (e.g., Unusual foods, Desecration, and 
Public issues). These box labels were also selected based 
on preratings so that they differed on rated disinhibi- 
tion (7.9 vs. 4.16) but not on interesting/fun (4.2 vs. 
4,38). 

Subjects were told that during the individual session 
(lasting 20 minutes) they could choose any box they 
liked. They could see the boxes and their labels but did 
not know the contents (tasks) unless they selected that 
box. They could work in a box for as long or as short 
a time as they liked and do any number of tasks within 
it, When they tired of that box, they were to select 
another. Initial selections were made on a sheet of 
paper that contained a list of all the inhibited boxes in 
one column and all the disinhibited tasks in the other. 
Subjects were asked to make three ranked choices ini- 
tially in case “we run out of a particular box.” These 
three initial choices were weighted by their order to 
produce a measure of disinhibition of the initial choices. 
The experimenter and an assistant brought the new 
boxes when they were desired. 

Subjects were told before the individual session that 
they should feel free to choose either type of box, since 
both “can probably enhance creativity.” They were told 
that in the dark boxes were intellective tasks that 
could stimulate their minds and get them thinking 
quickly. They were reminded that quick thinking is 
an essential part of creativity. The experimenter said 
that the tasks in the colorful boxes could also enhance 
Creativity by helping them get in touch with feelings 
and sensations. It was repeated that they should feel 
free to switch boxes whenever they liked and that they 
could choose any box or combination of boxes they 
wanted. 

Two confederates seated at the far end of the table 
from the subjects recorded which boxes and tasks the 
subjects chose. Since these confederates were apparently 
working in boxes requiring writing, their writing did 


not seem strange and never attracted attention. Te 
cording confederates also wrote down a description of 
the subject’s behavior. These descriptions were later 
rated for the degree of disinhibition exhibited by the 
subject by raters who were blind as to the subjects 
condition. Each confederate chose the same boxes 16 
gardless of condition and did the same tasks in p 
condition. Four of the six confederates initially hose 
disinhibited boxes, and then they switched tasks in# 
preprogrammed regular way, the overall new choices 
being approximately half disinhibited boxes. a 
Speech ‘measure. Subjects were asked to Fi, 
several “creative statements” into a microphone dur 
the individual session. These statements were a : 
made whenever a cue light came on in front © f 
person. The order of lights was always the same a 
allowed each subject to make three statements : 
each confederate to make two. The confederates m 
the same standard statements in all conditions, H 
some of their statements being rather unusual, crea 
or disinhibited, and some being mundane and < 
An example of a statement is, “This remini ie Ke. 
when I was young, exploring new things an a 
crazy. I feel like a child again without a care 
world.” T t 
Subjects were told in the self-aware cone thal 
their statements would be individually reco! ‘coll 
the non-self-aware condition it was mentioned endive 
that the statements would be recorded. In the A i 
duation condition it was stated that all the group Me 
be recorded together and labeled with the oa d by 
The recorded statements of the subjects were ad | 
two raters who scored the subject’s three se 
inhibition-disinhibition. The ratings for the fa. a 
were averaged across their three speeches. nd since 
raters’ scores were correlated across subjects, 2 
the interrater agreement was high (r = ed HS 
average score for the two raters was assigni i 
subject. «avid 
Porro iiaa questionnaire. After the on d the 
session, subjects were taken to a different roo at i 
research assistants administered the postexP 


me af 


that 
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Table 2 
Factor Loadings for Two Orthogonal Factors 
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Loading 
Variable Factor 1* Factor 2» 

Reported self-consciousness 
Tasks completed disinhibition gore Re 
Initial boxes choice (weighted) ‘58 
Spontaneity versus conscious planning ‘54 
Time seemed to go quickly ‘53 
Lack of concern for what other subjects would think of you ‘53 
Liking for the group i 
Feeling of group unity Ad 
Sum of unusual experiences reported 17 
Feeling of group unity 61 
Altered state self-report 59 
Liking for the group 58 
Subjective loss of individual identity 44 
Memory for confederate statements® 07 —.01 
Match statements to pictures® «16 07 
Recognition of own statements® 14 03 
Subjective feeling of anonymity" 03 39 

Xf 04 


Disinhibition of speech® 


"Deindividuation. » Altered experiencing. ° Did not load above .4 on either factor. 


Questionnaire and memory measures to them, The 
questionnaire contained a series of items, most rated 
‘on 9-point bipolar scales: (a) subjective feeling of time 
‘passage; (b) self-reported _self-consciousness; (c) 
memory for the tasks the person next to them did 
, (essay format); (d) perceived similarity to other 
Participants and feeling of group unity and liking; (e) 
thjoyment of the session; (f) reports of altered percep- 
tions, thinking, and emotions; (g) reports of an altered 
State of consciousness (yes or no answer) and other 
‘Unusual experiences such as hallucinating; (h) the 
extent to which their behavior was guided by conscious 
Danning versus flowing with the environment and 
: making spontaneous choices; (i) memory measures for 
‘ripheral cues in the room such as the content of 
Posters on the walls; and (j) suspicion measures, in- 
cluding suspicion of the confederates and guessing the 
‘Purpose of the experiment. 

Memory measures. Each subject was asked to select 
the photographs of persons who i 
Boup and of persons not in their group: There were 
Pictures of all six confederates and six filler photos of 
other undergraduates. Subjects were required to sort 

‘Al of the pictures into two piles. Thus, @ subject could 
Sore from 0 to 12 pictures correct. The same procedure 
Was carried out for statements that had been made 

luring the session. Ten correct confederate statements 
< of the 12 made) written on cards were present! 
A gether with 10 filler statements that had not been 
aid, and subjects were asked to sort them all into two 
tiles. Next, the 6 veridical pictures and the 10 actual 
tements were given, and subjects were asked to 
match all the statements with the picture of the person 


who said them, even if they had to guess on some. 
Subjects could not keep track of how many statements 
each of the confederates made while they were involved 
in the task, Also, they did not necessarily receive two 
statements from each confederate. Thus, participants 
could match from zero up to several statements to a 
particular confederate, Therefore, the task was really 
one of matching each statement to the confederate 
who was most likely to have made it. Lastly, subjects 
were given six statements and asked to select the three 


they had made. 


Ethical Precautions 


Since the study involved deception and some unusual 
activities, a series of ethical precautions were used, and 
these are briefly described here. The initial instructions 
emphasized subjects’ right to withdraw at any time 
and their right to refuse to do anything they did not 
wish to do in the experiment. Subjects were cautioned 
not to engage in any physical activities contraindicated 
by health problems. The confederates intervened if any 
subject’s behavior became too extreme during the 
individual session (e.g., eating mud). The debriefing 
was conducted carefully and took 15-20 minutes. 
Each subject was introduced to facts about the study 
in a graduated way as recommended by Mills (1976), 
was promised anonymity, and his or her feelings about 
the study were discussed. In general, the ethical guide- 
lines recommended by Diener and Crandall (1978) 
were followed. Many subjects afterward said it was the 
most interesting study in which they had participated. 
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Results 
Factor Analysis 


Sixteen of the major variables were subjected 
to a principal axis factor analysis with mul- 
tiple correlations employed as the communality 
estimates. A solution with two factors was 
chosen based upon the scree of the eigenvalues 
(Cattell, 1966), the interpretability of the 
data, and the factor loadings. The solution 
was subjected to an orthogonal (varimax) 
rotation. For both factors the items with 
absolute weights above .4 are shown in Table 
2.! The factor loadings reflect simple structure 
except for the case of group liking and group 
unity, which loaded substantially on both 
factors. The five variables that did not load 
above .4 on either factor are also shown in 
Table 2. The major factor contains the key 
elements of deindividuation, including group 
unity and liking, low self-awareness, spon- 
taneity, and disinhibited behavior. The second 
factor also contained group unity and liking, 
but was otherwise composed of altered ex- 
perience items. Thus, across subjects the key 
variables formed two major’ independent 
factors, one that can be labeled Deindividua- 
tion and the other that can be labeled Altered 
Experiencing. It appears that the memory 
items were mainly those that did not load 
on either factor. The first-order correlation 
between conscious planning and group unity 
was r(124) = — 45, p < .001. The correlation 
between the disinhibition score of tasks chosen 
and self-awareness was r(124) = .38, p < .001. 

| 
Effect of Conditions 


The two factor scores for each subject were 
standardized, and a multivariate analysis of 
variance (MANOVA) was performed on the two 
scores, These data were collapsed across sex 
because this factor produced negligible differ- 
ences on most variables, Thus, the MANOVA 
was computed on the three conditions with 
the two dependent variables, Pillai’s Trace 
Criterion V transformed to a standard F re- 
vealed that the three conditions were signifi- 
cantly different, F(4, 246) = 8.79, p < .001, 
This method is one of several for calculating 
a MANOVA significance test, and is suitable for 
the present data (Olson, 1976). A multivariate 
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extension of omega-square revealed that condi- 
tion accounted for 23% of the variability in | 
the dependent measures. The mean standard- 
ized factor scores and the univariate tests on 
the factors are shown in Table 3. Newman- 
Keuls comparison of means (significant differ- 
ences designated by the subscripts in Table 3 
reveals that the deindividuation group sco 
significantly higher on Factor 1 than the ot 
two groups, who did not differ signifi 
from each other. On Factor 2 the deindiyi 
tion group was significantly higher thai 
self-aware group only. à 
The factors were also subjected to a 
criminant analysis, and the discriminant fu 
tions were used to predict the probabilit 
each subject was a member of each condii 
(Cooley & Lohnes, 1962). This proc 
properly classified 74% of the deindividu 
subjects (gave the largest probability l 
they were in the deindividuated group), 
of the non-self-aware subjects, and 60% of 
self-aware subjects. This procedure pre 
that 33% of the non-self-aware subjec 
longed to the deindividuated group and 
45% belonged to the self-aware group 
appears that some subjects in the no 
aware group reacted in a manner simil 
the deindividuation subjects and some in 
similar to the self-aware subjects. 
Univariate data and analyses of var 
on the individual variables are also sho 
Table 3, Although sex of subject did produ 
significant main effect differences on a few 
the individual variables, it did not result 
significant interactions with condition, 
these results tended not to be theoreti 
interesting. For this reason the sex data 
analyses are omitted from Table 3. Althou 


1 A stable factor-analytic solution depends on h 
a large number of subjects relative to the num! 
items to be factored. Therefore, a number of 
Were not included in the factor analysis, bringing 
number of factored items to 16. The 10 items thal 
excluded either were redundant with other i 
were of lesser importance to the theory of deing y 
tion. The excluded items were rated disinhibitt 
behavior; concern over what experimenter woul 
of you; similarity to the group; enjoyment 0! 
group session; emotions, thinking, and ae 
altered from normal (3 items); memory for d for 
federate photographs, for peripheral cues, anes 
tasks neighbor did (3 items). 
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Mean Scores for Factors and Individual Variables 
COMM 


Condition 
x Deindivid- Non-self- - i 
Variable uated aware DE Be ae 
| jate F 
Factor 1 standardized scores 
(Deindividuation 
Factor 2 pie scores a ap E pie ieee 
(Altered experiencing) —.37, 02 351, 4.69" 
a | . 0 
Rated disinhibition of behavior 
Tasks completed disinhibition score ee z i E RETE 
Initial boxes choice (weighted) hS 5.8, 60, a am 
isinhibition of speech 6.53, 3.7, x 2.6, y sae 
Reported self-consciousness 3.85 4.4 z 
Conscious planning not spontaneity* 6.05 4.3v te rae 
oss of individual identity* 24% 1y 1% 2 = 7,00° 
Subjective feeling of anonymity* 4.0 4.5 4, 8 4 75 
Concerned other subjects think of you" 6.3 6.0 5.2 2.39 
Concerned experimenter thinks of you" 7.2 6.6 6.8 ‘60 
iking for the group* VER 3.9y 4.9p 9,36*** 
teling of group unity* 3,25 4.6.x Sey 14.08°*** 
Similarity to group” 44, S.dy 5.8) 3,56° 
Enjoyed group session" 3.da 4.5» 4.3» 6,42"* 
Emotions altered from normal 88% 16% 14% x= 5,87 
ime seemed to go quickly* 3.2 3.3 3.9 2.00 
hinking seemed altered from normal 6.1 5.5 61 83 
| Altered state self-report 39% 28% 21% xt m 2.83 
Sum of unusual experiences reported 1.2 1.0 J 3.00 
Altered perceptions 56% 28%y 34%y xt = 6.52 
Memory for confederate statements 13.95 14.8, 15.0, 3.38° 
Memory for photographs 8.8; 8.6 8.ly 320° 
Match statements to pictures 3.9 4.2 4.2 ad 
Memory peripheral cues Ix 14 1.8, 392: 
Recognition of own statements 5.9 5.9 59 me 
9 1 6 1.55 


Memory for tasks neighbor did 


Note. Multiple comparisons conducted 
with the same subscripts in the same 
$ < .01 (abc) or p < 05 (xyz). 

‘Items scored so that a low score in 
p< 05. ** p < 01, * p< 001. **** p < 0001. 


the manova was highly significant, the uni- 
Variate alpha levels should be treated cau- 
tiously, since there are a large number of non- 
independent tests presented. In addition to the 
traditional F values shown in Table 3 for each 
Variable, the results of Newman-Keuls multiple 
comparisons between the three means are 
shown in cases where the overall F was 
Significant. 

The variables shown 
together into classes of phenomen 


in Table 3 are grouped 
a. Clearly, 


only where main 
letter series are no! 


dicated a greater magnitude of the item as | 


effect was significant. Means across the same row 
t significantly different (Newman-Keuls or x’) at 


labeled. 


the largest differences between the experi- 
mental conditions came on the group feeling 
items and on the disinhibition items. The 
measures of disinhibition showed differences 
between the three conditions that were highly 
significant, and the Newman-Keuls analyses 
indicated that for these variables, all three 
groups were significantly different from the 
others. For the group feeling items, each of 
the three conditions also tended to be signifi- 
cantly different from the other conditions. 
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The self-consciousness and loss of identity 
items also showed differences between groups. 
The deindividuation group was significantly 
different from both of the other two groups on 
two of three key items. Although feelings of 
anonymity and concern over the evaluation of 
others did not reach an alpha level of .05, the 
deindividuation group did score in the pre- 
dicted direction on these items. The smallest 
differences between conditions were on the 
altered state and memory items. It can be seen 
in Table 3 that most of the altered state items 
were not significant, although the deindividua- 
tion group was most extreme on these varia- 
bles. For the memory items, the deindividua- 
tion group had the worst memory for the state- 
ments of the confederates and for cues in the 
room, but the best memory for the faces of the 
confederates, which is similar to the pattern 
found by Diener and Kasprzyk (Note 1). 
Finally, it can be noted that subjects’ recogni- 
tion of their own statements was almost perfect 
in each condition. 


Analyses for Artifacts 


There were several potential problems in the 
present study, and these should be analyzed. 
One question concerns the unit of statistical 
analysis used earlier, Individuals, not pairs of 
subjects, were employed in the analyses al- 
though the pairs of individuals might not be 
independent, since they could have influenced 
each other. This seems unlikely, since the 
majority of the group were confederates, and 
they set the tone of the group by acting 
similarly to each other, However, the possi- 
bility that subjects within the subject pairs 
influenced each other was checked by corre- 
lating within conditions each factor score for 
pairs of subjects run together. Thus, correla- 
tions for each of the conditions were derived 
that represented the similarity of the responses 
of pairs of subjects run together. These corre- 
lations were all near zero and averaged .06, 
revealing that the data for each subject were 
independent of the other subject’s scores, 

Another question concerns suspicion. When 
asked to guess the Purpose of the study, fewer 
than 15% of the subjects had any idea beyond 
a study of creativity, and only a few subjects 
guessed anything related to the true purpose. 
Similarly, when asked about their suspicion of 
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the confederates, the vast majority of subjects 
checked the choice saying they honestly never 
suspected any of the confederates. There were) 
virtually no differences between conditions 
on the suspicion items. The two suspicion 
measures did not correlate with the number of 
weeks (rs = .079 and —.069), indicating that 
Suspicion was not increasing during the 
Semester as subjects were becoming more 
knowledgeable about psychology. This also 
Suggests that leakage about the study into the 
subject pool was not occurring to any extent, 
which is consistent with past findings (Diener, 
Matthews, & Smith, 1972). Finally, the corre 
lations between subjects’ factor scores and 
their suspicion scores were low, ranging from 
—.06 to .08, with a mean of .01. Thus, there: 
appeared to be low levels of suspicion, and 
what little suspicion there was did not seem to 
influence the data. 


Discussion 


The present results support the importance 
of the deindividuation construct, which relates | 
social situational factors to lack of seli 
awareness and in turn relates lowered sell 
awareness to disinhibited behavior. The factor 
analysis revealed that characteristics such as 
lack of self-awareness, lack of conscious 
planning, group unity, disinhibited behavior, 
and lack of concern about what others woul 
think of the subject all formed a single central 
factor. A comparison of factor scores reveale 
that subjects exposed to the deindividuation 
manipulation scored significantly higher E. 
the other two groups on this factor. Thus, ti 
two types of analyses converged in suggesting 
the validity of the deindividuation cons i 

Tt should be noted that there was a significan 
first-order correlation between disinhibited r 
havior and lack of self-consciousness. T k 
indicates that less self-aware participan 
tended to exhibit more disinhibited behavior: 
The question arises as to whether low E 
awareness caused disinhibited behavior f 
the reverse. This question cannot be anisa 
with certainty because of the correlating a 
nature of the findings. However, the correla Hi 
of —.35 between self-awareness and i 
tion of the initial activity choice (before a 
activities were done) suggests that the dev! 


activity did not cause the lowered self-aware- 
ness. We know from the self-awareness litera- 
ture that persons with lower self-awareness 
are less likely to adhere to norms, and there- 
fore we can conjecture that the lower self- 
awareness exhibited by deindividuated subjects 
‘could certainly have had some causal role 
in disinhibiting behavior. 

The present findings clearly support Diener’s 
(1979) theory of deindividuation, which inte- 
grates the concepts of self-awareness and self- 
regulation with earlier conceptions of deindi- 
viduation. Lack of self-awareness and lack of 
wnscious planning, group unity, and dis- 
inhibited behavior did occur together within 
‘individuals. Deindividuating conditions also 
produced these changes in subjects. The nega- 
tive correlation between self-awareness and 
behavioral disinhibition also supports the 
heory of self-awareness (Wicklund, 1975, 
1979), According to the theories of self-aware- 
[fess and deindividuation, lowered self-aware- 
ness leads to lessened self-regulation because 
person is less aware of behavior-norm dis- 
trepancies and less likely to plan for the future. 
he present results are compatible with these 
formulations. 
| The study provides a less clear answer to 
the second question addressed—whether de- 
‘individuation is similar to a lack of self-aware- 
ness occurring because of nonsocial causes. The 
tindividuation group was significantly higher 
fon Factor 1 than the non-self-aware group, 
[hich did not differ significantly from the self- 
aware group. It can be noted that non-self- 
dare subjects, although somewhat less self- 
ware than self-awareness condition subjects, 
tid not more frequently report a loss of identity 
| ot lower conscious planning. In addition, classi- 
fication of subjects based on the discriminant 
function revealed that only a third of the non- 
self-aware subjects were grouped together with 
| the deindividuation participants. Social causes 
| flowered self-awareness may be more likely to 
telease dishibited behavior because the person 


tan perform novel or antinormative behaviors 


and not become self-aware, whereas the person 
Whose attention is focused outward by non- 
Social factors can easily focus back on the self 
When confronted by norm-related choices or 
Č novel situation. It is also possible thar 2 
Unique characteristic of deindividuation 1S 


i 
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that the person not only loses a sense of indi- 
vidual identity but also goes along with the 
direction of group behavior. 

The answer to whether deindividuation is 
accompanied by altered experiences is left un- 
certain by the present data, The deindividua- 
tion group differed significantly from the non- 
self-aware group on the second factor, which 
included several altered state items. About 
40% of deindividuated subjects answered “yes” 
to the statement, “Felt like I was in an altered 
state of consciousness in which my perceptions 
or thinking were distorted compared to their - 
normal functioning.” It should be recalled, 
however, that a fairly large number of persons 
in the other groups also reported such ex- 
periences (about 25%) and that the differences 
between groups on these items were not signifi- 
cant in most cases. Thus, more research on this 
issue is necessary before firm conclusions can 
be reached. 

One additional characteristic that has been 
proposed as part of deindividuation is a 
memory decrement (Festinger et al, 1952; 
Zimbardo, 1970). In the present study the 
memory for what had been said by others was 
poorest for the deindividuation group. How- 
ever, subjects’ memory for what they had said 
was uniformly high across conditions, indi- 
cating that even when self-awareness is low, 
the person can often still remember later what 
he or she has done. In addition, the deindivi- 
duation group had higher memory scores for 
pictures of the other group members, perhaps 
because they were more involved with the 
group. The deindividuated subjects were some- 
what poorer at memory for peripheral cues. 
This pattern suggests that “memory” differ- 
ences actually result from attentional differ- 
ences in the various conditions and that 
deindividuation does not automatically lead 
to reduced memory or amnesia. 


Alternative Explanations 


There are a number of alternative explana- 
tions for the present results.” It is possible that 


2One other alternative explanation for the present 
findings is in terms of demand characteristics. The 
major reason to discount a demand characteristics 
explanation of the findings is that it does not explain 
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subjects who conformed more to the disinhib- 
ited atmosphere in the deindividuated condi- 
tion simply liked the group more, and they 
therefore complied with the demands of the 
setting. It should be noted that the conformity 
explanation does not contradict the idea of 
deindividuation. Indeed, compliance or con- 
formity to group behavior in the initial stages 
of group activity may often set the stage for 
deindividuation and for the group to there- 
after move in increasingly disinhibited direc- 
tions. However, it does not appear that con- 
formity is the sole explanation for the present 
findings. First, deindividuation subjects chose 
more disinhibited boxes immediately after the 
group manipulation and before they knew 
what other subjects had chosen. And second, 
deindividuation subjects showed a variety of 
responses that are not explained simply by 
conformity, for example, more frequently re- 
porting a loss of identity, lower self-conscious- 
ness, and less conscious planning. 


why the deindividuation items formed a coherent 
factor, as predicted by the theory. Another reason to 
discount this possibility is that few subjects guessed 
the true purpose of the study, and those few who did 
responded no differently from those who did not. 


Reference Note 


1. Diener, E., & Kasprzyk, D. Causal factors in disin- 
hibition by deindividuation, Unpublished manuscript, 
University of Illinois at Urbana-Champaign, 1978. 


References 


Bandura, 
dologi 
155. 


Baron, R. S. Anonymity, deindividuatio: 
re n and - 
sion (Doctoral dissertation, University of Minne 
1970). Dissertation Abstracts International, 1971, 32° 
533A. (University Microfilms No. 71-18,681) 
Carver, C. S. Facilitation of physical aggression 
through objective self-awareness. Journal of Experi- 
mental Social Psychology, 1974, 10, 365-370, 
Carver, Cc S. Physical aggression as a function of 
sea self-awareness and attitudes toward punish- 
ment. Journal of Experimental i j 
a 1975, 11, 510-519. i T 
arver, C. S. A cybernetic model of self-; i 
processes. Journal of Personality and SR 
è chology, in press. A 
Attell, R. B. The scree test for the numbe: 
R. | T of 
Multivariate Behavioral Research, 1966, E 


A. Self-reinforcement: Theoretical and metho- 
ical considerations. Behaviorism, 1976, 4, 135- 


ED DIENER 


Cooley, W. W., & Lohnes, P. R. Multivariate procedures 
for the behavioral sciences. New York: Wiley, 1962, 

Diener, E. Effects of prior destructive behayio, 
anonymity, and group presence on deindivi 
and aggression. Journal of Personality and 
Psychology, 1976, 33, 497-507 

Diener, E. Deindividuation: Causes and consequence 
Social Behavior and Personality, 1977, 5, 143-155, _ 

Diener, E. Deindividuation: The absence of 
awareness and self-regulation in group memb 
In P. Paulus (Ed.), The psychology of group in 
Hillsdale, N. J.: Erlbaum, 1979, 

Diener, E., & Crandall, R. Ethics in social and 
havioral research. Chicago: University of Ch 
Press, 1978. 

Diener, E., Dineen, J., Endresen, K., Beaman, 
& Fraser, S. C. Effects of altered responsi 
cognitive set, and modeling on physical 
and deindividuation. Journal of Personal 
Social Psychology, 1975, 31, 328-337. 

Diener, E., Fraser, S. C., Beaman, A. L., & Kelem, 
Effects of deindividuating variables on steal 
Halloween trick-or-treaters, Journal of Perso 
and Social Psychology, 1976, 33, 178-183. 

Diener, E., Matthews, R., & Smith, R. E. 
experimental information to potential futw 
jects by debriefed subjects. Journal of Resed 
Personality, 1972, 6, 264-267. h 

Diener, E., & Srull, T. K. Self-awareness, psychol 
perspective, and self-reinforcement in relati 
personal and social standards. Journal of Perso 
and Social Psychology, 1979, 37, 413-423. 

Diener, E., & Wallbom, M. Effects of self-awai 
antinormative behavior. Journal of Reseat 
Personality, 1976, 10, 107-111. 

Diener, E., Westford, K. L., Dineen, J., & 
S. C. Beat the pacifist: The deindividuating, 
of anonymity and group presence. Proceedings 
81st Annual Convention of the American 
logical Association, 1973, 8, 221-222. (Sum 

Dion, K. L. Determinants of unprovoked g 
(Doctoral dissertation, University of Minni 
1970). Dissertation Abstracts International, 
32, 534A. (University Microfilms No. 71-18; 7 

Dipboye, R. L. Alternative approaches to deini 
tion. Psychological Bulletin, 1977, 84, 1057-10 

Duval, S., & Wicklund, R. A. A theory of objedtit 
awareness. New York: Academic Press, 19: 

Festinger, L., Pepitone, A., & Newcomb, T. Some 
Sequences of deindividuation in a group. Jot 
Abnormal and Social Psychology, 1952, 475 382 

Ickes, W., Layden, M. A., & Barnes, R. D. Obje 
self-awareness and individuation: An empiri 
Journal of Personality, 1978, 46, 146-161. 

Kanfer, F. Self-regulation and self-control. In H. 
(Ed.), The psychology of the 20th century. © 
Kindler Verlag, 1977. Ree 

Maslach, D. Social and personal bases of individu, 
Journal of Personality and Social Psychology, 
29, 411-425. E 

Mills, J. A procedure for explaining exper clos! 
volving deception. Personality and Social Psych 
Bulletin, 1976, 2, 3-13. 


DEINDIVIDUATION 


Olson, C. L. On choosing a test statistic in multivariate 
analysis of variance. Psychological Bulletin, 1976, 83, 
579-586. 

Rule, B. G., Nesdale, A. R., & Dyck, R. Objective self- 
awareness and differing standards of aggression. 
Representative Research in Social Psychology, 1975, 
6, 82-88. 

Sargant, W. The mind possessed. New York: Penguin 
Books, 1975. 

Scheier, M. F., Fenigstein, A., & Buss, A. H. Self- 
awareness and physical aggression. Journal of Ex- 
perimental Social Psychology, 1974, 10, 264-273. 

Watson, R. I. Investigation into deindividuation using 
a cross-cultural survey technique. Journal of Per- 
sonality and Social Psychology, 1973, 25, 342-345. 

Wicklund, R. A. Objective self-awareness. In R. L. 


1171 


Berkowitz (Ed.), Advances in experimental social 
psychology (Vol. 8). New York: Academic Press, 1975, 

Wicklund, R. A. Group contact and self-focused atten- 
tion. In P. Paulus (Ed.), The psychology of group in- 
fluence. Hillsdale, N. J.: Erlbaum, 1979. 

Ziller, R. C. Individuation and socialization: A theory 
of assimilation in large organizations. Human Rela- 
tions, 1964, 17, 341-360. 

Zimbardo, P. G. The human choice: Individuation, 
reason and order versus deindividuation, impulse 
and chaos. In W. J. Arnold & D. Levine (Eds.), 
Nebraska Symposium on Motivation (Vol. 17). 
Lincoln: University of Nebraska Press, 1970. 


Received July 10, 1978 m 


Journal of Personality and Social 
1979, Vol. 37, No. 


Psychology 
7, 1172-1178 


Effects of Difficulty and Diagnosticity on 
Choice Among Tasks in Relation to Achievement 
Motivation and Perceived Ability 


Ursula Buckert 
Oberhausen, West Germany 


Heinz-Dieter Schmalt 
Ruhr-Universität Bochum, West Germany 


Wulf-Uwe Meyer 


i is a partial replication of a study conducted by Trope. It in- 
eo wate two ona characteristics (achievement motive and 
perceived own ability) and two task characteristics (difficulty and diagnostic 
value about own ability) on choice among achievement tasks. In accordance 
with the results of Trope, it was found that high-diagnostic tasks were pre- 
ferred to low-diagnostic tasks, independent of their difficulty. Trope’s finding 
that high resultant achievers choose high-diagnostic tasks over low-diagnostic 
tasks to a greater extent than low resultant achievers was not replicated. HOW 
ever, the perceived degree of own ability affected choice behavior: W hen easy 
and difficult tasks were both high in diagnosticity, subjects high in perc ave 
ability preferred difficult over easy tasks, whereas subjects low in perceive 
ability preferred easy over difficult tasks. From this latter finding it is con- 
cluded that a self-informational conception of choice behavior has to include 
the subjective probability of success at tasks as a determinant of choice, in 
addition to objective difficulty and diagnostic value. 


Universitat Bielefeld, West Germany 


According to Atkinson’s (1957, 1964) theory 
of achievement strivings, individuals high and 
low in resultant achievement motivation differ 
systematically in their choices of achievement 
tasks. High resultant achievers will select tasks 
that are intermediate in subjective probability 
of success, whereas low resultant achievers 
prefer to undertake tasks of high or low sub- 
jective probability (P,). These predictions are 
based on a hedonic conception of achievement- 
related behavior. It is assumed that the choice 
of tasks intermediate in P, maximizes positive 
affect (pride), Prospectively connected with 
Success, for individuals high in resultant 
achievement motivation, and that the selec- 
tion of tasks high or low in P, minimizes 
negative affect’ (shame), Prospectively con- 
nected with failure, for persons low in resultant 


Ss I Fa ta 


This research was conducted by the first author and 
was part of her diploma thesis. 

Requests for reprints should be sent to Wulf-Uwe 
Meyer, Universitat Bielefeld, Abteilung Psychologie, 
Postfach 8640, 4800 Bielefeld, West Germany, 


Copyright 1979 by the American Psychological Association, Inc. (0022-3514/79/3707-1172$00.75 


1172 


achievement motivation. In the x rely 
empirical studies undertaken by Atkinson i 
his co-workers, P, is equated with the difficulty 
of the experimental tasks. x 

An alternate conception of achiever ay 
related choice was developed by Weiner of 
(1971). This approach, which is based on a 
principles of attribution theory, espar 
assumed differential task selection of the i 
Motive groups in terms of the expected a 
informational feedback of an outcome. a 
assumed that high resultant achievers E 
motivated to obtain information about t f i 
capabilities; they therefore select a 
intermediate difficulty because success ae 
failure at these tasks yield the most into 
tion about the person undertaking these ta a 
On the other hand, low resultant ache ae 
assumed to be motivated to avoid this 1m A 
mation; they therefore select very ewi l 
very difficult tasks. This is assumed k lh 
success at easy tasks and failure at di 


i tively 
tasks (the usual outcomes) yield rela 
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little information about one’s ability. This lack 

of self-informational feedback results from the 

fact that outcomes consistent with social 
norms lead to causal inferences about the task 
rather than about the person undertaking the 
task (Heider, 1958; Kelley, 1967, 1973). Thus, 
the same pattern of choices for the two motive 
‘groups is predicted from Atkinson’s hedonic 
‘and Weiner et al.’s informational approaches 
to achievement behavior. : 

Two recent studies by Trope and Brickman 
(1975) and Trope (1975) unconfounded the 
factors that lead to these identical predictions 
that are derived from Atkinson’s achievement 
motivation theory and Weiner et al.’s attribu- 
| tional conception. These authors argued that 
the relationship between task difficulty and 
expected self-informational value, postulated 
y Weiner et al. (1971), is not universally 
valid, The informational value also depends on 
the diagnosticity of tasks, which is operation- 
ally defined as the difference in the percentage 
of success at a given task between individuals 
igh versus low in ability. For example, suppose 
that 100 people, 50 high in ability and 50 low 
in ability, first perform Task A and then Task 
with the two possible outcomes, success OF 
| failure, at each. At each task 40 individuals 
succeed. The overall probability of success of 
oth tasks then is 40. Suppose further that 
on Task A 40 from the group high in ability 
are successful and none from the group low in 
ability, whereas on Task B 22 out of the high- 
ability group have success and 18 from the 
low-ability group. It is clear that success at 
Task A distinguishes better between persons 
with different ability levels than success at 
Task B. Although the overall difficulty of both 
tasks is equal, Task A is more diagnostic of 
the ability level of a person than Task B. 

Trope and Brickman pointed out that inter- 
Mediate difficulty tasks are generally, but not 
necessarily, more informative (diagnostic) 
than easy or difficult tasks because tasks of 
intermediate difficulty mathematically can be 
made maximally diagnostic and therefore are 
ordinarily more efficient at discriminating be- 
tween people with high and low ability. The 
authors conducted an experiment in which the 
variables, difficulty and diagnosticity of tasks, 


Were independently manipulated. Groups of 
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subjects were given a choice among easy, 
moderate, and difficult tests. The tasks of 
each test were highly diagnostic for some 
groups and low in diagnosticity for other 
groups. The results showed that tasks of high 
expected diagnostic value were preferred over 
tasks of low expected diagnostic value, inde- 
pendent of their difficulty. This result is in- 
terpreted by the authors as incompatible with 
Atkinson’s theory because information seeking 
rather than maximizing positive affect (ex- 
hibited by a preference for intermediate 
difficulty tasks) seems to be the crucial deter- 
minant of choice behavior. 

In a replication of Trope and Brickman’s 
study by Trope (1975), resultant achievement 
motivation also was measured. The results 
again showed that high-diagnostic tasks were 
generally preferred to low-diagnostic tasks. 
Moreover, high resultant achievers preferred 
high over low-diagnostic tasks to a signifi- 
cantly greater extent than low resultant 
achievers. These results again are viewed by 
Trope as contradicting Atkinson’s theory and 
giving some support to the position of Weiner 
et al. (1971) that high resultant achievers 
strive for ability-relevant information and 
that low resultant achievers avoid such 
information. Trope (1975), however, pointed 
out “that replication studies that employ 
other measures of achievement motive are 
definitely needed in order to firmly establish 
this conclusion” (pp. 1010-1011). 

The present experiment is a partial replica- 
tion of the studies conducted by Trope and 
Brickman (1975) and Trope (1975). It in- 
cludes a well-validated German instrument 
for the assessment of resultant achievement 
motivation developed by Schmalt (1976). In 
addition to achievement motivation, the de- 
gree of ability that the subjects ascribe to 
themselves (perceived ability) is introducted 
here as a further independent variable ; 
several authors have considered perceived 
ability as an important determinant of the 
achievement motive (Heckhausen, 1977; 
Meyer, 1976) or even equated differences in 
resultant achievement motivation with differ- 
ences in perceived ability (Kukla, 1972 


1978; Meyer, 1973). 
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Method 


Subjects 


Subjects were 104 male students, 16 to 19 years old 
and attending Grades 11 and 12 of two German high 
schools (Gymnasium). 


Design 


‘The general features of the design and the procedure 
resembled those employed by Trope and Brickman 
(1975) and Trope (1975). The study included two ex- 
perimental conditions with three tests varying in diffi- 
culty in each condition (easy ~.70, moderate ~.50, 
difficult =.30). In Condition 1, the test moderate in 
difficulty was high in diagnosticity, whereas the other 
two tests were low in diagnosticity. In Condition 2, 
the easy and the difficult test were high in diagnosticity, 
whereas the moderate difficult test had low diagnosti- 
city, There were 51 subjects assigned to Condition 1 
and 53 subjects to Condition 2. 

Two other independent variables included in the 
study were the subjects’ resultant achievement moti- 
vation and the degree of perceived ability that the sub- 
jects ascribed to themselves. Resultant achievement 
motivation was measured by the adult version of the 
Need Achievement (n Ach) Grid developed by Schmalt 
(1976), This instrument includes a series of depicted 
achievement situations with a fixed set of statements 
characteristic for hope of success and fear of failure 
orientation, The statements are taken from Thematic 
Apperception Test (TAT) scoring codes (Heckhausen, 
1967) and are the same for each depicted situation. 
The subject has to mark those Statements that would 
be characteristic for himself in each situation. The 
construct validity of the n Ach Grid is documented by 
a number of findings (see Schmalt, 1976). 


Procedure 


The experiment was run in ‘oups ranging from 11 
to 25 subjects, with each e ERRET code to 
guarantee anonymity. First, the n Ach Grid was ad- 
ministered, The subjects then were told that the ex- 
perimenter would administer a test called the Minnesota 
Multiphasic Integrative Orientation Test, To create 
the impression that the test really would be adminis- 
tered and that ability-relevant information could be 
gained by all subjects, the experimenter pointed out 
that she would visit the class three times. Subjects were 
told that in the first phase the test would be explained 
and that each subject would be given the opportunity 
to determine for himself the type of test items he 
would work on approximately 1 week later. In a second 
phase, they were told, the tasks selected by each in- 
dividual would be administered. In a third phase, the; 
were told, each subject would be given the opportunity 
to get information about the test result from the 
experimenter, 

accordance with the instructions used b; 
and Brickman (1975) and by Trope (1975), the reed 
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were then given a test booklet indicating that integra- 
tive orientation is characterized by being able “ito 
generate new information from previously known in- 
formation, to generalize, to find with relative ease new 
solutions to problems.” It was emphasized that standar 
intelligence tests do not measure integrative orientation 
and that about half the students from high schools 
comparable in age were high in the ability and half 
were low. (For the rationale of this procedure, see 
Trope & Brickman, 1975.) Subjects were then tol 
that the test consisted of three subtests, each consisting 
of 16 tasks. It was explained that subjects were to 
choose a total of 12 items and that they were free to 
decide for themselves how many items from each test | 
they would like to work on, 

Instead of leaving the tasks unspecified as Trope an 
Brickman did, examples of the tasks, supposedly used 
in the tests, were presented; they were taken from the 
Intelligenz-Struktur-Test (Amthauer, 1973). Two rows, 
with five cubes in each, were shown, with each cube 
carrying different symbols on the three sides presented, 
The cubes in the two rows were the same, but in the 
lower row they were turned, tipped over, or both, an 
had changed their positions in relation to the upper row. f 
It was explained that pairs of identical cubes had to be 
identified. Following this presentation, subjects had to 
estimate the degree of their integrative ability on & 
9-point scale. The scale was anchored at odd numbers 
with the following labels: very high (1), high (3); 
average (5), low (7), very low (9). Up to this point, 
the instructions were the same for all subjects. 

Manipulation of difficulty and diagnosticily. T 
create the experimental conditions, subjects received 
different booklets. In these booklets the information 
about difficulty and diagnosticity was given in the form 
of bar diagrams that showed the percentage of students 
high and low in integrative ability who succeeded on 
the three subtests. The diagrams were accompanie’ 
by an explanation: For the high diagnosticity tests, the 
subjects were told that there was a relatively large 
difference between the successes of students high e: 
low in integrative ability. Success or failure therefor 
would indicate the ability level. For low diagnosti 
tests, subjects were told that this difference was ir 
tively small, success and failure therefore would n 
indicate the ability level. Furthermore, the ove 
percentage of success indicated whether the tera 
easy, moderate, or difficult, Table 1 gives the repor a 
percentages of success for high- and low-ability studen 
in the two experimental conditions. hove 

Measurement of dependent variables. After the al i 
descriptions, subjects were given another booklet a 
dicating that they were to choose a total of 12 ne 
from the tests. It was stressed that the items sle 
were to be performed by the subject during the ae 
test session. In the booklets again some short into us 
tion about difficulty and diagnosticity of the three $ 
tests was presented in the form of bar digas 
subjects were then asked to indicate how many eee 
from each test they would like to be given at te sted 
test session with the total number of items Se Fe 
not exceeding 12. Furthermore, subjects had ee 
dicate the degree of preference for each test T la 
parison with the other two tests on a 10-point bip 
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scale ranging from very weak preference to very strong 
preference. 


Results 


Since the analyses of item selection and 
preference ratings yielded the same results 
the latter are not presented. The mean pute? 
of items chosen from each test in Conditions 
1(n = 51) and 2 (n = 53) is shown in Figure 1. 
Figure 1 reveals that in both conditions, more 
items are chosen from the high-diagnostic 
tests than from the low-diagnostic tests. The 
means for each test differ significantly between 
the experimental conditions, easy, 1(102) = 
5.34; moderate, #(102) = 10.95; difficult, 1(102) 
=4.61; ps < .001.! In contrast to the results 
from the experiments of Trope and Brickman 
and of Trope, in Condition 2 there was no 
preference for easy tasks over difficult ones. 

To test the effect of the achievement motive 
on choice, subjects were divided at the median 
on the Net Hope (NH) scores in groups high 
(NH > 5,” = 52) and low (NH < 5,” = 52) 
in resultant achievement motivation. There 
Were no significant differences in preference 
between the motive groups. Thus, Trope’s 
finding of a more pronounced preference of 
high-diagnostic over low-diagnostic tasks in 
the high than in the low motive group was 
not replicated. 

To examine the influence of perceived 
ability, which is unrelated to achievement 
motivation (r = —-12), on item selection, 
groups high (Scale Values 1-4, n = 56) and 


Table 1 

Percentages of Success Among High- and 
Low- Ability Students on Three Tests Reported 
in Two Experimental Conditions 


Test difficulty 
Student Easy Moderate Difficult 
ability (.70) (.50) (.30) 
Condition 1 
High 13 15 33 
Low 66 28 27 
Condition 2 
High 90 55 51 
Low 50 48 10 


Note, Criterion values are in parentheses. 
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Diagnosticity: 


e High 


a Condition! 


Moderate Difficult 


Easy 
DIFFICULTY 


S, 
TEST 


Figure 1. Mean number of items chosen from tests 
varying in difficulty in two experimental conditions. 


low (Scale Values 5-9, n = 48) in perceived 
ability were separated, with 26 highs and 25 
lows in Condition 1 and 30 highs and 23 lows 
in Condition 2. The mean number of items 
selected is shown in Figure 2. In Condition 1 
there were no significant differences in the 
mean preferences of the two groups. In Con- 
dition 2, however, individuals low in perceived 
ability preferred easy test items to a signifi- 
cantly greater extent than individuals high in 
perceived ability, (51) = 3.79, p< 0, 
whereas for the difficult test in this condition 
the reversed preference pattern is given, 
1(51) = 2.27, p < 05. The difference between 
the groups at the moderately difficult test 
failed to reach statistical significance, (51) 
=1,92, Within the groups high and low in 
perceived ability, there were significant differ- 
ences between the experimental conditions at 
each level of test difficulty (t test, all ps at 


least < .05). 


1 A comparison of the means for the tests within each 
dition was not possible because the 


experimental con 
selected from each test was not 


number of tasks 
independent. 
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Perceived Ability : 


Experimental Condition 1 


NUMBER OF ITEMS 


Easy Moderate Difficult 


FIETS T. 


Figure 2, Mean number of items chosen from tests vai 
high and low in perceived ability. 


Discussion 


The results of this experiment partially 
replicated the findings teported by Trope and 
Brickman (1975) and Trope (1975): It was 
found that high-diagnostic tasks were pre- 
ferred to low-diagnostic ones, independent of 
their difficulty. This Supports the view that 
choice behavior is guided by a tendency to 
attain self-relevant information, 

But, in contrast to Previous studies, a 
general preference for easy tasks over more 
difficult ones was not found. Furthermore, 
Trope’s finding that high resultant achievers 
choose high-diagnostic tasks over low-diag- 
Nostic tasks to a significantly greater extent 
than low resultant achievers was not replicated. 
This lack of any differential choice behavior 
between the motive groups is consistent with 
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è High o Low 


Experimental Condition 2 


Easy Moderate 


Difficult 
ORT C UL TY 4 


rying in difficulty in two experimental conditions by subi 


the results of Meyer, Folkes, and a 
(1976), which demonstrated that both 
and low achievers selected tasks in a pa 
that is best described as normally distribute 
around the difficulty level of .50. M. 
The disparate results in the Pa i 
reported here and in Trope’s (1975) a 
scarcely be attributed to the different k. 
ments used for the assessment of the ac 7 
ment motive in both studies. Meyer (196 
(1976) and Trope used the Meba a 
scale, with Trope finding marked di a 
in choice between high and low acl X 
which was not the case with Meyer © N 
Furthermore, in a related study, Starke ( E 
1) employed, in accordance with T K. À 
ment, the Grid technique. He found t Eo 
achievers select tasks that provide hie 
relevant feedback, whereas low achi Se 


select tasks providing ability-irrelevant feed- 
back. Thus, altoghether, it seems clear that 
individuals high in achievement need—inde- 
pendent of the disparate situational conditions 
in the experiments—strive for information 
out their capabilities. However, it remains 
unclear under what situational conditions in- 
dividuals low in achievement need strive for or 
avoid such information. 

Another important finding of this study is 
the difference in choice behavior between sub- 
jects high and low in perceived ability. In 
Condition 2 the high-ability group preferred 
ifficult over easy tasks, whereas the low- 
ability group preferred easy over difficult 
tasks, with the easy and difficult tasks being 
igh in diagnosticity. In Condition 1 the two 
groups correspondingly preferred tasks from 
the intermediate difficulty test, which is the 
only test high in diagnosticity in this condition. 
This overall pattern of choice is neither 
completely consistent with Weiner et al.'s 
(1971) position that choice is a function of 
task difficulty, nor with Trope and Brickman’s 
| (1975) position that choice is a function of the 
expected diagnostic value of tasks. Thus, 
there must be at least one other determinant 
of choice behavior in addition to these two 
variables. It appears that an informational 
conception of choice behavior has to include 
the subjective probability of success at tasks 
as a determinant of choice. Meyer (1973, 
1976) has contended that given tasks with 
which a person has had a history of involve- 
ment, information gain is maximal when Ps 
at a task equals .50 rather than when the ob- 
jective difficulty of a task, as defined by the 
proportion of successful individuals, equals .50. 
Assume, for example, that Individuals A and 
B are confronted with a task intermediate in 
difficulty, both having some familiarity with 
the general type of tasks. A perceives own 
task-specific ability to be very high and esti- 
mates P, to be 1.00, whereas B perceives own 
ability to be very low, estimating P, at the 
task to be .00. It is clear that the expected 
success for A and the expected failure for B 
Prospectively provide no new information con- 
cerning their abilities at this intermediate 
difficulty task, but only corroborate the estab- 
lished ability percepts. Individuals will expect 
most information about their ability from tasks 


2 


EFFECTS OF DIFFICULTY AND DIAGNOSTICITY ON CHOICE 


1177 


in which own success and failure are maximally 
uncertain and whose P, therefore is .50. Posi- 
tive results at tasks with higher values of P, 
and negative results at tasks with lower values 
of P, have less self-informational value, inas- 
much as these results either provide informa- 
tion about the difficulty of the task or confirm 
already established perceptions of own ability. 

A number of studies correspondingly show 
that individuals high in perceived ability esti- 
mate the P, of tasks at all difficulty levels as 
higher than individuals low in perceived 
ability (Meyer & Hallermann, 1977; Starke, 
Note 1; Buckert, Note 2; Schulz, Note 3). 
Thus, with increasing perceived ability, ob- 
jectively more difficult tasks should be chosen 
to get self-relevant information at tasks for 
which P, equals .50. Although P, was not con- 
trolled here, it can be assumed, extrapolating 
from the studies just mentioned, that the per- 
ceived ability also influenced P,: For the high- 
ability groups, the objectively difficult test 
(.30) should be nearer to P, = .50 than the 
easy one (.70), whereas for the low-ability 
group, the easy test should be nearer to 
P, = .50. It therefore can be assumed that 
from tasks equally high in diagnosticity 
(Condition 2), those are preferred whose sub- 
jective probability is closest to .50, Thus, al- 
though individuals high and low in perceived 
ability phenotypically select different tasks, 
genotypically they are both striving to attain 
ability-relevant information. 

In sum, it seems false to us to conceive the 
informational properties of choice in general as 
a function of task difficulty (Weiner et al., 
1971) or as a function of expected diagnosticity, 
which is usually dependent on task difficulty 
(Trope & Brickman, 1975). This may be true 
for situations in which the tasks to perform 
at are not known to the individual or in which 
the tasks are unspecified, as in the experiments 
by Trope and Brickman (1975), Trope (1975), 
and Meyer et al. (1976, Experiments I and Il), 
or when the individual is completely un- 
familiar with the tasks. Under these conditions, 
when the individual is unable to estimate own 
task-specific abilities, it might be the most 
efficient strategy to choose intermediate- 
difficulty (i.e. highly-diagnostic) tasks to 
obtain information about one’s own ability. 
But when the tasks are known to the in- 
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dividual, as in the experiment reported here, 
and the individual already has an estimate of 
own ability, which is, however, subjectively 
not fully valid, then the perceived self-informa- 
tional value of choice behavior at tasks that 
the individual perceives as diagnostic, with 
respect to own abilities, should be a function 
of P, with maximal expected information at 
tasks intermediate in P,. 
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Role of Foreseen, Foreseeable, and Unforeseeable 
Behavioral Consequences in 
the Arousal of Cognitive Dissonance 
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Subjects delivered a counterattitudinal 


speech supporting an unwanted policy. 
bout the consequences of making the 


unwanted consequence retrospectively foreseeable if made known later. In ad- 
dition, there was a third group that was told nothing about the consequences 


of making the speech at the time of t 


he decision to make it. After the speech 


was delivered, half of the subjects in each group were informed of a specific 


unwanted consequence of their act and half were given no further information. 
As predicted, self-justificatory attitude change was found only in the two con- 
ditions in which subjects were informed prior to making the speech of the 
specific unwanted consequence and in’ the condition in which subjects were 
given a general description of the consequences beforehand and specific infor- 
mation about the unwanted consequence after the speech. The results are dis- 
cussed in terms of the relation of personal responsibility to cognitive disso- 


nance arousal, 


A series of recent experiments has made it 
telatively clear that unwanted behavioral con- 
sequences that become known to actors only 
after they have performed a behavior can lead 
to dissonance-produced self-justificatory atti- 
tude change * (Cooper & Worchel, 1970; 
Cooper, Zanna, & Goethals, 1974; Goethals & 
Cooper, 1972). There is also considerable evi- 
dence leading to the conclusion that such con- 
sequences produce dissonance only when their 
occurrence is foreseen at the time of the 
decision to engage in the behavior (Cooper, 
1971). 

The latter point, that unwanted conse- 
quences must be foreseen if they are to pro- 
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duce dissonance, is consistent with arguments 
by Cooper (1971), and also Hoyt, Henley, 
and Collins (1972), regarding the role of 
personal responsibility in cognitive dissonance. 
Cooper proposed that counterattitudinal be- 
havior will produce dissonance if the actor 
feels personally responsible for the conse- 
quences of the behavior. Responsibility is 
accepted if (a) the person had free choice in 
performing the behavior and (b) the person 
was able to foresee its unwanted consequences. 
However, Cooper's analysis does not tell us 
whether the unwanted consequences must be 
foreseen or only foreseeable in order for them 
to lead to the acceptance of responsibility. 
The distinction between foreseen and foresee- 
able consequences is an important one, both 
for psychological processes and legal decisions. 

Consider the following example: A man 
enters a gambling establishment. According to 
his already existing cognitions, he would find 
it undesirable to donate his money to the 
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Two groups were given information al 

speech before agreeing to make it, One was explicitly informed of the possibil- 
ity of an unwanted consequence occurring. The second group was given a gen- 
eral description of the consequences that was designed to make the specific 
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Mafia. In the first situation, the person knows 
full well that the establishment is run by the 
Mafia and that any loss of money will enter 
into mafioso hands. We can say that losing 
money in this situation will result in a foreseen 
consequence. At the other extreme, a person 
with the same attitudes enters a church to 
play bingo. He later learns that the “church” 
was merely a building borrowed from a movie 
set and that it was, in fact, a front for the 
Mafia. He, too, contributed money to the 
Mafia, but this time the consequence was com- 
pletely unforeseeable. 

In the middle lies the situation in which the 
person does not have firm evidence as to who 
runs the gambling house. When he learns that 
it is the Mafia who pocketed his money, he 
may be tempted to say, “I didn’t know.” 
However, he may have had to walk past 
several bodyguards to get into the place; he 
may have have had to pass the statue of Al 
Capone; he may have had to whisper a pass- 
word to a gun-holding stranger to enter the 
locked room. It can be argued that he cer- 
tainly could have known that the Mafia Te- 
ceived his money, had he given it some reflec- 
tion, We can say that the consequence in this 
Situation was foreseeable. The difference be- 
tween the last two situations is that in the 
former, there was no reasonable way in which 
the person could foresee the consequence of his 
behavior. In the latter, a reasonable person 
could have been expected to be aware of the 
possible occurrence of the consequence. 

A more formal set of definitions may be 
helpful. By foreseen consequences we mean 
those whose possible occurrence the actors are 
explicitly aware of at the time of decision. The 
occurrence of these consequences need only be 
Possible, not definite, for them to be regarded 
as foreseen. The critical point is that they be 
in the actors’ consciousness as Possible occur- 
rences, By foreseeable consequences we refer 
to those results of behavior that were not in 
the actors’ awareness at the time of decision 
but that they feel they—or any reasonable 
person—could have anticipated in light of the 
information they were explicitly given. That 
is, these are consequences the actors were not 
aware of, but feel in retrospect that they 
reasonably could be expected to have been 
aware of them. Finally, unforeseeable con- 
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sequences are those that the actors were 
aware of and, furthermore, feel that. 
no way that a reasonable person could 
anticipated. 

In the present article, we are testi 
proposition that both foreseen and fo; 
consequences can lead to the arousal of 
tive dissonance. Unforesceable consequi 
will not. Moreover, there will be a 
difference between the foreseen and fores 
situations. Past research (e.g., Goeth 
Cooper, 1975) suggests that the explicit 
tion of an unwanted event prior to a de 
to act will result in dissonance arousal 
less of whether subjects ever learn if the 
took place. Although this is true of a for 
consequence, we do not expect it to be tf 
a foreseeable consequence. When the 
wanted consequence is not explicitly 
individual’s awareness, it can only | 
dissonance arousal if its occurrence is 
explicit after the behavior. Only th 
subjects be able to think back to whet 
could have been reasonably expected to a 
pate the occurrence of the consequence, 

In the current study, subjects agr 
participate in. attitude-discrepant 
The degree of foreseeability of an un 
behavioral consequence was separated 
foreseen, foreseeable, and unforeseeable v 
tions. In the foreseen variation, the uni 
consequence was explicitly mentioned 
subjects as one of the events that could 
as a result of their behavior..It was n 
definite outcome, but the subject was 
directly aware of its possible occurrem 
the foreseeable variation, the events 
from the behavior were described in 
terms at the time of decision. The 
unwanted consequence was not €x] 
mentioned, but it was an instance of th 
class of consequences that was desc 
the unforeseeable variation, no men 
made of anything bearing any relation 
unwanted consequence that was to resul 
the attitude-discrepant behavior. 

In addition, the information that $ 
Were given following their counterattit 
behavior was varied. In the not-infor 
tion, there was no further mention 
sequences after the subjects perform 
behavior. In the informed conditions, § 
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were told explicitly that the unwanted con- 
sequence definitely was to occur. 
$ Tt was predicted that self-justificatory atti- 
tude change would occur in the foreseen con- 
ditions, regardless of whether subsequent in- 
formation about the occurrence of the con- 
sequence was ever given. On the other hand, 
subjects in the foreseeable variation were pre- 
dicted to manifest attitude change only in 
the informed condition. It was expected that 
when these subjects received specific post- 
behavioral information of the occurrence of the 
consequence, they would feel that they could 
have been aware of it. Finally, subjects in the 
unforeseeable conditions were not expected to 
change their attitudes, Regardless of any sub- 
sequent information they might receive about 
the consequence of their behavior, it was ex- 
pected that they would absolve themselves of 
the responsibility for those consequences, 
since they had no way of knowing about them 
at the time they agreed to their attitude- 
discrepant act. 


Method 
Overview 


Under the guise of participating in a study on psycho- 
linguistics, subjects delivered a counterattitudinal 
Speech supporting an unwanted policy on campus. One 
group was explicitly informed of the possibility of an 
unwanted consequence following the attitude-dis- 
crepant act. A second group was given a general de- 
scription of the consequences of the behavior, designed 
to make a specific unwanted consequence retrospec- 
tively foreseeable. A third group was not told anything 
about the consequences of the act. f 

After the counterattitudinal speech was delivered, 
half of the subjects in each group were informed of a 
Specific unwanted consequence of their act, and half 
were given no further information. Finally, the subjects 
Bave a rating of their opinion about the issue at hand, 
after which they were debriefed. 


Subjects 


Sixty Princeton undergraduates came in response to 
an advertisement for a psychology experiment. Sub- 
jects who asked for more information about the experi- 
ment were told that the study was on psycholinguistics. 


Procedure 


Each subject was run individually by two experi- 
menters. One experimenter conducted the first half of 
the procedure and the other experimenter the secon’ 
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half. This technique was chosen to reduce the impres- 
sion management concerns of the subjects. Both experi- 
menters were female. One was a native of France and 
the other a native of Iran. Both spoke fluent English 
with quite noticeable French and Persian accents, re- 
spectively. This factor played a part in the selection 
of the cover story and lent it considerable credence. 

When the subjects arrived the first experimenter 
greeted them and explained that the experiment, being 
conducted in conjunction with foreign language depart- 
ments, concerned psycholinguistics and was investigat- 
ing the different linguistic devices used in oral and 
written communication. It was explained that sub- 
jects would either write an essay or give a speech argu- 
ing for one side of a particular issue. The experimenter 
added that an issue was chosen that most students 
would be familiar with, namely a pending university 
decision to double the size of the freshman class and 
therefore over time, the number of undergraduates at 
the university. The experimenter explained that she 
knew how most undergraduates felt about the issue and 
that she had plenty of speeches and essays opposing the 
plan. Now, she said, she needed some favoring the 
proposed policy. 

At this point the procedure varied for subjects in 
different conditions. Those in the unforeseeable con- 
ditions were simply told that participation in the ex- 
periment was voluntary and that now that they had 
heard what the experiment was about, they could 
decide whether they wanted to continue. The experi- 
menter emphasized that the choice was theirs and that 
they could leave immediately and be paid. All subjects 


, agreed to stay. 


For the subjects in the foreseen and foreseeable varia- 
tions, the experimenter explained that she should tell 
the subjects that several other groups on campus be- 
sides the psychology and language departments were 
interested in the speeches and essays collected in the 
study. She said that they were interested in things such 
as the linguistic devices used, the issues raised, and the 
tactics taken in composing the communications. In the 
foreseeable conditions the experimenter simply èx- 
plained that the subject’s speech would be sent to one 
of the interested groups on the basis of a random selec- 
tion procedure. Then she proceeded to the discussion 
of free choice. In the foreseen conditions, before men- 
tioning the random selection procedure the experi- 
menter elaborated on the interested “groups” by 
deliberately reading from a note pad the names and 
interests of the groups. She stated: 


The groups are first, a group of graduate students ina 
linguistic strategies seminar (this group is interested 
in the linguistic devices used in the speeches and es- 
says collected in this study) and second, the debating 
team. This group is interested in the fact that we 

have chosen topics that have two sides to them, and 

have asked for speeches and essays on both sides. 

Thus the interest of this group lies in the tacks taken 
by different students in composing the communica- 
tion. And finally, the board of admissions of Princeton 
University is also interested in this study, for as I 
mentioned before, the University is considering an 
increase to 1,600 in the size of the freshman class. 
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When the subject had agreed to write the essay, 
the experimenter led her to another room where the 
subject was introduced to the second experimenter who 
was “helping out” with the project and who would 
record the speech and ask some final questions. The 
first experimenter said goodbye and left. The remainder 
of the procedure was conducted by the second experi- 
menter, She gave the subject some paper to make notes 
on, told the subject to take 5 minutes to prepare a 
t-minute speech, and left the room. She returned in 5 
minutes and recorded the subject’s speech. She began 
recording by speaking the subject’s name and the 
topic and direction of the speech into the microphone. 
Then she handed the microphone to the subject who 
delivered the speech. 

After the speeches had been recorded, the experi- 
menter told the subjects that they seemed very per- 
suasive. Then in the foreseen-informed condition, she 
reminded the subjects that a random process was to 
be used to determine which of the three other interested 
groups would get the speech. She asked the subject to 
look at a letter on the back of the piece of scratch paper, 
saw that it was the letter “B,” and pointed to a chart 
showing that speeches recorded from pieces of scratch 
paper with the letter “B” would be sent to the board 
of admissions, 

In the foreseeable-informed condition, the experi- 
menter reminded the subjects that other groups were 
interested in the speeches and told the subjects what 
the three groups were, exactly as the subjects in the 
foreseen group had been told prior to the speech. Then 
she explained the random assignment procedure and 
informed the subject that the speech was to be sent to 
the board of admissions, 

In the unforeseeable-informed condition, the experi- 
menter explained to the subjects after the speeches 
were recorded that there were other interested groups, 
and so forth, Then the actual groups were identified, 
the random assignment procedure was explained, and 
it was revealed to subjects that their speech would be 
sent to the admissions board, 

None of these postspeech events occurred for the 
Subjects in the three not-informed conditions. All sub- 
jects then were asked to answer a questionnaire that 
contained a question asking the subjects to indicate, 
onan 11-point, scale, the extent to which they agreed 
or disagreed with the aa of doubling the size of the 
x C e experimenter gaye th 
questionnaire to the subjects, inaa had 


on an 11-point scale, whether their speech 

used for linguistic rı s fhe penis 
well. After the sul 
questionnaire, they were full 
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Table 1 , 
Means on Question Regarding Uses of Speech 


Informed 10.7. 10.2. 
Not 
informed 10.7. 5.95 


Note. Higher numbers indicate subjects believed 
speech was to be used by “other groups’ 
numbers that speech was to be used for “li 
research only.” Means not sharing a commo 
script differ at the 5% level by the Newman: 
test. 


Results 
Effectiveness of the Manipulations 


Subjects were asked whether their g 
were to be used for linguistic research on 
whether they would also be used by | 
groups. The means on this measure a 
sented in Table 1. An analysis of v 
showed significant main effects for tht 
formed factor, F(1, 54) = 60.55 p < .00 
foreseeability factor, F(2, 54) = 243 
.001; and a significant interaction, F(2; 34) 
=20.34, p < .001. Subjects in all thr 
formed conditions were well aware tha 
speech would be used for other purpo: 
addition, subjects in the foreseen—not-ink 
conditions knew that the speech would 
for other purposes. Subjects in the fores 
not-informed condition were, surprising: 
aware that the speech would be used foi 
purposes than subjects in the aforementl 
conditions, but more aware than su 
the unforeseeable-not-informed condi 
believed that the speech would be 
research purposes only, The foreseeable™ 
informed condition subjects’ lesser a 
of other uses of the speech makes ; 
terms of their being less fully informed 
the other groups interested in the speet 
Overall, the results suggest that the man 
lations had their intended impact. In addi 
all subjects indicated that they felt fr 
decline participating in the speech-m 
task. 


Altitudes Toward Increases in Class Sise 


Subjects were asked whether they the 4 ug 
that the freshman class size should be incisa 
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to 1,600 students. They responded on a scale 

labeled strongly agree (scored 1) and strongly 
' disagree (scored 11) at the end points. The 
results are presented in Table 2. As expected, 
subjects in the unforeseeable-not-informed 
group were strongly opposed to an increase in 
the size of the freshman class. This group, 
which made a speech but never believed any 
unwanted consequence would occur, was ex- 
pected to reflect the prevalent attitude on 
campus (M = 10.1) 

An analysis of variance performed on the 
data of Table 2 resulted in main effects for 
both the informed and foreseeability factors. 
Informed subjects evidenced more favorable 
attitudes toward the proposal than not-in- 
formed subjects, F(1, 54) = 5.85, p < 02. In 
addition, as the degree of prior awareness of 
the consequences increased from unforeseeable 
to foreseen, subjects indicated greater agree- 
ment with the counterattitudinal position, 
F(2, 54) = 12.26, p< 001. Most important, 
our hypotheses led to the prediction of an 
interaction between the two variables. This 
was supported by the analysis of variance, 
which found the interaction to be reliable, 
F(2, 54) = 3.50, p < .05. A Newman-Keuls 
analysis summarized in Table 2 adds further 
support. When the consequences were fore- 
seen prior to the decision to act, subsequent 
information made no difference in the magni- 
tude of attitude change. Both the informed 
and not-informed variations evidenced sub- 
stantial changes in belief toward the proposal. 
Similarly, when the consequences of the be- 
havior were completely unforeseeable prior 
to the decision, the level of subsequent infor- 
mation again made no reliable difference in 
final opinions. However, when the conse- 
quences were foreseeable—those that one 
might reasonably have been expected to antici- 
pate but were not explicitly made known prior 
to the act—then the level of subsequent m- 
formation made an important difference. 
not informed of the unwanted consequence, 
foreseeable subjects reported anti-class_in- 
crease attitudes (M = 10.2), but if they were 
subsequently informed, they demonstrate 
considerable moderation (M = 7-4). The dif- 
ference between the two foreseeable variations 


was significant (p < -05)- 
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Discussion 


The results are supportive of the original 
predictions. First, when subjects perform an 
act under conditions of high choice that they 
realize has the potential to produce unwanted 
consequences, they show the effects of experi- 
encing dissonance. This is true whether or not 
the subjects are informed that the unwanted 
consequence did actually occur, That is, even 
if subjects are not told the outcome of their 
behavior, they still show manifestations of 
dissonance. This latter finding is entirely con- 
sistent with previous research (Goethals & 
Cooper, 1975) suggesting that the mere possi- 
bility that one’s behavior will produce un- 
wanted consequences is sufficient to produce 
dissonance if subjects anticipate receiving 
no further information regarding the final 
outcome. 

Second, if one’s behavior produces unwanted 
consequences that are totally unforeseeable, 
such consequences produce no dissonance. 
If a subject performs what appears to be 
innocuous behavior that later turns out to 
have unwanted consequences that could not 
possibly have been foreseen, the subject will 
not accept responsibility for those conse- 
quences and will not manifest dissonance. 

Finally, and most importantly, the study has 
demonstrated that consequences made known 
to the subject after he or she has performed 
a behavior can and do produce dissonance if 
the subject perceives, retrospectively, that 
they were foreseeable; that is, that he or she 
could have foreseen that such an unwanted 
consequence could occur. 

One important aspect of these results is that 
they reaffirm the idea that personal responsi- 
bility for consequences is a necessary precon- 
dition for dissonance arousal. Furthermore, the 
study has helped specify under what conditions 
subjects will accept responsibility. They will 
accept responsibility if they feel the conse- 
quences were foreseeable, in the sense that they 
could have been foreseen by one who was being 
careful and thoughtful about his or her be- 
havior. Because the consequences could have 
been foreseen, subjects will feel that they 
should have foreseen them and are responsible 
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Table 2 i hs 
Mean Attitudes Toward Increasing the Sise 
of the Freshman Class 


Condition Foreseen Foreseeable Unforeseeable 
Informed Tels Tda 9.6, 
Not 

informed TAs 10.2, 10,1, 


Note. Higher numbers indicate less favorable atti- 
tudes toward increasing class size. Means not 
sharing a common subscript differ at the 5% level 
by the Newman-Keuls test. 


for the unwanted consequences even though 
they did not, in fact, foresee them. 

It seems as if the subjects in our experiment 
accepted responsibility for their actions in 
much the same way that the law assigns re- 
sponsibility to people whose behavior leads to 
undesirable consequences. Drivers who seri- 
ously harm pedestrians in auto accidents are 
held responsible by the law consistent with the 
principles of choice and foreseeability. Assum- 
ing that the drivers were driving as they were 
of their own volition, choice is clearly present. 
But their lawyers might argue that the out- 
come was not foreseen, or their clients clearly 
would have driven differently. However, the 
consequence was foreseeable. A reasonable 
man (a well-accepted legal fiction) would have 
foreseen that driving in this way could lead 
to a serious accident and injury. That is, 
drivers can be held negligent (i.e., responsible) 
even if they did not foresee what would occur, 
if they should have anticipated what might 
occur, 

The demonstration that dissonance can be 
produced by behavior that was performed 
where no unwanted Consequences were ex- 
plicitly foreseen is important in terms of the 
external validity of dissonance research. 
Most studies have shown that if subjects are 
coerced or induced to perform behavior that 
they know has unwanted consequences, they 
will experience dissonance. Yet one wonders 
how often people perform such behaviors in 
real life. Of course, the inducements and pres- 
Sures that dissonance experimenters use to 
elicit compliance are not absent from the real 
world. Still, it seems that people more often 
perform behavior not realizing that it will 
Produce negative consequences but subse- 
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quently find that it did. They then may feel 
that they should have known. For example, 
people may not buy a house if they realize 


that the roof is going to leak. However, they 4 


may find that it does leak and that they 
neglected to check it before they bought the 
house. Here the consequence was not foreseen 
but probably would be regarded as foresee- 
able, and the people will feel responsible for 
the situation’s having occurred. They should 
reduce dissonance by believing more strongly 
in how nice their house is. These kinds of 
events—people behaving in ways that pro- 
duce unanticipated but foreseeable events— 


are probably more common in arousing dis- 1 


sonance than those in which people know- 
ingly behave in ways that bring about un- 
wanted consequences. 

Finally, these data are important in relation 
to the overall understanding of the idea of 
responsibility and its relation to dissonance 
arousal. Wicklund and Brehm’s (1976) review 
of the development of the responsibility con- 
cept suggests that although choice and fore- 
seeability are necessary for responsibility and 
thus dissonance reduction, ‘unforeseen con- 
sequences can arouse dissonance under special 
conditions” (p. 71). The present study sug- 
gests one special condition for which the un- 
foreseen consequence is retrospectively fore- 
seeable. This special condition seems to apply 
to all of the studies reviewed by Wicklund and 
Brehm that appear to contradict the propos 
tion that foreseeability is a necessary pre- 
requisite for accepting responsibility for 
behavioral outcomes. 

For example, in studies conducted by Pallak, 
Sogin, and Van Zante (1974) and by Sogin 
and Pallak (Note 1), subjects participated m 
a task that involved selecting random num- 


bers. They were told afterwards that their ” 


numbers could not be used because of the way 
they had performed the task. Although they 
did not know beforehand that their numbers 
might not be used on the basis of the E 
menter’s criterion, it was probably forseea K 
that their performances could be found iu 

ing in some way. Subjects were simply ae 
that their numbers “should be usable. i 
seems to leave open the possibility that k 
some basis or other, admittedly unknown, 
they also might not be usable. 


| signed to show dissonance 


AROUSAL OF DISSONANCE 


Wicklund and Brehm (1976) discuss a study. 
by Aronson, Chase, Helmreich, and Ruhnke 
(1974) in which subjects wrote counter- 
attitudinal essays and were told before or after 
writing them that their essays would be shown 
to a persuasible or nonpersuasible audience. 
There were two other conditions (lie condi- 
tions) in which subjects were promised that 
the essays would not be shown to other people. 
These subjects also discovered after writing 
their essays that they would be shown to a 
persuasible or nonpersuasible audience. Aron- 
son et al. found attitude change in the per- 
‘suasible audience-after condition, in which 
they argued that the unwanted consequence 
was unforeseeable. However, in all conditions 
of the experiment, subjects were told that 
their essays would be used “for an attitude 
change study,” and so an element of fore- 
seeability of the unwanted consequence may 
well have been present. Aronson et al. pointed 
out that there was greater attitude change in 
the persuasible audience-lie condition than 
expected. Perhaps this finding can be explained 
in terms of this degree of foresecability. 
Incidentally, Wicklund and Brehm also noted 
that this study may have involved fore- 
knowledge on the part of the subjects. 

We would like to be able to argue that not 
only the present study, but also the research 
reviewed by Wicklund and Brehm (1976), 
shows conclusively that self-justificatory atti- 
tude change does not occur when actors pro- 
duce unwanted consequences that are truly 
unforeseeable. We think that a plausible case 
“can be made that retrospective foreseeability 
of the kind created in the present experiment 
did exist in all of the studies that were de- 
effects without 
foreseeability. However, it is difficult to make 
this point definitively. Unlike the present 
experiment, previous studies have been con- 
cerned only with, a foreseen versus: unforeseen 
distinction rather than a retrospectively fore- 
seeable versus unforeseeable distinction within 
an unforeseen variation. In examining the 
procedures of these studies, it may be reason- 
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able to argue that they resemble more clearly 
our foreseeable variation than our unforesce- 
able variation, but this argument cannot be 
drawn conclusively. We believe that dissonance 
effects have not been demonstrated in condi- 
tions in which the unwanted consequences are 
truly unforeseeable, but further research would 
be required to support our interpretation of 
earlier studies. 


Reference Note 


1. Sogin, S. R., & Pallak, M. S. Responsibility, bad 
decisions, and attitude change: Volition, Joreseeability, 
and locus of causality for negative consequences. 
Unpublished manuscript, University of Iowa, 1974. 
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An attentional model of fear-based behavior is proposed and a study that tested 
the model is reported. It was predicted that among subjects with moderate fear 
of snakes, heightened self-attention during an approach attempt would cause 
increased awareness of existing anxiety, followed by one of two courses of 
events: Subjects who believed that they could do the behavior in spite of their 
fear were expected to redirect their attention to the behavior—goal comparison 
and exhibit no behavioral deficit. Subjects who doubted their ability to do the 
behavior were expected to divert their attention from the behavior-goal com- 
parison and to withdraw behaviorally from the approach attempt. The results 
of the study support this reasoning. Discussion centers on relationships between 
the proposed model and previous theory. 


Recent approaches to understanding anxiety 
and anxiety-related behavior have emphasized 
the role of cognitive processes in such experi- 
ences (see, e.g., Sarason, 1972a, 1972b; Wine, 
1971). Some of this theorizing, for example, 
Wine’s summary and integration of the litera- 
ture of test anxiety, has treated attentional 
focus as an important variable in the fear 
response. Almost universally, however, direc- 
tion of attention has been considered only as 
a response—a reaction to fear-inducing situa- 
tions. Very little consideration has been given 
to the consequences of shifts in focus of 
attention. 

An exception to this general rule comes 
from research on fear-based withdrawal that 
was conducted by Carver and Blaney (1977a, 
1977b), investigating the effects of bogus 
arousal feedback on fearful subjects’ approach 
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behavior. In two of those studies (Carver a 
Blaney, 1977a, Experiment 3; 1977b) subjects 
were tested who had moderate fear of non- 
poisonous snakes, but who varied in their 
self-rated expectancy of being able to ap | 
proach, pick up, and hold a snake. Confident 
subjects were defined as those who believed 
that they would be able to do those behaviors 
even though doing so would cause them some 
personal discomfort; doubtful subjects wel 
those who reported being not at all sure that 
they could do the behaviors. Subjects Jater 
attempted the behavioral sequence in question 
while simultaneously receiving bogus het 
beat feedback. For some subjects, the feedbac 
indicated autonomic quiescence, for others a 
indicated gradually increasing arousal o 
heartbeat acceleration was coordinated be 
the subject’s approach behavior). Doubt i 
subjects (i.e., those with negative expectan 
responded to the arousal feedback by attend! ; 
less to the comparison betwéen their yee 
and their goal (according to self-repor A 
Carver & Blaney, 1977b) than did a i 
the constant heartbeat control group 2? a 
withdrawing earlier in the approach segin 
However, confident subjects (i.e, those e 
positive expectancies) responded to the 470! 
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feedback by increasing the attention they paid 
to the comparison between their behavior and 
their goal and by bringing the one into closer 
conformity with the other. 

Carver and Blaney (1977b) suggested the 
following interpretation for their findings, 
based on Duval and Wicklund’s (1972) self- 
awareness theory: An accelerating heartbeat 
might, as a sign of fear, lead to greater self- 
focus than a constant heartbeat. This would 
be consistent with theorizing that holds self- 
focus to be a response to fear-inducing situa- 
dions (e.g., Wine, 1971). However, Carver and 
Blaney further suggested that subsequent 
behavioral and focus-of-attention differences 
between confident and doubtful subjects might 
have resulted from two different modes of 
responding to that self-focus. That is, it has 
been demonstrated repeatedly in other be- 
havioral contexts that one consequence of 
self-directed attention is a heightened tendency 
fo conform to salient behavioral standards 
(eg, Carver, 1974, 1975; Froming, in press; 
Gibbons, 1978; Scheier, Fenigstein, & Buss, 
1974; Wicklund & Duval, 1971). This is an 
tect of self-attention that appears to occur 
whenever such conformity is possible. This phe- 
nomenon seemed to Carver and Blaney to be 
implicit in their findings among confident 
subjects who had been presented with arousal 
feedback, That is, these subjects reported 
greater attention to the behavior-goal com- 
parison than did those presented with constant 
feedback, and they failed to display any 
behavioral deficit. These subjects thus seemed 
lo be attempting, successfully, to match their 
behavior to the salient standard (i.e., picking 
ùp and holding the snake). 

On the other hand, at least one previous 
study (Duval, Wicklund, & 
shown that the realization that one cannot 
conform to the behavioral standard caused 
self-focus to enhance behavioral withdrawal. 
That previous demonstration seemed to Carver 
and Blaney (1977b) to be consistent with their 
ndings among doubtful subjects. That 1s, 
atousal feedback caused doubtful subjects to 
withdraw earlier from the approach attempt 
and to report reduced focus on the behavior- 
Boal comparison. 

More recent data suggest, however, that 
Carver and Blaney’s (1977b) interpretation 


Fine, 1972) had , 
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of their results was not entirely accurate. 
Specifically, Fenigstein and Carver (1978) 
have shown that the presence of a sound 
identified to subjects as their heartbeats 
causes increased self-attention, as reflected by 
performance on a color-word test (cf. Geller 
& Shaver, 1976) and by increased self-attribu- 
tion of hypothetical outcomes (cf. Buss & 
Scheier, 1976; Duval & Wicklund, 1973). More 
importantly, Fenigstein and Carver’s results 
failed to indicate that an accelerating heartbeat 
sound leads to greater self-focus than does a 
constant heartbeat sound. Carver and Blaney’s 
(1977b) interpretation of their findings as 
reflecting responses to self-focus is thus called 
into question. 


A Sequential Model of Fear-Based Behavior 


There is, however, a way to reconcile 
certain aspects of Carver and Blaney’s reason- 
ing with the Fenigstein and Carver (1978) 
finding. A model of approach and withdrawal 
that does so, which represents a refinement 
of Carver and Blaney’s (1977b) theory, has 
been presented elsewhere (Carver, in press) 
as part of a more comprehensive analysis of 
behavior regulation. The model (see Figure 1) 
is described by the following assumptions: 

1. When a behavioral standard is salient 
(in this case, to approach and handle the 
feared stimulus), self-attention leads to in- 
creased attempts to conform behaviorally to 
the standard. If fear arousal cues never become 
salient during the approach, this process 
continues until the task is completed. This 
would seem to reflect what occurred among 
Carver and Blaney’s (1977a, 1977b) subjects 
who were presented with a constant heartbeat 
sound. Those subjects apparently accepted 
the nonarousal information contained in the 
feedback as veridical. Thus, although they 
may have been self-attentive (as is implied 
by Fenigstein & Carver's data, 1978), their 
behavior reflected only conformity to the 
standard. 

2. If arousal becomes strong enough to be 
salient, the approach attempt is momentarily 
interrupted. This interruption leads to an 
assessment of the likelihood of being able to 
continue the behavior. The combination of 
self-focus plus salient arousal information 
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RISING 
AROUSAL : 
INTERRUPT 


OUTCOME 
ASSESSMENT 
FAVORABLE 
EXPECTANCY 


Figure 1. Postulated sequence following self-attention in a fear-provoking behavioral context. 


would seem to correspond to the subjective 
experiences of Carver and Blaney’s subjects 
who had been presented with an accelerating 
heartbeat. 

3. If the outcome of the assessment is a 
positive expectancy (e.g., if the person judges 
that he or she can cope with the fear), the 
result is a return to the matching-to-standard 
attempt, with perhaps even increased concen- 
tration on that attempt (as reflected by self- 
report focus-of-attention data from Carver 
and Blaney’s, 1977b, subjects). If the outcome 
is a negative expectancy (if the person thinks 
that his or her limits have been reached), the 
consequence is behavioral withdrawal. The 
expectancy judgment thus is a kind of psycho- 
logical watershed. All behavioral responses 
following this judgment seem ultimately to 
fall into one of those two categories, that is, 
renewed efforts or withdrawal. 

To summarize, in this model the following 
variables are considered relevant to the under- 
standing of approach behavior in a fear- 
eliciting situation: the degree of self-focus 


experienced during the task, the level of on 

experienced during the task, and one’s a 

of confidence in probable success at the tas ms 
Self-focus initially increases the salience 3 
task completion as one’s goal; but if ania 
arises, self-focus also increases one’s awarent n 
of the anxiety. The greater the subj 
experience of anxiety, the greater the likeli g 
that one will pause to evaluate one’s h 
of successful completion. If the resi 6 
expectancy judgment is favorable, the rae 
returns to the approach attempt; if the an 
ment is unfavorable, the person withdra t 

Note that any renewed approach can i 
interrupted again, later in the sequence, 


perceived anxiety increases again. 
. 
Present Research 
cted 
The study reported below was a die 
as a test of this model. In this study, 


v, k d bot 
positional anxiety was held constant š: at 
self-focus and dispositional expectancie 
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yaried.! Specific predictions were derived as 
follows: Subjects with any substantial degree 
of chronic fearfulness with regard to some 
stimulus (e.g. a snake) presumably experience 
Isome veridical fear-related autonomic arousal 
t some point during an attempt to approach 
that stimulus. There is considerable evidence 
that in the absence of bogus feedback, experi- 
mentally heightened self-focus causes an 
increase in awareness of one’s internal affective 
states (Scheier & Carver, 1977). Thus we 
predicted that in a fear-provoking context, 
heightened self-focus should lead to enhanced 
perceptions of rising fear arousal among all 
fearful subjects, regardless of their chronic 
expectancies. The subjective perception of 
Jarousal, above some quasi-threshold, is as- 
sumed to cue the self-assessment process. 
Thus we predicted that the enhanced aware- 
ness of anxiety experienced by the self- 
attentive subjects would lead them to under- 
fake self-assessment earlier in the approach 
sequence than the less self-attentive subjects, 
for whom the arousal would be less salient. 


self-assessment, however, 
subjects’ expectancies. That is, doubtful 
subjects (those with negative expectancies) 
should respond by withdrawing earlier in the 
approach sequence if self-focus is high than 
if it is low. Confident subjects (those with 
‘positive expectancies) should exhibit no such 
(behavioral deficit; if anything, confident 
‘subjects should respond with renewed efforts. 
Thus we predicted an interactive influence of 
self-attention on subjects’ approach/with- 
drawal tendencies. A similar interactive effect 
Was expected on subjects’ self-reports of what 
‘they had attended to during the approach 
attempt. 


Method 


| Subjects 


Potential subjects, undergraduates at the University 
of Miami, completed a two-item pretest at the beginning 
of the semester. The items required respondents (a) 
to estimate the amount of anxiety aroused in them by 
| nonpoisonous snakes (very little, slight, moderate, 
marked, or strong) and (b) to indicate what mg 
thought their response would be to being to 
pick up a nonpoisonous snake (“T’d do it without any 
i discomfort”; “Pd do it, but Td feel a little queasy 


1189 


about it”; “I’m not at all sure that I’d do it”; or “I'd 
be too fearful even to try”). 

Our predictions depended on an interaction between 
self-attention and expectancies, with chronic fearful- 
ness held constant. To select subjects who differed 
appropriately with regard to expectancies, we chose 
persons who reported moderate fear of snakes. Confident 
subjects were defined as those who also indicated that 
they could pick up a snake even though feeling queasy 
about it. Doubtful subjects were those who indicated 
being not at all sure that they would do it. These were 
the same criteria for subject selection as were used by 
Carver and Blaney (1977b). 

Subjects completed the experimental procedures 
described below, in individual sessions several weeks 
after having completed the questionnaire. Subjects 
were randomly assigned to either the mirror present 
or the mirror absent condition by a second experi- 
menter, who had no contact with subjects. The primary 
experimenter had no knowledge of subjects’ pretest 
scores or of the hypotheses being tested until after the 
sessions had been completed. Assignment of subjects 
to conditions led to the following distribution: Each 
group of doubtful subjects was comprised of 10 males 
and 6 females, the confident-mirror present group had 
10 males and 5 females, and the confident-mirror 
absent group had 9 males and 4 females. 


Procedure 


When the subject arrived at the experimental site, 
he or she was given typed instructions explaining 
that the study concerned the physical aspects of 
anxiety. The subject was to attempt to approach, pick 
up, and hold a live nonpoisonous snake while his or her 
heartbeat ostensibly was monitored by the experimenter 
via a microphone. (This explanation and the following 
procedures were adapted from those of Carver & 
Blaney, 1977a, 1977b; the microphone was used to 
make the procedures as similar as possible to those used 
previously, but the present study did not use any 
bogus feedback.)* Each subject was encouraged to g0 
as far as he or she could (thus establishing approach as 
the behavioral standard), but the subject was also 
told that he or she could stop at any point simply by 
indicating a desire to go no farther. t 

‘After reading the instructions, the subject strapped 
a microphone to his or her chest at a position where he 


— 


1 What is critical in the model is the person’s situa- 
tional expectancy. However, on the basis of a substantial 
amount of data from previous research (Carver & 
Blaney, 1977a, 1977b), we assumed that subjects’ 
dispositional expectancies would provide a good basis 
for inferring their situational expectancies. The results 
of the present study lend additional credence to that 
assumption. y 

2 Jt might be argued that strapping on a microphone 
would lead to heightened self-attention. However, data 
from Fenigstein and Carver (1978) indicate that such 


is not the case. 
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or she had located a strong heartbeat sound (using a 
stethoscope provided for this purpose). The subject 
then was given gloves and was escorted to the test 
corridor. At the far end of the corridor (which was 
slightly less than 1 m wide and approximately 4m 
long) was a glass cage with a sliding top, standing on a 
pedestal about .5 m off the floor. Inside this cage was a 
boa constrictor slightly over 1 m in length. The experi- 
menter reminded the subject that his or her task was 
to open the cover, reach in and pick up the snake, and 
hold it until the experimenter said to return it to the 
cage. After cautioning the subject not to begin until a 
verbal signal had been given, the experimenter returned 
to the observation room, ostensibly to turn on the 
heartbeat recorder. 

Self-attention manipulation, Self-attention was 
varied in this experiment by the presence or absence of 
a mirror. Evidence that this stimulus does increase 
self-focus has been discussed at length elsewhere 
(Carver & Scheier, 1978). For subjects in the mirror 
absent condition, the corridor was generally featureless. 
For subjects in the mirror present condition, however, 
a .5m X .75 m wall mirror had been hung behind the 
cage at approximately chest level. The angle of the 
mirror gave the subject a view of his or her face and 
shoulders; the snake, however, was not visible in the 
mirror at any point except the final stage of the ap- 
proach task (discussed later), In the mirror present 
condition, before leaving the subject the experimenter 
pointed out the mirror’s presence, indicated that it was 
part of the experiment, and said that its purpose would 
be explained later. No subject ever expressed concern 
or suspicion about the mirror’s presence, either at that 
point or during postexperimental debriefing. 

It will be noted that this method of introducing a 
self-awareness manipulation differs from usual pro- 
cedures. Typically, the mirror is treated as belonging 
to a different experiment (e.g., Carver, 1974, 1975) or as 
being necessaty for a subsequent task (e.g., Wicklund 
& Duval, 1971). We decided to point the mirror out to 
subjects in the present context, however, for the 
following reason: A live snake is an attention-attracting 
stimulus even for people who are not afraid of snakes. 
Inasmuch as all of the present subjects had reported 
being moderately fearful of snakes, it seemed likely 
that they would be very attentive to the snake’s 
presence while in the test corridor. This would tend 
to make the mirror less salient as a stimulus, thus 
minimizing its impact on subjects. (See Scheier et al., 
1974, for a further discussion of the role of stimulus 
salience in effectiveness of self-attention manipulations.) 
For this reason, it was decided to draw subjects’ 
attention explicitly to the mirror as a part of their 
behavioral context. 

Approach task. After the experimenter returned to 
the observation room, he called to the subject to begin 
approaching. Nine levels of approach had been pre- 
defined: (1) the subject's appearing in the experi- 
menter’s view, (2) coming to within 6 inches (15 cm) 
of the cage, (3) touching the cage, (4) putting a hand 
inside the cage, (5) touching the snake, (6) grasping 
the snake, (7) lifting the snake a few inches, (8) lifting 
it from the cage, and (9) holding it for 15 sec. Each 
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subject’s approach score was defined as the highest 
level that he or she attained. 

After the subject had completed the approach task | 
or had indicated an unwillingness to continue, the} 
experimenter told the subject to return to the other 
end of the corridor. The subject then was escorted to | 
the original room, where he or she was given a post- 
experimental questionnaire, then debriefed and 
dismissed. 


Postexperimental Measures 


The postsession questionnaire contained several p 
items intended to gain indirect evidence on hypothe 
sized mediating processes. (With one exception, noted 
later, all of these, questions were answered on T-point 
scales anchored with “none at all” and “very much.) 
The first item, bearing on the subject’s awareness of 
affect, was “How much anxiety did you experience ini 
the presence of this snake, as indicated by your bodily 
arousal?” 

The presence of noticeable affect was expected Ie 
lead to a self-assessment process. This process 1 
be reflected by a momentary s of concern as 
whether one could do the behavior in question. 
a second question posed to subjects was “When yout 
were approaching the snake, how much did you nota 
momentary sense of inadequacy and fearfulness? 

The following three items were included to try W 
determine what subjects’ attention had been for 
on during the approach attempt: “When you w 
approaching the snake, how much attention were 
paying to your ‘chronic’ level of fearfulness to 
snakes?” ; “When you were approaching the sna 
how much attention were you devoting to assess 
whether or not your bodily arousal was increasing} 
and “When you were approaching the snake, i 
much attention were you paying to your goal—thal i 
picking up and holding the snake—and how well you 
were doing compared to that goal?” d al 

Finally, subjects were asked to rate their gem He 
level of fearfulness toward nonpoisonous snakes: 


3 It might be argued that this experimental sett 
has the particular demand characteristic of ree 
subjects that they had previously made a art. 
about whether or not they could do the approac 
If this were the case, however, one would on 
behavioral difference between confident and do 
subjects in both mirror present and mirror Eo 
conditions. As will be detailed later, this did not 
It might be argued, on the other hand, that heien ‘a 
self-focus makes subjects more sensitive to pree? thre 
demand characteristics. This is evidence By, an 
separate studies, however (Gibbons, cui ee 
& Hormuth, in press; Scheier, Carver, Gi ‘posi 
press), that self-focus can exert precisely the va h 
influence. The nature of demand characteristic Bi 
type of research is an interesting theoretical pr 
but full treatment of the problem is beyond 
of this article. 


s 
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tem was identical to the first pretest item and had the 
ame five response options as had the pretest item. 

Jt will be noted that all of these self-report data 
vere collected after the behavioral task had been 
completed. Thus results on these items should be 
interpreted with some caution. 


Results 


Awareness of Affect 


_ An analysis of variance was conducted on 
self-reported degree of anxiety felt in the 
presence of the snake (see Table 1) to deter- 
mine whether the manipulation of self-focus 
had increased subjects’ awareness of their 
bodily anxiety responses.“ This analysis, as 
predicted, yielded only a main effect for self- 
awareness condition, F (1, 52) = 9,62, p < .004, 
with greater anxiety being reported by subjects 
‘in the mirror present than in the mirror absent 
condition. Group comparisons using the error 
term from the overall analysis of variance 
(Winer, 1962, p. 65f.) indicated that this 
efect was most reliable among doubtful 
subjects, (52) = 2.92, p<.01, the com- 


parable comparison not attaining significance 


among the confident subjects ( = 1.41, ns). 


Table 1 

Self-Reported Anxiety Experienced in the 
Presence of the Snake, Self-Reported 
‘Momentary Sense of Inadequacy an 
Fearfulness During the Approach Task, and 
Actual Level of Approach Toward the Snake 


Condition 
Mirror Mirror 
Group present absent 
Self-reported anxiety* 
Confident 4,33 3.46 
Doubtful 5.25 3.56 


Self-reported sense of inadequacy" 
3.85 


Confident 4,20 

Doubtful 5.38 3.63 
Approach behavior” 

Confident 8.47 ia 

Doubtful 6.38 8.3 


anxiety and felt 


* Larger numbers indicate greater anx 
indicate greater 


inadequacy. » Larger numbers 
approach. 
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It was expected that awareness of anxiety 
and consequent self-assessment would be 
reflected in self-reports of having experienced a 
momentary sense of “inadequacy and fearful- 
ness.” An analysis of variance of those data 
(see Table 1) revealed that again there was a 
significant main effect for self-awareness 
condition in the expected direction, F(1, 52) 
= 6.05, p < .02. As with the anxiety self- 
reports, this effect was most reliable among 
doubtful subjects, /(52) = 2.85, p < 01; among 
confident subjects, £ < 1. No other component 
of this analysis was statistically significant. 


Approach Behavior 


It was predicted that heightened self-focus 
would interact with subjects’ expectancies of 
being able to do the approach task, such that 
heightened self-attention would cause earlier 
withdrawal among doubtful subjects but not 
among confident subjects. An analysis of 
variance of approach scores yielded a margin- 
ally significant interaction, F(1, 52) = 3.67, 
p < .06, of the expected form (see Table 1). 
Subsequent comparisons indicated that doubt- 
ful subjects withdrew reliably earlier in the 
approach sequence when the mirror was 
present than when it was absent, 1(52) = 2.48, 
p < 03. The approach scores of the confident 
subjects were unaffected by the presence of the 
mirror, however. 

There was a considerable range of within- 
cell variances on the approach measure. 
Although this diversity was not sufficient to 
render an analysis of variance entirely in- 
appropriate by Fmax, We judged it desirable 
to gain additional information on approach 
scores through a nonparametric test. There- 
fore, a comparison between approach scores 
of the two doubtful groups was conducted by 
means of a Kruskal-Wallis “analysis of 
variance by ranks” (see, €g. Hays, 1963, p. 
637ff.). When adjusted for ties in rank, this 
test also indicated that the scores of these 
groups did differ as predicted, H' = 10.04, 


p< Ol. 


pag 

4 All analyses reported here included sex of subject 
as a variable. Because there was only one significant 
effect involving sex in the results of the study, however, 


data presented in all tables have been combined across 


gender. 
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Table 2 l} 
Self-Reports of Attention Paid to Chronic Level 
of Fearfulness, Self-Assessment of Bodily 
Arousal, and Behavior-Goal Comparison 


Condition 
Mirror Mirror 
Group present absent 
Chronic level of fearfulness 
Confident 4.67 4.23 
Doubtful 4.81 3.56 


Assessing degree of bodily arousal 


Confident 3.60 4.69 
Doubtful 4.31 3.13 
Behavior-goal comparison 
Confident 5.93 5.62 
Doubtful 5.06 6.06 


Note. Larger numbers indicate greater attention. 


Focus of Attention 


Three postexperimental items had been 
included to assess subjects’ recollections about 
what they had focused their attention on 
while attempting the approach task. Analysis 
of subjects’ self-reports of attention paid to 
their chronic level of fearfulness toward 
snakes yielded no significant difference among 
subject groups (Table 2), with only the self- 
awareness main effect component approaching 
significance (p = .10). 

An analysis of reports of attention paid to 
assessing whether or not bodily arousal was 
increasing (see Table 2) yielded two inter- 
actions. Of greatest interest was the interaction 
between pretest group and self-awareness 
condition, F(1, 52) = 5.72, p < .02. This inter- 
action indicated that confident subjects tended 
to report devoting less attention to assessing 
their arousal when their approach had occurred 
in the presence of a mirror than if there had 
been no mirror and that the opposite tendency 
had occurred among doubtful subjects. Neither 
of these tendencies was significant by itself, 
however (/s = 1.56 and 1.81, respectively), 
In addition, there was an unanticipated 
Sex X Pretest interaction, F(1, 52) = 10.99, 
$ < .003, of the following form: Confident 


males (overall) reported devotin, 
j e g greater 
x Attention to assessing arousal than did confi- 
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dent females, /(52) = 2.18, p< .05, 
doubtful males (overall) reported devoi 
less attention to assessing arousal than di 
doubtful females, ¢(52) = 2.64, p < 05. We. 
have no ready explanation for this gender 
interaction. 

Analysis of self-rated focus on the behavior- 
goal comparison (see Table 2) revealed an 
interaction between pretest group and self 
awareness condition that approached signifi 
cance, F(1, 52) = 3.12, p < .08. In line with 
previous findings (Carver & Blaney, 1977), 
confident subjects tended to report more 
attention to this comparison, and doubtful 
subjects reported less attention, when 
mirror had been present than when it had 
absent. The fact that this interaction took 
form opposite to that of the Self-Aware e 
X Pretest group interaction on attentio 
devoted to assessing arousal suggests th 
directing attention to the behavior 
comparison and directing attention to ass 
arousal might be complementary proces 
When self-rated attention to the behavior 
comparison was reanalyzed as a proportion 
the total attentional focus reported by subje c 
the Self-Awareness X Pretest group 
action did achieve an acceptable significāl 
level, F(1, 52) = 4.62, p < .04. Subseq 
comparisons indicated that doubtful subjé 
in the mirror present condition repo 
having paid proportionally less attentio 
the behavior-goal comparison than did do 
ful subjects in the mirror absent condil 
1(52) = 2.40, p < .04; the opposite tenadi 
among confident subjects was not relia} 
(t< 1). However, the overall correla 
between these two self-reports was quitei 
(r = —.10). This casts some doubt on 
notion that the two represent mutik 
exclusive responses among our subjects. 

These analyses of variance all ue 
awareness condition as a predictor ong 
objects of subjects’ attention. It is 466 
formative to examine briefly the rel , 
between perceived anxiety and those a a 
of attention. Among confident subjects, ® i; 
ness of anxiety did not predict how m, 
attention was devoted to searching © i 
for arousal cues, (28) = —.07; among ae 
ful subjects, in comparison, there was 4 © 
positive relationship between these Vi 


t= r 
nage 


7(32) = .38, p < .02. These correlations dif- 
fered reliably from each other (= 1.71, 
p< .05, one-tailed). Additionally, confident 
subjects displayed a weak but positive relation- 
ship between perceived anxiety and attention 
devoted to the behavior-goal comparison, 
r(28) = .19, p = .16, whereas these variables 
were inversely related among doubtful subjects, 
r(32) = —.28, p < .06. These two correlations 
also differed significantly from each other 
(z= 1.65, p < .05, one-tailed). This overall 
pattern of correlations is quite consistent with 
our reasoning and with the results of the 
above analyses of variance. 


Postsession Fear 


An analysis of subjects’ postexperimental 
ratings of their general levels of fearfulness 
toward nonpoisonous snakes revealed only & 
marginally significant tendency for doubtful 
subjects to report more fearfulness (M = 3.34) 
than confident subjects (M = 2.82), F(1, 52) 
=3.21, p< .08. These means represent 
deviations of .34 and —.18, respectively, from 
the pretest levels of self-report fearfulness. 
Correlational analysis provided some evidence 
that these slight shifts in general fear of snakes 
were mediated by subjects’ observations of 
their autonomic and physical behavior during 
the approach task. That is, postsession fearful- 
ness was positively correlated with self-reports 
of perceived anxiety during the approach 
task, overall r(60) = 55, p < 001, and was 
negatively correlated with actual 
level, overall r(60) = —.63, p < 001. 


Discussion 


this study were quite con- 


The results of 
sistent with our hypotheses. According to 
self-focus 


subjects’ self-reports, _ heightened c 


directly influenced by 
self-focus in inter- 
That is, doubtful 


hand, behavior was not 
self-focus, but rather by 
action with €: jes. 
subjects withdrew 
sequence when self-focus was he 
when it was not, but no 

exhibited by confident subjects. Self-focus also 
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exerted an interactive influence on the self- 
reported content of attention during the 
approach attempt, Confident subjects, though 
aware of fear, apparently responded to that 
awareness by attempting to divert their 
attention from internal fear indices to their 
behavioral goal. In effect, they showed 
evidence of increased attempts to match their 
behavior to the salient standard, even though 
the approach task quite obviously placed a 
ceiling on behavior.’ Doubtful subjects, in 
contrast, reported increased focus on assessing 
their bodily arousal and decreased focus on 
the behavior-goal comparison as functions of 
heightened self-awareness. These cognitive 
effects were consistent with the doubtful 
subjects’ early withdrawal from the approach 
attempt. In sum, the results fit the pattern 
that had been predicted, Moreover, all of these 
findings conceptually replicate previous results 
of Carver and Blaney (1977b). 

It is worthy of some additional note that 
subjects’ self-reports of their fearfulness did 
not allow prediction of their behavior in this 
task. Our subject groups did not differ from 
each other in their ratings of their chronic 
fearfulness of snakes. Yet they differed be- 
haviorally from each other. The ratings of 
chronic fearfulness were validated to some 
degree by the finding that reports of situation- 
ally experienced fear did not differ reliably 
between the pretest groups (Table 1), But 
self-reported fear—even this situational fear— 
not allow accurate tion of 
. Nor did even the self-reported 


è difference between subject groups 
when 


when attention was self-directed, This pattern 
of findings is exactly as would be predicted by 
the sequential analysis presented in the 
introduction of this article. 


+ Enhanced behavioral conformity is somewhat 
difficult to observe in this context because the desired 
behavior is clearly prescribed and because there is such 
a ceiling on performance. However, as discussed in the 
introduction, heightened conformity to standards as a 
function of self-attention has been demonstrated 
previously many times in other contexts (¢.g., Carver, 
1974, 1975; Scheier et al., 1974; Wicklund & Duval 
1971). 
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The present theory and research have a 
number of other theoretical implications that 
deserve mention. The fact that our model is 
framed in terms of expectancies invites a 
comparison between it and Bandura’s (1977) 
recent analysis of fear-based behavior. There 
are some similarities between these models, 
but there are differences as well. According 
to Bandura, expectancy judgments make use of 
several sources of information: the person’s 
prior performances in similar behavioral 
situations; vicarious experiences, such as the 
observation of the behavior of another person 
in the situation; concurrent verbal persuasions, 
such as externally provided suggestions or 
self-instructions; and present arousal per- 
ceptions. This position on the contribution of 
arousal perceptions to expectancies was quali- 
fied elsewhere (Bandura, Adams, & Beyer, 
1977) by noting that attributional considera- 
tions play a role in the perceived meaning of 
the arousal. However, there still appears to 
be a difference between Bandura’s position and 
ours with regard to the role assumed for 
arousal perceptions. Our model and the data 
from the present study suggest that arousal 
perceptions may cue an assessment process, 
but that the outcomes of that assessment 
(expectancy judgments and eventual behavior) 
depend largely on the other sources of 
information.’ 

It should be noted, however, that we dealt 
here with an approach task of only moderate 
difficulty, which in turn produced only 
moderate levels of anxiety. It is conceivable 
that at extremely high levels of arousal, 
perceptions of autonomic activity can even- 
tually have an impact on the behavior of even 
confident persons via changes in their expec- 
tancies. Indeed, a hint of this more complicated 
function of arousal perceptions may be 
contained in the present finding that post- 
session self-rated fear of snakes was influenced 
by the anxiety cues that had been perceived 
during the approach attempt as well as by the 
level of approach attained. Perhaps subsequent 
expectancies might also have been altered by 

one or the other of these sources of information. 

One final point should be addressed briefly. 

A mirror was used to heighten self-attention 
in this experiment because it has been widely 
accepted as the “purest” manipulation of 
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self-focus, that is, the least contaminated by 
other influences. For example, the presence of 
a videotape camera may raise in a subject’s 
mind the implication that unknown others 
either are observing now or may observe 
later. It will be noted, however, that the use of 
a mirror to induce self-attention in a fear- 
arousing context leaves open the possibility 
that any consequent heightened awareness 
of fear may occur via facial cues rather than 
internal fear cues. Although this possibility 
cannot be entirely discounted in the present 
study, other research (Scheier & Carver, 1977) 
indicates that enhancement of emotional 
experience occurs among subjects high in the 
disposition to be self-attentive as well as 
among subjects confronted with a mirror 
manipulation. This convergence of evidence 
suggests that the primary influence of the 
mirror, even in an emotional context, is to 
remind the person of himself or herself. This 
reasoning also suggests, however, that an 
important direction for future research will be 
to test our model’s predictions by means of 
individual differences in self-attention. 


*There are also some similarities between both of 
these models and Stotland’s (1969) analysis of “hope.” 

7 Other more subtle conceptual differences between 
our analysis and that of Bandura are discussed else- 
where (Carver, in press). 
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Sharing Secrets: 
Disclosure and Discretion in Dyads and Triads 


Ralph B. Taylor 
Center for Metropolitan Planning and Research 
Johns Hopkins University 


Clinton B. De Soto and Robert Lieb 
Johns Hopkins University 


To develop a more comprehensive picture of the variables that influence dis- 


closure patterns, 


the impact of group size on sharing secrets was explored. 


Given Derlega and Chaikin’s suggestion that the existence of a closed dyadic 
boundary is a prerequiste for intimate self-disclosure, it was hypothesized that 
Subjects would be more willing to disclose intimate information in a dyad than 
in a triad. The results of Experiment 1, which used a role-playing methodology, 
confirmed the hypothesis. The main effect of group size was observed over a 
range of roles and items of information. In addition to the main effect, group 
size interaction effects also indicated that the difference between dyad and triad 
disclosure rates increased with more intimate items of information and with 
more intimate roles. These interaction effects suggested that the importance of 
a closed dyadic boundary depends in part on the expected confidentiality of the 
interchange. In Experiment 2 the conversations of groups of acquaintances 
were recorded and rated for intimacy. As predicted, the conversations of dyads 


were more intimate than those of 


Self-disclosure is a topic that has recently 
generated an enormous amount of research 
(cf. Cozby, 1973), and substantial support 
for what Jourard (1971) called the “dyadic 
effect: disclosure begets disclosure” (p. 66) 
has been obtained. Two hypotheses—modeling 
and social exchange—have been offered to 
explain this effect. The latter hypothesis has 
received the lion’s share of empirical support 
(Certner, 1973; Davis, 1976; Davis & Skinner, 
1974; Derlega, Harris, & Chaikin, 1973; 
Ehrlich & Graeven, 1971; Jones & Archer, 
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y triads. Suggestions for understanding the 
intimate quality of dyads are discussed. 


1976; Worthy, Gary, & Kahn, 1969), and 
reciprocity has become a dominant theme in 
self-disclosure research. 

However, several limitations preclude draw- 
ing conclusions on the generality of reciprocal 
disclosure. Research has been conducted 
largely with pairs of strangers, and these 
interactions may not be prototypical of inter- ~ 
actions between acquaintances or friends (cf. 
Derlega, Wilson, & Chaikin, 1976).! Second, 
almost all disclosure research has been con- 
ducted with dyads, and little is known about 
disclosure patterns in larger groups. For 
example, when interacting with more than one 
other, a person may find it difficult to maintain 
reciprocal disclosure vis-à-vis each other 
person in the group. 

The term dyadic effect seems to suggest ~ 
that there is something special, apart from 
reciprocity, about disclosure in dyads. For 
example, dyads may be more intimate because 
each person has the undivided attention of the 
other and can give undivided attention to the 


‘For a contrary viewpoint, see Rubin (1974, 1975). 
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other. Although this interpretation of the 
dyadic effect has intuitive appeal, the little 
work that has been conducted does more to 
undermine than support this appeal. Rubin 
(1976) found that the presence or absence of a 
third party had no influence on subjects’ 
responses to disclosures by an experimenter. 
Spinner (Note 1) in his laboratory experiment 
also found no main effect of group size on 
depth (intimacy) of disclosure. Drag (1969) 
found that the two-person discussion groups 
self-disclosed more than eight-person groups 
but not more than four-person groups. Group 
size interaction effects have indicated that 
people may disclose more or less readily in 
groups larger than dyads, depending on the 
composition of the audience (Chelune, 1976) 
and the mode of communication (Spinner, 
Note 1). 

In sum, although results are mixed, evidence 
suggests that disclosure patterns in dyads, as 
compared to other groups, are not unique. 
Nonetheless, before concluding that there are 
no effects of group size on depth of disclosure, 
group size as a source of social influence 
should be examined. 

Blake (1958, p. 229) has suggested that 
situational sources of social influence include 
both (a) the “central stimulus” or the “im- 
mediate focus of attention” and (b) “context” 
factors. Group size qualifies as a context 
factor. In a self-disclosure situation, the 
recipient of information is the central stimulus. 
Research has indicated that recipient charac- 
teristics such as physical attractiveness (Brun- 
dage, Derlega, & Cash, 1977) and a reflective 
or aggressive style (Ellison & Firestone, 1973) 
do influence disclosure patterns. 5 

Context factors have received little attention 
in self-disclosure research, although they ae 
be influential. For example, Johnson oh Da ny 
(1976) and Rubin and Shenker (1978) : 
observed that proximity influenced amount o! 
disclosure on low- and medium-inumacy 
topics. thc Ban. 

We wished to further inten acts 
ship between gt actors in general may exert 

ince (a) contex y k 
widespread influence on social anne 
(Blake, 1958) and (b) group size 
established as @ 
interaction (Bales 
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& Vidmar, 1970; O'Dell, 1968; Slater, 1958; 
Thomas & Fink, 1963). Exploration of the 
relationship between group size and self- 
disclosure uncovers two relevant theoretical 
considerations. First, Derlega and Chaikin 
(1977) have suggested that self-disclosure 
may be viewed as an interpersonal boundary 
regulation process (Altman, 1975; Altman & 
Taylor, 1973) and that a precondition of 
intimate disclosure is the existence of a closed 
dyadic boundary, that is, “a boundary within 
which it is perfectly safe to disclose to the 
invited participant and across which the self- 
disclosure will not pass” (p. 104). The im- 
portance of a closed dyadic boundary has 
been alluded to before. Simmel (see Wolff, 
1950) suggested that dyads have a special 
quality of intimacy that is not present in larger 
groups, and thus, dyads could share secrets 
more often and with more security. 

The second theoretical consideration is the 
layperson’s perception of disclosing. De Soto 
(1960; De Soto & Kuethe, 1959) found that 
“confides in” was perceived as a pair-forming 
relationship in contrast to “likes,” which was 
perceived as a group-forming relationship. 
Confides in was viewed by subjects as sym- 
metric and nontransitive or “essentially a 
pair-wise interchange” (De Soto & Kuethe, 
1959, p. 193). This view of disclosing is 
congruent with the notion of a closed dyadic 
boundary. 

To test Derlega and Chaikin’s (1977) 
suggestion that a closed dyadic boundary is 
a prerequisite for intimate disclosure, self- 
disclosure between acquaintances was ex- 
amined in dyads and triads. Since a closed 
dyadic boundary does not exist in a triad, 
we hypothesized that subjects would be 
less willing to make intimate disclosures in a 
triad than in a dyad. Although our hypothesis 
may seem intuitively obvious to some, it is 
important to bear in mind that previous 
research has by and large failed to yield main 
effects of group size on intimacy of disclosure.’ 


2 Predictions contrary to ours might be drawn from a 
deindividuation perspective (Diener, Fraser, Beaman, 
& Kelem, 1976; Zimbardo, 1969), which suggests that 
the presence of a group serves as a releaser for certain 
behaviors. Contrary predictions could also be drawn 
from a sensitivity training or encounter group per- 
spective, which is based on the working assumption 
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We also expected, in light of Chelune (1976) 
and Spinner (Note 1), that dyad/triad 
differences in disclosure rates would increase 
when more confidential interchanges were 
expected. 

Although intimate disclosures between 
strangers can be elicited, we decided to focus 
on intimate disclosures between acquaintances. 
Given that intimate interchanges occur more 
frequently between acquaintances than 
strangers, this focus might enhance the 
ecological validity of the results. 


Experiment 1 
Method 


Subjects. A specially designed self-disclosure ques- 
tionnaire, described later, was administered to 21 
undergraduate volunteers enrolled in psychology courses 
at Johns Hopkins University and Towson State 
University. 

Procedure, To manipulate group composition and 
information content in a controlled fashion, a role- 
playing technique was used. Willingness versus reluc- 
tance to disclose was measured. Subjects were asked 
to predict how they would behave in hypothetical 
situations, not how they had behaved in past situations, 
to assure similarity of situations across subjects. 
Items contained specific information rather than 
general topics. 

Six items of information were presented on separate 
pages of the questionnaire with individual roles or pairs 
of roles listed underneath. The subject was asked to rate 
how quickly he/she would disclose the item to occupants 
of various roles or pairs of roles. Each item was en- 
countered twice in the questionnaire—once to assess dis- 
closure to single roles and once to assess disclosure to 
pairs of roles, The instructions exhorted subjects to 
think of particular individuals that they knew to fill 
each role. Disclosure was assessed across several role 
pairs (sister and best friend of the opposite sex, boy/ 
girlfriend and best friend of the same sex, two liked 
professors, and acquaintance and roommate) to assess 
the generality of any effects of group size, Role pairs 
were chosen on the basis of pilot subjects’ suggestions 
about what role pairs they would be likely to encounter 
in everyday settings, Finally, disclosure was assessed 
across six items of information (see Table 1)2 


the person [them] as soon as I see him or her ({them]”) 
to 4 (“I will probably never tell the person [them ]”) 
was used to measure disclosure. Questions concerning 
disclosure to pairs of individuals were presented 


that people are willing to HESS g 
Er EA el reveal intimate information 
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within each type of question. At the end of the question- 
naire, subjects rated the intimacy of each item of 
information and the closeness of their relationship 
with occupants of each of the specific roles. 


Results and Discussion 


The analysis of variance used a 2 X 4 X 6 
(Group Size X Role Pairs‘ X Information) 
within-subjects factorial design. Although the 
use of a repeated measures design may possibly 
have aroused subjects’ suspicions, in debriefing 
no subjects expressed any awareness concern- 
ing the group size hypothesis. 

The hypothesized main effect of group size 
was observed, F(1, 20) = 11.4, p < .005, (a? 
= .01). Subjects indicated that they would 
be more reluctant to disclose in a triad than 
in a dyad (see Table 1). This main effect was 
qualified by two interactions. 

A modest though significant Group Size 
X Information interaction occurred, F'(5, 100) 
= 2.5, p < .05 (& < .01). The difference in 
dyad versus triad disclosure rate increased as 
the intimacy of the information increased (see 
Table 1). (The rank-order correlation between 
dyad versus triad disclosure rate and intimacy 
ratings of the items was .74.) This interaction 
further supports the hypothesis. As the 
information under consideration became more 
intimate, the triad, in comparison to the dyad, 
was perceived as a less appropriate disclosure 
setting. Furthermore, relative reluctance to 
disclose in a triad was a general phenomenon, 
since it occurred with five out of the six items 
of information. 

Also, a significant Group Size X Role Pairs 
interaction was obtained, F(3, 60) = 4.4, p 
< .01 (& = .01). The difference in dyad 
versus triad disclosure rate increased as the 
intimacy rating of the role pairs increased. 
There was a perfect rank-order correlation 


* Only negative items were used, since pilot subjects 
quickly disclosed items of positive information, regard- 
less of group size or group composition. Although 
Jones and Wortman (1973) have pointed out that 
disclosure of positive information can pose problems 
for the discloser (e.g., being perceived as a bragger), 
in our samples this was not the case. 

*To determine disclosure to a role pair when the 
members of the pair were encountered individually, 
the mean of the disclosure rate to the two individuals 
was used. 
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3 Relationships Among Group Size, Disclosure Rates, and Information in Experiment 1 


yi 


Disclosure rate 


Information Intimacy rating Dyad Triad 

You have lost a large sum of money 

from your wallet 3.29 2.24 2.21 
You have been rejected from the 

college or graduate school of 

your choice 5.14 2.13 2.21 
During vacation you got into a car 

accident due to excessive speed 

and the person who was with you 

suffered a broken leg 5.19 2.32 
Your parents are getting divorced 6.66 2,40 
Your brother has been committed 

to a mental hospital 7.29 2,72 2.98 
You have discovered that you have s 

incurable leukemia 7.29 2.89 3.24 


Note. On intimacy ratings, a higher score indicates a more intimate item; on disclosure rates, a higher score 
indicates a slower disclosure rate (i.e., more reluctance to disclose). 


between these two measures. Although it was 
based only on four pairs, this correlation does 
strongly suggest that a triad, compared to a 
dyad, became a less appropriate setting for 
disclosure when the individual was inter- 
acting with more intimate contacts.* 

Consistent with prior self-disclosure research 
(Derlega & Chaikin, 1975; Jourard & Lasakow, 
1958), main effects for role pairs (information 
was revealed more quickly to more intimate 
role pairs) and information (more intimate 
information was disclosed less readily) and 
a Role Pairs X Information interaction (differ- 
ence in disclosure rate to acquaintance and 
roommate and to two Be, ‘ing 
decreased as the intimacy e items in- 
creased) were obtained all ps < .001, a's 
= .22, .07, and .01, respectively). 

As predicted, subjects thought they would 
be more reluctant to disclose intimate infor- 
mation in a triad than in a dyad. This pan 
effect and the group size interactions len 
erlega and Chaikin’s (1977) 


presence of a closed dya 
more important as the € 
of the interchange inc! 


Experiment 2 


The results of Experiment 1 are limited in 
that they were based on a role-playing method- 
ology. Subjects were asked to describe how 
they thought they would act in various 
situations. Behavioral observations of actual 
disclosure patterns in dyads and triads would 
increase our confidence in these findings. We 
hypothesized that in a fairly unstructured 
leaderless discussion, conversations among 
dyads would be more intimate (i.e., contain 
greater depth of self-disclosure) than conver- 
sations among triads.‘ 


* The results of this experiment were almost wholly 
replicated with a second larger sample (n = 26), using 
different items and role pairs. In the replication the 
predicted main effect of group size was again obtained 
(p < .005). A significant Group Size X Information 
interaction (p < .05), similar in interpretation to the 
one obtained in the first study, was also obtained. 

*In Experiment 2 an instructional variable (discuss 
intimate vs. discuss nonintimate topics) could have 
been included to replicate the group size interaction 
effect in Experiment 1. Although such an instructional 
variable was considered, we decided against it because 
(a) such an instructional variable would make the 
conversations less natural and (b) we felt it was appro- 
priate to replicate the group size main effect before 
seeking to replicate any interaction effects. 
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Method 


Subjects and procedure. Freshmen in on-campus 
housing at Johns Hopkins University were solicited 
as volunteers by an experimenter. He explained that 
he was Conducting a survey to find out how groups of 
freshmen felt about the quality of life on campus. The 
freshmen were randomly asked to contact either one 
or two of their acquaintances and arrange a time fora 
group discussion. The experimenter returned to the 
volunteer's room with a tape recorder at the time 
appointed for the discussion. 

A total of 35 freshmen (19 males and 16 females) 
agreed to Participate, Subjects were grouped into seven 
Same-sex dyads (2 male and 5 female) and seven 
Same-sex triads (2 female and 5 male). Each grouping 
Was composed of acquaintances (i.e, friends the 
original contact had brought with him/her to the 
group discussion),? 

When he arrived at the dorm room, the experimenter 
explained that the Purpose of the discussions was to 
find out how groups of Subjects felt about the quality 
of life on campus. Fully informed consent was ob- 


After 9-12 minutes of conversation, the experi 
‘xperimenter 
returned to the Toom and terminated the discussion. 


quaintance, w] 


closure across the four role 

e pairs was summed t 
establish a general rate of disclosure of ninae 
information for each subject, After completing the 
questionnaire, sub jects were fully debriefed, 
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first assessed with no variables held as co- 
variates and then with sex composition, mean 
level of acquaintance, and mean disclosure 
habits held as covariates,’ 

Intimacy of discussions was strongly in- 
fluenced by group size, F(1, 12) = 21.2, p 
< 001 (@ = .59). As predicted, the dis- 
cussions of triads were less intimate than the 
discussions of dyads (see Table 2). The 


7 Some previous self-disclosure research has observed 
that females disclose more readily than males (cf, 
Cozby, 1973), Thus, one might argue that by over- 
representing females in the dyad condition we are 
biasing the sample in favor of our hypotheses. However, 
in this experiment sex composition was not correlated 
with any of th or behav: 


* The conversations were transcribed. In the tran- 
scripts, there was no indication of the number of 
people in a particular group. Using a 4-point intimacy 
scale, each Conversation was rated independently by 
one of the experimenters and by an individual who 


ed using the 
1962). Reliability be- 
tween the two sets of ratings was .96. The mean ratings 


The estimated reliability of these mean ratings was .98, 
The intimacy scale used was as follows: 1 = little 
or no disclosure (discussants shift focus away from 
selves), 2 = Superficial or conventional disclosure 
(reveal trite or peripheral aspects of selves), 3 = per- 
sonal disclosure (reveal specific experiences concerning 
more personal topics), 4 = intimate disclosure (specific 
experiences concerning more personal topics are dis- 
cussed, and Subjects clearly discuss their responses to 
ese experiences), 

e analysis of covariance (ANcova) used here 
represents, in the terminology of Evans and Anastasio 
(1968), Usage 2 of axcovas « ‘Adjustment’ of treat- 
ment means for differences between intact groups, when 

€ covariate is unrelated to the treatments” (p. 227). 
The covariates used in this analysis (sex composition, 
level of acquaintance, and disclosure habits) were all 


ese correlations are small. Evans and Anastasio 
note that “small Correlations such as might occur . . . 
in the context of Usage 1 or 2, might not have serious 
Consequences” (p, 233), Thus, in this analysis the 
usage of ANCOVA was appropriate and was not seriously 
bi by the small violations of the assumptions of 
independence. 
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Table 2 
Behavioral and Perceived Effects of Group 


Size in Experiment 2 


Dependent variable 


A Intimacy of Amount learned 
Condition discussion about others 
Dyad 2.79 3.36 
Triad 1.43 2.24 


Note. Intimacy of discussion means were derived 
from ratings of the transcripts; a higher mean indi- 
cates a more intimate discussion, Amount learned 
means were obtained from subjects’ answers to the 
question: “As a result solely of this discussion, how 
much do you think you have learned about the other 
people in the group?,” with a higher mean indicat- 
ing more learned. 


influence of group size remained strongly 
significant with sex composition, acquaintance 
level, and disclosure habits held as covariates, 
F(1, 9) = 12.0, p < .01. 

The covariate of acquaintance influenced 
the intimacy of the discussions, although the 
effect was much smaller than the effect of 
group size. Better acquainted groups had 
more intimate conversations, F(1, 9) = 5.5, 
p< 05. J í 

On the postdiscussion questionnaires, the 
amount subjects learned about other group 
members was influenced by group size, 
F(1, 12) = 5.4, p < .05 (@ = .30), with mem- 
bers of dyads learning more about others in 
the group than members of triads (see Table 2). 
This result is consistent with the finding 5s 
dyads had more intimate discussions than 
triads. The effect of group size on eee 
learned persisted, with the three hag is 
held constant, F(1, 9) = 7.2, p < .05. Further 
more, amount learned due to the ee a 
was solely a function of group ays lm 
covariates, Fs < 1.) Amount learne either 
the discussion was independent of Pe 
dimensions assessed by the es 
highest correlation between amount a 
and other questions about the interac 
"Nene ot use pis e 
discussions as less awkward and more TAA 
than did members of triads, These elec 

a esized direction, 
although in the hypoth 
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not attain the accepted .05 level of statistical 
significance, 

On the questionnaire results, there were 
two significant effects due to covariates. First, 
groups composed of members with high 
disclosure habits perceived the group dis- 
cussions as less awkward, F(1,9) = 144, 
$ < 01. Second, the perceived intimacy of 
the topic discussed was influenced by the level 
of acquaintance of group members, with 
better acquainted groups perceiving their 
discussion topic as more intimate, F(1, 9) 
=14.0, p < 01, 


General Discussion 


Two experiments using dissimilar method- 
ologies have supported the hypothesis that 
individuals, when interacting with acquain- 
tances, are more likely to disclose intimate 
information in a dyad than in a triad, These 
results support Derlega and Chaikin’s (1977) 
Suggestion that intimate disclosure depends on 
a closed dyadic boundary, that is, the revealer 
perceiving that his/her message is safe with 
the recipient. Disclosure was less intimate in 
triads in which the closed dyadic boundary 
did not exist. Furthermore, both experiments 
provided strong evidence that disclosing 
behavior closely corresponds to the perceived 
nontransitive nature of “confides in” (De Soto 
& Kuethe, 1959), Thus, sharing secrets is not 
only perceived as a pairwise interchange 
dividing people into dyads but, behaviorally, 
confiding actually appears to operate in this 
fashion. (See also Rubin & Shenker, 1978.) 

Although there are other differences between 
dyads and triads in terms of role conflict 
(Brown, 1965), unanimity of mood (Heider, 
1958), and coalition formation (Mills, 1953), 
the pattern of findings in Experiment 1 (see 
also Footnote 5) Strongly supports our interpre- 
tation of the findings. With more intimate 
items of information and more intimate role 
pairs, the difference between disclosure rates 
in dyads and triads increased, Although the 
magnitude of these interaction effects was 
not large, the effects were congruent with 
prior research (Chelune, 1976; Spinner, Note 
1). Furthermore, these interaction effects help 
clarify the role of the closed dyadic boundary. 
It appears that the salience of the boundary 
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increases as the confidentiality of the inter- 
change increases. 

In a public setting in which strangers are 
interacting and disclosure is likely to be non- 
intimate, our theoretical rationale would lead 
us to expect no effect of group size on dis- 
closure, This was the result obtained by 
Rubin (1976) in his field experiment. Although 
it is hazardous to make inferences from 
negative findings, Rubin’s failure to obtain a 
group size effect in a low disclosure setting is in 
line with the perspective presented here. 

One limitation of the present study is that 
the effects of group size were explored only 
over a narrow range. However, this limited 
range of group sizes was a legitimate focus, 
since the differences between dyads and triads 
are more dramatic than the differences between 
other pairs of group sizes (cf. Hackman & 
Vidmar, 1970; O’Dell, 1968). 

The results of the present study go some 
distance toward understanding the inherent 
quality of intimacy that Simmel (Wolff, 1950) 
attributes to dyads. However, the reasons for 
this quality are not as yet fully illuminated. 
Our expectation is that the intimacy of dyads 
is due to particular perceptual properties 
(eg. perceived climate) and structural proper- 
ties (e.g., role complexity and intensity) that 
covary with group size. 

Furthermore, the present study highlights 
a host of questions that should be attended 
to by researchers in self-disclosure. What is 
the relationship between reciprocity and group 
size? Although reciprocal disclosure is straight- 
forward in a dyad and involves only one input 
and one output channel, it is probably more 
complex in larger groups. In a triad each 
individual receives two sets of incoming 
messages and can send two different sets of 
messages, although everyone “overhears” 
everyone else’s messages. In a triad, people 
may avoid the difficulties of maintaining two 
separate channels by disclosing at a uniform 
low level of intimacy. The apparent robustness 
of Jourard’s dyadic effect and the reciprocity 
underlying it may in part be due to the past 
research focus on dyads. 

Also, given that context factors such as 
group structure and central stimulus charac- 
teristics such as qualities of the target both 
influence disclosure patterns, what is the 


R. TAYLOR, C. De SOTO, AND R. LIEB 


relative impact of each of these on disclosure? 
Consideration of this question should lead 
researchers to a more comprehensive under- 
standing of the determinants of self-disclosure. 


Reference Note 


1. Spinner, B. Privacy maintenance and self-disclosure. 
Paper presented at the 86th annual convention of 
the American Psychological Association, Toronto, 
Canada, August 1978. 
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Expressive Tendencies 
and Physiological Response to Stress 


Clifford I. Notarius 
Catholic University of America 


This study assessed the effects of natural expressive tendencies on physiological 
response to stress. Male undergraduates were unobtrusively observed while 
watching a stressor videotape. On the basis of the subjects’ facial responsive- 
ness to the film, a group of 23 natural expressers and 22 natural inhibitors 
were selected and exposed to a threat of shock situation during which heart 
rate, respiration rate, skin conductance, and facial expressions were monitored. 
In accord with the discharge model of emotion, natural inhibitors were less 
facially expressive and more physiologically reactive to the shock threat than 
were natural expressers. The results also demonstrated that overt expressivity 
is stable over time and situation. On personality measures, natural expressers 
scored significantly higher on Mehrabian’s empathic tendency scale, thus sup- 
porting the efficacy of this paper-and-pencil instrument as a measure of non- 
verbal responsiveness. The two groups did not differ on measures of self-esteem, 
introversion-extraversion, or locus of control. The results are discussed in 
terms of the discharge model as a descriptive metaphor and not a causal theory. 


Understanding the relationship between 
overt emotional expression and physiological 
response is necessary for a comprehensive 
theory of emotion as well as for an adequate 
conceptualization of behavior change following 
any of several expressive therapies (Nichols & 
Zax, 1977). Facial display is one component 
of emotional expression that has been shown 
to influence both physiological reaction pat- 
terns and the subjective experience of emotion 
(eg., Izard, 1978; Laird, 1974), Although there 
is agreement that facial display is an important 
component of emotional response, there is con- 
flicting evidence about whether facial display 
attenuates (Buck, Miller, & Caul, 1974; Buck, 
Savin, Miller, & Caul, 1972; Lanzetta & Kleck, 


This article is based on data collect 
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1970) or augments (Lanzetta, Cartwright- 
Smith, & Kleck, 1976) physiological response — 
to emotionally arousing situations. 

Lanzetta and Kleck (1970), Buck et al. 
(1972), and Buck et al. (1974) used an encod- 
ing/decoding paradigm to study the relation- 
ship between facial displays of emotion and 
physiological response to arousing situations, 
In these studies, a sender subject was exposed ) 
to an emotionally arousing event while an ob- y 
server subject watched the subject’s face on a — 
video monitor and attempted to decode the 
sender’s expression. The sender's skin con- 
ductance or heart rate or both were monitored 
throughout the stimulus presentation. The 
results of these studies indicated that the ob- 
server subjects were most accurate at decoding 
the facial expressions of sender subjects who 
were least physiologically reactive to th 
eliciting stimuli; conversely, observer subjects 
were least accurate at decoding the facial ex- 
pressions of sender subjects who were most 
physiologically reactive. The results of these 
studies, which imply that reduced decoding ac- 
curacy was due to relatively fewer facial ex- 
pressions, have been interpreted as support for 


the discharge model of emotion according to 
which facial expressions are associated with at- 
tenuated physiological response to emotional 
~ stimuli. 

_Unfortunately, the encoding/decoding para- 
digm does not provide a strong empirical test 
of the discharge model. In the studies that used 
this paradigm, the assessment of overt expres- 
sivity was dependent on the measure of de- 
coding accuracy and the implication that this 
measure reflects overt expressivity. A more 
© direct test of the relationship between facial 
expressiveness and physiological reactivity 
would be to place subjects in an arousing situa- 
tion with trained raters recording the extent of 
their facial expressiveness. This methodology 
would enable an objective assessment of a sub- 
ject’s tendency to facially display a response 
to an emotionally arousing situation. Compari- 
| son of the objectively rated facial display data 
A with recorded physiological response data 
would allow a test of the discharge model in 
situations in which natural response tendencies 
= would most likely be evidenced. 

An alternative approach is possible in which 
an attempt is made to directly manipulate the 
“extent of facial responsiveness. Lanzetta et al. 
(1976) tested the discharge model in a well- 
designed study incorporating experimentally 
"manipulated facial displays and independent 
"assessment of expressivity during exposure to 
_.. electric shock. Lanzetta et al. instructed sub- 
¥ jects to either pose an intense expressive reac- 
_ tion or to pose no reaction to electric shock and 
~~ found that subjects who were instructed to 
pose no reaction were significantly less physto- 
logically reactive to the shock than subjects who 
had posed an intense reaction. These results 
are opposite to predictions based on the dis- 
charge model. Lanzetta et al. interpreted their 
. results as support for proprioceptive feedback 
L models of emotion (Gellhorn, 1964; Izard, 


The discrepancy between Lanzetta et al.’s 
(1976) findings and previous support for the 
‘discharge model (Block, 1957; Buck et al., 
1974; Buck et al., 1972; Jones, 1950; Lanzetta 
et al., 1970; Learmonth, Ackerly, & Kaplan, 
1959) may stem from differences across studies 
in the operationalization of emotional expres- 
sivity. With the exception of Lanzetta et al.’s 
(1976) study, all previous investigations that 
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found support for the discharge model allowed 
subjects access to their natural emotional re- 
sponses during exposure to an eliciting situa- 
tion. Thus, as Lanzetta et al. suggest, short- 
term, experimentally manipulated control of 
emotional expression may lead to a positive 
relationship between facial expression and 
physiological reactivity, whereas natural re- 
sponse patterns may be characterized by the 
discharge model. 

One possible mediator of this short-term, 
positive relationship may be patterns of general 
somatic activity that are activated by instruc- 
tions to “be responsive.” To the extent that 
subjects become more physically active, one 
would expect such activity to be accompanied 
by increases in heart rate and skin conductance. 
Conversely, if instructions to hide responses 
produce a decline in somatic activity and a 
more general state of relaxation, decreases in 
cardiac and electrodermal activity might 
follow. In any event, it seems likely that results 
derived from experiments that involve natural 
expressive tendencies will differ from the results 
obtained from direct manipulation of expres- 
sivity, insofar as the two methodologies reveal 
different aspects of the relationship be- 
tween facial expressivity and physiological 
responsiveness. 

The purpose of the present study was to 
determine the relationship between facial dis- 
plays of emotion and physiological reactivity 
to stress in subjects whose natural expressive 
styles were unconstrained. The experimental 
design incorporated a behavioral coding of 
facial expressivity (Mehrabian, 1972) and 
multiple measures of physiological reactivity 
(heart rate, respiration rate, and skin con- 
ductance). Based on the consistency of findings 
from studies in which subjects were allowed 
their natural emotional responses, it was pre- 
dicted that natural expressers would be less 
physiologically reactive to an emotionally 
arousing situation than natural inhibitors. 

A secondary purpose of the study was to 
explore personality correlates of expressive 
tendencies. Buck et al. (1974) reported that 
expressers were higher in self-esteem and were 
more extraverted than inhibitors. In the 
present study, subjects were assessed on self- 
esteem, introversion-extraversion, and em- 
pathic tendency. 
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Subject Selection 


To test the experimental hypotheses, it was necessary 
to preselect a group of natural inhibitors (persons who 
show little or no facial responsivity to an emotional 
situation) and a group of subjects who were natural 
expressers (people who show high levels of facial re- 
sponsivity to an emotional situation). Seventy-six male 
undergraduates enrolled in introductory psychology 
classes were recruited for the preselection phase of the 
experiment; they were fulfilling a course requirement. 

To establish natural expressive tendencies, subjects 
were shown an industrial accident film that had been 
demonstrated to be emotionally arousing (Lazarus, 
Opton, Nomikos & Rankin, 1965). The subjects were 
told that they would view a brief film and then be asked 
to complete several questionnaries. The questionnaires 
were Mehrabian’s empathic tendency scale, Eysenck’s 
Tntroversion-Extraversion Scale, Rotter’s Locus of 
Control Scale, and Janis and Field’s Self-Esteem 
Inventory. 

Subjects viewed the industrial accident film in groups 
of four to six while three coders (two undergraduates 

4 the first author) observed through a one-way 
mu Phroughout the film, coders counted the number 
of facial expressions occurring in 30-sec time periods and 
entered this number on a coding sheet. Criteria for de- 
termining the occurrence of a facial expression were 
based on a procedure described by Mehrabian (1972). 
Using this ny changes from a neutral display 
toa nonneutral display and back to a neutral display 
constituted one facial expression. Gestural behaviors, 
such as slight movements of the eyebrows or touching 
the face with the hand, were not counted as facial 
expressions. Although more elaborate qualitative coding 
schemes exist for classifying facial displays in terms of 
discrete emotional states (Ekman & Friesen, 1975), it 
was decided to use a relatively simple quantitative 
coding system in this study to determine if a general 
relationship between natural facial expressiveness and 
physiological reactivity could be detected using a stress 
paradigm. 

_ In addition to recording the number of facial expres- 
sions, coders assigned a subjective rating on a 10-point 
scale to each subject after the film, with 0 indicating 


a Subjective rating of 1 or less were classified as natural 
ors. On this basis, 23 natural inhibitors and 22 
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natural expressers were selected for continuation in t 
experiment, 


Procedure 


A single laboratory appointment was arranged { 
each of the 45 subjects selected for continuation in t 
experiment. As each subject arrived at the laborator 
he was seated in a comfortable chair in front of a vid 
monitor and an unobtrusively placed video camera. T 
subject was told that he would be viewing a film on tl 
monitor and that the experimenters were interested 
people's physiological reactions to the film. Phy log 
cal sensing devices to detect heart rate, respiration k 
and skin conductance were then attached to the subj 
and he was told that the experimenter (one of two r 
search assistants blind to the experimental hypothese 
was going to set up the film and check to see that th 
physiological recording equipment was workin 
properly. 

Just prior to leaving the subject, the experimente 
informed the subject that he would soon see a “digits 
voltmeter” displayed on the monitor via a close 
circuit TV camera placed in the adjoining room. Th 
subject was told that the “voltmeter” indicated’ th 
voltage flowing through his body as measured by a i 


(actually a ground clip) attached to his ear. T) 
experimenter explained that the meter had an interni 
circuit to sense rapid increases in voltage and that if th 
internal circuit was activated, “9999” would begin t 
flash as a warning that a dangerously high level 
voltage was present and that a strong shock coul 
result. The subject was then asked to monitor t 
voltmeter and to signal the experimenter immediat 
if 9999 began to flash by ringing a buzzer convenieni 
placed to the subject’s right. The experimenter ask 
if the subject had any questions and then left, telli 
the subject that he had to check the recordi 
equipment. 

In reality, the monitor displayed a videotape 
was identical for all subjects and on which was recom 
a small digital light display of four numbers. The 
Programmed display presented “0000” for 4 min, le 
ing to small increases over the next 30 sec until 9996 
began to flash on and off for 1 min. The flashing display 
of 9999 constituted the threat of shock. As soon as tht 
subject signaled the onset of 9999, the experimente: 
began to jiggle the wires in the adjoining room as if t¢ 
attempt to locate the “problem.” A loud thump on the 
wall was produced to accompany the return of 0000 te 
the video monitor; then 0000 remained on the moni 
for 4.5 min (the poststimulus period). The experiments 
entered the subject room and apologized for the flashing 
9999 by explaining that this was the first time this had 
happened, that the cause was located and corrected, 
and that the subject was in no danger. The subject was 
then shown a second stressor film and exposed to oy 
additional arousing situation before being debriefed- 


1 Only data gathered in response to the a ‘a 
shock are reported. Although all subjects watched th 
film and were exposed to a third eliciting situation, 


Monitoring physiological responses. Ski 
tance was measured by oe ie. 
“ween two Beckman electrodes placed on the volar sur- 
faces of the middle segments of two fingers of the right 


stvetching a mercury strain gauge 10 inches (25.4 cm) 
in length above the subject’s waist. Physiological signals 
were passed through a Grass Model 7 polygraph and 
outed through the analog-to-digital converter of a 
t DP-11 computer for on-line processing. 

Monitoring facial expressivity. Facial expressivity 
was coded by the same coders using a similar procedure 
to that used in the preselection of subjects, with the 
o] exception that only one coder rated each subject.* The 

coder watched the subject’s face on a closed-circuit 
video monitor located in a separate room and pressed 
a button for the duration of each nonneutral facial dis- 
play. In the case of expressions maintained over 5 sec, 
g the computer automatically counted an additional ex- 

pression. Coders were blind as to which group the sub- 
r. | ject was assigned. 
| Videotape functions. The videotape served two func- 

i tions, (a) to present the threat of shock stimulus to the 
| subject and (b) to signal the end of the trial periods to 
the computer. An inaudible signal (17,000 kHz) placed 
Pd n the videotape at spaced intervals (30 sec during 

| imulus presentations and 90 sec during baseline 
f } Jriods) momentarily closed a decoding switch, signal- 
-ag the computer to begin a new trial period. The video- 
“ape continued for the duration of the experiment to 
‘nsure identical timing of the experiment for each 

Abject. 
De; measures. The computer was used to 
btain the following dependent variables: (a) heart rate 
interbeat interval)—time, in msec, between heart 
eats; (b) respiration rate (intercycle interval)—time, 
“à msec, between inspiration and expiration cycles; (c) 

cin conductance—skin conductance, in micromhos; 
and (d) facial expression—the number of facial expres- 


| Şi) coded. 


Results and Discussion 

ty was assessed by 
occurring in a 
inute stimulus 


i tstimulus period was 
UREE od to de- 


Í responses OCC 
Í period. A 4.5- 
compared with 


; measures 
into a oups (natural 
‘analysis of variance with two gr kee 
inhibitors and natural expressers 


i j ign used in this 
was decided that the within-subject e i 
study did not permit valid iaa jo! ot 

logical data gathered following threat of shock. 


EXPRESSIVE TENDENCIES 
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trials (prestimulus, stimulus, and poststimu- 
lus). Planned comparisons by £ test were used 
to test specific reaction patterns of natural 
expressers and natural inhibitors. 

Due to equipment malfunction, heart rate 
data were lost for one natural inhibitor across 
all stimulus periods, respiration rate data were 
lost for one natural expresser during the stimu- 
lus period, and skin conductance data were lost 
for one natural expresser across all stimulus 
periods and for one natural expresser during 
the prestimulus period. The remaining avail- 
able data from these subjects were included in 
the analyses. 

Stability of the inhibitor-expresser dimension. 
The preselection of subjects was intended to 
yield two groups of subjects who differed in the 
tendency to display facial expressions in re- 
sponse to emotionally arousing situations. In 
the absence of evidence concerning the stability 
of the tendency towards overt facial displays 
of emotion, it was hoped that a selection pro- 
cedure based on behavioral observations of 
expressions in response to a stressor situation 
would be most likely to reliably identify the 
two groups. Examination of the mean number 
of facial expressions displayed in response to 
the threat of shock stimulus indicated that 
natural expressers averaged significantly more 
facial expressions than did natural inhibitors, 
1(86) = 2.53, p < .025; the mean numbers of 
expressions were 1.34 and .87, respectively. 
This result confirms the preselection classifica- 
tion of subjects and demonstrates stability of 
expressive tendencies across two independent 
situations (film and threat of shock) and across 
time (approximately 9 weeks separated the 
preselection film and the threat of shock 
situations). 

Assessment of the arousing situation. Since 
the threat of shock was a novel stressor, it was 


” important to determine whether subjects found 


the situation arousing. A significant trials effect 


2 Even though several months had passed since the 
preselection phase of the experiment, every attempt was 
made to ensure that a given subject’s facial expressions 
were coded by a different rater in the preselection and 
laboratory phases of the research. Use of separate 
groups of coders in both phases of the study and having 
two coders per subject in the laboratory phase (to allow 
continuation of reliability monitoring) would be de- 
sirable in future work. 
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Table 1 


Subjects Measure 
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Physiological Reactivity Patterns for Inhibitors and Expressers 


Prestimulus 


Shock threat Poststimulus 


Inhibitors Heart rate* 865 818" 882 
Respiration rate” 3648 3298* 3621 
Skin conductance 3 
(micromhos) 12.7 16.6* 15.7* 
Expressers Heart rate* 823 820 838 
Respiration rate” 3593 3530 3687 
Skin conductance 
(micromhos) 10.8 14.5* 12.9° 


* Time in msec between heart beats. 


b Time in msec between inspiration and expiration cycles. 


* Significantly different from prestimulus level (p < 


was found on all dependent measures, indicat- 
ing that the threat of shock was a physiologi- 
cally arousing situation; for facial expression, 
F(2, 86) = 35.12, p < .001; for heart rate, 
F(2, 84) = 11.21, p< .001; for respiration 
rate, F(2, 85) = 10.61, p < .001; and for skin 
conductance, F(2, 83) = 49.49, p< .001. 
Anecdotal data gathered during the debriefing 
also indicated that the subjects found the 
situation arousing and believable. One subject, 
for example, related, “When the nines began 
to flash I really thought I was in for it.” 
Reactivity patterns in natural expressers and 
natural inhibitors. On measures of heart rate 
and respifation rate, natural expressers were 
less reactive to the threat of shock stimulus 
than were natural inhibitors. Natural inhibi- 
tors showed a significant heart rate increase, 
1(84) = 3.76, p<.001, and a significant 
respiration rate increase, 4(85) = 434, p 
< .001, from the prestimulus to the stimulus 
period, whereas natural expressers showed no 
significant change from prestimulus levels in 
heart rate, ¢(84) = .28, ns, or respiration rate, 
1(85) = .78, ns. During the poststimulus period, 
heart rate and respiration rate of both groups 
returned to prestimulus levels. Heart rate and 
respiration rate means are presented in Table 1. 
These results are consistent with the discharge 
model. 

Analysis of the skin conductance measure 
indicated that both groups were similarily re- 
active to the threat of shock ; natural inhibitors 
showed a significant increase in skin con- 
ductance from prestimulus levels, ¢(83) = 7.11, 


001). 


p <.001, as did natural expressers, /(83) 
= 6.67, p < .001. During the poststimulus 
period, each group maintained a significantly 
higher skin conductance level relative to th 
prestimulus period, /(83) = 5.49, p < .001, f 
natural inhibitors; ¢(83) = 3.89, p < .001, f 
natural expressers. Skin conductance means a 
presented in Table 1. Skin conductance read 
tivity patterns did not discriminate natur: 
expressers from natural inhibitors, as would 
predicted on the basis of the discharge model 

Resting levels of natural inhibitors and natur 
expressers. Although reactivity patterns fro) 
the prestimulus to the stimulus period are t 
critical test of the discharge model, baselini 
differences occurring in the prestimulus peri 
were also assessed. Natural expressers displays 
a faster heart rate during the prestimulus 
period than did natural inhibitors, ¢(84) 
= 3.35, p < .001; whereas natural inhibitors 
displayed a significantly higher skin conduc- 
tance level than did natural expressers, 
4(83) = 3.49, p < .001. The two groups did 
not display any prestimulus respiration rate 
differences, {(85) = .68, ns. 

These baseline differences, observed in su 
jects who had just entered the laboratory an 
who were told only to monitor a “voltmeter,”4 
were unexpected. Furthermore, the pattern ol 
the baseline differences indicated that one 
group was not simply more aroused to the 
general laboratory situation than the other 
group, since heart rate was elevated for natural 
expressers whereas skin conductance was 
elevated for natural inhibitors. These findings, 
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together with the lack of correspondence be- 
tween skin conductance reactivity and heart 
rate and respiration rate reactivity, suggest 
that expressive tendencies may be associated 
with differences in individual physiological re- 
sponse stereotypy (Roessler & Engel, 1974) be- 
tween expressers and inhibitors. This inter- 
pretation is offered tentatively, pending replica- 
tion and extension of the results. 

Personality correlates of expressive tendencies. 
Natural expressers were shown to be signifi- 
cantly more empathic on Mehrabian’s em- 
pathic tendency questionnaire than were 
natural inhibitors, F(1, 43) = 5.37, p < .025. 
In a series of studies, Mehrabian (1972) pre- 
sented evidence that overt responsiveness is the 
primary characteristic of subjects who score 
high on this questionnarie. Thus, the empathic 
tendency questionnaire appears to have dis- 
criminative validity. 

The two groups did not significantly differ 
on measures of introversion-extraversion, self- 
esteem, or locus of control. These results are 
contrary to the findings of Buck et al. (1974). 


Conclusion 


The results of the present study are con- 
sistent with the discharge model; expressive 
subjects were significantly less physiologically 
reactive to an emotional stressor than were non- 
expressive subjects. Furthermore, the stability 
of the inhibitor-expresser dimension over time 
and situation suggests an enduring attribute 
of affective style. However, it would be inap- 
propriate, on the basis of these data, to con- 
clude that facial expression functions as a 
causal agent in determining the parameters of 
physiological response to stress. More ap- 
propriate experimental designs for examining 
the causal relationship in which expressiveness 
is directly manipulated (e.g., Lanzetta et al., 
1976) have produced results that are discrepant 
with the discharge model. This discrepancy be- 
tween the results of studies in which subjects 
were allowed their natural response patterns 
and those in which subjects were asked to 


control their emotional expression underscores 


the fact that the discharge model is a descrip- 
tive metaphor for clinical and empirical findings 
that expressive individuals are less physiologi- 
cally reactive to stress than are natural in- 
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hibitors. The fact that facial expressions are 
not mediating the relationship between expres- 
sivity and physiological reactivity leads to 
speculation concerning a third factor or factors 
that are responsible for the observed 
relationship. 

Speculatively, the causal factors mediating 
the observed relationships between expres- 
sivity and physiological reactivity may involve 
the subject’s cognitive appraisal of the eliciting 
situations. Given personality differences be- 
tween expressive and inhibited subjects (Block, 
1957;.Jones, 1950; Learmonth et al., 1959), it 
would not be unexpected to find characteristic 
differences between the groups in the cognitive 
processes that function to shape an emotional 
response out of an environmental stressor. Ac- 
cording to Lazarus, Averill, and Opton (1970), 
the outcome of these cognitive, subjective 
processes is an appraisal of the eliciting situa- 
tion (along such dimensions as perceived 
danger, threat, or security) that determines a 
complex emotional response, which includes 
both overt expression and physiological re- 
sponse. Prestimulus differences in physiological 
levels observed in the present study may reflect 
the operation of differential appraisal from the 
moment the subjects entered the laboratory 
environment. 

Attempts to refine our understanding of the 
determinants of different physiological reac- 
tivity patterns between expressive and non- 
expressive individuals, whether or not cognitive 
appraisal is the primary mediator, must ac- 
knowledge the influence of several factors that 
have yet to be addressed. The first of these is 
the possible role of stimulus specificity. Al- 
though the facial responses in the present study 
were similar for the accident film and the threat 
of shock, it is conceivable that other stimuli 
might produce different patterns of facial and 
physiological response. In this respect, it would 
be important to examine stressors of a more 
interpersonal nature as well as situations in 
which positive affect is elicited. Second, con- 
tinued research must recognize that natural 
expressive tendencies cannot be equated with 
instructionally manipulated displays of emo- 
tion. Short-term control of expression does not 
seem to be the equivalent of an expressive re- 
sponse from an individual’s natural repertoire. 
Third, a more qualitative coding of facial ex- 
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pressiveness would be useful in determining 
whether the discharge model is uncritical as to 
the nature of facial display or whether such 
dimensions as appropriateness of affect, dura- 
tion of display, or specific emotion displayed 
improve the predictive power of the model. 
Finally, further refinements in the scope of 
physiological response analysis may be im- 
portant. Although it is generally agreed that 
monitoring one physiological system (e.g., skin 
conductance) is insufficient for adequately 
assessing physiological response, there is less 
consensus as to how data from multiple 
physiological systems should be handled in 
light of such factors as stimulus specificity and 
individual response stereotypy, which may 
strongly influence the nature of interrelation- 
ships among facial and physiological indices of 
expressivity. 
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The Cognition-Emotion Process in 
Achievement-Related Contexts 


Bernard Weiner, Dan Russell, and David Lerman 
University of California, Los Angeles 


Two experiments were conducted to examine the linkages between cognitions 
and emotions. In the first investigation, subjects reported a “critical incident,” 
in which they succeeded or failed an exam for a particular reason (e.g., help 
from others, lack of long-term effort). They then recounted three affects that 
were experienced. The data revealed prevalent affects linked with success and 
failure regardless of the attribution for the outcome. But many emotions were 
identified that are associated with specific attributions (e.g., luck-surprise ; 
others-gratitude and others-anger). In addition, dimensions of causal attribu- 
tions, such as locus, also influence recollected feeling states, particularly esteem- 
related emotions. Hence, it was proposed that in achievement-related contexts 
there are three sources of affect elicited by disparate cognitions. In the second 
experiment, it was demonstrated that individuals can use emotional cues to 
infer why a success or a failure has occurred. The Proposed cognition-emotion 


and emotion-cognition couplings appear to be symmetrical. 


The research reported in this article was 
guided by the simple presumption that a 
variety of cognitions, particularly causal at- 
tributions, influence emotional reactions in 
achievement-related contexts. This belief re- 
cently found suggestive support in an in- 
vestigation conducted by Weiner, Russell, 
& Lerman (1978). The questions raised in 
Weiner et al. provide the foundation for the 
research reported here. 

Weiner et al. compiled a dictionary list of 
approximately 250 potential affective reac- 
‘tions to success and failure in an academic 
context. The dominant causal attributions for 
achievement performance were also identified 
(Elig & Frieze, 1975; Frieze, 1976). Then a 
cause for success or failure was given within 
a brief story format, the success- or failure- 
related affects that had been identified were 
listed, and the subjects reported the intensity 
of the affective reactions that they thought 
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would be experienced in this situation, Re- 
sponses were made on simple rating scales, 
A typical story follows. 


Francis studied intensely for a test he took. It was 
very important for Francis to record a high score on 
this exam. Francis received an extremely high score 
on the test. Francis felt that he received this high 
score because he studied so intensely [his ability in 
this subject; he was lucky in which questions were 
selected; etc.]. How do you think Francis felt upon 
receiving this score? (Weiner et al., 1978, p. 70) 


A number of provocative findings emerged 
from this investigation. First, there were a 
group of “outcome dependent — attribution 
independent” affects that were rated ab vividly 
and equally experienced, regardless of the 
perceived attribution or the “why” of success. 
Examples of affects given success were plea- 
sure, happiness, satisfaction, and goodness, ` 
whereas for failure the outcome dependent — 
attribution independent affects included un- 
cheerfulness, displeasure, and being upset. 

In addition, for both success and failure 
many affects were discriminably related to 
specific attributions. Table 1 shows a subset 
of the causal attributions for success and 
failure and a label that best describes their 
distinguishing linked emotion. 


Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3707-1211$00.75 
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Table 1 Red 
Altributions and Dominant Discriminating 
Affects for Success and Failure 


——————————— 


Attribution Success Failure 
SS a 
Ability Competence Incompetence 

Confidence — 
Unstable effort Activation Guilt 

Augmentation Shame 
Stable effort Relaxation Guilt, shame 
Personality Self-enhancement Resignation 
Others Gratitude Aggression 
Luck Surprise Surprise 


Note. Table 1 based on Weiner, Russell, & Lerman 
(1978). 


Table 1 reveals that for success, attribu- 
tions to ability were thought to give rise to 
feelings of confidence and competence; un- 
stable effort ascriptions produced heightened 
activation and high potency emotions (e.g., 
being uproarious and delirious), whereas stable 
effort attributions brought about relaxation ; 
Personality attributions resulted in self-en- 
hancement (e.g., conceit and pride); attribu- 
tions to others were associated with gratitude; 
and ascriptions to luck were linked with 
surprise. Given failure, ability attributions 
were perceived as giving rise to feelings of 
incompetence; effort ascriptions generated 
reports of shame and guilt; personality at- 
tributions were linked with resignation ; ascrip- 
tions to others caused vindicti 
aggression; and bad luck produced surprise. 


seem obvious—as 
truth is not new.” 
expected results unanticipated revela- 
tions. For example, the data Suggest that 
whether success is followed by activation or 
calmness depends on a discrimination between 


the labeling of internal arousal precedes and 
defines one’s emotional experience. 


Table 1 rey some interesting compari- 
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when the outcomes are diametric, the Teported 
experiences for success and failure are in 
direct opposition : respectively, competence in 
contrast to incompetence. This antithesis is 
also evident when attributions are to others 
(the generated emotions for success and failure, 
respectively, are gratitude vs, hostility), But 
if luck is the causal attribution, then the 
reactions to both success and failure include 
surprise. And, given effort ascriptions, the 
feelings associated with success and failure— 
high activation or calmness versus shame and 
guilt—are unrelated. 

In spite of the systematic and significant 
findings reported by Weiner et al. (1978), 
there is reason to be skeptical about the data. 
These investigators cautioned that 


This procedure is fraught with danger, even as a 
starting point. First of all, we assumed that indi- 
viduals would project their own emotional experiences, 
or those observed in others, upon the characters in 
the stories. Second, we assumed that our labels reflect 
the “real” experiences of the subjects. Finally, we 
assumed that repression and Surpression, memory 
distortion, response sets, experimenter demands, and 
individual differences in affective labeling as well as 
in the subjective meaning of these labels would not 
render our results meaningless (see Davitz, 1969). 
In sum, there obviously are many limitations of this 
initial study. (Weiner et al., 1978, p. 70) 


Later it was warned that 


The reactive and respondent measure undoubtedly 

encourages subjects to report affects that are not 
i » Or are experienced in a manner not truly 

captured by the labels, (Weiner et al., 1978, p. 71) 


A series of follow-up studies was therefore 
initiated with the purpose of replicating the 
prior findings with altered methodologies and 
extending the research in new directions. The 
two investigations reported here introduce the 
changes given below. 

(a) In Experiment 1, a “critical incident” 
technique (Davitz, 1969) is used in which 
subjects relate a personal experience of suc- 
cess or failure, given a designated causal 
attribution. Free responses, rather than rating 
scales of specific affects, are the dependent 
indicators of emotions. Thus, the procedure 
uses operant methods of affective assessment, 
whereas Weiner et al. (1978) used a respondent 
methodology (McClelland, 1971). In addition, 
the methodology is not simulational (although 
it is retrospective). 


Ea 
>- ou 


THE COGNITION-EMOTION PROCESS 


Table 2 
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Percentage of Subjects Recownting a Critical Incident as a Function 


of Outcome and Causal Attribution 


Attribution 
y Unstable Stable 
Outcome Ability effort effort Personality Others Luck 
Success 94 87 68 54 75 66 
Failure 73 81 33 42 56 56 


(b) In Experiment 2, affects are used as 
the independent variables to examine whether 
they can be used as information to infer 
thought processes (causal attributions). 

In both experiments, a major empirical and 
theoretical goal is to determine whether the 
emotions linked with internal causal ascrip- 
tions cluster together and differ from the 
emotions associated with external ascriptions. 
In prior writings (e.g., Weiner et al., 1971) 
it was contended that the dimension of locus 
of causality (internal vs. external) mediates 
between success and failure and the emotional 
reactions to these outcomes, with affects 
heightened given feelings of internal respon- 
sibility. Weiner et al. (1978) isolated two 
sources of emotion—outcome and causal at- 
tributions—but failed to identify the role of 

` locus of causality as a third determinant of 
affect (see review in Weiner, 1977). 


Experiment 1 
Method 


The subjects were 79 male and female college stu- 
dents enrolled in introductory psychology at the Uni- 
versity of California, Los Angeles, and participating 
to satisfy a course requirement. They were tested in 
groups ranging in size from 5 to 16. Py 

Each subject completed a questionnaire in which 
12 achievement conditions were represented. Each 
condition consisted of an outcome (doing. well or 
poorly on a test) determined by one of six causes 
(ability, unstable effort, stable effort, personality, 
other people, and luck). These causes were selected 
because they showed the clearest relationships to dif- 
ferent qualities of emotional experience in the Weiner 
et al. (1978) investigation. The subjects were asked, 
for example, to 


Think of a time when you did well on a test in a 
school subject that was very important to you. 
You felt that the reason you did well was that 
the teacher (or other people, such as friends) helped 
you a great deal. 


The order of the 12 conditions (two outomes X six 
causes) was randomly varied between subjects. 

The subjects were asked to try to recall a time 
when they had been in such a situation. If they could 
recall this type of personal experience, then they were 
asked for a brief description, including details such as 
the academic course, when the event occurred, and 
so on. It was hoped that this prompting would enhance 
the salience of the event. Finally, the subjects were 
asked how they felt in the situation, using three 
affective labels to characterize their emotional reac- 
tions. Short spaces for three words were provided on 
the questionnaire. 

Research has indicated that subjects often do not 
have words readily available to describe their ex- 
periences. As an aid to subjects in characterizing 
and communicating their affective reactions, examples 
of six affects known from the Weiner et al. (1978) 
study to be associated with success (calmness, com- 
petence, conceit, exhilaration, gratitude, and pride), 
six affects known to be associated with failure (de- 
pression, fear, guilt, incompetence, resignation, and 
vindictiveness), and one associated with both success 
and failure (surprise) were included with the instruc- 
tions. These were listed as “words that describe dif- 
ferent types of emotion.” Thus, the procedure was 
a mixture of free and guided recounting of emotional 
experience, 

Subjects worked through the experimental condi- 
tions at their own pace. When the questionnaire was 
completed by all the subjects in a group, the experi- 
ment was discussed openly. 


Results and Discussion 


For all the reported analyses, no differences 
were found between the sexes; the data are 
collapsed across this variable. Table 2 shows 
the percentage of subjects recalling a critical 
incident in each of the 12 experimental con- 
ditions. Subjects reported more success ex- 
periences than failure experiences and out- 
comes associated with ability and effort 
attributions were also more likely to be 
recollected. An analysis of variance of these 
dichotomous data (see Winer, 1971, pp. 303- 
305) revealed a main effect of outcome, 
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Percentage of Emotional Recollection as a Function of the Causal Attribution for Success 
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Table 3 
Unstable 
Affect Ability effort 

Competence 30* 12 
Confidence 20 19 
Contentment 4 4 
Excitement 3 9 
Gratitude 9 1 
Guilt 1 3 
Happiness 44 43 
Pride 39° 28 
Relief 4 28° 
Satisfaction 19 24° 
Surprise 7 16 
Thankfulness 0 1 

*p<.0l. 


F(1, 858 = 39,39, p < .01, attribution, F(S, 
858) = 21.44, p < .01, and a significant Out- 
come X Attribution interaction, F(5, 858) 
= 2.34, p < .05. Within the given attribu- 
tions, there was a greater likelihood of report- 
ing success than failure in the ability, stable 
effort, and others conditions (for all condi- 
tions, p < .01). 

The reported differences could indicate the 
operation of hedonic or defensive biases in 
attribution or memory or both. However, in 
the academic lives of these students, the data 
reported in Table 2 could. readily be inter- 
preted as reflecting the frequency of their 
actual experiences. 

Success. The data analyzed for both suc- 
cess and failure outcomes are the percentage 
of subjects (based on those recounting a 
critical incident) who reported a specific 
emotional experience in response to the various 
achievement conditions. Only emotions listed 
by more than 10% of the subjects in response 
to any one of the causal attributions are 
included in the final analysis. 

The 79 subjects listed 156 different emo- 
tions in response to the six success attribu- 
tions. Of these emotions, 12 met the criterion 
of being given by more than 10% of the 
subjects for any given attribution. (See 
Table 3.) Four of these 12 emotions (com- 
petence, gratitude, pride, and surprise) were 
among the examples of emotions provided to 
the subjects; there was probably lability or 
influence on which labels subjects were willing 


Stable 
effort Personality Others Luck 
20 19 5 2 
18 19 14 4 
12° 0 7 2 
8 il 16* 6 
4 8 43° 4 
0 3 2 18* 
43 38 46 48 
39 43° 21 8 
16 11 13 26* 
16 14 9 0 
4 14 4 sar 
0 0 18* 4 


to give to their experiences; perhaps a dif- 
ferent set of exemplar affects would have 
yielded other specific emotional reports. 3 

The most commonly reported affect, happi- 
néss, appeared in each condition with rela- 
tively equal frequency (M = 44%). This 
response pattern depicts what we have labeled 
an outcome dependent ~ attribution indepen- 
dent emotion and, again, suggests that one 
source of achievement-related affect is linked 
with the perception of success and failure 
regardless of the reason for the outcome. 
Happiness was also one of the general re- 
actions to success reported by Weiner et al. 
(1978); as in that investigation, the outcome- 
linked affect was notably dominant in the 


Table 4 ` 
Discriminating Affects as a Function of 
Outcome and Causal Attribution 


Attribution Success Failure 
Ability Competence Incompetence 
Pride Resignation 
Unhappiness 
Unstable effort Relief Fear 
- Satisfaction 7 
Stable effort Contentment Guilt 
Personality Pride T 
Others Gratitude Anger 
Thankfulness 
Excitement s 
Luck Surprise Surprise 
Guilt Sadness 
Relief Stupidity 


foe 2 1) aie N a 
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Table 5 
Percentage of Emotional Recollection as a Function of the Causal Attribution of Failure 
W Unstable Stable 
Affect Ability effort effort Personality Others Luck 

Anger 16 38 33 2 e 

Depression 22 20 21 ri 6 3 
Disappointment 12 17 17 13 20 12 
Disgust 2 10 4 3 8 5 
Fear r 7 25% 21 13 18 12 
Frustration 22 18 4 13 20 19 
Guilt 9 15 29* 9 3 5 
Incompetence 14* 7 8 0 10 0 
Mad 7 3 0 3 10 7 
Resignation 16* 0 4 9 2 0 
Sadness 9 10 0 6 0 14* 
Stupidity 14 10 13 9 0 19* 
Surprise 2 3 4 0 7 14* 
Unhappiness 14* 3 4 9 2 14 
Upset 9 10 8 0 8 10 


“p< 01, 


present study. Confidence also was not dis- 
criminably linked with any particular affect, 
although confidence was not a reported affect, 
given the attribution of luck. 

Table 4 lists the discriminating affects for 
each causal attribution, A discriminating affect 
is defined as one that is reported significantly 
more (p < .01) for one attribution relative to a 
composite of the other attributions. All com- 
parisons were done using arc sine transforma- 
tions, following the procedure suggested by 
Langer and Abelson (1972).! The linkages cor- 
responding to those reported by Weiner et al. 
(1978; see Table 1) are ability-competence and 
ability-pride; stable effort - contentment ;? per- 
sonality-pride; others-gratitude; and luck-sur- 
prise. In addition, there were some associations 
that did not appear in the prior research: 
unstable effort — relief and satisfaction (rather 
than activation); others-excitment; luck-guilt 
and luck-relief. It is of interest to note that 
one of the reactions to success caused by luck 
is a negative affect. 

Failure. In response to the six failure 
situations, 166 different emotional reactions 
were listed. Fifteen of these emotions were 
given by more than 10% of the subjects in 
response to at least one of the causal at- 
tributions (see Table 5). Six of these emo- 
tions (depression, fear, guilt, incompetence, 
resignation, and surprise) were included among 


the examples that were provided for the 
subjects and, again, apparently indicate their 
readiness to appropriate given labels. Of these 
15 emotions, six (depression, M = 23%; 
frustration, M = 16%; disappointment, M 
= 15%; upset, M = 8%; disgust, M = 5%; 
and mad, M = 5%) were not discriminably 
associated with any single attribution. Weiner 
et al. (1978) also reported that frustration 
and upset were directly linked with failure 
outcomes. In the present data, three of these 
affects (depression, frustration, and upset) were 
reported to be significantly less experienced 
in response to a particular attribution—others, 
stable effort, and personality, respectively. 
Table 4 also includes the discriminating 
affects for each of the causal attributions 
given failure. These discriminating affects 


1For example, 30% of the subjects reported the 
emotion of “competent” given an ability attribution 
for success. In response to the five other attributions, 
11% of the subjects reported competent as one of 
their three affects. Arc sine transformations were 
performed on these percentage figures and the dif- 
ference between the arc sines was evaluated relative 
to an error term based on the number of subjects 
i ea each percentage figure (Langer & Abelson, 
1972), 

2 Although stable effort attributions yield the emo- 
tion of pride to as great an extent as do ability at- 
tributions, the linkage is not significant because the 
percentage is based on a smaller n. 
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Table 6 
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Dimension-Linked Affects as a Function of Outcome and Locus 
OTS OES EEE — ee 


Present study Weiner et al.* 


Success—internal attribution 


pride pride 

competence competence 

confidence confidence 

satisfaction satisfaction 
zest 


Failure—internal attribution 


guilt guilt 

resignation aimlessness 
humbleness 
regret 


a Weiner, Russell, & Lerman, 1978. 


Present study Weiner et al.* 


Success—external attribution 


gratitude gratitude 
thankfulness thankfulness 
surprise modesty 
guilt 


Failure—external attribution 


anger anger 
surprise surprise® 
others@ 


b Includes 19 affects related to anger (e.g., hostile, infuriated). 
© Includes 5 affects related to surprise (e.g., astonished, startled). 
4 Seventeen unclassifiable affects (e.g., concerned, alarmed, hysterical). 


were determined using the same procedure 
as for the success condition. The parallel 
linkages with those reported by Weiner et al. 
(1978; see Table 1) are ability-incompetence ; 
stable effort - guilt; others-anger; and luck- 
surprise. The bonds between ability-resigna- 
tion and ability-unhappiness, and between 
luck-sad and luck-stupid, were not exhibited 
in the prior research. However, there was 
previous evidence in support of the unstable 
effort — fear association. 

Summary. The finding that certain emo- 
tions such as happiness and disappointment 
are independent of attributions but dependent 
on outcomes corroborates the results reported 
by Weiner et al. (1978). In addition, of the 
12 attribution conditions, the stated emotions 
in 9 clearly replicate the prior findings. 
Perhaps what are least clear are the emo- 
tional labels given success or failure attributed 
to the presence or absence of a short-term 
burst of effort. In addition, in the present 
study the attribution of failure to personal 
shortcomings had no discriminably linked 
affects. But ascriptions of success or failure 
to ability, stable effort, others, and luck have 
idiosyncratic affective associates (see Table 4). 

Causal dimensions. Further analyses were 
conducted to discover the relation between 
causal dimensions and affective responses. 


The emotional reactions to the four internal 
causes used in the present study (ability, 
unstable effort, stable effort, and personality) 
were compared with the reactions given the 
two external attributions (other people and 
luck). That is, the percentage figures from 
the four internal attributions were averaged 
and compared with the average of the emo- 
tional responses to the two external attribu- 
tions, using arc sine transformations of the 
data. Table 6 shows that for success, pride, 
competence, confidence, and satisfaction, all 
were more likely to be experienced given 
internal rather than external attributions. On 
the other hand, external attributions were 
more linked with gratitude, thankfulness, 
surprise, and guilt than were internal ascrip- 
tions. (For all conditions, p < .01.) 

The findings reported above are suspect 
because only a subset of the possible internal 
and external attributions were included among 
the experimental conditions. In Weiner et al. 
(1978), 10 causal attributions were examined 
(the present 6 plus mood, intrinsic motiva- 
tion, task difficulty, and motivation and 
personality of others). These 10 account for 
the majority of attributions in achievement- 
related contexts (Frieze, 1976) and thus 
permit a better test of whether affects are 
tied to locus of causality. Reanalyses of the 
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Weiner et al. data, not previously reported, 
show a remarkably close correspondence with 
the present results (see Table 6). 

In sum, the affects for success that are 
most intimately related to personal esteem 
(competence, confidence, and pride) are re- 
ported to be associated with internal attribu- 
tions. On the other hand, external attributions 
are linked with affects, such as gratitude and 
thankfulness, that require an outside causal 
agent. 

Table 6 also shows the dimension-related 
responses to failure. Again, there are clear 
correspondences between the two research 
studies. Internal attributions exacerbate feel- 
ings of guilt and what might be considered 
a giving up of the goal (resigned and aimless), 
whereas external ascriptions magnify anger 
and surprise. Guilt, therefore, emerges as an 
affective reaction for both success and failure, 
but given external attributions for success 
and internal ascriptions for failure. These 
data reaffirm the intertwining of achievement 
and moral motivational systems (see Weiner 
& Peter, 1973). 


General Discussion 


The data from the present study and the 
Weiner et al. (1978) investigation are suf- 
ficiently similar to warrant speculation about 
the emotional process in achievement-related 
contexts. We contend that emotions are a 
product of three distinct cognitive phases. 
Prior to elaborating this idea, however, it is 
important that the reader understand that 
we are going “beyond the data given.” First 
of all, we have no immediate evidence con- 
cerning what people actually experience sub- 
sequent to success and failure. The experi- 
mental paradigm used by Weiner et al. (1978) 
might be called a simulated other-perception 
(“How do you think the person would feel?”), 
whereas in the present investigation the de- 
pendent variable is recollected emotions. We 
also cannot definitively state that emotions 
follow cognitions, because causal attributions 
for achievement events were not manipulated. 
In addition, it is possible that the reported 
cognition-emotion linkages are due to scripts— 
in this case, rules that guide which emotions 
should go with which cognitions, although 
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individuals actually do not progress through 
cognition-emotion phases. A related approach 
to the data is that we are studying naive 
theories of emotion, or what a person thinks 
are appropriate cognition-emotion bonds. 

The lack of methodological sophistication 
in the field of emotion and the limitations 
of our studies do not allow us to dispel with 
certainty these alternative interpretations of 
the data. Nonetheless, there is an account 
that we find most compelling for the time 
being. It is suggested that on experiencing 
an achievement-related outcome, actors first 
appraise their performance, assigning it a 
value on a continuum ranging from subjective 
success to subjective failure. This evaluation 
is likely to be based on some internal standard 
and aspiration level, consensus information, 
and so on. The success-failure judgment 
produces a positive or negative feeling of a 
certain intensity. The intensity no doubt is 
a function of many factors, such as the im- 
portance of the outcome, ego involvement, 
and long-term implications. The positive or 
negative reactions to success or failure are 
given labels, such as happy or disappointed. 
This emotional response may be the most 
short-lived and intense of the affective ex- 
periences in achievement contexts. 

The actor then immediately and auto- 
matically or later and reflectively assigns a 
reason for the outcome. This could be an 
impulsive statement (“Was I lucky!”) or a 
more reasoned judgment (“I believe I could 
not have succeeded without the help I re- 
ceived.”). As a consequence of this ascription, 
a number of attribution-specific emotions fol- 
low, such as surprise or gratitude. Of course, 
there may be multiple perceived causality, 
in which case a number of distinctive emo- 
tions should be elicited. 

Finally, the attribution(s) for success and 
failure is classified into causal dimensions, 
such as locus of causality. Further affective 
experiences then follow as a consequence of 
the fact that some of the ascriptions have 
implications for how one views oneself. Other 
attributional dimensions, such as stability, 
also appear to have affective implications 
(hopelessness vs. optimism; see Weiner et al., 
1978; Weiner, 1979). We suspect that these 
dimension-tied affects have greater longev- 
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Table 7 
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Mean Magnitude of Attributional Inference Elicited in Response 


to the Indicated Emotional Description 


a ee eee 


Success—emotional description 


uproarious 
confident delighted 
competent good 
Attribution pleased (unstable 
inference elicited (ability) effort) 
Ability 8.06 6.25 
Unstable effort 7.08 6.48 
Stable effort 7.02 6.42 
Task ease 4.67 5.75 
Luck 2.60 2.73 
Others 3.54 3.60 


ity than the outcome- or attribution-linked 
emotions. 

In achievement-related contexts, therefore, 
the actor might progress through various 
cognition-emotion scenarios, such as (a) 
“I just received a D on the exam. That is 
a very low grade.” (This generates intense 
but relatively fleeting feelings of being frus- 
trated and upset.) “I received this grade 
because I did not try sufficiently hard.” 
(This is followed by feelings of guilt.) “There 
really is something lacking in me.” (This is 
ensued by low self-esteem or lack of worth.) 
“What I lack I probably always will lack.” 
(This produces hopelessness.) 

Alternatively, there is (b) “I just received 
an A on the exam. That is a very high 
grade.” (This generates happiness.) “I re- 
ceived this grade because I worked very 
hard during the entire school year.” (This 
produces contentment and relaxation.) “I re- 
ally do have some positive qualities that will 
persist in the future.” (This is followed by 
high self-esteem, feelings of self-worth, and 
optimism.) 


Experiment 2 


Since emotions are coupled with cognitions, 
an emotional expression can act as a cue to 
an observer, allowing an inference about the 
cognitions of the actor. Hence, publicly 
masking our emotions often serves the func- 
tion of hiding our thoughts, just as public 
emotional expression reveals our thoughts. In 


calm hopeful 
relaxed composed surprised appreciative 
secure safe astonished grateful 
(stable (task thankful modest 
effort) ease) (luck) (others) 
6.21 6.10 3.10 4.56 
7.04 6.12 3.60 5.60 
6.77 6.73 3.12 5.08 
5.25 5.08 4.65 5.54 
4.73 3.94 7.31 6.08 
4.92 5.02 5.42 6.69 


Experiment 2, we examined whether knowl- 
edge about an actor’s emotions (here con- 
veyed with verbal labels) enables an observer 
to infer an actor’s causal attributions for an 
achievement performance. 


Method 


Six attributions used by Weiner et al. (1978) pro- 
vided the foundation for the present investigation: 
ability, unstable effort, stable effort, task difficulty, 
others, and luck. Task difficulty, therefore, has re- 
placed personality among the causal ascriptions varied 
in Experiment 1. 

Two criteria were used to select the relevant affects 
associated with each attribution. Highly discriminating 
affects not intensely experienced might be so sub- 
ordinate among the affects being expressed that they 
could not have a cue function. On the other hand, 
emotions strongly felt (and expressed) for all attribu- 
tions cannot possibly have signal value for any one 
ascription. Hence, selection of the affects for pact 
attribution was based on two criteria: the discrimi- 
nating value of the affect multiplied by its absolute 
intensity rating (data from Weiner et al, 1978). 
Three emotions were selected for each of the six at- 
tributions; the affects chosen for this study had the 
highest values using this procedure. (See Tables 7 
and 8.) 

The subjects were 48 male and female students at 
the University of California, Los Angeles, participating 
as part of a course requirement. Twelve scenarios 
(two levels of outcome X six emotional descriptions) 
were randomly presented to each subject. A typi 
story scenario follows. 


A person just received a test back in a course that 
is very important to him or her. He or she has 
done very well and feels extremely surprised, 0S- 
tonished, and thankful. Why did this person beli 
that he or she did so well? 
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Table 8 
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Mean Magnitude of Attribution Inference Elicited in Response 


to the Indicated Emotional Description 


Sr aaaaaaaħÃă 


Failure—emotional description 


5 sorry humble stunned 
incompetent ashamed guilty sad astonished bitter 
EN inadequate scared troubled displeased overwhelmed furious 

. Attribution panicked (unstable (stable (task surprised revengeful 
inference elicited (ability) effort) effort) difficulty) (luck) (others) 
Ability 7.40 5.50 4.56 4.08 3.44 3.35 
Unstable effort 4.92 7.48 7.40 3.54 2.73 3.67 
Stable effort 5.46 7.56 7.81 3.94 2.90 3.81 
Task difficulty 6.35 4.56 4.94 5.56 4.94 6.70 
Luck 3.25 3.10 2.85 4.96 6.52 4.85 
Others 3.48 3.60 3.58 4.79 3.85 6.75 


This was followed by six scales anchored at the ex- 
tremes with “definitely” and “definitely not,” on 
which the judgments for the six causal ascriptions 
were made. The scales were divided into nine equal 
intervals and were randomized both within and 


between subjects. 


Results and Discussion 


Tables 7 and 8 show the mean ratings for 
each emotional description on the six at- 
tribution scales in the success and failure 
conditions, respectively. Inspection of the 
columns of the tables indicates the extent to 
which a particular emotional description is 
associated with the attributional judgments. 
If the emotion-cognition linkages and the 
cognition-emotion linkages are symmetrical, 
then the diagonals in the tables should con- 
tain the highest ratings within the columns. 
For example, the emotions known to be 
elicited by luck attributions (surprised, as- 
tonished, and thankful) should, in turn, lead 
to an inference of luck as the causal de- 
terminant. In the fifth column of Tables 7 
and 8, the fifth row should (and does) con- 
tain the highest rating. In 9 of the 12 com- 
parisons ($ < .01) the a priori designated 
attribution is most highly rated, whereas in 
two other instances (stable effort emotions 
for success and unstable effort emotions for 
failure) the ratings are second highest. The 
one disconfirmatory finding involves task at- 
tributions for success, which primarily elicit 
the inference that stable effort was the cause 
of the positive outcome. However, the data 


in the Weiner et al. (1978) investigation were 
not very systematic for task attributions; for 
that reason, the task ascription was excluded 
from Experiment 1. 

Statistical comparisons were performed con- 
trasting the a priori “correct” attributional 
inference (that is, the attribution that was 
found to be associated with the emotional 
description in the Weiner et al., 1978, in- 
vestigation) to the inferences given the five 
other attributions. Significant differences 
(p< .01) in the expected direction were 
found for all the emotional descriptions, 
except in the case of the emotions associated 
with task ease. 


Causal Dimensions 


Three of the attributions used in this study 
are classified as internal in locus of causality 
(ability, unstable effort, and stable effort) 
and three are external (task difficulty, luck, 
and others). The strength with which internal 
versus external causal attributions were in- 
ferred for each emotional description was then 
analyzed. In 11 of 12 comparisons, emotional 
descriptions associated with a particular in- 
ternal attribution according to Weiner et al. 
(1978) elicited internal causal inferences sig- 
nificantly more than external causes, with the 
opposite being true of emotional descriptions 
associated with external attributions. (For all 
conditions, p < .01.) For example, the emo- 
tional description of confident, competent, and 
pleased, which was predicted to elicit ability 
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attributions, elicited ability, unstable effort, 
and stable effort, which are the internal attribu- 
tions, to a greater extent than inferences that 
the external attributions of task ease, luck, and 
others were responsible for success. The lone 
exception to this pattern was again the 
emotional description associated with task 
explanations for success, in which internal 
attributions were more strongly inferred than 
external attributions. These data indicate 
that there is not only a strong inference of 
the a priori “correct” attribution given an 
emotional description, but one also infers 
attributions similar in locus of causality. 

These data may not seem so surprising 
considering that two of the internal causal 
ascriptions involve effort. However, the ex- 
ternal attributions are disparate and the 
emotional descriptions are extremely varied. 
There is an intimation that observers might 
engage in a two-step inference process, in 
which first the emotions are linked to a par- 
ticular causal dimension and then there is 
a search for the particular cause. But at 
present it is not known whether specific 
causal judgments follow or precede dimen- 
sional placement or whether there is no sys- 
tematic judgment sequence. 
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Instrumentality Effects in the Assessment 
of Racial Differences in Self-Esteem 


Bernadette Gray-Little and Mark I. Appelbaum 
University of North Carolina at Chapel Hill 


Previous studies of racial differences in self-esteem have led to highly disparate 
and confusing results. This study shows that the effects of the measuring in- 
strument as well as those of preexisting individual differences in academic and 
demographic characteristics are very great. These effects may be sufficient, in 
many cases, to explain the disparate results of earlier studies. 


Attempts to measure racial differences in 
self-esteem, the evaluative component of the 
self-concept, have probably been more nu- 
merous than social comparisons of any other 
psychological variables except intelligence and 
school performance. Much of this research has 
proceeded on the hypothesis that blacks have 
degraded self-images and make reduced self- 
evaluations compared to whites. In many in- 
stances, this assumption is explicitly anchored 
in the self-theory of James (1890) or Mead 
(1934), both of whom linked self-concept to 
the evaluative feedback that one receives from 
important others in the social environment. 
Although there is substantial empirical support 
for the relationship between self-evaluations 
and appraisals from others (Wylie, 1968), 
such a relationship cannot be offered as con- 
clusive proof of lower self-esteem in blacks. To 
assume that blacks have low self-esteem, it is 
also necessary to assume that an individual’s 
self-concept is heavily dependent on society’s 
evaluation of the group with which a person 
is identified (Cartwright, 1950; McCandless, 
1970). Accordingly, since blacks as a group are 
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less positively evaluated than whites, black 
individuals will have lower self-esteem than 
whites. There has been some support for this 
conclusion (Bridgette, 1970; Long & Hender- 
son, 1968), but just as often no support (Gibby 
and Gabler, 1969; Rosenberg, 1965) or directly 
opposing results (Baughman & Dahlstrom, 
1968; McDonald & Gynther, 1965; Powell, 
1973; Rosenberg & Simmons, 1971; Wendland, 
1969). 

There are probably many reasons for the 
contradictory findings in: this area of re- 
search. First, the assumption of correlation 
between one’s self-esteem and society’s evalua- 
tion of one’s racial or ethnic group, although 
compelling in terms of common sense, has 
little empirical support. Rosenberg (1965) 
called this proposition the stratification hy- 
pothesis and found no relationship (r = .04) 
between an ethnic group’s average self-esteem 
score and independent ratings of that group’s 
status in society. 

Second, the study of self-esteem has been 
plagued by a number of problems that make 
racial comparisons even more difficult. For ex- 
ample, self-esteem has been defined and mea- 
sured in different and, at times, noncomparable 
ways (Wylie, 1968). Furthermore, subject 
characteristics such as sex, age, and grade 
have typically varied from one study to the 
next, with little attention paid to the rele- 
vance of these factors to racial differences 
found (Christmas, 1973). Moreover, in many 
instances important mediating factors, such 
as socioeconomic status and educational 
achievement, which are probably relevant to 
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self-concept and which are certainly pertinent 
to discussions of racial differences, have been 
neither controlled nor covaried. 

The problems attendant on the confounding 
of socioeconomic status with racial/ethnic 
membership are illustrated in studies by Long 
and Henderson (1968) and Soares and Soares 
(1969). Long and Henderson found economi- 
cally disadvantaged school beginners in a 
Southern community to have lower self- 
esteem scores than more advantaged children 
on the Self-Social Constructs Test. Soares and 
Soares asked fourth-grade through eight-grade 
children to rate themselves on a set of 20 bi- 
polar trait descriptors and found that dis- 
advantaged children had higher self-esteem 
scores than advantaged children. In the Long 
and Henderson study, all of the disadvantaged 
children were black and all of the advantaged 
children were white. In the Soares and Soares 
study, 3 of the disadvantaged children were 
black or Puerto Rican whereas 90% of the 
advantaged children were white. Neither study 
made clear whether socioeconomic or racial 
factors account for the differences in self- 
esteem scores. Studies that have looked at the 
relationship between self-esteem and academic 
achievement have been somewhat more con- 
sistent in showing a positive relationship be- 
tween educational achievement and self- 
concept scores (Freyberg & Shapiro, 1966; 
Caplin, 1969). There have, however, been in- 
vestigations that revealed no differences in 
the self-concept scores of high-achieving and 
low-achieving students (Curtis, 1967; Lou- 
renso, Greenberg, & Davidson, 1965). 

Two studies that are particularly relevant 
to the present one and that highlighted the 
problems encountered in racial comparisons of 
self-esteem were those of Wendland (1969) 
and Bridgette (1970). These studies involved 
children in the same geographical area and, in 
some instances, in the same school system. 
Wendland found blacks to have higher self- 
esteem scores, whereas Bridgette reported 
higher self-esteem scores for whites. One ob- 
vious difference between the studies was that 
Wendland’s subjects were attending segre- 
gated _schools whereas Bridgette’s subjects 
were in a recently desegregated school in 
which whites (who were the majority) had 
vigorously opposed integration. Another po- 
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tentially relevant difference was the grade 
level and hence the age of the subjects tested— 
sixth-grade students in Wendland’s study and 
eleventh-grade students in Bridgette’s research. 
Since there may be different age trends in 
self-esteem for blacks and whites, it is unclear 
whether or not the contradictory findings were 
related to age. Perhaps most important, how- 
ever, two different measures of self-esteem 
were used—Fitt’s Tennessee Self-Concept Scale 
(TSCS) in Wendland’s research and Cooper- 
smith’s Self-Esteem Inventory (SEI) in Brid- 
gette’s work, One might reasonably have 
expected the same results from these two in- 
struments, since they purport to measure 
the same aspect of self-concept—self-esteem. 
Coopersmith (1967) described the SEI as a 
measure of “the evaluation an individual 
makes and customarily maintains with regard 
to himself . . . a personal judgment of worthi- 
ness” (p. 4). Similarly, Fitts (1965) character- 
ized those who receive high scores on the 
Total Positive Scale, the most important sub- 
scale of the TSCS, as “persons who tend to like 
themselves, feel they are persons of value and 
worth” (p. 2). 

There are, then, several factors, both 
theoretical and methodological, that might 
be important in explaining why some studies 
have shown blacks to have higher self-esteem 
scores than whites, whereas others have found 
whites to have higher scores. Among the 
methodological issues, one should consider the 
possibility that seeming racial differences may 
be attributable to differences in the socio- 
economic status or academic achievement of 
the groups tested or to differences in the 
racial make-up, segregated or desegregated, 
of the school. Racial differences may also be 
artifactual. 

The primary purpose of this study is to 
examine the extent to which the discrepant 
findings with regard to race are due to the 
instruments used to measure self-esteem 
First, we focus on the race effects obtained on 
two instruments—the SEI and the TSCS—for 
junior high and high school students. Then, 
we examine the correlations of each instrument 
with important intervening socioeconomic an! 
academic variables and observe whether the 
pattern of racial differences found with these 
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measures persists when the correlated variables 
are covaried. 

There were several reasons for selecting 
these instruments. Both are widely used to 
measure self-esteem in school-age children. 
Both are intended to measure the evaluative 
component of self-concept and both are rating 
scales. Norms and reliability data are available 
for both instruments. Finally, because we 
had access to subjects from the same school 
systems used by Bridgette (1970) and Wend- 
land (1969) in their investigations, we wished 
to use the same measures to enhance com- 
parability with their data. 


Method 


The school system in Central North Carolina, from 
which our 735 subjects were drawn, is now desegregated. 
Small farms and textile mills are among the major 
employers in this area; most of the children live in a 
small town and its surrounding rural area. Our sample 
was composed of 146 black males, 145 black females, 
238 white males, and 206 white females, with seventh- 
grade students accounting for roughly 54% of the 
sample, and the remainder being tenth-graders. The 
mean age of the seventh-grade students was 12.31 
(SD = .68) and of the tenth-grade students, 15.33 
(SD = .61). Half of the students at each grade level 
were tested with the SEI and the other half with the 
TSCS, yielding 372 SEI scores and 363 TSCS scores. 
Two female examiners, one black and one white, ad- 
ministered the tests to classroom groups. IQ (Form L 
of the Otis-Lennon Intelligence Test) and achievement 
scores (grade-appropriate forms of the Iowa Test of 
Basic Skills), as well as information regarding the 
number of children in the family and parents’ educa- 
tional and occupational levels, were obtained from the 
school records of the tested students. The achievement 
tests had been administered the previous semester. 1Q 
scores were typically not more than 1 year old. Hollings- 
head’s Two-Factor Index of Social Position (Note 1) 
was used to classify subjects’ socioeconomic status.’ A 
score of 1 indicates the highest social class; 5 indicates 
the lowest. : 

Two years after the initial testing with the self- 
esteem instruments, the original seventh-grade students 
were retested as ninth graders. The purpose of this 
testing was to allow us to examine changes in self- 
esteem over time as a function of race and sex and to 
compare the stability of the SEI scores with those of the 
TSCS scores. Of the 256 students that were available 
at the time of the follow-up, 128 had originally been 
tested with the SEI and 128 with the TSCS. A com- 
parison of each of the two follow-up groups with the 
total group of subjects who had taken the same test 
showed them to be very similar. The follow-up SEI 
group had an average seventh-grade score of 65.44, 
whereas the total seventh-grade group had an average 
score of 64.99. Similarly, the follow-up TSCS group 
had an average seventh-grade score of 329.79, whereas 
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Table 1 
Analyses of Variance for Age, Race, and Sex 
Differences in Self-concept 


Variable SEI* (1, 364) TSCS? (1, 355) 
Age (A) 1.79 3.68* 
Race (B) 11.81** 1.10 
Sex (C) «64 ‘40 
AXB 23 154 
AXC 16 01 
BXC 17 2.33 
AXBXC .02 28 


Note. Tests are those of effects eliminating other 
effects at the same level (e.g., age eliminating race 
and sex). See Appelbaum and Cramer (1974). Num- 
bers in parentheses are degrees of freedom. 
a SEI = Self-Esteem Inventory. 

b TSCS = Tennessee Self-Concept Scale. 

* p < .056. 

++p < 001. 


the total seventh-grade group had an average TSCS 
score of 329.19. The scores for the original and follow-up 
samples tested with the TSCS are very similar to those 
found for junior and senior high school students in 
several other studies (For a review, see Thompson, 
1972). The scores obtained by the SEI standardization 
sample (Coopersmith, 1967) were slightly higher than 
those obtained by our original and follow-up groups 
tested with the SEI. 


Results 


In the following section, we present analyses 
of the SEI and TSCS in terms of race and age 
differences as well as the relation of these tests 
to demographic and academic variables. Initial 
analyses were on a three-factor (race, age, and 
sex) design. As indicated in Table 1, however, 
there were no significant two-way or three-way 
interactions and no significant sex effects. 
Therefore, results are presented only for the 
main effects of race and age. For each test, we 
examine raw differences between groups as 
well as associated differences on demographic 
and academic variables. Given the intact 
groups used (racial and age groups) we sus- 
pected that there would be substantial pre- 
existing group differences on demographic 


1 Social Position is based on a combination of weighted 
educational and occupational levels. Father’s occupa- 
tion and education were used to determine social posi- 
tion. When for any reason this information was missing, 
the mother’s occupation and education were used. 
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Table 2 
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Racial Differences on the Self-Esteem Inventory (SEI) and the Tennessee Self-Concept 
Scale (TSCS): Demographic and Academic Variables 
pTO O OE ee 


M 
Variable Black White F (df) 
SEI data set 
SEI 60.72 66.21 11,81 i! ae 
’s education 10.57 11.28 6.41 (1, 31 
be apap 9.16 10.80 19.14 (1, 288)** 
Social position 4.62 4.17 23.69 (1, 364)** 
No. children 5.89 3.65 65.35 (1, 345)** 
Achievement 5.71 7.34 93.37 (1, 285)** 
1Q 84.04 98.79 112.26 (1, 333)** 
TSCS data set 
TSCS 323.09 327.38 1.10 (1, 355) 
Mother's education 10.60 11.25 4.96 (1, 314)*° 
Father's education 9.27 10.69 14.85 (1, 297)** 
Social position 4.50 4,09 22.02 (1, 355) 
No. children 5.45 3.71 oe Ht pe ie 
i 6.12 7.53 98.44 (1, 3 
ae 86.77 101.19 98.98 (1, 345)** 


Note. Varying degrees of freedom associated with the denominator reflect differences in the completeness of 


data from school records. 
*p <05. 
** p< 001. 


variables that might be logically related to 
SEI and TSCS differences. Although we are 
aware of the limitations of adjustment pro- 
cedures with intact groups, we have presented 
estimates of group differences (and tests of 
significance) adjusted for covariates. Readers 
are advised to regard these results with caution, 
bearing in mind that it is very difficult to judge 
the quality of the adjusted scores under these 
circumstances. 


Racial Differences 


SEI. Table 2 contains group means and 
F values for examining the main effects of race. 
The upper half of Table 2 contains raw SEI 
scores and academic and demographic informa- 
tion collected on the SEI groups. As can easily 
be seen, white subjects received higher self- 
esteem scores than black subjects. Indeed, 
black and white subjects differed significantly 
on all variables, with white students having 
higher parental education and social position, 
higher achievement and IQ scores, and fewer 
siblings. Intercorrelations of these variables 


with SEI scores are given in Table 3. The 
intercorrelations of these variables with the 
TSCS are also presented in Table 3. All of 


„these variables, with the exception of father’s 


occupation, correlate significantly with SEI 
scores. Given the initial racial differences on 
these variables and their significant intercor- 
relations with the SEI, we attempted to obtain 
estimates of racial differences in the SEI ad- 
justed for these variables through the use of 


Table 3 2 

Pearson Product-Moment Correlations of 
Self-Concept Measures With Demographic 

and Academic Variables 

OO attat Ae iSi ce eee 
TSCS (r) 


Variable SEI (r) 
a L o S L 
Mother’s education .199%* T F 
Father's education -152* 1 K 
Father's occupation 072 —.02 
No. children =.176** —.082 
Achievement pasoe? -20275 
10 “208** 191 
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Table 4 
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Racial Differences in Self-Esteem Inventory (SEI) and Tennessee Self-Concept 


Scale (TSCS), Adjusted for Covariates 


Difference 
Item (black — white) F (df) 
SEI data set 
Raw SEI differences —5.49 11.81 (1, 364)* 
Differences adjusted for 
Mother’s education —4.65 7.61 (1, 316)* 
Father's education —4.41 5.85 (1, 287)* 
Social position —4.87 8.86 (1, 363)* 
No. children —3.01 3.01 (1, 344) 
Achievement 1.46 95 (1, 284) 
1Q — 93 40 (1, 332) 
TSCS data set 
Raw TSCS Differences —4.29 1.10 (1, 355) 
Differences adjusted for 
Mother’s education —3.25 .44 (1, 313) 
Father’s education — .63 .03 (1, 296) 
Social position —2.14 .23 (1, 354) 
No. children —3.78 .68 (1, 343) 
Achievement 2.52 56 (1, 319) 
10 3.79 91 (1, 344) 


*p< 01. 


an analysis of covariance. The results of these 
adjustment analyses are given in the upper 
half of Table 4. Keeping our earlier proviso in 
mind, we found that each of these adjustments 
results in some attenuation of the racial 
difference in SEI scores. White students, never- 
theless, continue to have significantly higher 
SEI scores when social status or the educational 
level of either parent is covaried. However, the 
black-white difference on the SEI is virtually 
eliminated when either achievement or 1Q 
scores are covaried. Likewise, controlling for 
the number of children in a student’s family 
substantially reduces the racial difference. 
TSCS. A somewhat different pattern of 
relationships emerges when one examines racial 
differences in self-concept as measured by the 
Total Positive Score from the TSCS. These 
results are presented in the lower half of Table 
2. Perhaps most important, there is not a 
significant difference in the TSCS scores ob- 
tained by black and white students, even 
though this group of white students also 
comes from significantly more advantaged 
families, has fewer siblings, and higher achieve- 
ment and IQ scores. A reinspection of Table 3 
reveals that both academic variables are signifi- 


cantly related to the TSCS; however, unlike 
the SEI, the TSCS is significantly correlated 
with only one of the demographic variables, 
father’s education. The results of the covariance 
analyses are presented in the lower half of 
Table 4. Covarying any of the demographic 
variables results in a decrease in the racial 
difference; covarying either of the academic 
variables results in a reversal of the difference 
so that the scores of black students are non- 
significantly higher than those of white 
students. 

These two measures, then, yield different 
results when administered to unmatched groups 
of black and white students. The SEI reveals 
racial differences; the TSCS does not. One of 
the reasons for this difference appears to be 
the fact that the SEI is correlated with a 
larger number of race-sensitive demographic 
variables than is the TSCS. Both measures are 
consistent, however, in showing the major 
impact that academic variables, such as 
achievement and IQ scores, have on self-esteem, 
As suggested in Table 4, when black and white 
students are matched for either of these 
variables, their self-esteem scores do not differ. 
The results of the initial analysis with the 
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Table 5 
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Grade Differences in Self-Esteem Inventory (SEI) and Tennessee Self-Concept Scale 


(TSCS) Demographic and Academic Variables 


eee 


M 
Variable Grade 7 Grade 10 F (df) 
SEI data set 
SEI 64.99 62.84 1,79 (1, 364) 
Mother's education 11.04 10.97 .07 (1, 317) 
Father's education 10.25 10.17 .05 (1, 288) 
Social position 4.45 4.22 6.61 (1, 364) 
No. children 4.70 4.32 2.01 (1, 345) 
Achievement 5.69 8.01 228.08 (1, 285)** 
1Q 91.04 95.47 10.99 (1, 333)** 
TSCS data set 
TSCS 329.19 321.91 3.68 (1, 355)* 
Mother's education 11.17 10.82 1.63 (1, 314) 
Father's education 10.48 9.86 3.18 (1, 297) 
Social position 4.27 4.23 19 (1, 355) 
No. children 4.69 4.11 5.83 (1, 344) 
Achievement 5.96 8.07 195.61 (1, 320)** 
I 95.40 95.29 .01 (1, 345) 
*p <.05. 
** p< 001. 


SEI are consistent with those of Bridgette 
(1970), who found that white students have 
significantly higher self-esteem scores than 
black students. We did not, however, replicate 
Wendland’s (1969) findings of higher TSCS 
scores for black students. The present results 
represent a 12-point decrease in TSCS scores 
for black students and a 4-point increase for 
white students when compared to Wendland’s 
subjects. An explanation of our success in 
replicating Bridgette’s findings but not Wend- 
land’s may be that Wendland tested children 
while schools were still segregated, whereas 
Bridgette tested them after desegregation had 
begun. 


Grade Differences 


SEI. As indicated in the upper half of 
Table 5, the mean self-esteem score of seventh- 
grade subjects was slightly higher than that of 
the tenth-grade group, but the difference is 
not significant. The two age groups are largely 
comparable on demographic variables but, 
as one might expect, the tenth-grade group 
had a significantly higher achievement level. 
The tenth-grade group also had higher IQ 


scores. When the SEI scores are adjusted for 
each of the demographic variables, the small 
age difference in SEI remains basically un- 
changed. Adjusting SEI scores for achievement 
level, F(1, 284) = 12.84, p < .001, or for 1Q, 
F(1, 332) = 3.99, p < .05, does result in 
significantly higher SEI scores for the seventh- 
grade group. 

TSCS. The scores received by the seventh- 
grade subjects on the TSCS are significantly 
higher than those received by older students 
(see the lower half of Table 5). These two age 
groups are comparable on all demographic 
variables except for the number of children in 
the family, which is higher for the younger 
group. The achievement level, but not IQ, of 
the older subjects is higher. Adjusting TSCS 
scores for parental education or number of 
children in the family reduces the age difference 
so that it is no longer significant. The agè 
difference is increased when TSCS scores are 
adjusted for social position, F(1, 354) = 3.92, 
~ < .05. And, as in the case of the SEI, adjust- 
ing TSCS scores for achievement, F(1, 219 
= 12.77, p < .001, or IQ, F(1, 344) = 44% 
p < .05, enhances the age effect. 


$ 


RACIAL DIFFERENCES IN SELF-ESTEEM 1227 
Table 6 
Mean Difference Scores (Grade 9 — Grade 7 Scores) 
SEI TSCS 
M M 
n Difference SD n Difference SD 
Black males 27 1.11 12.23 20 —5.90 21.84 
Black females 31 529 14.50 28—886 33.69 
White males 42 —.09 16.82 41 —2.63 33.94 
White females 28 1.78 13.70 39 —7.21 35.68 
Changes in Self-Esteem were very predictable from seventh-grade 


The primary purpose of the following 
analyses was to assess patterns of change in 
self-esteem as a function of race and sex. 
Changes in self-esteem were examined in two 
ways. First, the difference scores (Grade 9 
scores — Grade 7 scores) were examined by 
means of an analysis of variance in which race 
and sex were the factors. Second, we per- 
formed a similar analysis but used the original 
seventh-grade score as a covariate to control 
the possible impact that the level of the initial 
score might have had on the amount of change. 

The mean difference scores for both the SEI 
and TSCS subjects are presented in Table 6. 
In examining the scores, one should bear in 
mind that the possible range of scores on the 
SEI is relatively small (50 to 100), whereas 
on the TSCS, scores can range from 100 to 
500. Changes in SEI scores were small and 
inconsistent from one race/sex group to 
another. Changes in TSCS scores were also 
small, but were consistent in showing a decline 
from the seventh grade to the ninth grade. 
The earlier comparison of these subjects as 
seventh graders with tenth graders showed the 
seventh-grade students to have significantly 
higher scores. The present data also suggest 
that one can expect a decline in TSCS scores 
from junior to senior high school. , 

The analyses of variance revealed no main 
effects for race or sex nor any interaction 
effects on either the SEI or TSCS difference 
scores, Furthermore, the covariance analysis 
for each test was consistent in showing neither 
race nor sex differences. Finally, the pooled 
regression for the four race/sex groups Was 
highly significant (p < .001) for each of the 
tests, indicating that the ninth-grade scores 


scores in all instances. 


Discussion 


The major conclusion of this study is that 
it is possible to ensure or preclude findings of 
racial or age differences in self-esteem through 
test selection or control of relevant demo- 
graphic and academic variables. Although it is 
true that tests such as the SEI and the TSCS 
purport to measure the same phenomenon, it 
is also clear that they are differentially related 
to a number of factors that affect self-esteem. 
A casual inspection of these two tests dis- 
closes only that the SEI has the advantage 
of being shorter and the TSCS the advantage 
of being longer. Further examination of the 
tests reveals them to be similar in content with 
both tests sampling from a wide range of self- 
referent statements that are typically assumed 
to be important in describing the self-concept. 
The SEI is, however, more highly related to 
a number of demographic factors and reveals 
racial differences. The TSCS shows weaker 
relationships to these variables and does not 
reveal racial differences. The problem is to 
determine which self-esteem measure is to be 
preferred, especially when one is making racial 
comparisons. 

In considering the relative merits of these 
two instruments, a decision as to which is 
better cannot be based on a strict judgment of 
which has greater construct validity, since 
there is no uncontested, external criterion of 
self-esteem. Rather, the choice will have to be 
made on the basis of the network of relation- 
ships that exists between each of these tests 
and other demographic and experiential values. 
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One might easily argue that the test that 
appears most culture-free, that has a weaker 
correlation with conventional indicators of 
status and social position, should be preferred. 
This argument would lead to the selection of 
the TSCS. One might also argue the inverse— 
that because the SEI is more highly correlated 
with other conventional measures of status 
and position, it will be more predictive of 
success and should, therefore, be preferred. 
The latter, culture-bound argument implies 
that one’s self-esteem should be, for example, 
a reflection of socioeconomic status. Although 
it is reasonable that factors such as parental 
status or school achievement might contribute 
to a child’s self-esteem, many self-concept 
theorists and researchers seem to work under 
the assumption that self-esteem is an important 
feature of personality and should, therefore, 
add to the information derived from demo- 
graphic or academic data (see Block, 1971). 
The many problems that plague this area of 
research point to the need to incorporate a 
number of methodological and conceptual re- 
finements in future studies of self-concept. 
We have already alluded to the necessity for 
care in test selection and control of subject 
characteristics; studies such as that of Long 
and Henderson (1968), which compare dis- 
advantaged black subjects with advantaged 
white subjects, contribute little to the under- 
standing of racial or social class differences in 
self-esteem. Furthermore, in reviewing studies 
relevant to the present one (that is, studies 
that make direct racial comparisons of self- 
esteem), it becomes clear that not only test 
and subject characteristics but also contextual 
variables need to be considered. Specifically, 
the racial make-up of the school may be an 
important determinant of the direction of 
racial differences in self-esteem. Most of the 
studies (including the present one) that report 
higher self-esteem scores for white students 
or no racial differences were completed in in- 
tegrated settings (Bridgette, 1970; Rosenberg, 
1965; Wylie & Hutchins, 1967). Studies that 
report more positive self-concept scores for 
black children seem to have been completed 
in either segregated schools (Baughman & 
Dahlstrom, 1968; McDonald & Gynther, 1965; 
Wendland, 1969) or desegregated schools in 
which blacks were a majority (Powell, 1973; 
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Rosenberg & Simmons, 1971; St. John, 1975; 
Soares & Soares, 1969). 

There is also a need for greater clarity in 
the definition of self-esteem. This is a particu- 
larly difficult problem. The concept of self- 
esteem has great appeal and is evoked to 
explain a host of complex behavioral and 
personality characteristics. It is unlikely that 
any one of the instruments typically used to 
measure self-esteem will adequately encom- 
pass this complexity. What promise to be more 
feasible alternatives are greater articulation 
of and more specificity in the facet of self- 
concept being measured. Watkins (1978), for 
example, suggested an approach to self-esteem 
that attempted to take into account the value 
system of the person tested. Subjects were 
asked to rate themselves in various areas of 
life—for example, social, family, and general 
happiness—and the responses were weighted 
by the relative importance of these areas to 
the subject. Watkins suggested that this pro- 
cedure might easily be tailored to the par- 
ticular social or ethnic group under considera- 
tion. Wylie and Hutchins (1967), on the other 
hand, focused on a specific content area, 
academic self-concept, in testing groups of 
black and white adolescents whom they had 
attempted to equate on socioeconomic status 
and academic ability. 

In addition to delimiting self-concept in the 
preceding and other ways, we also need to be 
clear as to whether our intention is to measure 
the individual’s self-concept or the individual's 
pride in his or her race. Although these two 
ideas may be interdependent, there is no ne 
to assume that they are equivalent either to 
one another or to the status one’s racial or 
ethnic group has in society (Rosenberg, 1965). 

Finally, the truism that intragroup vania- 
tions are frequently greater than intergroup 
differences bears periodic repetition for those 
studying racial differences in self-esteem- 
Although this does not mean that racial com- 
parisons should not be made, it does mean that 
global racial comparison will be of limited 
utility. If one is primarily interested in study- 
ing the plight of blacks and other minorities, 
then it is eminently more useful to investigate 
the causes and prevention of racism, poverty) 
and discrimination. If, on the other hand, one 
is primarily interested in the antecedents of 


RACIAL DIFFERENCES IN SELF-ESTEEM 


self-esteem, then another approach is de- 
sirable—one that focuses on variations in self- 
esteem of subjects who are equated on other 
characteristics that are known to be relevant. 
Studies of racial differences in self-esteem are 
not instructive either with regard to the nature 
of self-concept or racial differences in person- 
ality when the groups initially differ in ways 
that are confounded with race. 


Reference Note 


1. Hollingshead, A. B. Two-factor index of Social 
Position. Unpublished manuscript, Yale University, 
1957. 
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Observer Bias in the Attitude Attribution Paradigm: 
Effect of Time and Information Order 


Edward E. Jones, Janet Morgan Riggs, and George Quattrone 
Princeton University 


Subjects were exposed to an essay taking either a pro or anti stand on the use 
of minority quotas. They were informed either before or after reading the 
essay that the writer had or had not been permitted to choose which side of 
the issue to support. Subjects returned for a second session a week after the 
first, and their attributions of true attitude to the essayist were again mea- 
sured. The results replicated previous attitude attribution findings, but the 
tendency to attribute an essay-consistent attitude when the essay was written 
under high constraint was much greater when the constraint information ap- 
peared after exposure to the essay. There was no hint of a sleeper effect and, 
therefore, no evidence that constraint cues were discounted over time. 


The attitude attribution paradigm—diag- 
nosing “true attitudes” from opinions ex- 
pressed under varying situational pressures— 
has provided consistent evidence of a tendency 
to assume correspondence between the ex- 
pressed opinions and underlying attitudes of a 
target person acting under high constraint. The 
data are summarized and discussed in a recent 
article by Jones (1979). This robust tendency 
toward observer bias has been referred to by 
Ross (1977) as the “fundamental attribution 
error.” Ross emphasizes the ubiquity of over- 
attributing personal dispositions to account for 
behavior and reports the results of several in- 
genious experiments that carry the phenome- 
non beyond the realm of attitude attribution 
per se. Nevertheless, the attitude attribution 
paradigm stands out because of its historical 
importance as a basis for the actor-observer 
divergence proposition (Jones & Nisbett, 
1971), and because research within the para- 
digm has withstood intense scrutiny for arti- 
facts; that is, explanatory alternatives lacking 
in general theoretical interest. 

The present experiment was designed to 
explore the role of temporal delay in the atti- 
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tude attribution process, and to determine 
whether attributions for constrained behavior 
are affected by the locus of constraint informa- 
tion in the experimental sequence. 

The role of temporal delay is of interest be- 
cause of a clear analogy between the attitude 
attribution paradigm and the traditional 
sleeper effect paradigm in attitude change. Ac- 
cording to the early formulations of Hovland 
and his colleagues (Hovland & Weiss, 1951; 
Kelman & Hovland, 1953), a sleeper effect may 
occur when a communication is paired with a 
negative or low credibility source. Over time, 
their reasoning goes, the discounting informa- 
tion about the source becomes dissociated from 
the communication, which, therefore, has 4 
delayed effect on attitude change. Gillig and 
Greenwald (1974) have argued that the evi- 
dence for the sleeper effect is suspect. Never- 
theless, it seems reasonable that under some 
conditions, competing pieces of information 
would have different decay functions M 
memory. Since the behavior involved in an 
expressed opinion is presumably more salient 
than the contextual information about con- 
straining conditions, there is some reason to 
think that retention of the essay’s direction and 
extremity will be better than retention of the 
discounting information that it was prepar 
under highly constraining conditions. The com” 
straint information is viewed as analogous t° 
discounting information about the source in 
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sleeper effect studies, and degree of corre- 
spondence is analogous to attitude change, 
since it represents a departure from the neutral 
or typical attitude that presumably would have 
been attributed in the absence of any informa- 
tion about the target person. 

Order of information has been constant in 
all previous attitude attribution studies. In the 
natural world, we usually learn about or make 
surmises about the context of behavior before 
or during the action itself. Nevertheless, it 
might be argued that the tendency to over- 
attribute dispositions from constrained be- 
havior may be adequately explained as the 
effect of recency on recall, since information 
about the context in the typical attitude at- 
tribution study is temporally more remote than 
the opinion statement from the dependent at- 
tribution rating. In the present study, there- 
fore, statement-context and context-statement 
orders were directly compared. If the state- 
ment-context order were to eliminate the 
fundamental attribution error, this would 
place severe qualifications on the generality of 
the effect. 


Method 


Subjects 


Undergraduate volunteers (V = 79) participated for 
money in a study of impression formation. Subjects 
were run in groups of 12 or less, and each participated 
in two experimental sessions that were 1 week apart. 


Procedure 


Session 1. When subjects arrived for the first 
experimental session, they were told that the experi- 
menter was interested in “how people form impressions 
of others depending on the information they receive 
and how this information is conveyed” (video, audio, 
written, etc.). Subjects were told to expect to receive 
information about another person through various 
means in the experimental sessions in which they would 
be participating. f 

A booklet nOn the following materials was 
then given to each subject: an essay supposedly written 
by a target person (male) concerning the use a a 
minorities quota system in determining college ad- 
missions; instructions supposedly given to the target 
person at the time the essay was written; and a ques- 
tionnaire tò be completed by the subject. The essay 
was either in favor of or opposed to the use of a quota 
system in college admission policy. Each essay was ap- 
proximately 370 words long and contained four argu: 
ments in support of the stand taken. 
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The instructions included in the booklet either gave 
the target person the freedom to take whatever position 
he chose on the issue or assigned him a position, pro 
or anti, to support in his essay. The essay always en- 
dorsed the position assigned, Instructions to the target 
person were prefaced by the statement: “You might be 
interested in knowing that the essay was written under 
the following instructions.” The target person was then 
described as being asked to imagine that he was a 
member of a debating team and was instructed to 
prepare a series of persuasive essays or opening state- 
ments on various opinion issues. Included in these in- 
structions was either the no-choice manipulation, 
“Today I would like you to write a persuasive essay 
favoring [opposing] the quota system as a part of 
college admission procedures,” or the choice manipula- 
tion, “Today I would like you to write a persuasive 
essay either in favor of or opposed to the quota system 
as a part of college admission procedures.” Finally, order 
of presentation of the essay and instructions was varied; 
the instructions were either presented immediately 
before or immediately after the essay in the booklet. 

To summarize, the three independent variables 
manipulated in Session 1 were position taken in an 
essay toward the use of a quota system in college ad- 
missions (pro or anti), instructions under which the 
essay was prepared (choice or no choice), and order of 
presentation of essay and instructions in the booklet 
(essay first or essay last), Each subject was randomly 
assigned to one of the eight experimental conditions. 

After reading the material, subjects completed a 
questionnaire that included evaluation of the target 
person’s attitude toward the use of minority quotas in 
college admissions and other related issues. These in- 
cluded evaluation of the target person’s attitude toward 
the busing of school children to facilitate integration, 
attitude toward the use of minority quotas in the hiring 
of employees, willingness to support actively the im- 
plementation of the quota system in college admissions, 
frequency of discriminatory behavior toward others 
based on minority membership, and general liberalism- 
conservatism. The questionnaire continued with several 
manipulation checks (“How much freedom did the 
target person have to choose the position or side he 
took?” “direction and extremity of the target person’s 
essay...”); a final question asked for the subject’s own 
attitude and his or her estimate of the typical under- 
graduate attitude toward the use of a minorities quota 
system. Subjects were then asked to return for another 
experimental session 1 week later. They were directed 
to talk with no one concerning the experiment during 
that week. 

Session 2. When subjects returned for the second 
experimental session, they were first asked to recall the 
information they had received the week before about 
the target person and were then asked to complete a 
questionnaire containing several of the dependent mea- 
sures administered in the first session. It was explained 
that some subjects might have changed their feelings 
or forgotten the first week’s material over the time in- 
terval; therefore, they were urged to record their cur- 
rent impressions of the target person who wrote the 
essay. To provide suitable justifications for the second 
session and for rating changes if the subject’s current 
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Figure 1. Attitude attribution across sessions. 


feelings differed from his or her recalled impression, a 
S-minute tape of a supposed interview between the 
experimenter and target person was then played to 
provide “additional information.” The interview 
avoided discussion of social issues and described a person 
who seemed to be a typical college freshman. 

Following the tape, subjects were asked to evaluate 
the target person on a series of interview-relevant di- 
mensions and to reevaluate the target person on the 
same dependent measures administered previously. 
This third set of ratings did not differ markedly from 
the second, although subjects tended to converge 
toward a more neutral attributed attitude, probably 
because of the rather bland and noncommittal impres- 
sion conveyed by the tape. The ratings made prior to 
the tape in Session 2 are the focus in the subsequent 
presentation of Session 1-Session 2 differences. 

At the close of the second session, subjects were fully 
debriefed. 


Results 


The present experiment had two primary 
goals: (a) to explore the Possibility of a sleeper 
effect or the differential retention of behavioral 
versus contextual information over time and 


(b) to investigate the role of information order 
in leading observers into the “fundamental at- 
tribution error.” We shall deal with data rele- 
vant to both of these goals after evaluating the 
general evidence on observer bias. 


Correspondence and Observer Bias 


Previous findings (cf. Jones, 1979) have 
shown greater correspondence of attitude at- 
tribution under choice than no-choice condi- 
tions. This was again clearly demonstrated in 
the present experiment: When the target 
person was alleged to have a choice in produc- 
ing the essay read by subjects, he was seen to 
hold a more extreme attitude toward quota 
systems in education (in line with the direc- 
tion of the essay) than when the target person 
had no apparent choice. The statistical interac 
tion between essay direction and choice r 
significant in both sessions: Session 1 F(1, 71) 


= 20.32, p < .001; Session 2 F(1, 71) = 845, 


p < 01. 
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Previous findings of significant correspon- 
dence of attribution and essay direction under 
no-choice conditions were also replicated in 
both sessions. In Session 1, the no-choice-pro 
essayist was seen as more in favor of quotas 
(M = 15.9) than the no-choice-anti essayist 
(M = 8.4), F(1, 71) = 56.85, p< .001. In 
Session 2, the tendency was reduced but still 
significant, F(1, 71) = 6.14, p < .05. In both 
sessions target persons with choice were, of 
course, attributed even more extreme attitudes. 


A Sleeper Effect? 


Since past evidence suggested that subjects 
pay more attention to behavior (figure) than 
its context (ground), it seemed quite possible 
that behavior in the no-choice conditions would 
have even more impact as remembered in 
Session 2 than in Session 1. In other words, 
there might be a sleeper effect in attitude at- 
tribution as a function of forgetting the less 
salient constraint information more rapidly 
than the more salient behavior. 

The results provide no support for the sleeper 
effect notion. A repeated-measures analysis of 
variance showed very little variance (F < 1.00) 
attributable to the crucial interaction between 
essay direction, choice, and session. In fact, as 
Figure 1 shows, the observer bias or funda- 
mental attribution error disappears by the 
second session if the context has preceded the 
essay in the first session. The bias is main- 
tained in the essay-first condition. Clearly, in 
any event, there is no hint of the kind of 
sleeper effect when the impact of behavior 
direction becomes greater over time. If any- 
thing, the implications of contextual con- 
straints themselves become stronger over time. 


Table 1 ; 
Attributed Attitude Toward Quota System in 


College Admissions (Session 1): Means 


Choice No choice 
ea 
Essay Essay Essay Essay 
Position first last first last 
Pro 16.3 | 18.3% 17.6 14.1 
Anti 44 3.8 70 9.7 
gly in favor). 


Note. 1 (strongly opposed) to 21 (stron: 
an = 9 in this cell; n = 10 in all others. 
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Table 2 

Attributed Altitude Toward Quota System 

in College Admissions (Session 1): Analysis 
of Variance 


Source df MS F 

Essay direction (A) 1 2,111.23 159.58** 
Constraint (B) 1 38.16 2.88 
Information order (C) 1 Ad 
AXB 1 160,29 121577 
AXC 1 15.68 1.19 
BXC 1 6.12 
AXBXC 1 95.90 7.25* 

Within error 71 13.23 


* p< .01. ** p < 001. 


At least this appears to be true when the no- 
choice context is initially presented prior to 
the essay. 


Effects of Information Order 


It is the case that all previous research on 
attitude attribution has featured a standard- 
ized order of information about contextual 
constraints followed by an opinion statement. 
Although this probably parallels mundane 
reality most of the time, an obvious question 
is whether context is relatively ignored (and 
behavior relatively emphasized) because of a 
recency effect. Since the essay or speech is the 
last piece of information before the crucial 
rating in the standard paradigm, perhaps its 
differential influence is a function of its differ- 
ential salience in memory. 

The present results clearly speak against 
such a recency effect. In fact, whether one looks 
at the data from Session 1, Session 2, or com- 
bines the data from both in a repeated-mea- 
sures analysis, the essay has greater impact 
when it comes first. For convenience of exposi- 
tion we shall concentrate on the Session 1 data, 
though we have already seen (in Figure 1) that 
the differences of interest become greater in 
Session 2. 

Tables 1 and 2 present the means and 
analysis of variance summary, respectively, of 
ratings on a 2l-point scale of attributed at- 
titude toward “the quota system in college 
admissions.” It can be seen that there is a 
significant triple interaction, F(1, 71) = 7.25, 
p< .01, that focuses on the greater spread 
between no-choice-pro and no-choice-anti 
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Judged Essay Direction, Perceived Freedom, and Confidence: Means 
5 
DSN DON ee o ee ee 


Essay first Essay last 
Pro Anti Pro Anti 
Measure Cc NC 3 NC c NC C NC 
thy ple cn 19,2 19.6 2.9 3.5 18.2 16.4 3.9 5.2 
greet 3.4 16.2 42 14.6 46 20.3 2.9 20.5 
Confidence* 17.3 14.7 16.3 15.6 14.0 10.3 18.5 12.2 


Note. C = choice; NC = no choice. 


a 1 (essay strongly opposed) to 21 (essay strongly favored). 


b 1 (complete freedom) to 21 (no freedom). 
€ 1 (low confidence) to 21 (high confidence). 


ratings when the essay is presented first rather 
than last. This is also apparent in Figure 1 (see 
no-choice, Session 1 means). The triple inter- 
action is larger than the simple Direction 
X Order interaction because order of informa- 
tion has no impact within the choice conditions. 

Related traits. The reader will recall that 
subjects also rated the target person on related 
opinion issues ranging from busing to general 
liberalism. These ratings were summed, and 
an analysis of variance revealed a pattern of 
F ratios almost identical to those on the most 
directly relevant admissions quota question. 
The triple interaction was again significant, 
F(1, 71) = 4.43, p < .05, as was the simple 
Order X Direction interaction, F(1, 71) 
= 5.70, p < .05. On both the direct trait rating 
and the related ratings composite, the simple 
effect of essay direction was significant re- 
gardless of when the context was mentioned, 
though the difference was significantly greater 
when the essay preceded the explanation of 
context. 

Supporting perceptions, In approaching the 
question of why the attribution error is greater 
when the essay comes first, it is important to 
note that the essay-first subjects appear to see 
the behavior and the situation differently than 
the essay-last subjects. When asked to judge 
the extremity of the essay, compared to the 
essay-last subjects the essay-first subjects 
rated the Pro essay as more in favor of and 
the anti essay as more opposed to the quota 
system ; for the interaction between order and 
direction, F(1, 71) = 10.60, p < .01. 


Not only was the essay judged more ex- 
treme when it preceded the context, but the 
essayist was judged to have greater freedom 
when the no-choice context was mentioned 
after exposure to the essay. The fact that this 
effect of order on perceived freedom differen- 
tiated the two orders only in the no-choice 
conditions is reflected in a significant Con- 
straint X Order interaction, F(1, 71) = 5.25, 
p < 05. ) 

In line with these two findings, subjects in 
the essay-first condition were significantly more 
confident of their attitude attributions than 
were subjects in the essay-last condition. This 
difference was especially marked in the no- 


Table 4 
Judged Essay Direction, Perceived Freedom, 
and Confidence: Analysis of Variance 


Summary 
O a 
F 
fo SS ae 
Extrem- Free- _ Confi- 
Source ity dom dence 
th a T an 
Eei 27 
direction (A) 733:7190* ` E: 
Constraint (B) 0 164.21*** 10.08 
Information 4.52 
order (C) 32 sot 4h 
AXB 2.45 03 
AXC 10.60** .07 
BXC 56 5.25* 
AXBXC 1.77 95 


*p < 05. ** p < 01. *** p < .001. 
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Essay First 


Anti 


Essay Direction 
Figure 2. Own attitude by experimental conditions. (Numbers in parentheses refer to within-cell 


correlations.) 


choice conditions, F(1, 71) = 6.94, for order 
within no choice. 

The means and analysis of variance sum- 
maries on these items are in Tables 3 and 4, 
respectively, 

Own and perceived typical attitude. Subjects 
were finally asked to rate their own attitude 
toward the quota system in college admissions 
as well as that of the average undergraduate. 
In previous attitude attribution studies, the 
subjects’ own subsequently measured attitude 
has sometimes, but by no means always, been 
Positively related to their attributions to the 
target person (cf. Jones, 1979). In the present 
study a most distinctive interaction was dis- 
covered. This is portrayed in Figure 2. In all 
but those no-choice conditions in which the 
essay was presented first, subjects showed a 
contrast effect: Those exposed to a pro essay 
tate themselves as more against quotas than 
those exposed to an anti essay. However, this 
trend was markedly reversed in the no-choice, 
essay-first conditions. The resulting triple 


interaction (Essay Direction X Constraint 
Order) was highly significant, F(1, 70) 
= 15.05, p < .001. The same trends are also 
present in the data on perceived typical at- 
titude. The triple interaction was again signifi- 
cant, though not quite as strong: F(1, 70) 
= 7.75, p < 01. 

Figure 2 also indicates the within-cell cor- 
relations between attributed attitude and own 
attitude. None of these correlations was signifi- 
cant, and the pattern of the larger ones is un- 
interpretable and unrelated to the distribution 
of means as a function of the independent 
variables. Thus we are faced not with a general 
tendency for attributed attitude to follow own 
attitude or to contrast with own attitude, but 
a tendency for the various experimental condi- 
tions to have systematic effects on attributed 
attitude, own attitude, and perceived typical 
attitude. 

The most reasonable accounting for the 
errant findings in the no-choice, essay-first 
cells, is attribution to chance: The significant 
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interactions on the own-attitude data may 
represent a Type I error of accidental signifi- 
cance. To shed light on this possibility, 28 addi- 
tional subjects were run in the four no-choice 
conditions (i.e., retaining the variations in 
essay direction and information order). Al- 
though the distinctiveness of the essay-first 
conditions was not nearly as striking in this 
sample, the same general pattern was observed. 
In the new sample the simple interaction be- 
tween order and essay direction was not signifi- 
cant, but adding the two samples together pro- 
duced a highly significant interaction, F(1, 63) 
= 22.77, p < .001. This seems too robust to be 
explained away as a chance vagary of the 
random assignment of subjects. We can only 
conclude that some aspect of the experimental 
experience caused subjects to evaluate their 
own attitudes differently as a function of the 
condition to which they were assigned. 


Discussion 


The results of the present experiment pro- 
vide no support for any speculations con- 
cerning a sleeper effect in attitude attribution. 
Reasoning by analogy with Hovland and 
Weiss’s (1951) finding that in delayed recall 
subjects remember the content of a message 
better than discounting characteristics of the 
source, we speculated that the positions of an 
author might be better recalled after a week’s 
time than the circumstances under which the 
author prepared the statement. In the present 
experiment subjects returned a week after 
their exposure to a pro or anti essay on minority 
quotas and were asked to rate the true attitude 
of the writer without further exposure to his 
essay or the constraint context in which it was 
allegedly prepared. Subjects previously run in 
the high-choice condition showed little or no 
change in their unequivocal attitude attribu- 
tions. In contrast, subjects initially exposed to 
a highly constrained target person showed the 
customary observer bias in the first session, 
but this dissipated completely in the second 
session if the contextual information had pre- 
ceded the essay. When the essay was presented 
first, there was some regression to a neutral 
attribution by the second session, but much of 
the biased attribution remained. There was 
clearly no evidence for a sleeper effect in any 
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condition; on the contrary, when there is any 
change over time (i.e., in the no-choice, essay- 
last conditions), the attribution error is di- 
minished rather thean augmented. 

Prior evidence for a sleeper effect in attitude 
change studies has already been severely chal- 
lenged by Gillig and Greenwald (1974), though 
their conclusion has in turn been questioned 
by Gruder et al. (1978). The latter authors 
suggest that absolute sleeper effects (actual 
increases in attitude change over time) are to 
be expected only when there is no immediate 
attitude change in a “discounting cue” group. 
They reason that immediate attitude changes 
can be expected to decay with time, and such 
decay would countervail against the sleeper 
effect. The results of their two experiments 
provide strong evidence in favor of their view. 
To translate these concerns into terms relevant 
for the attitude attribution paradigm, the no- 
choice context information may be viewed as 
the discounting cue and the attribution of an 
attitude that is more “pro” or more “anti” 
than the perceived typical attitude may be 
viewed as attitude change. Following the logic 
of Gruder et al., then, the initial existence of 
observer bias in the no-choice conditions de- 
creases the likelihood that an absolute sleeper 
effect would be observed: Such an effect would 
have to operate against the general tendency 
for attitude attributions to regress toward the 
neutral point or the perceived typical attitude. 

The present results are not even consistent 
with the possibility of a relative sleeper effect— 
when the effects of discounted material are 
relatively more persistent than the effects of 
undiscounted material. There was, in fact, no 
discernible decay of effect in the undiscounted 
(choice) conditions, whereas a clear decay was 
apparent in one of the discounted conditions 
(no-choice, essay-last). In contrast to the pos 
sibility that the essay, because of its initial 
salience, would retain its impact better than 
information about constraint, the direction of 
the essay seemed to be ignored by the average 
subject in the no-choice, essay-last condition: 
This convergence over time (see Figure 1) may 
reflect regression to the mean, but it should 
noted that there was no such convergence in the 
essay-first conditions. It is still necessary, then, 
to account for differential regression as @ fun™ 
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tion of the position of the discounting cue 
(no-choice instructions) during Session 1. 

It should also be noted that the present at- 
titude attribution paradigm differs from the 
typical sleeper effect paradigm in partially 
reinstating both the communicator and his 
“message.” Thus in the typical sleeper effect 
study, the delayed measure of attitude is taken 
in a separate experimental context, and no 
mention is made of the initial experimental 
session. In the present study, however, it was 
necessary to remind subjects of the prior 
experimental session and the target person 
whom they would again be asked to judge. 
This is not analogous to the reinstatement con- 
dition of Kelman and Hovland (1953), in which 
subjects were reexposed to the discounting cue 
itself (a description of the communicator that 
manipulated his credibility). Nevertheless, the 
inescapable partial reinstatement involved in 
bringing to mind the previous week’s session 
may have somehow muted the sleeper effect. 

The most striking results of the present ex- 
periment are those associated with the effects 
of information order. Although the funda- 
mental attribution error is demonstrable 
whether the essay follows or precedes the in- 
formation about constraint (no choice), it 
clearly is strengthened when the essay is read 
prior to the constraint information. In fact, 
there was no significant difference between 
attributions in no-choice and choice conditions 
when the essays came first (Essay direc- 
tion X Order interaction, F = 2.06, ns). Also, 
there was little or no decay in the attribution 
effects over a week’s time. It appears that the 
no-choice information is virtually ignored when 
the subject has earlier been exposed to a strong 
pro or anti essay. Why might this be the case? 


Our findings are reminiscent of the persever-_ 


ance effect identified by Ross, Lepper, and 
Hubbard (1975). These investigators found 
that both actors’ and observers’ impressions 
persisted when their basis was subsequently 
discounted. The perseverance effect has been 
explained in terms of a cognitive consolidation 
process wherein the initial impression is but- 
tressed through a reaching out for supportive 
data in remembered prior experiences, OF 
through an initial salience of confirmatory as- 
sociations. In the present experiment, subjects 
in the essay-first conditions presumably as- 
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sume at first that the essay is written in the 
absence of clear-cut constraints and therefore 
represents the true viewpoint of the writer. 
Their impression of the writer’s attitude to- 
ward minority quotas in education is readily 
buttressed by associations to related attitudes 
and personality attributes that are automati- 
cally inferred. The readers will recall that in- 
ferences about related attributes (toward 
busing, etc.) were significantly more consistent 
with the essay when it preceded the context 
than when the essay was last. 

When the essay-first subjects then learn 
that the essay was written under highly con- 
straining conditions, they make an adjustment 
in their assessment of attitudinal extremity, but 
the buttressing cognitive work they have al- 
ready accomplished renders this adjustment 
insufficient. (Indeed, it is not even statistically 
significant.) In the essay-last conditions, on 
the other hand, the adjustment is more ade- 
quate and the modest (but still significant) 
tendency toward overattribution is accom- 
panied by a modest amount of cognitive but- 
tressing. The perseverance explanation, then, 
implies that the initially attributed attitude 
remains attributed (or is possibly reattributed) 
as a necessary implication of the associated at- 
titudes that have already been cognitively con- 
structed by the subject. This would not be the 
first time that the origins of an inference have 
been disregarded once the inference is made, 
and new or different origins have been assigned. 
Snyder and Frankel (1976) for example, showed 
that emotional dispositions are inferred to ac- 
count for emotional behavior that was itself 
previously inferred from an arousing context. 

But what shall we make of the distinctive 
tendency for the subjects’ own attitudes, and 
their estimates of the perceived typical at- 
titude, to vary positively with the attitude 
attributed to the essay writer only in the no- 
choice, essay-first conditions? We know from 
the correlational patterning (and the total lack 
of any overall correlation between own and 
attributed attitude—overall r = —.099) that 
this distinctive relationship is clearly produced 
by the independent variables and their par- 
ticular order. The one explanation that occurs 
to us involves a mechanism of consistency- 
maintaining cognitive work not unlike the 
well-known strategies of dissonance reduction. 
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The subjects who read a pro-quota essay, for 
example, who make a strong attribution and 
then learn that the writer had no choice, are in 
a situation of dissonance or inconsistency. They 
have been “taken in.” They have made an 
inference that is not warranted because of the 
information they belatedly learn about the 
context. This may be dissonant or inconsistent 
with the subjects’ desire to see themselves as 
sophisticated and capable of seeing when 
someone (in this case the essay writer) does 
not really mean what he says. Faced with this 
mildly unpleasant challenge to their self- 
concept, the subjects may attempt to rational- 
ize their premature commitment by shifting 
their conception of the typical attitude and of 
their own attitude. After all, if the essay is at 
or near the consensus position, it is perfectly 
reasonable to assume that the essay writer 
really believed what he said—even though 
there is a reason to discount the essay. In this 
way, subjects are able to maintain their at- 
tributional commitment for a reason that is not 
vulnerable to the new, discounting information. 

A dissonance account is also consistent with 
the fact that little attribution decay is ob- 
served from Session 1 to Session 2 in the no- 
choice, essay-first conditions (see Figure 1). If 
the attribution were buttressed by similar at- 
tributions about the self and typical other, then 
one would expect the attributions to be quite 
stable over time. 

Regardless of the eventual explanation for 
the strong order effects observed in the present 
experiment, it is clear that the “fundamental 
attribution error” is enhanced when discount- 
ing information follows the formation of an 
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unequivocal impression. It goes without saying 
that observer overattribution under conditions 
of constraint is more than an effect of the sub- 
ject’s most recent exposure to the target 
person’s behavior. 
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A model for the prediction of behavior from beliefs was tested in the context 
of voting behavior in the presidential election of 1976. A two-wave panel de- 
sign was used in a field survey in which voting behavior was predicted from 
beliefs about candidates in accord with a recent subjective probability model. 
Results supported the model, yielding an average correlation between predicted 
and obtained voting behavior of .75, The average correlation between predicted 
and obtained voting intention was .84. The ability of the model to predict be- 
havior and behavioral intentions was relatively unaffected by the educational 
level of the respondent and the type of beliefs under consideration, 


A major concern of social psychology has 
been the prediction of social behavior. 
Researchers have used measures of attitude 
(e.g., Wicker, 1969), norms (e.g, Schwartz, 
1973) and personality traits (e.g., Mischel, 
1968) in the attempt to achieve accurate 
behavioral prediction—all with mixed success. 
Dulany (1968) presented an approach to 
behavioral prediction based on behavioral 
intent. According to this perspective, a 
person’s behavior (at least most behaviors of 
concern to social psychologists) is largely a 
function of his or her intention to perform that 
behavior. Research by Dulany and others 
(e.g., Ajzen & Fishbein, 1973) has observed 
strong relations between behavioral intentions 
and overt behavior. Conceptual discussions 
of the relationship between these two variables 
have been presented by Dulany (1968) and 
Fishbein and Jaccard (1973). 

A number of theorists have attempted to 
extend this work by specifying factors that 
determine behavioral intentions. A number of 
models of the relationship between intentions 
and psychological variables have been pre- 
sented in the context of attitude research 
(Fishbein & Ajzen, 1975), verbal learning 
(Dulany, 1968), cross-cultural research (Tri- 
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andis, 1974), and consumer psychology (Sheth, 
1973). Each of these approaches is based on 
the general linear model, in which a set of 
predictor variables (e.g., attitudes, norms) 
are weighted and summed to yield an overall 
prediction of behavioral intent. Generally 
speaking, the types of predictor variables 
used have been similar across the various 
theories. 

An alternative approach to understanding 
the relation between behavioral intentions and 
psychological variables has been presented by 
Jaccard and King (1977). These theorists 
were concerned with the relationship between 
beliefs and behavioral intentions. A belief was 
defined as a perceived relation between two 
“objects.” The term object was used in its 
most generic sense, so that it could refer to 
a person, a behavior, an attribute, a goal, or 
some concept. Belief strength referred to the 
subjective probability linking the two objects. 
For example, in the belief statement, “Smoking 
cigarettes causes cancer,” belief strength 
refers to the individual’s perceived probability 
that “smoking cigarettes causes cancer.” This 
probability may range from .00 to 1,00. A 
behavioral intention was defined as a perceived 
relation between oneself and some behavior. 
Characterizing this relation is also a subjective 
probability. For example, the statement, “I 
intend to vote for Gerald Ford” consists of a 


perceived relation between “myself” and the 
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behavior of voting for Gerald Ford. The 
subjective probability linking these two objects 
may vary from .00 (“I definitely intend not to 
vote for Gerald Ford”) to 1.00 (I definitely 
intend to vote for Gerald Ford). 

A behavioral intention, according to Jaccard 
and King, does not differ in its basic structure 
from any other belief. However, it does 
possess certain characteristics that distinguish 
it from other beliefs: (a) it always links a 
person with some action, (b) it always refers 
to future behavior, and (c) it is usually 
correlated with overt behavior (Dulany, 1968; 
Fishbein & Ajzen, 1975). These characteristics 
suggest that it would be useful to more fully 
understand this “special” belief and its relation 
to other beliefs. 

Given that both beliefs and behavioral 
intentions can be conceptualized as subjective 
probabilities, Jaccard and King (1977) at- 
tempted to extend the work of Wyer (1974) 
using a model relating these subjective 
probabilities. Specifically, Wyer (Wyer & 
Goldberg, 1970; Wyer, 1970, 1974) tested and 
found support for the following equation as a 
model of relations among beliefs: 


Pa = PePajg + (1 — Pr)Pap, (1) 


in which Pa = the subjective probability that 
Proposition A is true, Pg = the subjective 
probability that Proposition B is true, Pays 
= the subjective probability that Proposition 
A is true given that Proposition B is true, and 
Pam = the subjective probability that Prop- 
osition A is true given that Proposition B is 
not true. Jaccard and King (1977) attempted 
to extend this model to the analysis of be- 
havioral intentions by rewriting Equation 1, 
substituting subjective probabilities character- 
izing a behavioral intention : 


Pr = PsPrg + (1 — Pa)Pre (2) 


in which P; = a person’s intention to perform 
a behavior (e.g., vote for Gerald Ford), 
Pg = a _Person’s perceived probability that 
Proposition B is true (eg., Gerald Ford 
favors reduced government spending), Prije 
= a person’s perceived probability that he or 
she would perform the behavior, given that 
Proposition B is true, and Prg = a person’s 
perceived probability that he or she would 
perform the behavior, given that Proposition B 
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is not true. A discussion of the conceptual 
relationship between this model and the 
models of Fishbein (1972) and others has $ 
been presented by Jaccard and King (1977); 
hence, it will not be considered here. The 
difference between the two conditional prob- 
abilities, Prj2 — Prim, was termed the psy- 
chological relevance of a belief for a behavioral 
intention. When this difference is large, B may 
be said to be psychologically relevant to the 
intention (e.g., a person would vote for Gerald 
Ford if he favored reduced government 
spending but would not vote for Gerald Ford 

if he did not favor reduced government | 
spending). Conversely, when the difference 
between these two conditional probabilities 
is low, B may be said to be psychologically 
irrelevant to the intention (e.g., a person 
would vote for Gerald Ford regardless of 
whether or not he favored reduced government 
spending). In general, a change in a belief will — 
produce a change in a behavioral intention to — 
the extent that the difference between the | 
two conditional probabilities is large. 

Jaccard and King (1977) presented data in l 
the context of two experiments that supported i 
the model in Equation 2. Wyer (1970, 1974, — 
1976) presented data that are consistent with 
Equation 1. All of these tests, however, dealt 
with a restricted range of subjects (via. — 
college students), and the utility of these 
models has yet to be demonstrated with 4 
more heterogeneous population in a field 
setting. The purpose of the present investiga- 
tion is three-fold. (a) To extend previous 
research by Wyer in an attempt to specify 
the relationship between beliefs and behavior 
instead of just the relation of beliefs to beliefs. 
This involves the incorporation of the concept 
of behavioral intention into Wyer’s model. 
(b) To test the generality of Wyer’s model in 
the context of an application of the intention 
model to a noncollege population. Specifically; 
the predictability of the model is evaluated aS 
a function of the education of respondents: 
Several theorists (e.g, Bem, 1970) have 
suggested that models of belief consistency 


rT] 
i This assumes that a change in Pg does not i | 
Pre or Prg. For a discussion of this issue kar in 
model’s implication for understanding chort 
behavioral intentions, see Jaccard and King (1977): 
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have greater validity for bright college 
students and intellectuals than for the majority 
of the population. If such suggestions are 
correct, then the model should predict better 
for high versus low education respondents. 
(c) To investigate psychological factors (e.g., 
wishful thinking) that may affect the predict- 
ability of the intention model. 


Method 


Respondents 


Respondents were 119 males and females living in a 
Midwestern community of moderate size. Sampling 
procedures were designed to obtain a heterogeneous 
sample in terms of education with an approximately 
equal number of Democrats, Republicans, and Indepen- 
dents. Respondents were contacted by phone and 
administered a brief phone survey. On completion, 
they were asked if they would be willing to complete 
a longer survey in their home. The goal of the sampling 
procedures was not representativeness, but rather 
heterogeneity, so as to achieve a more stringent test 
of the generality of the model. The final sample con- 
sisted of 54 males and 65 females who ranged in age 
from 23 to 56 with a mean age of 34.6. Education level 
varied from less than a high school education to a 
professional degree, with 34 respondents having not 
completed high school and 85 respondents with a high 
school degree or more. A total of 36 Democrats, 48 
Republicans, and 34 Independents were interviewed 
(one individual did not complete the party identification 
question). Refusals to participate in the final survey 
were relatively low, being less than 10% of those 
initially contacted. 


Materials 


The home interviews took approximately 1 hour to 
complete. During the first stage of the interview, the 
interviewer presented an oral introduction to the study 
and a standardized set of instructions concerning the 
questionnaire. The respondent then filled out a practice 
section. This section contained examples of each type 
of scale and served to anchor the endpoints and 
eliminate any “warm-up” effects from the data; in 
addition, the interviewer could detect any misunder- 
standing of scales prior to actual data collection. The 
entire interview schedule was self-administered. pans 
probably minimized any effect of “interviewer on 
the subject’s responses. The interviewer did, however, 
remain in the house while the respondent was complet- 
ing the questionnaire to answer any questions the re- 
spondent might have. j 

On the basis of reviews of national magazines, local 
newspapers, and campaign materials distributed by the 
Democratic and Republican parties, a set of 17 issues 
was compiled (e.g., abortion and reduced government 
spending). A list of these issues is presented in the 
results section of this article. Data were collected to 
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test Equation 2 for the presidential election. Specifically, 
for each candidate, respondents provided estimates of 
Ps, Pı and Py for each of the 17 issues. Examples 
of the statements used to measure these parameters 
are as follows. Pa = Jimmy Carter favors reduced 
government spending, Pz|z: Suppose Jimmy Carter did, 
in fact, favor reduced government spending. How likely 
is it you would vote for Jimmy Carter, P;\z: Suppose 
Jimmy Carter did not favor reduced government spend- 
ing. How likely is it you would vote for Jimmy Carter. 
Each statement was rated on a 21-point unlikely-likely 
scale ranging from 0 to 100 with a midpoint of 50. 
Scores were converted to probability units by multiply- 
ing each score by .01. These three measures were 
obtained for each candidate for each of the 17 issues, 
yielding 102 questions. Behavioral intention (Pz) was 
ascertained by asking each respondent the probability 
that he or she would vote for Jimmy Carter and 
Gerald Ford on a 21-point unlikely-likely scale. A 
behavioral intention measure was independently 
obtained for both candidates. 

In addition to the above measure, evaluations of 
each of the 17 issues were obtained on a 7-point bipolar 
semantic differential scale, with good and bad as the 
endpoints. A number of demographic variables were 
also ascertained, including party identification (using 
the traditional measure of Berelson, Lazarsfeld, & 
McPhee, 1954), education, religion, self-reported 
liberal-conservativeness, age, and marital status. In 
addition, a number of general beliefs about the Dem- 
ocratic and Republican parties were measured. The 
order of all items was (a) evaluations of each of the 
17 issues, (b) beliefs about the Democratic Party, 
(c) beliefs about the Republican Party, (d) belief 
strength (Psi) that Jimmy Carter favors each of 17 
issues, (e) belief strength that Gerald Ford favors each 
of 17 issues, (f) demographics, (g) behavioral intentions 
(Pi), (h) additional attitudinal and background 
variables, and (i) the conditional probabilities, Prsi 
and Prigi. Note that measures pertaining to each of the 
parameters of Equation 2 were separated so as to 
reduce the possibility of respondents perceiving 
consistencies among them. 

Questionnaires were administered approximately 1 
week prior to the election. One day after the election, 
respondents were recontacted and asked if they had 
voted and, if so, for whom. These self-reports served as 
estimates of voting behavior.? 


Results 


Reliability of Measures 


To ascertain test-retest reliabilities of the 
probability scales used in the present investiga- 
tion, a number of items were repeated at the 
end of the questionnaire. The average test- 


2 Research in political science has tended to indicate 
that such self-reports of voting behavior are veridical 
(cf. Campbell, Converse, Miller & Stokes, 1960; 
Fishbein & Coombs, 1974). 
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Table 1 
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Correlations and Mean Absolute Discrepancies for Predicting Voting Behavior and Voting Intention 
from the Subjective Probability Model: Total Sample i 


Carter Ford 
Issue B with Ê; P, with Ê; |P: -— Ê,| B with Ê, Pywith Py |P- Pil 
Amendment to ban abortion 7 85 12 68 gi .16 
Reduce defense budget 80 87 13 .78 86 13 
Busing 70 79 16 72 83 AS 
Amnesty 76 86 12 76 87 RE 
Registration of guns a7 86 13 T1 77 17 
National health system 76 85 14 77 .80 16 
Inflation by unemployment AF 87 14 74 83 16 
Labor unions 75 84 14 72 81 16 
Nuclear energy 73 81 16 75 84 15 
Guaranteed annual income 74 .83 15 2 83 15 
Local vs. federal control 75 85 14 75 79 y 
Cooperation with Russia JS 84 14 a .16 n 
Closing tax loopholes .16 -86 14 18 86 in 
Honesty .70 81 16 78 84 1 
Experience 78 88 12 72 87 i 
Clarity on positions 79 88 12 16 85 l 
Reduced 7 i 
government spending 79 87 12 78 86 
M -758 .850 — 744 831 = 


retest correlation of these items was .93, 
indicating that the scales were relatively 
reliable. In addition, a number of checks were 
made to ensure that subjects’ responses were 
not artificially correlated by means of method 
variance (Campbell & Fiske, 1959). For 
example, it was assumed a priori that belief 
strength in two statements, (a) Jimmy Carter 
is in favor of a constitutional amendment to 
ban abortion and (b) Jimmy Carter favors 
reducing the defense budget, should be 
theoretically uncorrelated (i.e., knowing an 
individual’s score on the first belief should not 
be predictive of his score on the second 
belief). Any correlation between these two 
beliefs would, therefore, reflect common 
method variance. The correlation between 
these beliefs was .04 (ns). Similar comparisons 
on a total of 10 belief statements (5 compar- 
isons per candidate), assumed to be uncor- 
related a priori, yielded an average absolute 
correlation of .08.° Other data, which indicate 
that responses are not artificial, are presented 
in a later part of the Results section. 


Prediction of Behavior from Behavioral Intention 


yi Ake test the relation between behavioral 
intention (Pz) and voting behavior, a point 


biserial correlation was computed between 
Pr and behavioral score (1 = voted for the 
candidate, 0 = did not vote for the candidate) 
for each candidate separately. The correlation 
between stated intention to vote for Carter 
and observed voting behavior was .87, P < 0. 
The corresponding correlation for Ford w 
85 (p< .01). These correlations did no 
differ across a number of individual araa 
variables (e.g., education and religion) ane 
indicate a relatively strong relation betwee? 
intentions and behavior. 


Prediction of Behavioral Intention from Beliefs 


To test the model of behavioral inte 
proposed by Jaccard and King (1977), 
predicted P; score was derived in accor 
Equation 2 for each subject and each 0 of 
17 issues. Table 1 presents the corre 
each of these predicted intentions ( 1) A 
behavior and with the observed intenta 
vote for Carter and Ford. In terms of predic 


3 Obviously, such a test for method variance ojei 
the underlying assumption of uncorrelated vat ons fot 
veridical. Details of the additional comparis thors: 


method variance can be obtained by writing the au 
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voting behavior (VB), the correlation between 
Pr and VB ranged from .68 to .80 with a 
mean r to Z transformation of .75. All correla- 
tions were significant (p < .01) and did not 
vary appreciably across issues or candidates. 
In terms of predicting behavioral intention, 
the correlation between predicted and observed 
intention ranged from .76 to .88, with a mean 
of .84. Strictly speaking, if Equation 2 is 
valid, then no curve-fitting parameters should 
be required to predict observed intention from 
the predicted intention score. In other words, 
a regression analysis should yield a slope of 1.0 
and an intercept of 0. The appropriate slopes 
and regression weights were computed for each 
belief. In no case did the observed slope differ 
significantly from 1.0 nor did the intercept 
differ significantly from 0. Finally, in no case 
did the correlation with Pz of any individual 
parameter of the model (i.e., Ps, Pris, P18, 
Pris — Prig) exceed the correlation between 
P; and Py. The correlation of Pr with each 
Pp ranged from —.45 to .67 with an average 
absolute correlation of .19; for Pr with Prig 
the correlations ranged from .49 to .86 with 
an average correlation of .79; for Pr with 
Prg the correlations ranged from .50 to 86 
with an average correlation of .75; for Pr 
with Pris — Priz the correlations ranged from 
—.24 to .45 with an average correlation of .16; 
for Pr with PPr the correlation ranged 
from .29 to .81 with an average correlation 
of .64; for Pr with (1 — Ps)PrB the correla- 
tions ranged from .05 to .70 with an average 
correlation of .45. 

In terms of prediction, the full model 
presented in Equation 1 was a consistently 
better predictor of the Py than any component 
of the equation considered in isolation. This 
is reflected not only in the correlational 
analyses above but in other indices of goodness 
of fit. The true superiority of the full model is 
probably underestimated due to the fact 
that the increased number of parameters and 
combinatorial rules result in greater degrees 
of measurement error than the correlations 
based on single components. 

In addition to the correlational analyses 
above, the absolute difference between Py and 

z was computed for each subject on each 
belief, The mean discrepancy score for each 
belief is presented in Table 1. Discrepancy 
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scores ranged from .12 to .18, indicating 
little variability in predictability across beliefs. 
However, these discrepancy scores did differ 
significantly from perfect predictability (i.e., a 
hypothetical discrepancy mean of zero). This 
is not surprising, considering measurement 
and experimental error in the parameters of the 
model. 

To test the generality of the model across 
individuals, the sample was divided into two 
groups: (a) high education (high school 
graduate or more) and (b) low education (did 
not complete high school). For the high 
education group, the correlation between Py 
and VB ranged from .63 to .80, with a mean of 
.74. Similarly, the range of correlations 
between Pz and Êz was .77 to .91, with a mean 
of .85. In no case did the slope differ signif- 
icantly from 1.0, and in only one case did the 
intercept differ significantly from 0. For the 
low education group, the results were compar- 
able. The correlation between Êr and VB 
ranged from .65 to .84, with a mean of .75. 
The correlation between Pr and P; ranged 
from .60 to .88, with a mean of .80. In no 
case did the slope differ significantly from 1.0 
nor the intercept from 0. Although the differ- 
ence in correlations between the two groups 
is not large, there is a consistent tendency 
for the model to predict better for high versus 
low education subjects. Out of 34 correlations 
(17 for each candidate), the correlation 
coefficient between Pr and Êr was higher 
(albeit, nonsignificantly so) for the high 
education group on 27 of them. This was 
significant (p < .05) via a sign test. More 
important, the standard error of estimate was 
lower for the high education group on all 34 
relations. Thus, there was a tendency for the 
model to predict consistently better for high 
versus low education respondents. 

In addition to the above correlation analyses, 
an analysis of mean |Pr— Pr| scores was 
performed. Specifically, a Hotelling T? test 
was performed comparing high versus low 
education respondents on the vector of mean 
discrepancy scores for each belief. For the 17 
beliefs concerning Carter, a T? of 13.95, 
F(17, 101) = .71, ms, was observed, whereas 
for the 17 beliefs concerning Ford, a T? of 
27.54, F(17, 101) = 1.40, ms, was observed. 
Thus, the analysis of mean absolute dis- 
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Table 2 
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Mean Scores of Selected Parameters of the Subjective Probability Model 


Carter Ford 
Rele- Rele- 
Issue P, Ps Pns Pg vance P; Pa Pua Prg vance 
Amendment to ban abortion 44 4 39 44 —.05 S59" $3" .56> .S8 TR 
Reduce defense budget 45 77 43 44 —.01 60 33 60 59 01 
Busing 44 «56 39 A8 —.09 58 48 .S3 62 —=,09 
Amnesty 44 76 «44 2 02 57 33 56 58 -01 
Registration of guns 46 66. 45 43 02 58 47 58 57 01 
National health system ERA IAAT 41 .06 565 37 SS ST EE 
Inflation by unemployment A aen idl, 49 —.18 S4 Ss 48 o =N 
Labor unions ATS «39 AS —.06 S6 45. 53.51 OR 
Nuclear energy 46 «55 (46S .06 Co DUAA y MR | ae 10 
Guaranteed annual income 41.65 «439 A —.05 59 34 49 62 GS 
Local vs. federal control 45 S8 47 39 08 60 59 63 S2 09 
Cooperation with Russia 44 57 46 AL 05 60 76 61 53 08 
Closing tax loopholes <0), G0) 2526, .34 18 55 48 69 47 22 
Honesty 42 «OSI 61 20 Al 61 11 11 30 AL 
Experience as aS i a, SA 19 64 81 68 46 22 
Clarity on positions ao 230!) S56). .32 24 59 = .72 65 Al 24 
Reduced government s] 
spending AEA tee TSD 35 A7 61 75 67 46 21 


Note. Observed P; Carter = .41; observed P; Ford = .61; N = 119. 


crepancy scores yielded no differences in 
predictability as a function of education. In 
general, mean discrepancy scores for each 
separate educational group were similar to 
those presented in Table 1 for the total sample. 
Table 2 presents mean scores for the 
components of the behavioral intention model 
as well as for Py. The observed mean Py for 
Carter was .41, whereas for Ford it was .61. 
Tt can be seen that the model also tended to 
accurately predict mean behavioral intention 
scores. The range of differences in mean Êr 
versus mean P; was —.09 to .07, with a mean 
of —.01. In no case did the observed mean 
differ significantly from the predicted means. 
From an applied perspective, Table 2 has 
some interesting implications. First, one can 
examine the mean belief strength scores (Pp) 
for each candidate. For each of the 17 issues, 
the candidates took specific positions during 
the course of their campaigns. It is interesting 
to note that in every case the direction of the 
difference in means is consistent with the 
stands taken by the candidates. For example, 
relative to Ford, Carter was more in favor of 
busing, amnesty, registration of guns, a 
national health system, labor unions, a 
guaranteed annual income, reducing the 


defense budget, and closing tax loopholes for 
big business. These differences were reflected 
in the mean Pp scores. This suggests, contrary 
to many perspectives (e.g., Campbell, Con- 
verse, Miller, & Stokes, 1960; Sears 1969), 
that voters may have considerable information 
about candidates and that they may ač 
curately perceive differences between them. 

Examination of mean psychological rel- 
evance scores (i.e., Pris — Prz) indicates t 
following beliefs as being the most importan 
for influencing behavioral intentions: ( 
reducing inflation by policies encouragn 
unemployment, (b) closing tax loopholes 10 
big business, (c) being trustworthy, (d) being 
experienced, (e) clearly stating one’s positions, 
and (f) reduction of government spending 
However, it should be noted that in gener: : 
the difference between the conditional pr 
abilities for each of these beliefs was rele 
small (approximately .20). This probaby : 
reflects the fact that change in any one ve 
about a candidate would not have 4 8" E 
impact on voting intention due to the im 
tance of several other beliefs in the determin? 
tion of voting intentions. However, changin 
several beliefs may have a dramatic imp’ 
on voting intention and behavior. For 
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discussion of this issue, see Jaccard and 
King (1977). 


Additional Factors Affecting Model 
Predictability 


In addition to individual differences in 
model predictability, a set of analyses was 
undertaken to determine whether the type of 
belief under consideration influenced the 
ability of the subjective probability model to 
predict intentions. For each belief statement, 
a predictability index was defined as | P;—Pr|, 
for each individual. McGuire (1960a, 1960b, 
1960c) has suggested that such variables as 
(a) the favorability of the beliefs in question 
(i.e., wishful thinking) and (b) how “‘affectively 
charged” beliefs are, may affect the logical 
relations between beliefs. Similarly, these 
variables may affect the predictability of the 
intention model. 

To test the extent to which model predict- 
ability is related to how “desirable” the 
belief in question was, the predictability 
index was correlated with the evaluative 
ratings of each issue (on a 7-point semantic 
differential good-bad scale scored +3 to —3). 
In general, the correlations were low and 
nonsignificant. The range was —.19 to .11 with 
an average correlation of —.02. 

In addition, the correlations between the 
predictability index and the absolute value of 
the evaluative ratings was also computed for 
each belief statement. This analysis would 
indicate the extent to which model predict- 
ability is influenced by how “affectively 
charged” the issue is, irrespective of the 
direction of that affect (positive or negative). 
Again, the correlations were typically low, 
with a range of —.20 to .15 and an average 
correlation of —.03. 

These correlations must be interpreted 
cautiously, since they involve the correlation 
of one variable with a difference score. It is a 
well-known fact that difference scores can be 
highly unreliable and would tend to mask a 
Significant true correlation via measurement 
error. However, “test” measures used in the 
Study generally indicated high reliability, 
Suggesting this may not be a problem. Never- 
theless, no direct data are available to estimate 
the reliability of the difference score. 
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Discussion 


The present data support the model of 
behavior and behavioral intention based on 
Equation 2. The average correlation between 
voting behavior and voting intention was .86. 
The strength of this relationship did not vary 
as a function of numerous individual difference 
variables. In addition, the average correlation 
between predicted versus obtained voting 
intention was .84. In general, regression 
analyses indicated that ad hoc curve fitting 
procedures were unnecessary for prediction 
of voting intention (i.e., intercepts did not 
differ significantly from 0.0 and slopes did 
not differ significantly from 1.0). Furthermore, 
the strength of the relationship between 
predicted and observed voting intention did 
not vary appreciably as a function of the 
respondent’s education. However, there was a 
tendency for the model to predict consistently 
better for high versus low education respon- 
dents. Predictability of the model was found 
to be unaffected by the desirability of the 
belief on which it was based or how “‘affectively 
charged” the belief was viewed. These data 
indicate that Wyer’s (1970) probability model 
can be applied to the understanding and 
prediction of behavior as well as beliefs, 

The present model of behavioral intention 
has a number of interesting implications for 
theoretical and applied research. First, it 
represents an interesting alternative to models 
of behavioral intention presented by Fishbein 
(1972), Triandis (1974), Sheth (1973), and 
others. As noted earlier, each of these ap- 
proaches is based on the general linear model 
in which a set of predictor variables (e.g., 
beliefs) are weighted and summed to yield 
an overall prediction of behavioral intent. 
Each of these models relies on a regression 
equation for its basic format. In contrast, the 
present model does not rely on such an 
approach and circumvents a number of 
problems raised in the context of correlational 
approaches (see Jaccard & King, 1977, for 
an elaboration). 

Second, the present model has a number of 
interesting implications for changing behav- 
ioral intentions. If Pry and Pr are held 
constant, then the following relation should 


hold: 
AP; = APs(Prye — Pr), (3) 
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in which AP; = the change in the subjective 
probability characterizing the behavioral inten- 
tion, APs = the change in the subjective 
probability characterizing the belief, and all 
other terms are as previously defined. If a 
persuasive message effectively changes Ps, 
its corresponding influence on Pr will be 
proportional to the difference Pris — Pin. 
More important, Equation 3 suggests that the 
logical inconsistency created by the change in 
Pp can be resolved in three ways: (a) by 
changing the intention to perform the behavior 
(Pr), (b) by changing the psychological 
relevance of the belief for the intention (i.e., 
by changing the difference Prys — Prin), or 
(c) by a combination of the above. Which 
mode of resolution individuals use and the 
factors that influence this choice remain to 
be investigated. 

Third, the model possesses obvious relevance 
to researchers in applied areas. On the basis 
of literature reviews or pretests, a set of beliefs 
may be generated for which measures in 
Equation 2 could be obtained. To understand 
variations in intentions, it would be possible 
to compare those who intend to perform the 
behavior with those who do not intend to on 
the strength of belief in each belief statement 
(Pai). A second comparison could be made 
between the two groups for differences in the 
psychological relevance of the belief for the 
intention (i.e, Prig: — Pras). It is entirely 
possible that a given belief (Ps;) will not 
covary with intention (i.e., that intenders and 
nonintenders will not differ on Pg;), but 
that the belief will still be psychologically 
relevant to the intention (i.e., Prig: — PrBi 
will be nonzero), and furthermore, that the 
magnitude of this relevance may differ as a 
function of whether or not the individuals 
intend or do not intend to perform the behav- 
ior. The present model allows the researcher 
to examine these factors for any number of 
groups defined on the basis of any “external” 
variable (such as sex or religion). 

è The present model of behavioral intention 
is based closely on Wyer’s (1970) model of 
cognitive organization. It is not clear, however, 
whether research generated by Wyer and 
others in this area (e.g., McGuire, 1960a, 
1960b, 1960c) is directly applicable to the 
Present model. For example, it is not clear 
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whether such phenomena as the Socratic effect 
and cognitive inertia would be observed when 
the criterion in question is a behavioral 
intention rather than a belief. The present 
data suggest that wishful thinking, for example, 
may not affect the behavioral intention model, 
although it has been shown to influence the 
cognitive organization model. Future research 
should help to identify the influence of these 
phenomena on beliefs versus behavioral inten- 
tions. 

The present data do not allow a meaningful 
comparative test of the model of intention 
based on Equation 2 with other more prom-~ 
inent models (e.g., Fishbein, 1972: Triandis, 
1974). Due to differing numbers of parameters, 
differing combinatorial rules, and differing 
reliability of measures, a prediction strategy 
such as the present one is simply not conducive 
to such comparative tests. However, the 
results of the present investigation are eni 
couraging and suggest that comparative tests: 
are worth pursuing. These would probably take 
the form of manipulative or change studies 
whereby differential predictions of the compet- 
ing models are derived (e.g, one mode 
predicts that a change in a belief should lead 
to a change in an intention, whereas another 
model does not). : 

Some additional advantages of the presen 
model should be noted relative to othe 
strategies for understanding the relationship 
between beliefs and behavior. One common 
approach used in relating beliefs to behavior 
is to obtain individuals’ agreement or disagte® 


(in the present study, this would be analogo f 
to correlating each Ps with the b 7 d 
measure). These beliefs may be entered a ‘ 
a multiple regression equation and the ae 
multiple R as well as standardized regress i 
coefficients of each belief interpreted. 
tively, the belief ratings may be factor ana 
and the resulting factor scores entered in 
regression equation. These correlation”. 

proaches suffer from two distinct shortcom i 
however: (a) a belief may be highly corre 
with a behavior yet still not have a 
implications for the behavior (i.e., the Mi y 
tion may be spurious) and (b) a belief m i 
uncorrelated with a behavior yet St 


causal implications for the behavior. Any form 
of correlational analysis is ambiguous with 
respect to causality. In contrast, these prob- 
lems are circumvented in the context of the 
conditional probabilities of the present model. 
If a belief is spuriously correlated with an 
intention (and hence, behavior), this should 
be reflected in a small difference between the 
conditionals Pr;z — Priz. Similarly, if a belief 
is uncorrelated with an intention, yet has 
causal implications for it, this should be 
reflected in a large difference between the 
conditionals. Thus, the present model, if 
valid, circumvents some of the problems 
associated with these traditional correlational 
strategies. 

A major difference between the present 
model and alternative approaches is the fact 
that most of these approaches explicitly 
consider multiple beliefs as determinants of 
intentions. Multiple belief measures are ob- 
tained and then combined in some fashion 
and related to the corresponding intention. 
This view is not incompatible with the present 
approach in that multiple beliefs are also 
considered. However, Equation 2 is applicable 
to only one belief at a time (i.e., measures of 
Ps, Pre, and Prg are obtained for each 
belief separately and analyzed accordingly). 
It would be possible to modify Equation 2 in 
accord with probability theory to incorporate 
multiple beliefs simultaneously. However, the 
equation would become cumbersome and 
would probably be of little practical utility 
when the number of beliefs is three or greater. 
The approach outlined in the present article 
is probably a more appropriate strategy. This 
approach should provide insights into the 
extent to which changes in any one belief 
will produce changes in behavior. It is possible 
that changes in one belief will not produce 
t changes in behavior but that changes in 
multiple beliefs would. This state of affairs 
would be suggested by a set of beliefs whose 
~ psychological relevance (Pre — Prg) was 
uniformly small.ê In this situation, a model 
of multiple beliefs should be applied as a 
supplement to the present one. None of the 
current theories consider this problem, and 
such a model is now being developed and 
tested. 


wht 
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A Cybernetic Model of Self-Attention Processes 
Charles S. Carver 


University of Miami 


An analysis of the experiential and behavioral consequences of self-directed 

attention is proposed, based on cybernetic or information-processing ideas. The 

the following assumptions: When attention is directed N 
to environmental stimuli, those stimuli are analyzed and categorized according 

to the person’s preexisting recognitory schemas. Self-directed 
leads to a similar analysis of se/f-information; 
es an enhanced awareness of one’s salient self-aspects. In 
one’s context or of some self-element— 
hich constitutes a behavioral standard. If a prior 
h a behavioral standard, subsequent self-attention 


attention often 
experientially, such a state of 


ehavior is altered to conform more 
standard is construed as the occur- 


f a test-operate-test-exit unit or a nega- 
truct applicable to many different phe- 


ology. More specifically, self-focus 


ncy causes a return to the 


's theory of fear-based behavior, 


usually identified with the terms information- 
processing, cybernetic, or control theory (€g. 
Apter, 1970; Buckley, 1968; Powers, 1973; 
Singh, 1966; Wiener, 1948), In the minds of 
many theorists and researchers, information- 
processing ideas provide an ideal framework 
within which to confront a wide variety of 
problems. Indeed, it is a framework that, by 
virtue of its abstractness, many believe may 
provide the best basis yet attainable for a 
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comprehensive model of behavior. This article 
represents an attempt to apply such ideas to 
an area of interest to social psychologists that 
has not been treated before in these terms. 
That area is the experiential and behavioral 
consequences of self-directed attention. 

The article is divided into four major sec- 
tions, The first section is intended to provide 
some background to the theory that is pre- 
sented later. This background material in- 
cludes a brief summary of Duval and Wick- 
lund’s (1972) theory of objective self-aware- 
ness and a basic introduction to the uses of 
the terms cybernetic, control, and information 
processing. In the second section a theoretical 
model is presented, along with some evidence 
in support of its propositions, This section 
represents the central portion of the article. 
Because of the unique relationship between 
this theory and Duval and Wicklund’s (1972) 
self-awareness theory, a third section is in- 
cluded, which addresses specific points of con- 
flict between the two models. The present 
theory also has some broader implications, 
however. These implications are considered 
in the final major section. That section in- 
cludes comparisons between the present model 
and each of the following: Bandura’s recent 
analysis of fear-based behavior (Bandura, 
1977), learned helplessness theory (e.g., 
Abramson, Seligman, & Teasdale, 1978), and 
ay comparison theory (Festinger, 1950, 

4). 


Background 
Self-Awareness Theory 


In 1972 Duval and Wicklund proposed a 
theory of self-awareness that consisted of two 
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central assumptions. First, they suggested that 
the objects of conscious attention could be 
dichotomized: Attention can be directed out- T 
ward to the environment, or attention can be | 
directed inward to the self (self-awareness), 
Self-awareness is increased by stimuli that 
remind a person of himself or herself (e.g, a 
camera or a mirror); self-awareness is de- 
creased by stimuli that distract attention from 
the self (e.g., perceptual-motor tasks, such as 
a pursuit rotor). 

Duval and Wicklund’s (1972) second as- 
sumption was that self-focus leads to a self- 
critical evaluation process such that the self- 
attentive person compares himself or herself 
with some standard on whatever behavioral 
dimension happens to be salient (see also 
Wicklund, 1975a).* Those authors further 
argued that self-directed attention is there- 
fore usually aversive, because in most cases 
the person’s actual behavior or state is worse 


1 Duval and Wicklund (1972) originally termed 
self-focus “objective” self-awareness and environ- 
ment focus “subjective” self-awareness. In recent 
years this rather cumbersome convention has been 
widely abandoned in favor of such alternatives 3% 
high versus low self-awareness or self-focus versus 
environment focus. 

2It will be noted that many potential standards 
of comparison exist at any given moment, even 
with respect to a single behavioral dimension. How- 
ever, in Duval and Wicklund’s theory it was assum 
that at any given time the following two conditions 
obtain: One standard is more salient than om 
and the person adopts that standard as his or E 
comparison value, at least temporarily. The aA 
assumption will be made throughout the presen | 
discussion. Cases in which the individual’s atenha 
is drawn to a potential standard that is not adopt i 
as a comparison value (because it is seen as ) 
evant to one’s behavior or for some other et 
are explicitly excluded from the analysis. THe 
will be taken up in somewhat greater detail wi 

Perhaps the distinction between standards 2 
dimensions should also be clarified at this point 
distinction is similar to that commonly ma aa al 
tween the terms value and variable. A pene o 
dimension (or a variable) such as aggressiven® | 
politeness is defined by some quality that may 
present in one’s behavior to a greater OF vioral 
degree. A standard is a point on that bac ; 
dimension, for example, nonaggressiveness = vioral 
erate aggressiveness. The existence of a be ae 
standard, of course, always presupposes the s 
of a behavioral dimension. 
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than the standard of comparison, This aver- 
‘siveness presumably leads the person to at- 
tempt to escape the self-aware state (Wick- 
lund, 1975a, 1975b). If it is not possible to 
avoid self-awareness-inducing stimuli (and 
thus the aversive state), the self-focused per- 
son may attempt to alter his or her behavior 
so that it conforms more closely to the stan- 
‘dard, as a way of reducing the aversiveness 
of self-awareness. That is, the aversiveness of 
self-attention was seen as the motivator for 
behavioral alterations. Thus self-awareness 
theory, as proposed by Duval and Wicklund 
(1972), was one of a larger class of drive 
| theories—one in which the presence or ab- 
sence of drive was determined by the direction 
of one’s focus of attention. 

This theory has been provocative, and it 
has led to a good deal of imaginative research. 
However, evidence has begun to accumulate 
that its assumptions are incorrect. The re- 
mainder of this article will be devoted to the 
assertion that the elements of self-awareness 
theory that have been supported by research 
findings can be most usefully construed as 
components of a more elaborate model of so- 
cial cognition, and further, that such a model 

lis most usefully framed in information-pro- 
cessing or cybernetic terms. 


Information Processing, Cybernetics, 
and Control 


Definitions. Some amplification on the 
uses of the terms information processing, CY- 
bernetic, and control is in order at the outset. 
The term information processing is used in 
two related but distinguishable ways by psy- 
chologists. At a micro level, the term refers to 
analyses of the exact bases by which informa- 
tion is encoded in and retrieved from memory: 
At the macro level, information processmg 
applies to descriptions of control systems an 
is more synonymous with the term cybernetic. 
Cybernetics is the science of control and com- 
Munications systems (Apter; 1970; Singh, 
1966; Wiener, 1948, 1954). The term cont 
aS used here, refers to the sequencing that 1s 
implicit in a set of instructions, each of which 
awaits and depends on the execution of the 
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EXIT 
(Congruity) 


(incongruity) 


OPERATE 


Figure 1. The TOTE unit, as presented and discussed 
by Miller, Galanter, and Pribram (1960). 


previous instruction (as in a computer pro- 
gram). 

A specific illustration may clarify the con- 
cept of control. Miller, Galanter, and Pribram 
(1960) introduced to psychologists a hypo- 
thetical construct called a TOTE unit (Figure 
1); the word TOTE is an acronym for test- 
operate-test-exit. This construct is known 
more widely among cyberneticists and engi- 
neers as a negative feedback loop (negative 
because it reduces discrepancies between an 
existing state and a standard). 

The diagram of a TOTE unit (Figure 1) 
portrays a self-regulatory system. The arrows 
in the diagram convey three things simul- 
taneously. They represent energy, that is, the 
activation of a physical mechanism to do some 
sort of work; they represent information 
transfer, that is, the information that con- 
gruity between two values has or has not been 
attained; and they represent control, the fact 
that one process logically depends on an out- 
come of the previous process and then leads 
to a subsequent process. Any given process is 
said to control another when the execution 
of the first triggers the execution of the sec- 
ond. In effect, any decision-making flow chart 
describes a control sequence. 

In general, the present use of the term 
information processing is meant to convey 
the latter (macro-level) meaning of the term. 
Though there are differences in implication 
between the terms information-processing, 
cybernetic, and control theory, the differences 
are relatively subtle, and the terms will be 
used interchangeably here. 

Applications. Many people tend to think 
of the word cybernetic as applying solely to 
electronic computers. However, cybernetics as 
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a science was seen from its infancy as being 
applicable to both living and nonliving sys- 
tems. To most cyberneticists, the living—non- 
living dichotomy is not terribly useful (cf. 
Turing, 1950; Wiener, 1948). Because of the 
fact that the TOTE unit or feedback loop has 
some particularly important implications for 
the discussion to follow, let us consider the 
living-nonliving distinction as it applies to the 
TOTE unit (Figure 1). 

“In the control system of the TOTE unit, 
three mechanisms are activated in an orderly 
sequence. The first of these—test—is a mech- 
anism by which a comparison is made between 
some input and a standard. If the input is not 
congruous with the standard, the second 
mechanism—operate—is activated. Operate 
alters the existing state of affairs in some way. 
Following operate, test recurs. If the input 
now is congruous with the standard, the final 
mechanism—exit—occurs, deactivating the 
system or freeing it for other applications. If, 
on the other hand, the second test reveals that 
incongruity still exists, operate is reactivated 
and continues to sequence alternately with 
test until congruity between input and stan- 
dard is attained * 

The most commonly used illustration of the 
operation of a TOTE unit is the behavior of 
a nonliving system: the room thermostat. The 
thermostat senses existing temperature, com- 
pares that existing temperature with a preset 
standard, and activates a furnace when the 
temperature falls discriminably below the 
standard or activates an air conditioner when 
the temperature rises above the standard. 
However, this control Structure is certainly 
not limited to electrical systems. Indeed, in 
the view of cyberneticists, the feedback loop 
may occur in virtually any type of physical 
system (cf. Buckley, 1968; Kuhn, 1974). The 
same control structure that is realized elec- 
trically in the thermostat is realized biolog- 
ically in innumerable forms: for example, the 
homeostatic mechanisms that maintain normal 
body temperature, maintain appropriate levels 
of oxygen and nutrients in the blood and tis- 
sues, and regulate hormonal balances (cf. 
Wiener, 1948, especially p. 114). Moreover. 
the same control structure can be realized in 
Psychological systems, For example, it is im- 
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a 
$ 
plicit in a child’s self-correcting attempts t 
reach out to touch a nearby object, and in th 
adult’s checking and adjusting his or he 
clothing in a mirror. i 
Until recently, cybernetic constructs hay 
not been widely applied to subjects of interes 
to social psychologists. The theorizing pre 
sented below is intended, in part, as a stey 
in that direction.* 


The Model 


The control-theory model of self-attention 
Processes proposed here subsumes many find: 
ings generated by researchers in self-aware- 
ness theory. It does not, however, incorporate 
all of the same assumptions as does thal 
theory. The difference in assumptions be- 
tween the two allows the present model to 
accommodate existing data with greater in- 
ternal consistency and to make predictions 
that seem not to be easily derivable from self- 
awareness theory as framed by Duval an 
Wicklund (1972), The model consists of th 
following propositions, which will be pre- 
sented along with some amplification and sup- 
porting evidence. (Figures 2 and 3 provide 
a flow-chart illustration of the interrelation: 
ship among propositions.) i 

1. A dichotomy may be imposed on the pa 
tential objects of conscious attention, such 


E 
* This alternation may be quite rapid, of cour 
depending on the characteristics of the physical sy 
tem in which it occurs. (Indeed, some theorists, ie 
Powers, 1973, hold that this information flow 
better conceptualized as being continuous at 
stages of the feedback loop.) Thus, although we c 
easily separate the component elements logically a 
other purposes it is more convenient to think of tig 
entire feedback loop as the unit of analysis. ave 
* Control theorists seem to have been more aw! to 
of the potential applications of such cont 
social behavior than have social psychologists. sfact 
example, MacKay (1963) has noted that “an ics 
capable of receiving and acting on information # ae 
the state of its own body can begin to parallel elle 
of the modes of activity we associate with n to 
consciousness” (p. 227). An important erop ai 
this general rule is the work of Powers (1973, ole 
whose background includes training in both psy 


been concerned predominantly with — 
Processes other than those involved in S 
havior. 


pression of that previous event. 


J 


part of one’s body, however, 


l 


| 


| that attention may be said to be directed 


either outward toward the environment or 
inward toward the self. 
This seemingly simple statement (which is 


identical to the first assumption made by 
| Duval and Wicklund, 1972) requires several 


kinds of qualification and elaboration. First, 
it should be clear that the dichotomy between 
self-focus and environment focus is an imper- 
fect one (as in some respects is the distinction 
between what is self and what is nonself). 
When one’s attention is focused on the en- 
vironment, one is attending to immediate per- 


| ceptual input from the distance receptors: the 
| eyes and ears. Self-focus, in comparison, in- 
cludes a wider variety of possibilities. When 


attention is self-directed, it sometimes takes 
the form of focus on internal perceptual 
events, that is, information from those sen- 
sory receptors that react to changes in bodily 
activity (for example, the autonomic and 
proprioceptive activity that contributes to the 
subjective experience of emotion). Self-focus 
may also take the form of an enhanced aware- 
ness of one’s present or past physical be- 
havior, that is, a heightened ‘cognizance of 
what one is doing or what one is like. Alterna- 
tively, self-attention can be an awareness of 
the more or less permanently encoded bits of 
information that comprise, for example, one’s 
attitudes. It can even be an enhanced aware- 
ness of temporarily encoded bits of informa- 
tion that have been gleaned from previous 
focus on the environment; subjectively, this 
would be experienced as a recollection or im- 


Tt is sometimes difficult to maintain the 
clarity of this dichotomy, as is illustrated by 
the following example. To look at a part of 
one’s own body is, initially, to focus on the 
environment. The visual perception is 4 per- 
ception of a thing out in the world, until that 
percept is categorized a an EE a 

i imulus in qu 
Because the visual stim' ie Fei thal 
it will ultimately be categorized as 4 compo- 
nent of self, At that point the experience may 
become one of self-attention, a5 One mentally 
considers that self-aspect. It is this a 
by which experimental manipulations F . 
mirrors presumably act to heighten self-focus: 
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by inducing visual perceptions that remind 
subjects of themselves in a very general sense, 
that is, remind them of self-dimensions other 
than their visual representations, 

It should also be clear that the term height- 
ened self-awareness and its synonyms as they 
are used in this article do not necessarily 
connote a long, deep examination of the self. 
Rather, such terms imply an increase in the 
probability or the frequency with which atten- 
tion is momentarily directed to some aspect 
of the self. In research, such increases in self- 
focus are caused by stimuli that remind sub- 
jects of themselves. It is assumed, however, 
that attention normally shifts back and forth 
between self and environment to some de- 
gree, whether such reminders are present or 
not. 

2. When attention is directed to the en- 
vironment, incoming stimulus information is 
processed, leading to classification of that in- 
put. The process of classifying incoming stim- 
uli has been studied extensively by cognitive 
psychologists. Posner (1969; see also Posner 
& Rogers, 1978), Franks and Bransford 
(1971), Reitman and Bower (1973), and 
Neumann (1974) are among the theorists 
who have developed models to account for the 
classificatory process in various ways. Though 
these models vary in other respects, they have 
as a unifying idea the notion that, over a suc- 
cession of perceptual experiences, a single 
representation in memory is abstracted from a 
series of relatively similar percepts. This 
single representation, then, stands for a class 
of perceptual events. These abstracted repre- 
sentations—sometimes termed recognitory 
schemas or recognitory prototypes—are stored 
in memory and are subsequently used to clas- 
sify and interpret new inputs. 

Evidence for the existence and use of recog- 
nitory prototypes in classifying stimuli in the 
environment comes from a vast accumulation 
of research in cognitive psychology. In cogni- 
tive research, the prototypes are typically 
generated experimentally by presenting sub- 
jects with a series of discrete training stimuli 
that vary along several dimensions. After 
training, subjects are found to classify new 
stimuli according to the organization implicit 
in the training series. Presumably the catego- 
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ANALYSIS ANALYSIS 
YIELDS NO CUE YIELDS CUE 
IMPLICATION 
FOR BEHAVIOR 


(TO FIGURE 3) 


Figure 2. Flow-chart description of the cybernetic model of the consequences of self-directed 
attention. (Numbers in the figure refer to comparable propositions in the text.) 


tizational structures that we carry with us in 
our day-to-day lives similarly reflect our ex- 
periential histories, but histories that are 
much longer and more complex. 

In the restricted settings of cognitive re- 
search, the subject categorizes only one stim- 
ulus at a time. However, in the broader con- 
text of social behavior one doubtless catego- 
rizes the nature of the interaction setting as a 
whole, as well as categorizing single stimuli. 
The kinds of information contributing to this 
overall identification undoubtedly include 
gestural, postural, and other nonverbal cues 
emitted by persons in the visual field; sig- 
nificant physical elements in the situation 
(e.g., presence of objects with identifiable 
uses); and verbally transmitted information, 
including inferences that the listener makes 
from the verbalization. Each cue that is en- 
coded presumably contributes something to 
the resultant categorization of the context, 

The work of a number of environmental 
Psychologists (e.g., Barker, 1968; Frederick- 
sen, 1974; Gump, 1971) would seem to fit 
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STANDARD 
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FOR BEHAVIOR 


very easily into this line of thought. Those 
researchers have emphasized that beta 
settings vary along many discriminable be. 
mensions, and they have worked toward r 
development of taxonomies of the be 
in which behaviors take place. Presuma y 
their efforts in categorization are similar to 
the processes that we all use continuously, 
albeit less consciously. In a similar vein, c 
tor and Mischel (1977) have recently ee 
evidence that personality-trait terms wee: 
much as recognitory prototypes in the clas 
fication of other persons. y 
The entation of environmental inpia 
which is assumed in the present model 


5 The specific processes by which such ie 
formed and altered, although obviously Me ri 
tant, are outside the scope of this article. EH wall 
experimental literature of that area is trea ae 
peripherally here, Inasmuch as this article Ao 3 
to deal primarily with consequences of s ne | 
attention, discussion will emphasize a vith 
direct relevance to the propositions that 
self-focus, 
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(from Figure 2 ) 


STANDARD HAS 
NEGATIVE VALENCE: 


(from this point, 
the possibilities 
parallel those 
illustrated 

to the right) 


Figure 3. Flow-chart description 

attention, (Numbers in the figure re! 
occur whenever attention is directed outward 
sto the environment, is presumed to be an 
interactive process. That is, it involves a com- 
parison of the incoming information with 
Preexisting recognitory schemas.“ Indeed, 
many believe that the resulting percept 15 as 
much a function of the previously encoded 
information as it is a function of the input 


Pnformation itself (cf. Neisser, 1976). The 


N 


fact that this categorization involves such an 


interaction between input and schema sug- 
Bests another qualification that must be made 


Tegarding the proposed dichotomy between 
self-focus and environment focus. Specifically, 
if processing and classification of even # very 
Simple sensory input occurs by reference to 
„Preexisting recognitory prototypes, it woul 
4Ppear to be an oversimplification to say that 
attention is ever directed “at” the environ 


SELF-FOCUS WITH 
BEHAVIORAL STANDARD SALIENT 


ANDARD HAS 
POSITIVE VALENCE: 


BEGIN DISCREPANCY BEGIN DI 
singers | | AES 
SEQUENCE SEQUENCE 


HIGH EFFICACY LOW EFFICACY 


HIGH EFFICACY 
PERCEPTIONS: 
FECT 


AFFEC’ 
ASSOCIATED 
WITH 
ENVIRONMENT 


PERCEPTIONS: 


of the cybernetic model of the consequences of self-directed 
fer to comparable propositions in the text.) 


ment alone. The prototypes are, in a very 
basic sense, an aspect of oneself, The process- 
ing of environmental information thus seems 
to involve the simultaneous utilization of an 
of the self. 
mais level of analysis, however, the di- 
chotomy still remains useful and meaningful. 
When internally stored schemas are accessed 
for comparison with environmental input dur- 
ing normal perception, this accessing occurs 
preattentively. It is the resulting percept that 
is represented in consciousness, not the evok- 
ing of the schemas. With regard to conscious 
experience, then, it seems quite reasonable for 
us simply to ignore the role played by the 


d that thi 

6 The perceptive reader may have notei at this 
Tes He itself represents a TOTE unit at a lower 
level of analysis than is being emphasized in this 


paper. 
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schemas in generating the percept. Moreover, 
it bears repeating that the function of the 
Sequence that leads to the percept is quite 
obviously the categorization of the stimulus 
input—and the input is an event whose point 
of initiation is outside the self. Because this 
is the goal of the activity, it thus seems rea- 
sonable to refer to it with such terms as focus 
on the environment and processing of incom- 
ing information. In both of these respects, the 
Proposed dichotomy can be viewed as a verbal 
shorthand that is useful in classifying con- 
scious experience at a macro level (where 
most of our present concerns are located) 
even while recognizing that at a micro level 
much more complicated events are occurring. 

3. In some cases, response-prototype infor- 
mation either constitutes part of a recognitory 
prototype or is directly implied by the recogni- 
tory prototype; in other cases, this informa- 
tion is not contained or implied (see Figure 
2). Saying that a recognitory schema used to 
identify one’s behavioral context does specify 
Tesponse-prototype information is exactly 
equivalent to saying that the context evokes 
in the person a behavioral standard. As an 
example, most people in our culture have been 
taught to dampen the intensity of their be- 
havior—that is, to speak and walk quietly— 
in places that are classified as religious in na- 
ture. When a person with this behavioral 
schema classifies a behavioral setting as being 


religious, quietness is immediately evoked as 
a behavioral standard. 


Situations exist in 
dard of appropriate 
guidelines for action 
recognition of the 
Whether a situation 


tization that were Postulated above to occur 
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are being made between information and re 
cognitory schemas. The difference is that th 
information now being compared with the 
schemas is self-information—information from 
within (e.g., recollections of prior behaviors 
or present bodily feedback activity). Experi- 
entially, in this type of circumstance, self- 
attentive persons simply become increasingly | 
cognizant of themselves and their own salient 
characteristics, much as they might examine 
any external stimulus. 

In such cases, self-focus is not evaluative 
except in the minimal sense that preattentive 
analysis “evaluates” in order to categorize, | 
More specifically, this state of self-focus is not 
phenomenologically aversive. 

The notion that self-directed attention may} 
in some cases lead simply to heightened 
awareness of salient self-elements rather than 
to a self-critical examination is a major de-| 
parture from the assumptions on which Duval 
and Wicklund (1972) based their theory. Yet 
there is a good deal of evidence to support this 
assertion. This evidence will be reviewed in| 
the following paragraphs, l 

Duval and Wicklund (1973) themselves | 
demonstrated that more self-attentive subjects 
make greater self-attributions for hypothet- 
ically experienced outcomes than do less self- | 
attentive subjects, This outcome has since 
been replicated (Buss & Scheier, 1976), al- 
though other research suggests some limita- 
tions on the effect (Hull & Levy, 1979). It 
may be that the mere heightened salience of 
self as an entity led subjects in the Duval and i 
Wicklund (1973) and Buss and Scheier 
(1976) studies to weight the self more heavily 
in the judgments they made. Alternatively, 
perhaps the salient aspect of self in this at- 
tribution-making context is one’s potential as 
a causal agent (cf. Duval & Wicklund, 1973). 

The reasoning that salient self-aspects may 
be more fully represented in consciousness 
when self-focus is high than when it is low 
has also been extended to the internal expeti- 
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ence of emotional states (Scheier & Carver, 
V 1977). In that research, it was proposed that 
when one’s attention is self-focused in a con- 
text in which there is affect, one may attend 
to that affect as being the salient component 
of self. Thus, in two of Scheier and Carver’s 
studies, subjects who had experienced positive 
and negative mood inductions subsequently 
reported feeling more elation and depression, 
respectively, when more self-attentive than 
when less so. In two additional studies, sub- 
jects reported experiencing greater attraction 
and repulsion toward pleasant and unpleasant 
slides, respectively, when self-attentive than 

when not. Others (Borkovec & O’Brien, 1977) 

have found similarly that directing a subject’s 

attention to his or her bodily responses leads 
to an increase in the self-reported intensity of 
the emotion being experienced. 

The notion that heightened self-focus leads 

to enhanced awareness of one’s internal state 
Suggests another interesting possibility: that 
‘self-focus can lead to heightened awareness 
of the absence of anticipated internal activity. 
This hypothesis was recently tested (Gibbons, 
Carver, Scheier, & Hormuth, 1979), in a 
Study in which some subjects were led to 
expect arousal symptoms from a pill, whereas 
others were not. In all cases the pill was a 
Placebo, Among subjects who expected 
arousal, fewer symptoms subsequently were 
reported by self-focused subjects than by 
those with less self-focus. 

The Gibbons et al. (1979) finding appears 
to indicate that when led to expect a different 
internal state than is actually present, the 
self-attentive person has greater access to the 
Veridical internal state than does the less self- 
attentive person. Two additional studies 
(Scheier, Carver, & Gibbons, in press-a) have 
Mvestigated further this heightened awareness 
of internal experience among self-attentive 
Petsons by assessing whether it can serve to 
teduce other kinds of suggestibility phenom- 
na. In the first of these studies, subjects were 
‘Xposed to stimuli of moderate sexual attrac- 
tiveness, which they were instructed to eval- 
cite on the basis of their own bodily reactions. 

ubjects were led, however, to anticipate 
either highly arousing or very nonarousing 
Stimuli. Subjects with experimentally height- 
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ened self-attention were less misled by these 
cues than were subjects in whom self-focus 
had not been increased. The second study ex- 
tended the reasoning to the experience of 
taste. Subjects were led to expect either an 
increase or a decrease in a flavor intensity 
(relative to a previous sample); they then 
received either a slightly stronger or a slightly 
weaker solution. The intensity judgments sub- 
sequently made by highly self-attentive sub- 
jects were less in line with manipulated antic- 
ipations and more in line with actual flavor 
intensities than were judgments made by less 
self-attentive subjects. Thus, again, self-focus 
minimized suggestibility, apparently by in- 
creasing awareness of an actual internal ex- 
perience. 

Nor is the increased accuracy of self-report 
that occurs under high self-focus limited to 
reports of internal perceptual experiences. 
Other studies (Pryor, Gibbons, Wicklund, 
Fazio, & Hood, 1977; Scheier, Buss, & Buss, 
1978) indicate that self-focus enhances peo- 
ple’s accuracy in reporting on their habitual 
behavioral tendencies. Thus, for example, self- 
reports of sociability that were made under 
conditions of heightened self-attention were 
found to be highly correlated with unobtru- 
sive measures of actual social behavior; re- 
ports made by less self-attentive subjects, on 
the other hand, poorly represented their actual 
behavior (Pryor et al., 1977). Presumably, 
self-attention provides more veridical or more 
thorough access to the memories that are 
relevant to such self-reports. Additional evi- 
dence that self-focus increases the activation 
of self-relevant memory areas has been pro- 
vided by Geller and Shaver (1976).” 


7 In addition to determining ease of access to pre- 
viously stored information, the direction of one’s 
attention probably also has implications for the 
access one has to schemas for use in encoding new 
stimuli (whether they be self-stimuli or environ- 
mental stimuli). Thus, for example, when one is 
self-focused, one may not have good access to the 
schemas that are typically used for encoding informa- 
tion about the environment (cf. Vallacher, 1978). 
Indeed, easy access to a schema may increase its use, 
even in situations where such use would not other- 
wise be expected. This may be the mechanism behind 
the “top of the head” phenomena discussed by Taylor 
and Fiske (1978), in which some particularly salient 
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A final illustration of the fact that self- 
attention can lead simply to greater cog- 
nizance of the self comes from research con- 
ducted by Carver and Scheier (1978) as 
validation of self-awareness manipulations. 
Subjects in that research completed a self- 
focus sentence completion blank (Exner, 
1973) developed and validated earlier as a 
measure of egocentrism. This instrument was 
completed either in an empty room or in the 
presence of a self-awareness-inducing stim- 
ulus, As was expected, proportionally more 
self-focus sentence completions were emitted 
in conditions of heightened self-awareness 
than under control conditions. 

Salience. One final issue, implicit in much 
of the above discussion, probably should be 
addressed at this point. What determines 
which aspect of the self is “salient” when 
attention is self-directed and which aspect of 
the environment is salient when attention is 
directed to the environment? A full and com- 
plete answer to this question cannot yet be 
given. Doubtlessly the ultimate answer will be 
complex. However, one determinant of sali- 
ence probably derives directly from the inter- 
active nature of the categorization process de- 
scribed above. As was noted, classification of 
the environment takes place by comparison of 
input against recognitory prototypes. If the 
prototypes to which the environmental inputs 
correspond have cue implications for some 
specific self-information, the self-information 
is probably accessed (in a preliminary pre- 
attentive fashion) by the very process of 
classifying the environmental input. The 
aspect of self that is represented by that self- 
information presumably will be more salient 
than other self-aspects when attention is sub- 
sequently directed inward. Another way of 
saying this is that when we attend to the self, 
we are often examining an aspect of self that 
has been suggested by some cue in our en- 
vironmental context. For example, an environ- 
mental stimulus that is categorized as typ- 
ically emotion-inducing may cue a search for 


stimulus exerts an inordinate influence on subjects’ 
behavior. This notion might also have some relevance 
to the recent finding that imagining an event in- 
creases the degree to which a person expects the 
event to occur (Carroll, 1978). 
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bodily arousal as a salient self-dimension) 
when attention is self-directed. Indeed, such a % 
cue implication may serve in vivo as an im- 
petus to shift focus from the environment to 
the self. 

Similarly, the self-aspect that has just been 
under examination may be a major determi- 
nant of the aspect of the environment that is 
salient when attention is directed outward. 
For example, consider a case in which self- ff 
attention has had as its object an existing 
state of bodily arousal for which there is no 
recognized antecedent. In such a case, it seems 
likely that attention subsequently directed 
outward would be oriented specifically to 
searching for aspects of the environment that 
are categorizable as potentially arousing. Thus 
a self-aspect can have a cue implication for 
some aspect of the environment, leading to 
heightened salience of that aspect when one’s 
focus turns outward. More simply, when we 
attend to the environment, we are often focus- 
ing on aspects of the environment that have} 
been suggested by some cue within ourselves. | 

5. Just as categorization of environmental 
stimuli sometimes evokes a behavioral proto- 
type as a standard, a categorization of a sali- 
ent self-aspect may in some cases evoke a be- 
havioral standard (see Figure 2). As an 
example, a salient emotion might evoke aĵ 
prototypic response ¢o that emotion. The re- 
search presented above regarding awareness 
of affect was limited to reports of subjective 
emotional experiences. Other research has 
demonstrated, however, that heightened 
awareness of an affect may also lead to 10- 
creased behavioral responsivity to the affect. 
For example, Scheier (1976) found that pro) 
voked aggression was more intense among 
more self-focused than among less self-focused 
subjects. Other research (Scheier, Carvel 


8Scheier (1976) found that subjects’ persondi 
attitudes toward aggression did not correspond 4 
the aggression they actually emitted. However, i 
attitude is only one type of standard that is Po 
tially applicable in that context. The issue of yi 
of standards will be addressed more fully below. 6 
position taken here with regard to the Scheier (19: 4 
data is that awareness of anger evoked a stereot P 
retaliatory response prototype and that the pro at 
type was used as a comparison value for subseque 
behavior. 
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Schulz, Glass, & Katz, 1978) has demon- 
trated that feelings of sympathy are ex- 
pressed to a greater degree among more self- 
attentive than among less self-attentive per- 
sons. Finally, yet other studies (Scheier, 
Carver, & Gibbons, in press-b) have shown 
that strong fear leads to greater avoidance 
among more self-attentive than among less 
self-attentive persons. 

6. A standard can have either a positive or 
a negative valence. A positive valence implies 
that the standard is taken as a “desired” goal 
state. A negative valence exists when a stan- 
dard of comparison is taken as an “undesired” 
goal state. Valences presumably are based on 
previous learning and generalization and on 
temporary encoding through imitation and 
verbal transmission, Multiple or component 
valences can easily exist with respect to any 
specific standard, of course. Consider, for ex- 
ample, the teacher’s behavioral standard of 
going from the office to teach a class, On a 
given day the teacher may feel that he or she 
should go to the class because teaching is his 
or her vocation (positive valence); the 
teacher may also feel like not going because 
the teacher does not enjoy the class (negative 


valence). One important component valence, 


of course, occurs as a function of an obligation 
to do the behavior in question (cf. Vickers, 
1973). 

It seems likely that any given component 
valence can also vary in salience over time. 
Its salience is likely to depend on the salience 
of the environmental cues to which the val- 
ence is attached. For example, if bright, inter- 
ested students have just been in the teacher’s 
office, a positive valence may be salient. Pre- 
sumably the accumulation of these component 
valences (weighted by salience) determines 
the overall valence of the standard as dis- 
cussed here. 

7. Standards and behavioral dimensions can 
also vary in importance. There are at least 
two ways in which importance may be con- 
strued in informational terms. The first de- 
pends on association of positive or negative 
events with the stimulus class during the per- 
son’s prior experiences. A priority label may 
become part of the categorization as 4 func- 
tion of the intensities of such events. High 
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intensity events (e.g., strong punishment for 
poor behavior with respect to a standard) 
would lead to a high priority tag or “Gmpor- 
tance”; low intensity events would lead to a 
tag of less priority. 

A second possibility concerns the centrality 
or interrelatedness of a behavioral dimension 
with respect to other dimensions. Importance 
in this case would be defined according to the 
number of cross-reference tags attached to a 
given behavioral dimension that refer to other 
behavioral dimensions. For example, the 
amount of work that one person does may be 
weighed in his or her mind as important, but 
its importance does not extend to other areas 
of the person’s life. The work, in effect, is left 
at the office. In contrast, another person may 
not only see his or her work output as im- 
portant but may also view it as having im- 
plications for other parts of his or her life, for 
example, interpersonal relationships. The 
work output dimension thus would be effec- 
tively more important for the second person 
than for the first, by virtue of its intercon- 
nectedness with other dimensions.’ 

8. If a behavioral standard has been evoked 
either by environmental input or by a salient 
selj-element, subsequent self-focus leads to a 
comparison between self (ones present be- 
havior or characteristics) and the previously 
evoked standard. Comparison between self 
and a positively valenced standard leads to 
discrepancy reduction; comparison between 
selj and a negatively valenced standard leads 
to discrepancy enlargement (see Figure 3). 
The comparison between self and standard is 
regarded as representing the test phase of a 
psychological TOTE unit. It thus leads to the 
engagement of one or the other of two possible 
adjustment and feedback sequences: either a 
negative feedback loop (discrepancy reduc- 
tion) or a positive feedback loop (discrepancy 
enlargement). 

The notion that heightened self-focus leads 


9 Standards can also vary in their degree of ac- 
ceptedness, which is logically distinguishable from 
importance, even though the two in reality are often 
confounded. A standard that one accepts half- 
heartedly will be less reflected in behavior than one 
accepted enthusiastically. 
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to increased conformity of one’s behavior to a 
salient standard was, of course, a main tenet 
of self-awareness theory, although for quite a 
different reason than is assumed here. Self- 
awareness theory held that discrepancy reduc- 
tion was one of several potential responses to 
the postulated aversiveness of self-attention. 
The present model assumes that the discrep- 
ancy-reduction impulse is affect free and is the 
prepotent response to the recognition of a 
discrepancy. These distinctions are important 
ones and will be elaborated more fully in a 
later section of the paper. 

Discrepancy reduction. Whatever its basis, 
the prediction that self-directed attention 
leads to heightened conformity to the salient 
behavioral standard has received ample re- 
search support. For example, a series of 
studies of instrumental aggression showed that 
subjects matched their behavior to salient 
standards (which varied from study to study) 
to a greater degree when made more self- 
aware than when less self-aware. In the first 
of these studies (Scheier, Fenigstein, & Buss, 
1974), self-focus caused male subjects to con- 
form to a “chivalry” norm when shocking a 
female victim who had not provoked them. In 
a second study (Carver, 1974), female sub- 
jects shocked at higher levels when self-focus 
was increased, in line with an experimentally 
induced standard in which high shock was 
portrayed as desirable. In yet other studies 
(Carver, 1975), subjects’ aggressive behavior 
was more consistent with their previously ver- 
balized attitudes when the subjects were more 
self-attentive than when less so. 

Moreover, demonstrations of the fact that 
self-focus leads to increased conformity to 
standards are by no means limited to studies 
of aggression. For example, in an experiment 
in which speed of copying prose was made 
salient as a standard of comparison (Wick- 
lund & Duval, 1971), more prose was copied 
within a given time span by subjects in whom 
self-focus was heightened than by those in 
whom it was not. In addition, compared to 
less self-aware controls, self-aware subjects 
have been found to behave more consistently 
with their previously assessed stage of moral 
reasoning (Froming, in press) and more con- 
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sistently with their previously assessed levels 
of sex guilt (Gibbons, 1978). 

Discrepancy enlargement. The notion that 
standards can have negative valuation and 
that focus on those standards leads to discrep- 
ancy enlargement has received much less 
attention than has the matching-to-standard 
ptocess. However, this notion is logically con- 
sistent with an older idea in social psychology. 
Newcomb (1958) wrote of the “negative ref- 
erence group,” a group with which one does 
not identify, but with which one actively com- 
pares oneself. Such a comparison allows one to 
emphasize, and indeed to further increase, the 
existing differences, Newcomb coined the term 
negative reference group in the context of 
social comparison theory (Festinger, 1950, 
1954). Yet the nature of the process that 
Newcomb proposed is strikingly similar to the } 
process proposed above for cases in which 
some negatively valenced behavioral standard 
is salient. The difference is that in the present 
model the comparison value need not be the 
attitudes or behavior of a particular group of | 
people. It can be any negatively valenced 
value that is taken up for purposes of com- 
parison, (The relationship between the pres- 
ent theorizing and social comparison theory | 
will be examined in more detail in a later sec- | 
tion of the article.) 

Another class of behavior that can be 
viewed as discrepancy enlargement is a phe- 
nomenon that is usually analyzed in terms of 
psychological reactance (Brehm, 1966; Wick- 
lund, 1974). Researchers have demonstrated 
repeatedly that coercive attempts to persuade 
lead to resistance to persuasion or even to atti- 
tude change in a direction opposite to the 
position being advocated (e.g, Brehm & 
Brehm, 1966; Snyder & Wicklund, 1976). 
Such effects may be interpreted as attempts 
to distance oneself from the standard of com- 
parison (i.e., the position that is advocated in 
the communication), In the present model 
that standard would acquire a negative Vē- 
lence in very much the same manner as WS 
postulated by Brehm (1966)—that 15) by 
virtue of the threatening elements contain 
in the associated communication. Consistent 
with the reasoning that distancing-from-stam” 
dard might occur under these circumstances 
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in response to self-focus, Carver (1977) has 
found that attitude reversal in response to a 
coercive communication was greater among 
self-focused subjects than among less self- 
focused subjects. Similar results have occurred 
‘in two additional studies (Carver & Scheier, 
in press). 
Standards. Standards for self-comparison 
can vary widely, both in terms of their spe- 
cific nature and in terms of their source. One 
reasonable distinction in this regard would be 
between the type of a standard and its con- 
tent. For example, smoking in response to 
_ stress is different in content from eating, or 

pacing in response to stress. Yet they are all 
2 similar in one respect: They are of the type 
Called habit. A personal-preference attitude is 

a different type of standard. A third type is 
an attitude based on more complex reasoning, 
for example, moral reasoning. Similarly, an 
instruction from another person that is 
adopted and conformed to is different in type 
from all these, as is the temporarily assumed 
socially desirable behavior that others nearby 
are exhibiting. With these multiple possibil- 
ities of source, type, and content, what deter- 
mines which is evoked in a given situation and 
_ is thus taken as the comparison value when 
one is self-focused? 

Specific applications of this general ques- 
tion abound in personality and social psychol- 
ogy. For example, the literature of moral 
development is replete with illustrations of 
the fact that moral reasoning is 4 judgmental 
capability rather than a typical way of behav- 
ing. Under which conditions will a person 
respond to a moral choice by exerting his or 
her psychological capabilities to their fullest, 
and under which conditions will the person’s 
behavior be dictated by other considerations? 
- The answer to this question is by n© means 
clear, for the following reason. Although 
categorization processes and the use of catego- 
rizational schemas to classify and interpret 1n- 
formation have recently begun to receive at- 
tention from social psychologists (€-8- Cantor 
& Mischel, 1977; Kuiper & Rogers, 1979; 
Markus, 1977; Rogers, Kuiper, & Kirker, 
1977; Rogers, Rogers, & Kuiper, 1979), thus 
far virtually no work has been aimed ex- 
plicitly at understanding the process of ex- 
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tracting behavioral prototypes in response to 
environmental cues. Which standard becomes 
salient in a given experimental situation has 
usually been more a matter of good intuitions 
on the part of the researcher who is preparing 
the situation than a matter of precise analysis 
of the determinants of salience. Quite ob- 
viously, this entire question is an important 
area for future work. For the present, let us 
simply note that matching-to-standard effects 
appear to have been demonstrated with regard 
to all of the following: habits ° (Liebling, 
Seiler, & Shaver, 1974), personal-preference 
attitudes (Carver, 1975), moral reasoning 


capabilities (Froming, in press), and experi- 
mentally provided instructions (Carver, 
1974). 


9. Matching-to-standard is the normal con- 
sequence of self-attention when a behavioral 
standard is salient, until and unless the pro- 
cess is impeded in some way. (Although this 
and the following propositions will be framed 
in terms of matching-to-standard, it should be 
clear that the same logic is applicable to dis- 
tancing-from-standard phenomena; see also 
Figure 3). Throughout a discrepancy-reduc- 
tion sequence, self-focus represents a recur- 
rence of the test mechanism of the psycholog- 
ical TOTE unit. Test will continue to alter- 
nate with one or more operate mechanisms 
until behavior or present state is brought into 
line with the standard. As in any TOTE se- 


10 The fact that matching-to-standard apparently 
can occur with regard to habits as standards raises 
the additional question of how “conscious” the 
process is in such cases. Matching to a habit standard 
may feel very unconscious, phenomenologically, for 
the following reason. The degree to which execution- 
of a behavioral sequence seems conscious depends on 
f its component acts and their conse- 
quences; this in turn depends on the degree to which 
those acts are monitored while being carried out. In 
habitual activity, the behavioral components have 
previously been encoded as a well-connected chain 
(cf. Schank & Abelson, 1977, discussion of cognitive 
scripts”). To use TOTE terminology, a single op- 
erate consists of a behavioral sequence rather than 
a small bit of behavior change. Therefore, fewer tests 
are required over the course of the behavior than 
is the case for other, less well-known behavioral 
sequences. The relative paucity of attention devoted 
to monitoring this behavior suggests that it may be 
experienced Jess completely in consciousness. 
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quence, if successful completion of matching- 
to-standard occurs, it is followed by disen- 
gagement. 

This proposition stands in direct contradic- 
tion to the assumption in recent statements of 
self-awareness theory (Wicklund, 1975a, 
1975b) that the prepotent response to self- 
focus in a standard-salient context is avoid- 
ance of self-awareness rather than discrepancy 
reduction. As was noted above, this contradic- 
tion has important ramifications that will be 
considered further in a later section of the 
paper. 

10. If something impedes the matching-to- 
standard process, behavior is interrupted and 
an assessment process is evoked (see Figure 
3). This assessment entails the further pro- 
cessing of relevant information, yielding an 
outcome expectancy: an estimate of the likeli- 
hood of being able to more closely approxi- 
mate the standard, based on the nature of the 
situation and on the behaviors available to 
the person. This assessment process can occur 
prior to the initiation of behavior if, for ex- 
ample, the person has foreknowledge that the 
behavior is going to be difficult to execute suc- 
cessfully. Alternatively, it can occur during 
the matching-to-standard sequence if the per- 
son encounters difficulty in either selecting 
or executing the appropriate operation to 
move toward the standard. 

The assessment of outcome expectancy may 
be an evaluation of the likelihood of attaining 
the overall goal, given the context and one’s 
resources. Alternatively, if the matching-to- 
standard sequence involves separable steps, it 
may be an assessment of the likelihood of at- 
taining the next step. In either case, the 
assessment process has two products: One is 
behavioral, the other affective, These will be 
considered in turn. 

11 (a). If the assessment process leads to a 
favorable outcome expectancy, the behavioral 
response is a return to the matching-to-stan- 
dard sequence (Figure 3). 

11(b). If examination of the context and 
one’s resources reveals a low subjective prob- 
ability of being able to alter behavior appro- 
briately, the behavioral consequence is with- 
drawal. The latter would appear to represent 
a disengagement from the dimension in ques- 
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tion, a kind of going-out-of-the-field phenom- 
enon (cf. Lewin, 1935). In practice, this 
could take the form of a mental dissociation 
from the dimension (refusal to consider fur- 
ther) or could occur as a behavioral with- 
drawal from contextual elements that either 
heighten self-attention or make the standard 
salient. 

There are several types of events that can 
potentially prevent or disrupt the matching- 
to-standard process. One possibility is that the 
behavior in question is fixed in the past. If 
there is no opportunity available to repeat, 
reattempt, or undo the behavior, it is thus 
rendered unchangeable. In such a case, dis- 
crepancy reduction cannot even begin. An 
example of this state of affairs occurred in a 
study conducted by Duval, Wicklund, and 
Fine (1972). Some subjects in that study 
were informed that their scores on a pre- 
viously administered test of intelligence and 
creativity were very discrepant from the levels 
that they had anticipated and desired. More- 
over, it was implicit that the test was not 
going to be repeated, nor was any other op- 
portunity to be provided for reducing the 
discrepancy (thus all subjects would have 
had low outcome expectancies with regard to 
matching the behavior to standard). In this 
study, subjects who were to wait for another 
experimenter after receiving the negative 
feedback withdrew sooner if the room con- 
tained a self-attention-heightening stimulus 
than if it did not. 

Results of a more recent study (Steenbarget 
& Aderman, in press) confirm that this with- 
drawal tendency depended on the fact that 
the behavioral discrepancy was portrayed as 
unalterable. This study made use of a para- 
digm derived from that of Duval et al. 
(1972), but one in which there was ostensibly 
to be a second task. Steenbarger and Aderman 
(in press) found that when subjects experi- 


11 This proposition and the two following propost- 
tions, though derived independently, have a go i 
deal in common with Stotland’s (1969) anaes a 
hope as a psychological construct. Stotland used f 
term kope to refer to the perception of a high ee 
ability of attaining a goal and the term anxiety " 
refer to the perception of low probability of attain 
ing a goal. 
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‘enced an inflexible discrepancy—that is, when 
‘the impression was created that subjects could 
not expect to alter their performances ap- 
preciably—sclf-awareness hastened with- 
drawal. When the impression had been created 
‘that improvement was a real possibility, on 
‘the other hand, no increase in withdrawal 
‘occurred as 2 function of self-attention. 
Intrapersonal deficit. A second set of con- 
ditions that may lead to behavioral interrup- 
tion stems from the possibility that the execu- 
‘tion of the behavior in question requires spe- 
“cific skills, abilities, or characteristics that the 
“person is not certain he or she possesses. The 
“implication that these skills are required may 
“cause the discrepancy-reduction attempt to be 
suspended and cue the assessment process. An 
example of the importance of this possibility 
is provided by recent studies of fear-based 
behavior conducted by Carver and Blaney 
(1977a, Experiment 3; 1977b). In those 
studies, some subjects were chosen explicitly 
on the basis of their doubts about their ability 
to cope with fear, whereas others were chosen 
as believing that they could cope with their 
fear, All of these subjects, however, reported 
having identical (moderate) degrees of fear. 
It seems reasonable that signs of fear while 
Í attempting to approach and pick up the 
feared stimulus (in this case, a snake) should 
cue a self-assessment process in all such sub- 
jects. In the Carver and Blaney (1977b) 
study, the subjects who doubted their abil- 
ities to overcome their fear (i.e., who had 
chronically low outcome expectancies) Te- 
acted to the perception of rising fear, induced 
via false heartbeat feedback, by avoiding at- 
tending to the behavior-goal comparison (ac- 
cording to self-reports) and by withdrawing 
behaviorally. The latter effect replicated a 
Previous finding (Carver & Blaney, 1977a, 
Experiment 3). In contrast, those subjects 
Who had expressed more confidence about 
being able to go through with the behavior 
(ie, who had chronically higher outcome ex- 
Pectancies) responded to the perception of 
anxiety by focusing more on the behavior- 
80al comparison and attempting—successfully 
—to match the one with the other. 
The Carver and Blaney (19778, 1977b) 
Studies investigated the behavioral conse- 
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quences of variations in expectancy, but those 
studies did not explicitly incorporate experi- 
mental manipulations of self-focus. More re- 
cently, however, a similar study has been 
conducted using a conventional self-awareness 
manipulation (Carver, Blaney, & Scheier, 
1979). In that experiment, confident subjects 
were found to be undeterred by heightened 
self-awareness during the approach task, but 
doubtful subjects withdrew earlier in the 
approach sequence when self-focus was high 
than when it was lower. Postexperimental self- 
reports formed a pattern that was consistent 
with the present model, that is, that height- 
ened self-attention had led to increased aware- 
ness of fear among all subjects and thus to 
self-assessment during the approach sequence; 
that self-assessment, in turn, had led to con- 
tinued approach attempts among confident 
subjects and to early withdrawal among 
doubtful subjects. 

This result has recently been conceptually 
replicated (Carver, Blaney, & Scheier, in 
press) in quite a different behavioral domain: 
i.e., responses to failure. All subjects in that 
research were confronted with a failure ex- 
perience on an intellectual task, in order to 
create a large self-versus-standard discrep- 
ancy. Subjects then undertook a second task, 
ostensibly bearing on the same intellectual 
skill. Rather than assessing subjects’ chronic 
expectancies, however, as had been the case 
in the Carver et al. (1979) study, an experi- 
mental manipulation of outcome expectancy 
was introduced. Subjects were led to believe 
either that they could potentially do quite 
well on the second task, or that they would 
probably do quite poorly. In reality, the sec- 
ond task was a measure of persistence. As 
predicted, subjects in whom unfavorable ex- 
pectancies had been induced were less per- 
sistent when self-focus was high than when it 
was low. Also as predicted, subjects with 
favorable expectancies were more persistent 
when self-focus was high than when it was 
low. 

The tasks in all of the above research pro- 
vided subjects with escapable situations. That 
is, it was possible for subjects engaged in task 
attempts in those studies to withdraw phys- 
ically from the behavioral context. This pos- 
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sibility does not always exist, however, when 
a perceived self-deficiency threatens one’s be- 
havioral adequacy. For example, consider the 
person with severe test anxiety who must take 
an unavoidable test. While in the testing 
situation, such persons probably are very 
susceptible to being interrupted from their 
performance attempts by stimuli that cue out- 
come assessment. In many test situations, this 
assessment yields unfavorable expectancy 
judgments. Because they cannot withdraw 
behaviorally, however, these persons may be 
frozen in the self-assessment phase of the 
sequence, where they repeatedly reconfront 
the evidence of their own inadequacy. At- 
tempts at task-relevant activity are sporadic 
because outcome assessment indicates that 
such attempts will be unsuccessful, and the 
attempts are short-lived because the assess- 
ment process is likely to be re-evoked. Thus 
the test-anxious person is caught in a cogni- 
tive loop in which self-assessment and cogni- 
tive disengagement from the task predom- 
inate, rather than renewed effort. 

This portrayal is consistent with Wine’s 
(1971) characterization of the test-anxious 
person as being chronically self-focused in 

| situations where the adaptive response is to 
be task focused. The present model seems 
more adequate than is Wine’s, however, in 
two respects. Though Wine’s analysis would 
not necessarily predict that self-focus can 
facilitate performance among persons low in 
test anxiety, the present model clearly does. 
Such performance facilitation has been dem- 
onstrated elsewhere (Wicklund & Duval, 
1971). The present model also implies that 
self-focus can facilitate performance even 
among highly test-anxious persons, if circum- 
stances lead those persons to have Positive 
outcome expectancies. This possibility does 
not seem derivable from Wine’s (1971) 
theory. Recent research (Slapion & Carver, 
Note 1) has yielded evidence that this can 
occur as well. 

Environmental constraints. Yet a third set 
of conditions that would lead to behavioral 
interruption is the Possibility that some fea- 
ture of the environment may prevent the suc- 
cessful execution of the appropriate behavior 
(a circumstance that fits the usual opera- 
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tional definition of frustration). To use a 
somewhat trivial example, if one goes shop. © 
ping at a time when stores are closed, one 
typically withdraws from the attempt. This 
withdrawal does not occur, however, until 
one’s attempt has been interrupted and the 
realization has occurred that the desired out- 
come is unlikely (i.e., low outcome probabil- 
ity). The notion that environmental con- 
straints lead either to withdrawal or toj 
attempts to circumvent them seems a very ob- 
vious point. This case is discussed here largely 
in order to point out that the psychological 
processes implicit in this class of events are 
the same as those that occur in the other event 
classes discussed above. This indicates the 
high degree of generality of the processes 
under consideration. 

Novelty. The analysis presented here of 
processes leading to withdrawal also bears 
some resemblance to Berlyne’s (1963) discus- 
sion of the somewhat perplexing fact that 
novelty sometimes leads to approach, some- 
times to withdrawal. He proposed that the 
presence of some degree of “collative” (ie, 
comparison-requiring) properties in a stimulus 
(novelty, for example) engages exploration, 
movement toward the stimulus in order to 
assimilate it. Too much of the same proper- 
ties, however, leads to withdrawal. According 
to Berlyne, the reason for ‘this withdrawal is 
that the organism assesses the alterations re- 
quired to assimilate the stimulus as being be- 
yond its present capabilities. In effect, out- 
come expectancy with regard to assimilation 
is negative, and withdrawal is the conse- 
quence. 

12(a). The judgment that one cannot alter 
one’s behavior in the direction appropriate t0 
the standard may also lead to negative affect 
in proportion to the importance of the be- 
havioral dimension or the standard and ie 
perceived magnitude of the discrepancy. 
This negative affect remains present as long as 
the person focuses on the judgment that he 


3 i ce 
12It seems worth reemphasizing that importan 


here refers to importance as defined by one a 
standard that others view as important but Bs 
seems irrelevant to oneself is for present purp% 
not an important standard. 
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‘or she cannot reduce the discrepancy. Evi- 
dence for this proposition comes from a study 
(Steenbarger & Aderman, in press) that was 
discussed above in connection with the as- 
sumption that self-focus leads to withdrawal 
only when discrepancy reduction is prevented 
in some way. That same study also showed 
that subjects who had been led to believe that 
they could not reduce the discrepancy experi- 
enced more negative affect when self-focus 
was high than when it was lower. This effect 
did not occur among subjects who expected 
to be able to reduce the discrepancy. 

12(b). Engagement of the assessment pro- 
cess can also lead to positive affect, if assess- 
ment leads to the perception of a favorable 
outcome probability. This is comparable to 
Simonov’s (1970) argument that emotion be- 
comes positive when “there is a surplus of 
information available as compared with the 
information necessary to satisfy a sufficiently 
strong need” (p. 147). This is the elation ex- 
perienced, for example, when one knows that 
one knows kow to arrive at the solution of a 
problem, even though the problem has not 
yet been solved.’* 

The assumption of two potential responses 
(behavioral and affective) to the assessment 
of outcome expectancy raises a question as to 
their relationship. The two may be parallel 
responses. For example, both the cessation of 
behavioral attempts and the experience of 
negative affect may be cued by a recognition 
that nothing can be done to improve one’s 
present state. This would be similar in some 
respects to Leventhal’s (1970) parallel re- 
sponse model of reactions to fear communica- 
tions, An alternative possibility is that be- 
havioral withdrawal may occur as 4 response 
to the existence of negative affect. It is not 
possible to specify the relationship unequi- 
vocally at this ‘time. It seems clear, however, 
that environmental constraints on potential 
behavior will have a large role in determining 
which of the two response modes predom- 
inates, particularly in cases of unfavorable 
outcome expectancy. That is, when the en- 
vironment permits withdrawal, withdrawal 
should predominate. When the environment 
does not permit withdrawal, negative affect 
should predominate. 
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13. Affect that results from assessment of 
outcome expectancy can take one of two 
forms, depending on the cognitions associated 
with the affect (see Figure 3). What cogni- 
tions are linked to negative affect will be de- 
termined by the person’s perception of the 
locus of the barrier to behavior alteration. 
Tf the person has a low efficacy expectancy— 
that is, if he or she perceives that behavior is 
being impeded by a self-deficitthe affect 
will be associated with the self. It may ‘be re- 
flected in such ways as reduced self-esteem. If 
the person has high efficacy expectancy and 
perceives that his or her behavior is impeded 
by some environmental constraint, the affect 
will be associated with those constraining ele- 
ments. It may be reflected by such things as 
expressions of resentment.’* Similarly, what 
cognitions are associated with positive affect 
will be determined by the perceived locus of 
the cause of favorable outcome expectancy. 
The affect will be associated with the self if 
personal efficacy is seen as having caused the 
good outcome to be likely; it will be asso- 
ciated with the environment if the positive 
outcome expectancy is ascribed to a benign 
environment. Although the logic behind this 
proposition is relatively straightforward, these 
predictions remain to be tested. 


Comparisons With Self-Awareness Theory 


At least three questions deserve further con- 
sideration with regard to differences between 
the present model and self-awareness theory 
as proposed by Duval and Wicklund (1972) 


13 This analysis may help to account for the ob- 
servation that the anticipation of a pleasant event 
often seems more pleasurable than the event itself, 
as if knowing that the-outcome will occur is better 
than its occurrence. Indeed, this would be consistent 
with the frequently expressed observation among 
psychologists that “the fun is over” once an idea has 
been conceived and a study to test it is designed. 

14 It may be this variable that distinguishes, for 
example, between depression attributed by a client 
to environmental contingencies and depression at- 
tributed to jnadequacies of the self. That is, insuf- 
ficiently positive outcome expectancies lead to de 
pressed affect, but whether cognitions of self o 
environment are associated with the negative affec 
may vary with efficacy expectancies. 
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and amended by Wicklund (1975a). They are 
the following: (a) Under what conditions is 
self-attention aversive? (b) What instigates 
or energizes the matching-to-standard pro- 
cess? (c) Finally, what is the prepotent re- 
sponse to self-focus when a behavioral stan- 
dard is salient—discrepancy reduction or 
withdrawal? These issues will be discussed in 
the following three sections. 


Aversiveness of Self-Focus 


One important area of conflict between the 
Present model and that of Duval and Wick- 
lund concerns whether negative affect is an 
inevitable consequence of self-attention, and 
if not, what other conditions are necessary 
for it to occur. Duval and Wicklund (1972) 
and Wicklund (197Sa, 1975b) have argued 
that any time a discrepancy exists such that 
one’s present state or behavior is worse than 
the standard of comparison, self-attention is 
aversive." “It is assumed that the objectively 
self-aware person will not simply react to him- 
self impartially and in a neutral manner, but 
that he will come to evaluate himself as soon 
as the objective state occurs” (Duval & Wick- 
lund, 1972, p. 3). “Certainly objective self- 
awareness is postulated to be an aversive, 
motivational state” (Wicklund, 1975b, p. 80). 
Indeed, as was stated at the beginning of the 
Present article, the postulated aversiveness of 
self-attention was that theory’s crux. In con- 
trast, it has been Proposed here that self- 
directed attention leads to negative affect 
only when the person perceives that he or she 
cannot alter his or her Present state in the 
direction of the Standard. (This discussion 
will be framed in terms of discrepancy reduc- 
tion, but it is of course applicable to discrep- 
ancy enlargement as well.) 

It is instructive to hote that the studies 
cited as evidence of the unpleasantness of 
self-focus were invariably cases in which a 
within-subjects discrepancy existed and in 
which subjects were prevented from altering 
their present state or behavior, For example, 

in the first demonstration that subjects would 
attempt to escape from self-focus (Duval et 
al., 1972), the subject’s behavior on the rele- 
vant dimension was anchored in the past and 
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thus could not be altered in the direc 
the standard. Similarly, though self 
has been found to be reduced when 
were made self-aware (Ickes, Wick 
Ferris, 1973), it is unclear to what 
that finding depended on the nature 
study’s dependent measure. The d 
measure was a comparison between sul 
self-rated real selves and ideal selves 
several dimensions. Subjects in that res 
had no way to alter their real selves, 
were simply reminded of the real-ideal. 
crepancy to a greater degree when self-aw: 
ness was heightened than when it was no t 
Aside from these examples in which 
jects’ behavior or characteristics were t 
changeable, there appears to be no evi 
that self-attention itself is phenomenolog 
aversive. To the contrary, Steenbarger al 
Aderman (in press) found an inc 
negative affect only when self-focus was con 
bined with a nonreducible discrepancy; 
the discrepancy was flexible, the opposite 
tendency occurred. Similarly, Carver al 
Scheier (1978), in demonstrating that 
presence of either a mirror or an audi n 
increases self-focus, were unable to find 
evidence that self-attention led to n 
affect. A similar conclusion has also 
reached by other researchers (Davis & B 
1975; Hull & Levy, 1979). 4 
Affect and valuation. There is only 
restricted sense in which the present m 
implies the existence of something resemb 
negative affect whenever self-attention is di 
rected to a behavior-standard comparison 
may be useful in this regard to draw a 
tinction between the phenomenological 
perience of affect and the function of 
tion (cf. Diggory, 1966, p. 70). The 
that initiates the matching-to-standard 
quence according to the present model is 
fact that the test of the TOTE unit as 


15 Wicklund (1975a) has also pointed out re $ 
existence of a positive discrepancy (ie, self b i 
than the standard) is a different case, g in 
view to positive affect. The present discussion, 
ever, will be limited to instances of negative 
crepancy, as has implicitly been the case 
the article. 


CYBERNETIC MODEL OF SELF-ATTENTION PROCESSES 


the existence of a discrepancy between a pres- 
ent state and an accepted goal state. Inas- 
much as the function of the behavior change 
(operate) is to reduce the discrepancy, it is 
possible to construe the recognition of the dis- 
crepancy as being (on some level) a nega- 
tively evaluated event. However, there seems 
to be no reason to assume a concomitant 
phenomenological experience of negative emo- 
tion. As was noted in the preceding para- 
graphs, however, there is substantial reason to 
believe that negative affect occurs in cases 
when one is prevented from reducing those 
discrepancies (Duval et al., 1972; Steenbarger 
& Aderman, in press). Thus the data seem 
consistent with the assumption in the present 
model that self-focus has phenomenologically 
affective consequences only following outcome 
assessment, 

To the degree that negative affect is experi- 
enced in contexts in which behavior change is 
(objectively) not difficult, it may be seen as a 
variation of the inability-to-alter experience. 
For example, a workman who repeatedly 
reaches to the wrong spot for tools may feel 
negative affect, not because of failure to reach 
the tool, but because the inability to correct 
the initial reaching impulse is beginning to 
alter his perceptions of self-efficacy with re- 
gard to tool handling. Similarly, a man at a 
formal dinner who suddenly realizes that he 
has been eating his salad with the wrong fork 
can easily alter his behavior. Any negative 
affect he experiences is caused by concern that 
his behavior may be regarded by others as 
uncultured and the possibility that he will not 
be able to alter the first impression that he 
has thus created. Finally, the perception of 
the need for a discrepancy reduction that will 
take an extraordinarily long time can lead to 
negative affect. However, this should occur 
only if there are salient doubts as to the ulti- 
mate outcome. If there is a positive outcome 
expectancy—hope, in Stotland’s (1969) terms 
—positive affect should ensue, even if many 
months of behavior remain between the person 
and his or her goal. 


Instigation of Matching-to-Standard 


There is a second important distinction be- 
tween the present model and Duval and 
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Wicklund’s (1972) theory that is intimately 
bound to the question of whether self-focus is 
aversive. This second distinction concerns the 
question of how behavior change is instigated 
and energized when self-focus occurs and a 
behavioral standard is salient. As was men- 
tioned above, Duval and Wicklund (1972) 
devised their model from a drive-theory per- 
spective. By postulating that self-focus is 
aversive any time a within-self discrepancy 
exists, those authors could assume that subse- 
quent behavior change was energized by a 
drive state and was instigated by that drive 
state’s aversiveness. 

In the present model, in contrast, the 
matching-to-standard sequence that is en- 
gaged when self-focus occurs in the presence 
of a salient behavioral standard is seen as the 
realization within a psychological system of a 
negative feedback loop. The fact that self- 
attention is required in order to engage the 
matching-to-standard sequence is interpreted 
in control-theory terms in the following way. 
Self-focus in a standard-salient context repre- 
sents the test phase of the TOTE unit: an 
assessment of whether a discrepancy exists. 
The behavioral response to self-focus (the 
alteration of behavior in the direction of the 
standard) is the operate phase. The ordering 
of the control sequence of the feedback loop 
dictates that operate cannot take place before 
test reveals a discrepancy. Thus, focus on a 
discrepancy is required to engage the TOTE 
sequence. In phenomenological terms, this 
awareness simply corresponds to the realiza- 
tion that one is in a context in which there is 
a standard to match with one’s behavior. A 
change in behavior depends on that realiza- 
tion. > 

Inasmuch as both models assume a com- 
parison process between self and standard, the 
critical difference between them in this con- 
text seems to be the presence or absence of a 
drive postulate. This issue introduces con- 
siderable theoretical complexity, stemming 
from two facts: first, that a great many vari- 
ants of drive as a construct have been pro- 
posed over the years, and second, that there 
is far from universal agreement on what mea- 
surable state, if any, corresponds to a theo- 
retical state of heightened drive. A completely 
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adequate treatment of this complexity is well 
beyond the scope of the present article. The 
interested reader is referred to Appley’s 
(1970) review article for a more detailed 
critique of drive and arousal constructs, Con- 
sideration of the issues here will be much 
more limited. 

Does self-attention heighten drive, or does 
it not? One might at first view this as an em- 
pirical question, although it will be argued 
below that this appearance may be illusory. 
However, let us consider the existing evidence 
on the relationship between self-focus and 
drive. Although agreement on the nature of 
drive is often difficult to obtain, as was noted 
above, many researchers have assumed that 
drive is either equivalent to, or reflected by, 
autonomic arousal.'° To the limited degree 
that evidence on this point is available, self- 
attention appears not ito lead to an increase 
in physiological arousal. If one were willing 
to equate such arousal increases with height- 
ened drive, it would thus appear that self- 
focus is not drive inducing. Evidence on this 
issue comes from two sources. 

Salience of nonarousal. One set of data 
comes from a study (Gibbons et al., 1979) 
that was discussed earlier as evidence that 
self-attention when no standard is salient 
merely makes one more conscious of one’s 
salient characteristics. Subjects in that study 
were led to anticipate that arousal symptoms 
would occur as a function of a pill (actually a 
placebo). Subjects later reported the extent of 
their symptoms, with self-attention heightened 
or not. If self-directed attention per se were 
arousal inducing, greater arousal should have 
been experienced, and thus reported, by sub- 
jects in the high self-awareness condition than 
by control subjects, Instead, the opposite oc- 
curred, Self-attentive subjects apparently be- 
came more conscious of the absence of arousal 
(consistent with the present informational 
analysis) and consequently reported fewer 
symptoms, 

One difficulty with applying this finding to 
the drive question, however, is the fact that in 
the early statement of their theory, Duval and 
Wicklund (1972) largely ignored the pos- 
sibility that a standard of comparison might 
not always be present. Wicklund (1975a) 
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later concluded that self-focus was aversive 
only if one’s present state was worse than a 
salient standard of comparison. However, he 
argued at the same time that one was almost 
always worse than some standard and that one 
would ultimately become conscious of that 
discrepancy. Nevertheless, if self-attention 
should be drive inducing only when a discrep- 
ancy exists between self and standard, and if 
it were possible to assume a case within Duval 
and Wicklund’s framework in which no stan- 
dard is salient, then those authors might thus 
anticipate that self-focus would not increase 
drive in the Gibbons et al. (1979) case. 

Physiological data. A second set of data on 
the arousal question does exist, however 
(Paulus, Annis, & Risner, 1978). Moreover, 
this finding is not liable to the caveat that was 
applied above to the Gibbons et al. finding, 
Paulus et al. gave subjects a task (copying 
prose) with a clear standard (copying 
quickly). As subjects prepared to attempt the 
task, either before a self-focusing stimulus or 
with no such stimulus present, the experi- 
menter took a physiological measure called the 
palmar sweat index (Johnson & Dabbs, 1967), 
Increased palmar sweat values have com- 
monly been believed to reflect increased 
arousal (cf. Geen & Gange, 1977; Martens, 
1969a, 1969b). However, in the Paulus et al. 
study, lower values on this index were found 
among more self-aware than among less self- 
aware subjects. The meaning of the palmar 
Sweat index is open to considerable question 
on other grounds (see the following para- 
graphs). But if that measure were acceptable 
as an index of drive, the finding of Paulus et 
al. would seem to contradict Duval and Wick- 
lund’s assumption that self-attention is drive 
increasing." 


18 Though it certainly oversimplifies the case y 
imply that all theories utilizing a drive construc 
contain this assumption, it is probably fair to Say 
that it is implicit in the large majority of drive 
models in social psychology. ! BY 

The data of Paulus et al. were derived Bo 
context of a social facilitation paradigm. s 
present theoretical model has rather extensive im- 
Plications for social facilitation phenomena, Hia 
consideration of those implications is beyond ‘ 
Scope of this article. These implications are, how 
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It was noted above that the arousal or drive 
issue is not entirely an empirical one. There 
are also two important logical considerations 
that render the entire arousal controversy 
somewhat less meaningful. 

The meaning of physiological indices. 
First, contemporary researchers in physiolog- 
ical psychology tend to recoil at the mere 
suggestion that autonomic arousal is a unitary 
phenomenon. Such psychologists are inclined 
to look at autonomic indicants in terms of 
what specific functions they imply rather than 
whether the organism is aroused. For example, 
Sokolov (1963) has proposed that one pattern 
of physiological responses comprises an “ori- 
enting? response, whereas another pattern 
comprises a “defensive” response. The orient- 
ing response is held to facilitate greater 
knowledge of an external stimulus by making 
the organism more receptive to input informa- 
tion from that stimulus, The defensive re- 
sponse, in contrast, involves an inhibition of 
sensitivity to input information (cf. also Hare 
& Blevings, 1975). This latter function pre- 
sumably protects the organism from over- 
stimulation. i 

This argument is easily taken one step far- 
ther, by assuming that such function-specific 
physiological response patterning is a normal 
concomitant of different kinds of consciously 
controlled behavioral involvement. Williams, 
Bittker, Buchsbaum, and Wynne (1975) have 
discussed this possibility in some detail (see 
also Lacey, 1967). They reasoned that experi- 
mental tasks vary in the types of attentional 
demands they make (as opposed to the degree 
of demand). Some tasks (e.g., & visual decod- 
ing task) require sensory intake; others (eg, 
mental arithmetic) require sensory rejection. 
Conceptually, the sensory intake pattern 
seems similar to Sokolov’s (1963) orienting 
response; the sensory rejection pattern, 
though not particularly “defensive,” seems to 
embody many functional components of Soko- 
lov’s defensive reaction, in that there 15 a 


ever, taken up elsewhere (Carver & Scheier, Note 2). 
That paper also presents further data on the 

of drive in social facilitation and self-awareness 
Phenomena. 
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suppression of sensory input. In the Williams 
et al. analysis, however, this pattern simply 
reflects the attempt to concentrate focus in- 
ward and prevent distraction. In a test of this 
reasoning, Williams et al. (1975) were able to 
show that subjects differed as predicted in car- 
diovascular response patterns as a function 
of the type of task being performed. 

It is instructive to note, in this regard, that 
initial theorizing about the palmar sweat in- 
dex (Dabbs, Johnson, & Leventhal, 1968) 
concerned attentional issues fully as much as 
it concerned arousal. Dabbs et al. (1968) 
argued that increased palmar sweat was asso- 
ciated with readiness to engage the environ- 
ment, and that decreased values reflected self- 
directed attention and an attempt to concen- 
trate. These descriptions seem a good deal like 
the sensory intake and sensory rejection pat- 
terns of Williams et al. (1975). Taken to- 
gether, these two descriptions converge on two 
notions, one of them general and the other 
more specific. The general one is that at least 
some physiological indices may be more use- 
fully construed as providing information 
about attentional and information-processing 
phenomena than as providing information 
about arousal. The more specific one is this: 
that the finding of decreased palmar sweat in 
the presence of a self-attention-inducing stim- 
ulus (Paulus et al., 1978) may most reason- 
ably be interpreted in attentional terms, as 
reflecting inward focus of attention and the 
simultaneous suppression of environmental in- 

ut. 
i Cybernetics and the energizing of behavior. 
In addition to constraints imposed by the 
complexity of the meaning of autonomic re- 
sponses, another issue of logic should be ad- 
dressed. This issue concerns the nature of 
cybernetic theory. 

Among persons accustomed to think of be- 
havior as being “energized” and thus impelled 
by some kind of drive, the idea that a control- 
theory model of motivation is truly adequate 
to explain variations in the intensity of be- 
havior will seem difficult to accept. This is 
true even though a number of motivational 
psychologists (eg., Guilford, 1965; Hunt, 
1965; Taylor, 1960; Vickers, 1973) have for 
a number of years characterized the TOTE 
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unit as the most promising model of motives 
leading to intentional activity. 

The sticking point for many people seems 
to be the obvious fact that it takes energy to 
do work, Thus a drive model would seem to 
be preferable to a control-theory or a cyber- 
netic one. But this view fails to recognize the 
fact that physiological changes that occur in 
the course of executing a behavior are in no 
way incompatible with cybernetic assump- 
tions. Cybernetic theory would simply relegate 
the observed physiological change to a sec- 
ondary causal role. As Guilford (1965) has 
noted, “In automated devices, as in living 
organisms, there is a clear distinction between 
two functions of energy. There is the energy 
that carries out the various operations, where 
work must be done, and there is the energy 
that has control functions, where triggering 
and release of much greater quantities of 
energy are involved” (p. 320). Cyberneticists 
would tend to emphasize the importance of 
the energy that has control functions, even 
though it is the work-doing energy that is 
most readily observed in terms of changes in 
physiological state. Thus Wiener argued long 
ago (1948) that the then-current focus on 
arousal was misguided, that the body is in 
fact very far from being a conservative system 
with limited energy available. He argued that 
the appropriate type of “bookkeeping” for 
events in the nervous system is not one that 
is energy based. Instead, the critical notions 
were those of message, quantity of informa- 
tion, coding technique, and so on. 

It should be repeated, however, that the 
fact that physical work systems are subsidiary 
components of the body’s control system has 
this important implication: Covariation of 
physiological change with behavioral change 
in no way constitutes a threat to a control- 
theory analysis of behavior regulation. In such 
an analysis, a change in physiological activity 
leads merely to the question of what specific 
functions are being performed, rather than 
the more misleading question of whether 
arousal exists. Indeed, for this reason it is 
arguable (see also Carver & Scheier, Note 2) 
that control theory provides a more subtle 
understanding of physiological events than 
does drive theory. 
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Prepotent Response Tendency 


There is a third distinction between the 
Duval and Wicklund (1972) model and the 
present model that should also be addressed, 
It bears on the following question, In a con- 
text in which a behavioral standard is salient, 
which response is prepotent when self-focus 
occurs: matching-to-standard or withdrawal? 
In the model proposed here, the matching-to- 
standard sequence is prepotent, as part of the 
human being’s normal self-regulatory, system, 
Withdrawal occurs only after an outcome 
assessment and only if that assessment yields 
a negative expectancy. In contrast, although 
Duval and Wicklund (1972) originally felt 
that it was not possible to know which of 
these two modes would be preferred, Wick- 
lund later argued for an ordering opposite to 
that assumed in the present model. “Certainly 
a successful averting of self-focused attention 
would eliminate the negative affect, however 
temporarily; thus an individual’s immediate 
reaction to objective self-awareness should be 
an avoidance of self-focusing stimuli and/or 
efforts to find distractions” (Wicklund, 1975a, 
p. 236). Thus, according to Wicklund, dis- 
crepancy reduction occurs only when self- 
focus cannot be avoided. 

Results of two recent studies (Carver, 
Blaney, & Scheier, in press; McDonald, in 
press) suggest that the present analysis is the 
more accurate of the two. In McDonald’s re- 
search, subjects in whom a large negative 
within-self discrepancy had been made salient 
were more persistent on a subsequent task 
(for which persistence had been defined as 
appropriate) when self-focus was increased 
than when it was not. Moreover, there was 
also evidence that this effect was exaggerated 
by instructions that the second task was 
relevant to the dimension on which the dis- 
crepancy had been created. The possibility 
that subjects had simply been distracting 
themselves from self-focus by burying them- 
selves in their task was considered and Te- 
jected by McDonald, because of the fact that 
a much simpler and quicker means of ayain 
self-focus was readily available but was 0° 
used, that is, announcing “I’m finished” and 
leaving the context. Thus McDonald con- 
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cluded that discrepancy reduction is prepotent 
under self-focus when a means of reducing the 
discrepancy is available. 

The Carver et al. (in press) research was 
discussed earlier in this article, in the section 
on outcome expectancy. Some subjects in that 
research were led to have favorable expec- 
tancies of being able to reduce a large within- 
self discrepancy, others were led to have un- 
favorable expectancies. Subjects with favor- 
able expectancies were more persistent when 
self-attention was high than when it was low, 
consistent with the McDonald (in press) re- 
sults, Carver et al, rejected the possibility 
that this persistence represented attempted 
distraction for the same reason as had Mc- 
Donald, that is, that there was an easier way 
to avoid self-focus readily available. Indeed, 
this point is further emphasized in the Carver 
et al, research by the fact that one group of 
subjects did utilize that way of avoiding self- 
focus—specifically, subjects with negative out- 
come expectancies. Thus those results appear 
to be unequivocal in supporting the present 
theory. 


Other Theoretical Comparisons 


The model proposed in this article also 
bears certain similarities to models in social 
psychology other than self-awareness theory. 
Three of the similarities are striking enough 
that they deserve some separate attention. 
The first of these concerns Bandura’s recent 
(1977) analysis of fear-based avoidance be- 
havior; the second concerns recent analyses 
of human helplessness (Abramson et al., 
1978; Wortman & Brehm, 1975); the third 
concerns social comparison theory (Festinger, 
1950, 1954). These similarities will be dis- 
cussed in the following sections. 


Self-Efficacy and Avoidance 


The fact that the approach-withdrawal de- 
cision process postulated here has been framed 
in terms of outcome and efficacy expectancies 
suggests that some commonality may exist be- 
tween the present model and Bandura’s recent 
analysis of the role of self-efficacy expec- 
tancies in avoidance behavior (Bandura, 
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1977; Bandura, Adams, & Beyer, 1977). Ban- 
dura’s analysis and the present one do share 
some common ground, but they also differ in 
important respects. 

Terminology. The first difference is one of 
terminology (see Figure 4). Bandura used the 
term outcome expectancy to refer to “a per- 
son’s estimate that a given behavior will lead 
to certain outcomes” (Bandura, 1977, p. 
193). His primary concern was in distinguish- 
ing that belief from the expectancy that one 
can execute the behavior. This he labeled 
efficacy expectancy. Clinically, such a distinc- 
tion is quite an important one. In many cases 
people know what behavior a situation re- 
quires but do not believe themselves capable 
of doing the behavior. Because efficacy expec- 
tancy was most clinically relevant and thus 
most important to Bandura, he devoted rela- 
tively little attention to defining and discuss- 
ing outcome expectancy. Implicitly, however, 
outcome expectancy in Bandura’s terms is 
based only on the presence or absence of 
knowledge about the normal consequences of 
a behavior, In the present model, in contrast, 
the term outcome expectancy is used to denote 
expectancy about an event’s likelihood of oc- 
currence. This usage quite explicitly includes 
a variety of inputs: knowledge of the normal 
consequences of task-appropriate behavior, 
constraints imposed by external forces or by 
the passage of time, and efficacy expectancy— 
one’s judgment about whether or not one can 
execute the requisite behavior (Figure 4). 

In the present model, therefore, outcome 
expectancy is the direct determinant of the 
person’s subsequent behavior and affect 
(Propositions 11 and 12). Efficacy expectancy 
is merely one input into outcome expectancy, 
but one determining what cognitions are asso- 
ciated with affect experienced as a function of 
that outcome expectancy. In Bandura’s model, 
in contrast, the person’s efficacy judgment 
is the direct determinant of his or her subse- 
quent behavior. Outcome expectancy can be 
seen as a partial—though often trivial—de- 
terminant of efficacy expectancy in Bandura’s 
model (Figure 4) in that one cannot feel 
capable of doing a behavior unless one knows 
the behavior and what its ordinary conse- 
quences are. 
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Figure 4. Relationships among outcome expectancy, 


ceptualized by Bandura (1977) and Carver, 


The models are in somewhat closer agree- 
ment with regard to the nature and determi- 
nants of efficacy expectancy, Bandura con- 
ceived of efficacy expectancy as being a 
product of four classes of information: prior 
performance accomplishments, vicarious ex- 
periences, verbal instructions, and emotional 
arousal. An additional set of relevant infor- 
mation would seem to be the degree to which 
one perceives that one can Separate a behavior 
into components, each of which may be man- 
ageable by itself, These various sources of in- 
formation (involving both previously encoded 
stimuli and presently experienced stimuli) are 
Processed together, yielding a judgment of 
efficacy expectancy, 

Role of emotional arousal. Even with re- 
gard to the determinants of efficacy expec- 
tancy, however, there is a point of conflict be- 
tween the models: Specifically, the role that is 
assumed to be played by the Perception of 
emotional arousal. As was noted in the pre- 
ceding paragraph, Bandura (1977) held that 
fear arousal was one kind of input into effi- 


tional contingencies (Bandura et al., 1977), 
there still appears to be a difference between 
Bandura’s position and the position taken in 
the model Proposed here. The Present model 


efficacy expectancy, and behavior, as con- 


holds that the role of perceptions of fear 
arousal in an approach attempt (initially, at 
least) is to cue the self-assessment process 
(see Figure 5). This assessment results in an 
outcome expectancy judgment, which for fear- 
based behavior is largely determined by effi- 
cacy expectancy (which derives, in turn, from 
the other informational sources enumerated 
by Bandura, 1977). This expectancy deter- 
mines whether the person subsequently with- 
draws or returns to the approach attempt. 
This analysis has received empirical sup- 
port in three studies. Carver and BP 
(1977a, Experiment 3; 1977b) showed tha 
persons who differed only in their chronic 
efficacy expectancies responded to false feed- 
back of arousal in two quite different = 
Confident subjects focused on their approal 3 
attempt and were not debilitated beni 
ally; doubtful subjects avoided focusing 0 
the approach attempt and withdrew more 
quickly, compared to doubtful subjects oe 
feedback of nonarousal. This interactive ene, 
ence of fear perceptions on focus of Pe 7 
and behavior—a set of findings that has on 
sequently been conceptually replicated a! 
a self-awareness manipulation to hel 
awareness of veridical fear (Carver et a a 
1979)—does not seem easy to predict a 
Bandura’s assumptions. The findings are, ho 
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‘ever, clearly predicted from the present model 
of self-attention processes. 

Generality. A final useful comparison be- 
tween the theories concerns their relative 
generality. Bandura’s (1977) analysis was 
“aimed quite explicitly at fear-related behavior 
and the therapeutic change of such behavior. 
In applying the two models to situations in 
which the likelihood of a positive outcome is 
entirely dependent on intrapersonal factors, 
which is the case in fear-based behavior, the 
two are functionally equivalent. This is true 
because outcome and efficacy expectancies as 
defined in the present model are perfectly con- 
founded with each other in this particular 
class of situations. A consideration of broader 

contexts reveals, however, that the present 
model is more general in two respects than is 
Bandura’s. First, because outcome expectancy 
is not held to be determined solely by knowl- 
edge of the task-appropriate behavior, the 
’ 


present model can generate predictions for 
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Figure 5. Information-processing sequence fol 
context, (This sequence represen! 
that was presented in Figure 3.) 


ATTEMPT 
BEHAVIORAL 
CONFORMITY 
TO STANDARD 


FAVORABLE FAVORABLE 
WYPECTANCY EXPECTANCY 


lowing 
ts a specialized application 0: 
te 


1275 


two classes of behavioral interruptions not 
considered by Bandura: incomplete or inade- 
quate behavior that occurred in the past and 
behavior constrained by environmental factors 
(see Figure 3). Secondly, since efficacy ex- 
pectancy is held to exert an influence on be- 
havior that is independent of outcome expec- 
tancy, the present model explicitly predicts 
that different affective experiences will be 
associated with similar behavioral events, 
based on differences in efficacy expectancy. 
That is, Proposition 13 above holds that if 
there is negative outcome expectancy, the 
presence of high efficacy perceptions predicts 
that affect will occur as resentment toward the 
environment; presence of low efficacy percep- 
tions predicts negative affect associated with 
the self. If outcome expectancy is favorable, 
low efficacy perceptions predict that positive 
affect will be associated with the environment; 
high efficacy perceptions predict that the 
affect will be associated with the self. These 


RISING 
AROUSAL : 
INTERRUPT 


OUTCOME 
ASSESSMENT 


self-attention in a fear-provoking behavioral 
í the model of behavioral regulation 
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predictions, which are now being tested, do 
not seem to be derivable from Bandura’s 
(1977) theory. 


Helplessness Theory 


The present theoretical model also appears 
to have some implications for the area of re- 
search and theory known as “learned helpless- 
ness.” Helplessness is a performance deficit 
that reflects presumed motivational, cognitive, 
or learning deficits resulting from fairly ex- 
tended exposure to uncontrollable outcomes. 
There is at least one important similarity be- 
tween aspects of the present model and the 
kind of cognitive analysis developed in recent 
years to account for helplessness effects among 
humans (e.g., Abramson et al., 1978; Wort- 
man & Brehm, 1975). Abramson et al. 
(1978), for example, have proposed that the 
impact of uncontrollability on subsequent per- 
formance depends upon the development of an 
expectation of future noncontingency. Sim- 
ilarly, Wortman and Brehm (1975) have 
argued that helplessness occurs when the per- 
son has an expectancy of no control for a task 
being undertaken. Both of these characteriza- 
tions are quite similar to what has been 
termed in the present article an unfavorable 
pet expectancy regarding the subsequent 
task. 

Abramson et al. held that this expectancy is 
influenced largely by attributions concerning 
the reason for the initial failure, In fact, this 
attributional aspect of their model has re- 
ceived by far the greatest amount of attention 
among researchers in that area. Despite this, 
however, it bears emphasizing that the direct 
determinant of performance in their theory, as 
in Wortman and Brehm’s, is the person’s ex- 
pectancy, not the attributions. If one has an 
unfavorable expectancy, regardless of the rea- 
son, helplessness should result. If one’s ex- 
pectancy is favorable, there should be no help- 
lessness. 

The central emphasis accorded to expec- 
tancy by these theorists is quite consistent 
with the model presented in this article, In- 
deed, one study conducted to test the model 
(Carver et al., in press, discussed in detail 
earlier) could be regarded quite easily as a 
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helplessness experiment, Subjects underwent 
a failure pretreatment, and some of them later 
displayed reduced persistence, which is often 
taken as a sign of helplessness, In providing 
support for the present model, that research 
thus also appears to offer support for other 
models of helplessness that rely on postulates 
about expectancies. 

However, that research also suggests two 
respects in which the present model adds to 
such theories of helplessness, The more ob- 
vious contribution of this model stems from 
the fact that no other analysis of helplessness 
effects includes any consideration of the role 
of self-directed attention in promoting such 
effects. As is demonstrated in the Carver et al. 
(in press) research, that role may be quite an | 
important one. 

The second contribution of the present 
theory is its suggestion that the impulse to 
withdraw is basic to a wide variety of help- 
lessness effects. The assumption of a with- 
drawal impulse is inherent in the present 
model, And it was overt withdrawal that was 
displayed in the Carver et al, (in press) re- 
search. In contrast, most studies in the help- 
lessness tradition do not explicitly allow sub- 
jects this behavioral option. As was suggested 
earlier in this article, however, when physical 
withdrawal is prevented, the result may be & 
cognitive withdrawal—a mental dissociation 
from task attempts. This characterization 
seems consistent with typical helplessness ef- 
fects, which often reflect an apparent unwill- | 
ingness or inability to utilize task-relevant | 
cues. An interesting possibility that should be 
examined further is that all these effects may 
stem from a thwarted impulse to remove one 
self from the behavioral context. 


Social Comparison Theory 


Earlier in this article it was noted that 
there is also considerable similarity heed 
some aspects of the present theory and EEE 
comparison theory (Festinger, 1950, 1954): 
Social comparison theory can be viewed 4 


d 

18 Although this similarity exists for Duval a 
Wicklund’s theory, as well as for the present m 
this discussion will focus on the present model. 
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having three central assumptions. The first is 
that we tend to define much of reality—espe- 
cially our social reality—by comparing our 
own opinions, reactions, characteristics, and 
capabilities with those of other people, usually 
termed reference groups. The second assump- 
tion is that the tendency ‘to do this is exag- 
gerated by conditions of ambiguity. The third 
assumption is that once consensus has been 
reached and some normative value implicitly 
established, then there is pressure toward 
conformity to this value among the members 
of the group. 

Holding in abeyance the role of perceived 
ambiguity in the instigation of these pro- 
cesses, the elements of social comparison 
theory thus seem to fulfill two functions: first, 
to establish some value as a standard of com- 
parison (by means of an implicit social con- 
sensus), and second, to increase behavioral 
conformity to that standard. These two func- 
tions are, of course, precisely the same as 
those assumed in a control-theory model of 
behavioral self-regulation such as the one pro- 
posed here. That is, a control-theory approach 
to motivation assumes two kinds of informa- 
tion-processing systems. The first of these is 
the system that analyzes and categorizes per- 
ceptual input, yielding a behavioral standard. 
The second system—a TOTE unit—regulates 
behavior with regard to that standard. 

_ One way in which social comparison theory 
is similar to the model proposed here concerns 
the first stage of social comparison, in which 
a standard is established by implicit social 
consensus. This aspect of that model would 
seem to represent one important way in which 
@ person can extract a behavioral standard 
through categorization of his or her environ- 
mental context (Propositions 2 and 3 of the 
present model). This may indeed be one of the 
most commonly used methods of choosing 4 
behavior among human adults. In principle, 
however, it represents only a subset of a 
larger class of potential ways to determine be- 
havior, Thus, it would seem that this facet 
of social comparison theory could be sub- 
sumed under the more general process of 
categorizing the nature of one’s context (a 
process that certainly deserves much more 


_ more attention. I 
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attention than it has received in the present 
article) 2° 

The second stage of the social comparison 
process, in which conformity pressure occurs, 
is also functionally similar to a major facet of 
the present model. The difference between the 
two is in the locus assumed for the impetus 
to conform. In the present model, the con- 
formity tendency (via the normal matching- 
to-standard sequence) is construed as inter- 
nally based. Discussions of social comparison 
theory, in contrast, often seem to assume the 
existence of external pressure to conform. 

Finally, it seems worthy of brief note that 
a subsidiary aspect of social comparison 
theory holds that if a person cannot conform 
to the group’s standard, there is a tendency 
for either the group or the person to withdraw 
from the other. This seems not unlike the 
present proposition that when one cannot 
match one’s behavior to the standard—that is, 
when outcome expectancy is unfavorable— 
there is an impulse to withdraw from the at- 


tempt. 


Summary and Concluding Comment 


In the preceding pages, a cybernetic model 
of self-attention processes was presented, 
along with support for its propositions. Sum- 
marized briefly, self-focus in a context where 
no behavioral standard is salient leads to 
heightened cognizance of salient self-elements. 
Self-focus when a behavioral standard does 
exist leads to an automatic matching-to-stan- 
dard sequence. This sequence is regarded as 
the realization within a psychological system 
of a negative feedback loop. If matching-to- 
standard is interrupted, an outcome assess- 
ment ensues. A favorable outcome perception 
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19 The assumption in social comparison theory that 
ambiguity increases the tendency to undertake social 
comparison suggests another possibility not pre- 
viously addressed in this article. That is, it may be 
a more general rule that ambiguity leads to increased 
information search, even when the context is non- 
social, in order to determine better what behavior is 
appropriate. This possibility certainly should receive 
f confirmed, it would further in- 


crease the degree to which these two analyses over- 


lap. 
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leads to positive affect and/or to a return to 
matching-to-standard. An unfavorable out- 
come perception leads to negative affect and/ 
or behavioral withdrawal. 

Following presentation of the model, three 
specific points of conflict between this theory 
and self-awareness theory were addressed. In 
each of those cases, evidence was reviewed 
that indicated the present analysis to be the 
more internally consistent and parsimonious of 
the two. Comparisons also were undertaken 
between the present model and three other 
theories: Bandura’s analysis of cognitive pro- 
cesses underlying fear-related behavior, help- 
lessness theory, and social comparison theory. 
Each of those comparisons revealed the pos- 
sibility for considerable integration between 
models, 

There are many other potential links that 
could be developed between the present theory 
and information-processing ideas currently 
under investigation elsewhere in social and 
cognitive psychology. A fairly obvious ex- 
ample is the similarity between the proposi- 
tions of the present model that concern sa- 
lience of oneself versus the environment and 
research conducted on other kinds of salience 
phenomena (see, e.g., Taylor & Fiske, 1978). 
Although there simply is not the space to pur- 
sue those interrelationships here, the existence 
of such commonalities among theories suggests 
that the ideas on which the Present model is 
based have a good deal of integrative poten- 
tial. Indeed, this seems apparent even if one 
considers only the material discussed in this 
Paper. The model of self-attention processes 
presented here has proven to be applicable to 
a wide range of Phenomena, topics as diverse 
as taste and test anxiety, in a way that is in- 
ternally consistent. It is to be hoped, more- 
over, that this perspective will prove to be 
useful in generating additional hypotheses 
beyond those considered in the preceding 
pages. Thus might the specific ideas advanced 
here, and the concepts of contro] theory more 
generally, come to be seen as important tools 
for the analysis of a broad Tange of long- 
standing problems in human ‘behavior. 
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This experiment addressed issues in impression management. 
investigated whether type of impression intended (accurate or 


First, this study 
fabricated) and 


level of self-monitoring (high or low) affect the amount of information about 
a target person that individuals would acquire, at some cost to themselves, 


prior to interacting with that person. As predicted, high self-monitors 


planning 


a fabrication purchased more information than high self-monitors planning an 


accurate impression or low self-monito 


In addition, 


rs planning either type of impression. 


reactions of both participants to the actor’s performance were 


analyzed. Impression type affected both actors’ and targets’ reactions, whereas 
self-monitoring affected only the targets’ ratings. 


Erving Goffman, in his seminal work on 
impression management (1959), has discussed 
how individuals present an impression of self 
that influences the definition of the situation 
that others come to formulate. It is in the 
various parties’ best interest to induce others 
to attribute qualities to them that are sup- 
portive of their own interaction goals. People 
can do this by expressing themselves in a 
manner that communicates impressions in- 
tended to lead others to Support their plans 
voluntarily, The current experiment focuses 
on actors’ preparation for an upcoming en- 
counter and reactions to their performance. 
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Acquiring Information 


Actors planning for an interaction can 
benefit from information about the other 
participants. They can use this information to 
help them tailor their impression to fit the 
particular characteristics of the encounter. In 
considering the problems of creating and sus- 
taining an impression, one should distinguish 
between impressions that the actor believes o 
be accurate self-presentations and impressions 
that the actor knows are false (fabrications, 
in Goffman’s, 1959, terms). Information about 
others is helpful in conveying an accurate imd 
pression (if only to ensure accuracy), but itis 
even more important in conveying a fabrica- 
tion. Whereas an accurate impression can be 
generated from habit and previous experience, 
actors planning a fabrication must be wary of 
the possibility of being discredited. Behavior 
must be carefully monitored and, when neces- 
sary, accentuated or suppressed. Normal, 
habitual behavior that usually enhances an 
accurately descriptive performance could be 
tray deceiving actors, Fabricators must bê 
Sensitive to elements of the encounter that 
could threaten or enhance their performance: 
In particular, information about the other 
participants in the interaction can indicate 
the kind of performance most likely to suc- 
ceed. 


DECEPTION, SELF-MONITORING, AND SELF-PRESENTATION 


There is evidence that individuals differ in 

e importance they attach to information 
about other participants. Snyder’s (1974, 
1978, in press) work on self-monitoring ten- 
encies indicates that some individuals show 

greater attentiveness to external cues (i.e. 

formation from the environment in which 
gn interaction occurs) when judging the ap- 
ropriateness of an impression and the effec- 
tiveness of a performance. Labeled high self- 
monitors, they are concerned with the situa- 
tonal appropriateness of their behavior 
furing a social encounter. As a result, they 
{show greater sensitivity to the characteristics 
fof others involved in the interaction, whether 
tevealed by self-presentation or by inad- 
vertence, 


tation. 

| It follows that high self-monitors should be 
fager to acquire information about other par- 
|ticipants in the interaction. This should be 
specially true if they plan to convey a fabri- 
cation. The challenge of having to contradict 
their own attitudes or beliefs without reveal- 
ing the contradiction makes them more care- 
ful than usual about their performance. This 
Steater care manifests itself in part in an 
increased interest in information about other 
Participants. Actors can use relevant informa- 
tion about others to plan their fabrication 
More precisely and to increase the likelihood 
Of success. 

On the other hand, individuals attentive to 
"nternal cues (low self-monitors) should feel 
little need for information about other par- 
ticipants, regardless of the type of impression 
to be conveyed. Their behavior is guided by 
their own beliefs about what is necessary to 
‘convey the intended impression, be it accurate 
or fabricated. Although they may feel in- 
creased concern in planning @ fabrication, it is 
manifested in a more thorough introspection 
for guides to action, not in a need for informa- 
tion about others. 

Prior research on the acquisition of infor- 


— 
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mation about other participants in an upcom- 
ing interaction is sparse. Snyder (1974) found 
that, compared to low self-monitoring sub- 
jects, high self-monitors spent more time look- 
ing at a sheet of majority responses to a per- 
sonality inventory that they were filling out 
while preparing for an upcoming discussion. 
This information was obtainable merely by 
turning one’s head to see the majority re- 
sponse sheet taped to the wall near the sub- 
jects. 

Note that this activity involved no real 
effort on the subjects’ part, because the in- 
formation was readily available. Information 
more critical to the actor may be more difficult 
to obtain and may involve some cost. Two 
studies (Eiser & Eiser, 1976; Eiser & Tajfel, 
1972) investigated subjects’ preferences for 
different kinds of information about an oppo- 
nent in upcoming bargaining sessions. Subjects 
could find out how much each potential out- 
come would cost the opponent in resources 
expended and how much the outcome was 
valued in terms of resources gained, Each 
piece of information was available at a set 
price. Subjects purchased more information 
about the values the opponents placed on out- 
comes than on the costs; this value-cost dif- 
ference was larger in competitive than in non- 
competitive groups. 

These studies indicate that individuals who 
expect to interact with another do seek out 
information about them. Although the bar- 
gaining experiments do not explicitly involve 
impression management, self-presentational 
concerns could be one aspect determining sub- 
jects’ behavior. The present experiment in- 
vestigated the acquisition of information by 
individuals prior to conveying a specific im- 
pression of themselves to another person. It 
differs from Snyder (1974) in that the infor- 
mation was available only at some cost to the 
actors. Unlike the studies conducted by Eiser 
and associates (Eiser & Eiser, 1976; Eiser & 
Tajfel, 1972), the present study investigates 
information purchasing in an explicit context 
of impression management. 

The arguments presented above lead to the 
following hypotheses. First, because of their 
greater sensitivity to external cues for ap- 
propriateness of behavior in everyday en- 
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counters, high self-monitors will seek out more 
information about another participant, even at 
some personal cost, than will low self-mon- 
itors. Second, the type of impression planned 
should influence the magnitude of the dis- 
crepancy between the information acquired by 
high and low self-monitors. A planned fabrica- 
tion increases one’s concern for performance 
during a self-presentation, which is reflected 
in a greater attentiveness to cues of situa- 
tional appropriateness. For low self-monitors, 
this means more attention is paid to internal 
states, such as norms or personality character- 
istics, than would occur when planning an 
accurate impression. High self-monitors, by 
contrast, would be led to a greater reliance on 
information from the environment than would 
be required for an accurate performance. A 
two-way interaction between self-monitoring 
and impression type is therefore predicted: 
Prior to an accurate performance, high self- 
monitors should acquire more information 
about another than will low self-monitors, but 
the difference in the amount acquired should 
be even greater when planning a fabrication. 


Reacting to the Encounter 


This study also investigated the partic- 
ipants’ reactions to the encounter. Actors 
could conceivably react to their behavior dur- 
ing the interaction on two distinct levels: 
their own personal feelings during the en- 
counter (were they comfortable, confident, 
self-conscious?) and their feelings about the 
performance they have just completed (were 
they successful, organized, persuasive? ). Indi- 
viduals may be able to separate critiques of 
their own performance quality from the feel- 
ings they had while performing, although the 
two are not necessarily independent. In addi- 
tion, other participants in the interaction 
should be able to make judgments about the 
actor on the same dimensions of performance 
quality and apparent feelings. 

In prior research, Ickes and Barnes (1977) 
found that when high self-monitoring subjects 
were paired with self-monitoring partners in 
an informal interaction, high self-monitors 
rated themselves and their partners as more 
self-conscious than did low self-monitors, Re- 


GREGORY C. ELLIOTT 


garding others’ impressions of an actor’ 
formance, Weiler and Weirstein (1972 
covered that accurate impression m 
were judged more convincing than were 
cators. 

Individuals conveying an accurate ii 
sion should feel little stress during the 
action. They have only to be themsel 
keep the interaction proceeding smoot 
Fabricators, however, are under much 
stress. They must constantly be alert for 
nals that their performance is being 
credited. This leads to the hypothesis 
actors who have just conveyed a fabri 
will reveal more negative feelings about tht 
selves and their performance than will th 
who have conveyed an accurate impre 
Similarly, observers will evaluate the fabri 
ing actors’ performance more negatively | 
that of accurate impression managers, lar} 
because of nonverbal leakage (Ekm 
Friesen, 1969). 

High self-monitors check their performat 
during an interaction by judging the othe 
reactions to them and tailoring their si 
presentation accordingly; low self-monitors 
not make much use of the reactions of 0i 
Because high self-monitors are espi 
aware of the reactions of others, th 
more critical of their performance thal 
low self-monitors. The hypothesis is that t 
will react more negatively than will lows 
monitors. On the other hand, greater att 
tiveness to reactions of others spurs the 
self-monitors to correct any problems in 
performance quickly. As a result, the h 
esis is that observers will evaluate the 
formance of high self-monitors more po 
than that of low self-monitors. 


Method 


Overview 


Subjects were led to believe that they Wi 
ticipating in a study of impression format 
were asked to help the experimenter disco 
people build impressions of others by com! 
specific impression about themselves to @ 
son (ostensibly a naive subject, but actually 
federate). This impression was either in j 
with their own attitude (accurate impressio: 4 
tion) or contrary to it (fabricated impressioni 


tion). Subjects were told that the experimenter would 
fe studying how the target person built an impression 
{ them and what elements of the impression were 
important in the target person’s mind. 

A successfully formed impression would net both 
participants a monetary bonus. Before beginning the 
encounter, subjects were given the opportunity to 
purchase information about the target person by giv- 
ing up some of the standard subject fee. After some 
time to prepare themselves, subjects spent 10 minutes 
in an informal face-to-face discussion with the con- 
ederate, Subjects and confederates then completed a 
questionnaire on which they rated the subjects’ per- 
formance and feelings during the encounter. 


Subjects 


Sixty-six female students at a large Midwestern 
university served as subjects. All subjects had pre- 
viously filled out preexperimental questionnaires, Two 
subjects were dropped from the analysis. One did not 
understand the instructions. A second refused in ad- 
vance to be paid for her participation, commenting 
that this meant the information made available to 
her would cost her nothing. Since her orientation to 
the experiment was affected by her refusal to be paid, 
her responses were not comparable to the others. 


Preex perimental Questionnaire 


Six weeks before the experiment began, students 
from introductory sociology and education courses 
at the university filled out a preexperimental ques- 
fionnaire. The questionnaire contained Snyder's 
(1974) Self-Monitoring Scale (labeled “personal re- 
action inventory”) and a “critical issues inventory” 
that measured attitudes toward controversial issues. 
Two items on legalizing marijuana served as the basis 
for determining the accuracy of the impression to be 
conveyed. The items were scored from —3 (extremely 
unfavorable to legalization) to 3 (extremely favorable 
to legalization), Subjects receiving a summed score 
of 4 or more on these items were judged as definitely 
In favor of legalization and were eligible to partic- 
pate in the experiment. 


Design 


, The study utilized a 2 X 2 factorial design that 
varied type of impression (accurate vs. fabricated) 
and level of self-monitoring (high vs. low). The 
impression was accurate if the subject was asked to 
‘pear in favor of legalization; it was fabricated if 
the subject was asked to appear opposed to legaliza- 
tion. Subjects were high in self-monitoring if they 
‘cored above the median (12 in a possible range from 
0 to 25) on the self-monitoring scale; they were low 
in self-monitoring if they scored below the median. 
ubjects were randomly assigned to the impression- 
lype condition, with the constraint that in each con- 
dition half the subjects must be high and half must 
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be low in self-monitoring. This yielded 16 subjects in 
each cell of the design. Confederates were randomly 
assigned to each condition, 


Experimental Procedure 


In a preparation room, the experimenter explained 
the (fictitious) nature of the study. The confederates 
were not present, The experimenter explained that 
this experiment was an attempt to discover how 
people form an impression about another when they 
are face-to-face with that person, and that the sub- 
jects would serve as a discussion partner with another 
woman. They were also told that the experimenter 
would find out if the other woman had formed spe- 
cific impressions about the subject after the discussion 
had ended. Subjects were told they would be paid 
$2.50 for their participation. They then signed the 
participation consent form. 

Next, the experimenter handed two instruction 
sheets to the subjects. The first described their task 
in general. Reading from the instructions, the experi- 
menter explained that to ensure comparability across 
settings, the subjects’ task was to convey a specific 
impression about themselves to their partner during 
a 10-minute, face-to-face conversation period. The 
specific impression dealt with marijuana legalization. 
The experimenter explained that he was interested 
in determining whether appearing to be for or against 
an issue influences how others build up an impression 
of people. Therefore, some would be randomly chosen 
to convey the impresson that they were in favor of 
legalizing marijuana use, whereas others would be 
randomly selected to appear opposed to it. 

In addition, the experimenter told the subjects he 
was interested in whether the other person could 
build up an overall favorable picture of them at the 
same time she was learning whether they were for or 
against marijuana use, Subjects were asked to induce 
the other woman to like them, regardless of which 
side of the issue they were assigned. The purpose of 
this task was to keep the subject from merely stating 
her assigned views perfunctorily and to promote a 
more realistic conversation. 

The experimenter explained that the other woman 
was the focus of the experiment. He said he would 
be analyzing exactly how she received the impression 
they would convey. The other woman (ostensibly) 
was with a (nonexistent) research assistant and had 
been told the study involved an analysis of what goes 
on in an informal, videotaped discussion of issues 
that affect college students’ lives. The rationale for 
the 10-minute discussion was that it was a practice 
session to give the participants a chance to get used 
to the discussion. 

Subjects were led to believe that the other woman 
had been given a list of possible discussion topics, 
including marijuana legalization. Since it was only 
one of several issues on the list, the subjects were 
told that they would probably have to bring up the 
topic themselves. They were told to do this in the 
manner most comfortable for them, as long as the 
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partner had a fair chance to build an impression of 
how they felt about marijuana. 

The experimenter stressed that it was important 
that the partner not realize that the study was really 
about impression formation until it was actually 
over; otherwise, she might try too hard and do things 
she wouldn’t do in normal, everyday life. 

The second instruction sheet informed the subjects 
of some special considerations regarding their task. 
First, the experimenter was interested in more than 
whether their partner came to understand how they 
felt about legalization. Subjects were asked to build 
up in the other person’s mind an organized, sensible 
set of reasons supporting their position. To help 
them, they would have available a list of some of the 
most common reasons given for the position they 
were assigned to convey.! The experimenter empha- 
sized that their use of this list was entirely a matter 
for them to decide and that the quality of reasons 
used was at least as important as the quantity. 

Second, subjects did not have to persuade the other 
person to agree with them. The experimenter was 
only interested in whether and how the partner came 
to understand what the subjects’ position was. 

Third, the subjects were told that success in form- 
ing impressions often carries some kind of benefit 
for those involved in the get-together. To simulate 
the real-life possibility of rewards, the subject and 
her partner would each earn a 50¢ bonus if the other 
woman formed both a clear impression of the sub- 
ject’s attitude toward marijuana legalization and an 
overall favorable impression of her. Success would be 
determined by the target person’s responses to a 
questionnaire after the discussion, 

Subjects were then taken to their separate inter- 
action rooms, where the actual discussions were to 
take place. At this point, half the subjects learned 
that they were to convey the impression that they 
oe es marijuana (an accurate impres- 
sion), an € other half were to to oj 
legalization (a fabricated innpredkidn)/ They aere Gis 
given a list of 16 reasons supporting the assigned 
Position. To this point, the experimenter had not 
known to which condition the subjects would be as- 
signed. Throughout the study, the experimenter was 
blind to subjects’ self-monitoring scores, 


Dependent Variable 


After considering their positions for a few minutes, 
subjects were given an Opportunity to purchase infor- 
mation about their partner, Three kinds of informa- 
tion were available: biographical (eg., hometown. 
religious preference), attitudinal (involving issues 
other than marijuana, €g., abortion, the priority of 
conscience over law), and Personality characteristics 
(eg., assertiveness, tolerance for others). This infor- 
mation came from a self-descriptive questionnaire 
ostensibly being filled out by the other woman at that 
time. Actually, a standard response for each item was 
Provided by the experimenter. These responses were 
neither extreme, in the case of the attitudinal and per- 
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sonality items, nor unusual, in the case of the back. 
ground information. 

For every 2 bits of information selected, 1¢ was de- 
ducted from the standard subject fee of $2.50. Ten 
bits of each type were available (a total of 30 bits), 
for a maximum possible cost of 15¢. Subjects were 
told that the cost was to simulate the real-life pos- 
sibility that one could often find out about another 
only at some cost to the self. The experimenter em- 
phasized that subjects were free not to purchase any 
information at all. They could have as little or as 
much as they wanted. Subjects were given a few 
minutes alone to decide, at which time the experi 
menter returned with the questionnaire (presumably 
filled out by the other person) and filled in any 
requested responses. The amount and kind of infor. 
mation bought by the subjects constituted the majon 
dependent variable. 


Performance Ratings 


After being given 5 minutes to prepare, subjects} 
engaged in an informal 10-minute discussion with} 
the partner, who was brought to the subjects’ room 
The other woman was actually a confederate, chosei 
for her ability at improvisational drama.? She wi 
trained to be easygoing, personable, and ambivalea!| 
regarding legalization. Although keeping up her endfy 
of the conversation, she did not direct nor dominati 
the conversation in any way. The confederates wer 


the experimental hypotheses. Subjects were randomly 
assigned to a confederate. 

Following the discussion, the subject and the othet} 
woman were separated, and each filled out the f 
questionnaire consisting of 30 bipolar adjectives 
semantic differential form. Half of the items measu! j 
their impressions of the performance that the subjed 
had just given (e.g., interesting, impressive) ; the 
measured their impression of what the subject had 
been feeling during the discussion, (e.g., enthusiast 
self-conscious). These items constituted the measly 
of the second dependent variable. r 

At this juncture, the subjects were extensively w 
briefed. This involved assessing their suspicion levi 
discovering their perceptions about the experimen 
revealing the true nature of the experiment, s 
ensuring that they felt comfortable about thei 
ticipation. The experimenter then paid the subj 
$2.50, cautioned them not to discuss the experimeti 
and dismissed them. 


1 The reasons were generated from two on "a 
Kaplan (1970) and the National Commisst 


Marijuana and Drug Abuse (1972). Laut fbi 

2I thank Joyce Bizub, Mary Erdman, at q 
Franklin, Jane Homburg, Laura Raffe, ad su 
Schwindt for serving as confederates. Their jy of 0 
improvisation added greatly to the believab re 


this experiment. 
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Results and Discussion 


al Validity 


During the debriefing, the degree of sus- 
ion the subjects expressed was coded. Com- 
surprise upon learning the true nature 
the experiment was assigned a value of 0; 
e vague feeling that something else was 
ing on was assigned 1; the correct inference 

t the subjects themselves were at least part 
fthe focus of the study received a 2; guess- 

in addition that the partner was a con- 
erate received a 3; correctly discerning 
t the experiment investigated the subject’s 
pression management behavior was scored 


Complete surprise was expressed by 60.97% 
f the subjects, and 26.6% had only a vague 
éling that something else was going on. The 
mean level of suspicion was .61, indicating 
hat most subjects believed the cover story 
hey were given. Analysis of variance in sus- 
icion using level of self-monitoring and im- 
ession type as factors revealed that no con- 
ition caused more suspicion than any other. 
i addition, dichotomizing subjects into non- 
ispicious (those receiving a coded value of 
) and suspicious (those receiving any other 
toon created a factor that showed no main 
Am on the dependent variables, nor were 
4 interactions between suspicion and the 
perimental factors in evidence. 


formation Purchasing 


The major dependent variable in this study 
as the amount of information purchased by 
ibjects in preparing for their self-presenta- 
n. Recall that the available information was 
‘several types. To see whether the effects of 
pression type and self-monitoring differed 
t each type of information, an analysis of 
itiance was carried out, with the level of 
lf-monitoring and impression type as be- 
een-subjects factors and information type 
a within-subjects factor with three levels: 
ographical, attitudinal, and personality 
aracteristics. The results of the analysis 
Pport the hypotheses. Analysis of variance 
i the total amount of information purchased 
vealed a significant main effect for self- 
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monitoring, F(1, 60) = 10.65, p< .002, a 
marginal main effect for impression type, F(1, 
60) = 3.57, p < .07, and a significant two- 
way interaction between these factors, F(1, 
60) = 4.25, p < .05. The marginal values of 
Table 1 provide the means for each cell of the 
design. High self-monitoring subjects pur- 
chased more information than low self-mon- 
itoring subjects. Fabricating subjects showed 
a tendency to purchase more information than 
those conveying an accurate impression. High 
self-monitoring fabricators reliably bought 
more information than all others.* 

An additional interesting finding was a 
three-way interaction between level of self- 
monitoring, impression type, and information 
type, F(2, 120) = 3.213, p < .05, Inspection 
of the means (provided in Table 1) revealed 
the pattern of this interaction. For low self- 
monitoring subjects, there were no differences 
in the amounts of information purchased. 
High self-monitoring subjects tended to buy 
the same amount of personality information, 
regardless of the truthfulness of the impres- 


sion they were conveying; they tended to buy 


less biographical and attitudinal information 
te impression than 


when planning an accura’ 
when planning a fabrication. 

It may be that, for the high self-monitors, 
what the other individual thinks she is like as 
a person is not differentially important in 
planning a performance. Personality char- 
acteristics are important elements of some 
baseline information about another, but they 
are probably not directly related to the issue 
at hand (at least not in this experiment). Per- 
sonality information serves mainly to guide 
actors in choosing the general manner of self- 
presentation (ie. orientation to the other) 


3 Inspection of the within-cell variances revealed 
that the assumption of homogeneity of variances was 
tenuous. The information purchased was transformed 
by the square root function, resulting in more homo- 
genous cell variances. The analysis of variance for this 
transformed variable did not change the significance 
Jevels of any of the effects. Since the effects are more 
readily interpretable for the untransformed variable, 
results are discussed in terms of it. An analysis of 
variance was conducted using confederate as an addi- 
tional between-subjects factor. Results showed no 
effects for confederate, alone or with any of the 


other factors. 
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Table 1 

Mean Number of Information Bits About the 
Partner Purchased by Actors as a Function of 
Impression Type, Level of Self- Monitoring, 
and Information Type 


Impression type 


Information 
type Accurate Fabricated 
High self-monitoring 

Attitudinal 

M 375, 1.938, 

SD .719 2.744 
Biographical 

M 625, 1.875, 

SD 1.258 1.821 
Personality 

M 1.125, 1.188, 

SD 1,360 1.424 
Total 

M 2.125 5.000 

SD 2,872 4.502 


Low self-monitoring 


Attitudinal 

M 438, -250, 

SD 629 .683 
Biographical 

M 438, 438, 

SD 629 727 
Personality 

375, 438, 

SD -619 -629 
Total 

M 1.250 1.125 

SD 1.612 1.668 


a a 
Note. Means that do not share a subscript differ from 
each other at the .05 level of significance by the 
Newman-Keuls procedure, 


and not specific elements of the performance. 
Therefore, Personality information is equally 
important in conveying an accurate or a fab- 
ricated impression. 

Attitudinal and biographical information is 
more directly related to the issue at hand. 
These data give indications of how one might 
feel about the issue around which the impres- 
sion is being built. This would be especially 
important for a fabrication. High self-mon- 
itoring actors in a fabrication are feeling their 
way through unfamiliar territory and need 
relevant information about their partners, 
With an accurate impression, high self-mon- 
itors can rely on their abilities as performers, 
The luxury of being themselves may lend high 
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self-monitors the confidence to do with 
this information and deal with contingencies 
as they arise. In short, a fabricated impression 
is more of a challenge to high self-monitorin 
actors than is an accurate impression. 

One might ask if the difference betwee 
high and low self-monitors was due to the f 
that low self-monitoring actors felt no height 
ened concern when contemplating a fabrica 
tion. Although no data from this experimen 
directly addressed this question, some indir 
evidence suggests that this was not the 
All subjects judged the accurate performa 
as easier (M = 3.97) than the fabricated on 
(M = 1.87), F(1, 60) = 12.53, p < .001, 
there were no such differences between hi 
and low self-monitors, F(1, 60) <1. Iti 
probable that all subjects anticipated dif 
ficulty with a fabrication. Because low selff 
monitors rely on their own internal states fol 
resources, however, the challenge afforded by 4 
fabrication did not motivate low self-monito 
ing actors to acquire more information abot 
their partner. 


Reactions to the Performance 


The second set of hypotheses focused on th 
aftermath of the self-presentation. Actor sub 
jects and observer confederates were gvel 
two sets of 15 bipolar items in semantic differ 
ential format to record their impressions fol 
lowing the encounter. Responses were made a 
a 9-point semantic differential scale. For p! 
poses of analysis, a score of 9 was given i 
the positive extreme and 1 to the negative “i 
treme. | 

It should be noted that confederate Ou 


4 This explanation is entirely post hoc and 4 i i 
be taken with some caution. One factor that ad r- 
the tentativeness of this reasoning is that G i 
low self-monitors, a floor effect is operating: tion} 
cannot buy fewer than zero pieces of Oe 
However, tests for the differences between bee 
and fabricated impression type were conducta A 
each level of information type using only; higa g 
monitors. In a test for “simple effects” of GN., of 
type (cf. Kirk, 1968, chap. 8), the results ot bi 
significant differences (p < .05) in the amount ©, 
graphical and attitudinal information purcha' o dif 
accurate and fabricating subjects; there WARA 
ference with respect to personality information. 


servers’ ratings of the subjects’ performance 
uality were their own personal judgments of 
‘what they saw; their ratings of the subjects’ 
feelings during the interaction were what the 
subjects were apparently feeling, as revealed 
by their verbal and nonverbal behavior. Fur- 
ther, subjects and confederates completed 
these items while totally isolated from each 
other, with no way of influencing each other’s 
ratings. Subjects were not told that they 
‘would be rating themselves until after the dis- 
‘cussion was over, nor did they know that the 
confederate had rated them until the debrief- 
ing. Confederates were not told that the sub- 
jects had rated themselves until the last sub- 
ject had been run. Finally, it should be re- 
called that the confederates were blind to all 
experimental conditions, so that any differ- 
ences due to these factors must have been 
communicated through the performance. 
Because of the large number of evaluations 
being made (30 each for subject and confed- 
erate), subjects’ and confederates’ ratings 
were factor analyzed separately using Rao’s 
uniqueness rescaling procedure with varimax 
rotation, This served to organize the evalua- 
tions, reduce the number of analyses con- 
ducted, and ensure that the variables to be 
analyzed were relatively orthogonal. 

Results for subjects’ self-reports revealed 
three factors for the performance quality rat- 
ings and four factors for the feeling states 
ratings. For the performance quality items, 
Factor 1, an acting-quality factor, consisted of 
the items unconstrained-constrained, satis- 
fied-dissatisfied, and graceful-awkward. The 
second factor reflected organization and in- 
cluded organized—disorganized, systematic- 
unsystematic, clear—unclear, impressive-unim- 
pressive. Factor 3 indicated communicative 
quality, with consistent-inconsistent, under- 
stood-misunderstood, and persuasive-unper- 
Suasive loading on this factor. 

For the feeling states items, the first factor 
seemed to reflect satisfaction with the per- 
formance: positive-negative, capable-incapa- 
ble, satisfied—frustrated, commendable-repre- 
hensible, and enthusiastic—apathetic. The 
second factor was a security factor: calm— 
nervous, comfortable-uncomfortable, secure— 
insecure, and confident—unconfident. Factor 3 


DECEPTION, SELF-MONITORING, AND SELF-PRESENTATION 


1289 


was a friendliness factor and consisted of 
friendly—unfriendly and __likable-unlikable, 
Finally, Factor 4 seemed to reflect personal 
comportment: sensible—foolish, straightfor- 
ward—devious, and free-constrained. 

Turning to the confederates’ evaluations of 
the subjects, four factors emerged from the 
ratings of performance quality. Factor 1 indi- 
cated performance competence: interesting- 
uninteresting, impressive-unimpressive, suc- 
cessful—unsuccessful, satisfied—-dissatisfied, and 
believable-unbelievable. The second factor 
involved aesthetic reactions to the perform- 
ance and consisted of organized—disorganized, 
graceful-awkward, flawless—flawed, and sys- 
tematic—unsystematic. Factor 3 involved com- 
municative clarity, with understood—misun- 
derstood and clear—unclear. Factor 4 reflected 
performance coherence: persuasive-unpersua- 
sive and consistent-inconsistent. 

For the feeling states items, five factors 
were discovered. Factor 1 consisted of items 
dealing with the actors’ perceived comfort in 
the role: calm-nervous, comfortable-uncom- 
fortable, and free-constrained. Items loading 
on Factor 2 suggested personal accomplish- 
ment: commendable-reprehensible and capa- 
ble-incapable. Factor 3 involved security, 
with secure-insecure, unself-conscious—self- 
conscious, confident-unconfident, and satis- 
fied-frustrated. The fourth factor indicated 
friendliness: likable-unlikable, friendly—un- 
friendly, and sensible-foolish. Finally, Factor 
5 suggests role commitment: enthusiastic- 
apathetic and positive-negative.® 

A second-order factor analysis was run on 


5 As the reader may have noted, the factor struc- 
tures for the two rating sources (subject and con- 
federate) are somewhat different. The difference 
appears to reflect the perspective each brings to the 
performance. The factor structure for the actor sub. 
jects indicates their concern for the technical qual- 
ities of the performance. By contrast, the factor 
structure for the confederates reveals their critical 
reactions to the aesthetics of the performance. Such 
a performer/audience difference in perspectives is 
consistent with the dramaturgical approach to social 
interaction. However, it should be noted that the 
experimental design may have contributed to this 
difference. The observers were confederates of the 
experimenter and as such may have felt more like an 
audience than a coparticipant in the interaction, 
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the entire set of factors separately for subjects 
and confederates. Variables used in the sec- 
ond-order analysis were obtained by summing 
the scores for the items loading on the same 
factors. For the subjects, analysis of the seven 
first-order factors yielded three second-order 
factors. Factor 1 involved performance compe- 
tence, performance satisfaction, security, and 
personal comportment (performance quality 
first-order Factor 1 and feeling states first- 
order Factors 1, 2, and 4) and may be labeled 
a performance enactment factor, Factor 2 in- 
cluded organization and communicative qual- 
ity (performance quality first-order Factors 
2 and 3), which appears to deal with the 
mechanics of the performance, Factor 3 was 
the friendliness factor (feeling states first- 
order Factor 3). 

For the confederates’ ratings, analysis of 
the nine first-order factors also produced three 
second-order factors. Factor 1 involved per- 
formance competence, performance aesthetics, 
communicative clarity, performance coher- 
ence, and personal accomplishment (all the 
performance quality first-order factors plus 
feeling state first-order Factor 2) and indi- 
cates a critical evaluation of the performance, 
Factor 2 included comfort in the role, secur- 
ity, and role commitment (feeling states first- 
order Factors 1, 3, and 5), which appears to 
reflect a critical evaluation of the actors’ per- 
sonal fit to the role requirements. Finally, 
Factor 3 was the friendliness factor (feeling 
states first-order Factor 4). 

Analysis of variance was conducted on the 
second-order factors, Variables were created 
by summing the scores for the items loading 
on the same factor. Results provide moderate 
Support for the hypotheses, The subjects’ sec- 
e relatively unaffected 
conditions, No effects for 


conditions had greater 
federates’ second-order 
“Monitoring affected all 
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three factors. For the performance evaly, 
factor (Factor 1), confederates rated. 
self-monitors more favorably (M = 10 
than low self-monitors (M = 93.72), 
60) = 6.82, p < .025. High self-monitorg 
also seen to fit the role they enacted 
3) better than low self-monitors (M 
for high self-monitors vs, M = 59,53 for 
self-monitors), F(1, 60) = 3.97, 
Finally, high self-monitors were see 
friendlier (Factor 3; M = 23.62) thi 
self-monitors (M = 22.19), F(1, 60 
P< .05, a result also reported in 
(1976). 

Impression type influenced the confi 
ates’ ratings on two second-order facto! 
curate impression managers were judg 
the confederates to fit the role (Factor 
better (M = 65.69) than fabricators (M 
59.44), F(1, 60) = 4.22, p < 05. In 
tion, they tended to see accurate impri 
managers as friendlier (Factor 3; M = 2, } 
than fabricators (M = 22.22), F(1, 60 
3.71, p < 06. 

Although the targets seemed sensiti 
Many aspects of the actors’ performances, | 
exception stands out. They did not seem 
realize when the actors were deceiving th 
Observers rated fabricators as no less stri 
forward or believable than accurate p 
formers were. Believability was affecte 
the level of self-monitoring. High self- 
itors were rated as more believable ( 
7.47) than low self-monitors (M = 6. 
F(1, 60) = 5.95, p < 02. 

Apparently, actors were able to Took | 
convincing during the discussion. They didi 


~~ 


= 


SA second-order factor analysis was coni 
using all 16 factors extracted for both subjects 
confederates. The results yielded 4 second-order 
tors, Factor 1 consisted of all the confederates? 
order factors. Factors 2, 3, and 4 were identii 
the 3 second-order factors reported in the ani 
involving only the subjects’ factors. Because 
Was no mixing of first-order factors across 
Source in the second-order structure and be 
separate analyses of the subjects’ and confe 
factors yielded identical results in the subjec 
and more differentiated results in the confede 
case, it was decided to use the second-order 
from the separate analyses as dependent varia 
in the analyses of variance. 


fool themselves, however, Actors in the fab- 
ticating condition reported themselves as less 
straightforward (M = 5.94) than those con- 
veying an accurate impression (M = 7.34), 
F(1, 60) = 5.48, p < .05, and slightly less be- 
lievable (M = 7.09 for fabrication vs. M = 
7.81 for accurate impression), F(1, 60) = 
3.26, p < 08. 

Compare these results to those in Lippa 
(1976), in which subjects sequentially role 
played teachers as introverts, extraverts, and 
themselves. Although rating high self-monitors 
as more technically proficient actors, observers 
were better able to detect their true tendencies 
toward extraversion. The correlation between 
perceived and actual (as assessed by premea- 
sures) extraversion was greater for high than 
for low self-monitors, In other words, high 
self-monitors were perceived more accurately 
than low self-monitors. Perhaps the difference 
is due to differences in the content of the self- 
presentation. In the present study, attitudes 
were being conveyed; in Lippa’s study, the 
content of the impression was a personality 
“characteristic. Expressive control for the latter 

might be much more difficult. 


Conclusion 


The results of this study indicate that 
actors involved in planning a self-presentation 
exhibit differential interest in information 
about a prospective interaction partner. A 
person variable (self-monitoring) and a situa- 
tion variable (impression type) combined in- 
teractively to influence the amount of infor- 
mation acquired by impression managers. 
Specifically, only high self-monitors who 
planned a fabrication tended to purchase in- 
formation about their prospective partner. 

Further, whereas low self-monitors found all 

“types of information unnecessary, high self- 
monitoring actors placed different values on 
the kind of information available. High self- 
monitors acquired personality information 
about their partner regardless of impression 
type but purchased attitudinal and biograph- 
ical information only when planning a fabri- 
cation, 

This study also investigated actors’ and 
observers’ reactions to the actors’ perform- 
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ance. One striking result is the degtee to 
which subjects’ self-reports were unaffected by 
the experimental conditions. In particular, high 
self-monitors rated themselves no better or 
worse than low self-monitors did. But con- 
federates clearly distinguished between them, 
consistently evaluating high _ self-monitors 
more positively. This suggests that although 
high self-monitors may have been as worried 
as their low-scoring counterparts, they man- 
aged to hide those worries from their partner. 

Several aspects of this experiment may limit 
the generalizability of the results. First, it 
may be that the results presented above apply 
only to those individuals whose true attitudes 
favor the issue involved in the performance, 
It should be recalled that all subjects were 
selected for their strong support for legalizing 
marijuana, a popular stand on college cam- 
puses. Perhaps the behavior of fabricating 
actors was due to being seen as supporting an 
unpopular position as much as to giving a de- 
ceptive performance. 

Second, the impression to be conveyed was 
assigned by the experimenter and was not 
chosen by the actors themselves. It may be 
that the effects due to impression type were 
actually a function of a forced fabrication. 
Actors assigned a truly self-descriptive im- 
pression may behave no differently than they 
would if they had selected the impression 
themselves; actors assigned a fabrication may 
behave differently, more because they had no 
choice in the matter than because the impres- 
sion was not honest, There are situations in 
which demands for a particular impression 
are strong enough so that very little choice 
exists. If the goals desired from the interaction 
are important, actors may be induced to enact 
the impression demanded by the situation, 
even if it involves a fabrication. Their plight 
would be not unlike that of fabricating sub- 
jects in the present experiment, 

In sum, the results of this study suggest 
that impression managers are differentially 
concerned with acquiring information about 
another participant in an upcoming interac- 
tion. High self-monitors planning a fabrication 
are the only actors who acquire a significant 
amount of information about the other. More 
important, they will endure some personal 
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costs ‘to obtain this information, They ap- 
parently believe that the advantages of addi- 
tional information in preparing a self-presen- 
tation outweigh the costs of obtaining it. In 
the present study, individuals acquired infor- 
mation without the other’s knowledge. It 
would be interesting in future research to 
learn if high self-monitoring fabricators main- 
tain their interest in purchasing information 
when the other knows what is being acquired 
and at what cost. Further insights into the 
planning of a self-presentation will provide 
greater understanding of the process guiding 
impression management. 
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Lakoff has suggested that men 
women’s speech more polite but 


three of Lakoff’s linguistic variables (tag questions, 
ese ways was tested. Sex of speaker was 


duates rated the assertiveness, warmth, 
d two female speakers who used or did not use 
“female” linguistic forms were rated less 
forms; qualified 


quests were rated warmer and compound requests more polite. Sex of speaker 


was a significant factor in only one poss 
substantially replicated in Experiment 2, in w 
These findings suggest 
on person perception are 


requests) affect person perception in th 
also varied. In Experiment 1, undergra 


and politeness of two male an 
the three linguistic forms. All three 
assertive than corresponding “male” 


women acted as judges. 
ing effects of speech styles 


are perceived. 


Our impressions of other people are derived 
what they 
think of them, and so on. 
What people say is surely an important source 
of information, but kow they say it may also 
be of great importance. Thus, for example, a 
rapid speech rate leads to higher ratings of 
speaker competence (Smith, Brown, Strong, 
& Rencher, 1975) and increases the persua- 
siveness of a communication by enhancing 
speaker credibility (Miller, Maruyama, Bea- 
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and women use different speech styles, with 
less assertive than men’s. The assumption that 


qualifiers, and compound 


speech and compound re- 


ible comparison. These results were 
hich older and/or less educated 
that Lakoff’s intuitions concern- 
largely correct and that 
men and women to affect how they 


ber, & Valone, 1976). The present article 
focuses on three linguistic dimensions that are 
hypothesized by Lakoff (1975) to affect per- 
son perception. According to Lakoff, the three 
forms are used differentially by men and wom- 
en, thus contributing to the maintenance of 


stereotypes concerning sex differences in per- 


sonality. 

Lakoff has suggested that men and women 
differ in their styles of speech in ways that 
both result from sex stereotypes and reinforce 
those same stereotypes. Specifically, sex dif- 
ferences in speech styles are said to contribute 
to maintaining images of men as assertive, 
self-confident, and definite, and images of 
women as vague and lacking in confidence. 
Women’s speech is also said to be more polite, 

more formal and more 


in the sense of being 
deferential than men’s. The differences in 


speech style Lakoff discusses include the fol- 
lowing linguistic dimensions, among others: 


1. Greater female use of tag questions at 
for example, 


the end of declarative sentences, 
“jt’s really cold in here, isn’t it?” Tags re- 
quest listener confirmation of the truth of an 
assertion and are thus said to indicate lack of 
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confidence. They are also characterized as 
polite in that they allow the listener a graceful 
“out” in case of disagreement. 

2. Greater female use of hedges or qual- 
ifiers, that is, words or phrases such as 
“y'know,” “kinda,” “I guess,” or “maybe,” 
which blunt the force or definitiveness of an 
assertion but, again, are polite in that they 
give the listeners options if they wish to dis- 
agree with or ignore the statement, 

3. Greater female use of compound requests 
(e.g, “Won’t you close the door?”) rather 
than simple requests (e.g., “Close the door”), 
Such forms are said to soften the assertion of 
power entailed in commanding or requesting 
action, as well as being considered more polite. 

While these are interesting hypotheses with 
a variety of social implications, Lakoff has 
presented no empirical evidence to support her 
claims. She makes at least three assumptions 
based on intuition, First is the assumption 
that there are, in fact, differences in the fre- 
quency of use of the above-mentioned words 
and syntactic. constructions as a function of 
sex. Second, Lakoff assumes that these linguis- 
tic differences influence how people are per- 
ceived, in the ways she has suggested. Third, 
Lakoff seems to assume, although she is in- 
consistent on this point, not only that linguis- 
tic style affects person perception but that 
style is an important influence on person per- 
ception. 

The experiments to be reported in this 
article concern the second and third assump- 
tions. However, we will first review evidence 
concerning the first assumption, that there are 
sex differences in linguistic usage. While this 
is not critical for our purposes, since the 
effects of individual linguistic variations are 
of some interest in their own right, much of 
the current excitement concerning Lakoff’s 
proposals depends on the assumption of sex 
patterning. The data to date are somewhat 
contradictory, Several studies have failed to 
find sex differences, but there is growing evi- 


dence that such differences do exist, although 


perhaps only in certain contexts or for certain 
populations, 

While Dubois and Crouch (1975) found 33 
tag questions from males, but none from fe- 
males, on tapes of question periods following 
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formal presentations at a professional meet, 
ing, they did not provide information on num- 
ber of participants or total time talking for} 
each sex. Perhaps very few females were 
present, or those present talked very little in 
any style, Furthermore, an academic popula. 
tion may have distinctive speech styles, 
Lakoff (1975, 1977) has discussed at some 
length her belief that academic men are ex 
ceptions to her rules and use a speech style! 
generally identified as “female.” Many of the 
same reservations can be expressed about an- 
other study reporting no sex differences in the 
use of tag questions (Baumann, 1976). 

Another study finding no sex differences was 
conducted by the present authors (Newcombe 
& Arnkoff, Note 1). Pairs of unacquainted 
undergraduates were taped while talking 
about three topics of general interest. The 
tapes were coded for instances of tag questions 
and qualifiers, as well as for other language 
differences discussed by Lakoff: rising intona- 
tion on declarative sentences, so and such as 
intensifiers, “cute” adjectives of admiration, 
and expletives and euphemisms. No differ- 
ences in speech style or trends to differences 
were observed due to sex of speaker, sex of 
listener, or the interaction, either for fre- 
quencies or frequencies divided by time speak- 
ing. 

There might, of course, be a variety of rea 
sons for the lack of sex differences in linguistic 
form usage in this study, with perhaps the 
most obvious being the use of a college stu- 
dent population. However, sex differences in 
speech style were found in a very similar 
study by Crosby and Nyquist (1977, Study 1) 
that coded use of empty adjectives, tag ques- 
tions, hedges or qualifiers, and so. McMillan, 
Clifton, McGrath, and Gale (1977) also re- 
ported differences between the speech of male) 
and female college students, with women us- 
ing more tags, more imperatives in question 
form, more intensifiers such as so and such, 
and more modals such as might have said (a 
dimension discussed by Key, 1975). McMillan 
et al. placed their subjects in a group prot 
solving situation. Either being in a grouk 
being in a problem-solving situation Co y 
potentially account for the appearance of a 
related linguistic differences. McMillan et %4 
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also found that sex differences were especially 
marked in comparing speakers from mixed- 
‘sex rather than same-sex groups. 

Hartman (1976) found that in individual 
interviews, women aged 70 years or over used 
more tag questions and qualifiers compared 
to men of similar ages. Crosby and Nyquist 
(1977, Study 3) found that female clients 
requesting aid or information in a police sta- 
tion used “female register” (i.e., tag questions, 
hedges or qualifiers, and politeness expres- 
sions) more than male clients, and that female 
police personnel also used this style more than 
male police personnel, However, Crosby and 
Nyquist (Study 2) did not find sex differences 
in requesting assistance at an information 
booth, Gleason and Weintraub (1978) re- 

ported that mothers produced twice as many 
tag questions as fathers in speech to pre- 
schoolers, that fathers produced more simple 
requests than mothers, and that male day- 
care teachers produced more simple requests 
than female teachers. Further work is clearly 
needed to identify whether, and in what ways, 
variations in context, age, and other consider- 
ations determine the appearance of sex-related 
= differences in language; that they do exist in 
at least some instances seems fairly well estab- 
g lished. 
© There is relatively little direct evidence 
bearing on the second of Lakoff’s assumptions, 
that differences in usage contribute to how 
people are perceived. While Bates (1976) has 
provided evidence that interrogative requests 
in Italian are judged to be more polite than 
simple imperatives, and Siegler and Siegler 
(1976) have found a nonsignificant trend for 
tag questions to be judged less intelligent than 
assertions, both studies involved judgments 
of a small sample of written sentences, with 
attention implicitly or explicitly drawn to the 
.. dimension of interest. We do not know from 
these studies whether linguistic form affects 
person perception in a more realistic speech 
context when attention is not focused on the 
forms and dimensions of interest. 
Erickson, Lind, Johnson, and O’Barr 
(1978) have compared reactions to simulated 
court testimony delivered either in “powerful” 


or “powerless” style. These styles contrasted 


three of the dimensions discussed by Lakoff, 
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hedges or qualifiers, intensifiers, and question 
intonation in normally declarative contexts. 
However, in addition, powerless speech con- 
tained many hesitation forms, such as pauses, 
stutters, and “uhs,” incorrect or informal 
pronunciation, using of sir in addressing the 
attorney, and instances in which the witness 
asked the lawyer questions. Speakers using 
powerless speech were rated less credible and 
less attractive than those using powerful 
speech. While these findings indicate the gen- 
eral importance of speech style in impression 
formation, unfortunately they do not allow 
specific identification of which of the many 
contrasting speech dimensions were respon- 
sible for the effects observed. 

It should be noted that some of Lakoff’s 
claims regarding the effects of speech style in 
person perception are not obvious. For in- 
stance, tag questions, rather than conveying 
uncertainty, can be used in a condescending or 
overbearing manner to forestall opposition 
(Dubois & Crouch, 1975). If it is said with 
appropriate intonation, the statement “That 
production of Hamlet was dreadful, wasn’t 
it?” can in fact be intimidating rather than 
uncertain. 

With regard to Lakoff’s third assumption, it 
seems important to determine whether speech 
style is a reasonably large determinant of per- 
son perception compared to the effect of sex 
stereotypes per se. According to Valian 
(1977), Lakoff is inconsistent in her position 
on the relationship of linguistic change to 50- 
cial change. She sometimes argues, as in her 
discussion of the term Ms., that linguistic 
change can only reflect social change, not pro- 
duce it. However, at other points she implies 
that changes in speech style could lead to 
changes in person perception large enough to 
be socially important. Whether the latter is 
the case would seem to depend in part on 
whether sex stereotypes are so powerful that 
most men would be seen as more assertive 
than most women, with speech style a rela- 
tively minor source of individual variation 
within sex. 

The present studies were designed to ex- 
amine the second and third assumptions re- 
garding the effect of linguistic form on person 
perception. Ratings were obtained of the as- 
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sertiveness, politeness, and warmth of male 
and female speakers who used (or did not 
use) tag questions, qualifiers, and compound 
requests. The existence and nature of the 
hypothesized influence of the linguistic vari- 
ables on person perception could thus be as- 
sessed, and the size of any effect could be 
evaluated relative to the effect of sex of 
speaker. Warmth was included in part be- 
cause Lakoff alludes to women’s speech as 
sharing greater sympathy, friendliness, and 
person orientation than men’s and also be- 
cause McMillan et al. (1977) suggest that 
Lakoff has overemphasized the negative con- 

' notations to women’s speech and underempha- 
sized its value in conveying interpersonal 
sensitivity. It would have been interesting to 
obtain ratings of femininity also, since Lakoff 
maintains that women who deviate from fe- 
male style pay the price of being perceived as 
unfeminine, However, inclusion of such a 
scale would probably have sensitized subjects 
to the variable of sex of speaker and the gen- 
eral purpose of the study, thereby creating 
demand characteristics that would have in- 
validated results, 

Following the rating task, subjects were 
also asked to estimate the relative frequency 
with which male and female speakers used 
tags, qualifiers, and compound requests, Sieg- 
ler and Siegler (1976) have found that when 
asked to guess about Sex, raters attribute tag 
questions to women and strong assertions to 
men, One purpose of the Present studies was 
to determine whether sex stereotypes affect 
People’s abilities to judge frequency differ- 
ences. As Thorne (1976) suggests, if sex 
Stereotypes override what People actually 
hear, Lakoff may herself have been biased in 
assuming that there are linguistic differences 
between male and female speech. 

Having subjects tate spoken material leaves 


sex of speaker will be 
intonation, Phrasing, or 
rather than to subjects’ sex Stereotypes. Two 
speakers of each sex were used in an attempt 
to reduce the likelihood that effects of sex of 
speaker would be attributed to one individ- 
ual’s vocal characteristics, but clearly both 
males might differ from both females in some 


due to differences in 
other vocal factors, 
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crucial respects. Written presentation would 
avoid the problem but would be a far less | 
naturalistic technique. In any case, in these 
initial experiments the concern was to eval- 
uate the effects, if any, of a variable over 
which people have some control ( speech style) 
as compared to the effects of a set of variables 
over many of which they have no control 
(their sex) or relatively little control (eg, 
pitch of voice can be controlled, but only 
within certain limits). 


Experiment 1 
Method 
Subjects 


A total of 138 undergraduates, 75 males and 63 
females, participated in order to gain extra credit for 
a course in introductory psychology. Data from $8 
subjects were later discarded to create an orthogonal 
design (see Results section for justification), Subjects 
were randomly assigned to one of 13 mixed-sex 
groups. Each group listened to one of eight versions 
of the stimulus material, with more than 8 groups 
necessary in order to obtain a minimum of 5 subjects 
of each sex listening to each version, The groups 
ranged in size from 6 to 15 people. 


Stimulus Material f 


There were three linguistic variables of interest: tag 
questions, qualifiers, and compound requests. r 
each category, 16 speech segments (speech excerpt 
that could be part of an ongoing conversation) v 
written. Each was one to three sentences long an 
could plausibly either include or omit the words a 
Phrases of interest. Items were chosen from a poo! 
written by four people, after first discarding any 
items judged by any one of the four to involve a 
sex-typed topic. All tag-question items involved an 
inanimate third-person subject, such as a car, a 
movie, or a building, about which some aesthetic a 
Practical judgment was being made. A sample it ok 
is: “My typewriter’s finally been fixed. It sure w 
long enough (didn’t it?).” Qualifier items used s 
of four words or phrases: maybe, sort of, probably, 
or I guess. A sample item is: “That’s (sort of) s 
crazy reason for quitting school.” Compound be 
quests involved the insertion of either will cae 
would you before a simple imperative. Neither, a ii 
included the word please. A sample item is: ( 
you) order me a pizza; I'll be there soon.” 1 


in the 
1A complete list of the 48 sentences used in thi 
experiment is available from the first author. 
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' Two male and two female speakers recorded eight 
versions of the 48-item script (16 items for each of 
the three linguistic variables). Speaker identity was 
treated as a separate factor in analysis, nested within 
sex of subject. The speakers were in their twenties 
and were chosen on the basis of having no accent 
to the Northeastern ear, typical voice pitch for their 
sex, and some acting ability, although none were pro- 
fessional actors. 

Eight versions were recorded so that each segment 
would be available in each of the two possible 
linguistic versions and would be recorded by each of 
the four speakers. The order of topics spoken about 
was constant across versions. Each of the linguistic 
forms (tag questions or corresponding assertion, qual- 
ified assertion or corresponding unqualified assertion, 
and compound request or correspondingly simple 
request) appeared once in each block of 6 segments, 
so that linguistic forms would be evenly distributed 
over the tape and the salience of each would be re- 
duced. Speakers were assigned to segments so that no 
more than 2 successive segments were recorded by the 
same speaker and so that each of the four speakers 
recorded each of the six linguistic types once in the 
first half and once in the second half of the 48-seg- 
ment tape, 

Each subject thus heard, in a random order, 16 
simple assertions (which could be considered buffer 
items), 8 assertions with tag questions, 8 assertions 
with qualifiers (one of four words), 8 simple im- 
peratives, and 8 imperatives expressed as questions 
(one of two ways). This linguistic variety, combined 
with the variety introduced by the use of four 
speakers, reduces the likelihood that subjects noticed 
the linguistic variations and recognized that the study 
concerned reactions to these variations. 


Procedure 


Subjects were recruited for a study of “effective 
telephone communication.” On arrival, they were told 
that the experimenters were interested in how people 
can communicate effectively when speaker and listener 
cannot make normal use of nonverbal cues such as 
facial expressions and gestures. Subjects were told 
that they would listen to a tape of 48 short segments 
of conversation recorded by people simulating tele- 
phone communication and were informed: “After 
listening to each segment, you will be asked to give 
your impression of the person talking. Specifically, 
you will be asked to rate the person’s assertiveness, 
politeness, and warmth on a scale running from 1 to 
10.” Assertiveness was explicitly defined as “different 
from aggression—an assertive person doesn’t have to 
sound belligerent or mad. But they do sound definite 
about what they want, rather than unsure or tenta- 
tive.” Subjects then signed an informed consent form, 
noted their sex and group number on an answer sheet, 
and proceeded to listen to the tape and make ratings. 
The experimenter stopped the tape recorder after 
each segment and allowed all subjects as much time 
as they desired to make the ratings. 
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Following this, another answer sheet was circulated, 
on which tag question, qualifier, and compound re- 
quest were defined. The subject was asked to indicate 
for each form whether it had been used more by 
males, more by females, or equally by both. Subjects 
were then told the true purpose of the study and 
were given extra-credit cards. 


Design 

The ratings part of the study consisted of three 
parallel “studies,” of tag questions, qualifiers, and 
compound requests. Each consisted of a Sex of Sub- 
ject (2) X Sex of Speaker (2) X Linguistic Type (2) 
X Identity of Speaker (2) X Sentence Topic (ie. 
stimulus sentence) (16) multivariate analyses of 
variance design with three dependent variables, All 
independent variables except sex of subject were 
manipulated within subjects, and identity of speaker 
was nested within sex of speaker. The three depen- 
dent variables were the ratings of assertiveness, 
warmth, and politeness on & 10-point scale, with 
higher ratings indicating higher levels of each quality. 


Results 
Rating Data 


A total of 138 subjects participated in the 
study, but there were not equal numbers of 
each sex listening to each of the eight versions. 
The number of subjects of each sex per ver- 
sion ranged from 5 to 14, with a median of 8. 
With such a highly unbalanced design, it is 
likely that the order in which the variables 
were considered would seriously affect the re- 
sults, Therefore an orthogonal design was 
created by discarding some subjects. Fifty- 
eight of the 138 subjects were randomly elim- 
inated to leave 5 subjects of each sex for each 
of the eight versions. The following analyses 
are based on data from 80 subjects, 40 males 
and 40 females. No effects of version of the 
tape were observed in preliminary analyses, 
and this factor was thus dropped in the 
analyses reported below. 

The three rating scales were correlated with 
each other (for assertiveness and politeness, 
Pearson = .27; for assertiveness and warmth, 
r = .18; for politeness and warmth, r= .81), 
and thus multivariate analyses were per- 
formed on these data. The analyses were per- 
formed using the analysis of variance proce- 
dure of the Statistical Analysis System (SAS) 
statistical package (Barr, Goodnight, Sall, & 
Helwig, 1976). 
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Tag questions. The multivariate analysis 
showed a significant effect of linguistic form, 
Hotelling T° = 31.77, F(3, 76) = 10.32, p < 
0001, and of sentence topic, Hotelling-Lawley 
trace = 2.64, F approximation (45, 224) = 
4.38, p < 0001. Univariate analyses showed 
that tag questions were seen as less assertive 
(M = 6.1) than corresponding nontags (M 
rating = 7.2), F(1, 78) = 20.27, p < 0001. 
Sentence topic had a significant effect in all 
three analyses, of assertiveness, warmth, and 
politeness ratings. However, it is important to 
note that this heterogeneity in stimulus mate- 
rials did not interact with or qualify any of 
the variables of interest (all Fs < 1). The 
only other significant univariate effect was 
that female subjects rated speakers as warmer 
than did male subjects, F(1, 78) = 4.49, p< 
05, with ratings of 5.3 from females and 4.9 
from males, However, sex of subject was not 
significant in the multivariate analysis or in 
the following two analyses of qualifiers or 
compound requests, 

Qualifiers, 


(M rating = 6.9) than 
6.3), F(1, 78) = 6.90, 
: was not due to idiosyn. 
cratic characteristics of the voices or Bete 
of one of the speakers is Suggested by the ab- 
Sence of any significant main effects or inter- 
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actions involving identity of speaker, ea 
ever, the strength of association, omega-square 
(Hays, 1963) of linguistic form with as- 
sertiveness ratings was -12, while the strength 
of association for sex of speaker with asser- 
tiveness ratings was only .01. (The strength 
of association for the other significant effect, 
Sentence topic, was .09.) 

Compound requests. The multivariate 
analysis showed significant effects of linguistic 
form, T* = 154.78, F(3, 76) = 50.27, p< 
-0001; sex of speaker, 7* = 11,21, F(3, 76) = 
3.64, p < .05; and identity of speaker, Hotel- 
ling-Lawley trace = 47, F approximation (6, 
150) = 5.88, p < .0001. There was no sig- 
nificant effect of sentence topic; apparently. 
the requests were perceived as more homoge- 
neous than the tag question or qualifier items. 
Univariate analyses showed effects of linguis- 
tic form on ratings of assertiveness, F(1, 18) i 
= 40.14, p < .0001; politeness, F(1, 78) = 
50.50, p < :0001; and warmth, F(1, 78) = 
70.31, p < .0001). Compound requests were 
rated as less assertive (M rating = 6.5 vs. 8,2 
for simple requests) but more polite (6.5 vs. 
4.6) and warmer (6.0 vs. 4.0). Female 
Speakers were considered more polite than 
males, F(1, 78) = 5.13, p < .05, with ratings 
of 5.9 and 5.2, respectively. Females were also 
considered warmer, F(1, 78) = 7.84, p < 0l, 
with ratings of 5.3 and 4.7, respectively. These 
results, however, unlike the ones for qualifiers, 
did seem to be due to specific characteristics 
of one of the two male speakers. Identity of 
Speaker was a significant factor for ratings of 
both politeness and warmth, F(2, 78) = 7.27, 
Ż =œ 001, and F(2, 78) = 15.00, p œ .0001, 
respectively, One male speaker was seen as 
much less polite and less warm than the other 
three speakers. This male received a polite 
ness rating of 4.5, as compared to 6.0 for the 
second male and 5.8 and 6.0 for the two fe- 
males. He received a warmth rating of 3.7, 
as compared to 5.6 for the second male and 
5.3 and 5.4 for the two females. There was 
also, for the univariate analysis of the warmth 
Tatings, a significant interaction of identity of 
speaker with linguistic form, F(2, 78) = 3.54; 
$ < .05. The differences between the “co 
male speaker and the other three speakers 
were especially pronounced when speakers 
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able 1 
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‘Judgments of Relative Frequencies of Usage of Three Linguistic Forms by 


Wale and Female Speakers in Experiment 1 
4 


Tag questions Qualifiers Compound requests 
Judgment Males Females Total Males Females Total Males Females Total 
Males use more 11 15 26 13 11 24 23 19 42 
Males and females 
use equally 23 21 44 37 29 66 23 28 61 
Females use more 42 26 68 26 22 48 20 15 35 
Total 76 62 138 76 62 138 65 62 138 


were using compound requests. However, the 
difference between the two linguistic forms 
appeared for all four speakers. 


Frequency Estimation Data 


The frequencies of usage of tags, qualifiers, 
and compound requests were actually equal 
for male and female speakers on the tape. 
Judgments of frequencies by all 138 subjects 
are shown in Table 1. (The 58 subjects dis- 
carded in the previous analyses were included 
here, since there was no need for an orthog- 
onal design in this analysis.) The significance 


‘of these data was evaluated relative to a null 


hypothesis that subjects do not have sex 
stereotypes and thus judge the frequencies 
equal except as affected by random error due 
to factors such as lack of attention. Thus the 
null hypothesis stated that the probability of 
choosing males more would be equal to the 
probability of choosing females more. A bi- 
nomial test rejected the null hypothesis of 
equal probability for tag questions (2 = 4.23, 
P< 001) and for qualifiers (z = 2.71, $ < 
01) but not for compound requests. 

The biases in frequency estimation can be 


S interpreted either as reflecting sex stereotypes 


or as being accurate reports of how subjects 
have heard men and women talk in general, 
although not of how speakers talked on the 
experimental tape. The issue will be discussed 
more fully below. 

Inspection of Table 1 suggests that male 
subjects may be more prone to bias than fe- 
male subjects in making frequency judgments 
of tag questions, although not more prone to 


bias in the case of qualifiers or compound 
requests. However, a chi-square test carried 
out on the 2 x 3 table of responses for tag 
questions was not significant, x*(2) = 3.08. 


Discussion 


Experiment 1 provides substantial support 
for several of Lakoff’s intuitions regarding the 
effect of speech styles on person perception, 
Tag questions, qualifiers, and compound re- 
quests all decrease the perceived assertiveness 
of speech. Tags and qualifiers do not seem to 
increase perceived politeness, but use of qual- 
ifiers leads to higher ratings of warmth. Com- 
pound requests correspond most closely to 
Lakoff’s hypotheses, in that their use increases 
ratings of both politeness and warmth as well 
as decreasing ratings of assertiveness. 

Only one instance of an effect of sex of 
speaker on ratings was found in these data, 
that occurring for ratings of assertiveness of 
qualified and unqualified assertions, This ef- 
fect was small relative to the corresponding 
effect of linguistic form: Presence or absence 
of a qualifier made more differences in ratings 
of assertiveness than did the sex of the 
speaker. An implication is that, when men and 
women do differ in frequency of usage of these 
three linguistic forms, the contribution to the 
support of sex stereotypes could be substan- 
tial. Conversely, men and women could change 
the way in which they are perceived by chang- 
ing their speech style. However, several reser- 
vations concerning these conclusions spring to 
mind, perhaps most notably the fact that Ex- 
periment 1 used college students as subjects. 
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Perhaps an older or less well educated sample 
would show stronger sex stereotypes. For this 
reason, Experiment 2 was conducted as a rep- 
lication of Experiment 1, using a different 
subject population. 


Experiment 2 
Method 
Subjects 


A total of 39 female Secretaries employed at the 
university participated in return for a $2 reimburse- 
ment. Subjects were recruited through friendship net- 
works, That is, we contacted secretaries known to us 
or introduced to us by a faculty member known to us 
and asked them i 


listening to each of eight tapes, were retained, These 
women ranged in age from 18 to 58 years, with 7 
under 20 years old, 12 aged 20-29 years, 8 aged 30-39 
40 years or over. The number 


Stimulus Material and Design 


These were as in Experiment 1, except that the 
design did not involve the factor of sex of subject. 


Procedure 
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Results 
Rating Data 


As in Experiment 1, 
the tape were observed 
ses, 

Tag questions. Although the multivaria 
analysis did not show a significant effect ¢ 
linguistic form, there was a trend in the uni 
variate analysis for tag questions to be rate 
less assertive than nontags, F(1, 31) = 3,37 
p < .10. The mean assertiveness ratings wer 
6.6 for tags and 7.4 for nontags, There wa 
again a significant effect of sentence topic 
Hotelling-Lawley trace = 2,93, F approxima 
tion (45, 83) = 1.80, p < .05. Sentence topic 
was a significant factor in univariate analyses 
of warmth and politeness, but no interactions 
with other variables were significant, Sex of 
speaker did not approach significance in any 
analyses, 

Qualifiers. The multivariate analysis 
showed a significant effect of linguistic form, 
T? = 11,99, F(3, 29) = 3.74, p < .05. Qual- 
ified speech was seen as less assertive (M = 
6.4) than nonqualified speech (M = 7.6), 
F(1, 31) = 8.23, p< 01, although not af 
Warmer, as it had been in Experiment 1 
Neither sentence topic nor sex of speaker was 
a significant effect, 

Compound requests. The only significant 
effect in the multivariate analysis was that of 
linguistic form, T? = 35.92, F(3, 29) = 11.20, 
?<.0001. Compound requests were rated 
less assertive than simple requests (7.2 vs. 
8.5) but also warmer (6.6 vs. 4.6) and more 
Polite (6.7 vs. 5.2). In the univariate analyses 
corresponding to these comparisons, F(1, 
31) = 14.93, p< 001; F(1, 31) = 16.28, 
Ż < .001; and F(1, 31) = 8.14, p < .01. The 
univariate analysis of the warmth ratings 
showed a trend to an effect of sex of speaker, 
F(1, 31) = 3.48, p < .10, but as in Experi- 
ment 1, this was qualified by a trend to an 
identity of speaker effect, F(2, 31) = 2.93, 
? < .10. Inspection of means again showed 
that one male speaker was rated less warm 
than the other male, who was comparable to 
the two females, 


Comparison with Experiment 1. A quanti- 


no effects of version ( 
in preliminary anal 


ive assessment of the apparent similarity 
tween Experiments 1 and 2 was undertaken 
analyzing both studies together, including 
e experiment number as an independent 
ariable with two levels and including all in- 
teractions between the experiment variable 
and the other independent variables. Male 
“subjects from Experiment 1 were not included 
jn the analysis, and 1 female subject was 
tandomly dropped from each version of Ex- 
periment 1, so that both experiments would be 
equally weighted in the analysis (32 female 
‘| subjects in each experiment). 

For tag questions and qualifiers, multi- 
variate analyses of variance showed no sig- 
} nificant main effects of experiments. The anal- 
ysis for compound requests did produce a 
marginally significant main effect, T° = 8.27, 
F(3, 61) = 2.67, p = .0545. Univariate analy- 
ses indicated that the secretaries in Experi- 
ment 2 gave speakers higher warmth ratings 
than did the female students in Experiment 1, 
F(1, 63) = 4.28, p < .05, and tended to give 
higher assertiveness ratings, F(1, 63) = 3.39, 
p< .10. However, no significant interactions 
involving the experiment variable were found 
in any of the three multivariate analyses. 
Thus the findings do not depend on which 
subject population is being considered. 


Frequency Estimation Data 


Judgments of the frequencies with which 
male and female speakers used tag questions, 
qualifiers, and compound requests are shown 
in Table 2. There is, again, no bias in estima- 
tions for compound requests. There are ten- 
dencies to biases in estimations for tag ques- 
tions and qualifiers, although the tendency is 
not very marked for tag questions, and bi- 
nomial tests are not significant in either case. 


Discussion 


Experiment 2 substantially replicated the 
effects of linguistic form obtained in Experi- 
ment 1, despite the reduction in power due to 
the reduction in sample size. No hint was ob- 
tained that sex of speaker was 4 significant 
factor for this subject population. Thus, the 
idea that an older or less well educated sample 
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Table 2 

Judgments of Relative Frequencies of Usage 
of Three Linguistic Forms by Male and 
Female Speakers in Experiment 2 


Tag Compound 

Judgment questions Qualifiers requests 
Males use more 8 5 8 
Males and fe- 

males use 

equally 13 17 16 
Females use 

more 11 10 8 

Total 32 32 32 


would have stronger sex stereotypes than the 
undergraduates and would thus rate male and 
female speakers very differently was not sup- 
ported. 


General Discussion 


The results of these two studies indicate 
that as Lakoff has suggested, variations in the 
use of tag questions, qualifiers, and compound 
request forms can affect how people are per- 
ceived. The sex of the speaker had small and 
largely nonsignificant effects on ratings. How- 
ever, before concluding that sex differences in 
speech style could contribute to the support of 
sex stereotypes and that changes in speech 
style might allow men and women to modify 
how they are perceived, we should consider 
the fact that the frequency estimation data 
from both studies can be interpreted as dem- 
onstrating the existence of sex stereotypes. 
Subjects are probably aware that they have 
been rating tag questions and qualified speech 
as less assertive than declaratives. They may 
therefore guess that the female speakers use 
these forms more frequently because they 
have a stereotype of women as less assertive 
than men. 

This way of accounting for the frequency 
estimation data implies a contrast between 
behavior in specific situations with specific 
individuals (i.e., rating the speech of a partic- 
ular speaker on a particular topic) and be- 
havior in more abstract situations involving 
thought about a class of individuals (i.e., the 


reasoning outlined above for the frequency 


(l 
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estimation task). Judgments about concrete 
instances of behavior may be less influenced 
by sex stereotypes than judgments regarding 
a general class of behaviors. A woman speak- 
ing in a direct and self-confident style might 
thus be seen as self-confident, but the same 
individuals who saw her this way might still 
maintain that women in general are not very 
confident and do not speak assertively. How- 
ever, the cumulative impact of judgments of 
the self-confidence of individual women might 
be to change general stereotypes. 

There is, however, an alternate interpreta- 
tion of the distortion in frequency estimation. 
In guessing about frequencies of use on the 
tape, subjects may unconsciously average 
speech they have heard in real-world contexts, 
rather than considering only the experimental 
tape. Of course, if this second idea is correct 
and frequency estimations reflect unconscious 
averaging of speech heard by subjects outside 
the experiment, the interpretation of the data 
from the experiment as a whole is more 
straightforward, Men and women do speak 
differently, at least in certain contexts, in 
ways that affect how they are perceived. 
Changing one’s speech style could potentially 
change these perceptions. 

An important question in discussing sex- 
related variation in speech styles has been 
whether the important dimension governing 
choice of speech style may be status rather 
than sex. Thus, it has sometimes been sug- 
gested that “female speech style” is really the 
language of the powerless or the lower in 
Status rather than the language of women, 
and that women high in status or power do 
not use female speech style, although low- 
Status men do use this style (Dubois & 
Crouch, 1975; Erickson et al., 1978; Key 
1972; O’Barr & Atkins, in press). Other sr 
nie a work, for instance that of Ervin- 

ipp (1976) on im tiv support 
idea that status difference he oi teed 


Lakoff (1977) disputes the “Janguage of the 


S t women with 
power still speak in female style, that some 
men with power (i.e., academics and artistic- 
literary men) also speak this way, and that 
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minorities (e.g., blacks) who lack power do. 
not speak this way. Crosby and Nyquist’s 
(1977) finding that role and sex both condi- 
tion the use of “female register” but do not 
interact partially supports Lakofi’s argument, 
Status (or role) does make a difference, but 
within roles, females differ from males, 

While empirical work is necessary to eval- 
uate these claims and resolve the issue, it may 
be helpful to clarify the question being asked 
by contrasting two views of the relationship 
of sex and status to linguistic variation, A 
straightforward “status” hypothesis claims 
that, while sex and status are confounded in 
the real world, all speakers control all forms 
and can switch easily and flexibly as a func- 
tion of a particular status situation. Con- 
versely, listeners expect such context-based 
variation and are comfortable with it; the sex 
of the person holding a particular status posi- 
tion is irrelevant, A contrasting view holds 
that the real-world correlation of sex and 
status results in relatively stable speech pat- 
terns and listener expectations not so easily 
changed as a function of situation. These 
might result in part from early socialization, 
since mothers and fathers and male and fe- 
male day-care teachers model sex-differenti- 
ated language in speech to preschoolers (Glea-, 
son & Weintraub, 1978) and since work on 
imperatives suggests that situation-based 
Variation in commanding and requesting iS 
well established at least by 44 or 5 years of | 


age (James, Note 2) and perhaps as early as 
24 or 3 years (Bates, 1976; Ervin-Tripp, 
1976, 1977; Newcombe & Zaslow, Note a 
How early one’s own sex is used as a basis for 
choosing linguistic forms is not known, how- l 
ever. The ability to use linguistic cues to infer 
sex of speaker undergoes considerable change 
during the elementary school years (Edelsky; 
1977). If speech patterns and listener expecta- 
tions are relatively stable, high-status women 
might well be in the double bind described by 
Lakoff (1975): branded unfeminine and un- 
likeable if they adopt male style and vague 
and frivolous if they adopt female style. . 
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Effects of Resource Availability and Importance of Behavior 
on the Experience of Crowding 


Richard McCallum, Caryl E. Rusbult, George K. Hong, 
University of North Carolina, Chapel Hill 


Tedra A. Walden John Schopler 
University of Florida University of North Carolina, Chapel Hill 


According to the interference formulation, participants in a crowded setting will 
experience interference to the extent that behavioral goals conflict with envi- 
ronmental conditions. The importance of the behavioral goals directly affects 
not only the magnitude of the interference but also the mechanism by which 
people cope with interference. It was reasoned that important goals would in- 
duce a more active coping strategy in a crowded setting than in an uncrowded 
setting and would maintain task performance at the price of increasing crowd- 
ing stress. When the behavioral goal is unimportant, decrements in task per- 
formance preclude a rise in stress. A laboratory study manipulated group size, 
in order to vary the availability of resources, and the importance of the task 
behavior. The predictions were confirmed, and partial confirmation was ob- 
tained for predictions involving the effects of the internal-external personality 
dimension. The meaning of the results is discussed in terms of other findings ‘ 
in the literature on crowding and the mediating role of the type of mechanism 


used to cope with interference. 


A number of variables have been said to 
mediate between conditions of high physical 
density and the experiences of subjective 
crowding and crowding stress. The variables 
include excessive stimulation from social 
sources (Desor, 1972), excessive and un- 
wanted social interactions (Valins & Baum, 
1973), restricted behavioral choice (Proshan- 
sky, Ittelson, & Rivlin, 1972), decreased in- 
teraction distance (Worchel & Teddlie, 1976), 
and breakdowns in privacy-regulating mech- 
anisms (Altman, 1975). In an attempt to sub- 
sume these mediating factors under a single 
generic variable, Schopler and Stockdale 
(1977) suggested that interference with goal- 
directed behavior sequences is the crucial 
mediator between density and the experience 
of crowding stress. 

The behavioral interference formulation is a 
sequential model that views physical density 


Requests for reprints should be sent to John 
Schopler, Psychology Department, University of 
North Carolina, Chapel Hill, North Carolina 27514. 


as a necessary but not sufficient condition for 
the occurrence of crowding stress, When the 
presence of other people in large numbers or 
in close physical proximity leads to interfer- 
ence with goal-directed behavior, stress is €x- 
perienced. The degree of stress experienced 
in a given setting is determined by several 
factors that affect the impact of interference 
produced in the setting. These include the 
anticipated duration of the interference, the 
importance of the vulnerable response se- 
quences, and personality factors related to 
anticipated control. This framework accounts 
for interference in terms of the reward—cost 
ratio associated with particular behaviors in 4 
given setting. The effect of density-induc 

interference is to increase the costs of enacting 
the desired behavioral sequences. This in- 
creased cost may be manifested in various 
forms, such as annoyance, embarrassment, 
anxiety, or increased effort for making ap- 
propriate responses. Implicit in this formula- 
tion is the assumption that crowding stress 18 
jointly determined by the physical and social 
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EXPERIENCE OF CROWDING 


characteristics of the setting and the nature 
of the behavioral goals set by the actor. Inter- 
ference is experienced to the extent that be- 
havioral goals conflict with environmental 
conditions. 

Schopler, McCallum, and Rusbult (Note 1) 
provided evidence for the relationship between 
behavioral interference and the subjective ex- 
perience of crowding. Subjects in this study 
worked on a group decision task under condi- 
tions of either high or low density. This factor 
was crossed with a manipulation that imposed 
structure on the interactions of half of the 
groups. The imposition of structure reduced 
the potential for interference in the high- 
density setting by providing for the coordina- 
tion and synchronization of responses among 
group members. Consistent with the interfer- 
ence model, subjects in the high-density condi- 
tions perceived less interference and reported 
less subjective crowding with the imposition of 
structure, Similar results have been reported 
by Baum and Koman (1976). These investi- 
gators manipulated anticipated group size and 
found that subjects expecting to meet with a 
large group felt less crowded when anticipat- 
ing structured interactions than when no 
structure was expected. 

The present experiment was designed to test 
the interference model prediction regarding 
the relationship between crowding stress and 
the importance to the actor of vulnerable re- 
sponse sequences, One direct way in which 
high-density settings produce interference 
with goal-directed behaviors is to place limita- 
tions on access to available resources. The 
term resource refers to those features of the 
physical/social environment that are neces- 
sary for the attainment of behavioral goals. 
In natural settings, resources may include 
doorways, seating, clear lines of vision or 
movement, physical or verbal access to others, 
adequate materials for task completion, and so 
forth. The designation of a given environ- 
mental feature as a resource requires specifica- 
tion of the particular behaviors enacted in the 
setting. Consequently, the setting features 
Tequired as resources will vary not only from 
setting to setting but also across individuals 
within the same setting and across time for a 
given individual. Despite these potential 
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variations, it is evident that conditions of 
resource scarcity should produce interference 
and consequent stress.1 The first hypothesis 
of the present study was merely that indi- 
viduals provided with inadequate resources to 
complete required tasks would report greater 
subjective crowding and crowding stress and 
would perform less well than individuals pro- 
vided with sufficient resources, This hypoth- 
esis is consistent with many points of view, 
including the interference formulation. Pre- 
dictions regarding the effects of resource avail- 
ability and importance upon actual task per- 
formance and crowding stress, however, do 
stem directly from the interference formula- 
tion. These effects depend upon the strategy 
the participants select to cope with interfer- 
ence and their success in maintaining task 
performance. It is useful to distinguish two 
general coping strategies for dealing with 
interference in high-density settings. One 
strategy involves active intervention of the 
type required for coordination and synchro- 
nization of responses, whereas another in- 
volves passive withdrawal from interaction, 
altering behavioral goals, and lowering expec- 
tations (Thibaut & Kelley, 1959). Either of 
these general strategies may be successful in 
reducing the degree of conflict between be- 
havioral goals and environmental conditions 
and may consequently reduce crowding stress. 
To the extent that the behavioral sequences 
to be enacted in a setting are important to the 
individual, however, passive withdrawal and 
lowered expectations should become less at- 
tractive as coping strategies. A second predic- 
tion was that given low importance, the 
scarce-resources condition would evidence a 
performance decrement relative to the ade- 
quate-resources condition. The average effect 


1 The availability of resources may be manipulated 
in an experimental paradigm by holding group size 
constant and varying resources or by holding re- 
sources constant and varying the number of people 
who must compete for them. Group size was manip- 
ulated in the present experiment, Because the func- 
tional importance of group size, in our experimental 
setting, was in relation to the availability of task 
resources, we have labeled the manipulation resource 
availability throughout the article, rather than group 
size or density. 
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of resource availability upon task performance 
predicted by Hypothesis 1 is therefore mod- 
ified by the interaction contrasts described by 
Hypothesis 2. 

The interference formulation predicts that 
crowding stress increases in magnitude with 
increases in the importance of vulnerable re- 
sponse sequences. This effect of course as- 
sumes that the conditions expected by Hy- 
pothesis 2 will pertain, that is, that raising 
the importance of the goal will induce an ac- 
tive coping strategy even in the face of high 
interference. Under these restricted condi- 
tions, variations in resource availability 
should have an effect when behavioral goals 
are important but not when behavioral goals 
are unimportant. Specifically, among indi- 
viduals working for important goals, subjects 
given inadequate resources should experience 
more crowding stress than should subjects 
given adequate resources, In contrast, individ- 
uals working for unimportant goals should be 
less affected by variations in resource avail- 
ability. The average effect of resource scarcity 
upon crowding stress predicted by Hypothesis 
1 is therefore modified by the interaction con- 
trasts described by Hypothesis 3. 

In addition to variables associated with the 
setting characteristics and the nature of the 
participants’ behavioral goals, the present ex- 
periment investigated the effect of a personal- 
ity variable that is conceptually related to the 
interference analysis. Stockdale (1978) has 
emphasized the importance of the perception 
of environmental control in determining the 
degree of actual or anticipated interference 
experienced in high-density settings. To the 
extent that the potential for exerting control 
over the social environment (through co- 
ordination and synchronization of behaviors) 
is salient to participants, the negative effects 
of interference on subjective crowding and 
crowding stress should be mitigated. Individ- 
uals who typically anticipate a high degree of 
personal control in social situations should 
therefore experience fewer negative effects of 
density-induced interference. A scale designed 
to assess individual expectancy of internal or 
external control over the social environment 
(Schopler, Langmeyer, Stokols, & Reisman, 
1973) was employed to locate subjects on this 
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personality dimension. The fourth hypothesis 
was that within the scarce-resources (high 
interference) condition, individuals identified 
as externals should experience greater crowding 
and crowding stress than should internals, No 
internality-externality differences in subjec- 
tive crowding and crowding stress were pre- 
dicted within the adequate-resources condition 
because the existence of adequate resources 
should obviate the salience of control con- 
siderations. Such results would be consistent 
with the findings that externals report more 
subjective crowding under high-interference 
conditions and respond more favorably than 
do internals to manipulations designed to re 
duce interference experienced in high-density 
situations (Schopler, McCallum, & Rusbult, 
Note 1). These results parallel the findings of 
research that directly manipulated perceived 
control. For example, Glass and Singer (1972) 
found fewer behavioral effects and aftereffects 
of another environmental stressor (noise) 
when subjects believed they controlled the © 
onset of the aversive stimulus, and Sherrod 
(1974) demonstrated that subjects who con- 
trolled when they could leave a crowded set- 
ting showed greater tolerance for frustration 
on a postcrowding task, 

Because the interference formulation and | 
many other conceptions of crowding specu- 
late about the arousal and reduction of 
crowding stress, identification of an adequate 
measure of this kind of stress is of general 
importance. The obvious limitations of self- 
reports of stress, among other considerations, 
have led some investigators to seek physio- 
logical indices. Although it is by no means 
evident that a variable as complex as crowd- 
ing stress will have a reliable physiological 
counterpart, the literature contains some 
suggestions, such as the Palmar Sweat In- 
dex (PSI; Johnson & Dabbs, 1967) or blood 
pressure (D’Atri, 1975). We undertook to 
explore the adequacy of PSI as a measure of 
crowding stress in the present study. 


Method 


Subjects 


Seventy-two males and 72 females participated 
in the experiment in partial fulfillment of the j 


uirements for an introductory psychology course. 
ubjects were assigned to 16 same-sex six-person and 

same-sex three-person groups, Four six-person 
and four three-person groups of each sex were 
randomly assigned to two importance conditions. 


Procedure 


The study employed four independent variables 
in a complete 2 X 2 X22 factorial design: avail- 
ability of resources (scarce or adequate), importance 
of behavior (high or low), locus of control (in- 
ternal or external subject orientation), and sex of 
subject. 

Upon arrival at the laboratory, subjects were 
escorted to a waiting room by one experimenter 
who was of the same sex as the subjects and were 
given numbered clipboards. (Two women and two 
men served as experimenters for the study.) The 
experimenter briefly described the procedure as a 
simulation of working conditions in a large busi- 
ness organization. The subjects were told that they 
would be asked to perform clerical work under con- 
trolled conditions so that factors affecting work 
quality and job satisfaction could be evaluated. 
Subjects wishing to terminate participation in the 
experiment without penalty were given the oppor- 
tunity to do so at this point, The experimenter then 
administered the North Carolina Internal-External 
Scale (Schopler et al., 1973). Instead of categor- 
izing individuals relative to the sample median, 
locus of control assignments were determined for 
each subject on the basis of the standardization 
mean (if his/her score was 72 or lower, the sub- 
ject was assigned internal status, and if his/her 
score was 73 or higher, the subject was assigned 
external status). 

In order to obtain a measure of physiological 
stress (the Palmar. Sweat Index) without arousing 
subject suspicion as to its purpose, the experimenter 
requested the subjects’ cooperation in helping an- 
other graduate student (the second experimenter 
present at the session) pretest for an experiment 
he/she was preparing to run. The second experi- 
menter stated that he/she wanted to pilot test a 
measure of concentration called the “PCI” and 
explained that in order to assess the stability of 
the measure across various activity levels, several 
measures would be obtained throughout the ex- 
perimental session. The PSI was obtained by paint- 
ing a special chemical on the fingertip and he 
ferring the print to a slide with a piece 0 
transparent tape. After demonstrating the administra- 
tion of the PSI on the primary experimenter, the 
second experimenter obtained a baseline 
sweat measure from each subject. iad th 
The primary experimenter then accompanied 2 
group to a mock office containing six (or peo 
chairs placed in a semicircle facing three stan to 
four-drawer file cabinets. Subjects were instructed to 
‘occupy the chair that corresponded to SEE sa 
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tive clipboard numbers. Availability of resources 
was manipulated by variations in the size of the 
group. The resources available for work on the 
task (such as physical space and equipment) re- 
mained constant across all conditions. The mock 
office itself measured 2.86 m by 4.11 m and con- 
tained the subjects’ chairs as well as the three file 
cabinets placed against one wall and spaced 60 cm 
apart. While these resources were sufficient for 
three-person groups, they were insufficient when 
divided among the members of larger groups. That 
is, while members of smaller groups seldom ex- 
perienced difficulty in obtaining access to the files, 
resources were insufficient and led to problems of 
interference and a greater need for coordination of 
behaviors in the larger groups. 

The experimenter distributed a different list of 
names to each subject and explained the experi- 
mental task, The file cabinets contained a set of 
alphabetically ordered folders, and the subjects’ task 
was to locate the file for each name on their lists 
and to record the home address listed for each of 
those names. Subjects were told that they would 
receive “credit” for names only if their records of 
each address corresponded exactly to that on the 
experimenter’s master list. Subjects were also asked 
to complete the names in the order in which they 
appeared on the lists. 7 

Subjects in the low-importance condition were in- 
structed to work as quickly and as carefully as 
possible and were told that individuals were typi- 
cally able to complete from 20 to 60 names during 
the experiment. The other half of the subjects, those 
in the high-importance condition, were told that 
in order to simulate piecework schedules they would 
be paid 15 cents for each name they located and 
correctly identified. These subjects were informed 
that typical earnings in the experiment ranged from 
$2 to $8, the average being about $5. The offer of 
monetary incentives contingent on high quality and 
quantity performance was intended to increase the 
relative importance of task-related behaviors. 

‘After 5 minutes subjects’ task work was inter- 
rupted to obtain a second PSI reading from each 
subject. Subjects then resumed work for an addi- 
tional 5 minutes, after which the primary experi- 
menter administered the experimental questionnaire, 
Subjects were then informed that the task work 
was completed, were escorted back to the waiting 
room, and were thoroughly debriefed. Subjects in 
the high-importance condition were paid $5 each 
for participation in the experiment (this amount 
was greater than that actually earned by any sub- 


ject). 


Dependent Variables 


The experimental dependent variables comprised 
four separate conceptual groupings. The first group 
was designed to measure crowding stress and in- 
cluded ratings of subjects’ reported stress, per- 
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sonal feelings, attitudes toward other group mem- 
bers, and intentions concerning task performance. 
The second set consisted of measures of subjective 
crowding, whereas the third and fourth sets con- 
sisted of single measures—task performance and 
Palmar Sweat Index, respectively. Items were also 
included to evaluate the effectiveness of the ma- 
nipulations of availability of resources and im- 
portance of the behavior. 

Nine items were included in the crowding stress 
group. Affective reactions were assessed by requir- 
ing the subjects to rate both their “normal” feel- 
ings and their feelings during the experiment on 
four pairs of 9-point semantic differentials (pleasant- 
unpleasant, cooperative-uncooperative, nervous- 
calm, friendly-unfriendly). These ratings were then 
converted to difference scores (experimental rating 
less normal rating) to facilitate statistical analyses. 
Subjects also rated the degree to which the ex- 
perience of working in the experimental setting 
was stressful, the extent to which task work was 
distressing compared to the subject's experience 
during a typical day, the extent to which their 
expectations concerning task performance decreased 
durin, the task, the difficulty of coordina- 
tion i other group members, and the extent to 
which others interfered with task work. 

Subjective Gfowding was assessed by six 9-point 
bipolar’ scales. These items asked: “Is there (too 
much-too little) room for you to work on the 
task?” “To what extent do you feel (extremely 
crowded — extremely uncrowded) by the other group 
members?” “Is there (too much -+ too little) room 
for you to feel comfortable while working on the 
experimental task?” “Do you feel (extremely un- 
confined -extremely confined) physically while 
working on the task?” “Do working conditions in 
this room make task work (much harder- much 
easier)?” “To what extent is the experimental room 
(congested — uncongested) ?” 

Task performance was evaluated by simply count- 
ing the number of items correctly completed by each 
subject. A baseline and task-related PSI score was 
computed for each subject. Two raters ignorant of 
the experimental conditions scored the PSI mea- 
sure by counting the number of open sweat pores 
within a standard 4-mm Square area for both the 
baseline and the task-related measures, Scores of 
the two raters were reliable and yielded a correla- 
tion of .97. 

Two 9-point bipolar scales assessed the effective- 
ness of the availability-of-resources manipulation 
and required that subjects estimate the extent to 
which they would have Performed better had they 
been alone in the room and indicate whether there 
were too many people in the room. 

Four 9-point bipolar scales assessed the effective- 
ness of the manipulation of the importance of the 
behavior. These measured liking for the task, im- 
Portance of task Perforthance, difficulty of the task, 
and amount of effort expended on task work. 
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Results 
Manipulation Checks 


Except where otherwise indicated, the in- 
dividual responses on each dependent mea- 
Sure were averaged and the group mean em- 
ployed as the unit of analysis, avoiding 
problems associated with possible response de- 
pendence within experimental groups. To as- 
sess the effectiveness of the availability-of- 
resources manipulation, a two-factor analysis 
of variance was performed on the measures 
associated with this factor. Subjects in the 
large groups were more likely than those in 
small groups to report that they would have 
performed better had they been alone in the 
room and that there were too many people 
in the room. The means on the ability-to- 
perform measure were 8.07 for the scarce- 
resources condition and 7.44 for the normal- 
resources condition, and the respective means | 
for the number-of-people measure were 7.20 
and 6.23, multivariate F(2, 27) = 16.92, p 
< .001. Thus it appears that the manipula- 
tion of availability of resources was success- 
ful. 

The success of the importance-of-behaviorf 
manipulation was assessed by four items de- 
signed to measure subjects’ reactions to the 
experimental task. Subjects in the high-im- 
portance condition reported greater liking 
for the task, stated that it was more im- 
portant for them to perform well on the task, 
perceived task work as more difficult, and 
estimated that they had expended greater 
effort on task work than did subjects in 
the low-importance condition. A two-factor 
analysis of variance revealed that the differ- 
ence between the two conditions on these 
measures was only marginally significant, 
multivariate F(4, 25) = 2.31, p < .086. Al- 
though the importance manipulation was 
judged satisfactory, the marginal significance 
of the mean differences should be recalled 
in evaluating the results of the experiment. 


Tests of the Major Hypotheses 


Initial analyses revealed that no significant ; 
effects were obtained with the nervous a h 
and, friendly-unfriendly measures, and 
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‘ean Values of all Dependent Measures as a Function of Importance of 


— 


Importance of behavior 


Task-related PSI score 


High Low 
Item Scarce Adequate Scarce Adequate 
resources resources resources resources 

Behavior 

Performance score 15.40 16.92 14.40 16.83 
Crowding stress 

Unpleasant 1.71 92 1.81 1,21 

Uncooperative 94 21 92 42 

Difficulty of coordinating with others 5.35 4.42 4.85 4.67 

Stress 4.77 3.13 3.90 3.13 

Distress 4.90 4.17 5.31 3.96 

Others interfere 5.92 5.33 5.88 4.63 

Lowered expectations 5.27 4.75 5.10 4.71 
Subjective crowding 

No room to work 7,21 6.17 7.10 6.21 

Crowded 6.96 6.08 6.96 6.00 

No room to feel comfortable 6.90 5,92 6.77 5.92 

Confined 6.27 5.67 6.13 5.54 

Congested 6.81 5.71 6.79 5.75 

Working conditions poor 7.58 6.50 7.67 6.33 
Physiological reactions 

58.39 53.33 51.52 55,46 


Note. PSI = Palmar Sweat Index. 


variables were not considered further. Al- 
though no experimental hypotheses were ad- 
vanced with respect to sex of subject, the 
influence of this factor was examined in order 
to assess the universality of the experimental 
results, A three-factor analysis of variance 
was performed on each of the four sets of de- 
pendent variables. There were no significant 
multivariate main effects nor interactions as- 
sociated with sex of subject, and this factor 
was not included in the remaining analyses. 

The first experimental hypothesis predicted 
that scarce resources would produce greater 
subjective crowding, crowding stress, and 
task decrements than would adequate re- 
sources, A two-factor analysis of variance 
revealed a significant effect of resource avail- 
ability on participants’ subjective crowding, 
multivariate F(6, 23) =9.90, Ê < .001; 
crowding stress, multivariate F(7, 22) = 
2.48, p<.05; and task performance, F qd, 
28) = 8.66, p < .006. Participants in the 
scarce-resources condition reported greater 


subjective crowding and greater crowding 
stress and performed less well on the experi- 
mental task than did participants in the ade- 
quate-resources condition. The resource-avail- 
ability manipulation did not significantly af- 
fect participants’ physiological reactions to 
the setting (as measured by the PSI score, 
covarying initial PSI levels). The significant 
effects, in essence, confirm that the resource- 
availability manipulation established the cir- 
cumstances required to test the hypotheses 
involving importance. Induction of scarce 
resources by an increase in group size did 
create the required crowding and interfer- 
ence with actual task performance, The lack 
of an effect for PSI scores precluded con- 
sideration of this measure. We will return to 
this regrettable fact in the discussion. 
Hypotheses 2 and 3, stated as planned 
comparisons, were tested by a series of simple 
effects analyses because standard analysis 
of variance does not provide the specific con- 
trasts necessary to test the hypotheses. It is 
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necessary to examine the influence of varia- 
tions in availability of resources within both 
the high-importance and the low-importance 
conditions. It will be recalled that the second 
hypothesis expected high, but not low, im- 
portance to maintain task performance in 
the face of scarce resources. Consistent with 
the experimental hypothesis, scarcity of re- 
sources significantly affected performance 
among low-importance subjects, F(1, 28) = 
6.57, p < .02, but did not significantly influ- 
ence performance in the high-importance 
condition, F(1, 28) = 2.56, p < .12. Average 
performance scores are shown in Table 1. 
It is apparent that subjects in the high-im- 
portance condition produced a uniformly 
high level of performance (these subjects 
averaged 16.16 completed items per session), 
whereas subjects in the low-importance con- 
dition responded to variations in availability 
of resources, Within the low-importance con- 
dition, scarce-resources subjects averaged 
14.4 completed items, whereas adequate re- 
sources subjects averaged 16.8 items, When 
motivated by the potential for monetary gain, 
subjects overcame the interference created 
by scarce resources and performed at a high 
level of productivity. When task behaviors 
were of relatively low importance, however, 
variations in availability of resources pro- 
duced commensurate decrements in task per- 
formance. 

In accord with the interference formula- 
tion, Hypothesis 3 predicted that subjects’ 
crowding stress would be adversely affected 
by scarcity of Tesources when the task be- 
havior was of high importance to the sub- 
ject but not when it was of low importance. 
In support of this Prediction, the crowding 
stress measures were influenced by variations 
in availability of resources within the high- 
importance condition, multivariate F(7, 22) 
=240, p< 05, but not in the low-impor- 
tance condition, multivariate F(7, 22) = 
1.62, p < .18. Mean responses on these de- 
Pendent variables are Presented in Table 1, 
where it can be seen that for the seven stress 
items, only two (distress and others inter- 
fere) do not show the predicted pattern. 
With hindsight, these two items appear to 
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be deficient in tapping events that were no} 
strictly task related. 


Locus of Control 


It was predicted that among scarce-re- 
sources subjects in the high- and low-im- 
portance conditions, the stress Tesponses, 
subjective crowding, and physiological re. 
actions of external subjects would be more 
negative than those of internal subjects. This 
hypothesis was tested by simple effects anal- 
yses and was by necessity performed using 
the individual rather than the group as the 
unit of analysis, It should be noted that the 
use of locus of control as an independent 
variable produced a nonorthogonal experi- 
mental design, The reader is referred to 
Appelbaum and Cramer (1974) for a de 
scription of the series of ignoring and elimi- 
nating procedures that were employed to 
produce accurate tests of significance. : 

Analysis of subjects’ stress responses in- 
dicated that externals reported more nega- 
tive affect than did internals in the high- 
importance scarce-resources condition, mul- 
tivariate F(7, 130) = 2.11, p< .05, but 
within the low-importance scarce-resources 
condition, internality-externality did not 
have the predicted effect, multivariate F(7, 
130) = 1.15, p < .34. As expected, internal- 
ity—externality did not significantly influence 
affective responses within the adequate-re- 
sources condition. 2 

A similar pattern of results was obtained 
for the measures of subjects’ subjective 
crowding. Among scarce-resources subjects, 
externals consistently responded more nega- 
tively than did internals, but the effect for 
internality-externality within the high-im- 
portance scarce-resources condition is sig- 
nificant, multivariate F(6, 131) = 2.33, $ 
< 04, whereas the internality—externality 
effect within the low-importance scarce-re- 
sources condition was not significant, multi- 
variate F(6, 131) = 0.21, p< .98. Again, 
internality-externality had no significant ef- 
fect on subjects” responses in the adequate- 
resources condition. 

There were no locus-of-control main effects 
nor interactions on the measures of perform- 
ance or physiological stress, 


Discussion 


The major focus of the present research 
was to evaluate the effects of varying the 
"importance of the behaviors vulnerable to 
"interference by others. We successfully in- 
duced crowding and interference through 
the creation of scarce resources by manipu- 
lating group size. Individuals working in the 
scarce-resources condition, compared to those 
working in the adequate-resources condition, 
reported more crowding and crowding stress 
and completed fewer items on the task. These 
effects are hardly remarkable, although they 
do underscore the reason why some experi- 
ments, for example, those of Freedman, Kle- 
vansky, & Ehrlich (1971), find no perfor- 
mance decrements with high density, whereas 
other experiments, such as ours or that of 
Heller, Groff, & Solomon (1977), do find 
performance decrements. The difference ap- 
pears to reside in whether or not the crowd- 
ing manipulation and the particular task 
used intersect to produce scarce task re- 
sources and interference. Freedman, Klevan- 
sky, and Ehrlich used problem-solving tasks 
and manipulated room size, a combination 
that was selected explicitly to minimize task 
interference by unhinging high density from 
scarce resources, In our experiment and that 
of Heller, Groff, and Solomon, a performance 
task was combined with the heightening of 
density by increasing group size. This com- 
bination produces task interference and per- 
formance decrements. 

Based on the interference model (Schopler 
& Stockdale, 1977), it was predicted that the 
= aversive impact of interference would be 
| greater when behaviors encountering inter- 
ference are important to the actor. The sec- 
ond and third hypotheses expected efforts to 
maintain task performance, when the goal 
was important and despite scarce resources, 
"by an active coping strategy that also pro- 
duced increased stress. The data confirmed 
these predictions for task performance end 


à for stress. Subjects responded more nega- 
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Figure 1. Mean number of items completed and 
reported stress as a function of resource avail- 


~ ability and level of importance. 


goal-directed behavior sequences experienced 
more psychological costs than did less mo- 
tivated subjects when these behaviors en- 
countered interference from the physical/ 
social environment. These results suggest 
that in understanding the impact of high- 
density settings, specifying the nature and 
importance of behaviors enacted in these 
settings is as necessary as specifying the 
nature of the settings themselves. 

Further evidence of the relationship be- 
tween reported stress and the interference 
produced by high-density settings is revealed 
in a comparison of the pattern of responses 
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to the stress item, in the stress group, with 
the pattern of performance scores (see Fig- 
ure 1). When no external incentive was pro- 
vided, scarce resources produced a perform- 
ance decrement but did not result in greater 
reported stress, When the importance of task 
performance was raised, however, scarce re- 
sources no longer resulted in a significant 
performance decrement but did produce sig- 
nificantly greater reported stress. It appears 
that subjects in the high-importance scarce- 
resources condition were able to maintain 
their relatively high level of performance 
only at the expense of experiencing greater 
subjective stress, These psychological costs 
seem to be the price subjects paid for their 
attempts to maintain high performance un- 
der unfavorable environmental conditions. 
The observed relationship between perform- 
ance and reported stress in the scarce-re- 
sources condition may also indicate that re- 
duced task output functions as a coping 
mechanism. When behavioral goals and en- 
vironmental conditions conflict, modifying or 
eliminating those behaviors that encounter 
interference may be an effective strategy for 
low-incentive subjects in reducing the stress- 
fulness of the situation. The extrinsic rewards 
offered in the high-importance condition, 
however, encourage these subjects to main- 
tain those behaviors in spite of the interfer- 
ence. 

The importance of the behavior and the 
scarcity of task resources may also be the 
critical features determining when crowding 
and sex of subject interact with respect to 
affective feelings, Increases in density have 
raised positive feelings for women while low- 
ering them for men, both when the measures 
are taken in the crowded situation (Freed- 
man, Levy, Buchanan, & Price, 1972; Ross, 
Layton, Erickson, & Schopler, 1973) and 
when they are taken after the participants 
have left the crowded situation (Epstein & 
Karlin, 1975), Affective ratings are included 
in our crowding stress group. It will be re- 
called that we found no interactions with 
sex of subject on these measures, We would 
Suggest that the sex interaction effect will 
not occur when the mechanism used to cope 
with interference does not lower crowding 
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stress. The cohesion and cooperativeness that 


characterizes groups of women, according to 
Epstein and Karlin, will probably not survive 
the combination of scarce resources and high 
importance. 

Although it was predicted that externals 
would respond more negatively to the scarce 
resources present in the large group under 
both high and low importance, this effect 
was observed only in the high-importance 
condition, A possible explanation of this 
result is suggested by reference to our previ- 
ous discussion regarding the most likely cop- 


ing strategies in the high- and low-impor- 4 


tance conditions. A passive withdrawal 
Strategy, resulting in reduced performance, 
may seem equally attractive to internals and 
externals when incentive is low. The high- 
importance condition favors internals be- 
cause it requires a more active strategy aimed 
at coordinating and synchronizing responses 
to keep performance high. Thus, individual 
differences in expectancies for control over 
the social environment produce differing af 
fective responses when the desirable coping 
strategy requires active intervention. 

No significant effects were obtained for 
the physiological measure of stress. One of 
the problems with attempting physiological 
measures in crowding studies is the disrupt- 
ing and distracting influence of the measure- 
ment process itself. The PSI was selected for 
use because it allowed subjects the necessary 
mobility and could be administered relatively 
quickly. To further reduce the disruptions 
inherent in taking the measure, the PSI was 
administered only twice. The data, viet 
reflect generally higher scores on the initial 
administration, suggest that a longer baseline 
period with several administrations woul 
have been desirable. The longer baseline 
period would allow subjects an opportunity 


to adapt to the measurement procedure It | 


self and might have resulted in a greater 
sensitivity to the experimental manipulations: 


satin e 
However, the nonsignificant results for th l 


PSI must also lead us to entertain the poss 


bility that physiological stress does not 1! ~ 


actuality adhere to the pattern obtained p 
self-reported stress and negative affect. Tr 
seems plausible in view of the relatively bri 
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duration of the experimental session. In any 
case, we feel that the development of a sensi- 
tive and easily administered measure of 
physiological stress would add much to a 
literature that has relied heavily on self- 
report measures. 

The results of this study have been inter- 
preted in support of the interference model 
of crowding. This model locates the aversive 
impact of high-density settings in the in- 
creased costs of enacting behaviors when 
environmental conditions are in opposition to 
the actor’s behavioral goals. Aversive psy- 
| chological consequences were shown to be a 

joint function of the degree of density-re- 

lated interference and the importance to the 
actor of his or her goal-directed behaviors. 

The relationship between density and actual 

task performance was also qualified by the 

importance of the task-related behaviors. 

When highly motivated individuals are suc- 

cessful in maintaining high levels of per- 

formance under high-density conditions, the 
increased costs of enacting the necessary be- 
haviors are reflected in their negative affective 


responses to the situation. 
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Investigators have taken diverse theoreti- 
cal approaches to 


chologi 
within individuals: unresolved psychody- 
namic conflicts (e.g., Glad, 1973); the needs 
for achievement, affiliation, and power (Win- 
ter, 1973); cognitive maps of the political 
world (e.g. 1973; Axelrod, 1976; 
Holsti, 1976; Jervis, 1976); pressures toward 
cognitive consistency (e.g., Jervis, 1976); or 
the effeots of stress on information processing 
complexity (e.g., Hermann, 1972; Suedfeld 

& Tetlock, 1977), 
Political decisions are not, of course, made 
in a social vacuum. They are usually made 
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in organized group contexts in which implicit 
and explicit norms regulate the conduct of 
the decision maker. However, relatively few 
psychologically oriented studies have con: 
sidered how patterns of social interaction 
among policy makers can influence decision 
making. One noteworthy exception to this 
generalization is the work of Janis (1972) on 
the groupthink phenomenon, Janis argued 
that intense social pressures toward uni- 
formity and in-group loyalty within decision- 
making groups can build to the point where 
they seriously interfere with both cognitive 
efficiency and moral judgment. Groupthink ” 
occurs when independent critical analysis of 
the problem facing the group assumes second 
Place to group members’ motivation to main- 
tain group solidarity and to avoid creating 
disunity by expressing unpopular doubts or 
opinions. p 
In several case studies of major foreign 
policy decisions by the American govern- 
ment, Janis attempted to trace the effects 
of social pressures toward groupthink on de- 
cision making. He selected for analysis cases 
in which he felt the signs of poor decisa 
making as a result of concurrence seeking 
were “unmistakable” (Janis, 1972, p. 10)- 
These included the decisions to pursue the 


€ 


Korean army beyond the 
launch the Bay of Pigs in- 
ion of Cuba, and to escalate involvement 


vered similar predisposing conditions to 
invariably 


bers. The policy-making groups were 
relatively insulated from the judgments of 
qualified outsiders and lacked systematic 
procedures for evaluating and searching out 
new evidence relating to the problem. Dur- 
ing the deliberations on the particular issues, 
the group leaders tended to promote their 
preferred solutions rather than to encourage 
open-minded, careful analysis of policy al- 
ternatives, Finally, all decisions were made 
in highly stressful situations in which policy 
makers were not hopeful of finding an al- 
ternative superior to the one that the group 
currently favored. 

Under these conditions, certain distinctive 
attitudes and patterns of interaction emerged 
within the groups (see Janis, 1972, PP- 197- 
198). Decision makers appeared to believe 
that the group was invulnerable to outside 
attack—a belief that encouraged excessive 
members 


policies and the inherent immorality and in- 
competence of the enemy. i 
often self-censored any 
the group’s policies. The group 
pressure to those few members who showed 
signs of deviating from the consensus. S 
appointed “mindguards” shielded the group 
from external sources of dissonant oF coun- 
terattitudinal information. 

so inhospitable to 


making procedures fell far short of the i 
“rational actor” standard (cf. 
According to Janis and 
cision making in groupthink situations fail 


Fy 
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makers generally did not 
the full range of pol- 
(b) consider the full spec- 


account of information that contradicted 
prior beliefs and preferences; (£) reexamine 
evaluations of all known alternatives, includ- 
ing those previously regarded as unaccept- 
able; (g) develop sufficiently detailed plans 
for implementing the chosen policy, with 
special reference to contingency plans in the 
event known ri materialized. As Janis 
(1972, p. 10) noted, the outcomes of group- 
think decisions deserved to be fiascoes “be- 
cause of the grossly inadequate way the 
policy makers carri out their decision- 
making tasks.” 

Janis contrasted the groupthink decisions 
with two examples of well-worked-out for- 
eign policy decisions: the development of 
Marshall Plan and the handling of the Cuban 
Missile Crisis. In these instances, the de- 
cision-making groups and their leaders gave 
high priority to critical appraisal and open 
“Decision mak- 
ers had to undergo the unpleasant experience 
of hearing their pet ideas critically pulled to 
pieces” (Janis, 1972, P- 165). 
ultimately developed within these groups 
were upon careful analysis of the 
likely consequences of large numbers of pol- 
icy options, with frequent attempts at pro- 
posing new solutions that maximized the 
advantages and minimized the disadvantages 
of options already analyzed. 


of foreign policy deliberations into the group- 

think and non-groupthink categories. 

For purposes of hypothesis construction—which is 

the stage of inquiry with which this book is con- 
illing to make some infer- 


cerned—we must be 
ential leaps from whatever historical clues we can 


pick up. (Janis, 1972, p. V) 


Unfortunately, the most appropriate docu- 
ments for testing the groupthink hypotheses 
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—verbatim records of formal and informal 
meetings among decision makers—were not 
available. The historical clues upon which 
Janis relied—observer accounts of private 
conversations and participant memoirs—were 
susceptible to serious retrospective distortion 
of a motivational or cognitive nature (cf. 
Fischhoff & Beyth-Marom, 1976, on the ‘“cer- 
tainty-of-hindsight” bias), Moreover, Janis 
did not specify the criteria that he used in 
deciding to include or exclude data. Thus, he 
may inadvertently have given too much 
weight to evidence that supported group 
think hypotheses and too little weight to con- 
tradictory evidence. 

A wide range of behavioral research meth- 
ods can be used to test Janis’s analysis, Two 
possible approaches are (a) laboratory simula- 
tion studies that factorially manipulate hy- 
pothesized determinants of groupthink and 
then observe effects on social interaction and 
quality of decision making, and (b) content 
analysis studies of archival records that cast 
light on how decision makers think in actual 
crises. Each approach has compensating 
methodological advantages and disadvan- 
tages. 

Flowers’s (1977) research exemplifies the 
first approach. In a 2 x 2 laboratory experi- 
ment, Flowers found that “open-leadership 
style” led groups to suggest more solutions 
to a problem and to use more available in- 
formation than “closed-leadership style.” 
This finding was compatible with Janis’s 
(1972) analysis, Contrary to Janis, though, 
“degree of group cohesiveness” did not affect 
quality of decision making. This latter find- 
ing is, however, difficult to interpret. On the 
one hand, Janis’s theory may need revision. 
On the other hand, Flowers’s operational defi- 
nition of cohesiveness (using groups of col- 
lege students) probably differs from the co- 
hesiveness displayed within foreign policy — 
making groups (who have typically known 
each other for many years and are confronted 
by extremely ego-involving tasks). The in- 
terpretive ambiguity reflects a fundamental 
limitation of simulations: the impossibility 
of ensuring that a simulation fully captures 
the processes operating in the original situa- 
tion (cf. Bem, 1972). 


PHILIP E. TETLOCK 


The present study takes the second ap- 
proach. It applies standardized content anal- 
ysis procedures to statements of key decision 
makers involved in the groupthink and non- 
groupthink crises that Janis examined. Rela. 
tive to laboratory studies, the primary ad- 
vantage of this approach is the high external 
validity achieved by dealing with real-life 
decisions by top policy makers; the primary ` 
disadvantage is the relative lack of control 
over both independent and dependent vari- 
ables. 

With respect to independent variables, the 
laboratory researcher can often be confident 
that his or her experimental conditions differ 
in only a few theoretically relevant ways. 
Whether the same can be said for the hy- | 
pothesized groupthink and non-groupthink 
foreign policy decisions is unclear, As is al- 
most inevitable in comparing complex nat- 
urally occurring phenomena, possible con- 
founding variables exist. For instance, the 
groupthink decisions all involved military in- 
terventions or escalations of questionable suc- 
cess; the non-groupthink decisions involved 
more restrained policies having desirable con- 
sequences (from the standpoint of the deci- | 
sion makers). It is debatable, however i 
whether these differences are independent of 
or attributable to groupthink. For example, 
Janis (1972) suggested that groupthink pre: 
disposes decision makers to seek rapid, clear- 
cut military solutions to complex problems. 
Nonetheless, it is important to keep in mind | 
that the groupthink/non-groupthink contrast 
in this study can be reconceptualized as 4 
contrast between unsuccessful military and 
successful nonmilitary national policies. 

With respect to dependent variables, ie, 
laboratory researcher has relatively direc! 
access to subjects’ decision-making delibera- 
tions. Such records are much less accessible 
when one’s subjects are heads of state 
their advisors. Currently, the only standard- 
ized records of how groupthink and non 
groupthink decision makers perceived on 
options consist of decision makers’ PU! is 
statements during the relevant crises. T 
need not, however, be a serious liability. 
though public statements are undoubt a 
more influenced by efforts to manage P? 
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. ical impressions than are private statements, 


¥- 1961; Schroder, 1971; 


there are good reasons to assume that mani- 
festations of groupthink will appear in both 
private and public statements. Political elites 
—for a variety of reasons—probably do not 
frequently endorse policy positions that are 
very dissimilar to those they endorse in pri- 
vate (cf. Graber, 1976). Political leaders are 
usually concerned with maintaining an image 
of trustworthiness that would be jeopardized 
by failure to maintain some consistency be- 
tween their words and deeds (cf. Graber, 
1976; Tedeschi, Schlenker, & Bonoma, 1971). 
To the extent that groupthink seriously inter- 
feres with the cognitive efficiency and moral 
judgment of decision makers, groupthink also 
probably affects how decision makers attempt 
to justify their policies to the world. 

The study reported here selected content 
analysis procedures for their appropriateness 
in identifying two major manifestations of 
groupthink: (a) the tendency to process 
policy-relevant information in simplistic and 
biased ways, and (b) the tendency to evaluate 
one’s own group highly positively and to eval- 
uate one’s domestic and international oppo- 
nents highly negatively. These variables were 
selected partly because they are more likely 
to be revealed at an overt or public level than 
others (€.., self-censorship of doubts, direct 
pressuring of deviants, or the emergence of 
mindguards) and partly because reliable and 
valid measurement techniques could be read- 
ily adapted for the purpose of assessing them. 

The integrative-complexity coding system 
was used to determine the degree to which 
“simpler” (i.e., less differentiated and less 
integrated) modes of information processing 
prevailed in groupthink situations. The cod- 
ing system was initially developed as a mea- 
sure of the integrative-complexity dimension 
of personality (Harvey, Hunt, & Schroder, 
Schroder, Driver, & 
Streufert, 1967). Recent research indicates 
that the coding system is also sensitive to 
situationally induced shifts in complexity of 
information processing and can be usefully 
applied to such documents as letters, essays, 
speeches, and diplomatic communications 
(Suedfeld, 1978; Suedfeld & Rank, 1976; 
Suedfeld & Tetlock, 1977; Suedfeld, Tetlock, 


be 
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& Ramirez, 1977). For instance, Suedfeld and 
Rank (1976) have shown that the integrative 
complexity of successful revolutionary leaders 
who subsequently held onto power was higher 
after victory than before victory, whereas 
leaders who failed to maintain power after 
victory did not show such a change. They 
argued that this finding reflects the need for 
change from the simple cognitive strategies 
appropriate to a revolutionary struggle toward 
the complex approach required of govern- 
ments in power. Suedfeld et al. (1977) found 
that the complexity of Arab and Israeli 
speeches to the United Nations tended to de- 
cline prior to the outbreak of major wars. 
They argued that this finding probably reflects 
the disruptive effects of stress on complex 
information processing (see also Suedfeld & 
Tetlock, 1977). Such data strongly suggest 
the usefulness of the complexity coding sys- 
tem for unobtrusively assessing cognitive 
structural variables from archival data and, 
most relevant here, for testing Janis’s group- 
think analysis. 

Evaluative assertion analysis was used to 
test the hypotheses concerning the effects of 
groupthink on “jn-group” and “out-group” 
attitudes (see Osgood, 1959; Osgood, Saporta, 
& Nunnally, 1956). The technique has grown 
out of Osgood’s research on the dimensions of 
meaning and the congruity principle of atti- 
tude change. It has been used successfully in 
a wide variety of research projects, including 
the coding of psychotherapeutic interviews 
(Osgood, 1959), the examination of John 
Foster Dulles’s attitude toward the Soviet 
Union (Holsti, 1967), and the assessment of 
bias in the news coverage of presidential can- 
didates (Westley et al., 1963). The common 
purpose underlying the diverse applications of 
evaluative assertion analysis has been “to ex- 
tract from verbal communications the evalua- 
tions being made of significant concepts” 
(Osgood et al., 1956, P. 47). The significant 
concepts in the present context are defined as 
groups with which the speaker identifies and 
domestic and foreign opponents. 

In summary, this study employed two inde- 
pendent techniques of content analysis to test 
hypotheses abstracted from Janis’s case 
studies of groupthink in American foreign 
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policy making. The integrative-complexity 
coding system was used to assess the relative 
differentiation and integration of decision 
makers’ public statements in groupthink and 
non-groupthink crises. Evaluative assertion 
analysis was used to assess the direction and 
intensity of in-group and out-group attitudes 
in these situations, 


Method 


Archival records of public statements by leading 
decision makers in five American foreign policy 
crises provided the data for the current study. State- 
ments were drawn from Primary sources that in- 
cluded the United States Department of State Bul- 
letin, the Congressional Record, Collected Papers of 
the Presidents of the United States, and Vital 
Speeches. The decision makers for whom materials 
were scored included: the Marshall Plan—President 
Harry S. Truman, Secretary of State George C, 
Marshall, Undersecretary of State Dean Acheson; 
the invasion of North Korea—President Harry S. 
Truman, Secretary of State Dean Acheson; the Bay 
of Pigs invasion—President John F. Kennedy, Secre- 
tary of State Dean Rusk; the Cuban Missile Crisis— 
President John F, Kennedy, Secretary of State Dean 


1950- November 20, 1950; 
Invasion—January 20, 1961 — April 
; the Cuban Missile Crisis—October 16, 1962 


which groupthink in one topic are; 
other afeas. The following guidelines were used: 


3. The Bay of Pigs invasion, All statemi 
: s ents per- 
tained to the dangers Presented by the Castro ett 
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tained to the threat 
Soviet Union in Cuba. 

5. The Vietnam War escalation decisions, Al] states 
ments pertained to the danger of the Communist 
threat in Southeast Asia and the need for American 
aid to defend the non-Communist governments in 
the region. 


Posed by the actions of the 


The Bay of Pigs invasion 
obtaining data. Since the i 
eration, decision makers did not describe or justify 
the invasion Policy in their public statements, In this 
Sense, the Bay of Pigs case differs from the other four 


performed both with 
from the Bay of Pigs time period. 


Integrative-Complexity Coding 


All material was scored for integrative complexity 

on a 7-point scale (see Schroder, Driver, & Streufert, 
1967, Appendix 2, for detailed discussion of the cod- 
ing rules), The scale defines complexity in terms of 
both differentiation and integration. Differentiation 
refers to the number of characteristics or dimensions 
of a problem situation that are recognized and taken 
into account in decision making. For instance, a deci- 
sion maker may process information relating to policy 
options in an undifferentiated fashion by placing 
options into only one of two categories: the “good, 
Patriotic” policies and the “bad, defeatist” policies. 
A more differentiated approach would recognize that 
Policy options can have multiple, often contradictory 
effects that cannot be classified on a single evaluative 
dimension of judgment—for example, effects on dif- 
ferent political constituencies, various sectors of the 
economy, military strength, and the strategies of one’s 
opponents. Integration refers to the development of 
complex connections among differentiated character- 
istics, (Differentiation is thus a prerequisite of inte- 
gration.) The complexity of integration depends on 
whether the dimensions of judgment employed by 
the decision maker are perceived as operating in isola- 
tion (low integration), in hierarchical interaction 
(medium integration), or according to multiple, com- 
Plex, and perhaps flexible patterns (high integration). 
Scores of 1 reflect low differentiation and low integra- 
tion. Scores of 3 reflect medium or high differentia- 
tion and low integration, Scores of 5 reflect medium 
or high differentiation and medium integration. 
Scores of 7 reflect high differentiation and high inte- 
gration. Scores of 2, 4, and 6 represent transition 
points between adjacent levels. 

Initial scoring for integrative complexity was pel 
formed by the author and Peter Suedfeld, who were, 
however, both aware of the hypotheses being test i 
and the sources of the material (interrater agreemen 
r= 91). The reliability of this scoring was check 
by having material rescored by trained coders as50- 
ciated with a research group under Suedfeld at the 
University of British Columbia, These coders were 
blind to both the hypotheses being tested and the 
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Table 1 
and ‘Groups With Which Speaker Identifies” 


Marshall Plan 
Opposition 


Groups with which speaker identifies 
American government, American people 
Invasion of North Korea 
Opposition 


new colonialism, Soviet leaders 
K Groups with which speaker identifies 


Bay of Pigs invasion 
Opposition 
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© Atitude Objects Included Under General Categories of “Domestic and Foreign Opposition” 


Communism, USSR, East European Communist states, Communist parties of Western Europe, totali- 
tarian regimes, Communist victories, fall of Europe 
Communism, USSR, North Korea, Communist China, forces of Communism, Communist imperialism, 


American government, American people, United Nations Command in Korea, South Korea, UN 
soldiers, we, American history to the present, policy of increasing strength 


Communism, USSR, Castro's Cuba, alien ideology, Communist agents 


Groups with which speaker identifies 


American government, American people, Cubans opposed to Castro government, those concerned with 


maintaining peace, OUr long-range aspirations, 


Cuban Missile Crisis 


Kennedy, our (OAS) delegates, anti-Castro guerrilla 
fighters, Cuban people, security of Latin America 


Opposition 
n Communism, USSR, Castro's Cuba, new Soviet moves, Khrushchev's decision, extracontinental power 


Groups with which speaker identifies 
American gover 
free nations 

Vietnam War escalation decisions 

Opposition 


nment, American people, nations of this hemisphere, unanimity of Western Hemisphere, 


North Vietnamese, Viet Cong, Red China, USSR, aggressors, yielding to aggression, policy of with- 


drawal 
h speaker identifies 


South Vietnamese government, policy of standing firm 


Groups with whic! 
American government, American people, 
ed by us- 


< 


hypotheses. Analysis of disagreements revealed no 
relationship to the groupthink hypothesis. 


Evaluative Assertion Analysis 


Evaluative assertion analysis was performed on 

r the same materials. (Osgood et al., 1956, present a 
more detailed discussion of how to perform evaluative 
assertion analysis.) The analysis involved four basic 
stages and was performed by two one 

Y whom was the author; the other was an individual 
otherwise uninvolved in the study. The first stage 
of the evaluative assertion analysis was the identifica- 
tion and isolation of attitude objects. Two general 
classes of attitude objects were selected 
political groups with which the speaker identifies and 
domestic and foreign opponents. Table 1 presents the 
variety of specific terms taken from public statements 
that were considered to fall into one of these two 
general categories. It was possible to obtain extremely 
high interrater agreement for identifying and classify- 


ing these terms. Disagreements were resolv 
ing the judgments of the coder unaware of the hy- 
potheses. 

‘The second stage involved translating all statements 
in which these attitude objects appeared into one of 
two common sentence forms: 


Attitude Object 1 / Verbal connector / 
Common meaning term 
Attitude Object 1 / Verbal connector / 
Attitude Object 2. 


For example, the sentence “An aggressive North 
Korea threatens freedom-loving South Korea” would 
be translated to read: 


North Korea / is / aggressive 
North Korea / threatens / South Korea 
South Korea / is / freedom loving. 


Here, again, coding disagreements were rare and were 

resolved by using the judgment of the blind coder. 
Jn the third stage, the verbal connectors and pre- 

dicates were rated for intensity and direction on 7- 
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point scales ranging from —3 to +3. A verbal con- 
nector received a negative score to the degree that it 
dissociated the subject from the predicate (“never is” 
would receive a score of —3) and a positive score to 
the degree that it associated the subject and the pre- 
dicate (“always is” would receive a score of +3). A 
predicate received a negative score to the degree that 
it represented a negatively evaluated attribute or qual- 
ity (e.g. evil, aggressive) within “the language com- 
munity of the speaker” (Osgood et al., 1956). The 
predicate received a positive score to the extent that 
the attribute or quality cited was positively evaluated 
within the language community of the speaker (e.g., 
freedom, peace). Reliability checks were obtained to 
ensure that verbal connectors and common meaning 
terms were being assigned ratings that reflected com- 
mon linguistic standards. The correlations of coder 
ratings of verbal connectors and common meaning 
terms were .93 and .94, respectively, 

The direction and intensity of evaluative sentiment 
directed toward each of the two categories of attitude 
objects were computed for each paragraph unit. 
Within each unit, references to each of the two classes 
of attitude objects were separated. The direction and 
intensity of evaluative sentiment directed to each class 
were computed by (a) multiplying the values as- 
signed to the verbal connectors and predicates used 
to describe attitude objects falling within the class 
and (b) summing the products thus obtained, Two 
scores were obtained in this way for each Paragraph 
unit from the evaluative assertion analysis: One 
specified the intensity and direction of evaluative 
feeling toward political groups with which the 
speaker identified, and the other specified evaluative 
feeling toward the speaker’s opponents. 


Results 


Unweighted-means analyses of variance 
were used to test the groupthink hypotheses. 
The analyses of variance always involved 
three independent variables: type of crisis 
(groupthink vs. non-groupthink), decision 
makers within crises, and the random ordering 
of the 12 passages of material sampled from 
each decision maker’s statements, 

The analyses of variance were complicated 
by the fact that individual decision makers 
were unevenly represented across different 
crisis situations. In analysis of variance terms, 
decision makers were partly “crossed” with 
and partly “nested” within crises, For in- 
stance, Truman and Acheson both appeared in 
the 1947 (non-groupthink) and 1950 (group- 
think) crises (that is, they were crossed with 
these two crises); similarly, Kennedy and 
Rusk both appeared in the 1961 (groupthink) 
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and 1962 (non-groupthink) crises, Neither 
pair of decision makers appeared in crises jn.’ 
volving the other pair. Marshall appeared only 
in the 1947 crisis, and Johnson appeared only 
in the 1964-1965 (groupthink) crisis, Finally, 
Rusk appeared most frequently: in the 1961, 
1962, and 1964-1965 crises. 

To resolve this complicated situation, it was 
assumed that different decision makers were 
involved in each crisis (i.e., decision makers 
have been treated as nested within crises), 
This assumption was conservative in the sense 
that it reduced the likelihood of detecting sig. 
nificant differences between groupthink and 
non-groupthink crises. The assumption sacri- 
fices the opportunity to control statistically 
for the effects of personality variables. The 
groupthink effect must overcome personality 
effects in order to be detected. 

A second, conservative assumption was also 
made. In all analyses of variance, the group- 
think and non-groupthink crises have ben 
treated as random effects. The crises wete 
viewed as a sample from the population of 
groupthink and non-groupthink decisions. For 
this reason, quasi-F ratios have been con- 
structed to test the significance of crisis effects 
(cf. Clark, 1973; Winer, 1971). These quasi- 
F ratios are smaller than the ordinary F ratios 
that would have been appropriate if crises 
were viewed as a fixed variable. 

Table 2 presents the mean integrative com- 
plexity scores of each major decision maket 
within each crisis. Consistent with the group 
think hypothesis, a planned orthogonal con: 
trast of complexity scores revealed that de 
cision makers in groupthink crises were Sif 
nificantly less complex than their counterparts 
in non-groupthink crises, quasi-F (1, 17) = 
18.29, p < .01. This effect remained signif 
icant when the data obtained from the Bay 0 
Pigs period were deleted, planned contrast, 
quasi-F(1, 16) = 15.77, p < .01. No otha 
effects were significant. 2 

Decision makers’ mean evaluations of polit 
ical groups with which they identified are als 
presented in Table 2. As predicted, decisio! 
makers in groupthink crises evaluated polit 
ical groups with which they identified mor 
positively than did decision makers in non 
groupthink crises, planned contrast, quasi 
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“Table 2 
a Ratings on the Three Dependent Variables 
Decision Integrative Evaluation of group with Evaluation of 
maker complexity which speaker identifies opponents 
Non-groupthink crises 
1947 
Truman 3.66 5.5 — 1.75 
Acheson 4.5 666 — 1.0 
Marshall 5.9 0 — .166 
1962 
Kennedy 4,33 2.92 0 
Rusk 3.16 2.0 — 4.08 
Groupthink crises 
1950 
Truman 1.0 14.0 10.5 
Acheson 1.83 13.33 — 1,92 
1961 
Kennedy 2.16 12.33 — 6.25 
Rusk 2.58 9.833 — 4.0 
1964-1965 
Johnson 2.16 4,83 '— 9,83 
Rusk 29 3.33 — 292 


< 012 This effect re- 
mained significant when the data obtained 
from the Bay of Pigs period were deleted, 
planned contrast, quasi-F (1, 16) = 8.03, P < 
05. No other effects were significant. 

Table 2 also presents decision makers’ mean 
evaluations of domestic and foreign Oppo- 
nents, Contrary to theoretical expectations, 
there was no significant difference between 
groupthink and non-groupthink crises in 
„ negative evaluations of domestic and foreign 

opponents, planned contrast, quasi-F(1, 13) 


='219; polos The effect remained nonsignif- 
igs period 


B F(2, 17) = 17.37, $ 


were deleted, planned 
11) = 2.34, p < 25- 


The other effects 


ions existed between 
d the evaluative as- 
More complex 


statements tended to include fewer positive 
s with whi 


the speaker identified, 7(130) = =21, P< 
001, and fewer negative evaluations of do: 
J mestic and foreign opponents, r(130) = £335 


p < 001. 


A major disadvantage of treating decision 
makers as nested within crises is that it pre- 
vents comparison of how individuals who ap- 
peared together in two crises (e.g, Truman 
and Acheson; Kennedy and Rusk) reacted in 
the different situations. For example, one de- 
cision maker may have been highly consistent 
in the integrative complexity and evaluative 
intensity of his public statements, whereas 
another may have been highly inconsistent. 
To examine this hypothesis, matched-pairs £ 
tests were performed on the data obtained 
from Truman and Acheson in the crises in 
which they appeared (1947 and 1950) and on 
data obtained from Kennedy and Rusk in the 
crises in which they appeared (1961, 1962, 
of Rusk, 1964-1965). The 
that public statements of 
individual decision makers did not always 
change in the directions predicted by the 
groupthink analysis. While Truman’s, Ache- 


son’s, and Kennedy’s statements were less 
t) 


1 Degrees of freedom for the quasi-F statistics 
were calculated using Satterthwaite’s formula (see 
Winer, 1971, PP- 315-318). With this formula, it 
is possible for numerator degrees of freedom of a 


planned contrast to be greater than one. 
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complex in groupthink than in non-groupthink 
situations, the complexity of Rusk’s state- 
ments remained relatively constant through 
the 1961, 1962, and 1964-1965 crises, Tru- 
man’s evaluations of opponents were signif- 
icantly more negative during the 1950 crisis 
(relative to the 1947 crisis), but his evalua- 
tions of political groups with which he identi- 
fied were not more positive. Acheson's evalua- 
tions of political groups with which he identi- 
fied were, however, more positive in the 1950 
crisis (relative to the 1947 crisis), but his 
evaluations of opponents were not signif- 
icantly more negative. Kennedy’s public state- 
ments changed most in conformity with the 
groupthink model. Between the Bay of Pigs 
and the Cuban Missile Crisis, Kennedy 
showed significant or marginally significant 
changes in the predicted direction on integra- 
tive complexity, evaluations of opponents, and 
evaluations of political groups with which he 
identified. Rusk’s public statements were least 
in conformity with the groupthink model, He 
showed no significant shifts on any of the 
three dependent variables between the 1961 
and 1962 crises or the 1962 and 1964-1965 
crises. In sum, evidence exists for individual 
differences in decision makers’ reactions to 
groupthink and non-groupthink situations. 

An important unresolved issue is how to 
determine whether groupthink is occurring in 
given situations. The present research sug- 
gests that relatively objective content analysis 
procedures applied to the public statements 
of decision makers may be useful for this pur- 
pose. As a first step toward developing clear 
criteria for distinguishing groupthink and 
non-groupthink decisions, discriminant anal- 
ysis was used to evaluate the usefulness of the 
three dependent variables employed in this 
study in differentiating groupthink and non- 
end te decisions (cf. Cooley & Lohnes, 

19; 

The discriminant function obtained was 
significant at the .001 level, x*(3) = 72.7, and 
accounted for 45% of the total variation be- 
tween groups. The standardized discriminant 
function coefficients for maximally distin- 
guishing the two groups were (a) integrative 
complexity, .89; (b) evaluations of political 
groups with which speaker identifies, —.21; 
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and (c) evaluations of domestic and forei 
opponents, .08. This pattern indicates 

public statements in groupthink crises can by 
best distinguished from public statements i 
non-groupthink crises on the basis of integrad 2 
tive complexity and, to a lesser extent, eval.) í 
uations of political groups with which the dei 
cision makers identify. The “evaluations off 
Opponents” variable appears to have no ind 
pendent discriminatory power. The discri 
inant function correctly predicted the gro 
think or non-groupthink origins of 78% of the 
public statements.* 


Discussion 


The results for two of the three dependenti 
variables strongly supported the predictions 
derived from Janis’s groupthink analysis, Pub- 
lic statements of decision makers in group 
think crises were characterized by signif- 
icantly lower levels of integrative complexity 
than the public statements of decision makets’ 
in non-groupthink crises. Decision makers in 
groupthink crises evaluated political groups 
with which they identified more positively 
than did decision makers in non-groupthink 
crises. However, contrary to expectation, 
groupthink and non-groupthink decision 
makers did not significantly differ in the in- 
tensity of their negative evaluations of domes 
tic and international opponents. 

With the exception of the last dependent 
variable, the current findings converge mg 
pressively with the conclusions of Janis's 1M 
tensive case studies. The convergence 1S a 
pressive primarily because the findings of 
study and those of Janis (1972) are e. 
upon markedly different types of data (pu K 
statements versus the retrospective accoun 
of observers and participants) that have wee 
processed in very different ways (quantitativ 
content analysis vs. intuitive reconstruction a 
historical episodes). The present study under: 
scores how multiple methods of ime 
tion—ranging from case studies to cont 
analysis studies to laboratory experiments— 


2A separate discriminant analysis was perform 
without the Bay of Pigs data. The rest 
differ substantially from those reported above. 
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an be brought together to test the validity of to war and some of which have not. If the 
ne roupthink construct. propaganda position is correct, symptoms of 
Nonetheless, it is appropriate to introduce groupthink should appear in public statements 
cautious note. The evidence reported here prior to the decision to go to wat, indepen- 


loes not rule out all alternative explanations. dently of how the decision was made. Tf the 


The skeptic can point out that the groupthink groupthink position is correct, manifestations 


and non-groupthink crises differed in ways of groupthink should appear only in the public 
that render interpretation of the results some- statements of the groupthink decision makers. 
what ambiguous. For instance, as noted Regardless of whether the war propaganda 
earlier, the groupthink decisions all Jed to un- Or groupthink interpretation of this study is 
successful military interventions Or escala- correct, the results reported here have intrigu- 
| s the non-groupthink decisions ing practical implications. The propaganda 
gdid not lead to significant military action. interpretation suggests that it should be pos- 
Differences in public statements prior to these sible to predict in advance the occurrence of 
‘decisions may reflect differences in propa- military interventions OT escalations using 

ganda strategies used to convince the Amer- content analysis indicators of integrative com- 


ican public to accept military oT nonmilitary plexity and attitude polarization. The group- 
solutions to conflicts. Political leaders may think interpretation suggests that it should 
justify militaristic decisions by emphasizing be possible to use the same content analysis 
in simple terms their total opposition to the indicators to monitor the quality of decision 


enemy and their total dedication to their own making of governme 
< itoring activity may have the beneficial effect 


nation’s values. The disadvantage of this 
of sensitizing licy makers to the manner in 


interpretation is that, unlike the groupthink 
e available which they make decisions and may, In the 


model, it cannot explain most of th 
historical evidence om the behavior of policy 
makers in the relevant crises (see De Rivera, ments in the quality of their decision making. 
1968; Graff, 1970; Janis, 1972; Neustadt, Finally, a few comments are in order con- 
1964). The propaganda explanation accounts cerning the lack of significant differences in 
only for the results of the content analyses of negative evaluations of opp 
public statements. groupthink and non-groupthink crises. This 
A decisive test of the groupthink and prop- result is puzzling for both the groupthink 
aganda explanations will probably elude us and war propaganda exp 
„for a long time. At least two li 
research can, however, help to clarify the ap- 


plicability of the competing positions in par- 2 methodological reason 


tions, wherea: 


statements of decision makers in the hy- ponents in groupthink and non-groupthink 
pothesized groupthink and non-groupthink crises. In this regard, the fact that the public 
crises, If groupthink is operating, man! esi n lec > fewe! 
tions of the syndrome should appear in pri- negative than positive evaluative assertions 1$ 
vate and public statements. Tf the propagan . This finding is not surprising in 
position is correct, there is no reason to exp view of the considerable social psycholog- 
the private statements of groupthink and non- ical evidence y a 
iffer, even more likely to offer positive than negative 
though significant differences would emerge in evaluations and that persons giving positive 
+. evaluations are themselves regarded as more 
to draw upon case studies in the historical and attractive than those giving negative evalua- 
political science literatures to identify 4 MU tions (see Folkes & Sears, 1977). It is prob- 
larger sample of probable grou i ably effective impression management for pol- 
“accentuate the positive” in their 


pih an 
groupthink decisions, some of which have Jed iticians to 
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public statements. Diplomatic norms may add enemy. In D. J. Finlay, O. R. Holsti, j 
further pressure in this direction. An advocate Fagen (Eds.), Enemies in politics. Chicago 


k a ° McNally, 1967. 
of the groupthink position could simply argue Holsti, O. R. Foreign policy formation 


that groupthink decision makers were reluc- nitively, In R. Axelrod (Ed.), The struct 
tant to express fully their negative feelings cision, Princeton, N.J.: Princeton University 
toward opponents. 1976. 


The argument above points again to the ae: Asi of groupthink. Boston: He 
need for research that analyzes verbatim rec- Janis, I. L., & Mann, L. Decision making, New 
ords of actual group deliberations. A more Free Press, 1977. 
conclusive test of the groupthink analysis Jervis, R. Perception and misperception in i 
awaits the declassification of such documents, fonal politics. Princeton, N.J.: Princeton 

versity Press, 1976 ; 
Neustadt, R. E, Presidential power: The polili 

leadership. New York: Signet, 1964. 
Osgood, C. E. The representational model and 


+ evant research methods, In I, de S. Pool | 
aA SEN PE: ra appended Trends in content analysis. Urbana: Universi 
odel anal a ssa Illinois Press, 1959, 7 
spores) shoven: Ana tananan joan Francisco: Osgood, C. E., Saporta, S., & Nunnally, J. CE 
Allison, G. T. Essence of decision: Explaining the tive assertion analysis. Litera, 1956, 3, 471 
Cuban Missile Crisis. Boston: Little, Brown, 1971. Schroder, H. M. Conceptual complexity. In Hi 
Axelrod, R. (Ed.). The structure of decision, Prince- Ut & P. Suedfeld (Eds.), Personality t 
ton, N.J.: Princeton University Press, 1976. eae processing. New York: Ron 
Bem, D. Self-perception theory. In L. Berkowitz 7 ole 
(Ed.), Advances in experimental social psychology Schroder, H. M, Driver; M. J phe Streufery sa i 
(Vol. 6). New York: Academic Press, 1972. sigomation RrDCRESAE. New York: Holt, 
Clark, H. H. The 1: : inston, 1967. : 
Ailes of lage meena cece Ea Suedfeld, P. Measuring integrative compl 
search. Journal of Verbal Learning and Verbal Be- archival materials. In H. Mandl & G. L 
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The Birth Order Puzzle 


R. B. Zajonc, Hazel Markus, and Gregory B. Markus 
University of Michigan (Ann Arbor) 


Studies relating intellectual performance 
sults, some finding 
order. In contrast, 
size is stable and consistently replicab! 
jables generate such divergen! 


means of the confluence model that quan 


growth arising within the family context. 
posing influences act upon intellectua 
her intellectual environment is “diluted” 
born’s handicap” and begins serv! 
sibling. Since these opposite effects are n 


> 


in intellectual perform: 
While elder children may 
ance at some ages, 
is taken into consideration, 
and an orderly pattern 


the 


Reuben, you are my first born, My might, and the 
fruits of my strength, pre-eminent in pride and pre- 
eminent in power. (Genesis 49:3) 


In most societies some individuals come to 
control great wealth, to wield enormous power, 
and to acquire coveted honors, titles, and 
privileges just by virtue of their position in 
the family. Faith in superior endowment of 
the first offspring has fostered laws of succes- 
sion, inheritance, and intestacy favoring the 
firstborn child—especially the male child— 
that still prevail in many countries. In fact, 
it was not until the turn of the century that 
this faith and the corresponding Jaws of prti- 
mogeniture were challenged. At this time, 
geneticists, pathologists, and psychologists 
began to explore the question from a variety 
of perspectives, and indeed, the first studies 
appeared to provide empirical substance for 
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intellectual scores to increase, 


the relationship between intellec 
le. Why do these two highly related var- 


t results? This birth order puzzle is resolved by 


1] growth of the 


ance among birth ranks are shown 
surpass their younger siblings in inte! 


they may be overtaken 
birth order literature loses 


of results emerges. 


to birth order report conflicting re- 
others to decrease with birth 
tual performance and family 


tifies the influences upon intellectual 
At the time of a new birth, two op- 
elder sibling: (a) his or 
and (b) he or she loses the “last- 


ing as an intellectual resource to the younger 
ot equal 


in magnitude, the differences 
to be age dependent. 
Jlectual perform- 
by them at others. Thus when age 
its chaotic character 


the belief in the superiority of the eldest (cf. 


Ellis, 1904; Galton, 1874; Gini, 1915). For 


example, Cattell and Brimhall (1921) pub- 


lished their observations on the birth rank of 
the firstborn is 


scientists and concluded that 
to be found in grea-er frequency among scien- 
tists than the later porn, and that this over- 
representation occurs for all family sizes. 
With continued attention to this question, 
however, contradictory results soon began to 
appear. Thurstone and Jenkins (1929) ex- 
amined a large number of children and con- 


cluded quite explicitly: 


On the whole the later born siblings tend to be on the 
the first-born. 


average brighter than Not only does 
be the case in the comparison of the 
t children, but the rise 


the order of pirth seems to con- 
eighth-born chi 


Ina subsequent study, Steckel (1930) sub- 
stantiated Thurstone’s results. 
Other studies followed. Some reported in- 
th birth order, some decrements, 
failed to find any relationship 
studies that found decreas- 
scholastic scores with birth 


crements wi 
and several 
whatever. Among 
ing intelligence or 
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order were those of Altus (1965), Bayley 
(1965), Breland (1974), Belmont and Ma- 
rolla (1973), Lunneborg (1968, 1971), and 
Schachter (1963). Increases in intelligence 
Scores with birth order were reported by 
Arthur (1926), Commins (1927), Hill 
(1936), Koch (1954), and Willis (1924), In 
a study by Hsiao (1931), some samples 
showed a positive relationship with birth order 
and others a negative relationship, And Bayer 
(1966) and McCall and Johnson (1972) 
failed to find any relationship whatever be- 
tween birth order and intelligence. The latter 
authors suspected, in fact, that the correla- 
tions of IQ with birth order approach zero “in 
those studies where more careful attention is 
given to sample design and to subsequent con- 
trols” (p. 208). The research literature on 
birth order appeared so confused that one 
serious reviewer was prompted to declare 
birth order unworthy of further research ef- 
forts (Schooler, 1972). 

In contrast to the inconsistencies among the 
birth order data are the results on the in- 
tellectual consequences of family size (Ter- 
hune, 1976), Figure 1 shows the results of 
surveys of the intellectual levels of six large 
populations. In spite of differences in age, sex, 
nationality, and type of test given, they all 
show a strikingly similar pattern of decline in 
scores with family size. 

This, then, is the birth order puzzle, Why 
is the effect of family size such a consistent 
one, showing itself over and over again in the 


lleçtual perform. in S.D. units 
eos nes ye ge 


The Netherlands, 1965, France, 1973 
taven i 


1A. (Gihte) 
6 = 14 yr olds 


intellectu: 


6 19 yr, olds 


Family 


Figure 1. Family size (number of children) and intellectual performance in six large samples- 
N.MS.Q.T. = National Merit Scholarship Qualifying Test.) 


(S.D. = standard deviation; 


R. ZAJONC, H. MARKUS, AND G. MARKUS 


literature, while at the same time the effect 
of birth order—a closely related factor—p 
sents such a chaotic picture? The birth order 
problem is especially troublesome, for none of 
the variables examined in the literature cay 
organize these results into an orderly set of 
generalizations. For example, the suspicion) 
that birth order effects are found primarily on} 
verbal proficiency tests is seriously under.) 
mined by the strong effects found also with 
the Raven Progressive Matrices Test (Bel. 
mont & Marolla, 1973). The possibility that 
socioeconomic factors may mediate these ef- 
fects is dispelled by several studies (eg., In- 
stitut National d’Etudes Démographiques 
[INED], 1973; Claudy, Note 1) showing 
parallel effects for a variety of socioeconomic’ 
Status categories. Cohort effects, too, are soon 
ruled out as a possibility because Galton’s 
cohort of the 19th century showed effects no 
different from those of the much later cohort 
examined by Cattell and Brimhall (1921) nor 
from those of the most recent cohort of high 
school seniors examined by Claudy (Note 1). 
A recent analysis of intellectual develop- 
ment may supply a solution to the birth order | 
puzzle. A model termed the confluence model 
(Markus & Zajonc, 1977; Zajonc, 1976; 
Zajonc & Markus, 1975) was developed to ex- 
plain a large body of intelligence data pub- | 
lished in 1973 by Belmont and Marolla. These” 
data on birth order and family size exhibited 
five important features: (a) intelligence seora 
declined with family size; (b) within each 


Scotland, 1947 
Stantord-Binet 
= 1yr. olds 


Cloudy 1976 
Pr Talent 10 
17-18 yr ods 


E: a 
4356769 12345678 123456 


family size they declined with birth order; 
(c) if the last child was ignored, the decline 
with birth order seemed to be decelerated; 
(d) the decelerating birth order trend was not 
followed by the last child, who showed a dis- 
continuous drop in intellectual performance, 
and (e) the only child, too, showed a discon- 
tinuity in that if the family factor were sys- 
tematically negative in influencing IQ, the 
only child should have had the highest average 
of all, which was not the case, as can be 
clearly seen in Figure 1. The confluence model 
was constructed to reflect these features of the 
Belmont-Marolla data, and it has peen refined 
since its original publication to accommodate 
new data (see Appendix). 


aG 


The Confluence Model 


The basic assumption of the confluence 
model, both in its earlier and its repara- 
metrized version presented here, is that family 
influences that contribute significantly to in- 
tellectual growth can be divided into two 
sources. Specifically, the rate of intellectual 
growth is hypothesized to bea function of the 
intellectual environment within the family, a% 
and a factor, A, associated with the special cir- 

cumstances of last children. 

The intellectual environment is defined to 
be a function of the absolute intellectual levels 
of all family members, including the indi- 
vidual whose intellectual development is being 

t analyzed.t These intellectual levels are not to 
be confused with 1Q, which is a measure rela- 
tive to age. Nor are th 


monly set equal to chron 
they are absolute quantities th 
individual’s mental 
here, for example, of the indivi 
lary, which is accumulated not in 
increments but according to 4 sigmoi 
tion. 

As the family members mature, the intel- 
lectual environment the most 
pronounced changes 
new children are born into the \ 
grown members leave. Since infants are 10- 
tellectually immature, 
average intellectual level in 
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over, the smaller the family, the greater is the 
impact of a new birth (or a departure of an 
adult) on the resultant intellectual environ- 
ment. Therefore, the major family configura- 
tion variables that affect a are family size and 
length of intervals between the births of suc- 
cessive siblings. 

The term À reflects the two discontinuities 
found in the Belmont-Marolla data for only 
and last children. These discontinuities exist, 
it is hypothesized, because of a particular dis- 
advantage suffered by all last children, includ- 
ing only children. The intellectual respon- 
sibilities of older and younger children differ 
within the family. Specifically, only and last 
children are not teachers in the broad sense 
of the word, and it is assumed that the rela- 
tively low performance scores often reported 
for them in the literature are associated with 
this lack of teaching opportunity.” 

Since last-born children do not normally 
serve as intellectual resources to their siblings, 
a condition assumed to enhance intellectual 
growth (Zajonc & Markus, 1975), and since, 
except for twins, each child is at some age the 
Jast child in the family, the rate of intellectual 
growth during such periods is affected accord- 
ingly. Specifically, dy will be equal to zero 
throughout the period when the child is last 
born and will take on a positive value once a 
younger sibling is born. It is assumed, further- 
the beneficial effects of the teach- 
with age too, and that they 
of the individual 


more, that 


Ree Sime 
1 We shall use the term family to include also indi- 
ther (such as 


viduals who are not related to each 0' 
permanent guests, domestic help, etc.) but who live 
me and can be considered to be members of 
the immediate intellectual environment. 

2 While there are no studies on the teaching func- 
some recent research has shown 
derived by tutors in 
(Devin-Sheehan, 


school cross-peet 
Morgan & Toy, 


Feldman, & Allen, 

1970; Richer, 1973; W most cases the 
benefits accruing to the tutor exceed those accruing 
to the tutee. Jn one study, for example, there was a 
g-month gain on the Wide Range 
for tutors and only a 3-to-5-month gain for their 
charges (Morgan & Toy, 1970). 
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do much tutoring at an age less than 2 years, 
and little tutoring can be administered to an 
infant. The term A; reflects these considera- 
tions, in the reparametrized version of the 
model, representing them as the product of 
the two increments—that pertaining to the 
individual whose mental maturity is being 
calculated and that pertaining to the next sib- 
ling. All members of the family, except the 
last born, serve as intellectual resources to one 
another. The benefits that derive from this 
function are therefore divided among the par- 
ents and all the children but the last one. 
Thus, A; has n — 1 in its denominator. 

Birth order has no independent effect on 
intellectual development within the confluence 
model. Instead, effects previously thought to 
be associated with birth order are posited to 
be mediated substantially by family size and 

~ the spacing of births. Short intervals force 
later-born children to spend a larger portion 
of their period of growth in a family environ- 
ment “diluted” by the presence of young 
children. As intervals between successive chil- 
dren increase, new children enter relatively 
more advanced environments that enhance 
their growth. Hence the birth order effects 
may be mitigated or even reversed by varia- 
tions in birth intervals (Markus & Zajonc, 
1977; Zajonc, 1976; Zajonc & Markus, 1975). 
The reparametrized confluence model pro- 
vides a fairly exact fit (obtained by means of 
iterative procedures) for all of the large data 
sets thus far analyzed, responding well to the 
wide variety of patterns of empirical results, 
some of which show a negative effect of birth 
order on intelligence, others a U-function or 
ho effect at all. 


Dynamics of the Intellectual Environment 
and Individual Patterns of Growth 


According to the confluence model, the in- 
tellectual environment changes continually as 
family members mature and as the family ex- 
pands (or contracts). Hence, entries at differ- 
ent points into this dynamic environment 
must result in radically different patterns of 
growth. Some children may grow up in a 
family environment marked by abrupt shifts 
that are occasioned by Many additions or de- 
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partures, whereas their siblings might enter a | 
different time segment of the same environ- 
ment, where changes are much more gradual, | 
As a consequence, comparisons of intellectual 
levels at one point in time in the family his- 
tory or at a given age may yield different 
values from those obtained from comparisons 
made at another time. The age at which birth 
order effects are evaluated is thus of pivotal 
significance, and the major concern of the | 
remainder of this article is with these varia- | 
tions in birth order effects that occur as a 
function of the age of testing. 

The dependence of birth order effects on age 
is one of the unanticipated discoveries of the 
confluence model. This dependence can be 
readily understood if we consider the growth 
patterns of only children and children who 
have just one sibling. Note that, other things 
constant, only children should develop undis- 
turbed until maturity. There are no events 
within the family during this period that 
would cause sharp discontinuities in their rate 
of growth. Yearly increments in growth are 
only those that are associated with the matur- 
ation of all the family members. For the same 
reason, the second of two children will also 
develop undisturbed until maturity (provided 
that his or her older sibling does not leave 
home and there are no other major changes in 
the family situation). In principle, the above 
statement holds for all last children, and it 
holds only for them. The growth rates for 
these children will show no discontinuities. In 
contrast, the older of two siblings should 
manifest a discontinuity in his or her grow 
pattern, and such discontinuity should coin- 
cide with the acquisition of the younger sib- 
ling. From his or her own birth until the birth 
of a younger brother or sister, the first of two 
children grows at the same rate as the only 
child. The family conditions of their growth 
are identical until that age. But when the 
second child is born, two dramatic changes 
take place: (a) The average of the intellectual 
level within the family is decreased by the 
addition of the immature member and (b) the 
firstborn acquires a teaching function (ies 
the value of his or her A; changes from zero t0 
a positive number). Ae 

Dunn (Note 2) reports findings consistent — 


v 


ji 
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with these hypothesized changes for young z 
children (18 to 43 months) shortly after they 
acquire & younger sibling. She observed sev- 


play together the firstborn tends to play rela- 
tively simple games; he approaches the level i 


) 

The birth of a second child can have one 
of several effects on the firstborn, depending 
made: (a) Tf the decre- 
ment occasioned by the dilution of the en- 
vironment due to the new birth exactly equals he $ L i 
the gain derived from the teaching function, 


the growth rate of the firstborn will not pe Figure 2. Theoretical growth curves of intellectual 
levels of only children (Mu) and of firstborns (Mss) 


e : agp 
ae E the te oO EA SN and last borns (Mz) in families of two children. 
effect more pronounced than the POR the intellectual levels of the only child and of 
change in the value of a, the grow Sa a both siblings in a two-child family are plotted 
rE o aaO Ta pia a aa a Èe ma 
£ - seen in these growt curves that the only 
sated for by the change 1 à, however, we child and the first of two grow at the same 
ve! Obie an initial depression Si hee ned rate until the latter acquires a sibling at age 
ai growth rate relative to that of te only 5 years. Subsequently, the rate of growth of 
Gas) A the firstborn drops below that of the only 
The first two outcomes ga a that child, Eventually, the firstborn will surpass 
me eee EE ee n ee the only child, however, and will retain this 
ane. ing be pat iate F ae Th ae superiority until maturity, because his or her 
sibility seems | airly imp ne ARE, teaching function continues to cumulate and 
possibility, which presumes that ' a ad eventually compensates for the decrement in 
rom teaching accrue more Aon ange d the a component, originally penget about by 
ually, seems more © ely. the acquisition ©! a younger si ing. 
yee g T P Wi, pe and k (see The comparison of the growth rates of the 
ppendix), W ich fit aggregate h a 
those of Belmont and Marolla (1973), it is js also of interest. At birth the second born 
the third effect that is most consistent with comes into an intellectual environment that is 
he are d possibility were to obtain, then eo mA ee than m p vat 
is third possibili ; tered by the firstborn. owever, the rate 0 
according to the confluence model the meen change in these environments is quite differ- 
ment af the peek Wg Le aes cp ent, and the second me ng ej aa the 
must be age bound. e cat firstborn’s rate of growth. Consider, or ex- 
and their intellectual skills increase, e ample, a family with two children whose births 
vironment within the family m a ee are separated by 3 years. At age 8, the in- 
The effects of the teaching function, ©» a tellectual environment of the oldest will reflect 
mulate for the firstborn put not for the only the intellectual level of him- Jherself (an 8- 


a 
i) 
5 
9 
A 
E 
E 
Ea 
5 
a 


child—a condition that co low the first- 
born eventually to surpass the only child, LOWER ant 
S 3 The differences between the curves have bee 


prediction is illustrated in Figure 2, in which exaggerated for illustrative purposes. 
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Table 1 


Test Results of Children Ranked by Birth Order 


— aaaaaaaaIIIIIIIIMMlllaaiaaallaaiaiaaaaiħiħĖ—— 


Author and test 


Bryant & Davies, 1974 
Denver Developmental 
Screening Test 


Bayley, 1965 
Bayley Scales of Motor 
and Mental Development 


Abe, Tsuji, & Suzuki, 1963 
Draw-A-Person-Test 

Koch, 1954 
SRA Primary Abilities Test 

Arthur, 1926 
Kuhlmann-Binet 


Hsiao, 1931 
Stanford-Binet 


National Intelligence Test 
Jensen, 19744 
Lorge-Thorndike 
Willis, 1924 
Stanford-Binet 


Schoonover 1959 
Stanford-Binet 


INED, 1973 
INED Test 


Tabah & Sutter, 1954 
Gille IQ Test 


‘Commins, 1927 
McCall Multi-Mental 


Population 


668 Cardiff infants 


1,260 infants examined in 
the 10 institutions of 
the Perinatal Research 
Project* 


302 Osaka children 
360 Chicago children 


92 pairs of siblings from 
Minnesota with Fin- 
nish, Russian, and 
South European names 


190 pairs of siblings from 
the same populations, 


all tested at the same 
age 


133 Berkeley elementary 
school sibling pairs 


258 unrelated subjects 
118 sibling pairs 


1,959 unrelated subjects 


219 sibling pairs 


59 sibling pairs 


33,339 elementary 
school children 


2,488 elementary 
school children 


142 elementary school 
sibling pairs 
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Age 


Results 
eee 
First- Second 
Only born 


16-375 days 


1-15 months 


3 years 
6 years 


5-7 years 


7.5 years 


8 years 
10 years 


6-14 years 
Elementary 
school 
Elementary 
school 
6-14 years 


6-13 years 


Grades 3-8 


born 


Firstborn shows a faster gross i, 
and fine motor develop- 
ment. 

| 


Firstborn significantly super: 

ior on mental score in 
Months 5, 7, 12, and 13, on 
motor score in Months 6,1, 
12, 14, and 15. Second born 
shows no significant super- 
iority over firstborn in any J 
of the 15 months, 


2.0> aS 
108.2 td 
93.0 99,1 
90.1 96.9 
104.4¢ 105.2" 
103.5°  108.1° 
103.0 104.9 
104.7 109.9 109.1 
97.1 99.7 
24.ge 25.78 
103.6 107.0 1067 
100.2 100.5 


In 70% of the pairs, firstbor™ 
has lower scores 
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fable 1 (continued) 


Age 


Only 


1331 


Results 


First- 
born 


zu 


Second 
born 


Author and test Population 
Chittenden, Foan, & Zweil, 1968 
Iowa Basic Skills 62 sibling pairs in Grades 4-5 In 58% of the pairs, firstborn 
Massachusetts has higher scores; in 35%, 
\ $ lower scores. 
67 sibling pairs in Grades 7-8 In 64% of the pairs, firstborn 
Massachusetts has higher scores; in 31%, 
lower scores.° 
Richardson, 1936 
i] Stanford-Binet 101 sibling pairs tested 11.5 years 95.5 94.5 
at the same age 
101 sibling pairs tested Firstborn, 94.3 102.1 
at different ages 12 years 
Second born 
8.5 years 
SCRE, 1949 
Scottish Council Verbal 23,403 Scottish children 11 years 42.0 41.9 41.5 
Test 
Eysenck & Cookson, 1970 
Verbal Reasoning 4,000 primary school 11 years 96.4 96.1 96.1 
children in England 
Cicirelli, 1967 X 
California Short Form Test suburban Detroit Grade 6 
of Mental Maturity children 
Males 108.2 114.4 
Females 117.9 110.8 
Hill, 1936 
Otis, Form New Jersey children, 12 years 83.6 87.6 
66 sibling pairs 
12 unrelated matched 12 years 84.2 87.5 
S pairs 
Quelea et al., 1970 
RA High School Place- 
ranch 153 students from @ Grade 8 113.3 114.9 
northern suburb o! 
Chicago 
Svensson, 1972 
Opposites, Number Series, 10,025 swedish school 13 years Ba A NA 
Spatial children (national 
sample) 
Hsiao, 1931 i 
Terman Group Test 366 sibling pairs 13 years 109.1 106.3 
569 unrelated fubjects BS ears 108.8 1084 
Davis, Cahan, & Bashi, 1977 rete 
Quantitative Skills č 82,689 Jsraeli children 14years 48 58 Al 
Test Z scores of Western origin 
sraeli children 14 years —.09 16 —.01 


109,304 I ! 
of Oriental origin 
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Table 1 


Test Results of Children Ranked by Birth Order 


aa ae a Te RE Ye A a a G a E 


Author and test 


Bryant & Davies, 1974 
Denver Developmental 
Screening Test 


Bayley, 1965 
Bayley Scales of Motor 


and Mental Development 


Abe, Tsuji, & Suzuki, 1963 
Draw-A-Person-Test 
Koch, 1954 


SRA Primary Abilities Test 


Arthur, 1926 
KuhImann-Binet 


Hsiao, 1931 
Stanford-Binet 


National Intelligence Test 
Jensen, 19742 
Lorge-Thorndike 


Willis, 1924 
Stanford-Binet 


Schoonover 1959 
Stanford-Binet 


INED, 1973 
INED Test 


Tabah & Sutter, 1954 
Gille IQ Test 


Commins, 1927 
McCall Multi-Mental 


Population 


668 Cardiff infants 


1,260 infants examined in 
the 10 institutions of 
the Perinatal Research 
Project* 


302 Osaka children 
360 Chicago children 


92 pairs of siblings from 
Minnesota with Fin- 
nish, Russian, and 
South European names 


190 pairs of siblings from 
the same populations, 


all tested at the same 
age 


133 Berkeley elementary 
school sibling pairs 


258 unrelated subjects 
118 sibling pairs 


1,959 unrelated subjects 


219 sibling pairs 


59 sibling pairs 


33,339 elementary 
school children 


2,488 elemen: 
school children 


142 elementary school 
sibling pairs 


Age 


R. ZAJONC, H. MARKUS, AND G. MARKUS 


Results 
ee eRe 
First- Second 
Only born born 


16-375 days 


1-15 months 


3 years 
6 years 


5-7 years 


7.5 years 


8 years 
10 years 


6-14 years 
Elementary 
school 
Elementary 
school 
6-14 years 


6-13 years 


Grades 3-8 


Firstborn shows a faster gross 
and fine motor develop- 
ment, 


Firstborn significantly super- 
ior on mental score in a 
Months 5, 7, 12, and 13, on © 
motor score in Months 6, 7, 
12, 14, and 15. Second born 
shows no significant super- 
iority over firstborn in any 
of the 15 months. 5 


2.0> 2.5% 
108.2 111.1 
93.0 99.1 
90.1 96.9 
104.4 105.2° 
103,5¢ 108.1° 
103.0 104.9 
104.7 109.9 109.1 
97.1 99.7 
24.8¢ 25.7° 
103.6 107.0 106.7 
100.2 100.5 


In 70% of the pairs, firstborn 
has lower scores 
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Table 1 (continued) 
Results 
First- Second 
Author and test Population Age Only born born 
Chittenden, Foan, & Zweil, 1968 
Iowa Basic Skills 62 sibling pairs in Grades 4-5 In 58% of the pairs, firstborn 
Massachusetts has higher scores; in 35%, 
raat ul lower scores. 
67 sibling pairs in Grades 7-8 In 64% of the pairs, firstborn 
Massachusetts has higher scores; in 31%, 


lower scores.° 


Richardson, 1936 
Stanford-Binet 101 sibling pairs tested 11.5 years 95.5 94.5 
at the same age 


101 sibling pairs tested Firstborn, 94.3 102.1 
at different ages 12 years 
Second born 
8.5 years 
SCRE, 1949 
Scottish Council Verbal 23,403 Scottish children 11 years 42.0 41.9 41.5 
Test 
Eysenck & Cookson, 1970 
Verbal Reasoning 4,000 primary school 11 years 96.4 96.1 96.1 


children in England 


Cicirelli, 1967 
California Short Form Test 609 suburban Detroit Grade 6 


of Mental Maturity children 
Males 108.2 114.4 
Females 117.9 110.8 
Hill, 1936 
Otis, Form A New Jersey children, 12 years 83.6 87.6 
66 sibling pairs 
72 unrelated matched 12 years 84.2 87.5 
pairs 
Oberlander et al., 1970. 
SRA High School Place- 
meai ae 153 students from a Grade 8 113.3 114.9 
northern suburb of 
Chicago 
S 1972 
Oproti Number Series, 10,025 Swedish school 13 years 21.3 21.5 21.3 
Spatial children (national 
sample) 
Hsiao, 1931 4 i 
Terman Group Test 366 sibling pairs 13 years 109.1 106.3 
569 unrelated subjects 13 years 108.8 108.1 
Davi Bashi, 1977 i 
pire eS 82,689 Israeli children 14 years aT o 47 
Test Z scores of Western origin 
109,304 Israeli children 14 years —.09 16 —.01 
of Oriental origin 4 
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Table 1 (continued) 


Results 
— 3 
First- Second 
Author and test Population Age Only born born 


Lunneborg, 1968 7 
Washington Precollege High school seniors 17-18 years 


Firstborns scored higher than 
Battery 2,878 males 


later borns or only children 

on 18 of 18 tests 

2,523 females Firstborns scored higher than 
later borns or only children 
on 14 of 18 tests 


Lunneborg, 1971 4 i r 
Washington Precollege 896 siblings of two-child 16 years Firstborns score higher on 15 
Battery families of the 16 ability measures — 
Breland, 1974 
National Merit Scholarship 794,589 high school 17 years 103.7 106.2 104.4 
Qualification Test students from a na- 


tional competition 
Burton, 1968 


Project TALENT IQ 43,352 high school seniors 18 years 
easure males 50.1 48.7 
females 49.7 48.9 
Claudy, Note 1 
Project TALENT IQ 81,175 12th graders 17-18 years .08 30 07 
Measure in scores (national sample) 
Altus, 1965 
SAT 1,878 University of 17-18 years 538.9 518.8 


California students 
Schachter, 1963 
GPA 


628 Minneapolis high 14-18 years 2.31 2.35 2.23 
school students 
Rosenberg & Sutton-Smith, 1969 
ACE (total) Students at Bowling 19 years 
Green State University 
355 males 52.0 495 | 
658 females 53.2 52.4 
Belmont & Marolla, 1973 
Raven 386,114 Dutch males 19 years 2,688 2.5756 2.678: 
born 1944-1946 


Note. SRA = Science Research Associates. INED = Institut National d'Études Démographiques. SCRE 
= Scottish Council for Research on Education. SA’ 


T = Scholastic Aptitude Test. GPA = Grade Point 
Average. ACE = American College Entrance Examination, 


* Boston, Providence, New Haven, Buffalo, New York City, Philadelphia, Memphis, New Orleans, San 
Francisco, and Portland, Ore, 


è High scores indicate superior performance. 
© Corrected for age. 


à Arthur R. Jensen kindly made his data from the Berkeley Unified School District study available for the 
Purposes of this analysis, 


* Months over average norms. 
f Composite score. 
#1 = high score; 6 = low score, 


year-old), the parénts, and the sibling, aged 5. 
When the last born is 8 years old, however, 
the environment will consist of him-/herself, 


the parents, and a sibling aged 11 years. i 
at a given age the second born’s environmen! 
may be superior to the environment that sur- 


w 
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rounded the older sibling at the same age.* 
The accelerated intellectual growth of the 
second born is temporary, however. Although 
at a given age the second born may enjoy a 
superior environment, he or she does not 
accrue the benefits of the teaching function. 
The firstborn will benefit by virtue of being a 
teacher, and the cumulation of this benefit 
may eventually lead the firstborn to surpass 
his or her sibling. 

The differences in the rates of growth of the 
first and second children apply equally to all 
next-to-last and last children, In summary, 
given the values of the parameters w1, W2, and 
k such as we found for aggregate intelligence 
data, the confluence model leads to the follow- 
ing expectations: 

1, The rate of intellectual growth of the 
only child and of the first of two is the same 
until the age when the firstborn acquires a 
younger sibling. 

2. After the birth of the younger sibling, 
there is a decline in the intellectual environ- 
ment, and relative to the only child, the rate 
of growth of the firstborn of two is tem- 
porarily depressed. 

3. A few years after the birth of his or her 
younger sibling, the firstborn acquires a 
teaching function, and thus his or her rate of 
growth accelerates and eventually surpasses 
the rate of growth of the only child. At ma- 
turity, the first of two will have a higher in- 
tellectual level than the only child. 

4. Immediately after birth, because of a 
diluted environment, the second born develops 
more slowly than did the firstborn at the same 
age. He or she continues at that lower rate of 
growth until an age equal to at least the 
length of the interval separating him/her 
from the older sibling. 

5. The second born can surpass the first- 
born at some age equal to at least the interval 
separating him or her from the older sibling. 

6. The advantage of the second born over 
the firstborn that exists at age levels greater 
than the birth interval is temporary, and at 
maturity the firstborn exceeds the second born 
in intellectual level. 

7. The changes in growth rates of first and 
second children given in 4, 5, and 6 occur for 
all next-to-last and last children. 
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Research Evidence 


Data that could best reveal the dependence 
of birth order effects on age predicted by the 
confluence model would entail the repeated 
observations of entire families. Data of this 
type should exhibit the expected discontinu- 
ities in the growth curves. Unfortunately, in- 
formation on entire families over time is not 
available. However, cross-sectional compari- 
sons of individuals of different ages may be 
made from the existing literature, and al- 
though they are indirect, they are quite in- 
formative. Table 1 summarizes the literature 
on birth order effects arranged according to 
the age of subjects. The studies selected were 
those in which there was information about 
sibships of Sizes 1 or 2 and in which birth 
order effects were not confounded with family 
size. Studies comparing firstborns and second 
borns averaged over families of various sizes 
were thus not included. 

Table 1 shows the pattern of results that 
we would indeed expect on the basis of the 
confluence model. For very young children 
the firstborn surpasses the second born. For 
children of older ages (from about age 3 to 
13 years) the birth order effect seems to be 
reversed. From age 3 years to about age 13 or 
14, with almost no exceptions, it is the second 
born who is higher in intellectual performance. 
The few exceptions occur at about age 11 (the 
age of the crossover) or in samples that span 
several ages. Homogeneous age groups, how- 
ever, show a consistent pattern. From age 14 
up, the superiority of the firstborn reappears 
and is maintained thereafter without excep- 
tion, Studies that examine adults never show 
the second born surpassing the firstborn. For 
example, Cattell and Brimhall (1921) found 
firstborns to be overrepresented in American 
Men of Science relative to second borns in 
two-child families. Gini (1915) reported an 
overrepresentation of firstborns among Italian 
university professors, as did Poole and Kuhn 
(1973) among British university graduates. 

Studies on birth order effects with children 


4Jt should be clear that the second born’s environ- 
ment will be necessarily superior only at ages greater 
than the length of the birth interval. 
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of intermediate ages that were not included 
in Table 1 because of the confounding of birth 
order with family size or because of insuf- 
ficient information, but in which inferences 
can be made about the likely form of this 
confounding, are in general agreement with 
the studies that were included (Kellaghan & 
MacNamara, 1972; Marjoribanks & Walberg, 
1975; McCall, 1973; McCall & Johnson, 
1972; Murray, 1971; Oberlander & Jenkins, 
1967; Orum, 1971; Steckel, 1930; Thurstone 
& Jenkins, 1929). Interestingly, many of these 
Studies, especially those having populations of 
young teenagers, show no birth order effects 


RMANCE (S.D. units) 


p -t 0 
i) N a 


INTELLECTUAL PERFO 


i 
> 
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and often cite their results as counterexamples 
to other studies that claim birth order to be 
a significant factor in intellectual develop.. 
ment, 
Since some studies show negative effects, 
others positive, and still others no differences 
at all among the various birth orders, it is 
clear why birth order acquired such an un- 
Savory reputation as an explanatory variable, 
But Table 1 reveals that the data on birth — 
order are far from chaotic and that these data 
can be ordered fairly systematically according 
to age. The predicted crossover effect is | 


clearly seen in the pattern of the reported 


e 
19 
187,284 


Merit Scholarship Qualification Test Study [Breland, 1974]; [e] Survey of Dutch Recruits [Bel- 


mont & Marolla, 1973]. S.D. = standard deviation.) 
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results, There are two results for children 
less than 3 years of age. For both, the first- 
born shows superiority. There are 12 studies 
with children aged 3 to 10 years. Of course, 
10 show the second born as superior. The 
remaining 2 results in which the difference in 
intellectual performance favors the firstborn 
come from samples that spill over to an older 
age category, that is, 6 to 14 years (INED, 
1973; Jensen, 1974). For the crossover inter- 
val 11 to 12 years, 4 results favor the firstborn 
and 5 favor the second born. Finally, for chil- 
dren over age 12 all results but 1 (18 out of 
19) favor the firstborn. 

The logic of the predictions indicating how 
the differential rates at which a and Aà con- 
tribute to intellectual growth applies to all last 
and next-to-last children. In this regard, Fig- 
ure 3 shows the results of five separate surveys 
of intellectual performance in which ages of 
the subjects vary from 8.5 to 19 years. The 
figure shows the scores of the next-to-last and 
last children in various sibship sizes. It is clear 
that the pattern found for children in sibships 
of Size 2 in Table 1 is duplicated quite sys- 
tematically on all last and next-to-last chil- 
dren, Of the 12 observations of children 11 
years of age or younger, the aggregate data 
favor the second born in 9 cases. Of the 8 
observations of older children, all aggregate 
data show the opposite: The firstborn has a 
higher score. 

Although the pattern of results in Table 1 
and Figure 3 is consistent, ‘there remains some 
ambiguity. Age differences in these results are 
confounded with a variety of other differences. 
A homogeneous population of children of dif- 
ferent ages tested with the same materials 
would be virtually free of this confounding. 
Data of this type were secured from the In- 
stitut National d'Etudes Démographiques, 
which a few years ago conducted extensive 
surveys of intellectual performance. The sec- 
ond survey carried out by this institution 
(INED, 1973), examined a sample of more 
than 125,000 French schoolchildren. It uti- 
lized an intelligence measure containing sub- 
tests on vocabulary, comprehension, analogies, 
differences, series completion, and proverbs. 

Figure 4 shows intellectual performance of 
only children and of first- and second borns 


1335 


in two-child families of eight successive age 
groups.° It is clear from the age changes in 
birth order effects that the predicted patterns 
are borne out. Among younger children aged 
6 to 8 years, the second born surpasses the 
firstborn. Among the older children this pat- 
tern is reversed. Moreover, as predicted, only 
children score higher than the firstborns 
among the youngest group, but from age 7 
they show a consistent disadvantage. Analyses 
of variance showed both interactions to be 
significant, F(7, 22881) = 2.59; p<.01; 
F(7, 22187) = 1.97, p < .05. Figure 4 also 
shows comparisons for all next-to-last and 
last children. The pattern of these data is sim- 
ilar to those of two-child families, although 
the difference in favor of the next-to-last over 
the last child is not present among other chil- 
dren. Possibly, the full reversal of the birth 
order effect does not occur for these children 
until a later age. 

In the above data, the firstborns and the 
second borns are unrelated. Data on first- and 
second borns from the same family would be 
preferable because these data would not be 
confounded with extraneous environmental 
factors such as socioeconomic status. Tabah 
and Sutter (1954) published an analysis of 
two-child families based on an earlier French 
survey (Tabah & Sutter, 1954), and calcula- 
tions of the scores of firstborns and second 
borns of different ages can be made directly 
from their tables. For children who are 9 years 
old or younger, the second borns are higher 
in IQ than the firstborns (standard scores of 
—175 vs. —.003 for firstborns and second 
borns, respectively). For children who are 
older than 9, the pattern is reversed (.048 vs. 
.036). This interaction of birth order with age 


5 Professor A. Girard, Chief of the Department of 
Social Psychology. Institut National d'Etudes Dém- 
ographiques in Paris was kind enough to make some 
of these data available for the present analysis, 

6 The rising trend of intellectual scores with age 
results from the manner in which the tests were con- 
structed. Four different forms of the test were pre- 
pared. The lowest school grades received only the two 
most difficult forms, whereas the intermediate grade 
received various combinations of the other forms. 
For the purposes of this analysis, however, this 
overall rising trend can be ignored. 
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Family size(j) 


Birth order 


Figure 4. IQ of first- and second borns in families of two and of all next-to-last and last children 
as a function of age (calculated from data supplied by INED, 1973). S.D. = standard deviation. 


is again significant, F(1, 2476) = 6.65, p< 
.01. 


Conclusions 


The predicted 
article clarify the 
ports on the relationship between birth order 
and intellectual performance, According to 
the confluence model, one should not be sur- 


also age dependent. Table 1 shows patterns 
of results that are entirely consistent with the 


inferences made from the confluence model. 
For very young children (i.e., children whose 
age is less than the typical interval in two- 
child families) the firstborn surpasses the sec- 
ond born. This advantage is then reversed, 
and a positive birth order effect prevails from 
age 3 or 4 years until the early teens, Finally, 
during the middle teens there is a return to an 
effect that favors the firstborn and that per- 
sists until maturity, and possibly permanently. 

This complex pattern of results may be ex- 
plained by the dynamic interplay between the 
contributions of the intellectual environment 
and the child’s Opportunity to serve as an in- 
tellectual resource. Certainly, other factors 
may contribute to the patterns examined in 
this article. The age dependence of birth order 
effects has been noted previously by Cicirelli 
(1967), who interpreted it as “some sort of 
trend where at an early age the later born 
child benefits from the stimulation of an older 
sibling, and at a later age (where the ab- 
stract verbal abilities come into play in the 
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school situation) the firstborn child profits 
from his closer exposure to adults” (pp. 482- 
483). It is also possible that the decline in the 
growth rate of the firstborn following the birth 
of a younger sibling may be due to a sudden 
shift in parental attention and care (Dunn, 
Note 2). Yet these alternative explanations 
lead to a variety of implications that are not 
borne out by the data, The first conjecture 
would lead us to expect an eventual equaliza- 
tion of performance when the second born 
reaches the school level, and the second ex- 
planation suggests a permanent disadvantage 
for the older child. 

Among other explanations of family size and 
birth order effects are those couched in biolog- 
ical terms. For example, there is some evi- 
dence that perinatal asphyxia-anoxia may pro- 
duce brain damage in children (Apgar, Gird- 
any, McIntosh, & Taylor, 1955; Campbell, 
Cheeseman, & Kilpatrick, 1950) and that its 
incidence increases with parity. Nutritional 
deficit and hormonal and mineral depletion 
have also been singled out as factors asso- 
ciated with successive parities and especially 
with those following each other in close suc- 
cession (Mertz & Cornatzer, 1971). In gen- 
eral, biological explanations of birth order and 
family size effects are at present not capable 
of dealing with the crossovers of the growth 
rates of only children with first- and: second 
borns, with the crossovers between the sib- 
lings of two-child families, and with the cross- 
overs between all last and next-to-last chil- 
dren. That birth order is strictly an environ- 
mental factor is strongly suggested by an 
interesting study of 845 subjects from 120 
biological and 104 adoptive families carried 
out by Scarr and Weinberg (1977). These 
authors found no differences in birth order 
effects on IQ when biological and adoptive 
birth orders were compared. In fact, in a 
number of regression analyses, the coefficients 
for biological and adoptive birth order were 
remarkably similar. 

Within the context of the confluence model, 
it is quite clear that the crossover patterns of 
growth found in the data cannot be accounted 
for without assuming some form of facilita- 
tion, such as the teaching function, that 0c- 
curs following the birth of a younger sibling. 
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Thus, both the theory presented here and the 
confirming data give additional support to the 
idea that older children do derive some benefit 
from the presence of a younger sibling. 

It must be pointed out that the effects 
found here are for the most part rather weak. 
The largest difference in intellectual perform- 
ance (found between extreme family sizes) is 
on the order of one standard deviation. Birth 
order differences are even smaller; differences 
between adjacent birth ranks are sometimes 
less than one tenth of a standard deviation. 
But if these differences had been large and 
systematic to begin with, the role of family 
configuration in intellectual development 
would have been understood long ago. 

In a recent article, Grotevant, Scarr, and 
Weinberg (1977) applied the early outline of 
the confluence model to individual data and 
claimed that only 2% of the variance in IQ 
was accounted for by that version of the 
model. They concluded that the model was 
inappropriate for individual data and is useful 
only for population trends. The 2% of vari- 
ance that Grotevant et al. found they could 
explain, however, is not an accurate reflection 
of what the model can do. Their figures were 
calculated on the basis of the nonparametrized 
version of the model and used IQ values 
rather than mental ages. The predictions, 
moreover, were made without allowing any 
free parameters whatever and without esti- 
mating them from the data. Additional pe- 
culiarities in the calculations of Grotevant et 
al. make their results less than useful in 
evaluating the performance of the confluence 
model. Berbaum and Moreland (Note 3) used 
the current parametrized version of the model 
(see Appendix) and mental ages as the basis 
for calculating predicted values in a more 
recent analysis of individual scores that ac- 
counted for nearly half of the variance in the 
mental ages of 257 children from 51 families. 

‘As we have seen, the present version of the 
model fits aggregate data quite well, and the 
previous simpler linear version (Markus & 
Zajonc, 1977) performed quite well, too, As 
much as 95% and no.less than 88% of the 
variance can be accounted for with three free 
parameters when the model is tested on all 
available large data sets. These prediction: 
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were obtained for a variety of populations 
from different countries that differ in age, 
sibling spacing, and socioeconomic ‘background 


measures of 
intellectual performance, With only two free 
Parameters (k and We), as much as 92% and 
no less than 78% of the variance in aggregate 
data can be accounted for by the predictions 
from the model.” 


ing the amount of variance in IQ that the 
model should explain. If the confluence model 
were constructed for the particular Purpose of 
accounting for all the differences found in IQ 
—those of genetic origin, those associated with 


performance. As much as 25% of the variance 
i be attributed to the 


variance in children’s IQ is “controlled” 
their parents’ 


aggregate data (that is, 
have no confounding with genetic 
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lectual growth Curves manifest the sor 
disturbance, coinciding with 
younger siblings, 
Data of this kind, including the nec 
family information and an adequate s: 
size, are not available. When such reg 
has been carried out, it will illuminat 
transmission of environmental effects 
emerge in the pattern of individual differ x 
in intelligence described by the extens 
literature on family configuration, g 


ô: = wsAf(t), encompassing all influences on intelle 


experience, or day-care facilities, for instance, ant 
circumstances strictly related to the home. 
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| Appendix 
The Confluence Model: Reparametrized Version 


The level of mental maturity, Mj», attained at age t by the ith child of j 
children in a family (see Footnote 1) of n individuals, is expressed as a first-order 
difference equation à 


Mi;(t) = Mijo-» tart Ay (Al) 
where t = 1, 2,3, .-. years. The sum of the two components, a and às, both mea- 
aa sured at age t, represents the growth increment accumulating each year. 


\ Since intellectual growth is not linear with age, the environmental factors that 

influence it do not have the same effects at different ages of the child. Thus, the 

acquisition of a sibling at age 3 years has a more pronounced effect than at age 12, 

oa! when the child is much nearer intellectual maturity. In the original article (Zajonc 
& Markus, 1975), we supposed that the effects of family configuration cumulated 
as a sigmoid function of age, f(t) =1— e-#t. The two components a and Mı 
are therefore expressed as weighted yearly increments of this sigmoid function, 
Aft) = a rte eh) -—(i- et?) = gP = eke: 


a= wid OLE, Mĉine-n/na-n + 1} (A2) 
and 
a, is waL Af HAST) 
ean nea 


In the above terms, wı and we are weights associated with the two components, k 
is a constant, f(r) = 1 — oF? where 7 is the age of the adjacent younger sibling, 
and Lr is the last-child index, which is equal to 0 for children who at age ¢ are not 
‘ followed by a younger sibling and is equal to 1 otherwise. The weights w and we 
represent metric contributions of the component sources of influence upon intel- 
lectual growth and need to be estimated from the data. The constant k in f(t) and 
f(r) must also be estimated, and it reflects the rate at which intellectual effects 
cumulate. It will therefore vary with the population investigated, with the type 

of ability measured, or with the test employed for its measurement. 
$ In Equation A2 the intellectual environment within the home is represented as 
a root mean square of the intellectual levels of all family members. This formula- 
tion is used, as previously (Markus & Zajonc, 1977), to reflect the greater contribu- 
tion of the more mature members. Moreover, the denominator is n + 1 to allow 
the quality of the intellectual environment to rise with increasing number of 
family members. We would not want to consider an environment consisting of one 
adult to be of the same quality as one consisting of four adults. Thus, for constant 
values of M, the inequality nM/(n+1)< (n + 1)M/(n + 2) is always satisfied. 
It can be readily seen that this formulation of the intellectual environment also 


oi reflects the fact that an addition of a new birth will reduce its quality, whereas 
the addition of an adult will increase that quality. 
n 
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Motivation x Ability has been empir- 
and Ruble, 


equal-weight averaging rule was able to 
h group and Single-subject analyses. Per- 


haps the integration rules underlying achievement judgments are culture-specific, 
and Indian college students average motivation and ability information in 
attribution of future scholastic performance. These results illustrate the po- 
tential power that information integration theory provides for the cross-cultural 


study of social Perception and cognition. 


How do people integrate information about 
motivation and ability when they predict per- 
formance? Heider (1958) made the following 
Suggestion: 


The personal constituents, namely, power (ability) 
and trying (effort) are related as a multiplicative 
combination, since the effective personal force (per- 
formance) is zero if either of them is zero. For in- 
stance, if a person has the ability but does not try at 
all he will make no Progress toward the goal, (p. 83) 


If Heider’s Proposal is correct, then attribu- 
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tion of performance from motivation and abil- 
ity information would follow the multiplying 
rule: 


Performance = Motivation x Ability. (1) 


Anderson and Butzin (1974) tested this 
multiplying rule in two experiments. They 
presented information about the motivation 
and ability of target persons, applicants to 
graduate school and athletes trying out for 
college track, and asked subjects to predict 
how the target persons would perform. The 
factorial plot of the Motivation x Ability 
judgments was a diverging fan of straight 
lines. By the logic of functional measurement 
(Anderson, 1974a, 1976), this linear fan pat- 
tern implies the operation of a multiplying 
rule. 

In a developmental study, Kun, Parsons, 
and Ruble (1974) also obtained evidence for 
a multiplying rule. Although the youngest 
children (5-6 years) integrated ability and 
trying information by an adding-type rule, 
the linear fan pattern was already present in 
second graders. The results of these two 
studies suggest that there is considerable gen- 
erality to the multiplying rule. 

However, an anomalous result appeared 
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; when Anderson and Butzin (1974) presented 

“information about performance and ability 
and asked subjects to judge motivation. Equa- 
tion 1 implies a dividing rule, 


Motivation = Performance + Ability. (2) 


However, the data from both experiments 
supported a subtracting rule, 


Motivation = Performance — Ability. (3) 


‘Anderson and Butzin concluded, therefore, 

that these judgments obeyed a simple cogni- 
y tive algebra but that this cognitive algebra 
was not consistent mathematically. 

Perhaps there is an alternative interpreta- 
tion, one in which the subjects are mathemat- 
ically consistent. The linear fan pattern is not 
unique to the multiplying rule. This pattern 
can also be produced by a conjunctive averag- 
ing rule with differential weighting (Ander- 
son, 1971, p. 85). If lower values of motiva- 
tion and/or ability had greater weight, then 
the averaging model would produce an ap- 
proximate linear fan. This conjunctive averag- 
ing idea seems reasonable, for it reflects the 
idea that no amount of motivation can com- 
pensate for low ability, contrary to the im- 
plications of the multiplying model. This pos- 
sibility was noted by Anderson and Butzin 
(1974), who said, “Since averaging processes 
are pervasive in judgment, a conjunctive in- 
tegration rule deserves more consideration” 
(p. 602), However, this possibility has not 
been considered further. 

A test between the multiplying rule and 
the conjunctive averaging rule can be ob- 
tained by asking for judgments based only 
on ability information. The multiplying rule 
can only operate if the subject imputes some 
implicit level of motivation. With this implicit 
level of motivation, the ability-only curve will 
still form part of a linear fan, But if the 
+ averaging model holds, then the well-known 

crossover interaction will be observed (Ander- 
son, 1974a; Lampel & Anderson, 1968). No 
such test was used by Anderson and Butzin 
(1974) or by Kun et al. (1974). 


Experiment 1 


Experiment 1 had two purposes. The first 
was to replicate previous results on the multi- 
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plying rule, Performance = Motivation x 
Ability. The second was to test the plausibil- 
ity of the alternative interpretation as a con- 
junctive averaging rule. 


Method 


Stimuli and design. Sixteen stimulus persons were 
prepared according to a 4X4 design comparable to 
that of Anderson and Butzin (1974). The two factors 
were interest in studies and IQ. The four levels of 
interest in studies were slight, less than most, more 
than most, and very high; the four levels of IQ 
were 100, 110, 120, and 130, In addition, 4 stimulus 
persons were described with respect to their IQ alone; 
their interest in studies was not specified. The infor- 
mation about each of these 20 stimulus persons was 
typed on a separate sheet of paper, and all 20 sheets 
were arranged randomly in a booklet. 

Four practice examples were also constructed, using 
stimuli more extreme than the regular levels of the 
two factors. Levels of interest in studies were not at 
all interested and extremely interested; IQ levels were 
90 and 140. These practice examples were intended 
to serve as end anchors and to orient the subjects 
toward the use of the entire response scale (Ander- 
son, 1974b, Note 1). 

Subjects. Subjects were 32 male second-year en- 
gineering students enrolled in an introductory psy- 
chology course at the Indian Institute of Technology, 
Kanpur, India. Participation fulfilled a course require- 
ment. 

Procedure. Subjects, gathered in groups from three 
to five, received a typed sheet of instructions that 
described the nature of the experimental task and 
their role as subjects. The task was introduced as 
dealing with prediction of future academic perform- 
ance of first-year engineering students at the Indian 
Institute of Technology, Kanpur. It was emphasized 
that prediction about future performance would be 
based on interest in studies and intelligence or on 
intelligence alone. Subjects were instructed to base 
their prediction only on the information given about 
each stimulus student. 

After reading the instruction sheet twice, each sub- 
ject worked with the four practice examples. He read 
the information about each stimulus student and then 
indicated how that stimulus student would perform at 
the institute. Prediction of performance was made 
along a 21-point scale, with endpoints labeled —10 
(very bad performance) and +10 (very good per- 
formance). 

Immediately following the practice period, the main 
points of the instructions were summarized to the 
subjects by the experimenter. All queries about the 
task were answered. Finally, each subject received 
the experimental booklet and rated all 20 stimulus 
students twice. In each case, he wrote the code num- 
ber of the target person and his judgment of per- 
formance on the response sheet with the 21-point 
response scale typed at the top. The first replication 
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was considered as additional practice, and only the 
data from the second replication were analyzed. 


Results 


Figure 1 plots mean judgments as a func- 
tion of interest in studies (curve parameter) 
and IQ. The IQ levels are spaced on the hori- 
zontal axis according to the marginal means 
of the factorial design. This spacing allows the 
linear fan pattern to appear. If the two pieces 
of information were integrated in accord with 
the multiplying rule, then the four solid 
curves would form a diverging fan of straight 
lines. 

It is clear that the pattern of the data is 
contrary to the multiplying model. There is 
not any evidence for divergence at all, In- 
stead, the four solid curves display parallelism. 
This parallelism indicates that subjects inte- 
grated the given information by an adding or 
averaging rule. ~ 

The statistical analysis also supported the 
graphical interpretation. The linear fan pat- 
tern predicted by the multiplying model re- 
quires a statistically significant Interest in 
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Studies x IQ interaction. However, this inter- 
action was statistically nonsignificant, FO, 
279) = 1.44. This result argues against the 

multiplying model. On the other hand, it sup- | 
ports the parallelism prediction of the adding 
and averaging models. . 

A distinguishing test between the adding 
and averaging models may be obtained by 
considering the dashed curve in Figure 1, 
This curve represents judgments based on Q` 
information alone, with interest in studies not 
specified. The adding rule requires that the 
dashed curve be parallel to the solid curves; 
the averaging rule implies that the dashed 
curve should have steeper slope than the solid 
curves (see Anderson, 1974b, Section 3). 
Visual inspection of Figure 1 thus argues 
against the adding rule and for the averaging 
rule. 

To obtain a statistical test that would dis- 
criminate between adding and averaging, the 
dashed curve data, which are based on IQ 
information alone, were considered as a fifth 
level of the interest-in-studies factor. The 
interaction term in this 5 x 4, Interest in 
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Studies x IQ analysis of variance tests for 
a nonparallelism in the set of all five curves in 
Figure 1. This interaction was highly signif- 
icant, F(12, 372) = 8.36, p < 01. Since the 
four solid curves were essentially parallel, as 
shown above, it appears that the dashed curve 
is reliably steeper than the solid curves. This 


test thus supports the averaging hypothesis 


and rejects the alternative adding hypothesis. 
' Discussion 

The main result of Experiment 1 is that the 
information about motivation and ability ap- 
pears to be integrated by a simple, equal- 
weight averaging rule. There was no support 
for the linear fan pattern of the multiplying 
rule, This result is contrary to the prediction; 
it disagrees with results from American sub- 
jects. 

One hypothesis is that this difference in 
integration rules is a consequence of cultural 
differences in outlook on social motivation. 
Before this hypothesis can be taken seriously, 
however, it is necessary to consider method- 
ological differences between the present and 
previous experiments. 

Methodologically, this experiment was dif- 
ferent from those of Anderson and Butzin 
(1974) in two ways. First, they asked their 
subjects to describe in writing how they com- 
bined motivation and ability cues during the 
practice session, That was not done in the 
present work. This difference, however, does 

\ not seem to be very serious. Kun et al. (1974) 
obtained the linear fan pattern with children 
even without following the Anderson and 
Butzin procedure. Second, and perhaps more 
important, the motivation factor was defined 
here as interest in studies, and the subjects 
may not have interpreted that in terms of 
motivation and trying. This possibility was 
checked in the next experiment by using more 

we explicitly motivational information. 


Experiment 2 


Experiment 2 was run chiefly as a reliability 
check on Experiment 1. There were, however, 
three notable changes from Experiment 1. 
First, information about laboriousness was 
used as a motivation cue to allow direct com- 
parison of findings with those obtained in the 
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United States. Second, the levels of the two 
factors, laboriousness and IQ, were drawn 
from a wider scale range to increase the op- 
portunity for a linear fan pattern to appear. 
Finally, sufficient data were gathered from 
each subject to explore various integration 
rules on the individual level. 


Method 


Stimuli and design. A 4 X 4 (Laboriousness x IQ) 
factorial design was employed to construct 16 two- 
cue stimulus persons. The levels of laboriousness were 
not at all laborious, slightly laborious, fairly laborious, 
and very laborious; the levels of IQ were 90, 105, 
120, and 135. In addition, 8 single-cue stimulus per- 
sons were described by one level of either factor. 

The practice and filler stimulus persons were de- 
scribed by information more extreme than that used 
for the regular experimental stimulus persons. For 
example, laboriousness had not at all laborious and 
extremely laborious levels, and IQ had values of 85 
and 140, These extreme stimuli were used to orient 
the subjects toward the use of the entire response 
scale and also to guard against the operation of floor 
and ceiling effects as in Experiment 1. Each person 
description was typed on a separate index card. The 
eight practice examples included four two-cue and 
four single-cue descriptions. 

Subjects. Subjects were 12 male second-year engi- 
neering students fulfilling a course requirement in 
elementary psychology at the Indian Institute of 
Technology, Kanpur, India. Two subjects were run 
at a time in a session that lasted approximately 1.5 
hours. 

Procedure. The general procedure was the same as 
in Experiment 1 except that subjects read the entire 
set of descriptions before beginning their actual judg- 
ments, The complete set of 24 experimental and 2 
filler descriptions was presented three times in differ- 
ent shuffled order for each subject. Data from all the 
three replications were analyzed. 


Results 


Parallelism prediction. The upper left 
panel of Figure 2 plots mean performance as 
a function of laboriousness (curve parameter) 
and IQ (horizontal axis) of the stimulus per- 
sons. If the result obtained in Experiment 1 is 
reliable, then the four solid curves should plot 
as parallel lines. Inspection shows that these 
curves are essentially parallel, The Laborious- 
ness X IQ interaction in the 4 x 4 design was, 
however, significant, F(9, 99) =2.13, P< 
05. This corresponds to a slight tendency for 
the four curves to converge to the right. There 
is no sign of the diverging linear fan pattern 
reported in the ‘American studies. Despite the 
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Figure 2. Mean judgment of performance as a function of laboriousness and 1Q, pee 2. 
(Levels of laboriousness are listed as curve parameters: NAL = not at all laborious ; SL er! 
laborious; FL = fairly laborious; VL = very laborious; NS = laboriousness not specified. Classifica- 
tion of subjects into adding, constant-weight averaging [CWA], and differential-weight averaging 


[DWA] subgroups 


change in the nature of the motivation infor- 
mation, these results confirm those of the first 
experiment. 

Averaging versus adding. The dashed 
curve in the upper panel of Figure 2 repre- 
sents judgments based on IQ alone, with no 
information about laboriousness, This curve 
is nearly parallel to the solid curves. At face 
value, this seems to argue for the adding hy- 
pothesis and against the averaging hypothesis, 
The result raises some difficulty for both hy- 
potheses, however. 

The difficulty with the adding hypothesis is 
that the dashed curve has the same elevation 
as the solid curve for slightly laborious. Under 
the adding hypothesis, this equality in eleva- 


was based on results from single-subject analyses.) 


tion requires that information that the student 
is slightly laborious has the same effect F 
lack of information about laboriousness. i 
slightly laborious has a value of zero, then 7 
course there is no problem. However, ! 
slightly laborious has a value different a 
zero, then lack of information must have ie 
same value. In this case, therefore, it would be 
Necessary to assume that the subject infers 
an implicit level of laboriousness when no be 
plicit information is given and that he 4 
this value to the other information given. Py 
The difficulty with the averaging model s 
the failure of the dashed curve to cross ov 
the solid curve. However, the averaging m is 
can account for the near-parallelism if it 


: 


py 
a 
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assumed that subjects infer an implicit value 
of Jaboriousness when no information is given 
and that they average that in with the other 
given information. If this inferred implicit 
value has the same weight as given laborious- 
ness information, then the corresponding 
dashed curve will be parallel to the solid 
curves. 

The assumption that subjects infer an im- 


’ plicit value for missing information about 


laboriousness seems reasonable for the present 
task, Logically, a prediction of performance 
cannot be made on the basis of TQ information 
alone; some information pertaining to degree 
of laboriousness is necessary. This question 
cannot be decided on the basis of the above 
data, but some evidence is available from Ex- 
periment 3. 

Single-subject analyses. Separate analyses 
were made for each subject (Shanteau & 
Anderson, 1969, p. 315) to find out the vari- 
ous ways in which the experimental task was 
handled and also to explain the minor discrep- 
ancy from parallelism in the main analysis 
mentioned earlier. In these analyses, the sub- 
jects were broken down into three subgroups. 

Mean performance judgments for these 
three subgroups of subjects are shown in Fig- 
ure 2. The upper right panel displays results 
for eight subjects, all of whom showed non- 
significant interactions. Here parallelism is 
quite obvious. Perhaps these subjects followed 
an adding or an averaging rule. The lower left 
panel plots data from two subjects who ap- 
parently averaged, as shown ‘by the crossover 
of the dashed and solid curves. The lower 
right panel shows two subjects who had a 
strong convergence tendency. However, the 
general shape of the curves is too irregular to 
be interpreted. 

The most important information from these 
individual analyses is that not a single subject 
showed the diverging fan pattern obtained in 
the American studies, That supports the sug- 
gestion of Experiment 1 that Indian subjects 
make achievement judgments in a rather dif- 
ferent way from their American counterparts. 


Discussion 


The parallelism pattern observed in Ex- 
periment 1 is replicated here. Although infor- 


1347 


mation about laboriousness, rather than inter- 
est in studies, served as the motivation factor, 
and although the laboriousness and IQ factors 
had levels from a wider range than in Experi- 
ment 1, the findings are basically identical. 
Most of the subjects obeyed the parallelism 
prediction. More importantly, not even one 
subject showed a diverging fan pattern pre- 
dicted by the multiplying-type strategy. These 
results support the hypothesis that the differ- 
ences between the American and Indian 
studies result from cultural differences, 

The findings of the first and second experi- 
ments converge to make a single point: In- 
formation about motivation and ability is 
added or averaged in making attributions of 
future scholastic performance. The evidence 
leans toward the averaging hypothesis, but the 
results of Experiment 2 require the supple- 
mentary assumption that the subject imputes 
an implicit value for motivation when no in- 
formation is given. This assumption is reason- 
able, since no performance is logically possible 
without some level of motivation. Neverthe- 
less, more direct evidence is required. 

Direct evidence can be obtained by using 
three stimulus cues. If the averaging model is 
correct, then the single-cue curves should be 
clearly steeper than the curves in the two-way 
factorial plots from the three-cue design. The 
reason for this is simple. When the single cue 
is given, any implicit inference about the ab- 
sent motivation information will yield an 
effective two-cue response. But the averaging 
model implies that the slope of a curve based 
on two cues will be greater than that based on 
three cues. Accordingly, the crossover test 
remains valid even if the subject does impute 
an implicit value to the missing dimension of 
information. The adding hypothesis, of course, 
still predicts parallelism in such a three-cue 


design. 


Experiment 3 


Experiment 3 had two goals, The first was 
to extend the findings of Experiment 2 by 
including a third informational cue about past 
performance. The second was to resolve the 
ambiguity in the test of the averaging model 
in Experiment 2. 
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Method 


Stimuli and design. There were two main designs. 
The first design was a 3 X 3 X 3 (Past Performance X 
Laboriousness X IQ) factorial. The three levels of 
past performance were Cumulative Performance In- 
dex (CPI) of 5, 7, and 9 during the first year of 
study at the Indian Institute of Technology, Kanpur. 
This information was from a 10-point scale. The 
three levels of laboriousness were not at all laborious, 
fairly laborious, and very laborious; the three levels 
of IQ were 90, 112, and 135. Orthogonal combina- 
tions of these three factors generated 27 three-cue 
stimulus persons. 

The second design was a 3 X 3 (Laboriousness X 
IQ) factorial. The levels of laboriousness and IQ fac- 
tors were the same as in the three-cue design. Pairing 
of the levels of these two factors produced nine two- 
cue stimulus persons, 

There were also 9 single-cue stimulus persons, based 
on one of the three levels of each factor. In addition, 
4 three-cue filler persons were constructed to serve 
as end anchors. All these 49 stimulus persons, includ- 
ing the 4 filler persons, were judged by each subject. 

Procedure and subjects. The general procedure 
was the same as in Experiment 2. Each subject rated 
49 stimulus persons three times, As in Experiment 2, 
a description of each stimulus person was typed on 
a separate index card. 

Before rating the experimental stimuli, each subject 


a 


R. SINGH, M. GUPTA, AND A, DALAL 


worked with 10 practice examples prepared from ex- 
treme levels of the informational factors. Of the 10 
examples, 4 had three cues, two had two cues, and 
the remaining four had just one cue. 

There were 12 subjects from the same population 
as in Experiment 2. Two subjects were run at a time, 
Each subject spent approximately 2.5 hours on the 
task, 


Results and Discussion 


Three-cue design. The main purpose of 
Experiment 3 was to study the integration 
rule for the three cues. On the basis of their 
results, Anderson and Butzin (1974) sug- 
gested the compound averaging-multiplying 
rule: 


Predicted Performance = Past Performance 
+ Motivation x Ability. (4) 


However, the results of the present Experi- 
ments 1 and 2 suggest the three-term averag- 
ing rule: 


Predicted Performance = Past Performance 
+ Laboriousness + Ability. (5) 
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Figure 3. Two-way factorial plots of Past Performance X Laboriousness, Past Performance X Abil- 
ity, and Laboriousness X Ability effects from the main three-cue, Past Performance X Laboriousness 
X Ability design. (The dashed curve of each panel is based on the single cue listed on the horizontal 
axis. Digits 5, 7, and 9 represent levels of past performance; NAL, FL, and VL represent not at 


laborious, fairly lab 
was not specified.) 


rious, and very laborious, respectively. NS means that the row information 
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a The results of Figure 3 support the three- 
a term averaging rule. Each panel shows one of 
the two-cue factorial graphs from the main 
design. In each panel, the solid curves are ap- 
proximately parallel. The right panel, which 
corresponds to the Laboriousness x Ability 
interaction, is marginally significant, 7(4, 44) 
= 2.46. However, the three solid curves show 

a slight convergence, as in Experiments 1 and 
yi 2, not the divergence predicted by the multi- 
plying rule. 

A test between the adding and averaging 
rules is obtained by comparing the dashed 
curve with the solid curves in each panel of 
Figure 3. The dashed curve represents judg- 
ments based on just the single cue listed on 
the horizontal axis. In each panel, the dashed 
curve has much steeper slope and crosses over 
the lowest solid curve convincingly. This 
crossover interaction is evidence against the 
adding rule and supports the averaging rule. 

Two-cue design. The two-cue design was 


included to provide a further check on the 

two-cue results of Experiments 1 and 2. These 

data are shown in Figure 4. The solid curves 

are approximately parallel, but with a visible 

and barely significant tendency to converge 

to the right, F(4, 44) = 2.85, p < .05. The 
« 
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Figure 4, Mean judgment of performance as a 
function of laboriousness and IQ, Experiment 3, eee 


cue design. (NAL, FL, and VL represent not at 
and very laborious, re- 


laborious, fairly laborious, 0 
the row information was 


spectively. NS means that 
not specified.) 
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Table 1 
F Ratios for Two-Way and Three-Way 
Interactions in Single-Subject Analyses 
(Experiment 3, Three-Cue Design) 
Ce ee a ee 
AB AC BC ABC 
Subject F(4, 52) F(4, 52) F(4,52) F(8, 52) 
1 2.69* .58 94 84 
2 2.23 3:331 1.28 1.79 
3 1.64 74 1.77 .98 
4 84 1.19 3,73* 1.41 
5 1.72 1.14 1.22 87 
6 78 1.15 77 53 
7 39 59 95 93 
8 25" 1.09 1.18 3.50* 
9 .42 -16 91 arb) 
10 ah .23 1.29 .91 
il 1.36 1 1.73 27 
12 1.65 76 1.32 68 


Note. A = Past performance. B = Laboriousness. 


dashed curve is again nearly parallel to but 
slightly steeper than the solid curves. These 
results are essentially the same as those in 
Experiment 2. 

These two-cue data support the assumption 
of implicit inference about absent motivation 
information discussed in Experiment 2. The 
dashed curve in the center and right panels of 
Figure 3 is the same as the dashed curve of 
Figure 4. The clear crossover of the dashed 
curve in Figure 3 supports the averaging rule. 
Hence, the failure to obtain a clear crossover 
in Figure 4 would seem to reflect the presence 
of an inferred, implicit value for the missing 
motivation cue. 

Single-subject analyses. As Shanteau and 
Anderson (1969) emphasized, individual anal- 
yses are important to check that the group 
averages are not hiding alternative integra- 
tion strategies by different subjects. Table 1 
presents the F ratios for the four interactions 
for each subject in the main three-cue design. 

Table 1 shows that only five scattered inter- 
actions are statistically significant. Inspection 
of individual data failed to disclose any mean- 
ingful pattern to these deviations from paral- 
lelism. Instead, these deviations seemed to 
reflect individual idiosyncracies such as may 
be seen in the lower right panel of Figure 2. 
Almost all individuals showed the crossover 
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interaction for the single cue curves. Thus, 
the present results support the generality of 
the averaging rule at the level of the indi- 
vidual. 

In single subject analyses of the two-cue, 
Laboriousness X IQ data, 11 of the 12 sub- 
jects showed parallelism. Only 1 subject had 
a converging type of nonparallelism. It can 
thus be said that present two-cue results are 
indeed similar to those of Experiments 1 
and 2. 


General Discussion 


The results of the present set of three ex- 
periments do not agree with the multiplying 
model suggested by Heider (1958) and tested 
by Anderson and Butzin (1974) and by Kun 
et al. (1974). In the present experiments, 
attribution of performance from motivation 
and ability cues appears to have been gov- 
erned by a process quite different from the 
multiplying. Strong evidence for parallelism 
pattern indicates that an equal-weight averag- 
ing rule could very well account for the data. 
It seems reasonable to suggest, therefore, that 
attribution of scholastic performance obeys 
an averaging rule in India. 

Collateral work by Singh (Note 2) helps 
solidify this cultural-difference interpretation 
of the present results. Singh presented infor- 
mation about generosity and income of vari- 
ous persons and asked subjects to predict how 
much these persons would contribute to a 
family whose house had burned down. This 
experiment parallels that of Graesser and 
Anderson (1974) on Gift Size = Generosity 
X Income, and quite similar results were ob- 
tained. Judgments of gift size showed a di- 
verging fan pattern, as if a multiplying model 
was operative, just as in the American studies. 
This shows that the present Indian subjects 
are able to use the same integration rule as 
the Americans. Accordingly, their present use 
of an equal-weight averaging rule would not 
seem to be attributable to the subjects’ in- 
appropriate understanding of the task, their 
attempt to simplify it (Anderson & Butzin, 

` 1974), or their inability to use the same in- 
tegration rule as the Americans, It can thus 
be said that the present discrepancy from the 
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American results reflects a difference in cul- | 


weight averaging rule and the multiplying rule 
point to any important difference in cultural 
outlook between India and America? For 
attributions of achievement, given information 
about motivation and ability, the following 
argument suggests at least one important so- 
cial difference. According to the equal-weight 
averaging model, effort or trying will be 
equally effective with persons of low or high 
ability. Within a cultural system of attribu- 
tions that obeys the averaging model, there- 
fore, it will appear that a person of lower abil- 
ity can gain as much by trying as a person of 
higher ability. 

The multiplying rule, in contrast, implies 
that effort or trying will be more effective 
with persons of higher ability. Within a cul- 
tural system of attributions that obeys the 
multiplying rule, therefore, it will appear that 
persons of lower ability have less to gain by 
trying. 

This multiplying rule for attributions may 
be closer to the actual behavior processes, 
since just such a rule is found in most current 
behavioral theories of motivation, such as i 
those of Hull and Tolman (see Anderson, 
1974a, p. 29). Given the achievement orienta- 
tion of American culture, it is no surprise to 
see a multiplying rule emerge in the attribu- 
tional judgments of American subjects, esp 
cially college students. 

From a social-cultural view, however, the 
present equal-weight averaging model for at- 
tribution of performance appears to portray 
a more egalitarian outlook. Perhaps college 
students in India believe that each person, 
regardless of native ability, has equal as 
tunity to improve his or her lot. This egali- 
tarian attitude is a healthy sign for progress 
in India, for Indian people have been de- 
scribed as high in dependency (Murphy, 
1953; Winter, 1969). isht 

Both the present data on the equal-we!g 4 
averaging rule and the previous Ane 
results on the multiplying rule, however, Te 
on a very narrow cultural sample, name i 
college students. If the cognitive algem a 
achievement attribution is to be usefu 


tural outlook, 4 
Does the difference between the equal- 
| 
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< cross-cultural research, then it is necessary 
‘to study many other cultural strata, both in 
India and in the United States. 

To the authors, information integration 
theory is impressive because it provides a use- 
ful framework for cross-cultural studies of 
social perception and cognition. An important 
characteristic of the integration rules, it 
should be emphasized, is that they are con- 
cerned with patterns of responses, not with 
the numerical value of single responses. This 
characteristic is vital for cross-cultural com- 
parisons because it bypasses uncertain as- 
sumptions that specific stimuli have com- 
parable values across cultures. Indians and 
Americans undoubtedly have different value 
systems, but they can still be compared in 
terms of the pattern of their responses to a 
factorial set of stimuli. Indeed, the integra- 
tion rule can provide a base and frame for 
measurement of meaning—value even for the 
individual within his or her culture (Anderson, 
1976, 1977). The present results illustrate the 
potential power of this approach. 

Because of its direct concern with the prob- 
lems of multiple causation, integration theory 
is well suited for cross-cultural comparison. 
Of special interest and value is that the inte- 
gration-theoretical approach can help shift 
the direction of cross-cultural research from a 
purely correlational approach (Whiting, 1968) 
to a causative, experimental base (Singh, 
Bohra, & Dalal, in press; Singh, Sidana, & 
Saluja, 1978a, 1978b; Singh, Sidana, & 
Srivastava 1978). 
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The Effects of Reward Contingency and 
Performance Feedback on Intrinsic Motivation 


Judith M. Harackiewicz 
Harvard University 


In this study the effects of reward contingency and positive performance feed- 
back on subsequent intrinsic motivation for an enjoyable task were examined. 
It was hypothesized that rewards contingent only upon participation in the task 
would produce decrements in motivation (the overjustification effect) but that 
rewards contingent on performance quality would produce even greater decre- 
ments in intrinsic motivation relative to control conditions of no reward. It 
was also hypothesized that positive performance feedback would enhance in- 
trinsic motivation and that this effect would be independent of reward effects. 
High school students were offered performance or task-contingent rewards, or 
no reward, for doing hidden-figures puzzles. Subjects offered performance- 
contingent rewards all received positive feedback concerning performance and 
half the subjects in task-contingent and no-reward conditions received the same 
positive feedback. Performance-contingent rewards were found to undermine 
intrinsic motivation more than task-contingent ones, which produced decre- 
ments relative to control conditions of no reward, supporting Deci’s control 
model. Positive feedback enhanced intrinsic motivation and this effect was 
independent of reward effects. A recall measure indicated that subjects receiv- 
ing performance-contingent rewards remembered fewer performance-irrelevant 
details about the task, suggesting that rewards may affect the process of task 
involvement as well as its motivational outcomes. 


In a review of the literature concerned with 
the effects of extrinsic rewards on intrinsic 
motivation, Condry (1977) concludes that in 
certain contexts, subsequent interest in a task 
may be reduced by the imposition of task- 
extrinsic rewards. One aspect of context that 
apparently mediates this undermining effect is 
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reward contingency. Rewards have been 
promised to subjects contingent only upon 
participation or for completion of the task 
(task contingent),? or contingent upon some 
level of performance of the task ( performance 
contingent). ‘ 
The undermining effects of task-contingent 
rewards have been well documented (Ander- 
son, Manoogian, & Reznick, 1976; eae 
Greene, & Nisbett, 1973). Rewards promise 
to subjects for engaging in an activity and a 
plying no performance demands have consist- 
ently produced what Lepper and Greene 


. ” 
(1976) termed the “overjustification effect, 


as 
1 These rewards have frequently been referred to 


noncontingent rewards. Although not explicitly oe 
tingent on some level of performance, uT 
usually at least implicitly contingent on ob 
with the experimenter or participation in the a k 
and are perhaps more appropriately consider 
contingent. 
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INTRINSIC MOTIVATION 


or a decrease in subsequent intrinsic motiva- 
tion. Their most recent theoretical statement 
specifies the context in which this effect should 
occur: 

The overjustification studies provide a demonstration 
of the negative effects on intrinsic motivation of mak- 
ing salient to a person the instrumentality of his be- 
havior when rewards are presented which do not 
result in the acquisition of new skills or convey 
salient information to him concerning his ability at a 
task. (Lepper & Greene, 1976, p. 33) 


Because the promise of a performance-con- 
tingent reward implies that information con- 
cerning task ability will be provided, specific 
predictions for the effects of performance- 
contingent rewards cannot be directly derived 
from the overjustification hypothesis. Other 
investigators, however, have attempted to ex- 
amine the effects of such rewards and to relate 
them to overjustification effects. Two predic- 
tions for performance-contingent rewards have 
been advanced. Karniol and Ross (1977) sug- 
gested that performance-contingent rewards 
may maintain or increase intrinsic motivation 
to the extent that they convey information 
about effective performance at a task, because 

. attainment of such rewards should provide 
tangible evidence of personal effectiveness. 
Deci (1975), on the other hand, predicted 
that a performance-contingent reward should 
decrease intrinsic motivation even more than 
a task-contingent one because a reward is per- 
ceived to be more controlling when it is con- 
tingent on some level of performance. 

Karniol and Ross and Deci agreed that re- 
wards have controlling and informational as- 
pects, and ‘both suggested that the more 
salient of the two will initiate either changes 
in perceptions of instrumentality of behavior 
(when controlling aspects of the reward are 
salient) or changes in feelings of competence 
and self-determination (when informational 
aspects are salient). Their predictions for the 
effects of performance-contingent rewards dif- 
fered because of differences in the ascribed 
salience of controlling versus informational 
aspects of these rewards. 


Previous Research 


Deci (1971, 1972a) consis 
subjects receiving performance 


tently found that 
-contingent re- 
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wards demonstrate less subsequent intrinsic 
motivation than.unrewarded controls. Pritch- 
ard, Campbell, and Campbell (1977) and 
Pittman, Cooper, and Smith (1977) found 
that performance-contingent rewards under- 
mined intrinsic motivation relative to control 
conditions of no reward. Pinder (1976) found 
that performance-contingent rewards pro- 
duced greater decrements in intrinsic motiva- 
tion than did task-contingent ones. Although 
these results all seem to disconfirm the hy- 
pothesis of Karniol and Ross, the studies do 
not represent adequate tests of it, because per- 
formance-contingent rewards are predicted to 
maintain interest only when subjects perceive 
that they have done well at the task, It is not 
clear that this criterion was met in these 
studies. 

Two studies have compared task-contingent 
and performance-contingent rewards with a 
control condition of no reward in the same de- 
sign. Greene and Lepper (1974) found that 
performance and task-contingent rewards 
produced equal decrements in intrinsic mo- 
tivation relative to control conditions. Karniol 
and Ross (1977) found that when children 
received positive feedback concerning their 
performance, task-contingent rewards de- 
creased subsequent interest relative to per- 
formance-contingent rewards and no rewards. 
Performance-contingent rewards did maintain 
interest, but they did not increase it above 
the control level as had been predicted. 
Karniol and Ross suggested that because the 
performance standard was established in the 
reward manipulation at the outset, children 
knew that they had performed well as soon as 
they finished the task. The information in- 
herent in the receipt of the reward was thus 
redundant with that provided in the instruc- 
tions. The researchers predicted that when the 
receipt of a reward provides additional rather 
than redundant information concerning com- 
petence, it should increase intrinsic motivation 
above the level of control subjects. 


Methodological Issues 


One aspect of reward contingency that has 
not been directly examined is the possible 
confounding of material reward with positive 
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feedback about performance quality. The 
promise of a performance-contingent reward 
implies that a subject’s performance will be 
evaluated and that feedback will be available. 
Deci (1972b) and Anderson et al. (1976) 
demonstrated that positive feedback, in the 
absence of any reward manipulation, increases 
subsequent intrinsic motivation. Although it is 
difficult to separate reward manipulation from 
feedback in the case of performance-contin- 
gent rewards, it is possible to examine the 
effects of reward and feedback independently 
in task-contingent and control conditions be- 
cause feedback is not implied by a task-con- 
tingent reward manipulation. 

Several measures of intrinsic motivation 
have been used in the overjustification litera- 
ture, ranging from behavioral measures (task 
persistence in the absence of rewards) to vol- 
unteering measures (willingness to return for 
another session) to self-report measures of 
task interest and enjoyment. Conflicting re- 
sults of previous studies may be due to differ- 
ent operational definitions of intrinsic motiva- 
tion. 

Condry (1977) draws a distinction between 
dependent measures that focus on the prod- 
ucts of the rewarded session, for example, sub- 
sequent interest, and measures that focus on 
the process of task involvement, and he sug- 
gests that the promise of extrinsic rewards 
may affect the manner in which subjects en- 
gage in a task, as well as subsequent motiva- 
tion. Products and processes may be differen- 
tially affected by rewards, however, and 
should be considered separately. 


Objectives 


The primary aim of the present study was 
to examine the effects of performance- and 
task-contingent rewards on intrinsic motiva- 
tion, compared to conditions of no reward. 
To accomplish this, two kinds of performance- 
contingent rewards were created. A perform- 
ance-contingent reward with performance 
norms supplied at the outset of the task 
would convey no additional information con- 
cerning competence beyond that inherent in 
the norms. A performance-contingent reward 
with no prior norms established would provide 


JUDITH M. HARACKIEWICZ 


nonredundant information concerning com 
petence, These two performance-contingent 
rewards were compared with task-contingent 
rewards and control conditions of no reward, 
In addition, the effects of positive perform- 
ance feedback, relative to no feedback, were 
examined in conditions of task-contingent and 
no reward in order that results might be com- 
parable to both those of Karniol and Ross and 
those of other studies that have demonstrated 
the undermining effects of task-contingent 
rewards in the absence of positive perform- 
ance feedback. 

To facilitate comparison with other studies, 
several behavioral, volunteering, and self- 
report product measures of intrinsic motiva- 
tion were utilized. Interest was measured at 
several points in time in order to examine the 
endurance of reward effects. The task in this 
study was constructed in such a way that 
attention to aspects of it not relevant to per- 
formance could be measured with an inciden- 
tal recall measure, intended to assess the 
process of intrinsic motivation. 

It was predicted that positive feedback 
would enhance intrinsic motivation, that task- 
contingent rewards would undermine it, and 
that these two effects would be independent. 
The effects of performance-contingent rewards 
were of particular interest because of conflict- j 
ing predictions: an enhancement of motiva- 
tion mediated by perceived self-efficacy 
(Karniol & Ross, 1977) or a detrimental effect 
because of the perceived controlling nature of 
the reward (Deci, 1975). Controlling the 
informational value of the reward permits 
closer analysis of performance-contingent re- 
wards. Karniol and Ross predict that the rA 
informational performance-contingent rewar' 
with no prior norms established should pro- 
duce the most enhancement of motivation. 
Although Deci does not make specific Skee 
tions regarding informational value, 4 specula- 
tive control prediction was generated. Because 
the two rewards are both contingent on va 
formance quality, equal decrements might ie 
predicted, but this negative effect could 
differentially mitigated by the two pen 
positive feedback. When norms are pee 
at the outset, the feedback is received eari 


in the motivational process and might haved 


reater positive effect than would feedback 
eceived after completion of the task. Accord- 
ing to this analysis, the control hypothesis 
would produce a larger decrement in interest 
in the performance-contingent condition 
where norms are not supplied. It was hypoth- 
esized that the control prediction would be 
‘supported in this study because of the greater 
similarity of the subject population (high 
school students) to Deci’s college students 
than to the children used as subjects by 
Karniol and Ross. 


Method 


Overview 


In this study six groups of subjects received vari- 
ous combinations of reward and positive feedback 
with respect to a challenging but enjoyable task. 
Interest in this task was assessed both before and 
after the experimental manipulations, The six experi- 
mental conditions are summarized in Table 1. The 
no reward—no feedback (NRNF) and task-contingent 
ward — no feedback (rcrNF) cells replicate the con- 
ditions employed in overjustification studies, whereas 
“the no reward — positive feedback (NRPF), task-con- 
tingent reward — positive feedback (terPF) and per- 
‘ormance-contingent reward (norms supplied) — posi- 
tive feedback (rcrNSPF) cells are conceptual replica- 
tions of the central features of the Karniol and Ross 
(1977) design. Subjects in the PCRNSPF condition 
were offered a reward contingent on performing 
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above average and were informed of the performance 
standard before beginning the task, so that receipt 
of the reward only confirmed that they had per- 
formed above average. In the performance-contingent 
reward (no norms) -positive feedback (PCRNNPF) 
condition, however, the reward itself was the only 
source of information available to the subject con- 
cerning level of performance. Thus the reward re- 
ceived in the pcrnnpF condition was more informa- 
tional than that received in the pcrnsPF condition, 
although subjects in both conditions received the same 
feedback concerning performance. The standard of 
performance was set at a relatively low level so that 
all subjects would be eligible for positive feedback. 


Subjects 


The participants in this study were 64 male and 29 
female students in a suburban Massachusetts high 
school, with a mean age of 16.4 years. The subjects 
were tested during their English class. 


Procedure 


The study was conducted in three phases. The first 
and third sessions were group administered in the 
classroom, whereas subjects were tested individually 
in a separate room for the second session. This ex- 
perimental session began 1 month after the first 
session and continued for 2 weeks, and the third ses- 
sion took place 1 month after the end of the experi- 
mental sessions. 

The experimental activity in this study was a series 
of hidden-figure puzzles—cartoon-style drawings by 
Al Hirschfeld, in which the name Nina is hidden sev- 


d Contrast Weights for Main Hypothesis 


feedback (PCRNSPF) 


Performance-contingent reward 
(no norms)-positive feedback 


(PCRNNPF) 


Offered for meeting per- 
formance criterion, 
norms supplied before task 


Offered for meeting per- 
formance criterion, no 
norms supplied 


Condition Reward Feedback Contrast weight 
N |-no feedback 
peas Sor None mentioned None 1 
-positive feedback v9 
me <a None mentioned Positive" 3 
Task-conti t reward- s 
a PAAS i, Offered for doing puzzles None -1 
-conti t reward- : eh 
Tase ave Cae (zoner) Offered for doing puzzles  Positive* 1 
Performance-contingent reward 
(norms supplied)—positive Svan iat 


Positive* 


* Positive feedback was 


the average high school student on these puzzles.” 


“We've found that the average student usually finds 4 Ninas, so you did b 


etter than 
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Feiden Galleries, New York, New York). 


eral times. These drawings appear in the Arts and 
Leisure section of the Sunday New York Times and 
are usually caricatures of actors and scenes from cur- 
rent films and plays. Each drawing has a title and a 
caption describing the scene and naming the actors 
depicted. The drawings used in this study contained 
three to seven hidden Ninas. An example of a Nina 
puzzle (in which there are seven hidden Ninas) is 
shown in Figure 1. 

Pretest. The students were given booklets contain- 
ing three kinds of Puzzles. The word puzzles con- 
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ron Margo - 
Figure 1. A Nina puzzle (Hirschfeld’s drawing for the play Anna C. hristie, © courtesy of the i 


econ a. 
sisted of several anagrams; the face wits of 
nition puzzle was the booklet form oe Hall, Di: 
Nonverbal Sensitivity (PONS; Rosenthal, BEY 4g 
Matteo, Rogers, & Archer, 1979), which oa situation} 
pictures of a face or body and two ne more ap- 
labels from which the subject eases were also 
Propriate description of the picture; Be students dit 
three Nina puzzles. The order in whic d. After do- 
the three puzzle sets was Cee a puzzle 
ing each puzzle set, students complet oint bipolar 
questionnaire that consisted of 13 7-P 
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rating scales (egs good-bad, easy-hard, boring-ex- 
citing) and eight questions concerning the subject's 
involvement with the puzzle (eg., “How much did 
you. enjoy the puzzle?”), also answered on 7-point 
scales. After completing all three types of puzzles, the 
students rank ordered them according to enjoyment. 
Experimental session. The experimenter informed 
each subject that he or she would be doing more Nina 
‘puzzles and that the instructions were tape recorded. 
“After turning on the tape recorder she left the room 
‘in order to remain blind to the subject’s condition, 
which was randomly determined. All subjects were 
instructed to find as many Ninas as possible within 
the 2 minutes of inspection time allowed per drawing. 
Subjects in the two task-contingent reward conditions 
(tcRNF and TCRPF) were promised a reward for doing 
the puzzles, whereas subjects in the two performance- 
contingent reward conditions (PCRNSPF and PCRNNPF) 
were promised a reward for exceeding the perform- 
ance of the average high school student. Those in the 
performance-contingent reward (norms supplied) — 
positive feedback group were then informed of the 
exact performance standard (detecting four or more 
Ninas in the three puzzles combined). All subjects 
who had been promised a reward (TCRNF, TCRPF, 
PCRNSPF, and PCRNNPF) were told that it consisted of 
ave felt-tip pens and a notebook, in their choice of 
® colors, and were reminded of the conditions of the 
reward, Finally, the subjects completed an identifica- 
tion form and answered a series of questions concern- 
y ing their understanding of the directions, the reward 
contingencies, their current mood state, expected 
puzzle difficulty and performance level, and attitude 


toward the reward. 
The experimenter returned to the room and ad- 
ministered the three Nina puzzles. Subjects in the 
NRPF, TCRPF, PCRNSPF, and PCRNNPF groups then re- 
teived positive performance feedback (“We've found 
that the average high school student usually finds 
four Ninas, so you did better than the average stu- 
dent on these puzzles”), and those who were prom- 
ised a reward (TCRNF, TCRPF, PCRNSPF, and PCRNNPF) 
indicated their color choices on 
reminded them of the conditions 
_’ reward had been promised. The experimenter was 
H blind to this latter information. While the experi- 
menter filed away the puzzles and order forms, the 
subject was permitted to read a current popular 
magazine, work on another Nina puzzle, or simply sit 
quietly. The experimenter unobtrusively observed the 
subject’s behavior and recorded the amount of time 
) spent on the puzzle, if any. 
After 2 minutes the subjec 
tionnaires, including a Nina pu 
tical to the one administered at the pretest. A volun- 
teering questionnaire stated that althou 
no definite plans for future sessions, 
wanted a general idea of student willingness to pares 
It was composed of four 7-point rating scales asking 
the subject how willing he o R 
for a P Ae session of Nina puzzles if 
: it were scheduled during their En 


R 
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a study hall, after school, or on a weekend. The 
extra puzzles request form informed subjects that 
copies of different Nina puzzles would be available 
after the conclusion of the research and permitted 
students to request up to four additional puzzles. The 
recall questionnaire was composed of a list of actors, 
in which were embedded the names of those depicted 
in the Nina puzzles from the experimental session, 
and a similar list of play and film titles; students 
were asked to circle any names that they remembered 
seeing in the puzzles. 

If the subject had not previously been offered a 
reward, he or she was informed at this time that 
there was a gift for having participated in psycholog- 
ical research. These subjects then indicated their 
color choice for the pens and notebook on an order 
form that reminded the subject of the rationale for 


the gift. 
Finally, all subjects were assured that the experi- 
ment woufff'be explained to them in detail when the 


rewards/gifts were delivered at its conclusion. They 
were also cautioned not to discuss the details of the 
experiment with other students. 

Posttest. One month after all students had par- 
ticipated in the experimental session, their teacher 
distributed a letter from the experimenter saying that 
she would be visiting their classroom soon with de- 
liveries and to answer questions about the research 
project. At this time they completed yet a third Nina 
puzzle questionnaire. 

The experimenter visited the classroom a few days 
after the students received the letter, distributed the 
rewards/gifts (and extra puzzles if requested), and 
debriefed the students. 


Dependent Measures 


In addition to various manipulation checks con- 
cerned with the informational effects of positive feed- 
back and postmanipulation motivation, several de- 
pendent variables were of interest in this study. 
Performance was indexed by the number of Ninas 
actually found on the three puzzles administered 
during the experimental session; all subjects found 
at least 4 Ninas, and no subject found all 15 con- 
tained in the drawings. Initial levels of intrinsic 
motivation were indexed by a pretest enjoyment 
scale derived from the pretest Nina puzzle question- 
naire in the manner described below. 

There were five outcome measures of intrinsic 
motivation taken after the experimental manipula- 
tion: (a) experimental enjoyment—a scale identical 
to the pretest enjoyment scale, but administered after 
the experimental task; (b) time—the amount of time 
that the subject spent looking at the extra Nina 
puzzle during the free-choice period; (c) volunteering 
—score on the volunteering questionnaire ; (d) extra 
puszzles—the number of puzzles requested by the sub- 
ject on the extra puzzles request form; and (e) post- 
test enjoyment—a scale identical to the pretest and 
experimental enjoyment scales, but administered dur- 
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ing the posttest.* A further measure was a process 
variable intended to index the level of task orienta- 
tion during the experimental session: (f) incidental 
recall—the number of names and titles correctly 
recognized on the recall questionnaire, corrected for 
overcircling by subtracting one half the number of 
incorrect answers. 


Results 
Pretest Analyses 


A principal-components factor analysis with 
varimax rotations was performed on the 21 
items of the Nina puzzle questionnaire, and 
the first factor extracted accounted for 50.6% 
of the variance. Seven items had factor load- 
ings higher than .70 and were considered to 
contribute to the definition of the factor: En- 
joyable, Good, Interesting, Exciting, Enjoy- 
ment of Puzzle, Interest in Puzzle, and 
Worthwhile. This factor was interpreted inde- 
pendently by two judges to measure task en- 
joyment, and scores on each of the 7 items 
were combined using unit weighting to form 
an enjoyment scale. Pretest enjoyment was 
used as a covariate in subsequent analyses of 
covariance. 

Since the overjustification effect is predicted 
to occur only when intrinsic motivation for 
an activity is initially high (Lepper & Greene, 
1976) it is necessary to establish that the 
Nina puzzles were in fact intrinsically inter- 
esting. The mean score for pretest enjoyment 
was 33.98 (SD = 9.20); this was significantly 
higher than 25, the midpoint of the enjoy- 
ment scale, 4(92) = 9.41, p< 001, d = 1.95, 
suggesting that subjects rated the Nina puz- 
zles more enjoyable than not. Examination of 
the rank ordering of puzzles indicated that 
50% of the subjects rated the Nina puzzles 
most enjoyable of the three, This observed 
frequency was significantly higher than the 
chance expectation of 33.33%, x? (1) = 10.88, 
P< 002, = 34, offering additional evi- 
dence for the high level of initial enjoyment 
of the Nina puzzles. 


Manipulation Checks 


The effectiveness of the Positive feedback 
manipulation was examined in two Treatment 
x Sex (6 x 2) analyses of covariance ® co- 
varying pretest enjoyment on the perceived 
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performance item (“How well do you thini 
you did on these puzzles?”) from the Nin 
puzzle questionnaire for the experimental seg. 
sion and posttest. A planned contrast testedi 
whether subjects in the four conditions receiv- 
ing positive feedback (NRPF, TCRPF, PCRNNPF) 
and PCRNsPF) were higher in perceived pers 
formance than subjects in the two no-feedba 
conditions (NRNF, TCRNF). In this contrast 
the first four groups were assigned weights of 
+1 and the latter two groups were given 
weights of —2. Subjects who received positive 
feedback showed higher levels of perceived 
performance both immediately following thel 
experimental session, F(1, 86) = 7.08, p 
01, » = .28, and at the posttest 1 month later, 
F(1, 80) = 7.03, p < .01, ņ = .28. 

A Treatment x Sex (6 X 2) analysis off 
variance was performed on the seven postma- 
nipulation scales. No significant effects were 
found for alert, comfortable, relaxed, moti- 
vated, excited, anticipated puzzle difficulty, 
or expected performance level, suggesting that 
there were no treatment differences in motiva- 
tion as measured by these scales after the stu- 
dents heard the instruction tape but before 
they actually did the puzzles. 

In order to examine possible differences be- 
tween the two groups that received perform- 
ance-contingent rewards, post hoc protected 
t tests were used to compare the PCRNSPF and 
PCRNNPF groups on the seven postmanipula- 
tion scales and on experimental and posttest 
perceived performance. pcrNsPF subjects were 


2Six subjects graduated from high school pat 
the administration of the posttest, so posttest enjoy- 
ment scores were not obtained for them. sail 

%Sex of subject was included as a factor n 5 
analyses of variance, although it was not ple coal 
the main hypotheses of this study. A amima ad 
effect for posttest perceived performance was Sane 
F(1, 74) =4.33, p < .05, »=.24, indicating i 
males thought they had performed better bae, 
males. A significant sex effect for expected es aed 
ance indicated that males thought they wou“ 678 
better than females on the puzzles, F(1, 80) pe 
> < 05, n= .27. A significant main effect for a fee 
recall, F(1, 80) = 6.62, p <.05, n =.28, pa thal 
males were higher in amount of incidental re Hes, 
females, suggesting that the females may have bee 
more task oriented. The sex differences found F HRE 
study were unanticipated, and interpretation 0 
would require replication. 


‘significantly higher 


“alert, 
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than PCRNNPF subjects on 
expected performance level (“How well do 
you think you will do on these puzzles?”), 
and posttest perceived performance 
(pcrnsPF Ms = 5.13, 4.93, and 5.84 vs. 
pcrnner Ms = 4.16, 3.69, and 4.72 on 7- 
point scales), ts(87) = 2.36, 2.46, and 2.32, 
respectively, all ps < 05, The two groups did 
not differ significantly on comfortable, relaxed, 
motivated, excited, anticipated puzzle dif- 
ficulty, or experimental perceived perform- 
ance. This pattern of significant differences 
suggests that the groups differed primarily in 
expectations for and perceptions of perform- 
ance level. 


Intercorrelation of the Dependent Measures 


Table 2 presents the correlation matrix for 
the six measures of intrinsic motivation (pre- 
test, experimental, and posttest enjoyment, 
time, volunteering, and extra puzzles), per- 
formance, recall, and experimental and post- 
test perceived performance. The average inter- 
correlation of the six measures was 51 (p< 
.001), and each individual correlation was sig- 
nificant at the p < .001 level, indicating that 
all of the operationalizations of intrinsic mo- 
tivation were positively interrelated. 
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Performance was positively related to pre- 
test and experimental enjoyment, r(91) = .20 
and .22; p < .05, but was not significantly 
related to any other measures of intrinsic 
motivation. Performance (actual) did corre- 
late positively with experimental perceived 
performance, r(91) = 33, p < 001. Recall 
did not correlate significantly with any mea- 
sures of intrinsic motivation or performance. 

Because the four intrinsic motivation vari- 
ables measured during the experimental ses- 
sion were positively related, a principal-com- 
ponents factor analysis was performed on 
them in order to create one measure of mo- 
tivation for the experimental session. Volun- 
teering, extra puzzles, time, and experimental 
enjoyment had loadings of .57, .75, .60, and 
.73, respectively, on the first factor extracted, 
and standardized factor scores were computed 
for the resulting intrinsic motivation measure, 
which was used as the primary dependent 
variable in the main analyses. 


Effects of Reward and Feedback on 
Intrinsic Motivation 


A Treatment X Sex (6 X 2) analysis of 
covariance with pretest enjoyment as a Co- 
variate was performed on intrinsic motivation 


Table 2 
Intercorrelations of Dependent Measures 
6 7 8 9 


Measure 1 2 3 
1. Pretest 

enjoyment 
2. Experimental 

enjoyment .69*** 
3. Posttest Pena 

j „5 A 

4, ena ET 44ee* 4o"** 
5. Volunteering gge .S0*** om 
6. Extra puzzles 46*** gases Al 
7. Performance .20* .22* 14 
8. Recall -05 .04 01 
9, Experimental 

perceived 

performance 04 Bal .00 
10. Posttest 

perceived 

performance -12 19 :28** 


4 5 


33"** 

‘s3°** A3°** 

10 .09 .03 

14 06 17 AL 

.02 .02 —.07 30%) —-13 

.29** —.02 .03 .337* r RGA biaia 


*p <05. * p < 01. *** p < 001. 
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Table 3 


Reward Condition Means for the Dependent Variables 
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, 


ha 


Reward condition/contrast weight 


NRPF/ NRNF/ TCRPF/ TCRNF/ PCRNSPF/ PCRNNPF/ 
Measure +3 +1 +1 -1 -i -3 
Intrinsic motivation 1.05 —.16 39 —.64 07 —.96 
Posttest enjoyment 38.47 33.09 35.36 31.96 35.97 29.96 
Recall 2.56 2.44 3.13 2.30 1.49 1.22 
Performance 8.87 7.81 8.56 8.27 9.00 8.62 


Note. For the intrinsic motivation measure, standardized factor scores were created from volunteering, extra 
puzzles, time, and experimental enjoyment variables. NRPF = no reward - positive feedback. NRNF = no 


reward — no feedback. TcRPF = task-contingent reward — positive feedback. TCRNE = 


ask-contingent reward 


~no feedback. PcRNsPF = performance-contingent reward (norms supplied) — positive feedback. PCRNNPF 
= performance-contingent reward (no norms) ~- positive feedback. 


and posttest enjoyment. The effect of the pre- 
test enjoyment covariate was highly signif- 
icant for both intrinsic motivation, F(1, 86) 
= 56.98, p < .001, » = .63, and posttest en- 
joyment, F(1, 80) = 39.99, p < .001, = .58, 
indicating that, as expected, intrinsic interest 
following the experimental manipulation was 
linearly related to initial task interest. No 
effects involving sex were significant for either 
dependent measure. 

A planned contrast was used to test the 
main effect of treatment. Weights for the 
contrast were operationalized according to 
the control hypothesis as follows: no re- 
ward — no feedback—-+ 1; no reward — positive 
feedback—+3; task-contingent reward — no 
feedback——1; task-contingent reward — posi- 
tive feedback—+1; performance-contingent 
reward (norms supplied) — postive feedback 
—-—1; and performance-contingent reward (no 
norms) — positive feedback——3. These equal 
interval weights are based on the assumption 
that the predicted effects are all equal in mag- 
nitude. This contrast tests the prediction that 
task-contingent rewards will produce negative 
effects relative to control conditions of no 
reward, that positive feedback will produce 
positive effects relative to conditions of no 
feedback, and that these two effects do not 
interact. This contrast also tests the predic- 
tion that performance-contingent rewards will 
decrease intrinsic motivation more than task- 
contingent ones and that performance-con- 
tingent rewards with no norms supplied will 
decrease intrinsic motivation more than per- 


formance-contingent rewards with norms sup- 
plied. 

Means for the two dependent measures off), 
intrinsic motivation are presented in Table 3. 
As predicted, the treatment planned contrast 
was highly significant for both measures: in- 
trinsic motivation, F(1, 86) = 12.72, p< 
001, » = .36, and posttest enjoyment, F(1, 
80) = 9.58, p < .005, ņ = .33.* Thus intrinsic 
motivation was enhanced by positive feedback 
and undermined by rewards, particularly per- 
formance-contingent ones. This treatment 
planned contrast tests the overall patterning 
of the six cell means with respect to the pre- 
dictions specified by the weights. Because this 
overall test incorporates several specific pre- 
dictions, and since particular cell comparisons 
are relevant with respect to comparison with 
other studies, several additional nonorthogonal 
but theoretically relevant planned compari- 
sons were computed for the primary depen- 
dent measure, intrinsic motivation. j 

In the four cells in which task-contingent 
reward or no reward was crossed with positive 
feedback or no feedback (TCRNF, ae | 
NRNF, and NRPF), task-contingent Tewar' s 
were found to reduce intrinsic motivation rela- 
tive to control conditions of no reward, zeh 
cating the overjustification effect: TCRNF a” 


4 This planned contrast was also significant for on 
of the four dependent measures combined to form 
intrinsic motivation measure, in four separate uel 
of variance on time, extra puzzles, volunteering, 
experimental enjoyment. 
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TCRPF versus NRNF and NRPF, (87) = 1.78, 
p <05, one-tailed d = .38; and positive 
feedback increased interest relative to no feed- 
back: NRPF and TCRPF versus NRNF and 
TCRNF, ¢(87) = 2.74, P< 05, d = 59. The 
interaction between task-contingent reward 
and feedback was not significant. Comparison 
of the two effect sizes (ds) suggests that the 
positive feedback effect was stronger than the 
overjustification effect. 

Planned comparisons were also computed 
for the four cells with positive feedback 
(NRPF, TCRPF, PCRNSPF, and pcRNNPF), which 
represent the conceptual replication and ex- 
tension of the Karniol and Ross (1977) de- 
sign, The TCRPF and PCRNSPF means were not 
significantly different, but both were signif- 
icantly lower than the control group: TCRPF 
and PCRNSPF versus NRPF, t(87) = 2.01, p< 
05, d= 43. Thus the Karniol and Ross re- 
sults were not replicated: Performance-con- 
tingent rewards, as well as task-contingent 
ones, reduced interest relative to control con- 
ditions, Furthermore, the Karniol and Ross 
prediction for the more informational 
PCRNNPF group was not supported; the 
PCRNNPF group was significantly lower than 
the PcRNSPF group in interest, #(87) = 1.76, 
$< 055 one-tailed, d = 37. Additionally, 
although the PCRNSPF group was not signif- 
icantly lower than the TCRPF group, PCRNNPF 
rewards were significantly more undermining 
than tTcRPF ones, ¢(87) = 2.34, p < 05, d= 
.50, and the two performance-contingent re- 
wards considered together were significantly 
more undermining than the task-contingent 
one: PCRNNPF and PGRNSPF versus TCRPF, 
4(87) = 1.67, p< 05, one-tailed, d = .36, 
replicating the results of Deci (1972b). 


Effects of Treatment on Performance 
and Recall 


Treatment x Sex (6 X 2) analyses of co- 
variance covarying pretest enjoyment were 
performed on incidental recall and perform- 
ance. There were no significant effects for per- 
formance. 

It was predicted that subjects offered per- 
formance-contingent rewards would be more 


task oriented and would recall fewer irrelevant 
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details than subjects offered no rewards or 
task-contingent ones. A planned contrast com- 
paring the two performance-contingent condi- 
tions (PCRNSPF and PCRNNPF: —2) with the 
other four conditions (NRNF, NRPF, TCRNF, 
and TcRPF: +1) was significant, F(1, 86) = 
8.95, p < .005, n = 32, indicating that recall 
was lower in the two performance-contingent 
conditions, as predicted. The effect of the pre- 
test enjoyment covariate was not significant 
(F < 1). Amount of recall, then, did not vary 
with the subjects’ initial enjoyment of the 
puzzles. No other effects, except those involv- 
ing sex, were significant for this measure. 


Discussion 


The control hypothesis received very strong 
support in this study. Performance-contingent 
rewards, particularly informational ones, were 
found to undermine subsequent intrinsic mo- 
tivation more than task-contingent ones, 
which produced decrements in intrinsic mo- 
tivation relative to control conditions of no 
reward. This pattern of results was found 
across five different operational definitions of 
intrinsic motivation, some of which were be- 
havioral and some of which were self-report, 
attitudinal measures. Most striking, perhaps, 
is the fact that this pattern of results was still 
found 1 month after the experimental manip- 
ulation had taken place, on the task enjoy- 
ment measure. 

The findings of this study lend support to 
the generalizability of the overjustification 
effect, which was replicated using a different 
subject population (high school students as 
opposed to college sophomores or pre- 
schoolers), a different experimental activity, 
and a different reward (notebook and pens, as 
opposed to the cash usually offered to adult 
subjects). The enhancing effect of positive 
feedback was also replicated, and this effect 
was found to be independent of the overjusti- 
fication effect. This finding has important im- 
plications for the comparison of previous 
studies and for the identification of appropri- 
ate control groups. 


5 All reported $ levels are two-tailed unless other- 
wise stated. 
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Since the receipt of a performance-contin- 
gent reward means that a subject has received 
both a reward and positive feedback concern- 
ing performance, the appropriate control 
group for examination of the effects of re- 
wards may be a group that receives the same 
positive feedback but no reward, The com- 
parison of performance-contingent and task- 
contingent rewards requires that the task-con- 
tingent group receive the same positive feed- 
back, assuming that the feedback does not 
interact with the offer of a reward, On the 
other hand, it can be argued that the feedback 
inherent in a performance-contingent reward 
is an integral part of the reward itself, in 
which case it would not be unreasonable to 
compare performance-contingent rewards with 
task-contingent rewards or controls without 
positive feedback added. Choice of an ap- 
Propriate comparison group is therefore some- 
what arbitrary but of course affects the nature 
of the conclusions drawn. 

Both kinds of performance-contingent re- 
wards were found to reduce intrinsic motiva- 
tion relative to control conditions of no re- 
ward. The informational reward (no norms) 
decreased intrinsic motivation more than the 
noninformational (norms supplied) reward, 
as predicted. Although the reward itself con- 
tained more information, subjects in both 
conditions received the same amount of infor- 
mation about their performance, In the non- 
informational condition, subjects knew how 
many Ninas they had to find and knew that 
they were doing well as soon as they found 
four Ninas. Subjects in the informational con- 
dition, on the other hand, had no idea about 
their performance level until after they fin- 
ished the puzzles and received their reward. 
Intrinsic motivation may have been lower in 
the informational condition because the re- 
ward was more controlling (students had to 
perform as well as possible for the whole ses- 
sion, whereas those in the noninformational 
condition were less controlled because of the 
lesser demand of having to find four Ninas), 
or intrinsic motivation may have been higher 
in the noninformational condition because of 
the greater amount of feedback available to 
subjects during the actual Process of doing the 
Puzzles. Positive feedback may be more effec- 
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tive if it is supplied (or implied by preestab. 
lished norms) during the process of task in 
volvement (so that subjects feel that they ar 
doing well) rather than after task completion, 

The informational hypothesis of Karnigl 
and Ross (1977) received no support in this 
study, The means for intrinsic motivation fell 
in almost the exact opposite ordering of the 
pattern predicted by their hypothesis. It was 
originally predicted that their hypothesis 
would not be supported in this study because 
of differences in subject populations, but post- 
experimental conversations with students re- 
vealed that the nature of the experimental 
activity might be an important determinant of 
the relative salience of controlling versus in- 
formational aspects of performance-contingent 
rewards. Although students enjoyed doing the 
Nina puzzles, their enjoyment did not seem 
related to competence concerns. Enjoyment of 
the task was only slightly related to actual 
performance level, 7(91) = .22, p< .05) 
Neither performance nor perceived perform- 
ance was significantly related to any of the 
other measures of intrinsic motivation, sug- 
gesting that actual or perceived competence 
is not always an integral part of intrinsic mo- 
tivation. 

Intrinsically motivated activities may be 
engaged in for their own sake (e.g., drawing 
with magic markers and reading novels) or for 
the feelings of competence they produce (e.g. 
games, puzzles, and sports). This is, of course, 
an unrealistic dichotomy, because most activ- 
ities involve both kinds of intrinsic satisfac- 
tion, but the extent to which an activity 1s 
relatively more competence relevant or fun 
relevant may determine whether informational 
or controlling aspects of a performance-con- 
tingent reward will be more salient to an indi- 
vidual. ; 

The results of the analyses of recall anog 
ences are particularly interesting. The fees 
that recall was unrelated to the measures 0 
intrinsic motivation and unrelated to perform- 
ance, yet was affected by reward condition, 
Suggests that this measure tapped an E 
portant dimension of task involvement t! 2 
has not previously been examined in wi 
justification studies, The finding that com 
tingently rewarded subjects did not remembe 
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as much about the puzzles as did other sub- 
jects, but did not find any more Ninas either, 
suggests that they may have been using a 
more answer-oriented strategy that did not 
prove to be effective in terms of performance 
on these particular puzzles. Exactly what sort 
of processing strategy is gauged by the recall 
measure employed in this study is not clear, 
but the results of the analyses lend support to 
Condry’s (1977) contention that more atten- 
tion needs to be directed toward an explora- 
tion and understanding of the process of task 
involvement. 
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The following factors were hypothesized to moderate the attitude—behavior re- 
lation: (a) the behavioral sequence that must be successfully completed prior 
to the occurrence of the behavior, (b) the time interval between the measure- 
ment of attitudes and behavior, (c) attitude change, (d) the respondent’s ed- 
ucational level, and (e) the degree of correspondence between attitudinal and 
behavioral variables. The behaviors investigated were having a child and using 
oral contraceptives. A stratified random sample of 244 married women in a 
midwestern urban area was studied during a three-wave, 2-year longitudinal 
study. Selection of attitudinal and belief measures was guided by the Fishbein 
model of behavioral intentions. Consistent with the hypotheses, the relations 
between behavior and both intention and the model’s attitudinal and normative 
components were substantially attenuated by (a) events in the behavioral 
sequence not under the volitional control of the actor, (b) an increase in the 
time interval between the measurement of attitudes and behavior from 1 to 2 
years, and (c) changes in the model’s attitudinal and normative components 
during the first year. The respondent's educational level did not affect attitude- 
behavior consistency. Finally, the attitude-behavior correlation increased sig- 
nificantly as the degree of correspondence between the two variables increased. 


Wicker (1969) concluded his comprehen- 
sive review of the attitude-behavior relation 
with the suggestion, “It is considerably more 
likely that attitudes will be unrelated or only 
slightly related to overt behaviors than that 
attitudes will be closely related to actions” 
(p. 76). Rather than signaling a decrease in 
research on this topic, pessimistic reviews by 
Wicker and others (e.g., Deutscher, 1966, 
1969; Ehrlich, 1969; McGuire, 1969) appear 
to have prompted a renewal of interest in the 
relationship between attitude and action. One 
encouraging line of inquiry has focused on 
methodological refinements in the attitudinal 
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and behavioral measures. By assessing bo j 
variables at corresponding levels of specificit ' 
that is, measuring attitude toward the act fol 
the prediction of a specific behavior or he 
suring a global attitude toward an object fot 
the prediction of a multiple-act behaviors 
criterion, a reasonable degree of predictiv 
accuracy can be obtained (Ajzen & Fishbein 
1973, 1977; Fishbein & Ajzen, 1974; a 
lein & Black, 1976; Weigel & Newman, 197 
Weigel, Vernon, & Tognacci, 1974; Weinstein 
1972; Wicker & Pomazal, 1971). J 
A second line of inquiry, encouraged 
Kelman (1974), has attempted to specify H 
personal and situational variables influenci k 
the extent to which people act in accord wit 
their stated attitudes and beliefs. Indi al 
difference variables that have been foun 4 
moderate attitude—behavior correspondent 
include the degree of affective-cognitive A 
sistency (Norman, 1975), the tendency | 
ascribe responsibility to the self (San 
1973), and the degree of self-monitori"ei 


i 
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(Snyder & Tanke, 1976). Similarly, a number 
of situational variables appear to moderate 
the attitudebehavior relation. For example, 
Warner and DeFleur (1969) found that the 
public versus private nature of the behavior 
interacts with attitude in influencing attitude- 
behavior consistency, and Snyder and Swann 
(1976) demonstrated that situations that in- 
crease the relevance of salient attitudes as 
guides to actions increase the correspondence 
between attitude and action. A recent series 
of studies (Fazio & Zanna, 1978; Fazio, 
Zanna, & Cooper, 1978; Regan & Fazio, 
1977) provide support for the notion that 
attitudes formed through direct behavioral 
experience with the attitude object are more 
predictive of later behavior toward that ob- 
ject than are attitudes based upon indirect, 
nonbehavioral experience. In summary, sev- 
eral personal and situational factors have been 
found to affect the attitude-behavior relation. 
As Regan and Fazio (1977) commented, “The 
question facing researchers is, therefore, no 
longer whether an individual’s attitudes can 
be used to predict his overt behavior, but 
when” (p. 30, italics in the original). 

The present research continues the study of 
factors that moderate the degree of attitude- 
behavior consistency and investigates the fol- 
lowing variables: 

1. Sequence of prior events. As Fishbein 
and Jaccard (1973) have observed, the occur- 
rence of a behavior under investigation is fre- 
quently dependent upon the successful com- 
pletion of a sequence of prior events. For 
example, in order to have a child a woman 
must (a) have intercourse, (b) not use birth 
control, (c) conceive, and (d) not have a 
spontaneous or induced abortion. While some 
of the sequences are potentially under the 
control of the actor (e.g., not using birth con- 
trol), others are not (e.g. conception). If 
performance of the final behavior requires the 
successful completion of sequences not under 
the volitional control of the actor, it is hy- 
pothesized that attitude toward the behavior 
will provide a better estimate of the likelihood 
that the actor will initiate the sequence than 
that the actor will complete the sequence 
perform the behavior in question. 
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2. Attitude change. In a recent review of 
the attitude-behavior literature, Schuman and 
Johnson (1976) noted that little attention 
has been paid to the possibility that attitude 
change may occur between the measurement 
of attitude and behavior. They commented, 
“The extent to which A-B correlations are 
reduced because of real changes in underlying 
attitude needs to be identified as far as pos- 
sible” (p. 178). Such identification is impos- 
sible in most studies because attitude is as- 
sessed at only one time. In the present re- 
search, attitude was measured twice prior to 
the performance of behavior in order to detect 
attenuation attributable to attitude change. 
Respondents whose attitudes remain stable are 
hypothesized to exhibit greater attitude—be- 
havior consistency than respondents who ex- 
perience attitude change. 

3. Time interval. A third variable hypoth- 
esized to influence the degree of relation be- 
tween attitude and behavior is the time inter- 
val between the measurement of attitude and 
the performance of behavior, As Fishbein and 
Jaccard (1973) noted, it is probably the 
processes occurring during that time interval 
—most notably exposure to new information 
—and not the passage of time per se that mod- 
erate the relation. This hypothesis is concep- 
tually similar to the one above; the longer the 
time interval between the measurement of 
attitude and behavior, the higher the prob- 
ability of exposure to new information and, in 
turn, attitude change. 

Three previous investigations have reported 
data relevant to the influence of temporal in- 
stability on attitude-behavior consistency. 
Norman (1975) regressed behavior (volun- 
teering to be a subject) on attitudes toward 
acting as a subject, assessed both 3 and 6 
weeks prior to the measurement of behavior. 
For each of three attitudinal measures, the 
second administration was more highly cor- 
related with behavior than the first, but the 
differences were not statistically significant. 
It has been suggested (Schwartz, 1978) that 
for this behavior the 3-week interval between 
attitude assessments was too brief for an un- 
ambiguous effect to be detected. In studies of 
voting behavior, Kelley and Mirer (1974) 
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have reported that for some subgroups the 
time interval between the measurement of 
voting behavior and attitudes can moderate 
the attitude—behavior correlation. Analyses of 
data from the 28% of the subjects judged 
most likely to change their minds found that 
a substantial proportion of variance in errors 
of prediction was accounted for by the num- 
ber of days that intervened between the inter- 
view and the election, In the final and most 
recent investigation, Schwartz (1978) re- 
gressed stated willingness to tutor blind chil- 
dren on perceived moral obligation to tutor 
blind children, measured both 3 and 6 months 
prior to the criterion measure. The correlation 
with the criterion was significantly higher over 
the shorter time interval. However, as 
Schwartz noted, it is uncertain whether the 
results obtained with perceived moral obliga- 
tion can be generalized to more traditional at- 
titude scales measuring evaluation. In sum, 
although specific aspects of each of these 
studies—interval between attitude assess- 
ments, nature of the subsample investigated, 
nature of the predictive variable—have pre- 
cluded a clear demonstration of the influence 
of time interval on the consistency between 
behavior and the evaluation of the behavior, 
these studies together provide a sound basis 
for positing the present hypothesis. 

4. Respondent’s education. Consistent with 
formulations proposed by Peak (1955), Ros- 
enberg (1956), and Fishbein (1963), attitude 
toward a behavior is viewed as the sum of 
one’s beliefs about the consequences of per- 
forming the behavior multiplied by the eval- 
uation of those consequences. Such formula- 
tions, with their emphasis on cognitive-affec- 
tive consistency, are frequently criticized as 
having greater validity for college students 
and intellectuals than for the majority of the 
population (cf. Bem, 1970). If such critiques 
are correct, education should interact with 
attitudes in the prediction of behavior, and 
degree of attitudebehavior consistency 
should correlate positively with the respon- 
dent’s educational level. The findings of Ro- 
keach (1973) also lead to the hypothesis that 
greater consistency will be observed for re- 
spondents with more education. In a national 
sample of American adults, the instrumental 
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value logical defined as consistent, ‘ration 
was more positively valued by highly educa 
respondents than by those with less educatio 

5. Correspondence between attitudinal ani 
behavioral variables. As noted earlier, if ath 
tude and behavior are both measured at a sin 
ilar level of specificity, a reasonable relatiq 
is generally observed. In an extension of th 
specificity hypothesis, Ajzen and Fishba 
(1977) maintain that attitudinal and beha 
ioral variables are defined by four elements 
an action, the target the action is direct 
toward, the context in which the action is{ 
be performed, and the time at which th 
action occurs. They argue that the correla 
between attitude and behavior is determing 
in part by the degree of correspondence 
match between the elements comprising th 
two variables. They classified past studies 
terms of the correspondence between attitut 
and behavior on two of the four elemenl 
target and action. Consistently significa 
attitude—behavior correlations were obtain 
only if there was high correspondence betwa 
target and action elements. Their investig 
tion, however, examined the moderating © 
fects of target and action correspondence b 
comparing attitude—behavior correlations o 
tained from different studies. In the presë 
research, the effects of target, action and, i 
addition, time correspondence on attitud 
behavior consistency are tested within o 
study. It is hypothesized that as the degrt 
of correspondence increases, the obtained al 
titude—behavior correlation will increase. 


4 


Attitude Model 


The selection of attitudinal measures ™ 
guided by the Fishbein (1967) model of bi 
havioral intention. The model is predicated 0 
the assumption that most human behavior 
concern to social scientists is to some deg i 
volitional in nature and hence guided A 
behavioral intent of the individual. 4 
braically, the model is expressed as follows: 


B ~ BI = [X BEW: 
íl 


+ CÈ NB:MC:W» 
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where B = overt behavior, BJ = the behav- 
vioral intention to perform the behavior, B; = 
the belief (perceived probability) that per- 
forming the behavior will lead to consequence 
X; E; = the evaluation of X;, NB; = the per- 
ceived expectation of Referent i, MC; = the 
motivation to comply with Referent i, n = 
the number of salient consequences, m = the 
number of salient normative beliefs, and W1 
and W = empirically determined regression 
weights. 

As Peak (1955), Rosenberg (1956), and 
Fishbein (1963) have theorized, the >B,E; 
component is viewed as an index of the per- 
son’s attitude toward performing the behavior 
(Aact). Fishbein and Ajzen (1975) maintain 
that the second predictive component (3NB;- 
MC,) assesses the influence of the social en- 
vironment or the general subjective norm 
(SN) on behavior. Empirical support for the 
equivalence of 3B,E; and Aact and of 3NB;- 
MC, and SN is discussed in Fishbein and 
Ajzen (1975). 

According to the model, any variable other 
than the attitudinal and normative compo- 
nents can influence BZ only indirectly. Social, 
demographic, and personality characteristics 
of the respondent can affect intentions only if 
they influence 3B;E; or 3SNB,MC; or their rel- 
ative weights. Thus if this framework is cor- 
rect, such variables as education should not 
have a direct effect on the magnitude of the 
relation between the components of the model 
and intention, and hence behavior. 

Fishbein and Ajzen (1975) argue that the 
relation between BZ and B should be very 
strong, provided that intention and behavior 
are measured at correspondent levels of spe- 
cificity and that nothing intervenes to alter 
intention. Whenever a strong intention-behav- 
ior relation is observed, the behavior in ques- 
tion should also be predictable from the atti- 
tudinal and normative components. Assuming 
a significant BI-B correlation, it is hypoth- 
esized that behavior can be predicted from a 
linear combination of the attitudinal and 
normative components. In addition, it is hy- 
pothesized that the variables moderating the 
intention-behavior relation will also moderate 
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the relation between attitudes, subjective 
norms, and behavior. 


Behavioral Domain 


The behaviors chosen for study were child- 
bearing and contraceptive use. While social 
scientists have met with success in document- 
ing fertility differentials on the basis of social 
and demographic variables, (cf. Rindfuss & 
Sweet, 1977) there has been a marked lack 
of success in explaining fertility behavior with 
psychological constructs. Two major fertility 
surveys, the Indianapolis study (Whelpton & 
Kiser, 1946-1958) and the Princeton study 
(Westoff, Potter, Sagi, & Mishler, 1961) 
failed to find any noteworthy relationships 
between psychological variables (primarily 
personality measures) and fertility and fam- 
ily planning behaviors, 

Recently, research findings have indicated 
that a few psychological models that can be 
categorized as expectancy models are of some 
utility in the prediction of fertility decision 
making (see, for example, Beach, Townes, 
Campbell, & Keating, 1976; Davidson & Jac- 
card, 1975; Werner, Middlestadt-Carter, & 
Crawford, 1975). Working with the Fishbein 
model, Davidson and Jaccard (1975) investi- 
gated attitudes concerning three family plan- 
ning behaviors in a sample of married women. 
In support of the model, 3B,E; and SNB,MC, 
explained an average of 60% of the variance 
in behavioral intentions. 

The data reported in Davidson and Jaccard 
(1975) were obtained during the first wave 
of a three-wave longitudinal survey. The 
sample was reinterviewed both 1 year and 2 
years after the initial interview. The present 
study examines the relation between inten- 
tions, at the first interview, of having a child 
during the next 2 years and using oral contra- 
ceptives during the next 2 years and the cor- 
responding self-reports of behavior during the 
2-year period. 


Method 


Sample 


Respondents in the initial survey were 270 white 
married women, age 18-38 years, residing in a mid- 
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western urban area. The sample was stratified by 
three levels of socioeconomic status and two levels of 
religious affiliation (Catholic and Protestant). The 
sampling procedure was designed to select randomly 
45 women for each cell of the 2 X3 design. For a 
detailed description of the sampling strategy, see 
Davidson and Jaccard (1975). 


Loss to Follow-Up 


The respondents were reinterviewed both 1 and 2 
years after the initial survey. Of the original 270 
respondents, 244 completed all three interviews. 
Approximately one half of the respondents lost to 
follow-up refused to be reinterviewed and the other 
half could not be located. Loss to follow-up did not 
significantly vary among the six cells of the sampling 
design. 


Interviews 


Approximately 1 hour was required for the re- 
spondent to complete each questionnaire, Although 
the interviews were self-administered, the interviewer 
remained in the house while the questionnaire was 
being completed to answer any questions the re- 


spondent might have. The respondent received $10 
for each interview. 


Measurement Procedures 


To insure that relevant beliefs and referents for the 
attitudinal and normative components of the model 
were included in the precoded questionnaire, initial 
elicitation interviews were conducted with an inde- 
pendent sample of 55 women. These women were 
selected from the same Population as the women in 
the longitudinal sample. The interviews identified 
four referents (e.g., husband, close friends) and nine 
beliefs (e.g., the effect of having a child on the hus- 
band-wife relationship, the woman’s freedom to pur- 
sue other activities, the family budget, etc.) salient 
to the population. A list of the referents and beliefs 
included in the questionnaire is presented in Davidson 
and Jaccard (1976). 

All predictive components were measured on 7- 
point adjective scales, Examples of the measures are 

1, Behavioral intention (BI) (I intend to have a 
child within the next 2 years) was measured on a 
likely-unlikely scale. 

2. Belief about the act (Bi) (For me, having a 
child within the next two years would make my mar- 
me stronger) was measured on a likely-unlikely 
scale. 

3 Evaluation of the consequence (E:) (Making my 
marriage stronger would be . . -) was measured on a 
g0od-bad scale. 

4. Attitude toward the act (Aact) (For me, having 
a child within the next two years would be . . .) was 
obtained by summing the responses to three evalua- 
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tive scales: good-bad, nice-awful, pleasant 
ant. 

5. Normative belief (NB,) (My parents thin 
should have a child within the next two years) 
measured on a likely-unlikely scale. 

6. Motivation to comply (4fC,) was measur 
scale anchored by generally speaking, I want 
what Referent i thinks I should do and gene 
speaking, I do not want to do what Referent i t 
I should do. 

7. General subjective norm (SN) (People who 
important to me and whose opinions I value ti 
should have a child within the next two years) 
measured on a likely-unlikely scale. 

Scales assessing evaluations of consequences ai 
attitude toward the act were scored —3 (bad), 
+3 (good). All other scales were scored from 1(t 
likely; not motivated to comply) to 7 (likely; 1 
tivated to comply)? 2B,i; and ZNB\MCy 
obtained by multiplying the score on each 
Statement by the score on the corresponding eval 
tion or motivation to comply and then su 
these products for all beliefs, In addition to the bi 
items, the questionnaire assessed a number of | 
tility-related attitudes and contained measu 
education, scored from 1 (sixth-grade education 
less) to 8 (PhD or other professional degree), | 
other demographic variables. 


each behavior. For childbearing, women reportin} 
birth between the first and third interviews 
assigned a score of 1, and the remainder of resp 
dents were assigned a score of 0, In families ep 
ing a birth, the child was typically observed by 
interviewer. For oral contraceptive D 
reporting that they used oral contraceptives at @ 
time during the 2-year interval were scored 1, 4 
women never using the pill were scored 0. 

The 2B, £, and SNB,MC, measures were obtai 
for only one of the behaviors—having a child, | 
to time constraints, only the direct measures of 
attitudinal and normative components, Aact and & 
respectively, were obtained for the second beha! i0 
using oral contraceptives. In the first of the tl 


1 Due to missing data and failure to follow ins A 
tions, the sample sizes in the following analyses ‘ 
from 242 to 244. 

2 In our previous studies all scales were sco! : 
to +3. However, we now recognize that with the 
ception of the evaluation measures, the remaindel 
scales are assessing variables that are theoretic 
unipolar (eg., subjective probability). Hence, 
unipolar scales are scored 1 to 7 in the pres 
analysis. A comparison of variables formed from b 
the present and previous scoring rules indicated t 
the two were substantially intercorrelated; for Bi: 
r= 93, and for =NB.MC,, r=.76. For a di 
of unipolar versus bipolar scoring of attitude 
belief measures, see Fishbein and Ajzen (1975, PP. ® 
86). 
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interviews, questions referred to performing each be- 
havior during the next 2 years. In the second inter- 
view, the questionnaire was adjusted to query the 
respondents about having a child and using oral 
contraceptives during the next year. 


Results 


The first half of this section focuses on in- 
tention—behavior consistency and the factors 
hypothesized to influence the degree of con- 
sistency. The second half investigates the pre- 
diction of behavior from the model’s atti- 
tudinal and normative components and ex- 
amines the variables that moderate that rela- 
tion. 


Prediction of Behavior from Behavioral 
Intention 


The intention-behavior relation. For each 
behavior a point-biserial correlation was com- 
puted between behavioral intention, measured 
during the initial interview, and behavior dur- 
ing the 2-year period. The BI-B correlation 
was .526 (p < .01) for childbearing and .678 
(p < .01) for contraceptive use. The relative 
magnitude of these two correlations is prob- 
ably influenced by the extent to which each 
behavior is under volitional control. While 
contraceptive use-nonuse is to a large degree 
under the control of the actor, childbearing is 
determined by an interaction of biological and 
volitional events. 

The sequence of prior events. As discussed 
above, in order to bear a child a woman must 
have intercourse without using contraception, 
conceive, and not have a spontaneous or in- 
duced abortion. It was hypothesized that in- 
tention would provide a better estimate of the 
probability that the person would initiate the 
sequence than of the likelihood that the per- 
son would complete the sequence and perform 
the behavior in question (bear a child). Wom- 
en were classified as initiating the sequence if, 
in the final survey, they responded that by 
having intercourse while not using contracep- 
tion they had been attempting for at least 9 
months to become pregnant or if they actually 
gave birth during the 2-year period. The 
point-biserial correlation of intention with 
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birth or attempted conception was .615 (p < 
.01). As hypothesized, this correlation was sig- 
nificantly larger than the correlation of inten- 
tion with birth, .615 versus .526, ¢(241) = 
445, p< O01. 

The time interval between the assessment of 
intention and behavior, It was hypothesized 
that as the time interval between the measure- 
ment of intention and behavior increases, the 
BI-B correlation decreases. This hypothesis 
was tested by comparing the B/-B correlation 
for the 2-year period (Year 1 to Year 3) with 
the correlation for the 1-year period (Year 2 
to Year 3). In support of the hypothesis, for 
both behaviors, the correlation for the 1-year 
period was significantly higher than for the 
2-year period (oral contraceptive use: .853 
versus’ .678, z = 5.21, p < .01; birth or at- 
tempted conception: .829 vs. .615, z = 4.95, 
p< 01)? 

Change in intention. The effect of time in- 
terval on intention—behavior consistency can 
be attributed to processes occurring during the 
time interval (e.g., exposure to new informa- 
tion and attitude change). To demonstrate 
directly the effect of change on the BI-B cor- 
relation it was necessary to identify respon- 
dents who changed intentions between the first 
two waves of the three-wave survey. However, 
as Kiesler (1977) has shown, a respondent 
might change her intention to make it con- 
sistent with her behavior during the prior 
year. For example, if a woman did not intend 
to have a child at the first interview but be- 
came pregnant during the first year of the 
study, she would probably indicate an inten- 
tion to have a child at the second interview. 
Such a case would more accurately be classi- 
fied as a missed prediction rather than as a 
change in intention. Hence, only respondents 
not performing the behavior (becoming preg- 
nant, bearing a child, using oral contracep- 


3s these correlations are themselves intercorre- 
lated, it is incorrect to use the traditional Fisher’s z 
transformation (Hayes, 1963, pp. 529-532) to test for 
the significance of differences between them. Rather, 
the correlations were appropriately compared using 
est initially suggested by Pearson and Filon and 


ti 
3 d in Peters and Van Voorhis (1940, p. 185). 


reporte 
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tives) during the first year were eligible for 
the classification of change in intention. 

Eligible respondents were classified into one 
of three groups based on their responses at the 
first and second interviews to the relevant in- 
tention items. They were classified as in- 
tenders if they indicated it was slightly, mod- 
erately, or very likely that they would per- 
form the behavior. Respondents indicating 
that performance of the behavior was slightly, 
moderately, or very unlikely were classified 
as nonintenders. Women checking the mid- 
point of the scale (neither likely nor unlikely) 
were classified as uncertain. Any eligible re- 
spondent who changed classification from 
Time 1 to Time 2 was identified as having 
changed her intention. 

Based on this procedure, 30 women were 
identified as having changed their childbear- 
ing intentions, and the remaining 214 women 
were classified as nonchangers. For oral con- 
traceptive intentions, 36 women were classi- 
fied as changers and 206 as nonchangers. Each 
BI-B correlation was then computed sep- 
arately for the change and nonchange respon- 
dents. For both childbearing and contracep- 
tive use, the intention—behavior correlation 
was substantially lower for women who 
changed intentions than for nonchangers 
(birth or attempted conception: —.015 vs, 
742, 2= 4.65, p< 01; oral contraceptive 
use: —.397 vs. 880, z = 9.57, p < .01). Thus, 
in support of the hypothesis, change in inten- 
tion attenuated the intention-behavior rela- 
tion over the 2-year period, 

Respondent’s educational level. It was hy- 
pothesized that the respondent’s educational 
level would be positively related to the degree 
of intention—behavior consistency. Hence in- 
tention should interact with education in the 
prediction of behavior. To test this hypoth- 
esis, behavior was regressed hierarchically 
(Allison, 1977; Cohen, 1978) against inten- 
tion, education and the multiplicative inter- 
action term of Intention x Education. For 
neither behavior did the inclusion of the inter- 
action term lead to a significant increment in 
R*; both Fs < 1, Therefore the hypothesis 
that educational level moderates the inten- 
tion—behavior relation was not supported. 
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In summary, there was a reasonably strong | 
correlation between intentions and behavior 
during the subsequent 2 years, Consistent 
with the hypotheses, however, this relation! 
was attenuated by (a) events in the behay- 
ioral sequence not under the volitional control 
of the actor, (b) changes in intention, and 
(c) the time interval between the assessmen 
of intention and behavior, The respondent’s 
education level did not moderate the inten 
tion-behavior relation. 

A demonstrated relation between what peo 
ple say they are going to do (intention) andi 
what they actually do (behavior) is probably 
of greater applied than theoretical signif 
icance. Once strong intention—behavior con 
sistency has been observed, however, the rela: 
tion between the attitudinal and normativel 
components of the model and behavior is off 
considerable theoretical interest. The subse 
quent analyses examine this relation. 


Prediction of Behavior From the Model’s 
Attitudinal and Normative Components 


The regression of behavior on attitudes and) 
norms. For both contraceptive use and child: 
bearing, the dichotomous index of behaviol } 
during the 2-year period was regressed oni 
attitudes and norms, measured during the in: 
itial interview. As indicated in Rows 1 and 3 
of Table 1, for each behavior both of the 
model’s components received significant Te 
gression weights in the prediction of behavior. 
For the prediction of birth the multiple cor- 
relation was .508 (p < .01), and R? adjustedi 
for shrinkage was .252. For the prediction of 
oral contraceptive use, the multiple correla- 
tion was .606 (p < .01) and R? adjusted for 
shrinkage was .362. | 

The sequence of prior events. It was by- 
pothesized that events in the behavioral se 
quence not under the volitional control of the 
actor would attenuate the prediction of be 
havior from the model’s components. The data 
Presented in row 2 of Table 1 support this 
hypothesis. The multiple correlation with 
birth or attempted conception is significantly | 
higher than the correlation with birth, -595 | 
versus .508; (241) = 4.33, p < .01. 
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1371 
Table 1 
Regression of Fertility and Contraceptive Behavior on Attitudes and Norms 
——— ————— 
Standardized regression R 
coefficients adjusted 
$ for 
Behavior È B;E: È NB:MC; R R shrinkage 
Births» 187 -361 -508 258 +252 
Birth or attempted 
conception®:* -203 .438 .595 354 349 
Aact SN 
Use of oral 
contraceptives*:® 392 .268 -606 367 362 


Note. All regression coefficients and multiple correlations are significant (p < .01), Abbreviations are as 
follows: £ ByE; = one's beliefs about the consequences of performing the behavior multiplied by the 


evaluation of those consequences;  NB;MC; = one’s normative beliefs about 
multiplied by one’s motivation to comply with those perceived norms; Aact 
performing the behavior; SN = one’s subjective norm (generalized normative belief) 


the behavior. 
aN = 244. 


> Coded: birth during the 2-year period = 1; no birth = 0. 
° Coded : birth or attempted conception during the 2-year period 


iN = 242, 


© Coded: use of oral contraceptives during the 2-year period = 1; nonuse 


Time interval between the assessment of 
the model’s predictive components and behav- 
ior. For both behaviors, the multiple correla- 
tion with the model’s predictive components 
was higher for the 1-year period (Year 2 to 
Year 3) than for the 2-year period (Year 1 to 
Year 3). However, for birth or attempted 
conception the difference between the Rs 
failed to achieve the traditional significance 
level (oral contraceptive use: .704 vs. .606, 
z = 2.48, p < .01; birth or attempted concep- 
tion: .649 vs, .595, z = 1.39, $ < 09). 

Attitudinal and normative change. As prè- 
viously noted, the effect of time interval on 
attitude_behavior consistency is attributable, 
in part, to attitude and belief change occur- 
ring during the interval. To demonstrate the 
effect of change on the correlation between 
behavior and the model’s components, it was 
hecessary to identify respondents whose atti- 
tudes or norms changed from Interview 1 to 
Interview 2. The procedure used here is sim- 
ilar to the one described above for assessing 
change in BZ. First, to differentiate norm and 
attitude change from post hoc justification of 
prior behavior, only respondents not perform- 
ing the behavior (becoming pregnant, bearing 
a child, using oral contraceptives) during the 


performing the behavior 
= one's attitude toward 
about performing 


= 1;no birth or attempted conception = 0. 
= 0. 


first year were eligible for the classification 
of change in attitudes or norms. Next, sep- 
arately for each behavior and each of the 
model’s two predictive components, eligible 
respondents who changed from Time 1 to 
Time 2 were identified as changers, and the 
remaining respondents were classified as non- 
changers. The correlation between the rel- 
evant component and behavior was then com- 
puted separately in each group. Consistent 
with the hypothesis, lower correlations were 
expected in the change than in the nonchange 
group. 

Attitude toward the act of using oral con- 
traceptives (Aact) was assessed by summing 
the responses to three evaluative scales, each 
scored +3 to —3. A score of 0 was categorized 
as a neutral attitude, a score greater than 0 
was a positive attitude, and less than 0 con- 
stituted a negative attitude. Sixty-three elig- 
ible respondents changed attitude categories 
between Interview 1 and Interview 2. As pre- 
dicted, the Aact-B correlation was substan- 
tially higher in the nonchange than in the 
change group (.733 vs. —.238; z = 7.88, p < 
01). 

gies to the general subjective norm 
concerning the use of oral contraceptives 
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(SN) were also divided into three categories: 
uncertain (a score of 4, the midpoint of the 
scale), positive norm (a score greater than 4), 
and negative norm (a score less than 4). Sev- 
enty-four eligible respondents changed cat- 
egories from Interview 1 to Interview 2. In 
support of the hypothesis, the SV-B correla- 
tion was significantly higher in the nonchange 
than in the change group (.682 vs. —.039; 
z = 6.14, p < 01). 

For childbearing behavior, the midpoints of 
the scales for the 3B,£; and SVB,MC, compo- 
nents had to be empirically identified. Using 
data from the first interview, the mean scores 
on 3B,E; and 3NB,MC, were determined 
separately for those intending and not intend- 
ing to have a child. The score midway be- 
tween the means was designated the neutral 
point on the scale, In the interval between 
Interview 1 and Interview 2, 34 eligible re- 
spondents moved across the neutral point of 
the 3B,E; measure, and 25 eligible respondents 
moved across the neutral point of the XNB;- 
MC; measure. Consistent with the prediction, 
the correlation of both XB;E; and 3NB,MC, 
with birth or attempted conception was higher 
in the nonchange than in the change group 
(3BE;: 580 vs. 011, z= 3.38, p< .01; 
2NBMC;: .648 vs. —.061, z = 3.7L). 

Respondent’s educational level. It was hy- 
pothesized that the respondent’s educational 
level would interact with attitudes and norms 
in the prediction of behavior. To test this hy- 
pothesis, behavior was tegressed hierarchically 
against the model’s attitudinal and normative 
components, education, and the multiplicative 
interactions of Education x Each Component. 
For neither behavior did the inclusion of the 
interaction terms lead to a significant incre- 
ment in R?; both Fs < 1. Hence, the hypoth- 
esis was not supported, 

In summary, there was a reasonably strong 
correlation between the model’s attitudinal 
and normative components and behavior dur- 
ing the subsequent 2 years. Similar to the re- 
sults obtained with behavioral intention, the 
prediction of behavior from attitudes and 
norms was attenuated by (a) events in the 
behavioral Sequence not under the volitional 
control of the actor, (b) the time interval be- 
tween the measurement of the model’s predic- 
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Table 2 i 
Correlations Between Selected Attitudinal t 
Variables and Behavior | 


Attitudinal variable 


Use of birth control pills during the 
2-year period* 


Attitude toward birth control 083 
Attitude toward birth control pills 323! 
Attitude toward using birth control pills 5254 
Attitude toward using birth control pills 


during the next 2 years 572 


Birth or attempted conception during the 
2-year period” 
Attitude toward children 
Attitude toward having children 
Attitude toward having a child in the 
next 2 years 
"N = 244. 
b N = 242. 
*p<.01. 


tive components and behavior (for childbeats| 
ing there was only a trend in the hypothesized 
direction), and (c) changes in attitudes ant 
norms. The respondent’s educational level did 
not moderate the strength of the relation. 


Correspondence Between Attitudinal and 
Behavioral Variables 


It was hypothesized that the degree of cor 
respondence between the elements of the ati 
titudinal and behavioral variables would s 
positively related to attitude-behavior cond 
sistency. The present analysis investigated tht 
effect of three elements on the attitude | 
havior correlation: action, the target thé 
action is directed toward, and the time a 
which the action occurs. As indicated in Table) 
2, four attitudinal variables, measured durin 
the first interview, were correlated with thes 
dichotomous behavioral measure of birth coni 
trol pill use during the 2-year period, The ay 
titudinal measures systematically varied wi! 
regard to the number of elements they had 
correspondence with the behavioral measult 
as follows: zero elements in corresponden! 
(attitude toward birth control, z = .083, ™ a 
one element in correspondence—target (a 
tude toward birth control pills, r = .323, ? i 
-01); two elements in correspondence—tatg? 
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and action—(attitude toward using birth con- 
trol pills, 7 = .525, p < .01); three elements 
in correspondence—target, action, and time— 
(attitude toward using birth control pills dur- 
ing the next 2 years, r = .572, p< .01). In 
support of the hypothesis, for each successive 
increase in correspondence between attitude 
and behavior variables there was a significant 
increase (ranging from p < .01 to p < .06) in 
the magnitude of the attitude—behavior corre- 
lation. 

Following a similar pattern, the dichoto- 
mous behavioral measure of birth or at- 
tempted conception during the 2-year period 
correlated —.007 (ms) with attitude toward 
children, .187 (p < .01) with attitude toward 
having children, and .535 (p < .01) with atti- 
tude toward having a child in the next 2 years. 
For each successive increase in correspondence 
there was a significant increase (all ps < .01) 
in the attitude—behavior correlation. 


Discussion 


The primary purpose of the present re- 
search was to investigate five factors hypoth- 
esized to influence the extent to which people 
act in accord with their stated attitudes. The 

investigation of the first of these factors, the 
sequence of prior events, directed attention 
to the behavioral variable, The majority of 
previous attitude-behavior research has 
focused exclusively on problems associated 
with attitudinal variables during attempts to 
account for low attitude-behavior correla- 
tions. However, the performance of a be- 
havior is often dependent upon the successful 
completion of a series of behavioral events. 
As the present research has demonstrated, if 
all of these steps are not under the volitional 
, Control of the actor, intention will provide a 
better estimate of the probability that the 
actor will initiate the sequence than of the 
likelihood that the actor will complete the 
sequence and perform the behavior in ques- 
tion. These results suggest that future studies 
of attitudebehavior consistency should seek 
to identify and quantify the behavioral se- 
quences that must be completed prior to the 
Performance of the behavior of interest. 
Analyses can then isolate the sequences at 
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which serious disruptions occur between atti- 
tudes and actions. From an applied perspec- 
tive, such knowledge would be of considerable 
use in the selection of behavior-change strat- 
egies. This approach would lead to the iden- 
tification of situations in which the most ap- 
propriate strategy is not to change attitudes 
but rather to institute policy changes making 
it easier for people to act in accord with their 
existing attitudes. 

The second and third factors, attitude 
change and the time interval between the mea- 
surement of attitude and ‘behavior, focused on 
instability in the attitude and belief variables, 
As the time interval decreased from 2 years to 
1 year, there was a 58.3% increase in the 
amount of variance in oral contraceptive use 
accounted for by BZ. The corresponding in- 
crease for childbearing was 81.7%. Substan- 
tial improvements in the prediction of be- 
havior were also achieved when analyses were 
limited to the sample of respondents who did 
not exhibit attitude change. The magnitude 
of these effects directs attention to the prob- 
lem of using static models in the analysis of 
a dynamic process. When stating attitudes and 
intentions about future behaviors, a respon- 
dent is probably assuming either that the en- 
vironment will not change or that it will 
change in an anticipated manner. Unexpected 
changes (e.g., loss of employment, exposure to 
new information) can lead to a change in atti- 
tude and, if such changes are not monitored, 
to an apparent lack of consistency between 
attitudes and behavior, Future research 
should attempt to specify the personal and 
situational variables influencing attitudinal 
stability. From the studies reviewed above, 
some of the factors that might initially be 
hypothesized to contribute to stability include 
the extent to which the attitude was formed 
through direct behavioral experience with the 
attitude object (Fazio & Zanna, 1978; Regan 
& Fazio, 1977) and the individual difference 
variables of self-monitoring (Snyder & Tanke, 
1976) and tendency to ascribe responsibility 
to the self (Schwartz, 1973). 

The fourth factor hypothesized to moderate 
the attitude-behavior relation was the re- 
spondent’s educational level. It was hypoth- 
esized that as completed years of schooling 
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increased, so would the magnitude of the 
attitude-behavior correlation. However, an 
interaction between education and attitude— 
behavior consistency was not observed. Hence, 
these results lend no support to the contention 
that consistency is especially prevalent among 
the highly educated. It appears unlikely that 
nonsupport for this hypothesis is attributable 
to lack of variance on the education variable; 
years of education ranged from 8 to 16. In 
addition, attitudinal models similar to the 
present one have provided good prediction of 
behavioral intentions and behavior in cross- 
cultural samples with considerably lower ed- 
ucational levels than the present sample 
(Davidson, Jaccard, Triandis, Morales, & 
Diaz-Guerrero, 1976; Davidson & Thomson, 
1980). It should be noted that the present 
finding provides indirect support for the theo- 
retical framework of Fishbein. As noted 
above, if the framework is correct, social, 
demographic, and personality characteristics 
of the respondent should not have a direct 
effect on the magnitude of the relation be- 
tween the components of the model and in- 
tentions, and hence behavior. 

The fifth factor hypothesized to moderate 
attitude-behavior consistency is the degree of 
correspondence between the elements of the 
attitudinal and behavioral variables. As the 
attitudinal measures moved along a con- 
tinuum of correspondence in terms of the 
number of elements they had in agreement 
with the behavioral measure (from zero to 
three elements), their correlations with be- 
havior significantly increased. The present 
results concerning the impact of action and 
target correspondence are consistent with the 
conclusions of Ajzen and Fishbein (1977). In 
addition, the findings highlight the impact of a 
third element, not previously investigated, the 
time during which the action occurs. To ob- 
tain reasonable predictions of behavior from 
attitudinal variables, it appears important to 
ensure correspondence in target, action, and 
time elements, 

The findings from the present study also 
provide support for the Predictive validity of 
the components of the model of Fishbein. 
Even prior to the adjustments made for atti- 
tude change and the sequence of events, both 
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intention and the attitudinal and normative! 
components measured at the first interviey 
provided reasonable correlations with behavior 
during the subsequent 2 years (‘the lowest 
validity coefficient was .508). 

A weakness in the present research is that] 
the data on oral contraceptive use were ob: 
tained solely through self-reports, Hence it 
not possible to determine if the respondents 
misrepresented their behavior in order , 
make it consistent with their previously stated 
attitudes and intentions. It appears doubtful 
that there was extensive misrepresentation) 
because the findings concerning oral contr 
ceptive use were very similar to those fo 
childbearing in terms of both the percentage 
of subjects correctly classified and the varil 
ables found to moderate the degree of att 
tude-behavior consistency. There was also 
reliance on self-report data concerning thé 
attempt to become pregnant. Respondent 
might have falsely indicated that they hat! 
been unsuccessfully trying for at least § 
months to become pregnant in an attempt 
appear consistent with previously stated in 
tentions. However, this rival hypothesis E 
quires that the respondents admit somethin 
that is socially undesirable about themselvé 
for the purpose of appearing consistent. 

In conclusion, this research indicates thal 
the contraceptive and fertility behavior o 
married women is quite predictable from bi 
havioral intentions, attitudes, and normativ 
beliefs. However, the degree of the relation i 
attenuated by events in the behavioral së 
quence not under the volitional control of tht 
actor, the time interval between the measult 
ment of behavior and the model’s predictivt 
components, changes in attitudes and beliefs 
and the degree of correspondence between the 
attitudinal and behavioral variables. 
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Mind Over Matter: 
Perceived Success at Psychokinesis 


Victor A. Benassi, Paul D. Sweeney, and Gregg E. Drevno 
California State University, Long Beach 


Four experiments showed that college students’ estimates of success on a psy- 
chokinetic (PK) task were affected independent of actual performance. In Ex- 
periment 1, subjects given a positive introductory set or no set about PK 
evidenced more illusory control than subjects given a negative set. In Experi- 
ment 2, both degree of general belief in psychic phenomena and the number 


of practice trials that subjects received influenced performance estimates, with 
high believers who received 10 practice trials providing the highest estimates 
and low believers who received 1 practice trial the lowest. In Experiment 3, 
subjects actively involved with the PK task judged their performance more 
positively than passively involved subjects. Experiment 4 showed that when 
they were actively involved in the task, subjects with an internal locus of 
control gave higher estimates of their success than subjects with an external 
locus of control. When passively involved, internals and externals did not reli- 


ably differ in their estimates, 
active/internals. 
also highlight the importance 


but their estimates were lower than those of 
These results support Langer’s illusion-of-control theory and 
of general psychic belief and locus-of-control 


orientation in affecting perceived success at a psychic task. 


How do people come to believe in the exis- 
tence of psychic phenomena? Although re- 
search on the determinants of psychic belief is 

| sparse, we can presume that such beliefs are 
determined by a number of factors, including 
media inputs, impactful personal experiences, 
inherent biases in human reasoning, and so 
forth (Sweeney, Benassi, & Drevno, Note 1). 
The present studies focused on several factors 
affecting estimates of personal performance on 
a psychokinetic (PK) task. Our intention was 
to provide a partial answer to the question of 
how people come. to believe in their personal 
efficacy. We suggest that the same variables 


affecting perceived control in general also in- 
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fluence judgments of perceived psychokinetic 
control. 

Using Langer’s (1975) illusion-of-control 
model, Ayeroff and Abelson (1976) reasoned 
that belief in success on a mental telepathy 
task could be influenced by the introduction 
of skill-related but irrelevant variables into 
the situation. They found that subjects per- 
mitted to choose the symbols to be trans- 
mitted in a standard mental telepathy task 
and/or to engage in a warm-up session 
thought they did much better on the task than 
subjects who were not provided with these 
experiences. Objectively, the subjects per- 
formed at chance levels. 

Layton and Turnbull (1975) found that 
attitude toward and evaluation of ESP were 
systematically related to performance on an 
ESP task. Two experiments were conducted, 
and an ESP effect was found in the first study, 
but not in the second. In both studies the 
authors found that subjects with a positive 
attitude toward ESP and a positive evaluation 
of it thought they did better on the ESP task 
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than their negative counterparts. In an early 
study, James and Rotter (1958) found that 
verbalized expectancies for success on an ESP 
task during extinction depended on whether 
subjects were told that ESP was a skill or 
chance task and whether they were given 
100% or 50% reinforcement (positive feed- 
back) during training. 

In Experiment 1 we predicted that esti- 
mates of PK performance would vary with the 
instructional set. We hypothesized that sub- 
jects given a positive introductory statement 
about the existence of PK would rate their 
performance more positively than subjects 
given a negative set (cf. Layton & Turnbull, 
1975). Additionally, we predicted that sub- 
jects given no specific set about PK would 
give higher estimates of success than those 
given a negative set, but subjects given no 
specific set would not differ in perceived suc- 
cess from positive-set subjects. This latter 
prediction derives from the finding that col- 
lege students as a group are generally positive 
in their attitudes toward psychic phenomena 
(e.g., Jones, Russell, & Nickel, 1977; Polyella, 
Popp, & Hinsman, 1975). We hypothesized 
that our positive set merely provides support 
for what subjects already tend to believe. 


Experiment 1: Instructional Set 
Method 


Subjects. Thirteen male and 24 female introduc- 
tory psychology students met a course requirement 
at California State University, Long Beach, by par- 
ticipating in the study. 

Apparatus. A device called a die funnel! was 
constructed in such a way that subjects could not 
see the outcome of their die tosses, A subject sat at 
one end of the funnel and tossed the die into the 
chute. The experimenter then opened a box at the 
opposite end, recorded the outcome, and returned 
the die to the subject. 

Procedure. Individual subjects reported to the 
experimental room and were told that they were to 
be part of a study dealing with psychokinesis. Sub- 
jects were then randomly assigned to one of the 
following instructional set conditions: 


1. (Pro set). Research studies have indicated that 
people have varying degrees of psychokinetic abil- 
ities, for example, they are able to bend spoons 
and repair watches without physically touching 


following study will attempt to verify 
cate these findings. Specifically, this st 
concern itself with measuring the effect 
psychokinetic abilities have on the thro 
dice. 


2. (Com set). Much research has been do 
cerning people’s psychokinetic abili 
the alleged ability to move objects or 
shapes without physically touching 
ample, bending spoons and repairing w 
research demonstrates little or no 
people actually possess such abilities, 
ments by both United States and Soviet 
that have tested these purported abilities 
date failed to establish evidence for PK. O 
periment is another in a long series in 
PK abilities. 
3. (Neutral set). This will be an e 
vestigating psychokinesis, that is, the 
move objects or alter their shapes with 
cally touching them, for example, bending 
or repairing watches. 


After completing one of the above sets, 
menter A left the room as Experimenter B en 
Blind to the set provided by Experimenter 
perimenter B described the procedure for con 
the PK test as follows: 


We have constructed this apparatus 
funnel. It acts as a funnel into which y 
throw this die. The die will be put into th 
and then thrown into the funnel. It will” 
down to the end, whereupon I will reco 
outcome of the toss. As you will notice, 

has three sides painted green and 
painted red [subject is shown the die]. ‘ 
of 20 trials we will run, you will choose 
and then I will ask you to concentrate 
mental energies for a period of 10 secon 
fecting the outcome of the toss. For exam] 
might choose the target color red; next 

concentrate for 10 seconds; then, when I sé 
gin,” you will immediately toss the die 

funnel. I will then record the outcome 0 
toss on the sheet I am holding. Further, 
like you to indicate on this sheet [subject 
data sheet] whether you are confident or no 
fident that the color you chose actually cot 


Upon hearing the above instructions, 
completed 20 trials, after which he or she 
a short questionnaire. In introducing the 
naire, Experimenter B briefly described 
formance (i.e., about 10 hits) on the task the 
had just completed. 


periments, subjects were debriefed thro 
feedback after all subjects had participated. 


1 Photographs of this apparatus may be € 
from the first author. 
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Table 1 
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Estimated Success as a Function of Instructional Set 


— ee NS 


Number of Estimates 
confident on 20 
trials Self- more Percei 
Group (0-20) influence trials SEn 
Pro (n = 13) 
M 11.85*** 5.00 12.20** 11.90* 
SD 2.79 2.90 2.19 2.39 
Neutral (n = 10) i 
M 12.60*** 4.40 12.00** 11,50* 
SD 1.27 1.43 2.40 1,51 
Con (n = 14) 
M 7.57 3.36 9.80 9.90 
SD 3.18 247 1.89 1,83 


Note. For each dependent variable, a Fisher's least significant difference test was performed following the 
analysis of variance. None of the differences between the means of the pro- and neutral-set groups exceeded 
the critical values required at a p value of .05. However, for each measure, the p values adjacent to the 
means of the pro- and neutral-set groups indicate the level at which they are significantly different from the 


means of the con-set group. 
*p <05. 

p < .025. 

8 < 001. 


Results 


We found no evidence of sex differences on 
any of the measures taken in this study or in 
any of the subsequent three studies. There- 
fore all data have been collapsed across sex. 

The mean success rate did not differ across 
groups (F < 1): pro = 11.6, neutral = 10.7, 
and con = 11.0. Binomial tests showed that 
no group performed significantly above or be- 
low chance levels. We also performed bi- 
nomial tests to determine whether people 
deviated from chance performance on those 
trials on which they were confident of being 
correct, and we found the tests for each group 
to be clearly nonsignificant? 

Table 1 shows that there were significant 
differences on the confidence measure between 
the treatment groups, F (2, 34) = 13.31, P< 
001, w? = .280, While the means for the pro- 
and neutral-set subjects did not systematically 
differ, both were significantly different from 
the mean for the con-set participants ($ < 
001 with Fisher’s least significant difference 
[LSD] test). 

Table 1 also shows the mean scores on the 
Questions asked of subjects upon completion 
of the PK task, In response to the Likert for- 


mat question “How much do you feel that 
your concentration influenced what colors 
came up? That is, how much PK influence do 
you feel you have exerted?” we found non- 
significant mean differences among groups, 
F(2, 34) = 2.31, p > .05. The results of this 
measure were in the same directional pattern 
as those on the other dependent measures, To 
the question “If you were given another 20 
trials, on how many do you think you would 
influence what colors would come up?” we 
found a significant effect, F(2, 34) = 5.04, 
p < .05, o? = .175. We also found a signif- 
icant difference on the question “On how 
many of the 20 trials do you think you cor- 
rectly predicted what colors came up?”: 


2 Combining the PK data from the three experi- 
mental groups resulted in a mean hit rate of 55.5%, 
which is nearly three binomial standard deviations 
from the expected mean of 50%. There are, of course, 
any number of factors that might have been re- 
sponsible for this effect. The combined hit rate 
cannot account, however, for the significant estimated 
success differences among the experimental groups 
on the dependent measures. We may recall that the 
hit rates for groups did not separately deviate from 
chance. The combined group PK effect, while per- 
haps interesting, does not discount the result that 
belief measures were affected by the instructional 
set and not by actual performance. 
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F(2, 34) = 4.06, p < .02, w? = .139. Post hoc 
comparisons done on all of the significant 
analyses revealed that although the pro- and 
` neutral- group means did not differ from each 
other, they were both significantly higher 
than the means for the con group (ps < .05 
with Fisher’s LSD test). 


Discussion 


The results support the notion that a varia- 
tion in set can lead subjects to make differing 
estimates of performance on a PK task, inde- 
pendent of actual performance (cf. Layton & 
Turnbull, 1975). Subjects in the pro- and 
neutral-set conditions judged their perform- 
ance to be more successful than subjects in 
the con-set condition. However, estimates of 
the pro and neutral groups did not reliably 
differ, Whether our manipulations actually in- 
fluenced belief or only influenced verbal re- 
port remains an open question. A strong be- 
liever in psychic abilities, for example, may 
have been reluctant to tell the experimenter 
that she or he performed well when the experi- 
menter had just stated that there is no evi- 
dence to support the existence of PK. In other 
words, the results could be an artifact of de- 
mand characteristics of the experimental situ- 
ation. The results of the subsequent three ex- 
periments cannot easily be attributed to de- 
mand characteristics. 

Subjects threw the die and chose the target 
color before each trial; the ratings of the neu- 
tral- and pro-set groups might have been lower 
if the experimenter had performed these func- 
tions. Additionally, the negative set seems to 
have overridden the effects of the subjects’ 
active involvement with the task. Had these 
subjects been passively involved, we might 
expect their ratings to have been even lower 
than they were. Because involvement was not 
manipulated in this study, we cannot say to 
what extent active involvement influenced the 
judgments of our subjects separate from the 
effects of the instructional set. This orthogonal 
variation of set and involvement might clarify 
the contributions of these variables in deter- 
mining perceived success. 
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Experiment 2: Practice and Belief 


Based on the work of Langer (1975) we 
predicted that subjects given more prior ex- 
posure to a PK task would judge their per- 
formance during an actual test series more 
positively than subjects provided with less 
prior exposure. To 'the extent that skill-related 
variables are introduced into a chance situa- 
tion, Langer found, people will judge their 
performance to be skillful. For example, prac- 
tice in flipping a coin does not enhance sub- 
sequent performance, but the subject may 
mistakenly form the impression that it does. 
We also predicted that subjects with relatively 
strong general beliefs in psychic phenomena 
would give higher ratings of their performance 
on the PK task than more skeptical subjects 
would, Jones et al. (1977) found that subjects | 
who scored higher on a paranormal belief scale 
gave higher estimates of their performance on 
a PK-type task than subjects who scored 
lower on the belief scale. 


Method 


Subjects. Fifty-five students, 30 female and 2% 
males, at California State University, Long Beach 
fulfilled an introductory psychology course require) 
ment by participating. 

Apparatus and materials. An apparatus called the 
die tumbler (see Footnote 1) allowed for electromit 
release of a die. The die, which sat on a platform, 
was released 10 seconds after a timer switch had beet 
activated by the experimenter. During the experi 
ment subjects sat at one end of the tumbler, from 
which they could see two lights: a red light thal 
stayed on for the duration of the 10-second co 
centration period and a white light that signifi 
the end of this period and the release of the dé 
The subjects could not see the outcome of the 
tosses. ak 

A portion of the Belief in the Paranormal St 
(BPS) was used to assess subject’s general b f 
(see Table 1 of Jones et al., 1977, for the comple 
scale). The nine questions used were Numbers 4 
11, 15, 16, 19, 21, 23, 24, and 25. tab. 

Procedure, Same-sex subjects reported to the 2” 
oratory in groups of three or four. Subjects VT 
given a definition of PK similar to the one w 
periment 1. They were then asked to inspect f 
die tumbler. The procedure was demonstrated, i 
subjects were asked to sit at predetermine The) 
located about 3 feet (1 m) from the tumbler pe 
sat at a desk separated by partitions, which it 
vented them from seeing each other’s respo 
allowed them to see the die tumbler. Subjects 
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next told that when the red light on the die tumbler 
came on, they were to “concentrate your full mental 
energies on making the die land face up.” The die 
used in Experiment 1 was then presented for all 
subjects to inspect. 

According to procedure, the experimenter called 
out a target color and then started the 10-second 
concentration period. After the die was released, 
subjects completed the same confident/not confident 
checklist described in Experiment 1. 

Subjects were assigned to a condition providing 
1 or 10 practice trials in which they were given 

a experience with the procedure to be used in the 
test series. After the practice trial(s), Experimenter 
A left the room, and Experimenter B entered and 
began the series of 20 trials. The sequence of target 
colors during practice and testing sessions was con- 
stant across conditions. Experimenter B was blind 
to the practice-trial condition of the subjects. Upon 
completion of the 20 trials, subjects filled out two 
Likert-format questionnaires. The first contained 

one question that asked, “How much do you feel 

’ that your concentration influenced what colors came 

' up on the 20 trials?” The second questionnaire was 
the BPS. 


Results 


We found no evidence of PK influence. The 
mean hit rate was 9.2 for the group that per- 
formed 1 practice trial and 10.4 for the group 
that performed 10 practice trials (¢ < 1, ms). 
Also, as determined by binomial tests, neither 
group performed above or below chance. As in 
Experiment 1, subjects did no better or worse 
on the trials on which they were confident. 

Within both trial groups a median split was 
done on the BPS questionnaire results. Sub- 
jects in the 1- and 10-trial conditions who 
scored above the median values of 32 and 31, 
tespectively, were placed in high-belief groups 
and those below the median in low-belief 
groups. The above procedure yielded a 2 x 2 
analysis of variance (anova) design with 
Practice trials and belief in the paranormal as 
independent variables. 

Table 2 (column 1) shows that the number 
of trials on which subjects were confident of 
Success varied with both the number of prac- 
tice trials, F(1, 51) = 16.24, p< 001, o = 
194, and with degree of general psychic be- 
lief, F(1, 51) = 10.1, p < .005, o? = -116. 

he interaction was nonsignificant (F < 1). 

Table 2 (column 2) shows ratings on the 
Posttask question, “How much do you feel 
that your concentration influenced what 


1381 


Table 2 


Estimated Success as a Function of Practice 
Trials and Prior Belief 
a eT 


Number of Self- 
confident influence 
trials estimates 
Group (0-20) (0-9) 
1 Practice trial 
Low belief 
M 6.83 1,88 
SD 3.74 ZAS: 
High belief 
M 10.14 4.57 
SD 3.01 2,21 
10 Practice trials 
Low belief 
M 10.87 4.13 
SD 3.62 2.53 
High belief 
M 13.00 4.93 
SD 2.04 2.34 


colors came up on the 20 trials?” We found 
main effects for the practice trial variable, 
F(1, 51) = 4.33, p < .05, o° = .05; and for 
the general belief variable, F(1, 51) = 7.72, 
p< 01, o = .101. The interaction was non- 
significant, F(1, 51) 2.29, p > .05. 


Discussion 


This experiment demonstrates the effects of 
practice and prior belief on judgments of suc- 
cess on a psychic task, without any apparent 
accompanying psychic ability. These results 
replicate Ayeroff and Abelson (1976) and 
support Langer’s (1975) illusion-of-control 


model. 


Experiment 3: Involvement 


We sought further confirmation of the hy- 
pothesis that people given a skill orientation 
would judge their performance on a PK task 
more positively than those not given this 
orientation. Subjects run in pairs were either 
actively or passively involved in the PK task. 
We predicted that active subjects would per- 
ceive more control over the task than passive 
subjects. We also attempted to replicate the 
effect of general paranormal belief on esti- 


mated PK performance. 
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Method 


Subjects. Thirty female and 30 male introductory 
psychology students at California State University, 
Long Beach, fulfilled a course requirement by par- 
ticipating. 

Apparatus. The apparatus was the same as in 
Experiment 1. 

Procedure. Subjects came to the experiment room 
in same-sex pairs, were given a definition of PK 
similar to the one given in the previous two studies, 
and were told that the present investigators were 
beginning a research program on PK. Subjects were 
then asked to inspect the die funnel, after which 
they were told that the experiment called for both 
members of each pair to explore at the same time 
the possible effects that two persons have on the 
throw of a die. They were also told that although 
they both were to try to affect the outcome of each 
die toss, only one of them would actually toss the 
die, Experimenter A randomly chose one of the 
subjects to throw the die and then asked both to 
sit at predetermined spots at one end of the’ die 
funnel. Subjects were separated by a partition so 
that they could not see each other’s responses on 
the confidence checklist. 

The painted die used in Experiments 1 and 2 was 
again produced for the subjects to inspect. During 
this process, subjects were read instructions similar 
to those given in Experiments 1 and 2, asking them 
to center and concentrate their full mental energies 
for a period of 10 seconds on affecting the color 
called. Also, Experimenter A instructed subjects on 
how to complete the confident/not confident check- 
lists. The sequence of target colors was constant 
across conditions. 

On each of 14 trials, both subjects concentrated on 
making the target color land face up, one subject 
then threw the die, and finally both responded to 
the confident/not confident checklist. Following the 
last trial, Experimenter A left the room after in- 
dicating that a coexperimenter would arrive shortly. 
Experimenter B entered, blind to the condition ex- 
perienced by the subjects, and administered two ques- 
tionnaires. Questionnaire 1 consisted of two ques- 
tions constructed on a Likert-format scale. The first 
asked, “How much do you feel that your concen- 
tration influenced what colors came up on the 14 
trials? That is, how much psychokinetic control do 
you feel you had?” The second question asked, “How 
much do you feel the other person’s concentration 
influenced what colors came up on the 14 trials? 
That is, how much psychokinetic control do you 
fee] the other person had?” The order of presenta- 
tion of the questions was reversed for different sub- 
jects. Questionnaire 2 asked subjects to judge on a 
Likert-format scale the extent to which they be- 
lieved in psychic phenomena: “Do you believe that 
Sid possess psychic abilities, such as ESP and 

» 


V. BENASSI, P. SWEENEY, AND G. DREVNO 


Results 


The mean success rate for subjects was 7.1, 
The results of all analyses attempting to show 
PK effects may be summarized succinctly—no 
PK effects were found. 

As in Experiment 2, we generated high- and 


low-psychic-belief groups within the active | 


and passive conditions. The median scores on 
the general belief question for the active and 


passive conditions were 6 and 7, respectively. | 


Within each condition, subjects scoring above 
the median were placed in a high-believer 
group and those scoring below the median 
were placed in a low-believer group. 

Judgments of success varied across the in- 
volvement conditions. Table 3 (column 1) 
shows that active subjects were confident on 
more trials than were passive subjects, and an 
ANOVA on these data shows this difference to 
be significant, F(1, 59) = 11.10, p< 005, 
w? = .147, No other significant differences 
were found (Fs < 1). 

An anova was performed with high/low be- 
lief and active/passive involvement as be- 
tween-subjects variables and the self /other 
perceived-influence questions as a within-sub- 
ject variable, the dependent measure being 
the rating on the self/other questions. There 


was no significant main effect for belief (F <| 


1). There was a significant main effect for the 
self/other variable such that subjects rated 
their own control over the die to be greatet 
than that of their partners, F (1, 56) = 11.21, 
p < .005, w? = .09. Similarly, active subjects 
gave higher ratings than passive subjects; 
F(i, 56) =4.23, p<.05, o =.04. The 
above main effects must be qualified in light 
of a significant Active/Passive X Self/Othe 
interaction, F(1, 56) = 10.49, p< 005, w = 
differ, although they were lower than the sell 
PK influence ratings relative to their other 
PK influence ratings (see Figure 1). be 
selj/other ratings for passive subjects did no 
differ, although they were lower than the s 
ratings of active subjects. No other significa 
interactions were found. 


Discussion 
ME, 
Active subjects were confident On sf cf 


icantly more trials than passive subjects ` 


$ 
y 
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Table 3 
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i Estimated Success as a Function of Involvement and Prior Belief 
————- eo 


i! 


Number of Self- Other 
poea influence influence 
trials estimates estimates 
Group (0-14) (0-9) (0-9) 
Passive involyement 
Low belief 
M 7.06 2.59 2.53 
SD 2.46 2.15 : 
High belief Be 
M 7.62 3.58 3.38 
SD 1.19 2.07 1.56 
Active involvement 
Low belief 
M 8.73 4.73 3.13 
SD 1.58 2.34 2.00 
High belief 
M 8.93 4.87 2.93 
SD 1,16 1.46 2:12 


Golin, Terrell, & Johnson, 1977; Langer, 1975; 
Wortman, 1975; Benassi, Roher, & Reynolds, 
Note 2). The second finding was the signif- 
icant Self/Other X Active/Passive interac- 
tion. In general, when two people are jointly 
involved with a task, one may feel more re- 
sponsible for producing an outcome than the 
partner is only when one is the more active 
participant in the situation. The fact that our 
active subjects gave higher self than other 
ratings in a chance situation provides further 
support for Langer’s (1975) illusion-of-con- 
trol model. 


Experiment 4: Involvement and 
Locus of Control 


The major purpose of the present study was 
to determine how persons with an internal or 
external locus of control respond to a PK 
task. Rotter (1966) proposed that persons 
differ in the extent to which they feel control 
over events, with internals perceiving more 
control than do externals. Some research 1s 
consistent with the notion that internals per- 
ceive control over events only when the events 
are seen as potentially controllable (see Geen, 
1976, pp. 242-243). In general terms, in- 
ternals should perceive control in skill but not 
chance situations (cf, Miller & Seligman, 


1973; Rotter, 1966). Conversely, externals 
should perceive little control in chance or skill 
situations. We predicted that active/internals 
would perceive more control in a situation 
than active/externals, passive/internals, and 
passive/externals, who would not reliably dif- 
fer in their estimated control. In our experi- 
ment we induced the perception of skill by 
making one of a pair of subjects actively in- 
volved with the task; the perception of chance 
was induced by making the other partner a 
passive participant in the task. 


Method 


Subjects. Thirty male and 30 female introductory 
psychology students at California State University, 
Long Beach, completed a course requirement by 

icipating. 
E iarta and materials. The die funnel was used. 
Responses to each of the 46 statements from Rotter’s 
(1966) locus-of-control scale were given on a dis- 
agree/agree format (based on Collins, 1974). Scor- 
ing was accomplished by summing subject’s ratings 
to each statement, with higher sums reflecting ex- 
ternality. Scoring was reversed on external items, 
that is, a rating of 1 was changed to 5, and so forth. 

Procedure. Subjects reported to the experimental 
room in groups of five. The experimenter gave a 
general introduction similar to that given in Ex- 
periment 3. Subjects were told that the experimenter 
could only work with one subject at a time, and 
they were asked to complete an attitude question- 
naire (the locus-of-control scale) while waiting for 
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a: self influence 
@ : other influence 


MEAN PERCEIVED 
PSYCHOKINETIC INFLUENCE 


active passive 


INVOLVEMENT CONDITION 


Figure 1. Active/Passive X Self/Other interaction. 


their turns. Before the questionnaire was completed, 
a “lottery” chose one subject to serve as the ex- 
perimenter’s helper; the lottery was manipulated 
so that a confederate of the experimenter was al- 
ways selected. 

The four real subjects were given time to com- 
plete the questionnaire. Next, they were individually 
put through the following experimental procedure, 
similar to that of Experiment 3. The subject was 
given a target color; he or she concentrated on pro- 
ducing the target for 10 seconds; the die was thrown; 
the subject verbally indicated on a 0-to-10 scale 
his or her degree of confidence that the target color 
had come up. The confederate recorded the con- 
fidence ratings and the outcome of each die toss. 
In one condition, the subject threw the die (active 
involvement); in the other condition, the experi- 
menter threw the die (passive involvement). Each 
subject received 20 trials with a constant sequence 
of target colors. Neither the experimenter nor the 
confederate was informed about the purpose of the 
study until all subjects had been run. 


Results 


The data of the 10 most internal and 10 
most external subjects from the active and 
passive conditions were analyzed. The ra- 
tionale for this decision was that we wished to 
create more extreme groups on the locus-of- 
control dimension than would probably have 
resulted if we had created groups by doing a 
median split on the locus-of-control scores. 
Table 4 (row 1) shows the mean locus-of-con- 
trol scores for each of the Locus of Control x 
Involvement groups. 
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The hit rates for the experimental groups 
did not reliably differ from each other (F < 
1). Also, no group reliably deviated from 
chance performance (Table 4, row 2). 

Each subject’s confidence rating was ob- 
tained by taking the mean rating for the 20 
test trials. Table 4 (row 3) shows that active 
subjects were more confident than passive 
subjects, F(1, 36) = 4.02, p < .06, u” = 06. 
Internals gave nonsignificantly higher ratings 
than externals, F(1, 36) =2.11, p> 10. 
The predicted interaction did not reach signif- 
icance, F(1, 36) = 2.53, p > .10. 

The ratings on the posttask measures pro- 
vided consistent and moderate support for the 
hypothesized interaction. On the Likert-for- 
mat question “How much control do you feel 
you had over what colors came up from the 
die tosses?” active subjects gave higher esti- 
mates than passive subjects, F(1, 36) = 4.23, 
p < .05, œ? = 07 (Table 4, row 4). Internals 
gave nonsignificantly higher ratings than ex- 
ternals, F(1, 36) = 2.75, p > 10. The inter- 
action was marginally significant, F(1, 36) = 
3.70, p < .08, o° = .06. To the Likert-format 
question “Do you feel you would get better 
on the task if you were given another 20 
trials?” active subjects gave higher ratings 
than passive subjects, F(1, 36) = 7.93, p < 
.01, o = .11 (Table 4, row 5). Internals were 
significantly more confident than externals, 
F(1, 36) = 8.73, p < .01, o? = -13. The inter 
action was significant, F(1, 36) = 7.18, 2 < 
02, o? = .10. To the question “If you were 
given another 20 trials, how many do you 
think you would get?” active subjects gavé 
nonsignificantly higher ratings than passive 
subjects, F(1, 36) = 3.14, p> -1 (Table 4; 
row 6). Internals gave higher ratings than 
externals, F(1, 36) = 5.99, p < 02, # = v 
The interaction was significant, F(1, 36) = 
8.72, p < 01, a? = .14. 


Discussion 


Active subjects gave higher estimates . 
their performance, replicating Experiment 3 F 
well as previously cited research. Inten 
gave higher ratings than externals on tW0 T 
the four dependent measures. These ree 
should be qualified, however, in light of 


oe 


MIND OVER MATTER 


Table 4 
Summary of Data From Experiment 4 
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Active Passive 
Measure Internals  Externals Internals  Externals 
Locus of control 
M 120.80 150.80 11 
È f 8.40 151.40 
SD 10.95 4.38 
E 12.69 6.83 
M 9.80 10.10 10.00 
% i 5 9.80 
SD 2.80 2.28 1.83 2.40 
Confidence (0-10) 
M 6.20 4.00 3.60 3.70 
SD 1.61 2.16 2.87 2.31 
Perceived control 
(0-10) 
M 4.90 2.20 1,90 2.10 
SD 2.86 2.25 2.28 2.07 
Improvement 
(0-10) 
M 6.10 2.00 2.10 1.90 
SD 2,72 1.86 2.76 1.60 
Another 20 trials 
M 13.10 9.90 10.30 10.60 
SD 2.51 1.79 1.63 1.34 


Note. For each dependent varial 
sive/internal, and passive/external group: 
these groups were collapsed and compared wi 
for independent means. Each comparison show 
icance for a two-tailed test. 


significant Involvement X Locus of Control 
interaction obtained on three of the four de- 
pendent measures, Comparison of the mean 
values in Table 4 shows that active/internals 
gave consistently higher estimates of their 
performance than subjects in the other three 
experimental groups, whose estimates did not 
reliably differ from one another. Internals 
seem to detect the presence or absence of skill 
Cues and then attempt to exert control if skill 
is perceived as possible, In our experiment, 
skill was merely illusory. While an internal 
locus of control is viewed as a positive char- 
acteristic in many contexts, internals who 
threw the die manifested the highest degree of 
Superstitious behavior among Our subjects. 

In a study similar to ours, ‘Miller and Selig- 
Man (1973) failed to find a main effect for 
locus of control or a Locus of Control x Task 
(chance vs, skill) interaction. Differences be- 
tween the studies make it difficult to account 
for these discrepancies. Miller and Seligman, 
for example, assigned subjects to internal or 


ble, a one-way ANOVA was done on the results of the active/external, pas- 


s; all Fs were less t 
ith the data of the active/internal subjects by way of ł tests 


ed a difference between means at the p < 


han 1. Next, for each measure, the data from 


.001 level of signif- 


external groups based on a median split done 
on responses to Rotter’s (1966) forced-choice 
scale. Also, they gave subjects bogus feedback 
after each trial of their skill and chance tasks. 
Lastly, they used different tasks for their skill 
and chance conditions. 


General Discussion 


In the present studies the situational vari- 
ables of instructional set, practice, and in- 
volvement affected estimated success on a PK 
task independent of actual success. Similarly, 
general belief in paranormal phenomena and 
locus of control orientation was significantly 
related to estimated performance. The locus- 
of-control variable was found to interact with 
the involvement variable in Experiment 4. 
The results of Experiments 1—4 are consistent 
with Langer’s (1975) skill theory of non- 
veridical perceived control. The fact that the 
theory was supported in a psychic context 
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using a variety of independent and dependent 
variables bolsters the theory’s generalizability. 

The consistent pattern of interaction found 

in Experiment 4 demonstrates the relevance 
of the locus-of-control construct for a theory 
of nonveridical perceived control. Externals 
actively involved with the task gave no higher 
estimates of their performance than either 
internals or externals passively involved with 
the task, Thus, being actively involved in a 
chance task may evince no greater estimate of 
performance than would be made after being 
passively involved with the task, unless the 
person has an internal locus of control. Future 
work might determine whether the effects of 
locus-of-control orientation could be overriden 
in contexts different from ours. 

Langer’s (1975) skill theory is a cogent 
account of how illusions of control can be in- 
duced. At the same time, the phenomenon of 
illusory control is sufficiently complex that it 
cannot as yet be accounted for by a single 
model, Future research might profitably focus 
on the following factors, which may be shown 
to be related to perceived illusory control: 
human biases in information processing 
(Kahneman & Tversky, 1973)*; coincidences, 
particularly dramatic (psychic?) ones; the 
human propensity to recall successes and to 
forget or play down failures (Ward & Jenkins, 
1965); personality variables such as locus of 
control, need for certainty, and so forth. We 
are inclined to speculate that the same vari- 
ables that affect one’s judgment of personal 
control also affect one’s judgment of another’s 
control. Also, we suspect that the emotional 
concomitants associated with perceived psy- 
chic control are noticeably greater than those 
correlated with more mundane perceived con- 
trol. For whatever reasons, the realm of the 
paranormal seems to conjure deep-seated emo- 
tional reactions that far exceed those war- 
ranted by the phenomena on which they are 
based. We plan to examine these speculations 
in our future work. 


8In a manuscript in preparation, the first author 
found that in judging another’s success on a psychic 
a peadants fell prey to the availability and repre- 
sentation fallacies described by Kahn = 
erry y eman and Tver. 
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Communicator Physical Attractiveness and Persuasion 


Shelly Chaiken 


University of Toronto, Toronto, Canada 


In a field setting, physically attractive or unattractive male and female com- 
municator-subjects delivered a persuasive message to target-subjects of each 
sex. Results indicated that attractive (vs. unattractive) communicators induced 
significantly greater persuasion on both a verbal and behavioral measure of 
target agreement. In addition, female targets indicated greater agreement than 
did male targets. Data gathered from communicator-subjects during an earlier 
laboratory session indicated that physically attractive and unattractive com- 
municators differed with respect to several communication skills and other 
attributes relevant to communicator persuasiveness, including grade point av- 
erage, Scholastic Aptitude Test scores, and several measures of self-evaluation. 
These findings suggest that attractive individuals may be more persuasive than 
unattractive persons partly because they possess characteristics that dispose 


them to be more effective communicators. 


Experimental evidence regarding the effect 
of communicator physical attractiveness on 
persuasion is equivocal. Although two studies 
have demonstrated that attractiveness can 
significantly enhance a male communicator’s 
persuasiveness with both male and female 
message recipients (Horai, Naccari, & Fatoul- 
lah, 1974; Snyder & Rothbart, 1971), the 
majority of published experiments have failed 
to obtain significant attractiveness effects or 
have obtained interactions between attractive- 
Ness and other variables (Chaiken, Eagly, 
Sejwacz, Gregory, & Christensen, 1978; Mills 
& Aronson, 1965; Blass, Alperstein, & Block, 
Note 1). For example, Mills and Aronson 
(1965), using a female communicator and 
Ny recipients, found no overall effect of 
OMmunicator attractiveness on persuasion. 


pae author is grateful to Alice H. Eagly, Jonathan 
ee and an anonymous reviewer for their 
and ee on an earlier draft of this manuscript, 
Wid to John W. Fee TII, Herschl Forman, Gory 
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the m version of this article was presented at 
noe eting of the Eastern Psychological Association, 
Rangton, D.C., 1978. 
Chis for reprints should be sent to Shelly 
‘oro; en, Department of Psychology, University of 
nto, Toronto, Ontario, Canada MSS 1A1. 
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However, on a marginally significant basis, 
they did find that the communicator’s expres- 
sion of a desire to influence recipients en- 
hanced her persuasiveness when she was at- 
tractive but not when she was unattractive. 
Blass, Alperstein, and Block (Note 1), who 
also studied a female communicating to male 
recipients, found that communicator attrac- 
tiveness and race interacted to affect opinions. 
A white communicator was more persuasive if 
attractive than if unattractive, whereas a 
black communicator was more persuasive if 
unattractive. Finally, one experiment reported 
by Chaiken, Eagly, Sejwacz, Gregory, and 
Christensen (1978) indicated that the persua- 
sive impact of attractive communicators de- 
pended both on the sexual composition of the 
communicator-recipient dyad and on whether 
recipients anticipated interacting with the 
communicator. Their second experiment, 
which employed the identical stimulus mate- 
rials but utilized a somewhat different cover 
story, yielded no significant persuasion find- 
ings involving the attractiveness variable, 


however. 

All of these experimen 
laboratory settings and 
mental manipulations 0 
tractiveness. The majority of st 


ts were conducted in 
all employed experi- 
f communicator at- 
udies have 
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used pictured communicators, varying appear- 
ance by means of photographs scaled for 
physical attractiveness (Chaiken et al., 1978; 
Horai et al., 1974; Snyder & Rothbart, 1971). 
Two experiments have utilized either live com- 
municators (Mills & Aronson, 1965) or video- 
taped communicators (Blass et al., Note 1) 
whose appearance was varied by means of 
makeup, dress, and grooming. 

The present research addressed two general 
concerns raised by previous empirical work. 
The first deals with the importance of the 
attractiveness cue outside the laboratory. The 
lack of consistent findings across previous ex- 
periments—particularly the paucity of attrac- 
tiveness main effects and the occurrence of 
unpredicted and often theoretically ambiguous 
interactions (Chaiken et al., 1978; Blass et 
al, Note 1)—suggests that the persuasive 
impact of physical attractiveness may be far 
from robust. This concern is heightened by 
the possibility that the highly controlled set- 
tings of previous experiments made the at- 
tractiveness cue more salient than it would 
typically be in more naturalistic settings, and 
the context of these experiments may there- 
fore have inflated the importance of this 
variable as a determinant of persuasion. In 
this regard, it is interesting to note that both 
experiments demonstrating an attractiveness 
main effect on opinion change (Horai, Nac- 
cari, & Fatoullah, 1974; Snyder & Rothbart, 
1971) used photographic stimuli and em- 
ployed relatively simple experimental proce- 
dures, In contrast, studies that have not 
demonstrated this main effect have used live 
or videotaped communicators (Mills & Aron- 
son, 1965; Blass et al., Note 1) or have em- 
ployed relatively more complex experimental 
procedures (Chaiken et al., 1978, Experi- 
ment 1). 

Conversely, it might be that previous em- 
pirical findings underestimate the importance 
of the attractiveness variable in persuasion. 
The implicit demands of the laboratory may 
encourage subjects to adopt a highly logical 
mode of cognitive functioning. At the same 
time, such demands may discourage subjects 
from utilizing information such as physical 
attractiveness, which popular wisdom would 
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hold to be irrelevant. Thus, although attrac. | 
tiveness may well influence opinion change in 
naturalistic settings, the psychology labora- 
tory may provide a particularly unsuitable 
setting for demonstrating such an effect. 

In sum, one concern of the experiment dealt 
with the importance of communicator attrac- 
tiveness as a determinant of persuasion, To 
address this concern, the experiment utilized 
live, physically attractive and unattractive 
communicators who delivered a persuasive 
message to targets in a field setting. It was 
predicted that attractive communicators 
would be more persuasive than unattractive 
ones, No hypotheses were formed with respect 
to whether attractiveness might interact with 
communicator or target sex to affect opinions, 
although some previous findings (Chaiken et) 
al., 1978, Experiment 1) suggest this possibil- | 
ity. l 

The second concern of the present research, 
one that presupposes that attractiveness does 
influence persuasion, deals with the explana 
tion of attractiveness effects. In previous) 
studies, the style in which the persuasive 
message is presented to subjects has remained, 
constant across experimental conditions. In 
addition, excluding trait inferences that sub- 
jects make, the (typically hypothetical) com- 
municators in these studies are equated on | 
attributes other than attractiveness. Suc 
standardization is of obvious value, since it 
allows one to conclude that observed differ- 
ences in treatment are a product of the attrac: 
tiveness cue rather than the result of some 
factor that is unintentionally confounded with 
the attractiveness manipulation. This type q 
design has been and should be used to eval 
uate a variety of explanations for the facilitat- 
ing effect of attractiveness on persuasion, 1 
cluding psychological mechanisms such 7 
credibility enhancement, identification (Kee 
man, 1961), social reinforcement, cogniti 
balance, and classical conditioning. Hower 
such designs preclude from consideration a 
possibility that attractive and unattractl? 
individuals are differentially persuasive 
cause attractiveness is correlated with 0 i 
attributes that affect communicator pest 
siveness. 


ATTRACTIVENESS AND PERSUASION 


While little is currently known about actual 
differences between physically attractive and 
unattractive persons, researchers (Berscheid 
& Walster, 1974) have argued that ‘because of 
differing socialization experiences (e.g., Clif- 
ford & Walster, 1973; Dion, 1972) attractive- 
ness may be confounded with other attributes 
(eg. intelligence, status, self-concept, per- 
sonality). Consistent with this argument, 
Goldman and Lewis (1977) recently reported 
that attractive individuals may possess greater 
social skill than unattractive persons. The 
idea that attractiveness covaries naturally 
with other factors suggests that attractive 
individuals may be more persuasive than un- 
attractive persons partly because they possess 
communication skills or other attributes that 
dispose them to be particularly effective com- 
municators. To explore this possibility, the 
present study selected 17 different com- 
municator-subjects to represent each of four 
possible combinations of communicator attrac- 
tiveness and sex, Videotapes of these com- 
municators delivering the persuasive message 
were examined to determine possible individ- 
ual differences in communication skills. The 
skills measured in the experiment included 
those identified by previous research (speech 
rate, Miller, Maruyama, Beaber, & Valone, 
1976; nonfluencies, McCroskey & Mehrley, 
1969; vocal confidence, London, 1973; eye 
gaze, Mehrabian & Williams, 1969) or sug- 
gested by intuition (smiling) as correlates of 
communicator persuasiveness. To examine 
differences with respect to other attributes 
relevant to persuasiveness, communicator-sub- 
jects also completed the Rotter (1966) In- 
ternal_External Locus of Control Scale, rated 
themselves on a series of evaluative scales, 
and reported their grade point averages and 
Scholastic Aptitude Test (SAT) scores. 


Method 


Subjects and Overview of Design 


A total of 110 male and female University of 
Massachusetts undergraduate psychology students 
Participated for extra course credit as communicator- 
subjects, In a laboratory session immediately pre- 
ceding their field participation, these individuals 
Were trained to deliver a persuasive message, and 
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their final practice performances were videotaped. 
In addition, communicator-subjects completed a ques- 
tionnaire and were photographed. In the field, each 
communicator delivered the persuasive message to 
2 University of Massachusetts undergraduates of 
each sex. Subsequently, an independent group of 
judges (n = 56) rated communicator-subjects’ photo- 
graphs on physical attractiveness (15-point scale). 
Communicator-subjects were rank ordered accord- 
ing to their mean attractiveness ratings, and those 
in the top or bottom third of the distribution for 
their sex were selected for inclusion in the design. 
This procedure resulted in the analysis of data from 
68 communicators (17 per sex/attractiveness level) 
and 272 target-subjects (2 male and 2 female targets 
per communicator). Mean attractiveness scores for 
the four communicator groups were 8.80 (male/at- 
tractive, 9.09 (female/attractive), 6.10 (male/un- 
attractive), and 6.34 (female/unattractive). 


Laboratory Procedure 


Communicator-subjects were recruited for an ex- 
periment entitled “social influence” and were sched- 
uled individually for their laboratory sessions. The 
(male) experimenter informed communicators that 
they would be trained to deliver a persuasive mes- 
sage and would then attempt to persuade students 
whom they would approach on campus. Communica- 
tors were told that “in order to explore correlates 
of communicator persuasiveness,” they would com- 
plete a questionnaire, be photographed, and be video- 
taped delivering the persuasive message. 

‘After this introduction subjects completed the 
first part of a questionnaire on which they indicated 
their sex, age, grade point average, SAT scores, and 
agreement (7-point scale) with the position advo- 
cated in the persuasive message. Subjects completed 
the remaining three sections of the questionnaire in 
one of six possible orders. One section consisted of 
the 23-item (plus fillers) Internal-External Locus 
of Control Scale (Rotter, 1966). In a second sec- 
tion, subjects described themselves on 7-point bipolar 
adjective scales. Positive poles of the 17 scales used 
were intelligent, interesting, assertive, confident, lik- 
able, knowledgeable, competent, sincere, physically 
attractive, modest, moral, persuasive, trustworthy, 
warm, attractive, friendly, and sensitive. Finally, 
a third section asked subjects to “speculate about 
their future” by responding on 7-point scales to the 
following items: to have an excellent (vs. poor) job, 
have a happy (vs. drab) family life, be regarded as 
a successful (vs. unsuccessful) person, bea contented 
(vs. discontented) person, be well-of financially (vs. 
in bad financial straits), be enjoying life (vs. de- 
pressed about things), be famous for something (vs. 
a very ordinary person), be a highly (vs. only mod- 
erately) educated person, be a confident (vs. a 

ied) person. 
nee next received a script containing the per- 
suasive message as well as all procedural details 
pertaining to their field participation. To standardize 


communicator-subjects’ training as much as possible, 
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the following procedure was employed. The experi- 
menter twice went through the field procedure with 
each subject, once playing the role of communi- 
cator and once playing that of target-subject. Com- 
municator-subjects were then allotted 10 minutes to 
practice the script on their own. Afterwards, the 
experimenter again went through the script twice 
with the communicator-subject, both times play- 
ing the role of target-subject. 

Next, communicator-subjects were asked to de- 
liver the persuasive message “just as you would to 
a real target-subject.” This performance was de- 
livered to an experimental assistant (male) and was 
videotaped. Subsequently the videotapes were scored 
by two independent judges (blind to the experimen- 
tal hypotheses), Exposed only to the audio compo- 
nent of the videotape, each judge rated the com- 
municator’s vocal confidence (5-point scale). Using 
a digital stop-clock, judges recorded the time that 
the communicator gazed directly at his or her tar- 
get (experimental assistant) and, to assess speech 
rate (seconds per word), recorded the time each 
communicator took to deliver the 111-word persua- 
sive message. A smiling index (possible range 0 to 
5) was obtained by having judges record the pres- 
ence (1) or absence (0) of smiling in each of five 
segments of the message. Judges also counted the 
number of nonfluencies (vocal pauses within sen- 
tences, repetitions, “umms,” “ers,” stuttering, etc.) 
made during delivery of the message. Finally, judges 
rated the communicator’s physical attractiveness (5- 
point scale) while exposed only to the video com- 
ponent of the tape. For each of the above measures, 
the two judges’ ratings were averaged to obtain a 
score for each communicator-subject. Interrater re- 
liability coefficients were speech rate, r= .99; vocal 
confidence, r=79; gaze, r=.99; smiling, r= 83; 
nonfluencies, r= .90; physical attractiveness, r = .67. 

Just prior to leaving the laboratory for the field, 
communicators were photographed in a standard 
pose. The 34 X 5 inch black and white photographs 
showing a head and shoulder view of each com- 
municator were subsequently scaled for physical at- 


tractiveness by an independent group of judges (see 
above). 


Field Procedure 


Each communicator was randomly assigned to one 
of five campus locations and was then taken to 
that location by the experimenter. To prevent com- 
municators from selecting their own target-subjects, 
each communicator was required to approach every 
passer-by until he or she had completed the entire 
field procedure with 2 university students of each 
sex, This rule was subject to the following con- 
straints: (a) sex of person approached was required 
to alternate, (b) only persons walking alone were 
to be approached, and (3) an approach could not 
be initiated while the communicator was occupied 
with another target-subject. 

The communicator introduced himself or herself 
to the target-subject and requested that the target 
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complete an opinion survey, If the target agreed, 
the communicator stated that he or she was in a 
campus group favoring the proposition that “the 
University should stop serving meat at breakfast 
and lunch at all dining commons,” and the com- 
municator supported this position with two brief 
arguments. After restating the position, the com- 
municator gave the target a confidential question- 
naire to complete. On this questionnaire, targets 
indicated their agreement (7-point scale) with the 
message’s overall position and rated the communi- J 
cator’s friendliness, knowledgeability, and attractive- 
ness (15-point scales). Targets also indicated their 
sex and age and responded to some filler items de-| 
signed to make the questionnaire appear to be an 
opinion survey. Afterwards, targets placed the com- 
pleted questionnaire in a box provided by the com- 
municator. The communicator next stated that he] 
or she was also circulating a petition demanding 
that “the University stop serving meat at break- | 
fast and lunch at all dining commons,” and asked | 
the target to sign it. After the target had signed 
(or refused to sign), he or she was thanked for 
participating in the survey. 
Throughout the entire field procedure, the ex- 
perimenter stood nearby and monitored all com- 
municator-target interactions. Data from 6 (of the 
initial 110) communicators were excluded from con- 
sideration because they deviated. greatly from the | 
experimental script (4) or informed one or more | 
of their targets that they were participating in an 
experiment (2). The experimenter also recorded the | 
total number of approaches required by each com- 
municator in order to complete the field procedure 
successfully with two male and two female targets. 


Results and Discussion 


Data provided by target-subjects were an- 
alyzed by four-way multivariate and univari- 
ate analyses of variance with 2 levels each of 
communicator sex, attractiveness, and target 
sex, and 17 levels of communicator (nest 
within communicator sex and attractiveness): 
Data from communicator-subjects were an- 
alyzed by two-way multivariate and univari- 
ate analyses of variance with 2 levels each of 
communicator sex and attractiveness. 


Check on Experimental Design 


The attractive communicator-subjects of | 
the present study were clearly perceivi 
more physically attractive than their una! 
tive counterparts. An analysis of attractive 
ness ratings made by photograph judges ee 
vealed only an attractiveness main ie 
(p < 0001), as did an analysis of attractive 
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Mean Target Agreement and Proportion of Targets Signing Petition as a Function of 
Communicator Attractiveness, Communicator Sex, and Target Sex 
nn nee ee UE EEEEIIE EEEEEESIS SEES 


Attractive communicator 


Unattractive communicator 


Male 


Female 


Male Female 


Male Female 


Male Female 


Male Female Male Female 


Variable target target target target target target target target 
9 Target 
; agreement 4.50 3.32 4.24 3.53 4.91 4.00 4.53 4.03 
Petition 
signing 29 53 .35 AT 35 38 „24 .29 


Note. On the agreement measure, lower numbers indicate greater agreement with the position advocated in 


the persuasive message. Cell n = 34. 


"ness ratings made by videotape judges’ (p < 

001). Further, targets-subjects perceived 
physically attractive (vs. unattractive) com- 
municators as more attractive (p < .05). It 
should also be noted that communicator-sub- 
jects did not differ significantly as a function 
of attractiveness or sex with respect to their 
age or opinions on the persuasive message 
topic. 


Field Data 


The five target-subject variables (agree- 
Ment, petition signing, perceptions of com- 
Municator friendliness, knowledgeability, and 
attractiveness) were first submitted to a four- 
way multivariate analysis of variance and 
Were then analyzed on a univariate basis.* 
The multivariate analysis yielded significant 
effects due to communicator sex, F(5, 60) = 
2.96, p < .02, and target sex, F(5, 60) = 
3.16, p < .02, and a marginal effect due to 
Communicator attractiveness, F(5, 60) = 
174, » <.142 The univariate analyses re- 
_ Yealed that the communicator-sex, target-sex, 
4 and communicator-attractiveness main effects 
attained significance on one (attractiveness), 
two (agreement, petition signing), and three 
(agreement, petition signing, attractiveness) 
a the five target variables, respectively. The 
pe of this section presents the results 

ese more informative univariate analyses. 
k ersuasion measures. Analysis of targets’ 

Steement with the communicator’s overall 

Position (see Table 1) revealed main effects 


due to communicator attractiveness and target 
sex. Attractive communicators elicited greater 
agreement from targets than did unattractive 
communicators (M = 3.89 vs. M = 4.37), 
F(1, 64) = 4.23, p < .05, and female targets 
expressed greater agreement than male targets 
(M = 3.72 vs. M = 4.54), F(1, 64) = 11.22, 
p < .005. No other effects were significant on 
this measure. Dunnett’s test (cf. Myers, 
1972) indicated that all experimental groups 
(in the Communicator Sex X Communicator 
Attractiveness X Target Sex design) differed 
significantly from an opinion-only group of 
pilot subjects* in the direction of greater 


1 Following Lunney (1970) and the advice of an 
anonymous reviewer, the dichotomous petition-sign- 
ing measure (coded 0,1) was included in the multi- 
variate analysis and was also treated by univariate 
analysis of variance. The latter analysis yielded main 
effects due to communicator attractiveness, F(1, 64) 
=3.95, p<.06, and target sex, F(1, 64) =5.07, 
p<.05. These univariate results parallel those pre- 
sented in the text, which are based on a more con- 
ventional treatment of dichotomous data, 

2Since the nested communicator factor accounted 
for no significant effects in the four-way analysis, 
it was ignored in a subsequent three-way multi- 
variate analysis on the five field measures. The three- 
way analysis, which employs an error term with 
greater degrees of freedom than the error terms 
utilized in the four-way hierarchical design analysis, 
yielded comparable findings: significant effects due 
to communicator sex, F(5, 260) = 3.80, p < .005, and 
target sex, F(5, 260) =2.93, p < .02, and a marginal 
effect due to communicator attractiveness, F(5, 260) 
= 1.83, p< 11. 

3 Pilot subjects (n = 18) were approached by the 
experimenter at various campus locations and were 
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agreement with the position advocated in the 
message (ps < .05 or smaller). f 

The proportion of targets in each experi- 
mental condition who signed the communica- 
tor’s petition was computed (see Table 1), 
and z tests were performed on the arcsin 
transformations of these proportions (cf. 
Langer & Abelson, 1972). Paralleling the 
agreement results, this analysis revealed that 
a greater proportion of targets signed the peti- 
tion when the communicator was attractive 
($ = 41) rather than unattractive (p = .32), 
z= 1.56, p < .06, one-tailed, and a greater 
proportion of female (vs. male) targets signed 
(p = 42 vs. p = 31), z = 1.91, p < .06, two- 
tailed. No other effects were significant on 
petition signing. 

It should be noted that communicators ap- 
proached an average of 6.04 individuals in 
their efforts to deliver the persuasive message 
to the requisite number of targets. An analysis 
of the number of approaches that each com- 
municator made yielded no significant effects. 
Thus male and female and attractive and un- 
attractive communicators were equally suc- 
cessful in gaining the audience of targets.* 

Other measures. In addition to the signif- 
icant attractiveness effect on targets’ attrac- 
tiveness ratings, attractive communicators 
were perceived as somewhat friendlier than 
unattractive ones (p = .07). Finally, targets 
(unlike either videotape or photograph 
judges) rated female (vs. male) communica- 
tors as more attractive (p < .001). No other 
effects were significant on these measures, and 
no effects were obtained on targets’ ratings of 
communicator knowledgeability. 

The field data generally substantiate the 
physical attractiveness — persuasion relation- 
ship. On both a verbal and behavioral mea- 


asked to indicate their agreement (7-point scale) with 
a variety of opinion statements and to judge whether 
males or females were more expert with respect to 
each opinion issue (7-point scale). The meat topic 
was chosen because pilot subjects disagreed strongly 
with the position advocated in the persuasive mes- 
sage (M=6.50) and considered males and females 
equally expert on the issue (M=4.08). Further, 
male and female pilot subjects did not differ sigi- 
cantly with respect to either their agreement or their 
expertise ratings, ts(16) < 1.0, on this issue. 
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sure, target-subjects expressed significantly. 
greater agreement with attractive, rather than 
unattractive, communicators. Although the 
study was not designed to evaluate possible 
psychological mechanisms underlying this 
effect, the fact that that targets’ perceptions 
of communicator knowledgeability were un- 
affected by attractiveness suggests that attrac- 
tiveness effects in persuasion are not typically] 
mediated by differential perceptions of credi- 
bility (also see Norman, 1976; Snyder 
Rothbart, 1971). On the other hand, the find: 
ing that targets’ perceptions of communicator 
friendliness were marginally affected by at 
tractiveness is consistent with psychological 
mechanisms such as identification (Kelman, 
1961), social reinforcement, cognitive balance, 
and classical conditioning. 

Target sex also had a significant impact on) 
persuasion. Females expressed greater agreed 
ment with the communicator’s message that 
did males. While not anticipated, this finding | 
is consistent with two hypotheses recently) 
proposed by Eagly (1978). One hypothesis is 
that heightened female influenceability may 
be observed when the message topic is biased 
against the expertise or interests of women) 
In view of pilot-subjects’ ratings (see Foot) 
note 3), it seems unlikely that the sexes wett 
differentially expert on the meat topic. Howi 
ever, while this study did not assess involve 
ment with the issue, recent findings obtained) 
by Eagly (Note 2) indicate that males may 
be more interested in or involved with the 
meat question, perhaps because they af 
greater consumers of beef than are females. 
Since greater involvement typically decreasts 
persuasion (e.g., Miller, 1965; Sherif & Hov 
land, 1961), the present findings are consis- 
tent with the idea that male targets’ greate 
involvement with the message topic made 
them less willing than females to yield to the 
persuasive message. Eagly’s second hypothe 
suggests that heightened female influenceabl: 
ity may be observed in situations character” 


5 d 3 e 
ized by the communicator’s physical present 


1 
4Of the 411 individuals approached by the stud} : 
68 communicators, 139 (34%) refused the commu, 
cators’ initial requests that they “complete an o 
ion survey.” 
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and surveillance, and may be a product of 
females’ greater concern with maintaining 
social harmony and insuring smooth interper- 
sonal relations. The fact that a sex difference 
occurred in this experiment where communica- 
tors presented their messages to targets during 
face-to-face encounters is also compatible with 
this hypothesis. 

As Eagly (1978) has noted, differences in 
interpersonal orientation as an explanation of 
sex differences in influenceability imply that 
the greater persuasibility of women may rep- 
resent surface agreement rather than genuine 
belief change. Yet, if this explanation is enter- 
tained, the meaning of the experiment’s per- 
suasion measures and of the observed attrac- 
tiveness effects is called into question. Did 
the greater “persuasion” induced by attrac- 
tive (vs. unattractive) communicators repre- 
sent genuine belief change or mere compli- 
ance? Although a compliance interpretation 
cannot be completely ruled out, at least two 
aspects of the field data favor the idea that 
communicator attractiveness did actually ef- 
fect genuine changes in targets’ opinions. 
First, unlike the petition-signing measure, 
communicators did not see targets’ verbal 
agreement responses. Targets were told that 
their opinions were confidential, and targets 
themselves inserted their completed question- 
naires in a ballot box provided by the com- 
municator. Yet this measure, which should 
have been less susceptible to compliance pres- 
sures, showed an attractiveness effect of the 
same (and slightly higher) magnitude as that 
shown on petition signing. Second, a compli- 
ance interpretation suggests that attractive 
(vs. unattractive) communicators should have 
been more successful in inducing compliance 
with their initial requests that potential tar- 
gets “complete an opinion survey.” However, 
) analysis on the number of approaches made 
by each communicator revealed no attractive- 
Ness effect (p > .65). 


Individual Differences Between 
Communicator-Subjects 


__A second focus of the experiment was to 
Investigate whether the greater persuasiveness 
of physically attractive communicators might 
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be understood in terms of differences between 
attractive and unattractive individuals on 
dimensions relevant to communicator persua- 
siveness. The individual difference measures 
obtained during communicators’ laboratory 
participation were first submitted to a two- 
way (Attractiveness X Sex) multivariate 
analysis of variance and were then analyzed 
on a univariate basis. The multivariate anal- 
ysis yielded only an attractiveness main effect, 
F(36, 29) = 9.52, p < .001. However, the 
univariate analyses revealed that this effect 
attained significance or marginal significance 
on only a small proportion of variables. The 
remainder of this section presents the findings 
of these univariate tests. 

Communication skills. Analyses on the 
communication skills measured in this study 
revealed that compared to unattractive com- 
municators, attractive communicators were 
significantly more fluent speakers (p < .05) 
and had a marginally faster rate of speech 
(p= .07). Attractive and unattractive com- 
municators did not differ with respect to vocal 
confidence, gaze, or smiling. The only other 
effects on these variables were due to com- 
municator sex. Male (vs. female) communica- 
tors spoke faster (p < .05) and smiled less 
(p < .001) while delivering the message. 

Other attributes. Phares (1965) found 
that students who were more internal (vs. ex- 
ternal) in their locus-of-control orientation 
(Rotter, 1966) were more successful in influ- 
encing other students’ attitudes. And Miller 
(1970) has shown that attractive (vs. un- 
attractive) persons are perceived by others as 
being more internal. In the present study, an 
analysis of communicators’ locus-of-control 
scores yielded no significant effects. It should 
also be noted that locus of control was not 
significantly associated with an index of com- 
municator persuasiveness (sum of targets’ 
agreement scores), r=—.15, p< .20, al- 
though the direction of the relationship is con- 
sistent with Phares’ (1965) results. 

On communicator-subjects’ self-reported 
SAT scores, both an attractiveness main ef- 
fect (p < .05) and a Communicator Attrac- 
tiveness X Communicator Sex interaction 
(p < .05) were obtained. Overall, attractive 
communicators reported higher SAT scores 
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than did unattractive ones. However, this 
difference was most evident for female (vs. 
male) subjects. Attractive (vs. unattractive) 
communicators also reported marginally 
higher grade point averages (p = .11). 

Although attractive communicators tended 
to regard both themselves and their future 
more positively than unattractive communica- 
tors did on many (but not all) of the self- 
descriptive items, the attractiveness main 
effect reached significance or marginal sig- 
nificance on only a very few scales. Physically 
attractive (vs. unattractive) communicators 
tended to regard themselves as more persua- 
sive (p< .025), attractive (p< .10), and 
interesting (p< .10), and were somewhat 
more optimistic about obtaining an excellent 
job (p< .10). These analyses also revealed 
that female (vs. male) communicators viewed 
themselves as more moral (vs. immoral; p < 
.01) and indicated more optimism with respect 
to obtaining an excellent job (p< .05) and 
being a successful person (p < .05). Finally, 
a Communicator Sex X Communicator At- 
tractiveness interaction was obtained on com- 
municators’ speculations regarding their fu- 
ture contentedness (p< .05). Attractive 
females and unattractive males speculated 
that they would be more contented than dis- 
contented whereas unattractive females and 
attractive males speculated that they would 
be more discontented. 

These results indicate that attractive and 
unattractive individuals do differ on dimen- 
sions other than physical appearance. Attrac- 
tive communicators were more fluent speakers 
and faster speakers than their unattractive 
counterparts. Further, attractive communica- 
tors tended to report higher scores on two 
indices of educational accomplishment (grade 
point average, SAT scores) and described 
themselves somewhat more favorably along 
several dimensions (persuasiveness, attractive- 
ness, interestingness, optimism about getting 
an excellent job) that may tap aspects of self- 
concept. 

The remaining question is whether the dif- 
ferential persuasiveness of physically attrac- 
tive and unattractive communicators can be 
understood in terms of the observed differ- 
ences with respect to communication skills, 
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educational accomplishment, and components 
of self-concept. Correlations between com- 
municators’ scores on these measures and their 
persuasive effectiveness were generally low 
and nonsignificant (fs = .07 or higher), Al- 
though the relationship between communica- 
tors’ attractiveness scores and persuasive ef- | 
fectiveness was low to begin with (r = .20,| 
p = .05), it did attenuate somewhat when the} 
influence of these individual difference mea- 
sures was removed (r = .14, p = .14). Theses 
results suggest that while the observed indi- | 
vidual differences do not provide a full ex- 
planation for the observed effect of attractive: | 
ness on persuasion, they may have contributed | 
at least partially to this effect. In any case, 
it would seem premature to discount such ine 
dividual difference variables as correlates of | 
communicator persuasiveness or, by extension, 
to discount the role of these variables in un- 
derstanding the attractiveness—persuasion re 
lationship. Research has demonstrated that 
both fluent speech (McCroskey & Mehrley, 
1969) and speech rate (Miller et al., 1976) | 
relate positively to persuasion. And although 
little is known concerning the psychology of | 
the persuasive communicator (in contrast to 
the psychology of the message recipient), it 
seems reasonable that factors such as self- 
concept and educational achievement, a {real 
quently used indicant of intelligence, should 
contribute to one’s effectiveness as a social in- 
fluence agent. 


| 
Conclusions and Implications 


| 
The present research indicates that physical | 
attractiveness can significantly enhance com | 
municator persuasiveness. A number of as 
pects of the findings and the experimental | 
design are worth noting, since they suggest 
that the attractiveness—persuasion relationship 
may be fairly general. First, the attractiveness | 
effect on persuasion was obtained on both ap 
verbal and behavioral measure, and attractive 
ness did not interact with either communicà- 
tor or target sex to affect agreement. Secon 
the design featured multiple-stimulus pers0™ | 
at every level of communicator attractiveness 
and sex, and this nested communicator faca 
accounted for no significant effects M 
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analyses of variance. Third, communicator- 
subjects were not judged to be extreme in 
physical attractiveness. Thus, the persuasion 
results were obtained using attractive and 
unattractive communicators who are probably 
quite representative of those whom individuals 
might encounter in their everyday lives. Last, 
the findings were obtained in a field setting 
where the salience of information about phys- 
ical attractiveness should approximate its 
“salience in genuine interpersonal situations 
and where any implicit experimental demands 
on the subject to overutilize or underutilize 
attractiveness information should be minimal. 

Despite the study’s apparent success in 
demonstrating the attractiveness—persuasion 
relationship, a number of issues remain unre- 
solved and require further investigation. First, 
the interpersonal-orientation explanation of 
the observed sex differences in persuasibility 
raises the possibility that ‘the enhanced per- 
suasion induced by attractive communicators 
may have represented heightened compliance 
rather than genuine belief change. While the 
field data seem more compatible with a belief- 
change interpretation, compliance cannot be 
completely eliminated as an explanation. Re- 
search that manipulated communicator pres- 
ence and/or surveillance in conjunction with 
physical attractiveness would be quite useful 
in resolving this issue. A second issue concerns 
the psychological mechanisms underlying the 
attractiveness-persuasion relationship. The 
results of this experiment as well as those of 
some previous studies (Norman, 1976; Sny- 
der & Rothbart, 1971) suggest that credibility 
enhancement does not typically mediate the 
attractiveness—persuasion relationship. How- 
ever, mechanisms such as identification (Kel- 
man, 1961), social reinforcement, cognitive 
balance, classical conditioning, and perhaps 
others remain possible and require further 
evaluation. 

Finally, there is the question of why most 
Previous experiments have failed to demon- 
Strate ithe attractiveness effects on persuasion 
observed here. One answer is suggested by the 
fact that the attractiveness effects obtained 
in this study, while statistically reliable, were 
not very strong. If, as the magnitude of these 
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effects imply, the persuasive impact of attrac- 
tiveness is not particularly robust, the failure 
of previous efforts to document such a rela- 
tionship becomes more understandable. One 
obvious implication of this explanation is that 
while physical attractiveness can be a signif- 
icant determinant of persuasion, it may not 
be a particularly important determinant. A 
second possible explanation stems from the 
laboratory nature of previous research. As 
suggested earlier, the implicit demands of the 
psychology laboratory may often lead subjects 
to adopt highly logical modes of cognitive 
functioning and, as a consequence, to under- 
utilize attractiveness information in forming 
their opinions. The ecological validity of the 
laboratory for studying the attractiveness— 
persuasion relationship requires further inves- 
tigation, perhaps through research that ex- 
plicitly incorporates social-influence setting 
(field vs. laboratory) as a feature of its ex- 
perimental design. 


Individual Differences Between Attractive 
and Unattractive Communicators 


The experiment also revealed differences be- 
tween attractive and unattractive communica- 
tor-subjects with respect to characteristics 
relevant to persuasive effectiveness (com- 
munication skills, educational accomplish- 
ment, components of self-concept). These re- 
sults provide preliminary evidence that in 
genuine interpersonal situations, attractive 
individuals may be more persuasive than un- 
attractive persons because they possess char- 
acteristics or skills that dispose them to be 
particularly effective communicators. In this 
regard, it should be noted that in a recent 
investigation of manipulative social influence 
among children, Dion and Stein (1978) 
found that attractive and unattractive chil- 
dren were both differentially successful and 
employed markedly different interaction styles 
in attempting to influence peers. 

Previous investigators have studied physical 
attractiveness in isolation from, rather than 
together with, its concomitant variables. The 
present findings and those of Dion and Stein 
(1978) underscore the utility of the latter 
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approach. Future research should address it- 
self to a more systematic investigation of indi- 
vidual differences between the physically at- 
tractive and unattractive with respect to 
variables relevant to social influence. For ex- 
ample, research on the physical attractiveness 
stereotype indicates that attractive (vs. un- 
attractive) individuals are perceived by others 
as more likable, friendly, interesting, and 
poised (cf. Berscheid & Walster, 1974). Such 
attributes would seem of obvious advantage 
to someone wishing to influence others. Fur- 
ther research might attempt to assess whether 
physically attractive people actually behave 
in a more likable, friendly, interesting, and 
poised fashion with others—either because 
they possess these traits or, as Snyder, Tanke, 
and Berscheid (1977) have recently argued, 
because the stereotype-based expectations of 
others tend to elicit and maintain such be- 
haviors—and whether such behaviors can help 
to explain their greater effectiveness as agents 
of social influence. In addition to furthering 
our understanding of the relationship between 
attractiveness and social influence, such re- 
search would provide valuable information 
regarding the validity of the physical attrac- 
tiveness stereotype and would also increase 
our knowledge about communicator character- 
istics and behavioral styles and their role in 
social influence. 
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A Creative Personality Scale for the Adjective Check List 


Harrison G. Gough 
Institute of Personality Assessment and Research 
University of California, Berkeley 


The Adjective Check List was administered to seven male and five female : 
samples comprising 1,701 subjects. Direct or inferred ratings of creativity were 
available for all individuals. The samples covered a wide range of ages and 
kinds of work; criteria of creativity were also varied, including ratings by 
expert judges, faculty members, personality assessment staff observers, and life 


on all protocols, as were Welsh’s A-1, A-2, A-3, and A-4 scales for different 
combinations of “origence” and “‘intellectence.” From item analyses a new 
30-item Creative Personality Scale was developed. It is positively and signifi- 
cantly (p < .01) related to all six of the prior measures but surpasses them in 
its correlations with the criterion evaluations. 


history interviewers. The creativity scales of Domino and Schaefer were scored 


Creativity is a valued commodity in every 
kind of human endeavor. Since the publica- 
tion of Guilford’s (1950) influential presiden- 
tial address to the American Psychological 
Association, an enormous amount of effort 
has been invested in the study of creativity 
and its determinants. One line of investigation 
within the larger domain of inquiry has been 
the search for methods of assessment that can 
identify creative talent and potential within 
the individual. Many of these studies have 
addressed cognitive issues and problem solv- 
ing. For example, Guilford and his colleagues 
(Guilford, Wilson, Christensen, & Lewis, 
1951) developed a series of tests stressing in- 
genuity, the ability to Overcome constraining 
sets, and fluency in ideation, Mednick (1962) 
proposed a method of assessment requiring 
the generation of remote associations for the 
solution of analogies. 

In regard to intellectual functioning, it 
should be noted that most studies have found 
intellectual ability as usually measured to be 
unrelated to criteria of originality. MacKin- 
non and Hall (1972) obtained Wechsler Adult 
Intelligence Scale (WAIS; Wechsler, 1958) 


Requests for reprints should be sent to Harrison 
G. Gough, Institute of Personality Assessment and 
Research, University of California, Berkeley, Cali- 
fornia 94720. 
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protocols from 88 architects, 37 research 
scientists, 33 male mathematicians, and 27 
female mathematicians who had also been 
rated on creativity. Within each sample, sub- 
jects were dichotomized into those with higher 
or lower ratings. The higher rated subgroup, 
taken from all four samples, had a mean full- 
scale IQ of 133, and the lower rated subgroup — 
had a mean IQ of 131. The difference between 
the means was not statistically significant, | 
nor were there any noteworthy differences in 
range or dispersal of scores. In their study of f 
gifted students, Getzels and Jackson (1962) 
found intelligence to play a smaller role than 
personality in determining creativity, and 
Taylor (1960) also found general intellectual | 
ability to be less important than special kinds 
of thinking and motivational factors. 
Ordinary observations have long suggested 
that artistic temperament and aesthetic dis- 
Positions are related to creative potential. A 
landmark study of this hypothesis was that of 
Barron and Welsh (1952) in which a non- 
verbal figure-preference scale was introduced. 
The original Barron-Welsh Art Scale and the 
Revised Art Scale (Welsh, 1969, 1975), in 
which like and dislike responses are balanced, | 
have repeatedly been shown to differentiate 
between more and less creative persons 10 
various scientific, literary, and artistic fields 
(Barron, 1972; Welsh, 1977). Another ex 


ample of measurement in the aesthetic domain 
is the Hall Mosaic Construction Test (Hall, 
1972). 

Personal traits and dispositions have also 
been examined in regard to creativity, using 
| both standard personality inventories and 
specially developed scales and questionnaires. 
Examples of this kind of work may be found 
in the writings of Barron (1957, 1958), 
Domino (1974), Gough (1956/1962, 1976), 
Helson (1977), Kanner (1976), MacKinnon 
(1962, 1965), and Stein and Heinze (1960). 
An overall analysis of intellectual, aesthetic, 
| motivational, and other kinds of tests for 
creativity has been published by Barron 
(1965), 

A particular topic within the realm of 
studies of personality is that dealing with the 
self-concept. The Adjective Check List (ACL; 
| Gough & Heilbrun, 1965) is an assessment 
device intended for appraising views of the 
self, and as might be expected it has fre- 
quently been employed in investigations of 
| cteativity (Cashdan & Welsh, 1966; MacKin- 
non, 1963; Schaefer, 1969), Several attempts, 
in fact, have been made to develop creativity 
scales for the ACL. Schaefer (Smith & Schae- 
fe, 1969) identified 27 items that differ- 
tntiated between the responses of high school 
boys rated as more and less creative. Only 
one of these items (cooperative) was checked 
More often by those with lower ratings; the 
other 26 items were more often endorsed by 

higher rated respondents. Follow-up studies 
(Schaefer, 1972, 1973) indicated that the 

Scale retained its validity over time. 

_ Domino (1970) asked faculty members of a 
“liberal arts college to identify all male fresh- 

men who had manifested creative ability. 

es students were selected; these stu- 
| ents were then matched against 96 unnom- 
tated controls on age, IQ, personal adjust- 
as estimated from the Minnesota Multi- 
ie Personality Inventory (MMPI), and 

a ae major. In the next year, faculty 

to (etd were asked to make a special effort 
i Serve the performance of all of these 
Ba and to judge which ones had shown 
a ence of creative ability; this procedure 
Jil Tepeated in the third year. These steps 

ded a final sample of 59 creative males 
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and a control sample of 82. For each student, 
at least three faculty raters completed descrip- 
tive ACLs. Composites were formed for the 
141 students by assigning a 1 to items checked 
by two or more raters, and a 0 to items 
checked by none or by only one observer, An 
item analysis of the 300 adjectives in the list 
identified 68 that differentiated between the 
two subsamples at the .05 level of probability 
or beyond. Fifty-nine of these were more 
often used to describe the creative subsample 
and 9 were used more often to describe the 
controls. To simplify scoring and analysis, 

Domino decided to base his scale solely on the 

59 items associated with greater creativity. 

Cross-validation of the scale on the self-report 

protocols of new samples of males and females 

produced correlations ranging from .24 to .45 

with criterion classifications of creativity. 

A third study in which item analyses were 
made of the ACL was that of Welsh (1975). 
Two dimensions were first defined, one deriv- 
ing from intellectual functioning and behavior 
and termed intellectence, and the other deriv- 
ing from originality and aesthetic sophistica- 
tion and termed origence. The interactive grid 
for these two dimensions permitted the spe- 
cification of four types of cognitive function- 
ing (Welsh, 1977). Type 1, high on origence 
but low on intellectence, was characterized by 
diffuse, global, and imprecise integration with 
little or no differentiation. Type 2, high on 
both origence and intellectence, was char- 
acterized by synthesis, organization, and the 
cathexis of metaphor. Type 3, low on both 
origence and intellectence, was characterized 
by fragmentation of elements and overatten- 
tion to details. Type 4, low on origence but 
high on intellectence, was characterized by 
analytic and logical preferences. The four 
types could be very briefly described as imag- 
inative, intuitive, conventional, and analytic 
in their cognitive styles. Adjective Check List 
scales were developed for each quadrant by 
contrasting the responses of individuals in 
that cell with those of individuals in the other 
three. The A-1 scale for respondents high on 
origence but low on intellectence contained 21 
items. The A-2 scale for respondents high on 
both axes contained 25 items. The A-3 and 
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A-4 scales contained 17 and 24 items, respec- 
tively. Scoring on all four scales was by en- 
dorsement only; no points were given for non- 
endorsement of contraindicative adjectives. 

During the late 1950s, the writer directed a 
study of creativity among research scientists 
(Gough & Woodworth, 1960). When the six 
ACL scales described above were applied to 
the 45 scientists in this study, rather dis- 
appointing results were obtained, The correla- 
tions with the criterion ratings of creativity 
were .01 for Domino’s scale, .08 for Schaefer’s, 
and —.06, —.01, —.08, and .08 for the four 
measures developed by Welsh. Similar results 
were obtained when the six scales were scored 
on the self-report ACLs of the 57 male mathe- 
maticians studied by Helson and Crutchfield 
(1970). Correlations with criterion ratings of 
creativity were —.11, —.03, .05, —.08, .05, 
and —.08 for the six ACL scales, in the same 
Sequence. Because of these inconclusive find- 
ings, it was decided to undertake a new anal- 
ysis of the ACL, using larger samples and a 
broader range of criteria, to see if a stronger 
measure could be developed. 


Method 
Samples 


Six male Samples were available in which the ACL 


1958), 


a study of population Psychology (Go 
and 25 males from a study of oe Natt 
ences (Craik, Note 1). 


There were four samples of women fi i 
ACL protocols ie. 


and ratings of creativity were avail- 


1973), 
prefer- 
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able. The first was comprised of the 41 women 
mathematicians studied by Helson (1971). The sec- 
ond included 335 graduate students in psychology 
at Berkeley, tested at the time of entry and rated 
between 3 and 4 years later by faculty members. 
The third was composed of the 51 college seniors 
reported on earlier by Helson (1967). The fourth 
was a composite sample of 126 women, including 
20 college sophomores from the study of career plan- 
ning, 41 women from the study of population psy. | 
chology, 25 women from the study of environmental 
preferences, and 40 first-year students of law (La- | 
Russa, 1977). i 

In addition to these 1,631 subjects, there were 35 
males and 35 females for whom ACL protocols were 
available and who had been interviewed by two 
psychologists and described by them on the ACL 
and on Block’s (1961) 100-item California Q-set, 
Although direct ratings of creativity were not ob- | 
tained, indirect estimates were derived from the 
ACL and Q-sort portraits furnished by the inter- | 


| 
viewers, in a way to be described below. | 


Tests and Criteria 


The ACL self-report protocols of all subjects were 
scored for the creativity scales of Domino and 
Schaefer and for Welsh’s four scales. Four kinds of 
criterion evaluations were utilized. For the archi- 
tects, male and female mathematicians, and research 
scientists, creativity was specified by ratings fur- 
nished by expert judges. The validity and reliability 
of these ratings are discussed in the papers already 
cited. Ratings by faculty members supplied the cri- 
terion for the engineering students, psychology grad- 
uate students, and college seniors. The psychology 
graduate students had entered over a 24-year pê- 
riod. Faculty ratings were obtained every 3 or 4 
years, covering students who had entered in the 
Preceding interval. Corrected interjudge reliabilities 
for these ratings of creativity ranged from .73 to 
87, with a median of .77, Equivalent coefficients 
were found for the engineers and college seniors. K 

The 256 males and 126 females seen at the insti- 
tute in intensive programs of assessment were rated f 
On creativity by panels of 10 or more observers. 
Corrected interjudge reliabilities for these ratings 
were typically very high, ranging from .80 to .98. 

To develop a criterion for the 70 interviewed sub- 
jects, the interviewers’ checks on five adjectives an 
Placement of five Q-sort items were summed. The 
five adjectives were imaginative, insightful, intelli- 
gent, original, and resourceful. There were four 
Sort items given positive weighting: Has a wi 
range of interests, Appears to have a high degree 
of intellectual capacity, Thinks and associates +0 
ideas in unusual ways; has unconventional thought | 
processes, and Able to see to the heart of important 
problems. One Q-sort item was assigned a negative 
weight: Is uncomfortable with uncertainty and com- 
plexities. The total score based on these 10 cont 
Ponents had a standardized item alpha reliability 
Coefficient of .79. 


Analyses 


For purposes of item analysis, four subgroups were 
defined: (a) 558 males from the samples of archi- 
tects, mathematicians, scientists, engineers, and other 
assessed males; (b) 530 male graduate students in 
psychology; (c) 218 females from the samples of 


| mathematicians, college seniors, and other assessed 


females; and (d) 335 psychology graduate students. 


Ratings of creativity were converted to standard 
| scores by subsample. Point-biserial coefficients of 


correlation were calculated between each of the 300 


f items in the ACL and the criterion standard scores. 


Results 


Thirty items were selected for inclusion in 
the Creative Personality Scale (CPS) on the 
basis of item analysis. Several examples may 
be given. The item egotistical yielded correla- 


| tions with the criterion ratings of creativity 


of .17, 06, .08, and .08 in the four subgroups. 
Although only the first of these coefficients is 
statistically significant (p < .01), all are posi- 
tive, and the item itself is consonant with 
prior conceptualizations of the creative per- 
sonality. The item also appears on Domino’s 
scale. The item original produced correlations 
of 13, 06, .17, and .11. Two of these co- 
efficients are statistically significant (p< 
05), and the word appears on both the 
Schaefer and Domino scales. It is also quite 
clearly consonant with prior conceptualiza- 
tions of the creative personality. 

The item conservative had correlations with 
the criterion of —.16, —.08, —.13, and —.10. 
Two of these coefficients are significant at the 
05 level of probability, and one is significant 
at p = .06. Because of these findings and be- 
Cause absence of the attribute appears to be 
Consonant with previous conceptualizations of 
t € creative personality, the item was retained 
with a negative weighting. In this way, 18 
Positive and 12 negative items were selected 
for the final scale. The positive items were 
pee, clever, confident, egotistical, humor- 
a individualistic, informal, insightful, in- 
c interests wide, inventive, original, 
eg resourceful, self-confident, sexy, 
Mel He and unconventional. The negatively 
bog ted items were affected, cautious, com- 

x Place, conservative, conventional, dissatis- 

» honest, interests narrow, mannerly, sin- 
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cere, submissive, and suspicious. In scoring a 
protocol, 1 point is given each time one of the 
18 positive items is checked, and 1 point is 
subtracted each time one of the 12 negative 
items is checked. The theoretical range of 
scores is therefore from —12 to +18. 

The 30 items were then compared with 
those in the previous six scales. Seven CPS 
items were found in Schaefer’s 27-item scale, 
all scored in the same direction. Fifteen CPS 
items were found in Domino’s 59-item scale; 
14 were scored in the same direction, the ex- 
ception being dissatisfied, which received a 
negative weight on CPS and a positive weight 
on Domino’s measure. Two CPS items were 
found to overlap with Welsh’s A-1 scale, 5 
with his A-2 scale, 1 with his A-3 scale, and 
2 with his A-4 measure. Except for dissatisfied 
on A-2, all of these common items were scored 
in the same direction. Because dissatisfied 
showed these two reversals, attention was re- 
directed to the item analytic data. On the four 
subgroups the item had correlations of —.02, 
—.07, —.10, and —.02. Additional analyses 
were then computed on the male mathema- 
ticians, the research scientists, and the college 
seniors, where the coefficients were —.17, 
—.36, and —.10. Because of these findings it 
was decided to retain the item with a negative 
scoring weight. 

The 30-item Creative Personality Scale 
(CPS) was then scored on all of the samples 
included in the present study. Alpha coeffi- 
cient reliabilities were computed on the four 
subgroups defined for the item analysis. The 
coefficients were .77 for the male composite 
group, .73 for the male graduate students, .81 
for the female composite group, and .73 for 
the female graduate students. Means and 
standard deviations for each sample are given 
in Table 1. Among the male samples, the high- 
est mean was attained by the research scien- 
tists, followed closely by the psychology grad- 
uate students. The lowest mean was that for 
the 35 males in the sample seen in interviews 
only. Among the females, the highest mean 
was that for the female graduate students, 
and the lowest was that for the women seen 
only in interviews. Where comparisons seem 
appropriate, as between the two samples of 
graduate students and the two samples of 
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interviewed subjects, CPS scores were sig- 
nificantly (p < .05), albeit only slightly, 
higher for males than for females. 

Table 2 gives the correlations among the 
seven ACL scales, computed on the samples of 
male and female graduate students in psy- 
chology. The median coefficients between 
CPS and each of the other measures were .68 
for Domino’s scale for creativity, .74 for 
Schaefer’s scale, 42 Welsh’s A-1, .51 for 
Welsh’s A-2, 32 for Welsh’s A-3, and .32 for 
Welsh’s A-4. For the other measures, the high- 
est median correlations were those of .89 be- 
tween the Domino and Schaefer scales, .64 
between A-1 and the Domino and Schaefer 
scales, .82 between A-2 and the Domino and 
Schaefer scales, .64 between A-3 and A-4, and 
-64 between A-4 and the Domino and Schaefer 
scales. The presence of the 12 negatively 
weighted items in CPS appears to be the rea- 
son why the other six scales correlate more 
highly among themselves than they do with 
CPS. 

Table 3 gives the correlations of the seven 
ACL scales with criterion ratings for creatiy- 
ity. For all seven of the male samples, the 
highest coefficient each time was that for CPS, 
and in six of the seven instances the coeffi- 
cients for CPS were significant at or beyond 
the .05 level of probability. The exception was 


Table 1 

Means and Standard Deviations for the 
Samples Indicated on the Adjective Check 
List Creative Personality Scale 


Sample n M 


SD 
Male 
Architects 124 5.28 3.86 
Mathematicians 57 4.44 4.20 
Research scientists 45 5.98 3.71 
Psychology graduate 
students 530 5.96 3.86 
Engineering students 66 3.88 3.94 
Assessed males 256 3.57 3.99 
Interviewed males 35 2.00 3.01 
Female 
Mathematicians 41 3.34 4.45 
Psychology graduate à 
students 335 5.43 3.88 
College seniors 51 5.10 4.24 
Assessed females 126 440 40 
Interviewed females 35 0.00 3.25 
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Table 2 

Intercorrelations Among Seven A djective Check 
List Scales in Samples of 530 Male and 335 
Female Graduate Students in Psychology 


—_— es 
Correlation 
re 
Scale Be 1384 15 ogee 
1. Domino 
creativity 
Male 89 64 .82 51 63 .69) 
Female 89 .64 81 54 65 614 
2. Schaefer 
creativity 
Male =) 1/263. :.76. 347-355) A 
Female = ".6)) .73: .46° 48h 
3. Welsh A-1 
Male — .62 51 44 45 
Female — .62 46 .35 40 
4. Welsh A-2 
Male — 31 43 38 
Female — .36 42 49 
5. Welsh A-3 ’ 
Male — 69 32 | 
Female — 61 339 
6. Welsh A-4 
Male =m 
Female — 3} 
7. Creative 
Personality 
Male = 
Female = 


Note. All coefficients are Statistically significant | 
beyond the .01 level of probability, A-1 = Welsh | 
high origence, low intellectence; A-2 = Welsh high l 
origence, high intellectence; A-3 = Welsh low 
origence, low intellectence; A-4 = Welsh low ori- 
gence, high intellectence. 


the coefficient of .25 for the 45 research scien- 
tists, where p = .097 in a two-tailed test. The 
smallest coefficient for CPS was that of .15 
for the psychology graduate students, and the | 
largest was that of .42 for the 256 males rated 
by observers in the various assessments. On 


1 For purely normative purposes, Adjective Check 
List protocols for 1,121 college females and 760 col- | 
lege males from the author's research files were 
Scored for the new 30-item scale, The mean of 5.03 
(SD =4.01) for males was significantly higher < 
01) than that of 3.97 (SD = 4.34) for females. This 
difference suggests that where raw scores on the new 
scale are used, analyses should be conducted sepa- 
rately for males and females. It should be noted 
that most studies with the ACL use scale scores that 
have been standardized by sex. 
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Correlations of Selected Scales From the Adjective Check List With Criterion Ratings of Creativity 


Scale and correlation 


Sample n D s A-1 A-2 A-3 A4 CPS 
Males 
Architects* 124 35 36 16 a 
Mathematicians* 57 ti aes ceils BURT INE 
Research scientists 45 ‘01 MORNE 06, | N01 Nos 08.25 
Psychology graduate 530 01 01 = i T i 
l Psychology <06 ‘01 09* = 0515 
| Engineering students? 66 27* 33* 03 26* 
P E ; : —.04 i * 
Assessed malese 256 aaae N gae ge ages ty f ve 
Interviewed males? 35 ‘05 13. 0) = 34 ‘02 21 05135 
Females 
Mathematicians* 41 —.16 —.07 12 07 42** 
i f m i iby —.40" 
Psychology graduate 335 514% .17**  —.02 .07 .02 $ ate 
students 5 é 
College seniors? 51 A2 ae) Aao 200 10 1427 
Assessed females* 126 29%% = 1328" “og 22* 06 21" 40% 
Interviewed females! 35 27 34* ‘05 21 ‘07 35" .40* 


Nole. D = Domino creativity; S = Schaefer creativity; A-1 = Welsh high origence, low intellectence; 


: z = Welsh high origence, high intellectence; A-3 
ow origence, high intellectence; CPS = 


4 Titeria = ratings by expert judges. 
panera = ratings by faculty members. 
Titeria = ratings by assessment staff. 
Criteria = ratings by interviewers. 
ne < .05. 
$ < .01 
the first six male samples, higher coefficients 


4 CPS would be expected because these sam- 
w were used in the item analyses. The 35 
p mea males, however, were not used in 
pen studies and therefore constitute a 
the “ross-validating sample. In this sample 
th only Statistically significant coefficient was 
at of 35 for CPS, 

d af the five samples of women, the coeffi- 
A S CPS was the highest in four instances 
ach cally significant at the .05 level 
iG; The exception occurred in the sam- 
ubject 1 enm mathematicians. For these 
CAIN Welsh’s A-3 scale had a statistically 
id his re (P < .01) coefficient of —.42, as 
4 ae scale, for which the coefficient was 
a os e coefficient for CPS on this sample 
ample »P = 075 in a two-tailed test. The last 
een ges: in Table 3, the 35 women 
ite em in life history interviews, consti- 
bject Tue cross-validating sample. For these 

S, there were three statistically signif- 


= Welsh low origence, low intellectence; A-4 = Welsh 


Creative Personality Scale. 


icant (p < .05) correlations between scales 
and the derived criterion rating of creativity: 
.34 for Schaefer’s scale, .35 for Welsh’s A-4 
scale, and .40 for CPS. ; 

In the six male samples from which item- 
analytic data were drawn, the median coeffi- 
cient between CPS and the criterion ratings of 
creativity was .30, and in the small cross- 
validating sample of 35 males, the correlation 
between CPS and the inferred rating of cre- 
ativity was .35. A similar summary for the 
female samples yielded a median coefficient 
of .28 for the samples used in the item analy- 
ses and a coefficient of .40 for the small cross- 
validating sample. These samples covered a 
wide range of ages, kinds of work, and cir- 
cumstances of testing. They also involved the 
use of four vantage points in rating creativ- 
‘ity; expert judges, faculty members, personal- 
ity-assessment staff observers, and life-history 
interviewers. If one-tailed tests of significance 
are allowed, the new 30-item scale can claim a 
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statistically significant relationship with every 
criterion in every sample. If the two-tailed 
test is used, 10 of the 12 validity coefficients 
for CPS were significant at the .05 level of 
probability or beyond. On the basis of these 
findings, it seems reasonable to conclude that 
the new scale is a reliable and moderately 
valid measure of creative potential and that it 
may properly be included among the scales to 
be scored on the Adjective Check List. 


Reference Note 


1. Craik, K. H. Impression of a place: Effects of 
media, context, and personality. In S. Saegert 
(Chair), Psychology of the Urban Environment. 
Symposium presented at the annual meeting of 
the American Psychological Association, Toronto, 
August 1978. (Copies available from the Insti- 
tute of Personality Assessment and Research, Uni- 
versity of California, Berkeley, California 94720.) 
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Conflict and Avoidance in the Helping Situation Ji 
S. Mark Pancer, Linda M. McMullen, Randal A. Kabatoff, 
Kent G. Johnson, and Carole A.- Pond 


University of Saskatchewan, Saskatoon, Canada 


Two field studies examined avoidance behavior in individuals asked for a charit- 
able donation. Subjects were observed as they walked past a table placed in a 
tunnel leading to the university library. The first study showed that subjects 
tended to maintain a greater distance from the table when it was set up as a 
donation center than when it was not. In addition, subjects maintained greater 
distances when a female confederate sat at the table than when no one was at 
the table. Study 2 replicated the findings of Study 1. It found, in addition, that 
even greater distances were maintained from the table when a handicapped 
person sat at the table than when the person at the table was not handicapped. 
This avoidance behavior is discussed with reference to the kinds of conflict 


inherent in helping situations. 


The helping situation is fraught with con- 
flict, Usually there are strong normative pres- 
sures to help those in need and to give aid to 
those who have been- innocently victimized 
(see Harris & Meyer, 1973; Konečni, 1972). 
Coupled with these pressures, however, are 
equally strong forces that tend to inhibit 
helping. Research has shown, for example, 
that appeals for help often evoke reactance 
(Jones, 1970). In certain situations, helping 
may also expose the helper to danger or to the 
ridicule of others (see Latané & Darley, 
1976). Also, costs in time, effort, and money 
are often involved in giving help. 

One consequence of the conflict inherent in 
many helping situations may be what 
Schwartz (1970) refers to as a “redefinition” 
of the helping situation, The individual rede- 
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fines the situation to perceive himself/herself 
as less responsible for helping or the situation 
as less needful of help. This cognitive redef- 
inition of the helping environment serves t0 
reduce the press of social helping norms and 
thereby lessens the conflict intrinsic to such 
situations. Such redefinitions may often be 
difficult to accomplish, however, as demon- 
strated in the research of Darley and Latané 
(1968). They report anecdotal evidence of 
subjects under much emotional stress result- 
ing from not helping someone who was quite 
obviously in need of help. 

Another means of reducing helping-related 
conflict is simply to avoid such situations. | 
Canvassers for various charities (e.g., Wils- 
don, Note 1) have long noted a tendency in 
potential donors to ignore the existence of the 
charity worker or even to go some distance 
out of their way to avoid the donation site 
Avoidance is psychologically perhaps the least 
costly of conflict-reduction strategies, since it 
Spares the person from any cognitive distor- 
tions that might have been necessary to rede- 
fine the situation. | 

The purpose of the present study was to 
demonstrate the avoidance that can occur in 
helping situations and to determine some 0f 
the factors that may influence the extent of 
this avoidance. In the studies reported, sub- 
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5 HELPING AND AVOIDANCE 


+ jects were observed as they walked past a 
table Set up in a tunnel connecting the arts 
| and library buildings on a university campus. 
It was hypothesized that subjects would show 
greater avoidance of the area around the table 
when it was set up as a donation center for a 
well-known charity than when it was not set 

| upin this way. 
Another variable examined was the presence 
| or absence of an experimental confederate at 
the donation table. Research has shown that 
personal (or even personalized) requests are 
more effective in eliciting helping. This may 
| be true for a number of reasons. Anonymous 
or impersonal requests (such as those repre- 
sented by placing an unattended donation 
Canister on a street corner) may not be per- 
ceived to be as strong as personal requests. 
Also, the person canvassing for donations may 
serve as a helping model. It has often been 
demonstrated that models serve to increase 
helping, presumably by making helping norms 
More salient (e.g., Bryan & Test, 1967). Al- 
though personal requests often elicit more 
helping, however, they may also produce 
pester avoidance. They are even more likely 
than Impersonal requests to produce reactance 
m the potential donor because of the strong 
lormative pressures implied. In addition, be- 
Ng approached or asked for help by a strange 
Person may represent an invasion of one’s per- 
“nal space, Again, while this may produce 
th. helping (Baron, 1978; Baron & Bell, 
ge Konečni, Libuser, Morton, & Ebbesen, 
A ), it is also likely to produce more avoid- 
i €. The presence of a confederate at the 
nation table, then, may serve to intensify 
a pomicting demands present in the helping 
th ion. This may result in an increased level 
o elping (and therefore a greater tendency 
approach the table) or greater avoidance 

e donation site, 


Study 1 
Method 


Sı 

ae 7 of design. The experiment was a 2 X 2 

ast a E were monitored as they walked 

Vay Ma le that did or did not contain a United 

here h ation box (donation factor) and at which 
as or was not a female confederate seated 


1407 


(person factor). The measure of avoidance was the ; 


distance subjects maintained from the table as they 
walked past. 

Subjects. The subjects were 180 people (45 in 
each of the four experimental conditions) who were 
randomly selected as they walked through a tunnel 
that connected the main library to the arts building 
on the University of Saskatchewan campus. Only 
subjects walking alone from the library towards the 
arts building were monitored. 

Experimental site. Approximately halfway through 
the tunnel, a chair and a table had been positioned 
along one wall (see Figure 1). The tunnel floor from 
the outside edge of the table to the opposite wall 
was sectioned into six lanes, each lane measuring 76 
cm in width (approximately one body width). These 
lanes were marked by placing small pieces of black 
tape on a 2-cm expansion strip that ran the width 
of the tunnel, through the center of the table. Ex- 
perimental observers sat in a lounge area approxi- 
mately 6.5 m from the table, from which subjects 
could be clearly observed as they were both ap- 
proaching and passing the table. 

Procedure. The experiment was conducted during 
the afternoon of 1 day and the morning of the next. 
No subjects were observed within 10 minutes of 
breaks between classes, due to the large numbers 
of people using the tunnel at these times. This re- 
striction also prevented individuals who were in a 
particular hurry to go to (or away from) class from 
being used as subjects. Six volunteer undergraduate 
observers (three on each of the 2 days) sat un- 
obtrusively in the lounge area across from the table. 
As people approached the table from the library, 
each observer selected a subject who met the cri- 


terion of being alone, made certain that none of the 


other observers was monitoring that same subject, 
and recorded the lane through which the subject 
walked as he or she passed the table. As soon as a 
recording had been made, the observer selected the 
next subject walking alone and unmonitored by the 
other observers. Each observer monitored the same 
number of subjects under each of the four experi- 


mental conditions. Prior testing had indicated greater — 


than 95% interrater agreement as to which lanes 
subjects had used in passing the table. 

On each of the 2 days, the conditions were ar- 
ranged in a different random order, and approxi- 
mately half the subjects in each condition were 
observed on each day of the study. In the two person- 
present conditions, a female undergraduate is 
erate sat on the chair behind the table. She mAn 
no attempt to solicit donations but sat silently. 
watched the activity in the tunnel, avoiding oe 
eye contact with subjects who were being n eH 
tored. In the no-person-present conditions, the 
was empty. 

In the two donation conditions, a 
tion box (34 X 23 X 47 cm) was i 
table et the end closest to apl ching EHA 
It was covered on all four sides and es re), 
posters from the donation agency (l Be tack. 
Each of these posters consisted of a wi 
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Figure 1. Experimental site. 


ground with 2 cm of black lettering on the top and 
bottom and the agency’s logo in orange in the center. 
The lettering on the top of each poster read “Thanks 
to you it’s working,” and on the bottom it read 
“United Way.” In addition, a scaled-down version 
of the poster (17 X25 cm) was placed behind the 
box so that it was clearly visible to people walking 
from the library. In the no-donation conditions, 
the table was bare. 


Results and Discussion 


Mean distances maintained from the table 
as subjects passed by are presented in Table 
1." Cochran’s test for homogeneity of variance 
revealed significant differences among cell 
variances (C = .366, $ < 05). Distance 
scores were therefore converted to reciprocals 
before being submitted to analysis of variance. 
Transformed scores appeared to be homoge- 
neous (C = .310, ns). 

Analysis of variance of the transformed 
data revealed a main effect for the donation 
factor, F(1, 176) = 6.03, p < .025. Subjects 
maintained a greater distance from the table 
when it displayed a poster and donation box 
than when no donation request was implied. 
A main effect was also obtained for the person 
factor, F(1, 176) = 13.46, p < .001, subjects 
maintaining a greater distance from the table 
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when the female confederate was present than 
when no one was at the table. The Donation 
X Person interaction was not significant, F(l, 
176) = .40, ns. j 
Only three subjects made donations, two in 
the person and one in the no-person condi 
Analysis of variance of the distance mead 
excluding these subjects revealed the sam 
patterns of significant differences. | 
These results suggest that people do el 
tively avoid situations involving charita! a 
donations. The results further indicate p 
personal appeals will produce more avoids 
than anonymous appeals. It is unlikely 
the presence of the confederate prod 
more avoidance because of the normative Pa 
sures implied or because the appeal was s 
how stronger; her presence produced n 
avoidance in the nondonation conditions 
well as in the donation conditions. This sug- 
gests that much of the avoidance may se 
from the invasion of personal space that cal) 
occur when a personal appeal is made. 


+ ing t° 

1 Subjects never used Lanes 5 or 6 in e o 
the arts building and only infrequently use a 

4. This was most likely because the lanes wit! wet 


numbers represented the shortest path to the 
building. 
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| Table 1 
Distance Maintained from Table: Study 1 


ee 
Person factor 


Person No person 
Donation factor present present 
Donation sought 1.71 1.33 
No donation sought 1.42 1.20 


Wole. Distances are in terms of 76-cm units. 


Study 2 


The second experiment was conducted to 
teplicate the findings of Study 1 and further 
to examine differences between personal and 
impersonal appeals in evoking avoidance be- 
havior, 

Study 1 did not control for the effects of sex 
of subject or the nature of the charity in- 
volved. It is possible that differences in avoid- 
ance behavior were the result of having differ- 
ent proportions of males and females in each 
of the cells. It may also be possible that sub- 
ects’ feelings about the particular charity in- 
volved may have produced a different pattern 
of avoidance than if another charity had been 
Wed. For these reasons, sex of subject was 
Included as a factor in Study 2, and the appeal 
yas made on behalf of the United Nations 
Children’s Fund (unicer), a different charity 
tom that used in Study 1. 

q It is often the case in appeals for donations 
y members of the disadvantaged group act 

Canvassers, While potential donors may be 
ve sympathetic to handicapped canvassers, 
a research literature suggests that disabled 
Nee can also elicit greater avoidance. 
i example, Acton, Bentley, Matheson, & 
el (Note 2) found that subjects avoided 
Bae at a disabled person more than at a 
aes person during the course of an 

le In a study by Kleck, Ono, and 

Rane (1966), subjects interacting with a 
ees individual terminated the inter- 
a sooner than did subjects interacting 
) her nonhandicapped individual. In Study 
it dae again walked by a table that did 
Sadie display a donation box and at which 
nfed icapped confederate, a nonhandicapped 

erate, or no one was seated. It was hy- 
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pothesized that greater avoidance would occur 
when the handicapped individual sat at the 
table, less avoidance would occur when the 
nonhandicapped person was at the table, and 
least avoidance would result when no one was 
seated at the table. 


Method 


Summary of design. The experimental design was 
a 2X2X3 factorial. Male and female subjects 
(sex factor) were monitored as they walked past 
a table that was or was not set up as a center for 
donations (donation factor). Seated at the table 
was a handicapped person, a nonhandicapped per- 
son, or no one (person factor). Again, the dependent 
measure was the distance maintained by subjects as 
they walked past the table. 

Subjects. The subjects were 462 people (51 males 
and 26 females in each of six experimental condi- 
tions) who were randomly selected as they walked 
from the library to the arts building as described 
in Study 1. 

Experimental site. The site used was the same as 
in Study 1. The only change made was in the width 
of the lanes marked along the tunnel floor. The 
lanes had been reduced from 76 cm to 62 cm in 
width. This provided for finer distance discrimina- 
tions with no apparent loss in interrater reliability. 

Procedure. Subjects were observed during three 
sessions, in the morning and afternoon of one day 
and in the morning of the following day. Approxi- 
mately the same number of subjects were observed 
in each of the three sessions. Two observers moni- 
tored subjects in the same manner as in Study 1, 
The experimental conditions included all six com- 
binations of the donation factor and the person 
factor. These six conditions were arranged in a dif- 
ferent random order for each of the three experi- 
mental sessions. l 

The three donation conditions were somewhat 
different from those in Study 1, since a different 
charity was being used. In these conditions a sign 
reading “Please donate to UNICEF” was hung fon 
the end of the table facing oncoming traffic. A 
donation box (18 X 18 X 23 cm) was positioned on 
the table as in the first experiment. In the no-dona- 
tion conditions, the table was bare. 

In the handicapped-person conditions, 4 male con- 
federate sat in a wheelchair near the end 
table that faced oncoming apes aa wi 
was backed against the wall and $ 
traffic at a 45° angle. Both the when and the 
donation sign were clearly visible to Appi no 
subjects. In the nonhandicapped-person Reet 
the confederate sat in an ordinary chait, ae 
exactly as the wheelchair had been. In the 


conditions, this chair was empty. 
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Table 2 

Distance Maintained from Table: Study 2 

Person factor 
Handi- Nonhandi- No 

Donation factor capped capped person 

Donation sought 2.08, 1,99, 1.308 

No donation sought 1.66, 1 Adee 1434 


Neie, Distances are in terms of 62-cm units. Means 
carrying different subscripts differ at the p < .05 
level of significance. 


Results 

Analysis of variance of the distance mea- 
sure revealed no main effects for sex of sub- 
jects and no interactions involving this vari- 
able, Consequently, the data were collapsed 
across sex and reanalyzed using a 2 Xx 3 
(Donation x Person) analysis of variance 
Mean distances are presented in Table 2 
Cochran's test for homogeneity of variance 
again revealed significant among 
ceil variances (C = 232, p < 05). Distance 
‘scores were therefore transformed as in Study 
1 Detre buing submitted to analysis of vari- 
ance, Transforming the scores appeared to re- 
Ssh la homogeneous cll variances (C = 212, 
Analysis of variance of the transformed 
data revealed a main effect for the donation 
factor, F(1, 456) = 45.03, p < 0001. As in 
the firm experiment, subjects maintained a 
Senter distance irom the table when it was 
‘et up as a donation station than when it was 
sot. A main effect was also obtained for the 
preson factor, F(2, 456) = 60.96, p < 0001. 
Piansad comparisons showed that the greatest 
distances were maintained from the table at 
intermediate distances were maintained from 
the table at which the nonbandicapped person 
was sated, and least distances were main- 
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Analysis of variance also yielded a i 
Donation x Person interaction, F(2, 4 
2.84, p < .06. Multiple comparisons of the 
cell means (indicated in Table 2) show th 
while the donation and no-donation mean 
did not differ when there was no person at th 
table, they did differ significantly when eitha 
a handicapped or a nonhandicapped confet 
erate sat at the table. i 

Only a relatively small number of p 
donated money during the course of the & 
periment. Twelve people made donations, 
all; 5 when the handicapped confederate 
present, 6 when the nonhandicapped co 
erate was present, and only 1 when no 
was at the table. Naturally, these people 
recorded as passing the table in Lane 1, Prig 
analysis that excluded these subjects 
the same patterns of significant differences, 


General Discussion 


These results provide support for the h 
pothesis that people will often avoid situatio 
in which they are asked to give help. T 
main effects found in both studies for W 
donation factor suggest that the appeal its 
regardless of its impersonal or personal 1 
ture, elicits avoidance behavior, The M 
effects found for the person factor in & 
studies indicate that another element in 
in producing avoidance on the part of 
tial donors may be the invasion of p 
Space that can occur when one is ap 
by an unfamiliar person seeking a don 
However, the Person X Donation interad 
found in Study 2 suggests that there may) 
another important component operating” 
personal appeals for help. The fact that p 
sonal appeals (by a handicapped or nonh 
capped individual) yielded greater avoiaa? 
relative to the corresponding nondonation @ 
ditions than did nonpersonal appeals su 
that personal appeals represent more th 
invasion of personal space. They may incre® 
the strength of the appeal and thereby iM 
ence the donor’s perceived choice 
whether or not to give. This would res 
greater reactance and hence greater avoi 
Further support for this interpretation 
from the fact that the personal appeals, 
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diiting more avoidance, also produced more 
donations than the anonymous appeal. 

Presumably, avoidance will be a function of 
the relative strength of the forces that moti- 
yate and inhibit helping. If those factors that 
motivate helping are much more powerful 
than those that inhibit helping, the individual 
will give help and suffer no conflict in doing 
w, If, on the other hand, the inhibiting fac- 
lors are much stronger than the motivating 
factors, the individual will not help but, again, 
will suffer no conflict in doing so. Conflict 
{and hence avoidance) will be maximal when 
the forces that motivate helping and those 
that inhibit helping are equally strong, 

These studies address an aspect of helping 
wt often considered in helping research. 
Helping studies have typically been quite 
pare in the kinds of dependent variables they 
tamine. Often the amount of help given is 
the only dependent variable considered. The 
sults of the current research indicate that 
in looking only at amount of helping, inves- 
ligators may be ignoring many aspects of the 
helping situation. A knowledge of the amount 
ind kind of avoidance produced, for example, 
wuld aid greatly in understanding why people 
ue often so reluctant to give help. Such con- 
Werations might also lead to more effective 
peals, in which helping entails less conflict 
ind aversion for the potential helper. 
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The Importance of Consistency of Modeling Behavior Upon 


Imitation: A Comparison of Single and Multiple Models 


Peter A. Fehrenbach, David J. Miller, and Mark H. Thelen 


University of Missouri—Columbia 


The present study was designed to assess the differential effects of single versus 
multiple adult models on children’s expression of preferences when the modeled 
preferences are either consistent or divergent. Second- and third-grade children 
were exposed to either a single model who made two choices per trial or to 
two models (multiple models) who made one choice each per trial. The modeled 
choices were either identical (consistent) or different (inconsistent) from each 
other on every trial of two four-option preference tasks. A single-response con- 
trol condition was included, in which a single model made one choice per trial. 
As expected, children who observed multiple models whose choices were con- 
sistent (multiple-consistent) imitated more than any other group. Children who 
were exposed to two models whose choices differed from each other on every 
trial (multiple-inconsistent) showed little imitation of either model. An inter- 
mediate amount of imitation was observed for the single model conditions, with 
the single model who made two identical choices per trial (single—consistent ) 
resulting in more imitation than the single model whose choices were incon- 


sistent on each trial (single-inconsistent). The single—consistent group and the 
single-response control group did not differ in imitation, but the two inconsist- 
ent modeling groups showed significantly less imitation than the control group. 


Research on the effects of modeling typ- 
ically has been conducted within a method- 
ological framework in which a subject observes 
the behavior of a single model. Yet in natural- 
istic settings, individuals often observe the 
responses of more than one person to a given 
situation, This phenomenon has been called 
multiple modeling and may involve similar or 
divergent responses on the part of the models 
(Bandura & Walters, 1963). The present 
study was designed to assess the differential 
effects of single versus multiple adult models 
on children’s imitation of preferences when 
the modeled preferences are either consistent 
or divergent. 

Previous research Suggests that multiple 
models are more likely than a single model to 
facilitate maintenance of reductions in the 
fear of dogs in children (Bandura & Menlove, 
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1968). However, in that study the multiple | 
modeling subjects viewed the models inter- 
acting with different dogs, whereas the single 
model interacted with only one dog. Thus, 
the number of observed models and the num- 
ber of feared stimuli were confounded. With 
such a confound removed, greater reductions 
in snake avoidance (Kazdin, 1974) and larger 
increases in self-reported assertive social skills 
(Kazdin, 1975, 1976) have been reported 
when multiple covert modeling procedures} 
rather than single covert modeling procedures 
were used. 

Other research has examined the effects of 
live multiple modeling cues on children 
standards of self-reward, sharing, and resist: 
ance to temptation. In support of the notion’ 
that consistently deviant modeling cues 0P- 
erate additively, Allen and Liebert (oi 
found that exposure to two deviant adu 
models weakened children’s adoption of pre 
viously learned standards of self-reward more 
than exposure to a single deviant model. 2a } 
Liebert and Fernandez (1970) found that} 
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nursery school boys were more likely to im- 
' itate specific norms for sharing when multiple 
models demonstrated the norm as compared 
with a single model. 

In making a direct comparison of a single 
model with two models, Allen and Liebert 
(1969) and Liebert and Fernandez (1970) 
confounded the number of models with the 
number of exposures to the modeled behavior. 
In both studies the single model condition 
‘consisted of essentially one exposure to the 
deviant or generous model’s behavior, whereas 
in the multiple modeling condition subjects 
observed twice the number of deviant or gen- 
erous responses. The greater behavioral 
thange observed following multiple modeling 
may have been simply a function of increased 
posure to the deviant or generous behavior, 
regardless of the number of models. 

Other support for the hypothesis that 
modeling cues operate additively comes from 
McMains and Liebert (1968), who have 
hown that children exposed successively to 
Wo models, one of whom behaved in accord- 
ince with rules (strict model) and the other 
Na deviant fashion (lenient model), showed 
nore rule violation than children exposed to 
No strict models and less rule violation than 
tildren who observed two lenient models. 
Volf (Note 1) obtained similar effects of 
lultiple models upon children’s resistance to 
¢mptation, 

TWo methodological considerations limit the 
‘eralizability of the findings of McMains 
nd Liebert (1968) and Wolf (Note 1) with 
Spect to the effects of discrepant multiple 
‘dels upon imitation. First, in many situa- 
ns in which multiple models demonstrate 
tepant behaviors (e.g, in the expression 
Preferences or judgments), the observer has 
€ Option of not imitating either model in 
oo to the option of imitating one or 
« of the models. In the McMains and 
oe (1968) and Wolf (Note 1) studies, 
7 itation was impossible to the extent that 
«nl biect’s violation of the rule could be 
hie as indicating imitation of the lenient 
h and adherence to the rule as imitation 
: ae model. Second, in both studies, 

ild’s behavior was measured after the 


Í E Š 
deling manipulations and when the child 


í 


1413 


was alone. Under conditions where a child 
must respond in the presence of models, it is 
more likely that the observer will anticipate 
the social consequences of imitating or not 
imitating. Thus, when two models are diver- 
gent in their behaviors, the observers may per- 
ceive imitation of one model as offensive to the 
other, nonimitated model. Given the option 
of nonimitation, such perceptions could result 
in lower levels of imitation of either model 
than would be predicted by the hypothesis 
that modeling cues operate additively. 

Different kinds of social factors seem to be 
operating when an individual responds in the 
presence of a single model. In this case the 
specific characteristics and behavior of the 
single model are more important in determin- 
ing imitation. For example, an adult model 
who demonstrates consistency in preferences 
is likely to be viewed as more certain and de- 
serving of emulation than an adult who is in- 
consistent or vacillating in his or her prefer- 
ences. 

The present investigation differed method- 
ologically from previous studies on multiple 
modeling in several ways. The experimental 
tasks involved the expression of preferences, 
rather than self-initiated standards of be- 
havior, and were performed in the presence of 
the model(s). The number of exposures to 
modeled behavior was controlled by having 
the single model make two responses, the same 
number of responses that occurred in the 
multiple modeling conditions where two 
models responded once each. Finally, a 2 X 2 
factorial design was employed with the num- 
ber of models (single vs. multiple) and type 
of modeling (consistent vs. inconsistent re- 
sponses) as the main factors. A single model - 
single response control group was also in- 
cluded. j ; 

It was expected that the effects of multiple | 
modeling on imitation of preferences would 
depend upon whether the models were con- 
sistent or divergent with each other in their 
choices. When both models expressed the same 
preference, high levels of imitation were ex- 
pected. When the models expressed prefer- 
ences different from each other, very little 
imitation of either model was expected. An 
intermediate amount of imitation was pre- 
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dicted for single model subjects, the incon- 
sistent — single modeling condition resulting in 
less imitation than a consistent model, 


Method 
Subjects 


The subjects were 80 second- and third-grade chil- 
dren (40 female and 40 male) from a primarily 
middle-class public elementary school. The experi- 
menter was a senior undergraduate male, and the 
models were two female undergraduates. The sub- 
ject pool was counterbalanced across the four ex- 
perimental conditions and one control condition with 
respect to sex, grade, seating position of the model(s), 
the order in which the models responded in the 
multiple modeling conditions, and the model con- 
federate utilized in the single modeling conditions. 
Each of the five conditions contained 16 subjects. 


Tasks and Materials 


Two limited-option Preference tasks were used in 
this study. The colors task involved 10 5 X 8 inch 
index cards with four 1-inch color squares attached 
horizontally across the cards. The color squares varied 
only slightly in shades of the same basic color on 
any given card, 

‘The faces task consisted of 10 emotionally expres- 
sive faces, each mounted on a 5 X 8 inch index card. 
Below each face were typed four nonsense syllables, 
which were three-letter consonant-vowel-consonant 
trigrams with an associative value of less than 15% 
(Archer, 1960). 


Procedure 


The experimenter escorted the subject individually 
to the experimental room, and the model arrived 
shortly after the subject. In the multiple modeling 
Conditions, the second model arrived shortly after 
the first model. The models were introduced to each 


other and to the experimenter in order to decrease | 


Suspicion on the part of the subject that the models 
knew each other or the experimenter. The subject 
and the models sat across a table from the experi- 
menter, the subject being seated between the two. 
models in the multiple modeling conditions. The ex- 
perimenter then explained that he was seeking the 
assistance of students and teacher aides in deciding 
which colors to paint some classrooms, 

In both of the single mode] conditions, the model 
and subject were told that each would have a chance 
to pick a color. In order to determine who went 
first, the experimenter drew their names from a box. 
The model’s name was always selected first, and she 
was told that she could therefore go twice before 
the subject made a selection from the colors. Both 
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model and subject were told that each choice they 
made could be the same or different from the pre- 
vious choices. 

The experimenter then administered 10 trials on 
the colors task. On each trial in the single-consistent 
condition, the model always pointed on her second 
chance to the same color square that she had se- 
lected on her first turn. In the single-inconsistent 
condition, the model always selected a different color 
square on her second turn. In all conditions the 
model’s choices were randomly determined. 

After 10 trials on the colors task, the experimenter 
told the subject and model that he was writing a 
children’s book and needed their help in picking 
“crazy names” to go with faces. They were asked 
to pick the name they thought went with the face, 
Ten trials on the faces task were then administered. 
As on the colors task, the model in the single-con- 
sistent condition pointed to the same name twice. 
In the single-inconsistent condition the model picked 
two different names on each trial. 

In both multiple modeling conditions, the models 
and subject were told that each would have a chance f 
to pick colors. Again, the experimenter drew names 
from the box to determine the order to be followed 
in making their choices. The models’ names were 
always drawn before the subject in a prearranged, 
counterbalanced order. The rest of the instructions 
for both tasks were essentially the same as in the 
single model conditions, except that both the models 
and subject were told that they could make one 
choice of colors (or names) per trial, On each trial 
in the multiple-consistent condition, the second model 
always expressed the same preference as the first 
model. In the multiple-inconsistent condition, the 
second model always selected a different color of 
name from the first model. 

The order of responding in the single-response 
control condition was established in the same Way 
as in the two single modeling groups. However, the 
model was instructed to go first and to make only 
One choice per trial. 


and the faces task and the sum of the matched 
responses on colors plus faces were used as the 
main measures of imitation. In general, the 
results of the analyses of the separate tasks 
are consistent with those obtained whell 


only the results of colors plus faces will be 
Teported. Preliminary analyses indicated 5 
significant effects on any of the dependen 
measures with respect to model confederates 
seating position of the model(s), or the orde 


which the multiple models made their 
choices. Therefore, the data were pooled 
| across these variables in subsequent analyses. 
Table 1 presents the means and standard 
deviations of the number of matched responses 
m colors plus faces tasks for the five groups. 
inspection of this table reveals that the pre- 
icted interaction of number of models and 
msistency of modeled responses was ob- 
lained. To test the reliability of this finding, 
2X 2 X 2 X 2 analysis of variance was per- 
formed, the factors being number of models 
(single vs. multiple), type of modeling (con- 
sistent vs. inconsistent), grade of subject 
(second vs. third), and sex of subject. A sig- 
nificant main effect for number of models and 
pe of modeling was obtained, but the pre- 
dicted interaction of number of models and 
ype of modeling was also highly significant, 
R(1, 48) = 20.71, p < .001. 
_Iwo-tailed ¢ tests indicated that as pre- 
ticted, more imitation occurred in the multi- 
jle-consistent group than in the single-con- 
‘stent condition, t(48) = 5.92, p< .001. 
Also as predicted, the single-consistent condi- 
lon resulted in more imitation than the 
ingle-inconsistent condition, #(48) = 2.01, 
I< 05. Finally, although the means are in 
he predicted direction, the mean for the 
ingle-inconsistent group was not significantly 
teater than the mean for multiple-incon- 
Itentigroup, 
_ Summary, multiple models who were con- 
y with each other in the expression of 
teferences more strongly influenced the chil- 
M to imitate than a single model who made 


able 1 


a and Standard Deviations of the Total 
mber of Matched Responses on the Colors 
“SFaces Tasks 


No. of matched 


responses 
Condition M SD 

ulti; . 
Ule-consistent 10.56 4.93 
iel p! -inconsistent 2.06 2.32 
4 Consistent 4.63 2.42 
g e-inconsistent 2.56 2.00 
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two consistent choices. However, multiple 
models who showed divergent preferences were 
imitated very little, These predicted results 
were obtained despite the fact that the prob- 
ability of obtaining matching responses by 
chance alone was higher in the inconsistent 
conditions, where two of the four options were 
modeled (p= .50), than in the consistent 
conditions, where chance imitation was only 
one out of four (p = .25). Imitation was sig- 
nificantly greater than chance in the multiple- 
consistent condition, ¢(15) = 3.92, p < .01, 
but significantly less imitation than that ex- 


“pected by chance alone was obtained in the 


multiple-inconsistent and single-inconsistent 
conditions, ¢(15) = —13.67, p < .001, and 
t(15) = —14.89, p < .001, respectively. 

An unexpected three-way interaction of 
number of models, type of modeling, and sex 
of subject was obtained, F (1, 48) = 4.02, p < 
.05. Separate comparisons of males and fe- 
males in all four experimental groups revealed 
that the only significant sex differences oc- 
curred in the multiple—consistent condition, 
where the female subjects imitated more than 
males, ¢(48) = 4.50, p < .001. 

Finally, nonorthogonal planned comparisons 
of the single-response control condition with 
the four experimental groups were performed. 
The multiple-consistent condition evidenced 
more imitation than the control group, ¢(75) 
= 6.28, p < .001 (all ¢ tests are two-tailed). 
The single—consistent group did not differ in 
imitation from the single-response control 
group. However, the multiple-inconsistent 
group (M = 1.06) imitated less than the con- 
trol group (M = 2.31) on faces, t(75) = 
—2.27, p < .05; there was only a trend in the 
same direction on colors plus faces, ¢(75) = 
—1.86, p< 10. Similarly, subjects in the 
single-inconsistent condition (M = 1.06) im- 
itated less than control subjects (M = 2.31) 
on faces, #(75) = —2.02, p < .05; a trend 
in the same direction was obtained for colors 
plus faces, (75) = —1.38, p < .20. i 

In summary, under the present conditions 
it appears that exposure to a single model 
making two consistent responses does not in- 
crease imitation over that obtained after ob- 
serving a single model responding once. The 
likelihood of imitation does increase, however, 
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when multiple models respond consistently 
with each other. While it thus appears that 
multiple models are required to increase imita- 
tion, imitation may be decreased by incon- 
sistent modeling, regardless of the number of 
models. This is evident by the finding of 
equivalent reductions in imitation in both 
inconsistent groups relative to the single-re- 
sponse control group and the single—consistent 
group. 


Discussion 


The results provide support for the notion ` 


that observing multiple models can increase 
imitation of preferences in children signif- 
icantly over that obtaingd after observing a 
single model. These results were obtained 
when the number of exposures to modeled be- 
havior was equated in both the single and 
multiple model conditions. Previous studies 
comparing single and multiple modeling have 
failed to control for the number of exposures 
to modeled behavior. However, the results 
also strongly suggest that a greater degree of 
imitation with multiple models can only be 
expected when the multiple models present 
consistent behaviors to imitate. When multiple 
models appear to be in conflict or are other- 
wise divergent in their behavior, very little 
imitation of either model can be expected. 
The finding of a dramatic attenuation of 
imitation in the multiple-inconsistent condi- 
tion warrants some explanation. While much 
of the research on imitation has emphasized 
the unidirectional impact of modeling on the 
subsequent behavior of a naive subject, more 
recent studies have considered the reciprocal- 
influence process involved in the ongoing 
model—imitator interaction. It has been found, 
for example, that being imitated is reinforcing 
to the model and leads to reciprocal imitation 
and to increased attraction on the part of the 
model toward the imitator (Thelen, Dollinger, 
& Roberts, 1975). Given that the effects of 
being imitated are often Positive, it is likely 
that children who observed two high-status 
adults (supposedly teacher’s aides) expressing 
different preferences were conflicted Tegarding 
which one to imitate. Children in the second 
and third grades may realize the possible 


P. FEHRENBACH, D. MILLER, AND M. THELEN 


negative social consequences (e.g., lowere 
attractiveness) of choosing to imitate or con 


conditions were not significant. These resul 
considered in conjunction with the findin 
that both inconsistent conditions resulted 
less imitation than the two consistent condi 
tions, indicate that the crucial variable aŭ 
counting for the low levels of imitation in the y 
multiple-inconsistent condition may be they 
inconsistency of modeling cues per se, rathel 
than social interaction variables associated 
with the number of models. 

A third explanation for the low levels oln 
imitation in the multiple-inconsistent condi, 


were actually imitating the nonimitative (dif, 
vergent) behavior of the second model. Th 
second model was thus demonstrating opposti 
tional behavior toward the first model and giv) t 
ing tacit acknowledgment that the child could 
similarly make a choice different from that 0l 
the first model. Brehm (1977) found thal 
when one adult’s attempt to influence chil} 
dren’s choice preferences was followed ‘BY 4! 
second adult’s suggestion that they should 
choose whatever they wanted, first graders 
(but not fifth graders) acted in opposition t0 
the first adult. Perhaps simply observing COM). 
flict in preferences among high-status adults 
disposes a child not to imitate either adult 
choices. The potential effects upon socializa 
tion resulting from a child’s receiving flict, 
ing modeling cues from such significant 8 
as parents remains to be determined it 
search conducted in more naturalistic settini 
than those employed in the present study. | 
With respect to the single model condition 
it appears that doubling the number of -© 
posures to modeled choices in the single-©? 
sistent conditions did not result in more imit@} 
tion than the single model - single-respor 
control condition. However, when. the sing" 
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odel made two divergent choices, the chil- 
ren were less likely to imitate either choice 
han when the model made the same choice 
wice. The difference in imitation between 
hese two single model groups may have been 
1 part a function of the lowered esteem in 
hich the inconsistent model was held relative 
ù the consistent model, Or, as mentioned 
arlier, the inconsistency of modeled informa- 
ion (cues) may result in reduced imitation 
dependent of social factors. 

This study presents strong support for the 
ypothesis that multiple models are more 
fective in producing imitative behavior than 
single model. This finding is only supported, 
owever, when the models behave consistently 
ith each other. Oppositional behavior or low 
vels of imitation or compliance are likely to 
esult when two models conflict in their be- 
avior or when a single model behaves incon- 
istently. Further research conducted in more 
aturalistic settings is necessary to determine 
ihether inconsistency in modeling cues pre- 
ented by either single or multiple models 
ignificantly affects socialization. For example, 
[parents in marital conflict provide divergent 
hodeling cues, what effect might this have on 
heir children’s behavior? 


Reference Note 


pol, T. M. The effects of multiple modeling on 
sistance to deviation. Paper presented at the 
‘ennial meeting of the Society for Research in 


Child Development, Santa Monica, California, 
April 1969, 
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Self-Serving Attributional Biases: 
Perceptual or Response Distortions? 


Gifford Weary 
Ohio State University 


The present article examines 


Miller’s analysis of what constitutes a self-serving 


attributional bias. It is argued that his delineation of different types of self- 
serving attributions is not supported by the empirical evidence collected to date 


and that what previous authors (e.g. 


tual bias in the causal inference process may 


Miller & Ross) have viewed as a percep- 


be better seen as a response bias 


or as a strategic self-presentation designed to maximize public esteem. 


In Miller’s (1978) recent reply to my re- 
examination of the notion of self-serving 
biases in the causal inference process (Brad- 
ley, 1978),* he discusses the merit of my 
broadened self-serving bias formulation and 
addresses the important issue of what consti- 
tutes a self-serving attribution. While an ex- 
amination of this latter issue is long overdue, 
I will argue in the present article that the 
analysis presented by Miller serves only to 
obfuscate issues in an already confusing area 
of inquiry. Moreover, I will contend that in 
his reply, Miller (a) incompletely summarized 
my formulation of self-serving attributional 
biases and (b) failed to address important 
criticisms raised in my examination of Miller 
‘and Ross’s (1975) information-processing 
analysis of data that have been interpreted as 
reflecting motivational biases. 

In my review (Bradley, 1978) of the em- 
pirical evidence relevant to the proposition 
that self-serving biases modify attributions of 
causality, I presented a broadened formula- 
tion to account for seemingly counterdefensive 
attributions (i.e, attributions indicating 
greater acceptance of responsibility by the 
attributor for negative than for positive out- 


The author thanks John H. Harvey for his helpful 
comments on an earlier draft of this paper. 

* Some of the author’s previous publications have 
appeared under the name “Gifford Weary Bradley.” 

Requests for reprints should be sent to Gifford 
Weary, Department of Psychology, Ohio State Uni- 
versity, 164 West 19th Avenue, Columbus, Ohio 
43210. 


comes attendant upon his/her behavior), 
Specifically, I suggested that self-serving at- 
tributions may be viewed as public self-pre- 
sentations designed to maximize public esteem 
and that under certain conditions such esteem 
needs may be best served by accepting respon 
sibility for negative outcomes. 

While Miller (1978) feels this broadened 
formulation “has the virtue of parsimony” 
(p. 1221), he also states that it “seems t0 
vitiate the virtue of parsimony by being over 
inclusive and allowing any asymmetrical at: 
tributions for success and failure to be inter 
preted as being self-serving in nature” (P 
1221). It is important to note, however, 
certain major conditions under which ont 
would expect esteem motives to lead to de 
fensive versus counterdefensive attributions 
were specified in my earlier article. It wa 
pointed out that in all of the studies that have 
offered support for counterdefensive attribt 
tional processes, there has existed the oppor 
tunity to compare simultaneously the subject 
self-attribution for his or her own task pet 
formance with his or her subsequent behavi0 
on a similar task or with another’s causal 4 
tribution for the subject’s outcome. Is 
gested (1978, pp. 64-66) that the probe 
simultaneous comparison of and potential 
consistency ‘between one’s description 
causality and (a) one’s subsequent behav! 
and/or (b) another’s description of causa 
for one’s performance outcomes may n 
an individual from taking too much credit 
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that 


success or denying blame for failure. In view 
of this reasoning, I contend that the broad- 
ened formulation is not “overinclusive” and 
is not invulnerable to disconfirmation. 

In an attempt to define a self-serving at- 
‘tribution, Miller (1978) suggests that attribu- 
tions for positive and negative outcomes can 
represent either distortions in perception of 
tausality or distortions in descriptions of 
tausality. That is, “they can serve to protect 
or enhance how the person perceives himself 
or herself” or “they can serve to protect or 
enhance how others perceive the person” (p. 
| 1221). Moreover, Miller argues that my anal- 
ysis of self-serving attributional phenomena 
has obscured two distinct psychological pro- 
Messes, because I use the term self-serving to 
Tefer to attributions that serve one’s public 
image, whereas previous authors have used 
this term to refer to attributions that serve 
One’s private image. 

_ There are several problems associated with 
this statement. First, as Miller (1978) him- 
Self admits, there is little evidence to support 
Me proposition that “we are prone to alter 
bur perception of causality so as to protect 
Ot enhance our self-esteem” (p. 1222). In 
Most of the studies reviewed by Miller and 
Ross (1975) and myself (Bradley, 1978), 
jects’ performances and causal interpreta- 
IN were public in nature; consequently, 
libjects’ causal ascriptions for their positive 
Ad negative outcomes likely reflected “dis- 
Ottion in descriptions or self-presentations, 
Wt perceptions” (Miller, 1978, p. 1222).1 
Mond, the preceding statement should not 
taken to imply that private self-esteem 
AS are not influenced by distortions in de- 
tions of causality. For example, if the 
» serving bias effect is mediated by a desire 
“Maintain or gain a positive public image, 
2 Positive evaluations from others may 
. to enhance the esteem one feels for one- 
M Conversely, negative evaluations from 
AtS may threaten the individual’s positive, 
Wate self-image. Based upon the empirical 
Ance Collected to date, then, what previous 
Me (eg, Miller & Ross, 1975) have 
fe aS a perceptual bias in the causal 
sj, Process may be better viewed as a 
E use bias or strategic self-presentation de- 


id 
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signed to maximize public esteem (Bradley, 
1978) and, secondarily, private self-esteem.? 

In their 1975 review, Miller and Ross pro- 
posed that the results of many of the studies 
often cited as support for self-serving attribu- 
tional biases could readily be “interpreted in 
information-processing terms” (p. 224). Spe- 
cifically, they contended that the observed 
tendency for individuals to accept greater 
responsibility for positive than for negative 
outcomes may. occur for any or all of several 
reasons: (a) Individuals intend and expect 
success more than failure and are more likely 
to make self-ascriptions for expected than for 
unexpected outcomes, (b) perceived covaria- 
tion between response and outcome may be 
more apparent for individuals experiencing a 
pattern of increasing success than for individ- 
uals experiencing constant failure, and (c) 
people erroneously base their judgments of the 
contingency between response and outcome in 
terms of the occurrence of the desired outcome 
(i.e., success) rather than any actual degree of 
contingency. 

In my examination of the explanatory 
strength of these alternative interpretations 
of findings offered as evidence for self-serving 
attributional biases, I concluded that they 


1The major issue underlying Miller’s differentia- 
tion of two types of self-serving attributions involves 
the locus of distortion in the causal inference pro- 
cess (i.¢., a perceptual vs. a response bias). However, 
I would argue that (a) there are probably multiple 
points of bias in the attribution process, from input 
to output, and (b) present evidence does not allow 
us to determine the precise point at which bias or 
defensiveness may be occurring. At this point in 
time, a more pertinent empirical question would 
seem to be how and under which conditions self- 
serving motivations influence attributions of causality. 

2 Miller suggests that securing behavioral measures 
might help future researchers determine whether in- 
dividuals distort causal ascriptions for their positive 
and negative outcomes in order to serve their public 
or their private images. Since private self-esteem 
needs might be influenced by public self-presenta- 
tions, it is not clear that behavioral measures would 
help determine whether the self-serving bias effect 
represents distortions in descriptions or perceptions. 
Moreover, it seems reasonable to argue that written 
or verbal measures of causal responsibility are in 
fact behavioral measures and that individuals may 
infer their private self-attributions from their public, 
behavioral statements of causality (cf. Bem, 1972). 
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suffer from conceptual incompleteness and a 
lack of parsimony. It is indeed unfortunate 
that Miller chose not to use his recent reply 
to my 1978 review as a forum for addressing 
these criticisms; while many of Miller and 
Ross’s (1975) points are meritorious and de- 
serve direct investigation, I. suspect such ex- 
perimentation will not be forthcoming until 
the alternative interpretations proposed by 
these authors receive considerable theoretical 
clarification. 

In conclusion, it is contended that the 
validity and explanatory power of Miller and 
Ross’s (1975) information-processing analysis 
of data presumably reflecting motivational 
biases in the causal inference process remains 
undetermined. In addition, it is concluded 
that the empirical evidence related to the 
notion of self-serving biases in causal attribu- 
tions is most consistent with a self-presenta- 
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tion formulation holding that individuals alter 
their descriptions of causality so as to enhance 
or protect others’ perceptions of them. 
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The Bias Phenomenon in Attitude Attribution: 
Actor and Observer Perspectives 


Arthur G. Miller, Robert Baer, and Peter Schonberg 
Miami University (Ohio) 


Extensive research in attitude attribution has shown that observers infer cor- 
respondent attitudes from behavior that is externally constrained. The present 
study concerned (a) the degree to which actors anticipate observers’ manifesta- 
tion of this bias, (b) the effect of presenting constraint information directly 
from the actor as compared to the experimenter’s instructions, and (c) the re- 
lationship between constraint experienced by the actor and the persuasiveness 
of essays produced under different constraint levels. Essay writers clearly pre- 
dicted that observers would infer correspondent attitudes even when the posi- 
tion had been randomly assigned to the writer. This was true, although to a 
diminished extent, when the writer was under genuinely high constraint. When 
actors expected observers to have precise information regarding their actual 
constraint, however, they anticipated that observers would recognize the attri- 
butional implications of such information. Data from observers corroborated 
the actors’ predictions. When observers were not given this information, their 
attributions were based solely on essay content and indicated no recognition of 
the different freedom levels experienced by the essay writers. Essays written 
under different constraint levels were judged by observers to be generally of 
similar and fairly respectable persuasiveness. It is suggested that the bias phe- 
nomenon may be a consequence of presenting observers with essays more 
persuasive than they expect from a writer who in fact disagrees with the 


assignment. 


_ An individual’s prediction of the attribution 
t others will make to him or her is an issue 
of Practical as well as conceptual interest. 
Accurate predictions should enhance effective- 
Mss in managing the impression one makes 
tpon others, Conflicts could be eased by an- 
‘eating how others will view the causes of 
N actions. Research attention has focused 
ently on attributional bias, in particular 
è tendency to overestimate the importance 
ie onal or dispositional determinants rela- 
g to situational influence—what Ross 
17) has termed the “fundamental attribu- 


n error,” which was a central concept in, 
á 


e= of this research were reported at the 

aA meeting of the American Psychological Asso- 

eon Oronto, Canada, 1978. 

; kis for reprints should be addressed to Arthur 

hit €t, Department of Psychology, Miami Uni- 
Y, Oxford, Ohio 45056. 


the article by Jones and Nisbett (1971). The 
present study was primarily concerned with 
three issues. The first was the degree to which 
actors anticipate or predict that observers of 
their behavior will manifest this particular 
bias. The second related to the source of 
constraint information—the effect of present- 
ing constraint information directly from the 
actor and determining the impact of this kind 
of stimulus upon observers’ attributions as 
well as upon actors’ predictions of observer 
judgments. A final concern was the relation- 
ship between the amount of constraint experi- 
enced by the actor and the quality of the be- 
havior produced under different constraint 
levels. 

The procedure was a variation of a para- 
digm used by Jones and Harris (1967), in 
which subjects were told that a target person 
had written an essay under choice or no-choice 
instructions. Perhaps the most celebrated 
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finding of that study was the observation that 
subjects attributed attitudes in line with the 
essay content or direction (pro or anti) even 
when the position had been assigned to the 
writer. This finding has been reported in many 
experiments and is widely regarded’ as ob- 
server bias in the direction of personal causal- 
ity (Jones, 1976, 1979; Jones & McGillis, 
1976; Miller, 1974, 1976; Monson & Snyder, 
1977; Ross, 1977; Snyder & Jones, 1974). 

In the first part of the present experiment, 
actors were asked to predict the judgments of 
observers in the attitude attribution paradigm. 
Subjects were asked to write essays on affirma- 
tive action; they were either given a choice of 
position or were assigned to write in the anti 
or pro direction. After writing their essays, 
they were asked to predict the average atti- 
tude attribution that observers would make 
after reading the essays and after being in- 
formed of the instruction condition. Would 
actors predict attributions according to the 
model of correspondent inference theory? This 
would be shown by predicting correspondent 
inferences under the choice condition and pre- 
dicting a normative or base-rate attitude, un- 
related to essay directionality, in the assign 
conditions (Jones & McGillis, 1976). Actors 
should thus expect observers to feel confident 
that the essay represented the actor’s true 
opinion in the choice condition but should ex- 
pect observers to be in a very real dilemma in 
the no-choice or assign conditions and to make 
an attribution that would be the “best guess” 
under these circumstances. 

Actors have an advantage over observers in 
being explicitly aware of the randomness of 
their essay assignment, their precise constraint 
in the actual writing of the essay, and their 
personal opinion on the essay topic. With 
these insights, actors could be expected to pre- 
dict the discounting of their own behavior in 
the assign conditions. Conceptually, the neces- 
sary information (for example, randomness of 
assignment) is more salient to actors than to 
observers. Actors would therefore have at least 
the option of predicting the discounting 
strategy—a pattern dictated by attribution 
theory but one that observers do not appear 
inclined to endorse. Of particular interest were 
the predictions of actors who were asked to 
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write essays contrary to their true q 


Would these genuinely constrained actors ex- 


of their essay? Perhaps the more likely out- 
come, given the available research data from 
observer subjects, is that actors, constrained 
or not, would predict correspondent inferences 
under the no-choice instruction. This latter 
outcome would suggest that the bias effect in 
attitude attribution not only results from the 
observer’s lack of precise information regard- 
ing constraint but reflects a more construc- 
tive, intuitive strategy that subjects bring to 
their task of estimating attitudes from essays. 

A second issue in the present study was the 
source of information concerning the actor's 
constraint. Jones and Harris (1967) and 
others have demonstrated bias by showing 
that perceivers appear to acknowledge neither 
the randomness of the writer’s essay assign- 
ment in the no-choice conditions nor—a result 
of that randomness—the low informational ori 
diagnostic value of the essay itself. There 1$ 
a certain vagueness or abstractness to the n0- 
choice instruction, however, in that it consti- 
tutes the experimenter’s designation of the 
actor’s situation but provides no explicit in- 
formation concerning the actor’s reaction t0 
the essay assignment. In actual social inter- 
action, the actor’s personalized definition of 
the situation might be available to the ob: 
server. One might imagine a disgruntled writer 
muttering, “Oh, what luck, I would have t0 
get that position—can I switch?” Would ob- | 
servers discount this type of information 
about constraint, coming directly from the 
actor? Would actors expect them to discount 
it? This issue was examined by comparing the} 
impact of constraint information presented it 
sthe traditional, externally designated manne 
with information based upon the actor’s own, 
definition of the situation. It was expectei 
that information concerning the actor’s actu! 
constraint would have a marked influent 
upon attributions of attitude, as well as Pr” 
dicted attributions, based on the essay. Spee 
cifically, indications of high constraint from 
the actor were expected to result in a dis 
counting of the behavior. 


pect observers to attach little informational 
value to their essay, or would they expect ob- 
servers to base attributions upon the content 


y 
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A final concern was the persuasiveness of 
the essays produced in the assign conditions, 
Two relationships were examined. The first 
concerned the correlation between observers’ 
attributions of attitude and their rating of 
the persuasiveness of the essay. The second 
concerned the degree to which essays com- 
posed under different constraint levels were 
judged as more or less persuasive. Data on 
these matters were considered as they might 
help explain the tendency for attributions to 
be associated with essay content in the no- 
choice conditions. 


Method 
Overview 


In the first phase of the experiment, subjects: filled 
out an attitude inventory, including an item that 
dealt with affirmative action. They were next asked 
to write a short essay and were either allowed to 
choose a position to defend or were assigned the pro 
or the anti position on the issue of affirmative ac- 
pin, In one condition, subjects were asked to rate 

tit degree of freedom after having written their 
"say and to note this judgment at the bottom of 
enk In another condition, subjects were not 
the le ately asked for this judgment. All subjects 

n estimated the average attribution that would 
ae by a group of observers who would read 
Milter “Ae? would judge the true attitude of the 
Retr oak were told that the readers would be 
es wh eir condition (choice or assign). The sub- 
Es © rated their freedom after writing their es- 
A i informed that observers would also have 
karch o this rating. In a second phase of the re- 
says subjects (observers) were given sets of 
Bn written under varying constraint levels in the 

ther Part of the experiment. These subjects, who 
hin Were or were not shown the writer’s freedom 
Hii were asked to estimate the writer’s personal 
“dic on the issue of affirmative action and to 
Bien. their opinion of the essay’s persuasiveness. 

Jects were then debriefed. 


Subjects 


: essay-writing part of the experiment, the 
| Were 276 undergraduates participating in re- 
ticipat, i a course requirement. Ninety subjects par- 
bjects in the observer phase of the experiment. 
ko $ rere seated at individual desks. Instructions 
: ìn written form, and subjects were seen in 
Westone oe (10 to 25). Subjects with procedural 
were aided individually by the experimenter. 
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Procedure 


_ Actor predictions of observer attributions, The 
introduction to the experiment explained that it dealt 
with attitudes and the perception of beliefs, Subjects 
were informed that after filling out a short attitude 
inventory, they would be asked to write a short 
essay on one of the items and to fill out a question- 
naire dealing with the essay that they had written, 
The key item on the 10-item attitude survey read 
as follows: “Women and other minorities should 
be given preference in job markets and other places 
where they are under-represented, assuming that they 
are qualified.” Responses were made on a 10-point 
scale (1 = extremely disagree, 10 = extremely agree). 
In the second part of the experiment, described as 
a debate simulation, subjects were asked to con- 
sider the statement on affirmative action, as worded 
above, and to write a short, convincing essay in 
approximately 20 minutes. Subjects were randomly 
assigned to one of three categories listed at the top 
of the sheet upon which the essay was to be written, 
For each subject one category was checked in red: 
(a) choice of position, (b) assign pro, or (c) assign 
anti. 

After writing their essays, subjects were randomly 
placed in one of two conditions. In one condition, 
to be termed observer informed, subjects were asked 
to rate the degree of choice or freedom they had 
while writing their essay. The definition of the low 
end of the scale (1=no choice or freedom) was 
“as if you had copied the essay verbatim from an- 
other source.” The high end (10 = complete choice 
and freedom) was defined as “indicating the com- 
pletely free expression of your beliefs.” This rating 
was located near the bottom of the 8X 14 inch 
sheet upon which the essay had been written, In 
the other condition, to be termed observer not in- 
formed, subjects were not asked to make this free- 
dom rating, All essays were collected. At this point, 
subjects were not aware of the specific questions 
that would follow. 

Subjects then received questionnaires. The first 
item asked them to estimate the average attribution 
of a group of observers asked to read their essay and 
to guess their true opinion on affirmative 
(1 = strongly against, 10 = strongly in favor). Sub- 
jects were informed that these observers would 
shown the essay sheet and would be aware of the 
essay instructions, Subjects in the observer-informed 
condition, who had rated their freedom upon com- 
pletion of their essay, were also told that observers 
would have access to this rating, After estimating 
the observers’ attributions of attitude, subjects filled 
out several filler items on the two-page question- 
naire. For subjects in the observer-not-informed 
condition, the freedom rating was included as the 
final item on the second page. The timing of the 
freedom rating and its mention (or omission) when 
predictions of observers’ attributions were requested 
thus differentiated the observer-informed and 0b- 


server-not-informed conditions. The design of this 
3X2 factorial (Choice, 


part of the experiment was a 
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Figure 1. Mean estimate of observers’ attributions as a function of writer’s attitude, vasa con- 
dition, and instruction regarding informing observers of writer’s freedom. (Obs, = observer. 


Assign Pro, Assign Anti X Observer Informed/Not 
Informed). 


Observer data, Ninety essays produced in the 
assign-pro ("=45) and assign-anti (n= 45) condi- 
tions of observer informed were selected as stimuli 
for this phase of the experiment. The fact that the 
writer’s freedom rating was located below the es- 
say in this condition facilitated the Presentation or 
omission of this information as the independent vari- 
able manipulation Ten copies of each of the 90 
essays were made, with the writer’s freedom rating 
masked on half, The €ssays were divided into nine 
sets, each set containing five assign-pro and five 
assign-anti essays written by actors who varied in 
their personal belief on affirmative action, Ten sub- 
jects were assigned to each set of essays, five sub- 
jects being given copies that included the freedom 
rating and five being given copies (of the same es- 
says) that masked out the freedom rating, Subjects 
were asked to read each essay and make two judg- 
ments: an estimation of the writer’s true attitude 
on affirmative action (1-10 scale) and an evalua- 
tion of the essay’s persuasiveness (1 = very low, 
10 = very high). Mean attitude attributions and 


i i ted for each 
persuasiveness ratings were then compu 
essay based on the judgments of subjects who ba 
(n=5) and were not (n=5) shown the wri 
freedom rating. 


1The use of essays from the observer-informed 
condition was based strictly on the convenience i 
having the freedom rating located below the san 
There are no indications that the essays pro aa 
in this condition differed systematically from tl mes 
in the observer-not-informed condition. The bet 
freedom ratings in the observer-informed conge 
(assign pro = 5.17, assign anti = 5.84) did not a 
significantly from those in the observer-not-iniory 
conditions (assign pro = 4.89, assign anti = ou 
In all conditions, subjects who were assigned a Pia 
tion counter to their personal belief reported cat 
freedom than those assigned a position congrui ie 
with their belief. Correlations between own atine 
and rated freedom for subjects in the aben 
formed conditions were: assign pro—r = .66, elk: 
anti—r = —.52; for subjects in the observer-no| ae 
formed conditions; assign pro—r = .49; assign an 
r=—Al, 
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Results 
Actors’ Estimates of Observers’ Attributions 


The subjects (actors) were first divided into 
wo groups based on their attitude toward 
ifirmative action, obtained in the original 
attitude inventory, Subjects who checked Re- 
sponses 1-5 were categorized as anti; those 
checking 6-10 were categorized as pro. Sub- 
jects’ predictions of observers’ attributions 
were submitted to a 3 X 2 X 2 analysis of 
variance, the independent variables consisting 
of the writer’s essay condition (choice, assign 
pro, or assign anti), the writer’s true opinion 
(anti or pro), and whether the writer did or 
did not expect the observer to have access to 
the writer’s freedom rating (observer not in- 
formed or observer informed). Mean predic- 
tons of observer attributions are shown in 
Figure 1. 

À Two main effects achieved significance. Sub- 
jects assigned to write a pro essay predicted 
more pro affirmative action attributions than 
subjects in the choice condition, with subjects 
assigned to write an anti essay predicting the 
most anti attitude attributions (Mpro = 6.39, 
Ua = 5.17, Many = 3.68), F(2, 264) = 
35.61, p < .01. Subjects who were in fact pro 
poe action predicted more pro attribu- 
“y than did subjects who were anti affirma- 
A action (Mpro = 6.04, Manti = 4:18), F(, 

4) = 43.14, p < .01. In terms of the central 
Questions of this research, however, the data 
4 best understood in terms of the Observer 
” Informed/Observer Informed x Writer’s 
ee interaction, F(1, 264) = 15.05, p< 
ae the left panel for the observer- 
eee results. Subjects writing essays 
k the choice conditions predicted that ob- 
i would make correspondent attribu- 
in Fe Indeed, subjects under choice conditions 
MYariably do choose to write in defense of 
a own attitude. However, subjects in the 
“sign-pro and assign-anti conditions antic- 
pa that observers would base their attribu- 
a Solely on the essay itself. The subject’s 
3 A attitude bears no relationship to the at- 
j ution the subject expected observers to 
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make. This is a clear instance of the prediction 
of the “fundamental attribution error.” 

The right-hand panel shows the results for 
the observer-informed conditions. The result 
for the choice conditions is very similar to the 
observer-not-informed case, with subjects pre- 
dicting correspondent inference. However, 
subjects in the two assign conditions clearly 
anticipate that observers will use the freedom 
rating information. The subjects’ own atti- 
tudes thus bear a very strong relationship to 
the attributions they expect informed ob- 
servers to make. There is still an effect of the 
particular assignment, however, in the ob- 
server-informed condition. Within the writer- 
anti and writer-pro conditions, the difference 
between the assign-pro and assign-anti means 
is significant at the .05 level. However, neither 
mean is significantly different from the choice- 
condition mean. There is, in summary, a 
strong demonstration of actors’ predicting 
that observers will attribute attitudes on the 
basis of essay direction. If they expect ob- 
servers to be informed of their actual con- 
straint level, however, there is a marked mod- 
ification in actors’ predicted attributions, with 
writers expecting their true opinions to “come 
through” no matter what their particular as- 
signment. 

Also of interest is the relationship between 
the actor’s self-rated freedom in the essay task 
and his or her prediction of observer attribu- 
tions. For this analysis, only the responses 
from subjects in the two assign conditions 
were considered. Subjects were divided into 
three freedom categories based on their re- 
sponses to the 10-point scale (1-3 = low; 
4-7 = medium; 8-10 = high). Predictions of 
observer attributions were submitted to a 3 X 
2 x 2 analysis of variance (Freedom Level X 
Assign Pro/Anti x Observer Informed/Not 
Informed). Mean estimates of observer attri- 
butions are shown in Figure 2; j 

Subjects assigned to write pro essays esti- 
mated that observers’ attributions would, in 
general, be more pro affirmative action than 
did subjects assigned to write anti essays, 

70, p < .01. Within the pro 


F(1, 165) = 110. e p 
and anti conditions, as self-rated freedom in- 


creased, subjects indicated that they expected 
observers’ attributions to be more extreme 1n 
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Figure 2. Mean estimate of observers’ attitude 
essay condition, and instruction 
omitted). (Affirm. = affirmative.) 


the direction of their essays, for the Freedom 
X Condition interaction, F(2, 165) = 34.42, 
p < .01. Central to the issue addressed in this 
research, however, is the three-way Observer 
Xx Freedom x Condition interaction, F(2, 
165) = 7.44, p < .01. When essay writers pre- 
sumed freedom knowledge on the part of the 
observers (see the right panel of Figure 2), 
they clearly expected observers to use this 
information in an appropriate way. Thus, for 
writers who were under genuine constraint 
(low freedom), predicted attributions did not 
reflect their essay assignment condition but 
were in fact in the direction opposite to that 
condition. When essay writers did Not expect 
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attributions as a function of writer’s freedom, 
g observers of writer’s freedom (choice condition 


observers to know their freedom ratings, they 
expected observers to make attributions in 
line with essay content (see the left panel of 
Figure 2). There is a clear suggestion, even 
in the observer-not-informed condition, that 
subjects experiencing less freedom expected 
observers to make less extreme attributions. 
However, even in the low-freedom condition, 
the assign-pro (6.16) and assign-anti (4.07) 
means are significantly different (p < 05); 


2 To assess the reliability of this result, a replica- 
tion of the observer-not-informed condition was Gus 
ducted using a different attitude issue (allowing 
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from both views of the data (Figures 1 and 
1), actors expect observers to base their at- 
tributions substantially on the content of the 
say, provided that observers are not in- 
jormed of the actor’s actual freedom in the 
«say task. If actors do expect observers to be 
sven this information, their predicted attribu- 
tions assume a much different pattern, one 
that is closely and logically related to the 
witer’s actual attitude and constraint level. 


Observer Data 


The mean attribution and persuasiveness 
judgments, which had been computed for each 
of 90 essays, were first divided into three cat- 
‘tories based on the actual freedom ratings 
given by the writers of the essays (1-3 = low; 
41= medium; 8-10 = high). These means 
were then submitted to a 2 X 3 X 2 (Assign 
Pro/Anti x Freedom Category X Observer 
Informed/Not Informed) analysis of vari- 
ance; means for these analyses are shown in 
Figure 3.3 

Ratings of essay persuasiveness were gen- 
tally similar, with an indication of a some- 
What lower quality essay in the low-freedom, 
&sign-anti condition, A Direction X Freedom 
Category interaction, F(2, 168) = 4.33, p < 
05, indicated that persuasiveness ratings were 
“milar in the three categories of assign pro 
= 5.85, medium = 5.81, high = 5.52), 

lit were somewhat lower for assign-anti es- 

Sys written with less freedom (low = 4.71, 
Medium = 5.93, high = 6.03). 

The attribution data, shown in the lower 
Ee of Figure 3, indicate significant effects 
o essay direction, F(1, 168) = 137.98, $ < 
a and the Direction x Freedom Category 
ae F(2, 168) = 16.58, p < 01. At- 
ite e attributions thus were more positive in 
ti assign-pro than in the assign-anti condi- 
nS and were increasingly correspondent 


i 
p t8raduates to own and drive cars on campus). 


W directionality was, as expected, a strong de- 
at of predicted attributions, F(1, 77) = 25:85, 
inco A Freedom Level X Essay Direction in- 
41g, yo °8tin characterized the data, PU, 7m) = 
. tin <.05, with writers under low freedom pre- 
$09, B less correspondent attitudes (assign Pro — 
hep Ten anti = 4.68) than writers under high 
om (assign pro = 7.06, assign anti = 3.84). 
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with essay directionality under conditions of 
higher freedom. The latter result, however, 
was unique to the observer-informed condi- 
tion. The significant Direction x Freedom 
Category X Observer Informed/Not Informed 
interaction, F(2, 168) = 9.73, p < .01, indi- 
cates that freedom category had no influence 
on attitude attributions in the observer-not- 
informed conditions, whereas the impact of 
this information was very clear in the ob- 
server-informed condition, Data in the left 
panel indicate that attitude attributions were 
based on the content of the essays and reveal 
no recognition on the part of observers of the 
wide array of freedom or constraint levels 
actually experienced by the writers of these 
essays. The fact that the attributions are sim- 
ilar at each freedom category also suggests 
that the essays were perceived as similar in 
persuasiveness across freedom categories.* 
Data in the right panel, however, suggest that 
observers utilized the freedom ratings as a dis- 
tinctive clue to the writer’s personal attitude, 
with essay content being totally discounted at 
the low-freedom level. 

Correlations among various judgment items 
are shown in Table 1. Without the freedom 
information, observers’ attributions of attitude 
did not relate meaningfully to judgments from 
the writers (i.e, their real attitude, rated 
freedom, and predicted attributions). Subjects 
in the observer-informed conditions, however, 
attributed attitudes that related substantially 
and logically to writer judgments. Correlations 


3 Note that the same essays constituted the stimuli 
for the judgments in both the left and right panels 
of Figure 3. Judgments in the right-hand 
were obtained from subjects having access to the 
writer’s freedom rating; data in the left-hand panels 
were obtained from subjects not given this infor- 

ation. 
ar essays written under less freedom were lower 
in persuasiveness, attributions could reflect this T 
minution in strength. It would then not be possi 
to interpret less correspondent attributions as a pure 
reflection of the observer's knowledge of low free- 
dom ratings per se, in that similarly less enue 
attributions would be expected to accompany Wi A 
essays. However, the persuasiveness ratings, as wel 
as the attributions of attitude in the observer-not- 
informed conditions, strongly suggest that observers 
were not responding to systematic differences in 
essay persuasiveness at the various freedom levels. 
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Figure 3, Mean persuasiveness ratings and attitude attributions for essays written at three freedom 
levels by observers who are or are not informed of the writer’s freedom rating. 


involving persuasiveness judgments corrobo- 
rate the analysis of variance results for the 
data in Figure 3. In the assign-pro conditions, 
Persuasiveness ratings did not correlate sig- 
nificantly with the writer’s attitude or freedom 
ratings, whereas these correlations were some- 
what higher in the assign-anti conditions. This 
Suggests, as noted above, that writers in the 
assign-anti condition who were in favor of 
affirmative action wrote somewhat less per- 
suasive essays, Finally, the correlations in- 
volving the two judgments made by observers 
—attitude attributions and persuasiveness rat- 
ings—were highly significant. The extremity 
or polarity of attitude attributions, that is, 
their degree of correspondent inference, re- 


lated strongly to the perceived persuasiveness 
of the essays, 


Discussion 


Actors, in the role of essay writers, pre- 
dicted that observers would attribute attitudes 
using the content of the essays—a pattern re- 
sembling that of observer data found in many 
Previous experiments and in the present one 4S 
well. Even genuinely constrained actors esti- 
mated that observers would attach informa- 
tional value to their essays, although these 
predictions were less extreme in the direction 
of essay assignment. Perhaps subjects under 
high constraint wrote, or felt that they wrote, 
less persuasive essays. The persuasiveness 
Tatings do not suggest that this was a major 
factor, although the possibility exists, to 4 
degree, for the anti affirmative action essays: 
Writers assigned to write against their ow? 
beliefs may have expected observers to be 


THE BIAS PHENOMENON AND ATTRIBUTION 1429 


fable 1 
correlations in Experimental Conditions 


p lle 


Observer not informed Observer informed 
an fen Assign Assign Assign 
anti pro anti pro 
Observers’ attribution and writer's 
attitude on essay topic 12 $i 
’ * +P) E AS j 
Observers’ attribution and writer's ži 1 SrA 
freedom rating 20 
| ng eni —.03 —.59,** 
Observers’ attribution and writer's 598 bia 
prediction of observers’ attribution 29 —.10 63,** 60.** 
Observers’ persuasiveness rating and Na A 
writer's attitude —.30* —.23 —.36* 
Observers’ persuasiveness rating and ; iy isn 
writer's freedom rating .23 —.27 39** 11 
Observers’ persuasiveness rating and i l 
observers’ attribution of attitude —.60** 69** —.61** 538%, 


erations in observer-informed conditions with subscript a differ significantly from corresponding 
elations in observer-not-informed conditions at or beyond p < .05. n = 45. 


"p< 05. 
"p< 01. 


insightful regarding their constraint. However, 
his expectation of empathy or accuracy on 
lhe part of observers was unwarranted, be- 
tause observers were completely unresponsive 
b the constraint variations that actually ex- 
‘* in the assign conditions unless they were 
ecifically informed of the writer’s self-rating 
constraint. 
a Present results provide further evidence 
aba attributions—or, in this in- 
H actors’ predictions about them—do not 
aes the hypotheses derived from cor- 
aie lent inference theory—a theory that 
situate an estimation of the average or modal 
bi, in the no-choice conditions (Jones & 
hor “sane p. 405). The phrase observer 
Bic ias has been used to describe these 
b poos the perceiver does not appear 
signin late the full implications of randomly 
À = the essay position to the writer— 
RA the inherent lack of a systematic 
ih. 1 between the writer’s personal view and 
ee assigned to him. The constrained 
ition testa tes in a particularly ideal 
f tespo i predict the “correct” (i.e, non- 
k be ent) attribution because, relative to 
ithed Met the actor is informationally en- 
a $ e results supported this line of rea- 
8 to a degree, in that actors under low 


freedom did predict less correspondent at- 
tributions. Even these subjects, however, pre- 
dicted attributions that related to the essay 
content. It would seem, moreover, equally 
“unjustified” for subjects in the high-freedom 
category to predict such extremely corre- 
spondent attributions of attitude. There was 
no logical reason for these actors under high 
freedom (in the observer-not-informed condi- 
tions) to expect that observers would know 
that they were, in fact, under such minimal 
constraint. What might account for this per- 
sistent tendency to base attributions on essay 
content and to expect others also to produce 
such judgments? 

Research in a different but related area of 
attribution theory may be helpful in under- 
standing the basis of the present findings. A 
number of experiments have addressed the 
issue of the use of consensus information in 
the context of Kelley’s theory (1967). At 
issue, specifically, is the tendency for subjects 
to fail to use such information in a theoret- 
ically proper way- The reasons for this par- 


ticular kind of attribution error are themselves 
complex, Some have suggested that consensus 

i.e., base-rate or normative) information 1s 
pallid and abstract vis-a-vis the concrete real- 


ism of the behavior per sé (Borgida & Nisbett, 
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1977). Others have reasoned that many re- 
search paradigms unwittingly capitalize upon 
subjects’ lack of an intuitive grasp of the 
meaning of random assignment (Wells & 
Harvey, 1977). The attitude attribution para- 
digm would seem vulnerable to the same lines 
of reasoning. The essay itself, in terms of 
occupying the perceivers’ time and attention, 
is vivid in comparison to the brief, relatively 
subtle impact of the no-choice instructions. 
Furthermore, discounting of the behavior un- 
der no-choice instructions requires that sub- 
jects understand explicitly the statistical or 
probabilistic implications of the essay position 
being randomly assigned to the writer. In this 
context, Wells and Harvey (1977) obtained 
judgments that were more in line with theo- 
retical predictions from a group of subjects 
who were indoctrinated as to the meaning of 
random assignment. It would seem that, at the 
least, a manipulation of this magnitude would 
be required to produce similar means in the 
assign-pro and assign-anti conditions of the 
attitude attribution paradigm. To designate 
what subjects produce in these experiments as 
“error” or “bias” may, in this view, be ques- 
tionable. The paradigm requires that the sub- 
ject attend to an essay and make a difficult 
attributional response, A model ultimately 
stipulating that the essay should be totally 
discounted may itself be committing an “error 
of implausibility.” The fact that attitude 
attributions are often significantly less corre- 
spondent in the assign conditions than in the 
choice conditions may be more of a judg- 
mental triumph or accomplishment than has 
generally been acknowledged. 

Another interpretation of these findings 
focuses not on what subjects err in doing or 
fail to accomplish but instead on the active, 
cognitive strategies that they bring to their 
task in this paradigm. A clue to one such 
strategy may be found in the matter of essay 
Persuasiveness. Mean ratings were generally in 
the 5-6 range, that is, better than average, 
although they were based on essays assigned 
to writers. Actors may have been able to write 
better essays than they expected to be able to 
write, particularly when the assignment was 
not in line with their own attitude, Having 
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written such essays, they may have suspected 
that observers would also consider the essays * 
to be better than one might expect from some- 
one assigned to defend a position uncongenial 
to his or her personal belief. 

In the essay task, therefore, people may 
give themselves less than due credit for the 
ability to produce a noncorrespondent essay. 
They are better able to comply with the ex- 
perimenter’s assignment than they expect. 
The essays are then written in a modal range 
of “average to good.” The attributor, however, 
is still operating on the premise that people k 
assigned a position will do a poor job unless 
they happen to agree with the assignment. 
Attributions of attitude in the form of essay- 
correspondent judgments—what Ross (1977) 
and Jones (1979) term the “fundamental at- 
tribution error”—are, in this view, the logical 
judgment to make, but for the wrong (a 
priori) reasons. That is, given the strategy of 
basing the extremity of one’s attribution on 
perceived persuasiveness, it is logical to at- 
tribute attitudes that are, to a degree, in line ¢ 
with the essays—many of the essays are, in 
fact, reasonably persuasive. The error lies in 
not recognizing that many writers are capable 
of producing respectable essays even if they 
personally disagree with what they are writ- 
ing. There is limited but suggestive support 
for this line of reasoning. Jones, Worchel, 
Goethals, and Grumet (1971) included the 
important variation of presenting strong or | 
weak essays to perceivers. Subjects presented 
with weak essays written under no-choice in- 
structions actually attributed attitudes in op- 
Position to the essay assignment. That is, the 
pattern of attributions correspondent with 
essay direction was not only not present but 
was reversed. In terms of the present line of 
reasoning, the persuasiveness of the weak 
(ambivalent) essays in the Jones et al. study 
may have been less than that expected by 
Perceivers, They then inferred that the writer 
did not personally endorse the assigned posi- 
tion but in fact believed in the opposite posi- 
tion, 

If we consider the many replications of the 
observer bias phenomenon in attitude attribu- 
tion (Jones, 1979), the hypothesis that per- 
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givers may consistently be observing essays 
‘better than expected” may be shown to have 
wnsiderable explanatory power. An important 
research question involves documenting the 
expectancy that perceivers entertain regarding 
the relationship between a person’s attitude 
md the quality of essay that the person is 
likely to produce. Observers in the present 
study produced attributions of attitude and 
ratings of persuasiveness that were highly 
correlated (Table 1). The causal direction is 
unclear, although on the basis of the experi- 
mental manipulations in the Jones et al. study, 
itcould be assumed that the persuasiveness of 
the essay was assessed first and provided a 
rationale for the extremity or correspondence 
of the attitude attributed to the writer. Thus, 
the attitude-persuasiveness link appears to be 
an important element in the perceiver’s im- 
plicit theory. A demonstrably weak or am- 
bivalent essay would in this view be a much 
Sounder reason for discounting the essay than 
the probabilistic or statistical implications of 
No-choice instructions. 

Regarding the writer’s self-rated freedom, 
actors, in their predictions, and observers both 
‘Ppeared to use this information to its fullest 
‘gnificance. While previous research in atti- 
tude attribution has not used this method of 
tonveying constraint—that is, from the actor’s 
tefinition of the situation—it is unclear that 
Sich a mode of transmitting the constraint or 
Stuational information is ill-conceived or be- 
Yond the province of theory. This information 
tone! in principle, be underplayed or dis- 
fo inted, just as is the experimenter’s instruc- 
a of no-choice to the essay writer. One 
Need only remember that jurors found Patricia 
aot guilty despite her low degree of “self- 
| en freedom,” Many professors have looked 
*eptically at students who mentioned situa- 
Bian causes for missing exams. In the present 
a ie however, the actor’s designation of his 
E degree of constraint was information of 
“nsiderable utility. Ratings of low freedom 
fn geeceivers a salient and plausible reason 
: Ek iscounting the observed behavior as indi- 
Boe of the writer’s personal attitude. Actors 
~*écted observers to use the information in 


2 
. 
| 
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this way (Figures 1 and 2), and observers did 
in fact use it (Figure 3). 
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Inducing Biased Scanning in a Group Setting to Change 
Attitudes Toward Bilingualism and Capital Punishment 


Patrick O’Neill and Diane E. Levings 
Acadia University, Wolfville, Canada 


Triads of male and female Canadian high school students were assigned to 
debate either capital punishment or bilingualism. Teams were given 40 minutes 
to prepare their arguments. Half were told in advance which side of the argu- 
ment they would be debating, and half were told they would be assigned at the 
end of their discussion. Scoring of the taped discussions confirmed that this 
manipulation produced biased scanning of arguments in the predetermined con- 
dition and unbiased scanning in the later-determined condition. A postmeasure 
of attitudes was administered either before or after the actual debate. Biased 
scanning led to significant attitude change in the predicted direction, and this 
effect was not influenced by time of presentation of the posttest. The study 
provides support for the conflict theory of attitude change. 


According to conflict theory as articulated 
by Janis and Mann (1968), attitude change 
is facilitated by inducing an individual to 
think of arguments in favor of only one side 
of a controversial topic. This “biased scan- 
ning” makes a set of cognitions salient and 
tends to shift the subject’s attitude toward the 
position for which he or she has been develop- 
ing arguments. 

Girodo and Strickland (1974) suggested 
that the conflict theory explanation of attitude 
change would hold when there was high inter- 

. est in the issue, when there was opportunity 
for the subject to think up arguments (i.e., 
to engage in biased scanning), and when there 
was incentive for the subject to do so. In their 
research, the largest change was produced 
under such conditions, It is assumed that 
when subjects have high interest in an issue 
they will have given it some thought in the 
past, and ideas about the issue will be stored 
in memory and available for ‘scanning. 


This study was supported by a t from 
Canada Council. The authors ieeteeds, the ie 
sistance of Marsha Eisner, Janice Hilchey, Mary Mc- 
Cann, Nancy Gaudet, and Anne Toland, and the 
help of Susan Scanlon. 

Requests for reprints should be sent to Patrick 
O'Neill, Psychology Department, Acadia University, 
Wolfville, Nova Scotia, Canada BOP 1X0. $ 


The present study attempted to test predic- 
tions from conflict theory with two issues that 
were expected to engage the interest of Cana- 
dian high school students: bilingualism and 
capital punishment, In Canada, as in a num- 
ber of other countries, there is strong senti- 
ment on the issue of the death penalty (All- 
mand, 1976; Fattah, 1976). Each new sensa- 
tional murder revives the arguments for and 
against executions. The fact that Canadian 
legislators recently abolished capital punish- 
ment at a time when voter sentiment was 
strongly in favor of retention (Fattah, 1976) 
helps keep the debate alive. i 

Bilingualism is as controversial as capital 
punishment and more personally involves most 
Canadians. In response to pressures dividing 
the country, the federal government has ac- 
tively promoted a policy of bilingualism. A 
much-disputed aspect of this policy is the at- 
tempt to expand the use of the French lan- 
guage in traditionally English-speaking areas. 
This has often produced a backlash from An- 
glophones—an outbreak of strong ethnic pre): 
udice against Francophone Canadians. Such 2 
situation occurred in the 1970s in the Annap- 
olis Valley of Nova Scotia, where the Angli 
phone majority actively opposed the federa 
government’s plan to move a Francophone 
unit to the local military base. Nova Scotiā 
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apers reported that such an event would 
“social deterioration” and warned that 
‘unsafe to bring Francophones into the 
‘A Conservative member of Parliament 
gested that emotionalism was running so 
that it might get out of hand. The 
speaking minority in the area, de- 
ts of the early Acadian settlers in 
th America, noted with alarm that “the 
and bigotry, for so long dormant” among 
Anglophones, was now apparent, The 
t study was done in the two Anglophone 
schools closest to the site of the proposed 
incophone “invasion,” where it was an- 
ted that students’ feeling of involvement 
the issue would be very high. 
Laboratory studies of attitudes, including 
We conflict theory of attitude change, are 
ally done with individuals rather than 
ps. However, the use of groups has sev- 
advantages. First, in our 
jons are made, issues are considered, and 
d scanning of information often occurs 
a group context. Therefore, it is of some 
terest to know whether the results of studies 
th individuals generalize to group situations. 
cond, if a group is induced to discuss an 
sue, the discussion can be recorded so that 
d or unbiased scanning of arguments can 
easured directly rather than being merely 
inferred. 
Several studies have used techniques in 
hich subjects generate arguments on one side 
an issue because they must defend that 
tion in a debate (Scott, 1957, 1959; 
hey & O’Neill, Note 1). In the present 
ly, triads of high school students discussed 
ather capital punishment or bilingualism in 
ation for a debate on the topic with 
other “team.” The involvement value of 
e issues has already been noted. Students 
ere also offered incentives in the form of 
prizes for winning teams and further 
awards as grand prizes. The purpose of 
study was to compare biased versus un- 
ed scanning of arguments directly to de- 


which side of the issue they would debate, 
half were not assigned until after their 
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society many , 
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preparation session. It was anticipated that 
the former condition would lead to relatively 
more biased scanning of arguments and con- 
sequently to more attitude change. 

It might be argued that while biased scan- 
ning produces more attitude change initially, 
such persuasive effects fade away when ex- 
posed to counterarguments. This might be the 
case, because if biased scanning is employed, 
the subject is not inoculated against argu- 
ments on the other side of the issue (McGuire 
& Papageorgis, 1971). To check this possibil- 
ity, for half of the subjects the postmeasure 
was given following the preparation session 
and before the debate. For the other subjects 
the postmeasure was given following the de- 
bate, when those in the biased scanning condi- 
tion would have been faced with counter- 
arguments that they had not considered. 

Because no pretest was used in the present 
study to minimize reactivity (Campbell, 
1969; Campbell & Stanley; 1966), it may be 
useful to comment on the initial positions of 
the subjects that can be deduced without a 
premeasure. As indicated, ethnic prejudice 
against French Canadians was running high 
in the target area at the time of the study. 
On capital punishment, Hilchey (Note 2) re- 
viewed evidence to indicate that Canadians in 
general and Nova Scotians in particular tend 
to be in favor of the death penalty. She found, 
in a university near the site of the present re- 
search, that students were overwhelmingly in 
favor of capital punishment. It might be as- 
sumed that this was also the general sentiment 
of the high school population from which our 
sample was drawn. 


Pretest of the Instrument 


Before running the present study, a post- 
measure of attitude change was designed spe- 
cifically to focus on the issues of bilingualism 
and capital punishment. A disguised technique 
similar to Cook’s measure of plausibility 
(Brigham & Cook, 1970; Waly & Cook, 1965) 
was used, Subjects were asked to “rate how 
convincing these arguments are; circle the 
appropriate number after each argument.” 
The numbers ranged from 1 (not convincing) 
to 7 (very convincing) with no other labels. 
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There were 13 arguments. Only 8 were scored 
in this study, 4 on the bilingualism issue and 
4 on capital punishment; the other arguments 
were on women’s liberation and were “fillers.” 
On each issue, two arguments supported one 
side and two supported the other. The follow- 
ing are examples of arguments about capital 
punishment; 


If a person is executed for murder, we can be ab- 
solutely certain that he will not kill again. Any 
other sentence always leaves open the possibility that 
an appeal board will mistakenly free a dangerous 
killer, or that he may escape from prison. In cases 
where guilt is beyond doubt, dangerous murderers 
should be given the death penalty. 


Many jurors are extremely reluctant to convict 
when the death penalty may be the result. There 
would certainly be more convictions if juries knew 
that life imprisonment was the most serious penalty 
that could be applied. It is better for law and order 
to have two murderers convicted and sent to prison 
than to have one convicted and executed and the 
other set free, Therefore, capita] punishment should 
be abolished, 


The following are 2 examples of arguments on 
the bilingualism issue: 


People of many races, cultures and languages have 
Played their part in the development of Canada, 
This is not just a country of English and French. 
It makes as much sense to say that we should all 
speek Cree and Inuit, Greek and Italian, as to say 
that we should all speak French and English. To 
expect that all Canadians will be fluent in more than 
their own language is simply impractical, 

Canadians must deal with the fact that there are 
two major languages in the country, Each language 
is spoken by so many people that this can never 


This measure? was Pretested on university 
students in introductory Psychology classes, 
They were given the questionnaire while being 
told that the experimenter planned to do re- 
search on debating and needed to know how 
“convincing” these arguments Seemed. Scores 
on items were added to get separate totals on 
the capital punishment and bilingualism is- 
sues. Two months later, a different experi- 
menter telephoned students who had scored 
highest or lowest on each of the issues. There 
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Table 1 

Distributions of Validation Subjects on the 
Questionnaire and Side Chosen in Writing 
an Essay 


_ 


Questionnaire 


Essay In favor Against 
Capital punishment issue 
In favor 8 2 
Against 0 10 
Bilingualism issue 
In favor 9 1 
Against 0 10 


were 10 high and 10 low on each issue, Stu- 
dents came to the laboratory individually, 
They were told that for some future research 
we needed to develop a number of arguments 
on current affairs topics. Topics assigned were 
feminism and either capital punishment or 
bilingualism. For the two target issues, the 
Statements were (a) “We need to bring back 
capital punishment to lower the crime rate in 
Canada” and (b) “All Canadians should learn 
to speak both French and English to make the 
country fully bilingual.” 

The students were told that they should 
take one side of the issue and write an essay 
containing as many arguments as they wished. 
They were told, “It doesn’t matter to us 
which side of the argument you pick; it’s 
probably going to be easier if you pick the 
one you agree with.” Of course, the side 
picked was exactly what interested us. Stu- 
dents in this validation study could be classi- 
fied as in favor of or against capital punish- 
ment or bilingualism on the basis of scores on 
the target instrument and again on the basis 
of the side they chose when writing an essay 
2 months later. The results are presented in 
Table 1. They are in the predicted direction 
for both issues, significant beyond the 005 
level using the Fisher’s exact test, The reli- 
ability of the measure was estimated using the 
Spearman-Brown formula. On the pretest, the 
reliability was .48 for the bilingualism ques- 


1 Copies of the instrument are available on re- 
Quest from the first author, 
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tions and .71 for capital punishment. In the 
main study the reliability was 40 for bilin- 
gualism and .64 for capital punishment. 


Method 
Subjects 


The study involved 48 students, 24 males and 24 
females, from two high schools, all from Grades 11 
and 12. 


Procedure 


First contact with subjects was made in their class- 

rooms, where they were told that they would be 
invited to take part in research on debating, that 
no previous experience in debating was necessary, 
that participation was voluntary, and that cash 
prizes of $5 would be given to each person on the 
winning side in a debate, with an additional $25 
given to the three best debaters in each school. The 
night before each session, six males or six females 
Were chosen at random from class lists and were 
telephoned, invited to take part, and given a time 
and place. If a student was not available, same- 
a replacements were taken randomly from class 
ists, 
_ Sessions were held immediately after school hours 
in two classrooms. When the six students arrived, 
they were met by two experimenters. Subjects were 
divided into triads alphabetically according to the 
first letter of the last name. Each experimenter then 
took three subjects to each of the rooms. The fol- 
lowing instructions were given to the assignment- 
before-discussion group: 


As you know, we are doing research on debating. 
The procedure is not very complicated. You will 
be assigned a current affairs topic (and you will 
be told which side of the issue to debate). You will 
have 40 minutes to discuss the topic and consider 
what arguments you might present. As you know, 
another group is meeting in another room to dis- 
cuss the same question (and they will be assigned 
the other side of the issue). At the end of the 40- 
minute discussion we will bring the two groups 
together for the actual debate. 


The modifications in instructions made for the assign- 
ment-after-discussion condition were the omission of 
the phrases in parentheses and the insertion of the 
Sentence “At the end of the discussion I will tell you 
Which side of the issue you must present in the de- 
a of course the other group will be assigned to 

e other side of the issue.” They were told that the 
debates would be taped and would later be evaluated 
y two judges at the university to determine the 
cash awards. Subjects were then told: 


We will also be taping your discussion now, before 
the debate—but that’s just for our own records. 
The judges will not hear it, and of course it will 
have no bearing on whether you win or lose. 
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In the assignment-before condition, subjects were 
informed: 


TIl give you a slip 
you are to debate 
has been determined randomly. Here is 


of paper telling you what topic 
and which side you are on. This 
your topic. 


The experimenter then opened a sealed envelope and 
handed a slip of paper to a member of the group. 
This slip provided both the statement of the issue 
and the position (in favor or against). The procedure 
was similar for the assignment-after condition, except 
that only the issue—and not the side to be debated— 
was given before the discussion. The issues stated 
were (a). “We need to bring back capital punishment 
to lower the crime rate in Canada and (b) “All 
Canadians should learn to speak both French and 
English to make the country fully bilingu: Ws We 
discussions were taped on cassette tape recorders 
using small omnidirectional lapel microphones that 
were hung from the ceiling. 

At the end of the discussion period, the posttest 
was administered to half of the triads, In each room, 
the experimenters told their groups: 


While we’re waiting, I'd like to get you to do a 
small task for me. This will help us with another 
research project. We have a number of different 
current affairs topics, and we're not sure how Con- 
vincing certain arguments are about those topics. 
Td like you to help by reading over these argu- 
ments, one at a time, and rating them to indicate 
how convincing they are. We'll use the arguments 
that are judged most convincing and discard the 


others. 


triads were brought together, and 
the debate took place. Each debater was given 5 
minutes for initial presentation of argument, with 
speakers from each side alternating. Then each side 
had another 5 minutes for rebuttal. For half of the 
groups the posttest was administered following the 
debate, using instructions similar to those above, 


After this the two 


Design 


To determine whether time of assignment was Suc- 
cessful in producing biased versus unbiased scanning, 
each triad was treated as a subject in a 2 X2 fac- 
f assignment as one factor (before 


or after the predebate discussion) and issue as the 


change effects, 


debate? 


z f in the attitude change por- 
It will be noted that in e ied as a factor 


tion of the study, tissue” was not d 
but was counterbalanced within conditions. In the 
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Results 


There were three major questions of interest 
in the present study. 

1, Were we successful in producing biased 
versus unbiased scanning in the assignment 
conditions? 

2. If so, did biased scanning lead to more 
attitude change? 

3. If so, was this attitude change affected 
by the time of presentation of the posttest 
(i.e., before or after the debate) ? 


Biased Versus Unbiased Scanning 


Each 40-minute tape was scored by one of 
four experienced raters who had been trained 
in a previous study that used the same scoring 
procedure (Hilchey & O'Neill, Note 1). Each 
time an argument was raised in group discus- 
sion, the rater briefly noted the content of the 
argument and indicated whether it was for or 
against the issue, In cases where there was 


(1969), was “a reason advanced for or 


of statements not counted 
as arguments included 
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concept of an argument was interpreted lib- 
erally, with any reason given for or against a 
position being scored, Each time an argument 
was advanced, it was counted, Thus, for ex- 
ample, a group member might say “bilingual. 
ism will unite our country” several times dur- 
ing the discussion, and this would be noted 
as a probilingualism argument each time it 
was raised. The quality of the arguments was 
not an issue, merely the extent to which the 
group considered one side or both sides of the 
question. When the tapes were scored, a ratio 
of pro and anti arguments was computed, 
varying between 0 and 1, with 1 indicating a 
perfect balance and lower numbers indicating 
a tendency to focus on only one side of the 
issue. Mean scores for the assignment-before- 
discussion condition were bilingualism issue = 
-19, capital punishment issue = «11; for the 
assignment-after condition, mean scores were 
bilingualism issue = -55, capital punishment 
issue = 87. In a 2 x 2 analysis of variance, 
looking at time of assignment and issue, there 
was a main effect for time of assignment, F(1, 
12) = 124.0, p < .001, with the assignment- 
before condition producing more biased scan- 
ning; there was a main effect for issue, F(1, 
12) = 6.0, p < .05, but this effect can best be 
explained in terms of the significant interac- 
tion between issue and time of assignment, 
F(1, 12) = 17.0, p< .005. When subjects 
did not know which side of the issue they 
would be required to argue, they were much 
more unbiased in their consideration of capital 
punishment than they were in their considera- 
tion of bilingualism. This effect was substan- 
tiated by the Newman-Keuls procedure, which 
showed a significantly higher score (more un- 
biased discussion) for capital punishment than 
for bilingualism within the assignment-after 
condition, 


Attitude Change 


Because we were successful in producing 
biased scanning, it was expected that there 
would be more attitude change in the bi 
scanning (assignment-before) condition. A 
Possible confounding variable was the act 


Table 2 


BIASED SCANNING 
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< Mean Scores on the Posttest for Assignment to Discussion and Time of Altitude Measurement 


. Assignment before discussion 
(biased scanning) 


———— 


Attitude measure 


Assignment after discussion 
(unbiased scanning) 


————————— 


Attitude measure 


Before 
Issue debate 
Bilingualism 5.25 
Capital 
punishment 4.79 
M 5.02 


After Before After 
debate debate debate 
4.79 4.29 3.92 
5.25 3.98 3.7 
5.02 4.14 3.81 


Nole. Attitude posttest was scored in the direction of assignment. 


direction of assignment, which was expected to 
have an effect itself, This was controlled by 
scoring the posttest in the direction of assign- 
ment, Thus, the mean score on the 7-point 
scale indicated amount of agreement with 
either the “in favor” position or the “against” 
position, whichever was assigned. Mean scores 
on the postmeasure are presented in Table 2. 
Each cell included an equal number of males 
and females, an equal number debating in 
favor and against, and an equal number de- 
bating bilingualism or capital punishment. 
There was a significant main effect for time 
of assignment, F(1, 40) = 21.01, p< 001, 
with biased scanning producing more attitude 
change. Neither the main effect of time of 
attitude measurement nor the interaction was 
significant, indicating that the persuasive 
“en of biased scanning were not affected by 
rare to opposing arguments during the 
ebate. The analysis of group discussion indi- 
cated that in the assignment-after-discussion 
co discussion was somewhat more un- 
oe on the capital punishment issue than on 
Pot miner and in view of this fact similar 

erences in attitude change might be ex- 
ten Table 2 indicates that there was a 
tend in this direction, although it was non- 
Significant. 


Discussion 


ts Conflict theory suggests that when a person 
"gate to think of arguments for only one 
le of an issue, those arguments become 


salient and influence attitudes. The theory 
previously received support in studies with 
subjects run individually, The present re- 
search indicates that the effect can be pro- 
duced in a small group setting. 

An advantage of using groups and recording 

group discussion is that biased or unbiased 
scanning of arguments is made public; re- 
searchers can hear and score arguments and 
quantify the degree of bias toward one side or 
the other. In studies with individuals, the ex- 
perimenter is typically left in the position of 
inferring that biased scanning has taken place, 
reasoning backward from the results (€8 
Girodo & Strickland, 1974). An obvious dis- 
advantage in the use of groups is the lack of 
control over all the factors that influence atti- 
tude change. We cannot be sure to what €x- 
tent a subject is changing his or her attitude 
as a function of self-generated arguments Or 
because he or she hears persuasive 
from others in the group. Indeed, these factors 
may interact so that potential debaters are 
more receptive to one-sided co! 
from others (see Greenwald, 1969, 1970). 
While such questions cannot be in 
an experiment such as the present one, these 
factors were not the focus of the and 
would have no systematic influence on the re- 
sults. 
Half of the subjects were not given the 
postmeasure until after they had debated the 
issue and had heard counterarguments. It is 
known that subjects can be “immunized 


against counterarguments by hearing them in 
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mild form before such arguments are pre- 
sented with persuasive intent (McGuire & 
Papageorgis, 1961). In this study, however, 
the attitude change created by biased scan- 
ning was not affected by the time of the post- 
test. Still, caution should be used in generaliz- 
ing this result beyond the present experi- 
mental procedure. There may be some aspects 
of debating per se, including the competitive 
aspects involved and the probable arousal of 
reactance (Brehm, 1966), that make counter- 
arguments presented during a debate ineffec- 
tive in persuading the opposing side. 

An unexpected finding was the interaction 
between issue and time-of-assignment in pro- 
ducing biased scanning. For both issues, groups 
that had to prepare for the debate without 
knowing their eventual assignment were much 
more unbiased in their consideration of argu- 
ments than were groups in the assignment- 
before condition. But within the unbiased 
condition, subjects were significantly more 
balanced in their treatment of capital punish- 
ment than bilingualism. This study was done 
in a geographic area where bilingualism is a 
matter of considerable debate and concern: 
Presumably capital punishment, however in- 
teresting it might be to the subjects, touched 
few of them directly. This provides a footnote 
to the point made by Girodo and Strickland 
(1974) that “high interest” in an issue pro- 
motes biased Scanning and facilitates attitude 
change. It seems that high interest in the 
sense of personal involvement with an issue 
also makes it more difficult to create unbiased 
Scanning of arguments, even when there is 


high incentive to look at both sides of the 
question, 


Reference Notes 
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On Problems With the Cause—Reason Distinction 


in Attribution Theory 


John H. Harvey and Jalie A. Tucker 
Vanderbilt University 


Buss’s statement on causal and reason explanations of behavior involves a con- 
tribution regarding a necessary distinction in attribution theory. In this paper, 
we argue that despite his ambitious reformulation of previous analyses of lay 
attribution, Buss fails to provide substantive guidance for empirically testing 
the major assumptions of his hypothesized cause-reason distinction. Most im- 
portant, Buss’s analysis does not yield unambiguous theoretical criteria for 
operationally distinguishing intentional actions and unintentional occurrences 
and for judging the content of actors’ and observers’ explanations as causal 
or reason in nature. Without such specifications, we conclude that Buss’ cen- 


tral ideas may be empirically untestable. 


We are now in a position to . . . steer a truer course 
that will lead us out of the conceptual fog with 
tegard to causes and reasons in the attribution process. 
(Buss, 1978, pp. 1319-1320) 


When a theorist makes such an assertion, 
however metaphorical, about integral concepts 
in a domain of work, that theorist’s ideas in- 
vite and deserve careful scrutiny. In this 
paper, we seek to provide such scrutiny of the 
major aspects of Buss’s (1978) reconceptual- 
ization of previous attributional analyses of 
lay explanation of behavior. 


We are indebted to Gary Wells for comments on 
an earlier version of this paper. 

Requests for reprints should be sent to John H. 
Harvey, Department of Psychology, Vanderbilt Uni- 
Versity, Nashville, Tennessee 37240. 


Copyrig! 


To summarize Buss’s position, he argued 
that attribution theorists (e.g., Heider, 1958; 
Jones & Nisbett, 1972) have failed to dis- 
tinguish both conceptually and empirically 
between two logically distinct types of lay 
explanation (i.e., causal and reason explana- 
tions) and that this lack of distinction has 
greatly impeded an accurate understanding of 
this major area of social psychological inquiry. 
As an alternative to previous attributional 
analyses of lay explanation (especially the 
Jones & Nisbett, 1972, actor—-observer hypoth- 
esis), which he considered exclusively causal 
in nature, Buss proposed (a) that an actor’s 
behavior may be dichotomized according to 
whether it was intentionally or unintentionally 
initiated (termed an “action” and an “occur- 
rence,” respectively) and (b) that actors and 
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observers differ in the types of explanations 
they may give for actions and occurrences. 
More specirically, Buss hypothesized that oc- 
currences were explained exclusively with 
causes by both actors and observers and that 
actions were explained solely with reasons by 
actors, whereas observers could give reason or 
causal explanations of actions, 
We readily acknowledge the potentially im- 
portant contribution of Buss’s statement in 
making this conceptual distinction between 
causal and reason explanations of actions and 
occurrences. However, although Buss is cor- 
rect in concluding that this distinction has not 
been formally addressed in relevant attribu- 
tion writings, his assertion that theorists have 
uniformly neglected reason explanations in 
favor of a causal analysis represents an overly 
generalized, polemical criticism. For example, 
see Heider (1958, p. 100) and Jones and 
Davis (1965, pp. 222, 263) for prior discus- 
sions of the role of reason explanations and 
actor intentionality in lay attributions, More 
importantly, we find Buss’s statement most 
problematic in terms of its cogency and clarity 
as a reformulation of actor—observer phenom- 
ena, It will be argued below that while Buss’s 
position may eventually contribute to a better 
understanding of lay explanations of behavior, 
ambiguities and oversights inherent in his 
analysis, especially in relation to the empirical 
testability of his ideas, greatly reduce the 
potential importance of his conceptualization. 


Issues Involved in an Empirical Separation 
of Actions and Occurrences 


Fundamental to Buss’s conceptual distinc- 
tion between causal and reason explanations 
is an assumption that an actor’s behavior can 
be unambiguously dichotomized as either in- 
tentional action or unintentional occurrence, 
Thus, an adequate empirical test of Buss’s 
hypothesized cause-reason attributional dis- 
tinction rests on the Operational separation of 
actions and occurrences, which he perceives as 
having generally been confounded in earlier 
Tesearch on actor-observer differences. As de- 
tailed below, we maintain that Buss’s state- 
ment does not provide either a clear logical 


or empirical basis for operationalizing actions 
and occurrences, 
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In an attempt to clarify the nature of this 
error in previous research, and to illustrate 
his definitions of actions as consciously con- 
trolled intentional behaviors and occurrences 
as unwilled behaviors that the actor “suffers,” 
Buss cited a number of examples of actions 
and occurrences from relevant experimental 
work, Examples of actions were intelligence 
test performance, expression of attitudes and 
Opinions, translation of sentences, decision 
behavior, and carrying parcels (p. 1318); ex- 
amples of occurrences were headache, blush- 
ing, perspiring, emotional arousal, galvanic 
skin response, friendliness, talkativeness, ner- 
vousness, dominance, and looking happy (pp. 
1317-1318). Unfortunately, for an investi- 
gator seeking operational definitions of actions 
and occurrences, the criteria implied by these 
examples for distinguishing the two types of 
behavior present substantial difficulties. First, 
Buss’s examples imply that physiological re- 
sponses and overt displays of emotion are 
occurrences that operate completely outside 
the actor’s conscious control, while actions 
encompass all deliberate behaviors, including 
those dependent on innate abilities such as 
intelligence, On both logical and empirical 
grounds, one must seriously question the im- 
plicit assumptions (a) that physiological re- 
sponses and overt manifestations of emotion- 
ality cannot be at least partially controlled 
consciously (cf. Miller, 1969) and (b) that 
behaviors dependent on innate abilities can 
be completely controlled and deliberately 
manipulated (cf. Kruglanski, 1975). 

Similarly, Buss’s classification of the ex- 
Pression of attitudes and opinions as inten- 
tional actions is questionable on the grounds 
that attitude expressions are often affected by 
various experimental manipulations and/or 
self-perception processes (Bem, 1972) that do 
not appear to be readily controllable by the 
actor; thus in these instances it may be argued 
that the expression of an attitude represents 
an occurrence, not an action, in Buss’s scheme. 
When one further considers that Buss classi- 
fied expression of opinions as an action but 
regarded “looking happy” as an occurrence, 
One is left with the curious conclusion that if 
we say we like someone (an opinion), it is an 
intentional action; but if we smile in the 
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gpesence of this same person (“look happy”), 
is an unwilled occurrence. Although these 
sues raised as a result of Buss’s examples are 
tobably not fundamentally damaging to his 
asic action-occurrence distinction, they 
evertheless demonstrate the difficulty in 
lassifying behaviors according to the actor’s 
apacity to control various behaviors inten- 
jonally and consciously. 
A second, highly related issue in determin- 
mg whether an actor’s behavior should be 
plassified as an action or an occurrence con- 
serns specification of criteria for imputing 
intention to an actor’s behavior in ambiguous 
ituations where the behavior in question may 
yt may not have been controlled by the actor 
{e.g., as in the case of overt displays of emo- 
ñon). Despite Buss’s contention that inten- 
lionality is a crucial differentiator of actions 
and occurrences, he did not address potential 
broblems in imputing intentionality, thereby 
implying that it is usually possible to establish 
dearly whether or not an actor’s behavior is 
intentional, One can envision ambiguous cir- 
cumstances, however, where the actor, ob- 
server, and investigator fail to agree on the 
presence or absence of the actor’s intentional- 
ity and/or conscious control over a given be- 
havioral sequence. To use Buss’s example (pp. 
1317-1318) of such a situation, the experi- 
ments by Storms (1973), Regan and Totten 
(1975), and Taylor and Fiske (1975) in- 
volved actors and observers giving explana- 
tions of the actor’s degree of friendliness, 
talkativeness, nervousness, and dominance in 
4 “getting acquainted” situation. On the one 
hand, Buss classified these behavioral dimen- 
sions as occurrences that should evoke causal 
explanations from both actors and observers. 
Buss also allowed, however, that if the actor 
consciously controls his/her self-presentation 
along these dimensions, the actor would pro- 
vide a reason, rather than a causal, explana- 


tion of behavior. Alternatively, the actor might 
h willful manipula- 


pserver and investigator 
might perceive the ac 
intentionally controlling his/her behavior. 

__In light of potential disagreements between 
the actor, observer, and investigator concern- 
ing the presence or absence of intention on the 
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actor’s part, the investigator would seemingly 
be forced either (a) to rely on the actor’s self- 
report of his/her behavioral intentionality or 
lack thereof or (b) to judge the content of the 
actor’s explanation to determine whether it 
comprises a reason or causal explanation of an 
intended action or unintended occurrence, 
respectively. The first possibility depends on 
the somewhat questionable validity of the 
actor’s self-report, especially in relation to 
willingness and ability to admit. intention, 
whereas the second depends on the existence 
of independent criteria that can be used to 
classify the content of explanations as either 
primarily causal or primarily reason in nature. 
‘As will be discussed in the following section, 
Buss’s analysis does not readily yield such 
criteria. 

Overall, we must conclude that Buss’s anal- 
ysis is weak in clarity of definitions or em- 
pirical implications for concepts such as in- 
tention and conscious control of behavior, 
which are central to his action-occurrence dis- 
tinction. Presumably, these concepts refer to 
qualities of actions, not occurrences, but is it 
necessary for both to exist for an action to be 
perceived? For instance, would it not be pos- 
sible for a behavior to be consciously and de- 
liberately taken, but not intentionally initi- 
ated? Considerations such as these are totally 
lacking in Buss’s analysis and require further 
explication if his statement is to be of much 
value to researchers. 


Empirical Implications of the Cause-Reason 


Distinction 


Buss’s second major assumption, that actors 
and observers differ in the types of explana- 
tions they give for actions and occurrences, 
also entails major difficulties in its empirical 
implications. In line with his criticism that 


existing actor-observer research has typically 
behaviors, 


confounded action and occurrence 
Buss seems to suggest that actor-observer dif- 
ferences should be investigated by having 
actors and observers give explanations of be- 
haviors that can unambiguously be classified 
as actions and occurrences. One thus can 
formulate a basic experimental design to test 
the validity of Buss’s cause-reason distinction ; 
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this design would involve comparisons of the 
explanations offered by actors and observers 
of both actions and occurrences, 

Under these experimental conditions, Buss’s 
model predicts that no differences should be 
expected in the content of actors’ and ob- 
servers’ (causal) explanations of occurrences, 
but that differences may be expected in their 
explanations of actions. That is, actors may 
give only reason explanations for actions and 
observers may give reason, causal, or reason 
plus causal (mixed) explanations. Buss fur- 
ther contended that the types of explanations 
elicited from observers to explain actions may 
be manipulated by instructing the observer 
(a) to explain the action using reasons as he/ 
she perceives the actor would, (b) to give a 
reason explanation based on the observer's 
personal interpretation of the action, or (c) to 
give a strictly causal 
Conceivably, 


however, ap- 
parently because his model precludes the pos- 
sibility that actors might give causal explana- 
tions of actions by adopting the perspective 
both actors and 
observers give explanations of the same log- 
ical form (i.e, when actors and observers 


this assertion about the logi 
of explanations under these conditions, Buss 
also maintained the seemingly inconsistent 
position that actors and observers “are en- 
gaged in fundamentally (ie. logically) dij- 
ferent situations to explain the same action” 
(p. 1316). Overall, it would Seem that an ade- 
quate empirical test of Buss’s cause—reason 
distinction would involve having both actors 
and observers explain actions and occurrences 


from the Perspective of both an actor and an 
observer, 


reason distinction, his Statement is 
regarding meaningful a pri 
tinguishing and cla 


explanations that would be produced. For in- 
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stance, Buss merely states that causal an 
reason explanations may be distinguished Ď 
the greater degree of justification “vis-à-vi 
society’s norms for ‘proper’ conduct” (p 
1315) in reason than causal explanations, an 
by the greater degree of “such ‘connotatiy 
attributes’ as lawfulness, determinism, ante 
cedent—consequent relationship, predictability 
and replicability” (p. 1314) in causal thar 
reason explanations. He fails, however, t 
elaborate on how these rather vague distinc. 
tions might be measured in explanations de 
rived in an experimental situation. This dif. 
ficulty in empirically classifying explanations 
as causal or reason is compounded further by 
Buss’s contention that actors give reason ex- 
planations of actions that are couched in 
causal terminology (pp. 1315, 1318), and that 
observers may make justifications for the 
actor (p. 1318). 

It may prove possible, of course, to estab- 
lish empirically those situations in which con- 
tent differences in actors’ and observers’ ex- 
planations of actions are either maximized 
or minimized, in a manner similar to findings 
of divergence and convergence in actor—ob- 
server explanations conducted within the 
actor-observer research paradigm (e.g., Wells, 
Petty, Harkins, Kagehiro, & Harvey, 1977). 
Notwithstanding this possibility, however, 
Buss’s failure to articulate well-developed a 
Priori criteria for establishing the causal or 
reason nature of explanations must be con- 
sidered a fundamental shortcoming of his 
theoretical conceptualization. Without better 
a priori criteria, one is generally left in the 
Scientifically indefensible position of defining, 
in a circular manner, reason explanations as 
those given by actors for actions and causal 
explanations as those given by actors and ob- 
Servers for occurrences. 

A final issue, which indirectly relates to 
problems in investigating Buss’s conceptual- 
ization, concerns his implicit characterization 
of the information sources and perceptive 
capacities of actors and observers. Specifically, 
in comparison to Jones and Nisbett’s hypoth- 
esis, Buss implicitly appears to accord the ob- 
Server more and the actor fewer information 
Sources and perceptive powers for deriving 
attributions, Whereas Jones and Nisbett em- 
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ized that the actor often possesses more 
evant information regarding his/her be- 
Syior than the observer, who often merely 
impses short segments of the actor’s ongoing 
lhavior, Buss instead presented the observer 
i possessing the capacity to perceive motives 
ad forces influencing the actor’s behavior 
at operate outside the actor’s awareness (p. 
416). The observer and also the investigator, 
ho qualifies as an observer, may then use 
jis information to give a causal explanation 
[the actor’s behavior. In contrast, Buss de- 
feted the actor’s perceptual and attributional 
lpacities as substantially limited by his/her 
ied to provide society with a justification for 
tsonal behavior.’ 
‘Hence, in Buss’s conception, the observer is 
ntially characterized as capable of omni- 
ently perceiving unconscious forces influ- 
ing the actor’s behavior, which form the 
lisis of the observer’s truly causal explanation 
the actor’s behavior. This implied view is 
ftprisingly consistent with the disinterested- 
dentist efficient-causality model that Buss 
ms to reject. Buss’s implicit characteriza- 
în of the perceptual and attributional capac- 
of actors and observers does not neces- 
ily invalidate the basic cause-teason dis- 
action advanced in his paper. Nonetheless, 
questionable foundation, which involves 
itle consideration of extant evidence and 
alysis regarding actor-observer inference 
acesses, does lessen the intuitive appeal of 
iiss’s conceptualization. 


lusions 


The foregoing discussion highlights some of 
@ issues and problems involved in attempt- 
& to incorporate Buss’s ideas in empirical 
livestigations of lay explanations of behavior. 
le acknowledge that his criticism regarding 
Me previous failure of attribution investiga- 
bts to distinguish actions and occurrences em- 
cally is well taken. However, we must 
clude that Buss’s analysis offers little sub- 
Mtive guidance in establishing theoretical 
teria for distinguishing between these two 
hes of behavior or in testing his hypothesis 
Rarding logical differences in the types of ex- 
Nations given by actors and observers. 
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Without better criteria for distinguishing ac- 
tions and occurrences and for judging the con- 
tent of explanations as causal or reason in 
nature, the impact of Buss’s contribution ap- 
proaches that of an untestable armchair anal- 
ysis. 


1 Although Buss may reasonably take issue with 
Jones and Nisbett’s assumptions about possible medi- 
ators of actor-observer differences, he fails to elab- 
orate the psychological processes that may mediate 
actors’ presumed preoccupation with justifying, rea- 
son explanations. If, as Buss implies, motivations 
associated with self-esteem maintenance or enhance- 
ment, or with self-presentation concerns, underlie 
actors’ reason attributions, we view Buss’s analysis as 
inadequate in its failure to address several works 
relevant to the role of self-serving bias mediators of 
attributions, such as Miller and Ross’s (1975) and 
Bradley’s (1978) influential discussions of information 
processing alternatives to a self-esteem position or 
Monson and Snyder’s (1977) important reconceptual- 
ization of the psychological processes involved in 
actor-observer attributional phenomena. 
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The cause-reason distinction in attribution theory is examined. The author 
notes conceptual difficulties associated with previous interpretations of cause 
as opposed to reason and offers a reformulation that shows how reason may 


refer to a specific type of explanation (notably, teleological) and cause may 
refer (a) to the general case of explanation (the inclusive sense) or (b) to the 


nonteleological (causal) modes 


general case of nonteleological explanation (the exclusive sense). From this 
perspective he considers Buss’s recent comments regarding teleological and 
of naive explanation. It is noted that teleologi- 


cal (reason-type) explanation does indeed fall within the purview of attribu- 


tion theory but is me 
types that all properly 
reason concept is of no unique si 
foregoing argument culminates in t 
mology might well rid itself of conc 
reason) that have figured frequently 
cepts may be thought of as the con. 
an infinite variety. Instead, an 


| My object in the following pages is to con- 
[Sider the cause-reason distinction in attribu- 
[tion theory and to comment on Buss’s (1978) 
[recent discussion of this topic. Despite my 
[disagreement with some of its contents, I find 
Buss’s article of considerable interest, par- 
ticularly as it calls attention to the relevance 
of British philosophy of the ordinary language 
(e.g., Anscombe, 1957; Austin, 1962; or Ryle, 
1949) for contemporary research on attribu- 
tion (Harvey, Ickes, & Kidd, 1976, 1978; 
Jones, Kanause, Kelley, Nisbett, Valins, & 
Weiner, 1972; Kelley, 1967). 

The main task that the ordinary-language 
Philosophy set for itself was the explication 
of concepts in everyday discourse. Buss 
(1978) takes two such concepts, cause and 
teason, and proclaims them of central import- 
ance to the theory of attribution. By contrast, 
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process of knowledge acquisition assume 
epistemic contents of possible interest to 


Psychological Association, 


rely one among a vast number of possible explanatory 
belong within the attributional domain. Hence, the 
ignificance for the theory of attribution. The 
he conclusion that a theory of naive episte- 
ern with lay concepts (such as cause and 
in attributional formulations. Such con- 
tents of knowledge, of which there may be 
epistemic theory should restrict itself to the 


d to be invariant across the diverse 
the layperson. 


I have felt for some time (cf. Kruglanski, 
1977; Kruglanski, Hamel, Maides, & 
Schwartz, 1978; Kruglanski & Bar-Tal, Note 
1) that the preoccupation with lay concepts 
threatens to ensnare attribution theory in end- 
less particularities that might ultimately di- 
vest it of scientific value. In the remainder of 
this article I present my arguments for this 
position, with special reference to the cause- 
reason issue. 

The present essay is composed of four 
major parts. In the first, some difficulties are 
noted with common conceptions of cause and 
reason, and a reinterpretation is offered clear of 
those difficulties. In the second part, teleolog- 
ical and nonteleological modes of naive €x- 
planation are examined and are contrasted 
with Buss’s views on these matters. The third 
part deals with Buss’s claim that conceptual 
confusion in attribution research may have 
resulted from a failure to draw the distinction 
between cause and reason. In the fourth part 


the cause-reason issue is cast into broader per- 
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spective and an evaluation is made of the 
place that any lay concepts may be hoped to 
assume within the theory of attribution. 


The Cause-Reason Distinction 
Reconceptualized 


Some Problems With Current Conceptions 
` of Cause Versus Reason 


Let us see how Buss characterizes cause, 
reason, and the differences that set them apart. 
As he expressed it (Buss, 1978, p. 1311), 
“causes are that which brings about a 
change,” whereas “reasons are that for which 
a change is brought about (e.g., goals, pur- 
poses, etc.).” In Aristotle’s classification, 
Buss’s cause is efficient cause, reason is final 
cause, and they are both presumed to differ 
from material cause, or “that in which the 
change comes about,” and formal cause, or 
“the pattern or shape of that which is 
changed” (p. 1312). We are further informed 
that causal explanation is associated with 
“such connotative ‘attributes’ as lawfulness, 
determinism, antecedent—consequent relation- 
ship, predictability and replicability” (p. 
1314), 

But the foregoing clarifications notwith- 
standing, the distinction between cause and 
reason still seems possessed of a “will-o’-the- 
wisp” quality, For instance, characterization 
of cause as “that which brings about change” 
would appear equally applicable to reason. 
Surely an actor has a reason for an action be- 
fore it occurs (a person may feel thirsty and 
then proceed to drink), and action would not 
have occurred without the reason, so the rea- 
son may well be said to bring about the 
change (i.e., the action). This conclusion is 
hardly altered if reason is conceptualized as 
the end state that the actor strives to attain 
(e.g., the quenching of one’s thirst). Anticipa- 
tion of the end state must antedate the action 
and, in a sense, bring it about, Needless to 
say, the actual end state (vs. mere anticipa- 
tion thereof) does not bring about the action 
and many a behavior may run its course with: 
Out eventual attainment of the und 


purpose. erlying 
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The above considerations suggest that rea- 
son may bring about action (change) and that? 
the antecedent—consequent relation that Buss 
(p. 1314) reserved for causal explanation is 
shared by teleological explanation. Further- 
more, it appears that lawfulness, predictabil- 
ity, replicability, and even a certain sense of 
determinism may well be predicated on teleo- 
logical explanations, so they may not differ- 
entiate it from causal explanation. To explain 
a person’s action in terms of a reason means 
that, all else being constant, given the reason, 
the action would follow. This of course means: 
that reason and action are lawfully (hence 
replicably) related and that action may be 
predicted from reason. As has just been noted, 
actual success in predicting an action from a 
reason requires the constancy of all remaining 
conditions including possible alternative goals 
and beliefs that the actor might have as to the 
best means for accomplishing his/her particu- 
lar end. That such psychological constancy 
across occasions might sometimes be unlikely 
in practice does not in the least detract from 
the principle of predictability and replicability 
intended by teleological explanation. 

To illustrate the essential lawfulness con- 
veyed by a teleological explanation, consider 
a brigand accosting a victim with a “money 
or life” dilemma. Most people would agree 
that the brigand is attempting to elicit in the 
victim the “life-saving” end as well as the 
belief that yielding the money would be the 
surest way of reaching this end. Note that the 
brigand’s expectation of being obeyed (or this 
person’s prediction that he/she would be) be- 
trays the assumption that purposes sys- 
tematically lead to actions (that may best 
Serve the purposes involved). ‘ 

Furthermore, should the brigand’s predic- 
tion fail to materialize (should the victim 
refuse to comply), it seems unlikely that a 
would be explained by an unlawfulness of the 
Purpose—action relation. Rather, the actors 
unexpected behavior would probably be taken 
to reflect a shift in circumstances, that is, 4 
Violation of the constancy assumption, that 
may render the situation irrelevant to the 
original purpose-action hypothesis. For ex- 
ample, one might postulate that a different 
purpose dominated the victim’s conduct (4 


udden penchant for heroics) or that the actor 
eemed a different action to serve the original 
urpose best (having become a black belt in 
arate, the victim could feel that a side kick 
fo the assailant’s chin might best rid him/her 
Of the threat involved). 

Thus, contrary to implying that the rela- 
ion of purpose to action is helter-skelter, so- 
ial interaction seems fundamentally to de- 
pend on a belief that the latter may be 
dicted from the former, that is, to depend 
On the essential lawfulness intended by teleo- 
gical explanation (for a similar view see, 
g., Schlenker, 1974). The preceding discus- 
jon makes plain that the demarcation of 
leological explanation from causal explana- 
tion is a largely unaccomplished task. 


An Asymmetrical View of Cause 
Versus Reason 


I believe that the cause-reason question 
may be neatly resolved (or dissolved) when 
reformulated asymmetrically rather than 
symmetrically, as it has been formulated by 
Several contemporary commentators (e.g., 
Buss, 1978). By a symmetrical interpretation 
‘of cause versus reason I mean their treatment 
as equally definitive concepts of unique signif- 

ance for the problem of explanation. How- 
ever, while most people seem to have little 
difficulty with the concept of reason (usually 
defined in such familiar terms as purpose or 
Boal), explication of the notion of cause has 
required considerable analytic efforts. For ex- 
ample, Buss approvingly cites the view that 
“causal explanation of human action would be 
in terms of bodily movements” (Buss, 1978, 
P. 1313). In addition, Buss equates causality 
with “that which makes things happen, or 
brings about change” (p. 1312). I admit to 
finding such clarifications not entirely satis- 


factory and to being left with a gnawing sense 
ing the intended meaning 


_ According to the present asymmetrical 
View, reason is interpreted as a specific type 
Of explanation (notably, teleological), whereas 
Cause is interpreted as explanation* in the 
Beneric sense. More precisely, the cause notion 
has apparently been employed in two broad 
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senses: one sense (henceforth labeled inclu- 
sive) has denoted any explanation, including 
the teleological. The second sense (henceforth 
exclusive has denoted any explanation exclud- 
ing the teleological. The inclusive sense seems 
to underlie Aristotle’s classification of causes 
into efficient, material, formal, and final, for 
example. This line of reasoning seems now to 
reflect simply four types of explanation, of 
which teleological explanation (based on final 
causes) may be one. Similarly, in attribution 
theory, frequent references to causal explana- 
tion connote (somewhat redundantly) the 
theory’s concern with naive explanation of 
whichever type rather than with special kinds 
of naive explanation. Finally, the pervasive 
use of the because preposition with all types 
of explanation is readily comprehensible if we 
assume that cause simply means explanation, 
no more, no less. 

The exclusive interpretation of cause (as 
any nonteleological explanation) seems prom- 
inent in those treatments where it is explicitly 
opposed to reason, as in Buss’s (1978) recent 
analysis. In those instances, cause seems de- 
fined negatively as a “nonreason” but is not 
further characterized more positively. In the 
remainder of this article, too, cause will be 
contrasted with reason and will exclusively 
denote any nonreason-type explanatory cat- 
egory. Regardless of whether the inclusive or 
the exclusive view of cause is adopted, the 
assymmetrical interpretation of the cause- 
reason issue has this fundamental implication: 
Reason is merely one type of explanation, 
whereas cause may include an infinite variety 
of explanatory types. Thus Aristotle’s fourfold 
classification of possible causes is considered 
a wholly arbitrary taxonomy. Besides efficient, 
material, formal, and final explanations, one 
can have mathematical, mechanical, geo- 
graphic, demographic, or social explanations, 


O 

1Consistent with common usage, explanation is 
used here to mean a statement constructed around the 
connective because (as in “Bill stayed home because 
jt rained”). It is generally accepted that such state- 
ments presume a universal conditional (if-then) 
proposition (“if it rains Bill stays home”) and an 
instantiating clause affirming the antecedent of the 
above conditional (“it did rain”; see, e.g., Popper, 
1961, p. 19; Turner, 1965, p. 272). 
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and so on, Similarly, cause and reason may 
not now be regarded as two equivalent ex- 
planatory types of special significance. Rather, 
reason is seen to represent a particular type 
of explanation, whereas cause is seen to repre- 
sent explanation in general or nonreason-type 
explanation in general. Thus, we seem to have 
an infinity of explanatory types of which 
teleological explanation is but one. In the 
following section this type of explanation is 
considered in further detail. 


Teleological and Nonteleological Modes 
of Naive Explanation 


Two major questions with which Buss 
(1978) deals at length are (a) whether non- 
teleological (in his terms, causal) explanations 
of actions and/or teleological explanations of 
Occurrences may be possible and (b) who 
(actors, observers, or both?) may be capable 
of nonteleological versus teleological accounts 
of actions or occurrences. In his answers Buss 
implies that nonteleological accounts of ac- 
tions are in principle possible, although they 
may be given by observers only, not by ac- 
tors: “action . . . is explained by the actor 
with reasons. The observer may use causes 
and/or reasons in explaining action” (p. 
1311), By contrast, occurrences (i.e., events 
devoid of intentionality) are assumed to be 
explicable only nonteleologically: “occurrence 
ve is explained by both actors and observers 
with causes” (p, 1311), In the forthcoming 
section I closely examine Issues a and b above 
and comment on Buss’s proposed answers, 


Are Nonteleological Accounts of Actions 
and/or Teleological Accounts of 
Occurrences Possible? 


The answers to these questions depend on 
how one chooses to define teleological and 
honteleological explanations, If by nonteleo- 
logical explanation is meant explanation that 
denies the behavior’s Purposive character, a 
nonteleological explanation of actions is im: 
possible by definition, I suspect, however, that 
for most people a teleological explanation is 
merely one that identifies the actor’s reason 


for a behavior, For example, one might (teleo- 
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logically) account for Mark’s application to 

Harvard Law School by mentioning his hope ` 
of joining the family law firm, thus explicitly 

identifying the behavior’s underlying reason, 

Similarly, one might (teleologically) explain 

Sara’s departure for a skiing trip by labeling 

it an end in itself, which again implicates the 

action’s instigating reason (Sara’s desire to 

ski; see, e.g., Kruglanski, 1975, Derivation 9, 

p. 302). 

Granting the above definition of teleological 
explanation, a nonteleological explanation 
would be one that is mute with respect to the 
actor’s specific reason for a behavior. Further, 
because actions alone may have underlying 
reasons (purposes, etc.), if a teleological ex- 
planation is given, it follows logically that the 
event explained was an action. The obverse 
does not follow, however: Given that an ac- 
tion was explained, the explanation must not 
necessarily have been teleological, For ex- 
ample, one might nonteleologically account for 
John’s application to Harvard in terms of his 
upbringing, which does not clarify John’s spe- 
cific reason for the action. Similarly, one 
might explain Paul’s decision to undertake 
some job in terms of the situation (most peo- 
ple in it would have done the same) or in 
terms of Paul’s personality (only someone like 
him would have done so), again without 
illuminating Paul’s specific reason for the de- 
cision. Assuming the interpretation of cause as 
nonreason (the exclusive interpretation given 
earlier), Buss seems correct in implying that 
nonteleological accounts of actions are in prin- 
ciple possible.2 

Finally, if the essence of teleological ex- 
planation consists of identifying the actor’s 
reason, such explanation may not logically be 
Possible with reference to nonintentional be- 
haviors. One cannot identify something that 
is not there, that is, a purpose in a noninten- 
tional behavior. Assuming the exclusive inter- 
pretation of the cause concept, Buss’s asser- 


* This is contrary to Postulate 1 of my endogenous- 
exogenous article (Kruglanski, 1975, p. 389), which 
implies that nonteleological explanation of actions 
may not be possible. Assuming the present definition 
of teleological and nonteleological explanation, the 
above implication of Postulate 1 must now retracted. 


ae 
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tion that occurrences must be explained 
causally by all actors as well as observers 
simply means that they may not be explained 
teleologically. This follows tautologically from 
the definition of occurrences (as noninten- 
tional behaviors devoid of reasons) and must 
be considered quite indisputable. Let us con- 
sider now an issue on which Buss’s claims may 
not seem as cogent: the hypothesis of actor— 
observer differences in the rendition of teleo- 
logical and nonteleological explanations. 


Are Observers Alone Capable of 
Nonteleological Explanations for Actions? 


Buss provides two separate arguments for 
the contention that observers alone, and not 
actors, can explain action nonteleologically. 
According to one argument, 


The actor's “because” statements consist of giving his 
or her reasons for an action, that is, matters that 
weighed in on his or her deliberation. By definition, 
in giving one’s reasons for an action, only matters 
that are consciously available to the actor can occur 
in such explanations. In contrast, the observer who 
is asked to explain “why” in regard to an actor’s 
action can include causal attributions that operate 
out of consciousness of the actor. Thus, while an ob- 
server can make causal attributions in explaining the 
action of an actor . . . , the actor’s explanation could 
not include such kinds of attributions. . . . (Buss, 
1978, pp. 1315-1316) 


According to the second argument: “Tn 
asking an actor to explain his or her action, 
the ‘why’ is a request to justify, or to make 
rational or intelligible, his or her action vis-a- 
vis society’s norms for ‘proper’ conduct .. .” 
(p. 1315). I now examine the two foregoing 
arguments in turn. 

The differential-consciousness hypothesis. 
Turning to the former argument, it may be 
readily conceded that only matters available 
in one’s consciousness can figure in one’s ex- 
planations and that the contents of actors’ 
consciousness may systematically differ from 


those of observers’ consciousness. Such differ- 


ences have usually been interpreted as ay 
ob- 


informational advantage of actor over 
server, as the former person more than the 
latter may be cognizant of his or her own 
bodily reactions or of the details of his or her 
Own personal history. But it is difficult to see 
how an excess of information may prevent the 
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actor from formulating explanations of which 
an observer may be capable. 

More generally, granting that a nonteleo- 
logical explanation is merely one that stays 
incommunicative with respect to the actor’s 
reason for a behavior, it still remains to 
specify what in the actor’s consciousness 
might prevent this person from arriving at 
explanations in this class, In fact nonteleo- 
logical explanations of a person’s own actions 
seem quite commonplace; criminal activity is 
often explained with reference to the adverse 
social conditions during the criminal’s child- 
hood, and a hostile act may be accounted for 
in terms of the perpetrator’s tension or irrita- 
bility. 

The justification hypothesis. Buss’s second 
argument for the actor’s presumptive inability 
to formulate nonteleological accounts of his 
or her own behavior assumes that (a) the 
actor’s accounts of personal actions are neces- 
sarily justificatory (Buss, 1978, p. 1315) and 
(b) all justificatory accounts of actions are 
necessarily teleological. On close inspection, 
both of the foregoing premises appear ques- 
tionable. The first appears unwarranted if 
justificatory is taken in the sense of norma- 
tively excusable (this seems implied in Buss’s 
reference on p. 1315 to society’s norms and 
rules for “proper” social behavior). Surely an 
actor may occasionally admit to socially un- 
wholesome motives (greed, cowardice, venge- 
ance) or to reasons that are normatively neu- 
tral, as when one explains a trip to the grocery 
store by reference to the empty cupboard at 
home. 

If by justificatory is meant, more broadly, 
intelligible or plausible accounts (excusable or 
not; cf. Buss, p. 1315) then Premise b 
would still seem doubtful. Many intelligible 
accounts of one’s own behavior may be non- 
teleological, as in the earlier example of the 
criminal who may account for deviant actions 
in terms of the deleterious social circum- 
stances in his or her neighborhood. Finally, if 
by justificatory is meant rational (Buss, p. 
1315) and by rational is meant teleological, 
then accounting for the actor’s presumptive 
need to give teleological explanations by in- 
voking the need to give justificatory explana- 
tions would be circular, hence devoid of 
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clarifying value. The preceding discussion 
shows that Buss’s hypotheses of actor-ob- 
server differences, relying on teleological (vs. 
nonteleological) explanations, are largely un- 
supported, 


The Cause-Reason Distinction in 
Attribution Research 


According to Buss, the failure of attribution 
theorists to draw the distinction between 
causes and reasons has resulted in much con- 
ceptual confusion. This is illustrated in refer- 
ence to two research topics: (a) Jones and 
Nisbett’s (1971) hypothesis of actor-observer 
differences in attribution and (b) Kruglanski’s 
(1975) partition between endogenous and 
exogenous attributions. Let us examine Buss’s 
specific comments on these issues, 


Jones and Nisbett’s (1971) Actor—Observer 
Hypothesis 


In a classic attributional essa , Jones and 
Nisbett (1971) speculated, “actors attribute 
cause to situations while observers attribute 
cause to dispositions . . .” (p. 82). Buss 
(1978, p. 1316) regarded the above formula- 
tion as objectionable, for it contradicts his 
aforementioned assumption that actors (un- 
like observers) are fundamentally incapable 
of causal (ie., nonteleological) accounts of 
their own behaviors, Furthermore, Buss 
viewed as categorically mistaken Statements 
(occasionally made by researchers in the 
actor—observer paradigm) conjoining the 
actor term with nonintentional behaviors like 
nervousness and Physiological reactions to 
music or emotional responses. 

But from the present perspective the prob- 
lems identified by Buss are not as serious as 
has been alleged. The Statement “actor at- 
tributes cause for own behavior” may be 
taken in the inclusive sense of cause to mean 
simply that the actor explains his or her own 
behavior (indeed, I Strongly suspect this to 
have been Jones and Nisbett’s intent in the 
first place). But even if cause is interpreted 
exclusively as a nonteleological explanation 
it follows from the Preceding section that 
actors should be capable of attributing cause 
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(for their own behavior) if observers are 
capable of such attributions. 

As for the linkage of the actor term with 
nonintentional behaviors (occurrences), the 
contradiction identified by Buss exists only if 
actor is narrowly interpreted as producer of 
intentional behaviors. If actor is taken in the 
broader (if somewhat Pickwickian) sense of 
target person whose responses (voluntary or 
not) are being explained (an interpretation 
that I again believe better captures Jones and 
Nisbett’s intent), however, the apparent con- 
tradiction exists no longer. Generally, the 
hypothesis of Jones and Nisbett (1971) impli- 
cates the differential preference of behaving 
individuals (actors, target persons) versus the 
observers of such persons, for situational 
versus dispositional explanations of the be- 
haviors at issue. In present terms, both situa- 
tional and dispositional explanations are non- 
teleological: They do not name specific rea- 
sons, if such reasons are indeed involved in 
the behaviors observed. As we have scen, 
actions as well as occurrences may be ex- 
plained nonteleologically, and actors (target 
persons) as well as observers seem capable of 
nonteleological explanations. Hence, contrary 
to Buss’s reasoning, Jones and Nisbett’s hy- 
pothesis seems conceptually coherent, and its 
case would have to rest ultimately with em- 
pirical research. 

Finally, to the extent that neither the per- 
son nor the situation categories are classifica- 
tions of reasons, the cause-reason distinction 
is not particularly relevant to the paradigm 
of Jones and Nisbett. The distinction’s rel- 
evance hinges entirely upon relevance of the 
reason concept to a given domain of phenom- 
ena, the residual concept of cause signifying 
merely a nonreason, that is, containing no 


unique Meaning independent of the reason 
notion. 


The Endogenous—Exogenous Partition 


Buss’s second example of confusion due to 
failure to differentiate between cause and rea- 
son recalls the theory of endogenous (vs: 
exogenous) attributions (Kruglanski, 1975) 
concerned with the case in which action is 
assumed to be an end in itself (endogenous 


: 
` 
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attribution) rather than a means to a further 
end (exogenous attribution): As Buss (1978, 
p. 1318) correctly points out, Postulate 1 of 
the theory implies.a distinction between causal 
and teleological explanation in stating that 
“the lay explanation of actions is teleological 
rather than causal” (Kruglanski, 1975, p. 
389). But subsequent portions of the same 
article contradict the distinction in speaking 
about “causally attributing actions . . . ” and 
in implying the interchangeability of causes 
and reasons (p. 389). 
The foregoing criticisms by Buss seem quite 
| valuable: (a) They demonstrate a clear in- 
consistency in my usages of the causality no- 
tion throughout the 1975 article, and (b) 
they highlight an authentic shift of views be- 
tween my 1975 work and my current reason- 
ing. While Postulate 1 of the former article 
denies the possibility of explaining actions 
nonteleologically, such a possibility has been 
explicitly affirmed throughout the present 
essay. 

Finally, note that the endogenous and 
exogenous categories refer to teleological ex- 
planations, so the notion of reason would seem 
quite pertinent: Implications drawn from an 
endogenous attribution (Kruglanski, 1975, 
pp. 391-392) follow without exception from 
its being a classification of reasons. In agree- 
ment with Buss (p. 1320), we may therefore 
conclude that the reason concept is of con- 
siderable significance to the endogenous- 
exogenous paradigm. 

In sum, the concept of reason (and the de- 
rivative explanation type called teleological) 
could be of relevance to some domains of at- 
tributional research (like that of endogenous 

"vs, exogenous attributions) while being irrel- 
evant to alternative attributional domains 
(like that of actor—observer differences). 
What then can be safely concluded about the 
concept’s significance for attribution theory 
at large? This issue is specifically addressed 

in the following section. 


"The Reason Notion and the Content Problem 
vin Lay Epistemology 


f Buss’s analysis has been 


Then ain thrust o 
eason-type) 


or the inclusion of teleological (r 
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explanations within the purview of attribution 
theory. I readily concur with the suggestion: 
The recognized province of attribution theory 
is lay explanation; a teleological account is 
one type of lay explanation, so it should 
properly fall within that province. 

But it is one thing to concede the applicabil- 
ity of attribution theory to reason-type €x- 
planations and quite another to pronounce 
the reason-cause distinction as of key im- 
portance (see Buss, 1978, p. 1311) within the 
theory. Such centrality would certainly be 
implied in a scheme that regards the reason 
and cause categories as the unique classifica- 
tory scheme for explanations. But according 
to the present view there is nothing unique 
about classifying explanations into causes and 
reasons. As noted earlier, the concept of cause 
as distinct from reason seems best interpret- 
able as a nonreason, that is, as representing 
an open class encompassing a potential in- 
finity of explanation types. In other words, 
reason-type or teleological explanation is 
merely one of an infinite number of possible 
explanations. Furthermore, viewed now as it 
separates reason from nonreason, the reason- 
cause partition may be compared to the pos- 
sible distinctions between (any of the infinite) 
explanatory categories and their comple- 
mentary classes, for example, mechanical ver- 
sus nonmechanical explanations, biological 
versus nonbiological explanations, and so 
forth, In sum, attribution theory deals with 
the general case of explanation, so any distinc- 
tion (like that between reason and cause) 
based on a specific type of explanation (out 
of the infinity that are possible) must be of a 
vanishing significance with the theory. 

But matters will now become even more 
extreme. We have already seen that the class 
of teleological statements (reason-type €x- 
planations) is a subset of the class of all ex- 
planatory statements and that the latter class 
defines the recognized purview of attribution 
theory. But obviously the class of explanatory 
statements (those constructed around the be- 
cause preposition) is a subset of the class of 
all statements, and I shall now argue that it is 
the latter class that defines the domain of a 
general epistemological theory, of which at- 
tribution theory is merely a special case. 
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Within this general epistemic theory any 
classification of statements (as explanatory 
or as teleological) turns out to be of vanishing 
importance, Therefore, the attribution frame- 
work, constrained as it may be to explanatory 
statements, deals with an extremely small seg- 
ment of the domain to which general epis- 
temological laws are assumed to apply. 

The epistemic theory presently referred to 
(and elaborated in Kruglanski et al., 1978 and 
in Kruglanski & Bar-Tal, Note 1) rests on the 
partition between the contents of knowledge 
(epistemic contents) and the process of 
knowledge acquisition (epistemic process). 
Epistemic contents are all propositions whose 
validity a person might, on occasion, wish to 
ascertain—in principle, all propositions, The 
epistemic process is the sequence of cognitive 
Operations performed by an individual on the 
way to a given bit of knowledge. The terms 
process and content figured prominently in the 
recent debate about transhistorical laws in so- 
cial psychology (see, e.g., Gergen, 1973, 1976; 
Hendricl 1976; Manis, 1975, 1976; 
ee eee 1976; Thorngate, 1975), al. 
though their meaning in that context differed 
from the present one. To clarify matters, the 
distinction between the two usages is now 
considered at some length, As presently em- 
ployed, process refers to systematic change 
and content to the object of such change. In 
this sense, the distinction between epistemic 
process and content parallels the distinction 
between the metabolic Process and the bio- 
logical system and between the evolutionary 
Process and the evolving Species, to name only 
two examples, 

The foregoing interpretation must be 
sharply distinguished from that in which 
Process denotes the abstract or the general 
ae Ti de Concrete or the specific 

. ne meaning of the process- 
content partition that has characterized the 
social-psychology-as-history debate just men- 
ee For example, in denying the feasibility 
ot Process invariance in social sychol] 
Thorngate (1975, p, 487) referred a “local 
organization” (meaning contents) and “gen. 
ation” (meaning process, italics 
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derlying social behavior we are dealing with ` 
a hypothetical structure suitable for encap- 
sulating a series of selected observations .. . .” 
(i.e., a series of social contents). In the same 
vein, Manis (1976) proposed that “the de- 
velopment of a scientific theory . . . involves 
the construction of some hypothetical mech- 
anism or process that lends coherence to an “ 
otherwise chaotic set of particulars” (p. 431), | 
Thus, reinforcement is cited as a theoretical | 
term representing process, and a specific class- 
room event, as a particular instance of rein- 
forcement, that is, content, 

It is stressed that the two foregoing inter- 
pretations of process versus content are quite 
orthogonal. Both change and the object of 
change may be conceptualized on a variety of 
abstraction levels, and neither may be ex- 
clusively identified with the abstract or the ) 
specific level. For example, one might par- g 
ticularistically describe a sequence of physio- 
logical changes transforming a given child into 
an adult at a specific time and place, or one 
might refer to the very same event sequence 4 
more abstractly as an instance of the matura- 
tion process. Similarly, the child (the object 
of change) might be particularistically identi- 
fied as Johnny Smith or might more abstractly 
be referred to as child, mammal, or physical 
body. 

The last example serves to underscore the 
relativity of abstraction levels; mammal is an ¢ 
abstract class of which human child is a par- © 
ticular instance, but mammal is itself a par- 
ticular instance of the organism- class, which 
in turn is a particular instance of the physical 
object category, and so forth. Interpreting 
process as the abstract and content as the spe- 
cific, it is indeed correct to assert that “öne 
man’s process is another’s content” (cf. Hend- 
rick, 1976, p. 395). But the relativity hardly 
extends to the alternative interpretation of 
process and content mentioned earlier, To 
confuse change and the object of change would 
represent a gross mistaking of category (Ryle, 
1949), of which few persons would be guilty. 

While in the present usage the generics f 
categories of process and content are not pre- 
sumed to differ in level of abstract, a par- 
ticular content could still be the ject of 
numerous particular Processes, and’ à par- 
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ticular process could still apply invariably 
‘across multiple particular contents. Thus the 
(same) human body may be the object of 
multiple different processes: growth, matura- 
tion, aging, disease. Conversely, the process 
of evolution may invariably apply to all bio- 
logical species, and the process of metabolism 
to all living organisms. In precisely this last 
sense it is presently assumed that the same 
epistemic process discussed below underlies 
the acquisition of all epistemic contents:* The 
sequence of cognitive operations that a knower 
may perform is assumed to be invariant re- 
gardless of the kind of knowledge that this 
knower may seek. 

In light of the foregoing assumption it is 
now possible to appreciate the sense in which 
any classification of statements may be of a 
vanishing significance for the general epistemic 
theory: (a) Statements may be classed in a 
vast number of ways, based, for example, on 
their inclusion of any of the vast number of 
concepts that man may be competent to con- 

struct, Thus, the designation of statements as 
- explanatory derives from their inclusion of the 
concept of the conditional (if-then) relation 
(see Footnote 1), but it is possible to devise 
numerous alternative classifications based on 
alternative included relations (€-8., the vari- 
ous geometric, social, or biological relations). 
(b) In the present framework any classifica- 
f tion of statements is a classification of epis- 
temic contents to which the same epistemic 
process is assumed to apply; for example, the 
process of explanation (the acquisition of so- 
called causal knowledge) would be no different 
from that involved in the acquisition of any 
knowledge, including nonexplanatory (non- 
causal) knowledge. It therefore follows that 
any classification of statements. (any catego- 
rization of epistemic contents) may be of ex- 
tremely limited import for the general theory 
of knowledge and that the theory might best 
focus instead on the epistemic process. i 
A detailed discussion of epistemic process 1S 


well outside the scope of the present essay (the 


interested reader is referred to Kruglanski, 


1977; Kruglanski et al., 1978; or Kruglanski 
& Bar-Tal, Note 1), but passing consideration 
of its major attributes is necessary. Briefly, it 


is hypothesized that (a) the epistemic process 
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is purposive, that is, it is initiated by the indi- 
vidual’s purpose to acquire a given bit of 
knowledge, Furthermore, the process involves 
(b) the formulation of an epistemic problem 
the solution of which may advance the 
knower’s purpose, (c) the generation of con- 
jectures regarding the identity of such a solu- 
tion, and (d) the deductive assessment of such 
conjectures by checking their implications for 
(logical) consistency with available evidence. 

The epistemic problem just referred to is 
conceptualized as a set of mutually exclusive 
propositions on some topic, among which a 
knower might wish to choose. Only some epis- 
temic problems may be explanatory, that is, 
revolve about because-type propositions; other 
epistemic problems may not, in this sense, 
be explanatory. For an example of an ex- 
planatory problem, consider someone choosing 
among the propositions “John laughed at the 
comedian, (only) because of himself,” “John 
laughed at the comedian ( only) because of the 
comedian,” and “John laughed at the come- 
dian (only) because of the circumstances.” * 
For an example of a nonexplanatory problem, 
consider someone choosing among the proposi- 
tions “The way to the market lies to the left,” 
«to the right,” or “directly ahead.” 

The same epistemic process is assumed to be 
involved in both examples and in all other 
cases of epistemic behavior, For instance, an 


uld not usually wonder about 


individual wo 
John’s laughter or about the way to the market 


unless he/she had a purpose for wanting to 
know those things. Furthermore, in tackling 
any epistemic problem the lay inquirer would 
be likely to form conjectures about the correct 
resolution, For example, one might conjecture 
that John laughed at the comedian because of 


3 As with any scientific proposition, the assumption 
of process invariance in this case is not intended as 
an a priori truth but as a hypothesis open, in prin- 
ciple, to empirical test. In this connection it is of 
interest to note that at least some suggestions regard- 
ing the presumptive diversity of epistemic processes 
(eg, Kelley’s, 1971, 1972, proposals for separate 
attributional principles and schemata) are inter- 
pretable as instances of the same epistemic process 
applied to different contents of knowledge (cf. Krug- 
lanski & Bar-Tal, Note 1). 

4 Roughly modeled after problems employed in 


McArthur’s (1972) research. 
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himself or that the way to the market lies to 
the left. Such conjectures would then be tested 
via a deduction of some of their implications 
and the assessment of these against relevant 
evidence. Thus, the hypothesis that John 
laughed because of himself implies that he 
would also laugh on other similar occasions, 
that other people would not laugh similarly, 
and so forth. Data consistent with such im- 
plications would augment the knower’s con- 
fidence regarding the correctness of the con- 
jecture involved (as has been demonstrated 
empirically by McArthur, 1972). Similarly, 
with a nonexplanatory problem, the hypoth- 
esis that the way lies to the left implies among 
other things that if one pursued the path to 
the left one would eventually wind up at the 
market. Evidence consistent with such an im- 
plication would again bolster the knower’s 
confidence regarding the validity of the hy- 
pothesis at stake. 

The above discussion suggests that the same 
epistemic process may be involved in the 
acquisition of diverse epistemic contents. 
From a standpoint of scientific generalization 
it would thus seem that we should have a 
unitary theory of process applicable alike to 
divergent contents of knowledge. Still, an in- 
dividual’s epistemic behavior in a specific in- 
Stance should vary considerably as a function 
of the contents investigated. Someone wishing 
to account for John’s laughter would thus care 
very little about the consequence of turning 
left at the crossroads, similarly, someone inter- 
ested in reaching the market would be indif- 
ferent, for example, to John’s behavior toward 
various comedians, If so, the question arises 
whether the theory of epistemic process may 
not be profitably supplemented by various 
theories of content, that is, by the specification 
of various content Categories that the lay 
Person may occasionally employ in the course 
of epistemic behavior. Indeed, extant attribu- 
tional formulations haye frequently included 
various such categories, for example, those of 
Dry accommo (et Kaley 
; > genous versus exogenous causal- 
ity (cf. Kruglanski, 1975), of ability, luck 
effort, and task difficulty (cf, Weiner 1974). 
In this sense Buss’s Proposal to include the 
reason category within the attribution frame- 
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work jibes well with an already popular trend 
in attributional theorizing. 

Still, an immense difficulty with any pro- 
gram of constructing content-bound epistemic 
models (those constructed around specific 
concepts or statements in the layman’s reper- 
tory) is their potentially infinite variety. For 
exactly the same reason, the program of the 
ordinary-language philosophers, to explicate 
the terms in everyday discourse, can hardly be 
expected to get off the ground. The everyday 
discourse may contain an infinity of terms, so | 
the task of their explication may only begin 
but never end. 

Furthermore, people across cultures and 
historical periods may have vastly different 
conceptual repertories and so must deal in 
epistemic contents that may vastly differ. 
Hence, an epistemological theory constrained 
to specific elements of content (e.g., to ex- 
planatory propositions or to special kind of 
explanatory propositions) seems condemned 
to the throes of cultural and historic relativ- 
ism (see, e.g., Gergen, 1973, 1976). Finally, to 
the extent that a normal person may pose di- 
vergent epistemic problems across varying 
occasions, an epistemic (or attributional) 
framework restricted to a narrow set of con- 
tents is perforce situationally specific and 
particularistic. Thus, we may not have con- 
tent elements in an epistemological theory for 
the very same reason that we must have a 
unitary theory of epistemic process: the fun- 
damental scientific commitment to generaliza- 
tion. The contextually shifting contents of 
relevant knowledge, while undoubtedly essen- 
tial to the full understanding and prediction y 
of human conduct, must be divined spe- 
cifically for each new situation rather than , 
derived from an invariant set of principles. y 
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On the Relationship Between Causes and Reasons 


Allan R. Buss 
University of Calgary, Calgary, Canada 


Criticisms (Harvey & Tucker, Kruglanski) of my recent conceptual critique of 
attribution theory are criticized. Further analysis of the distinction between 


cause and reason is undertaken, 
of the growth of knowledge 


where their relationship is set forth in terms 
(at both the individual and the collective level). 


The model involves converting causes into reasons and makes use of the psycho- 
analytic therapy session as an analogue for purposes of explicating the process. 


Reply to Harvey and Tucker 


Although Harvey and Tucker (1979) 
would seem to agree that the cause-reason 
distinction is valid and important for attribu- 
tion theory, they are somewhat concerned 
about what they regard as certain logical and 
empirical problems inherent in my analysis 
(Buss, 1978), 

In regard to logical matters, Harvey and 
Tucker believe that the distinction between 
“action” and “occurrence” is not an easy one 
to make. Thus they point out that physiolog- 
ical responses are often Susceptible to con- 
scious and deliberate control, thereby render- 
ing them actions rather than Occurrences, I 
quite agree, and am glad to see that my 
earlier example of just such a situation (Buss 
1978, p. 1318, footnote 6) did not go un’ 
noticed. 


A second example in their attempt to dem- 
onstrate some logical 
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an intention to act. The limits and/or con- 
straints surrounding the action do not make 
it less an action. Even if the test giver holds 
a gun to the test taker’s head and orders him/ 
her to complete the latest IQ test, such be- 
havior would constitute action, since the test 
taker still does have the option to refuse. 
Thus, the distinction between action and oc- 
currence is conceptually and logically rela- 
tively straightforward, given sufficient anal- 
ysis. 

When we move into the realm of empirical 
matters, defined here as experimental social 
psychology, then I quite agree with Harvey 
and Tucker that there are problems. Why 
shouldn’t there be, considering that the dis- 
tinctions that I haye stressed are conceptual 
rather than empirical? How could one opera- 
tionally define a reason or a cause, an action 
or an occurrence? I cannot think of any sim- 
ple way. I do think that an experimenter 
sensitive to the conceptual issues I have 
raised can arrange the task demands such 
that there is an avoidance of category errors, 
Reason and logic will guide the experimenter 
here, I cannot offer a pat formula. My whole 
point was to alert attribution theorists to the 
conceptual confusion in their area. If they 
themselves cannot remedy such confusion in 
their concrete researches, as Harvey and 
Tucker would seem to imply, then they should 
abandon this type of research. My challenge 
cannot be thrown back to me, since it rests 
with those who consider themselves attribu- 
tion theorists and who believe that their ex- 
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perimental program has some potential to 
enlighten us about lay explanation. 


Reply to Kruglanski 


Kruglanski (1979) has argued that causal 
explanation (efficient cause) is a generic form 
of explanation and that reason explanation 
(final cause) as well as Aristotle’s other two 
types (formal cause and material cause) are 
“merely” special instances of efficient causal- 
ity. In other words, Kruglanski is a monist, a 
scientific monist in the grand tradition of 
Hempel (1966), who attempted to reduce 
Dray’s (1957) use of -reasons in historical 
explanation to a special case of efficient cau- 
sality. The point, however, is that reasons 
and causes are truly distinct logically. Perhaps 
the best explication of this distinction avail- 
able to psychologists is Harré and Secord’s 
(1973), especially chapters 1, 2, and 8: 
human action reasons appear in a 
justificatory context, and logically imply the pro- 
priety of a happening, as opposed to its existence, 
which would follow from the description of its 
cause. (P. 148) 


Thus the connection of reasons to an event is 
logical, whereas that between causes and an 
event is physical, that is, space/time. This 
analysis is, I think, the best thus far, and it 
avoids the necessity’ to to 
explanation à la Hemple i 
Kruglanski) or reason explanatio: 

In the same article, Kruglanski states that 
there are an infinite number of kinds of ex- 
planation. He cites mathematical, mechanical, 
geographic, demographic, social, and other 
types. Now, any of these types of explanations 
can be classified or reduced to one or more 
of Aristotle’s four basic types. Kruglanski 
has confused form and content. 

Kruglanski does make the valid point that 
actors are not necessarily required to give a 
reason account of their action and in fact may 
offer a causal explanation. I think my initial 
position, claiming that actors can only give 
a reason account of their action, was too €x- 
treme. It would seem that both causes and 
reasons could figure in an actor’s self-explana- 
tion. However, I think that Harré and Secord 
(1973) make an important point in this con- 
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nection when they state, “For ‘propaganda’ 
purposes some justificatory reasons are passed 
off as causes, as e.g., in the substitution in 
common speech of ‘need’ for want?” (p. 
149). The example they give concerns the 
alcoholic who states that he/she “needed” a 
drink (implying uncontrollable causal fac- 
tors) rather than that he/she “wanted” a 
drink (implying reasons that fail to justify 
the action). To the extent that taking a drink 
is difficult to justify, the actor will try to 
offer a causal explanation the validity of 


which is suspect. 


On the Relationship Between Causes 
and Reasons 


Let me now attempt a more explicit anal- 
ysis of the distinction between cause and rea- 
son by elaborating upon their relationship. 
In maintaining a logical distinction between 
causes and reasons, I would not claim that 
there is no relationship between them—in 
fact, quite the contrary. The first point to 
note in this regard is that a cause may be 
cited as a reason, that is, it may figure in one’s 
justificatory explanation. However, not all 
reasons are causes, since not every reason 
would have the space/time or antecedent—con- 
sequent connection to an event. This view is 
an extension of my previous position rather 
than being inconsistent with it. 

The second matter to consider in the rela- 
tionship between causes and reasons is more 
complex and involves the process of actually 
converting causes into reasons. A model that 
illustrates just such a relationship between 
causes and reasons has been developed and 
described by Apel (discussed in Radnitzky, 
1970), Habermas (1971), and Ricoeur 
(1970), as well as Taylor (1973), where the 
context is arriving at increased knowledge of 
self. Essentially, the model involves the dia- 
logue between patient (actor) and therapist 
(observer) in the psychoanalytic encounter. 
As long as the actions of the patient can be 
understood by his/her offering reasons (there- 
by arriving at valid intersubjective meanings, 
i.e., the point of the action is intelligible to 
both patient and therapist), then there is no 
need for the therapist (observer) to introduce 
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hypostatized causal forces. When there is a 
breakdown in a reason account of action, how- 
ever, it is then necessary to resort to causal 
analysis, where the therapist (observer) at- 
tempts to formulate a causal explanation in- 
volving unconscious or hidden motives. To 
the extent that these causal factors are 
brought into the patient’s (actor’s) conscious- 
ness (through the technique of appropriately 
timed interpretations—thereby facilitating 
“insight””), then, according to the above-men- 
tioned individuals, they cease functioning as 
causes and become matters for deliberation, 
that is, reasons. Thus the road toward greater 
self- (reason) understanding and emancipa- 
tion from hypostatized causal forces lies in 
converting causes into reasons. In this way, 
causal analysis mediates, or is in the service 
of, greater (reason) understanding. 

The psychoanalytic encounter and the 
growth of self-knowledge operate at the indi- 
vidual level, and Habermas especially has 
generalized this basic paradigm to the collec- 
tive enterprise of the growth of knowledge in 
the social or human sciences, Revealing the 
underlying social forces or causes of nonlib- 
eration requires criticism grounded in the 
emancipatory interest. At the social level, 

“bringing into consciousness previously un- 
known hypostatized forces—causal forces that 
hinder greater self- (reason) understanding 
at a social level—will, of necessity, take the 
form of a critique of ideologies. When pre- 
viously hidden social forces are brought into 
consciousness, they become Matters of de- 
liberation, that is, are converted into reasons, 
and are thereby made not to function caus- 
ally, 

We can be somewhat critical of the bold- 
ness of the claims of Apel, Habermas, Ricoeur 
and Taylor regarding the conversion of canised 

into reasons and the supposed neutralization 

of the causal efficacy of the cause in question 
as a result. If only matters were that simple 
fortunately, as crit- 


even after sufficient insight has been demon- 
strated by the patient, that is, after pre- 
viously unconscious causes are brought into 


consciousness. According to what we might 
aptly dub the “conversion model,” the causes 
should cease to function once they enter con- 
sciousness and become matters for delibera- 
tion (reasons). Since this is often not the case 
in the psychoanalytic encounter and most cer- 
tainly does not apply to becoming conscious 
of previously hidden social forces, it is neces- 
sary to add a very important qualification to 
the conversion model, 

The necessary qualification of the conver- 
sion model is that the bringing of previously 
hidden causes into consciousness—and the 
institution or creation of new reasons by this 
means—does not ipso facto neutralize or de- 
stroy the original cause. Rather, when a cause 
is converted into a reason, the cause may still 
operate, but now the awareness of that cause 
may potentially lead to eventual elimination 
of the cause qua cause. Thus, the conversion 
of causes into reasons may (but not neces- 
sarily) provide for deflating previously un- 
conscious causes of their causal power. 

The modified conversion model proposed 
here permits going beyond the implications 
of attribution theory for psychotherapy that 
have thus far been considered. Storms (1973), 
Storms and Nisbett (1970), and Valins and 
Nisbett (1972) have all discussed both the 
desirability of and the means for having the 
patient (actor) adopt more of an observer 
perspective with respect to his/her action. 
In this way, the patient (actor) would tend 
to give more dispositional or self-attributions 
in explaining his/her inappropriate or mal- 
adaptive action and would thereby assume 
greater responsibility for it (and thus, pre- 
sumably, greater responsibility for changing 
it). The modified conversion model permits 
us to be a little more sophisticated in under- 
standing how, in fact, a greater frequency 
of internal attributions made by the patient 
(actor) can work toward the goal of greater 
self-understanding and emancipation. That is 
to say, as a person becomes more self-con- 
scious of previously unconscious internal 
causes vis-à-vis maladaptive behavior, such 
causes may (but not necessarily) be made 


causally impotent through conversion into 
reasons. 


r 
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However, it should be noted that a higher 
æ frequency of internal causal attributions on 
the part of the patient (actor) is not in and 
of itself desirable if the attributions are not 
valid. Storms (1973; see also Galper, 1976) 
did recognize that increasing the number of 
self-attributions may actually lead to an 
exacerbation of the pathological behavior if 
the causes of such behavior are external. The 
modified conversion model also goes beyond 
abstract individual liberation by incorporat- 
“ing unconscious social causes of nonliberation 
within its framework. Thus one must guard 
against the reification of abstract internal 
causes (dispositions), as well as abstract ex- 
ternal causes (situations), and adopt more of 
a dialectical view of subjective (internal) 
and objective (external) forces. Viewed in 
this way, the successful conversion of causes 
into reasons, and thus the achievement of 
greater human emancipation, is two dimen- 
sional rather than one dimensional. 


Conclusion 


Causes and reasons are both (a) logically 
distinct and (b) intimately connected to each 
other, In order to understand truly the nature 
of lay explanation, it is necessary to take 
cognizance of both of these conclusions. If 
attribution theorists will not or cannot do this 
(because of the inherent limitations of their 
experimental methodology and philosophy of 
science), then they should give up any pre- 
tense that they are better able to explain lay 
explanation than are the lay explainers. 
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Competence and the Overjustification Effect: S 
A Developmental Study 


Ann K. Boggiano and Diane N. Ruble 
Princeton University 


This study examined the conditions under which information regarding com 
petence would mitigate the negative side effects of rewards on the intrinsic 
interest of preschool and middle elementary school children. Children engaging 
in a task of high initial interest anticipated a reward made contingent either 
upon meeting a standard based on absolute performance level or upon task ` 
engagement alone, or they were not rewarded. In addition, they were provided 
with direct information concerning competence presented in terms of social 
comparison. The pattern of results indicated that the preschool children were 
primarily affected by information about meeting the absolute standard but not 
the social comparison information. That is, the overjustification effect did not 
occur when attaining a reward was made contingent on meeting an absolute 
standard of performance. In contrast, social comparison information superseded 
the effect of the contingency of the reward on subsequent interest in the target 
task for the older children. These findings suggest the importance of research 
from a developmental perspective for attempts to delineate the conditions under 
which rewards may avert the undermining of intrinsic interest. A 


A number of researchers have demon- 
strated that offering a reward to a person for 
engaging in an otherwise enjoyable activity 
will undermine that person’s ‘subsequent in- 
terest in that activity when the reward is dis- 
continued—a phenomenon typically explained 
by means of an attributional analysis (Bem, 
1972; Nisbett & Valins, 1971) and termed 
the overjustification effect (Lepper, Greene, 
& Nisbett, 1973). That is, if the justification 
for having undertaken an interesting task is 
perceived to be an overly sufficient reason for 
task engagement (e.g., an extrinsic incen- 
tive), the person would then infer that be- 
havior was motivated by the reward itself 
rather than by enjoyment, thereby decreasing 
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interest in the target task when the reward is 
no longer available. A substantial number of 
Studies have shown the reliability of this effect 
across both types of reward and population 
(cf. Condry, 1977, Deci, 1975; Lepper & 
Greene, 1978; Ross, 1976). 

The overjustification effect, however, is not 
assumed to be an inevitable result of reward 
attainment. Based on theories postulating that 
intrinsic interest stems from perceptions of 
competence and self-determination (Deci, 
1975; White, 1959), several investigators 
have hypothesized that rewards conveying in- 
formation regarding competence should sus- 
tain rather than undermine subsequent in- 
trinsic interest (Arkes, 1978; Lepper & 
Greene, 1978). Specifically, it has been argued 
that level of intrinsic interest should vary 
directly with information regarding compe- 
tence or incompetence that is conveyed by 
means of reward systems. 

A few lines of research have provided some 
Support for the competency hypothesis. Re- 
cent research has demonstrated that a reward 
made contingent on meeting a performance 
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standard produces more interest than a re- 
gward made contingent on task engagement 
alone, as shown in both behavioral (Karniol 
& Ross, 1977) and attitudinal measures 
(Enzle & Ross, 1978). These results were in- 
terpreted as suggesting that rewards made con- 
tingent on successful performance provide cue 
value regarding competency at the task, 
thereby maintaining intrinsic interest. Sim- 
ilarly, research using a social reward paradigm 
has shown that task-related praise sustains 
or even enhances intrinsic interest (Ander- 
son, Manoogian, & Reznick, 1976; Deci, 
1975; Pittman, Davey, Alafat, Wetherill, & 
Wirsul, in press; Swann & Pittman, 1977). 
This body of research, then, has provided in- 
direct support for the competency hypothesis 
by demonstrating that intrinsic interest is not 
undermined when reward is conveyed by 
means of verbal feedback or is made con- 
tingent on successful performance. 
The primary purpose of the present study 
was to test the competency hypothesis by 
manipulating the availability of direct infor- 
mation about competence. That is, although 
intrinsic interest should be maintained when 
reward attainment is made contingent on 
an absolute performance standard, the com- 
petency hypothesis would predict that if more 
direct information about competence or in- 
competence is additionally provided (in this 
case, performance level relative to others), 
‘subsequent level of interest should vary di- 
rectly with the type of information regarding 
competence that is presented. 

The competency hypothesis was also as- 
sessed by means of a developmental analysis 
of the effect of different kinds of competency 
information on intrinsic interest. According 
to Veroff’s (1969) developmental theory of 
achievement. motivation, it is not until 1-8 
years of age that children derive feelings of 
competence from comparative standards of 
excellence; however, even very young children 
(e.g., preschoolers) are assumed to make 
evaluative judgments regarding their level of 
competence based on absolute performance 
standards. Evidence supporting these propo- 
sitions is provided in several recent studies 
(eg., Ruble, Parsons, & Ross, 1976; Ruble, 
Boggiano, Feldman, & Loebl, in press). These 
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developmental findings suggest an alternative 
explanation for the results of one recent study 
where rewards made contingent on a com- 
parative standard were mot found to sustain 
subsequent interest in nursery school children 
(Greene & Lepper, 1974). Specifically, the 
findings described above would suggest that 
this failure to support the competency hy- 
pothesis may have been because young chil- 
dren do not use comparative information in 
making self-evaluations of competence. Thus, 
if the findings of the present study supported 
the predictions that competency information 
conveyed in relative terms mitigates the 
undermining effect of reward for older but 
not for younger children, additional support 
for the competency hypothesis would then be 
obtained. 

In the present study, children at two age 
levels engaged in a task that pilot testing re- 
vealed to be highly interesting for both 
groups. Upon completion of the target task, 
children were provided with competence in- 
formation in terms of relative and/or absolute 
level of performance, which was predicted to 
avert the undermining effect of reward on 
intrinsic interest. However, the two types 
of information concerning competence (ab- 
solute vs. relative) were expected to affect the 
children differentially at the two age levels in 
the manner described above. 

To test these hypotheses, children in the 
experimental conditions anticipated a reward 
made contingent either upon task engage- 
ment alone or upon meeting a performance 
standard (i.e. competence information based 
on absolute performance level). In addition, 
the children were either provided with infor- 
mation regarding their performance relative 
to peers (i.e. competence/incompetence in- 
formation conveyed directly in terms of $0- 
cial comparison) or were not provided with 
this information, Finally, a no-information- 
regarding-competence/no-reward group was 
included as a baseline control. 


Method 


Subjects 


Subjects were 147 children of predominantly mid- 
die-class backgrounds drawn from two private nur- 
sery schools and two public elementary schools in 
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central New Jersey. Three children were excluded 
from the analyses for reasons explained below. The 
nursery school children ranged in age from 3-5 to 
5-8, with a mean age of 4-6, whereas the mean age 
of the elementary school children from grades 3 
through 5 was 9-11. An attempt was made to dis- 
tribute subjects equally by sex across each cell. Since 
preliminary analyses revealed no interaction of sex 
or experimenter with experimental conditions on any 
measure, the data were collapsed across these vari- 
ables for purposes of analyses. 


Procedure 


The children were escorted individually into a 
mobile trailer parked near their school. After one of 
four female experimenters (three of whom were 
blind to the hypothesis) demonstrated to the chil- 
dren the different activities that she used with chil- 
dren from different schools at various times, she 
then told the children that they would have the 
opportunity to play the “find the hidden pictures” 
game. 

The hidden picture game consisted of a stack of 
eight embedded figure tasks with five to eight em- 
bedded figures on each of the eight papers. The 
children were told that the object of the game was 
to find and circle the ome particular thing on each 
of the eight papers which they would be instructed 
to circle after they turned on a tape recorder. They 
were also told that they should work as quickly as 
possible since they would have to proceed to the 
next picture when the person on the recorder in- 
structed them to turn the page. Subjects were told, 
however, that they could find and circle other 
hidden pictures on each of the pages if they wanted 
to and had sufficient time. The children were given 
two practice trials on an additional embedded-figure 
task to ensure comprehension of the task and were 
also given the opportunity to practice using the 
recorder to facilitate recognition of the different 
switches (the “on” button was painted green and the 
“off” button was painted red). The colors and game 
context of the task (i.e. finding and circling one hid- 
den picture on each page) were all designed to en- 
hance the interest value of the task. Pilot testing 
did, in fact, reveal that this task was highly inter- 
esting for the two age groups. The interest value is 
also indicated by the high Percentage of time the 
subjects in the baseline control condition played with 
the task, 

After the practice session, subjects were shown the 
scoreboard that displayed numbers ranging from 0 to 
8. Children in the performance-contingent reward 
conditions were told that they could earn two 
Hershey’s Kisses only if they obtained a score of at 
least 3 out of a possible 8. Stars were painted above 
the numbers 3 through 8 to ensure comprehension 
of the absolute performance criterion, which they 
were told meant that they did “okay” at the game 
Children in the task-contingent reward conditions 
were given the same information as children in the 
Performance-contingent reward conditions; however, 


they were instructed that they would be given two 
Hershey’s Kisses for simply playing the game, regard- 5 
less of their score. 

The scoreboard for children of both age groups 
in the social-comparison information conditions dis- 
played the scores of same-age others: children in the 
relative incompetency condition were shown a score- 
board indicating that most of the other children had 
obtained a score of 7 out of the maximum 8, whereas 
some of the others had scored 6 or 8; children in the 
relative competency condition were shown a score- 
board indicating that most of the others had ob- 
tained a score of 1 out of the maximum 8, while 
some of the others had scored O or 2. Children in the 
control condition were simply told that when they 
completed the task, they could put their score under 
the number on the board indicating on how many 
pages they found the particular figure they had been 
instructed to circle. 

After the experimenter was confident that the chil- 
dren understood all aspects of the task and after the 
children had been given the choice of whether or not 
to play the game, they turned on the recorder and 
began the task. (One fourth-grade boy in the control 
condition stated that he preferred not to play the 
target game, and his data were excluded from the 
analyses.) To make performance level comparable 
across age groups, the actual difficulty of the task 
increased with grade level, so that all children ob- 
tained a score of 4 (excluding one third-grade and 
one fourth-grade boy, whose data were excluded from 
the analyses). After tabulating their scores and indi- 
cating to the children the hidden pictures on each of Í 
the pages, all children except those in the control 
group received the reward and put their scores on \ 
the appropriate scoreboard. | 

The experimenter then told the children that she Í 
had another task for them to do but needed some 
time to prepare the task, The children were told that  \ } 
they could play with any of the games that were on | 
the table during the 5-10 minute waiting period. 

The alternative games, which had been pretested for 

their interest value, were: marbles, crayons with a i 
game that involved following the lines to complete W 
a picture, two mazes, and an additional set of hidden 

pictures. Although the same tasks were used for the ‘| 
two age groups, the difficulty level of the tasks was r 
matched to age level. After the experimenter placed 

the tape in the recorder that corresponded to the 

additional set of embedded figures tasks, she stressed 

that the children should feel free to play with any 

of the toys on the table and made the children aware N 

that they would not be given any further candy. The 
experimenter then went to the back room, unobtru- 

sively watched through a one-way mirror, and re- 

corded the amount of time during a 6-minute inter- 

val that each of the children spent with the target 


activity. 
Results 


Following the procedure of previous re- 
search (Greene & Lepper, 1974; Lepper et } 
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á ransformed Percent of Time Spent Playing With Target Task 


Social comparison condition 


Pee better Others worse 
_ (indicating (indicating No 
Reward group incompetence) competence) information 
Younger children 
Performance contingent 33 (.44) 29 (41) 4. 
x 33 (. 29 (, AS (.62 
Task contingent «17 (.23) +20 (.27) 19 in 
Control «52 (73) 
Older children i 
Performance contingent 21 (.28) 49 (.67) +39 (.55) 
Task contingent .25 (.32) -50 (.68) .21 (.29) 
Control 49 (.68) 


Note. Raw percentiles appear in parentheses. 


al., 1973), a log transformation, Y = log [Y 
+ 1], on the proportion of time subjects spent 
with the target task was performed to pro- 
duce homogeneity of variance (Winer, 1971, 
p. 400). The major hypotheses of the study 
were examined by a 2 (Nursery/Elementary 
School Children) X 2 (Performance-Contin- 
gent Reward/Task-Contingent Reward) x 3 
(Relative Competence/Relative Incompe- 
tence/No ‘Information Regarding Compe- 
tence) analysis of variance on the data. These 
means are presented in Table 1. a} 
The results concerning the undermining 
effect of reward were to be revealed 


$ 


/ in two comparisons. First, it was predicted 


that subsequent interest in the task would be 


reater when the reward was made contingent 
A meeting a performance standard than when 
reward was made contingent on mere task en- 
gagement. As expected, the main effect of a 
tingency was significant, F(1, 111) 5445; 
p< 05. However, this effect was primarily 
due to the data of the younger = as 
described below in the wi -grade lyses 
and as shown in Table 1. ‘Additional analyses 


i and the 
“ between the combined control groups 
contingency conditions revealed that interest 


was undermined Be i, PED 
conditions, as predicted, LA 
05; £(130) = 1.16, ns, for task-contingi 
and- performance-con' nt groups, Tespec- 
tively. AE K 
The second major prediction pee K 
competency information. expected, 


nificant Grade X Social Comparison inter- 
action, F(2, 111) = 3.28, p < .05, showed 
that relative competency information affected 
the subsequent interest of the older but not 
the younger children. Within-grade analyses 
were performed to examine further the pat- 
tern of data obtained. The analyses indicated 
that for the younger age group, reward made 
contingent on task engagement produced less 
interest in the task- than performance-con- 
tingent reward, as expected, F (1, 53) = 4.45, 
p< .05. Contrasts between the control and 
the combined contingency conditions indi- 
cated that only task-contingent reward under- 
mined interest for the nursery-school-aged 
children, #(130) = 2.67, p < .01; t(130) = 
1.31, ms, task- and performance-contingent 
reward groups, respectively. Moreover, com- 
parative information had no effect on their 
later interest in the task, as expected, F(2, 
53) <1. 

The data pattern for the older children dif- 
fered sharply from that of the younger age 
group. For this group, intrinsic interest dif- 
fered as a function of the comparative infor- 
mation in the predicted direction, F(2, 57) = 
5.61, p<.01. Children given information 
about their comparatively excellent perform- 
ance showed more interest in the target task 
than those who thought they performed worse 
than others, #(57) = 3.33, p < .01, or than 
children who had been given no information 
regarding performance relative to others, 
1(57) = 2.33, p< .05. However, contingency 
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Table 2 4A 
Performance as a Function of Age Level and Condition N 


Others better 


Social comparison condition 


Others worse 


(indicating (indicating f No 
Reward group incompetence) competence) information 
Younger children 

Performance contingent 10.9 2? te 

Task contingent 12.2 A Z S 

Control à $ K 

Older children 

Performance contingent 12.0 11.6 na 

Task contingent 10.9 8.6 n A 

Control 5 
of reward did not independently affect level nificant. Thus, the competing response hy- 
of interest or interact with the comparative pothesis appears unable to provide a satis- 
information, both Fs < 1. A specific test of factory account of the present data, since no 
the overjustification hypothesis by comparing differences in level of performance were found. > 


task-contingent reward with the control did 
produce the expected undermining effect of 
reward on interest for this age group only 
when no information regarding competency 
was provided, ¢(130) = 2.17, p < .05. Fur- 
thermore, information regarding relative in- 
competence was found to decrease interest 
relative to the control, as predicted, (130) = 
2.13, p < .05. Intrinsic interest was sustained, 
as can be seen in Table 1, only when subjects 
were given information that they were rela- 
tively competent or that they had met an ab- 
solute standard of competency when no rela- 
tive information was provided, 

Since advocates of competing response 
theory have argued that performance differ- 
ences resulting from distraction in anticipa- 
tion of reward can explain overjustification 
studies( e.g., Reiss & Sushinsky, 1975; 1976), 
analyses were performed to test the viability 
of this alternative explanation for the present 
data. A 2 (Grade) x 7 (Condition) analysis 
of variance on the children’s overall level of 
performance did not show a main effect of 
condition or grade nor a significant interac- 
tion, all Fs < 1, The means for the perform- 
ance level of children as a function of grade 
and condition can be seen in Table 2. Within- 
cell correlations between performance level 
and time spent at the target task were not sig- 


Discussion 


The results of the present study provide 
support for the hypothesized effect of different 
types of competency information on intrinsic 
interest. The current experiment demonstrated 
that reward based on successfully meeting an 
absolute performance standard did not under- 
mine interest, thereby replicating previous re- » 
search (Karniol & Ross, 1977). However, 
when more direct information regarding com- 
petency was presented by means of social 
Comparison, the pattern of results obtained 
depended on developmental level. 

Consistent with previous developmental 
findings (e.g., Veroff, 1969; Ruble et al., 
in press) and with present predictions, compe- 
tence information sustained the intrinsic in- 
terest of the preschool children when it was 
Presented in terms of an absolute but not a 
relative standard, In contrast, although the > 
older children were not impervious to the ab- 
solute standard indicating competence, the 
differential social comparison information 
Superseded the reward contingency in its 
effect on subsequent interest for this age 
group. Specifically, task-contingent reward 
decreased interest relative to the control only 
when no information about relative compe- 


ee 
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tency was provided or when the comparative 
information indicated relative incompetence. 
Task-contingent reward did not produce di- 
minished interest in the older children, as it 
had in the younger children, when informa- 
tion could be gleaned about their comparative 
excellence. 
Thus, these results provide strong support 
for the competency hypothesis by showing 
that social comparison, providing highly di- 
rect and unambiguous information about 
level of competence, superseded reward con- 
tingency in its effect on intrinsic interest in 
older children. Furthermore, when social 
comparison was not provided or was not used 
(ie., by the younger children) the contin- 
gency of the reward upon absolute perform- 
ance level (a different kind of competence 
information) had the predicted effect upon 
later interest. 
A The developmental findings in the present 
study are relevant to recent hypotheses about 
the effect of rewards on intrinsic interest in 
children. First, they do not support Harter’s 
(1978) hypothesis that the presentation of 
extrinsic incentives undermines intrinsic inter- 
est less at earlier developmental stages than 
at later stages. Second, the present data sug- 
gest important qualifications to Arkes’ (1978) 
argument that young children’s use of the 
competence but not attribution principles 


“(eg., discounting) renders an attributional 


explanation of overjustification less plausible 
and makes the former principle the key factor 
in overjustification research. While the im- 
portance of the competence principle is ob- 
viously not disputed here, our findings dem- 
onstrate that young children are not affected 
by certain types of competency information, 
in contrast to Arkes’ suggestions. Moreover, 
the present findings suggest that the type of 
information that will most effectively sustain 
“nterest in children depends on the develop- 
mental level of the child. 

The present results are also relevant to 
interpreting the findings of previous research 
examining the relationship between informa- 
tion regarding competence and the effects of 
rewards on intrinsic interest. First, it appears 
that Greene and Lepper’s (1974) failure to 
support the competency hypothesis was prob- 


ably due to their attempt to manipulate pre- 
schooler s perceptions of competence by means 
of social comparison information. Second, it 
appears that some important aspects of the 
competency hypothesis were obscured in the 
study by Karniol and Ross (1977), since they 
grouped children varying in age from 4 to 9 
years and conveyed information about com- 
petence by varying absolute and relative stan- 
dards simultaneously, Specifically, although 
the older children’s intrinsic interest may have 
been influenced by both the absolute and rela- 
tive information provided, the younger chil- 
dren may have focused on the absolute stan- 
dard and may have been unaffected by the 
relative information, 

The mediating factor assumed to account 
for the present findings is differential percep- 
tions of competence resulting from informa- 
tion in the various competency conditions. 
Perceptions of competence were not assessed 
here, however, because they would have been 
contaminated by the behavioral measure or, 
if taken before the behavior, would have po- 
tentially confounded this measure, Neverthe- 
less, several lines of previous research pro- 
vide support for the mediating variable hy- 
pothesized here. First, a number of studies 
have demonstrated that information regard- 
ing absolute and relative level of performance 
affects children’s perceptions of competence 
in the manner suggested here (Ruble et al, 
1976; in press). Second, a considerable 
amount of research in the achievement litera- 
ture has demonstrated a relationship between 
perceived competence and various task-ap- 
proach indices, including persistence (cf. 
Ruble & Boggiano, in press; Weiner, 1974; 
in press). Thus, both of the mediating links 
necessary to support the competency hypoth- 
esis (i.e., information — self-reported compe- 
tence, and competence —> persistence) are well 
demonstrated in previous related studies. 

In summary, the results of the present 
study provide the strongest support to date 
for the hypothesis that information about 
competence can mitigate the undermining ef- 
fect of reward on intrinsic interest. The avail- 
ability of direct information regarding compe- 
tence presented by means of social comparison 
appeared to override completely the effect of 
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reward on behavior. This finding raises the 
intriguing issue of the link between overjusti- 
fication phenomena and the standard achieve- 
ment motivation literature, as noted by sev- 
eral investigators (e.g., Deci & Porac, 1978; 
Lepper & Greene, 1978). Research is presently 
underway to explore the nature of this link. 
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« The Role of Social Comparison in Choice Shifts 


George R. Goethals 
Williams College 


s Mark P. Zanna 
University of Waterloo, Waterloo, Canada 


Research was conducted to test social comparison predictions regarding influ- 
ence processes related to risk taking in groups. It was shown that shifts toward 
risk were as likely to occur in groups where subjects exchanged information 
about their positions on the Kogan and Wallach Choice Dilemma Questionnaire 


(CDQ) and information about their self-ratings of ability as in ordinary group 
discussions of the CDQ items. Subjects in groups where only information about 
CDQ positions was exchanged showed far fewer shifts to risk. These findings 


are discussed in terms of a social comparison analysis of the social influence 
processes involved in risky shifts, which assumes that comparison processes 
can be engaged fully only when comparability is established by knowledge of 
other group members’ standing on traits thought to be related to risk taking. 


In a comprehensive review of the risky shift 
literature, Dion, Baron, and Miller (1970) 
suggested that the existing research seemed to 
lend more support to Brown’s (1965) “risk as 
a value” explanation of the risky shift than 
to accounts emphasizing diffusion of respon- 
sibility, leadership, or familiarization. How- 
ever, Dion et al, noted that the existing data 
did not seem to offer strong support for the 
social comparison mechanism that was im- 
plicit in Brown’s argument. Subsequently, 
Burnstein and Vinokur (cf. Burnstein & 
Vinokur, 1975, 1977; Burnstein, Vinokur, & 
« Trope, 1973) proposed an alternative “persua- 
sive arguments” version of the risk as a value 
idea. This account holds that subjects discuss- 
ing the items in the Kogan and Wallach 
(1964) Choice Dilemma Questionnaire 
(CDQ) have available a preponderance of 
arguments favoring risk, presumably because 
risk is a cultural value. As a result of this 
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preponderance of arguments, in any discus- 
sion about the CDQ items, subjects are likely 
to encounter more good arguments favoring 
risk that they had not considered before 
than arguments favoring caution. These 
arguments are likely to persuade subjects 
and produce genuine attitude change in the 
direction of increased risk. 

Burnstein and Vinokur (1975) did 
acknowledge a “few embarrassing studies” 
that produced an effect constituting a “puz- 
zle for the persuasive-arguments explana- 
tion” of the risky shift. These were the studies 
showing that a shift was sometimes produced 
when subjects were simply given information 
about others’ positions (e.g., Teger & Pruitt, 
1967), The explanation for these studies 
seemed to be social comparison, that is, that 
subjects shifted when they discovered the 
positions of other people, because they learned 
that they themselves were not as risky as 
they had imagined relative to others, and they 
wanted to correct that state of affairs by shift- 
ing. Although this effect has not been con- 
sistently obtained,’ it did suggest that a shift 
could sometimes be produced without discus- 


1 Pruitt (1971a, 1971b, has cited several articles that 
have included one or more information exchange con- 
ditions. It is important to note that of the 13 repli- 
cations only 4 actually resulted in significant risky 
shifts. 
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sion and therefore without opportunity for 
argumentation. However, Burnstein and Vino- 
kur suggested that persuasion could account 
for even those shifts obtained without discus- 
sion, Their rationale was that knowledge of 
other people’s positions inspires a self-genera- 
tion of reasons for those positions. It is these 
reasons, not comparison pressures, that pro- 
duce the shift. Burnstein and Vinokur argued, 
then, that persuasion was necessary and suf- 
ficient to account for all risky shifts. (This 
position has been disputed in reviews of the 
literature that are not necessarily sympathetic 
to social comparison theory, cf. Myers & 
Lamm, 1976, Also, Baron & Roper, 1976, 
have questioned whether Burnstein and Vino- 
kur’s 1975 data unequivocally support the 
self-generation of arguments idea.) 

Recently, however, two new lines of re- 
search have been reported that revitalize the 
social comparison account of the risky shift. 
Blascovitch and his colleagues (Blascovitch 
& Ginsburg, 1973; Blascovitch, Ginsburg, & 
Veach, 1975) have conducted several studies 
showing shifts toward risk in groups placing 
bets in blackjack games in gambling estab- 
lishments in the state of Nevada, They 
argued for a social comparison explanation 
for their findings. Baron and his colleagues 
(Baron & Roper, 1976; Sanders & Baron, 
1977) have also proposed an account of risky 
shifts involving social comparison, This view 
holds that subjects want to be as risky as 
other group members or slightly more so be- 
cause risk is a value, and that there is a tend- 
ency for the group as a whole to shift toward 
risk as a result of this desire. Countervailing 
values for being sensible or not reckless pre- 
sumably place some kind of limit on this com- 
petition to be relatively risky. The most re- 
cent exposition of this approach, that of 
Sanders and Baron (1977), argues that social 
comparison plays a large but not exclusive 
role in choice shifts. They believe that both 
social comparison and persuasion operate to 
produce risky shifts and that neither js 
capable of accounting for all the data. They 
Suggest that the two processes are not mu- 
tually exclusive and may even be comple- 
mentary. For example, it is possible that sub- 
jects are receptive to arguments favoring risk 
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that emerge in group discussion because such 
arguments justify a shift toward risk that 
they want to make because of social compari- 
son concerns. The arguments can provide a 
justification for opinion changes that subjects 
want to make anyway. 

The purpose of the present paper is to dis- 
cuss several studies designed to explore more 
thoroughly the role of social comparison 
processes in choice shifts. The starting point 
for these investigations was Goethals and 
Darley’s (1977) recent restatement of Fest- 
inger’s (1954) social comparison theory 
within an attribution theory framework. 
Their formulation suggests that people will 
compare their opinions with others who 
should have similar opinions, given their 
standing on attributes related to and predic- 
tive of those opinions, and that people will 
feel that they have selected appropriate 
opinion positions if people who are simila 
on these attributes agree. A recent experi- 
ment by Zanna, Goethals, & Hill (1975) sup- 
ports this analysis by showing that subjects 
do compare with others who are similar on 
related attributes. 

What kind of attributes do people feel are 
related to risk taking? A study by Jellison 
and Riskind (1970) suggested that ability is 
perceived as related to risk. They contended, 
in fact, that risk is a cultural value, as Brow: 
(1965) and others had suggested, because of 
its link with ability, Risk and ability are 
linked in that the probability that a person 
will succeed if he engages in a risky behavior 
is directly dependent on his ability—the more 
ability, the more probable success. Given the 
risk-ability link, social comparison theory 
implies that people will feel it is appropriate 
to take as much risk as others of equal ability 
but less risk than those who possess greater 
ability. 

A ea 

In order to test this reasoning, one initial 
and two follow-up experiments were con- 
ducted in which subjects were given informa- 
tion indicating that others of either extremely 
high or moderate ability had recommended a 
high degree of risk on CDQ items. It was 
predicted that subjects would feel comparable 
to others of moderate ability and would want 
to move toward risk themselves when exposed 


a 


` perimental condition did subjects’ 
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to their risky recommendations, but would 
cease comparison with the superior-stimulus 
persons and would not be influenced by their 
risky recommendations, The first experiment 
was only partially successful. In neither ex- 
recom- 
mendations differ significantly from those in 


. 2 control, nor did they differ from each other. 


However, manipulation check data showed 
that many subjects in the high-ability condi- 
tions perceived themselves to be equal in 
ability to the stimulus persons. An internal 
analysis was performed where subjects were 
assigned to one of three groups depending on 
whether they had indicated on the manipula- 
tion check that they saw the stimulus persons 
as having more, less, or equal ability com- 
pared to themselves. Subjects in the group 
perceiving themselves as equal to the stim- 
ulus persons were more risky than subjects 
in the control condition and more risky than 
‘those who perceived the stimulus persons to 
have more ability. This finding suggests that 
being influenced toward risk in a group is 
dependent on a perception of similarity and 
comparability on the dimension of ability and 
on social comparison tendencies being fully 
engaged as a consequence. 

This conclusion, however, is clearly an 
extrapolation from the data and is only sup- 
ported by an internal analysis. Consequently 
two replications were conducted in which 
great emphasis was placed on embellishing 
the high-ability manipulations so that it 
would be impossible for the subjects to per- 
ceive themselves as being even remotely as 
talented as the stimulus persons. Unfortu- 
nately, the results for both replications were 
very similar to the results of the initial ex- 
periment, As in the initial study, many sub- 
jects perceived the stimulus persons in the 
high-ability conditions to be equal in ability 
to themselves, and in each case an internal 
analysis was performed. For both follow-up 
studies this analysis indicated, as before, that 
subjects shifted to risk more when they per- 
ceived the stimulus persons to be equal in 
ability than when they perceived the stimulus 
Persons to be more able, and that only in the 
equal ability group was there a significant 


shift to risk, 
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Given the limited success of these experi- 
ments, a new approach was devised to test 
more directly the hypothesis that the extent 
to which groups will shift to risk is dependent 
on the extent to which social comparison 
tendencies are fully engaged in those groups. 
This approach grew out of the previously 
discussed finding that experimental proce- 
dures using information exchange, where sub- 
jects are given information about other group 
members’ CDQ positions but not their rea- 
sons for adopting those positions, show shifts 
of lesser magnitude than those found follow- 
ing full discussions of CDQ items. These 
data suggest that social comparison is not 
sufficient to account for large choice shifts 
following group discussion and that persua- 
sive arguments are needed fully to influence 
group members toward risk. How can these 
studies be reconciled with a social comparison 
analysis of choice shifts? The key would seem 
to be a careful consideration of what is and 
what is not exchanged in information ex- 
change experiments, 

The three experiments reported above sug- 
gest that ability information interacts with 
information about the stimulus person’s posi- 
tions on CDQ items and that for social com- 
parison processes to operate fully in a group 
discussing risk, the group members need to 
know more than the other members’ recom- 
mendations regarding risk; they need to have 
information about others’ abilities as well. It 
seems reasonable to argue, then, that the 
traditional information exchange studies may 
not fully engage social comparison because 
they fail to give subjects ability information 
that is critical in making any comparison of 
recommendations regarding risk meaningful. 
That is, because subjects have no informa- 
tion about the group members’ abilities, their 
recommendations regarding risk have am- 
biguous implications for the level of risk that 
is appropriate for the subject to recommend, 
Hence, we should not expect consistent shifts 
in traditional information exchange condi- 
tions (see Footnote 1). 

Suppose subjects were placed in an infor- 
mation exchange situation where information 
about both position and ability is being con- 
veyed. If each person in the group indicates 
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that he believes himself to be somewhat more 
able than the average person, as one might 
expect from Jellison and Riskind’s (1970) 
results, then each person in the group should 
come to feel that other group members are 
comparable to themselves on the relevant 
background characteristic, ability (i.e., they 
are all “above average”). Social comparison 
should be fully engaged, and we should ex- 
pect risky shifts of greater magnitude to oc- 
cur. 

But will shifts in the more complete infor- 
mation exchange condition be as large as those 
that result from group. discussion? If the re- 
sult of group discussion, in addition to ex- 
change of information concerning position, is 
primarily exchange of information concern- 
ing ability, then our answer would be yes. If 
relevant arguments make a difference over 
and above the reassessment due to social com- 
parison, however, then our answer would be 
no, At any rate, the purpose of our experi- 
ment is to give social comparison theory a 
fair test; the prediction is that a more com- 
plete information exchange condition will 
produce a shift toward risk of greater magni- 
tude than a “traditional” information ex- 
change condition. Whether the shift in this 
new information exchange condition will be as 
large as that following a group discussion is, 
for us, an empirical question. 


Method 
Subjects 


Two replications of the experiment were conducted. 
In one replication, conducted at Rider College (Tren- 
ton, New Jersey), 38 male and 26 female summer 
students served as subjects; in the other replication, 
conducted at the University of Waterloo, 73 female- 
students participated. In each case, care was taken 
to assure that the subjects Participating in a given 
group were not close acquaintances, In the Rider 
replication each subject was paid for his or her par- 
ticipation; in the Waterloo replication, subjects were 
drawn from a voluntary subject pool associated with 
their introductory psychology course. 


Procedure 


„Subjects were run in groups of four, except for 
nine groups in the Waterloo replication, which con- 
tained five subjects, Each group was composed of 
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subjects of the same sex who did not know each 
other well. Four groups of subjects were run in each \ 
experimental condition in each replication.* The ex- ` 
perimenter began by explaining that the study was 
concerned with decision making. He or she then 
described the nature of the choice dilemma problems, 
giving subjects a sample problem (the football cap- 
tain). Next he or she asked subjects to read, think 
about, and answer the same four choice dilemmas as 
employed in the initial experiment (items dealing 
with the prospective chemistry graduate student, the 
prospective concert pianist, the businessman con- 
sidering running for political office, and the research 
physicist). After subjects had given their initial 
responses (and the booklets had been collected), 
they were asked to fill out an anonymous background 
questionnaire that asked, “Compared to the average 
person of your age and sex, rate yourself in terms 
of your overall talent, creativity, and ability” (an- 
swered on a 7-point scale with points labeled “a 
great deal above [below] the average,” scored 7 and 
1, respectively, “moderately above [below] the aver- 
age,” “slightly above [below] the average,” and “just 
at the average,” scored 4). 

Group discussion condition. At this point the 
procedures for the three experimental conditions di- 
verged. Calibrating the time parameters employed by 
Teger and Pruitt (1967), subjects in the group dis- 
cussion condition were given 30 minutes to discuss 
the set of four choice dilemmas. New booklets con- 
taining the problems were handed out to facilitate 
these discussions. 

Information exchange of position condition. This 
condition closely resembled that conducted by Teger 
and Pruitt (1967). In addition to the new booklets 
of choice dilemmas, subjects were given a set of 
cards indicating the various odds depicted in the 
choice dilemma problems and were asked merely to 
indicate their current positions by holding up the 
appropriate card. In this condition each subject in 
turn indicated his or her current position three times 
for each problem. 

Information exchange of position and ability con- 
dition. This condition resembled the previous in- 
formation exchange condition except that subjects 
were also handed a set of cards that matched the 
alternative answers to the question for which they 
rated their own ability. Just prior to indicating their 
Positions for the first time on the first choice dilem- 
ma, subjects were asked, again in turn, to indicate 
their answers to the ability question. In the Waterloo 
replication Subjects were asked to indicate their abil- ~ 
ity positions again after the second of the four 
choice dilemmas. No talking was allowed in either 
information exchange condition. 

1 Control condition. Subjects in the control condi- 
tion were given only 10 minutes to reconsider their 


2 Actually, in the Rider replication, subjects in the 
control condition were run in groups of size 2, 3, or 
4. For purposes of analysis subjects were randomly 
assigned to four groups of size 4. 
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recommendation. No information whatsoever about 
others’ responses was provided. 

In the Rider replication, subjects in all conditions 
were asked to make their final private recommenda- 
tions after the entire discussion, information ex- 
change, or reconsideration phase; in the Waterloo 
replication, final judgments were made after each 
item was processed. Finally, subjects were debriefed 
and, in the Rider replication, were paid for their 
participation. 


Results 


Preliminary analyses indicated no signif- 
icant effect of sex of subject or interaction of 
sex with experimental conditions within the 
Rider replication; hence, data were collapsed 
across this dimension within this replication. 
In addition, preliminary analyses indicated no 
significant effect of replication or interaction 
of replication with experimental conditions; 
hence, data were also collapsed across this 


dimension. 


Ratings of Ability 


As expected subjects did rate themselves 
as more able than “the average person of 
(their) age and sex.” The overall mean on 
the scale, which ranged from “a great deal 
below the average,” scored 1, to “a great 
deal above the average,” scored 7, was 5.36. 
Twelve subjects (or 9%) felt they were “a 
great deal above,” 51 (or 37%) felt they 
were “moderately above,” 50 (or 36%) felt 
they were “slightly above,” and only 23 (or 
17%) felt they were “just at the average. 
One subject in the sample (or 1%) felt he 
was “moderately below average. This dis- 
tribution was similar for each condition. 


Postmanipulation Shifts 


A preliminary analysis indicated that pe 
variation in the magnitude of shifts p ar 
siderably greater in the group discuss! aa 
dition whether the unit of analysis Hy And 
group (Fmax = p: $ ay D 5 Ban: 

idual subject ‘max = 9-82, 7 
ae . best describe the data the pri 


analyses are nonpa i 
EE T eee the number of Jagd i 
which more subjects shifted in the ri 


1473 


in the cautious direction and the number of 
groups in which an equal or greater number 
shifted in the cautious direction. The overall 
chi-square analysis indicates that the treat- 
ment effect is highly reliable, x°(3) = 10,42, 
p < .02. As can be seen in Table 1, this over- 
all effect is primarily due to the fact that a 
greater proportion of groups in the group dis- 
cussion and information exchange of position 
and ability conditions (81% combined) had 
more subjects shifting to risk than to caution; 
this percentage contrasts with that for the in- 
formation exchange of position and control 
conditions (25% combined), x*(1) = 8.03, 
p < .01; Yates’ correction for continuity was 
applied to this statistic. This difference ac- 
counts for 97.5% of the overall differences 
among the four conditions. Individual com- 
parisons indicated that (a) the propdértion of 
groups with more shifts to risk than caution 
was not reliably greater in the group discus- 
sion condition than in either the information 
exchange of position condition or the control 
condition (p = .13, in each case, by Fisher’s 
exact test) but (b) the proportion of groups 
with more shifts to risk than caution was sig- 
nificantly greater in the information exchange 
of position and ability condition than in 
either the information exchange of position 
condition or the control condition (p < .05, 
in each case, by Fisher’s exact test). Finally, 
there was no difference between the group 
discussion and information exchange of posi- 
tion and ability conditions.‘ 


3 The direct comparisons of variances between the 
group discussion and the information exchange of 
position and ability conditions were highly reliable 
in both cases, F(7, 7) = 15.28, p < .001, and F(34, 
34) = 6.82, p<.001, for the by-group and by-indi- 
vidual subject analyses, respectively. 

4 There are other possible nonparametric analyses. 
One could consider the total number of subjects who 
shifted to risk or caution or who did not change. 
As in the analysis above, a virtually identical pro- 
portion of subjects shifted to risk in the group dis- 
cussion and exchange of position and ability condi- 
tions (59% combined), and a virtually identical pro- 
portion shifted to risk in the exchange of position 
and control conditions (36% combined). These pro- 
portions differ significantly, (x?(1) =6.22, p < .02. 
Another approach is to consider the proportion of 
groups shifting to risk, using as the criterion whether 
the final mean score of the subjects in a group is 
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Table 1 
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Number of Groups in Which More Subjects Did or Did Not Shift to Risk Than to Caution 


nnn eee EEE EINES 


Experimental condition 


Direction in 
which subjects Group 

in groups shifted discussion 
More to risk 

than caution 6 
The same or 

more to caution 2 

Discussion 


The central hypothesis of the present study 
was that there would be more shifts to risk 
resulting from exchanging information on 
CDQ items if subjects also exchanged infor- 
mation that could establish that they were 
similar on attributes related to risk taking. 
This hypothesis was confirmed. It is interest- 
ing to recall (see footnote 1) that information 
exchange experiments have yielded highly in- 
consistent data, Sometimes they produce a 
risky shift; sometimes not. It could be that 
whether or not a shift occurs depends on the 
degree to which subjects can infer important 
unknowns about the other person’s position 
on related attributes. If they know absolutely 
nothing about the other persons, then little 
comparability can be assumed and no choice 
shift should follow information exchange. If, 
on the other hand, subjects are familiar with 
each other and similar levels of ability can 
be assumed, then social comparison will be 
more strongly engaged and a choice shift is 
more likely to follow. 

The results support the general proposition 
that social comparison on a particular at- 
tribute is only meaningful when something is 


oa or mo cautious than the mean of their 
initial scores. The proportion of grou shifti: 

risk is 75% in both the group EEREN pi, a 
change of position and ability conditions. The pro- 
portion of groups shifting to risk is 50% in both 
the exchange of position and control conditions. 
With the small number of groups, these proportions 
do not differ significantly. Over: , however, the 
findings from these analyses are similar to those 
from the analysis above. 


Information > 
exchange Information 
of position exchange " 
and ability of position Control 
7 2 2 


1 6 6 


known about how you and the others should 
compare on that attribute, given your relative 
standings on related attributes, This point 
can easily be seen in many instances of opin- 
ion formation and change. Whenever a person 
expresses an opinion on any issue, the mean- 
ing of that expression must be interpreted in 
terms of that person’s identity and his stand- 
ing on various related attributes. An Amer- 
ican congressman appraises the president’s 
performance, We would like to know to what 
political party he belongs. A Canadian citizen 
signs a petition opposing confederation. We 
would like to know whether the person is a 
Francophone or Anglophone and in what 
province he or she lives. The point is equally 
true with ability comparison. A person has 
just run the hundred-yard dash in 11 seconds. 
How old was the person? Male or female? 
What kind of coaching, equipment, and so 
forth, was made available to the person? 
What were the track and weather conditions? 
The meaning of another’s opinion or perform- 
ance can only be calibrated by knowing the 
person’s standing on related attributes. Knowl- 
edge of environmental factors might also be 
necessary in many instances. 

How, then, do present findings relate to the 
controversy in the literature regarding the 
various explanations for the risky shift and 
other choice shifts? The present results show 
that when information about one’s self-esti- 
mate of ability and one’s recommendations 
about risk are exchanged, groups shift as 
often toward risk as they do following group 
discussion. The mechanism producing the 
shift in the information exchange of position 
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and ability condition cannot be persuasion. 
There is no argumentation in this condition, 
Even if one accepted a self-generation-of- 
arguments explanation of those studies where 
information exchange of position has pro- 
duced a risky shift, it is difficult to see how 
such an explanation could account for the 
clear difference between the two information 
exchange conditions reported here. 
While it seems clear that social comparison 
produces the shift in the information ex- 
change of position and ability condition, there 
is considerable doubt as to what is producing 
the risky shift in the group discussion condi- 
tion. This uncertainty is not new. There has 
for many years been doubt about the cause of 
the shift in that condition. It seems that the 
answer would well be social comparison. The 
same kind of social comparison that is en- 
gaged in our exchange of position and ability 
condition could occur in the group discussion 
condition, Through discussion, participants 
could quickly see that the others were socially 
comparable. In order to explore the possibility 
that subjects would come to view their fellow 
group members as more able (and thus more 
similar to themselves in ability) following 
group discussion, subjects in the experimental 
conditions were asked at the very end of the 
experiment to indicate how they felt about 
each person in their group “in terms of his 
(or her) overall talent, creativity, and abil- 
ity” (using same 7-point scale on which they 
had previously indicated their own ability). 
In the information exchange of position con- 
dition, subjects in 6 (of the 8) groups, on 
the average, believed they were wv least R 
third of a unit more able than the “average 
ʻi trast, this was the case 
roup member; in contrast, ) 
i 3 groups in the information exchange 
a tb eae ‘lity condition and in only 1 
of position and abi ye : oe ae 
; discussion condition, x*(2) 
group in the group : O a the 
= 6.51, p< 05. The difference 


i i information ex- 

discussion and the information ex- 

Eaei of position onei a a a 

1 er’s 4 

able at the 5% level by Fu aa 
i i on apparently 

indicates that group srs den er ae 


j think 
lead anbi s, Moreover, there was a 


fellow group member! ; ae 
relationship between the perceived ce 
the “average” group member and ris : 
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Of the 10 groups feeling that fellow group 
members were at least one-third of a unit less 
able than they themselves, only 5 had more 
subjects shifting in the risky than in the 
cautious direction, 1 had an equal number of 
risky and cautious shifters, and 4 had more 
subjects shifting in the cautious than in the 
risky direction; of the remaining 14 groups, 
who essentially saw others as similar in abil- 
ity, 10 had more subjects shifting in the risky 
than in the cautious direction, 4 had equal 
numbers of risky and cautious shifters, and 
no group had more subjects shifting in the 
cautious direction, x°(2) = 6.97, p < .05, (It 
might be noted that the one group in the 
group discussion condition, which for some 
reason did not come to believe that fellow 
group members were basically similar in abil- 
ity, was the only group in that condition that 
had more cautious than risky shifters.) 

In arguing favorably concerning a social 
comparison analysis of the risky shift, we do 
not mean to imply that persuasion does not 
occur or that it does not occur in groups dis- 
cussing risk. The processes of persuasion and 
comparison, again, are not mutually exclu- 
sive, Many things happen in group discus- 
sions. That both comparison and persuasion 
are occurring at the same time, in different 
amounts for different individuals, seems per- 
fectly plausible. The goal of the present re- 
search has not been to rule out persuasion. It 
has simply been to show that influence pro- 
duced by social comparison can be substantial 
but that social comparison is not fully en- 
gaged unless particular kinds of important 
information are made available to subjects, 
making the fact of comparability evident to 
them. Once comparability has been appropri- 
ately established, comparison can produce 
opinion shifts of the magnitude found in 
studies of the risky shift. In real life, discus- 
sion may be necessary to establish com- 
parability. The present study shows, however, 
that even if comparability is established by 
other means, similar opinion shifts occur. 
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Records collected during childhood and coded prior to knowledge of adult be- 
havior provided information about the childhood homes of 201 men. Thirty 
years later, information about criminal behavior was collected from court 
records. Multiple regression and discriminant function analyses indicate that 
six variables describing family atmosphere during childhood—mother’s self- 
confidence, father’s deviance, parental aggressiveness, maternal affection, paren- 
tal conflict, and supervision—have an important impact on subsequent behavior. 


Despite a massive literature emphasizing 
the importance of child rearing, conscientious 
critics (eg. Clarke & Clarke, 1976; Yarrow, 
Campbell, & Burton, 1968) have raised legiti- 
mate doubts regarding the impact of parental 
behavior on personality development. Many 
of the studies that link parental behavior 
with personality development rely upon a 
single source of information for both sets of 
variables; systematic reporting biases could 
thus cause obtained relationships. Most of the 
remaining studies have depended upon con- 
current measurements, leaving doubt as to the 
direction of influence between parents’ be- 
havior and characteristics of the child. Ques- 
tions about interpreting the results of both 


types of studies serve to highlight the im- 


portance of longitudinal research. 
A few researchers have gathered informa- 
tion through longitudinal studies, using inde- 
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pendent sources for measuring child rearing 
and for measuring personality. Robins (1966) 
analyzed information from clinic records 
gathered during childhood and related that 
information to data gathered when the sub- 
jects were adults. Robins pioneered assess- 
ment of long-term effects of child rearing, and 
her study raises doubts about the validity of 
retrospective reports on family socialization. 
Nevertheless, since predictor models combined 
variables describing child rearing with other 
types of variables (empirically linked with 
outcome), the research fails to provide con- 
vincing evidence that child-rearing differences 
affected adult behavior. 

Block (1971) evaluated character develop- 
ment among subjects in the Berkeley longi- 
tudinal studies. Dividing 63 subjects into five 
types and checking differences in their back- 
grounds, Block reached the conclusion: “What 
comes through, for both sexes and without ex- 
ception in viewing the various types, is an 
unequivocal relationship between the family 
atmosphere in which a child grew up and his 
later character structure” (p. 258). Although 
Block reports many statistically reliable dif- 
ferences, his analyses do not permit the reader 
to evaluate the strength of relationships be- 
tween family atmosphere and character struc- 
ture. 

In 1973-1974, Werner and Smith (1977) 
retraced 88% of the children born on Kauai 
Island in 1955. Interviews with the mothers 
provided evidence about the family environ- 
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ment of subjects when they were newborn 
infants, age 2, and age 10. Although com- 
bined measures tended to account for a rela- 
tively high proportion of variance in several 
problem areas, the authors did not assess spe- 
cific child-rearing models as predictors of out- 
come behavior. 

Lefkowitz, Eron, Walder, and Huesmann 
(1977) used a main effects model in stepwise 
multiple regression for their longitudinal study 
of aggression. Only two of the six variables 
that together accounted for about a quarter 
of the variance in male aggressiveness at age 
18 were related to child rearing at age 8. 
Since the model included both redundant mea- 
sures of child rearing and heterogeneous vari- 
ables (e.g., child’s preference for girls’ games, 
parents’ religiosity, and ethnicity of family), 
effects of differences in child rearing may have 
been masked by collinearity (Blalock, 1963; 
Gordon, 1968; Mosteller & Tukey, 1977). 

The paucity of evidence to support a view 
that child rearing affects personality has led 
some authors (e.g., Clinard, 1974; Jessor & 
Jessor, 1977) to the conclusion that home 
atmosphere during childhood has a negligible 
effect upon personality development. Such 
authors present the challenge to which the 
present research is addressed: if parental be- 
havior has an important impact upon per- 
sonality development, differences in child rear- 
ing ought to contribute to variations in sub- 
sequent behavior, 


Method 


Subjects for this study were selected from a treat- 
ment program designed to prevent delinquency. The 
youths ranged in age, at the time of their introduc- 
tion to the program, from 5 to 13 (M=10, 5, SD = 
1.6). : y 


ch visit, 
the counselor recorded observatii Pa Ei 
as well as the child.1 ni the family 


Case records from the 


friends, neighbors, and teachers as 
boys. Counselor turnover (a Potent 
a treatment perspective) Produced a 


search: most of the families were visited by more 
than one counselor, 

To justify treatment of family backgrounds as 
independent units for analyses, only one subject from 
a family was included. Boys not reared by their 
natural mothers were also excluded. After eliminating 
brothers (n=21) and those not reared by their 
natural mothers (m= 36), 201 cases remained for 
analysis,? 

In 1957, coders read each case thoroughly in order 
to form judgments about the home and family inter- 
action. These coders had no access to information 
about the subjects other than that contained in treat- 
ment records. A 10% random sample of the records 
was read independently by a second coder to yield an 
estimate of the reliability of the coding.* Variables 
from the coded case records were used in the present 
study. 

The mother's attitude toward her son had been 
classified as actively affectionate (if there had been 
considerable interaction, without continual criticism, 
between mother and child, » =95), passively affec- 
tionate (if there had been little interaction between 
mother and son, though the mother had shown con- 
cern for her child's welfare, n = 51), ambivalent or 
Passively rejecting (if there had been marked alterna- 
tions in the mother's attitude toward her son so that 
she had seemed sometimes to be actively affectionate 
and sometimes rejecting, or if the mother had seemed 
unconcerned about the child's welfare, » = 43), or 
actively rejecting (if the mother had appeared to be 
constantly critical of the boy, m= 11). Independent 
reading of 25 cases resulted in the same ratings for 


Two ratings from the original codes were combined 
to evaluate effects of Supervision, One described 
whether the child's activities outside of school were 
governed by an adult, This scale was divided to indi- 
cate whether supervision was generally present, oc- 
casionally present, or absent. Independent coding 
yielded identical ratings for 84% of the 25 randomly 

„cases. The second rating described parental 
expectations regarding the boys’ activities. Coders 
were instructed to rate expectations as “high” if the 


Re The project included a matched control group. 
Since records on family life, for the control group, 
were limited to information gathered during the 
intake interviews supplemented by information from 


gram.) 
* The criteria are not mutually exclusive, Five men 
beri eliminated through both of the selection cri- 


* See McCord and McCord (1960) for a complete 
description of the coding. 
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child was given responsibility for care of his younger 
siblings, for preparation of meals, for contributing to 
the financial support of the family, or for doing 
‘extremely well” in school. Independent ratings 
yielded agreement for 76% of 25 cases. The scales 
from the 1957 codes were combined to classify sub- 
jects into one of four categories: supervision gen- 
erally present and high expectations for the child to 
accept responsibilities (n =40), supervision generally 
present without evidence that high expectations were 
placed on the child (n = 78), occasional supervision 
(n= 60), and supervision absent (n= 23).4 

A rating of parental conflict was based on coun- 
selors’ reports of disagreements between the parents. 
Raters were instructed to look for conflicts about the 
child and conflicts about values, about money, about 
alcohol, and about religion. Parental conflict was 
coded into one of four categories: no indication, 
apparently none, some, or considerable. For the 
present research, cases were divided into those whose 
parents evidenced considerable conflict (n= 68) and 
those coded in alternative categories. Independent 
readers agreed, for this division, on 80% of the cases 
checked for reliability. 

Three measures from the 1957 codes were com- 
bined to identify aggressive parents. Coders classified 
the aggressiveness of each parent by looking for 
evidence that the parent “used little restraint” when 
angry. Case records included reports on parents who 
threw things (eg, one father threw a refrigerator 
down the stairs in the midst of an argument with his 
wife), hit people, broke windows, and shouted 
abuses. Independent coders agreed on 84% of the 
fathers and 92% of the mothers in classifying parents 
as aggressive. The coders described paternal disci- 
pline; the category “consistently punitive” identified 
fathers who regularly used physical force (¢.8. beat- 
ing a child) or very harsh verbal abuse. Independent 
coders agreed on 92% of the cases for ratings regard- 
ing this classification. If a parent was coded as 
aggressive or the father was coded as consistently 
punitive, the child was classified as having an aggres- 
sive parent (n = 75).8 iy 

The 1957 codes included a meası 
self-confidence. A rating as self-confident was ne 
signed if that mother showed signs of believing In kad 
Siem bilities (n= 55). Other possibilities for this 
rating were “no indication of general attitude; evi- 
dence that mother saw herself as a victim or paye 
in a world about which she could do nothing; iH 
neutral, that is, generally seemed merely a Saat 
things as they came.” For this variable, independe] 
raters agreed in classifying 84% of the 25 cases 


ure of the mother’s 


as alcoholic if the 


d that he had lost jobs because of 
ital problems were attributed 


primarily to his excessive dri 
had repeatedly pointed to the 


grounds for family problems, i 
received treatment specifically for alcoholism. Inde 


pendent coders agreed for 96% of the ratings on this 
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variable. In 1948, after termination of the treatment 
program, criminal records on the family members of 
subjects were collected; these records were locked in 
a file separate from the case histories. In 1975, after 
names had been replaced by numerical identifiers, an 
assistant unfamiliar with the case records coded these 
criminal records. For the present study, a father was 
considered “deviant” if the case record indicated that 
he was an alcoholic, if the criminal record showed 
that he had been convicted at least three times for 
drunkenness, or if his criminal record showed that he 
had been convicted for a serious crime (i.e. theft, 
burglary, assault, rape, attempted murder, or mur- 
der), These criteria led to identification of 86 fathers 
as deviant.® 

Case records included information about family 
structure. A father was considered “absent” if his 
residence was not with the subject’s mother. Inde- 
pendent coding of 25 cases yielded agreement on 96% 
regarding whether or not the boy was living with 
both natural parents.” The 71 boys having absent 
fathers ranged in age, at the time when the loss oc- 
curred, from birth to 16 (M = 7.01, SD = 5,03). The 
father-absent subjects were subclassified to identify 
those whose natural fathers had been present during 
their first 5 years (n= 48) and those for whom the 
absence had occurred prior to the age of 5 (n = 23). 

These seven variables (mother’s affection, super- 
vision, parental conflict, parental aggression, mother’s 
self-confidence, father’s deviance, and paternal ab- 
sence) were used to depict the home. atmosphere of 
subjects during childhood. The first three are regarded 
as directly related to child rearing. Relationships 
among these measures are shown in Table 1. 

Subjects had been selected from congested urban 
neighborhoods, Nevertheless, differences in social 
status could contribute to subsequent differences in 
behavior. Two measures of social status were avail- 
able. The case records supplied information about the 
father’s occupation. Coders classified these occupa- 
tions as white-collar (9.6%), skilled tradesmen 
(32.8%), or unskilled workers (57.6%). The reliabil- 
ity check yielded agreement on 96% of the ratings. 
A second measure of social status was provided by a 
rating of the neighborhoods in which the boys were 
raised, These ratings had been made, in 1938 and 
1939, as part of the selection procedures. The ratings 
took into account delinquency rates, availability of 
recreational facilities, and proximity to bars, rail- 
roads, and junkyards. These ratings were coded on 
a 4-point scale from better to worst neighborhoods, 
The two measures tended to covary, Cramer's V(6) 
= 218, p = 0044. 


4Only nine boys exposed to high expectations had 
not been rated as generally supervised. 

S Fifty of the subjects were classified as having 
aggressive parents by the direct description of 
parental aggression. 

6 Forty-nine had been convicted for serious crimes, 

7Of the 86 deviant fathers, 40 were also absent 
fathers. 
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Table 1 ub x : 
Relationships Among Variables Describing Home Atmosphere (Cramer's V) `š 
Mother's x i 
Parent Parent self- Father's Father's 
Supervision conflict aggression confidence deviance absence 
g! i .2419%* -209* .184* 199° .209° sA 10 
oa -267°* 106 -308*** .230° x 187 Od | 
Parent conflict .188°* 109 .381°** =) bed 
. 
Parent aggression .289°%°%° eee i 
Mother's self-confidence i all 
Father's deviance 2 S 
* p < .05. *p < .01, ***p < .001. 
Between 1975 and 1978, the subjects were retraced. home atmosphere to subsequent crime, the discrim- P 
Among the 201 men included in the study, 153 inant function analyses were also used to predict 


(76%) were alive and in Massachusetts at least until 
the age of 40;8 16 (8%) had died prior to their 40th 
birthdays; 29 (14%) had migrated from Massachu- 
setts; and 3 (1%) remained to be found. 

During 1975 and 1976, the names (and pseudo- 
nyms) of all the men who had been in the program 
were checked through court records in Massachu- 
setts.? These criminal records were traced and coded 
by different people from those who coded other 
records. Coders of the criminal records (and those 
who traced them) had no access to other information 
about the subjects. The court records showed the 
dates of court appearances and the crimes for which 
the subjects had been convicted. They were coded 
to show the type of crime and the age of the person 
when he was convicted. Convictions for serious prop- 
erty crimes (larceny, auto theft, breaking and enter- 
ing, arson) and serious Personal crimes (assault, at- 
tempted rape, rape, attempted murder, kidnapping, 
and murder) were used as dependent measures for 
this study. 

Among the 201 men, 71 had been convicted for at 
least one serious crime; 53 had been convicted for 
Property crimes and 34 for personal crimes (includ- 
ing 15 convicted for both types). Their ages when 
first convicted ranged from 8 to 38, with a mean of 
18.7 (SD =8.7) and a median of 20. Those convicted 
Prior to their 18th birthdays were classified as 
juvenile delinquents (n= 43); those convicted after 
reaching the age of 18 were classified as adult crim- 


inals (n= 48, including 20 who had been juvenile 
delinquents). 


analyses (General Linear 
Goodnight, Sall, & Helwig, 
tain the contribution of child rearing to the variance 
in number of serious property and 
To test the degree to 


sequent behavior, the six central variables describing 
home atmosphere were used in hd 


analyses to identify criminals, 
As a more Stringent test of the contribution of 


criminals among the subsample whose criminal rec- 
ords provided the most complete histories of convic- 
tions: those men living in Massachusetts at least 
until the age of 40. If this function identified crim 
inality more accurately for the total group than for 
those living in Massachusetts, there would be grounds 
for suspecting an interaction effect between home 
background and unmeasured variables. If this func- 
tion identified criminality at least as accurately for 
those alive in Massachusetts at the age of 40, there 
would be additional support for a conclusion that 
home atmosphere during childhood contributes to 
criminality, 


Results 


As a first step toward learning whether 
parental behavior contributes to subsequent 
differences in criminality, the seven scales de- \ 
scribing home atmosphere and the two scales 
describing social status were individually 
analyzed for their contributions to the vari- 
ance in number of serious crimes against 
Property and persons, (See Table 2.) $ 

With the exception of father’s absence, each 
of the scales describing home atmosphere ac- 
counted for a statistically significant (p < 
:05) proportion of the variance in number of 
crimes against property, persons, or both.'® 


y 


$ Among them, 147 were in Massachusetts through 
their 45th birthdays. 

9 These records were supplemented by court rec- 
ords from the states of New York, Maine, Michigan, 
Nebraska, and Florida, where some oí the men had 
resided. 

10 The Duncan multiple range test, modified for 
unequal groups (Kramer, 1956), indicated that boys 
without supervision, reliably (p < 05) more than 
boys in the other three categories, were convicted 
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Table 2 
Relationships Between Variables Describing Home Background and Crimes 
Property crimes Personal crimes 
DF R F PR>F R? F PRZE 
Mother's affection 3, 196 .092 6.60 .0003 029 1.92 1261 
Supervision x 3, 197 152 11,77 .0001 071 5.04 0023 
Parent conflict 1, 199 -008 1.76 -1866 035 7.14 0081 
Parent aggression 1, 199 012 2,38 1242 036 7.36 0073 
Mother's self-confidence 1, 199 -022 4.50 -0350 024 4.98 0268 
Father's deviance 1, 199 024 4.81 0295 000 0.00 .9569 
Father’s absence 2, 198 005 0.47 6286 026 2,67 0715 
Neighborhood 3,197 028 1:87, 1335 016 «1.10 3513 
Father's occupation 2,195 .008 0.74 4798 012 «145.3181 


Neither of the measures of social status was 
significantly related to these types of crimes. 

Supervision and mother’s self-confidence 
were related to both crimes against property 
and crimes against persons; mother’s affection 
and father’s deviance were related to property 
crimes (though not to personal crimes) ; con- 
flict and parental aggression were related to 

rsonal crimes (though not to property 
crimes). The boys who lacked maternal affec- 
tion, who Jacked supervision, whose mothers 
Jacked self-confidence, and whose fathers were 
deviant were more often subsequently con- 
victed for property crimes. The boys who 
lacked supervision, whose mothers lacked 
self-confidence, and who had been exposed to 
parental conflict and to aggression were sub- 
sequently more often convicted for personal 
che. relationships to criminality of indi- 


vidual variables describing home atmosphere, 


isti ‘nificant, each accounted 
though statistically significant, d 
for : relatively small proportion of the vari- 
More important, since “criminogenic 
Sd ded to be related to one an- 


ditions teni 
ee, these relationshi could not be take 
id nce that the differences in home baci 
sa resulted in dif- 


ferences in subsequent behavior. 


ai n 

„y and perso! $ 

ie Las el that boys seeded “a 

EN ere most likely to be convicted A p perty 
mAP Tan boys who had affectionate mO 
Jeast likely- 


To evaluate the contribution of parental 
behavior to subsequent behavior, the six cen- 
tral variables describing home atmosphere 
were divided into two sets. The first set in- 
cluded those variables that described char- 
acteristics of the parents, characteristics that 
might be viewed as antecedent to child-rear- 
ing practices: parental aggressiveness, pater- 
nal deviance, and mother’s self-confidence. 
The second set included the three variables 
that described interpersonal behavior: paren- 
tal conflict, supervision, and mother’s affec- 
tion; these were considered to be direct mea- 
sures of child rearing. The effect of this divi- 
sion was to classify families in two ways. The 
first classification took account of relation- 
ships among the variables describing the par- 
ents; the second took account of relationships 
among the variables describing child rearing. 

Sequential multiple regression models were 
used. They introduced the measure of social 
status (the interaction of father’s occupation 
and neighborhood) as the first variable. The 
regression procedure next evaluated the se- 
quential contribution to explained variance 
of parental characteristics (the interaction of 
paternal deviance, maternal self-confidence, 
and parental aggression). After controlling 
effects of both social status and parental char- 
acteristics, the procedure evaluated effects of 
child rearing (i.e., the interaction of super- 
vision, parental conflict, and the mother’s 
affection). 

Child rearing, as measured in this longi- 
tudinal study, clearly accounts for a signif- 
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Table 3 ATAS 3 
Home Environment and Subsequent Criminality 


JOAN McCORD 


eS 


Sequential contribution DF 


R? R 


Predicting property crimes* 


Social status 11 
Parental characteristics i 
Child rearing 27 


-0688 1.55 1185 
.0610 2.16 0404 
+2612 2.40 0005 


Predicting personal crimes 


Social status 11 
Parental characteristics 7 
Child rearing 27 


0541 
0588 
2444 


Predicting total crimes* 


Social status 11 0620 1.40 0620 

Parental characteristics 7 0691 2.45 .0209 

Child rearing 27 2601 2.39 .0005 
"Model: R? = 391, F(45, 151) = 2.15, p = .0003. > Model: R? = .3573, F(45, 151) = 1.87, p = .0028. 
° Model: R? = .3912, F(45, 151) = 2.16, $ = .0003. 


icant proportion of the variance in subsequent 
criminality. Table 3 describes the decomposi- 
tion of the regression models, 

As predictors of property crimes, the model 
accounts for 39.1% of the variance, F(45, 
151) = 2.15, p = .0003. Parental aggression, 
paternal deviance, and maternal self-confi- 
dence account for 6.1% of the variance after 
controlling social Status, F(7, 151) = 2.16, 
Pp = .0404. Parental conflict, supervision, and 
maternal affection contribute significantly to 
the variance after effects of social status and 
parental characteristics have been controlled, 
R? = 261, F(27, 151) = 2.40, p = .0003. 

As predictors of personal crimes, the model 
accounts for 35.7% of the variance, F(45, 
151) = 1.87, p= 0028. The three more di- 
rect measures of child rearing contribute sig- 
nificantly to the variance after effects of social 
status and parental characteristics have been 
removed, R? = 244, F(27, 151) = 2.13, p= 
0023. 

As predictors of the total number of serious 
crimes for which the men had been convicted, 
the model accounts for 39.1 % of the variance, 
F(45, 151) = 2.16, p = .0003. Parental char- 
acteristics account for 5.9% of the variance 
after effects of social Status have been con- 
trolled, F(7, 151) = 245, p= .0209. The 
child-rearing variables account for 26.0% of 


the variance, F(27, 151) = 2.39, p = .0005, 
after removing effects of both social status and 
parent characteristics, 

Adding information about whether or not a 
man had been reared in a home marked by 
paternal absence did not reliably increase the 
accuracy of any of the predictions.” 

Within the (relatively restricted) range of 
social class represented in the study, the con- 
tribution of social status to the variance in 
crimes was not statistically reliable. On the 
other hand, both parental characteristics and 
child-rearing practices were reliably related 
to the number of crimes for which the subjects 
had been convicted,?? 

Approximately a third of the 200 men coded 
on all six variables describing home atmo- 
sphere had been convicted for at least one 
serious crime. A discriminant function based 
on the variables describing home atmosphere 
for these 200 men correctly identified 147 


™ R? was increased 
007 for personal crimes, 

12 Without controlling 
characteristics and child-rearing variables accounted 
for 36.7% of the Variance in property crimes, F(34, 
162) = 2.76, p< 0001, 30.8% of the variance in 
sonal crimes, F(34, 162) 
of the variance 
F(34, 162) = 2.71, p < .0001. 


ri 
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Table 4 N 
Results of Discriminant Function Analyses 
% 
Correct as Correct as Overall improve- 
criminals noncriminals accuracy ment 
> Dependent and — — over 
independent variables ” % n % n % chance s>p 
All subjects 
Ever criminal 
Home atmosphere 71 67.6 129 76.7 200 73.5 19.3 .0001 
Adult criminal 
Home atmosphere 48 56.3 152 87.5 200 80.0 16.5 .0001 
Men living in Massachusetts through the age of 40 
Ever criminal 
Home atmosphere 60 81.7 92 70.7 142415.0. 22.8 .0001 
Adult criminal 
Home atmosphere 42 714 110 84.6 152 80.9 20.9 0001 
Juvenile delinquency 
record 42 45.2 110 83.6 152 73.0 13.0 0011 
5 as criminals or noncriminals; ran- slight improvemen' over the rate of correc 
(73.5% iminal jminal: light t th te of t 
based on prior probabilities identification among the total group of men 


dom predictions 
would be expected to identify only 54.2% 
= 5.48, p < .0001. The function 
based on parental aggression, maternal self- 
confidence, paternal deviance, supervision, 
maternal affection, and parental conflict cor- 
rectly identified 76.776 of the noncriminals 
and 67.6% of the criminals. (See Table 4.) 

men had 


3 (87.5%) of the 152 

ictions as adults. Predic- 
iptions of home atmo- 
ided a 16.5% improvement over 
dom predictions 


obabilities, 2 = 4.85, P< 


i who had died before 
igrated from Massachusetts, 


recor S 
sables describing home 

i iscriminant func- 

tmosphere remained for disc fun 

t 3 these men, the discrim- 


tion analyses. Among these Dv 
inant function correctly identified 75.0%, a 


and a 22.8% 
dures based on 


improvement over random proce- 
prior probabilities, 2 = 5.63, 
p < .0001. (See Table 4.) This discriminant 
function correctly identified 81.7% of the 60 
criminals and 70.7% of the 92 noncriminals. 

A breakdown of the results shows that the 
discriminant function had correctly classified 
as criminal 78.1% of those convicted only for 
property crimes, 78.6% of those convicted 
only for personal crimes, and 92.9% of those 


—_——- 


18 The model used to estimate predictions based on 
chance assumes that the number of predictions as 
criminal would be proportional to the actual dis- 
tribution of criminals among subjects. Alternative 
models that might be considered range from assuming 
that each individual is as likely to be convicted as 


predictions among half the noncriminals and half the 
criminals) to assuming that all or no individuals 
would be convicted. Although a “rational bet” would 
maximize correct predictions by predicting that all 
individuals would fall into the larger class, this model 
is inappropriate when the interest is in correct iden- 
tification of those in An equiprob- 
for the discriminant function analysis 
based on family atmosphere resulted in correct sort- 
ing of 68% of the men (62% of the noncriminals and 
79% of the criminals) in terms of whether or not 
they had been convicted for serious crimes. 
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convicted for both property and personal 
crimes. 

Among the 60 men convicted for serious 
crimes and still living in Massachusetts at the 
age of 40, 18 (30.0%) had been convicted 
only as juveniles, 23 (38.7%) had first been 
convicted after the age of 18, and 19 (31.7%) 
had been convicted both as juvenile and as 
adults. Were one to predict that only and all 
juvenile delinquents would be convicted as 
adults, the prediction would be correct for 
73.0%, an improvement over an expectation 
of 60.0% from random procedures based on 
prior probabilities, z = 3.27, p = .0011, This 
prediction would be correct for 45.2% of the 
adult criminals and for 83.6% of the men not 
convicted as adults. Predictions based on 
juvenile records would, of course, be right for 
none of the men first convicted as adults 
(54.8% of the adult criminals) and for only 
51.4% of the juvenile delinquents. 

Among men living in Massachusetts at the 
age of 40, the discriminant function analysis 
based on home atmosphere during childhood 
correctly identified 80.9% as criminal or non- 
criminal after the age of 18. (See Table 4.) 
Use of the variables describing home atmo- 
sphere during childhood resulted in a 20.9% 
improvement over chance identification, z = 
5.26, p < .0001, and a 7.9% improvement 
over predictions based on the subjects’ juve- 
nile criminal histories, z = 2.19, p = 0282. 
This function correctly identified 71.4% of 
the adult criminals and 84.6% of the non- 
criminals. The discriminant function based on 
home atmosphere correctly identified as crim- 
inals 65.2% of the men who had first been 
convicted as adults. In terms of their subse- 
quent criminal records, this discriminant func- 
tion correctly sorted 78.4% of the juvenile 
delinquents and 81.7% of those who had not 
been juvenile delinquents.*4 


Summary and Discussion 


Recent criticism of the assumption that 
child-rearing practices have an important im- 
pact on personality development posed the 
issue addressed in this research, In order to 
evaluate the assumption, records describing 
home atmosphere during childhood, recorded 
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during childhood, were linked with records 
of subsequent criminality, gathered when the 
subjects were middle-aged. The two sources 
of information were independent: data collec- 
tion had been separated by several decades, 
the data had been coded by different people, 
and the coders had no access to information 
other than that which they were coding. 
Therefore, measures of home atmosphere were 
uncontaminated by retrospective biases and 
measures of subsequent behavior were uncon- 
taminated by knowledge of home background. 

Records describing home atmosphere had 
been written between 1939 and 1945. These 
records were case reports of counselors’ re- 
peated home visits to the 201 boys included 
in this study, The case records were coded, 
in 1957, to provide descriptions of home atmo- 
sphere. 

Information about criminal behavior was 
gathered from court records, 30 years after 
termination of the program from which de- 
scriptions of home atmosphere had been col- 
lected. Subjects were considered criminals if 
they had been convicted for serious crimes 
(those indexed by the Federal Bureau of In- 
vestigation). 

In preliminary analyses, six of seven vari- 
ables describing home atmosphere were reli- 
ably related to criminal behavior. Only 
father’s absence failed to distinguish criminals 
from noncriminals. Considering the emphasis 
given to broken homes as a source of subse- 
quent criminality (e.g., Bacon, Child, & 
Barry, 1963; Glueck & Glueck, 1951; Wads- 
worth, 1979; Willie, 1967), this finding is 
worthy of note, 

_ Multiple regression analyses indicated that 
six variables describing home atmosphere in 
childhood account for a significant proportion 
of the variance in number of convictions for 
Serious crimes. After controlling effects of 
differences in social Status, parental char- 
acteristics and child-rearing variables ac- 


ae The “rational bet” that men not convicted as 
juveniles would not be convicted as adults would be 


> 


i 
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counted for 32.2% of the variance in number 
y= of convictions for property crimes and 30.3% 
o.Wthe variance in number of convictions for 
personal crimes. The three most direct mea- 
sures of child rearing (supervision, mother’s 
affection, and parental conflict) accounted for 
approximately a quarter of the variance in 
number of convictions for serious crimes— 
after effects of both social status and parental 
characteristics had been removed. 

Discriminant function analyses based on 
- the six variables describing home atmosphere 

correctly identified 73.5% of the men as either 
subsequently criminal or noncriminal; further, 
these six variables provided a function that for 
80% of the men correctly discriminated be- 
tween those convicted and those not con- 
victed for serious crimes as adults. 

As compared with analyses for the total 
sample, the discriminant function analyses 
were (slightly) more accurate when used to 

* predict behavior among the men whose crim- 
inal records provided the most complete his- 
tories of convictions. Among men living in 
Massachusetts at least to the age of 40, these 
functions correctly identified 75% as ever 
criminal or as noncriminal and 80.9% as 
criminal or noncriminal after the age of 18. 
Limiting analyses to men living in Massachu- 
setts controlled any differences contributing to 
migration or early death; therefore, the accu- 

æ. racy of discriminant functions among this 
subsample is interpreted as supporting the 
view that home atmosphere during childhood 
contributes to criminality. 

When used to identify men convicted as 
adults, the discriminant function identified as 
criminals almost two-thirds of the men first 
convicted after the age of 18. This Petes 
also correctly sorted more ledig a 


the juvenile delinquents, 
pe Bisse who were and those who were not 

* adult criminals. 3 
Although the discrimi: 

on home atmosphere W 
cessful in identifying men W 
criminals, it would ran A ‘ 
that the longitudinal dest 4 
has led to congo of the causes of ee: 
This research is limited not only by its sub- 


jects (all of whom were reared in congested 
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urban areas during the thirties and early 
forties) but also by the hypotheses considered. 
In this research, parental aggression, paternal 
deviance, maternal self-confidence, super- 
vision, mother’s affection, and parental con- 
flict indexed home atmosphere; unconsidered 
variables might better describe the features 
in the child’s home that affect his behavior. 
In this research, the possibly confounding 
variable of social status was considered; other 
conditions might account for the apparent link 
between home atmosphere and crime. Never- 
theless, the evidence from this study suggests 
that parental behavior does have an important 
impact on subsequent behavior: predictions 
of adult criminality based on knowledge of 
home atmosphere were not only markedly 
more accurate than chance—they were also 
more accurate than predictions based on the 
individuals’ juvenile criminal records. 
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$ Are Causal Attributions Causal? A Path Analysis 
of the Cognitive Model of Achievement Motivation 


Martin V. Covington and Carol L. Omelich 
University of California, Berkeley 


The causal assumptions of Weiner’s cognitive reinterpretation of the traditional 
4 theory of achievement motivation are tested. This research asks the question, 
> “Do achievement motivation groups differ in their affective reactions, expect- 
ancy of future success, and subsequent test performance as a consequence of 
attributions made for a previous test outcome?” Subjects (N = 206) were 
college students varying in resultant achievement motivation who experienced 
feelings of failure in a course test and chose to take the exam a second time 
under a mastery learning system. After feedback on first test performance, 
students made attributions for their initial failure, indicated degree of shame, 
and rated their expectancy for success on the second test opportunity. Treat- 
ment of this system of nonmanipulated variables by path analysis techniques 
provides little support for the contention that variations in expectancy and 
retest performance depend on attributions made for a previous failure. Affect 
depends in part on internal attributions, but in a direction opposite to predic- 
tions. An alternative interpretation of the role of cognitive attributions in the 


Fol achievement process is explored. 


ments. In this new schema nAch is assigned 
a subordinate status as one of many ante- 
cedents to achievement, with cognitive at- 
tributions becoming the major causal deter- 
minants of achievement behavior. 

Following Heider (1958), Weiner proposes 
ability, effort, luck, and task difficulty as 
among the major perceived causes of achieve- 
ment performance (Weiner, 1972, 1977, 
1979), There is general consensus that high 
and low nAch individuals harbor differential 
explanations about the causes of their suc- 
cesses and failures (Weiner et al., 1971; 


The most widely known theory of achieve- 
ment motivation (Atkinson, 1957, 1964) holds 
that achievement behavior is the result of an 
emotional conflict between hope of success 
and fear of failure. Recently Weiner and his 
colleagues (Weiner, 1972, 1974, 1977; 
Weiner, Frieze, Kukla, Reed, Rest, & Rosen- 

« baum, 1971) have suggested a cognitive re- 
interpretation of this traditional theory. It is 
proposed that need for achievement (Ach) 
is mediated by perceptions of causality that 
in turn influence affective reactions to success 


fh and failure, expectancy of future success, and 
subsequent performance. Thus achievement 
motivation, originally conceived solely in 
terms of differential emotional anticipations, 
becomes heavily imbued with cognitive ele- 


4 The authors wish to‘ thank Richard S. Lazarus, 
Joy Stapp, Alan Zonderman, the action editor, and 
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na suggestions on an earlier draft of this article, 
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Covington, Department of Psychology, University of 
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Ld 
Copyright 1979 by the Am 


Weiner, Heckhausen, Meyer, & Cook, 1972; 
Weiner & Kukla, 1970; Weiner & Potepan, 
1970), and it is precisely these differences 
that are taken as the essence of individual 
differences in achievement motivation. Gen- 
erally speaking, persons motivated to ap- 
proach success (high nAch) attribute failure 
to lack of effort and success to their ability, 
whereas failure-avoiding (low nAch) persons 
tend to ascribe failure to lack of ability and 
success to external factors such as luck. The 
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cognitive model specifies several sequential 
Steps in the achievement process resembling 
an information-processing system. 

1. First, following a given achievement out- 
come, success or failure is interpreted retro- 
spectively in terms of the four major attribu- 
tional elements and according to the predis- 
posing individual differences in nAch. 

2. Second, these differential perceptions of 
causality, in turn, mediate affective reactions 
to performance—shame in failure and pride in 
Success—as well as the individual’s expecta- 
tions about the probability of future success. 
For example, in failure, self-ascriptions to low 
effort—characteristic of high nAch individuals 
—intensify shame and act to maintain an ex- 
pectation for future success, By contrast, low 
ability ascriptions lead to less shame but 
lowered future expectations, a pattern typical 
of low nAch persons. 

3. Asa final step, achievement performance 
is determined by causal ascriptions acting 
indirectly through both pathways of affective 
reaction and expectations of future success, 
In effect, high nAch Persons who have reacted 
to failure with chagrin yet have maintained 
an undiminished optimism for the future (due 
to their attributional biases) are likely to per- 

form far better on the next occasion than are 
low nAch individuals. The entire process from 
Step 1 through Step 3 is repeated from one 
performance to the next, 

This ingenious cognitive analysis has stim- 
ulated a large volume of research (for reviews 
1972, 1974, 1977). Yet despite 
the impressive numbers, several shortcomings 


time, to the exclusion of other antecedent and 
a few studies 
variables (eg., 
Meyer, 1970 [ fully described in Weiner et al., 
1972]; Weiner & Sierad, 1975), none haye 
l model as a dynamic 
system in a real-life setting. As a result, the 
central assumption of the primary causal role 
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of attributions lacks sufficient empirical sup- 


port to warrant its unqualified acceptance. yy 


Thus we ask: “Do achievement motive groups 
differ in affective reactions, expectancy of 
future success and subsequent performance 
as a consequence of attributions made for a 
previous test outcome?” At stake here is the 
basic assumption that attributions are in 
themselves a sufficient explanation for 
achievement behavior. According to the 
model, once nAch acts to trigger differential 
attributional biases in response to success or 
failure feedback (Step 1), it plays no further 
role as a major, direct determinant of the 
more distal psychological events that influence 
retest performance (i.e., expectancy and af- 
fect). Whether or not we are justified in de- 
leting the concept of need achievement from 
theories of achievement behavior has yet to be 
empirically demonstrated. 

The purpose of the present study is to trace 
the sequential cause and effect network of 
relationships assumed by the cognitive model 
in order (a) to explore both the separate and 
overlapping causal roles played by nAch and 
cognitive attributions in determining varia- 
tions in achievement behaviors and (b) to 
determine whether or not the proposed at- 
tributional mediators act in accord with theo- 
retical predictions. To this end we have 
gathered data on all the crucial elements spe- 
cified by the model in an achievement context 
of test-taking in the classroom and in the 
chronological order that conforms to the 
theory. This analysis is confined to subjects 
who experienced subjective failure in the first 
of two test-taking opportunities, 

The primary method for analyzing this sys- 
tem of nonmanipulated variables is path 
analysis (Anderson & Evans, 1974; Duncan, 
1966, 1975; Heise, 1975; Kerlinger & Ped- 
hazur, 1973; Werts & Linn, 1970). Path 
analysis allows for all 
specified by a causal model to be incorporated 
into an overall predictive analysis, thereby 
permitting an estimation of the relative con- 
tribution (both direct and indirect) of each 
determinant to variations in the dependent 
variable(s), In this case, the dependent vari- 
ables of interest are various facets of achieve- 
ment behavior. 


determining factors as ` 
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Figure 1. Causal pathways of influence postulated by the cognitive model of achievement motiva- 


tion in the task reevaluation stage following failure. 


For a proper interpretation of the analyses 
to follow, it is important to bear in mind that 
path analysis is not a procedure for demon- 
strating causality. Rather it is a method for 

tracing out the implications of a set of causal 
assumptions that the theoretician is willing to 
impose on a system of relationships. In essence, 
we attempt to answer the question “Does the 
cognitive model fit the data adequately? ” by 
comparing the observed relationships among 
the variables with the expected relationships. 
Since the presumed relationships and direc- 
\ tionality among the elements of the cognitive 
.) model have been so well specified, the applica- 
tion of recursive path analytic techniques does 
not allow for the testing of an alternate model. 
For instance, nAch is seen as an antecedent 
that predisposes individuals to particular at- 
tributional patterns, This causal ordering pre- 
cludes the testing within the path analysis 
system of any alternate hypothesis that con- 
cerns the effect of attributions on nAch. 
It will prove helpful to indicate in advance 
the kinds of hypotheses and specific causal 


= 


F) 


ft 


assumptions we will examine, the order of the 
data presentation, and how path-analytic tech- 
niques will be employed. This presentation 
will be aided by reference to Figure 1, which 
schematizes the cognitive events described 
earlier, The unidirectional arrows and accom- 
panying algebraic signs indicate the main 
pathways of causal influence to be tested. 


Step 1: nAch—Attributions 


Cognitive theory predicts a positive rela- 
tionship between effort ascriptions and nAch 
in failure, that is, as nAch level increases, so 
do ascriptions to low effort. Similarly, we ex- 
pect a negative relationship between nAch 
and ability ascriptions such that as nAch level 
increases, attributions to low ability decrease. 
The nAch pathways to luck and task ascrip- 
tions carry question marks, since the model 
makes no specific predictions regarding mo- 
tive group predispositions. However, the 
stable (task) and unstable (luck) qualities of 
these two external elements suggest the direc- 


1490 


tion of causality indicated in parentheses. 
Testing these predictions involves regressing 
each of the four attributional sources sep- 
arately on nAch. Since the first step involves 
relationships between only the first two ad- 
jacent sets of variables in the model (nAch 
as antecedent and attributions), no indirect 
sources of influence on attributions are pos- 
sible, and the resulting path coefficients are in 
actuality zero-order correlation coefficients. 


Step 2: nAch—Attributions—A fiect/ 
Expectancy 


Shame. While nAch dispositions can ex- 
press their influence on shame in several ways, 
cognitive theory clearly specifies which path- 
ways are primary. 

1. A direct effect can, in principle, be ex- 
erted by nAch membership. However, as im- 
plied by its absence from Figure 1, this direct 
path is bypassed in the cognitive model. 
Rather, the theory assumes that the causal 
burden of the nAch—shame relationship is 
transmitted indirectly via attributions, with 
nAch serving as no more than an instigator 
of attributional biases. Path analysis allows 
for a comparison of the relative importance 
of these two nAch pathways. 

2. The cognitive position also assumes that 
shame is influenced directly along an internal- 
external dimension of causal ascriptions, with 
internal factors being the more important 
determinants. While this dimensional distinc- 
tion has been deemphasized in recent treat- 
ments and additional affective reactions iden- 
tified (Weiner, 1979; Weiner, Russell, & Ler- 
man, 1978, Note 1), it is still assumed that 
shame is a consequence of ability and effort 
attributions. Thus shame is said to be max- 
imized when failure is perceived as occurring 
through lack of effort, especially among highly 
able individuals, and minimized when the 
individual, particularly if he is of low ability, 
has tried hard. This leads to the prediction 
of a positive relationship between effort 
ascriptions and shame, that is, as attributions 
to low effort increase, so does shame, Corre- 
spondingly, we expect a negative relationship 
between ability ascriptions and shame such 
that as low ability attributions increase shame 
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is reduced. This is not to say that low ability 
attributions produce less shame than do ex- 
ternal explanations, but only that low ability 
mitigates shame. 

The preceding implies that high nAch indi- 
viduals will experience the greatest shame due 
to their disposition to maintain a sense of 
power (ability) and to blame failure on insuf- 
ficient effort. However, because of the tradi- 
tional, affective interpretation of failure- 
avoiding persons as possessing a “capacity for 


experiencing shame and humilitation” (Atkin- % 


son, 1957), we might expect low nAch persons 
to be the most distressed. These conflicting 
predictions will be evaluated. 


Expectancy for future success. 


1. As in the case of shame, we will also 
address the basic question concerning the 
magnitude of causal influence exerted by nAch 
directly on expectancy (nAch—expectancy) > 
as contrasted to the cognitive assumption that 
the only notable pathways of significance 
come indirectly from nAch mediated through 
attributions, 

2. Expectancy is predicted to vary directly 
along the stable/unstable attributional axis. 
Specifically, expectancy is said to be higher 
when failure is attributed to unstable elements 
(effort, luck) and lower when ascribed to 
stable ones (ability, task). Accordingly, we ' 
ought to find significant, positive path coef- 
ficients between expectancy and ascriptions to 
low effort and bad luck; and conversely, nega- 
tive path coefficients between expectancy and 
attributions to low ability and high task dif- 
ficulty. 

It follows from the foregoing that high 
nAch persons will maintain greater expecta- 
tions for future success after failure than will 
low nAch persons, owing to the tendency of _ 
the former group to attribute failure to un- 
stable elements, especially lack of effort. 
While several studies confirm that low nAch 
persons harbor lower expectations for success 
(Atkinson, Bastian, Earl, & Litwin, 1960; 
Kukla, 1972a, 1972b; Moulton, 1974) there 
is little evidence that this condition is influ- 
enced by attributional mediators. Path anal- 
ysis allows us to evaluate the relative im- 


% 
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portance of this presumed indirect causal 
pathway. 


Step 3: nAch—Attributions—A Bect/ 
Expectancy—Retest 


Cognitive theory holds that attributions in- 
fluence performance largely through their de- 
termination of expectancy and affect. 

1. To clarify the causal role of nAch, we 
will assess the degree to which nAch exerts a 
direct impact on performance in contrast to 
its indirect influence through cognitive media- 
tors, 

2. The assumption that affect and expec- 
tancy influence performance largely because 
of antecedent variations in attributions leads 
us to expect a relatively large indirect effect 
of attributions on performance as compared 
to the magnitude of the direct influence ex- 
erted by expectancy and shame. Correspond- 
ingly, the direct influence of attributions on 
p#rformance should be negligible. This latter 
prediction takes on some importance in light 
of recent work suggesting direct attributional 
influence (e.g., Dweck, 1977; Latta, 1976). 

As we move through the various steps of the 
cognitive model, the number of indirect paths 
of potential influence increases progressively. 
In principle, nAch can express its influence on 
performance indirectly through some 14 dif- 
ferent indirect pathways. Although most of 
these paths contribute only negligibly to 
variations in performance, and few of the 
paths are of more than passing interest, we 
must nonetheless deal with the unwanted 
complexity that results. Consequently, at each 

“of the three major steps in our analysis we 
will also employ a stepwise regression tech- 
nique that allows for an incremental F et as 
a summary statistic. While not parana ee 
in interpretation, it provides a valuable a i 
junct perspective indicating whether or no 
the addition of each successive ae 
variable (or set of variables) increases signif- 
icantly the total explained variance in ie pet 
ticular dependent variable under considera: 
tion. 


Method 


Subjects and Procedure 


Subjects were drawn from among 439 studens °° 
rolled in an introductory psychology course 


X 
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tured a modified mastery structure (Block, 1977, 
Note 2). At the end of each instructional unit two 
equivalent multiple-choice tests were administered 
two days apart. If students wished the chance to im- 
prove, they had the option of taking the second test 
after further study. Two hundred and six students 
(100 females; 106 males) who considered their first 
performance unsuccessful and chose to take the sec- 
ond test served as subjects. The present data were 
collected at the end of the second instructional unit, 


Measures 


Both tests were scored, and the results were known 
to students before they left the examination room, 
After receiving grade-equivalent feedback on the first 
test, students completed a brief questionnaire, 

Causal ascriptions for failure. Students indicated 
the causal importance of each of Weiner’s attribu- 
tional sources using a format adapted from Feather 
(1969) : 

To what extent do you consider that your unsuc- 

cessful performance was due to each of the follow- 

ing factors? 
The examination was difficult, 
I did not study very much. 
Thad back luck (e.g., test stressed different mate- 
rials), 
I lack the skill and ability. 

Each attribution was rated separately on a 7-point 
Likert-type scale from 1 (not a cause) to 7 (very 
much a cause), 

Affective reactions. Students also estimated degree 
of shame as a result of their unsatisfactory perform- 
ances. Again, ratings were made on a 7-point scale 
from 1 (not at all) to 7 (very much). 

Grade expectancy. Students indicated the grade 
that they expected to get on retesting. Letter grades 
were converted to standard grade-point equivalents, 
While some researchers stress measurement of ex- 
pectancy change (McMahan, 1973; Valle & Frieze, 
1976), our use of subsequent expectancy level follows 
the practice of the majority of investigators (eg, 
Weiner, Nierenberg, & Goldstein, 1976). 

Retest performance. Number correct on the sec- 
ond test served as the performance measure, 

Resultant achievement motivation, Following the 
procedures of Weiner and Potepan (1970), an index 
of resultant achievement motivation was derived for 
each subject by taking the difference between z-score 
transformations on the Mehrabian Achievement Risk 
Preference Scale (MARPS) (Mehrabian, 1968, 1969) 
and the debilitating anxiety subscale of the Achieve- 
ment Anxiety Test (AAT) (Alpert & Haber, 1960). 
These two indices were administered during pre- 
enrollment for the course, thereby occuring in the 
antecedent position dictated by the model. Selection 
of the present failure sample did not result in a 
biased subset of the nAch distribution. The mean 
nAch of these subjects (M = —.081) does not differ 
from that of the entire class (M =.068), t(560) = 
1.21, p < .20. Moreover, the resultant scores of failure 
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Table 1 


Means, Standard Deviations, and Product-Moment Correlations for Variables in the 
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Cognitive Model of Achievement Behavior (N = 206) 


Task 
Variable nAch Effort Ability difficulty Luck Shame Expectancy Performance 
04: —.342° — .066 —.237* =—.194° 208° 145° 
tort j —.168* —.284° —.076 —.147° —.113° — 067 
ee ae ae a aai — 086 
ilii 141 F À -. =, 
Ability (069) (233) a fs 
ifficult: 237° .00' —. 25 
Task difficulty (.204*) e 6 
077 001 012 
oe —.047 091 
(—.020) 
Expectancy .357* 
M —.081 3.74 3.01 5.28 3.80 4.02 3.33 21.11 
SD 1.39 1.83 1.62 1,39 1.71 1.81 .52 3.1 


* Coefficient of residual covariation. 
*p < .05. 


subjects classified as high (M = 1.08) and low (M = 
—1.14) on a median split do not differ from those in 
the class at large (Mu=1.21, p<.20; Mi =—1007, 
p < 30), indicating equivalent heterogeneity, 


Results and Discussion 


The means, standard deviations, and inter- 
correlations among the variables of the cog- 
nitive model are presented in Table 1, The 
cognitive model assumes no causal reciprocity 
between same-level variables (i.e., among at- 
tributions or between expectancy and shame), 
In path-analytic terms, any observed residual 
covariation (e.g., —.23 between task and ef- 
fort ascriptions) need not be zero, but may be 
assumed to originate from a common depen- 
dency on nAch or on variables outside the 
model. Thus these same-level covariations do 
not influence the causal structure and are 
therefore not dealt with further. 

Figure 2 presents the results of the total 
path analysis, The values associated with the 
unidirectional arrows are path (standardized 
regression) coefficients (P). Epsilon [E= 
(1-R?)?] represents the path coefficient 
value of all sources of variation not specified 
by the model. Preliminary path analyses by 
sex produced virtually identical results, hence 
male and female data are combined in this 
present analysis. These null results are inter- 


8 


esting in light of previous evidence of sex 
differences in achievement attributions and 
their consequences (e.g, Dweck, 1977; 
Feather, 1968, 1969; House & Perney, 1974; 
Nicholls, 1975). The present findings prob- 
ably point to the relatively modest role 
played by sex when all aspects of the model 
are allowed to vary naturally. 


Step 1: nAch—Attributions 


x 


Path coefficient values derived from regress- 
ing each of the four attribution elements 
singly on nAch are presented in the left-hand 
portion of Figure 2. The results indicate that 
variations in ability and luck ascriptions have 
a significant dependency on nAch (P= 
—.341 and ~.237, p< .05, respectively), 
but that effort (P = .048) and task ascrip- 
tions (P = —.066) do not. Consistent with 
previous results (Kukla, 1972a; Weiner et al» 
1971, 1972; Weiner & Potepan, 1970), failife 
experiences are found to elicit greater asctlP” 
tions to low ability among low nAch persons: 
However, the finding that low nAch person 
ascribe their failures to bad luck was re 
pected and may have been precipitated ý 
the highly ego-involving nature of the presen 
task. Those students who expect to f 

A 


be acting defensively to disavow pe 
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Figure 2. Path diagram of effects of nAch, causal attributions, expectancy, and shame on subsequent 


performance following failure. 


responsibility (Covington & Omelich, Note 
3). Whatever the explanation, inspection of 
Figure 2 indicates that luck as a causal 
attribution exerts no further influence on any 
of the distal components of the model. Sim- 
ilarly, task diffculty also drops out as a causal 
factor, having established neither a depen- 
> dency on nAch nor a clear causal influence for- 
ward in the presumed chain of psychological 
events. Yet we cannot conclude that percep- 
tions of task difficulty exert no influence on 
achievement behavior; we know they do. 
Variations in task complexity have been 
shown to influence expectations for future suc- 
céSs (Atkinson & Feather, 1966) as well as 
, Teactions to a given performance (Parsons & 
Ruble, 1972) and to moderate performance 
Marso, 1969; Woodson, Note 4). The fact 
that the task demands (test) were identical 
for all students probably led to a restricted 
Tange of task attributions, causing an under- 
‘stimation of the true causal role of task 
Variables, 


a 


The prediction that high nAch persons dif- 
ferently ascribe failure to insufficient effort 
was not upheld by the present data (P = 
048, p < .50). Other attempts to substantiate 
this hypothesis have also met with disappoint- 
ment (e.g., Weiner et al., 1971; Weiner & 
Kukla, 1970; Weiner & Potepan, 1970). Sev- 
eral explanations for this general failure to 
confirm the hypothesis appear tenable, First, 
according to the proposed distinction between 
stable and temporary effort (Rest, Nieren- 
berg, Weiner, & Heckhausen, 1973: Weiner, 
1974) cognitive predictions should be upheld 
if students are responding predominantly to 
low effort as a temporary state of affairs, But 
to the extent that some subjects, notably low 
nAch individuals, are reflecting a more trait- 
like disposition never to try (perhaps from a 
resignation born of a history of past failures), 
the predicted nAch>effort attribution linkage 
will be canceled out. Second, a self-serving, 
defensive interpretation is also viable, Infer- 
ences to low ability evoke shame ( Covington 
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& Omelich, 1979), and because a combination 
of high effort/failure signifies low ability 
(Kun & Weiner, 1973), some students may be 
acting to deceive themselves and others with 
false effort ascriptions. A third possibility is 
that the real difference between nAch motive 
groups lies not so much in the amount of 
study (the typical focus of attributional ques- 
tionnaires) as in its quality. The number of 
hours spent studying, either for the first test 
or for the retest, was unrelated to nAch mem- 
bership. The reason for this may be found in 
the work of Goldman, Hudson, and Daharsh 
(1973), who report that many low-achieving 
college students persist longer in their studies 
than do high achievers, a superiority offset by 
an opposing tendency among yet other low 
achievers to study little if at all. Additional 
effort may not pay off for low achievers be- 
cause nagging doubts about their ability inter- 
fere with their study (Wine, Note 5). 

All three lines of speculation point to the 
existence of important qualitative differences 
between nAch groups in the matter of effort 
expenditure that have yet to be fully explored. 
Meantime, the predicted nAch—effort attribu- 
tion relationship remains unconvincing. To 
the extent that nAch groups may differ in 


Table 2 
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affective reactions to failure, future expecta- ⁄^ 


tions, and retest performance itself, these 
variations do not appear to depend on their 
postdictive effort ascriptions. Naturally, this 
does not preclude effort attributions from 
exerting a direct influence on the more distal 
components of the model or from indirectly 
influencing performance through shame or 
expectancy, pathways we will examine shortly. 


Step 2(a): nAch—Attributions—Affect w 


Table 2 presents the results of decomposing 
the relationship between nAch and shame into 
direct and indirect effects. Indirect effects in- 
dicate the degree to which a change in an 
independent variable (nAch) influences an 
intervening variable (cognition), which in 
turn results in variations in the dependent 
variable of interest (shame). They are esti- 
mated by summing the products of the Patt 
coefficients of the one or more paths that 
intervene. At present no significance tests are 
available for indirect effects except by eval- 
uating the overall goodness of fit to the 
model. 

Inspection of the first column in Table 2 
indicates a significant zero-order correlation 


Decomposition of Zero-Order Correlations of Achievement Motivation With Shame, > 


Expectancy, and Performance 


ll 


Relationship 


Performance 


Expectancy 


ane Een, Ieee ass ee I 


Zero-order correlation 
Direct effect (P)* 
Total indirect effect 
Effort (E) 
Ability (A) 
Task difficulty (TD) 
Luck (L) 
Internal (A, E) 
External (TD, L) 
Stable (TD, A) 
Unstable (E, L) 
Internal attributions and shame 
External attributions and shame 
oe and expectancy 
nstable attributions and expect 
Shame sia 
Expectancy 


eee ee ee eee 


* Standardized regression coefficients. 
*p <.05. 
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between nAch and shame of —.194 (p < .05) 

ythat decomposes into a direct effect from 
nAch of —.141 (p= .054), which falls just 
short of the conventional standard of signif- 
icance, and into a total indirect influence on 
shame from nAch mediated through attribu- 
tions of —,053. Reading down the remainder 
of column 1, we note that only ability emerges 
as a substantial transmitter of nAch influence 
(—.051), accounting for virtually all the 
total indirect influence on shame. The con- 

/ tribution of effort is negligible (—.007). 
Other attempts to substantiate attributions as 
mediators of affect among nAch groups have 
fared little better. While Meyer (1970) dem- 
onstrated that high and low nAch subjects 
differed in ability ascriptions following failure, 
no relationship was found between this dif- 
ferential ascriptive pattern and degree of 
affective reaction to failure. Similarly, Weiner 
and Potepan (1970) found nAch membership 

Í predictive of anxiety over an upcoming school 
test, but again affect was unrelated to accom- 
panying attributions. 

These data provide little support for the 
view that the principal influence of nAch on 
shame is mediated indirectly via cognitive 
mechanisms. Contrary to what would be ex- 
pected if nAch acted only to initiate differen- 
tial ascriptions, there is a tendency for nAch 
to exert a direct influence on shame in its own 


“right, an effect certainly greater than that 


transmitted through attributions. 

Now consider the direct influence of at- 
tributions on shame, irrespective of nAch. 
Figure 2 indicates two noteworthy direct 

pathways: one from effort ascriptions to 

shame (P = —.133, p< .06) and the other 

between ability ascriptions and shame K F 

.148, p <.06). While this pattern of finding 

is consistent with Weiner’s original contention 
that shame is determined primarily by oi 
“tions in the internal elements of effort an 


ability, the direction Bee F Eos 
a * t predicted. First, ri 
pposite to that pi rather than in- 


to lo tend to decrease $ 
creed e and second, ascripHoni ma pr 
ability act to increase, not reduce, Re 
These trends are consistent with recent fnd 
ings (Covington & Omelich, 1979, ee 
Nicholls, 1975, 1976; Sohn, 1977; Covington, 
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Spratt, & Omelich, Note 6) indicating that 
students take pride in high self-ability attribu- 
tions and experience shame at low ability, a 
dynamic intensified in repeated failure (Cov- 
ington & Omelich, Note 7). 

One consequence of this ability valuing is 
that high nAch subjects tend to experience 
less shame than do low nAch subjects, not 
only because of the maintenance of high abil- 
ity ascriptions despite failure but also as a 
direct function of their motive group member- 
ship. This finding favors the traditional inter- 
pretation of failure-avoiding persons as “‘pos- 
sessing a capacity for experiencing shame” 
(Atkinson, 1957) and lends weight to 
Nicholl’s conclusion (1976) that the key dif- 
ference between high and low resultant 
achievement motivation students lies in their 
differential self-concepts of ability, 

Originally Weiner assumed that students 
would internalize teacher reinforcement pat- 
terns that favor achievement through effort 
and would view low ability as a mitigating 
circumstance in failure. However, in light of 
subsequent evidence (cited above), Weiner 
has recently acknowledged (Weiner et al., 
1978) that ability and effort attributions can 
magnify affective reactions, depending on 
whether the individual is a teacher or student. 
The present data underscore the importance 
of maintaining a distinction between rewards 
and punishment from others and personal re- 
actions to performance outcomes. 

The left-hand portion of Table 3 displays 
the results of incremental F tests providing an 
overall summary of the path analysis for 
shame, As the first predictor variable entered, 
nAch accounts for 48.5% (.038) of the ex- 
plained variation in shame through all paths 
of influence, direct and indirect, F(1, 200) = 
8.16, p < .01. Next, the four attributional 
sources combined add .040 to R? and account 
for the remaining 51.5% of the explained 
variance (R? = .078), F(4, 200) = 2.16, p< 
08. Only the internal elements of ability, F(1, 
200) = 3.81, and effort, F(1, 200) = 4.13, 
contribute significantly to this total (p < 
.05), with the contribution of effort coming 
entirely from its direct influence on shame. 
However, it is reasonable to wonder if the 
introduction of nAch as the first variable in 
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Table 3 $ ee aes _ os 
Incremental F Tests: Effects of Achievement Motivation and Attributions on sx 


Shame and Expectancy 
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Shame Expectancy 
Antecedent Increment % Increment i % 
variable df R to R? F of R R? to R? F of R? 
nAch 1 | .038 - 8.16% 485 043 — 9.38" .597 
i ibuti 078 040 2.16 515 .072 029 1.58 403 
yaa ‘ae í .057 .019 4.13% .253 .058 .015 3.28 .209 
Ability 1 .056 018 3.81% — .227 048 .005 1.20 075 a S 
Task difficulty 1 041 003 36 036 050 007 1.46 .093 
Luck 1 -038 000 Al 007 045 002 42 027 
Stable 2 .058 .020 2.21 .263 055 012 1,31 168 
Unstable 2 -058 -020 2.12 .260 .060 .017 1.85 .235 
Internal 2 .075 .037 397° 479 064 021 2.23 .283 
External 2 -041 -003 36 083 .052 .009 94 120 
R total .078 .072 


Note. df residual = 200. 
*p <05. 


our path analysis acted to attenuate the im- 
pact of attributions on shame, given the con- 
siderable variance shared by nAch and some 
of the attributional elements (see Table 1). 
Accordingly, an analysis with nAch deleted 
yields a significant effect of attributions on 
shame, F(4, 201) = 3.22, p < .05. While the 
internal attributions of ability (p < .01) and 
effort (p < .07) now account for virtually all 
the explained variance in shame (74% and 
26%, respectively), there is a decrease in 
R°? from .078 (with nAch included) to .060. 
However, this decrement fails to reach sig- 
nificance at the conventional level, F(1, 200) 
= 3.68, p < .06. 

These results establish the importance of 
the direct influence of internal attributions on 
shame, even though their role as transmitters 
of motive dispositions is less than expected 
and in a direction opposite that predicted by 
the cognitive model. It could be argued that 
these data support the contention that nAch is 
“nothing more than one manifestation of dif- 
ferential attributions” (Weiner, Note 8), 
especially since deletion of nAch results in a 
nonsignificant decrement in explained vari- 
ance. However, the present nAch data pattern 
in conjunction with Previously cited findings 
(Meyer, 1970; Weiner & Potepan, 1970) 
suggests that nAch does make a separate on- 
tribution to shame, independent of the predic- 


tive power it shares with ability attributions, 
Indeed, one further analysis deleting the four 

attributions from the prediction equation indi- 

cates a significant contribution from nAch, in 

the absence of any intervening influence of 

causal cognitions (R? = .038), F(1, 204) = 

7.98, p < .05. Despite their commonality, we 

conclude that both motive dispositions and 

internal cognitive ascriptions exercise a direct 

and independent influence on shame. 


A 


Step 2(b): nAch—Attributions—Expectancy 


The second column of Table 2 indicates a 
significant zero-order correlation between 
nAch and expectancy for future success of 
-209 (p< .05) that decomposes into a sig- 
nificant direct effect of .202 (p < .05) and a 
negligible indirect effect from nAch mediated 
through all four attributional sources of .007. 
We conclude that variations in expectancy 
have a substantial direct dependency on moz, 
tive group membership and that contrary to 
cognitive predictions, attributions do not act 
as mediators for this covariation. The weight 
of previous evidence on failure does little to 
alter this conclusion. While Meyer (1970) 
reported that high and low nAch groups dif- 
fered in their attributions to stable elements, 
these ascriptions led to expectancy shifts only 
among low nAch students. Moreover, Weiner 
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and Kukla (1970) found that although high 

pAch subjects reported greater ability levels, 
they did not as a consequence hold higher ex- 
pectancies for future success, Thus it appears 
that whatever differential attributions high 
and low nAch persons make in response to 
failure, they play little part in determining 
expectancies for future success. 

Now consider the evidence for a direct in- 
fluence of attributions on future expectancy, 
irrespective of nAch membership, Whenever 

the direct attribution—expectancy linkage has 
been the sole focus of research, the evidence 
has overwhelmingly favored cognitive predic- 
tions (Fontaine, 1974; McMahan, 1973; 
Meyer, 1970; Rosenbaum, 1972; Valle, 1974; 
Weiner et al., 1976; the work of Rosenbaum, 
1972, and Valle, 1974, is fully described in 
Weiner et al., 1976). These studies clearly 
demonstrate that expectations of future suc- 
cess remain higher when previous failures are 
attributed to unstable elements than to stable 
ones, In view of such agreement the present 
findings may appear puzzling. Figure 2 indi- 
cates that three of the four attributional ele- 
ments (ability, luck, and task difficulty) exert 
no significant direct causal influence on vana- 
tions in expectancy. This is also contrary to 
recently developed theoretical alternatives 
centering on learned helplessness (Abramson, 
Seligman, & Teasdale, 1978; Dweck, 1975, 
1977; Covington & Omelich, Note 7), sug- 
gesting. that self-ascriptions to low ability 


should produce expectancy decrements. How- 


ever, it must be recalled that path analysis 
assesses the relative influence of each of many 
factors on a single dependent variable simul- 
taneously. The role of attributions as direct 
causal mediators is probably overshadowed by 


the influential presence of nAch in the full 


model. When nAch is excluded from the path 
es are con- 


analysis and attributional soures 
idered in isolation, the causal link beepers 
ability ascriptions and expectancy now ap 


proaches significance (P = —.143, $ < ea 
suggesting that this stable element acts to de- 
press future expectancies of success. The in- 
significant role played by task ascriptions re- 
mains unchanged by these Se 
(P = —.084). This may reflect the possibility 
that, although it is a stable element, failure 
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ascribed to a difficult task does not necessarily 
imply personal inadequacy and hence holds 
no particular implication for the future 
(Klein, Fencil-Morse, & Seligman, 1976; Ten- 
nen & Eller, 1977). Thus our data lend partial 
support to the hypothesized role of stable 
elements as mediators of future expectancy, 
while pointing up their relatively modest 
causal role when compared to other variables 
in the model, 

We note in Figure 2 a significant direct, 
causal path from effort ascriptions to expec- 
tancy (P = —.154, p < .05), but in a direc- 
tion opposite to that predicted. Rather than 
raising hopes for future success, low effort 
ascriptions for a previous failure appear to 
dash them. Although having studied little for 
a first test may in principle sustain optimism 
about future test outcomes, this is likely only 
if time permits sufficient additional study be- 
fore the next test. When the time interval 
between tests is relatively short, as in the pres- 
ent study, students who did not prepare ade- 
quately the first time are probably less san- 
guine about a second chance. Weiner’s notion 
of stable effort may also apply (Rest et al., 
1973; Weiner, 1974, 1979). Consistent inac- 
tion in situations that clearly demand con- 
centrated effort suggests an attitude of de- 
moralization that carries with it negative 
expectations for future success. For some stu- 
dents a combination of low effort and pes- 
simism for the future may simply reflect an 
underlying negative attitude regarding their 
sense of personal causality (deCharms, 1968, 
1972). At the same time, these postdictive 
effort ascriptions may be partly defensive. 
The more effort expended in a failing cause, 
the more individuals are likely to underesti- 
mate actual time spent studying (Covington 
& Beery, 1976), a self-serving tactic that acts 
to confound cognitive predications. 

The right-hand portion of Table 3 provides 
a summary of the findings for expectancy. An 
incremental F test indicates that nAch ac- 
counts for some 60% of the explained vari- 
ance, F(1, 200) = 9.38, p < .05. Of all the 
attributional elements taken singly, only low 
effort contributes to any degree (although not 
significantly), accounting for half the remain- 
ing 40% of the explained variance in ex- 
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pectancy, F(1, 200) = 3.28, p < .08. Again 
to avoid underestimating the role of attribu- 
tions as causal agents because of the signif- 
icant covariation between nAch and attribu- 
tions, the analyses in Table 3 were recalcu- 
lated with nAch deleted. A significant decrease 
in R? is the result, from .072 (with nAch in- 
cluded) to .037, F(1, 200) = 7.54, p < .01. 
Thus nAch exerts a substantial influence on 
expectancy in the absence of any discernible 
impact of cognitions. 


Step 3: nAch—Attributions—A ffect / 
Expectancy—Retest 


Inspection of column 3 in Table 2 reveals 
a significant zero-order correlation between 
nAch and retest performance of .145 (p < 
.05) that decomposes into a direct effect on 
performance of .087 and into a total indirect 
effect of .058. Five pathways make up the 
total indirect influence of nAch on perform- 
ance. Three involve attributions as mediating 
links and should therefore make substantial 
contributions. The two remaining pathways 
involve only expectancy and affect, respec- 
tively, as the sole mediating links between 
nAch and performance and should in theory 
contribute relatively little to the total indirect 
effect. To simplify, we have squared and 
summed the individual contributions of each 
of these five pathways to permit a percentage 
interpretation. 

1. nAch-attributions—performance. The 
value (.014) associated with the four brack- 
eted entries in column 3 of Table 2 indicates 
the degree of nAch influence on performance 
mediated through variations in the four attri- 
butional sources combined. The absolute value 
of this effect (disregarding algebraic signs) 
represents roughly 15% of the total indirect 
effect of motive group on performance. Ability 
is the only attributional mediator of impor- 
tance, accounting for most of the 15% con- 
tribution (13%). Previous research on this 
nAch—attribution—performance link has 
been inconclusive. Weiner and Sierad (1975) 
and Meyer (1970) report the relationship 
holding only for low nAch subjects, whereas 
Kukla (1972a) found it only among high 
nAch subjects. In light of the present data 
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this inconsistency may be interpreted as due 
to the modest contribution of the nAch>, 
attribution linkage to performance. 

2. nAch—attributions—>shame— per{orm- 
ance. The internal dimensions of ability and 
effort have a negligible influence (—.008) on 
nAch group performance as mediated through 
shame. As expected, external elements play no 
part in this dynamic (.001). These combined 
sources account for only 1% of the total in- 
direct effect. 


3. nAch—attributions—expectancy— per” 


formance. Similarly, ascriptions to stable ele- 
ments (.011) and to unstable elements 
(—.008) contribute little to the performance 
outcomes of nAch groups as mediated through 
expectancy. These combined sources account 
for 3% of the total indirect effect of nAch in- 
fluence on performance. 

4. nAch—shame—perjormance, Variations 
in shame produce a stronger indirect pathway, 
of nAch influence on performance (—.018), 
accounting for 6% of the total indirect effect. 

5. nAch—expectancy— performance. The 
most powerful source of indirect influences for 
nAch membership on performance is expressed 
through variations in expectancy (.066), ac- 
counting for 75% of the total indirect causal 
impact of motive group membership on per- 
formance. 


In summary, the central proposition of cog-.. 


nitive theory, that performance -outcomes 
among motive groups depend largely on differ- 
ential attributions made for previous failures, 
finds little support. The main mediators of 
differential performance prove to be expect- 
ancy and, to a lesser degree, ability and 
shame, which together account for some 94% 
of the total indirect effect of motive group 
membership on performance. 

Next, consider the direct effect of expect- 
ancy, shame, and the four elements of causa), 
ascription on performance. Figure 2 indicates 
that the single most powerful source of direct 
influence is expectancy (P = .333, p < .05), 
followed by shame, which makes a modest, 
although nonsignificant, contribution (P = 
-131, p < .06). The direct path coefficient for 
each attribution is also nonsignificant (effort, 
P = —.028; ability, P = —.076; task dif- 
ficulty, P = —.012; and luck, P = .048). 
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Table 4 
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Incremental F Tests: Effects of Achievement Motivation, Attributions, Expectancy, 


Pand Shame on Performance 


Antecedent Increment % 
variable df R to R? F of Re 
nAch 1 021 == 4.89* 137 
Effort 1 -027 006 1.31 .037 
Ability 1 .027 006 1.32 .037 
Task difficulty 1 023 002 54 015 
Luck 1 .022 001 A3 012 
Stable 2 029 008 .93 .052 
A Unstable 2 038 007 87 cal 
Internal 2 .032 011 1.32 074 z 
External 2 -025 004 49 w 
Combined attributions 4 .036 015 .90 .101 
Shame 1 -037 016 3.71 104 
Expectancy 1 122 101 23.61* 660 
R total 153 


Note. df residual = 198. 
* p <.05. 


These four path coefficients were recalculated 
f'excluding nAch. However, the new values dif- 
fer little (—.027; —.098; —.012; and .036, 
respectively), and in no cases were changes 
sufficient to achieve statistical significance. 
Traditionally, theoretical interest has focused 
on the indirect mediating role of attributions, 
although the weight of evidence (Dweck, 
1975, 1977; Kukla, 1972a; Weiner et al., 


1972: Weiner & Potepan, 1970; Weiner & 
Sierad, 1975) suggests the existence of direct 
at least in 


/ attributional links to performance, 
these of failure. However, the present find- 
ings cause us to wonder about the impor- 
tance of such linkages relative to other direct 


and indirect pathways in the model. = 
the indirect in- 


As the last step, we assess 
fluence of attributions mediated through ex- 
pectancy and shame (attributions>shame/ 
expectancy—>performance), regardless - of 
nAch, These data indicate that the total in- 
direct influence from all attributional sources 
(disregarding algebraic signs) via expectancy 
is 136 and .045 via shame. While no statis- 


3 R irect 
tical comparisons are ssible between direc 
ae clear that ex- 


and indirect pathways, it is 
pectancy (P = .333) and, to a lesser extent, 
shame (P = .131) exert a considerable direct 
influence on performance, independent of at- 
tributions, whereas, contrary tO cognitive 
theory, variations in expectancy and shame 


that do depend on attributional mediators 
have a relatively modest influence on per- 
formance. 

With the completion of the path analysis of 
performance, we come full circle, having in- 
vestigated the entire presumed causal network 
among all elements of the cognitive model, 
Whether or not the magnitude of the various 
attributional influences, both direct and in- 
direct, is sufficient to justify the cognitive 
theory of achievement motivation in its pres- 
ent form is a major question raised by this 
study. The results of incremental F tests pre- 
sented in Table 4 help put this question in 
perspective. Need achievement accounts for 
some 14% of the explained variation in per- 
formance, F(1, 198) = 4.89, p < .05, through 
both its direct and indirect influences, The 
combined attributional sources account for 
an additional 10%, F(4, 198) = .90, p < „50. 
Expectancy, F(1, 198) = 23.61, p < .05, and 
shame, F (1, 198) = 3.71 p < .06, account for 
the remaining explained variance in perform- 
ance, 66% and 10%, respectively, and in con- 
junction with nAch explain 90% of the total 
R?. Once again, the data were reanalyzed ex- 
cluding nAch. The removal of nAch does not 
appreciably decrease the total R? for perform- 
ance, F(1, 198) = 1.41, p< .25. But neither 
is there a corresponding inorease in the predic- 
tive power of attributions, F(1, 199) = 1.01, 
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p < .50. Cognitions, which now account for 
14% of the total R?, as compared to the 
original 10%, continue to exert only a negli- 
gible impact on performance through both 
direct and indirect pathways, F(4, 199) = 
1.20, p < .50. It is expectancy that gains as a 
determinant of retest performance, from 66% 
to 77%, when nAch is removed. This is due 
to the considerable predictive power shared 
by nAch and expectancy in the full model. 
Another approach for judging the impor- 
tance of attributions is to assess the predictive 
power of the traditional theory of achievement 
motivation in the absence of mediating cogni- 
tions, In its simplest form Atkinson’s model 
(1957, 1964) postulates that the performance 
of any individual depends on a multiplicative 
combination of achievement motivation 
(nAch), the incentive value (J,) of the task 
(transliterated in Weiner’s system as shame), 
and expectancy of future success (P,). The 
final entry in Table 4 indicates that when 
these basic elements are combined along with 
attributions, a total of 15.3% of the variance 
in retest performance is explained, When the 
identical path analysis is repeated, but this 
time with all attributional sources removed, 
14.8% of the variance remains accounted for, 
a decrease in explained variance that does not 
‘teach statistical significance, F(4, 198) = 
1.17, p < .50. It appears that knowledge of 
causal ascriptions in failure tells us little more 
about the quality of subsequent performance 
than is already known from the elements of 
the traditional model, 


General Discussion 


The overall results of this study offer little 
support for the cognitive model of achieve- 
ment behavior as presently formulated. First, 
achievement motive groups do not differ in 
test performance primarily as a consequence 
of differential attributions made for a previous 
failure. Second, although some attributions 
(notably effort and ability) do contribute to 
negative affect and future expectancy, these 
causal relationships are not always in the pre- 
scribed direction. Moreover, the effects of at- 
tributions on performance as mediated 
through expectancy and affect are disappoint- 
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ingly small. Third, nAch emerges as an im- 
portant influence on achievement behaviors in_ 
its own right. In the case of shame, nAch con-* 
tributes both independent of and in common 
with mediating cognitions, primarily ability. 
Moreover, nAch exerts a direct impact on ex- 
pectancy in the absence of a significant con- 
tribution from attributions. And finally, nAch 
affects subsequent performance through a 
combination of direct influence and indirect 
influence mediated largely by expectancy, a 
network basically unaffected by cognitions. 
We agree with Weiner’s recent reassessment 
that “the presumed causal biases ascribed to 
individual differences in . . . achievement 
needs remain somewhat tenuous” (Weiner et 
al., 1978). 

What part, then, if any, do attributions 
play in the achievement dynamic? In our 
view, attributions, expressed postdictively, are 
reactions to rather than causes of perform- 
ance. As postdictive reactions to failure, at- 
tributions are prone to reflect defensive, self- 
serving biases (Miller, 1976; Snyder, Stephan, 
& Rosenfield, 1976; Stephan, Rosenfield, & 
Stephan, 1976; Covington & Omelich, Note 3) 
and thereby obscure their causal role as in- 
formation-processing determinants of subse- 
quent achievement behavior. We do not deny 
the importance of rational considerations in 
achievement behavior, nor do we argue that 
self-perceptions of ability and effort (“power” 
and “try” in Heider’s terms) are unimportant. 
We only suggest that such factors presently 
represented in the cognitive model as post- 
dictive explanations of success and failure 
should be relocated as exogenous, antecedent 
determinants of achievement behavior where 
they are less likely to be confounded by self- 
serving tendencies. 

The present data strengthen the case for 
differential self-perceptions of ability as the 
primary ingredient in achievement motivation 
(Kukla, 1972a, 1972b; Moulton, 1974; 
Nicholls, 1976). First, there is a substantial 
correlation between the Brookover Self-con- 
cept of Ability Scale (Brookover, LePere, 
Hanachek, Thomas, & Erickson, 1965) and 
the present index of resultant achievement 
motivation (r= 44, p< .01). Second, in- 
spection of Figure 2 indicates that despite the 
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likely presence of self-serving bias, ability 
nonetheless emerges as the only attribution 
that links the entire model from nAch to retest 
performance (via shame). This finding 
strengthens the conceptual basis for recent 
work on learned helplessness that focuses on 
the ability—performance pathway (Abram- 
son, Seligman, & Teasdale, 1978; Dweck, 
1977; Covington & Omelich, Note 7). 

The central importance of perceived ability 
to all achievement behavior, including affect 
and self-confidence, is underscored in a re- 
cently proposed self-worth theory of achieve- 
ment motivation (Beery, 1975; Covington & 
Beery, 1976) holding that students attempt 
to maintain and promote self-ascriptions to 
high ability owing to society’s tendency to 
equate worth with the ability to achieve com- 
petitively. In this self-serving process, attribu- 
tions may not always reflect rational con- 
siderations. For example, Covington and 
Omelich (Note 3) found that males with low 
self-concept of ability. devalued the ability 
of others compared to their own under iden- 
tical circumstances of failure, whereas females 
with low self-concept denigrated their abil- 
ities beyond what was rationally indicated by 
the information at hand. The presence. of 
such bias can only act to obscure, not clarify, 
the ~redicted relationships between postdic- 
tive attributions and shame, expectancy, and 
outcome. The study by Weiner and Sierad 


(1975) illustrates how the interjection of de- 


fensive attributions (excuses) can completely 


alter expected performance pean id 
lacebo) alleged to interfere wi an 
E istered to college sub- 


coordination was admini: d u 
jects of high and low nAch prior to a learning 
task. Armed with an explanation for poten- 
tially poor performance other than inability, 
low nAch subjects were expected to perform 
better than usual, while high nAch subjects 
were expected to perform poorer than pny 
owing to the presumed interference of the 
drug with ability. Both predictions were borne 
out. The authors correctly conclude that the 
experiment demonstrates the causal influence 
of cognitions on performance. Yet it illustrates 
something more. The presence of self-serving 


excuses, a factor overlooked by both tradi- 
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tional and cognitive theories, can act to con- 
found the usual attributional linkages. 

This reconsideration of the role of attribu- 
tions underscores the need, in Weiner’s words, 
“. . . to specify the conditions under which 
individuals avoid feedback and information as 
a self-protecting, coping strategy” (1977). 
Identifying the circumstances of failure that 
elicit defensiveness would appear a useful 
starting point, with special attention to 
Heider’s dictum (1958) that in order for self- 
serving biases to be effective, they must fit 
the constraints of reason. In this connection, 
Covington and Omelich (Note 3) found that 
individuals aggrandize their own ability at the 
expense of others only when such self-protec- 
tive judgments are plausible, either when 
their failure is accompanied by low effort or 
when excuses are available afo; explain why 
effort did not pay off. 

A reinterpretation of the present data from 
a self-worth/ability perspective, corroborated 
by evidence from related sources, provides a 
glimpse of the complex interaction between 
affective, social, and cognitive factors in the 
achievement context, Consider first the dy- 
namics of low nAch students as reflected in 
Figure 2, Owing to perennial self-doubts 
about their ability, failure elicits cognitions of 
incompetency that, in turn, cause shame (low 
nAch—low ability>shame), a dynamic con- 
firmed independently by laboratory studies 
(Covington & Omelich, 1979; Sohn, 1977). 
Shame is further intensified directly through 
nAch dispositions (low nAch—shame), the 
linkage originally postulated by Atkinson 
(1957). This suggests that negative reactions 
to failure have long been conditioned by a 
history of failure, irrespective of any implica- 
tion of low ability. A third source of shame 
depends on effort level (high effort—shame) : 
effort intensifies shame because high effort/ 
failure is evidence of low ability (Kun & 
Weiner, 1973) and inferences of inability 
evoke shame (Covington & Omelich, 1979). 
While this source of distress may be mitigated 
temporarily by not studying, as in the case 
of procrastination (Birney, Burdick, & Tee- 
van, 1969), such tactics are maladaptive in 
the long run, Students who are perceived by 
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teachers as having studied little are rewarded 
less in success and punished more in failure 
than those students who do try (Rest et al., 
1973; Weiner, 1972, 1974). Moreover, ac- 
cording to the present data configuration, low 
effort may also act to depress retest perform- 
ance in two interlocking ways. First, low 
effort contributes to reduced shame (Coving- 
ton & Omelich, 1979), which in turn tends to 
decrease performance, presumably to the ex- 
tent that chagrin might otherwise mobilize 
effort (Weiner, 1974; Weiner et al., 1971). 
Second, low effort acts indirectly to suppress 
performance through lowered expectations of 
future success (low effortlow expectancy 
low performance). Unfortunately, the low 
nAch student who works hard is unlikely to 
farermuch better even with the teacher ap- 
proval that es from trying. Studying 
typically occurs in the face of poor prospects 
for: success (low nAch—low expectations), 
and the likelihood of failure despite high effort 
simply intensifies the threat to self-esteem as 
further evidence of low ability (Covington & 
Beery, 1976). Little wonder that the study 
habits of low nAch persons have been char- 
acterized as conflicted, tense, and ineffectual 
(Wine, Note 5). 

Figure 2 explicates an entirely different dy- 
namic in the case of high nAch individuals. 
These students tend to experience less distress 
in failure both because, as predicted by At- 
kinson (1957), they are success-oriented 
(high nAch—low shame), and because they 
are secure in their sense of competency (high 
nAch>high ability>low shame) (Moulton, 
1974; Nicholls, 1976). Although the resulting 
lower level of shame tends to dampen subse- 
quent performance (high nAch—low shame 
low performance), the relatively small mag- 
nitude of this negative suppression effect is 
more than offset by the tendency to maintain 
a high expectation of success that manifests 
itself in increased retest performance (high 
nAch—high expectancy—high performance). 
From this it can be seen why an occasional 
failure for the high nAch student gives little 
cause for self-doubting of ability or for low- 
rpd, expectations, nor is it likely to precipitate 
a vicious cycle of self-defeat so common 
among low nAch pupils. 
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Conclusion 


Progress toward a fuller understanding of ‘ 
achievement behavior will be made only as 
long as theories are tested in the naturally 
occurring contexts of achievement. We con- 
clude from the evidence of the present field 
study that the achievement process is not 
solely, or even primarily, a manifestation of 
cognitive attributions, Rather it is most pro- 
ductively viewed as the operation of personal- 
ity and socially conditioned dispositions 
emerging from the individual’s efforts to pro- 
tect and enhance a sense of competency. To 
be sure, these efforts are moderated by ra- 
tional information-processing principles of the 
kind originally proposed by Heider and elab- 
orated by Weiner. Indeed, it is this dynamic 
interaction of cognition and affect that pre- 
sents the greatest challenge to the researcher 
and theoretician. 
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Uncertainty-Reducing Properties of Achievement Tasks 
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Two experiments were conducted to test the idea that choice among achieve- 
ment tasks depends on the extent to which performance outcomes are expected 
to reduce uncertainty about one’s ability. Two determinants of expected uncer- 
tainty reduction were investigated, the diagnosticity of the task at each zone 
-on the ability scale and the person’s uncertainty regarding the ability levels in 
each zone. As in previous research, the preference for a task increased with its 
overall diagnosticity, that is, its diagnosticity over the entire ability scale. Fur- 
thermore, among tasks of equal overall diagnosticity, subjects preferred tasks 
that were the most diagnostic at the zone where most of their uncertainty was 
concentrated, irrespective of the location of this zone on the ability scale. It is 
concluded that subjects selected tasks that would maximize expected reduction 


A theme running through a considerable 
amount of social psychological theory and 
research is the idea that people characteristic- 
ally strive to gain self-knowledge. Students of 
social comparison and attribution processes 
have contended that a wide range of social 
g phenomena, from emotional reactions and 

attitude change to affiliation and interpersonal 

communication, are affected by people’s un- 
derlying need to assess their own attributes 
accurately (Festinger, 1954; Nisbett & Valins, 

1972; Schachter, 1959; Suls & Miller, 1977). 

This article is concerned with the manifesta- 

tions of this need in the domain of achieve- 

ment behavior. More specifically, it focuses 
on the effects of (a) initial knowledge about 
one’s own ability and (b) ability-relevant in- 
formation contained in performance outcomes 
on choice among achievement tasks, 

“* In the last few, years, it has been argued 
that’ people choose among achievement tasks 
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of uncertainty about their standing on the ability scale, 


Aa 
according to the amount of ability-releyant 
information they expect to gain from Per- 
formance outcomes (Trope & Brickman, 
1975; Weiner, Frieze, Kukla, Reed, Rest, &p 
Rosenbaum, 1972). In their paper on the in- 
formational properties of achievement tasks, 
Trope and Brickman proposed that the well- 
documented preference for intermediate-dif- 
ficulty tasks reflects a strategy of maximizing 
information about one’s ability, This self- 
assessment interpretation was offered as an 
alternative to Atkinson’s (1957) widely 
known theory of achievement motivation, 
which postulates that a choice of tasks where 
probability of success equals .50 maximizes 
positive achievement affect. Specifically, 
Trope and Brickman argued that the amount 
of ability-relevant information contained in 
performance outcomes depends on the diag- 
nosticity of the task, that is, the extent to 
which performance outcomes are expected to 
vary as a function of ability level. The greater 
the differences in performance among people 
of different ability levels, the greater the 
diagnosticity of the task. Ordinarily, the dif- 
ferences between ability levels tend to be 
greater at intermediate difficulty tasks. How- 
ever, Trope and Brickman showed that it is 
possible to construct tasks in which difficulty 
and diagnosticity vary independently. When 


0022-3514/79/3709-1505$00.75 


z 1505 


1506 


presented with such tasks, Trope and Brick- 
man’s subjects preferred high-diagnosticity to 
low-diagnosticity tasks, irrespective of 
whether the tasks were of intermediate dif- 
ficulty. The effect of diagnosticity was repli- 
cated and extended in subsequent studies 
(Trope, 1975; Zuckerman, Brown, Fischler, 
Fox, Lathin, & Minasian, in press; Buckert, 
Meyer, & Schmalt, Note 1). 

In addition to task characteristics (such as 
diagnosticity), the amount of prior knowledge 
about one’s ability should also affect informa- 
tion gain. A person who has little prior uncer- 
tainty about his or her ability level cannot 
expect to gain much new information from 
performing tasks low or high in diagnosticity. 
As uncertainty about one’s ability increases, 

+ the amount of information the individual can 
gain should also increase. The notion that 
uncertainty instigates information seeking is 
not new to psychology. Extensive research has 
demonstrated that in various domains, prior 
to choosing from among several alternatives, 
the amount of information seeking is directly 
proportional to the amount of uncertainty 
regarding the “correctness” of the alternatives 
(Lanzetta, 1963; Lanzetta & Driscoll, 1968). 
Lanzetta’s research was guided by Berlyne’s 
theory of curiosity (1960), which assumes 
that uncertainty has drive properties and 
postulates information seeking as the primary 
instrumental means for reducing the drive, 
Now, if achievement behavior is an instance 
of information seeking in general, we would 
expect that the more similar the subjective 
prior probabilities of Possessing the various 
ability levels (i.e, the probabilities held by 
the person prior to task performance), the 
greater the uncertainty, and therefore the 
greater the tendency to perform a relevant 
task, 

A recent study by Trope (Note 2) tested 
these propositions, Subjects first performed 
two tasks, each pertaining to a different abil- 
ity. In one of the experimental conditions, the 
results were predetermined so as to induce 
very little uncertainty about their standing on 
one ability (low-uncertainty ability) and a 
great deal of uncertainty about their standing 
on the other ability (high-uncertainty abil- 
ity). Subjects could then decide whether to 
continue working on tasks that were relevant 
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to the low-uncertainty ability or on tasks tha 
were relevant to the high-uncertainty abilify 
As expected, subjects preferred the latte 
These tasks were equally attractive in th 
conditions in which prior results induced thi 
same amount of uncertainty, low or high 
about the two abilities. 
In the preceding discussion of uncertainty 
and diagnosticity, the ability dimension v 
considered as a whole. However, uncertain 
and diagnosticity may vary from one zone 
another on the ability dimension. Thus, 
prior probabilities of the ability levels wi 
one zone may be Aigher and more nearly eq 
than the prior probabilities of the ab 
levels within another zone. To illustrate, sup: 
pose that two zones are considered, the 1 i 
half and the upper half of the ability dimen 
sion, and imagine a person who is almost ci 
tain that he or she is not in the lower zone 
the ability dimension but is highly uncerta 
about his or her location within the uppe 
zone. In probabilistic terms, compared to 
ability levels within the lower zone, the abili 
levels within the upper zone have high 
equal prior probabilities. The opposite 
would obtain for an individual who is qu 
sure of not belonging to the upper zone bul 
is unsure as to his or her standing within # 
lower zone of the ability dimension. That is 
the prior probabilities within the lower 
are high and equally distributed among 
ability levels. Clearly, these two persons hi 
the same amount of overall uncertainty, 
their uncertainty is distributed differently b 
tween the two zones: Most of one pel 
uncertainty is concentrated within the lov 
zone, whereas most of the other’s uncertail 
is concentrated within the upper zone. 
The amount of information these hypo 
ical persons can gain from task perfor 
depends on the accuracy with which thi 
can diagnose (or discriminate between) l 
ity levels within each of the zones. Ta 
equal overall diagnosticity (i.e., equal 
amount of variation in performance acro: 
ability levels) may differ in the way t 
diagnosticity is distributed among the z0 
The task may be (a) more diagnostic i 
lower zone, producing greater differen 
performance among ability levels withi 
lower zone than among ability levels ¥ 
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the upper zone (eg., a task in which perform- 
ance is a monotone negatively accelerated 
function of ability); (b) it may be equally 
diagnostic in the two zones (e.g., a task in 
which performance is a linear function of 
ability); or (c) it may be more diagnostic in 
the upper zone (e.g., a task in which perform- 
ance is a monotone positively accelerated 
function of ability). 

One can maximize information gain or un- 
certainty reduction by choosing a task that 
is most diagnostic at the zone where one’s 
uncertainty is concentrated. That is, informa- 
tion gain depends on the similarity between 
the distributions of uncertainty and diagnos- 
ticity across the various zones of the ability 
dimension, Hence, the self-assessment no- 
tion would predict that a person whose uncer- 
tainty is concentrated in the lower zone will 
be more interested in a task that is more diag- 
nostic in the lower zone, less interested in a 
task that is equally diagnostic in the two 
zones, and least interested in a task that is 
more’ diagnostic in the upper zone. A person 
whose uncertainty is concentrated in the up- 
per zone will display the reverse pattern of 
preferences. 

The foregoing analysis was tested in two 
experiments. t 1 manipulated two 
variables. The first variable was the subjects’ 
distribution of ty, that is, subjects 
uncertainty was concentrated either in the 
lower zone or upper zone. The second vari- 
able was the distribution of task diagnosticity, 


that is, tasks were more tic within the 
lower br upper zone or were equally Pamori: 


within the lower and upper zones. The 


diagnosticity and 
were equated on 

i t manipulated 
difficulty. The second experiment man Pl ot 


these last two variables as well as 


i t, the 
tion of diagnosticity. In , 
distribution of uncertainty was a oe 
subjects’ own beliefs about their ability r: 
than being experimen’ d 

Experiment 1 
Method 

Subjects 


į- 
The subjects were 60 male candidates for an © 
in the $ 


ficers training school 
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They participated in the experiment during their stay 
in an officer selection unit located in an army base, 
All of the subjects were in regular army service, had 
completed high school education, and were 18-21 
years old. 


Procedure and Manipulations 


Subjects participated individually in two experi- 
mental sessions, In the first, they performed a cogni- 
tive task and received predetermined feedback that 
was designed to manipulate the uncertainty distribu- 
tion over the relevant ability dimension. In the sec- 
ond session, subjects were presented with three addi- 
tional tasks varying in diagnosticity distribution 
over the ability dimension. Subjects then indicated 
their preferences among these tasks and answered 
several questions that served as manipulation checks, 

At the beginning of the first session, subjects were 
told that they were taking part in a survey on con- 
ceptual ability carried out by the Department of 
Psychology at the Hebrew University, The experi- 
menter (a civilian) emphasized that subjects’ results 
were anonymous and would not be given to any army 
authority. Subjects were then informed that their 
first task would be to perform a test of conceptual 
ability called “common features test.” The test con- 
sisted of 20 items in each of which subjects had to 
name a feature or concept that most precisely fitted 
a group of four words (eg, fog, snow storm, elec- 
tricity blackout, military coup). 

After completing the test, subjects were handed a 
page containing information about conceptual abil- 
ity and the scoring of the common features test 
Subjects read: 


Conceptual ability is manifested in the ability to 
discover and generate conceptions and rules under- 
lying sets of items of information Extensive re- 
search has shown that young adults with high 
school education can be classified into three Jevels 
of conceptual ability: a relatively Jow level, a rela- 
tively high level, and an intermediate level of con- 
ceptual ability which lies halfway between the low 
and the high levels. 


To minimize differences among subjects in prior be- 
liefs about their ability, the statement further noted, 
“Conceptual ability is not measured adequately by 
standard intelligence tests. However, previous research 
has shown that approximately one third of the pop- 
ulation of young adults with high school education 
has each of the three levels of conceptual ability.” 
Subjects were told that the tester examined each an- 
swer and determined according to several well-estab- 
lished criteria whether it reflected a low level, an 
intermediate level, or a high level of conceptual abil- 
ity. 

Y Vonipulation of uncertainty distribution. While 
the subject was reading this information, the experi- 
menter pretended to score his test, ostensibly using 
an impressive table of norms and categories. The 
experimenter presented the results on a results sheet. 
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Table 1 


YAACOV TROPE 


Experiment 1: Reported Mean Percentages of Correct Items as a Function of Ability Level, 
Uncertainty Distribution, and Diagnosticity Distribution 
oasen =o a anben e -l ll o 


Uncertainty distribution 


Descending Ascending 
Diagnosticity distribution Diagnosticity distribution 
Abilit; -— 
jag Descending Constant Ascending Descending Constant Ascending 
Low 29.8 37 44.2 15.8 23 30.2 
Intermediate 65.8 57 48.2 51.8 43 34 
High 69.8 77 $4.2 55.8 63 70.2 


In one condition, the results sheet indicated that out 
of the 20 answers, 9 (45%) reflected low ability, 9 
(45%) intermediate ability, and 2 (10%) high ability. 
These results were designed to induce high probabil- 
ities of the low and the intermediate levels and a low 
probability of the high level. In the other condition, 
the results sheet indicated that 2 answers (10%) 
reflected low ability, 9 (45%) intermediate ability, 
and 9 (45%) high ability. These results were designed 
to induce a low probability of the low level and high 
probabilities of the intermediate and high levels. The 
former condition is referred to as descending uncer- 
tainty condition, meaning that most of the uncer- 
tainty was concentrated in the low half of the ability 
scale; the latter condition is referred to as ascending 
uncertainty, meaning that most of the uncertainty 
was concentrated in the upper half of the ability 
scale. Thirty subjects were randomly assigned to each 
condition, 

After receiving the results, subjects were escorted 
to an adjacent room where another experimenter 
conducted the second session. This experimenter told 
subjects that he would like them to work on a test 
of conceptual ability called “classification test” and 
that they would receive their results, Subjects were 
told that each item in the test Presented a list of 
objects or events that had to be assigned into four 


the descending and ascending uncertainty conditions. 


All the intervals were 30% wide and symmetric 
around the means. 


The following constraints were employed in deriv- 
ing these means. In order to hold overall diagnosticity 
constant, the difference between the means of the 
lowest and highest ability groups was set at 40% in 
all of the subtests. Another set of constraints was 
designed to manipulate diagnosticity distribution. In 
subtests that were more diagnostic at the lower half 
of the ability scale, referred to as descending diag- 
nosticity subtests, the difference between the low and 
intermediate levels was set nine times as great as the 
difference between the intermediate and high levels. 
In the subtests that were more diagnostic at the 
upper half of the ability scale—ascending diagnos- 
ticity subtests—the magnitude of these two differ- 
ences was reversed. In subtests that were equally 
diagnostic at the lower and upper halves of the 
scale—constant diagnosticity subtests—these two dif- 
ferences were of the same magnitude. The third con- 
straint was on expected performance at the subtests 
and was designed to hold perceived difficulty con- 
stant. Expected performance on a test was defined as 
the overall mean of the performances of the three 
ability levels, each weighted by the probability of 
having the corresponding ability level. Obviously, 
these probabilities should be different in the two 
uncertainty distribution conditions. In fact, pretest- 
ing indicated that the probabilities assessed by sub- 
jects were quite close to the Proportions of answers 
in the common features test that purportedly 
reflected each ability level. Hence, the prior 
Probabilities of the low, intermediate, and high 
ability levels were assumed to be 45, .45, and .10, 
respectively, in the descending uncertainty condition 
and .10, 45, and .45, respectively, in the ascending 
uncertainty condition. Using these probabilities, ex- 
Pected performance at the three subtests was set at 
50% in the two uncertainty distribution conditions. 

The order of presentation of the diagrams for the 
three subtests was varied across subjects according to 
a 3 X 3 Latin square design. 

Measures. Subjects were informed that they were 
to choose a total of 28 items to work on and that 
they could decide how many items would come from 
each subtest. After indicating their selection of items, 
subjects were asked to fill out an evaluation ques- 
tionnaire while the experimenter was ostensibly pre- 
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paring the test material. The first set of questions in 
this questionnaire pertained to the three subtests of 
the classification test. Subjects rated five questions 
on 7-point scales: (a) their willingness to work on 
each subtest, (b) their interest in finding out how 
they would perform at each subtest, (c) the accuracy 
with which each subtest could discriminate between 
the low and the intermediate levels of ability, (d) 
the accuracy with which each subtest could discrim- 
inate between the intermediate and the high levels of 
ability, and (e) the difficulty of each subtest. Ques- 
tions a and b, together with the measure of item 
selection, served as measures of task preference. Ques- 
tions c and d served as checks on the manipulation 
of diagnosticity distribution. Question e was designed 
to check the assumption that the subtests were 
equated on perceived difficulty. As a check on the 
manipulation of uncertainty distribution, subjects 
were asked to rate on three 11-point scales (ranging 
from 0 to 100) the chances of their having each of 
the three ability levels. After subjects had completed 
the questionnaire, the experiment was terminated. The 
experimenter thoroughly debriefed the subjects, 
checking for suspicion and explaining the actual pur- 
pose of the study and the reason for the deceptions. 


Results and Discussion 


Manipulation Checks 


The manipulation of uncertainty distribu- 
tion (i.e. the feedback from the common fea- 
tures test) had a strong effect on subjects 
assessments of the probabilities of having each 
ability level, In the descending uncertainty 
f the probabilities of 


condition, the means 0: I a 
the low, intermediate, and high levels were 


.36, .43, and 21, respectively, and in the 
ascending uncertainty condition, they were 
.10, .43, and 47. An analysis of a 
(Uncertainty Distribution X Ability wel 
yielded a highly significant interaction, y 4 
116) = 65.07, $ < .001, indicating that eh he 
descending uncertainty condition, the a 
ence between the low termediate 
was small and insignifican 
between the inte ‘i 
and significant ($ < 94); 
ascending uncertainty condition, oc 
difference was large an ae ERAN 
.001), and the latter Dah ma. 
Hence, as intended, in the descen E on 
tainty condition most of the uncer! x 
ediate levels (i.e. 


cerned the low and inlet scale), whereas 


the lower half of the a Eost 
in the ascending un! saat odiate 
of the uncertainty concerned P 


and high levels (i.e., the upper half of the 
ability scale). It should be noted, however, 
that although the two conditions differed in 
the expected direction, the uncertainty dis- 
tribution in the ascending uncertainty condi- 
tion was more in line with the feedback from 
the common features test than it was in the 
descending uncertainty condition. In the latter 
condition, therefore, the uncertainty was more 
evenly distributed over the ability scale. 

The first two rows in Table 2 indicate that 
the manipulation of diagnosticity distribution 
was successful, As expected, descending diag- 
nosticity tests were perceived as the most ac- 
curate and the ascending diagnosticity tests 
as the least accurate in discriminating be- 
tween low and intermediate ability, On ac- 
curacy in discriminating between intermediate 
and high ability, the ordering of the subtests 
was reversed, Analyses of variance (Uncer- 
tainty Distribution x Diagnosticity Distribu- 
tion) supported these observations, yielding 
highly significant effects of diagnosticity dis- 
tribution on ratings regarding accuracy of 
discrimination between low and intermediate 
ability, F(2, 116) = 68.27, p < .001, and be- 
tween intermediate and high ability, F(2, 
116) = 52.08, p < .001. No other source of 
variation was significant in these analyses. 
Finally, as intended, the perceived difficulty 
of the subtests (Row 3, Table 2) did not sig- 
nificantly vary as a function of the experi- 
mental variables. It seems, then, that the fact 
that expected performance was kept constant * 
made the subtests appear equally difficult, 


Choice and Preference Among Subtests 


The number of items selected from each 
subtest served as the main dependent vari- 
able. The pertinent means in Figure 1 indi- 
cate that the manipulation of uncertainty dis- 
tribution produced two opposite combinations 
of items from the three subtests. As pre- 


1 As noted above, the subjective probabilities of the 
ability levels in the descending uncertainty condition 
somewhat deviated from the reported proportions of 
items that purportedly indicated the various ability 
Jevels. Even if these probabilities are used to calcu- 
late expected performance, however, the differences 
among the subtests remain extremely small, 
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Table 2 


YAACOV TROPE 


Experiment 1: Mean Ratings of Tasks as a Function of Uncertainty Distribution and 


Diagnosticity Distribution 


Descending 


Diagnosticity distribution 


Uncertainty distribution 
Ascending 


Diagnosticity distribution 


Measure Descending Constant Ascending Descending Constant Ascending 

Accuracy in discriminating 

between low and be 

intermediate ability 5.87 4.50 2.50 5.57 4.20 50 
Accuracy in discriminating 

between intermediate ; G 

and high ability 2.90 4.80 5.70 2.90 4.20 6.27 
Difficulty 3.45 3.38 3.99 3.37 3.07 3,80 
Willingness to work on ae 

subtest 5.23 4.50 3.53 3.63 3.93 5 +7 
Interest in finding out results 5.93 5.33 >= veh 97 m 4.37 4.80 6.27 


dicted, in the descending uncertainty condi- 
tion the greatest number of items came from 
the descending diagnosticity subtest and the 
smallest from the ascending diagnosticity sub- 
test, whereas in the ascending uncertainty 
condition the results were reversed. This de- 
pendency of choice among subtests on un- 
certainty distribution was reflected in a 
strong Uncertainty Distribution x Diagnos- 
ticity Distribution interaction effect, F(2, 
116) = 13.56, p < :001, which was the only 
significant source of variation. 

The same picture emerges from the ratings 
of desire to work on the subtests (Table 2: 
row 4) and the ratings of interest in finding 
out the results of performance at the subtests * 
(Table 2, row 5). Analyses of variance of 
these ratings yielded highly significant Un- 
certainty Distribution x Diagnosticity Dis- 
tribution interaction effects ($ < .001), with 
no other effect being significant. It is par- 
ticularly interesting that the main effects of 
uncertainty distribution were insignificant 
(F <1), meaning that subjects in the two 
uncertainty distribution conditions had the 
same overall interest in working on the sub- 
tests and in finding out how they would per- 
form. 

Finally, it seems that the slopes relating 
item selection and the preference ratings to 
diagnosticity distribution are steeper in the 


ascending uncertainty condition than in the 
descending uncertainty condition, The F ra- 
tios corresponding to these comparisons were 
insignificant, but it is worth noting that this 
difference in slopes may be due to the fact, 
indicated earlier, that in the descending un- 
certainty condition subjects’ uncertainty was 
more evenly distributed between the two 
halves of the ability scale. 

In conclusion, subjects’ choice behavior sug- 
gests that they systematically employed a 
strategy of maximizing information about 
their ability. The descending uncertainty and 
the ascending uncertainty groups did not dif- 
fer in this respect. First, the two groups ex- 
pressed the same overall amount of interest 
in the ability-relevant tasks. Second, both 
groups were interested in working on a task 
to the extent that its diagnosticity distribu- 
tion fitted their uncertainty distribution. 
Thus, subjecs were most interested in a task 
that was highly diagnostic at the ability zone 
where their uncertainty was concentrated, 
irrespective of the location of this zone on 
the ability dimension. 


?The correlations between the three dependent 
variables (the two ratings and items selection) within 
conditions ranged from .42 to .79, with a mean of 


ws 
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Experiment 2 


This experiment manipulated three vari- 
ables: overall diagnosticity, diagnosticity dis- 
tribution, and difficulty. The first and last of 
these variables were held constant in Experi- 
ment 1. Distribution of uncertainty, which 
was experimentally manipulated in Experi- 
ment 1, was inferred in this experiment from 
subjects’ assessments of their own ability 
level. It was assumed that the higher the per- 
ceived ability, the higher the zones on the 
ability dimension within which the uncer- 
tainty was concentrated. 


Method 
Subjects 


The subjects were 54 male soldiers who belonged 
to the same population from which the subjects of 
Experiment 1 were drawn. 


Procedure and Manipulations 


Subjects participated in the experiments in groups 
ranging in size from three to five. The experimenter 
stated that the purpose of the study was to investi- 
gate opinions and feelings about ability tests. As in 
Experiment 1, subjects were assured that their an- 


~------- Descending Uncertainty 
Ascending Uncertainty 


20 


———- 


NUMBER OF ITEMS 
4 


Descending Constant 


DIAGNOSTICITY DISTRIBUTION 

f items 
1: Mean number of ite 
as a function of uncertainty 
ys. ascending un- 
on (descending 


Figure 1. Experiment 
Selected from each test E 
distribution (descending uncertainty A 
certainty) and diagnosticity distribui 

vs. constant vs. ascending) - 
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Swers were anonymous. The experimenter then 
handed the subjects a booklet titled Survey of Opin- 
ions about Tests of Mental Concentration. In the 
booklet subjects read that they would receive infor- 
mation about several tests of mental concentration 
in all of which the testee had to solve a series of 
items within a certain time limit. It was indicated 
that the tests varied in the type of item they included 
—numerical, verbal, or figural—and in emphasis on 
perception, memory, or reasoning. Subjects were then 
informed that extensive research had shown that 
young adults with high school education could be 
classified into seven levels of mental concentration, 
with approximately 14% of this population belong- 
ing to each of the levels, 

At this point, subjects were asked to assess their 
own level of mental concentration, Subjects were 
given a seven-interval scale with labels from Level 1 
(lowest ability) to Level 7 (highest ability) and 
were asked to indicate their level of ability. 

The booklet then indicated that previous research 
had obtained a large amount of data on the results 
that are typically achieved by people of different 
levels of mental concentration, An example of such 
data for one of the tests was then presented. Each 
ability level was represented by a bar whose height 
indicated the average percentage of correctly solved 
items. The bars, from left to right, indicated the 
lowest to highest ability levels, and were labeled be- 
neath from 1 to 7, respectively. In this example, the 
average of Level 1 was 35%, which increased linearly 
with ability level up to 65% in Level 7. Subjects 
were told that in this test as well as in the other 
tests, the performance of 90% of each ability level 
lay between 4% below the average to 4% above the 
average. 

Manipulation of test characteristics. After complet- 
ing the example, each subject considered six tests 
varying in overall diagnosticity and distribution of 
diagnosticity. Each test was represented by a bar 
graph, as in the example. In tests of high overall 
diagnosticity the difference in performance between 
Level 1 and Level 7 (51%) was three times as great 
as the corresponding difference in tests of low overall 
diagnosticity (17%). The three tests within each level 
of overall diagnosticity varied in distribution of 
diagnosticity. Thus, the difference between Levels 1 
and 7 (or overall diagnosticity) was divided such that 
the difference between Levels 1 and 4 (or diagnos- 
ticity at the lower half of the ability scale) was 
either greater, smaller, or equal to the difference be- 
tween Levels 4 and 7 (or diagnosticity at the upper 
half of the ability scale). In the descending diagnos- 
ticity test the difference between Levels 1 and 4 was 
75 times greater than the difference between Levels 
4 and 7. In the ascending diagnosticity test the mag- 
nitude of these differences was reversed, In the con- 
stant diagnosticity test these differences were of the 
same magnitude. Within each half of the ability scale, 
performance was linearly related to ability level. 
Difficulty was manipulated by varying expected per- 
formance on the tests, that is, the overall mean of 
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Table 3 


i 2: Reported Mean Percentages of Correct Items as a Function of Ability Level, 
pen, aah Ocerall Diagnosticity, and Diagnosticity Distribution 


Expected Performance, 


YAACOV TROPE 


Overall diagnosticity 


a a EEEEEEETE nn 


Low High 
a Se eee eee — 
ants Diagnosticity distribution Diagnosticity distribution 
Be cá aN eai 
parga jemy Descending Constant Ascending Descending Constant Ascending 
a aT 

40 1 28.72 31.50 34.30 6.14 14.50 22.86 

2 33.72 34.33 34.97 21.14 23.00 24,86 

3 38.72 37.16 35.64 36.14 31.50 26.86 

4 43.72 40.00 36.30 S114 40.00 28.86 

5 44.38 42.38 41.30 53.14 48.50 43.86 

6 45.05 45.66 46.30 55.14 57.00 58.86 

7 45.72 48.50 51.30 57.14 65.00 73.86 

50 1 38.72 41.50 44.30 16.14 24.50 32.86 

2 43.72 44.33 44.97 31.14 33.00 34.86 

3 48.72 47.16 45.64 46.14 41.50 36.86 

4 53.72 50.00 46.30 61.14 50.00 38.86 

5 54.38 52.83 51.30 63.14 58.50 53.86 

6 55.05 55.66 56.30 65.14 67.00 68.86 

7 55.72 58.50 61.30 67.14 75.50 83.86 

60 1 48.72 51.50 54.30 26.14 34.50 42.86 

2 53.72 54.33 54.97 41.14 43.00 44.86 

3 58.72 57.16 55.64 56.14 51.50 46.86 

4 63.72 60.00 56.30 71.14 60.00 48.86 

5 64.38 62.83 61.30 73.14 68.50 63.86 

6 65.05 65.66 66.30 75.14 77.00 78.86 

7 65.72 68.50 71.30 77.14 85.50 93.86 


performance across the seven ability levels. There 
were three levels of expected performance: 40%, 
50%, and 60%. The percentages presented to subjects 
in the bar graphs (see Table 3) were derived by 
using these constraints, 

Expected performance was manipulated between 
subjects. Twenty subjects were randomly assigned to 
each of the three expected performance conditions. 
Each group considered six tests (varying in overall 
diagnosticity and diagnosticity distribution) with 
order of presentation varying across subjects accord- 
ing to a 6 X 6 Latin square design. 

Measures. Subjects rated seven questions on 11- 
point scales: (a) their willingness to work on each 
test; (b) the attractiveness of each test; (c) their 
interest in finding out how they would perform at 
each test; (d) the accuracy with which each test 
could discriminate between Levels 1, 2, 3, and 4 of 
mental concentration; (e) the accuracy with which 
each test could discriminate between Levels 4, 5, 6, 
and 7 of mental concentration; (f) the accuracy with 
which each test could reveal their level of mental 
concentration; and (g) the difficulty of each test. 
Questions a to c served as measures of task prefer- 
ence. Questions d to f served as checks on the manip- 
ulations of overall diagnosticity and diagnosticity dis- 


tribution, and Question g served as a check on the 
manipulation of difficulty. 


Results and Discussion 


The results for the first two measures 10 
Table 4 clearly show that the manipulations 
of overall diagnosticity and diagnosticity dis- 
tribution were successful. Analyses of variance 
(Overall Diagnosticity X Diagnosticity Dis 
tribution x Expected Performance) indica’ 
that subjects perceived tests of high ovel 
diagnosticity as more accurate in discriminat 
ing both between Levels 1 to 4, F(1, 51) = 
69.63, p < .001, and between Levels 4 to ed, 
F(1, 51) = 152.73, p< .001. As intended 
descending diagnosticity tests were percel’ 
as the most accurate and the ascending di 
nosticity tests as the least accurate in discrim 
inating between Levels 1 to 4, F(2, 102)” 
155.95, p < .001, whereas on accuracy n 
criminating between Levels 4 to 7, the ordet 
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Table 4 


1513 


Experiment 2: Mean Ratings of Tests as a Function of Overall Diagnosticity, Diagnosticity 


Distribution, and Expected Performance 


eee 


Overall diagnosticity 


Low High 
% Diagnosticity distribution Diagnosticity distribution 
perform- Descend- Ascend- Descend- Ascend- 
: Measure ance ing Constant ing ing Constant ing 
Accuracy in discrim- 40 5.67 4.44 2.56 7.72 7, 
inating between 50 EMA 128 A A 
Levels 1, 2, 3, and 4 60 5.33 4.61 2.06 8.28 7.78 2.50 
Combined 5.39 4.41 1.96 8.15 7.72 2.83 
Accuracy in discrim- 40 2.89 4.83 5.67 4:28 7.44 8.28 
inating between 50 94 4.44 5.89 1.78 8.06 8.89 
Levels 4, 5, 6, and 7 60 2.33 4.50 5.50 3.50 7.89 8.67 
Combined 2,06 4.59 5.69 3.19 7.80 8.61 
Accuracy in determin- 40 3.83 5.06 6.00 4.28 7.11 8.00 
ing one's own 50 1.78 3.50 4.78 2.83 7.72 8.28 
ability level 60 Ae SAT | 5.67 483 817 8.17 
Combined 3.39 4.57 5.48 3.98 7.67 8.15 
Difficulty 40 467 411 5.50 478 428 4.50 
50 2.78 3.50 3.11 4.06 5.22 3.83 
60 2.89 2.78 3.11 2.39 3.33 3.72 
Combined 344 346 3.96 3.74 4.28 402 
Index of test preference 40 5.52 5.76 6.59 5.87 TAT 7.91 
50 4.37 4,80 5.81 4.74 7.26 8.19 
60 4.94 5.94 6.81 6.09 7.87 8,67 
Combined 494 5.50 641 Spo 17.43 TBS 
ing of the tests was reversed, F(2, 102) = Prior to discussing the other measures, sub- 
jects’ reports about their level of mental con- 


112.07, p< .001. Furthermore, since the 
manipulated differences among tests varying 
in diagnosticity distribution were. greater 
when their overall diagnosticity was high than 
when it was low (see Table 3), the two vari- 
ables were expected and found to have inter- 
active effects on perceived accuracy in dis- 
criminating between Levels 1 to 4, F(2, 102) 
= 18.17, p < .001, and between Levels 4 to 7, 
F(2, 102) = 13.06, p < 001. 

While expected performance did not have 
any effect on the ratings of perceived diagnos- 
ticity, it had a marginally significant effect on 
perceived difficulty (see fourth measure, 
Table 4), indicating that the higher the ex- 

test, the easier the 


pected performance at the 
test, F(2, 51) = 2.87, p < 07. The only other 


significant effect on perceived difficulty ape F 
Diagnosticity Distribution X ted 3 
formance interaction, KOMO N a 

.05, which was unexpected and hard to inter- 


Pret. 


centration should be considered. Quite sur- 
prisingly, the range of reported ability levels 
was extremely narrow. No subjects reported 
an ability below Level 4; only 2 subjects re- 
ported Level 4, 25 reported Level 5, 21 re- 
ported Level 6, and only 6 reported Level 7. 
These data suggest that the uncertainty of 
most of the subjects was confined to the high 
ability levels. It follows that subjects should 
rate the accuracy with which a test can deter- 
mine their own ability according to the degree 
to which it can discriminate between the high 
levels, that is, the ascending diagnosticity tests 
should be rated highest and the descending 
diagnosticity tests lowest. The pertinent data 
in Table 4 (third measure) fully corroborated 
this predicted effect of diagnosticity distribu- 
tion, F(2, 102) = 49.26, p < .001. Further- 
more, as intended, ratings of the test’s accu- 
racy in determining one’s own ability in- 
creased with overall diagnosticity, F(1, 51) 
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= 76.70, p < .001, and the effect of diagnos- 
ticity distribution on ratings of tests of high 
overall diagnosticity was stronger than its 
effect on ratings of tests of low overall diag- 
nosticity, F(2, 102) = 20.39, p < .001. 

Turning to subjects’ preferences among the 
tests, we should note that the three measures 
—willingness to work on the test, attractive- 
ness of the test, and interest in finding out the 
results—correlated highly across subjects for 
each of the six tests. The mean of the 18 cor- 
relations (3 intercorrelations for each of the 
six tests), using Fisher’s r-to-z transforma- 
tions, was .77. The three measures therefore 
were averaged (see Table 4, last measure). 
The self-assessment notion would predict that 
preferences among tests should be closely 
related to judgments about their accuracy in 
determining one’s own ability. The means of 
the index of test preference substantiated this 
prediction. First, tests of high overall diag- 
nosticity were preferred to tests of low overall 
diagnosticity, F(1, 51) = 40.55, p< .001. 
Second, there was a strong effect of diagnos- 
ticity distribution, F(2, 102) = 26.99, p< 
001, indicating that tests discriminating 
mainly between high ability levels (ascend- 
ing diagnosticity tests) were preferred to tests 
that discriminated equally between all levels 
(constant diagnosticity tests), which in turn 
were preferred to tests that discriminated 
mainly between low ability levels (descending 
diagnosticity tests). Third, this effect of diag- 
nosticity distribution was more pronounced 
when overall diagnosticity was high than when 
it was low, F(2, 102) = 7.44, p < .001. In 
none of these analyses did expected perform- 
ance (or difficulty) have a significant effect 
on ratings. Finally, the close relationship be- 
tween test preference and perceived accuracy 
in determining one’s ability was reflected in 
the correlations between the two measures 
across subjects for each of the tests, All these 
coreano were highly significant, ranging 
tom .415 to .674, with a mean of .55 (using 
Fisher’s r-to-z transformations), 

The effects of uncertainty distribution on 
task preference could not be fully manifested 
in this experiment because, as noted above, 
the ability levels that the subjects reported 
implied that ‘their uncertainty was concen- 
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trated in the upper region of the ability di- 
mension. However, this may be less true for 
subjects who reported the more moderate 
levels (i.e., Levels 4 and 5). It seems plausible 
that these subjects assigned higher probabil- 
ities to the low ability levels than did subjects 
who reported extremely high levels (i.e. 
Levels 6 and 7). In that case, the former 
group’s uncertainty was more evenly dis- 
tributed over the ability scale. Such uncer- 
tainty distribution should act to increase their 
interest in tests that can discriminate between 
low ability levels, namely, descending and 
constant diagnosticity tests. More precisely, 
the self-assessment notion would predict that 
the preference for ascending diagnosticity 
tests over constant and descending diagnos- 
ticity tests would be more pronounced among 
extremely high ability subjects than it would 
be among moderately high ability subjects. 
Figure 2 presents means of the index of test 
preferences for subjects whose perceived abil- 
ity was below the median (Levels 4 and 5) 
and for subjects whose perceived ability was 
above the median (Levels 6 and 7). It can be 
seen that the preferences of the extremely 
high ability group increased from the descend- 
ing to the ascending diagnosticity test much 
more sharply than did the preferences of the 
moderately high ability group, F(2, 104) = 
14.62, p < .001, for the Perceived Ability X 
Diagnosticity Distribution interaction. More- 
over, this difference between the two per- 
ceived ability groups was greater with respect 
to tests of high overall diagnosticity than with 
respect to tests of low overall diagnosticity, 
F(2, 104) = 3.03, p < .06, for the Overall 
Diagnosticity x Perceived Ability x Diagnos- 
ticity Distribution interaction. Figure 2 also 
Suggests an interaction between percelv' 
ability and overall diagnosticity, F(1, 52) = 
8.23, p < .025, which, unlike the other inter- 
actions involving perceived ability, was not 
derived from the self-assessment notion. This 
interaction suggests that among extremely 
high ability subjects the preference for high 
diagnosticity tests over low diagnosticity test 


3 The same significant effects were obtained in SP” 
arate analyses of each of the three measures of 
preference, 
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Figure 2. Experiment 2: Means 
(moderately high ability vs. extremely hi 


diagnosticity distribution (descending vs. constant vs. 


was more pronounced than among moderately 
high ability subjects. It should be noted, how- 
ever, that perceived ability did not have a 
significant main effect, that is, it did not 
affect overall interest in the tests. 


General Discussion 


The present experiments tested the idea 
that people choose among tasks according to 
the amount of  ability-relevant information 
they expect to gain. Two general determinants 
of information gain were considered: the per- 
son’s initial knowledge about his or her ability 
and the informational properties of the task. 
The former determinant was represented by 
distribution of uncertainty and the latter by 
overall diagnosticity and its distribution. On 
the whole, the effects of these variables on 
task preference supported the hypotheses and 
can be summarized as follows. First, as in pre- 
vious research, tasks of high overall diagnos- 

_ ticity were preferred to those of low overall 
diagnosticity (Trope, 1975; Trope & Brick- 
man, 1975; Zuckerman et al., in press; Buck- 
ert et al., Note 1). Second, with overall diag- 
Nosticity held constant, preferences were de- 
termined jointly by uncertainty distribution 
and diagnosticity distribution, that is, subjects 
Preferred tasks that could most accurately 


of the index of test preference as a function of perceived ability 
igh ability), overall diagnosticity (low vs. high), and 


ascending). 


diagnose within the zone where their uncer- 
tainty was concentrated. Third, the joint 
effect of uncertainty and diagnosticity dis- 
tributions was more pronounced when overall 
diagnosticity was high than when it was low, 
as overall diagnosticity was manipulated as a 
multiplier of diagnosticity in each zone. 

These results seem to reflect a preference 
for tasks that can add to previously acquired 
knowledge about one’s own ability. Thus, 
when subjects could infer from previous per- 
formance that they were most likely to have, 
for example, one of the relatively high ability 
levels, they wanted to work on a task that 
could determine with precision which of these 
levels they actually had. These subjects were 
least interested in a task that made precise 
discriminations only between the low ability 
levels, which they were unlikely to have. Such 
a task would only reaffirm that they possessed 
one of the high ability levels. 

Moreover, the observed task preferences 
conform to what a normative model of uncer- 
tainty reduction would prescribe. In such a 
model, each outcome revises prior probabil- 
ities of the ability levels into posterior prob- 
abilities (see Trope & Brickman, 1975). From 
each of these posterior probability distribu- 


tions, it is possible to derive, as in information 


1516 


theory (Attneave, 1959), the amount of un- 
certainty conditional upon each outcome, One 
can obtain an expected value of uncertainty 
by summing these conditional uncertainties 
across all outcomes, with probabilities of the 
respective outcomes serving as weights, Ex- 
pected uncertainty reduction can be defined, 
then, as the difference between the uncer- 
tainty in the prior probability distribution 
and expected uncertainty. 

The effects of the experimental variables on 
choice can be traced to their effects on ex- 
pected uncertainty reduction. First, the higher 
the diagnosticity of the outcomes, the greater 
the amount of revision from prior to posterior 
probabilities, and therefore the lower the 
expected uncertainty. Hence, tasks of high 
overall diagnosticity reduce more uncertainty 
than tasks of low overall diagnosticity. Sec- 
ond, the distribution of diagnosticity over the 
zones of ability determines the distribution of 
diagnosticity over performance outcomes. As- 
suming that performance is a positive mono- 
tone function of ability, it follows that diag- 
nosticity is related to ability and performance 
outcomes in a similar manner. Thus, if the 
task is more diagnostic of low ability levels, 
then relatively low outcomes are more diag- 
nostic, whereas if the task is more diagnostic 
of high ability levels, then relatively high 
outcomes are more diagnostic. Third, the dis- 
tribution of uncertainty affects the probabil- 
ities of the outcomes: The higher the zone of 
ability within which uncertainty is concen- 
trated, the higher the probability of relatively 
high outcomes and the lower the probability 
of relatively low outcomes. Now, since the 
uncertainty reduced by each outcome is 
weighted by the probability of the outcome, 
expected uncertainty reduction is high to the 
extent that outcomes that reduce much uncer- 
tainty have high probability and outcomes 
that reduce little uncertainty have low prob- 
ability. Clearly, this Tequirement is fulfilled 
by tasks that are highly diagnostic in the zone 
where uncertainty is concentrated. The fact 
that subjects preferred such tasks suggests 
that they were interested in maximizing ex- 
Pected uncertainty reduction. 

In and of itself, expected performance or 
task difñculty does not necessarily affect ex- 
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pected uncertainty reduction, nor did subjects 


perceive it as having any effect on the accu- j 
racy with which the tasks could diagnose — 


their ability. Hence, in Experiment 1 subjects 
had clear preferences among the tasks despite 
the fact that expected performance at these 
tasks was held constant, and in Experiment 2 
the manipulation of expected uncertainty did 
not affect task preference, However, in the 


absence of direct information regarding over- — 


all diagnosticity and distribution of diagnos- 
ticity, people may infer them from task dif- 
ficulty. Under these circumstances difficulty 


may have an indirect effect on choice. Thus, , 


Trope and Brickman (1975) argued that 
tasks with intermediate success probabilities 
are perceived as more diagnostic and there- 
fore more attractive than tasks with more ex- 
treme probabilities (see also Meyer, Folkes, 
& Weiner, 1976). Furthermore, it seems prob- 
able that easy tasks are perceived as more 
diagnostic of low ability levels (i.e. poor 
performance is more diagnostic than good 
performance) and difficult tasks as more diag- 
nostic of high ability levels (i.e., good per- 
formance is more diagnostic than poor per- 
formance). Hence, to maximize information 
gain or uncertainty reduction, people with 
low perceived ability (whose uncertainty is 
concentrated in the low ability levels) should 
prefer relatively easy tasks, whereas people 
with high perceived ability (whose uncer- 
tainty is concentrated in the high ability 
levels) should prefer more difñcult ones. The 
same logic applies to the case of two out- 
comes, success and failure, and two levels of 
ability, low and high. As difficulty increases, 
the perceived diagnostic value of success be- 
comes greater than that of failure, and sincè 
the subjective probability of success increases 
with perceived ability, people with high Re 
ceived ability should prefer more difficult 
tasks than people with low perceived ability 
(see Zuckerman et al. for corroborating 1°- 
sults), Poor performance shifts the uncer” 
tainty to lower ability levels and good Pa 
formance shifts it to higher ability lev® 
Hence, in the former case, the person will a 
tempt increasingly easier tasks, whereas 
the latter case, he or she will attempt ma 
ingly harder tasks. These well-docum 


s 


= with high perceived al 


UNCERTAINTY-REDUCING PROPERTIES OF ACHIEVEMENT TASKS 


shif ts in “level of aspiration” (Lewin, Dembo, 
Festinger, & Sears, 1944) are interpreted 
here as attempts to keep the amount of new 
ability-relevant information at a maximum. 
The present findings are relevant to the 
view that self-enhancement and self-protec- 
tive needs (Bradley, 1978; Jones, 1973) de- 
termine task choice, Kukla (1978) has re- 
cently suggested an attributional theory of 
choice based on the assumption that the tend- 
ency to work on a task increases with the 
level of ability that performance is expected 
to imply. A task will be attractive if it can 
demonstrate one’s high ability, but it will be 
avoided if it may betray one’s low ability. 
With regard to the present study, this ap- 
proach would predict that subjects who sus- 
pect that they have one of the low ability 
levels will be less interested in working on the 
ability-relevant tasks than will subjects who 
believe that they have one of the high ability 
levels. However, the data did not show any 
trace of such a main effect of perceived ability 
(or uncertainty distribution). Furthermore, 
self-serving biases should produce a preference 
, for a task that is more diagnostic of high 
ability levels over a task that is more diagnos- 
tic of low ability levels. More specifically, 
persons with high perceived ability should be 
attracted to the former task because it can 
demonstrate unequivocally that they have 
high ability, and persons with low perceived 
ability should avoid the latter task because it 
may disclose that they have low ability. 
Again, there was very little evidence for such 
asymmetry, Subjects chose the task that was 
diagnostic at the zone of ability within which 
their uncertainty was concentrated, even when 
performance of this task could give very un- 
flattering results. Finally, it is worth noting 
that perceived ability did not produce any 
self-serving distortions of the informational 
properties of the tasks. Specifically, subjects 
bility did not overesti- 


mate the accuracy with which performance 
ty; nor did subjects 


could reveal high abili : 
with low perceived ability underestimate the 


accuracy with which performance could re- 
veal poor ability. The judgments of both 
related to the 


8toups were appropriately 
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manipulated values of the informational char- 
acteristics of the tasks, 

There was one result, a Perceived Ability x 
Overall Diagnosticity interaction, which was 
consistent with the self-enhancement view. 
This interaction indicated that the preference 
for the high overall diagnosticity tasks was 
more pronounced among subjects with ex- 
tremely high perceived ability than among 
subjects with moderately high perceived abil- 
ity. This result suggests that subjects’ choice 
behavior was not entirely free from self-serv- 
ing biases. It should be remembered, however, 
that in Experiment 2, where this interaction 
was obtained, perceived ability was varied by 
subject selection, which leaves open the pos- 
sibility of a correlated variable. Achievement 
motive, which was found to be correlated with 
perceived ability (Kukla, 1978), may be such 
a variable; previous research by Trope 
(1975) has shown that high-achievement-mo- 
tive people are more interested in ability-rel- 
evant information than low-achievement-mo- 
tive subjects. In addition, while Zuckerman et 
al, obtained a similar effect of perceived abil- 
ity, Buckert et al. (Note 1) did not. Clearly, 
more research on this problem is needed be- 
fore definite conclusions can be reached. 

Even if self-serving biases were entirely 
absent in the data, it would not have meant 
that our subjects were affectively indifferent 
to the ability-relevant feedback they received 
or expected to receive, Extrapolating from 
previous research on reactions to self-evalua- 
tion (see reviews by Bradley, 1978; Jones, 
1973; Shrauger, 1975) it seems plausible that 
the subjects who learned that they had one of 
the low ability levels were distressed by the 
feedback and anticipated even greater distress 
should the next task diagnose a low level of 
ability. However, the present results cast 
doubt on the assumption that such affective 
reactions shape choice strategy. The desire to 
assess one’s ability accurately seems a more 
powerful determinant of choice, at least under 
the circumstances investigated in this study. 
It remains for future research to determine 
the generalizability of this conclusion to dif- 
ferent circumstances. One limiting condition 
is suggested by the fact that subjects in the 
present study were assured that their future 
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performance would remain anonymous. It is 
possible that when performance is expected to 
be public, the enhancement of self-esteem as 
well as social esteem may assume greater im- 
portance than accurate self-assessment. In 
such a situation, people may actually choose 
to work on tasks that are expected to diagnose 
their strengths but not their weaknesses. 
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The Role of Facial Response in the Experience of Emotion 


Roger Tourangeau and Phoebe C. Ellsworth 
Yale University 


Facial expression and emotional stimuli were varied orthogon i 

factorial design in order to test whether facial expression eee AN 
cient to influence emotional experience. Subjects watched a film eliciting fear, 
sadness, or no emotion, while holding their facial muscles in the position char- 
acteristic of fear or sadness, or in an effortful but nonemotional grimace; those 
in a fourth group received no facial instructions. The subjects believed that the 
study concerned subliminal perception and that the facial positions were neces- 
sary to prevent physiological recording artifacts. The films had powerful effects 
on reported emotions, the facial expressions none. Correlations between facial 


expression and reported emotion were zero, Sad and fearful subjects showed 


distinctive patterns of physiological arousal. Facial expression also tended to 


affect physiological responses in a manner con: 


Half a century ago, Cannon’s decisive cri- 
tique of the James—Lange theory ended scien- 
tific consideration of the hypothesis that 
peripheral responses provide the basis for 
qualitative distinctions among emotions. Non- 
specific arousal theories have dominated the 
study of emotion ever since (Duffy, 1934, 
1962: Lindsley, 1951). The James-Lange 
theory (James, 1890; Lange, 1885/1922) 
proposed that emotional stimuli elicit physio- 
logical responses specific to each emotion; the 
experience of an emotion, according to their 
view, is the perception of the corresponding 
physiological pattern.’ By contrast, the non- 
specific arousal theorists argue that physio- 
logical patterns do not correspond to specific 
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sistent with an effort hypothesis. 


emotions but only to the intensity of general 
emotional arousal and perhaps (Duffy, 1962) 
to a global, primitive approach-avoidance 
tendency. These theories have little to say 
about qualitative distinctions among the emo- 
tions. They tend to share, implicitly or ex- 
plicitly, the assumption that such distinctions 
are the product of learning. The theory of 
Schachter and Singer (1962), for example, 
asserts that undifferentiated arousal is classi- 
fied according to situational cues to determine 
the emotional experience. Qualitative distinc- 
tions derive from the classification; the classi- 
fication presumably derives from social learn- 
ing (see also Duffy, 1962). 

The learning position of the arousal theo- 
rists is cast into some doubt by recent evi- 
dence for the widespread cross-cultural gen- 


d by James and by Lange 


1The theories propose! 
was restricted to auto- 


differ in that Lange’s theory 
nomic feedback (heart rate, stomach contractions, 


blushes, etc.), whereas that of James also included 
muscular feedback (such as changes in tonus, posture 
and, presumably, facial expression). Most subsequent 
writers, including the major critics, attributed the 
visceral version to both authors indiscriminately ; in 
using the term “James-Lange theory” we will follow 
in this tradition, while recognizing that the muscular 
components of the James theory were not fully dis- 
credited by Cannon’s research, and have much in 
common with the later facial feedback hypotheses. 
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erality of a small set of basic emotion 
categories, reliably used in labeling facial ex- 
pressions of emotion (Ekman, Sorenson, & 
Friesen, 1969; Ekman & Friesen, 1971; Izard, 
1971). If emotion categories are learned, cul- 
tural phenomena, why should all cultures 
studied so far share the same small set? In 
line with the evidence on the cross-cultural 
generality of the recognition of facial expres- 
sions, some theorists (Tomkins, 1962; Izard, 
1971, 1977) have proposed that feedback 
from the facial muscles is important in the 
subjective experience of emotion, In the 
strongest version of these facial feedback 
theories, facial responses play the same crit- 
ical role as more general visceral and muscular 
changes play in the James-Lange theory: the 
proprioception of the facial response is the 
experience of emotion. This shift in emphasis 
from visceral to facial feedback neutralizes 
most of Cannon’s criticisms of the James— 
Lange theory. Cannon (1927), for example, 
argued that visceral responses were too slow 
and too undifferentiated to be the basis of 
the subjective experience; facial expressions 
are sufficiently immediate and sufficiently 
various; similarly, Cannon’s demonstrations 
of “emotional” behavior in animals whose 
viscera were separated from their central 
nervous systems are irrelevant to the facial 
theories. 

Although they differ on the causal priority 

assigned to each, these various positions pre- 
-dict that in general self-report, facial, and 
physiological measures of emotion should be 
positively correlated. In contrast, still another 
position predicts a negative correlation among 
these measures. This cathartic—hydraulic view 
was proposed first by James (1890); its chief 
exponent, however, is Freud (1946/1921). 
According to the hydraulic view, verbal, fa- 
cial, and physiological responses are alterna- 
tive channels for releasing the emotional en- 
ergy evoked by a stimulus; if one channel is 
blocked, the response through the others 
should increase in intensity. 

What is the evidence for the various views? 
The general arousal models receive a certain 
amount of indirect support from the numerous 
failures to find clear patterns corresponding 
to different emotions (Lacey, Kagan, Lacey, 
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& Moss, 1963; Lindsley, 1951). In addition, 
Hohmann (1966) has found, contrary to Can- 
non’s (1927) contention, that separation of 
the viscera from the central nervous system 
(due to spinal lesions) in humans is associated 
with reduced emotional responding. Finally, 
there is Schachter and Singer’s (1962) dem- 
onstration that induced autonomic arousal can 
lead to various types of emotional (and un- 
emotional) response (see also Schachter & 
Wheeler, 1962; Zillman & Bryant, 1974)? 
On the other hand, some of the better studies 
of emotional arousal kave found evidence that 
different emotions are associated with different 
autonomic patterns (Ax, 1953; Funkenstein, 
1955; Wolf & Wolff, 1947). Hohmann’s find- 
ings on patients with spinal lesions invite 
numerous interpretations, Schachter and 
Singer’s results do not always replicate (see 
Marshall & Zimbardo, 1979; Maslach, 1979), 
and even in their original experiment the 
differences between subjects given situa- 
tional cues for euphoria and those given cues 
for anger were negligible. Finally, Cannon’s 
arguments on the long latencies of visceral 
responses still pose difficulties for any theory 
that makes the sensation of autonomic arousal 
a necessary condition for emotional experi- 
ence. 

The strongest evidence for the facial feed- 
back view comes from studies by Lair 
(1974) and Lanzetta, Cartwright-Smith, and 
Kleck (1976). Laird showed effects for manip- 
ulated facial expression on felt aggressive- 
ness, and Lanzetta et al. showed similar ef- 


fects on pain, The facial feedback theorists, — 


however, cannot explain the results © 
Schachter and Singer’s experiment (but se€ 
comments by Izard, 1977, chapter 2). In 
addition, the Laird study did not contain the 
control group necessary to determine whether 
the appropriate expression increases the rê- 
sponse to a stimulus or the inappropriate €% 
pression inhibits the response (or both). Lal" 
also used self-report measures in a Wi ie 


2Since the physiological measures in Schacht 
and Singer were extremely crude, we cannot be 
tirely confident that the physiological patterns cues 
identical. It is quite possible that the situatio 
modified the physiological response. 
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subje design, leaving open the possibility 
that demand characteristics were responsible 
for the results. Lanzetta et al. (1976) also 
used a within-subjects design, but their inclu- 
sion of a galvanic skin response (GSR) mea- 
sure makes an account in terms of demand 
characteristics less plausible. However, since 
pain is not typically included in theories of 
emotion, their findings may not generalize to 
the feeling states that are. 

The best evidence for the hydraulic view 
comes from studies showing a negative corre- 
lation between facial expressiveness and mea- 
sures of physiological arousal (Buck, Miller, 
& Caul, 1974; Buck, Savin, Miller, & Caul, 
1972; Lanzetta & Kleck, 1970). This evidence 
tends to disconfirm the other theories. How- 
ever, few of the studies cited in support for 
this view actually show the negative correla- 
tion expected to hold within a given individ- 
ual. Instead, they find evidence that the most 
expressive people are not the most physiolog- 
ically aroused; they do not find that the same 
individual is more expressive when he is less 
aroused, In addition, several studies (includ- 
ing Lanzetta et al, 1976) have found signif- 
icant positive correlations. 

Thus none of the evidence is decisive, and 
most of the questions about the roles and rela- 
tive importance of the various components of 
emotional responding remain open. This study 
attempts to answer some of these questions by 
testing several predictions from the facial 
feedback hypothesis. What we are calling the 
“facial feedback hypothesis” obviously de- 
rives from the theories of Tomkins (1962), 
who was the first to claim that the emotions 
are primarily facial behaviors, and later Izard, 
who maintains that “awareness of facial activ- 
ity or facial feedback is actually our awareness 
of the subjective experience of a specific emo- 
tion” (1977, p. 60). Nonetheless the genera! 
hypothesis we are testing here is not the same 
as that of either of these theorists. In the first 
place, both theories are comprehensive Be 
ments containing numerous propositions 
about the relationship of emotions to person- 
ality, motivation, communication, and each 
other, The facial feedback hypothesis involves 
just one of these propositions, albeit a central 


one. 
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i In the second place, neither author is en- 
tirely static in defining the implications of his 
general statements about the importance of 
the face. Our experiment was designed to test 
(a) whether the appropriate facial expression 
is necessary for the subjective experience of 
the emotion, and (b) whether the voluntary 
assumption of an expression is sufficient to 
produce the experience. If facial expression is 
necessary for emotional experience, there 
should be no emotion unless the face responds. 
Even in the presence of emotional stimuli, 
without the appropriate emotional expression, 
no emotion should be felt. If the facial ex- 
pression is suficient for emotional experience, 
when the face responds, the emotion should 
follow. Even in the absence of emotional stim- 
uli, an emotional facial expression should pro- 
duce an emotional feeling. Weaker forms of 


. the hypotheses predict that feedback from the 


face should have a significant main effect on 
the emotional experience, attenuating or in- 
tensifying it. These hypotheses are derived 
from some of the more strongly worded state- 
ments of Tomkins (1962) and Izard (1977), 
statements that are qualified in other parts of 
their work, Both theorists argue that adults 
may have learned to duplicate the effects of 
proprioceptive feedback by means of a re- 
afferent loop from the subcortical centers di- 
rectly to the cortex, rather than from the sub- 
cortical centers to the face to the cortex, so 
that actual movement of the face is not al- 
ways necessary. Similarly, they have argued . 
that voluntary movement of the facial mus- 
cles may not be sufficient to produce the cor- 
responding emotion, because it does not create 
exactly the same proprioceptive cues as the 
involuntary movement created by a “real” 
emotional stimulus. 

Although the theories allow for the pos- 
sibility that neither the necessity nor the 
sufficiency hypothesis is true, we believe that 
these hypotheses are worth testing. In the first 
place, the practical, therapeutic implications 
of the theory are much greater if the facial 
muscles are actually involved in the experi- 
ence. If the face were necessary for emotional 
experience, victims of accidents, disease, or 
surgery resulting in facial paralysis or sensory 
impairment would be expected to show corre- 
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sponding affective deficiencies, and additional 
therapeutic attention would be indicated. If 
the face were sufficient to influence the felt 
emotion, it would be useful to teach patients 
suffering from affective disorders how to con- 
trol their facial expressions. In the second 
place, the qualifications render the theory 
much less testable. If the only influential 
facial expression is one that results from an 
involuntary natural response and if the facial 
muscles can be bypassed intracranially, the 
causal role of the face becomes inaccessible to 
any sort of definitive empirical test. 

A final hypothesis tested by our research is 
one that does follow directly even from the 
weaker statements of Tomkins (1962) and 
Izard (1977), as well as from the stronger 
facial feedback hypothesis. It is simply the 
prediction that, in general, the relationship 
between the facial expression and the emo- 
tional experience should be monotonic and 
positive. 

The three hypotheses of necessity, suf- 
ficiency, and monotonicity were tested in this 
study by a design in which facial expression 
was manipulated independently of emotional 
stimulation. The basic design included a rep- 
lication across two emotions—fear and sad- 
ness; we observed facial, physiological, and 
self-report responses. We chose fear and sad- 
ness because we felt that it was important to 
demonstrate a distinction between negative 
emotions. The comparison of a single positive 
with a single negative emotion does not pro- 
vide a very stringent test of the qualitative 
distinctions among emotions, and is particu- 
larly prone to demand characteristics and 
level-of-arousal artifacts. Because it in- 
cluded measures of all three sets of responses, 
the study can also address additional ques- 
tions about their interrelationships and thus 
can provide a basis for comparing all the 
theoretical positions, 


Method 


Overview 


Believing that the experiment was a study of 
physiological responses to subliminal stimuli, subjects 
watched a sad, fear-arousing, or emotionally neutral 
film. Their heart rate, galvanic skin response, and 
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respiration rate were recorded, and the placement of 
additional fake electrodes on their faces provided a 
rationale for asking them to hold their facial muscles 


in a constant position during the film. These positions 
corresponded to a fearful expression, a sad expression, 
or a grimace unrelated to any emotion. A final group 
of subjects watched one of the three films but re- 


ceived no facial instructions. During film the 
subjects’ faces were videotaped, and as soon as it was 
over the subjects rated their emotional experience. 
The design was thus a 3X4 factoria! with three 
films (fear, sad, and neutral) and four facial expres- 
sions (fear, sad, neutral, no instructions) 
Subjects 

The subjects were 128 undergraduates, 85 of whom 


received both credit (in partial fulfillment of intro- 
ductory course requirements) and $2.00. The remain- 
ing 43 subjects received $3.00. Five subjects were not 
included in the analyses: 2 left the experiment when 
the content of the film was described to them, 1 
stopped the film in the middle, and 2 others were 
lost because of equipment problems 


Procedure 


There were two experimenters in the study. The 
first was blind to the subject’s facial instruction con- 
dition. This experimenter (E:) told the subject the 
basic cover story: the experiment concerned physio- 
logical indices of subliminal perception—heart rate, 
skin conductance, respiration rate, “the orienting Te- 
flex,” and “subvocal speech”; all of these responses 
would be recorded on a polygraph, and the two that 
involved small movements of the eyes and lips would 
also be recorded on videotape. The subliminal stimuli 
would be single frames spliced into a film. E: then 
explained that certain parts of the procedure might 
cause some discomfort: (a) to prevent the subject 
from concentrating too hard on finding the sub- 
liminal images, the film was intended to be distract- 
ing, and might be upsetting; (b) since normal muscle 
movements could distort some of the physiological 
measures, the subject might be required to hold cer- 
tain muscles in a somewhat uncomfortable position 


speech response, absolutely complete anonymity C0! ‘ 
not be guaranteed. After explaining how the subjec 
would be “hooked up” to the polygraph, Ex 
the content of the film and obtained the su 
formal consent. 


d 
3We considered a comparison between fear an 
of AE 


anger, in an attempt to include a replication © ter- 
(1955). We had to abandon this plan for the Hop- 
esting reason that we could find no film that ©, 
sistently elicited anger more than other emotions 
all subjects. 


¢ 
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E. then left and sent the second experimenter (Ez) 
the room. Es had been out of earshot and so was 
ire of the subject’s film condition. 
placed the electrodes on the subject (electro- 
ca raph [EKG] electrodes on the wrists, a res- 
piration thermistor* on the nostril, GSR electrodes 
on the middle finger, bogus electromyograph [EMG] 
electrodes on the face), explaining how they worked 
and reinforcing the rationale of the cover story. After 
the electrodes had been placed, Es told the subject to 
kee p his or her arms still, and gave the instructions 
for holding the face in the position that would facil- 
itate recording the facial responses. When E: was 
satisfied with the subject's facial pose, he left the 
subject alone for a baseline period with instructions 
to relax. 
__ The baseline period lasted until the subject’s phys- 
iological responses had appeared stable for at least 1 
minute, Then Ez began videotaping the subject's face 
and repeated the facial instructions (in a shorter 
form). When the facial expression was approximately 
rizht, Es told the subject to hold the position during 
the film; he then turned the projector on and im- 
mediately left the room, As soon as the film was over 
(E, watched through a one-way mirror for the end 
of the film), Es returned and administered a ques- 
tionnaire containing the self-report emotion items, 
along with filler items consistent with the cover story. 
E. debriefed the subjects. Although several sub- 
jects expressed confusion about the complex cover 
story, no one guessed that facial expression was the 
Variable of interest, nor that his or her expression had 


been an emotional one. 
Two male undergraduates took the E1 role; one 
male graduate student took the Es role. 


Independent Variables 


Film. We pretested eight films on 4 group of 50 
undergraduates, and selecti three because they 
elicited high agreemen on a single 
dominant emotion, with 


emotions. Subjects rated each of five 
0-8) for each 


(M = 2.4). It concerns 
shop. The sad film traces > 
to his brief stay in an orphanage while I 
in the hospital. The sad film reliably eli 


(M = 3, i M=3.4) from pretest sub- 
36) an l e neutral film de- 


jects, and low fear M =0.15). 

picts a flower y in the b&nical gardens of 
Golden Gate Park. It was seen AS slightly ee 
(M =2.8) and interesting (M =2.1) and not at 
sad (M =0,20) or frightening (M = 0.02). Each film 


lasted for 2 minutes. f 
Facial instructions. The facial instructions were 
derived from the work of Tomkins (1962); Izard 
(1971), Ekman and Friesen (1975), and the instruc- 
tions used by the second author in developing proto- 
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type photographs for the Facial Affect Scorin; 
nique (Ekman, Friesen, & Tomkins, 1971). In S 
tice, the instructions varied somewhat depending on 
how easily the subject adopted the desired position. 
For the subjects in the fear face condition, the in- 
structions typically ran: ; 


There are three sets of muscles around the eyes 
that can distort the measurement of the orienting 
reflex, We want you to contract all three of them. 
The first is the muscle between the eyebrows, the 
corrugator. Contract that muscle by pulling the two 
eyebrows together, toward each other in the mid- 
dle. The second is the muscle in the forehead, the 
frontalis. Contract the frontalis by raising your 
eyebrows. The last muscle is the one which controls 
the eyelids. Contract that one by opening your 
eyes up wide. There is only one set of muscles 
you'll have to contract in your mouth, First, part 
your lips slightly; it will be easier, The muscles 
here [points below corners of mouth] are the tri- 
angularis muscles. Contract them by pulling the 
corners of your mouth down and back. If you're 
doing it right, you should feel your neck get tense. 


For the sad face subjects, the instructions were sim- 
ilar, although the muscles differed: the corrugator was 
contracted (brows drawn together), frontalis con- 
tracted (inner corner of the brows raised), eyelids 
relaxed, mentalis contracted (lower lip pushed up and 
out slightly), and quadratus muscles contracted (cor- 
ners of lips pulled down). The nonemotional face 
subjects were instructed to close one eye, purse their 
lips, and puff out their cheeks. Unmanipulated face 
subjects were told to ignore the facial electrodes and 
to act “naturally,” as we were “Gnterested in deter- 
mining whether the orienting reflex and subvocal 
speech responses can be detected against a background 
of normal facial movements.” For all subjects, elec- 
trodes were placed on the chin and below one eye 
(for subjects in the nonemotional face condition, this 
bogus electrode was placed below the eye they were 


supposed to close). 


Dependent Variables 


Self-reported emotion. Fear was measured on two 
9-point scales labeled “scared” and “afraid”; the 
scales ranged from “not at all” (0) to “yery strongly” 
les were highly related (r = 87). 
easured on two 9-point scales; 
these were labeled “sad” and “unhappy” (r=). 
As in the pretests, the overall emotion score is the 
sum of the two scales. 


Facial expressions. 
subject’s conditions Si 


Two trained raters blind to the 
cored videotapes on the emo- 


BOSAT 
4 Respiration rate was recor 
certain heart rate artifacts, These 


unnecessary. The results for respira! 


analyzed. 


ded as a control for 
corrections proved 
tion rate were not 
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tional content of the subject’s facial expressions. For 
each subject, the raters judged how sad, unhappy, 
scared, and afraid the subject looked. (The scales 
for these judgments were identical to those used by 
the subjects in judging their own emotions; they 
were, thus, 9-point scales ranging from “not at all” 
to “very strongly.”) The facial sadness measure is the 
sum of the two raters’ judgments for “sad” and 
“unhappy”; similarly, the facial fear rating is the 
sum of the two ratings of “scared” and “afraid.” 

Physiological indices. Physiological responses were 
monitored by a Narco-Bio Desk Model Physiograph. 
EKG electrodes were attached to the subjects’ wrists 
with a ground electrode on the ankle. A cardiotach- 
ometer averaged the beat-to-beat interval over five 
beats and recorded the heart rate in beats per min- 
ute. The skin resistance was monitored by passing a 
small direct current through two plate electrodes at- 
tached to the top and bottom of the subject's middle 
finger. 

Artificiality. Our efforts to keep the experimenters 
blind to the subject’s condition, to prevent the sub- 
jects from guessing the hypotheses, and to collect 
three different kinds of data, although successful, 
resulted in a situation that was complicated and 
unusual and may recall the futile efforts of early re- 
searchers to obtain photographs of “true emotions” 
in the laboratory (e.g, Landis, 1924). Although emo- 
tional experience may have been attenuated in our 
setting, there is no evidence that this was the case. 
First, the self-reported emotional reactions to the 
films were at least as high as those obtained in pre- 
testing; second, several subjects spontaneously com- 
mented on their emotional arousal, and one stopped 
the projector because the film was too upsetting; 
third, there were no false alarms in the reporting of 
subliminal stimuli, as might be expected if the task 
demands distracted subjects from the arousing prop- 
erties of the films; and finally, unlike the early re- 
search, our emotional stimuli produced significant 
differences on the dependent variables. 


Results ë 
Facial Expression 


The judgments of facial expression appear 
to be reliable across raters: for the two-item 
sadness index, the two raters’ judgments cor- 
related .76 (p< .001); for the fear index, 
the two raters correlated at .81. For the re- 
maining analyses, ratings were summed over 
the two judges. 

The facial instructions had the expected 
large effect on the ratings of facial expression. 
On the average, subjects given the sad facial 
instructions were rated as more than 15 points 
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sadder (on a 32-point scale) than the subjects 
given any of the other three facial instruc- 
tions, F(3, 3) = 153.1, p< .01. Subjects 
given the fear face instructions averaged 13 
points higher on rated facial fear, F(3, 3)= 
327.1, p < .01. The facial instructions thus 
seemed to have succeeded. 

There is also some indication that the film 
affected ratings of facial expression. Subjects 
watching the fear film looked more fearful 
(M = 8.2) than those watching the sad (44) 
or neutral (5.4) films, F(2, 2) = 116.7, P< 
01. There was no corresponding effect for 
rated facial sadness. This effect provides some 
evidence for the validity of the judges’ rat- 
ings (since the film might have been expected 
to affect facial expression in the obtained di- 
rection) and of the success of the films in 
creating real emotions but also indicates that 
the facial instructions manipulation was not 
wholly successful. Apparently subjects could 
not consistently maintain their instructed ex- 
pression when confronted with the strong 
stimuli of the film. Though reliable, the effect 
of the film is small, particularly when com- 
pared with the effect for facial instructions. 

There were no interactions nor main effects 
for experimenter on rated facial expression. 


5 The cell sizes in this study are unequal. While the 
differences are small, they may in part reflect the 
experimental variables of interest; the film, for ex- 
ample, may have affected subject attrition slightly 
(two fear film subjects, one sad film subject, and no 
neutral film subjects refused to participate ter 
learning about the film they would see). Under these 
circumstances, weighted means analysis seemed most 
advisable (Winer, 1962, p. 222). Partly because the 
inequalities are so small, unweighted analyses yiel 
identical conclusions. ed 

The film and face variables were treated as fix 
factors and experimenter as a random factor- ma 
main effects for the film and facial instructions s 
tested against their interaction with the experimenter 
factor. Error terms based on pooling these intet 
action terms with the within-group sum of <i 
(Winer, 1962, pp.@02-207) do not alter the On 
sions except where reported. The pooling procedu y 
is used whenever permissible for planned and @ d 
teriori comparisons, because of the handicap of 1 na 
1 degrees of freedom. Small variations in the a 
of freedom in these analyses reflect missing data Jed 
the specific terms that could be included in the PP? 
error term, 


psy! 
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Self-Reported Emotion 


As Tables 1 and 2 show, the film had sub- 
stantial eff ects on the subjects’ ratings of their 
own emotions: for fear, F(2, 2) = 29.5, p< 
05: for sadness, F(2, 2) =42.1, p< .05. 
A priori contrasts indicated that subjects who 
watched the fear film were more frightened 
than subjects who watched the other films, 
F(1, 106) = 17.9, p < .01; subjects who saw 
the sad film were sadder, F(1, 103) = NG 
p < .01. Each contrast accounts for more than 
90% of the variance among the means for the 
three film conditions. (See Footnote 5 for an 
explanation of the degrees of freedom). 

An examination of the difference between 
fear (or sadness) and the mean of all other 
emotion ratings confirms this analysis. Com- 
pared to subjects who watched the sad and 
neutral films, fear film subjects felt predom- 
inantly fear. For subjects who watched the 
fear film, the self-rated fear was 2.65 points 
higher than the average self-rating of all the 
other emotions; for sad film subjects self- 
rated fear was 1.06 points lower than the 
the other emotions, and for 
neutral film subjects it was about the same— 
0.28 points lower, F(2, 2) = 113.8, $ < Ol. 
The overall analysis also shows.an uninter- 
pretable Face X E interaction. No other ef- 
fects are significant. Compared to subjects 
who watched the fear and neutral films, sad 
film subjects felt predominan: 
Their self-rated sadness was 5.20 points higher 
than the average of all the other emotions; 


for fear film subjects self-rated sadness was 
about the same as the average of all other 
and for neutral 


emotions (.06 points lower) 


average across 


Table 1 


Mean Self-Reported Fear 


Facial instructions 


Nonemo- Unmanip- 

Film Fear Sad tional ulated 
Fear 6.0 7.5 7.8 46 65 
Sad 33.32 «39 39 «(3.6 
Neutral 3.3 2.1 3 22 25 
mbers 


Note. Scale ranges from 0 to 16, with higher nu 


indicating greater fear. 
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Table 2 
Mean Self-Reported Sadness 


ee 


Facial instructions 


h Nonemo- Unmanip- 

Film Fear Sad tional ated M 
Fear 41 44 47 3.5 44 
Sad 89 88 9.7 88 91 
Neutral 0.9 2.6 1,9 1.4 1.6 


Note. Scale ranges from 0 to 16, with higher numbers 
indicating greater sadness. 


flm subjects 1.42 points lower, F(2, 2) = 
20.5, p < 05. No other effects were signif- 
icant. 

Both the necessity hypothesis (that facial 
responses are necessary for the felt emotion) 
and the sufficiency hypothesis (that facial re- 
sponses are sufficient for felt emotion) of the 
facial feedback theory predict effects for facial 
instructions. There are, however, no signif- 
icant main effects for facial expression instruc- 
tions on either emotion, nor does facial in- 
struction interact significantly with film, 

One possible interpretation of the lack of 
effects for facial instructions is that the sub- 
jects were simply unable to maintain their 
facial expressions during the film. Part of the 
film effect, then, may reflect covariation of the 
facial expression with the film condition. Par- 
tialing out variation in felt emotion due to 
facial expression through analysis of covari- 
ance, however, fails to alter significantly the 
very substantial film effects. The effects of the 
film on self-reported emotion are thus not at- 
tributable to differences in facial expression 
across film conditions. 

A weakened version 


pothesis might predict t ; 
strong situational cues, the facial expression 


may be sufficient to determine felt emotion. 
The neutral film subjects, by this argument, 
should show an effect for facial expression. 
This prediction receives very slight support: 
among those subjects who watched the neutral 
film, the sad face subjects are, on the average, 
slightly sadder than subjects in the other three 
facial instructions conditions; similarly, of 
the neutral film subjects, 
jects report the most fear. 


of the sufficiency hy- 
hat, in the absence of 


Neither of these 


1526 ROGER TOURANGEAU AND PHOEBE C. ELLSWORTH 
Table 3 f j if 
Means on Physiological Indices by Film and Facial Instructions Condition 
Condition Fallin HR Rise in HR Fall in GSR No. GSRs 
Film 
7.4 21.3 71.0 i44 
Si 7.7 16.4 67.1 11.0 
Neutral 10.8 14.6 52.2 11.7 
Toa o a aaas eer 
Facial instructions 
Fear 8.4 17.0 43.8 12.6 
Sad 8.8 17.5 60.2 10.2 
Nonemotional TA 19.6 63.8 15.8 
Unmanipulated 9.9 15.7 86.2 11.0 


a O 


Note. HR = heart rate (change computed in beats per minute). GSR = galvanic skin response (change com- 


puted in thousands of ohms). 


trends for the neutral film subjects, however, 
is statistically significant, either by a priori 
comparisons within the context of the analysis 
of variance or by Kruskal-Wallis one-way 
analyses of variance for ranked data. 

Overall, the relationship between facial ex- 
pression and reported emotion is slight. The 
correlation between the ratings of facial fear 
and self-reported fear is .01; between facial 
and self-reported sadness, it is .02. The 
strongly positive relationship between facial 
expression and self-reported emotion predicted 
by all the facial feedback theories does not 
seem to obtain. 


Physiological Measures 


Two coders scored the polygraph recordings 
of the physiological variables. Their agree- 
ment was substantial, correlations between 
them ranging from .96 to 1.00 (median r = 
.98). Indices were based on six variables: 
baseline heartrate, the average of the sub- 
ject’s heart rate 10 and 5 seconds prior to the 
end of the baseline period (each heart rate 
reading is itself an average based on five beat- 
to-beat intervals); baseline skin resistance, 
also an average of readings 10 and 5 seconds 
prior to the end of the baseline period; max- 
imum heart rate during the film period; min- 
imum heart rate during the film; number of 
skin responses (any fall exceeding 1,000 ohms 
prior to leveling off was scored as a response) ; 
and lowest skin resistance during the film. 


Both skin resistance measures are reported in 
thousands of ohms. The heart rate (HR) 
measures are in beats per minute. The rel- 
evant means for the physiological variables 
are given in Table 3. 

Rise in heart rate. The largest rise in HR 
for each subject was calculated as the differ- 
ence between maximum and baseline HR. The 
film had a significant effect on this index, F(2, 
2) = 36.4, p<.05. A posteriori contrasts 
(Scheffé criterion, Winer, 1962, p. 88) indi- 
cate that subjects who watched the fear film 
showed the largest rises in heart rate; neutral 
and sad film subjects showed smaller rises and 
were similar on this index, F(1, 105) = 11.00, 
p < .05; the contrast accounts for 967% of the 
variation among the means of the film groups. 
None of the other main effects or interactions 
are statistically significant. 

Fall in heart rate. By subtracting the 
lowest heart rate from the index of baseline 
heart rate, we can find the largest fall in 
heart rate, There is a nearly significant effec 
for the film on this variable, F(2, 2) = 17.1, 
10> p> .05. An a posteriori contrast 15 
similarly marginal: neutral film subjects T 
the largest drop in heart rate; fear an er 
subjects show similarly smaller drops, F G 
105) = 4.78, .10 > p > .05; the contrast r 
counts for 99% of the variation. There 15 a 
an effect for the facial instructions variablo 
F(3, 3) = 37.2, p < .05. Subjects in the tes 
emotional face condition show the small if 
drop in heart rate, subjects in the unmallP 
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lated face condition the largest. The interac- 
tion term used to test this effect for facial 
instruction is quite small (F <1); with a 
pooled error term, the effect is no longer sig- 
nificant: F(3, 105) < 1, ns. None of the other 
effects is significant. 

Fall in GSR. Subtracting the lowest skin 
resistance from the baseline index yields a 
measure of overall change in skin resistance. 
Facial instruction condition has an effect on 
this variable: Subjects in the unmanipulated 
facial condition showed the largest drop in 
skin resistance, the fear instructions subjects 
the smallest. As with the facial instructions 
effect on fall in heart rate, the statistical sig- 
nificance of this is probably overestimated by 
the use of the Face X E interaction term as 
the error term: F(3,3) =4e490 = Ol; a 
pooled error yields F(3, 97) = 1.7, ns. The 
other effects do not reach statistical signif- 
icance, 

Number of galvanic skin responses. Each 
time skin resistance fell by 1,000 ohms, it was 
scored as a response. The film variable had a 
significant effect of the number of GSRs: F(2, 
2) = 57.1, p< 01. An a posteriori contrast 
is marginally significant: Subjects watching 
the fear film had more GSRs on the average 


than the sad and neutral film subjects, F(i, 
5; the contrast ac- 


98) = 5.63, 10> p> 0 
counts for 95% of the yariation. 

Summary of physiological effects. Three of 
the four physiological variables showed signif- 
icant film effects. For heart rate fall, sad and 
fear film subjects were similar, both being 
lower than neutral film subjects. For rise in 


heart rate and number 


film subjects. There was T 
film on largest fall in skin resistance, al 
the pattern was similar to that for heart rate 
(sad and fear subjects Were similar and both 
were different from neutral film subjects). 
This pattern of different physiological signs 
for different emotional stimuli also appears in 
Table 4. The pattern of correlations between 
physiological variables and self-reported emo- 
tion differs for sadness and fear. 

Both fall in heart rate an fall in GSR 
show some evidence fo. i 
effects, although the statistical significance of 


< subjects show mos' 
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Table 4 
Correlations of Physiologi i i 

r gical Variables with 
Facial and Self-Report Measures a 


SSS m 


Facial 
Self-report expression 
Item Fear Sad Fear Sad 
Fall in HR =05 —.10 02. = 
Rise in HR 16" —.05 rae 
FallinGSR ~.05, 03 ‘08 —.08 
No. of GSRs —.05 —.18* —.02 -14 
Maximum HR .19* —.08 03 02 
Minimum HR 16* = .00 01 —.01 
Lowest GSR .06 +19* —.07 —.06 


Note. HR = heart rate (change measured in beats 
per minute). GSR = galvanic skin response (change 
measured in thousands of ohms). 

*p < 05. 


the effects depends on the choice of the error 
term. For fall in HR, the nonemotional facial 
instructions subjects seem the most “aroused” 
(showing the smallest drop). This is in line 
with the hydraulic-cathartic view. It is the 
unmanipulated subjects, however, rather than 
the fear or sad face subjects, who show the 
least “arousal” (largest drop). Similarly, the 
fear face subjects show the smallest drop in 
skin resistance, again in line with the hy- 
draulic-cathartic view; however, this time the 
unmanipulated rather than the nonemotional 
t arousal. Table 4 offers 
the traditionally weak and 
ent relationship between 
d physiological response. 


further evidence of 
somewhat inconsist 
facial expression an 


Discussion 


The Facial Feedback Hypothesis 


ion that the facial instruc- 
ded effect, the facial feed- 
back hypothesis receives three major setbacks 
from the evidence of this study. First, adopt- 


ing an emotional facial expression does not 
appear to be sufficient to produce the emotion, 
Even when there were no competing emotional 

he film (i.e., for the neutral film 


stimuli from t! 

subjects), manipulated facial expression did 
not produce significant differences in emo- 
tional responding. The trend for these ub- 
jects, insofar as there was a trend, was in the 


On the assumpti 
tions had their inten 
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direction predicted by the facial feedback 
hypothesis. This trend appears to replicate 
Laird’s (1974) results: the differences he 
found were of about the same magnitude as 
those in our neutral film condition, and it is 
probably a safe assumption that his stimuli 
(still photographs) were less arousing than 
our fear and sadness films. 

Second, adopting a nonemotional expression 
does not prevent emotional responding; thus, 
emotional expression does not seem necessary 
for emotional feelings. Similarly, correcting 
for changes in facial expression statistically by 
analysis of covariance does not remove the 
effect of the film. Although this technique may 
be biased in the direction of undercorrection, 
it is hard to see how any statistical method 
based on the correlation between facial ex- 
pression and reported emotion could alter the 
film effect—the correlation is zero. 

Finally, this lack of correlation constitutes 
especially damaging evidence against the 
theory. Examination of the scatter plots of 
facial and self-reported fear and of facial and 
self-reported sadness does not provide obvious 
support for any monotonic relationship be- 
tween self-report and facial expression of 
emotion, let alone the linear relation mea- 
sured by the correlation coefficient. Thus, even 
a threshold version of the facial feedback 
hypothesis seems untenable. Lanzetta, Cart- 
wright-Smith, and Kleck’s (1976) finding of 
an effect of facial expression on feelings of 
pain does not seem to extend to feelings of 
fear or sadness, 

Even if the facial manipulation were un- 
successful, the facial feedback hypothesis 
would be difficult to maintain. The absence of 
any correlation between rated expression and 
reported emotion might buttress an argument 
based on failure of the facial manipulation. 
The convergence of the facial instructions 
with raters’ judgments could be explained 
away: raters might have recognized the in- 
tended expression even though subjects’ faces 
were poor reflections of the canonical fear or 
sad expression. Granting both of these argu- 
ments, we still must explain the absence of 
inhibiting effects for facial expression; even 
if the facial manipulation of fear, for example, 
were woefully inadequate to produce the 
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canonical fear expression, it is difficult to be- 
lieve that it didn’t greatly interfere with the 
emergence of the sad expression, thus reduc- 
ing felt sadness. Finally, the existence of the 
film effect, its independence of the facial ex- 


pression, and its magnitude compared with 
that of any facial effect present difficulties 
for the facial feedback position. As to these 
relative magnitudes, our results are in com- 
plete agreement with those presented by Laird 
(1974). 


The lack of any correlation between facial 
expression and reported emotion is damaging 
not only to the rather strong and unqualified 
version of the facial feedback hypothesis 
tested in this experiment but also to the more 
elaborated, qualified theories proposed by 
Tomkins (1962) and by Izard (1971, 1977). 
Even if there are reafferent loops and even 
if the proprioceptive feedback along voluntary 
and involuntary pathways is recognizably dif- 
ferent, the theories ought to predict a gen 
erally positive correlation. The unmanipulated 
face condition is especially relevant here, since 
there were no instructions to introduce poten- 
tially confusing voluntary feedback. In these 
conditions, all facial expression was sponta- 
neous, and the correlations between expression 
and reported emotion were still infinitesimal 
(r = —.01 for fear; r = .07 for sadness).° 


Physiological Results 


It is possible that self-reported emotion 
may reflect the subject’s perception of the & 
pected effect of the film. Such demand char- 
acteristics are also relevant to Laird’s oo 
study and may account for the effects of pis 
stimuli on self-reported emotion in both the 
study and ours. The effects of the film 07 | 
physiological variables cannot be 50 cau 
accounted for by demand characteristics: 


ce 
6 In general the subjects in the unmanipulated i 
condition showed little overt facial respon 4 by 
possible that covert facial expressions, unobserv! stion 
our raters, did correlate with self-report of on 97) 
(cf. Schwartz, Fair, Salt, Mandel, & Klerman, muscle 
Thus it is still possible that covert involuntary Pe 
activity has some causal influence, although fect of 
ficulties of separating this influence from the €! 
the eliciting emotional stimuli are enormous: 
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Table 5 
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Mean Physiological Arousal for Nonemotional and Unmanipulated Face Subjects 


in Neutral Film Condition 


el 


Ss Item Nonemotional Unmanipulated t (18) p 
Number of GSRs 16.4 10. 
Fall in GSR 68.6 385 is 
Rise in HR 18.0 13.4 1.53 a 
Fall in HR 68 12.6 -1.74 10 


Note. n = 9 in the nonemotional face condition; 
vanic skin response (change measured in thousam 
per minute). 


Subjects who watched the fear film showed 
generally greater “arousal” than subjects who 
watched the sad or neutral film, The pattern 
of rises and falls for subjects who wa! 
the fear film is quite similar to the pattern 
reported by Ax (1953) for fearful subjects. 
For the heart rate variables, sad film subjects 
were intermediate between fear and neutral 
film subjects, However, sad film subjects 
showed even fewer GSRs than subjects who 
saw the neutral film, These data support the 
notion of different physiological patterns for 
different emotions. This position receives fur- 
ther support from correlations between phys- 
iological variables and self-reported emotion; 
again, sadness seems related to lower levels of 
arousal, fear to higher levels. These findings 
tend to render the demand teristics 
account relatively less plausible. ! 
subjects seem to show the physiological pat- 
tern for fear. These findings dpn 
render suspect the theoretical utility of the 
nonspecific arousal concept. (Of cou the 
mere existence of physiologica 
terns does not guarantee that people ™ 
as the James—Lange theory asserts, 
to their emotional state.) f 


Facial and Physiological Variables 


The Freudian hydraulic model suggests ia 
there are different channels for emotional & 
Pression; as one channel is used mom i 
others are used less in releasing een 
energy. We might expect, according t0 


n = 11 in the unmanipulated face condition. GSR = gal- 
ds of ohms). HR = heart rate (change measured in beats 


effect on physiological responding, although 
the statistical significance of the findings is 
dubious. For the two heart rate variables and 
the number of GSRs, nonemotional face sub- 
jects did, in line with the hydraulic model, 
show more physiological “arousal” than sub- 
jects who received the other facial instruc- 
tions. The results for unmanipulated subjects, 
whose facial emotion was more than that of 
the nonemotional face subjects but less than 
that of subjects posed with a fearful or sad 
expression, create difficulties for the hydraulic 
view. In general, the unmanipulated face sub- 
jects showed the least arousal.’ s 
On the whole, these results are in line with 
an effort or concentration hypothesis. The 
nonemotional facial position, which required 
subjects to close one eye and puff out their 
cheeks, was probably the most difficult posi- 
tion to maintain and required a great deal of 
concentration. The unmanipulated face, of 
course, required no special effort at all. That 
concentration on a task can produce physio- 
logical changes has been amply demonstrated 
950; Lacey, Kagan, Lacey, & 
This mechanism can also ac- 


1970; Buck, Savin, Miller, & Caul, 
Buck, Miller, & Caul, 1 


(ABE: 
1 exception is fall in skin resistance. Unmanip- 
ute subjects had the highest skin resistance during 
the baseline od. They showed the largest drop 
i + despite this, their lowest skin resist- 
ed higher than that 


in the other groups. Some of their large 
ressi 
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hibitors concentrate on inhibiting their facial 
expressions; this concentration produces 
changes in GSR. The effects of facial instruc- 
tions on physiological arousal are not entirely 
in line with the hydraulic model and are per- 
haps better explained by an effort or concen- 
tration mechanism. 

This hypothesis is, however, both tentative 
and post hoc. Although some of the facial 
effects are significant, others are not. Com- 
paring nonemotional and unmanipulated face 
subjects across all film conditions a posteriori, 
the only physiological measure that is signif- 
icantly higher in the nonemotional facial con- 
ditions is the number of GSRs, F(1, 98) = 
9.97, p < .05. It might be argued that the 
purest test of the effort hypotheses is in the 
neutral film condition, where effort is the 
major source of physiological arousal. Table 
5 shows the means for the four physiological 
variables for the nonemotional and unmanipu- 
lated face subjects who watched the neutral 
film. Testing all four physiological variables 
together, these two groups did not differ sig- 
nificantly, Hotelling’s 7*(4, 15) = 9.9, p< 
.25. Although all the means differ in the direc- 
tion’ consistent with an effort hypothesis, they 
do not reach conventional levels of signif- 
icance. 


Summary 


In an area where counterintuitive theories 
and puzzling results seem the rule, our results 
seem to support a common sense theory. Emo- 
tional stimuli such as our films affect subjec- 
tive experience, facial expressions, and physio- 
logical processes associated with an emotion. 
The effect of the stimuli does not, as the facial 
feedback hypothesis predicts, depend on the 
facial response. Nor, as the nonspecific arousal 
theorists claim, are the physiological effects 
the same for all emotions (although people 
may not pay any attention to the differences). 
F inally, covering up an emotion facially may 
increase physiological responding, but this in- 
crease does not appear to result from the emo- 
tion’s having to “come out somewhere else” 
(as the hydraulic view would have). Instead, 
it seems plausible that the concentration re- 
quired to repress the outward expression of an 
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emotion has an effect on physiological re- 
sponse. Besides supporting the common sense 
view, these results also support a general self- 
perception hypothesis. A variety of cues— 
facial, physiological, situational—may enter 
into the subjective experience of an emotion. 
Our results suggest that the situational cues 
receive the most weight. 
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Deindividuation has been shown to relate to increases in antisocial behavior 
Typical manipulations, however, have confounded deindividuation with the 
presence of negatively valenced cues, such cues being inherent in the costumes 
or situations used to produce deindividuation. The present study manipulated 
deindividuation and valence of costume cues in a 2 X 2 factorial design. Zim- 
bardo’s theory of deindividuation suggests that deindividuation should disinhibit 
antisocial behavior, independent of cue valence, and should reduce any influ- 
ence due to cues. Gergen, however, suggests that cues may have increasing 
influence, given deindividuation, and that deindividuation may increase prosocial 
behavior, given positive cues, and increase antisocial behavior, given negative 
cues, Results supported Gergen’s position. Given options to increase or decrease 
shock level received by a stranger, no main effect was found for deindividua- 
tion. There was a main effect for costume cues, and an interaction of cues with 
deindividuation, with deindividuation facilitating a significant increase in pro- 
social responses in the presence of positive cues and a nonsignificant increase 
in antisocial responses in the presence of negative cues. Also cues interacted 
with trial blocks, prosocial behavior increasing with positive cues and antisocial 


behavior increasing with negative cues over trial blocks. 


The construct deindividuation was first sys- 
tematically investigated by Festinger, Pepi- 
tone, and Newcomb (1952). In their view, one 
consequence of an individual’s involvement 
and identification with a group is a reduction 
of individual responsibility for individual be- 
havior. They believed that deindividuation 
was a phenomenon on which groups would 
differ and that degree of deindividuation in a 
group could be indexed by the rate of failure 
of individuals to identify correctly the group 
members who had contributed different be- 
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haviors. It was expected that previously in- 
hibited behaviors (i.e., saying negative things 
about one’s parents) would be disinhibited as 
a result of deindividuation. As predicted, 
groups scoring highest on deindividuation did 
say more negative things about their parents. 
This was, however, a correlational study in 
which the direction of causality between de- 
individuation and disinhibition is unclear. 
Later studies (Singer, Brush, & Lublin, 1965; 
Zimbardo, 1970) using lab coats and hoods to 
obscure the identifiability of individuals, have 
shown that such a manipulation increases dis- 
inhibition of socially undesirable behaviors 
(i.e., speaking obscene words and administer- 
ing electrical shock to another person). 


Theoretical Issues 


Exactly which behaviors will be disinhibited 
in a given situation is not yet clear. Nor is it 
clear how situational cues affect the type or 
amount of disinhibition that occurs. 
major positions on these issues have been a> 
vanced. 


Copyright 1979 by the American Psychological Association, Inc, 0022-3514/79/3709-1532$00.75 
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DEINDIVIDUATION AND VALENCE OF CUES 


Zimbardo’s view. In Zimbardo’s theory of 
deindividuation (1970), anonymity, along 
with other input variables, produces a state of 
the organism, deindividuation, that in turn 
produces a general disinhibition of previously 
inhibited behavior. Negative comments about 
parents, college women’s use of obscene words, 
and subjecting others to pain are all behaviors 
inhibited by prior experience, possibly by ex- 
pectation of punishment. Deindividuation dis- 
inhibits such behaviors by means of a general 
weakening of inhibitory mechanisms. Which 
behaviors will be disinhibited depends upon 
which ones have been inhibited. Inhibited be- 
haviors, once initiated, will tend to increase in 
frequency and intensity because they are in- 
trinsically reinforcing. According to this 
theory, because the real source of the be- 
havior is its intrinsically self-rewarding na- 
ture, deindividuation should lead to decreasing 
influence of external cues. 

These hypotheses were subjected to an ex- 
perimental test using what is perhaps the best- 
known deindividuation paradigm (Zimbardo, 
1970), Groups of subjects were individuated 
(with identifying name tags) or deindivid- 
uated (with lab coats and hoods obscuring 
their identity) and were then given a sanc- 
tioned opportunity to administer electrical 
shocks to another person. Zimbardo has inter- 
preted the increased duration of shocks given 
by deindividuated subjects to be a result of 
the deindividuating experience disinhibiting 
aggressive behavior. 

As predicted, individuated subjects de- 
creased shock over trials for a “nice” target 
person and increased shock for an “ob- 
noxious” target, whereas deindividuated sub- 
jects increased shock for both targets. Also, 
shock duration and negativity of ratings of the 
target person were significantly correlated for 
individuated subjects, r= .67, but not for 
deindividuated subjects, 7 = .10. Contrary to 
prediction, however, the deindividuated sub- 
jects’ average shock duration was higher for 
the “obnoxious” than for the “nice” target, 
whereas during the first half of the trials, in- 
dividuated subjects actually shocked the 
“nice” target more than the “obnoxious” one. 
Deindividuated subjects thus seemed at least 
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initially more “appropriately” responsive to 
target characteristics than were individuated 
subjects. Thus evidence for Zimbardo’s pre- 
diction of reduced influence of situational 
cues, given deindividuation, is at best equi- 
vocal. 

Gergen’s view. The prediction that anti- 
social behavior is most likely to be disin- 
hibited by deindividuation was questioned by 
Gergen, Gergen, and Barton (1973). In their 
study, darkness- and anonymity-induced de- 
individuation led not to increased aggression 
but to increases in touching, caressing, and 
other affectionate behaviors. These workers 
advanced the notion that either prosocial or 
antisocial behavior could be enhanced by de- 
individuation, depending upon valence of 
situational cues. In this view, the darkness 
manipulation may have been more suggestive 
of intimacy than of aggression, hence an- 
onymity-induced deindividuation increased 
the frequency of intimacy behaviors. 


Experimental Confounds 


A more important question is whether the 
disinhibition observed in prior research was 
in fact due to deindividuation and not to some 
unintentionally manipulated variable, One 
such variable relates to the cue value of the 
deindividuating costumes, which may be rem- 
iniscent of Ku Klux Klan outfits or perhaps 
of some Halloween ghouls, either of which 
might be considered cues eliciting aggression 
(Berkowitz, 1974). It is noteworthy that even 
without differential identifiability, Berkowitz 
and his colleagues have demonstrated in- 
creased duration of shocks to a target person 
due to the presence of cues previously associ- 
ated with aggression (Berkowitz, 1974). A 
similar line of reasoning could serve as an 
alternative explanation for differential levels 
of the use of profanity in the Singer et al. 
(1965) study. Old clothes (deindividuated 
condition) may have provided cues for low- 
ered restraints against obscene language, 
whereas dressy clothes (individuated condi- 
tion) may have provided cues to a number of 
learned associations of verbal restraint and 
propriety. Finally, the dark chamber used by 
Gergen et al. (1973) to deindividuate subjects 
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may have provided cues (e8., darkness, the 
awareness of the presence of others through 
the sense of smell, the sounds of breathing, 
and the sense of touch, even if at first acci- 
dental) suggestive of intimacy and thus facil- 
itating intimate behavior. In this study, how- 
ever, a control condition indicated that dark- 
ness cues did not disinhibit intimacy if 
subjects were not anonymous. 

The major question addressed by the pres- 
ent study is whether the direction of behavior 
change induced by nonidentifiability is influ- 
enced by the valence of situational cues. Given 
an influence of cues, Zimbardo’s (1970) 
theory would predict that nonidentifiability 
will disinhibit aggressive behavior and that 
any effect of cues should be less than the effect 
for identifiable subjects. On the other hand, 
the reasoning of Gergen et al. (1973) predicts 
that nonidentifiability will lead to increases 
in antisocial behavior if antisocial cues are 
present and to increases in prosocial behavior 
if prosocial cues are present. A third possibil- 
ity is that anonymity per se has no effect, but 
that prior demonstrations of anonymity effects 
actually resulted from confounded differences 
in situational cues. 


Method 
Subjects 


Sixty female subjects were recruited from intro- 
ductory psychology, sociology, and child develop- 
ment courses to participate in a study described as 
concerned with changes in group evaluations of a 
stranger. Most subjects were given extra credit in 
their course as an incentive to participate. 


Procedure 


Subjects were randomly assigned, 15 to each of 
four conditions, in a 2X2 factorial design that 
manipulated individuation (identifiable) versus de- 
individuation (nonidentifiable) and prosocial versus 
antisocial cues. Each subject was informed by phone, 
at the time she was recruited, that the experiment 
was to be run in groups of four, and that it was 
therefore very important for her to appear on time 
Upon arrival, subjects were informed that each stu- 
dent had been asked to report to a different room in 
order to preclude interaction with others prior to 
the experiment. It was also explained that Polaroid 
pictures of the other group members would be used 
to establish the essential feeling of being part of a 
group and that disguises would be worn in the pic- 
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tures to obscure individual difference characteristics 
that might be influential. In fact, although as many 
as four subjects were run in each session, each sub- 
ject was treated completely independently of the 
others, and all subjects at any one time were in 
different treatment conditions. No subject had any 
contact with any other subject during the experi- 
mental session. 

It was explained to each subject that the paid 
nonstudent male volunteer whom they would be 
evaluating was taking part in a verbal learning ex- 
periment and that it was important for them to be- 
come involved with the stranger, Involvement was 
to be established by their participation in selecting 
the level of shock that this person received for 
failure to respond correctly in the learning task. 
Subjects were instructed that it was an exploratory 
study on the effect of arousal on learning and that 
the experimenters did not know what effect different 
shock levels would have. Following each error the 
learner would be given a shock the base level of 
which could be increased or decreased by the re- 
sponses of the subjects. The actual shock level re- 
ceived following an error would be the base level 
adjusted by the average adjustment selected by the 
four group members. Each subject would select 
high, moderate, or slight increases (+3, +2, +1) or 
decreases (—3, —2, —1) in the shock to be admin- 
istered. It was explained that following each trial 
the subject would see on her console the shock selec- 
tions of the other three subjects plus her own, that 
these would be averaged to determine the level of 
shock increase or decrease to be administered on that 
trial, and that no record would be kept of individual 
responses but only of the group average for each 
trial. Thus all subjects’ responses, regardless of con- 
dition, were to be nonidentifiable to the experimenter 
and to any others who might see the data. The 
“learning task” ended when the subject had made 
15 errors. The number of correct responses made 
prior to the 15th error was the measure of successful 
learning that subjects were to try to facilitate by 
their shock selections. 

Cue manipulation. The costume manipulation of 
cues was produced by having each subject wear either 
a robe resembling those of the Ku Klux Klan or à 
nurse’s uniform, The ostensible purpose of the c05- 
tumes was to obscure individual differences. The 
nature of the specific costume given to each subject 
was presented as an accident of convenience (i€, 
“Pm not much of a seamstress; this thing came OU 
looking kind yf Ku Klux Klannish,” or “I was for- 
tunate the hospital recovery room let me borrow 
these nurses’ gowns to use in the study.”) A Polaroid 
picture was taken of each subject in her costume, 4” 
pictures of others in the group, in similar costumes, 
were attached to the subjects’ consoles. Each subject 
was told that copies of her picture had been place 
on the consoles of others in the group. k 

Deindividuation manipulation. In the individua- 
tion condition, consoles were labeled so that each sub- 
ject could identify the shock level set by each paa 
in the group and the person by whom it was ae 
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Also, large name tags were attached to the costumes 
of the individuals in the Polaroid pictures. In the 
deindividuation condition, Pictures of others in cos- 
tume were attached to subject consoles, but no name 
tags were worn, and subjects were provided with no 
means of identifying the person who made any given 
response. 

To facilitate subjects’ involvement with the target, 
they were allowed to see and hear the “learner” 
being interviewed. His behavior in the interview was 
designed to be obnoxious and distasteful to the sub- 
jects. A negative target was used in order to minimize 
the inhibitions against shocking another person, thus 
allowing for variability on this response dimension. 
In fact, subjects proved not to be too inhibited to 
use the shock increase option, yet the target was 
not so negative as to preclude use of the shock de- 
crease option, 

Following the interview, subjects filled out a pre- 
liminary evaluation of the confederate. Learning 
trials were then begun, errors being indicated by the 
lights on the subjects’ consoles, signaling to them to 
select +3, +2, or +1 levels of increase or —1, —2, 
or —3 levels of decrease in intensity of the shock the 
confederate was to receive. After all subjects in the 
group had responded, feedback of the choices of 
others was displayed and shock was supposedly ad- 
ministered, Feedback was preprogrammed to average 
O across each of three blocks of trials. Following the 
learning task, subjects were asked to fill out a post- 
experimental questionnaire, after which they were 
Probed for suspicion, debriefed, asked not to divulge 
the deceptions to others, and dismissed. 


Results 


Evaluation of the Stranger 


Five bipolar scales for evaluating the 
stranger were administered immediately fol- 
lowing the interview, prior to the learning 
task. The experimental conditions did not sig- 
nificantly differ from each other, the means 
for each condition being on the insincere, dis- 
honest, cold, phony, and unkind side of the 
midpoint. It appears that the confederate was 
Perceived to be obnoxious, as planned, in all 
conditions. 


Manipulation checks 


Perception of cues, Four bipolar scales on 
the postexperimental questionnaire were used 
to assess subjects’ perception of costume cues. 
As intended, the Ku Klux Klan costumes were 
rated as significantly more tough, harmful, 
unkind, and cold than were the nurses’ cos- 
tumes, < .01, for each scale. Composite 
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Table 1 
Mean Shock Selection as a Function of Cue and 
Deindividuation Conditions 


ee 


Condition 
aE ON VA 
Cue Individuated Deindividuated 
Prosocial —.35, -1.47, 
Antisocial -16e +95, 


Note. Possible range of scores was from —3 (the 
Prosocial choice of maximally reducing shock level) 
to +3 (the antisocial choice of maximally increas- 
ing shock level). Means without a common subscript 
were significantly different from each other (p 
< .01). 


totals of responses to these four scales indi- 
cated that the Ku Klux Klan costumes were 
perceived as significantly more negative than 
the nurses’ costumes, F(1, 56) = 24.95, p < 
.01. This difference reflects the negative rat- 
ings of the Ku Klux Klan costumes, M = 
5.59, versus the slightly positive ratings of the 
nurses’ costumes, M = 3.71 on a 1 (extremely 
compassionate) to 7 (extremely aggressive) 
scale. 

Perceived sense of deindividuation, As ex- 
pected, deindividuated subjects, compared to 
individuated subjects, indicated on the post- 
experimental questionnaire that it would be 
more difficult to identify shock selections of 
other individuals in their group, F(1, 56) = 
15.84, p < .01, and that it would be more 
difficult to distinguish members of their 
group from nongroup members following the 
experiment, F(1, 56) = 19.55, p < .01. These 
measures suggest that in fact deindividuation 
was manipulated as intended. 


Shock Selections 


The primary dependent variable, shock 
selection, was analyzed by a 2 X 2 X 3 anal- 
ysis of variance, individuation versus deindi- 
viduation, and prosocial versus antisocial cues 
between subjects and trial blocks within sub- 
jects. This analysis revealed a significant main 
effect for cues, F(1, 56) < 46.28, p=.01, 
with shock decrease the mean response for 
prosocial cues and shock increase the mean 
response for antisocial cues. The main effect 
for individuation versus deindividuation was 
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not significant, F(1, 56) = 2.92, but there 
was a significant Cues X Deindividuation in- 
teraction, F(1, 56) = 8.21, p < 01. The in- 
teraction (see Table 1) is a result of an 
increasing effect of cues on behavior in de- 
individuated versus individuated conditions. 

A Newman-Keuls test revealed that each 
condition was significantly different from each 
other condition, each at the p < .01 level, 
except the simple effect of deindividuation 
within antisocial cues. This comparison is 
essentially a conceptual replication of Zim- 
bardo (1970). It was directionally consistent 
with that effect but only achieved p< .15 
even by a one-tailed simple ¢ test. 

No significant main effect occurred for trial 
blocks, F(2, 112) = 1.31, or for the inter- 
action of trial blocks with deindividuation, 
F(2, 112) = 1.85. There was, however, a sig- 
nificant Trial Blocks X Cues interaction, F(2, 
112) = 19.78, p< .01, resulting from the 
tendency for shock levels to increase in the 
antisocial cues condition and to decrease in 
the prosocial cues condition, from the first to 
the second of three blocks. 


Discussion 


The experimental manipulations appear to 
have been effective. The prosocial costumes, 
though closer to neutral than had been in- 
tended, were rated as significantly less nega- 
tive than the antisocial costumes. The manip- 
ulation of deindividuation was assessed by 
self-reports of ability to identify behaviors of 
specific group members (cf. Festinger et al., 
1952) and perceived anonymity of group 
members (cf. Zimbardo, 1970). Both mea- 
sures demonstrated that deindividuation was 
manipulated as intended. 

Of primary interest in this research were 
the effects of cues and of deindividuation on 
shock selection. In the presence of Ku Klux 
Klan costume cues, subjects were likely to in- 
crease shock levels, whereas in the presence 
of prosocial cues, subjects were likely to de- 
crease shock levels. This finding by itself sug- 
gests alternative interpretations of the Singer 
et al. (1965) study and the Zimbardo (1970) 
study presented in this paper’s introduction. 
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In those studies costume cues were completely 
confounded with the manipulations of dein- 
dividuation. It is entirely possible that the in- 
creased antisocial behavior observed in those 
studies (i.e., frequency of obscene words and 
duration of electrical shocks) was a function 
of the costume cue manipulation alone and de- 
pended less on anonymity than has been 
widely believed. 

In contrast to the previous research, the 
present study allows us to look at the effects 
of deindividuation independent of costume 
cues. According to Zimbardo, (a) deindividua- 
tion should have resulted in more aggressive, 
antisocial behavior for both costume varia- 
tions, and (b) any effect of cues should have 
been attenuated as a function of deindividua- 
tion. According to Gergen et al. (1973), how- 
ever, cues should interact with deindividua- 
tion, deindividuation leading to more anti- 
social behavior in the presence of antisocial 
cues and to more prosocial behavior in the 
presence of prosocial cues. The interaction 
obtained in the present study (see Table 1) is 
consistent with Gergen’s position. 


Theoretical Implications 


Antecedents of deindividuation. Concep- 
tually, most theorists (cf. Zimbardo, 1970) 
have viewed deindividuation as a state of the 
organism that can be induced by a variety of 
input variables, including but not limited to 
anonymity. Experimentally, however, deindi- 
viduation has nearly always been manipulated 
by varying some aspect of identifiability. This 
is true of the research discussed in the intro- 
duction and is true of the present research. 
More recently (cf. Diener, in press), deindi- 
viduation has been manipulated by means 2 
complexes of input variables including 4 sense 
of group unity, group cohesiveness, group T% 
sponsibility, and even kinesthetic f EE 
from physical activity. These inputs may ut 
fact induce a sense of anonymity, but very 
likely they have additional influences beyond 
those related to identifiability. 

The state of deindividuation. Confirm? 
tion of the relationship between anonymity 
and disinhibition is not sufficient to supp% 
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the construct of deindividuation as a mediat- 
ing state of the organism, and attempts to 
validate the construct have been largely un- 
supportive (Diener et al., 1975; Diener, in 
press), Anonymity-induced disinhibition may 
not require a reduction in the subjective sense 
of individuation but in many instances could 
reflect a simple reduction in perceived nega- 
tive sanctions, hence a disinhibition of be- 
havior previously suppressed by such sanc- 
tions. Possibly environmental cues, like de- 
mand characteristics, influence the subjects’ 
interpretation of the types of prior sanction 
that have been lifted. 

Individuation and objective self-awareness. 
Conceptually, a strong parallel seems to exist 
between Zimbardo’s (1970) constructs of in- 
dividuation versus deindividuation and the 
Duval and Wickland (1972) constructs of 
objective versus subjective self-awareness. For 
example, both individuation and objective 
self-awareness imply focusing of attention in- 
ward, toward the self. Recent research has 
shown that a typical manipulation of objec- 
tive self-awareness does in fact lead to an in- 
creased sense of individuation (Ickes, Layden, 
& Barnes, 1978). Also, Diener (in press) has 
demonstrated a reduction in self-awareness 
resulting from a broad-based group involve- 
ment manipulation of deindividuation but has 
found this to be not equivalent to lack of self- 
awareness induced in a nonsocial way. This 
does not imply that anonymity will not yield 
a lack of self-awareness comparable to that of 
typical objective self-awareness manipulations, 
for Diener’s manipulations of deindividuation 
included much more than anonymity. 

Objective self-awareness and situational 
cues. The state of objective self-awareness 
has been shown to increase the influence of 
internalized standards for correct or appropri- 
ate behavior (Carver, 1975; Scheier, Fenig- 
stein, & Buss, 1974). Conversely, one might 
expect objective self-awareness to reduce the 
influence of external, situational cues. If an- 
onymity-induced deindividuation is compar- 
able to reduced objective self-awareness, the 
anonymity should increase the relative influ- 
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ence of external cues. Basically, we are sug- 
gesting that reduced objective self-awareness, 
anonymity-induced deindividuation, and prob- 
ably group involvement-induced deindividua- 
tion all lead to an outwardly focused attention 
with increased salience of and influence due 
to concurrent situational factors such as the 
cue manipulation in the present study. Also, 
individuation and objective self-awareness 
should focus attention inward, increasing the 
salience of and influence due to internalized 
standards for behavior (cf. Carver, 1975). 

Under some circumstances, internalized 
standards may themselves be altered by con- 
current stimulus information (cf. Carver, 
1974), in which case the influence of such 
external stimuli may be increased by objective 
self-awareness. Consequently, statements as to 
whether anonymity will lead to increased or 
decreased influence of external cues may need 
to be qualified to take into account how or 
whether those cues affect a change in internal- 
ized standards for behavior. 
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Social and Emotional Messages of Smiling: 
An Ethological Approach 


Robert E. Kraut and Robert E. Johnston 


Cornell University 


Did smiling evolve as an expression of happiness, friendliness, or both? Nat- 
uralistic observation at a bowling alley (N = 1,793 balls) shows that bowlers 
often smile when socially engaged, looking at and talking to others, but not 
necessarily after scoring a spare or a strike. In a second study, bowlers (N = 166 
balls) rarely smiled while facing the pins but often smiled when facing their 
friends. At a hockey game, fans (W =3,726 faces) smiled both when they 
were socially involved and after events favorable to their team. Pedestrians 
(N = 663) were much more likely to smile when talking but only slightly 
more likely to smile in response to nice weather than to unpleasant weather. 
These four studies suggest a strong and robust association of smiling with a 
social motivation and an erratic association with emotional experience. 


Everyday experience suggests that smiling 
is one of the most common nonverbal signals 
used in communication among humans. De- 
spite this, and despite more than 100 years 
of research on facial expressions, we still know 
relatively little about the causation of smiling 
and its social functions, In this article we at- 
tempt to provide evidence about the causation 
of smiling in social settings and to raise some 
neglected questions about the analysis of fa- 
cial expressions in general. 

Research and thought on the facial expres- 
sion of emotion has had a checkered history 
since the publication of Darwin’s The Expres- 
sion of the Emotions in Man and Animals in 
1872, as has been documented by a number 
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of excellent reviews (Ekman, 1973; Ekman, 
Friesen, & Ellsworth, 1972; Izard, 1971), but 
recently several of the perennial questions in 
this field have been settled. In a variety of 
studies researchers have shown that people 
can consensually and accurately recognize at 
least six emotional expressions from pictures 
of faces (e.g., Ekman, 1972; Ekman et al., 
1972; Tomkins & McCarter, 1964) and that 
these abilities seem to be universal among 
humans (Eibl-Eibesfeldt, 1972 and 1973; Ek- 
man, 1973; Izard, 1971; Vinacke, 1949; 
Vinacke & Fong, 1955). Ekman (1972) has 
proposed a general model for the factors in- 
fluencing facial expression that we take as 
representative of this “emotional expression 
approach” to facial expressions, and we ex- 
plicate this model in more detail below for 
the particular case of smiling. Tomkins’ 
(1962) and Izard’s (1971) positions empha- 
size the influence of facial expressions on emo- 
tional experience more than Ekman’s ap- 
proach does but are otherwise similar, 


The Emotional Expression Approach: 
Smiling as the Expression of Happiness 


According to the emotional expression view, 
a smile is the major component of a facial dis- 
play associated with and caused by feelings 
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of happiness or joy. Anything that makes a 
person feel good or happy should produce 
smiling unless the individual wants to mask 
or inhibit this display. Laughing is considered 
to be the expression of either more intense 
happiness (Darwin, 1872/1965) or a par- 
ticular type of happiness (Ekman & Friesen, 
1975). Cultural and individual differences in- 
fluence smiling both by determining the inter- 
pretation of events, which affects the cause of 
happiness, and by shaping display rules, which 
determine when it is socially appropriate to 
smile. But such differences do not alter peo- 
ple’s innate and universal tendency to smile 
when they are happy. Thus, when a smile does 
occur, the message is usually happiness (Ek- 
man & Friesen, 1975), although this may be 
a false message if the sender is masking an- 
other emotion with a smile or if the sender is 
simulating happiness for some other reason. 

It is important to note that although 
workers in this tradition have emphasized the 
importance of facial expression in communica- 
tion and social behavior, they have rarely 
studied such communication in natural social 
settings by studying the causes and conse- 
quences of smiling; rather, they have focused 
on the recognition and verbal labeling of emo- 
tions in facial expressions, generally in still 
photographs. 


Ethological Studies: 
Smiling Indicates Friendliness 


A different paradigm has of necessity been 
used by ethologists studying nonhuman pri- 
mates. These workers have used naturalistic 
observation as a research tool and have drawn 
many of their hypotheses about humans from 
comparisons of humans with other primates; 
they have concentrated on the proximate 
causes of smiling, its consequences for the 
immediate interaction, and its evolutionary 
functions, 
$ Many nonhuman primates have a submis- 
sive facial display, called a grimace, a grin 
or a silent bared-teeth face. The display re- 
sembles the human smile, and in all species in 
which it occurs, it seems to have the function 
of deflecting hostile behavior of more dom- 
inant animals (Hooff, 1962). In a detailed 
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study of chimpanzees, Hooff (1972, 1973) 
showed that in addition to averting attacks, 
variations of this display were used to main- 
tain or increase affiliative behaviors between 
individuals. In some circumstances dominant 
animals used one variety of the bared-teeth 
face to reassure subordinates of their nonhos- 
tile, affiliative intent. Hooff hypothesized that 
the human smile is evolutionarily related to the 
chimpanzee’s bared-teeth displays and serves 
the same functions of deflecting hostility and 
maintaining friendly contact. On the other 
hand, according to Hooff, laughter evolved 
independently and is related to the primate 
“play face.” 

If human smiling is a behavioral homologue 
of chimpanzee bared-teeth displays, one 
would expect smiling to occur most in face-to- 
face interaction, especially where friendly in- 
tent is problematic or where social bonds are 
being established or renewed. The smiler’s 
motivation may be genuine friendliness or an 
intent to establish friendly relations. Re- 
searchers in this ethological tradition have not 
been concerned with the emotions or feelings 
experienced by those doing the smiling. 

The two approaches outlined above are not 
necessarily incompatible, but they have talked 
past each other by using different method- 
ologies on different species to ask different 
questions. The present article tries to compare 
predictions based on smiling as the expression 
of happiness with those based on smiling as 
an indication of friendliness, since the most 
straightforward extrapolations from each posl- 
tion do lead to different predictions about the 
causes of smiling in social settings. We asked 
about the motivational state of the smiler and 
the conditions under which smiling is Pro 
duced, and we used naturalistic observation 
as our methodology. We chose public settings 
in which we could observe people’s faces, 1” 
which the two theoretical orientations would 
predict that smiling would occur frequently, 
and in which we could distinguish the W° 
theories. Thus, for a setting to be relevant t° 
the emotional orientation, emotionally arous- 
ing and happiness-producing events had to he 
frequent. To be relevant to the social orienta- 
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tion, social interaction had to be frequent. | 


4 


+ 


Study 1: 
é 


‘ smiles, 


MESSAGES OF SMILING 


Naturalistic Observations of Bowling 


Watching bowlers is an excellent way to 
distinguish the social and emotional hypoth- 
eses about the causation of smiling. Observa- 
tions of the Progress of the game and the ac- 
tions of friends and teammates provide evi- 
dence about external events that might cause 
and observations of the bowlers? be- 
haviors as they are smiling provide evidence 
about their motivational states. 

According to the emotional hypothesis, 
bowlers should smile whenever they feel 
happy, for example, immediately following a 
spare or strike. But according to the social 
hypothesis, smiling should occur during social 
interaction, and the score obtained should be 
irrelevant. 

To some extent, the entire game of bowling 
is played in a social context. Bowlers gen- 
erally play with several friends or teammates 
with whom they talk and drink between turns 
and who shout encouragement, taunts, and 
insults after the play (see White, 1955, for a 
lively description). Yet during the game, the 
extent to which bowlers are engaged in social 
interaction varies greatly from moment to 
moment. When bowlers are facing the pins, 
preparing to roll the ball and watching the 
Outcome, social interaction is minimal. It in- 
Creases when they turn to face friends and is 
8reatest when they are engaged in face-to-face 
interactions in close proximity to them. The 
social hypothesis predicts that smiling should 
Occur most during these bouts of more intense 
social involvement. 


Method 


The bowling alley in which we made observations 
had 36 lanes. The lanes were set up in pairs with a 
Scoring table centered about 4.5 meters behind the 
foul line and a semicircular bench defining a pit for 
bowlers waiting their turn. Behind the pit and a 
guardrail was a gallery with small tables at which 
Spectators and bowlers might sit, watch the game, 
and drink. 

Although there were many variations, most bowlers 
followed a predictable sequence. They would arise 
from a seat in the pit, select a ball from the ball re- 
turn located between the two lanes, approach and 
Telease the ball at the foul line, stand or back up 
While facing the pins to see the outcome of the roll, 
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£ seats or leave the pit, 
occasionally talking to people on the way, 

We made observations of bowlers who appeared 
to be at least 18 years old, Lanes to observe were 
arbitrarily chosen from among those with an open 
Spectator table located about 7 meters behind the 
foul line. If no spectator tables were available, ob- 
servers arbitrarily selected a lane and observed while 
leaning against a wall. An observer recorded behavior 
from all bowlers using a selected lane for a maximum 
of 20 rolls of the ball from all bowlers, At the end 
of 20 rolls, the observer moved to another arbitrarily 
chosen lane. Most bowlers Were unaware that they 
were being observed by us, Those bowlers who asked 
were told that we were watching their technique and 
their score. 

The observer recorded verbal and nonverbal data 
about bowlers from the time they turned to face the 
pit after bowling until they turned away to pick up 
the ball after the first roll or returned to their seat, 
left the pit, or turned away from the observer after 
the second roll. The mean length of an observation 
period for a sample of 120 rolls was 5.5 seconds. A 
pattern of body and facial behavior believed by the 
observer to have occurred simultaneously was re- 
corded as a unit. If an element in the display changed 
or if the same pattern was held for what seemed a 
very long time to the observer, the observer recorded 
another unit. 

Our observational study of bowlers was replicated 
three times, using slightly different lists of behaviors, 
different observers, and different recording techniques, 
Over all replications we observed 1,793 rolls of the 
ball, based on approximately 350 different bowlers, 
Observers were trained by watching and recording 
from videotapes of people bowling. In the first rep- 
lication, observers were five students in a seminar on 
human ethology and the two authors, We were aware 
of the hypotheses of the study and recorded be- 
haviors using a brief notational scheme with pencil 
and paper. In the second replication, two observers 
naive to the hypotheses of the study watched 550 
rolls of the ball and recorded behaviors by speaking 
code names of the behaviors into portable tape 
recorders, pausing between behaviors in order to 
indicate unit boundaries. Using this technique, ob- 
servers didn’t have to remember behavioral Sequences 
as long and didn’t have to look away from the re- 
search subjects to record data; they were therefore 
able to record more behaviors and units per ob- 
servational period. 

In the third replication we videotaped 155 rolls of 
the ball and made detailed analyses of these tapes, 
Recordings were taken with bowlers’ awareness and 
Permission. We set up the lights and camera at the 
Spectator table and pretended to film for 5 minutes 
prior to the actual filming to acclimate subjects to 
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the equipment, Two observers recorded data from 
these tapes, As soon as both eyes of the subject could 
be clearly seen, the observers stopped the tape, re- 
corded behaviors, hand-turned the tape one quarter- 
turn, recorded behaviors, and repeated this sequence 
until the subject moved out of focus or turned so 
that only one eye could be seen. Using this method, 
data were recorded at approximately two-thirds-of- 
a-second intervals. After this stop motion analysis, 
the tapes were played at least once at normal speed 
to validate the behaviors recorded. 

We did no reliability checks on the live recording. 
On the analysis of the videotaped sequences the two 
observers agreed on 97% of the behaviors that either 
recorded. 

We recorded the following behaviors because prior 
research suggested that they might be communicative 
and because pretesting showed that they had occurred 
with sufficient frequency to allow meaningful statis- 
tical analyses: 

Neutral face: blank expression; mouth relaxed; 
head straight forward; absence of all other coded 
behaviors except talked to, groom, and good score. 

Closed smile: corners of the mouth turned up; lips 
together; teeth together. 

Open smile: corners of mouth turned up; lips are 
parted to show teeth. In Replication 1 only, smiling 
was coded without distinguishing between closed- and 
open-mouth smiles. 

Laugh: mouth open, corners of mouth sometimes 
turned up; laughter accompanying open mouth. 

Tight lips: lips compressed tightly; mouth in 
straight line; teeth probably clenched. 

Look: gaze fixed on another person. 

Look down: gaze directed at floor. 

Look away; head turned off body axis, gaze not 
directed at others in group. 

Talk: vocalization by subject to another person. 

Talked to: vocalization directed to the subject. 

Groom: subject preening, scratching, or rubbing 
any part of his body or face. 

Face cover: one or both hands covering facial fea- 
tures for several seconds. This was not recorded in 
Replication 2, 

Head shake: a continuous horizontal movement of 
the head, usually repeated several times. 

Ham: a nonspontaneous, exaggerated facial or body 
expression apparently intended to communicate; for 
example, “funny” faces, sticking out tongue, wrinkling 
nose, jumping up and down, dancing a “jig.” 

Positive exclamations: spontaneous and often exag- 
gerated actions or words that the observer believed 
corresponded to a pleasant experience or one during 
which the individual felt pleasurably excited. For 
example, a leap and squeal of joy after a strike. 

Negative exclamations: spontaneous and often 
exaggerated actions or words that the observer be- 

lieved corresponded to an unpleasant experience. For 
example, swinging the fist across one’s body, stamping 
one’s foot, or swearing after missing the pins. Ex- 
clamations were more spontaneous, less exaggerated, 
and shorter, and appeared to be less intentially oe 
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municative than hamming, with which they might be s 
confused. 4 
Good Score: a strike or spare 


Sequential Analysis 


In order to understand the causes of smil- : 
ing, it is necessary to examine the events with 
which it is temporally associated. If smiling 
often followed an external event in the” 
smiler’s environment, it is possible and even 
likely that the event caused the smile. If 
smilers often performed other behaviors while 
smiling, it is likely that the motivation under- pag 
lying the other behaviors is also underlying 
the smiling. 

Whether we consider two events or be- 
haviors temporally associated depends on the , 
time unit we select; the meaning of simul- 
taneous depends on temporal resolution, In 
Replication 1 we have considered two be- 
haviors as temporally associated or co-occur- 
ring if the original observer thought that they 
had occurred simultaneously, that is, placed 
them in the same behavioral unit, or if the 
observer placed them in adjacent behavioral 
units. Replication 2, using tape recorded data 
collection, and Replication 3, using analysis 
of videotapes, had finer temporal resolution. » 
Therefore, to make all data analyses com- 
parable, we considered two behaviors as C0- 
occurring if they occurred within 3 behavioral 
units of each other in Replication 2 and 
within 8 behavioral units of each other 1 
Replication 3. This means that we considered 
two behaviors as temporally associated if they i 
occurred within about 4 seconds of each other. 

Because the frequency of co-occurrence 1$ 
biased by the frequency of each behavior, 
Yules Q or gamma is the appropriate measure 
of association to use. We use this statistic bs | 
rather than the more familiar phi correlation 
because we considered a positive or negative 
association between two behaviors as perfect. 
if the more frequent behavior always Or never’ 
occurred with the less frequent behavior (Bla- 
lock, 1972, pp. 298-299) + 


1 The formula for gamma in this research 1 "i 
follows. Let A = the frequency of co-occurrence. 
total behavioral units; this is the total num ee of 
opportunities for co-occurrence. Ant = the freque 
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Results 


The data were stable and the co-occurrence 
of smiling with other behaviors is similar 
across the three replications, The mean Pear- 
son correlation between columns in Table 1 
is .80 (p < .001). 

The social hypothesis leads us to expect 
that bowlers will smile when they are socially 
motivated independently of emotional experi- 
ence. Table 1 provides evidence of this. 
Bowlers were more likely to smile when they 
were engaged in social contact, for example, 
when looking at their friends (M y=.11), 
and less likely to smile when they were tem- 
porarily avoiding social contact for some rea- 
son and were looking at the ground or away 
from their friends (M y = —.56). In addition, 
bowlers showed a tendency to smile more 
when they were talking to friends or being 
talked to by them (M y = .20). 

Smiling, especially open-mouth smiling, also 
occurred with playful behaviors; laughter (M 
y = .23 over all smiles; M y = .80 for open- 
mouth smiles), hamming (M y = .33 for all 
smiles; M y= .47 for open-mouth smiles), 
and face-covering (M y = .34 for all smiles; 
M y= .93 for open-mouth smiling). At the 
bowling alleys, open-mouth smiling occurred 
when bowlers were being socially playful. 
Since nonsocial playfulness rarely occurred 
in this setting, we cannot tell if the social or 
the playful motivation was a more important 
determinant of open-mouth smiling. 

The happiness hypothesis leads us to expect 
that bowlers would smile more after playing 
well and bowling a good score, but this was 
not the case. The association between smiling 


of the antecedent behavior; this is the maximum 
number of times it could have co-occurred with a 
subsequent behavior. Sub = number of behavioral 
groupings (i.e. two behavioral units in Replication 1, 
three units in Replication 2, and eight units in Rep- 
lication 3) in which a subsequent behavior occurred; 
this is the maximum number of times it could have 
co-occurred with an antecedent behavior. Then B = 
Ant—A; C=Sub—A; D=T—A—B-—C; Ir 
[(A X D) — (B X C)I/[ (A X D) + (B x C)1. Since 
the gamma based on Behavior X preceding Behavior 
Y is not necessarily the same as Y preceding X, we 
used the mean of these two measures in all our 


analyses. 
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and scoring a spare or strike was weak (M 
y = .13). Of the 1,793 rolls for which we col- 
lected data, 448 were Spares or strikes, 
Bowlers smiled 30% of the time after good 
scores and 23% otherwise. Other support for 
the happiness hypothesis is mixed. In general, 
smiling showed weak associations with the 
behaviors that we had identified as subtle indi- 
cators of negative affect and stronger associa- 
tions with larger scale emotional displays, 
Thus, smiling had little association with the 
subtle cues of grooming and head shaking, 
Smiling had a substantial negative association 
with the tight-lip display, which our observa- 
tions had led us to believe was an indicator 
of anger, frustration, and perhaps determina- 
tion. It is a component in the traditional anger 
display (Ekman & Friesen, 1975), However, 
the negative association could have been 
partly caused by the physical difficulty a 
bowler would have smiling and lip pressing 
simultaneously (and the difficulty coders 
would have distinguishing two behaviors in 
the mouth region) as well as by the emo- 
tional incompatibility of smiling and lip 
pressing. 

On the other hand, the spontaneous emo- 
tional displays we have termed positive and 
negative exclamations showed substantial 
associations with smiling. When bowlers were 
communicating surprise and glee, generally 
after getting a good score, they smiled (M 
y = 49); when they swore and showed other 
signs of disappointment and anger, generally 
at not getting a good score, they failed to 
smile (M y= —.49), Since smiling showed 
no association with score, which is the likely 
cause of bowlers’ emotional experience, these 
data suggest that when bowlers were attempt- 
ing to communicate their happiness through 
positive exclamations, they used smiles as 
part of the communication; when they were 
experiencing positive emotions but were not 
attempting to communicate them, however, 
smiling did not covary with other subtle be- 
havioral indicators of their emotional state. 

Several problems limit confidence in the 
conclusions one can draw from these co-occur- 
rence data. As we have suggested above, the 
degree of temporal co-occurrence between be- 
haviors may reflect as much their physical 
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Table 1 3 : 
Temporal Associations (Gamma) of Smiling and Other Behaviors 
a 
Replication 
3 
oe 3 M 
All Closed Open Closed Open all $ 

Behavior smiles smile smile smile smile smiles t 
Neutral face -11 —52 —88 —70 - a -= ve 10.5 
Closed smile ome 36 E ae w 3 n T 
S —48 07 91 —06 70 23 iy 
Ham 20 08 44 43 50 33 4.09 
Look at 70 63 88 53 83 71 11.14 
Look down —48 —82 —40 —66 -59 8.10 
Look away {-s9 Seen Esi Sas, 56 —52 13.95 
Talked to 42 05 44 37 —26 20 1 50 
Talk 37 08 41 07 —02 18 2.09 
Groom —47 01 —26 -14 04 —16 1.75 
Tight lip —79 —52 —68 —33 —63 —59 7.55 
Face cover 28 _ — —18 93 34 1.07 
Head shake —18 17 —07 12 15 04 a 
Positive exclamation 24 23 71 72 56 49 F 3 : 
Negative exclamation —65 —59 —40 —16 —64 —49 2 
Good score 07 37 29 01 -09 13 fi 


Note. Decimal points have been omitted. 


* In this table significance tests have not been performed on the individual gammas, since the units o 
which they are based were not independent of each other. They are based on 5,527 overlapping 4-secon 

periods in which two behaviors could co-occur, spread over 1,793 rolls of the ball and over approximately 
350 different bowlers. The £ tests performed on the mean gammas are based on five observations and test 
whether smiling is associated with the other behaviors reliably over the different replications and types 


of smiles. 


compatibility as their similarity in underlying 
motivation or external causation. For ex- 
ample, since smiling, laughing, talking, and 
compressing one’s lips are all behaviors done 
by the mouth, their mutual co-occurrence is 
limited. Similarly, the neutral display, by 
definition, cannot co-occur with other be- 
haviors. A second problem could be a result 
of observer errors, Faced with a rich and com- 
plex event, observers’ errors tended to be 
omissions; our training procedures showed 
that observers were more likely to ignore be- 
haviors that did occur than to record be- 
haviors that did not occur. What might be 
termed a climax error may be characteristic of 
event sequences in which observers under- 
report low intensity forms of a behavior that 
gradually change to a climax form of the same 
or another behavior. For example, it is pos- 
sible that smiling might be underreported in 
the sequences leading to laughter, hamming, 
and positive exclamations, while tight lips and 


grooming might be underreported in the se- 
quences leading to negative exclamations. 

A third problem is more conceptual. Our 
research strategy has been to infer the mes- 
sages of smiling by examining other behaviors 
with which it occurs, that is, marker be- 
haviors. Our beliefs about the significance of 
these other behaviors were based on the prior 
literature on nonverbal communication, on 
informal observation in our research setting, 
and on intuition. More systematic analysis of 
the patterns of co-occurrences among marker 
behaviors could provide direct evidence on the 
significance of marker behaviors and mien 
provide deeper insight into the messages ° 
smiling. 


Principal Components Analysis 


is to 
One partial solution to these problems pee 

analyze the similarities in co-occurrence ie 

pairs of behaviors had with other behav! 


ber 
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rather than analyzing only the temporal asso- 
ciation between pairs. For example, if lip 
tightening and closed-mouth smiling were 
equivalent but alternative behaviors, an anal- 
ysis of temporal association would show that 
they never Co-occurred, but an analysis of 
their similarities in co-occurrence would show 
that they occurred in exactly the same con- 
texts, Following Hooff’s logic (1973), we 
started with the assumption that to the extent 
that a pair of behaviors occurred in the same 
contexts, that is, to the extent that they had 
similar co-occurrences with other behaviors, 
they also shared underlying motivations or 
external causes. The Pearson correlation be- 
tween rows in a matrix of gamma scores (our 
measure of co-occurrence) is one measure of 
the extent to which the pair of behaviors rep- 
resented by the rows had similar co-occur- 
rences with each other behavior, including 
co-occurrences with themselves, Because the 
three replications of our basic study resulted 
in three estimates of the co-occurrences be- 
tween each pair of behaviors, we took the 
median of these three as our best estimate of 
the co-occurrence. The Pearson correlation be- 
tween rows was conducted on this matrix.? 

One can factor analyze this correlation 
matrix, Table 2 is a varimax rotation of the 
principle components analysis of this matrix. 
The five factors that emerged prior to an 
eigenvalue falling below 1.0 represent 86% 
of the variance in the original matrix. An ex- 
amination of the factor structure provides 
evidence both about the pattern of co-occur- 
rences and the underlying motivations repre- 
sented by the marker behaviors and about the 
association of closed- and open-mouth smiling 
with these underlying motivations. 

On what factors did smiling load highly? 
Factor 3 seems to represent a social motiva- 
tion. Talking, being talked to, looking at an- 
other, and not looking down or away all load 
highly on this factor, Closed-mouth smiling 
has its highest loading on this factor, and in 
addition, open-mouth smiling and laughing 
both loaded highly here. This result supports 
the previous analysis and clearly suggests 
the social motivation underlying both closed- 
and open-mouth smiling and laughing. 
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Table 2 
Component Analysis of Similarities in 


Co-occurrence 


Factor 


Behavior 1 pi 3 4 5 
Tight lip SLA ei AO a 


Head shake. jiti Zas lg ANA <0 


Laugh 84 -00 31 18 ~30 
Face cover 75 27 0S o8 4 
Open smile 60 32 4 -4 Fe 
Good score 18 92 «02 +3 ~19 
Negative 

exclamation —23 —9f _ 09-19 -1 
Positive i 

exclamation —08 70 30 54 06 
Talk 13-10 92, 
Talked to 12 11 87 —19 08 
Closed smile 10 44 64 —16 29 
Look at 31 41 63 —2 47 
Look down —57 -14 -61 3 ~05 
Ham 08 08 27 -§9 15 
Groom —34 -05 -08 75 ~14 
Neutral face =25 24-07 17 -g 
Look away =09" _=22  —47 16 67 


Note. Decimal points haye been omitted. Loadin 
greater than or equal to 60 are shown in italics, 


Smiling shows a complex pattern of rela. 
tionships with the emotional displays, Factor 
2 represents the explosive emotional displays 
associated with extreme scores: good Scores 
and positive exclamation loaded highly, anq 


?In Replication 1, closed-mouth smiles ‘were not 
distinguished from open-mouth smiles and looking 
down was not distinguished from looking away, 1, 
computing the matrix of median gammas, the yn. 
differentiated categories of Replication 1 were gen. 
erally averaged with each differentiated category of 
Replications 2 and 3, For example, the association 
between smiling and grooming in Replication 1 con. 
tributed in the final matrix to both the association 
of closed-mouth smiling and grooming and the asso. 
ciation of open-mouth smiling and grooming. How. 
ever, in a few cases closed and open smiles and look. 
ing down and away had substantially different a850. 
ciations with another behavior in both Replications 
2 and 3 (mean difference in gamma > .30). In these 
cases, Replication 1 was ignored in computing the 
final measure of association. Specifically, Replication 
1 was ignored in computing the median relationship 
of closed- and open-mouth smiling to laughter and 
to face cover, 
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negative exclamations loaded negatively. The 
association of closed-mouth smiles with this 
set of behaviors may reflect the role of closed- 
mouth smiling in the communication of emo- 
tion. Factor 1 seems to represent the contrast 
between affect and playfulness. The tight lip, 
head-shake, and looking-down displays cluster 
at one pole; and laughter, face covering, and 
open-mouth smiling cluster at the other. This 
factor may reflect the incompatibility of gen- 
uine playfulness with feeling bad. 

Factors 4 and 5 are especially difficult to 
interpret. Factor 4 primarily represents the 
contrast between hamming and positive ex- 
clamations, on the one hand, and grooming, 
on the other. Factor 5 may indicate expres- 
siveness. The neutral and looking-away dis- 
plays are at one extreme, and open-mouth 
smiling and face covering are at the other, 

In summary, both the temporal co-occur- 
tence of smiling with other behaviors and the 
factor analysis of similarities of co-occurrence 
Suggest that bowlers smiled when they were 
being social, when they were being playful, or 
when they were otherwise communicating an 
emotional statement to an audience. Both 
closed-mouth and open-mouth smiling shared 
a nonemotional, social motivation, In addition, 
open-mouth smiling was similar to laughter in 
adding a playful motivation; to the extent 
that one cannot be playful and at the same 
time distressed, angry, or disappointed, open- 


mouth smiling was incompatible with nega- 
tive affect. 


Study 2: Bowlers Facing the Pins or 
Their Companions 


One might dismiss our failure to find a 
strong association between smiling and emo- 
tional experience in Study 1 with the claim 
that bowlers were masking their emotional ex- 
periences; they hid the joy or disappointment 
they felt in order to appear modest or sports- 
manlike. To meet this objection we made ob- 
servations of people who were bowling alone 
and were therefore under no pressure to mask 
their emotions. Lone bowlers rarely showed 
any facial displays or other gestures but in- 
stead maintained a generally neutral face, The 
most common expressions seen were relatively 
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Table 3 A 
Percentage of Bowlers Smiling According to 
Bowling Score and Social Focus 

O 


Score 
Social focus Good Not good Total 
Yes 42 28 31 
No 4 3 3 
Total 46" 31» = 
Note. N = 116. 
“n = 26. 
bn = 90. 


antisocial or negative—looking down, tight 
lips, and negative exclamations: they rarely 
smiled. However, one may object to these 
data, since people who bowl alone may be the 
type of person who is unexpressive in all cir- 
cumstances, 

Therefore, to further examine the social and 
emotional messages of smiling, we looked at 
bowlers when they were facing the bowling 
pins and reacting to their rolls, and at the 
same bowlers when they turned to face their 
friends. The social hypothesis would lead us 
to expect very little smiling as bowlers faced 
the pins, since this is a relatively nonsocial 
setting. The emotional hypothesis would lead 
us to expect that bowlers would smile after 
rolling a good score and would engage in be- 
haviors revealing negative affect after bad 
Scores, regardless of their social orientations. 
When bowlers roll a good score and remain 
facing the pins, they should feel happy and 
should not need to mask or hide the expression 
of this emotion, since they believe that they 
are not being observed. 


Method 


An observer knelt on a platform among the pin 
setting equipment at the end of the bowling alley 
behind the bowling pins and watched bowlers 
through binoculars as they finished their roll. The 
observer was 19.2 meters from the bowlers and ob- 
served through a narrow slit in the facade of the pin 
setting equipment. The observer was invisible to 
bowlers. As bowlers finished their roll and stepped 
back while watching its outcome, the observer re- 
corded characteristics of the bowler and the behaviors 
listed above. Data were recorded from the moment 
the bowler stepped into view until he turned toward 


‘pe 


+. 
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the pit. Simultaneously, a second observer positioned 
as in Study 1 recorded behaviors in the standard way 
after the bowler turned to face friends in the pit. 
One hundred and sixteen rolls from 34 bowlers were 
observed from both Positions (behind the pins and 
facing the pit). 


Results 


As shown in Table 3, bowlers were gen- 
erally unexpressive while facing the pins, in 
comparison with their behavior when they 
faced their friends in the pit, Smiles were par- 
ticularly rare. In 116 observations, bowlers 
smiled 36 times when facing friends but only 
4 times while facing the pins, #(115) = 6.25, 
Ż < .001. Smiling was unrelated to how well 
they bowled. Only one of the pin-facing smiles 
was after a good score, although bowlers in 
this sample rolled 26 strikes or spares. These 
data clearly support the social hypothesis. 
People rarely smiled in nonsocial settings, re- 
gardless of emotional experience. 


Study 3; Smiling by Fans at a Hockey Game 


Hockey at Cornell University is probably 
the most important school sport. Students, 
faculty, and townspeople line up overnight to 
get season tickets. In 1977, Cornell’s team had 
finished its regular season first in the Ivy 
League and was in the playoffs for the Eastern 
College Athletic Conference Championships. 
On March 8, 1977 the team faced Rensselaer 
Polytechnic Institute in the quarter-final at 
Cornell. As might be expected, the game was 
sold out, mainly to enthusiastic Cornell fans. 
Given this situation, the emotional hypothesis 
about the causation of smiling would predict 
that Cornell fans would smile more when the 
hockey match was going well for Cornell than 
when it was going well for the opposing team. 
On the other hand, the social hypothesis 
would predict that fans would smile more 
when socially interacting with other fans than 


when watching the game. 


Method 


A photographer sat in the stands during the hockey 
game and, using a telephoto lens, took pictures of 
the spectators across the ice from him (about 30 
meters away) immediately following events favor- 
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Table 4 
Percentage of Spectators Smiling According to 
Valence of Hockey and Social Involvement 


a 


Valence of play for 


home fans 
Social Good Not good Total 
involve- — 
ment % n % n % 0 
Yes 27 59-22, «165-23 224 
No 12 1,258 2 2,244 6 3,502 
Total 13 1,317 3 2,408 7 3,726 


able, neutral, or unfavorable to Cornell’s chances of 
winning the hockey game. About 220 people were 
photographed at each exposure (M=223; SD= 
21.8). The photographer used a prefocused camera 
on a tripod and took pictures while watching the 
hockey game in order to be uninfluenced by the be- 
havior of his subjects as he was photographing them. 
Photographs were taken after (a) goals for Cornell 
or the opposing team, (b) penalties called on Cornell 
or the opposing team, (c) face-offs, before the puck 
came into play, and (d) time-killing passes of the 
puck at center ice, Cornell goals and opposing-team 
penalties were considered events favorable to Cornell, 
the opponent’s goals and Cornell penalties were con- 
sidered unfavorable, and face-offs and center-ice 
passes were considered neutral. The section of the 
stands that was repeatedly photographed mainly 
contained season ticket holders and was filled with 
Cornell fans. There were few opposing fans in the 
entire arena, and they were segregated in another 
area, 

Each transparency was coded by a person naive to 
both the hypothesis of the study and the events pre- 
ceding the photograph. Five randomly selected trans- 
parencies were then recoded by a second person to 
check reliabilities, The transparencies were first 
scanned for social units, which were defined as a 
group of two or more spectators, at least one of 
whom was turned towards the other or others in the 
group. The two independent coders agreed on 73% 
of the social units that either identified. Each member 
of a social unit was defined to be socially involved 
with others in that unit, and all other spectators 
were defined to be socially uninvolved. The trans- 
parencies were next coded for smiling. The two coders 
agreed on 73% of the smiles that either identified, 
69% for smiles within social units, and 74% for 


other smiles. 


Results 


Because the results involving neutral and 
bad events didn’t differ substantially from 
each other, they have been combined in the 
analyses that follow. Table 4 shows the per- 
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centage of fans smiling who were or were not 
part of a social unit following events that 
were good or not good for the Cornell hockey 
team, 

We coded a total of 3,726 faces in 16 
photographs. Over all photographs, the prob- 
ability of any one of these faces smiling at 
the moment that the picture was taken was 
.069. Some data supported the emotional hy- 
pothesis. Spectators were more likely to smile 
following events favorable to the home team 
(13%) than following neutral or bad events 
(4%; y= 60; N = 3,726; p < .001). Some 
data supported the social hypotheses. Regard- 
less of the valence of the events that preceded 
the photograph, spectators were more likely to 
smile if they were members of social units 
(23%) than if they were not (6%; y = .66; 
N = 3,726; p < 001). The effects of being 
socially involved on smiling were stronger 
following bad and neutral events (y= .85) 
than following good ones (y = .45, p < 01), 
perhaps because the base rate of smiling was 
lower following bad events. 


Study 4: Smiling, Social Interactions, 
and the Weather 


People feel happier on days when the 
weather is nice. Given this assumption, the 
emotional orientation would lead us to expect 
that people walking outdoors would smile 
more in pleasant than unpleasant weather, As 
they walk, they can either be socially in- 
volved with someone or not. The social orien- 
tation would lead us to expect that people 
would smile more if they are socially involved 
but that the weather alone should not influ- 
ence the frequency of smiling. 


Method 


A single observer made observations of pedestrians 
twice at each of four public walkways in Ithaca 
New York in September and October 1977. Each site 
was observed once during pleasant weather. when 
the temperature was between 50 and 70° F. id the 
sky was sunny or partly sunn; » and once during un- 
Pleasant rainy weather, Pairs of observations were 
made at approximately the same time of day and 
were made within 3 weeks of each other, A total of 
663 subjects were observed in the eight observation 
Periods. 
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Table 5 
Percentage of Pedestrians Smiling According to 
the Weather and Social Interaction 


Weather 
Good Bad Total 
Social — - 
interaction  % n % ” % n 
Yes 61 61 57 60 59 121 
No 12 264 S 288 8 552 
Total 21 325 14 38 _ -= 


At the site, the observer selected two reference 
marks about 10 meters apart on the sidewalk to 
indicate the limits of observation. The next pedestrian 
approaching the starting mark was selected as a sub- 
ject. If pedestrians were walking in pairs or larger 
groups, the person on the extreme left or the person 
on the extreme right was alternately selected as the 
subject for that group, before the group passed he 
starting mark. Subjects were observed for the 8 to 
12 seconds it took them to walk between the two 
reference marks. If they talked to anyone, greeted 
anyone, or were talked to by anyone at any time 
during the observation period, they were classified as 
socially interacting during that period. If they 
smiled at any time during the observation period, 
they were classified as smiling during the observation 
period. Reliability of the smile classification, based on 
a second observer's judgments of 100 pedestrians, 
was high (¢ = 86), 


Results 


Table 5 shows the cross classification of 
social interaction and smiling among pedes- 
trians during days with good and bad 
weather. These results show that pedestrians 
are no more likely to engage in social inter- 
action on nice days than on bad ones (19% vs. 
17%, $= .02). They were slightly more 
likely to smile on nice days than on bad ones 
(21% vs. 14%, $ = .10, p < 01) and were 
very much more likely to smile if they were 
conversing with or greeting someone than if 
they were not (59% vs. 8%, $ = 54, P< 
001). The effect of social interaction on smil- 
ing was slightly stronger on bad weather days 
than on good days ($ = .58 vs. 47, 2= 
—141, p< .10), but this interaction was 
minor compared to the main effect of social 
interaction and may not replicate. 

We conducted an extensive pretest in which 
1,489 subjects were observed at six sites on 
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the Cornell campus in the summer of 1977. 
Good weather was defined as between 65° F, 
and 75°F, with low humidity, whereas bad 
weather was over 85° F, with high humidity. 
Subjects were observed as long as they re- 
mained in sight, about 10-20 seconds, and 
observation times were not standardized 
across sites, In this pretest, subjects were 
, again slightly more likely to smile on nice 

days than on unpleasant ones ($ =.13, p< 
.01) and were much more likely to smile 
while socially engaged than when not (¢= 
.56, p < 001). Here, however, social inter- 
action had similar effects on good weather and 
bad weather days (¢ = .58 vs. 54, 3 = .89, 
p > 10). 

In both comparisons, social interaction was 
thus a much more powerful predictor of smil- 
ing than was the weather. In the main study, 
variations in social interaction accounted for 
29% of the variance in smiling, whereas 
variations in the weather and the positive and 
negative emotions they may have produced 
accounted for about 1% of the variance in 
smiling. 


Discussion 


The Social Hypothesis 


Both the present and earlier research pro- 
vide strong evidence that social involvement 
is a major cause of smiling, independent of 
the smiler’s emotional state. In each of the 
four studies described above, smiling was 
strongly associated with social interaction: 
talking to and looking at others in Study i 
facing fellow bowlers in Study 2, orienting 
toward other fans in Study 3, and talking to 
another person in Study 4. Other researchers 
have also found that smiling often occurs in 
a social context. Almost from its first appear- 
ance, smiling is socially produced and has 
social consequences, Human infants, from the 
age of 1 to 5 months, smile most in response 
to the human voice and the human face or 
abstractions of it (Sroufe and Waters, 1976). 
These smiles seem to be a major determinant 
of the bond between an infant and its care- 
taker (Fraiberg, 1977; Spitz & Wolf, 1946). 
‘Among nursery school children, smiles are 
likely to occur in the context of other social 
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behaviors such as pointing, giving, receiving, 
and talking (Blurton-Jones, 1972). When 
nursery school children approach a stranger, 
they often smile, and they are more likely to 
approach when the stranger smiles and talks 
to them (Connolly & Smith, 1972). In addi- 
tion, the smile appears to be a universal com- 
ponent of greetings (Kendon & Ferber, 1973; 
Eibl-Eibesfeldt, 1972). Even when people 
smile in response to humorous or other non- 
social stimuli, they smile more in the presence 
of other people than when they are alone 
(Mackey, 1976; Leventhal & Mace, 1970), 

It could be argued that positing a separate 
social cause for smiling is a theoretical extrav- 
agance. According to this view, the presence 
of others is just one of the many events that 
make people happy, and apparent social 
smiling is mediated by the pleasant emotions 
the smiler feels in the presence of others. This 
is undoubtedly true on occasion, 

However, there are several reasons to be- 
lieve that many smiles have purely social 
causes independent of happiness. The first is 
parsimony. We need not assume that hap- 
piness is a cause when we have evidence of a 
strong relationship between social stimuli and 
smiling, but no evidence of a mediating emo- 
tion. A theory of friendliness displays, based 
on independent comparative data, can ac- 
count for the empirical relationship. The com- 
ponent analysis of Study 1 showed a social, 
nonemotional motivation behind some smiling 
(Factor 3 in Table 2): both closed- and open- 
mouth smiling occurred in the same contexts 
as each of the social behaviors we recorded at 
a bowling alley. These in turn were inde- 
pendent of both gross and subtle emotional 
displays of happiness, disappointment, and 
anger. This evidence of purely social smiling 
exists even though smiling has an emotional 
component under some circumstances (Fac- 
tors 1 and 2 in Table 2). 

The existence of smiles in uncomfortable 
social settings is further evidence against the 
hypothesis that social smiles are mediated by 
happiness. Repeated viewings of some of our 
bowling videotapes convinced us that some 
smiling was done to apologize for an espe- 
cially clumsy performance or for poor bowl- 
ing, such as dropping the ball before bowling 
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or bowling a gutter ball immediately after 
bowling a strike. To examine this possibility 
more systematically, we have started looking 
at facial expressions in uncomfortable social 
settings, If smiling evolved from primate 
, appeasement displays, we would expect that 
in humans some smiling should occur when a 
person is trying to placate or appease an- 
other, for example when he or she has made 
a mistake or has violated a social norm and 
is apologizing for it. An exploratory field ex- 
periment suggests that subjects smile more 
in an appeasement than in a control condi- 
tion. In the appeasement condition, customers 
in a store were made to think that they had 
made a mistake when they interrupted a clerk 
busy with paperwork who told them “I’m not 
working here. She (another clerk) will help.” 
In the control condition, customers were made 
to think that no mistake had been made 
(“Fine, I'll get it for you, and she will ring it 
up”). Customers apologized more in the ap- 
peasement (11%) than the control (0%) 
condition. More to the point, customers 
smiled more in the appeasement condition 
(28%), when they were presumably trying 
to rectify a mistake, than in the control con- 
dition (5%, y = .77; n= 99; p < .001)3 

Ekman (1972) also notes that people often 
smile when experiencing an unpleasant emo- 
tion in the presence of others, although he 
interprets the smile as a mask for a socially 
inappropriate facial expression that the emo- 
tion would cause rather than as an appease- 
ment display. A question for future research 
is to determine whether an appeasement or a 
masking hypothesis can better account for the 
Occurrence of smiling in tense or uncom- 
fortable social situations. 

The detailed analysis of patterns of behavior 
in which smiling is embedded provides addi- 
tional evidence that social smiling need not 
be mediated by happiness. For example, Ken- 
don and Ferber (1973) have carefully de- 
scribed the behavior in a greeting. They di- 
vided the greeting into three stages, a 
distance salutation in which the participants 
establish their readiness and willingness for 
interaction, an approach phase in which one 
walks toward the other (or both do), and a 
close salutation that is a prelude to conversa- 
tion or other social interaction. During the 
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distance salutation, the greeters orient their 
bodies toward each other, look at each other, 
and show a greeting gesture such as a head 
toss, an eyebrow flash, a wave, or a verbal 
greeting. During the approach phase they 
decrease the amount they look at each other, 
until immediately prior to the close saluta- 
tion, when all participants looked at and then 
talked to each other. Clearly the greeters 
were regulating their social contact during 
the greeting, first agreeing to it, then post- 
poning it, and finally engaging in it when 
their physical separation made it convenient, 

Significantly for the present argument, 
smiling almost invariably occurred in both the 
distant and close salutation, when greeters 
were showing a willingness to greet each other 
and establish or reaffirm their relationship. 
Smiling occurred much less frequently and 
intensely during the approach phase, when 
the greeters were showing a temporary with- 
drawal from social contact by looking away 
from each other. Smiling thus varied with the 
intensity of social contact, Since emotions are 
often regarded as diffuse with a gradual 
decay, it is difficult to account for the rapid 
and asymmetrical shifts in smiling during a 
10-20 second greeting by referring to shifts 
in happiness. 

Through this literature review, we have 
tried to establish that social involvement is a 
major cause of smiling and that happiness 
does not seem to be a necessary mediator. 
Why do people smile in the presence of 
others? Drawing on comparisons of humans 
with other primates, Hooff (1972) argues that 
most human smiling is affinitive, used in 
the expression of sympathy, reassurance, or 
appeasement, that is, that the smiler’s motiva- 
tion is to insure the establishment and main- 
tenance of friendly interaction. The message 
might be paraphrased “I am friendly” or “I 
would like us to be friendly for a while.” This 
may occur when friendliness is highly prob- 
able, as when two old friends greet each other 
after an absence, or when friendliness is prob- 
lematic, as when a client interrupts a con- 
versation between two professionals to ask 
one of them a question. The smile is an evolu- 
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J tionarily designed signal to smooth interac- 
tion among members of a species who must 
ooperate in group living. 


The Emotional Hypothesis 


In our studies smiling had a weaker and 
more erratic association with happiness than 
* it had with social interaction. Although people 
smiled when shouting, jumping, and gestur- 
ing after bowling a spare or a strike or when 
cheering their hockey team’s goals, they did 
not smile much when they had gotten a good 
bowling score and were alone or had not yet 
turned to face their friends, or when they 
were walking down the street alone on a nice 
day. Given the very strong link between smil- 
_ ing and the experience of pleasant emotions 
posited by both the popular culture and the 
100-year-old Darwinian tradition, our failure 
to document this link convincingly demands 
explanation. We will discuss three attempts to 
reconcile our results with the happiness hy- 
pothesis. 

First, we may have found more smiling 
when people were social than when they were 
happy because our independent variables 
? were not equated for strength. If bowling, 
hockey games, and walks on fine days do not 
produce strong pleasant emotions, they could 
not be expected to produce much smiling 
either. We believe that in general, smiling as a 
display of happiness is relatively infrequent 
in daily life, partly because the strong emo- 
tions that may be needed to elicit it are also 
rare. However, this is not a plausible explana- 
tion for the present results. We picked set- 
7 ings and activities that were likely to pro- 
uce variations in positive and negative emo- 
tions. Our observations suggest that strong 
positive and negative emotions were produced 
in these settings, but that they were expressed 
without a consistent relationship to smiling. 
P The positive and negative exclamations dur- 
ing bowling were one indicator of the strength 
of the emotions produced. Not getting a spare 
or a strike led to fewer positive exclamations 
and more negative exclamations, head shakes, 
and tight-lipped displays. Thus variations in 
score produced variations in emotional expres- 
sions. Although we did not systematically 
collect data on emotional displays among 
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hockey fans or among pedestrians, our casual 
observations suggest that hockey fans ex- 
pressed their excitement, joy, and approval by 
jumping up and down, clapping, and scream- 
ing, whereas pedestrians celebrated spurts of 
pleasant weather, after days of Ithaca’s 
drizzly gloom or oppressive heat and humid- 
ity, by walking with a lilt, whistling, and 
humming, not by smiling, 

A second possibility is that although smil- 
ing did indeed have a strong association with 
happiness, our subjects masked their emo- 
tional expressions according to cultural dis- 
play rules, to confuse their audiences, or to 
comply with felt normative pressures (Ek- 
man, 1972). They refused to smile when they 
felt good and used smiling as a mask for nega- 
tive emotional expressions when they felt bad. 
However, this too is an unlikely explanation 
for our results. In both the bowling and 
hockey settings, subjects were very expres- 
sive, probably much more so than in other 
settings of daily life. For example, the fre- 
quency of positive and negative exclamations 
among bowlers and the strong association of 
positive exclamations with good scores attest 
to the freedom of this setting from constraints 
on emotional expression. In addition, as men- 
tioned earlier, bowlers smiled least when they 
were alone or were facing away from fellow 
bowlers and were thus under the least pres- 
sure to use display rules, 

A third reason why our results linking smil- 
ing and happiness were not as strong as we 
had expected is that we may have been mis- 
led by the phrase emotional expression. Per- 
haps because they have underemphasized the 
functions of emotional displays, earlier com- 
mentators, including Darwin (1872/1965) 
and Ekman (1972), have used the phrase 
expression to mean the outward manifestation 
of an internal state. While this definition is 
correct, it is incomplete. To the extent that 
smiling is linked with happiness, it is an 
evolutionarily adapted signal that informs 
other members of the species about the 
sender’s emotional state in order to influence 
their behavior. Thus, we should expect smil- 

ing, like other primate emotional displays 
such as fear (appeasement) or anger 
(threat), to be shown to a recipient and to be 
less frequently seen in the absence of an audi- 
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ence. This interpretation is consistent with 
our data. Smiling had its strongest associa- 
tion with emotion-producing conditions when 
subjects were communicating emotions in the 
presence of others through additional displays 
like positive and negative exclamations, but it 
was only weakly associated with emotion- 
producing conditions when subjects were so- 
cially uninvolved. These tentative results sug- 
gest the hypothesis for further research that 
emotional displays, in general, should be 
more frequent and more intense in the pres- 
ence of others, although this trend could be 
modified by display rules (Ekman, 1972). 


A Comparison of the Social and 
Emotional Hypotheses 


The social and emotional hypotheses about 
the causation of smiling are not incom- 
patible with each other. As we have sug- 
gested, social contact may sometimes produce 
happiness, which in turn may lead to smiling. 
In addition, people experiencing happiness 
may show it more in the presence of others. 

It is also possible that emotional and social 
motivations both independently produce fa- 
cial displays involving what we have termed 
a smile, but the morphology of the displays 
may differ. One possibility is that the mouth 
region has a different shape when expressing 
happiness, friendliness, or appeasement. Bran- 
nigan and Humphries (1972) and Grant 
(1969) have described several different 
smiles, Our own attempts to distinguish 
closed- from open-mouth smiles suggest that 
they have different underlying motivations 
with the closed-mouth smile more purely o- 
cial and the open-mouth smile more playful 
and possibly more emotional, Moreover, even 
if the mouth is the same, other components 
of the face may differ. The whole face, rather 
than just the smile per se, carries information 
about emotion (Ekman et al., 1972; Ekman 
Friesen, & Tomkins, 1971). Although the 
smiling mouth may be the most salient com- 
ponent of a happiness display, other com- 
ponents may often co-occur and differentiate 
happiness, friendliness, or appeasement 
smiles. Further research on facial displays 
needs to move to a finer level of description 
and categorization in an attempt to differ- 
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entiate the varieties of smiles and facial dis 
plays. Ekman and Friesen’s recent wort 
(1978) is a step in this direction. 


A Comparison oj the Ethological and the 
Expressive Traditions 


The ethological approach to nonverbal com- | 
munication, from which the social hypothesis 
derives, differs from other approaches in sev- | 
eral ways. Most important, it has remained | 
firmly wedded to evolutionary theory and, 
as a result, has stayed concerned with the 
functions of nonverbal displays and their 
social consequences. This concern with func- 
tions and consequences was clearly present 
in Darwin’s original work (1872/1965; seei 
his discussion of the principle of antithesis), 
but was lost as the study of emotional expres- 
sion passed through experimental psychology. 
The expressive approach to nonverbal be- 
havior, from which the emotional hypothesis 
about smiling derives, has focused on the cor- 
respondence between individuals’ internal 
States and their facial and other expression. 
As a result it has often embedded the study 
of nonverbal behavior in individualistic psy- 
chology by treating individuals as socially 
encapsulated. It is true that the expressive 
approach has studied communication in the 
limited sense of establishing that information 
about emotions can be transmitted through 
facial expression. With its almost exclusive 
reliance on the recognition experiment, how- 
ever, this approach has not shown that people 
use information from facial expressions in 
daily life. ` 

The effects of this neglect, while perhap; 
unintentional, are large. Ekman et al 
(1972), after reviewing more than 10% 
Studies on the facial expression of emotit £ 
published in the 100 years since Darwin’sẸ 
original work (1872/1965), found only a few 
that investigated causation or production of 
emotional displays and none on the effects of 
emotional expression on subsequent social 
interaction. 

In contrast to the expressive approach, the 
ethological approach to human nonverbal be- 
havior treats the individual as part of a social 1 
network and examines the interactions be- 
tween people and the effects of nonverbal be- 
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havior on others. By studying usage, this 

roach guides investigators toward a care- 

7 l, descriptive analysis of the situations in 

hich smiling occurs. The types of situation 

k studied were chosen to compare the emo- 

ional and social hypotheses about the causes 

of smiling, but we, along with others, have 

not looked at its effects. Indeed, we know of 

«no research on human nonverbal communica- 

tion from any tradition that has simulta- 

neously studied the social and motivational 

causes, the morphology, and the social con- 

p Sequences of a human display, as Hooff has 

done with chimpanzee bared-teeth displays. 

© This holistic approach is the direction, we 

think, that future work on human nonverbal 
communication should take. 


References 


Blalock, H. M. Social statistics. New York: McGraw- 


Hill, 1972. AAR 
Blurton-Jones, N. G. Categories of child-child inter- 
action, In N. G. Blurton-Jones (Ed.), Ethological 


studies of child behavior. Cambridge, England: 


i Press, 1972. 
Car Ae, D. A. Human non- 


of communication. In 
Ethological studies of 
England 


Brannigan, C. Rọ & Hi 
verbal behavior, a means 

N. G. Blurton-Jones (Ed), 

° child behavior. Cambridge, 


University Press, 1972. i 
Connolly, K., & Smith, P. K. Reactions of pre-school 


children to a strange . In N. G. Blurton- 


hild behavior. 
Ed.), Ethological studies of child | 
en England: Cambridge University Press, 


ee tions in man 
in, C. The expression of the emotio 
ee Chicago: University of Chicago Press, 
1965. (Originally published, 18 ae 
Eibl-Eibesfeldt, I. ities ites R A. Hinde 
cultures in expressive E a Br 
a i 72. 
England: Cambridge University Press, 19 
ibl-Eibesfeldt, I. The 
eaf- and blind-born. In M. 3 
. Vine (Eds.), Social com: 
ment. New York: Academic 
Ekman, P. Universal ande In J. K. Cole (Ed), 
facial expressions of emoti nT Vol 19). 
Nebraska Symposium on Loa es 
Line Meeks ER or of facial expression. 
k. P. Cross cultural stue i ens 
7 InP, Ekman (Ed.), Darwin and facial express? 
New York: Academic Press, 
Ekman, P., & Friesen, À 
Englewood Be bani da action coding 
Ekman, P. & Friesen, W. Y- 


; Cambridge 


1553 


system. Palo Alto, Calif.: Consulting Psychologists’ 
Press, 1978. 

Ekman, P., Friesen, W. V., & Ellsworth, P. Emotion 

make human face. New York: Pergamon Press, 
972. 

Ekman, P., Friesen, W. V., & Tomkins, S. S. Facial 
affect scoring technique: A first validity study. 
Semiotica, 1971, 3, 37-38. 

Fraiberg, S. Insights from the blind. New York: 
Basic Books, 1977, 

Grant, E. C. Human facial expression. Man, 1969, 4, 
525-536. 

Hooff, J. A. R. A. M. van. Facial expressions in 
higher primates. Symposia of the Zoological So- 
ciety of London, 1962, 8, 97-125. 

Hooff, J. A. R. A. M. van, A comparative approach 
to the phylogeny of laughter and smiling. In R. A. 
Hinde (Ed.), Non-verbal communication. Cam- 
bridge, England: Cambridge University Press, 1972. 

Hooff, J. A. R. A. M. van. A structural analysis of 
the social behavior of a semi-captive group of 
chimpanzees. In M. Von Cranach and I. Vine 
(Eds.), Social communication and movement. New 
York: Academic Press, 1973. 

Izard, C. The face of emotion. New York: Appleton- 
Century-Crofts, 1971. 

Kendon, A., & Ferber, A. A description of some 
human greetings. In R. P. Michael and J. H. Crook 
(Eds.), Comparative ecology and behavior of 
primates. New York: Academic Press, 1973. 

Leventhal, H., & Mace, W. The effects of laughter on 
evaluation of a slapstick movie. Journal of Per- 
sonality, 1970, 38, 16-30. 

Mackey, W. C. Parameters of the smile as a social 
signal. Journal of Genetic Psychology, 1976, 129, 
125-130. 

Spitz, R. A., & Wolf, K. The smiling response: A 
contribution to the ontogenesis of social relations. 
Genetic Psychology Monographs, 1946, 34, 57-125. 

Sroufe, L. A., & Waters, E. The ontogenesis of smil- 
ing and laughter: A perspective on the organization 
of development in infancy. Psychological Review, 
1976, 83, 173-189. 

Tomkins, S. S. Affect, imagery and consciousness: 
Vol. 1. The positive affects. New York: Springer, 
1962. 

Tomkins, S. S., and McCarter, R. What and where 
are the primary affects? Perceptual and Motor 
Skills, 1964, 18, 119-158. 

Vinacke, W. E. The judgment of facial expressions by 
three national-racial groups in Hawaii: I, Cau- 
casian faces, Journal of Personality, 1949, 17, 407- 
429. 

Vinacke, W. E., & Fong, R. W. The judgment of 
facial expressions by three national-racial groups in 
Hawaii: II. Oriental faces. Journal of Social Psy- 
chology, 1955, 41, 184-195. 

White, W. F. Street corner society (2nd ed.). Chi- 
cago: University of Chicago Press, 1955. 


Received September 14, 1978 m 


Journal of Personalit 
1979, Vol. 37, No. 


ity and Social Psychology 
9, 1554-1560 


The Effects of Mediator Rewards and 
Suggestions Upon Negotiations 


James A. Wall, Jr. 
School of Business and Public Administration 
University of Missouri—Columbia 


ch 


Over the years, mediators haye developed 
and employed a myriad of mediation tech- 
niques to facilitate negotiation effectiveness. 
Such methods include establishing effective 
lines of communication, becoming a spokes- 
man for the weaker side, creating a formal or 
relaxed negotiation atmosphere, pointing out 
ts of no agreement, 


Suggesting and at- 
ernative solutions, 
way. Even though 
variety, it suffers 
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suggestions; 


technique is more potent in the absence 


ens, 1963). Since a prerequisite for such an 
analysis is a delineation of the mediation 
process, the following succinct explication is 
proffered, Subsequently, it will serve as the 
base for the generation of the hypotheses, 


The Mediation Process 


The mediation paradigm consists of at least 
three individuals (mediator, negotiator, and 
opposing negotiator) and three relationships 
(mediator-negotiator, mediator — opposing 
negotiator, and negotiator - opposing negotia- 
tor). Consequently, the nature and outcomes 
of the mediation process can be considered 
as resulting from the main and interactive 
influences of (a) the personal characteristics 
of the three participants, (b) the nature of 
the three interperson 
situational factors, 

The negotiation literature documents well 
the effects of the two negotiators’ persona 
characteristics and their interpersonal relg 
tionships (Chertkoff & Esser, 1976; te 
Brown, 1975; Yukl, Note 1), and it se 
logical to conclude that these effects would } 
retain potency in the mediator’s presence, 
Similarly, the effects of situational factors 
upon Negotiator — opposing negotiator bar- 
gaining are thoroughly limned (Druckman, 


$ 
al relationships, and (c) Y 
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1971), and one can logically assume that most 
of these effects hold strong in a mediation 
paradigm. ` 

_ On the other hand, the effects of the media- 
tor-negotiator and the mediator — opposing 
negotiator relationships, as well as the media- 
tor’s personal characteristics, have not as yet 
been investigated. And similarly, the inter- 
actions of these factors with the situational 
factors, the negotiator’s traits, and his inter- 
personal relationships have not been studied. 
The present study focuses upon the mediator— 
negotiator relationship. 


The Mediator-Negotiator Relationship 


The relationship between the mediator and 
each of the negotiators can be considered one 
of exchange in which both the mediator and 
negotiator receive outcomes (or rewards) and 
make inputs (or incur costs). Usually the 
mediator’s primary goal or outcome is agree- 
ment between the two negotiators (Pruitt, 
1971; Stevens, 1963; Warren, 1954), In seek- 
ing this goal, the mediator can concentrate 
on the settlements and leave the means up to 
the negotiators. Such techniques include the 

_ following: suggesting better alternatives to 
the negotiators, selecting or defending one 
*party’s proposal, overestimating the costs of 
not settling or the benefits of settling, and 
making certain viable positions salient. On 
the other hand, the mediator can adopt a 
more indirect approach and concentrate upon 
modifying the negotiator’s behavior, expecta- 
tions, attitudes, perceptions, and so forth so 
as to conduce or facilitate agreement. Ex- 
amples of these techniques include controlling 
the amount of internegotiator communication, 
advising one party as to the other’s inten- 
tions, converting overt bargaining to tacit 
bargaining, suggesting concessions to one 
negotiator, and persuading one party to undo 
its commitments. As for the mediator’s inputs 
or costs, they are manifold. The mediator op- 
erates under a strict set of norms; he must 
contribute expertise, objectivity, social ap- 
proval (or disapproval), humor, and assur- 
ances. And occasionally he pays dearly in 
terms of time, sleep, energy, and relaxation. 
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Like the mediator, the negotiator experi- 
ences rewards and costs in the relationship. 
The rewards obtained from the mediator in- 
clude the mediator’s suggestions, approvals, 
payments, insights, and expertise as well as 
information about the opponent’s intentions 
and forthcoming behavior. As for the negotia- 
tor’s inputs or costs, they can include his 
promises, information as to how he will act, 
honesty and observation of norms vis-à-vis 
the mediator, and concessions to the opponent. 

Because the negotiator wishes to obtain 
high net outcomes from his relationship with 
the mediator, his behavior is most likely to be 
altered by mediation techniques that raise 
his rewards and lower his costs. Two media- 
tion techniques that do so are direct mediator 
rewards and mediator suggestions. 


Mediator Rewards 


A mediation technique that should prove 
effective in begetting negotiator concessions 
to an opponent entails rewarding the negotia- 
tor for his concessions—a technique success- 
fully utilized by President Carter in his Mid- 
east mediations (Butler, 1978). The pre- 
dicted effectiveness of this strategy (which 
involves giving the negotiator a large reward 
when he makes a large concession to his op- 
ponent, a small reward when he makes a 
small concession, and no reward when he 
makes a negative concession or no concession) 
is based upon the reinforcement principle that 
rewarded behavior is more likely to occur at 
a later time than is unrewarded behavior. 
It is also underpinned by the finding (Wall, 
1977) that an opposing negotiator, by using 
his own concessions as rewards, can increase 
the negotiator’s concessions. In view of the 
effectiveness of rewards when they are used 
by the opposing negotiator, who is usually 
perceived as an adversary, it seems that re- 
wards should be highly effective when applied 
by a mediator, who is usually perceived as a 
neutral party. 

In sum, the mediator’s rewards are ex- 
pected to increase the negotiator’s total con- 
cessions and consequently to result in more 
negotiator-opponent agreements and higher 
joint payoffs. Since the rewards follow the 
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negotiator’s concessions, however, they are 
not expected to affect his initial concession. 


Mediator Suggestions 


A mediation technique that should more 
expeditiously lead to negotiator concessions 
is the mediator’s suggestion of concessions, 
which provide the negotiator with a face-sav- 
ing device whereby he can make a concession 
and move toward agreement without a costly 
loss of face (Pruitt & Johnson, 1970), If the 
mediator’s suggestion precedes the negotia- 
tor’s initial concession, it is expected to in- 
crease that concession. It is hypothesized, 
however, that the effect upon the negotiator’s 
subsequent bargaining and its consequences 
is moderated by the presence or absence of 
mediator rewards: Whenever the mediator 
does not reward the negotiator’s concessions, 
the mediator’s suggestions retain potency 
throughout the negotiation because the nego- 
tiator retains his need of a face-saving device. 
That is, he continually requires an excuse or 
rationalization for his concession making, and 
the suggestions continually provide these. 
Therefore, when the mediator gives no re- 
wards, the negotiators receiving mediator 
suggestions continue to make large conces- 
sions to the opponent, Consequently, they 
make larger total concessions, reach more 
agreements, and share in larger joint payoffs 
than do the negotiators not receiving the 
suggestions, 

On the other hand, whenever the mediator 
does reward the negotiator’s concessions, the 
suggestions (vs. no Suggestions) have an 
insignificant effect upon the negotiator’s con- 
cession making because they are of little value 
to the negotiator. He does not need to save 
face in this situation, because the mediator’s 
rewards provide a rationalization for his con- 
cessions; he makes concessions because he re- 
ceives rewards for doing so. 


Hypotheses 
In summary, it is predicted that mediator 


tewards and suggestions have the following 
effects: 
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1. Negotiators receiving mediator rewards 
for their concessions (a) make larger total 
concessions to their opponent, (b) reach more 
agreements with their opponent, and (c) 
share in larger joint negotiator-opponent pay- 
offs than do negotiators not receiving such 
rewards, 

2. Negotiators receiving suggested conces- 
sions from a mediator make larger initial con- 
cessions to their opponent than do negotiators 
not receiving such suggestions. 

3. An interaction between the mediator’s 
rewards and suggestions is manifested in the 
negotiator’s subsequent bargaining and its 
outcomes. In the no reward condition, the 
negotiators receiving mediator suggestions 
(a) make larger total concessions to their 
Opponent, (b) reach more agreements with 
their opponent, and (c) share in larger joint 
negotiator-opponent payoffs than do negotia- 
tors not receiving such suggestions. The dif- 
ferences in the reward condition are smaller 
than those in the no reward condition. 


Method 
Subjects 


The 170 subjects who participated in the experi- 
ment were undergraduate male volunteers enrolled 
in behavioral science courses at Indiana University. 
For serving as a subject each received 2 hours’ credit 
for experimental participation. 


Procedure 


Two subjects and a confederate were used in each 
session. Upon arrival, they were randomly seated in 
three chairs at a round table that was partitioned 
into three equal segments of partitions 2 feet high 
connected at the middle of the table, Each segment 
was labeled union representative, mediator, or man- 
agement representative; the subjects were placed in 
the two representative positions and the confederate 
at that of the mediator. In the instructions the sub- 
jects were informed that they were going to act as a 
union or management representative in four negotia- 
tions and that the mediator was an MBA student 
majoring in labor relations who could communicate 
with them any time he wished. Attached to the 
union and management representative's instructions 
was his payoff matrix (and his alone) for the first 
negotiation. (The management and union payoff 
matrices for each of the negotiations were con- 
structed so that a gain to one side resulted in an 
equal loss to the other. The positive payoff areas of 
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the two matrices overlapped, however, so that in 
each negotiation it was possible for the management 
and union representative to agree on a mutually 
profitable wage. As an example, for the negotiation 
in which the current wage was $6.00, a settlement 
at $6.30 yielded a $.20 positive payoff for the man- 
agement and an $.18 loss for the union. For every 
$.01 that the wage increased, the union gained $.01 
and the management lost $.01. Therefore, if the 
agreement were $6.48, the management would make 
$.02 and the union $.00 j at $6.49 both would make 
$.01; and at $6.50 the management would make 
$.00 and the union $.02.) The instructions also told 
the representatives that they each had $.50 and that 
in the first negotiation the management representa- 
tive would initiate the bargaining. He was to make 
the first offer on his bargaining sheet and slide it 
through a slot on his left to the mediator. The 
mediator, in all conditions, was to look at the offer, 
make sure that no messages were written on the 
sheet, and pass the sheet through a slot on his left 
to the union representative. The union representative, 
in turn, was to respond with a counteroffer and pass 
it back to the mediator, who was to return it to the 
management representative. The management repre- 
sentative then would make his second offer, and so 
forth. Both representatives were informed that they 
and their opponent would each be allowed to make 
four offers in the negotiation and that failure to 
agree would result in a fine of $.10 to each side. As 
Soon as an agreement was reached or the eight offers 
were tendered, the mediator raised his hand and the 
subjects were given their instructions, payoff sheets, 
and a bargaining sheet for the second negotiation, 

The second, third, and fourth negotiations were 
similar to the first. The management representative 
made the first offer and both sides were allowed to 
make four offers. At the conclusion of the fourth 
negotiation, the subjects were asked to complete brief 
questionnaires. As soon as they had done so, they 
were paid and debriefed. 


Independent Variables 


Previous mediation studies (e.g., Bartunek, Benton, 
& Keys, 1975; Johnson & Pruitt, 1972; Johnson & 
Tullar, 1972) have applied the mediation technique 
to both the negotiator and his opponent. Unfortu- 
nately, such a methodology provides confounded, 
equivocal data. The reported effects of their media- 
tion techniques could have resulted from the negotia- 
tor’s response to the technique, from the opponent's 
response, or from an interactive negotiator-opponent 
response. To provide more unequivocal data concern- 
ing the effects of the mediation techniques, this study 
applied each technique only to the negotiator (man- 
agement representative). In all conditions, the media- 
tor passed notes to the opponent (union representa- 
tive) on which he had checked the statement, “I 
have no suggestions for this round.” 

Mediator rewards. In the reward condition, the 
Mediator gave the negotiator (management repre- 


1557 


sentative) 50% of each concession he made to the 
opponent (i.e., the mediator gave the negotiator $.01 
for every $.02 that he conceded to the opponent) and 
accompanied this reward with a handwritten message 
that read, for example, “Here is 
last offer.” (The blank was filled with the appropriate 
monetary amount, and the Specific wording of this 
message was varied over the different rounds,) On 
the sheet passed to the negotiator, the mediator also 
checked the message, “I have no suggestions for this 
round.” 

In the no reward condition the mediator after 
each round passed the negotiator a note on which 
he had checked the aforementioned Message without 
adding a handwritten message, 

Mediator suggestions. In the suggestion condition, 
the mediator on his note suggested a predetermined 
offer—one that exceeded the mean offer made by 
negotiators in a pilot study—to the negotiator prior 
to each round. An example of the statements used is 
“T suggest that you offer about $5.44 this time,” 

Under the no Suggestion condition the mediator 
in each round checked the no suggestion statement. 


— for your 


Dependent Variables 


There were four primary dependent variables un- 
der study. The negotiator’s initial and total conces- 
cessions were expressed as the difference between the 
current wage ($4.00, $5.00, $6.00, or $7.00, with the 
sequence randomly selected for each negotiator) and 
his initial and final offer, respectively, The number 
of agreements in the four negotiations had a possible 
range of O to 4. And the joint negotiator-opponent 
payoff was the sum of the two bargainers’ initial 
credits ($.50 each) and their payoffs for agreement 
minus their penalties for deadlocks ($.10 each). It 
did not include the mediator’s rewards to the nego- 
tiator. 


Design and Analysis 


A 2X24 repeated measures design with two 
levels of mediator reward (reward and no reward), 
two levels of mediator Suggestions (suggestion and 
no suggestion), and four trials (negotiations) was 
used to study the negotiator concessions. The number 
of agreements and the joint payoff were analyzed 
via a 2X2 design (two levels of mediator reward 
and two levels of mediator suggestion), 


Results 
Main Effects 


The mediator rewards affected all facets of 
the negotiators’ bargaining and outcomes as 
predicted in Hypothesis 1, Table 1 reveals 
that rewarded negotiators made larger total 
concessions to their Opponent than did those 
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Table 1 
Negotiators’ Concessions 
aii ieee se a e 


Negotiation 
per esi E 
Condition 1 2 3 4 
Reward 
Suggestion 


Initial concession .24 .22 .22 25 
Total concession 45 45 45 45 


No suggestion 
Initial concession .13 .19 .25 .23 
Total concession AT AT 


No reward 
Suggestion 
Initial concession .22  .24 .23 .26 
Total concession AS 2 cA ASME AT, 


No suggestion 
Initial concession 15 16 17 22 
Total concession 41 41 44 © ©40 


not receiving rewards, F(1, 81) = 3.91, p< 
051. Similarly, the rewarded negotiators 
reached more agreements with their opponent 
than did those who were not rewarded (Table 
2), F(1, 81) = 5.98, p < .02, and received 
larger joint payoffs, F(1, 81) = 8.52, p< 
.005. 

As for the effect of mediator suggestions, 
Hypothesis 2 was supported. Negotiators re- 
ceiving suggestions made larger initial con- 
cessions than did those receiving no sugges- 
tions (Table 1), F(1, 81) = 8.01, p < .006. 


Interactions 


The mediator rewards and suggestions in- 
teracted as predicted (Hypothesis 3a) to 
affect the negotiators’ total concessions, Table 
1, F(1, 81) = 5.92, p < .02. However, the 
data reveal a phenomenon in addition to the 
one predicted. They show that as predicted, 
the mediator suggestions are slightly more 
effective in conducing negotiator concessions 
in the no reward condition. But they also 
indicate that the mediator rewards are more 
effective in the no suggestion condition. Thus 
the two techniques have analogous effects: 
each is more potent in the absence of the 
other. Corroborative but nonsignificant evi- 
dence for this conclusion is supplied by Table 
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2. Here it can be seen that the mediator sug- 
gestions were effective in facilitating agree- 
ments and enhancing joint payoffs only in the 
no reward condition. Similarly, the mediator 
rewards were more effective in the absence of 
mediator suggestions. 


Within-Subject Effects 


As can be noted in Table 1, the negotiators’ ‘ 
initial concessions increased over the four 
negotiations, F(3, 82) = 5.45, p< .002; 
however, their total concessions did not ex- 
hibit this trend effect. In addition, the anal- 
ysis of the negotiator’s initial and total con- 
cessions revealed no significant Trend x Re- 
ward, Trend x Suggestion, or Trend X Re- 
ward X Suggestion effects. 


Discussion 


The principal conclusions to be drawn from 
this study are threefold. Mediator rewards 
are effective in facilitating negotiator conces- 
sions, negotiator-opponent agreements, and 
large joint payoffs. Second, these rewards are 
more effective in the absence of mediator 
suggestions. And finally, mediator suggestions 
lead to large initial concessions in both the ` 
reward and no reward conditions; however, 
their effectiveness abates in the reward condi- 
tion so that their effects upon the total con- 
cessions are nonsignificant. 

The interaction between the mediator’s 
rewards and suggestions raises an interesting 
question: Why are the mediator’s rewards 
more potent in the absence of his suggestions 
and his suggestions more effective in the ab- 


Table 2 


Negotiator-Opponent Agreements and Joint 
Payoffs 


Number of Joint 

Condition agreements payoff 
Reward 

Suggestion 2.7 81 

No suggestion 2.7 81 
No reward 

Suggestion 2.3 - 0 

No suggestion 1.8 57 


5 a ot em ee ee eS a 


THE EFFECTS OF MEDIATOR REWARDS AND SUGGESTIONS 


ce of rewards? It appears that mediator 
ggestions limit the effectiveness of his re- 
wards because the suggestions place a ceiling 
upon the negotiator’s concessions. That is, 
negotiators seldom will concede more than the 
mediator suggests. On the other hand, when 
no suggestions are proffered, such a ceiling is 
not created. The current data support this 
conclusion. Only six percent of the negotia- 
tors’ concessions in the reward-suggestion 
condition were larger than the mediator’s sug- 
gestions, However, 25% of the negotiators’ 
concessions in the reward—no suggestion 
condition were larger than the mediator’s 
suggestion on the corresponding round of the 
reward-suggestion condition. This difference 
is significant, t(41) = 2.73, p < .01. 

With regard to the effects of the mediator’s 
rewards upon the potency of his suggestions, 
the results support the argument that sugges- 
tions, in the absence of rewards, provide the 
negotiators with a face-saving device, allow- 
ing them to make concessions without suffer- 
ing loss of face. However, when the mediator 
rewards the negotiators’ concessions, the sug- 
gestions do not provide the negotiators with 
face because the rewards provide a rational- 
ization for their concession making. 

This latter proposition engenders a more 
thorough consideration of the role played by 
the mediator’s rewards. The reward effects, 
previously reported, could be solely attrib- 
utable to the change that the rewards made 
in the negotiator’s net payoffs. Or they could 
be attributable to a combination of payoff 
modification and other influences (€.g., face- 
saving or salience of reward). A 2 X 2 fac- 
torial design with two levels of negotiator 
reward and two levels of payoff administra- 
tion would indicate if the payoff modification 


‘was solely responsible. As in the present ex- 


periment, the reward treatments would con- 


sist of one condition (reward) in which re- 
wards are given for concessions and a second 
(no reward) in which they are not. With re- 
gard to the payoff administration factor, in 
one condition (mediator administered) the 
mediator, as in the present study, would ad- 
minister the reward or no reward, and in the 
other (matrix administered), the payoffs 
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would be dictated entirely by the negotiator’s 
payoff matrices. That is, under the reward - 
matrix administered condition, the rewards 
for conceding would be built into the negotia- 
tor’s payoff matrices, whereas under the no 
reward - matrix administered condition, the 
matrices would contain no rewards. If the 
rewards (vs. no rewards) were found to have 
the same effect upon the negotiator’s conces- 
sions in both payoff—administration condi- 
tions, one could conclude that the reward 
effects are attributable only to the change 
that the rewards make in the negotiator’s net 
payoffs. If a larger effect resulted under the 
mediator-administered condition, however, 
one could conclude that other influences also 
underpin the mediator reward effects. 

Since the present study applied the media- 
tion techniques only to the negotiator, it pro- 
vided an opportunity to investigate the effect 
that the techniques had, via the negotiator’s 
concessions, upon those of the opposing nego- 
tiator. If the increased negotiator concessions, 
resulting from mediator rewards and sug- 
gestions, had generated reduced opponent 
concessions, then the effectiveness of the 
mediation would have been reduced. But if 
increased negotiator concessions had resulted 
in large reciprocal concessions or had pro- 
duced no effect, their effectiveness would 
have been enhanced or unaffected. The anal- 
ysis of the opposing negotiator’s concessions 
revealed that neither the mediator rewards 
nor the suggestions altered the opponent’s 
concession making (none of the Fs for the 
differences approached significance.) That is, 
the opponents did not reciprocate the negotia- 
tor’s concessions, nor did they exploit them. 

Finally, the reported results attest to the 
potency of both mediator rewards and sug- 
gestions when applied to only one negotiator. 
Since neither technique produced indirect 
effects in the other negotiator’s bargaining, 
the effects of each technique would probably 
be enhanced by its concomitant application to 
the negotiator and his or her opponent. How- 
ever, further research is required to determine 
if this is the case or if joint application would 
produce an interactive effect. 
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Reference Note 


1. Yukl, G. A. A review of laboratory research on 
two-party conflict negotiation. Unpublished manu- 
script, Baruch College, 1975, 
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, The Effects of Behavioral Intentions and Consequences on 
Judgments of the Actor and Other: An S-V-O Analysis 


David A. Kravitz and Robert S. Wyer, Jr. 
University of Illinois at Urbana-Champaign 


An extension of Gollob’s subject-verb-object model of social inference was 
used to investigate the effects of information about behavioral intentions and 
consequences on judgments of both an actor and the person toward whom the 
behavior is directed. Participants received one or more pieces of information 
about some or all of the following factors: an attribute of the actor, the actor’s 
intentions to help or hinder the other, the actual consequences of this action 
(whether the other is helped or hindered), and an attribute of the other. Judg- 
ments of actors’ admirableness increased with the favorableness of the adjectives 
describing them, the favorableness of both their intentions and the consequences 
of their actions, the justness of their intentions and of the consequences of 
their actions, and their ability to produce the consequences they intended. 
Judgments of the other's admirableness depended only on the adjectives de- 
scribing the other and, when this adjective was not presented, on the conse- 
quences of the actor’s behavior for the other. Behavioral consequences appeared 
to affect judgments of both the actor and the other independently of the actor's 
intentions. A second experiment demonstrated that the effects of information 
on judgments of the actor depend on the dimension of judgment in predictable 
ways and suggested that judgments of admirableness may be mediated by per- 


ceptions of both yirtuousness and competence. 


When a person’s behavior has consequences 
for another, subsequent evaluations of both 
the actor and the other will often vary with 

” the favorableness of these consequences (cf. 
Lerner, Miller, & Holmes, 1976; Lerner & 
Simmons, 1966; Walster, 1966). These eval- 
uations are presumably based in part on the 
judge’s assumptions about the implications 
of the consequences for characteristics of the 
persons being judged. Unfortunately, the na- 
ture of these assumptions and their possible 
effects have not always been identified in re- 
search performed to date. The assumptions 
may depend on whether the person being 
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Wyer, Jr., Department of Psychology, University of 
Ilinois, Champaign, Ilinois 61820. 


judged is the actor or the target of the actor’s 
behavior and also on other available informa- 
tion about these persons that influences the 
interpretation of the behavioral consequences. 
Particularly important in this regard may be 
information about the actor’s intentions and 
about more general personality traits of the 
parties involved. 

The research reported in this paper inves- 
tigated the nature of these contingencies and 
the possible assumptions underlying them. 
Both the actor and the target of his actions 
were judged on the basis of identical sets of 
information, and the separate contributions 
of different informational cues to these judg- 
ments were isolated. The theoretical and em- 
pirical issues surrounding these matters are 
discussed below, along with the methodology 
used to investigate them. 


Judgments of the Actor 


To the extent that the judge perceives an 
actor as responsible for the consequences of 
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his/her action, the judge’s evaluations of the 
actor may vary with the desirability of these 
consequences. However, as Fishbein and 
Ajzen (1975, pp. 207-213) point out, there 
are several different criteria for assigning 
responsibility. One criterion is simply whether 
the actors are instrumental in producing the 
consequences that occur, regardless of 
whether they could have foreseen these con- 
sequences. A second criterion is whether the 
actors intended the consequences of their 
actions. According to a third criterion, re- 
sponsibility may be assigned only if the con- 
sequences of the actors’ behavior are unjus- 
tified in terms of social norms or standards of 
appropriateness, Thus, for example, an actor 
who intentionally harms a person about to 
commit a crime would be assigned respon- 
sibility according to the first two criteria de- 
scribed above but not necessarily according 
to the third, since harming a criminal might 
be considered justified. Fishbein and Ajzen 
argue that the inconsistent effects of behav- 
ioral consequences on evaluations of the actor 
(cf. Shaver, 1970; Shaw & Skolnick, 1971; 
Walster, 1966, 1967) may be partially due to 
a failure to control for the criteria used by 
judges for assigning responsibility in the ex- 
perimental conditions employed to investigate 
these effects. 

The above comments suggest that an ade- 
quate investigation of the effects of behavioral 
consequences on judgments of the actor may 
require a consideration not only of the actor’s 
intentions to produce these outcomes but also 
of the characteristics of the recipient of these 
outcomes. That is, if responsibility is assigned 
simply on the basis of whether the actor 
caused particular consequences, the conse- 
quence information should affect judgments 
of the actor independently of the actor’s in- 
tentions, as Walster ( 1966) reports, However, 
to the extent that responsibility is assigned 
on the basis of intentionality, evaluations of 
the actor may be based on the perceptions of 
the actor’s intentions, independently of the 
consequences that actually occur, Finally, if 
judges take into account the justifiability for 
the actor’s behavior, their judgments may 
also depend on attributes of the recipient that 
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influence their perceptions that the actor’s 
intentions or the consequences of his/her be- 
havior are justified. (Thus, for example, they 
may evaluate someone who either helps good 
people or harms bad ones more favorably 
than someone who either helps bad people or 
harms good ones.) Note that responsibility 
could be assigned on the basis of all three 
criteria simultaneously, To this extent, all 
three factors described above could contribute\ 
independently to judgments of the actor. 

In investigating these possibilities, two 
points must be carefully considered. First, 
when information is presented about inten- 
tions, consequences, and recipient character- 
istics in combination, configural features other 
than those described above may emerge and 
influence judgments of the actor, For ex- 
ample, suppose that people who appear capa- 
ble of producing the consequences they intend 
are seen as more competent than those who 
appear incapable of doing so and that the 
former are therefore more admired. To this 
extent, the consistency of the actor’s inten- 
tions with the consequences of his/her actions 
may affect judgments in a way that could not 
be predicted from a consideration of either 
intentions or consequences in isolation. 

Second, the above analysis points out that S 
a given piece of information may contribute to 
judgments in several different ways, depending 
on the information accompanying it. Thus in- 
formation that an actor intends to help an- 
Other may contribute positively to judgments 
of the actor based on the assumption that 
people who intend to help others are admir- 
able. In combination with information that 
the actor actually harmed the other, however, 
these helpful intentions might be interpreted 
as evidence that the actor is incompetent. In 
combination with information that the other 5 
Person is a criminal, these intentions may be 
interpreted as unjustified. In the latter two 
cases, the actor’s intentions would contribute 
negatively to judgments of him/her. More- 
over, these various interpretations of inten- 
tion information are not incompatible and 
thus could contribute simultaneously to judg- 
ments of the actor. A complete understanding 
of the influence of behavioral intentions and 
Consequences on judgments of the actor re- 


INTENTIONS, CONSEQUENCES, AND SOCIAL PERCEPTION 


quires that the relative contributions of these 
afactors be isolated and that the manner in 
which their contributions combine be deter- 
mined. 


Judgments of the Recipient of Behavioral 
Consequences 


At least two theoretical perspectives sug- 
“gest that the consequences of an actor’s be- 
havior may influence judgments of the person 
who experiences these consequences. Lerner 
and Simmons (1966) argue that people are 
motivated to maintain a belief that the world 
is just and therefore that people deserve the 
outcomes they receive. As a result, people who 
receive bad outcomes are assumed to be bad, 
whereas people who receive good outcomes 
are considered good. Similar predictions can 
be derived from the social equity formulation 
outlined by Walster, Berscheid, and Walster 
(1973). These considerations suggest that 
people may be evaluated on the basis of the 
consequences that befall them, regardless of 

ty why these consequences occur. 

Judgments may also be affected by the 
actor’s intentions toward the recipient. That 
is, judges may assume that people whom 

“others intend to help have more favorable 
attributes than people whom others intend 
to harm. Moreover, attributes of the actor 
may also play a role in these judgments. Ac- 
cording to balance theory (Heider, 1958), 


people should be evaluated more favorably 


(+) if either a highly regarded (+) person 
responds positively (+) to them or a disliked 
(—) person responds negatively (—) to them 
than if a highly regarded person responds 
negatively or a disliked person responds posi- 
tively. Thus, the effect of an actor’s intentions 
¢ toward another person on evaluations of the 
other may be contingent on the favorableness 
of attributes of the actor as well as on the 
favorableness of the actor’s intentions per se. 
The various factors described above may 
contribute to judgments simultaneously with 
other aspects of the information present 
(e.g, general personality attributes of the 
person being judged). On the other hand, it 
is conceivable that judges do not attach equal 
weight to these factors, but rather base their 
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judgments only on those features of the pre- 
sented information that they consider to be 
most reliable. For example, when direct in- 
formation about the attributes of a person is 
available, judges may base their evaluations 
of this person primarily on these attributes 
while disregarding the less direct implications 
of others’ behavior and intentions toward the 
person. In this case, these latter implications 
may have an impact only when direct infor- 
mation about the person’s attributes is un- 
available. The present study explored the 
possibilities discussed above. 


Description of the Approach 


The approach taken to investigate the is- 
sues outlined above was stimulated by the 
subject-verb-object (S-V-O) formulation of 
social cognition developed by Gollob (1974b). 
This formulation, which has been applied suc- 
cessfully to a variety of problems in impres- 
sion formation and attribution (Gollob, 
1974a, 1974b; Wyer, 1974, 1975; Wyer, Hen- 
ninger, & Wolfson, 1975; Wyer & Hinkle, 
1976) is described in detail elsewhere (Gol- 
lob, 1974b, 1979; Wyer & Carlston, 1979) 
and so will not be elaborated here, According 
to this approach, the information in a social 
interaction situation is analyzed by the judge 
into a series of informational cues, each cue 
defined in terms of values along one or more 
dimensions. The use of a given cue is hypo- 
thetically based on a specifiable assumption 
that the judge makes about its implications. 
Judgments can theoretically be described as a 
function of the independent contributions of 
these cues (and thus of the assumptions 
underlying their use), and the magnitudes of 
these contributions can be measured and 
tested statistically. The approach was orig- 
inally applied to the effects of information 
along only three dimensions, pertaining to a 
trait of an actor, the actor’s behavior toward 
another, and a trait of the other. However, it 
can easily be extended to any number and 
type of dimensions (cf. Wyer, Henninger, & 
Wolfson, 1975; Wyer & Hinkle, 1976). Such 
an extension was made in the present re- 


search. 
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To apply the formulation to the specific 
conditions of concern, consider a prototypic 
situation in which an actor engages in be- 
havior that either helps or hinders another 
person. In such a situation, information rel- 
evant to evaluations of the actor and the 
other might vary along at least four dimen- 
sions, pertaining to (a) an attribute of the 
actor, (b) the actor’s intentions to help or 
hinder the other, (c) the actual consequences 
of the actor’s behavior (the other is helped or 
hindered), and (d) an attribute of the other. 
These dimensions will be denoted S, I, C, and 
O, respectively. If the values along each of the 
four dimensions are dichotomous (positive vs, 
negative), 16 configurations of stimulus in- 
formation can be constructed, each represent- 
ing a different combination of values along 
these dimensions. 

In making a judgment on the basis of in- 
formation such as that described above, 
judges may consider the implications of a 
number of different, potentially overlapping 
subsets of values along the dimensions in- 
volved. Which subsets they consider, and the 
effects of these subsets on their judgments, 
depends on the assumptions they make about 
their implications and the importance they 
attach to these implications. Each subset of 
dimension values, if considered as a basis for 
making a judgment, functions as a single unit 
of information, or informational cue. It is 
important to note that not all Possible config- 
urations of dimension values are meaningful 
cues: Only those configurations that have im- 
plications for the judgment when interpreted 
as a single unit function in this manner. Al- 
though it is often difficult to predict which of 
several possible cues a judge will actually use, 
the set of potentially useful cues can typically 
be circumscribed a priori. 

Judgments of the actor. Based on the the- 
oretical considerations raised above, previous 
research findings, and intuition, seven infor- 
mational cues (referred to by Gollob as 
biases) were expected to influence judgments 
of the actor along a dimension of admirable- 
ness. (Admirableness rather than likableness 
was initially used because it seemed more 
directly relevant to considerations of the role 
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of justice. However, possible differences be- 
tween admirableness and other evaluative 
judgments are considered later in this paper.) 

1. S-bias. Judges may assume that people 
with favorable attributes are more admirable 
than people with unfavorable ones. If this is 
so, actors should be judged as more admirable 
when the adjective describing them is favor- 
able than when it is unfavorable, 

2. I-bias. Judges may assume that people’ 
with good intentions (i.e., intentions to help) 
are more admirable than people with bad in- 
tentions (i.e. intentions to hinder), If 50, 
their evaluations of the actor should be more 
positive when the actor’s intentions are good 
than when they are not, independently of 
other considerations, 

3. C-bias. Judges may assume that people 
who actually benefit others, regardless of their 
intentions to do so, are more admirable than 
those who hinder others. If judges make this 
assumption, their evaluations of the actor 
should increase with the favorableness of the 
consequences of the actor’s behavior. 

4. O-bias. Judges may assume that “birds 
of a feather flock together” and therefore 
may infer that actors are more admirable 
when their associates (i.e., the other) have 


p 
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favorable attributes than when these asso-> 


ciates have unfavorable attributes, Although 
this cue is not relevant to the main concerns 
of this paper, it has been found to make a 
small but reliable contribution to judgments 
in previous studies (Gollob, 1968; Wyer, 
1975; Wyer & Hinkle, 1976). 

5. /O-bias. Judges may assume that people 
whose intentions are just are more deserving 
of admiration than people whose intentions 
are unjust. To this extent, judges should rate 
actors who intend either to help a good person 
or to hinder a bad person as more admirable 
than actors who intend to help a bad person 
or to hinder a good one. 

6. CO-bias. People who produce outcomes 
that confirm that judge’s perceptions of a just 
world may be considered more admirable than 
People who produce outcomes that threaten 
this perception. To this extent, judgments of 
actors who actually help a good person or 
hinder a bad person should be more favorable 
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than judgments of actors who actually help a 
, bad person or hinder a good one. 

7. IC-bias. Judges may assume that people 
who are capable of producing the outcomes 
they intend are more competent, and thus 
more admirable, than people who are incapa- 
ble of doing so. To this extent, actors who 
actually help someone they intend to help or 
hinder someone they intend to hinder should 
‘be more admired than actors who help some- 
one they intend to hinder or hinder someone 
they intend to help. 

Note that the contributions of I-bias, C- 
bias, IO-bias, and CO-bias reflect the hypo- 
thetical effects of the different criteria for 
assigning responsibility noted earlier, whereas 
the contribution of IC-bias reflects the effect 
of an emergent cue that results when inten- 

tion and consequence information is presented 
in combination. S-V-O methodology permits 
these contributions to be isolated and inde- 
pendently assessed, along with the contribu- 
tions of more general personality character- 
istics of the actor and other (S-bias and O- 
bias). 

Judgments of the other. Five informa- 
tional cues seemed likely on a priori grounds 
to affect judgments of the person toward 

«whom the actor’s behavior was directed. The 
use of some of these cues is presumably based 


on assumptions similar to those postulated: 


to underlie judgments of the actor, while the 
use of other cues is based on quite different 
assumptions. 

1 and 2. O-bias and S-bias. As suggested 
above, people with favorable attributes may 
be seen as more admirable than those with 
unfavorable attributes, and people who asso- 
ciate with desirable others may be seen as 
more admirable than those who associate with 

¢ undesirable others. To this extent, judgments 
of the other’s admirableness should increase 
with the favorableness of both the adjective 
describing the other and the adjective describ- 
ing the actor. 

3. I-bias. Judges may assume that people 
generally try to benefit those who are ad- 
mirable but try to harm those who are not 
admirable. To this extent, their judgments 
of the other’s admirableness should be greater 
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when the actor intends to help the other than 
when the actor intends to hinder him/her. 

4. C-bias. People who believe that the 
world is just may judge someone who receives 
good outcomes more favorably than someone 
who is a victim of misfortune (Lerner & Sim- 
mons, 1966). If so, the other should be 
judged as more admirable if she/he is helped 
than if she/he is hindered, independently of 
other considerations. 

5. SI-bias. Judges may assume that people 
are more likely to be admirable if good per- 
sons intend to help them or bad persons in- 
tend to hinder them than if good persons 
intend to hinder them or bad persons intend 
to help them. If this assumption is made, 
judgments of the other’s admirableness should 
be greater in the first two cases than in the 
second two. 

Three things may be noted from the above 
analysis. First, the contribution of each infor- 
mational cue described above is reflected by 
the magnitude of one of the 15 orthogonal 
contrasts (main effects and interactions) in a 
standard analysis of variance on judgments as 
a function of the four informational variables. 
(For example, the contribution of S-bias is 
inferred from the main effect of the informa- 
tion given about the actor. Similarly, the con- 
tribution of IC-bias is inferred from the mag- 
nitude of the contrast corresponding to the 
interaction of intentions and consequences.) 
The procedures for calculating these con- 
trasts will be described presently. Second, a 
single piece of information (for example, the 
actor’s intentions) may be involved in sev- 
eral different informational cues, and thus 
may contribute to judgments in several dif- 
ferent ways. Third, the use of a cue in judging 
an actor may be based on different assump- 
tions than is the use of the same cue in judg- 
ing the other. 

The studies reported in this paper bear on 
these possibilities. The first experiment inves- 
tigated the contributions of informational 
cues to judgments of the admirableness of 
both an actor and the other, and the extent to 
which the contributions of these cues de- 
pended upon the availability of other infor- 
mation. The second experiment investigated 
the effects of similar information on judg- 
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ments of the actor with respect to two other 
characteristics, competence and virtuousness 
(goodness), that were assumed to mediate 
judgments of admirableness. These data pro- 
vided further clarification of the cues defined 
above and the assumptions underlying their 
use. 


Experiment 1 
Method 
Overview 


Forty-eight introductory psychology students 
judged the admirableness of two persons on the basis 
of information that varied along four dimensions 
concerned with (a) an attribute of the actor, (b) 
an attribute of the other, (c) the actor’s intentions 
to help or hinder the other, and (d) the actual con- 
sequences of the actor’s behavior for the other 
(helped vs. hindered). Each type of information was 
varied independently in a 2¢ design with two differ- 
ent stimulus replications, In addition to the 16 stim- 
uli comprising the complete design, subjects made 
judgments of stimuli composing all possible subsets 
of the information dimensions given above. Data 
were analyzed using S-V-O methodology in a manner 
described below, 


Subjects 


Fifty-one undergraduates participated in the study 
to fulfill a course requirement in introductory psy- 
chology. Of these, 3 who failed to Provide complete 
data were subsequently dropped from the analyses, 
Participants were run in groups of from 8 to 12 in 
sessions of approximately 1 hour. Subjects were ver- 
bally debriefed upon completion of the questionnaire. 


Construction of the Questionnaire 


formation along four 
tribute of the actor (S), the actor’s intentions (I), 


the actual Consequences (C) of the actor’s behavior 
for the other, 


There were two levels of each factor? 

$ To ensure that the results obtained were not pecu- 
liar to a given set of stimulus materials, these mate- 
rials were constructed at two levels of abstractness, 


describe the actor and the other in the two sets, Each 
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were described as either tolerant or intolerant, and 
the actor’s intentions (or consequences) dealt with 
having the other transferred to another department © 
of the corporation for which they both worked—a 
transfer that the other either did or did not desire. 
Thus, for example, the stimulus for the condition in 
which values along S and I were both positive and 
values along C and O were both negative was of the 
form: “A is a tolerant person. A intends to have B 
transferred to another department, where B wants to 
work. As a result of A’s actions, B is transferred to 
department where B does not want to work. B is anų 
intolerant person,” 

The abstract materials contained information 
about a less clearly specified situation in which the 
actor’s intentions and the consequences of his/her 
behavior were described simply in terms of helping 
and hindering. Moreover, the attributes describing 
the actor and the other were either honest or dis- 
honest rather than tolerant or intolerant as in the 
concrete materials. Thus, the stimulus representing 
the same condition as in the preceding example was 
of the form: “A is an honest person, A intends to 
help B. B is hindered by A. B is a dishonest person.” 

In all, 80 sets of information were presented in 
each stimulus replication, Of these stimuli, 16 con- 
tained information concerning all possible combina- 
tions of values along the four dimensions. These 
stimuli comprised the complete (SICO) stimulus 
design. The remaining 64 stimuli formed all possible 
subdesigns, each pertaining to a different subset of 
information dimensions, Specifically, subjects were 
presented with the eight stimuli forming each of the 
SIC, SIO, SCO, and ICO subdesigns, the four stimuli 
forming each of the SI, SC, SO, IC, IO, and CO 
subdesigns, and the two stimuli comprising each of> 
the S, I, C, and O subdesigns, where the letter(s) 
symbolizing each design pertain to the dimension(s) 
along which information was varied. 

Stimuli were presented in three subsections of the 
questionnaire. One subsection contained those stimuli 
with information about both I and C (ic, those 
comprising the SICO, SIC, ICO, and IC designs) ; 
a second subsection contained stimuli with informa- 
tion about C but not I (those comprising the Sco, 
SC, CO, and C designs); and the third subsection 
contained stimuli with information about I but not 


1The information Presented 
of “help” and “hinder”) was chosen on the basis of 
a pretest. Twenty-three subjects were Presented with 
a number of Personality traits, intentions, and conse- 
quences, and were asked to rate “how good each 
individual trait, intention or consequence is.” The 


The intention and 
the abstract stim- 
ulus materials (help and hinder) was chosen to relate 
to previous work in the area (cf. Gollob, 1974; 
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C (those comprising the SIO, SI, IO, and I designs). 
ge remaining stimuli, comprising the SO, S, and O 
designs, were randomly distributed within each of 
the three subsections. Stimuli were ordered randomly 
within each subsection except that the stimuli ex- 
pected to produce the most extreme judgments were 
presented first in an attempt to anchor the response 
scale. These subsections were presented in all six pos- 
sible orders to control for any demand character- 
istics that might be engendered by the presence or 
absence of specific information.” 
į After receiving each stimulus, subjects estimated 
how much they admired both the actor and the 
other along 11-point scales that ranged from 0 (not 
at all) to 10 (extremely much). The order of these 
estimates was counterbalanced across subjects. 

Thus 84 stimuli were contained in the main section 
of the questionnaire (80 unique stimuli and 4 repeti- 
tions; see Footnote 2), with each stimulus requiring 
two responses. 


F l Results 


Judgments of the Actor 


Data pertaining to the complete design and 
to each*of the 14 subdesigns were analyzed 
separately. In each case, analysis of variance 
was first performed on judgments, collapsed 
over order of stimulus presentation and order 
of response, as a function of the four infor- 
mation dimensions presented ($, I, C, and O) 

wand the type of stimulus materials (abstract 
vs. concrete). The main effect of stimulus 
materials was not significant in any of the 
15 analyses, and few effects of informational 
variables were significantly contingent on 
scenario, The only exception to this concerned 
the interaction of stimulus materials with 
information about the actor (i.e., the informa- 
tion along S), which was significant in all 8 
designs that included this information, These 
interactions indicated that the effect of the 
trait assigned to the actor in the abstract 
¢ materials (honest vs. dishonest) was greater 
than the effect of the trait assigned to him/ 
her in the concrete materials (tolerant vs. 
intolerant). This contingency simply reflects 
the difference in the implications of these 
traits for admiration and has no theoretical 
relevance to the issues of central concern in 
this paper. The few other interactions involv- 
ing stimulus materials had no obvious psycho- 
logical meaning and were interpreted as 
spurious.* Subsequent discussion is therefore 
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restricted to data pooled over the two stim- 
ulus conditions. 

An analysis of judgments as a function of 
the fSur informational variables involved in 
the complete (SICO) design yields 15 pos- 
sible main effects and interactions, Of these, 
7 (the 4 main effects and the two-way inter- 
actions between intentions, consequences, and 
the attribute describing the other) correspond 
to the information cues expected on a priori 
grounds to contribute significantly to judg- 
ments of the actor (S-bias, I-bias, C-bias, 
O-bias, IC-bias, IO-bias, and CO-bias). To 
calculate the orthogonal contrasts assumed to 
indicate the contributions of these cues, the 
favorable pole of each informational dimen- 
sion was assigned a positive (p) valence and 
the unfavorable pole was assigned a negative 
(n) valence, The contrast corresponding to a 
particular effect was then computed by sub- 
tracting (a), the mean judgment of stimuli 
in which the product of the relevant valences 
was negative, from (b), the mean judgment 
of stimuli in which the product of the relevant 
valences was positive. (Thus, for example, the 
contrast corresponding to the interaction of I 
and O, which indicates the contribution of 
10-bias, was inferred from the difference be- 
tween the mean judgment based on stimuli in 
which the valences of I and O were both p or 


2To investigate the possible effects of presenta- 
tion order, four additional stimuli comprising one 
of the two-factor designs that appeared in the first 
subsection of the questionnaire were repeated in the 
third subsection, The specific two-factor designs used 
were varied over subjects. Comparisons of these two 
sets of judgments revealed no significant differences, 
suggesting that order was not an important factor 
in this experiment. 

3 In only three designs did stimulus materials 
interact with any factor other than S. In these 
cases, the two sets of stimulus materials were an- 
alyzed separately. In only one instance did these 
separate analyses result in a significant effect in the 
opposite direction from that of the combined data; 
specifically, the effect of O was negative in the ICO 
design when the abstract stimulus materials were 
used. In all other cases, the effect obtained using 
abstract materials was greater than that obtained 
using the concrete stimulus materials, but both effects 
were in the same direction. These interactions in- 
cluded CO in the CO and ICO designs and IO in the 
IO design, 
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Table 1 


Contributions of Predicted Informational Cues to Judgments of the Actor 
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qe — 


Item S-bias I-bias C-bias O-bias IC-bias 10-bias CO-bias 
SS ee 
ign 

seo abe 2.66"* 1:739. 1.10°* 4 38* 34° 35% 
Subdesigns 

SIC ci 2.78°* 1,96** 1.26** 49" t 

SIO 2,99%* 2,69"* 0S .38** 

SCO 3.61** 2.02°* ll 24" 

ICO 2.76** .86** —.26 1,18°* 09 51 

SI 3.02** 3.00** 

SC 3.49** 2117 

SO 5.06** 29 

Ic 3.11% 1.24** a“ 

10 2,78%* —.26 1,20°° 

co 2.67°* —AS 1,19° 

S 4,.92** 

I 4.06** 

c 3.04** 

o .02 


Informational cue 


Note. S = an attribute of the actor. I = the actor's intentions to help or hinder the other. C = the actual 
consequences of the actor's behavior. O = an attribute of the other. 
*F(1, 42) > 4.08, p < .05. **F(1, 42) > 7.31, p <0. 


n and the mean judgment based on stimuli in 
which one of these valences was p and the 
other was m.) These contrasts are shown in 
Table 1 for the complete design. Similar con- 
trasts, based upon analyses of each subdesign, 
are also presented. (Statistical significance in 
each case is based on a separate analysis of 
data for each subdesign.) 

According to the assumptions hypothesized 
to underlie the use of each cue and the nota- 
tional system used to compute its contribu- 
tion, the contrast associated with each cue 
should theoretically be positive. In fact, all 
contrasts but that corresponding to O-bias 
were significant and positive in the complete 
design. Moreover, the six cues that contrib- 
uted significantly to judgments in the com- 
plete design contributed significantly to judg- 
ments in the subdesigns in which they were 
presented in 29 of 30 cases. The Nonsignif- 
icant contribution of O-bias in both the com- 
plete design and subdesigns is not too sur- 
prising, since the magnitude of this cue was 
quite small in previous studies (Gollob, 1968; 
Wyer & Hinkle, 1976). Overall, the 7 con- 
trasts described in Table 1 and interpretable 
on a priori grounds as reflecting the contribu- 


tions of informational cues accounted -for 
97% of the item set variance in the complete 
design, whereas the 8 interactions that were 
not predicted to contribute to judgments ac- 


counted for only 3% of the variance. More-» 


over, these latter interactions were signif- 
icant in only 4 of the 13 relevant subdesigns 
(31%), as opposed to 29 of 37 instances 
(78%) in the case of the seven predicted 
effects. None of the significant unpredicted 
effects qualifies our conclusions concerning the 
predicted effects of informational cues; in all 
cases, the contributions of these cues re- 
mained in the expected direction, varying only 
in magnitude. 

Nevertheless, the nature of the few unex- 


pected effects may be worth noting briefly. \ 


The only unexpected interaction that ap- 
proached significance in the complete design 
involved S, I, and C; F(1, 47) = 8.63, p< 
-01. This is less than the Bonferroni critical 
value of 9.73 for 15 comparisons and the .05 
significance level (Myers, 1972). To the ex- 
tent that this interaction is taken seriously, it 
may best be interpreted as an indication that 
the contribution of IC-bias is greater when 
the actor is described by a favorable attribute 


Y 


à 
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(M contrast = .64) than when the actor is 

¢ described by an unfavorable attribute (M 
contrast = 12). The use of IC-bias pre- 
sumably reflects the assumption that people 
are more admirable if they produce outcomes 
they intend than if they do not. The possibil- 
ity that this assumption is made only when 
the actor has other favorable qualities has 

_ some intuitive appeal; if the actor is de- 

‘scribed by an unfavorable attribute, this may 

be sufficient to conclude that she/he is not ad- 
mirable, and judges may thus attend less to 
other cues. However, since this interaction 
did not even approach significance ‘in the 
SIC subdesign, the above interpretation 
should be treated with caution pending rep- 
lication of the effect. 

Three of the four unpredicted significant 
effects in the subdesigns pertained to the 
interaction of S and I. The interpretation of 
this interaction, which was not significant in 
the complete design, is unclear. It could re- 
flect a tendency for judges to assume that 
people are more admirable if their intentions 
are evaluatively consistent with their general 
personality attributes (both favorable or both 
unfavorable) than if they are evaluatively 
inconsistent. Alternatively, it may also indi- 

r cate that judges pay more attention to an 
actor’s intentions when the actor is described 
by a favorable adjective than when he is de- 
scribed by an unfavorable one, and so I-bias 
has more effect in the former condition than 
in the latter. (This interpretation is con- 
sistent with our tentative interpretation of 
the SIC interaction.) Finally, the interaction 
could be due to a floor effect produced by. the 
unavailability of sufficiently extreme re- 

“sponses (on the response scale) when both 
the adjective describing the actor and his in- 

« tentions were unfavorable. If this is the case, 
however, it occurred despite our attempt to 
stabilize the response scale by presenting the 
stimuli calling for the most extreme responses 
first. 

Summary. The results described above 
provide support for several general hypoth- 
eses raised in the introduction. First, they 
suggest that responsibility for behavioral con- 
sequences may be assigned according to sev- 
eral different criteria, and that each con- 
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tributes to evaluations of the actor. That is, 
responsibility may be assigned simply on the 
basis of the actor’s having committed the act 
that produced these consequences (leading 
C-bias to contribute to judgments), on the 
basis of the actor’s intentions (leading I-bias 
to contribute), and on the basis of the justi- 
fiability of these intentions and consequences 
(leading IO-bias and CO-bias to influence the 
judgments). In addition, judgments of the 
actor may be based on other factors, includ- 
ing more general dispositional characteristics 
(as reflected in the contribution of S-bias) 
and the ability to produce the outcomes she/he 
intends (as evidenced by the contribution of 
adC-bias). It should be noted, however, that 
while the contributions of these cues may be 
isolated within any given design in which they 
occur, this does not imply that the contribu- 
tions of some cues are unaffected by the avail- 
ability of others. We shall return to this pos- 
sibility presently. 


Judgments of the Other 


Judgments of the other were analyzed in 
the same manner as were judgments of the 
actor. Once again, no main effects of stimulus 
materials were significant, allowing us to 
pool data over this factor.* 

Five main effects and interactions, corre- 
sponding to the informational cues denoted 
S-bias, I-bias, C-bias, O-bias, and SI-bias, 
were postulated on a priori grounds to be sig- 
nificant. The contrasts corresponding to these 
effects, calculated in the manner described 
previously, are shown in Table 2 for both the 
complete design and the subdesigns to which 
they are relevant. While all contrasts were 
positive as predicted, only that correspond- 
ing to O-bias was significant in the complete 
design. This suggests that when direct infor- 
mation about attributes of the other is avail- 


4 In only three designs did stimulus materials inter- 
act with any factor other than O. In these cases, the 
two sets of stimulus materials were analyzed sep- 
arately. In only one instance did these separate 
analyses yield an effect that was significantly differ- 
ent in direction from those reported in Table 2: The 
effect of S was in reverse direction in the SIC design 
with the concrete stimulus materials. 
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Table 2 A 
Contributions of Predicted Informational Cues 
to Judgments of the Other 


a re el 


Informational cue 
S- I- C- OSSR 


Item bias bias bias bias bias 
Complete 

design 

SICO .12 06 1 -3.78** 108 
Subdesigns 

SIC 06 —.03  43¥* 07 
SIO —.03'2 TS 3.77** 19 
SCO 08 18 3.70%% 
ICO AS .32** 3.80%% 

SI .29  .29* .21 
SC -00 27 

SO „35 4.96** 

IC -43** gore 

10 18 3.99%" 
co 36 3.91% 

S -06 

I 38 

C AS 

0 5.19** 


Note. S = an attribute of the actor, | = the actor's 
intentions to help or hinder the other. C = the 
actual consequences of the actor's behavior. O = 
an attribute of the other. 


*F(l, 42) > 4.08, p <.05, ** F(l, 42) > 7.31, 
P< .01. 


able, only this information is used as a basis 
for judging the Person’s admirableness, In this 
case information about other People’s inten- 
tions_and reactions to the person has rela- 
tively little influence, 

When trait information about the other js 
not available, however, the actor’s intentions 
and reactions toward him/her appear to have 


designs (M Contrast = 0.27) than in the four 
designs in which information along O was 
available (4 Contrast = 0.13), F(1, 47) = 
1.74; p> 105 Similarly, the contribution of 
C-bias, Presumably based on the assumption 
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that good things occur to 800d people and 
bad things occur to bad people, contributed \ 
significantly to judgments in three subdesigns, 
two of which were also those in which trait 
information about the other was not pre- 
sented. The mean contribution of C-bias when 
information along O was not available was 
nonsignificantly greater (M contrast = 0.41) 
than its contribution when the information 
along O was also 
0.25), F(1, 
though these differences are not reliable, their 
consistency is sufficient to suggest that judges 
are most likely to infer the admirableness of 


Although it was Statistically significant in 
only three of eight cases, the contribution of 
C-bias to judgments of the other was con- 
sistently Positive in all eight designs in which 
information along C was presented. 


consequences information has an 
effect over and above that of intentions, We 5 
ll return to this matter presently. 
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plete design. Moreover, the proportion of item 

x set variance accounted for by these 10 inter- 
actions was only 2%, as compared to 98% 
for the 5 effects that were predicted. Finally, 
the unpredicted interactions were significant 
in only 1 of 19 instances in which the relevant 
factors were manipulated in the various sub- 
designs. None of the 3 unexpected effects had 
any obvious psychological meaning, and none 
¿represented a significant reversal in the direc- 
tion of the contributions of any informational 
cue described in Table 2. 


Combinatorial Processes 


Judgments of the actor, unlike judgments 
of the recipient, appeared to be based upon 
several different informational cues. A ques- 
tion therefore arises concerning the manner in 
which the implications of these cues are com- 
bined to influence the judgments. There-are at 

-least three obvious possibilities, First, as 
noted previously, judges may use only a lim- 
ited number of informational cues as a basis 
for their responses (cf. Wyer, 1974; Wyer 
et al., 1975), If this is so, the likelihood that 
any given cue is used may decrease with the 
total number of cues available, The other 

`ø possibilities are suggested by the research on 
algebraic models of information integration 
(cf. Anderson, 1971, 1974). That is, judges 
may first construe the implications of each 
informational cue separately and-then either 
average or sum these implications to arrive at 

an overall judgment. If the implications of a 
given informational cue remain invariant over 
the stimulus configurations in which the cue 
is contained, an averaging model also implies 
that the contribution of this cue will decrease 
with the number of cues available (since 

« the relative weight given to it will be less). 

On the ôther hand, a summative model im- 
plies that the contribution of each cue will not 
be affected by the number of other cues pres- 
ent. 

To explore these possibilities, the mean 
contributions of the informational cues that 
contributed significantly to judgments of the 
actor in the complete design were plotted as 
a function of the number of factors compris- 
ing the design, (While the number of dimen- 


E af a 
Z are ae 
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Figure 1. Mean contributions of (a) S-bias, (b) I- 
bias, (c) C-bias, (d) IC-bias, (e) IO-bias, and (f) 
CO-bias to judgments of the actor as a function of 
the number of factors in the design. 


sions along which information is presented is 
not a perfect indicator of the number of in- 
formational cues available, this criterion was 
used in order to eliminate the need for a 
priori assumptions concerning the nature of 
these cues and whether they would contribute 
significantly.) These data are shown in Figure 
1, In each case, the contribution of the cue 
decreased in magnitude with the number of 
informational dimensions comprising the de- 
sign. This conclusion is supported statistically 
by the results of linear trend analyses of the 
data in each panel of Figure 1.° These trends 
were significant in analyses of S, F(1, 141) = 
70.64, p < .01; I, F(1, 141) = 89.27, p< 
01; C, F(A, 141) = 54.23, p < .01; IO, F(1, 
94) = 9,92, p< .01; and CO, F(1, 94) = 
15.35, p < .01 The linear trend for IC was 
nearly significant, F(1, 95) = 3.15, .05 < p 
< .10. These data rule out a summative pro- 
cess of information integration, However, the 
other two alternative possibilities raised above 


6 These analyses were performed on the mean con- 
tribution of the given cue in all of the designs of the 
same size (1, 2, 3, or 4 factors), with subjects being 
the unit of analysis. Thus, for example, the value 
for the contributions of S were based on a single 
design for Design Sizes 1 (S) and 4 (SICO) and on 
the mean of three designs for Design Sizes 2 (SI, 
SC, SO) and 3 (SIC, SIO, SCO). 
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cannot be distinguished on the basis of these 
data,” 


Experiment 2 


As we noted earlier, the assumptions under- 
lying the use of informational cues, and there- 
fore the contributions of these cues, may de- 
pend substantially on the characteristic being 
judged. The second experiment was conducted 
to identify some of these differences. In doing 
so, it also provided indirect support for cer- 
tain interpretations of the significant con- 
trasts obtained in the first experiment, 

Admiration for people may be based on two 
general types of criteria, both of which were 
reflected in the informational cues that con- 
tributed to judgments in Experiment 1. On 
the one hand, a person may be admired be- 
cause she/he is virtuous, that is, his/her at- 
tributes and behavior coincide with social 
standards of morality and justice. Thus, the 
contributions of I-bias and IO-bias to judg- 
ments of the actor presumably reflected as- 
sumptions that people who intend to help 
others are more admirable than people who 
intend to hinder others and that people whose 
intentions are just are more admirable than 
people whose intentions are unjust. On the 
other hand, a person may be admired because 
she/he is particularly competent or has ex- 
ceptional artistic or athletic skills, quite apart 
from moral considerations. This would ac- 
count for the contribution of IC-bias to judg- 
ments of the actor in the first experiment, 
which was presumably based on the assump- 
tion that persons who are capable of produc- 
ing the outcomes they intend are more com- 
petent, and therefore more admirable, than 
are those who are incapable of generating 
intended outcomes. If this reasoning is cor- 
rect, it has implications for the contributions 
of these informational cues to judgments of 
the actor along dimensions other than admir- 
ableness. Specifically, all three cues men- 
tioned above should affect judgments of the 
actor’s admirableness, as was found in Experi- 
ment 1. However, suppose contributions of 
Tbias and IO-bias to admirableness judg- 
ments are mediated by their influence on per- 
ceptions of the actor’s virtuousness, whereas 
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the contribution of IC-bias is mediated by its 
influence on perceptions of the actor’s com- 
petence. In this case, I-bias and IO-bias but 
not IC-bias should contribute to judgments of 
the actor along a good-bad dimension (as- 
sumed to be an index of virtuousness), 
whereas IC-bias but not IO-bias or I-bias 
should contribute to judgments of the actor’s 
competence, 


Method 


Two questionnaires were prepared, one pertaining 
to competence judgments and the other to general 
evaluations of the actor along a good-bad dimension. 
Each form was constructed using the same basic set 
of stimulus materials, which represented all eight 
combinations of bipolar values along three informa- 
tional dimensions pertaining to an actor's intentions 
(to help vs. to hinder), behavioral consequences 
(helps vs. hinders), and a general description of the 
other (good vs. bad). Thus the information pre- 
sented was comparable to that involved in the ICO 
subdesign of Experiment 1. This information was 
Presented in items of the form: “S intends to help 
(hinder) O. O is helped (hindered) by S. O is a good 
(bad) person.” Then, in the form pertaining to com- 
petence judgments, two items were constructed with 
each set of stimulus information. In one item sub- 
jects were asked to estimate how likely it was that S 
was competent, and in the second item they were 
asked to estimate how likely it was that S was in- 
competent. In the form pertaining to good-bad judg- 
ments, corresponding items asked whether S was ` 
likely to be a “good person” or a “bad person.”" 


7 There is, in addition, a statistical problem with 
these data. They violate the homogencity-of-treat- 
ment~difference-variance assumption (Huynh & 
Feldt, 1970). Although this does not affect the analy- 
Ses previously reported, it does force us to further 
qualify the modeling results. Specifically, when this 
assumption is not satisfied, the contrasts are not 
uncorrelated in the population and thus there is no 
guarantee that the sample contrasts are the appropri- 
ate bias weights, The homogencity-of-treatment- 
difference-variance test was run on the covariance 
matrix of the 15 contrasts of the SICO design. The 
contrasts pertained to judgments of the actor, and 
the data were pooled over both stimulus replications. 
For these data: f= 104, d= 8967, W= 001, and 
—(df) (d)InW = 290 is compared to chi-square with 
104 degrees of freedom. The correlations among the 
contrasts ranged from —.499 to +421, and the mean 
was —.002. 

8 Subjects were asked about both poles of the di- 
mension to provide a test of the “relevant bias hy- 
pothesis” proposed by Gollob (1974b). This test is 
beyond the scope of this report; for details, sce 
Kravitz (1978). 
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In all cases, judgments were made along an 11-point 
scale, ranging from O (not at all likely) to 10 (ex- 
* tremely likely). The 16 items in each form were 
arranged randomly, except that those items expected 
to result in the most extreme responses were placed 
first in an attempt to anchor the response scale. 
Fifty-one subjects who did not take part in the 
first experiment participated in this study to fulfill 
a course requirement in introductory psychology. Of 
these, 27 completed the form pertaining to compe- 
tence judgments, and the other 24 completed the 
form pertaining to good-bad judgments. 


Results 


For each set of stimulus information, each 
subject’s judgment of the actor with respect 
to the negative pole of the relevant response 
dimension (i.e, either “incompetent” or 
“bad”) was subtracted from his/her judg- 

z ment of the actor with respect to the positive 
pole (“competent” or “good”), Each differ- 
ence score was then treated as a single index 
of perceived competence or goodness.’ The 
two sets of difference scores were then an- 
alyzed separately as a function of the three 
informational dimensions. 

The five main effects and interactions that 
were significant in the first experiment when 
admirableness judgments were made, corre- 
sponding to the contributions of I-bias, C- 

*bias, IC-bias, IO-bias, and CO-bias, were of 
particular interest. The magnitudes of the 
contrasts corresponding to these effects, along 


Table 3 
Contrast Magnitudes for Judgments of the 
Actor (S) in Experiment 2 


Dimension of judgment 


Contrast + Competent-incompetent Good-bad 
I 22 2.28** 
bo © 21 1:27%* 
IC 2.54** 32 
10 —.01 97% 
co —.04 .74* 


Note. S = an attribute of the actor. I = the actor's 
intentions to help or hinder the other. C = the 
actual consequences of the actor’s behavior. O = an 
attribute of the other. J 

*p < .05, F(1, 23) > 4.30 for competent-incom- 
petent and F(1, 26) > 4.23 for good-bad. **p < -01, 
F(1, 23) > 7.95 for competent-incompetent and 
F(1, 26) > 7.72 for good-bad. 
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with the othér contrasts involved in the analy- 
ses, are shown in Table 3 for both sets of 
judgments, (In each case, contrasts were cal- 
culated in the manner described in Experi- 
ment 1.) These data show the pattern ex- 
pected on the basis of the considerations 
noted above. That is, of the five informational 
cues found in Experiment 1 to affect admir- 
ableness judgments, only the cue expected to 
be particularly relevant to competence, IC- 
bias, contributed significantly to these judg- 
ments. However, this cue did not contribute 
to judgments of the actor along a good—bad 
dimension. In contrast, I-bias, C-bias, IO- 
bias, and CO-bias each contributed signif- 
icantly to judgments of the actor along the 
good—bad dimension, but not to judgments of 
the actor’s competence. Although the effects 
of C-bias and CO-bias were not explicitly 
predicted, their contributions are not incom- 
patible with these predictions. They suggest 
that people whose actions have either de- 
sirable or just consequences are assumed to be 
better people, regardless of their intentions, 
than are those whose actions have undesirable 
or unjust consequences. 

In summary, the results of this experiment 
suggest that the effects of the informational 
cues on judgments of admirableness found in 
Experiment 1 were mediated by their implica- 
tions for two different attributes, competence 
and virtuousness. 


Discussion 


In combination, the two experiments re- 
ported here provide substantial insight into 
the assumptions and factors that underlie the 
effects of behavioral consequences and inten- 
tions on judgments of the interacting parties. 
Moreover, they point out important differ- 
ences between the factors that underlie the 


®This approach seemed justified on a priori 
grounds for two reasons. First, it was expected that 
judgments pertaining to one pole of the dimension 
would correlate negatively with judgments pertaining 
to the other pole. Second, the use of both poles 
would seem to represent the dimension better than 
the use of a single pole. Supplementary analyses of 
judgments pertaining to each pole separately do not 
qualify the pattern of results reported (Table 3). 
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effects of behavioral consequences on judg- 
ments of the actor and the factors that under- 
lie their effects on judgments of the person 
toward whom the actor’s behavior is directed. 
In the first case, we speculated that judg- 
ments of the actor may be mediated by at- 
tributions of responsibility at several different 
levels, ranging from simply the commission of 
the act that produced the consequences to the 
actor’s intentions to produce the consequences 
and the justifiability of these outcomes (as 
defined in terms of social standards of ap- 
propriateness). In fact, attributions of re- 
sponsibility at all three levels appear to con- 
tribute to judgments of the actor. In addition, 
evaluations of the actor appear to be based 
_ upon perceptions of the actor’s ability to pro- 
duce the outcomes she/he intends, as well as 
more general dispositional characteristics im- 
_ plied by the adjectives used to describe him/ 
her. In this regard, results of the second ex- 
periment suggest that judgments of the actor’s 
admirableness may be mediated by percep- 
tions of both virtuousness (goodness) and 
competence, and the contributions of these 
cues to admirableness ratings may be a result 
of their more fundamental influence on these 

mediating perceptions, 
ieee ar seny i actors were based 
their intentions and the consequences of 
rea te we 
ed by the actor’s intentions 


and behavioral consequen: 
nd bet ces onl; 
direct information abo Eee 


indications of his/her Personality are pro- 


enter into inferenc, 
ways. For example, 


actors as well as oth 
were admired more ae 


when their behavior had 
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good consequences than when it had bad con- 
sequences, independently of other considera- 
tions. In addition, actors were more admired 
when their behavior had just consequences 
(i.e, when they helped good people or hin- 
dered bad people) than when it had unjust 
consequences. This is not to say that be- 
havioral intentions are ignored; the intentions 
to produce both good outcomes (I-bias) and 
“just” outcomes (IO-bias) also contributed 
significantly to judgments of the actor, How- 
ever, the independent contribution of conse- 
quences information is noteworthy and sug- 
gests an extension of the role of just world 
considerations beyond that implied by the 
original formulation. That is, one’s belief in a 
just world may not only lead one to evaluate 
others in accordance with the outcomes they 
receive (thus making one’s own perceptions 
consistent with this belief) but may also lead 
one to evaluate people more favorably if their 
behavior has consequences that reinforce this 
belief. 

The effects of consequences and intentions 
noted above are relevant to a more general 
issue raised earlier, That is, a given piece of 
information can contribute to judgments in 
several ways, depending on the interpretation 


of its implications. Moreover, these contribu-, 


tions may occur simultaneously. In the pres- 
ent case, for example, the actor’s help or 
hindrance of the other may be considered 
independently of additional information pre- 
sented and therefore may contribute to judg- 
ments in its own right. However, depending 
on the additional information available, it 
may also be interpreted as an indication that 
the actor behaved justly or unjustly or that 
she/he is more or less competent. Each of 
these interpretations has implications for 
judgments of the actor. 

The fact that information can affect judg- 
ments in different ways emphasizes the value 
of S-V-O theory and methodology in con- 
ceptualizing these various effects, In this 
regard, it is worth noting that of the statis- 
tical contrasts postulated on a priori grounds 
to reflect the contributions of psychologically 
meaningful informational cues, six of seven 
contrasts contributed to judgments of the 
actor and two of five contrasts contributed to 


` 


Wa 
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judgments of the recipient. On the other 

«hand, very few of the contrasts that were not 
clearly interpretable as informational cues 
contributed significantly to judgments of 
either the actor or the other, and these con- 
tributions were typically not consistent over 
the informational configurations (full design 
and subdesigns) in which the relevant infor- 
mation was presented. Thus, it should be ap- 
parent that an S-V-O formulation can serve 
as a heuristic framework for circumscribing 
the set of information factors that may influ- 
ence judgments in a given situation, and for 
conceptualizing the possible assumptions un- 
derlying the effects of these factors. 
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Two studies tested the hypothesis that self-directed attention would cause in- 
creased awareness of internal states and would thus reduce suggestibility effects 
Experiment 1 applied this reasoning to the experience of an emotion. Males 
viewed moderately arousing slides of female nudes after being led to expect the 
slides to be either highly arousing or nonarousing. As predicted, ratings of the 
slides corresponded less with these experimentally-manipulated anticipations 
when self-focus was heightened by the presence of a mirror than when it was 
not. Experiment 2 examined a different internal experience: the perception of 
taste. Some subjects were led to expect a strong flavor as part of a test series, 
and other subjects were led to expect a weak flavor. Subjects high in private ` 
self-consciousness were less affected by this expectancy manipulation and more 
accurate in reporting their actual internal state than were subjects low in 
private self-consciousness. Discussion centers on the theoretical implications of 


the findings. 


Casual observation suggests that many per- 
sons do not know themselves very well (cf. 
Nisbett & Wilson, 1977). When asked for a 
self-description, they provide their ques- 
tioner with a picture fraught with distortions, 
ambiguities, and inaccuracies. Moreover, 
these vagaries in self-report are often un- 
motivated. That is, the inaccuracies exist 
simply because people are ignorant of their 
characteristics, not because of any attempt to 
deceive. As we all know, these impressions 
from everyday experience are also substan- 
tiated in the empirical literature (cf. Mischel, 
1968). The correlations between self-report of 
behavior and actual behavior, for example, 
typically hover in the mid-twenties; rarely 
do they even reach into the thirties, 


The authors express their appreciation to the fol- 
lowing people, who assisted with data collection: 
Randall Heller (Experiment 1); Gaye Delevin, Russ 
Militello, and Steve Faloon (Experiment 2). 

Requests for reprints should be directed to Michael 
F. Scheier, Department of Psychology, Carnegie- 
Mellon University, Pittsburgh, Pennsylvania 15213. 


Recent research stemming from self-aware- 
ness theory (Duval & Wicklund, 1972; Wick- 
lund, 1975), however, appears to indicate that 
this picture of people’s self-knowledge need 
not always be so bleak. In particular, th 
theory suggests that self-focused attention 
may be an important determinant of the ac- 
curacy of self-reports. Presumably, a person 
whose attention is directed inward is more 
cognizant of his or her self-aspects than is a 
person whose attention is directed elsewhere. 
The heightened salience of self that occurs as 
a function of self-focus should be reflected in 
more accurate self-descriptions. 

This seems indeed to be the case. In one 
study (Pryor, Gibbons, Wicklund, Fazio, & 
Hood, 1977), undergraduate men were asked 
for a self-report of their level of sociability; » 
several weeks later their sociable behavior was 
objectively measured. The correlation be- 
tween self-report and behavior for this group 
was .17. In contrast, when the procedure was 
repeated with a different group who made 
their self-reports in the presence of a mirror, 
the correlation approached .70. Related re- 
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SELF-ATTENTION AND SUGGESTIBILITY 


search has shown conceptually similar effects 
d for persons who are dispositionally high in 
private self-consciousness (Scheier, Buss, & 
Buss, 1978; see also Carver, 1975; Turner, 
1978). In brief, self-focused attention appears 
to be an important moderator variable that 
affects the accuracy of self-reports. 
The primary purpose of the present re- 
search was to explore further the relationship 
«between self-focused attention and accuracy 
of self-reports. In doing so, however, an at- 
tempt was made to extend the analysis to a 
different domain of self-knowledge. Thus, 
rather than study again the link between self- 
report and behavior, we chose instead to ex- 
amine the effect of self-directed attention on 
knowledge of and reports about a variety of 
different bodily states. 


& Similar considerations should apply to 


awareness of one’s bodily experiences as apply 
to knowledge about one’s behavorial tenden- 
cies. That is, focusing attention on the self 
should lead to increased awareness of any ex- 
periential dimension that is salient at the 
time. When attention is focused on one’s prior 
behavior, the accuracy of the self-report of 
that behavior is enhanced (Pryor et al., 1977; 
Scheier et al., 1978). If attention is focused 
on one’s bodily state, on the other hand, then 
the person’s self-report about the state should 
similarly become more accurate. Testing this 
reasoning was the major objective of the 


present research. 


« 


Suggestibility 


The present s 
front another issue as Wi 
previous research 
formation are no! 


a self-reports of intern 0 
trary, many people seem to make extensive 


use of information from their cess 
defining for themselves what their experi 5 
are, even when those experiences are esse 


ially i "For example, people will 
tially internal eae He provided information 


make use of ext e r 
in determining how intense their oa. 
(e.g., Barefoot & Straub, 1971; Val ins, ; ): 

To the extent that a person utilizes ex m 
rather than internal information when mak- 


1577 


ing judgments about his internal states, he is 
open to a wide variety of suggestibility phe- 
nomena, Indeed, a person who judged his 
bodily states totally on the basis of his in- 
ternal sensations would be completely im- 
pervious to the influence attempts of others. 
Such suggestibility influences on judgments of 
bodily states can occur in many forms. For 
example, phenomena that would fit into this 
category include “demand” effects (Orne, 
1962)—that is, the influence that an experi- 
menter’s verbalizations and demeanor can 
have on subjects’ subsequent reports—and 
“placebo” effects—that is, the influence that 
misleading information about the actions of 
an inert substance can have on subjects’ sub- 
sequent reports. What both these effects have 
in common is this: The subject accepts some 
externally provided information indicating 
the nature or intensity of the subject’s recent, 
present, or impending experience as being 
accurate when it is not. 

According to the reasoning presented above, 
self-directed attention should minimize sug- 
gestibility phenomena in cases where the ex- 
ternal suggestion contradicts one’s internal 
experience. That is, increased  self-focus 
should increase the salience of veridical in- 
ternal information, thus increasing the degree 
to which the latter is used preferentially in 
determining the subsequent self-report. As 
the misleading external information is disre- 
garded in favor of the internal information, 
the suggestibility effect should diminish. 

This reasoning has received one empirical 
test, in a study on placebo effects (Gibbons, 
Carver, Scheier, & Hormuth, 1979), Subjects 
in that study were given a placebo and were 
either misinformed or correctly informed 
about the effects of the drug. Misinformed 
subjects were told that the drug would pro- 
duce a slight increase in heart rate, sweaty 
palms, and a tightness in the chest. All sub- 
jects were subsequently asked to report the 
degree of arousal and the specific arousing 
symptoms that they were experiencing. These 
reports were obtained either in front of a 
mirror or not. Neither mirror nor no-mirror 
subjects reported experiencing the target 
symptoms when they had been correctly in- 
formed about the nature of the drug. Among 
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misinformed subjects, however, those not ex- 
posed to a mirror apparently were taken in by 
the experimental cues and reported consider- 
able arousal. In contrast, even though the 
specific target symptoms used in the study 
could not be directly observed by looking in 
the mirror, misinformed subjects exposed to a 
mirror reported minimal arousal, Indeed, the 
level of drug-produced arousal reported by 
self-aware subjects in the misinformed condi- 
tion was no different from the level of drug- 
produced arousal reported by subjects in the 
correct information condition. Thus self- 
attention significantly reduced the placebo 
effect. 

This finding has important theoretical im- 
plications, which will be considered more fully 
below under General Discussion, The data 
did come from a relatively circumscribed con- 
text, however—research on reactions to 
Placebos. Because of that finding’s potential 
importance, it seems highly desirable to as- 
Sure its generality, Gaining additional infor- 
mation on this question was a second aim of 
the present research, Therefore, in pursuing 
our primary objective, which was to study the 
effects of self-focus on accuracy of self-report 
of internal states, we did not simply correlate 
subjects’ internal states with their self- 
reports. Instead, we approached the accuracy 
issue by examining the role of self-attention 
in two kinds of suggestibility phenomena. 
Accuracy, then, is defined as resistance to an 
attempt to misdirect by experimental sugges- 
tion. The first study made use of an emotion 
as an internal experience. The second study 
examined perceptions of taste, Tn both cases 
experimental manipulations were introduced 
to influence subjects’ anticipations about their 
upcoming experiences. In both cases, we pre- 
dicted that heightened self-focus would lead 
to increased awareness of actual internal 
experience and would thus lead to enhanced 
Tesistance to suggestibility, 


Experiment 1 


In Experiment 1 we sought to investiga 
Suggestibility in the experience of an aang 
The design of the study was somewhat more 
complex than that utilized by Gibbons et al 
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(1979), in the following respects. First, the 


manipulation used by Gibbons et al. was uni- $ 


directional. Experimental subjects were al- 
ways led to believe that they would be ex- 
periencing more of the relevant symptoms 
than they actually did experience. In contrast, 
the suggestibility manipulation in the present 
experiment was bidirectional. All subjects in 
the present study were exposed to stimuli— 
slides of nudes—which have been shown to\ 
induce a moderate degree of affect—sexual 
attraction (Scheier & Carver, 1977). Some of 
the subjects were led to believe that the stim- 
uli would be nonarousing, and others were 
told that the stimuli would be highly arous- 
ing. Thus, an attempt was made to drive sub- 
jects’ opinions in both directions from a base- 
line, 

In addition, the present research assessed 
the effects of self-focused attention in two 
very different ways—a strategy that has been 
effectively employed previously (e.g., Carver 
& Scheier, in press; Scheier & Carver, 1977; 
Scheier, Carver, & Gibbons, in press). Like 
Gibbons et al. (1979), we included a manip- 
ulation of self-directed attention in Experi- 
ment 1. Thus some subjects were exposed to 
their own images in a mirror and others were 
not. Validation evidence concerning the self- 
focusing properties of mirror presence have 
been presented at length elsewhere (Carver & 
Scheier, 1978). Unlike the subjects of Gib- 
bons et al., however, subjects in the present 
experiment were also selected on the basis of 
their chronic tendencies to be self-attentive. 

The disposition to focus attention inward 
has been termed self-consciousness and is 
Measured by the self-consciousness scale 
(Fenigstein, Scheier, & Buss, 1975). The scale 

two major components, private and pub- 


lic, that are only weakly correlated (cf. Fenig- . 


Stein et al., 1975). Private self-consciousness 
involves a focus on the more personal and 
Covert aspects of the self. The person high in 
Private self-consciousness is more cognizant 
of his beliefs, moods, and feelings. Two sam- 
ple items from the private self-consciousness 
Subscale are “I’m generally attentive to my 
inner feelings” and “I’m always trying to 
figure myself out.” Public self-consciousness, 
in contrast, reflects an awareness of the self 
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asa social object. Persons high in public self- 
«Consciousness are aware of their social ap- 
pearance and the impressions they make on 
others, Two sample items from this subscale 
are “I’m concerned about my style of doing 
things” and “I’m usually aware of my appear- 
ance.” It is private self-consciousness that is 
of greatest relevance to present concerns. 
That is, our prediction was based on the rea- 
soning that self-attentive persons would be 
especially aware of their veridical emotional 
experiences, Private self-consciousness in- 
volves precisely these types of process: 
awareness of one’s thoughts, feelings, and 
other internal aspects of self. Consequently, 
subjects were selected on the basis of private 
self-consciousness rather than public. 
Both construct validity (Carver & Scheier, 


* 1978) and discriminant validity (Carver & 


Glass, 1976; Turner, Scheier, Carver, & 
Ickes, 1978) have been established for the 
private self-consciousness subscale, Moreover, 
private self-consciousness as a disposition was 
intended to be similar conceptually to the 
manipulated state of self-awareness. The fact 
that on a number of occasions the two have 
led to empirically similar results indicates 
that they do converge on a single psycholog- 
ical entity (see e.g, Buss & Scheier, 1976; 
“Carver & Scheier, 1978; Scheier, 1976; 
Scheier et al., 1978; Scheier & Carver, 1977). 
In brief, the parallel between manipulated 
self-awareness and private self-consciousness 
now seems well documented, and we expected 


imi the two variables in the 
similar results for e iai ouse 


resent research as Wi CUS 
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belief that the experimenter had been trying to influ- 
ence their perceptions of the stimuli. Participation 
in the research was limited to males because the 
study used stimuli that had already proven to be 
somewhat arousing to males under relatively similar 
circumstances (Scheier & Carver, 1977), whereas we 
had no comparable knowledge of females’ responses 
to the stimuli, Subjects, tested individually, were 
randomly assigned to one of the four experimental 
treatments described below. 

Subjects were also divided into groups high and 
low in private self-consciousness based on their re- 
sponses to items of the self-consciousness scale 
(Fenigstein et al., 1975), which had been admin- 
istered in large group sessions several weeks prior to 
the study. Subjects were classified as being high in 
private self-consciousness if their scores fell above 
the median (Mdn = 24.5) of the pretest distribution 
and as low in private self-consciousness if their 
scores fell below the median, 


Procedure 


The procedure used in this study was adapted 
from that used by Scheier and Canver (1977). The 
experiment was portrayed as part of a larger project 
studying reactions to sexually oriented stimuli. Fu- 
ture subjects ostensibly would be wired with physio- 
logical recording devices to assess their bodily re- 
sponses to the stimuli (photographic slides of female 
nudes), Before beginning that phase of the research, 
however, it would be necessary to evaluate and 
classify the stimuli that would be shown to those 
subjects, In particular, the experimenter continued, 
it was necessary to determine which slides were very 
arousing, which ones were minimally arousing, and so 
on; this was the ostensible purpose of the present 
sessions, The experimenter went on to explain that 
each participant would see only a few of the slides, 
however, in order to prevent overexposure to the 
stimuli on the part of any of the raters. 

The subject was told that he would be viewing 
each slide twice. The first time he was simply to 
view it carefully and thoroughly; the second time he 
was to rate each slide on a separate 15-point scale, 
the endpoints of which had been labeled “extremely 
arousing” and “totally monarousing.” The subject 
was cautioned not to make his ratings as a function 
of relative photographic or artistic qualities. Because 
the research to be done later concerned physiological 
responses, it was emphasized that all of the present 
judgments should be made only according to how 
much of a bodily reaction the slide seemed to pro- 
duce in the subject, in terms of flushing, muscle ten- 
sion, slight changes in heart rate, and so forth, 

Suggestibility manipulation. After ascertaining 
that the subject had understood the instructions, the 
experimenter stood up to start the slide projector. 
As he did so, he delivered one of the following two 
statements in an offhanded manner. These statements 
constituted the instructional set or suggestibility 
manipulation. To subjects in the nonarousal condi- 


1580 


tion the experimenter said, “By the way, you might 
as well know before you even start that the guys 
weve run so far with this particular set of slides 
have been reporting that they really aren’t arousing 
at all—I mean, not at all. I’ve noticed that they've 
generally been circling points way over there on the 
left side of the page.” To subjects in the high arousal 
condition the experimenter said “Well, it looks like 
you're one of the lucky ones. The guys we've run so 
far with this set of slides have been reporting that 
they’re really pretty arousing. I’ve noticed that 
they’ve generally been circling points way over here 
on the right side of the page.” 

Despite these comments on the part of the experi- 
menter, all subjects subsequently viewed precisely 
the same set of 7 slides. These slides had been 
selected from a larger group on the basis of pilot 
data indicating that these slides were all moderately 
arousing—ie., that unmanipulated ratings of the 
slides fell approximately in the middle of the scales 
used in the main study. Thus both instructional sets 
constituted manipulations away from what should 
have been the slides’ true effects on the subjects. 

After delivering his remark, the experimenter 
Started the slide projector, which automatically dis- 
Played each stimulus for 15 seconds, separated by 
15-second interstimulus intervals in which no light 
was projected. After a 30-second interval, the cycle 
Was repeated. During the Projection period the ex- 
Perimenter looked away from the Subject and said 
nothing except to inform the subject that he should 
make his ratings now, as the second cycle was about 
to begin. 

Self-awareness manipulation. The slides were rear- 
Projected onto a freestanding device that stood ap- 


ao (cf. Neumann, Carver, & Scheier, 1977) 2 
€, i 


being projected, the 
but during the inter- 


Results and Discussion 


The dependent measure was the subject’s 
summed ratings of the arousal produced by 
the seven stimulus slides (see Table 1). An 


analysis of variance of these data revealed 
3 
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Table 1 

Mean Arousal Totals as a Function of 
Experimental Treatment (Combined Across 
Levels of Self-Consciousness), Experiment 1 


—_— 
Instruction No mirror Mirror 
High arousal 78.60 70.00" 
Nonarousal 39.85" 45.63» 

Note, The higher the number, the higher the arousal 

ratings. 

*n = 13, 

on = 16, 


as one might expect, a large main effect for 
instructional set, F(1, 47) = 79.83, p < .001, 
Subjects who expected the slides to be non- 
arousing reported being less aroused by them 
(M = 42.74) than subjects who expected the 


slides to be arousing (M = 74.35). Thus the “ 


expectancy manipulation seemed to have been 
effective.” Of greater interest, however, was 
the significant interaction between this sug- 
gestibility manipulation and mirror-manipu- 
lated self-awareness, F(l, 47) = 4.33, p< 
05, of the expected form. Within both set 
conditions, the presence of a mirror led the 
slides to be rated more moderately. In short, 
mirror condition subjects were misled to a 


lesser degree by the experimenter’s sugges- 


tions than were subjects in the no mirror 
condition. As indicated in Table 1, the 
strength of the mirror’s influence on subjects’ 
ratings was relatively equivalent across the 
two instructional set conditions, Thus, al- 
though the interaction term of the analysis of 
variance was significant, neither simple effect 
of the mirror was significant by itself. 

The effects of private self-consciousness 
were similar in form to the effects of the 


1 Practical considerations made it impossible to 
Prevent the experimenter from knowing which screen 
was in front of the subject. However, the experi- 
menter did not know the purpose served by the 
different screens or the predictions being tested in 
the study. 

?We had extensively pretested the expectancy 
manipulation in a pilot experiment, and the results 
of this pilot experiment showed that the manipula- 
tion was quite effective. It was for this reason that 
no expectancy manipulation check was obtained in 
either of the main experiments reported above. 
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mirror, but weaker. Thus, although the pri- 
„Vate  self-consciousness group means were 
ordered in the expected direction in three of 
the four cases (in the fourth case the means 
were the same), the differences were smaller 
than those produced by the mirror, and no 
effect involving private self-consciousness 
reached an acceptable level of statistical sig- 
nificance. 

« In part because of the weakness of the self- 
consciousness effect in Experiment 1, we 
sought to reexamine the effect of private self- 
consciousness on suggestibility, using a 
slightly different experimental setting and a 
more sensitive dependent measure. In Experi- 
ment 2, subjects’ expectancies about a flavor 
intensity were experimentally manipulated 
prior to tasting a solution. Some subjects ex- 
+ pected the solution to be strong, and others 
expected it to be weak. In contrast to Experi- 
ment 1, however, each subject in Experiment 
2 served as his or her own control, This was 
done by individually adjusting each subject’s 
rating of flavor intensity from his or her 
previously baseline. It was pre- 
dicted that subjects high in private self-con- 
sciousness would be less susceptible to sug- 
gestion than would subjects low in private 
self-consciousness, and would thus be less 
4 swayed the information provided by the 
xperimenter. 
i ‘There was a second, more important, rea- 
son for conducting a second poaae We 
fave argued dat atoae Aea aterm 

a person more aware | 7 
‘aun and that the heightened salience of the 
ay tate makes him less suscep- 
person’s internal s Be the Bindings from 
ible to influence. Although the ATS rea: 
consistent with this rea 


Experiment 1 were © indirect evidence 
soning, they P oe js, no independent 


3 fo a gathered that conclusively 
howei self-focused ee if i. 
aware of ee tag P ther 


y AN to the 
reduced susceptibility t0 ©” eee 
tion provid w the experimen ee cals 
crducted by C 1979). 

‘hy Gibbons et al. ( 
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Thus, in addition to manipulating subjects’ 
expectancies about the flavor intensity they 
would be tasting, we also varied the actual 
intensity of the solution. Some subjects tasted 
a relatively strong solution, and others tasted 
a solution that was relatively weak. If self- 
focused attention reduces susceptibility to 
suggestion because it makes persons more 
aware of their actual internal states, then 
ratings of the actual intensity of the solution 
should be more veridical among persons 
whose attention is self-focused. Thus a second 
prediction of this study was that persons high 
in private self-consciousness would provide 
more accurate ratings of the actual intensity 
of the solution than persons lower in private 
self-consciousness. 


Experiment 2 
Method 


Subjects 


Three months prior to the experiment, several 
hundred undergraduate men and women at Carnegie- 
Mellon University completed the self-consciousness 
scale (Fenigstein et al, 1975). The private self- 
consciousness subscale was used to select subjects. 
Subjects were divided at the pretest median (Mdn = 
26.1) into those high and those low in private self- 
consciousness. 

A total of 43 men and 31 women participated in 
the study. Two men were dropped from the final 
data analysis: 1 because of suspicion and 1 because 
he failed to follow instructions. Within levels of self- 
consciousness, subjects were randomly assigned indi- 
vidually to one of the four treatment conditions de- 
scribed below, with the restriction that there be at 
least 8 subjects per group (cell ns varied from 8 to 


8In order to determine whether public self-con- 
sciousness affected the subjects’ ratings, subjects were 
redivided into eight new groups, based on a median 
split of the public self-consciousness subscale. A 2 
(High vs. Low Public) X 2 (Arousal vs. Nonarousal 
Instructions) X 2 (Mirror vs. No Mirror) analysis 
of variance was performed on the data from these 
reconstituted groups. The results of this analysis re- 
vealed the same significant effects reported above. In 
addition, the main effect for public self-consciousness 
was also significant, F(1, 47) =4.36, p<.05, such 
that subjects high in public self-consciousness re- 
ported being more aroused by the slides than did 
subjects low in public self-consciousness. No other 
significant effects were obtained. 
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10). As in Experiment 1, the experimenter remained 
blind with respect to the subject’s self-consciousness 
score until after all data had been obtained. 


Procedure 


The experimenter, who was attired in a white 
laboratory coat, described the study as part of a 
larger physiological research project investigating 
an assortment of variables that influence bodily sen- 
sations. The present study ostensibly examined the 
effect of several of these variables on the experience 
of taste. After these brief introductory remarks, the 
subject was told that he/she would be tasting a 
number of different flavored solutions during the 
experimental session. 

At this point, the experimental procedure was ex- 
plained. The subject was asked to stand in front of 
a table, which was located next to a sink, On the 
table was a tray containing 10 small paper cups with 
a green colored solution in each. The cups were 
arranged in two rows. The first row was marked 
sample and contained four cups. The second row 
contained six cups and was labeled test. The subject 
was told that his/her primary task was to rate the 
intensity of the six flavoring agents in the test row. 
In order to acquaint the subject with the experi- 
mental procedure, however, he/she was being asked 
to rate the intensity of the four sample solutions 
first. In fact, only the first two sample solutions 
were ever rated. Intensity ratings were made along 
an 11-point scale, with the end points labeled “ex- 
tremely weak” and “extremely strong,” Inasmuch as 
the subject needed to have his/her hands free to 
handle the solutions, the intensity rating scale was 
posted on the wall in front of the subject so that 
he/she could make his/her rating verbally, Each 
rating was recorded by the experimenter. 

The experimenter went on to explain that it was 
important that each person hold the flavoring agents 
in his/her mouth for the same length of time. Con- 
sequently, the experimenter would tell the subject 
when to put the solution in his/her mouth, how long 
to hold it there, and when to make the rating. The 
subjects held the solution in their mouths for 10 
seconds before spitting it into the sink, They then 
waited 10 more seconds before making the intensity 
rating. The waiting period was to be used by the 
subject to think about the sensation Produced by the 
solution. Once the rating was made, the subjects 
rinsed their mouths with water, waited for 60 sec- 
onds, and then began the next trial. 

The first solution, which consisted of 14 parts 
Peppermint extract mixed with 100 parts water, was 
the same for each subject. The second solution tasted 
by the subject also consisted of a mixture of = 
mint extract and water but, depending upon treat- 
ment condition, was either stronger or weaker than 
the first solution. In the strong-solution condition, 
the subject tasted a solution consisting of 19 parts 
Peppermint extract to 100 parts water. In the weak 
Condition, the solution consisted of .9 parts pepper- 
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mint extract to 100 parts water. The remaining eight 
solutions contained only water. Green food coloring 


had been added to all 10 solutions in order to make tg 


them similar in appearance. F 
In addition to manipulating the actual intensity of 
the second solution, the subject’s expectancy about 
the intensity of the second solution was also varied. 
As the subject reached for the second solution, the 
experimenter casually remarked, “the second solu- 
tion should be a little stronger (weaker) than the 
first.” Thus, depending upon experimental condition, 
the subject expected the second solution to be eithe 
stronger or weaker than the first one he/she tasted. 


` After the second solution was rated, the subject was 


probed for suspicion and debriefed. 

The procedure required that each subject taste and 
rate the same first solution. This procedure was in- 
stituted so that the subject’s rating of the first solu- 
tion might be used as an anchor to which his/her 
second rating might be compared. Thus, an intensity 
change score was derived for each subject by sub- 
tracting his/her rating of the first solution from his/ 
her rating of the second. This intensity change score 
constituted the primary dependent measure. A posi- 
tive intensity score indicates that the second solution 
was rated as stronger than the first; a negative score 
indicated the opposite. 


Results and Discussion 
Private Self-Consciousness 


A 2 (Strong vs. Weak Solution) x 2 
(Strong vs. Weak Taste Expectancy) x 2 
(High Private vs. Low Private Self-conscious) 
analysis of variance of the change scores re- > 
vealed a significant main effect for solution, 
F(1, 64) = 33.11, p < .0001.4 As might be 
expected, subjects who received a strong sec- 
ond solution rated it higher in intensity (M = 
1,09) than subjects who received a weak 
Second solution (M = —.093). There was 
also a significant main effect for the ex- 
pectancy variable, F(1, 64) = 11.75, p< 
-002. Subjects who expected a strong second 
solution rated the second solution as being 
more intense (M = .68) than subjects who 
expected a weak second solution (M =% 
—.052). 


* Preliminary analyses of the intensity change 
Scores revealed only negligible effects for the gender 
of the subject. The main effect for gender was non- 
Significant, as were all the interactions in which 
gender was involved. Consequently, gender was 
dropped as a variable in the design, and the data 
from men and women were combined for all analyses 
reported above. 
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_ Of greater importance, however, was the 
~._ignificant. Expectancy x Private Self-Con- 
sciousness interaction, F(1, 64) = 4.15, p < 

-05. Consistent with the findings of Experi- 

ment 1 and as predicted, subjects low in pri- 

vate self-consciousness were more affected by 

the expectancy manipulation than were sub- 

jects high in private self-consciousness (see 
Figure 1), The former group rated the second 
solution as being more intense when a strong 
solution was expected, and the second solution 

as less intense when a weak solution was ex- 
pected, than did subjects higher in private 

,  self-consciousness, ; 

There was also a marginally significant 
Solution X Self-Consciousness interaction, 
F(1, 64) = p < .16. As can be seen in Table 
2, the form of this interaction was exactly 

e opposite to that of the Expectancy x Self- 
Consciousness interaction, Whereas subjects 
high in private self-consciousness were less 
sensitive to the expectancy manipulation, they 
tended to be more sensitive to the actual 
intensity of the solution, As predicted, they 
tended to rate the strong solution as being 
stronger and the weak solution as being 

à weaker than did subjects low in private self- 


consciousness. 
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Table 2 

Intensity Change Scores as a Function of 
Private Self-Consciousness and Solution 
Intensity, Experiment 2 


Self-consciousness Weak solution Strong solution 


=1.25 1.26 
— .60 91 


High private 
Low private 


We expected that subjects high in private 
self-consciousness would be more accurate in 
rating the actual intensity of the solution than 
subjects low in private self-consciousness. 
The form of the Solution x Self-Conscious- 
ness interaction was consistent with this hy- 
pothesis, but the interaction failed to reach an 
acceptable level of statistical significance, It 
is important to note, however, that the Solu- 
tion X Self-consciousness interaction may not 
have been the most appropriate test of the 
accuracy hypothesis. For half the subjects in 
the study, no conflict existed between their 
expectancies and the actual intensity of the 
solution they tasted. In the absence of con- 
flict, subjects high and subjects low in private 
self-consciousness should have been equally 
accurate in their ratings of flavor intensity. 
Only when the subject’s expectancy conflicted 
with his or her actual experience should pri- 
vate self-consciousness have made a difference 
in the subject’s ratings. The high group, 
whose judgments were presumably based on 
their actual internal states, should have been 
very accurate in their assessment of flavor 
intensity under conflict conditions, whereas 
the low group, whose judgments were based 
less on their internal states, should have been 
misled more by the experimenter’s sugges- 
tions. 

Given these considerations, a slightly dif- 
ferent analysis of the intensity change scores 
was performed. Specifically, subjects were 
first separated into those who experienced 
conflict and those who did not experience 
conflict between expectancy and solution in- 
tensity. High and low private self-conscious 
subjects were then classified into those who 
were accurate in their ratings of flavor in- 
tensity and those who were not. A subject was 
classified as accurate if the sign of his or her 
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Table 3 
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Number of Persons Making Accurate Flavor Intensity Ratings as a Function of Private 
Self-Consciousness and Presence or Absence of Conflict Between Expectancy and Actual 


Solution Intensity, Experiment 2 


Conflict absent Conflict present 
Self-consciousness Accurate Inaccurate Accurate Inaccurate 
High private 12 4 12 5 
Low private 17 2 5 15 


intensity change score was in the same direc- 
tion as the actual intensity of the second 
solution tasted. Thus a subject was classified 
as accurate if he or she rated the second 
solution as being weaker than the first and it 
was in fact weaker, or if he or she rated the 
second solution as being stronger than the 
first and it was in fact stronger. A subject was 
classified as inaccurate either if his or her 
change score was zero (i.e., if the subject 
failed to report a change in flavor intensity 
when one in fact occurred) or if the sign of 
his or her change score was opposite in direc- 
tion to the actual intensity of the second solu- 
tion tasted. 

The results of this analysis are presented 
in Table 3. As can be seen from the left half 
of this table, when there was not conflict be- 
tween expectancy and solution intensity, the 
accuracy of flavor ratings was high for both 
subjects high and subjects low in private self- 
consciousness. The difference between the two 
groups was trivial and nonsignificant, y? < 
1.0. A very different pattern of results 
emerged, however, when accuracy was ex- 
amined among only those subjects for whom 
a conflict existed between expectancy and 
solution intensity. As can be seen from the 
right half of Table 3, subjects high in private 
self-consciousness were still quite veridical in 
their assessment of flavor intensity. Of the 17 
subjects high in private self-consciousness for 
whom conflict existed, 12 rated the intensity 
of the second solution accurately, and 5 rated 
it inaccurately. In contrast, of the 20 subjects 
low in low private self-consciousness who were 
in conflict, only 5 rated the flavor intensity 
accurately. This difference in the accuracy 
of the intensity ratings made by subjects high 
and low in private self-consciousness who 


were in conflict was highly significant, y*(1) 
= 5.96, p< .02.* These findings strongly 
suggest that high private self-conscious per- 
sons are in fact more cognizant of their actual 
internal states than low private self-conscious 
persons, as predicted. Moreover, these find- 
ings also lend credence to the contention that 
it is precisely such differences in awareness 
of internal states between people who are 
high and people who are low in private self- 
consciousness that accounts for the differences 
between them in susceptibility to suggestion. 

It should be noted that our purpose in 
conducting the present research was to docu- 
ment the fact that self-attentive persons are 
less susceptible to suggestion than persons 
whose attention is not self-focused, and the 
findings from both experiments offer strong 
evidence that this is indeed the case. It is 


5 The above analyses tested the accuracy hypoth- 
esis by using nonparametric statistics. However, two 
2(High vs. Low Private) X 2(Strong vs, Weak Solu- 
tion) analyses of variance were also performed on 
the raw intensity change scores—one analysis for 
each conflict condition. The results of these analyses 
paralleled those obtained for the nonparametric 
analyses reported above. Thus, the analysis of vari- 
ance of the data for no conflict subjects revealed 
only a significant main effect for solution intensity, 
F(1, 31) = 46.09, p < 001. Subjects rated the strong 
solution as being stronger than the weak solution, 
regardless of whether they were high, F, 31) = 
18.37, p < 01, or low, F(1, 31) = 28.22, p< 01, in 
private self-consciousness. In contrast, the analysis 
of variance performed on the data for subjects who 
were in conflict revealed a significant interaction 
between self-consciousness and solution intensity, 
F(1, 33) =5.61, p < .03. When in conflict, subjects 
high in private self-consciousness still rated the 
strong solution as being stronger than the weak solu- 
tion, F(1, 33) = 6.77, p< 02, but those low in pri- 
vate self-consciousness did not, F < 10, 
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also true, however, that self-attention did not 
entirely eliminate suggestibility effects. That 
even self-aware subjects in Experiment 1 
ered in their ratings of the stimulus slides 
as a function of the arousal instructions (see 
Table 1), and even high private self-conscious 
subjects in Experiment 2 differed in their rat- 
ings of solution intensity as a function of the 
expectancy manipulation (see Figure 1). 
What is important, though, is that self-atten- 
tive subjects were significantly less susceptible 
to both of these manipulations than were sub- 
jects who were less self-attentive. 


Subsidiary Analysis 


In order to determine the effect of public 
self-consciousness on intensity ratings, sub- 
into eight new groups, 
based on a median split of the public self- 
consciousness subscale, A 2 (High vs. Low 
Public) x 2 (Strong vs. Weak Solution) x 2 
(Strong Ys. Weak Taste Expectancy) was 

on the intensity changes 


then performed ‘ 
scores. The previously mentioned main effect 
for solution was still significant, F(1, 64) = 
29.81, p < .001, as was the main effect for the 
expectancy variable, F(1, 64) = 13.38, p < 
001. Neither the main effect for public self- 
* consciousness, F(1, 64) = 2.46, p< .05, nor 
any interactions involving public self-con- 
sciousness were significant. Thus, the effects 
of public self-consciou: 


were negligible. 


sness in Experiment 2 


General Discussion 


Experimental Findings 
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search takes this notion a step farther, how- 
ever. The results of both studies appear to 
indicate that self-attention can lead to height- 
ened awareness of internal experience even 
when the experience directly contradicts ex- 
perimentally induced expectancies. 

Moreover, the greater accuracy of self- 
report that comes with increased attention 
is not limited to reports of bodily states. That 
is, previous research has shown that self- 
reports of behavioral tendencies are most 
accurate when obtained under conditions of 
experimentally heightened self-awareness 
(Pryor et al., 1977). Similarly, persons high 
in private self-consciousness have been shown 
to provide more veridical self-reports of their 
behavioral tendencies than do persons lower 
in private self-consciousness (Scheier et al., 
1978). In brief, self-directed attention ap- 
pears to be an important determinant of the 
extent to which a person is aware of his or 
her bodily states, attitudes, and dispositions. 
As such, it is also an important determinant 
of the accuracy with which those states, atti- 
tudes, and dispositions are reported. 


Related Suggestibility Findings 


We have shown in the present research that 
self-focused attention can heighten a person’s 
cognizance of his or her internal state and 
thereby make that person more resistant to 
the suggestions and influence of others. In 
contrast, previous research has shown that 
heightened self-focus can also lead a person 
to be less resistant to the influence of others 
and more likely to conform (Carver, 1974; 
Duval, 1976; Wicklund & Duval, 1971). For 
example, Wicklund and Duval (1971) found 
that subjects who listened to their own voices 
on a tape recorder conformed more with the 
modal opinion of a positive reference group 
than did subjects who listened to the tape- 
recorded voice of another person. 

There are at least two ways, however, in 
which these previous studies differ from the 
present research, and these differences be- 
tween the two sets of studies may account 
for the different sets of findings. The first 
difference involves the specific self-aspect to- 
ward which the influence attempt was di- 
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rected. Our own research and the earlier re- 
search by Gibbons et al. (1979) attempted 
to influence the person’s belief about his or 
her internal bodily state. This research uni- 
formly showed greater resistance to sugges- 
tion as a function of increased self-focus. In 
contrast, the earlier research showing de- 
creased resistance to suggestion either at- 
tempted to influence the person’s opinions, 
beliefs, or standing on some attitude topic 
(Wicklund & Duval, 1971, Experiment 1) or 
attempted to influence his or her assessment 
of an ambiguous external stimulus (Duval, 
1976). 

Thus, in each of these earlier studies, an 
attempt was made to influence the person’s 
opinion about some aspect of self (attitudes 
or beliefs) or some aspect of the external 
environment (an ambiguous stimulus) for 
which there was no clear right or wrong posi- 
tion. That is, in each case the correctness of 
the person’s position required extensive social 
validation (cf, Festinger, 1954). In contrast, 
the correctness of one’s Perceptions of one’s 
own bodily states may requiré less social 
validation. Persons have direct access to in- 
formation that enables them to determine for 
themselves whether they are aroused or not, 
whether something tastes strong or weak, and 
so on.* In brief, we suggest that important 
differences exist in the extent to which a per- 
son’s judgments about himself or herself re- 
quire social validation. For some self-aspects, 
like bodily states, little social validation may 
be necessary, and it may be for only those 
self-aspects, like bodily-states, that self- 
focused attention will increase resistance to 
suggestion. 

A second possible difference between the 
two sets of studies rests on the distinctiòn 
that has been drawn between private and 
public self-consciousness. Although the pri- 
vate-public distinction has so far been applied 
solely to individual differences in self-con- 
sciousness, only a small extrapolation is 
needed to apply the distinction to manipula- 
tions of self-awareness as well, Indeed, Buss 
(in press) has already suggested that self- 
awareness manipulations might be usefully 
categorized in terms of whether or not they 
are likely to induce awareness of the private 
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or public aspects of self. He argues, for ex- 


induce private self-awareness, and the pars 
lels that have been obtained between 
mirror and private self-consciousness would 
seem to substantiate this claim (see e.g., 
Carver & Scheier, 1978; Scheier, 1976; 
Scheier & Carver, 1977). On the other hand, 
a variety of manipulations might induce pub- 
lic self-awareness and make the person mor 
cognizant of himself as a social object. Ex- 
amples of manipulations that might fall into 
this category are exposure to an audience, ex- 
Posure to a TV camera, and hearing one’s own 
tape-recorded voice, 

Although the effect of public self-conscious- 
ness in the present research was negligible, it 
should be remembered that we attempted to 
influence the person’s perception of his or her 
bodily state. The fact that public self-con- 
sciousness had little effect in this circum- 
stance is not necessarily unexpected. That is, 
perhaps public self-consciousness is relevant 
to suggestibility only when the influence at- 
tempt involves self-aspects that require more 
extensive social validation. In this regard, it 
is interesting to note that virtually all of the 
earlier research in which self-focused atten- 
tion increased suggestibility used manipula- 


ample, that exposure to a mirror is likely Ta 
e 


tions that might have heightened public self-> 


awareness—for example, the person’s tape- 
recorded voice (Wicklund & Duval, 1971) 
and a TV camera (Duval, 1976). In addition, 
the influence attempts in those studies were 
aimed at self-dimensions that typically re- 
quired validation from others—an attitude 
(Wicklund & Duval, 1971) and an ambiguous 
judgment (Duval, 1976). Thus, the manipula- 
tions used in those studies may have increased 


It should be noted that these comments about ` 


social validation and ambiguity are meant only to 
apply to the person's ability to identify in himself 
the presence or absence of a Particular physiological 
State—eg., identifying the Presence or absence of 
arousal. They are not meant to apply to the person's 
ability to provide a name for his state—e.g., labeling 
the arousal as fear (cf. Schachter & Singer, 1962). 
Nor are they meant to apply to the person’s ability 
to identify the source or cause of his state, 
knowing that the arousal was produced by running 
(cf. Zillman, Johnson, & Day, 1974). 
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SELF-ATTENTION AND SUGGESTIBILITY 


the person’s awareness of self asa social ob- 


LP me with a concomitant increase in concern 


ver the appraisal and evaluations of others, 
in contexts where others’ opinions are needed 
to evaluate one’s own position. If so, the fact 
that these earlier studies found increased sug- 


- gestibility due to heightened eoageus be- 


comes less surprising. 


Theoretical Implications 


As was noted earlier in this article, the 
finding that self-directed attention can weaken 


, Suggestibility phenomena has important theo- 


retical implications. These implications con- 
cern. two alternative interpretations that 
might be invoked to explain findings of pre- 
‘vious research in which mirrors were used as 
f manipulators of self-focus. One of those inter- 
pretations is based on drive theory, the other 
on the notion of experimental demand. 

In its simplest form, a drive-theory inter- 
pretation. of mirror effects holds that the 
presence of a mirror leads to heightened drive 
or arousal, which in turn causes dominant re- 
sponses to be emitted with increased fre- 
quency. Thus subjects’ behavior is more in 
line with salient behavioral standards when a 
ipirror is present than when it is absent (cf. 
Liebling, Seiler, & Shaved, 1974). The second 
potential explanation of the effect of a mirror 
is that it makes subjects more sensitive to 
demand characteristics (Orne, 1962) that are 
inherent in the experimental situation. Ac- 
_ cording to this argument, subjects in the pres- 
*-ence of a mirror do what they think the ex- 
perimenter wants them to do to a greater 
degree than do subjects with no mirror. 

Although either of these two explanations 
might provide an adequate account for some 
he mirror-induced effects (e.g., Carver, 

974, 1975; Gibbons, 1978; Scheier, Fenig- 
stein, & Buss, 1974), there are a number of 
other findings that neither approach can ex- 
plain (e.g, Duval & Wicklund, 1973; Gib- 
bons, et al., 1979; Carver & Scheier, Note 1). 
~ Nor can they account for the present findings. 
If a mirror increases arousal, all of the sub- 
jects in the present Experiment 1 who re- 
sponded in the presence of a mirror should 
have experienced—and thus reported—more 
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arousal than those with no mirror, Instead, 
the mirror manipulation interacted with the 
bogus information provided the subjects. Al- 
ternatively, if a mirror increases the emission 
of dominant responses, the interaction should 
have taken a form opposite to that which did 
occur, as subjects emitted the responses that 
had been rendered dominant by the experi- 
menter’s suggestions. Finally, if a mirror in- 
creases responsivity to experimenter demand, 
then self-aware subjects in the present re- 
search should have been more influenced by 
the experimenter’s comments. The fact that 
they were less swayed by the suggestions of 
the experimenter than were less self-attentive 
subjects renders a demand-based interpreta- 
tion of the present and previous findings 
much less tenable. 

It is also worth noting that any attempt to 
apply drive or demand interpretations to mir- 
ror effects seemingly implies that such inter- 
pretations should be applicable to self-con- 
sciousness effects as well. The self-conscious- 
ness scale, however, has been found to be 
unrelated to measures of test anxiety, soċial 
anxiety, achievement motivation, general 
emotionality, impulsivity, intelligence, and 
social desirability (Carver & Glass, 1976; 
Turner et al., 1978). Drive or demand inter- 
pretations of self-consciousness effects would 
seem to require that some of these correla- 
tions be positive. In brief, the most viable 
interpretation of the accumulated research 
involving both manipulated self-awareness 
and dispositional self-consciousness appears 
to be based on attention. It is the only ex- 
planation that can simultaneously account for 
the effects of each of these variables, and it is 
the only explanation that can account for the 
entire set of findings. 


Reference Note 


1. Carver, C. S., & Scheier, M. F. The self-attention- 
induced feedback loop and human motivation: A 
control-systems analysis of social facilitation, 
Manuscript submitted for publication, 1979. 
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tive support for age-related changes regard- 
ing some of the levels (a Levels x Populations 
interaction), although their results are open to 
alternative interpretations, 

Perhaps the most fundamental criticism re- 
lates to the adequacy of the stories used. 
First, there is doubt as to whether at least 
some of the stories reflect the criteria that 
they are intended to represent, especially at 
the foreseeability level (Sedlak, 1978). Sec- 
ond, factors such as effort that have been 
shown to affect AR (Raymond & Dillehay, 
Note 1) are confounded with the attribution 
criteria, Third, each level is presented in dif- 
ferent situational contexts, a variable that 
affects AR according to Shaw and Sulzer’s 
(1964, p. 45) qualitative observations. The 
inclusion of only two stories at each level may 
be insufficient to prevent the confounding of 
level and story context. Finally, the use of 
the same story character is likely to increase 
subject reactivity. These kinds of factors pos- 
sibly produced anomalous findings such as a 
decrease in children’s AR from foreseeable 
(but accidental) to intentional negative out- 
comes, Later research in this tradition (e.g., 
Shaw & Reitan, 1969; Shaw & Schneider, 
1969; Sulzer & Burglass, 1968) is similarly 
plagued by basic problems of interpretation, 
and even though a new more satisfactory set 
of Perry stories has been devised, they have 
to date only been used with children in a 
cross-cultural study (Shaw & Iwawaki, 1972), 

In an attempt to meet the above criticisms, 
Harris (1977) used a single situation to pre- 
sent Heider’s levels to five different age groups 
(7-, 9-, 12-, and 13.5-year-olds and college 
students). Utilizing a between-subjects design, 
he obtained partial support for Heider’s the- 
ory. Although an Age X Stimulus Level inter- 
action was found for the perception of causal- 
ity, this interaction only approached signifi- 
cance (p< .09) for AR. Furthermore, no 
distinction was found between causality and 
foreseeability despite a successful check on 
the manipulation of this variable. Finally, 
levels of the stimulus event did not affect 9- 
year-olds’ perceptions of naughtiness but did 
have an effect on adjacent age groups, No 
explanation is offered for this result, Perhaps 
the strongest support for the age-related na- 
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ture of Heider’s levels stems from a recent ' 
study by Sedlak (1979). Despite apq s 
exceptionally small sample, she showed við 
multidimensional scaling that 8-year-olds, 11- 
year-olds, and adults utilized Heiderian cri- 
teria as a dimension in their cognitive repre- 
sentations of stimulus stories. Moreover, the 

salience of this dimension (relative weighting) 

was directly related to age, supporting the 

notion of increasing AR differentiation with 
development, However, even adults did not 
distinguish Heider’s last two levels of inten- 
tionality and justifiability, thus replicating 

Shaw and Sulzer’s (1964) earlier results. It | 

thus appears that although previous research “ 

is adequate for testing aspects of Heider’s 
(1958) ideas, the relevant attribution criteria 
have not been adequately operationalized, nor 
have developmental differences in AR been 
fully investigated, For example, the implicit ¥ 
assumption that AR is cumulative because 
higher levels imply lower levels of attribution 
(e.g., perceived intentionality implies per- 
ceived foreseeability, causality, etc.) has never 
been evaluated, Consequently, the status of 
Heider’s last two levels (intentionality and 
justifiability) is unclear, and it is not known 
whether age differences in AR actually reflect 
developmental stages, 

One aspect of Heider’s theory, the develop- 
ment of AR based on intention, appears to 
have been established by Piagetian research 
Both early and recent methodologically inno- 
vative studies show that the degree to which 
intentional perpetrators are held responsible 
for outcomes increases with 
Gutkin, 1972; Hebble, 1971). As Karni 
(1978) notes, however, these studies tend oA 
vary intentionality in the context of a i 
dentally produced outcomes and henc a 
valid tests of Heider’s (1958) two as TS 
contrast, some investigations have Bot fount 
the traditional age shift in evaluations of A i 


tentional versus accidental 
1971; Farnill, 1974; Sea A 
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tions of AR occur via the child’s evaluation 


D of another’s actions. However, Piaget (1932) 


clearly stated that children’s moral judgments 
of themselves and others would differ. In re- 
spect to the self, children succeed much earlier 
(at age 3 or 4 years) “in differentiating in- 
tentional faults from involuntary breaches of 
the moral code” (p. 180). A second limitation 
relates to the almost exlusive focus on a single 
attribution measure. With the exception of 
“Harris (1977), all studies assess only moral 
attribution. But as Shaver (1975, p. 102) 
points out, attribution of causality and re- 
sponsibility are not necessarily synonymous, 
although they are often assumed to be. Fi- 
nally, attempts to obviate subject reactivity 
have often resulted in the use of between- 
subjects designs, whereas a within-subjects 
design is necessary for investigating the cumu- 
lative nature of Heider’s AR levels. 


Experimental Overview 


The present study attempts to clarify cer- 
tain conceptual issues and meet several of the 
methodological problems outlined above. Con- 
ceptually, it was considered that Heider’s 
(1958) analysis of AR was incomplete at the 
highest levels, Although a facilitative cause is 
ld to diminish responsibility (justifiability), 
no mention is made of a converse negative en- 
vironmental force. Theoretically, such a force 
should, in accordance with Kelley’s (1971) 
augmentation principle, increase responsibil- 
ity, and hence a sixth level was hypothesized. 
In addition, it seems plausible that a more 
accurate developmental picture may be 
yielded by investigating AR in relation to 
the self. Consequently, negative-outcome 
stimulus stories representing six attribution 
criteria were administered to five develop- 
y mental groups (6-7.5, 8-9.5, 10-11.5, 12- 
13.5 years, and adults) under two stimulus 
conditions; the actor was either a hypotheti- 
cal other or the self. 
` Methodologically, this study differed from 
previous experiments in two major respects. 
First, a Latin square design was used to con- 
trol for story context. Hence levels were 
varied within a single story theme. To re- 
tain. the advantages of a within-subjects de- 
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sign (cf. Sulzer, Note 3) and in order to 
evaluate the scalability of Heider’s levels 
without increasing subject reactivity, six 
story contexts were generated yielding 36 
stimulus items, Subjects were thus presented 
with a different level from each of the story 
themes. Second, two different attribution 
measures were used. One required an ascrip- 
tion of causality, whereas the other asked for 
an assessment of blame. 

It was hypothesized that the attribution 
measures would evoke two distinct attribu- 
tion styles leading to different response pat- 
terns. More specifically, an interaction be- 
tween age and story character (self versus 
other) was predicted for the moral judg- 
ment measure on the basis of Piaget’s (1932) 
theory. Younger children were expected to 
evidence a difference in these two conditions, 
whereas older ones were not. No such inter- 
action was predicted in regard to perceived 
causality, In terms of Heider’s (1958) the- 
ory, it was anticipated that both attribution 
measures would evidence an effect for stimu- 
lus level, the subject’s age, and the inter- 
action of the two factors, Generally, younger 
children were expected to show a high and 
more global level of attribution across the 
stimulus levels in contrast to older subjects’ 
differentiated attributions, which should show 
greater differences across the levels, 

However, it was anticipated that the na- 
ture of the interaction between age and stim- 
ulus level would be different for the two 
attribution measures. In regard to blame, a 
convergence of age groups was hypothesized 
to occur only at the higher levels, including 
intentionality, whereas attribution of causal- 
ity might coincide at the levels of causality 
or foreseeability. This prediction was based 
on the belief that judgment of causality is 
not primarily an evaluative event and may 
therefore result in an earlier consensus. Also, 
attribution of blame at the adult level should 
only be maximal at the highest stimulus 
levels, whereas the less differentiated re- 
sponses of children may show such blame at 
lower levels. 

Finally, it was held that Heider’s AR 
model should display the characteristics of a 
Guttman scalogram as suggested by Fishbein 
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and Ajzen (1973), However, previous evi- 
dence for the developmental nature of these 
AR criteria is based solely on quantitative 
group scores and is thus open to question. 
No attempt has been made to look at in- 
dividual response patterns in a developmental 
context. 


Method 
Subjects 


Two hundred and forty persons participated in 
the study. Children were drawn from two Oxford 
schools and fell into one of four age groups deter- 
mined by whether it was their second (for age, M 
= 82.75 months, SD = 4.83), fourth (M = 105.55 
months, SD = 5.05), sixth (M = 132.47 months, SD = 
3.43) or eighth (M = 150 months, SD = 4.79) year of 
formal education. To control for socioeconomic 
class, the schools selected were sited on the same 
council housing estate (predominantly lower middle 
class) within 600 m of each other. The adult group 
comprised both graduates and undergraduates at 
the University of Oxford as well as students from 
Oxford Polytechnic. 

Equal numbers of male and female subjects were 
tested at each age level. The sample was almost 
exclusively white; only one or two black children 
were present in each age group. 


Stimulus Stories 


Stimuli comprised stories reflecting six familiar 
contexts (e.g. riding a bicycle) within which the 
attribution levels were varied. The construction of 
the stories followed three stages. Initially, 10 grad- 
uate students in psychology rated the stimuli on 
each of Heider’s extended attribution criteria. On 
the basis of these responses, a second modified set of 
stories was constructed and again rated. Informal 
discussion following this procedure led to further but 
less extensive modifications. The full set of 36 stories 
was then presented to 6 senior PhD students in so- 
cial psychology who had not Participated in the 
previous phase. The students were asked to classify 
each story according to level after they had read 
Heider’s (1958, pp. 113-114) description of the levels 
and the rationale for including a further stage. Be- 
cause unanimity was obtained, no further structural 
changes were made. Finally, the stimuli were in- 
formally discussed with two teachers who were asked 
to comment on the linguistic complexity of the stim- 
uli and on the children’s familiarity with the words 
used. A few minor alterations consisting mainly of 
word changes were made at this stage. 

Because the precise manner in which Heider’s 
levels of responsibility are operationalized is of some 
importance, the Structural features of each are pre- 


Sented below. It should be noted that an attempt 
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was made to follow the exact general examples given 
by Heider: 

Level 1 (association). At this level the actor was} 
associated with the outcome in two ways; by spatial 
and temporal contiguity and by ownership. Typically, 
a victim is accidentally hurt in the actor’s presence 
and by an object belonging to the actor. 

Level 2 (causality). Here the actor merely causes 
the outcome. In a situation where competence can 
reasonably be assumed, the actor accidentally hurts 
a victim of whose presence he or she is not even 
aware (e.g., the actor slips while running). 

Level 3 (foreseeability). Although the harm is stil 
caused accidentally, foreseeability is achieved by a 
slight change of conditions. The actor experiences 
a lack of competence (e.g., the actor continually slips 
while running along a wet surface) and is aware of 
being a potential harmdoer (the actor sees the victim 
come and run too close to him or her.) 

Level 4 (justifiability). Justifiability was repre- 
sented by an “eye for an eye” situation. Hence when 
the actor saw someone prepare to attack him or her 
in some way (e.g., run past and knock the actor 
over while running), the actor responded to the 
provocation by doing to the other person exactly 
what would have been done to the actor as the 
victim. 

Level 5 (intentionality). The actor behaves in the 
same manner as at the previous level, but there is no 
provocation. The actor thus appears as a free, un- 
coerced agent. 

Level 6 (“supererogation”). This level is exactly 
the same as the previous one, except that the actor 
contravenes a request by the victim’s mother to be 
careful “as my boy (girl) gets hurt very easily.” 
(Notice that the order of Heider’s, 1958, Levels a 
and 5 has been reversed to create a ranking of in- 
creasing internality of causation.) 1 

A characteristic common to all the stories was that 
they represented a realistic description of an action 
sequence; they only contained information that 
would have been available to an observer of the 
episode. Hence attributional criteria (eg, inten- 
tionality) were not explicitly mentioned in any of 
the stories. However, an attempt was made to keep 
the demands made on social inference abilities at a 
bare minimum to avoid results that merely reflected 
eight ne on this variable. A final fea- 
outcomes. Ta tice “oR ie SAN pad 

e story contexts, the conse- 
quence of the actor’s behavior was a bleeding nose... 
whereas in the other three he/she produced a bleed- 7 


1 Attribution of responsibility is a function of 
both the observers’ response level and the stimulus 
level. Heider orders his criteria according to the 
jeunes whereas the reversal is made on the latter 
: e Present attempt to order the levels according 
O Increasing attribution was done to allow the 
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RE tgence of a Guttman scale response 
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ing lip. These outcomes were chosen because they 
were judged to be of medium intensity. 

Stimulus stories were combined to produce six sets 
of stimuli that contained one attribution level from 
each story context. The procedure followed is detailed 
in Winer (1971, p. 689). Since a Greco-Latin square 
design, which would have allowed assessment of any 
order effect in addition to the context variable, was 
impracticable in terms of subject number, a fixed 
order of presentation was followed. The exact order 
was randomly determined. It should be noted that 

aay order effect should have worked against the 
hypotheses of the study. 


Procedure 


All subjects were administered one of the six stim- 
ulus story sets. To facilitate younger children’s 
understanding of the task requirements, 6- and 8- 
year-olds were read the stories in small groups of 4 
so that individual attention could be given to each 
subject. Owing to practical difficulties, all other sub- 
jects read the stories in normal class (lecture) pe- 
riods (25-30 persons). An attempt was made to 
avoid comprehension difficulties among the older 
children by specifically excluding subjects known to 
have reading difficulties or learning problems of any 
sort. In addition, three adults were freely available 
to assist whenever any difficulty was encountered by 
a subject. Their help was, however, rarely required, 
as almost all the children found the task require- 
ments fairly simple. A further precaution was taken 
in constructing the test booklets. Each story was 
presented on a separate page followed by the appro- 
priate rating scale alongside which appeared a clear 
line drawing of the central object in the story. Any 
comprehension difficulties should have led to random 
responding and would therefore not have supported 
the experimental hypotheses. 

Before proceeding with the task, each group was 
told that sincere and honest individual responses were 
important even though the study was concerned with 
group differences. Subjects were assured of complete 
anonymity to allay any further anxiety, and the im- 
portance of the results was emphasized. In the self- 
actor condition, subjects were told that they would 
be presented with imaginary stories about themselves 
and were asked to respond to the situation as if it 
had actually occurred. The other-actor condition 
stories were described as situations that had occurred, 
and subjects were asked to give their evaluations. 
Some time was spent explaining the use of the rating 
scales and ensuring that the concepts of blame and 
cause were understood. Where necessary, examples 
were given. It should be emphasized that both the 
language used and extent of elaboration given were 
varied as a function of the subjects’ age and apparent 
comprehension. The latter was considered more im- 
portant than adherence to a rigidly worded set of 
instructions. Finally, subjects were given two very 
explicit practice items and their correct use of the 
rating scale was checked by supplying appropriate 
feedback. 
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Each subject’s responsibility attribution was as- 
sessed by asking, “How much should actor be 
blamed for the boy (girl) getting hurt?” (in the 
experimental situation the appropriate name or pro- 
noun was used). Subjects indicated their answers on 
a 7-point rating scale comprising a single empty 
rectangle and six Xs of increasing height. The rec- 
tangle was labeled “none,” the midpoint, “medium,” 
and the last X, “totally.” The first and fifth Xs were 
labeled “very little” and “very much,” respectively. 
The rectangle was used to emphasize the qualitative 
distinction between attributing blame and no blame, 
in contrast to the quantitative differences in the 
amount of blame assigned, represented by increasing 
Xs. 

Subjects were then asked, “How much was ACTOR 
the cause of (reason for) the boy (girl) getting 
hurt?” Again, subjects indicated their answers by 
ticking an appropriate scale point. Darkened rec- 
tangles replaced the Xs in an attempt to ensure that 
the two questions that followed each story were 
answered on the appropriate scale. The order of the 
questions was not varied, since it was judged that 
the use of only two questions was unlikely to result 
in any fatigue or similarly distracting effects. 


Results 


An initial 2 (condition) X 5 (age) X 6 
(level) X2 (attribution measure) analysis of 
variance was conducted to determine whether 
separate analyses for the two dependent mea- 
sures were justified. As hypothesized, a sig- 
nificant main effect for attribution measure, 
F(1, 230)= 6.52, p < .05, was found. Hence 
perceived blame and causality served as de- 
pendent measures in separate univariate 
analyses. 

Because the self-versus-other as actor was 
hypothesized as an interaction effect, analyses 
for sex and story context differences were 
conducted on the data for each condition 


separately. 


Sex Differences in Attribution 


To test for sex differences, a 5 (age) X 6 
(level) X 2 (sex) analysis of variance was 
carried out. In respect to both perceived 
blame and causality, sex produced neither a 
significant main effect nor an interaction in 
either actor condition (p> .10). Conse- 
quently data for both sexes were combined 


in subsequent analyses. 
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Figure 1. Mean moral attribution as a function of stimulus level for other as actor. 


Story Context Effects 


In evaluating possible Story context effects, 
age was not included as a separate factor be- 
cause stable differences were considered un- 
likely to emerge with only four subjects per 
cell. A 6 (story) x 6 (level) analysis of vari- 
ance showed that story context did not 
emerge either as a main effect or in interac- 
tion across both dependent measures and 
administration conditions (p> .10), All fur- 


ther analyses therefore also ignored story 
context, 


Measure of Perceived Blame 


It was hypothesized that subjects? 
judgments would be affected by the sul 
age, the stimulus level, and the interaction 
of these two factors, whereas administration 
Condition (self ys. other) was held to be 


blame 
jects? 


LEVELS 


important only at a younger age. Subjects’ , 
mean scores in both conditions are displayed 
in Figures 1 and 2, These data curves sug- 
gest a configuration consistent with the pre- 
dicted results. An analysis of variance per- 
formed on the data tevealed that the effects 
of subject’s age, F(1, 230) = 13.92, p< 
001; stimulus level, F(5, 1150) = 396.79, 
P< 001; the interaction of these two fac- 
tors F(20, 1150) = 4.92, p<.01; and the 
interaction of story character with age, F(4,. 
230) = 2.73, p< 05, were all significant, No 
other significant Tesults were found. 

To further investigate the relationship be- 
f subjects’ age, the stimu- 
r interaction, simple main 
f variance were computed 
(1968, PP. 284-294) rec- 
ure. The results were con- 
rediction that younger sub- 
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Figure 2. Mean moral attribution as a function of stimulus level for self as actor. 


jects would attribute more blame at lower 
levels of the stimulus factor than adults. A 
simple main effect for age was found in all 
but the last two levels (in these and subse- 
quent analyses the .01 level was adopted un- 
less otherwise stated). 

A second predicted feature of the age- 
stimulus level relationship related to possible 
differential effects of the stimulus level at 


y various ages, The simple main effects anal- 


ysis showed a significant stimulus level effect 
at all ages. To probe these effects in greater 
detail, comparisons between adjacent levels 
of the stimulus factor were made. The pat- 
tern of results supports the notion of in- 
creased differentiation of AR criteria with 
age. While 6-year-olds significantly distin- 
glished only accidental from intentional out- 
comes (Levels 3 and 4), 8-year-olds in addi- 


tion differentiated Levels 4 and 5. Ten-year- 
olds drew a further distinction in clearly 
separating Levels 2 and 3, an additional dis- 
tinction not made by the 12-year-old group. 
Finally, adults clearly distinguished all levels 
except the first and last pairs. 

A similar main effects analysis of variance 
was conducted to determine the exact nature 
of the Story Character X Age interaction. It 
was found that only the youngest age group 
was affected by actor type in their attribu- 
tion of blame. Further investigation showed 
that this difference was manifest at the as- 
sociation, causality, and justifiability levels. 
However, the direction of the difference was 
opposite to that expected; greater blame 
was attributed in the self- than in the other 
condition. 
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Measure of Perceived Causality 


The mean scores for attribution of causal- 
ity appear in Figures 3 and 4, As displayed 
in these figures, the pattern of perceived 
causality confirms the predicted effects for 
subjects’ age, stimulus levels, and their inter- 
action. A 5 (age) X 6 (levels) x 2 (actor 
type) analysis of variance yielded a signifi- 
cant main effect for age, F(4, 230) = 5.03, 
$ < .01; stimulus level, F(5, 1150) = 184.51, 
$ < .001; and the interaction of these two 
factors, F(20, 1150) = 3.25, p < .01. Actor 
condition had no effect in this analysis. 

Again, simple main effects analyses of vari- 
ance for subjects’ causal attributions were 
performed to investigate in more detail the 
relationship between subjects’ age and the 
stimulus level factors. One predicted feature 
of this relationship was that subjects’ age 
would affect attributions at lower but not 
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higher levels of the stimulus factor. Support- 
ing this prediction, a simple main effect for 
age was found at Level 2 and Level 3 but” 
not at Levels 4, 5, and 6. 

It was predicted that a second character- 
istic of subjects’ causal attributions would be 
a different effect of the stimulus level at each 
age. The simple main effects analysis showed 
that within each age group the stimulus level 
had a significant effect. Further analysis of 
the effect was made by comparing the means» 
of different stimulus levels within each age 
group. The only significant differences be- 
tween adjacent means in the youngest group : 
involved Levels 1 and 2 as well as Levels 
3 and 4. The latter, in addition to Levels 4 
and 5, were differentiated by the 8-year-old 
group. Both 10- and 12-year-olds made the 
same two distinctions, although they also ~ 
differentiated Levels 1 and 2, For the adult 
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Figure 4. Mean causal attribution as a function of stimulus level for self as actor. 


group, significant differences were obtained 
for Levels 1 and 2 and Levels 4 and 5. In 
sum, distinctions were mainly made between 
Level 1-2, 3-4, and 4-5 pairs, while no 
group distinguished Levels 2-3 and 5-6. 


Developmental Stages in Attribution 
of Responsibility 


A Guttman scalogram analysis was per- 
formed on the complete set of AR data 
gathered in the study. Although an Actor 
Condition x Age interaction was found, two 
Separate analyses were not conducted for two 
reasons. First, the interaction obtained did 
not affect the order relationship between 
blame attributions and the various stimulus 
levels. Second, deterministic models for cate- 
gorical data scaling tend to capitalize on 
chance variation, making the use of samples 


as small as 120 subjects a “dangerous ex- 
travagance” (Torgerson, 1958, p. 324). In 
addition, only Heider’s original five levels 
were included in the analysis, since no age 
group distinguished Level 5 and Level 6, and 
only one subject attributed blame in the 
latter case but not the former. 

An initial scalogram analysis showed that 
Heider’s levels displayed the expected order- 
ing but also revealed that the distribution of 
responses at Level 5 was skewed beyond an 
acceptable level. Because this is one of the 
auxiliary criteria that can lead to a spuri- 
ously high coefficient of reproducibility, this 
statistic was recalculated on the basis of 
Levels 1 to 4 data only (Level 5 was, how- 
ever, still retained in order to distinguish 
the 13.3% of scale types who attributed 
blame at this level but not at the previous 
stimulus levels). The coefficient of both 
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Table 1 : 
Percentage of Subjects Corresponding to Pure 
Scale Types in Each Age Group for 
Perceived Blame 


Age (years) 

Scale 

ae 6 gb 10° 12¢ Adult® 
5 TAM Nr eee) 8 3 
4 TARTON GES a nea 8 
3 10) (223%) 25 0 T Coane 
2 24 25 38 42 47 
1 R E E E T 
0 3 3 0 0 0 

an = 38, 

dn = 40. 

en = 40, 

dn = 36, 

on = 38, 


reproductibility (.92) and scalability (.73) 
exceeded the criteria commonly used in eval- 
uating Guttman scales. Inspection of the 
correlation coefficients between items sug- 
gested a unidimensional latent structure (cf. 
Torgerson, 1958, chap. 13), since adjacent 
items correlated positively and more highly 
than nonadjacent items. 

Further analyses were conducted to deter- 
mine whether the response patterns were age 
related. The scale scores, based on all five 
Heiderian levels, differed in the expected 
manner; they were inversely related to age, 
F(4, 235) = 6.32, p< .01. However, mul- 
tiple-means comparisons showed that the 
four oldest age groups did not significantly 


differ from each other, although they all 
differed 


shows 
of responses conforming to Perfect scale 
types. From the data displayed in this table 
it can be seen that the age effect may pos- 


actor manipulation, and hence scale scores 
for the other-as- 
F(4, 115) = 1.58, p> 10, Moreover, 


Spection of the age-by-scale-type distr 
tion for this 


criminable age-related p 
(a) some younger children were undifferen- 
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tiated in their AR, whereas every adult made 
some distinction between the levels; and (b) 


a greater proportion of adults than of any“ 


other age group was found at Scale Types 2 
and 3. 

A similar scalogram analysis was carried 
out on the data for perceived causality. No 
evidence was found to support the existence 
of a Guttman scale (coefficient of reproduci- 
bility < .90; coefficient of scalability < .60). 


This result is perhaps not surprising, since™ 


81% of the subjects attributed causality by 
Level 2, resulting in a highly skewed spread 
of item marginals. It should be noted that 
the use of a single cutoff point for building 
these scales is a particularly stringent test 
of Heider’s theory, and if multiple criteria 
had been adopted, a scalogram might also 
have been constructed for the causality 
measure. 


Multidimensional U; nfolding 


To determine whether the self/other dif- 
ference found for blame but not for cause 
reflected an underlying dissimilarity in the 
Perception of the stimulus items, a multidi- 
mensional unfolding analysis was attempted. 
Degenerate solutions were repeatedly ob- 


d 


` 


A 


tained owing to a lack of constraint in the „ 


data. To increase response variability, adult 
and child data were combined, but again, un- 
stable solutions emerged. Inspection of the 
raw data showed that there was much con- 
sensus in the response patterns across the 
various ages (e.g., no subject showed a sys- 
tematic decrease in AR with increasing level), 
and hence this analysis was abandoned. 


Attribution to Intentional Versus 
Unintentional Behavior 


B 
-- 


In terms of Piaget’s (1932) theory, the 5 


major moral development in middle child- 
hood involves a shift from outcome-based to 
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3 measures and within each age group. Sig- 

œA, nificant differences were found on both vari- 

# ables for all the groups (p < .05), indicating 
an emergence of this change earlier than is 
traditionally expected. 


a 


Discussion 


In general the results provided strong sup- 
sport for the experimental hypotheses. For 
both moral and causal attributions a strong 
interaction between subjects’ age and the 
characteristics of the stimulus event was 
found, As predicted, the data reflected sig- 
nificant age differences in mean scores at 
lower levels of the stimulus factor but not 
at higher levels (for the moral measure the 
convergence was at Level 5; for perceived 
causality, it was at Level 4). 
The form of the interaction between age 
and stimulus level was, however, only partly 
consistent with a developmental interpreta- 
tion of Heider’s (1958) theory. Although 
increasing age was associated with more sig- 
nificant distinctions between adjacent stimu- 
lus levels, this result requires some qualifi- 
cation. Almost without exception all age 
groups attributed increasing blame as actor 
behavior became more internally directed. 
< Hence even the youngest children appeared 
on the average to show distinctions in the 
hypothesized direction between Heider’s cri- 
teria for AR. With development these dis- 
tinctions became more marked. Sedlak’s 
(Sedlak, 1979) finding with respect to the 
age-related importance of Heider’s criteria 
in cognitive representations directly supports 
the present results. 
In addition to the above, it was found 
that even the 8-year-old group distinguished 
between intentionality and justifiability, the 
y second significant differentiation to emerge 

with age. This contrasts with Shaw and Sul- 
=e zer’s (1964) study, where only adults made 
a slight distinction between these two cri- 
teria, leading to the suggestion that they 
may only be fully appreciated by “mature” 
senior citizens. The early emergence of this 
sophisticated attribution process is perhaps 
not surprising for three reasons. First, the 
justifiability level was operationalized in 
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terms of a familiar, commonly occurring sit- 
uation for children (self-defense), whereas 
parents often model AR judgments on justi- 
fication. Second, outcome valence and in- 
tensity were held constant within the con- 
text of a single story presentation. Third, 
development of AR may be fairly rapid in 
such familiar interpersonal situations with 
undesirable outcomes; casual observation 
suggests that explanations are more often 
sought after something bad has happened. 
Indeed there is recent, though admittedly 
limited, support to show that even kinder- 
garten children can take mitigating circum- 
stances into account when evaluating an 
aggressor (cf. Karniol, 1978, p. 84), whereas 
Darley, Klossom, and Zanna (1978) found 
no age differences between 6-, 9-, and 24- 
year-olds regarding the use of contextual in- 
formation in evaluating intentional behavior. 
In any event, significant distinctions be- 
tween these two criteria did not emerge in 
a sequence consistent with a developmental 
interpretation of Heider’s (1958) levels. 
Moreover, not even adults made a signifi- 
cant distinction between the theoretically 
easiest level pair (Levels 1 and 2), a result 
also found by Harris (1977). It would thus 
seem that the difference between actor- 
caused accidental events (Level 2) and acci- 
dents not produced by actors (Level 1) may 
not be as important in AR as other Heiderian 
criteria. 

Finally, a scalogram analysis based not 
on the intensity of responsibility attributed 
but rather on its absolute ascription showed 
that Heider’s levels formed a Guttman scale. 
Within the age range tested, only limited 
and rather ambiguous support was found 
for age-related response patterns. Under con- 
ditions that did not prejudice the youngest 
group’s attributions, no age differences were 
found in scale scores. However, inspection of 
the data conforming to scale types did show 
that undifferentiated responses were exclusive 
to the child sample and were especially evi- 
dent in younger children. In addition, adults 
tended to dominate in the groups that based 
AR on intentional but justifiable acts and on 
foreseeable actor-produced accidents. Since 
these results reflect individual response pat- 
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terns rather than between-subjects mean 
scores, they may prove to be a more appro- 
priate test of the developmental support for 
Heider’s theory. In the absence of previous 
research utilizing such an approach, this re- 
sult requires replication before the develop- 
mental status of Heiderian criteria for at- 
tributing responsibility can be fully evalu- 
ated. 

It should be noted that although Heider’s 
attribution criteria were strongly confirmed, 
no evidence was obtained in this study to 
support the additional level hypothesized. 
Nonetheless, abandoning this criterion is felt 
to be premature for two reasons. First, it 
seems to be an extension logically implied 
by Heider’s model. Second, it may not have 
been appropriately tested in the present cir- 
cumstances, Because of a medium-intensity 
negative outcome, a maximal environmental 
force against the action may already have 
been broken at the intentionality level lead- 
ing to the ceiling effect observed. This could 
be the case for all negative-outcome events, 
and hence the distinction may only emerge 
under neutral- or positive-outcome condi- 
tions, 

To test this hypothesis the same within- 
subjects Latin square design was used to ad- 
minister stories representing both positive 
and negative outcomes at Levels 4, 5, and 
6 to 45 adults, Although no substantial evi- 
dence was obtained to Support Level 6 (dif- 
ferences obtained were, however, in the pre- 
dicted direction), the interaction between 
level and outcome was mainly owing to the 
extreme evaluations of negative outcomes at 
Levels 5 and 6, In fact subjects estimated 
that very few persons would have acted as 
the story character did at these levels (8% 
and 9%, respectively), supporting the idea 
of a ceiling effect for internal causation. In 
contrast, it was held that the behavior of 
large percentages of people would be similar 
to that of the story character for Positive 


Level 6 = 42%). 


Level 6 was not adequatel 


u y operationlized 
to produce maximal internal 


causality in this 
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condition. It appears that while an extension 
of Heider’s scheme for negative outcomes 
may be questionable, its utility for positive” 
events remains to be adequately evaluated. 
A third aim of the present study was to 
investigate attribution differences relating to 
the other and the self as actor. As predicted, 
this variable interacted with age, but the 
direction of the difference was contrary to 
that expected. Inspection of the, raw data, 
showed that almost half of the children at- 
tributed maximum blame to themselves re- 
gardless of level, a phenomenon not found in 
the other-as-actor condition. In view of this 
anomaly, the study was repeated on an in- 
dependent sample of 40 6-year-olds. Since 
it was suspected that the previous results 
might reflect a conservative simplifying 
strategy adopted in the presence of any un- 
certainty regarding the stimulus story con- 
ditions, children were individually tested, 
while finger puppets were used to further 
clarify the story presentation. Although the 
general pattern of increasing AR with level 
was found, no self/other differences emerged, 
This result is consistent with Keasey’s 
(1977) finding that children in kindergarten 
but not in first grade show greater use of 
intention information in relation to the self 
than in relation to a hypothetical other, It » 
thus appears that the difference initially 
found 1s an artifact that, as mentioned, may 
merely indicate a simplifying strategy. 
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illustrated. While adult distinctions regard- 
ing blame were achieved by 10 years of age, 


J no child group showed the adult response 


f 


P: 


=a 


pattern for causality attribution. Children 
tended to perceive differences in causality 
where adults did not, suggesting that causal- 
ity and responsibility were not as clearly dis- 
tinguished, Furthermore, increasing age was 
associated with the emergence of a clear 
pattern in the use of the two attribution 
“measures. Whereas younger children some- 
times ascribed greater blame than causality, 
adults rarely did so (6 years = 28% of total 
responses, 8 years = 24%, 10 years = 20%, 
12 years = 17%, and adults = 10%). Any 
response order effect due to regression would 
not alter the above result, which is con- 
sistent with the age-related changes in the 
data curves for both dependent measures 
and suggests a similar increasing cognitive 
differentiation between perceived blame and 
causality with age. 

The results of this study also support an 
emerging belief that even young children 
differentiate intentional from unintentional 
behavior and are capable of more subtle dis- 
tinctions in their attributions than are tra- 
ditionally ascribed to them. That children 
as young as 6 years clearly distinguished 


« accidental and intentional outcomes is con- 


sistent with recent research (e.g, Farnill, 
1974) showing that children do use inten- 
tionality information when tested under ap- 
propriate conditions that do not overburden 
their cognitive processing capacities. 

In sum, the present study provides some 
support for a developmental interpretation of 
Heider’s (1958) levels of AR. When quanti- 
tative group scores were considered, all ages 
made theoretically consistent distinctions be- 


„ tween the levels, although a greater number 
’ of significant differences was associated with 


increasing age. Similarly, certain response 
patterns dominated at particular ages, but 
no evidence was found for developmental 
stages. Insofar as developmental differences 
do exist, a potentially fruitful question for 
future research to answer is whether they 
represent age-related variations in evalua- 
tive criteria or social perception. 
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Multiple Discovery and Invention: Zeitgeist, Genius, or Chance? 


Dean Keith Simonton 


University of California, Davis 


The occurrence of independent contributions by two or more scientists can be 
interpreted in terms of zeitgeist, genius, or chance. The relative adequacy of 
these three theories was examined by hypothesizing four critical empirical tests. 
These tests focus on (a) the general and intradisciplinary probability distribu- 
tion of multiples and (b) the relationship of individual eminence with multiple 
production and priority. An analysis of 579 multiples and of 789 scientists and 
inventors lends the most support to the chance theory, followed by the zeitgeist 
theory. The results are integrated into a single probabilistic perspective that 
incorporates some of the major features of all three theories. 


What mathematician does not know that 
the calculus was independently devised by 
both Newton and Leibniz? Who among bi- 
ologists has not heard that Darwin’s labori- 
ous documentation of evolutionary theory 
was almost forestalled by an independent 
abstract by Wallace? And what psychologist 
has not observed to his or her students that 
the James—Lange theory of emotion was ac- 
tually advanced independently by James and 
Lange? Certainly the occurrence of multiple 
discoveries constitutes one of the most fas- 
cinating phenomena in the history of science 
and technology. Such multiples become par- 
ticularly dramatic when they provoke fierce 
priority disputes such as the debates that 
surrounded the independent prediction of the 
planet Neptune by Adams and Leverrier. 
But why do multiple discoveries and inven- 
tions happen at all? 

The traditional explanation of multiples 
is founded on what can be called the zeitgeist 
theory of creativity. According to this social 
deterministic view, the individual creator is 
largely irrelevant or epiphenomenal to the 
cultural progress represented by the inevi- 
table accumulation of scientific knowledge 
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and technological expertise. Rather, it is the 
sociocultural system as a whole, embodied as 
the spirit of the times, which is ultimately 
responsible for any given technoscientific ad- 
vance. Thus, if neither Adams nor Leverrier 
had predicted the existence of Neptune, 
someone else would certainly have done so. 
The failure of gravitational astronomy to 
explain the anomalies in the orbit of Uranus 
became so obvious and irksome that Nep- 
tune’s forecast was in the air. Because the 
times were ripe for the prediction of the 
eighth planet, it was really immaterial who 
actually picked the fruit; any competent 
astronomer might have done the same, had 
Adams and Leverrier both died in the crib. 
Not surprisingly, the zeitgeist account of 
multiples is frequently defended by those 
anthropologists and sociologists who desire 
to minimize the role of the individual as a 
causal agent of sociocultural events. For ex- 
ample, in an article entitled “Are Inventions 
Inevitable?” Ogburn and Thomas (1922) 
list nearly 150 multiples to prove that any 
given discovery or invention has a probabil- 
ity near unity of appearing, the presence or 
absence of any specific genius notwithstand- 
ing. And Merton (1961) takes this interpre- 
tation so much as given that he once as- 
serted, “It is the singletons—discoveries 
made only once in the history of science— 
that are the residual cases, requiring spe- 
cial explanation” (p. 477). But psycholo- 


yright 1979 by the American Psychological Association, Inc. 0022-3514/79/3709-1603$00.75 


1603 


1604 


gists are not always ill-disposed to the zeit- 
geist interpretation of multiples, as we can 
readily witness in Boring’s (1963) atanon of 
multiples as evidence that the “genius” is 
merely an agent of the zeitgeist, an effect or 
a symptom rather than the cause of the 
times. 

Still, there is no doubt that not all psy- 
chologists would completely endorse the zeit- 
geist position. The assumption implicit in the 
research on creativity in the 1950s and 1960s 
was that substantial and sustained individual 
differences in creativity determine whether 
any given person can make a contribution to 
society (e.g., Taylor & Barron, 1963). In 
short, many psychologists would feel much 
more comfortable with what can be styled 
the genius theory of creativity. According to 
this perspective, scientific discoveries and 
technological inventions are produced by 
great scientists and inventors who possess 
abilities, personalities, and backgrounds that 
set them apart from their colleagues. On 
first blush, it would seem that the genius 
theory could not adequately explain the ap- 
pearance of independent contributions, but 
this supposed inadequacy is only partly true. 
In fact, genius theory can explain one aspect 
of multiples that must Prove an embarrass- 
ment to zeitgeist theory—the frequent oc- 
currence of rediscoveries. Not all multiples 
are produced simultaneously, The various 
independent contributions are sometimes 
Separated by long time Spans during which 
the first invention or discovery passes into 
temporary oblivion. Merton (1961) has re- 
ported that even though 20% of the mul- 
tiples he studied occurred within a 1-year 
Span, some 34% entailed a time lapse of 10 
or more years, Classic instances of rediscover- 
ies include Mendel’s formulation of the laws 
of heredity, some of Cavendish’s unpublished 
electrical discoveries, much of Gauss’s un- 
published mathematica] work, and the pro- 
digious notebooks of one of the greatest an- 
ticipators of all, Leonardo da Vinci. How 
can someone like Mendel come to be 35 years 
ahead of his time? Genius theory has less 
difficulty on this point: A major genius is 
defined as one who transcends the limitations 
of the zeitgeist and one who may even antici- 
pate movements far into the future. Long 
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delayed rediscoveries can be adopted, there- 
fore, as evidence for genius theory. For ex- 
ample, even if Stevinus merely rediscovered 
hydrostatic principles already known to Ar- 
chimedes, the fact would remain that it took 
almost 2,000 years for someone to be born 
who could replicate Archimedes’ achieve- 
ment, 

Besides the genius and zeitgeist theories, 
a third position is a credible contender for 
understanding multiples: chance theory. 
Since most behavioral scientists seem to go 
out of their way to disprove the null hy- 
pothesis predicated on the random model 
(Greenwald, 1975), it is perhaps unsurpris- 
ing that this theory has not always received 
the study it deserves, Nonetheless, chance 
or “luck” definitely plays some part in crea- 
tivity, as the phenomenon of serendipity well 
illustrates (Cannon, 1940). In addition, 
probabilistic or stochastic models have been 
so successfully applied to other facets of 
creativity that there is no reason not to 
approach multiples in the same manner. To 
illustrate, probability models have explained 
individual differences in creative productivity 
(Price, 1976; Simon, 1955), the correlation 
between productivity and eminence (Dennis, 
1954), and the correlation between produc- 
tivity and age (Dennis, 1966; Simonton, 
1977a). More specifically, there have been 
several investigators who have proposed 
some kind of probability model for multiples 
(e.g., Price, 1963, chap. 3; Schmookler, 1966, 
chap. 10; Simonton, 1978). As an ex 
many adherents of 
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nating any research that might head in the 
same direction. 

Each of the foregoing three theories can 
be offered as partial interpretations of mul- 
tiples in science and technology. But which 
theory best explains the data? In this article 
I propose to examine each of the three prin- 
cipal theories so as to derive a set of dis- 
tinctive empirical predictions that can serve 
as critical tests. In deriving these distin- 
guishing repercussions of the three theories, 
I will treat each position as a pure or ideal 
type. In other words, I will try to make 
each theory as distinctive as possible so that 
the contrasting empirical implications can 
best be highlighted. Only at the close of this 
article, after the results of the tests are 
known, will I attempt to synthesize the three 
theories into a single view. With this quali- 
fication in mind, we can advance the fol- 
lowing four tests. 


Test 1 


If the zeitgeist theory holds, then mul- 
tiples should be very common relative to 
singletons, and the distribution of the vari- 
ous grades of multiples (ie., doublets, trip- 
ets, quadruplets, quintuplets, etc.) should 
be approximated by a binomial distribution 
‘with a mean appreciably greater than 1. If 
the genius theory holds, then multiples 
should be virtually nonexistent, yielding a 
degenerate binomial with a mean of 1 and a 
variance of 0, If the chance theory holds, 
then multiples should be rare relative to 
singletons, and the distribution of the vari- 
ous grades of multiples should be approxi- 
mated by a Poisson distribution with a mean 
less than 1. 

Let us suppose that for any given inven- 
tion or discovery there are » individuals 
capable of producing it, and each has a 
* probability p of success, where »>1 and 
O0<p<1. Then, as we know from any 
introductory statistics text (€.8., Snedecor & 
Cochran, 1967, chap. 8), we can employ the 

binomial distribution to obtain the propor- 
ion of the various numbers of successes. 
Yow suppose we apply this simple proba- 
bility model to the three theories. If the zeit- 
geist theory is valid, then the number of 


ZEITGEIST, GENIUS, OR CHANCE? 


1605 
30 Zeitgeist 
25 p=.5 
n=l0 
20 He 


Genius 
p =1.0 
n=l 
=I 


PROPORTION 


234567 8 910 


O I 
NUMBER OF SUCCESSES 
(Grade of Multiple) 
Figure 1. Predicted probability distributions ob- 
tained by interpreting the three alternative theories 
in terms of the binomial distribution with different 
values for n and p. 


creators should be very large and the proba- 
bility of success somewhat small, so that 
genius per se is irrelevant, Assume for the 
sake of argument that p=.5 and » = 10, 
that is, that for any potential contribution 
there are 10 people capable of making it, 
although each enjoys only a 50-50 chance 
of success. Then we should obtain a bi- 
nomial distribution with a mean of np =5 
and with the shape shown in Figure 1. Under 
this distribution almost 99% of all contri- 
butions to science and technology will involve 
two or more independent creators, and the 
modal multiple will be of Grade 5 (i.e., quin- 
tuplets). Hence, inventions and discoveries 
must be virtually inevitable. 

If the genius theory is correct, there should 
be only a small number of people capable of 
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making each advance, yet each individual 
should have nearly a 100% chance of suc- 
cess, For argument’s sake let us posit the 
parameters of n=1 and p= 1.0, yielding 
the degenerate binomial distribution in Fig- 
ure 1. Notice that the mean and mode are 
now 1, with no multiples whatsoever occur- 
ring. Hence the genius theory does imply 
that multiples should be quite rare. 

Yet another picture emerges when we 
operationalize the binomial parameters to be 
consistent with the chance theory. We have 
the case where n is quite large but p is quite 
small, so that “many are called but few are 
chosen.” For illustrative purposes, let n = 
100 and p = .01, again rendering the mean, 
p, unity, The binomial expansion produces 
the distribution’ shown in Figure 1. Observe 
that multiples are much more rare under 
this model: Only about 26% of all attempted 
„contributions will involve multiples, whereas 

` 37% of all.attempt8 will produce singletons. 
Even more critically, according to this model, 
about 37% of all attempts will result in 
total failures. Thus, quite unlike the zeit- 
geist theory, the chance theory implies that 
a large percentage of potential contributions 
will neyer be made and thereby inserts much 
indeterminancy into sociocultural history (cf. 
Simonton, 1978), Furthermore, if the chance 
theory is true, then we may go one step 
further and assert that the frequency dis- 
tribution of multiples of various grades will 
be closely approximated by a Poisson distri- 
bution, The latter is the exponential limit of 
the binomial distribution when n is very 
large and p is very small (Haight, 1967, p 

15), For example, the correlation between 
Proportions derived from a Poisson distribu- 
tion with » =1 and those from a binomial 
distribution with n= 100 and p= 01 de- 
parts from unity only after the fourth deci- 
mal place. Also, since the mean and variance 
are equal in the Poisson case, the Poisson 
distribution is easier to estimate from lim- 
ited data, a valuable asset, given that we do 
not know how many total failures there have 
been in the history of scierice and t 


e echnology 
(i.e., how many “nulltons”). For this first 
test the Poisson distribution will therefore 


be adopted as the hypothesis to be disproven, 
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Only if we are empirically compelled 


abandon the Poisson model with a r 
than 1 will we feel obliged to exa 
two alternative theories.’ 


Test 2 


According to the zeitgeist theory, 
ability of multiples occurring shot 
est in the mathematical and physi 
and lowest in the biological sciences. If 
genius theory is correct, the revers 
be true. But by the chance theory, the prob 


the 


hould 


ability that multiples will occur should be 
constant across scientific disciplines 
Some scientific endeavors can be said to 


engage more zeitgeist than others. As Kuhn 
(1970) pointed out, the physical sciences 
often have well-developed paradigms that 
logically generate the puzzle-solving research 
of “normal science.” Thus the problems of 
classical physics were largely defined in termi 
of the repercussions of Newtonian mechanig} 

and Maxwellian electromagnetic theory. 
Other, “softer” sciences do not feature such 
clean theoretical foundations, and conse. 
quently the zeitgeist is more diffuse. Henge | 
we would expect the more codified and. 
mathematically constructed disciplines ta 
generate a disproportionate number of mul- 
tiples. The prediction of genius theory goes 
almost in the opposite direction. Genius the- 
Ory maintains that the major figures of hu- 
man history tend to transcend the spirit of 
their times. Indeed, a study of more than 
2,000 thinkers in Western civilization has 


1 Actually, we coul 
able Probability distrib 


shown that there is a negative relationship 
tween the fame of a thinker and the ex- 
Went to which he or she fits the prevailing 
beliefs of the time (Simonton, 1976d). Ac- 
| cordingly, those disciplines that attract the 
most geniuses should have a smaller propor- 
tion of multiples. But how do we know which 
cientific enterprises attract the greatest 
minds? Roe (1952) found in her study of 
64 eminent scientists that physicists tend to 
core higher on standard intelligence tests 
than do biologists. Similarly, Harmon’s 
4 (1961) survey of doctorates discovered that 
f physicists and mathematicians had higher 
intelligence quotients than biologists, with 
‘chemists and engineers falling in between. 
Therefore, if we are willing to grant two 
jcrucial assumptions, we would predict in ac- 
ordance with Test 2. The first assumption 
is that the distribution of intelligence in con- 
temporary science can be projected a few 
-enturies into the past. Partial support for 
Ms extrapolation can be gleaned from Cox’s 
1926) IQ estimates for select scientists born 
as far back as the fifteenth century: Math- 


tists 178, medical researchers 172, 
72, biologists 171, and astronomers 170. The 
second assumption is that psychometric IQ 
has something to do with cultural eminence. 
While classic studies of contemporary pop- 
ulations by Terman and Oden (1947) and of 
historical populations by Cox (1926) have 
jattempted to demonstrate this relationship, 
| the evidence is far from secure. For instance, 
the correlation of .26 that Cox found between 
eminence and IQ has been shown to be a 
ethodological artifact (Simonton, 1976a). 
Nevertheless, without this assumption we 
annot derive a unique prediction for the 
enius theory. Finally, turning to the chance 
eory, we see that the random model is the 
mplest, for it merely affirms that the prob- 
ability of the coincidences labeled as multi- 
ples is uniform across all disciplines. A recent 
eanalysis of data published by Ogburn and 
omas (1922) seems to endorse this latter 
Prediction (Simonton, 1978). Nonetheless, the 
‘sample size may have been too small to per- 
mit rejection of the null hypothesis. 
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Test 3 


The zeitgeist theory implies that there 
should be a positive relationship between the 
eminence of a given scientist or inventor and 
the number of multiple discoveries or inven- 
tions he or she participates in. In contrast, the 
genius theory implies that the relationship 
between eminence and multiples participation 
should be negative. The chance theory pre- 
dicts that this correlation should be zero once 
variations in productivity are controlled. 

Boring (1963) believed that certain his- 
torical figures are singled out as “eponyms” 
because they epitomize the times in which 
they live. If multiples are also interpreted as 
zeitgeist-inspired events, then it follows that 
the eponyms of history should engage in more 
multiples than those outcasts from the main- 
stream of history. Thus Mérton (1961) has 
maintained that persons Of “great scientific 
genius will have been repeatedly involved in 
multiples” (p. 484). The genius theory would 
advocate the exact opposite. To employ 
Kuhn’s (1970) terminology again, the lesser 
figures may be wrapped up in the puzzle- 
solving activities dictated by the predominant 
paradigms of normal science, but the principal 
figures are generating influential scientific 
revolutions that will change the zeitgeist. On 
the other hand, chance theory takes a stance 
somewhere between these extremes. As Mer- 
ton (1961) noted, the more eminent scientists 
produce more multiples in part “because the 
genius will have made many scientific dis- 
coveries altogether” (p. 484). So just by 
chance the more productive scientist will 
create more multiples. Yet creative produc- 
tivity has been repeatedly shown to be cor- 
related with eminence (€-g., Dennis, 1954; 
Simonton, 1976b). Hence, there may very 
well be a- positive correlation between emi- 
nence and participation in multiples, but that 
correlation may be the spurious result of a 
common cause, creative productivity. There- 
fore, if we partial out the source of spurious- 
ness, the relationship should vanish, at least 
according to chance theory. 
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Test 4 


If the genius theory is justified, there 
should be a positive relationship between in- 
dividual eminence and temporal priority in a 
multiple. If either the zeitgeist or the chance 
theory holds, there should be no relationship, 
positive or negative. 4 

As already noted, the genius not only is 
independent of the zeitgeist but may also be 
a precursor of the future zeitgeist. Hence, the 
more eminent scientist produces inventions or 
discoveries far in advance of his or her time, 
contributions that lesser minds may take 
decades if not centuries to rediscover. Un- 
fortunately for this prediction, past research 
is not in complete accord with the precursive 
conception of genius, The study of more than 
2,000 thinkers mentioned earlier found that 
the more eminent thinkers were actually be- 
hind their times, as if they were more inter- 
ested in consolidating the past than in in- 
timating the future (Simonton, 1976d). If 
this result applies to science and technology 
too, then the major figures may be those who 
rediscover the work of lesser known predeces- 
sors. In any case, we would not expect any 
consistent relation between priority and emi- 
nence, should the zeitgeist or chance theories 
prove valid. 

Quite clearly the above four tests do not 
all have the same secure foundation, The con- 
ceptual basis for Tests 1 and 3 is perhaps 
More firm than for Tests 2 and 4. Even so, 
something is definitely gained by stipulating 
both strong and weak criteria for the three 

theories, Although the three approaches do 
hot always imply three separate predictions 
for each test, each of the theories is uniquely 
characterized when all four tests are taken 
together. If both weak and strong tests point 
in the same direction, then the preferred 
theoretical explanation can be better identi- 
fied. In addition, the outcome of all four tests 
may provide the basis for a synthesis of the 
three theories into a single, more compre- 


hensive view. 
Empirical Tests 
How Probable Are Multiples? 


Clearly our first task is to obtain an exten- 
sive list of multiples. To do this I began with 
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the previously published lists of Ogburn and 
Thomas (1922), Kroeber (1963), and Ster 
(1927, chap. 3).* This preliminary list wa 
then augmented by consulting a large number 
of additional sources, especially Darmstaedter 
(1908), Taton (1964, 1965, 1966), Asimov 
(1972), and Williams (1974). I also at 
tempted to verify all purported multiples t 
make sure they actually involved independen 
efforts of two or more researchers (using, e.g. 
Debus, 1968; Encyclopedia Britannica, 
1974; Gillispie, 1970; Singer, Holmyard, 
Hall, & Williams, 1958). A critical theoretical 
issue arose at this verification stage: The case 
for the occurrence of multiples is not as 
strong as one is led to believe by zeitgeist 
enthusiasts. One problem is that many puta- 
tive instances rest on vague generic terms tha 

may encompass far too much (Schmookler 
1966, chap. 10). For instance, Ogburn andj 
Thomas (1922) credit both Boerhaave and 
Hales with the independent contribution 
the “beginnings of modern organic chemistry” 
(p. 98), an extremely general attribution, t 

be sure. Another difficulty is that even more 
specific terms, such as the “microphone,” may 
Cover inventions or discoveries that are ofte 
comparable only regarding function, each con 
tribution employing quite different means t 
attain roughly the same end. Indeed, it may. 
even be argued that true multiples are vir- 
tually nonexistent: The calculus of Newton 
was not the same as that of Leibniz, nor was 
Priestley’s oxygen generated the same way as 
Scheele’s. Even the eighth planet differed from 
Adams to Leverrier, since each predicted dif- 
ferent orbits for Neptune (both of which just 
happened to coincide fairly well in 1846). 
Yet even granting some abstraction in cat 
egorization, many multiples still prove to 
mirages upon closer inspection. Often what 
described as indepe: 
Pendent at all. Thus Ogburn and Thomas 
(1922 ) should not list both Henry and Mor: 

as inventors of the telegraph, for it w. 

Henry's freely given advice that had allowed 


a painter to succeed. Similarly, fet 


learned of the anesthetic properties of e 


2I attempted to obtain Merton’s (1961) Sample as 


well, b i i 
pirat 5i his data are no longer available (Merton, 


ZEITGEIST, GENIUS, OR CHANCE? 


from Jackson, and Fulton well knew Jouf- 
froy’s work on the steamboat and actually 
attended the launch of Symington’s steam- 
boat (cf. Ogburn & Thomas, 1922). In short, 
many multiples do not involve independent 
efforts but rather consecutive developments 
in the evolution of science and technology 
(cf, Constant, 1978). A final problem is that 
a multiple mentioned in one source might not 
be mentioned in any other source and hence 
would stand unconfirmed, 

To deal with the foregoing problems, I 
chose to collect two separate samples of multi- 
ples (excluding cases prior to 1500) from the 
general fields of mathematics, astronomy, 
physics, chemistry, biology, medicine, and 
technology, The first, “exclusive” sample in- 
cluded only those multiples mentioned in two 
separate sources, without contradiction by any 
other source. The second, “inclusive” sample 
consisted of all multiples mentioned in at 
least one source as long as no contradictory 
evidence could be found in any other source. 
The exclusive sample contained 199 multiples, 
more than the 148 collected by Ogburn and 
Thomas (1922) but less than the 264 
gathered by Merton (1961). The more liberal 
definition added another 380 multiples to 
produce the 579 multiples of the inclusive 
sample, All principal hypotheses were then 
tested on both samples. Fortunately, the sub- 
stantive conclusions do not materially differ 
for the two samples, and I will therefore focus 
on the inclusive sample (since a larger sam- 
ple size permits parameter estimates with 
smaller standard errors). 

General probability (Test 1). One secure 
finding is that multiples are extremely rare 
events, It may sound impressive to assert the 
existence of 579 potential multiples, but the 
number of so-called singletons is immensely 
larger. To illustrate this point, we may note 
that Darmstaedter’s (1908) chronology, prob- 
ably the most comprehensive in any Western 
language, lists 12,300 contributions in the 
seven fields represented in our sample (Sorok- 
in, 1957, Table 5). So at best multiples con- 
stitute a mere 5% of all contributions. I say 
“at best” because the Darmstaedter chronol- 
gy is evidently more selective than the un- 
defined sample from which the multiples were 
apparently derived; most of the purported 
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multiples were not sufficiently important to 
be cited in that source. Consequently, the 
zeitgeist theory seems disproven: As we can 
conclude from Figure 1, singletons are sup- 
posed to be far less common than multiples 
if the zeitgeist theory holds. In fact, the re- 
sults thus far appear to partly bolster the 
genius theory. 

An advocate of the zeitgeist position, how- 
ever, may claim that it is most unfair to com- 
pare singletons to multiples. Merton (1961) 
presented 10 reasons why he believes that 
multiples are considerably underestimated 
and that many singletons may actually be 
multiples incognito, On first glance, this ac- 
cusation, if true, would seem to invalidate any 
empirical investigation and would thereby 
transport the whole issue to the realm of arm- 
chair social philosophy. But inquiry can be 
salvaged if we ignore the singletons altogether 
and instead concentrate on the probability 
distribution of multiples of various grades 
(Simonton, 1978). We have only to assume 
that the relative proportions of doublets, 
triplets, quadruplets, and so on remain con- 
stant after being sifted through the selection 
process. If there are twice as many doublets 
as triplets in the real world (i.e., the statis- 
tical population), there will presumably be 
twice as many doublets as triplets in any 
sample, once we allow for sampling error. 
Given this standard assumption of statistical 
sampling, we can then try to fit a truncated 
Poisson distribution using the two-moments 
method of estimating the mean, », and the 
standard error, SE (Haight, 1967, p. 89; 
Patil, 1962). The result is displayed in Table 
1, where I give the observed frequency of 
multiples of Grades 2 through 8 along with 
the fitted values according to a Poisson model 
(using the tables in Molina, 1942). 

The fit is fairly close. A pence test 

ives x°(3) = 6.07, p > .1 that any discrep- 
Casas due to mere chance. Although this 
test does not strictly prove the chance theory, 
we can be very confident that both the zeit- 
geist and the genius theories are disconfirmed. 
This conclusion is reinforced when we inspect 
the estimated mean, which is less than 1. 
This mean implies that contributions are 
made less than once, on the average. Ac- 
cordingly, a large proportion of potential 
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Table 1 y 
Observed Frequencies of 579 Multiples and 
Frequencies Predicted by a Poisson Model 


With p = 82 
e eee SSS 
Multiple Observed Predicted 
grade frequency frequency 
= 1361 
i — 1088 
2 449 435 
3 104 116 
4 18 23 
$ 7 4 
6 0 0 
7 0 0 
8 1 0 


Note. Poisson parameters were estimated using the 
two-moments method. The standard error of the 
estimate is .12. 


contributions never appeared at all. Given a 
mean of about .8 we would expect, by the 
Poisson Model, that some 45% of all poten- 
tial contributions would become nondiscov- 
eries or noninyentions, about 36% becoming 
singletons, and the remaining 19% multiples. 
Given this outcome, scientific or technological 
advance is not very inevitable, to say the 
least. 

T hasten to point out that it would be very 
difficult for zeitgeist theory to save the day 
by simply finding more multiples. It is my 
experience after two years of collecting multi- 
ples that the more multiples one teases out 
of history, the worse the picture becomes for 
any social deterministic outlook. What is re- 
quired is more high-grade multiples—more 
triplets on up—in order to give the distribu- 
tion a larger mean, less right skewness, and an 
approximately inverted U shape instead of the 
monotonically decreasing form. But that is 
not what happens. The quest for more multi- 
ples tends to dig up more doublets, rendering 
the case for the chance model stronger rather 
than weaker. Thus if we exi 


amine just the 
exclusive sample of 199 multiples, we find 
that there are 144 doublets, 42 triplets, 9 
quadruplets, 3 quintuplets, and 1 octuplet. 
The two-moments estimate of the mean is 
1.06 with a standard error of 1.33, x?(3) = 
3.37, p >.25. Even though this mean falls 


within the interval estimate for the inclusive 
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sample, it still is larger. Hence the addition 
of more multiples actually lowered the mean, 
making the distribution more skewed right by 
the disproportionate addition of doublets. s 

I should also observe that the Poisson dis- 
tribution proves to be an acceptable fit only if 
the singletons are excluded. Otherwise, I know 
of no probability model that will fit the data. 
We therefore have two choices. On the one 
hand, we can accept the observed relative 
frequencies of singletons and multiples as 
being realistic, in which case the genius theory 
is favored. On the other hand, we can ignore 
the singletons and look only at the probability 
distribution of multiples, in which case the 
chance theory is favored. The zeitgeist theory 
could not explain these results without adding 
some post hoc interpretation. 

Probabilities within disciplines (Test 2). 
If we subdivide the multiples into the seven 
fields, we discover that there are 53 multiples 
in mathematics, 28 in astronomy, 115 in 
physics, 78 in chemistry, 33 in biology, 184 
in medicine, and 88 in technology. By com- 
parison, Darmstaedter (1908) lists 329 con- 
tributions in mathematics, 478 in astronomy, 
1,511 in physics, 2,469 in chemistry, 1,415 in 
biology, 1,268 in medicine, and 4,830 in tech- 
nology (Sorokin, 1957, Table 5). Hence, as 
an approximation we may assert that 16% of 
mathematics, 8% of astronomy, 8% of phys- 
ics, 3% of chemistry, 2% of biology, 14% of 
medicine, and 2% of technology involves 
multiples. Except for the high proportion in 
medicine, these percentages seem to support 
the zeitgeist position in Test 2: The more 
codified and mathematical disciplines do ap- 
pear to have a larger proportion of multiples. 
But again, this conjecture hinges on the as- 
sumption that the comparison of multiples 
and singletons is justified, Do we obtain col- 


laboration when the probability distributions 
are studied? 


Table 2 shows th 
tributions as well 
and goodness-of-fit 


ere 
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Table 2 
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Multiple Frequencies and Poisson Statistics Within Disciplines 


as ete 2! SY 


Seven fields 
Item Mathematics Astronomy Physics Chemistry Biology Medicine Technology 
Multiple grade 
2 41 21 94 65 29 130 69 
3 9 6 21 10 2 41 15 
4 2 0 0 3 2 8 3 
5 1 1 0 0 0 5 0 
6 0 0 0 0 0 0 0 
7 0 0 0 0 0 0 0 
8 0 0 0 0 0 0 1 
" Statistics 
a 84 89 46 57 56 1.02 1.61 
SE 42 -61 18 25 38 27 36 
x? 40 26 3.57 1.24 94 4.63 1.61 
df 2 2 2 2 1 3 2 
p >.75 >.75 >.10 >.50 >.25 >.10 >.25 


tent with either the zeitgeist or the genius 
theories. In fact, the standard errors of the 
estimates are so large that we cannot reject 
the hypothesis that all the sample means 
come from the same population mean.* Fur- 
ther reason for doubting the substantive sig- 
nificance of any mean differences comes from 
the fact that the magnitudes and ordering of 
the means in Table 2 are not very consistent 
with those found in a secondary analysis of 
the Ogburn and Thomas (1922) data (Simon- 
ton, 1978). Hence, this part of Test 2 seems 
to endorse the chance theory. If we reject the 
Proportion of multiples to singletons as mean- 
ingless, then the probability that a multiple 
will appear is roughly the same across all 
disciplines. 


Who Participates in Multiples? 


Eminence, multiples, and creative produc- 
tivity (Test 3). To scrutinize Test 3 em- 
Pirically, we must first obtain a sample of 
Creators. This end was accomplished by taking 
all those scientists and inventors with entries 
in either Asimoy (1972) or Williams (1974) 
Who were born on or after 1500 and who died 
on or before 1909, (The latter date was 
Selected to maximize the number and quality 
of sources available for operationalizing crea- 
tive productivity.) The result was a sample 
of 789 scientists and inventors. The eminence 


of each was determined by using the Encyclo- 
pedia Britannica (1974), a source shown by 
previous studies to yield reliable estimates of 
individual eminence (e.g., Simonton, 1976a, 
1976d, 1977b). A scientist or inventor was 
given 3 points if included in the selective 
Macropexdia, 2 points if given a “major entry” 
in the more inclusive Micropexdia, 1 point if 
only given the terse “subsidiary entry” in the 
Micropedia, and O points if not included at 
all. Creative productivity was measured by 
counting the number of contributions listed in 
Darmstaedter (1908), a source shown to be 
very reliable for the period covered (Simon- 
ton, 1975, 1976b, 1976c). Finally, for each 
scientist or inventor I tabulated the number 
of multiples in which he or she participated. 
When we calculate the zero-order Pearson 
product-moment correlation coefficient be- 
tween eminence and the number of multiples 
produced by an individual we find the statis- 
tically significant value of .25 (p< .001). 
So far, the zeitgeist theory is supported. 
Nonetheless, creative productivity correlates 
32 (p < .001) with eminence and 42 (p< 


Even though we did not prove the Poisson 
model to be the best possible fit, the data are defi- 
nitely far from being normally distributed. There- 
fore, it seems far more reasonable to base statistical 
tests on the standard error of estimate for each 


Poisson px. 
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.001) with the number of multiples. Because 
the more eminent creators are more produc- 
tive regarding both singletons and multiples, 
the correlation between eminence and multi- 
ples may be spurious, as chance theory pre- 
dicts, Calculation of the first-order partial 
correlation between eminence of multiple pro- 
ductivity controlling for creative productivity 
yields a value of 14 (p < .001). We are now 
in something of an interpretative quandary. 
On the one hand, this partial correlation re- 
mains highly significant statistically, and 
consequently the zeitgeist theory seems justi- 
fied. The more eminent creators are more 
likely to participate in multiples even after 
controlling for individual differences in the 
number of notable contributions. On the other 
hand, the substantive significance of the par- 
tial correlation is small, given that only 2% 
of the variance in eminence and multiple par- 
ticipation is shared after controlling for crea- 
tive productivity. So in a substantive sense, 
the effect of zeitgeist is so small that we might 
adopt the chance theory as the main explana- 
tion, 

Since we are forced to interpret such small 
effects, I decided to attempt a replication us- 
ing different operationalizations of eminence 
and creative productivity. This time eminence 
was defined using the earlier edition of 
Asimov (1964), which makes finer distinc- 
tions regarding relative fame. A creator was 
given 3 points if he or she had a “major 
entry” in Asimov, 2 points for a “subsidiary 
entry,” 1 point for only a listing in the index 
and 0 points for no listing at all. This measure 
of eminence correlated .45 with the one pre- 
viously derived from the Encyclopedia Bri- 
tannica (1974), This interitem reliability co- 
efficient compares quite favorably with inter. 
test reliabilities for creativity and intelligence 

measures on selective populations see, eg. 
Wallach & Kogan, 1965). The ene 
measure of creative Productivity was not 
taken from a single Source, but rather con- 
sisted of all contributions listed in a large 
number of separate chronologies (viz, Auer. 
bach, 1923; Boyer, 1968; Dannemann, 1928: 
Daumas, 1957; Feldhaus, 1904: Gabriel & 
Fogel, 1955; Garrison, 1929; Grun, 1975; 
Hilditch, 1911; Langer, 1972; Mayer, 1949; 
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Mumford, 1934). This second measure corre- 
lated .79 with that derived solely from Darm- 
staedter (1908). When we calculate the cor- 
relation between the two new measures of 
eminence and multiples we obtain .26 (p < 
.001), a value hardly differing from that ob- 
tained earlier. Moreover, the new measure of 
creative productivity correlates 37 (p< 
001) with the new fame measure and .48 
(p< .001) with the number of multiples, 
values only modestly larger than those cal- 
culated formerly. Therefore, it should come 
as no surprise that the partial correlation be- 
tween eminence and multiple participation 
controlling for creative productivity is re- 
duced only slightly to a still statistically sig- 
nificant .10 (p < .05). So the results we have 
obtained are extremely robust. In fact, no 
matter which combination of fame and crea- 
tive productivity measures we employ, the 
outcome is essentially unaltered. Further, the 
same basic results emerge even if we employ 
only the exclusive sample of multiples. 

It is manifest that the genius theory cannot 
explain the results. In addition, since the cor- 
relation between eminence and multiple pro- 
duction remains even after controlling for 
creative productivity, the zeitgeist theory has 
some credence. Nevertheless, partialing out 
creative productivity reduces the correlation 
to such a small value that the chance theory 
is not altogether ruled out as long as a modest 
zeitgeist influence is permitted. 

Eminence and priority (Test 4). Are the 


more famous creators more likely to get there 


first, leaving their lesser colleagues the frus- 
tration of rediscovering the already known? 
To answer this question we first must be able 


to distinguish the relative eminence of each 
Participant in a m 


shown how Asimoy 
pedia Britannica 
this Purpose, 


tremely compri 
Debus (1968) 
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composite score, generating an index ranging 
from 0 to 7. 

Since the possible rank orderings vary de- 
pending on the grade of the multiple, it makes 
more sense to apply this eminence measure 
to each grade separately. An examination of 
doubles reveals that of the 449 cases, the data 
were insufficient in 63 to estimate the rela- 
tionship between eminence and priority 
(usually because the two contributors were 
equally obscure). Of the remaining cases, the 
contributions were simultaneous in 197, the 
more eminent creator had priority in 79, and 

< the less eminent creator had priority in 110. 
The preponderance of simultaneous contribu- 
tions may be taken as evidence against the 
genius theory. I strongly suspect, however, 
that this proportion would be reduced if dis- 
sertation-sized research could be devoted to 
each case (e.g., the proportion of simultaneous 
multiples is much smaller for the exclusive 
sample that involves better documented 
cases). Much more curious is the fact that the 
priority assignments, when not simultaneous, 
are definitely contrary to the genius theory, 
at least as presumed in Test 4. The more emi- 
nent creators are more likely to be the redis- 
coverers, whereas the less eminent creator is 
more likely to have priority, x*(1) = 5.08, 
b < .01. Perhaps the eminent scientist, like 
the eminent philosopher, is behind the times 
(cf. Simonton, 1976d). In the process of con- 
solidating past advances, the major scientist 
may inadvertantly rediscover much of the 
Work of various predecessors. Therefore, this 
Outcome might be taken as support for the 
genius theory if we change our perspective 
on the genius from precursor to consolidator. 
Nonetheless, even if we grant this post hoc 
amendment, the impact of genius must be 
small, given the prevalence of simultaneous 
Contributions, 

Because the higher grade multiples are so 
few in number, inferences are necessarily less 
Secure. Among triplets, the contributions are 
Simultaneous in 36, the most eminent creator 
first in 15, second in 13, and third in 17 (the 
remaining 23 being unclassified for gear 
given above). For quadruplets, 4 are simul- 
taneous, and the most eminent creator is first 
5 times, second twice, third twice, and last 
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once (with 4 cases unclassified). Among quin- 
tuplets, there are no simultaneous multiples, 
the most eminent creator is first in 3 in- 
stances, and the third, fourth, and fifth places 
have 1 case each (and one unclassified). 
Finally, the single octuplet can only be con- 
sidered a simultaneous multiple due to the 
plus or minus on the dating. Hence the whole 
picture suggests some tendency for the more 
eminent creators to have priority for the 
higher grade multiples, but none of these 
comparisons involve enough cases to permit 
statistical confidence in any conclusions. 


Conclusion 


The position that best meets all critical 
tests is the chance theory. Only the chance 
theory predicts the characteristic probability 
distribution of multiples and the absence of 
any probability differences from discipline to 
discipline. Although chance theory had to 
yield some ground to zeitgeist theory in Test 
3, control for creative productivity consider- 
ably reduced the common variance between 
eminence and multiple participation. So on 
the whole, chance theory may be the best 
general explanation of the phenomenon. De- 
spite the apparent explanatory superiority of 
the chance theory, however, the other two 
theories are not completely overturned. I 
think it is significant that the more eminent 
scientist is more likely to rediscover inde- 
pendently the work of his or her less eminent 
colleagues. Like the great philosopher, the 
eminent scientist appears to be a consolidator, 
synthesizing past achievements (cf. Simonton, 
1976d). This consolidative process may some- 
times entail some sociocultural redundancy in 
the form of rediscovery or reinvention. None- 
theless, the fact remains that there exist indi- 
vidual differences in priority, differences that 
cannot be explained by either chance or zeit- 
geist theories. 

When we turn to zeitgeist theory, we can- 
not ignore its explanatory value either. We 
must obviously acknowledge that the zeit- 
geist is probably a necessary even if not a 
sufficient determinant of discovery or inven- 
tion. There can be no denying that some con- 
tributions are prerequisites to other contribu- 
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tions, Certainly the discovery of bacteria pre- 
supposes a microscope, and the discovery of 
Jupiter’s moons presumes the existence of a 
telescope. Nevertheless, the invention of these 
instruments does not automatically cause the 
corresponding discoveries for which they are 
prerequisite. The discoveries may or may not 
occur after the preconditions have been ful- 
filled: Galileo’s discovery of Jupiter’s satel- 
lites took place only 1 year after the telescope 
was invented, whereas Leeuwenhoek’s dis- 
covery of bacteria happened about 200 years 
after the simple microscope appeared. So 
there remains some degree of indeterminancy 
in the influence of the zeitgeist, an indeter- 
minancy consistent with the chance model. 
Allow me to expand the last point: Not 
only can the zeitgeist and chance theories be 
seen to provide complementary accounts of 
multiples, but in addition, all three theories 
can be made compatible to some extent. Ad- 
mittedly, I have sometimes oversimplified 
one or another theoretical outlook in the in- 
terest of drawing distinctive empirical predic- 
tions, Although advocates of the pure forms 
for each theory definitely exist (e.g., Kroeber, 
1917), many researchers opt for some more 
moderate combination (e.g., Merton, 1961). 
By adjusting all three theories in the light of 
the current data, a more unified perspective 
on multiples can emerge. This possibility can 
be readily seen by more closely examining 
what I consider to be the two principal em- 
pirical results of this study. 
The first main finding concerns the inter- 
poren among individual eminence, crea- 
ve nee and participation in multi- 
ples, It is definitely consistent with genius 
theory to notice a high correlation between 
the fame of an individual scientist or inventor 
and the number of contributions he or she 
makes (Albert, 1975). It is also in line with 
zeitgeist theory to observe that the more 
eminent contributor is more likely to produce 
more multiples. Yet in accord with the chance 
theory, a large part of the multiple produc- 
tion of more eminent creators is a probabilis- 
tic consequence of their greater creative pro- 
ductivity. Hence, all three theories tend to 
converge to explain the same phenomenon at 
least so long as each undergoes some Hee 
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modations. The genius theory must permi 
the eminent scientist to be a major proponen 
of the current trends in his or her field—ai 
not a dissenter or rebel. Similarly, the zeit- 
geist and chance theories must each yield 
some ground in explicating the relation be- 
tween eminence and participation in multi- 
ples, Eminent scientists engage in more multi- 
ple production both because they are more 
productive and because they are more repre- 
sentative of the spirit of their times. 

The potential compatibility of the three 
theories becomes even more evident when we 
scrutinize the second principal finding, 
namely, that the frequency of various multi- 
ple grades can be described by a Poisson dis-4 
tribution. To simplify the argument, let us 
suppose the mean of the distribution is some- 
what close to unity, Then it may be argued 
that for any given potential contribution 
about 25 people are capable of the achieve- 
ment, each having 1 chance out of 25 suc- 
ceeding (i.e, mp=25%1/25=1). These 
parameters can be made consistent with the 
genius model if we define the genius as a rare 
individual who has a probability greater than 
0 of making notable contributions. It would 
be no slight achievement to be one out of a 
mere 25 persons in all history who are able’ 
to create the calculus or evolutionary theory 
or any other contribution. And even though 
chance plays a larger role than a pure genius 
theory might suggest, this interpretation still 
states that only a select few have any oppor- 
tunity whatsoever. So serendipity may be ever 
present, but it is an elitist serendipity when 
so few have the capacity to be so lucky. More- 
over, even if the probability that a creator 
would produce any given contribution were 
only around 1/25, the probability of that 
same creator producing some notable con- 
tribution would be high. On the average, after 
a little more than two dozen trials the odds! 
would favor the creator “chancing” upon! 
some major discovery or invention. 

A nega model with the parameters of n= 
eo p= 1/25 could also be compatible 
ies be re oats Seaton with some amend- 
Ak a oot. If the spirit: 
AEN influential, then certainly wg 
ct that the course of scientific 
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‘ 
_ progress would not depend on just one indi- 
. vidual. And the existence of more than two 
dozen potenial contributors for any given dis- 
covery or invention obviously would provide 
the needed reserves. On the other hand, zeit- 
geist theory must relinquish the idea that dis- 
coveries or inventions are inevitable. Under 
the current parameters more than one-third of 
all potential contributions will be missed by 
all. Nonetheless, matters could be a lot worse: 
The mean could be much smaller than unity, 
so that the proportion of total failures could 
exceed half or more. As it stands, with a mean 
‘* approaching 1, contributions are made once 
on the average. By permitting multiples to 
compensate for complete misses, discovery 
and invention can be said to have at least a 
probabilistic variety of inevitability. 

In conclusion, after various mutual con- 
cessions all three perspectives can be said to 
enhance our overall understanding of multi- 
ples. A small group of highly productive indi- 
viduals are most likely to participate in multi- 
ples, including independent rediscoveries. 
These same geniuses, as it were, are also un- 
usually intimate with the technoscientific 
zeitgeist and are perhaps equally gifted with 
an inordinate amount of good luck. 


Reference Note 


1. R. K. Merton. Personal communication, April 19, 
1976, 
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Value Correlates of Conservatism 


, _ N.T. Feather 
The Flinders University of South Australia, Bedford Park, Australia 


Stepwise multiple regression analyses 
tance, age, sex, education, and income as the independent variables were con- 
ducted to discover the major Predictors of conservatism. On the basis of past 
discussions of the ideal conservative, i 
people would emphasize values concerned with attachment to rules and au- 
thority and ego defense (e.g., securit 
salvation) and downgrade other values concerned with equality, freedom, love, 
and pleasure, as well as open-minded, intellectual, and imaginative modes of 
thought. This hypothesis was confirmed by the results from two independent 
surveys involving a sample of families in metropolitan Adelaide (Sample 1, 
1972) and the families of a sample of students at Flinders University (Sample 
2, 1976/1977). Respondents in both samples answered the Rokeach Value Sur- 


ground and demographic information. In addition to the values, age and sex 
were significant predictors (especially age), with older respondents tending to 
be more conservative than younger ones and females more conservative than 
males. Education and income of the heads (Sample 1) and fathers (Sample 2) 
of families played a minor role in prediction. Results are interpreted as sup- 
porting both the cognitive learning and psychodynamic explanations of value/ 
attitude relationships. The need for developmental studies in this area using 
new sophisticated methodologies along with causal models is emphasized. 


Does the person who has conservative views 
Over a wide range of issues have a different 
Pattern of values from the person whose at- 
titudes are more liberal? The present article 
reports research that was designed to answer 
this question. For present purposes conserva- 
tism will be defined by reference to the anal- 
ysis by Wilson and his colleagues (Wilson, 
1973). Wilson (1973) claims that conserva- 
tism is “a general factor underlying the en- 
tire field of social attitudes much the same as 
intelligence is conceived as a general factor 
which partly determines abilities in different 
areas” (p. 3). The “ideal” conservative is 
Seen as a person who tends to have a funda- 


This research was funded by grants from the 
Australian Research Grants Committee and from 
Flinders University. 
p,Reauests for reprints should be addressed to N. T. 
Pia Discipline of Psychology, School of Social 
Bees The Flinders University of South Australia, 
edford Park, South Australia 5042. 


mentalist religious orientation and whose po- 
litical leanings are likely to be proestablish- 
ment and supportive of the status quo; who 
tends to insist upon strict rules and punish- 
ments; who is likely to favor militarism; who 
tends to be ethnocentric and intolerant of 
minority groups; who tends to prefer what is 
conventional, traditional, and familiar; who 
is likely to be antihedonistic in outlook and 
to favor restriction of sexual behavior; who 
tends to oppose scientific progress and “new- 
fangled” ideas; and who is likely to be fatal- 
istic and superstitious. 

Wilson and his colleagues have used a ques- 
tionnaire, the Conservatism Scale (C Scale), 
to measure general conservatism. This scale 
involves items that were selected to relate to 
the main characteristics of the ideal con- 
servative that have just been mentioned. Each 
item consists of a word or phrase represent- 
ing a familiar or controversial issue (e.g., 
legalized abortion, death penalty, Bible 
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truth), and the respondent registers approval, 
disapproval, or uncertainty. This simple item 
format is assumed to overcome many of the 
problems inherent in more traditional item 
formats employing attitude statements, 

Wilson (1973, chaps. 4 & 5) draws upon 
factor analytic evidence in support of his 
claim that the C Scale measures a general 
factor called conservatism.’ One should note, 
however, that his claim has not, as far as the 
C Scale is concerned, gone undisputed 
(Boshier, 1972; Robertson & Cochrane, 1973 
—but see also Kirton, 1978; Wilson, 1972). 
The issue of the structure of conservatism 
is controversial and complex and is unlikely 
to be resolved on the basis of factor analytic 
findings alone (Feather, 1975a). It is clearly 
necessary to develop theories about the nature 
of general conservatism and its behavioral 
correlates and about how conservative atti- 
tudes develop. Then one can proceed to test 
these theories by using a wide range of em- 
pirical procedures. 

Wilson’s own theoretical stance vis-à-vis the 
conservative attitude syndrome is that it is 
intimately related to genetic and environ- 
mental factors that determine feelings of in- 
security and inferiority, The common basis 
for all of the various components of the syn- 
drome is assumed to be “a generalized sus- 
ceptibility to experiencing threat or anxiety 
in the face of uncertainty” (Wilson, 1973, 
chap, 17). The conservative individual tends 
to avoid both stimulus and response uncer- 


«+ Serve a defensive fi 
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rendering more secure, 


by subjugating them to rigid and simplistic 
codes of conduct (rul p moral: F 
tions, ete.), thus pi peal duties, obliga- 


(pp. 261-264) 


These ideas are not dissimilar i 
e to th S 
volved in the earlier influential A REET 


the authoritarian personality (Adorno, Fren- 
kel-Brunswik, Levinson, & Sanford, 1950; 
Kirscht & Dillehay, 1967; Ray, 1973; San- 
ford, 1973)—for example, in the emphasis 
upon the defensive functions of conservative 
attitudes—although Wilson’s analysis of con- 
servatism stresses order and control as a” 
means of reducing anxieties relating to feel- 
ings of insecurity and inferiority, whereas the 
theoretical account of authoritarianism draws 
upon Freudian concepts and deals more with 
the control of impulses that may be person- 
ally and socially unacceptable, such as sex 
and aggression, Sanford’s (1973) recent over- 
view of research into the authoritarian per- 
sonality impresses with its subtle awareness 
of the complex personality processes that may 
underly authoritarian attitudes and the im- 
portant role played by the family and the 
wider social context. It is noteworthy, how- 
ever, that the main personality dispositions 
of the authoritarian as described by Sanford 
(1973, pp. 143-146) overlap with the char- 
acteristics of the ideal conservative that were 
previously listed, Descriptively, therefore, 
the concept of conservatism as employed by 
Wilson has much in common with the earlier 
concept of authoritarianism. And at the level 
of explanation, there is some agreement also 
that these attitude syndromes may serve the 
function of ego defense, 

In the present article we make no attempt 
to develop a theory about the structure and 
dynamics of conservatism, Rather our main 
Interest is to discover whether conservative 
attitudes are related in predictable ways to 
more general values, Very little is known 
about these relationships, but it is reason- 
able to expect that attitude—value linkages 
will occur and that their emergence will de- 
can upon the individual’s developing ability 
© form abstract concepts that capture the 
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VALUES AND CONSERVATISM 


regularities or consistencies in experience, an 
ability that improves with increasing matur- 
ity. It is likely that specific attitudes come 
first and that general values begin only later 
to emerge as abstractions from personal ex- 
perience, especially experience of one’s own 
attitudes and actions, their relationship to 
the attitudes and behaviors of significant 
others, and the judgments of significant 
others about what is and what is not desir- 
able. In time these values become organized 
into value systems according to their rela- 
tive importance. Once formed, values and 
value systems serve as frames of reference 
or criteria that may be used to guide thought 
and action in many different contexts, They 
represent a personal working over of experi- 
ence in terms of the individual’s own internal 
cognitive capacities and personality dynam- 
ics. They may be modified and refined in 
light of new and discrepant experience, but 
throughout life they remain as important 
components of the self-concept. These points 
have been discussed and elaborated in pre- 
vious publications (Feather, 1975b, 1978, in 
Press; Rokeach, 1973). 

What are some of the values that one 
would expect to be associated with the con- 
Servative attitude syndrome? The previous 
discussion suggests that values that are com- 
monly held to involve ego defense would be 
involved, Rokeach (1973, p. 16) singles out 
a number of them. Research on the authori- 
tarian Personality and on religion, he argues, 
Suggests that an overemphasis “on such 
modes of conduct as cleanliness and polite- 
ness and on such end-states as family and 
national security may be especially helpful 
to ego defense” (p. 16) and that “religious 
values more often than not serve ego-de- 
fensive functions” (p. 16). At the same time, 
Rokeach (1973) recognizes that the motiva- 
tional base of values is complex and that in 
the final analysis “all of a person’s values 
are conceived to maintain and enhance the 
Master sentiment of self-regard—by helping 
à person adjust to reality, defend his ego 
against threat, and test reality” (p. 15). 

The values mentioned by Rokeach are con- 
tained in his Value Survey, a test that lists 18 
terminal values (concerned with goals or 
end states of existence) and 18 instrumental 
es (concerned with means or modes of 
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conduct), There are other values from these 
lists, however, in addition to those related 
to ego defense, that one would expect to be 
associated with general conservatism, Thus, 
a conservative tendency to overglorify au- ` 
thority, rules, and punishment would prob- 
ably be associated with a more extreme 
emphasis upon such instrumental values as 
being obedient and polite and with a down- 
grading of terminal values such as equality 
and freedom. Strict adherence to the familiar, 
the traditional, and the conventional, as 
demonstrated, for example, in anti-intellec- 
tual and antiscientific biases, would probably 
be associated with diminished importance 
attached to such instrumental values as be- 
ing broadminded, imaginative, intellectual, 
and logical. It is likely that the antihedon. 
istic attitudes assumed to be characteristic 
of the ideal conservative would be associated 
with downgrading such terminal values as 
mature love and pleasure, Finally, the eth- 
nocentric bias of the ideal conservative might 
appear in an upgrading of cleanliness values 
and a downgrading of equalitarian values, 

In summary, it was predicted that general 
conservatism as measured by the C Scale 
would be associated with a more extreme 
emphasis upon values concerning salvation, 
security, cleanliness, and obedience, and a 
less extreme emphasis upon values concern- 
ing equality, freedom, pleasure, love, and 
modes of conduct that involve acting and 
thinking in broadminded, imaginative, intel- 
lectual, and logical ways. These predictions 
are all consistent with the descriptive and 
theoretical accounts of conservatism and 
authoritarianism that were mentioned previ- 
ously and that have been extensively discussed 
by others (eg., Kirscht & Dillehay, 1967; 
Sanford, 1973; Wilson, 1973). 

In addition to providing evidence relating 
to these predicted relationships, the present 
study also included background and demo- 
graphic variables (age, sex, education, in- 
come) in the analysis with a view to inves- 
tigating their relationships to conservatism 
as well. It is already known that conserva- 
tism as measured by the C Scale tends to 
increase with age and to be higher for females 
(Feather, 1975b, 1977b; Wilson, 1973). The 
relationship of conservatism to socioeconomic 
status is, however, less clear. In a recent 
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study of the structure of social attitudes, Ey- 
senck (1975) presented evidence for two con- 
servatism factors—general conservative ver- 
sus radical ideology and socioeconomic con- 
servatism versus socialism—in addition to 
the tough-mindedness versus tender-minded- 
ness dimension that appeared in his earlier 
work (Eysenck, 1954). He found that 
middle-class people tended to be more radical 
in their general attitudes but more conserva- 
tive in their economic attitudes when com- 
pared with working-class people (see also 
Lipset, 1960). These results suggest that C 
Scale scores should tend to decrease with 
increased level of education, given the fact 
that the C Scale appears to be measuring the 
type of general conservatism factor to which 
Eysenck refers.? Thus, it was further pre- 
dicted that general conservatism will not 
only increase with age and be higher for 
females than for males but will also be lower 
for higher levels of education. 

A final aim of the present study was to 
discover the most important value predictors 
of general conservatism by using stepwise 
multiple regression procedures and to com- 
pare the level of prediction achieved by these 
values with that achieved by the values, the 
background, and demographic variables taken 
together, and with that achieved by the back- 
ground and demographic variables alone. 
Such a comparative analysis does not involve 
prior specification of a causal model, but it 
does throw light on the question as to 
whether information about a person’s value 
priorities provides better prediction of gen- 
eral conservatism than information about 


cioeconomic status 
or whether these relationships are more com- 
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plexly determined and involve not only the 
molding effect of experience in particular sit- 
uations but also the operation of internal per- 
sonality dynamics that affect the way en- 
vironmental input is coded and transformed 
(see Sanford, 1973, for a good discussion of 
this point). x 

A distinctive feature of the present inves- 
tigation is that it involved two separate data 
sets collected at different times with different 
samples. Hence it was possible to discover 
how far results with one sample were repli- 
cated by the results of the other sample, 


Method 
Subjects and Procedure 


Sample 1 (Adelaide metropolitan). The data for 
this sample were obtained from an extensive family 
survey conducted in metropolitan Adelaide in 1972. 
Details about subjects and procedure have been pre- 
sented previously (Feather, 1975b, pp. 121-125; 
1978). The survey involved dwellings that included 
children 14 years of age or more. The sample was 
selected by using a multistage cluster sampling 
frame of the Adelaide Statistical Division main- 


14 years of age or older, A trained interviewer con- 
ducted the survey. Respondents were assured that 
their answers would be confidential, and names 
were not required on the 
requested to com; 
dently, 


questionnaire, They were 
plete the questionnaire indepen- 
without discussing it with other members 
of the family. Out of a total of 659 possible re- 
spondents from the dwellings sampled, there were 
72 ‘refusals, leaving 587 respondents who answered 
Some or all of the complete Survey questionnaire 


(an 89% response rate). These respondents were 
made up of 147 who desc 


f ribed themselves as heads 
of household (117 male, 30 female), 145 wives, 152 


Sons, 126 daughters, and 17 from other members 


of the dwellings, 

Sample 2 (Flinders University). 
volved students who were enrolled 
(senior) course on social motiv. 
Flinders University in 1976 and a 
gether with members of their fai 
Were available. These Students ci 
tionnaire containing the releva; 


This sample in- 
in a third year 
ation offered at 
gain in 1977, to- 
milies when they 
completed a ques- 
nt measures, The 


? The factor anal 
(1973) d 
Socioecon 
ply refle 
Scale. 


lytic studies discussed by Wilson 
© not provide evidence for a factor of 
omic conservatism, However, this may sim- 
ct an absence of relevant items in the C 


tained by the School of Social Sciences at Flinders 

University. An attempt was made to survey all 

adult members in each dwelling, including children 
F 


VALUES AND CONSERVATISM 


majority of them then took the questionnaire home 
and administered it to their parents and to sib- 
lings who were 14 years of age or older living in 
the dwelling. A small number of older students 
administered it to their wives (or husbands) and 
to those of their children who were 14 years of 
age or older and were living in the dwelling. It was 
emphasized that the questionnaires should be an- 
swered independently by family members without 
any consultation with each other and that answers 
were confidential. Names were not required on the 
questionnaire, Data were obtained from 358 re- 
spondents consisting of 74 fathers, 83 mothers, 92 
sons, and 109 daughters. The response rate from 
available family members was very high, exceed- 
ing 90% 

Questionnaire, The first part of the question- 
naire was designed to obtain background and demo- 
graphic information (e.g, status in household, age, 
sex, birthplace, income, religion, etc.). This was 
followed by Form D of the Rokeach Value Survey, 
containing the two sets of terminal and instrumen- 
tal values, In this version each value with its short 
definition is printed on a removable gummed label, 
and the values in each set are presented in alpha- 
betical order, One set of 18 terminal values con- 
cerns general goals or end states of existence (e.g. 
freedom, happiness, salvation, wisdom), The other 
set of 18 instrumental values concerns modes of 
conduct (e., being broadminded, honest, loving, 
responsible). Respondents ranked the 18 terminal 
and then the 18 instrumental values in their “order 
of importance to you, as guiding principles in your 
life,” using the standard instructions (Rokeach, 
1973, pp. 357-361). Respondents arrived at their 
final rank orders by rearranging the gummed labels 
within each value set so that the final rank orders 
indicated “how you really feel.” 

As in other studies in the Flinders program 
(Feather, 1975b, pp. 23-24) the ranks from 1 (most 
important) to 18 (least important) were transformed 
to z scores corresponding to a division into 18 
equal areas under the normal curve. This trans- 
formation was made on the assumption that it 
would be easier for respondents to discriminate be- 
tween the relative importance of values at the 
extremes of the scale than between those ranked 
in the middle of the scale (see Cohen & Cohen, 
1975, p. 265, and Hays, 1967, pp. 35-39, for justi- 
fication of this procedure). These transformed ranks 
for the 18 terminal and the 18 instrumental values 
Constituted the measures of value importance for 
€ach respondent, and they could range from +1.91 
(most important) to —1.91 (least important). 

Respondents also completed the Conservatism 
Scale devised by Wilson and Patterson (1968) and 
discussed in Wilson (1973, 1975). The C Scale con- 
‘ists of 50 items, and as mentioned previously, each 
item relates to some familiar or controversial issue 
° concept (e.g, death penalty, evolution theory, 
chastity, church authority, colored immigration, dis- 
armament, strict rules, and so forth). Respondents 
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were asked, “Which of the following do you favor 
or believe in?” and they answered each item by 
circling yes, ?, or no. The C Scale is balanced for 
acquiescence response set, since yes answers to odd- 
numbered items are scored to reflect a conservative 
response and mo answers a liberal or nonconserva- 
tive response, whereas yes answers to even-num- 
bered items are scored to reflect a liberal response 
and no answers a conservative response. Responses 
to each item were scored in the usual manner 
(liberal response =0, ambiguous response = 1, con- 
servative response = 2), so that total conservatism 
scores could range from 0 to 100 in the direction 
of increasing conservatism. There were 575 re- 
spondents in Sample 1 and 358 respondents in 
Sample 2 who completed the C Scale. 


Form of Analysis 


Product-moment correlation coefficients were 
computed between the transformed rankings for 
each value and general conservatism as measured 
by the total score on the C Scale, and between 
each of the demographic and background variables 
and this total score. Subsequently, stepwise mul- 
tiple regression analyses were conducted using the 
Statistical Package for the Social Sciences (Nie, 
Hull, Jenkins, Steinbrenner, & Bent, 1975, chap. 20). 

In these analyses sex was coded 1 (male) and 2 
(female). Education and income were both coded 
in relation to the nominated head of each house- 
hold for Sample 1 (the 1972 metropolitan Adelaide 
family sample) and in relation to the father of 
the family in Sample 2 (the 1976/1977 Flinders 
University family sample). The scores obtained 
were used for all members of that household so 
as to provide general measures of the socioeco- 
nomic status of each family. In Sample 1 education 
was coded from 1 to 4 using the following cate- 
gories: primary or elementary school education (1) ; 
secondary school education without matriculation 
(2); matriculated at secondary school and/or ob- 
tained trade qualifications (3); tertiary education 
(4). Income was coded from 1 to 5 using the 
following categories for reported annual income: 
less than $2,000 (1); $2,000-$3,999 (2); $4,000- 
$5,999 (3); $6,000-$7,999 (4); above $8,000 (5). 
Both education and income levels were higher in 
Sample 2 than in Sample 1, as might be expected 
from the nature of the sample and from the fact 


3 Medians of the ranks for each of the terminal 
and instrumental values and means for the total con- 
servatism scores for Sample 1 are presented for dif- 
ferent groups in Feather (1975b). Correlations be- 
tween value importance and total conservatism with 
age partialed out are presented in Feather (1977b). 
Item means for the 50 items from the C Scale for 
Sample 1 can be found in Feather (1975a, 1977a). 
Corresponding information for Sample 2 has not yet 
been published. 
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Table 1 
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Product-Moment Correlations (Simple rs) Relating Independent Variables to Total 


Conservatism for Samples 1 and 2 


Total conservatism Total conservatism 


Sample 1 Sample 2 Sample 1 Sample 2 
Independent variable r F Independent variable r r 
alue i i Value importance, 
eae ater instrumental values 
A comfortable life 04 —.06 Ambitious 
An exciting life —.27*** —.31***| Broad-minded 
A sense of accomplishment —.04 —.03 Capable 
A world at peace —.04 .03 Cheerful 
A world of beauty —.16** 15%" Clean 
Equality bate eae il cet Riba Courageous 
Family security +30 Chine Forgiving 
Freedom —.28*** = —.22%**! Helpful 
Happiness .02 —.03 Honest 
Inner harmony 01 —.06 Imaginative 
Mature love —.208**  —.18** Independent 
National security 2308s Ogee" Intellectual 
Pleasure —.16°** — —.23*** | Logical 
Salvation ALS * 5627" Loving 
Self-respect A3**  —.05 Obedient 
Social recognition —.01 —.07 Polite 
True friendship = —.23*** | Responsible 
Wisdom 08 07 Self-controlled 
Other variables 
Age 4g ays 
Sex EER 209 
Education —.11* 01 
Income 313% 02 


Note. Because of missing data, the Ns for the correlations in this ta 
ranged from 558 to 575 except for education and income, where th 


Semple 2 (Flinders University), the Ns ranged from 357 to 358 exce 


ble for Sample 1 (metropolitan Adelaide) 
e Ns were 457 and 430, respectively. For 


* p <.05.** p <01, ** > <.001. 


that data were collect 
Sample 1 had been su 
fathers in Sample 2 
education, 


‘for Sample 2 
(1), above 


pt for education and income, where N was 


where it was possible to distinguish more categories 


and where the sample was more representative of 
the population at large, 


Results 


Simple Correlations of Variables 
With Conservatism 


Table 1 presents t 
relations relatin 
variables to tot: 
and Sample 2. 


In both samples there were statistically 


significant positive relationships between 
total conservatism and the relative impor- 
tance assigned by ri 


f 9y respondents to the follow- 
ing values: family Security, national security, 


he product-moment cor- 
g each of the independent 
al conservatism for Sample 1 
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salvation, being clean, being honest, being 
obedient, and being polite. Also, in both 
samples there were significant negative rela- 
tionships between total conservatism and the 
relative importance assigned by respondents 
to the following values: an exciting life, a 
world of beauty, equality, freedom, mature 
love, pleasure, true friendship, being broad- 
minded, being imaginative, being indepen- 
dent, and being intellectual. The significant 
relationships were remarkably consistent 
from sample to sample not only in their 
direction but also in their magnitude, and 
the majority of them were predicted in ad- 
vance.* 

The correlations for both samples also 
showed that total conservatism tended to in- 
crease with age and to be higher for females 
than for males, These relationships were also 
consistent with predictions. 

Total conservatism scores were negatively 
related to the level of education and income 
of the head of household in Sample 1, sup- 
porting predictions, These relationships were 
not strong, however, even though they were 
Statistically significant. Nor did they occur 
in Sample 2, probably because this sample 
was more highly selected (families of Flin- 
ders University students) and, as noted be- 
fore, less differentiated in the categories of 
education and income that were employed in 
the analysis. 

A small number of significant value/con- 
Servatism correlations were not replicated 
from sample to sample. Some of these may 
involve the effects of sample differences and 


Table 2 

Multiple Correlations (Rs) and Proportion of 
Variance Explained (R?) From Multiple 
Regression Analyses Predicting Total 
Conservatism From Age, Sex, Education, 

and Income 


a ee ST ee Le 


Sample 1 Sample 2 
Independent — EFTE 
variable RIR Roe 
saaa A a 
Age + sex 52 27 48.23 
Education + income .14 02 02.00 
Age + sex + 
, education + income 53.28 .48 .23 
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some may be chance events. They will not 
be discussed further. j 


Multiple Regression Analyses 


The relationships presented in Table 1 are 
further clarified by the results of the multiple 
regression analyses presented in Tables 2, 3) 
4, and 5.° 

Table 2 shows that the combination of 
age, sex, education, and income was able 
to account for 28% of the variance in con- 
servatism scores in Sample 1 and 23% in 
Sample 2, with age and sex as the most im- 
portant predictors (especially age—see Table 
1). 

At the 10th step of the stepwise solution, 
the set of terminal values alone accounted 
for 45% of the variance in conservatism 
scores in Sample 1 and 53% in Sample 2; 
the instrumental values alone accounted for 
31% of this variance in Sample 1 and 44% 
in Sample 2 at the 10th step (see Table 3).° 

The combination of terminal values with 
age, sex, education, and income accounted 
for 50% of the variance in conservatism 
scores in Sample 1 and 59% in Sample 2 


*The Sample 1 correlations between value im- 
portance and total conservatism and between value 
importance and age presented in Tables 1 and 6 re- 
spectively are marginally different from those pre- 
sented in a related report (Feather, 1977b) because 
of very slight differences in the Ns between both 
analyses due to the fact that the present study used 
pairwise deletion of missing data. 

5 In the absence of specific hypotheses about curvi- 
linear trends and interaction effects, it was decided 
to use a linear regression equation without inter- 
action terms. Moreover, given the large number of 
independent variables for final inclusion in the equa- 
tion, this seemed to be a reasonable decision. One 
exception was the inclusion of Age X Sex as an 
interaction term in an exploratory analysis involving 
age, sex, and Age X Sex as predictors. Inclusion of 
this interaction term made virtually no difference to 
the multiple Rs for Sample 1 and Sample 2 when 
compared with the Rs obtained by using the equa- 
tion involving age and sex alone. 

6 Results in Tables 3, 4, and 5 are reported for the 
first 10 steps for ease of comparison. The step beyond 
which there was no further significant (p < 05) 
step-by-step increase in R? is noted, either in the 
table or in the text. In all of the analyses reported 
in Tables 2, 3, 4, and 5 the “adjusted” R? values 
were virtually identical to the actual R? values (Nie, 
Hull, Jenkins, Steinbrenner, & Bent, 1975, p. 358). 
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Table 3 
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i 7 7 lained (R*), and Standardised 
i tions (Rs), Proportion of Variance Expi e dar i 
i, aoa 4 From Stepwise Multiple Regression Analyses Predicting Tota 
Pectin From Terminal and Instrumental Values 


Sample 1 Sample 2 : a 
5 8 
Step Value R R B Step Value R te 
Terminal values 

i 56 315 

ti AL ATL 426" 1 Salvation f 5 

; Pe eed 153283 snore 2 Family security 65 Al 4 

3 National security .58 .341 E? aad 3 ene security ie oe 

it OL, 375) —.13*** 4 ality 20, 497 

$ E 64 .406 —.15*** 5 Social recognition 71.509 

6 Freedom 65 418 —,08* 6 Mature love 72 516 
7 Self-respect OSTA ARE 7 An exciting life a2 .S21 — 

8 A comfortable life .66 .432 al Zee 8 A comfortable life A 72.525 

9 Wisdom 665.439 .10** 9  Asense of accomplishment 13 -528 

10 Happiness 67 445 .08* 10 Freedom 73.530 

Instrumental values 

1 Broad-minded -39 149 —.28*8* 1 Obedient 4S .206 

2 Imaginative 46.212 —.18*** 2 Broad-minded -55 .301 
3 Clean 49. 24S 5*** 3 Imaginative 61 .368 - 
4 Obedient 52.268 LE 4 Independent 63 .397 - 

5 Independent 53.280 —,16*** 5 Clean 64 407 

6 Cheerful 54.293 —.11"* 6 Cheerful 64 416 
7 Loving „55 .297 —.08* 7 Responsible 65 .424 — 
8 Logical 55.301 —.07 8 Loving 66 .430 — 
9 Helpful 55.304 —.05 9 Helpful 66 432 — 

10 Responsible 55.306 .05 10 Forgiving 66 .435 


Note. The standardized regression coefficients 
An underlined R? indicates that there were no 
after that step in the analysis. 

*p <05, ** p < 01,44" p < 001. 


at the 10th step; the set of instrumental 
values combined with these four background 
and demographic variables accounted for 
47% of this variance in Sample 1 and 56% 
in Sample 2 at the 10th step (see Table 4). 

The combination of terminal values and 
instrumental values accounted for 48% of 
the variance in conservatism Scores in Sample 
1 and 60% of this variance in Sample 2 at 
the 10th step (see Table 5). In both samples 
Step-by-step changes in R? continued to be 
Statistically significant (p< 05) up to the 


= 71 and R? = 


freedom to the regression equation i 

n 
order, For Sample 2. R= D and R= me 
at the 14th step following the addition of 


(Bs) are those obtained at the end of 10 Steps in the analysis. 
Statistically significant (p < .05) step-by-step increases in R? 


capable, responsible, a sense of accomplish- 
ment, and polite to the regression equation, 
in that order, 

Finally, the combination of all 36 values 
(terminal plus instrumental) with age, sex; 
education, and income accounted for 55% 
of the variance in conservatism scores in 
Sample 1 and 67% of this variance in Sample 
2 at the 10th Step (see Table 5). In Sample 
1, step-by-step changes in R? continued to 
be statistically Significant (p< .05) up to 
the 14th step, where R = .76 and R? = .570, 
following the addition of a world of beauty, 
responsible, independent, and education to 
the regression equation, in that order. In 
Sample 2, statistically significant step-by- 


Step increases in R? stopped at the 9th step 
(see Table 5), 
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The predictor variables that emerged in 
the early steps of the analyses reported in 
Tables 3, 4, and 5 were very similar across 
the two samples, and they tended to occur in 
much the same order in both samples. For 
example, salvation, family security, national 
security, and equality emerged in that order 
for both Sample 1 and Sample 2 in the first 
four steps of the solution for the terminal 
values presented in Table 3; age, broad- 
minded, and obedient emerged in the first 
three steps of the second set of analyses pre- 
sented in Table 4 for both Sample 1 and 
Sample 2; and age, salvation, broad-minded, 
and obedient occupied the first four steps of 
the second set of analyses reported in Table 
5 for both samples, The solutions presented 
in Tables 3, 4, and 5 therefore appear quite 


Table 4 
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robust—there was a high level of replica- 
tion from sample to sample despite the fact 
that different populations were involved and 
the samples were taken 5 years apart. 

Finally, the terminal values were slightly 
better predictors of conservatism than the in- 
strumental values (see Table 3) and predic- 
tion throughout was slightly better for 
Sample 2 than for Sample 1 (see Tables 3, 
4, and 5). 


Value Importance and Age 


In view of the fact that age emerged from 
the multiple regression analyses as an im- 
portant predictor of conservatism, it is also 
of interest to inquire about the degree to 
which value importance was a function of 


Multiple Correlations (Rs), Proportion of Variance Explained (R*), and Standardized Regression 
Coefficients (Bs) From Stepwise Multiple Regression Analyses Predicting Total Conservatism 


From Values, Age, Sex, Education, and Income 


Sample 1 Sample 2 
Step Variable R IR 8 Step Variable ROR 6 
Terminal values + age + sex + education + income 

1 Age .48 .230 n. 1 Salvation -56 .315 E Ns 
2 Salvation 60" IST E RET 2 Age :70 .490 Ki baad 
3 National security .63 .401 Th sgad 3 National security .73  .527 18h 
4 Equality 66 .429 —.18*** 4 Equality 74 544 ENE 

5 Mature love 67 453 —.17*** 5 Mature love 75.565 —.09' 

6 Family security 69 .473 4 Ung 6 Family security .76 .574 SE ni 
7 Sex -70 486 =. 12*** 7 Social recognition .76 .580 10* 

8 Inner harmony -70 .492 —.09* 8 Happiness _ 76 .584 05 

9 A world of beauty .71 .497 —.08* 9 A comfortable life .77 .586  .08 
10 Freedom .71 .502 —.08* 10 Sex Er y baleen 5 We CRUEL 

Instrumental values + age + sex + education + income 

1 Age 48.230 .40*** 1 Age BG A218") Lees 
2 Broad-minded 59.343. —.21*** 2 Obedient 65 .423 a 
3 Obedient 63 402. .21*** 3 Broad-minded .70 .489 278 
4 Sex 65: AID aae 4 Independent TL SIL —.19%** 
5 Ambitious 66 .440 .14*** 5 Imaginative .73 051 ~ 17 
6 Clean .67 451 Ab bod 6 Income 73.539 ane 

7 Independent .68 .458 —.09* 7 Sex f BH ue 

8 Education 68 464 —.09* 8 Loving jae 38 w 

2 R ibl 68 .469  .09* 9 Cheerful T4. -. 

10 Come 47307 10 Responsible 15, 587 —.06 


Courageous 69 
MUN igi ibis bs. Vater E EE A S 


Note. The standardized regression coefficients (Bs) : 
An underlined R? indicates that there were no stati 
after that step in the analysis. 

P < 05. ** p < 01. *** p < 001. 


are those obtained at the end of 10 steps in the analysis. 
stically significant (p < .05) step-by-step increases in R 
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Table 5 
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j 7 i 3 d Standardized Regression 
i i Rs), Proportion of Variance Explained (R ), an 
e a Eo r Regression Analyses Predicting Total Conservatism 
From All 36 Values and From All 36 Values, Age, Sex, Education, and Income 


aaa aaasta 


Sample 1 Sample 2 a0 
Step Variable RR B Step Variable R FR 8 
Terminal values + instrumental values 

1 Salvation AL 172 Raat 1 Salvation x ‚56 315 Sor" 
2 Toni eerily 2901 /20d penne tee 2 Family security  .65 419 13 
3 National security .58 .341 eae 3 Obedient f 69 414 

4 Broad-minded 62 .384 —.19*** 4 National security .72 .517 

5 Equality 64 413 —.17*** 5 Equality 73.540 

6 Mature love 166 442. —.16"** 6 Broad-minded 75.556 

7 Clean 108) ASE RIZS 7 Mature love 76 .571 

8 Self-respect -68 467 .09** 8 Imaginative 76.581 

9 Independent 69 476 —.11** 9 -Truefriendship .77 .588 
10 A world of beauty .70 .484 —.10** 10 Independent a7 SOS 


Terminal values + instrumental values + age + sex + income + education 


1 Age 48 1230  .35%%* 1 Salvation 56 315 
2 Salvation 260), EOST E 2ste 2 Age 70.490 
3 Broad-minded 65 1419 —.15*** 3 Obedient 76.574 
4 Obedient 67 fase o iar 4 Broad-minded .77 600 
5 National security .69 .479 $4120% 5 Mature love 79 625 
6 Equality 71.502 —.16*** 6 Equality 80 644 
7 Mature love 72.518 —.11** 7 National security .81 .654 
8 Sex 13529 eset 8 Independent 81.661 
9 Ambitious 74 541 Pi had 9 Imaginative 82.666 
10 Clean 14 550 .10** 10 Helpful 82.670 


Note. The standardized regression coefficients (Bs) are those ol 


An underlined R? indicates that th 
step in the analysis, 
*p <.05.** p < 01, ** p< 001. 


age in both samples, Table 6 reports the 
product-moment correlations of value im- 
portance with age for each of the terminal 
and instrumental values, (Correlations show- 
ing the positive relationships between age 
and conservatism are contained in Table 1.) 

Table 6 indicates that in both samples 
there were statistically Significant positive 
relationships between age and the relative 
Importance assigned b 
following values: 


btained at the end of 10 steps in the analysis. 


ere were no statistically significant (p < .05) increases in R? after that 


ity, freedom, social recognition, true friend- 
ship, being broad-minded, and being imagina- 
tive, These values tended to decrease in im- 
portance with increasing age. 

Thus, Table 6 shows that there were sev- 
eral value/age relationships that were con- 
sistent from sample to sample, These changes 
in value importance with age are of interest 
in their own right but they will not be dis- 
cussed in detail—see Feather (1975b, chap. 
6) and Feather (in press) for relevant dis- 
Cussion of age differences in values. 

There were a small number of significant 
value/age correlations that were not repli- 
cated from sample to sample. Most of these 
Significant relationships occurred in Sample 
I, and“they may reflect the fact that this 
sample came from a wider and more hetero- 
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geneous population than Sample 2. Some of 
the nonreplicated effects may also be chance 


events. 


Discussion 


The results of the present study are clear 
and consistent. As predicted, conservatism 
was positively related to the relative impor- 
tance of values that could be taken to in- 
volve ego defense and attachment to author- 
ity—values that were concerned with salva- 
tion, security, cleanliness, and rule following. 
The higher the conservatism score, the more 
likely it was that these values would be 
emphasized, Also as predicted, conservatism 
was negatively related to the relative impor- 
tance of values that referred to equality, 
freedom, love, and pleasure, as well as open- 
minded, intellectual, and imaginative ways of 
thinking. The higher the conservatism score, 
the less likely it was that these values would 
be emphasized, Conservatism was also asso- 
ciated with age and sex in the direction pre- 


Table 6 
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dicted. Thus, older people tended to be more 
conservative than younger people, and fe- 
males were slightly more conservative than 
males. Finally, the relative importance of 
some of the values varied systematically with 
differences in age. All of these results were 
replicated from sample to sample and they 
therefore stand as reliable findings, More- 
over the replication indicated that correla- 
tions were not only in the same direction in 
both samples but were also similar in their 
orders of magnitude.” 

The results also showed that conservatism 
was negatively related to the education and 
income levels of heads of households in 
Sample 1, though the correlations were rela- 
tively small. Higher conservatism tended to 


7 Those with high conservatism scores were also 
more likely to downgrade values concerned with 
beauty, excitement, and independence, perhaps again 
reflecting a preference for convention, tradition, 
structure, and stability as opposed to novelty, inde- 
pendence, complexity, and change. 


Product-Moment Correlations (Simple rs) Relating Value Importance to Age 


for Samples 1 and 2 
L A N T EE O aps es 


Age Age 
Sample 1 Sample 2 . Sample 1 Sample 2 
Value importance r r Value importance r r 
Terminal values Instrumental values 

A comfortable life .05 .05 Ambitious —.09* Bs 
An exciting life —.29*** = =—,29*** Broad-minded —.11* Sele 
A sense of accomplishment .07 .09 Capable .09* 12 
A world at peace —.08 01 Cheerful ale ‘ ey 
A world of beauty —.10* —.10 Clean s20 SS 
Equality —.20*** = —.16** Courageous -05 .16 
Family security i Bets A7***| Forgiving —.01 —.02 
Freedom —.23*** = —,13* Helpful 07 i —.00 
Happiness 07 —.10 Honest _ S19 S005 
nner harmony .19*** = =—.03 Imaginative ary AN —.27 
Mature love —.07 .03 Independent —.10 =.05 
National security ayre .31***| Intellectual —.04 —.08 

leasure = 19 10 Logical OL 02 
Salvation 41397. .09 Loving —.10* ER 
Self-respect (23#** — 119** | Obedient 1 m 

cial recognition —.12** —.14* Polite | ts ini 
True friendship —.18*** —,34***| Responsible 13 ut 

isdom 09*  —01 Self-controlled =.07 a 


Note. Because of missing data, the Ns for the correlations in this table for Sample 1 (me iopolitag Adelaide) 
ranged from 562 to 574. For Sample 2 (Flinders University), the Ns ranged from 357 to 358. 


$ < 05, +. p < 01. *** p < .001. 
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be associated with lower levels of education 
and income, However, conservatism was not 
significantly related to these socioeconomic 
measures in Sample 2, probably for the rea- 
sons presented earlier. 

While the present study is essentially de- 
scriptive rather than based upon a causal 
model, the results do advance our knowledge 
about general conservatism. In the first place, 
they indicate the values that relate to this 
attitude syndrome and thereby provide a 
reliable set of findings to be explained. One 
can agree with Rokeach (1973) that there is 
a compelling need to assemble descriptive 
information about attitude-value relation- 
ships while at the same time seeking explana- 
tions about how these relationships came 
about. These explanations are likely to be 
complex rather than simple, having regard 
not only to the impact of different social in- 
stitutions and environments but also to the 
active, constructive way in which information 
is processed by individuals according to the 
structure and dynamics of personality. 

Second, the results of the regression anal- 
yses do have some implications about likely 
causal mediators, The fact that certain values 
emerged as strong predictors of conservatism 
even when the background and demographic 
variables were included in the regression 
equation (see Tables 4 and 5) suggests that 
the obtained relationships between value pri- 
orities and conservatism were mediated by 
more than social learning and probably also 
involved the Operation of internal personal- 
ity dynamics that helped shape a system of 


relatively consistent attitudes and values, 
Thus, i 


these results can be under: i 
stood in 

pa of both the cognitive learning and 
e psychodynamic explanations of attitude 


syndromes such as authoritarianism and con- 


ms, 1976; Greenstein, 
ese explanations 
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and the operation of complex ego mecha- 
nisms that are in the service of the self. 

That both kinds of explanation are neces- 
sary is especially suggested by the results 
presented in Tables 4 and 5, where the best 
set of predictors of general conservatism in- 
volved a combination of values and age, with 
sex and socioeconomic background entering 
much further down the list. Age is a variable 
that acquires meaning only in terms of its 
correlates. Among these correlates are the 
experiences—some unique, some shared with 
others—that would determine changes in 
both value priorities and conservatism over 
time, Even with age included as a variable 
in the regression equation, however, certain 
values remained as important predictors of 
conservatism, and these values, involving a 
preference for authority, rules, safety, and 
religion and a downgrading of equality, love, 
and open-minded modes of thought, are con- 
sistent with past discussions of conservative 
and authoritarian attitudes and ideologies 
(€g, Adorno et al., 1950; Elms, 1976; 
Greenstein, 1969; Kirscht & Dillehay, 1967; 
Rokeach, 1960; Sanford, 1973; Tomkins, 
1963; Wilson, 1973). In view of the fact that 
these values emerged as important predictors 
after age and the other background variables 
were controlled, it is reasonable to assume 
that personality dynamics relating to impulse 
satisfaction and control and their effects on 
cognition also played an important role in 
determining the nature of the value/con- 
Servatism relationships. 

Given the relationships between values and 
conservatism described in the present study, 
longitudinal investigations that enable one 
to separate out cause and effect are needed 
in the future, It is nearly 30 years since The 
Authoritarian Personality was published 
(Adorno et al., 1950). One can now improve 
upon the procedures used in that seminal 
research in many ways, not least by employ- 
ing methodologies that have emerged from 
life span developmental psychology and go 
beyond simple cross-sectional and single- 
cohort longitudinal studies in an attempt to 
discover the complex influences involved 
(Feather, 1975b, pp. 143-146). These new 
Procedures are costly and time consuming, 


VALUES AND CONSERVATISM 


but they are essential if one is to disentangle 
the factors determining change and continu- 
ity in values and attitudes over time. Using 
these improved methodologies, one could de- 
sign developmental studies that explore the 
basis for relationships between values and 
conservative attitudes across the life span 
and enable the formulation of more sophisti- 
cated theoretical analyses.“ 

Not enough is known about the conditions 
under which general values emerge and how 
they come to be related to specific attitudes. 
It is likely that the sequence of events is 
complex and that attitudes tied to specific 
objects and issues develop first, followed by 
the emergence of general values as abstrac- 
tions from the welter of a person’s experience 
as it reflects the attitudes and actions both of 
self and of significant others (see Adelson, 
1975; Feather, 1978, in press). Once formed, 
the dominant values may then inform a per- 
son regarding the correct attitudes toward 
new objects and issues. The hierarchy of 
values would itself change over the life span, 
some values increasing in relative impor- 
tance and others decreasing. And new be- 
liefs and attitudes would also be acquired, 
concomitant with these general changes and 
based upon similar causes. This new learning, 
along with the modifications in the value 
hierarchy, would not only reflect the in- 
evitable changes that occur across the life 
span as people grow older, take on new roles 
and responsibilities, and develop new needs 
and priorities as they cope with a changing 
biology and the altered circumstances of their 
lives; they would also be influenced by gen- 
eral social trends and unique historical events 
(Feather, 1975b; 1977a; in press). With 
change in the hierarchy of a person’s values 
and the formation of new beliefs and atti- 
tudes, one would also expect to find altera- 
tion in some of the more specific orientations 
laid down in the past, so that the total set 
Of beliefs, attitudes, and values forms a rela- 
tively consistent and interrelated system in 
8eneral harmony with a person’s overall self- 
Concept. All of these matters are of obvious 
relevance to clarifying the basis of the rela- 
tionships between values and conservatism 
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and deserve detailed investigation in the 
future. 

The results of the present study indicate 
which values such studies might focus upon. 
One can agree with Rokeach (1973) that 
freedom and equality are two terminal values 
of special significance in the analysis of dif- 
ferences in political conservatism, Indeed, 
for a number of reasons, Rokeach (1973, pp. 
165-170) doubts the usefulness of employ- 
ing a single continuum of liberalism-con- 
servatism, preferring instead to refer to vari- 
ations in political ideologies that can be 
defined by differences in the relative impor- 
tance assigned to the two terminal values 
freedom and equality. In terms of his two- 
value model of political ideology (Rokeach, 
1973, p. 170), high scorers on the C Scale 
could be seen as more likely to demonstrate 
a “profascist” ideology (freedom low, equal- 
ity low) and low scorers a “prosocialist” 
ideology (freedom high, equality high), given 
the fact that total conservatism scores were 
negatively correlated with the relative im- 
portance assigned to both freedom and equal- 
ity on the Value Survey (see Table 1). The 
present results show, however, that several 
other values were associated with general 
conservatism as defined by the C Scale, and 
these deserve future investigation also, along 
with further clarification of the concept of 
conservatism and the extent to which it is 
unitary or multidimensional. 


8 For an excellent recent example in which these 
new methodologies have been applied to the study 
of political orientations, see Jennings and Niemi 
(1975). 
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In a conceptual replication and extension of a study by Bem and Lenney, male 
(n= 90) and female (n = 118) college students rated their comfort in and 


preference for performing several 
activities. Correlations between these 


(instrumentality) and femininity (expressi 
tributes Questionnaire (PAQ) of Spence ani 
ically reasonable in sign but in each 


occasionally significant. Classification 
drogynous, 
and femininity scores revealed 


both sexes had higher comfort ratings, inde 


feminine and undifferentiated subjects, 
tality and expressiveness per se. 
differences were found only in 


data support the Spence-Helmreich hypothesis 


instruments are largely measures of 
traits rather than sex roles and 


masculine, feminine, and undi 
that androgynous and masculine subjects of 


| series of masculine, feminine, and neutral 


ratings and scores on the masculinity 
veness) scales of the Personal At- 
d Helmreich tended to be theoret- 
sex were low in magnitude and only 
of subjects into four PAQ groups (an- 
fferentiated) on their joint masculinity 


pendent of type of task, than did 
suggesting the importance of instrumen- 


For forced-choice preference ratings, significant 
males, masculine subjects having a stronger 
asks than those in other categorical groups. The PAQ 


of the variance. The 
similar 


minimally related to many sex role behaviors. 


Interest in the concept of psychological 
androgyny has generated an increasing vol- 
ume of empirical research, much of it using 
the Bem Sex Role Inventory (BSRI: Bem, 
1974, 1977) or the Personal Attributes Ques- 
tionnaire (PAQ: Spence & Helmreich, 1978; 
Spence, Helmreich, & Stapp, 1974, 1975) as 
the operational measure of the construct. 

ach of these instruments contains separate 
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masculinity and femininity scales that are 
essentially orthogonal. The androgynous in- 
dividual, according to current usage, is defined 
as one who scores relatively high on both the 
masculinity (M) and femininity (F) scales 
of these self-report instruments. 

While the developers of the BSRI and the 
PAQ share certain theoretical assumptions 
that have guided both their empirical inves- 
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tigations and their interpretation of the re- 
sults, their views differ in other critical 
respects, These differences, in turn, lead to 
predictions about the behavior of “androgy- 
nous” individuals that do not always coincide. 
The present study, using the PAQ, involved 
conditions in which the implications of the 
two theories led to different expectations, In 
order to explain its rationale, it is first neces- 
sary to describe the empirical properties of 
both the PAQ and the BSRI. 

The PAQ M scale consists of a series of 
descriptions of traits, each stereotypically 
more characteristic of men than women but 
socially desirable to some degree in both 
sexes, Similarly, the F scale consists of de- 
scriptions of socially desirable traits more 
characteristic of women. In content, the trait 
clusters reflect what Parsons and Bales 
(1955) have described as instrumental (mas- 
culine) and expressive (feminine) character- 
istics. Factor analyses of the PAQ reproduce 
the two empirically derived scales, thus add- 
ing further justification to their separation 
(Spence & Helmreich, 1979; Helmreich, 
Spence, & Wilhelm, in press). The scales have 
been demonstrated to discriminate between 
the sexes in diverse populations, varying 
widely in age, ethnicity, and social class (eg., 
Spence & Helmreich, 1978), thus justifying 
use of the labels masculinity and femininity, 

In sum, the PAQ is a quite conventional 
self-report measure of socioaffective instru- 
mental and expressive traits, the items making 
tr oe dns rac 
These trait dim ‘rey “aye f ye i 

f ensions, according to our 
theoretical conception, are internally located 
response predispositions that combine with 
situational variables and other person vari- 
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ables to determine behavior but are not them- 
selves identical to behavior. 

Of the evidence that has been collected 
to support the construct validity of the PAQ 
scales as measures of instrumentality and ex- 
pressiveness, a recent experiment by Klein 
(Note 1) is particularly instructive. Klein 
observed the dominance behavior of female 
college students in three-member problem- 
solving groups, two members of which were 
the experimenter’s confederates. Subjects’ 
scores on the PAQ M scale were positively 
correlated with degree of dominance behavior, 
whereas their scores on the F scale showed a 
slight negative correlation with this behavior. 
As might be expected from these correlations, 
masculine women (high in M, low in F) were 
most dominant, followed in order by women 
who were androgynous (high M, high F), 
feminine (low M, high F), and undifferen- 
tiated (low M, low F). This order was found 
both when the two confederates were female 
and when they were male, In all PAQ groups, 
however, level of dominance was markedly 
(and equally) reduced when the two con- 
federates were male, a condition containing 
implicit role-related constraints on women’s 
undisguised expression of this type of be- 
havior. The results thus show that situational 
factors (presence or absence of sex role ex- 
Pectations) and personality traits (M and F) 
independently contributed to dominance be- 
havior. 

Like the PAQ, the BSRI consists of general 
trait descriptions, many of the M items being 
Socially desirable instrumental traits and 
many of the F items being socially desirable 
expressive traits that are similar or identical 
to PAQ items on the parallel scales. However, 
not all items on the BSRI can be described 
as instrumental or expressive (e.g., “mascu- 
line” and “feminine”) or (on the F scale) as 
Socially desirable, and unlike the PAQ, recent 
factor analytic studies (eg., Pedhazur & 
Tetenbaum, 1979) indicate a four- rather | 
than a two-factor structure. Nonetheless, em- 
pirical evidence supporting the construct 
validity of the BSRI as (primarily) a measure 
of instrumentality and expressiveness has 
been found in behavioral studies by Bem and 


er colleagues (Bem, 1975, 1977; 
fartyna, & Watson, 1976). 

The somewhat mixed content of the BSRI, 
the identification of the instrument as a sex 
role inventory, and Bem’s theoretical discus- 
sions combine to suggest that she posits the 
existence of global constructs of masculinity 
and femininity that are manifested in a 
variety of gender-related behaviors, role atti- 
tudes, and personal qualities collectively iden- 
tified as sex roles. She further appears to as- 
sume that, despite their limited empirical 
content, the BSRI and PAQ M and F scales 
are (in principle) valid measures of these 
global constructs. Thus, according to her con- 
‘ceptualization (e.g., Bem, 1977; Bem & Len- 
ney, 1976), androgynous individuals, high in 
both M and F scores, are “behaviorally flex- 
ible” with respect to all manner of gender- 
related phenomena. As such, they are willing 
or able to exhibit masculine behaviors, femin- 
ine behaviors, or both, as situationally ap- 
propriate. Individuals with sex-typed per- 
sonalities, on the other hand, will tend to 
avoid or exhibit lower levels of cross-sex-typed 
behaviors. 

This theoretical position rests on the sup- 
position that the empirically diverse indi- 
cators of masculinity and of femininity are 
all highly correlated, so that an individual 
exhibiting one set of attributes or behaviors 
in the class can reasonably be assumed to 
exhibit approximately the same degree of all 
other attributes and behaviors. If the subsets 
of masculine and feminine behaviors, atti- 
tudes, and personality characteristics that dif- 
ferentiate the sexes do covary within indi- 
viduals, then it follows that it is useful to 
Postulate global unidimensional concepts, and 
further, that it is justifiable to use instru- 
ments such as the PAQ or BSRI that tap a 
limited range of gender-related traits to make 
inferences about other kinds of masculine and 
feminine qualities. 

Spence and Helmreich (1978) have not 
denied the existence of associations between 
Various classes of masculine and feminine at- 
tributes and behaviors but have proposed that 
hey often are weak and/or complexly deter- 
Mined. More particularly, they hypothesize 
hat socially desirable masculine and feminine 


Bem, 
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(i.e., instrumental and expressive) personality 
traits are not likely to be strongly related to 
sex role preferences or behaviors unless these 
behaviors call quite directly for instrumental 
or expressive skills. They thus suggest that 
personality instruments such as the PAQ and 
the BSRI cannot legitimately be substituted 
for more face valid measures of sex role atti- 
tudes and preferences, and that findings ob- 
tained with these personality instruments will 
not necessarily parallel those with sex role 
measures. For example, theoretically reason- 
able but very weak relationships have been 
found between the PAQ and BSRI M and F 
scales and the Attitudes Toward Women Scale 
(AWS: Spence & Helmreich, 1973), a mea- 
sure of beliefs regarding appropriate roles of 
women vis-a-vis men (Orlofsky, Astin, & 
Ginsburg, 1977; Spence & Helmreich, in 
press). The values, however, are low and 
often nonsignificant even in very large sam- 
ples. Similarly, the study by Klein (Note 1) 
cited earlier suggested that sex-typed (fem- 
inine) women, as measured by the PAQ, were 
no more responsive to the introduction of sex 
role demands into group interactions than 
were women showing other personality pat- 
terns. 

In an experiment that went beyond the 
investigation of instrumental and expressive 
behaviors, Bem and Lenney (1976) tested 
the behavioral flexibility hypothesis by exam- 
ining the relationship between masculine and 
feminine traits, as measured by the BSRI, 
and reactions to gender-related activities. 
Male and female subjects were asked to dem- 
onstrate a series of simple tasks (e.g., pound- 
ing a nail in a board) that were stereotyp- 
ically masculine or feminine and then to 
indicate their degree of comfort in perform- 
ing them. Prior to this portion of the study, 
they were given a description of a series of 
pairs of tasks, each pair consisting of either 
two neutral tasks or a masculine and a fem- 
inine task, and were asked to indicate which 
they preferred to demonstrate. Three groups 
of male and female subjects, designated as 
masculine, feminine, and androgynous, 
were preselected to serve in the experiment, 
assignment of subject to group being based 
on Bem’s (1974) original subtractive (F 


1634 


minus M) method of scoring the BSRI. Sub- 
jects were chosen from the center and the two 
extremes of the difference score distribution. 
The results partially supported the behavioral 
flexibility hypothesis, heavily sex-typed indi- 
viduals (masculine men, feminine women) 
tending to show more avoidance of cross- 
typed tasks than did the other groups, 
Androgynous (balanced) subjects, how- 
ever, did not differ significantly from cross- 
typed subjects. 
The present study is a conceptual replica- 
_ tion and extension of the Bem and Lenney 
study, using the PAQ rather than the BSRI. 
Male and female students, pretested on the 
PAQ and later given the Attitudes Toward 
Women Scale (Spence & Helmreich, 1973), 
Were told that they would be asked to demon- 
strate several activities from a larger group 
and could indicate their preference before- 
hand. They were given two lists, each con- 
taining stereotypically masculine, feminine, 
and sex-neutral activities. The first list con- 
sisted of simple, everyday tasks similar to 
those used by Bem and Lenney (1976). The 
second list contained descriptions of TV com- 
mercials or public service announcements, half 


neutral and half 


In deriving hypotheses abo 
ut the r 
from Bem’s work, several aspects oe he 
theory and the Bem and Lenney study should 
first be noted. First, in Tesponse to data 
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(Spence et al., 1975) indicating that a sco 
ing scheme taking into account absolute level 
of M and F scores has greater utility than 
does a subtractive method, Bem (1977) sub- 
sequently revised her definition of androgyny 
to refer only to those relatively Aigh in M and 
F. She also adopted the Spence et al. designa- 
tion of undifferentiated to refer to those rela- 
tively low in M and F. Second, in discussing 
evidence bearing on the hypothesis that 
androgynous (high M, high F) individuals 
are more “behaviorally flexible” than others, 
Bem and others have seemed to treat thej 
type of sex role behavior studied by Bem and 
Lenney as having equal standing with be 
haviors demanding the expression of nal 
Mental or expressive qualities. That is, re 
Sponses to the types of gender-related tasks 
employed by Bem and Lenney are implicitly 
regarded as being in the same class as instru- 
mental or expressive behaviors and are there- 
fore implied to be related to the BSRI in the 
Same manner and with the same strength as 
instrumental and expressive behaviors. There | 
is both an internal contradiction and incom- 
pleteness in her theory, however. Although 
the experimental evidence (e.g., Bem et al., 
1976, as reanalyzed in Bem, 1977) indicates 
that androgynous and undifferentiated indi- 
viduals are not identical in behaviors designed 
to reflect instrumental or expressive capacities, 
Bem (1977) has stated that she would expect 
no differences between those groups on the 
types of tasks employed by Bem and Lenney. 
(Presumably undifferentiated individuals, be- 
ing neither Strongly masculine nor strongly 
feminine in their “sex-role identification,” 
would show little differential reaction to sex- 
linked tasks.) Thus, on this type of task, 
undifferentiated and androgynous groups are 
both expected to be more “behaviorally flex- 
ible” than Sex-typed individuals. More spe- 
Cifically, her theory predicts that androgynou: 
and undifferentiated groups, male and female,’ 
will not differ from each other but that. both 
Categorical 8roups will show less marked pref+ 
erence for same-sex tasks and will express” 
More comfort about performing opposite-sex 
eG than will sex-typed individuals (mascuf 
ne men, feminine women). Bem is uncleai 
about what to expect in cross-typed individ 


Bem's Theoretical Predictions 


Preference for Sex-typed Tasks 


Cross-typed Tasks 


ferentiated, androgynous, sex-typed, and cr 
females in the groups. (U = undifferentiated; 


feminine female]; CT = cross-typed 
and comfort ratings show the expectes 


uals (feminine men, masculine women). It 
Would seem to follow from an extension of her 
logic, however, that these groups would also 
be less flexible than undifferentiated and 
Androgynous groups, showing the mirror im- 
Age in their preferences and comfort ratings 
Í same- and opposite-sex tasks as masculine 
en and feminine women. The implications 
Pf these hypotheses for both comfort and 
Preference ratings are shown graphically at 
e left of Figure 1. 
These hypotheses, 


which also imply fairly 
bstantial relationships between 


the PAQ 
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Obtained Results 


Preference for Sex- typed Tasks 


~ Females 
` 


>> (Mase) 


Comfort (Tasks Combined) 


PAQ Category 


i i fort ratings (Bem’s hypotheses) in undif- 
Figure 1. Predicted mean sex-typed eS t ones a obtained means for imales and 
A = androgynous; ST = sex-typed [masculine male, 
[feminine male, masculine female]. The predicted preference 
d order of means. They do not reflect predicted magnitudes 


of effects. PAQ = Personal Attributes Questionnaire. Masc. = masculine. Fem. = feminine.) 


scales and preferences for and comibrtin per- 
forming sex-typed tasks, are to be contrasted 
with the expectations of the present investiga- 
tors. We anticipated that while subjects’ 
gender would be a major determinant of their 
reactions to sex-typed tasks (a point on which 
Bem is silent), the relationships within each 
sex between M and F and the criterion mea- 
sures would be minimal and/or of a different 
form than predicted by Bem. More spe- 
cifically, we expected that M scores, and to a 
lesser degree F scores, would be weakly but 
positively related to subjects’ ratings of their 


1636 


anticipated comfort in performing the tasks 
independent of subjects’ sex and the category 
of the task. We further expected that M and 
F would combine additively to determine com- 
fort ratings so that on all three types of tasks 
—neutral, masculine, and feminine—androgy- 
nous individuals would tend to have the 
highest comfort ratings, followed by mascu- 
line, feminine, and undifferentiated individ- 
uals. We were led to these expectations by 
the results of prior research indicating that 
in both sexes instrumentality (M scores) has 
a substantial relationship and expressiveness 
(F scores) a somewhat weaker relationship 
with such variables as self-esteem, social 
competence and absence of neurotic tendencies 
(Spence & Helmreich, 1978; Spence, Helm- 
reich, & Holahan, 1979). Differences among 
the categorical groups in their ratings of 
anticipated comfort in performing publicly 
would come about, we reasoned, because of 
their differences in self-esteem and self-con- 
fidence. 

In the case of the preference measure, the 
difference between our expectations and Bem’s 
is less clear-cut. As was stated earlier, our 
general theoretical stance has been that the 
masculine and feminine personality traits 
measured by the PAQ and to a large extent 
the BSRI tend, at best, to be only weakly 
related to masculine and feminine behaviors 
and characteristics that do not directly im- 


Plicate instrumentality and expressiveness, On 
a forced-choice preference measure, it seemed 
possible that some 


correlation might be found 

between M and F scores and preference 
scores, M being related Positively and F nega- 
tively to preference for stereotypically mascu- 
line over stereotypically feminine tasks, These 
relationships were expected to be somewhat 
stronger in males than in females because of 
the greater pressures males experience for sex 
role conformity. Even in the case of males, 
) 


however, we anticipated that the relationships 
would be 


weak, PAQ scores accounting for a 
small portion of the variance within each sex. 
We expected a stronger relationship, however 
between preference rating and scores on odr 


sex role attitudes measure (AWS), again 
especially in males. ‘ 
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Method 


Subjects 


The subjects were 90 male and 118 female students 
from introductory psychology courses at the Uni- 
versity of Texas at Austin who participated to fulfill 
a course requirement. 


Assessment Measures 


Personal Attributes Questionnaire. At the begin- 
ning of the semester, classes of which the subjects 
were members completed the short form of the Per- 
sonal Attributes Questionnaire (Spence & Helmreich, 
1978), which is divided into three eight-item scales: 
masculinity (M), femininity (F), and masculinity- 
femininity (M-F). (Since the results from the M-F 
scale provided no additional information, only data 
from the M and F scales will be reported.) Each 
item is scored from 0 to 4, and total scores for each 
scale are obtained by summing the item scores. Re- 
sponses to the M items are scored in a “masculine” 
direction and responses to the F items in a “fem- 
inine” direction. The means on the M and F scales 
are highly comparable to the normative data for 
college students presented in Spence and Helmreich 
(1978), with significant (p < .01) sex differences on 
each scale. The means for M for males and females 
were 21.6 and 20.0. For F, the respective male and 
female means were 21.9 and 24.4. 

One of the analyses of the joint contribution of M 
and F scores to the dependent variables involved 
classification of the Subjects into one of four cat- 


egorical groups, based on a median split of M and F 
Scores: 


for females were highly similar to those found i 
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other college samples (Spence & Helmreich, 1978). 

Activities questionnaire. An activities question- 
naire was constructed to measure subjects’ reactions 
to performing various masculine, feminine, and neu- 
tral activities. Two lists of activities were included. 
The first, labeled everyday behaviors, described a 
series of simple tasks comparable to those used by 
Bem and Lenney (1976). These included, in random 
order, 4 masculine behaviors, 4 feminine behaviors, 
and 4 neutral behaviors. The 12 tasks were rated by 
10 male and 10 female student judges on 5-point 
scales labeled 1 = very feminine, 3 = non-sex-typed, 
$=very masculine, The results confirmed the ade- 
quacy of the classification. Items and mean ratings 
are shown in Table 1. 

Subjects were asked to imagine themselves per- 
forming each activity and to indicate on a 7-point 
scale how comfortable they would be. The scale 
ranged from very uncomfortable to very comfortable, 
with high scores indicating greater comfort. 

The second list, labeled announcements, was com- 
Posed of 24 items referring to a TV announcement 
of Public interest or a description of some product. 
Subjects were asked to imagine themselves reading 
a prepared message on each of the topics and to rate 
how comfortable they would feel, again on a 7-point 
scale. In content, the items were equally divided 
among masculine, feminine, and sex-neutral topics. 
Four of the announcements in each of the three 
categories were affectively neutral (eg. feminine: 
Point out the positive features of a particular style 
of purse while holding up the purse”). The other 
four of the activities in each category were somewhat 
| embarrassing or negative in nature (eg, masculine: 

Point out on a model the positive features of a pair 
| of jockey shorts”), The ratings of undergraduate 

judges also confirmed the assignment of the an- 
Rouncements to the three sex-typed categories. (As- 
ponent to affectively negative and neutral cat- 

ps was confirmed post hoc by the comfort rat- 

Ings of the experimental subjects themselves.) 

j ay making their comfort ratings, subjects were 
: to indicate their preferences for performing 
| ° Previously rated activities. The activities were 
hi 
i 


ented in three separate lists. The first contained 
Brey everyday behaviors, the second the 12 affec- 
br y neutral announcements, and the third the 12 
ee announcements. For each list separately, 
ane were asked to indicate the 4 activities they 
the d most prefer to perform and the 4 activities 
F Would least prefer to perform. A score of 3 was 
i 


o ae to items marked as most preferred, a score 
Be? the unmarked items, and a score of 1 to the 

Subj marked as least preferred. For each list, each 
} Bea scores for the 4 masculine, feminine, and 
ec items were summed to yield preference 
ambarrassment index. After demonstrating two 
d from the everyday behaviors list, one masculine 
we One feminine, subjects were asked to indic 
E Point scale how embarrassed they felt while per- 

in (1976) 


t 
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Table 1 
Everday Tasks by Category (Masculine, 
Neutral, Feminine) and Mean Ratings by 
Judges on Masculinity- Femininity 
Order 
of 
4 presenta- Mean 
Category /item tion rating 
Masculine 
Oil a squeaky hinge on a metal 
box 2 4.30 
Nail two boards together 4 4.60 
Tighten a screw 5 4.50 
Attach fishing tackle to a line 12 4.65 
Neutral 
Play with a yo-yo 3 3.10 
Sharpen a pencil 6 2.85 
Peel an orange “i 2.95 
Put a jigsaw puzzle together 9 3.00 
Feminine 
Iron a cloth napkin 1 R25 
Set a table 8 1.50 
Measure out a cup of flour 10 1.35 
Sew a button to a piece of fabric 11 1.20 


Note. Raters were 10 female and 10 male under- 
graduates. 


factory results as postdemonstration ratings, this por- 
tion of the study was abbreviated and was conducted 
primarily to protect the cover story about the pur- 
pose of obtaining the initial comfort and preference 
ratings. Further, it was felt that the subjects’ relief 
in not having to give any of the negative announce- 
ments might obscure any differential reactions to the 
masculine and feminine tasks that might otherwise 
have occurred. In support of this suspicion, an anal- 
ysis of the embarrassment ratings revealed no sig- 
nificant differences in either sex. For this reason, no 
further reference will be made to these data. 


Procedure 


Subjects were tested in same-sex groups ranging in 
size from two to eight. Since Bem and Lenney 
(1976) reported stronger results for subjects tested 
by an opposite-sex experimenter, a pair of female 
undergraduate experimenters tested male subjects and 
a pair of male undergraduate experimenters tested 
female subjects. 

Both experimenters were present at the beginning 
of the experimental session, one of them giving the 
initial instructions. Subjects were informed that they 
were participating in the initial phase of a study on 
impression formation designed to investigate whether 
observers’ judgments about stimulus persons’ person- 
alities are influenced by the particular activity the 
Jatter are performing or how comfortable they are 
performing it. Subjects would therefore be asked to 
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rate a number of activities, several of which they 
would later carry out while being videotaped. Their 
data would be used to select activities for the later 
impression formation study. After a general descrip- 
tion of the several types of activities had been given, 
subjects were informed that they would be given 
instructions about what to do or a statement to read 
while they were being videotaped. It was emphasized 
that there was no interest in how well they were 
doing the activity but that they were to try to be- 
come involved in what they were doing or saying, 
so that they would look believable. 

The experimenter then told the subjects that they 
would be shown the whole list of tasks before the 
videotaping and asked for their reactions, The reason 
given for this procedure was to provide additional 
information about the tasks and to give the subjects 
some ‘choice in what they would later be doing. The 
activities questionnaire was then administered. As 
each subject completed the activities questionnaire, 
the second experimenter took the subject into an ad- 
joining room for videotaping. 

Each subject was asked to perform a masculine 
and a feminine task from the list of everyday be- 
haviors. One task was randomly selected from those 
the subject marked as most Preferred and one was 
randomly selected from those the subject marked as 
least preferred. After the videotaping, each subject 
returned to the first room and completed the em- 
barrassment questionnaire and the AWS. 


Results 
Everyday Behaviors 


Comfort ratings. Summed comfort ratings 
were found for each subject on the masculine 
feminine, and neutral tasks. Highly signif’ 
icant (p < .001) differences were found in 
each sex among the mean ratings for the three 
types of tasks, subjects rating sex-congruent 
tasks highest, neutral tasks intermediate, and 


gender, then, 


This pattern of iti i 
? atte Positive correlations To- 
vides preliminary support for our De 


that within each type of task, those high in 


Mis, 
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instrumentality and expressiveness would tend 
to give higher comfort ratings than others. A 
more direct confrontation between our hy- 
pothesis and Bem’s is afforded by classifying 
Subjects into categorical groups, based on 
their joint M and F scores. As illustrated in 
Figure 1, Bem’s theory implies that on gen- 
der-incongruent tasks (masculine tasks for 
females, feminine tasks for males), individuals 
who are sex-typed in personality will show the 
lowest comfort ratings, whereas cross-typed 
individuals will presumably show the highest 
ratings. On gender-congruent tasks, however, 
all groups except the cross-typed are expected 
to show equally high ratings, Finally, no dif- 
ferences are predicted for the neutral tasks. 
In contrast, our expectations, drawn from 
self-esteem data, are that similar relationships 
occur in all three task categories in both sexes, 
androgynous subjects tending to have the 
highest means, followed by masculine and 
then feminine and undifferentiated individ- 
uals. Inspection of the means of the four cat- 
egorical groups in both sexes on all three tasks 
showed good correspondence with the latter 
Prediction, androgynous or masculine subjects 
having the highest ratings in all six compari- 
Sons and undifferentiated or feminine subjects 
the lowest. i 
The data from the masculine and feminine 

tasks were each analyzed by a 2 (Sex) X 4 
(PAQ Category) analysis of covariance, using 
the neutral task ratings as the covariate. For 
both masculine and feminine tasks a highly 
significant (p < 0001) main effect was found 
for sex. For masculine tasks, the main effect 
of PAQ category was also significant (p < 
01), but for feminine tasks, it was not (p < 
25). Of particular importance, the Sex X 
Category interaction was nonsignificant (F < 
1) in both analyses, 

The pattern of the comfort data clearly 
does not support Bem’s predictions of an in- 
teraction between sex, type of task, and PAQ 
category but corresponds quite closely to our 
expectations. Still further support for our 
hypotheses came from the finding that in both 
sexes, substantial positive correlations were 
found between the comfort measures on the 
three types of tasks (average r = .67). 


The correlation between measures, in con- 
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2 
lations Among Independent and Dependent Measures 
Measure 
Subjects 1 2 3 4 5 
M Males 
i 1. Masculinity 
2. Femininity 08 
3. AWS —.17 13 
4. Everday preference .28%** —.08 — 49%", 
5. Everday comfort A Yid 4 -03 10 
Females 
1. Masculinity 16 00 ER 25*** 
2, Femininity —.08 —.04 14 
3. AWS Bii ‘17 
4. Everday preference —.18"* 


5. Everday comfort 


ard Women Scale. 


Sl. **p <.05. tep < 01. 


ion with the analyses showing similar 
tionships with M and F in all task cat- 
ies, led us to combine ratings on the three 
S of tasks to yield a single comfort score. 
sequent analyses, undertaken to explicate 
her the relationships among comfort rat- 
Bs, PAQ scores, and sex role attitude (AWS) 
fes were based on the combined comfort 
ure. Presentation of the results of these 
lyses will be delayed until after the pref- 
e measure has been described. 

ference ratings. It will be recalled that 
ch of the four tasks rated as most preferred 
given a score of 3, those rated as least 
ferred a score of 1, and those unrated a 
me of 2. The sum of the ratings on the 
asculine, feminine, and neutral tasks was 
M obtained for each subject. Subjects €x- 
ting perfect sex-related preferences would 
have scores of 12 for congruent sex-typed 
(masculine tasks for males; feminine 
S for females) and of 4 for noncongruent 
typed tasks; parallel scores for those 
ng perfect cross-typed preferences would 
tand 12. In initial analyses, means on the 
€ types of tasks were obtained for males 
males, the results indicating that in 
‘Sexes, the sex-congruent tasks were most 
ted, followed by the neutral tasks, and 
ü the cross-typed tasks. Within each sex, 


High preference scores indicate greater preference for gender-congruent tasks. AWS = Attitudes 


these differences were highly significant (p < 
001). 

A single score, called preference, was next 
obtained for each subject by subtracting the 
summed preference ratings for cross-typed 
tasks from the summed ratings for sex-congru- 
ent tasks. Possible scores thus ranged from 8 
(12 — 4) for those with perfect sex-congru- 
ent preferences to —8 (4 — 12) for those with 
perfect cross-typed preferences. These differ- 
ence (preference) scores were used as the 
dependent variable in subsequent analyses. 

Relationships among variables. Correla- 
tions among the independent and dependent 
variables are shown in Table 2. As found in 
previous studies, M and F show low positive 
relationships in each sex. The relationships of 
M and F with the AWS also confirm previous 
findings, the signs of the correlations being 
theoretically coherent but the magnitudes low 
(and in this instance, nonsignificant), 

Correlations for the combined comfort 
scores parallel previous reports for the ratings 
of each type of task: both M and F are posi- 
tively related to the comfort measure in both 
sexes. As expected, M (instrumentality) is 
more strongly related to comfort ratings than 
F (expressiveness). 

To examine the joint influence of trait and 
role attitude variables on the comfort mea- 
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sure, separate stepwise multiple regressions 
were performed for males and females.’ The 
interaction term (M X F) was also included 
with the constraint recommended by Cohen 
(1978) that it be entered after the main effect 
terms. The regressions are summarized at the 
bottom of Table 3, Instrumentality (M) was 
most strongly related to comfort ratings, fol- 
lowed by expressiveness (F), the effects being 
larger in females than males. The AWS made 
no significant contribution. (Separate regres- 
sions on comfort ratings for the three types 
of tasks produced identical results.) 

As another approach to the joint contribu- 
tion of M and F, the mean comfort ratings 
have been plotted for the four PAQ groups 
of each sex in Figure 1. Analysis of variance 
of these data shows a significant main effect 
for sex, F(1, 200) = 15.11, p < .001, with 
females reporting greater comfort. The main 
effect for PAQ category is also significant, 
F(3, 200) = 4.01, p < .01, whereas the inter- 
action F is less than 1. In both sexes, an- 
drogynous subjects gave the highest comfort 
ratings, followed by the masculine and then 
the undifferentiated and feminine subjects. 
The contrast of androgynous with all others 
is significant (p < 001), and androgynous 
and masculine versus undifferentiated and 
feminine approaches significance (p < .06). 

Turning to the preference data, inspection 
of the correlations in Table 2 shows that 
in males the signs of the correlations were 
in theoretically coherent directions, stronger 
preferences for Sex-congruent (masculine) 
tasks being positively associated with M 


28; p< .05). Also as 
tionship between AWS a 
was stronger than for 


urning to females, even 
with M and F are found, 
The AWS shows the 
—.14) with preference 
ted direction, but even 
nonsignificant, 

e regressions were also 


smaller correlations 
neither significant. 
largest correlation ( 
and is in the predic 
this relationship is 

Stepwise multipl 


R. HELMREICH, J. SPENCE, AND C. HOLAHAN 


performed on the preference data and are 
shown at the top of Table 3. As we predicted, 
attitude scores provided the best prediction 
of sex-typed preference in males, followed 
by M. No significant effects were found in 
lemales. 

Finally, the mean preference scores were 
examined for those in each of the four PAQ 
categorical groups. These means are plotted 
at the top right of Figure 1, while the results 
predicted by Bem are shown at the left. 
The most striking finding to be observed is 
the markedly greater preference of males 
than females for sex-typed tasks in all cate- 
gorical groups except the undifferentiated. 
Analysis of these data, using a 2 (Sex) X 4 
(PAQ Category) analysis of variance, re- 
vealed that the main effect for sex is highly 
significant, F(1, 200) = 20.60, p< .0001, 
accounting for 74% of the between-groups 
variance, The main effect for PAQ category 
is significant, F(3, 200) = 2.18, p< 01, as 
is the interaction between sex and PAQ cate- 
gory, F(3, 200) = 4.47, p< 01, reflecting 
a different ordering of means in the two 
sexes, 

Simple analyses of variance by PAQ cate- 
gory within each sex were then undertaken. 
For females, the F was less than 1, but for 
males it was highly significant (p < .001). 
Contrasts of the male data indicated that 
the masculine subjects were significantly (p < 
01) more Sex-typed in their preferences than 
the androgynous and undifferentiated sub- 
Jects but did not differ Significantly from 
Ctoss-typed, feminine sub 
Predicts no difference bet 
and undifferentiated indivi 
were also compared, A di 
line significance, t(45) = 
found, reflecting the higher 
drogynous. males for sex- 


Ween androgynous 


ifference of border- 
1.7, p=.15, was 


congruent tasks.” 
a ied 


Tf regressions are co: 
Predictor, highly Significant 
is variable reflecting the strong sex effect. 
aan vers Rad the signs of correlations with criteria 

ne two sexes, the separate regressions give 
a clearer Picture of relationships. 
` alin tnd Lenney classified their subjects as mas- 
ia tlie nite: or androgynous using Bem’s (1974) l 
na echnique based on the concept of balance 
n masculine and feminine attributes. To max- | 


jects. Since Bem | 


duals, these groups l 
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Table 3 
Regressions of Masculinity, Femininity, and AWS on Sex-T- 
Preference and on Comfort Ratings yee 
ay Multipl: R 
Step Variable Entry F p R R change oe p 
Sex-typed preference 
Males 
i AWS 28.9 -00 49 — 28.9 000 
2 M 4.7 <03 53 04 174 “000 
3 F 4 <i ns 53 -00 11.5 -000 
4 MXF 9.8 .002 60 -07 12.0 -000 
Females 
1 AWS 21 AS 13 — 2.1 AS 
2 M 1.9 16 18 -02 2.0 14 
3 F <i ns 19 01 1.4 26 
i M X Fe = a r == a aa 
Comfort 
Males 
1 M 2.8 -09 7 — 2.8 .09 
2 F 1.5 -22 .22 02 2.1 12 
3 AWS <1 ns 22 -00 1.5 22 
4 MXF <i ns -22 -00 1.1 .36 
Females 
1 M 7.7 01 .25 I 7.7 -006 
2 F 3.7 .05 31 03 5.8 004 
3 AWS 1.6 -20 32 01 44 006 
4 MXF <1 ns 33 -00 3.3 .010 


Note. M = masculine. F = feminine. AWS = Attitudes Toward Women Scale. 


* Data showed tolerance insufficient for inclusion. 


An analysis of covariance was also per- 
formed on the male preference data, using 
AWS scores as the covariate. The term for 
AWS was highly significant (F = 31.0, p< 
001), accounting for 65% of the explained 
variance and 23% of the total variance. The 
term for PAQ category was also significant 
(F = 5.18, p< .01), this variable account- 
ing for 35% of the explained variance and 
13% of the total variance. 

Announcements. The data on announce- 


— ae 


imize comparability of analyses, subjects were also 
assified into three groups by applying Bem’s pro- 
cedure to PAQ M and F scores. The preference and 
Comfort scores were then analyzed in 2 (Sex) X 3 
asculine, Feminine, Androgynous) analyses of 
atiance. The pattern of means for preference paral- 
X Bem and Lenney’s results, whereas those for 
enafort did not. In both analyses of variance, how- 
r, only the sex main effect was significant. 


ments were collected primarily to determine 
whether the affective quality of sex-linked 
activities interacted with the PAQ measures 
in determining preference and comfort rat- 
ings. Although significant main effects were 
found for the positive and negative tasks 
within the masculine, feminine, and neutral 
categories in both preference and comfort 
ratings, no interactions involving category 
and the PAQ measures occurred. Accord- 
ingly, results for positive and negative tasks 
were combined, and analyses paralleling those 
performed with the everyday task data were 
conducted. Sex differences related to type 
of task were highly significant. The contribu- 
tion of the PAQ scales to the criterion mea- 
sures was somewhat weaker than was found 
with the everyday tasks data, but the over- 
all pattern of the results for the comfort 
and preference ratings was similar. For this 
reason, they will not be discussed further. 
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Discussion 


Mounting evidence suggests that the PAQ 
and the BSRI have both construct and pre- 
dictive validity as measures of instrumental 
(masculine) and expressive (feminine) attri- 
butes. The question to which the present 
study is fundamentally addressed is whether 
the masculine and feminine attributes tapped 
by these psychometric instruments are both 
strongly and directly related to other mascu- 
line and feminine behaviors and attributes, 
as implied by Bem’s theory, or whether 
these personality attributes tend to be only 
weakly and indirectly related to sex-related 
phenomena that do not quite immediately 
engage instrumental and expressive capaci- 
ties, as suggested by the present investigators, 

Overall, the data from the present study 
support the latter point of view. Analyses of 
the ratings of anticipated comfort in task 
performance showed, as we predicted, weak 
but positive correlations with M and F in 
both sexes on all three types of tasks, mascu- 
line, feminine, and neutral, Assessment of the 
joint contribution of M and F scores, con- 
ducted by assigning subjects to categorical 
group, indicated that androgynous individuals 
of both. sexes expressed the greatest com- 
fort (on all tasks combined), followed by 
masculine, undifferenti 
jects. Except for the inversi k 
differentiated and An thein 


were, in any event, very similar), this order- 


emales were essentially nega- 
AWS) variables be- 
to preference for 
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gender-congruent tasks. Positive results, how- 
ever, were found in males, who as a group, 
it is interesting to note, showed strikingly 
higher preference ratings than women. Mas- 
culine men exhibited significantly higher 
preference ratings than did androgynous and 
undifferentiated men but, curiously enough, 
nonsignificantly higher ratings than did cross- 
typed feminine men. One other aspect of 
these findings from males should also be 
mentioned. As we predicted, our measure of 
general sex role attitudes (AWS) was more 
strongly related to the preference measure 
than were PAQ scores and, in analyses by 
PAQ category, accounted for almost twice 
as much of the variance as the latter, (Had 
a general measure of personal sex role choices 
been included, even stronger relationships 
with preference ratings might have been ob- 
tained, not only in males but in females.) 

The preference data from the male sub- 
jects replicate the findings of Bem and Len- 
ney that sex-typed subjects (in this instance, 
masculine men) show the greatest preference 
for gender-congruent tasks on a forced-choice 
measure. This finding supports Bem’s theory, 
but the relative performances of the remain- 
ing groups did not provide confirmation of 
other Predictions derived from her theory. 
Finally, the PAQ variable, even in males, 
contributed only weakly to the preference 
ratings. When these data are viewed in the 
Context of all the available evidence, they 
appear to us to be more supportive of our 
contention that the trait dimensions mea- 
sured by the scales are not only conceptually 
distinguishable from various classes of sex 
tole behaviors and attitudes but are often 
only minimally associated with these other 
sex-related phenomena, if at all. 

Finally, even the male preference data do 
not support the broad hypothesis that an- 


drogynous individuals, high in instrumental 
and expres: 


ally flexib| 
are those i 
evidence 


es 
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flexibility notion to include sex-related be- 
havior in general is unwarranted. 

A question that remains to be answered is 
why the present results were seemingly 
weaker than those reported by Bem and 
Lenney (1976). One explanation may be that 
the population effects were weak but a 
greater number of significant findings oc- 
curred in that study because subjects were 
selected from extremes of the distribution of 
combined M and F scores. The possibility 
cannot be discounted that procedural dis- 
ctepancies may have brought about differ- 
ences in the magnitude of effects in the two 
studies. However, it should be acknowledged 
that if method factors are responsible, the 
generalized sex role rigidity p ex- 
hibited by individuals who are sex-typed in 
personality characteristics is an extremely 
fragile phenomenon rather than the robust 
one suggested by Bem’s theory. 

Finally, differences in the BSRI and the 
PAQ may be cited. The PAQ M and F scales, 
as we have demonstrated, form unidimen- 
sional clusters of socially desirable instru- 
mental and expressive traits. Many of the 
BSRI M and F scale items can be similarly 
characterized. Others cannot, however, and 
the inventory is factorially more complex 
than the PAQ. These “extraneous” elements 
may bring about a stronger relationship with 
sex role behaviors. A specific hypothesis can 
be derived from a recent analysis of the 
Psychometric properties of the BSRI by 
Pedhazur and Tetenbaum (1979). These in- 
vestigators point out that the inclusion of the 
terms “masculine” and “feminine” on the M 
and F scales, respectively, greatly influences 
the discriminant ability of the instrument 
and also its factor structure. For example, in 
a discriminant function analysis between 
male and female respondents, 96.7% of the 
sample were correctly classified as to gender 
by the full function (all 40 items), but 
93.5% were correctly classified by using the 
function based on the adjectives masculine 
and feminine alone. Factor analyses in 
sex yielded four interpretable factors, “mas- 
culine” and “feminine” forming a separate, bi- 
Polar factor, with “feminine” having à po- 
tive loading and “masculine” a negative one. 
Given the demonstrated import of these two 
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adjectives, it may be that their inclusion to 
some degree evokes schemata (Markus, 1977) 
of appropriate masculine and feminine roles 
and that these schemata may influence en- 
dorsement of the trait adjectives that follow 
their appearance. Admittedly, this is putting 
a heavy explanatory burden on two items. 
More generally, interpretation of data ob- 
tained with the BSRI will be facilitated if 
there is better understanding of the influence 
that the presence of the subset of items that 
do not tap socially desirable instrumental 
and expressive traits has, both on responses 
to the items that do describe such traits and 
on the relationships of each subset of items 
to other behavioral and self-report measures." 


3 Bem (1979) has recently developed a Short BSRI 
dropping, notably, the items masculine and feminine 
as well as those with low social desirability. It seems 
reasonable to hypothesize that results from subjects 
classified by either the median split or difference 
method on the Short BSRI would be closer to those 
obtained in the present study. 


Reference Note 


1. Klein, H. M. Psychological masculinity and 
femininity, self-consciousness, and typical and 
maximal dominance expression in women. Un- 
published doctoral dissertation, University of 
Texas at Austin, 1978. 
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Nonverbal Trait Inference 


Sampo V. Paunonen and Douglas N. Jackson 
The University of Western Ontario, London, Canada 


The purpose of this study was to evaluate empirically the view that personality 
trait inferences merely reflect consistency in linguistic associations among verbal 
trait descriptors and have little correspondence with the actual organization of 
behavior. Trait inferences were obtained in both verbal and nonverbal rating 
contexts to evaluate their relative reliability, accuracy, and structure. Eighty 
male and 80 female undergraduates were supplied with either a verbal or non- 
verbal (cartoonlike) description of one of two target persons and were required 


by the data. 


One of the dogmas in some current the- 
ories about personality and the perception 
of personality is that there are few consist- 
encies, at least of the sort that can be at- 
tributed to broad behavioral dimensions or 
dispositions. Evidence from a variety of 
sources regarding consistencies in behavior 
has not been lacking, but its interpretation 
as being due to broad personality disposi- 
tions has been disputed. For example, rat- 
ings made about the personality of others 
in a variety of situations have consistently 
yielded a stable structure (Wiggins, 1973, 
pp. 328-349). But the fact that such studies 
have typically employed semantic trait names 
has raised doubts regarding the extent to 
which these consistencies are attributable to 
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verbal self-descriptive personality item: 


erican Psychological Association, 


to estimate the probability that the target would endorse both verbal and non- 
s. Results indicated very similar reliabil- 
ities, accuracies, and structures for the different conditions. The results are 
interpreted as not supporting the view that explicit semantic cues are necessary 
in accounting for observed consistencies in ratings of personality. Alternative 
hypotheses of desirability and the mediation of nonverbal inferences by some 
putative covert verbal process are evaluated and are found not to be supported 


underlying behavioral consistencies in the 
person being rated. D’Andrade (1965) ob- 
tained a similar factor structure for 20 traits 
judged for “similarity of meaning” that had 
been identified previously by Norman (1963) 
from data based on ratings of peers. From 
these data D’Andrade (1965) attempted to 
draw a general conclusion: “Some of the 
classifications used by psychologists can be 
derived solely from similarities in the mean- 
ings of words without considering any sample 
of actual behavior” (p. 215). Similarly, Mu- 
laik (1964) suggested that parallel findings 
derived from people ratings and word rat- 
ings “serve to focus attention upon the role 
of conventional linguistic usage in determin- 
ing the correlations between trait rating 
scales” (p. 507). 

The findings of D’Andrade and of Mulaik, 
advancing the hypothesis that observed con- 
sistencies in rating others might be due to 
linguistic conventions, have been highlighted 
by others as evidence against the existence 
of traits or other broadly based regularities 
in behavior. For example, Mischel (1968) 
indicates that “so-called traits at least in part 
exist as components of the verbal terms used 
to describe the external world; they do not 
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necessarily mirror the external world itself 
(D’Andrade, 1965)” (p. 46). As time has 
passed, earlier tentative statements of this 
“semantic similarity” hypothesis have tended 
to be treated as established law. Schneider 
(1973), for example, states, “The claim is 
made and indeed rarely questioned that vari- 
ous kinds of linguistic structures produce the 
implicit personality theory” (p. 302). Ban- 
dura (1969), D’Andrade (1974), Mirels 
(1976), Mischel (1976), and Shweder 
(1975), among others, have argued force- 
fully for the primacy of “linguistic charac- 
teristics,” “concept meanings,” “preexisting 
conceptual schemes,” or some similar term in 
influencing the perception of generalized re- 
sponse dispositions. But rarely are these 
terms clearly and explicitly defined. 

The concept of “similarity of meaning,” 
which is central to the issue, since judgments 
of the similarity of word meanings have been 
used to assess the “cognitive structure” of 
raters, is somewhat obscure in that it has 
been the subject of a variety of interpreta- 
tions, not all of which have been made ex- 
plicit by researchers and not all of which are 
mutually consistent (D’Andrade, 1974, pp. 
177-180). D’Andrade (1974) recognizes this 
as a “critical problem” and states that, in his 
treatise, “measures of conceptual similarity 
have been phrased primarily in terms of 
semantic similarity” (p, 177). Moreover, 
he „discusses several varieties of semantic 
fone tet Senne 
is haprenth Problem of not knowing what 

nappening when people are used as mea- 
Suring instruments, raised 
of this chapter and given 
certain procedures 


ing the nature of 
Validating ratings 
ry to state clearly 
ntic similarity hy- 
tion in the present 


Most studies involving ratings of the per- 
sonalities of others have used verbal trait 
labels as reference points for rating scales 
Tt has been conjectured that the “similarity 
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in meaning” of these verbal trait descriptors 
rather than “real world” similarity may ac- 
count for the observed consistencies in these 
rating studies. To our knowledge, this hy- 
pothesis has not been sufficiently delineated 
to allow for predictions of potential effects 
due to varying degrees of verbal and non- 
verbal target information, target-relevant 
versus target-irrelevant information, type of 
rating scale used, or other parameters con- 
ceivably biasing trait inferences. D’Andrade 
(1965), however, underscores the fact that 
the classification of traits by rating scales 
has been confounded with the use of verbal 
trait descriptors. Consequently, he questions 
the significance of trait attribution research, 
since it is difficult to ascertain whether “stud- 
ies that have reported in various forms cor- 
relations between two or more linguistic 
labels have been reporting about relations be- 
tween real world events, or about how similar 
in meaning the labels are, or some complex 
interaction of meanings and external events” 
(p. 228). It is our intention in this study 
to unconfound the use of verbal materials in 
designating traits from the trait descriptions 
by employing nonverbal as well as verbal 
stimulus and response materials in a trait 

inference paradigm. 
__ If the observed consistencies in the rat- 
ings of personality are simply a product of 
the words used by experimenters, it is pos- 
sible that these ratings, as merely a re- 
flection of the perceivers? “cognitive struc- 
tures,” will not correspond to the actual or- 
ganization of behavior as measured more 
objectively. An alternative position is the 
isomorphism hypothesis, that relations be- 
tween semantic cues are highly associated 
with the actual probable co-occurrence of the 
'haviors reflected by the verbal cues. Cer- 
tain authors (D’Andrade, 1974; Shweder, 
1975) entertain this possibility, but favor the 
interpretation that linguistic associations 
somehow determine the observed, invariant 
De i these structures have 
o S sia the actual organ- 
morphism H r. course, if the iso- 
SM hypothesis were confirmed. if se- 
PES meaning were shown to mirror be- 
aa Mies anes validly, such a finding 
Yy serve to reinforce confidence in 
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the validity of observed behavioral consist- 
encies or traits. Results apparently at vari- 
ance with the isomorphism hypothesis (D’An- 
drade, 1974; Shweder, 1975) have been chal- 
lenged by Block, Weiss, and Thorne (1979). 
The issue is an important one theoretically 
and one about which there is conflicting evi- 
dence, Lay and Jackson (1969), for example, 
found that the structure observed for multi- 
dimensional scaling of trait names was simi- 
lar to that obtained when actual Personality 
Research Form (PRF) (Jackson, 1974) 
items were used, which referred to more con- 
crete behavior than that implied by ab- 
stract trait names, Furthermore, these struc- 
tures were quite similar to the factor struc- 
ture derived from factor analytic findings 
based on responses to the entire PRF by 
different subjects. Chan and Jackson (1979) 
have obtained similar results in the domain 
of psychopathology. 

Recently, Mirels (1976) reported a lack 
of any statistically significant relationship 
between judgments of conditional probability 
of responses to pairs of the Lay and Jackson 
PRF items and observed conditional proba- 
bilities of respondents, claiming that such 
Presumed relationships were “illusory.” But 
Jackson, Chan, and Stricker (1979) reana- 
lyzed Mirel’ data and reported new data 
indicating that when the more appropriate 
product-moment correlation is used as an 
index of behavioral co-occurrence rather than 
conditional probability as used by Mirels, 
the association between judged and actual 
čo-occurrence of behavior is highly statis- 
tically significant and substantial. The evi- 
dence seems to suggest that “subjects make 
Judgments of item coendorsement on some 
asis other than conditional probability” 
(Jackson et al., 1979, p. 9). Reed and Jack- 
Sn (Note 1) report a number of studies 
‘idicating that when target persons are 
Ndged on the basis of broad descriptive 
Paragraphs, the behavior of actual persons 
ke the targets can be predicted with a high 
muvee of reliability and accuracy. e 
ane, and desirability were also taken in 
ice’ predictive validities reached as high 


Further evidence of the relevance of trait 
tings comes from a study by Norman and 
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Goldberg (1966). These authors demon- 
strated greater interrater reliability and cri- 
terion validity for peer ratings made on 
close as opposed to distant acquaintances. 
Kusyszyn (1968) and Jackson, Neill, and 
Bevan (1973) also reported similar findings. 
This observation suggests that raters’ per- 
ceptions are veridical when they have rele- 
vant information upon which to base their 
ratings. In addition, since Norman and Gold- 
berg analyzed ratings made on the same vari- 
ables used by D’Andrade (1965), their re- 
sults lend credence to the notion of an iso- 
morphism between meaning factors and per- 
son factors. From the present vantage point, 
however, all of these studies confounded the 
use of verbal rating scales and the question 
of veridicality. 

The question arises: Is the frequently ob- 
served organization of behavior in ratings of 
others solely a function of explicit verbal in- 
formational cues and verbal rating scales, or 
might this same structure arise in the ab- 
sence of such semantic cues? In this study 
we propose the use of a method not depend- 
ing upon explicit linguistic elements in the 
forming or reporting of inferences about the 
behavior of others. Judges will be asked to 
make inferences about the behavior of a 
target person who will be characterized with 
pictorial representations of some concrete 
behaviors in which linguistic elements are 
lacking. Similar pictorial information, again 
based on the figural representaton of con- 
crete behaviors in a diversity of situations, 
will be used for obtaining ratings of the 
probability that a particular target individ- 
ual will engage in particular behaviors. Also, 
we propose to include more traditional verbal 
personality statements or items for compari- 
son purposes, both for providing information 
about a target person and for obtaining judg- 
ments regarding the behavior of particular 
target individuals, The proposed design sepa- 
rates verbal and nonverbal cues, both in 
terms of the information used to form an 
impression of the individual and in terms of 
the vehicle used in communicating that im- 
pression. The study is thus designed to in- 
vestigate (a) the degree to which consist- 
encies in rated behavior depend upon ex- 
plicit linguistic or semantic characteristics, 
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(b) the degree to which verbal and non- 
verbal information yields inferences about 
behavior that conform to personality types 
observed empirically, and (c) the structure 
of perceptions of personality elicited by ver- 
bal and nonverbal materials. We hypothesize 
that there will be a high degree of consensus 
among judges regardless of whether verbal or 
figural cue information or rating scales are 
used and that, on the whole, judges’ charac- 
terizations of the behavior of target persons 
will conform closely to the structuring of the 
behavior as observed empirically. We are not 
hypothesizing that individuals necessarily 
avoid all use of implicit semantic reasoning 
processes in drawing inferences about per- 
sonality, but rather that explicit verbal cues, 
that is, those supplied by the experimenter 
in transmitting and receiving information, are 
not a necessary prerequisite for the identifi- 
cation of stable predictions regarding the be- 
havior of others. 


Method 
Experimental Design 


Psychology undergraduates were used in this study 
as raters who were instructed to make estimates 
concerning the probability that a target would en- 
dorse certain self-descriptive personality items, Each 


rater was presented with a limited am in- 
formation describin; ae tages 


depicted as having distinct behavioral characteris- 


individual eng; 
of the likelihood of behavior in 


on verbal personality it 


dictions of the behavior depict 
p, ed 
drawings, designated Order Si Sape Renee 


Construction of the Target Descriptions 


Subjects were instructed to make 


Su behavi 
dictions for one of two target chara etch 


cters, who were 
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Figure 1. Criterion Profile 1, based on 796 Person- 
ality Research Form E male respondents (basis for 
Ed Nolan target). 


described either verbally or nonverbally. The at- 
tributes of the targets were based on distinct per- 
Sonality types derived empirically from a multi- 
variate clustering procedure (Jackson & Williams, 
1975; Skinner, 1977; Skinner, Jackson, & Hoffmann, 
1974). Given a matrix of profiles representing the 
Scores of a number of persons on a number of 
Personality scales, this procedure identifies dimen- 
sions of persons based on profile shape using an 
Eckart-Young (1936) decomposition of the data 
matrix. Persons having similar profiles occupy simi- 
lar positions in the orthogonal n-dimensional space. 
Each cluster of individuals is represented by a set 
of standardized component scores typifying the 
Profile of persons receiving high scores on the 
dimension. 

Figures 1 and 2 show two of five bipolar “modal” 
Profiles extracted from a sample of 796 PRF re- 
Spondents. These two orthogonal configurations wert 
used as the basis for forming the target descrip- 
tions for the present study and were used as the 
Criterion profiles in the assessment of judgmental 
accuracy. Because the scale means are removed 3 
a first step in the derivation of the modal profiles 
from the Tespondent by scale matrix, predictions 
of these standardized scores cannot be based on 
qe accuracy, as defined by Cronbach (1955; 

ee 1973). Judgments seeking to predict these 
pro es of scores cannot be accurate as a function 
of the raters’ “knowledge of the relative frequency 
fey of the possible responses” (Cronbach, 
us 5, p. 179) since the “average item self-descrip- 

n summed over all others” (Wiggins, 1973, P- 


| 
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127) represents a null vector and as such contributes 
` no variance to the individual target profiles. Hence, 
judgments based on “accuracy in predicting the 
generalized other” (Cronbach, 1955, p. 179) would, 
in general, not contribute valid variance to the 
predictions of these criterion profiles. 
An alternative procedure for identifying criterion 
personality trait profiles associated with a certain 
type would have been simply to obtain the per- 
sonality scores of one or more persons highly asso- 
ciated with one of the typal dimensions (Reed & 
Jackson, Note 1). Although such a procedure has 
the advantage of simplicity and concreteness, and 
indeed would yield very similar results, scores so 
obtained would not be completely free of the stereo- 
type component. But in either case we are dealing 
with a set of scores derived from the responses of 

teal people, scores that have been shown to possess 
a degree of convergent and discriminant validity 
| (Jackson, 1974). 

Four of the most characteristic traits of Criterion 
Profile 1 (Figure 1) were used to describe the first 
target (Ed Nolan). This target was depicted as 
high on PRF scales for affiliation, exhibition, nur- 
turance, and play. The second target (John New- 
man) was based on Criterion Profile 2 (Figure 2), 
denoted by the attributes of aggression, autonomy, 
dominance, and thrill-seeking (low harmavoidance). 

Verbal target descriptions consisted of a name 
and five sentences written in third-person narrative 
form and presented as a personality sketch. For ex- 
ample, Ed Nolan was described to raters as follows: 


Ed Nolan is a sociable person who enjoys being 


neure 2. Criterion Profile 2, based on 796 Person- 
j ity Research Form E male respondents (basis for 
ohn Newman target). 
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Figure 3. Sample nonverbal items depicting thrill- 
seeking behavior (A) and nurturant behavior (B). 


with friends and people in general; in fact, he 
makes an effort to win friendships and maintain 
close acquaintances. Ed spends a lot of time par- 
ticipating in games, sports, and various social 
activities and often likes to play practical jokes 
on his fellow workers. His friends describe him 
as being very entertaining and delighting in be- 
haviors that attract the attention of an audience. 
Ed also readily performs favors for others and 
gives sympathy and comfort to those in need. 
When people have a problem they often go to 
Ed for encouragement and consolation. 


The items for the nonverbal target descriptions were 
prepared by commissioning an artist to illustrate 
a series of scenarios supplied by the authors. Ideas 
for these scenarios originated from a careful perusal 
of PRF scale definitions and items. Each illustra- 
tion consisted of a stick drawing of a central figure 
performing a behavior in a specific context. An 
attempt was made to convey pictorially a behavioral 
manifestation that would be characteristic of a per- 
son defining an extreme position on the substan- 
tive dimension of interest. For example, the upper 
panel of Figure 3 shows a nonverbal stimulus de- 
signed to reflect high thrill-seeking (low harmavoid- 
ance) in the central character, while the lower panel 
reflects high nurturance. Four items were concep- 
tualized and illustrated for each of the four be- 
havioral dimensions typifying each target, with an 
emphasis on nonredundancy of item content. In 
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composing the nonverbal trait descriptions, an at- 
tempt was made to imply behavioral consistency and 
to convey an impression parallel to the verbal 
counterparts, The 16 items describing a character 
were randomly ordered in a four-page booklet bear- 
ing a picture and the name of the central or target 
figure on a cover page. 


Construction of Verbal and Nonverbal 
Materials jor Behavioral Prediction 


Judges were instructed to make behavioral pre- 
dictions separately on the basis of both verbal and 
nonverbal rating materials. The verbal items were 
extracted from Jackson’s (1974) Personality Re- 
search Form (PRF), Form E. The first five items 
for each of 17 content scales and two validity scales 
(infrequency and social desirability) were selected 
from the PRF test booklet. Each scale had either 
two or three items keyed in one direction, with 
the remainder being keyed in the reverse direction. 

Items for the nonverbal rating scales were com- 
posed in a manner similar to that of the non- 
verbal target description scenarios. A total of five 
rationally constructed nonverbal items for each of 
17 content scales was created with unidirectional 
keying. In addition, a five-item validity scale was 
compiled, reflecting undesirable behaviors but het- 
erogeneous with regard to content or the possible 
personality scale on which they might be keyed. 
Three of the content scales of the PRF were not 
depicted because they represent Particularly diffi- 
cult constructs for nonyerbal definition—cognitive 
Structure, change, and defendence. These constructs 
were also absent in the verbal rating scales, 

Each of the 90 nonverbal items was rated for de- 
sitability by 12 Psychology graduate students using 


a 9-point rating scale ranging fro) t = 
desirable to extremely desirable. ee 


Subjects and Procedure 


ted in 10 males 
tion. 


| common to all; 
ting situation; (c) 
to one cell of the 
nonverbal rating 
common prelim- 
r: a aloud by the experi- 
menter while the subjects simultaneously read thei 
text. The specific instructions were read silent} by 
each Participant; these instru ed 


ictions were e tivalent 
across raters in form and wording except ie ae 
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Sary modifications such as interchanging the words 
verbal description and nonverbal description. 

The directions for completing the verbal rating 
scales instructed the subjects to make a judgment, 


using a 9-point scale, on the probability that the 
described target would endorse each of the verbal 
Statements. Instructions for the nonverbal rating 
Scales stated that judges were to estimate the 


likelihood that the described target would engage 
in the behaviors depicted by the central figure in 
each item. Judgments were again recorded on 9-point 
scales, 


Results 
Overview of Analyses 


The analyses may be divided into three 
Stages corresponding to the three major ques- 
tions addressed. First, interjudge reliabilities 
were computed to evaluate the degree to 
which raters share a consensus regarding 
their interpretation of nonverbal and verbal 
items or behavioral exemplars. Second, an 
assessment was undertaken to appraise the 
degree to which judgments based upon be- 
havioral exemplars are valid in the sense that 
they show a pattern of inferences consistent 
with the structure of behaviors observed em- 
Pirically. Third, the structures of inferences 
based upon nonverbal and upon verbal stim- 
ulus materials were compared. 

Before analyses were undertaken, judg- 
ments for both nonverbal and verbal items 
Were scored in conformity with the scale on 
which the item was keyed. Each judge thus 
Produced two profiles, one based on ratings 
of verbal items and one based on pictorial 
materials. This resulted in 160 separate pro- 
files for verbal judgment scales and 160 for. 
nonverbal judgment scales, one of each type 
for each of the 20 judges in each condition. 


Interjudge Reliabilities 


_The generalizability of judgments across 
different judges within each experimental con- 
dition was appraised by randomly dividing 
the 20 raters in each of the eight conditions | 
into two equal groups of 10, each having the | 
Same number of males and females. This was 
done separately for nonverbally and verbally 
based Profiles. The 10 profiles were then aver- 
aged separately for each of the two randomly 
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divided groups. The two mean profiles, con- 


sisting of 18 scales each (17 content scales 


and social desirability), were correlated. 
Table 1 reports interjudge reliabilities cor- 
rected by the Spearman—Brown formula to 
yield an estimate of within-cell group reliabil- 
ity (Gulliksen, 1950, p. 66). These values 
range from .93 to .99 and reveal a consistently 
high degree of consensus across treatment 
groups, targets, verbal and nonverbal descrip- 
tions, and verbal and nonverbal judgmental 
items, Of particular interest is the absence of 
a significant difference in group reliabilities 
between verbal and nonverbal rating scales. 
Average values of .96 for nonverbal and .97 
for verbal conditions indicate that high levels 
of judgmental consistencies are not predicated 
solely on the presence of an explicit semantic 
structure in rating scales. 

The Spearman-Brown formula was also 
used to derive estimates of single-subject reli- 
abilities for each of the conditions. These 
range from .40 to .83, with a mean of .61 over 
all 16 conditions. More properly, these co- 
efficients reflect the generalizabilities of judg- 
mental sensitivities over judges expressed in 
terms of single judges. Were we seeking to 
develop a reliable test of judgmental sensi- 
tivity for single judges over targets, many 
more targets than one per judge, used here, 
would be appropriate, because targets are 
analogous to items on a test. 


Evaluation of Judgments in Terms of 
Modal Types 


In the present study, experimentally de- 
tived profiles based on the raters’ judgments 
Were expected to emulate empirically obtained 
criterion personality profiles (Reed & Jack- 
son, Note 1). A close correspondence between 
the two is an indication of the accuracy of the 
inferential judgments observed in the present 
study. Such evidence would support the inter- 
Pretation that the inferred relationships be- 
tween personality characteristics accurately 
reflect these relationships as revealed in the 
Covariation of responses to structured per- 
Sonality items. 

Individual accuracies or sensitivities (Jack- 
son, 1972; Reed & Jackson, 1975), reflecting 
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Table 1 

I nterjudge Reliabilities for Eight 
Experimental Conditions for Nonverbal 
and Verbal Ratings 


Rating scale 
Target 
description Nonverbal Verbal 
Ed Nolan 

Order 1 
Nonverbal 96 98 
Verbal -93 94 

Order 2 
Nonverbal 98 96 
Verbal -98 97 

John Newman 

Order 1 
Nonverbal 94 97 
Verbal 96 97 

Order 2 
Nonverbal 97 .99 
Verbal -98 -98 


Note. Order 1 = predictions based on nonverbal 
drawings first, followed by predictions of responses 
to verbal personality items. Order 2 = predictions 
based on verbal personality items first, followed by 
predictions of the behavior depicted by the non- 
verbal drawings. 


the degree to which a judge manifested judg- 
ments congruent with observed behavioral 
consistencies, were computed by simply cor- 
relating both of his/her judgmental profiles 
of 17 content scales with the relevant criterion 
profile. Group sensitivities were obtained by 
first averaging the 20 individual profiles in a 
treatment condition and then correlating this 
group mean profile with the relevant criterion 
profile. These sensitivity indices were com- 
puted separately for both verbal and non- 
verbal rating scales. This computational tech- 
nique (i.e., the product-moment correlation 
between two profiles) measures the degree of 
similarity between each judge's rating of the 
target profile shape and the profile shape as 
identified empirically in groups of respon- 
dents. 

Individual sensitivities ranged from — 40 to 
93. Sensitivities falling below the .30 level 
(of which there were only about 13%), like 
high PRF infrequency scores, can be attrib- 
uted not only to a subject’s lack of awareness 
of trait relationships, but possibly to some 
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Figure 4. Mean sensitivities for nonverbally 
condition. 


combination of lack of motivation, miscon- 
strued instructions, random responding, or 
deliberate noncompliance. Figure 4 shows the 
means for the individual sensitivities for each 
experimental group. The most salient feature 
of Figure 4 is the substantial level of average 
sensitivity to trait covariation across targets 
and experimental verbal and nonverbal condi- 
tions. One notes also in Figure 4 a modest 
decrease in mean sensitivities for nonverbal 
ratings as opposed to verbal ratings, in each 
condition. However, verbal and nonverbal in- 
formational cues appear to result in similar 
levels of group sensitivity, To test these 
effects for significance and to examine any 
other effects on sensitivity due to the fac- 
torial manipulations, a four-way fixed-effects 
analysis of variance was run using the non- 
verbal and verbal rating scales as two levels 
of the fourth factor. Subjects were nested in 
the other three factors (description type, tar- 
get person, order of dependent variables), 
and the data used as the dependent variable 
was the individual sensitivity index (this is 
referred to by Kirk, 1968, Section 8.13, as a 
split-plot factorial 222.2 design). This anal- 
ysis revealed a significant main effect of rat- 
ing scale type; sensitivities were attenuated 


and verbally based judgments of behavior, by 


when judgments were made on the nonverbal 
scales as opposed to the verbal items, F (1, 
152) = 51.58, p < .001. Only one other effect 
was observed, and that was a Rating Scale X 
Target interaction, F(1, 152) = 7.47, $< 
:01; the drop in sensitivities as one examined 
the nonverbal judgmental profiles compared 
to the verbal profiles was more noticeable with 
the target John Newman (based on Criterion 
Profile 2). It should be emphasized that al- 
though they were significant, these effects 
account for a relatively small proportion of 
the variance, It should also be emphasized 
that the type of information used to form an 
impression of the person (verbal or non- 
verbal) and the order factor had no effects 
on rater accuracy. A separate analysis inves- 
tigating whether or not gender of rater af- 
fected ratings also revealed no significant 
differences. 

Table 2 shows the sensitivity of the group 
as a whole for both the verbal and nonverval 
personality rating scales for each of eight 
conditions, The accuracy of the group con- 
sensus in estimating the personality char- 
acteristics of the targets ranges from values 
of 58 to 95. This accuracy seems to occur 
Tegardless of the target being rated, the na- 


NONVERBAL TRAIT INFERENCE 


ture of the target description, the order of 
dependent measures, or the nature of these 
rating scales. 

The mean correlation of the group mean 
profiles based on the Ed Nolan target with 
the John Newman criterion profile is —.22. 
When the group mean profiles based on John 
Newman are correlated with the Ed Nolan 
criterion profile, the average correlation is .18. 


Evaluating the Role of Desirability 


The argument that the group sensitivities 
were simply based on the connotative social 
desirability components common to target 
descriptions and rating scale items was put to 
test by partialing this portion of the variance 
out of the ratings and recomputing the group 
sensitivities using standard partial regression 
techniques to control statistically the effects 
of desirability in both the predictor and the 
criterion. ř 

Because the predictor profiles were based 
on either verbal or nonverbal items, two dif- 
ferent desirability vectors were used for this 
analysis; the social desirability yalues of the 
scales as given in the PRF manual (Jackson, 
1974, p. 12) were partialed from the verbally 
based profiles and the criterion profiles. The 
corresponding values for the nonverbal scales 
were obtained from the social desirability 
ratings of the nonverbal items. This vector 
was partialed from the judgmental profiles 
based on those items and the criterion pro- 
files. The correlation between the two social 
desirability vectors is .82. Group sensitivities 
corrected for desirability are presented as 
parenthesized values in Table 2. A comparison 
with the uncorrected values reveals that there 
is little if any evidence that any of these 
values decline as a function of eliminating de- 
sirability. In fact, there is some evidence that 
desirability is acting as a suppressor variable 
(Conger & Jackson, 1972); when its effects 
are removed, accuracy in predicting the cri- 
terion profile is slightly in . 

After the effects of desirability had been 
removed from the correlations between the 
predictor and criterion profiles, estimated 
Population base rates were partialed from 
these correlations. The vector of PRF Form E 
mean scale scores for males (Jackson, 1974, p. 
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Table 2 
Correlations Between Group Mean Profiles 
and Criterion Profiles, by Condition 


me 


Rating scale 
Target 
description Nonverbal Verbal 
Ed Nolan 
Order 1 
Nonverbal .73 (.80) .93 (.94) 
Verbal .73 (.79) .84 (.85) 
Order 2 
Nonverbal .85 (.91) .95 (.95) 
Verbal .76 (.77) -83 (.83) 
John Newman 
Order 1 
Nonverbal .77 (.78) 84 (.84) 
Verbal .58 (.61) .81 (.81) 
Order 2 
Nonverbal «15 (.77) .84 (.84) 
Verbal .63 (.67) .83 (.82) 


Note. Correlations with social desirability partialed 
out are in parenthesis. 


42) was partialed from the correlations with 
the verbal rating scales, The mean change in 
group sensitivity is .01. A vector of mean 
scale scores derived from the self-reports of 
128 male respondents to a set of 202 non- 
verbal items, of which the nonverbal rating 
scale items used in this study formed a sub- 
set, was partialed from the correlations with 
the nonverbal rating scales. A mean change 
in group sensitivity of .01 is also observed in 
this analysis, indicating that the prediction 
of population endorsement frequencies was 
not a large component of the subjects’ judg- 
ments of either target. 


Evaluating the Structure of Verbal 
and Nonverbal Cues 


A multivariate analysis of variance 
(anova) was used to evaluate the effect of 
the factorial manipulations on the 18 com- 
ponent profiles of trait ratings. The analysis 
(a multivariate split-plot factorial with rating 
scale type as the within-block treatment and 
description type and order of ratings as the 
between-block treatments) was run separately 
for the ratings of the target John Newman 
and the target Ed Nolan. The mean profiles 
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Unrotated Principal Components Factor Analysis Solutions of Matrices 


of Group Mean Profile Intercorrelations 


p 


Target: Ed Nolan Target: John Newman 


y Factor Factor 
Target 
description Order Rating scale 1 2 3 1 2 3 
1 nonverbal 1 nonverbal 94  —.33 03 94  —.23 +24 
2 verbal 1 nonverbal 95. —.30. —.01 91 —.37 —.15 
3 nonverbal 2 nonverbal 96 —.17 16 95  —.24 15 
4 verbal 2 nonverbal 95 —.24 —12 93) —.32 —.15 
5 nonverbal 1 verbal 94 28 A7 95 26 10 
6 verbal 1 verbal -96 16 —.14 96 23 —.14 
iz nonverbal 2 verbal 1 34 .19 93 35 -08 
8 j verbal 2 verbal 1 28. —.30 94 31 = 12 
i 88.1 7.3 2.7 88.1 8.6 2.2 


for the various conditions are presented in 
Table 3. 

The manova revealed the following effects 
for the ratings of John Newman (using Wilks’ 
lambda criterion), Description type and rat- 
ing scale type were both significant beyond 
the .001 level, approximate F (18, 59) was 
3.25 and 29.9, respectively, and there was a 
significant Description Type X Rating Scale 
interaction, F(18, 59) = 2.7, p < .003. The 
same effects were replicated with the Ed 
Nolan target, F(18, 59) =2.7, p < .003; 
F(18, 59) = 19.0, p < .001; F(18, 59) = 4.8, 
p< .001, respectively, in addition to a sig- 
pace order effect, F(18, 59) =1.9, P< 
04. 

Despite the above differences among the 
mean vectors, an examination of the intercor- 
relations of these mean profiles of Table 3 
shows a substantial degree of consensus for 
all experimental groups judging the same 
target. The within-target values all exceed 
70, with the mean for the target John New- 
man being .87 and that for the Ed Nolan 
target being .86, The mean heterotarget cor- 
relation, however, was only —.02. 

A principal components factor analysis was 
undertaken to seek to identify the relative 
contributions of the descriptive attributes of 
the targets and the experimental manipula- 
tions on the judgments. Components were ex- 
tracted from the intercorrelation matrix of 
group mean profiles for each target. In this 
analysis, the experimental conditions were 


treated as variables and the trait dimensions 
as replications. For both targets, the largest 
components in Table 4 show a pattern of high 
positive loadings, accounting for more than 
88% of the total variance. Since the primary 
common element running’ through all eight 
sets of ratings was the common target, this 
factor may be interpreted as reflecting a con- 
sistent conception of the attributes of the 
target person, cutting across the different 
combinations of verbal and nonverbal sets of 
cues and response formats. The second factor, 
accounting for somewhat less than 9% of the 
variance for each target, is a bipolar dimen- 
sion, clearly separating ratings made on the 
nonverbal personality rating scales from rat- 
ings made on the parallel verbal form. The 
third factor, although very small, consistently 
separates ratings based on verbal and non- 
verbal target descriptions. 


Discussion 


The authors recognize certain limitations 
of the present study. Only one type each of 
verbal and nonverbal material was used, mak- 
ing it difficult to substantiate claims of gen- 
eralizability to other rating situations, Any 
differences in the judgments of verbal and 
nonverbal rating scale items may have been 
the consequence of slightly different instruc- 
tional sets for completing these ratings. Also, 
rating validities might have been different 
had a more objective criterion measure of 
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actual target behavior been available rather 
than that based on self-report data. In spite 
of these limitations, the results clearly demon- 
strated that (a) judges showed extremely high 
degrees of agreement when inferring the be- 
havior of a target from limited information; 
(b) although profile differences were found 
across verbal and nonverbal conditions, the 
profile similarities were more interesting; (c) 
highly similar factor structures were attained 
regarding the perceived patterning of target 
traits across verbal and nonverbal informa- 
tional and rating conditions; (d) the ratings 
of two targets were approximately orthogonal, 
corresponding to the orthogonal patterning 
of target characteristics as measured em- 
pirically; (e) although there was somewhat 
less accuracy regarding the prediction of be- 
havior using pictorial items as compared with 
verbal items, both sets of rating formats 
yielded substantial levels of accuracy, Al- 
though it was not assessed in this investiga- 
tion, the conclusion follows from Norman and 
Goldberg’s (1966) study that even greater 

levels of interrater reliability and criterion 

validity would have been expected had more 

target-relevant information been available to 

the raters, 

rit oein do Hee results have for 

p hypothesis? The an- 

swer to this question is unfortunately com- 

plicated by the rather vague manner in which 

this hypothesis has been stated. Perhaps the 


is is its ex- 


Ovides no basis for 
identical patterning 
s made with both 
rget information or 
Predictions similar to 
rbal rating conditions 


are made in the total absence of explicit ver- 


bal information. 
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There are alternative weaker forms of the 
hypothesis that merit some consideration. 
These may be summarized as follows: (a) 
linguistic factors do not account for all of the 
expected covariation of behavior, but con- 
tribute substantially to such findings; (b) 
connotative attributes of trait descriptions, 
like desirability, account for a major portion 
of the variance in observations of personality; 
or (c) even nonverbal cues elicit common 
linguistic associates in judges that are seman- 
tically related in some form of verbal net- 
work. Ratings of personality made in a con- 
text devoid of any explicit verbal materials 
may be mediated by the implicit use of cog- 
nitively generated verbal labels. 

With regard to the first alternative form of 
the semantic overlap hypothesis, that seman- 
tic processes account for substantial inde- 
pendent variance incrementing that obtain- 
able from nonverbal processes, the data are 
generally nonsupportive. There were no sig- 
nificant differences in mean levels of correla- 
tion with the criterion between verbal and 
nonverbal target descriptions. The differ- 
ences in accuracy between verbal and non- 
verbal rating scales, though Statistically sig- 
nificant, were not substantial, a finding 
particularly hoteworthy in view of the fact 
that the comparison was between unselected 
nonverbal items and highly selected verbal 
Items. It is reasonable to assume that the 
psychometric characteristics of the PRF items 
are superior to those of the nonverbal illus- 
trations. These scenarios were amassed with- 
out the consideration of important item 
Properties such as content saturation, fre- 
quency. of endorsement, and components of 
desirability, whereas the verbal items used 
ee undergone Stringent item selection pro- 

res, In addition, one might expect lan- 
guage to provide a richer and more precise 
eel the behavior to be rated. This 
eatiier eet 1s important in appraising the 
ae ata of trait-oriented rating 
ele E ie earlier rating scales usually 
sctiptive phr ple Boe names or general de- 
the Verbal ee the PRF items judged in 
Gbaa g conditions were largely de- 
iied a” behaviors. No one has 
semantic ae 7 e manner in ck 
conceivably a guistic associative rules might 

PPly to ratings of this type. 
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The second weaker semantic overlap hy- 


_ pothesis is that personality ratings are due to 


common connotative meanings such as desir- 
ability (evaluation), activity, and potency 
(Mischel, 1968). Two points might be raised 
regarding this interpretation. First, although 
these connotative components of meaning 
have often been studied in the context of 
verbal meaning, the Osgoodian formulation 
(Osgood, 1952) is general and does not de- 
pend on verbal meaning per se. In our study, 
it would be a simple matter for judges im- 
plicity to rate nonverbal stimuli for desir- 
ability. Whether verbal or nonverbal, no 
formulation is available for linking these con- 
notative item attributes with the type of pro- 
files comprising the empirical criterion for 
accuracy in this study. Second, the connota- 
tive attribute most prominently linked to rat- 
ings of personality, that is, desirability, when 
held constant statistically in our data, demon- 
strated no incremental effect on accuracy. 
Judgmental accuracy could not be attributed 
to a desirability stereotype, nor could it be 
accounted for as the prediction of normative 
item endorsement frequencies, since (a) the 
removal of base rates of endorsement did not 
alter the degree of association between the 
predictions and their respective criteria, and 
(b) the mean ratings for the two targets were 
approximately orthogonal. 

To what extent are the results derived from 
nonverbal cues and nonverbal rating scales 
due to verbal mediation? This is a difficult 
question to answer, simply because such an 
hypothesis refers to covert, unobserved pro- 
cesses in accounting for empirical results. 
There are three relevant points here. First, 
this study addressed itself initially to seman- 
tic overlap between explicit verbal cues in 
ratings of personality, To our knowledge, no 
one has attempted to explicate the manner in 
Which such supposed covert complex processes 
might operate. Second, if one might assume 
that these verbally mediated processes did 
occur, for example, between the processing of 
Nonverbal personality descriptions and non- 
Verbal ratings, one would expect some loss in 
fidelity in comparison with the situation in 
Which verbal descriptions of personality 
Would be rated nonverbally. (Verbal media- 
tion would assume that the nonverbal cues 
Would have to be coded verbally and entered 
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into a semantic space, with a verbal person- 
ality inference made and this information 
transformed into a nonverbal rating. From 
this perspective, going from verbal cues to 
nonverbal ratings would be less subject to 
error, since access to the semantic space 
would be more direct.) Contrary to expecta- 
tions from a verbal mediation hypothesis, 
accuracy is not attenuated going from non- 
verbal cues to nonverbal ratings. Similarly, 
going from nonverbal cues to verbal ratings 
is not less accurate than going from verbal 
cues to verbal ratings, contrary to expecta- 
tions based on a verbal mediation hypothesis. 

The third point to consider with regard to 
this question of verbal mediation concerns 
the lack of an effect for the order manipula- 
tion. It might be argued that if verbal media- 
tion were to occur, the intervention of verbal 
material subsequent to the study of the non- 
verbal target description but prior to making 
the nonverbal judgments would expedite this 
cognitive process, thereby intensifying accu- 
racy and consistency in the predictions of 
behavior. There are two reasons to expect 
greater fidelity on the nonverbal rating scales 
after, rather than before, the verbal rating 
scales have been filled out: (a) the verbal 
items would elicit covert verbal labels not 
evoked by the preceding nonverbal target 
description nor evident in the subsequent 
nonverbal illustrations; and (b) the process 
above would happen to a certain degree in all 
the subjects receiving common verbal mate- 
rial prior to their nonverbal judgments on 
the nonverbal target description (ie. sub- 
jects in the second order conditions). Since 
the explicit labels and terms contained in the 
verbal rating scales would be constant for all 
subjects (whereas cognitively generated labels 
might not be), this would mean that the 
intervention of verbal material would serve 
to unify, across raters, the covert labels upon 
which the trait attributions may be predi- 
cated. In short, one would expect the intro- 
duction of explicit verbal material in a 
nonverbal rating context to have an incre- 
mental effect on inferential sensitivity under 
the verbal mediation hypothesis, However, 
there was no effect of order of ratings on 
rating accuracy Or consistency. 

Although it is clear that a verbal mediation 
hypothesis cannot readily account for our 
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major findings, we do not wish to imply that 
verbal processes are nonexistent in judgments 
of personality. Most natural languages, u 
cluding English, are rich with trait descrip- 
tive terms. It would be silly to suppose that 
people did not sometimes use them. Clearly, 
research is needed on the conditions under 
which verbal processes contribute to judg- 
ments of others. Although such: research may 
well demonstrate their presence, the present 
findings support the view that verbal in- 
formation is neither necessary nor sufficient 
to account for major findings regarding the 
structure of perceived personality relation- 
ships. 

In reconciling the data from this study 
with previous findings of the structure of 
personality ratings, one might extend the 
notion of a parallelism between verbal and 
nonverbal judgmental processes, which was 
originally developed to account for findings 
in the area of cognitive psychology (Paivio, 
1975), ‘to the area of person perception, 
Such a notion, which obviates the concept of 
verbal mediation, requires the acceptance of 
two networks of associatively linked entities, 
One being a network of verbal descriptors, 
and the other, a network of imaginal en- 
tities, Presumably, personality inferences 
would be based on either or both cognitive 
systems depending on the input modality 
contextual cues, and requirements of the 
task. This dual coding approach “distin 
guishes between imaginal and verbal cognitive 
nue +... [which are] independent but 
Pi Rif 2 a systems for encoding, 

» Organization, transformation (ie, 
Pisa ci) and retrieval of stimulus in- 
s  Qemonstrated that nonverbal, or 


can be characterized by 


Conclusions 


From an experimental comparison of the 
use of nonverbal and verbal cues in a per- 
sonality judgment task, 


i and a reinterpreta- 
tion of extant data, we may conclude that 
1, Substantially reliable and accurate be- 


havioral inferences can be obtained in a 


SAMPO V. PAUNONEN AND DOUGLAS N. JACKSON 


judgmental context devoid of explicit verbal 
materials. 
2. The structure of judgments based on 


nonverbal cues corresponds closely to the 
structure of ratings obtained in a situation 
where verbal cues preponderate. 

3. The data were not consistent with the 
interpretation that a process of verbal media- 
tion accounted for the similarities between 


nonverbal and verbal personality inference, 

4. Neither could the major findings of this 
study be attributed to the use of a common 
desirability stereotype or “generalized other” 
in the use of nonverbal and verbal cues. Sta- 
tistical control of desirability and response 
base rates showed that they had no incre- 
mental effect on accuracy. 

5. The data support the view that linguis- 
tic informational cues, relationships, and rat- 
ing scales are neither necessary nor sufficient 


to account for observed consistencies in per- 
sonality. 


Reference Note 


- Reed, P, L., & Jackson, D, N, Personality measure- 
ment and inferential accuracy (Research Bulletin 
No. 419). Unpublished manuscript, The University 
of Western Ontario, London, Canada, 1977. 
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The Role of Category Accessibility in the Interpretation of 
Information About Persons: Some Determinants 
and Implications 


Thomas K. Srull and Robert S. Wyer, Jr. 
University of Illinois at Urbana-Champaign 


Many personality trait terms can be thought of as summary labels for broad 
conceptual categories that are used to encode information about an individual's 
behavior into memory. The likelihood that a behavior is encoded in terms of 
a particular trait category is postulated to be a function of the relative accessi- 
bility of that category in memory. In addition, the trait category used to encode 
a particular behavior is thought to affect subsequent judgments of the person 
along dimensions to which it is directly or indirectly related. To test these 
hypotheses, subjects first performed a sentence construction task that activated 
Concepts associated with either hostility (Experiment 1) or kindness (Experi- 
ment 2). As part of an ostensibly unrelated impression formation experiment, 
subjects later read a description of behaviors that were ambiguous with respect 
to hostility (kindness) and then rated the target person along a variety of trait 
dimensions. Ratings of the target along these dimensions increased with the 
number of times that the test concept had previously been activated in the 
Sentence construction task and decreased with the time interval between these 
prior activations and presentation of the stimulus information to be encoded 
Results Suggest that category accessibility is a major determinant of the way 
mation is encoded into memory and subsequently used to 


make judgments. Implications of this for future research and theory develop- 
ment are discussed, 4 


When individuals are asked to judge them- 
selves or another person, they are unlikely to 
Perform an exhaustive search of memory for 
all cognitions that have implications for this 
judgment. Rather, they are likely to base 
their judgment on some subset of these cogni- 
tions that is most readily accessible (Tversky 
& Kahneman, 1973, 1974). Using quite differ- 


and Lingle and Ostrom (in press) have both 
shown that once a judgment of a stimulus , 
Person has been made on the basis of new 
information, this judgment is subsequently 
used as a basis for later inferences about the 
person independent of the information upon 
which the judgment was originally based. 


ent research methodologies, Carlston (1977) 
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Similarly, Ross and his colleagues (Ross, 
Lepper, & Hubbard, 1975; Ross, Lepper, 
Strack, & Steinmetz, 1977) have found that 
once a person has constructed an explanation 
of an event involving himself or another per- 
son, this construction, rather than the infor- 
mation that stimulated it, is used to predict 
the likelihood of future events. Each body of 
research therefore suggests that the most 
easily accessible cognitions about an object or 
event (i.e., those that have been acquired and 
used most recently) have a major influence 0n 
future judgments, 


Similar considerations arise when a person 
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is asked to interpret new information about a 
social stimulus. In many instances, the in- 
formation one receives is ambiguous; that is, 
it can be interpreted in more than one way. 
| For example, information that someone told 
| his girl friend that her new hair style is un- 
attractive could be interpreted, or encoded, 
as either “honest” or “unkind.” Which encod- 
ing is actually made may depend upon which 
of the two relevant concepts (honest or un- 
kind) is most easily accessible at the time 
the information is received (cf. Bruner, 1957). 
Once the behavior is encoded as an instance 
of one of these trait concepts, the implications 
of this encoding, rather than those of the 
original behavioral information, may be used 
| as a basis for subsequent judgments about the 
person (Carlston, 1977; Higgins, Rholes, & 
Jones, 1977). If this is true, judgments of a 
| person may often be affected substantially by 
rather fortuitous events that lead one or 
another concept to be more accessible to the 
judge at the time information about the per- 
son is initially received. 

This paper reports two in a series of studies 
| designed to investigate the possibility raised 
above and to explore its implications. In so 
doing, they supplement and extend an earlier 
study by Higgins, Rholes, and Jones (1977). 
These authors reasoned that if subjects were 
required to use trait terms in the course of 
| Performing one task, these terms would be- 
| come more accessible and therefore more 
likely to be used to encode subsequent be- 
havioral information about a person in an 
| unrelated context. To test this hypothesis, 
subjects first performed a color-naming task 
that required them to remember four trait 
terms. In experimental conditions, the traits 
Were all potentially applicable for encoding a 
Normatively based set of behavioral descrip- 
tions (e.g. thinking about crossing the At- 
lantic in a sailboat), whereas in control con- 
ditions they were all inapplicable. Moreover, 
in some cases the four terms all had positive 
| ‘valuative implications (e.g-, “adyenturous”) , 
While in other cases they all had negative 
evaluative implications (eg. “reckless”). As 
Part of a second, ostensibly unrelated experi- 
ment, subjects read a story about a stimulus 
Person that contained these behavioral de- 
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scriptions. Subjects then both wrote a de- 
scription of the target person in their own 
words and estimated how much they liked 
this person. Experimental subjects tended to 
describe the target person with trait terms 
that were the same as or synonymous with 
those to which they had been exposed in the 
color-naming task. In addition, their subse- 
quent ratings of the target person were biased 
in the direction of the evaluative implications 
of these terms. In contrast, control subjects 
who were exposed to trait terms that were 
inapplicable for encoding the target’s behavior 
did not vary systematically in either the 
evaluative implications of their free descrip- 
tions of the target or their ratings of him. 
The findings of Higgins et al. provide in- 
triguing support for the general hypothesis 
that once a concept is activated or “primed” 
as a result of its use for one purpose, its rela- 
tive accessibility is enhanced, and its likeli- 
hood of being used to encode subsequent in- 
formation increases. Moreover, the implica- 
tions of the encoding, rather than those of the 
original stimulus material, are used as a basis 
for later judgments. Several additional ques- 
tions are raised by these results, however. 
First, in the Higgins et al. study, subjects 
were primed with specific trait labels, each of 
which presumably represented a particular 
cognitive category or concept. To the extent 
that trait and behavioral concepts are inter- 
related in memory, however, the accessibility 
of trait concepts may also be increased by 
priming specific behaviors that exemplify 
these trait concepts. In this regard, it may be 
useful to conceptualize the representation of 
person information in memory in terms of 
schemata, or configurations of interconnected 
traits and behaviors at different levels of gen- 
erality (cf. Cantor & Mischel, 1977). Such 
schemata may be hierarchical, with trait 
terms centrally located and behavioral in- 
stances of the traits more peripheral. Once a 
schema is activated, it may then be used to 
interpret and organize subsequent information 
(Bransford & Johnson, 1972; Lingle & Ost- 
rom, in press). Thus, exposure to behavioral in- 
stances of a trait in one context may activate 
a schema associated with this trait, and the 
schema may then serve as a basis for inter- 
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preting subsequent behavioral information 
that is received in other contexts (Wyer & 
Srull, in press). 

This reasoning implies that it is not neces- 
sary to prime the name of a trait in order to 
increase the likelihood that the schema asso- 
ciated with it will be used to interpret subse- 
quent information. Rather, one needs only 
prime behavioral instances of the trait that 
are represented in the schema. It is conceiv- 
able, however, that several such instances 
must be primed in order to activate the 
schema itself. One reason for this is that be- 
haviors are often ambiguous; that is, they 
can be considered instances of several differ- 
ent traits. To this extent, they are less likely 
to be interpreted as representative of any 
given trait when considered in isolation, Thus, 
it is reasonable to expect that the likelihood 
of activating a trait schema will increase with 
the number of behavioral instances used to 
prime it. 

Once a trait concept or schema is activated 
as a result of exposure to representative be- 
haviors, its accessibility, and thus its effect 
on the interpretation of subsequent informa- 
tion, is likely to decrease over time. This pre- 
diction is both intuitivel 

y reasonable and 
formally derived on the basis of several exist. 
ing theoretical formulations, For le, 


Wyer and Srull (in press) have d 
model of social’ in ores oo 


which the accessibili 


Wyer and Carlston ( 1979) have formulated 
a model of person memory that also predicts 
this decrease, According to this DARREN 
which in many respects is similar to the 
semantic memory model of Collins and Loftus 
(1975), the residual “excitation” that Temains 
at the location of a Previously activated trait 
Concept decreases over time once the concept 
is no longer used. As a result, increasingly 
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greater amounts of excitation are required to 
reactivate the concept as time goes on, and 
the trait category is less likely to be invoked? 
The two models therefore differ in that the 
first attributes the decrease in accessibility 
to be a function of interference effects (the 
likelihood of which increases over time), while 
the second attributes the decrease to be a 
function of time per se, independent of any 
interference produced by other concepts. 

To summarize, then, the above reasoning 
implies that the accessibility of a trait schema, 
and therefore the likehood that it will be used 
to interpret new information, will increase 
with the number of schema-related behavioral 
concepts that have been activated prior to 
the receipt of this information but will de- 
crease with the length of time between the 
activation of these concepts and acquisition of 
the new information to be interpreted. The 


Present set of experiments tests these hy- | 


potheses. 


The experiments reported extend upon the } 
findings of Higgins et al. in one other impor- | 


tant way. Specifically, Higgins et al. found 
that the accessibility of trait terms that were 
inapplicable for encoding the behavioral in- 
formation did not affect subsequent evalua- 
tions of the target person. This suggests that 
Priming does not have a direct influence on 
subjects’ judgments of the target but affects 
these judgments only through its mediating 
influence on the interpretation of the target’s 
behavior. Once the behavior has been encoded 
in terms of a given trait (e.g., “hostile”), 
however, and an overall impression of the 
target has been formed on the basis of this 
encoding, the target may then be ascribed 
other, evaluatively similar traits (eg., “un- 
intelligent”) that are related only on the basis 
of subjects’ implicit Personality theories 
(Rosenberg & Sedlak, 1972). This possibility 


* The effects of priming appear to dissipate very 
rapidly with the semantic tasks that are typically 


Used to test the Collins and s 
Carlston’s extension of Sg aa 


Portant, the sti 
which the Wy 
applied are 
Collins and 


imulus domain and judgment tasks to 
er and Carlston model is theoretically 


Quite different from those considered by 
Loftus, 


tht S the model does not postulate | 
S Tapid a decay of excitation, however. More im- | 
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was also explored in the present set of experi- 
| ments. 


: 
Method 


Overview 


Two experiments were run, The first investigated 
‘the effects of priming concepts related to hostility. 
The second was a conceptual replication in which 
concepts related to kindness were primed. The proce- 
dure used was the same in each experiment. Subjects 
first. performed a “word comprehension” task in 
which they constructed sentences from sets of words. 
These sets were constructed so that each sentence 
completed would describe a behavior either related 
or unrelated to hostility (kindness). The total num- 
[ber of questionnaire items and the proportion of 
items related to the concept being primed were both 
systematically varied. Then, as part of a separate 
experiment on impression formation, subjects read 
à paragraph about a hypothetical target person who 
manifested a series of behaviors that were ambiguous 
With respect to the trait being primed. Judgments of 
both the target person and of individual behaviors 
Were then analyzed as a function of the length of the 
Priming questionnaire (30 or 60 items), the propor- 
tion of hostile (kind) priming items in the ques- 
tionnaire (20% or 80%), and the time interval be- 
tween completion of the priming task and the pre- 
sentation of the target information (no delay, 1 hour, 
or 24 hours). 

Ninety-six introductory psychology students (8 in 
cach experimental condition) participated in Experi- 
Ment 1 for course credit, and a different group of 
96 students participated in Experiment 2. 


Selection of Behavioral Descriptions 


Experiment 1. To select behavioral descriptions 


that varied both in terms of the hostility they con- 


"veyed and in terms of the ambiguity of their implica- 


tions, 43 subjects who did not participate in ye 
main experiment were asked to rate a large pool of 
individual behaviors along a scale from 0 (‘not at 
all hostile”) to 10 (“extremely hostile”). From this 
Pool were selected 5 behaviors that were judged to 
convey high hostility (M =8.08) and 5 that were 
judged to convey low hostility (M= 58). In addi- 
tion, 10 “ambiguous” behaviors were selected on the 
asis of two criteria: first, the mean hostility rating 
of each ambiguous behavior (M = 3.99) was lower 
than the mean rating of any behavior identified as 
hostile and higher than that of any behavior identi- 
ed as nonhostile; second, the standard deviation 
of ratings for each ambiguous item was greater than 
the largest standard deviation of any item in either 
of the other two groups (SD =2.76). The 10 am- 
iguous items were randomly divided into two 
8toups of 5, and each group was then used to con- 


“Struct a vignette describing a hypothetical saraet 
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person. In addition, all 20 behaviors were used as 
test items to be rated individually in a manner to 
be described. 

Experiment 2. A procedure similar to that de- 
scribed above was used to select behavioral descrip- 
tions related to kindness. In this case, the mean 
ratings (# =40) of the items selected as kind, am- 
biguous, and unkind were 8.71, 4.24, and .81, respec- 
tively. Again, the standard deviation of each am- 
biguous item was greater than the largest standard 
pars of any item in the other two groups (SD = 

92). 


Procedure 


Except where noted, the procedures described were 
identical in both experiments. 

Administration of the priming task. Subjects, who 
were run in groups of four to eight, were greeted 
by a male experimenter. ‘The experimenter introduced 
himself as a graduate student and stated that sub- 
jects had not actually been assigned to him but that 
the “real” experimenter had agreed to let him pretest 
a word comprehension test he was trying to develop. 
The exercise was described as a test of how people 
perceive word relationships based on their first 
immediate impressions. The task consisted of a num- 
ber of items adapted from materials developed by 
Costin (1969, 1975). Each item consisted of a set of 
four words, and the subject's task was to underline 
three of the words that would make a complete sen- 
tence. The subject was told to complete each item 
as quickly as possible. Each item listed the words in 
random order and was constructed in such a way 
that the subject could form at least two possible sen- 
tences. In Experiment 1, however, each possible sen- 
tence formed from the hostile priming items (e.g. 
“leg break arm his”) necessarily conveyed hostility 
while each item formed from other (filler) items 
(eg., “her found knew I”) did not. In Experiment 2; 
the filler items were identical, but each possible sen- 
tence formed from the kind priming items (€g. “the 
hug boy kiss”) conveyed kindness. 

Although the effects of priming were expected to 
increase with the number of times an instance of the 
trait concept was primed, a more diagnostic test of 
these effects was constructed by varying both the 
total number of items in the questionnaire and the 


would then be indicated by significant effects for 
both of these manipulations. Moreover, 
tion variable should have a greater effect when the 
total number of questionnaire items is large than 
when it is small. 

To avoid suspicion, all subjects within any given 
experimental session received a questionnaire of the 
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same length, but the proportion of priming items 
was varied. In addition, the particular priming items 
used in constructing the questionnaire were varied, 
so that pooled over subjects within each condition, 
each item occurred the same number of times. 
Presentation of stimulus materials. Aiter complet- 
ing the priming task, the graduate student thanked 
` the subjects for helping him. They were then turned 
over immediately to the “real” experimenter, told to 
return 1 hour later, or told to return 24 hours later. 
(In the latter two cases, subjects had previously 
been notified through the mail that they were sched- 
uled for those times as well. The early dismissal from 
the first session was attributed to a small mix-up 
that prevented one of the planned experiments from 
~ being ready on time.) 5 
The second experimenter (a female) then led 
Subjects to believe that the scheduled experiment 
consisted of three separate and unrelated tasks. The 
first task (the only one relevant to the present 
article) was described as a task of impression forma- 
tion. Subjects were asked to read a short vignette 
about a stimulus person (Donald) that described a 
series of events occurring during the course of one 
afternoon, Two vignettes, serving as stimulus replica- 
tions, were constructed for use in each experiment. 
Each vignette contained a different set of five be- 
haviors that were ambiguous with respect to their 
implications for the primed trait. These behaviors 
were embedded within other information that was 
irrelevant to the trait. For example, one of the vig- 
nettes used in Experiment 1, which described be- 
haviors that were ambiguous with respect to hos- 
tility, was the following. 


I ran into my old acquaintance Donald the other 
ah ee decided to go over and visit him, since 
y coincidence we took our vacations at thi 

time. Soon after I arı Ria 
the door, but Donald refused to let him enter. 
He also tol ; 
rent until 


We talked for a while, had lunch, and the: 


out for a sista 


ride. We used my car, since Donald’s 


looking for, so we left and walked 
another store. The Red Calais 
by the door and asked us to g 


a few blocks to 
had set up a stand 
ite blood. Donald 


betes and th 
not give blood. It's funny that I Mae ween ie 


before, but when we got to th 

E e store, 
that it had gone out of business. It ear ie 
kind of late, so I took Donald to pick up his ee 
and we agreed to meet again as soon as pesstble. 


Similar vignettes that were ambiguous with respect 
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to kindness, were used in Experiment 2. 

In each case, subjects were asked after reading 
the vignettes to form an impression of the person 
described and then rate him along a series of trait 
dimensions, six of which (hostile, unfriendly, dis- 
likable, kind, considerate, and thoughtful) were as- 
sumed to imply either a high or a low degree of 
hostility/kindness, and the others of which (boring, 
selfish, narrow-minded, dependable, interesting, and 
intelligent) were expected to be evaluatively loaded 
but descriptively unrelated to either hostility or kind- 
ness. These ratings were made along a scale from 0 
(“not at all”) to 10 (“extremely”) and in each case 
included reverse scoring on half the items. 

Ratings of individual behaviors. After rating the 
target person, subjects rated the hostility (in peri- 
ment 1) or kindness (in Experiment 2) conveyed by 
each of the 20 individual behaviors selected on the 
basis of the pretest data described earlier. These rat- 
ings were made along a scale from 0 (“not at all 
hostile/kind”) to 10 (“extremely hostile/kind”) 

Ratings of trait co-occurrence. Finally, subjects 
were asked to estimate the co-occurrence of hostility 
(in Experiment 1) and kindness (in Experiment 2) 
with each of the other 11 traits. Items were of the 
form “If a person is hostile [kind], how likely is it 
that he is ?” and were rated along a scale 
from 0 (“not at all”) to 10 (“extremely”). 

Postexperimental data, To check on the extent to 
which subjects might have had insight into the ob- 
jectives of the experiment despite the several precau- 
tions taken to dissociate the priming questionnaire 
from the impression-formation task, subjects in Ex- 
periment 2 were asked to indicate which of the four 
tasks they performed during the experimental ses- 
sion(s) were most likely to be related to the same 
hypothesis. Following this question, they were also 


asked to indicate any other tasks they thought might 
be related. 


Collection of Normative Data 


To facilitate the interpretation of the expected 
Priming effects, normative data were collected on the 
20 behaviors used in each experiment. To avoid pos- 
sible context effects associated with the large pool of 
behaviors originally tested, different groups of sub- 
jects rated the amount of hostility conveyed by each 
behavior used in Experiment 1 (n= 28) and the 
amount of kindness conveyed by each behavior used 
in Experiment 2 (n= 34) on scales ranging from 0 
("not at all hostile/kind”) to 10 (“extremely hostile/ 
kind”). These ratings were all made under neutral 


testing conditions and were later used for compara- 
tive Purposes. 


Results 
Experiment 1 
Preliminary analysi ai ig 
friendly, S n 


dislikable, kind, considerate, and 
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Figure 1 Mean ratings of target person along (A) descriptively and (B) evaluatively related 
traits to hostility as a function of questionnaire length, proportion of hostile priming items, and 


delay 


thoughtjul were assumed to be denotatively 
related to the trait hostile. Ratings of the esti- 
mated co-occurrence of these traits with hos- 
tility, which ranged (after reverse scoring on 
the last three) from 8.79 to 8.98, indicate that 
they were all thought to covary with hostility 
to a substantial degree. Thus the six traits 
were summed across to produce a single index 
of the perceived hostility of the target. Sim- 
ilarly, ratings of the remaining six traits were 
summed after appropriate reverse scoring to 
provide a single index of the unfavorableness 
of the target along dimensions that are eval- 
uatively loaded but not descriptively related 
to hostility, as evidenced by co-occurrence 
ratings ranging from 5.52 to 6.45. 

Ratings of target person. Mean ratings of 
the target along both hostility-related dimen- 
sions and evaluative dimensions not directly 
related to hostility are shown in Figure 1 as 
a function of the length of the priming ques- 
tionnaire, the proportion of hostile priming 
items, and the delay between completion of 
the priming task and presentation of the stim- 
ulus materials. Analyses as @ function of these 
Variables and the stimulus replication are 
televant to several hypotheses. First, ratings 
of the target along both sets of dimensions 
Were expected to increase with the number of 


times hostility-related concepts had previously 
been activated. Support for this can be seen 
in Figure 1, which shows that ratings of the 
target increased monotonically with the num- 
ber of hostility-related items contained in the 
questionnaire. The hypothesis is supported 
statistically by significant main effects of both 
questionnaire length, F(1, 72) = 123.74, p< 
.001, and the proportion of hostile priming 
items, F(1, 72) = 590.67, p < 001. If the 
effect of priming is a linear function of the 
number of times hostility was previously 
primed, the effect of proportion should be 
greater when the questionnaire is long than 
when it is short. While the interaction of pro- 
portion and questionnaire length was not 
significant (F < 1), the pattern of results is 
consistent with predictions; specifically, when 
collapsed over delay conditions, the difference 
in the mean trait ratings of the target between 
low and high proportion lists was greater for 
the long questionnaire (2.70) than for the 
short questionnaire (2.51). 

Second, the effect of priming was expected 
to decrease with the time interval between the 
priming task and presentation of the stimulus 
information to be encoded. The data shown in 
Figure 1 clearly support this hypothesis, 
which is tested statistically by the main effect 
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Figure 2. Mean ratings of (A) hostile, (B) ambiguous, and (C) nonhostile behaviors as a function 
of questionnaire length, proportion of hostile priming items, and delay. 


of delay interval, F(2, 72) = 133.32, p< 
.001.* Figure 1 also suggests that the magni- 
tude of the decrease in priming effects over 
time is a positive function of the number of 
times the category was initially activated 
(i.e., the number of relevant priming items in 
the questionnaire). Only the interaction of 
delay and proportion of priming items was 
Statistically significant, F(2, 72) = 8.30, p< 
001, indicating that the effect of delay in- 
creased with the proportion of priming items 
in the initial questionnaire. Since the number 
of relevant items presented differed more as a 
function of proportion than as a function of 
overall questionnaire length, the greater con- 
tingency of delay on the former variable is 
Not surprising. 

Finally, priming effects on judgments of 
hostility were expected to generalize to ratings 
along dimensions that are evaluative but re- 


lated to hostility only indirectly through sub- 
jects’ implicit 


2 Personality theories, Data in 
the right panel of Figure 1 clearly support 
this hypothesis. However these 


supported by 


ype with both 
delay, F(2, 72) = 5.13, p < 01, and the pro- 


portion of hostile priming items, F( 1, 72) = 
5.70, p < .05. Specificall: , the effect of delay 


interval on judgments of the target along 
descriptively related dimensions was greater 
(M = 7.65, 6.94, and 5.28 in the immediate, 
I-hour, and 24-hour conditions, respectively) 
than its effect on judgments along evaluative 
but not descriptively related dimensions 
(M = 5.77, 5.00, and 3.90, respectively). 
However, the proportion of hostile priming 
items had less effect on hostility-related judg- 
ments (M = 5.41 and 7.83 under 20% and 
80% conditions) than on evaluative judg- 
ments along other dimensions (M = 3.49 and 
6.29, respectively). 

Ratings of individual behaviors. The mean 
ratings of hostile, ambiguous, and nonhostile 
behaviors are plotted in Figure 2 as a function 
of experimental variables, Analyses of these 
data yielded an obviously significant effect of 
behavior type, along with the predicted main 
effects of delay, proportion of hostile priming 
items, and length of the priming question- 
naire (in each case, p < .001). By far the 
greatest effects occurred on ratings of am- 
biguous behaviors, as evidenced by significant 


pais ng 


“At should be noted that all of the results reported 
in this paper are based on analyses of variance that 
assume homogeneity of treatment-difference vari- 
ances. The small positive bias that results when this 
assumption is not completely satisfied (see, €g 
Huynh & Feldt, 1970) would appear insignificant in 
relation to the general strength of the results. 
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| interactions of behavior type with question- 
naire length, F(2, 144) = 65.64, p< .001; 
proportion of hostile priming items, F(2, 144) 
= 352.95, p < .001; and delay, F(4, 144) = 
223.69, p<.001. The 4-way interaction 
among these variables was also significant, 
F(4, 144) = 3.38, p < .02. This reflects the 
fact that the interaction among delay, number 
of priming items, and proportion of critical 
| priming items was greater for the ambiguous 
| behaviors than for the other two behavior 
types, F(2, 72) = 4.79, p < .02. In fact, this 
component accounts for 73.8% of the total 
interaction sums of squares. 

I The differences in ratings under the 24- 
| hour delay conditions are sufficient to justify 
the conclusion that the priming task influ- 
enced judgments even after a fairly long time 
interval had elapsed. However, they do not 
indicate in an absolute sense whether the de- 
| layed ratings were positively affected by prim- 
ing under all conditions. Evidence bearing on 
this question is provided by a comparison of 
these ratings with normative ratings of the 
| behaviors made by subjects who were not €x- 
posed to the priming task. These normative 
ratings are also presented in Figure 2. Un- 
fortunately, the comparisons are not easy to 
interpret. Subjects who completed priming 
questionnaires in which only 20% (6 or 12) 
of the items were hostility-related made less 
hostile ratings of both the hostile and am- 
biguous behaviors than subjects who received 
| no priming at all. Taken at face value, this 
suggests that priming under these conditions 
had a negative effect after a delay of 24 hours. 
However, since comparable results did not 
obtain in Experiment 2 (see below) and such 
negative effects are difficult to account for 
theoretically, such a conclusion must be 
treated very cautiously pending replication. 
(Indeed, it seems more reasonable to attribute 
the finding to spuriously high normative rat- 
ings of the behaviors than to negative effects 
of priming.) 


Experiment 2 


Preliminary analysis. The traits consider- 
ate, thoughtful, hostile, unfriendly, and dis- 
likable were assumed to be descriptively re- 
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lated to the trait kind. Mean estimates of the 
co-occurrence of these traits with kindness 
(after reverse scoring on the last three) 
ranged from 8.54 to 8.85, indicating that they 
were all thought to covary with kindness to a 
substantial degree. Ratings of the six traits 
were therefore summed across to produce a 
single index of the perceived kindness of the 
target. Ratings of the remaining six traits 
were also summed after appropriate reverse 
scoring to provide a single index of the per- 
ceived favorableness of the target along di- 
mensions that are evaluatively but not de- 
scriptively related to kindness. Ratings of the 
estimated co-occurrence of these traits with 
kindness ranged from 5.69 to 6.24. 

Ratings of target person. Mean ratings of 
the target person along dimensions both de- 
scriptively and evaluatively related to kind- 
ness are plotted in Figure 3 as a function of 
experimental variables. These effects are sim- 
ilar in most respects to those obtained in Ex- 
periment 1. The hypothesis that ratings would 
increase monotonically with the number of 
times concepts related to kindness were pre- 
viously activated is supported by main effects 
for both questionnaire length, F(1, 72) = 
35.62, p < .001, and the proportion of kind 
priming items in the questionnaire, F(1, 72) 
= 158.67, p < .001. Moreover, the effect of 
proportion was significantly greater when the 
questionnaire was long (M = 4.54 and 6.05 
under 20% and 80% conditions) than when 
it was short (M = 4.24 and 5.18, respec- 
tively), F(1, 72) = 8.85, p< .0l. 

The hypothesis that priming effects would 
decrease over the time interval between the 
priming task and stimulus presentations was 
again strongly supported, F(2, 72) = 47.79, 
p < .001. Moreover, the magnitude of this 
decrease was greater when the proportion 
of kind priming items in the questionnaire 
was high than when it was low, F(2, 72) = 
19.66, p< 01, and greater when the ques- 
tionnaire was long than when it was short, 
F(2, 72) = 5.98, P< 01. These findings in- 
dicate that the effect of delay is a positive 
function of the number of times that kind- 
ness was initially primed. 

Finally, the priming manipulations had 
very similar effects on ratings of both di- 
mensions that are descriptively related to 
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kindness and those that were evaluative but 
descriptively unrelated. This again supports 
the hypothesis that once the target’s behavior 
is encoded in terms of a trait, it will also be 
assigned other characteristics that are evalu- 
atively associated with this trait. However, 
the effect of delay on ratings of descriptively 
related dimensions (M = 5.94, 5.15, and 4.54 
in the immediate, 1-hour, and 24-hour condi- 
tions, respectively) was greater than for 
ratings of evaluatively related dimensions 
(M = 5.23, 4.88, and 4.29, respectively), 
F(2, 72) = 3.61, p < .05. Moreover, the ef- 
fect of proportion was greater for descrip- 
tively related judgments (M = 4.55 and 5.87 
under 20% and 80% conditions) than for 
evaluative judgments along other dimensions 
(M = 4.23 and 5.36, respectively). However, 
this difference was not reliable, F(1, 72) = 
1.54, ns. 

There are two related differences between 
these data and those obtained in the first 
experiment. First, the delay interval in Ex- 
periment 1 had an appreciable effect at all 
combinations of questionnaire length and 
proportion of critical priming items, whereas 
the effect of delay in the present experiment 
was negligible when the Proportion of prim- 
ing items was low, F(2, 72) = 1.68, ns. Sec- 
ond, the effect of the two priming (variables 
after a 24-hour delay was pronounced in the 
first experiment, but was much less so in 
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this study. In fact, neither questionnaire 


length (F < 1) nor proportion of kind prim- 


ing items, F(1,72) = 3.12, ns, had reliahe 
effects after such a delay, In sum, these data 
Suggest that a greater number of priming 
items was necessary to increase the accessi- 
bility of a schema associated with kindness 


than was required to increase the accessi- 


bility of a schema associated with hostility. 
Moreover, the accessibility of kindness-re- 
lated concepts appears to decrease more 
rapidly over time than does the accessibility 
of concepts related to hostility, 

Ratings of individual behaviors. Mean 
ratings of individual behaviors designated 
as kind, ambiguous, and unkind on the basis 
of normative data are shown in Figure 4 as 
a function of experimental variables. The 
effects of these variables are generally similar 
to their effects on ratings of the target per- 
son; that is, the estimated kindness of all 
three types of behaviors increased with both 
questionnaire length and the proportion of 
kind priming items contained in the ques- 
tionnaire, each F(1, 72) > 43.60, p < .001, 
while decreasing as a function of the time 
interval between the priming task and mak- 
ing these estimates, F(2, 72) = 51.73, p< 
001. Moreover, the effects of the time delay 
Increased with both questionnaire length, 
F(2, 72) = 6.75, P< .01, and the propor- 
tion of kind priming items, F(2, 72) = 25.75, 


TRAITS 


portion of kind Priming items, and delay. 
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Figure 4. Mean ratings of (A) kind, (B) ambiguous, and (C) unkind behaviors as a fundon of 


p< .001. As in the case of target ratings, 
priming had virtually no effect when only 
6 or 12 kindness-related items were involved, 
and the effect of priming after a 24-hour 
delay was negligible. 

In addition to the above effects, there was 
an interaction between behavior type and 
the proportion of kind priming items, F(2, 
144) = 10.53, p <.001. This interaction re- 
flects the fact that the proportion variable 
had a greater effect on ratings of ambiguous 
items than on unambiguous ones, F (1, 72) = 
14.42, p < .001, This component accounts for 
74.5% of the total interaction sums of 
Squares. Finally, there was also an inter- 
action among behavior type, proportion of 
kind priming items, and delay, F(4, 144) = 
2.44, p< .05, which is again attributable to 
the greater effect of experimental variables 
on ratings of ambiguous behaviors than on 
ratings of unambiguously kind or unkind 
behaviors, F(2, 72) =3.31, p<.05. This 
component accounts for 73.9% of the total 
interaction sums of squares. 

The mean normative ratings of each type 
of behavior under neutral conditions are also 
shown in Figure 4 for comparison. These 
ratings, unlike the corresponding ones in Ex- 
periment 1 (see Figure 2), are invariably 
elow subjects’ ratings in any one of the 
Priming conditions. Thus, each level of prim- 
a increased the perceived kindness of all 

ree types of behaviors. However, this effect 


questionnaire length, proportion of kind priming items, and delay. 


was much less pronounced after a 24-hour 
delay than after either of the shorter delays. 

Supplementary analysis. Despite the elab- 
orate precautions taken to separate the prim- 
ing and experimental tasks, a postexperi- 
mental questionnaire was administered to 
determine whether subjects had insight into 
their actual relatedness. Two findings sug- 
gest that subjects were not complying with 
any implicit demand characteristics. First, 
only 5 of 96 subjects thought the first (prim- 
ing) task was related to any of the other 
three tasks performed during the course of 
the experimental session (s). Moreover, only 
1 of these 5 connected the priming task to 
the subsequent impression-formation experi- 
ment. Thus, subjects were just as likely to 
relate the priming task to an objectively ir- 
relevant experiment as to the ratings of ac- 
tual concern. Since each of the four tasks was 
highly dissimilar, subjects appeared to be 
guessing randomly and to have no insight 
at all into the relationship of the two tasks. 


Discussion 


The two experiments reported in this paper 
are consistent in their implications for the 
processing of information about persons. Spe- 
cifically, once a trait concept or schema is 
made more accessible by previous cognitive 
activity, the likelihood that the same schema 


will be used to encode new information is 
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increased. The accessibility of these concepts, 
and therefore the likelihood that they are 
subsequently used, increases with the num- 
ber of times that instances of them have been 
activated in the past. Moreover, although 
these effects decrease with the time interval 
between their activation and the acquisition 
of information to be interpreted or encoded, 
they are sometimes detectable even after 24 
hours, In addition, the effect of category ac- 
cessibility on the encoding of behavioral in- 
formation is much more pronounced when 
the implications of this information are rela- 
tively ambiguous. 

Finally, once behavioral information is en- 
coded, these encodings affect judgments of 
the person who manifested the behavior with 
respect to both the trait originally primed 
and other traits that are related to it only 
indirectly through subjects’ implicit person- 
ality theories. The fact that these effects were 
typically less on judgments of the latter than 
of the former traits Suggests that this gen- 
eralizability is not simply due to a halo effect 
produced by exposure to “good” or “bad” 
Concepts on the priming task, Moreover, this 
generalization 
information out a meng of the 
‘As, Higgins ea person being Judged. 

- (1977) found, increasing 


coding was not obtained ji 

Such evidence is supe 5th oe 
who found that only the Priming area 
terms that were potentially applicable fe i 
describing the target’s behavior affected z 
jects later characterizations of him Thu f 
Primed traits that were evaluatively PEA 
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but inapplicable for describing the target’s 
behavior had no effect. While similar data 
were not collected in the present study, the 
paradigm used is sufficiently similar to sug- 
gest tentatively that similar processes were 
operating. 

At least two other approaches could be 
used to examine directly the mediating role 
of category accessibility and the conditions 
under which it affects encoding. First, reac- 
tion time procedures may be useful. For ex- 
ample, once a particular trait concept such 
as hostile is primed and is thus theoretically 
made more accessible, subjects should encode 
information with hostile implications more 
quickly, as well as be capable of making 
judgments based upon it more rapidly. Sec- 
ond, our analysis has assumed that encoding 
effects occur at the time the stimulus infor. 
mation is received, If this is true, priming 
trait concepts after presenting behavioral in- 
formation about a target (i.e, after the in- 
formation has already been encoded) should 
not have any effect on Subsequent judgments 
of the person, Evidence that priming pro- 
duces no effect under these conditions would 
Provide clear evidence that priming effects 
are in fact mediated by the encoding of be- 
havioral information, 

ý The difference between the ultimate dura- 
tion of priming effects obtained in this type 
of situation and those typically obtained 
with semantic and lexical 
tasks is striking, For example, the effects of 
priming a familiar noun category (e.g, 
bread”) on 
to other concepts that are closely associated 
with it (eg, 
in a matter of 
See Schvaneveldt & Meyer, 1973). This sug- 


Periments reported here. It also suggests that 
an “interference” 
Postulated by Wyer and Srull (in press) may 

more appropriate for describing the pro- 


cessing of complex social stimulus informa- 
tion than the spread 


Bza 


d other situational conditions provided a 
ariety of relatively novel contextual cues 
lat were rich enough to “reprime” the trait 
ies originally made salient. This pos- 
ity should also be explored in future 
esearch. 

“The generalizability of the main findings 
Over experiments suggests that these effects 
ire not unique to particular trait concepts 
jr to traits at a given level of favorableness. 
[he amount of priming required to activate 


ver, In the first experiment, as few as six 
tances of hostile behavior were sufficient 
O activate the schema related to this trait 
thus to affect subsequent judgments. 
owever, many more instances of kindness 
apparently required to increase the ac- 
lity of a kindness-related schema. It 
possible that this difference is due to 
ique characteristics of the two traits or 
particular set of priming behaviors used. 
vever, to the extent that these materials 
representative of favorable and unfavor- 
able traits, it would appear that favorable 
ait concepts are generally more difficult to 
ivate using the present procedures. Im- 
Pression formation research (e.g., Birnbaum, 
1974; Wyer & Hinkle, 1976) has consistently 
‘shown that favorable information has less 
influence on judgments than unfavorable in- 
formation does, One reason is that favorable 
‘information is typically more ambiguous (for 
direct evidence of this, see Wyer, 1974). In 
the present case, since favorable behaviors 
are socially desirable, instances of these be- 
_haviors may be considered less indicative of 
traits to which they correspond (cf. Jones 
& Davis, 1965). Therefore, when considered 
“in isolation, these behaviors are less likely to 
Activate a particular trait schema. This pos- 
Sibility may deserve further investigation 
using a broader sample of trait concepts than 
hose considered in the present studies. 

While the effects of manipulations of cate- 
gory accessibility decrease over the interval 
between the priming task and presentation 
Of the information to be encoded, they ap- 
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pear to increase with the time interval be- 
tween the presentation of this information 
and subsequent judgments (Higgins et al., 
1977). These two effects should be distin- 
guished. The likelihood that a trait category 
is accessed and used to interpret information 
should decrease over the time period since 
it was most recently activated, for reasons 
noted earlier, Different considerations arise 
in explaining the effect of a delay between 
the encoding of information and judgments, 
Immediately after exposure to the stimulus 
information, both the raw information and 
the encoded representation of it are easily 
accessible. To the extent that the encoding 
does not capture all the implications of the 
original information for the judgment to be 
made, the judgment may be based on a com- 
posite of both, However, the accessibility of 
the original input material may decrease more 
rapidly over time than the encoded repre- 
sentation of it, producing the increased effect 
of encoding reported by Higgins et al. Addi- 
tional evidence of increased effects of en- 
coding over time has been reported by 
Carlston (1977), and an investigation of 
comparable effects within the present para- 
digm may also be worthwhile. 

A final issue to be considered concerns the 
way in which various target persons will be 
differentially affected by prior activations of 
a particular trait schema. The interpretation 
of information about persons is obviously 
an overdetermined process, and the accessi- 
bility of a particular trait schema is likely 
to be only one of several determinants. Wyer 
and Srull (in press) have proposed that sche- 
matic representations of specific individuals 
are built up on the basis of repeated ex- 
periences, In this regard, the effects of in- 
creased accessibility of a particular trait 
schema on the encoding of new information 
may be an inverse function of the amount 
of information already known about the tar- 
get person, Thus priming effects may be most 
pronounced when the target person is previ- 
ously unknown. 
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‘Negative 


and Positive Components of Psychological Masculinity 


and Femininity and Their Relationships to Self-Reports of 
Neurotic and Acting Out Behaviors 


Janet T. Spence, Robert L. Helmreich, and Carole K. Holahan 
The University of Texas at Austin 


Negatively valued masculinity (M-) and femininity (F-) personality scales 
were developed to supplement the positively valued Masculinity (M*) and 
F emininity (F*) scales of the Personal Attributes Questionnaire (PAQ; Spence 
& Helmreich). M- consisted of traits that had been judged to be (a) more 
typical of males than females, (b) undesirable in both sexes, and (c) agentic 


or instrumental in content. Two F- scales were 


developed, both containing 


stereotypically feminine, undesirable traits, one set of traits referring to com- 


munionlike characteristics (Fo~) and the other to 


verbal passive-aggressive 


qualities (Fy,7). Significant sex differences in the predicted direction were found 
on all scales. In both sexes, low and typically nonsignificant correlations were 


culinity and femininity. Scores on a se 


lated with M* and F*, uncorrelated with Mz, and negatively correlated with the 
F- scales. Different patterns of scores were ‘associated with two types of prob- 


lem behaviors. In both sexes, 
negative direction) with M*, and 


related (in a positive direction) with M~. 


instances was with Fra 


Studies of trait stereotypes (e.g., Rosen- 
krantz, Vogel, Bee, Broverman, & Brover- 
man, 1968; Spence, Helmreich, & Stapp, 
1974, 1975) have consistently demonstrated 
that the typical male and female are per- 
ceived as differing in a number of person- 
ality attributes. Males are reported to be 
higher than females in a cluster of charac- 
teristics reflecting personal competencies and 
goal orientation, whereas women are repo! 
to be higher in a cluster of characteristics 
reflecting social-emotional sensitivity and an 
interpersonal orientation. The same differ- 
ences in the personalities of men and women 
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neuroticism was most highly correlated (in a 
acting out behavior was most strongly cor- 


The next highest correlation in both 


are frequently mentioned in the discussions 
of social-psychological theorists. Parsons and 
Bales (1955), for example, have distin- 
guished between the extradomestic, instru- 
mental role responsibilities assigned to men 
in most societies and the expressive, domestic 
role responsibilities assigned to women and 
have proposed that these differential role as- 
signments are paralleled by underlying differ- 
ences in the relative strengths of instrumental 
and expressive characteristics in the two 
sexes. Similarly, Bakan (1966) has identified 
two fundamental properties that character- 
ize living organisms: a sense of agency, mani- 
fested in such characteristics as self-asser- 
tion, self-protectiveness, and self-aggrandize- 
ment, and a sense of communion, manifested 
in selflessness and a desire to be at one with 
others. Bakan has further identified agency 
as the “male principle,” stronger in males 
than in females, and communion as the “fe- 
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male principle,” stronger in females than in 
males. 

It has frequently been assumed by psycho- 
logical theorists that along with other attri- 
butes that differentiate the sexes and thus 
define “masculinity” and “femininity,” these 
clusters of instrumental, agentic traits and 
expressive, communal traits are essentially 
incompatible and may thus be treated as 

endpoints of a single masculinity-femininity 
continuum, This proposition is joined with 
still a further assumption—that appropriate 
sex-typing has beneficial consequences for the 
individual, the masculine male and the femi- 
nine female exhibiting a higher degree of 
social adjustment and psychological health 
than those who deviate from the patterns 
of behaviors and psychological characteristics 
expected of their sex. More recently, these 
views have been challenged by a number of 
investigators (e.g., Bem, 1974; Block, 1973; 
Carlson, 1971; Constantinople, 1973; Spence 
& Helmreich, 1978; Spence, Helmreich, & 
Stapp, 1975) who have Proposed that “mas- 
culine” instrumental characteristics and 
“feminine” expressive characteristics form 
Separate dimensions that not only vary inde- 
pendently but also contribute Positively to 
nue functioning in members of both 
XES. 


Data supporting the implications of these 
revisionist views have 


is more mixed in 
items on the M and F 
larly described. 

Results obtained wit 
struments have consist 
differences on the M 
predicted direction, 
the common belief 
relative degree of 
characteristics. Withi 


h these self-report in- 
ently demonstrated sex 
and F scales in the 
thus lending support to 
that the sexes differ in 
agentic and communal 
n each sex, however, the 
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correlation between scales has been found 
to be close to zero, and a number of in- 
dividuals of both sexes have been found who 
score relatively high on both scales (“an- 
drogynous” individuals) or relatively low on 
both scales (“undifferentiated”’ individuals), 
The empirical data (e.g., Bem, 1977, Spence 
et al., 1975; Spence & Helmreich, 1978) also 
Suggest that M scores, and to a lesser extent 
F scores, are positively associated with a 
number of indices of adjustment and social 
competence, irrespective of the sex of the 
individual. Thus “androgynous” individuals 
tend to be more socially effective than those 
who are sex typed. 

It is critical to note that theoretical dis- 
Cussions of personality differences between 
the sexes have focused on socially desirable 
characteristics that purportedly contribute 
to execution of socially approved sex roles; 
similarly, studies of trait Stereotypes have 
concentrated on identifying perceived differ- 
ences between men and women in positively 
Sanctioned attributes, In this same mode, 
the M and F scales of the PAQ and, to a large 
extent, of the BSRI,! are confined to the 
measurement of socially desirable traits. 

Both observation and psychological theory 
Suggest, however, that even within the cate- 
gory of agentic and communal traits, there 
are a number of “masculine” and “feminine” 
characteristics that are socially undesirable 
and have consequences for their possessors or 
those about them that may be deleterious. 
Bakan’s (1966) theorizing is particularly rel- 
evant to this point, He proposes that a 
Strong sense of agenc , unmitigated by a 
eg of communion, is destructive to the 
individual and to Society. Similarly, com- 


col NENN 


1 Ratings of the ideal man and woman on each 
of the BSRI items obtained by Gilbert, Strahan, 
and Deutsch 
gullible, childlike, and soft- 
fell toward the nonfeminine 
thus suggesting their social 
ition, mean ratings of the 
the opposite pole on 6 of 
the 20 F items. As will be 
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munion must be mitigated by agency if the 
individual is to function effectively. The de- 
velopmental task of males is thus to learn to 
balance masculine agency with some degree 
of communion, and of females, to balance 
feminine communion with some degree of 
agency. 

In the present article, we will describe an 
extension of the PAQ to include scales tap- 
ping some of the socially undesirable com- 
ponents of psychological masculinity and 
femininity and will present data that explore 
the implications of these scales for self- 
esteem and two types of problem behaviors. 
Prior studies (Kelly, Caudill, Hathorn, & 
O’Brien, 1977; Heilbrun, Note 2) confirm 
the observation that negatively valued as 
well as positively valued attributes distin- 
guish the sexes. A number of undesirable 
traits identified in these studies as being 
more characteristic of males than of females 
refer to the absence or opposite of desirable 
expressive, communal characteristics, whereas 
those identified as being more characteristic 
of females often refer to the absence or oppo- 
site of desirable instrumental, agentic char- 
acteristics, Setting up scales of undesirable 
attributes containing such items would be 
likely to provide no conceptual or empirical 
information not already provided by the 
existing PAQ M and F scales. Still other 
attributes identified in these studies are not 
easily classified into agentic versus communal 
categories and are of unknown theoretical 
significance. In expanding our instrument 
to include the negatively toned masculine 
and feminine attributes, it was our intent to 
develop scales that conceptually parallel the 
M and F scales, containing negatively valued 
Masculine characteristics that reflect Bakan’s 
“unmitigated agency” and negatively valued 
feminine characteristics that reflect “unmiti- 
gated communion.” The procedures used to 
develop these scales and their content will 
be described in a later section. 

Correlations between the original (socially 
desirable) M and F scales, it will be recalled, 
are close to zero and tend to be positive in 
sign, While similar relationships might occur 
between the negatively valued M and F 
scales, Bakan’s (1966) theory suggests that 
Stronger relationships that are negative mM 
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sign might be found between the original F 
scale (positive F or F*) and the new, socially 
undesirable M scale (negative M or M5). 
That is, these M- characteristics (represent- 
ing unmitigated agency) would tend to ap- 
pear in those relatively lacking in socially 
desirable communal attributes. Similarly, a 
negative relationship might be expected bë 
tween the positive M (M*) and the negative 
F (F-) scales, traits reflecting unmitigated) 
communion tending not to appear i tn € 
with a healthy sense of self. Predictions about 
the relationships between parallel positive 
and negative scales (M* and M-, F* and F-) 
are more problematic. One might argue that 
those exhibiting one cluster of masculine or 
feminine traits would be more likely to ex- 
hibit a second cluster of sex-linked traits, or 
conversely, that those exhibiting a particular 
cluster of socially desirable characteristics 
would in general be less likely to exhibit 
socially undesirable characteristics. Perhaps, 
within a group of individuals, these opposing 
tendencies could be expected to cancel each 
other, leading to a minimal relationship be- 
tween positive and negative scales belonging 
to the same sex-typed category. 

In addition to obtaining normative and 
correlational data on the expanded PAQ to 
test these possibilities, we administered to 
the subjects the Texas Social Behavior In- 
ventory (TSBI; Helmreich & Stapp, 1974), 
a measure of self-esteem and social compe- 
tence, and a biographical questionnaire that 
inquired into the incidence of two types of 
problem behaviors: emotional distress of a 
neurotic nature, and sociopathic, acting out 
behaviors. In prior research (€.8., Spence & 
Helmreich, 1978), the TSBI has been found 
to be strongly correlated in both sexes with 
M+ scores and moderately correlated with 
F* scores. Bakan’s theorizing about the self- 
destructive implications of unmitigated 
agency and communion led to the expecta- 
tion that considerably weaker if not negative 
relationships would be found between our 
measure of self-esteem and social competence 
and the negative EPAQ scales. A negative 
correlation with F- (which reflects a lack of 
strong sense of self) seemed particularly 
likely. 

In including the biographical question- 
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| naire, we were exploring the possibility that 
different patterns of positive and negative 
M and F scale scores would be related to 
neurotic and acting out behaviors. Our ex- 
pectations were clearest for the M scales, We 
Aanticipated that of all the scales, M+ would 
most strongly associated (in a negative 
ction), with emotional problems of a neu- 
rotic nature, that is, men and women ex- 
ibiting"these difficulties would tend to be 
Wi iN socially desirable agentic qualities. 
$ conformity with prior studies, male stu- 

nts were expected to report more acting out 
behaviors than females, even with personality 
variables controlled. Within each sex, how- 
ever, we predicted that M- would show the 

Strongest relationship with these behaviors 
in a positive direction, that is, men and 
women high in acting out would also tend 


to be high in the undesirable aspects of 
agency. 


pë 


oh Method: 
Development of Masculinity and 
Femininity Scales 


i l (Spence & Helmreich, 
1978). Briefly, the instrument in its present form 


drawn from those 


with items that were bi 

spect to social desirability, that huts 
fell toward the masculine pole and th 
toward the feminine pole, 
to a separate scale labeled masculini 
(M-F*), The M-F+ mie 


e ideal woman 


Bs ed for security). 
Similar procedures were used i 


in the devel 
of the negative masculinity (M: AE 


-) and femininity 


J. SPENCE, R. HELMREICH, AND C. HOLAHAN 


(F-) scales. Based primarily on pilot work by one 
of our students, Kirk Heilbrun (Note 2), with an 
item pool drawn largely from the Adjective Check 
List, a number of traits were identified whose pres- 
ence was (a) judged to be socially undesirable for 
members of both sexes, (b) attributed more fre- 
quently to males than to females, and (c) agentic 
in content. The first two criteria were verified by 
submitting the items to additional groups of male 
and female college students, with instructions to 
rate the typical or the ideal member of each sex. 
In all instances, significant differences in the mean 
ratings of the typical male and female were ob- 
tained, whereas mean ratings of the ideal member 
of each sex fell toward the pole that indicated a 
relative absence of the stereotypically masculine 
trait. Agentic content was determined by ratings of 
the investigators and their graduate students, Eight 
items meeting these criteria were selected for the 
negatively valued masculinity (M-) scale, (Eight 
items were chosen to equal the number of each of 
the socially desirable scales on the 24-item PAQ.) 
These items are: arrogant; boastful; egotistical; 
greedy; dictatorial; cynical; looks out only for 
self; and hostile. Each item is set up on a 5-point 
scale, with one extreme indicating a high degree 
of the trait (e.g, very arrogant) and the other a 
low degree (e.g, not at all arrogant). 

The same procedures were used to identify trait 
items that are attributed more frequently to women 
than to men but at the same time are considered 
to be undesirable in both Sexes. However, we were 
less successful in finding stereotypically feminine 
characteristics that appeared to reflect the excessive 
selfishness implied by negative communion. We 
abandoned our original goal as impossible to achieve 
fully and chose instead to investigate two sets of 
undesirable, feminine items (four items per set) 
because of the Possibility that they would yield 
interesting data. The first cluster of four items 
Comes close to unmitigated communion and is iden- 
tified as Fc-. The second cluster of four items de- 
Scribes a type of verbal Passive-aggressiveness and 
is identified as Fya-. The items on the Fo- scale 
ah ete servile, gullible, and subordinates 
E EENT e on the Fva- scale are: whiny, 

5 y, and nagging. Items are set up on 


a 5-point scale, the extremes indicati i 
ating a high and 
low degree of the trait, 5 j 


Extended Personal 


Attributes Questionnaire 

Si RAA iM te negative and positive scales are 
orm a 40-ite; 

a iera tk item extended PAQ (EPAQ), 


ng accompanied by a 5-point scale 
rom 0 to 4.2 The M+, M-, and M-F* 
ored in a masculine direction and the 
and Fe- scales in a feminine direction. 


and scored f 
items are sce 
BY Fy, 


— 


Pa. 
s ue of the scales and additional information 
a “ir Psychometric properties can be obtained 
5 ae anthos, The results of factor analyses are 

Ported in Helmreich, Spence, & Wilhelm (in press). 
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Item scores are summed to obtain a total score for 
each scale. 


Other Instruments 


Texas Social Behavior Inventory. The 16-item 
short form of the TSBI (Helmreich & Stapp, 1974) 
was used as a self-report measure of self-esteem and 
social competence, Items are scored 0-4 and are 
summed to yield a total score. 

Biographical questionnaire. The biographical 
questionnaire contained 39 items that asked sub- 
jects to indicate the frequency of certain kinds of 
behaviors or events at one or more periods of 
their lives (college, high school, grade school) and 
their current attitudes and feelings about certain 
aspects of their lives, After factor and other anal- 
yses, certain items were dropped, either because 
they had no predictive value (frequency of illness, 
level of school performance) or because their oc- 
currence was too infrequent to provide useful data 
(appearance before juvenile court, involuntary psy- 
chological counseling). The remaining items were 
assigned to 10 scales, 8 of which described acting 
out or sociopathic behavior and 2 of which de- 
scribed emotional distress of a more neurotic na- 
ture. The acting out scales contained items re- 
ferring to frequency of occurrence of (a) current 
use of alcohol and other drugs; (b) misdemeanors. 
such as property destruction, shoplifting, and other 
minor thefts at three age periods (grade school, 
high school, and college); (c) lying at three age 
periods; (d) verbal and physical fights in grade 
school; (e) verbal fights in high school and col- 
lege; (f) physical fights in high school and col- 
lege; (g) school misbehavior in high school; (h) 
School misbehavior in grade school. These scales 
Were positively correlated with each other in both 
Sexes and entered into similar relationships with 
the EPAQ scales. They were therefore combined 
for Purposes of data reduction into a single acting 
out scale. 

The neurotic scales inquired primarily about cur- 
tent feelings. The first of these scales contained 
items referring to depression, certainty of life goals, 
Satisfaction with social life, general life satisfaction, 
and voluntary seeking of professional help for psy- 
chological problems. The second neurotic scale con- 
tained items referring to frequency of feeling ner- 
Vous, tense, fearful, and anxious. These scales were 
also significantly positively correlated and were 
Combined to form a single neuroticism index. 


Subjects and Procedures 


The test battery, which included, in order, the 
EPAQ, the TSBI, and the biographical question- 
maire, was administered to 220 male and 363 female 
Students in introductory psychology courses at the 

niversity of Texas at Austin as part of a course 
requirement. The battery was given in mixed sex 
8toups and was administered by a male and a 
‘male experimenter. 
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Results 
EPAQ Scales 


A series of ¢ tests comparing the means of 
the male and female subjects indicated that 
the sexes differed significantly (ps < .001) 
in the predicted direction on all six scales. 
Except for the M-F* scale, és meani 
both sexes fell toward the socially desirable 
pole, which in the case of the negative scal 
indicated a relatively low degree of tke, indi” 
cated trait. On the M-F* scale, the mean for 
males fell slightly toward the masculine pole, 
and for the females, slightly toward the 
feminine pole. Since their values were highly 
similar to those previously reported (Spence 
& Helmreich, 1978), the means on the posi- 
tive PAQ scales will not be given here. For 
the negative scales, the means for males and 
females, respectively, were 13.63 and 12.21 
for M-, 6.39 and 7.28 for Fo", and 5.28 and 
6.48 for Fy”. (The maximum possible score 
is 32 for M- and 16 for Fo and Fya’.) 

Correlations between the EPAQ measures 
are shown for each sex in Table I. The rs 
between the three original PAQ scales repli- 
cate those of previous studies (Spence & 
Helmreich, 1978), M* and F* having a slight 
positive correlation and M-F' having a sub- 
stantial positive correlation with M* and a 
somewhat lower negative correlation with F*. 
Turning to the negative scales, it will be 
noted that as with the positive scales, the 
values of the coefficients are similar for the 
two sexes. It will also be observed that the 
correlations between the two F- scales are 
positive but low, suggesting the desirability 
of scoring these two scales separately. 

Particularly informative are the correla- 
tions between the parallel positive and nega- 
tive scales. For M* versus M-, and F* versus 
Fo, the rs are positive in both sexes, and 
for F* versus Fy47, the rs are both negative. 
However, even when the values are signifi- 
cant, they are small in magnitude, the high- 
est being only .15. (The bipolar M-F* scale, 
scored in a masculine direction, has a some- 
what higher positive correlation with M- than 
does M*, .20 for males and .17 for females, 
p< .05.) For all practical purposes, then, 
the comparable positive and negative scales 
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may be treated as independent and may thus 
enter into very different relationships with 
other variables. These data also provide fur- 
ther reassuring evidence of a lack of con- 
taminating social desirability response bias 
in the EPAQ scales; were such a bias sys- 
tematically distorting subjects’ responses, @ 
substantial negative correlation would be ex- 
‘pected between counterpart positive and neg- 


tive scales. 

BP a Ses expected, more substantial corre- 
lations, all significant and negative in di- 
rection, were found between the cross-typed 
positive and negative scales. Those high in 
M' thus tended to be low on both F- scales. 
These same negative relationships were also 
found for the M-F* scale, the correlation be- 
tween M-F* and Fo“ in both sexes being 
particularly marked, Similarly, those high in 
F* tended to be low in M-. Finally, M- 
showed a substantial positive correlation with 
Fy, and a negative but small correlation 
with Fo. 

In previous studies we have found it heu- 
ristically useful to classify subjects jointly 
on their M* and F* scores by a median split 
method to reveal how M* and F* combine to 
influence some dependent variable. The pres- 
ent subjects were categorized into four groups 
by this method, using college student norms 
(Spence & Helmreich, 1978): Androgynous 
(above the combined male and female me- 


Table 1 


Correlations of EPAQ Measures in 
Males and Females 


eo 


Scale 1 2 3 4 


Saw 
OE Mites Nias 4) ie eee 
F 09 —~ i30 33—09 ‘07 
3.. MEP (145) 220 gongs cas 
4. M- SORE 47 — “49 _199 
5. Pyar )e-.20 TA ang MSE ag 
6. Fo y :301 Sis ee ean E 


Note. EPAQ = Extended Personal i 
tionnaire. Males are above the a nE 
are below. For males, df = 227, ro; = 13) Korie: 
males, df = 361, r.os = .10. M+ = Positively BRN 
Masculinity Scale. Ft = positively valued Femi- 
ninity Scale. M-F* = positively valued Mascu- 
linity-Femininity Scale. M- = negatively valued 
Masculinity Scale. Fya~ = Feminine Verbal Ag- 
gressive Scale. Fo~ = Feminine Communion Scale. 
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Table 2 

Means on the Negative Scales for Males and 
Females Classified by Category on M+ and Ft 
E E e—a 


PAQ category 


Nega- Un- 

tive differen- Femi- Mascu- Androgy- 

scale tiated nine line nous 
Males 

M- 14.00 10.39 15.01 12.87 

Mva” 6.87 4.75 4.53 5.1( 

Fo 7.15 7.32 5.58 5,22 
Females 

M- 13.93 10.31 16.59 11.74 

Fva™ 7.13 6.89 7.08 5.58 

For 7.51 7.89 5.54 7.09 

Note. PAQ = Personal Attributes Questionnaire. 

Mt = positively valued Masculinity Scale. Ft = 


positively valued Femininity Scale. M-F* = posi- 
tively valued Masculinity-Femininity Scale. M- = 
negatively valued Masculinity Scale. Fya~ = Femi- 
nine Verbal Aggressive Scale. Fo = Feminine Com- 
munion Scale. 


dians on both M and F), Masculine (above 
on M, below on F), Feminine (below on M, 
above on F), and Undifferentiated (below on 
both). Means were next determined for males 
and females in each PAQ category on the 
three negative scales. Simple analyses of vari- 
ance indicated that for each sex and scale, 
significant differences (p < .001) occurred 
across categorical groups. Inspection of the 
means, shown in Table 2, suggests that in 
both sexes, M* and F* combine additively 
to determine M-. That is, with M* having 
a slight positive correlation and F* a more 
Substantial negative correlation with M~, one 
would anticipate from an additive combina- 
tion that the Masculine group would have 
the highest mean on M-, followed by the 
Undifferentiated, Androgynous, and Feminine 
groups. This order obtained in both sexes. 
The data from the F- scales also produced 
the ordering of PAQ groups predicted by 
an additive combination of M* and F*, the 
one exception being several slight inversions 
for females on Fy,~. In view of magnitudes 
of the underlying correlations, the latter Te- 
sult probably reflects sampling error. What 
should be noted about these orders is that 
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Table 3 
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Correlations between EPAQ Scales and Self-Esteem, Neuroticism, and 


Acting Out Measures for Males and Females 


— SSS 


Self-esteem Neuroticism Acting out 
Scale Males Females Males Females Males Females 
M* .66°** —.53%** —.41*** —.09 - 
Ft 23°? —.06 —.06 = AT - o 
M-F* P aaas —.43°** —.278%* 03 —.02 
M 02 13 „194+ .26*** a8 
Fya —.16°* Rh bead x) bd SS TiN 09 
Fo —.40°"* DATS 04 .13* -08 
Nole. EPAQ = Extended Personal Attributes Questionnaire. M+ = positively valued Masculinity Scale. 


ptm positively valued Femininity Scale. M-F* = positively valued Masculinity-Femininity Scale. 
M- = negatively valued Masculinity Scale. Fva™ = Feminine Verbal Aggressive Scale. Fo™ = Feminine 


Communion Scale. 
*p<.05. *p <.01. °% p < 001. 


they differ for each negative scale, so that no 
categorical group is uniformly highest or 
lowest. For each scale, however, the Un- 
differentiated group exhibits the highest or 
second highest amount of the undesirable 
Masculine or feminine qualities, and the An- 
drogynous group the lowest or second lowest 
amount. 


— 


EPAQ and Self-Esteem 


| In previous studies with the PAQ (Spence 
"et al, 1975; Spence & Helmreich, 1978), it 
has been demonstrated that in both sexes all 
three scales are positively related to scores on 
the TSBI, used as a measure of self-esteem 
and social competence. Substantial correla- 
tions are found with M*, and lower but highly 
significant correlations with M-F* and F*. 
Inspection of the correlations obtained in the 
Present samples of males and females, shown 
in Table 3, reveals values highly similar to 
those of prior investigations. As had been 
anticipated, different relationships were found 
| with the negative scales. In both sexes, the 
Correlations with the self-esteem measure are 
close to zero for M- and significantly negative 
for the F- scales. Particularly marked is the 
telationship with Fo in males (r = —-40). 
To examine the joint contributions of the 
PAQ scales to self-esteem, males and females 
ere each divided into three self-esteem 
ups, those falling into the upper quar- 
, middle half, and lower quarter of 


their respective score distributions, Multi- 
variate analyses of variance on the scales 
were then computed for each sex and were 
found to be highly significant in each sex 
(ps < 0001). Inspection of the pattern of 
means reveals that in both sexes, systematic 
trends occurred on all scales except M-. 
Scores increased from low to high self-esteem 
groups on all three positive scales and de- 
creased on the two F- scales. Univariate Fs 
were significant (ps < .01) except in the case 
of males and Fya~. 


EPAQ and Neuroticism 


Scores on our measure of neuroticism and 
of self-esteem and social competence turned 
out to be negatively related (rs of —.55 and 
—.36 for males and females, respectively, 
ps < .001). Although it was not surprising to 
find that those who perceived themselves as 
being socially effective tended to report ex- 
periencing less anxiety, depression, and dis- 
satisfaction than others did, the contents of 
the two scales (and the underlying psycholog- 
ical phenomena that they were designed to 
reflect) are not mirror images of each other. 
It was therefore anticipated that the relation- 
ships between the EPAQ scales and the 
TSBI would not necessarily be highly similar 
to those between the former and the neuroti- 
cism measure. 

The correlations between neuroticism and 
EPAQ scores, reported in Table 3, showed 
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that this is indeed the case. As expected, the 
strongest relationship is with M’ (rs of —.53 
and —.41 in males and females). Smaller but 
significantly negative correlations are also 
found with M-F* (rs of —.43 and —27). 
However, the correlations with F* are nonsig- 
nificant, Thus, expressiveness and interper- 
sonal skills, as measured by F', appear to 
contribute beneficially to social competence 
and self-esteem but appear to be unrelated to 
the kind of emotional disturbances—anxiety, 
depression, and so forth—tapped by the 
neuroticism scale. 

The three negative scales correlate posi- 
tively with neuroticism in both sexes, with the 
largest relationship being found with Fy,- 
(rs of .33 and .31). In females, M- had a sig- 
nificant (but modest) correlation, whereas Fo” 
did not. In males the opposite pattern was 
found: Fo had a significant negative correla- 
tion, whereas M- was nonsignificant. 

Next, the males and females were each di- 
vided "Ho ne neuroticism groups, those 
falling into the upper quarter, middle half, 
and lower quarter of their respective score 
distributions, in order to determine the con- 
joint contributions of the EPAQ scales. Multi- 
variate analyses of variance on the scales 
were highly significant in each sex (ps < 

.0001). Inspection of the pattern of means 
and the univariate Fs reveals that the high 
neurotic males are significantly lowest in M* 
and M-F' and highest in Fy,- and Fo, with 
a nonsignificant elevation in M-. High neu- 
rotic females are also significantly lowest in 
M’ and M-F* (and nonsignificantly on F+) 
and highest on the three negative scales, 
For women, however, the effects of M- are 
highly significant, whereas those for Fe- are 
not, It is possible that the presence of nega- 
tive cross-sex-typed characteristics is more 
consistent with emotional disturbance than is 
the manifestation of sex-congruent negative 
attributes, 

A final analysis was undertaken in which 
subjects were classified as above or below the 
median on M* and Fya-, the two scales show- 
ing the strongest correlations in each sex. 
(For this purpose the rounded mean of the 
medians of males and females on each scale 


was used as the cutting point.) The percent- 


J. SPENCE, R. HELMREICH, AND C. HOLAHAN 


age of subjects in each neuroticism group 
showing the most negative constellation (low 
M?’, high Fy4-) was computed. For males, the 
low M’, high Fy,~ pattern was found in 4% 
of the low neuroticism group, 17% of the 
medium, and 47% of the high. The com- 
parable figures for females were 20% low, 
35% medium, and 61% high. 


EPAQ and Acting Out Behaviors 


Inspection of the acting scores indicate 
that in both sexes, the distributions were mod- 
erately skewed; few of the subjects (partic- 
ularly females) admitted to having manifested 
these behaviors to any marked degree. The 
scores were therefore subjected to a z trans- 
formation, and statistical analyses performed 
on the transformed scores. 

Pearson correlations between acting out 
scores and scores on the self-esteem and 
neuroticism measures were not significant in 
either sex. Correlations between acting out 
scores and scores on each of the EPAQ scales 
for each sex are shown in Table 3. As ant 
ticipated, the highest correlations are between 
acting out and M-. In both sexes, the values 
are both positive and highly significant, 
relationship being more marked in males thai 
in females. (The mean acting out score for 
males is also significantly higher than that fo 
females.) For females, no other correlation 
teached significance, although the signs are 
in reasonable directions. For males, a sig- 
nificant negative correlation is found with F” 


and a significant positive correlation with 
Fya> and Fo. 


s gh acting out groups : 
elevated significantly on M- and Fya ine 
sex. A similar linear pattern is also foun 
Fe but is significant only in females, 
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The subjects were also classified as being 
above or below the median on M- and Fy,-, 
the scales most strongly associated with act- 
ing out. Those exhibiting the most unfavor- 
able combination, above the median on both 
M- and Fy,>, were then identified. In males, 
63% of the high acting out group showed this 

high M and Fy,~ constellation, whereas only 
30% of the moderate and 27% of the low 
groups were so classified. In females, the 
figures were 55% for high, 45% for medium, 
and 35% for low. 


Discussion 


On both the M- and the two F- scales, the 
mean self-report scores of men and women 
differed significantly in the direction predicted 
by the stereotype ratings. Despite the sex 
differences on the negative as well as the 
positive scales, the correlations between coun- 
terpart positive and negative scales were 
small and, typically, nonsignificant. Thus in 
neither sex did socially desirable and unde- 
sirable masculine or feminine traits tend to 
covary, Masculinity and femininity have typ- 
ically been treated as if they were unidimen- 
sional constructs, those exhibiting a given 
cluster of masculine or feminine behaviors 
or attributes being assumed to possess other 
types of masculine or feminine characteristics 
to the same degree. The present results join a 
mounting body of evidence suggesting instead 
that the correlations among the empirically 
diverse categories of attributes and behaviors 
distinguishing the sexes are often low. When 
significant correlations are found between 
two clusters of masculine or feminine char- 
acteristics, the relationship may be attrib- 
utable more to the particular characteristics 
involved than to the mere fact that they 
“belong” to the same sex-typed category. 
The multifaceted nature of masculinity and 
femininity is further demonstrated by the 
Pattern of correlations between cross-typed 
scales, The relationship between M* and F*, 
as has repeatedly been found in prior studies, 
ilow and positive in both sexes. More sub- 
tantial negative correlations occur between 
toss-typed positive and negative scales, as 
as anticipated. Those high in undesirable 
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agentic qualities, as reflected on the M- scale, 
tended to exhibit a relatively low degree of 
desirable, interpersonally oriented qualities, 
as reflected on the F* scale. Similarly, those 
high in the socially desirable agentic qual- 
ities on the M* scale tended not to exhibit the 
verbal passive-aggressive qualities reflected 
in the Fya- scale or the self-subordinating 
qualities of the Fo scale. 

Both positive and negative scales con- 
tribute to measures of self-esteem and social 
competence, neuroticism, and acting out be- 
havior, although the patterns of relationships 
differ in the three types of measures. For self- 
esteem, all three positive scales, but par- 
ticularly M*, were positively related, and 
both F- scales were negatively related to the 
criterion measure in both sexes. The correla- 
tion with M-, however, was essentially zero. 
Although neuroticism tended to be associated 
with low self-esteem, F* was not significantly 
related to this type of emotional disturbance, 
although, as anticipated, substantial negative 
correlations were found between neuroticism 
and M+ and M-F’, All three negative scales 
were positively related to neuroticism, the 
highest correlation in both sexes being with 
Fya- 

The associations between acting out be- 
haviors and the EPAQ scales tended to be 
somewhat weaker but of sufficient predictive 
value to be of potential practical as well as 
theoretical value. In both sexes, a high degree 
of acting out was most strongly associated 
with a high degree of M- and, secondarily, 
with a high degree of Fya™. In both sexes, 
other scales also made significant contribu- 
tions, but the pattern was not the same. 
Whether this discrepancy was due to chance 
or to genuine sex differences cannot be deter- 
mined without testing additional samples. 

A comment should be made about the rela- 
tively high percentage of men and women 
exhibiting the high M- and Fy,~ constellation 
even in the low acting out groups (27% of 
the males and 35% of the females). This find- 
ing was to be anticipated. Manifestation of 
*these undesirable masculine and feminine char- 
acteristics can take many forms less obviously 
antisocial than the behaviors tapped by the 
acting out measure. Additional information 
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about the individual and his or her life cir- 
cumstances is required if their avenues of 
expression are to be predicted. Further, these 
negatively sanctioned behaviors are relatively 
infrequent. While our data suggest that indi- 
viduals who exhibit sociopathic, acting out 
behaviors have a substantial probability of 
being high in M- and Fya,, the probability 
that those who are high in M- and Fya- also 
exhibit acting out behaviors should differ 
little from those with other personality con- 
Stellations, 

Although the results obtained with the neu- 
toticism and acting out measures are promis- 
ing, the samples were drawn from a non- 
clinical population, few of whose members 
exhibited any significant degree of problem be- 
havior. Acting out was particularly mild, espe- 
a in women, and some of the behaviors 

eg, 
drugs) 
emotional distress rather 
tion of hostile, antisocial tendencies, We are 


Populations might be expected to yield 
more marked relationships ON 
scales, 
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Motives for Social Comparison: The Construction-Validation 


Distinction 


Russell H. Fazio 


Indiana University 


It is suggested that informational social comparison can be motivated by two 
different concerns: (a) an interest in obtaining from others any information 
about the entity that they have at their disposal and that the individual lacks 
or (b) an interest in obtaining information about the validity of one’s judg- 
ment by determining if that judgment derives from one’s personal biases or 
from the qualities of the entity. A comparison act motivated by the former 
concern is termed construction and by the latter is termed validation. The re- 
sults of Experiment 1 suggest that a construction Process occurs only when 
one’s perceived level of information is low and others might possess additional 
information about the object. Two additional experiments examined the impact 
of agreement and disagreement from sources similar or dissimilar on a relevant 
attribute. Experiment 2 examined whether the judgments of each of these 
others are attributed largely to the other’s similarity/dissimilarity or to the 
amount of information about the entity that the other is thought to possess. 

Experiment 3 examined the implications of the attributional patterns obtained 

in Experiment 2 for choice of comparison other under motives to construct vs. 

validate. The two motives were found to lead to differing preferences regarding 


choice of comparison other. 


As individuals attempting to understand, 
predict, and exercise some control over a 
rather complex social environment, we are 
often interested in how our judgment of some 
Object compares with others’ judgments of 
that object. Such a proposition suggests that 
social comparison is desired and useful when 
One is in need of information about the ade- 
quacy of one’s judgment (cf. Festinger, 1954; 
Goethals & Darley, 1977; Jones & Regan, 
1974). The more important the judgment, the 
more likely that such informational social 
comparison will occur. 

The present research focuses upon informa- 
tional social comparison and investigates the 
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proposal that such comparison can be moti- 
vated by two quite different concerns. 

1. Sometimes we are interested in obtaining 
from others any information about the object 
that they have at their disposal and we lack, 
The goal of this process, which shall be 
termed construction, is simply the collection 
of information about the object—information 
that may be useful in constructing a judgment 
of the object. 

2. A second concern that can motivate in- 
formational social comparison is an interest in 
testing the validity of one’s judgment—what 
shall be termed.a validation process. The indi- 
vidual wants to know whether she/he has pro- 
cessed and integrated the available informa- 
tion properly. Would others, given the same 
information, reach the same judgment? Or 
has she/he been unduly influenced by some 
personal idiosyncrasy? Naturally, the indi- 
vidual hopes that the answers to these ques- 
tions lead to the conclusion that she/he is 
correct, that is, she/he hopes to achieve con- 


firmation. 
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The construction-validation distinction is 
based upon the sort of information that the 
individual is seeking. Construction involves 
information about the object; validation con- 
cerns information about one’s judgment of 
that object. (The distinction is similar to one 
drawn by Mettee & Smith, 1977, who note a 
difference between the discovery and con- 
struction of reality, i.e., construction, and the 
confirmation of already existing reality con- 
cepts, i.e validation.) An analogy to two 
types of research may serve to clarify the 
distinction further. Often, research is con- 
ducted for exploratory purposes and in order 
to generate hypotheses. A few intuitions may 
guide such research, but the main purpose is 
not to test those intuitions but to collect data 
that can help one to form hypotheses. This 
type of investigation is analogous to the con- 
struction process, The validation process cor- 
responds to research in which an empirical 
test of a hypothesis is conducted. The re- 
searcher seeks to test the validity of his or her 
hypothesis, 

Just as a researcher might first conduct an 
exploratory investigation and then an em- 
pirical test of a hypothesis, an individual 
might first be concerned with construction 
and then with validation. That is, the two pro- 
cesses are likely to occur in sequence. A per- 
Son may first desire information about the 
entity so that he or she can construct a tenta- 
tive judgment of the entity. He or she may 
then seek to test the newly formed hypothesis 
empirically via a validation process, If valida- 
a is achieved, then the informational social 

'omparison proc i 
me rae Wit a ENE, ee 

gosa ae a eved, the 
individual is likely to decide that shi SSA 
need of more informati iy len 

ìon about the entity. 


The person may return to a construction 
process in order to discover what “facts” he 
or she has yet to conside: 


x 
pig this newly acquired in an 

struct a more accurate opini 
to validate that D a4 5 ala 
It is our contention that the construction— 
validation distinction is critical to an under- 
standing of informational social comparison 
It shall be argued that Construction and val- 


idation processes are very different, and three 
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experiments suggesting this will be reported. 
Not only do concerns with construction versus 
validation diverge as a function of relevant 
variables, but choice of comparison other 
differs, given motives to construct vs. validate. 


Experiment 1 


Experiment 1 examined what variables 
might determine the extent to which an indi- 
vidual is concerned with construction versus 
validation. It was hypothesized that an indi- 
vidual would be most interested in construc- 
tion and least interested in validation when 
the individual believed that she/he lacked in- 
formation and believed that the other people 
available for comparison might possess a 
greater amount of information. 

In order to examine social comparison pro- 
cesses, an experimental task of sufficient dif- 
ficulty to prompt a need for information was 
required. Furthermore, the task needed to be 
one that permitted a simple and plausible 
manipulation of perceived level of informa- 
tion. For these reasons, the autokinetic effect 
was chosen as the experimental task. From 
the work of Sherif and his associates (e.g. 
MacNeil & Sherif, 1976; Sherif, 1935), it is 
clear that estimating the distance that the dot 
of light “moves” is a difficult and ambiguous 
task. In addition, perceived level of informa- 
tion could be easily manipulated by leading 
subjects to believe that they had been as- 
signed to a condition in which they would or 
would not be provided with additional infor- 
mation about the distance that the light 
Moves. Half of the subjects were provided 
with bogus information about the light’s 
movement, whereas the other subjects were 
not. 

Although subjects participated in the ex- 
periment individually, they were each led to 
believe that three other subjects were also 
Present. The amount of information about the 
light that these supposed others might possess 
Was manipulated as the second factor in the 
design. Half of the subjects were told that 
they and each of the supposed other subjects 
would be individually assigned to an informa- 
tion condition. The other half were told that 
each group would be assigned to a condition. 


= g 
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If the possibility that the available others 
possess more information does exist, as in the 
individual assignment conditions, then a con- 
struction motive may become operable. 
The lower a person’s perceived level of infor- 
mation, the more that person may attempt to 
gather information from the available others 
(ie. to construct), A subject in the group 
assignment conditions was led to believe that 
_ everyone possessed the same level of informa- 
tion that he or she had, This equivalence is 
predicted to undermine the construction mo- 
tive, since it is not possible to collect addi- 
tional information about the light. However, 
assignment to this condition should lead to a 
corresponding increase in a desire to validate 
one’s judgment. Given whatever information 
upon which the subject has based his/her 
judgment, can she/he find confirmation of that 
judgment? Thus, desires for construction 
versus validation are predicted to be jointly 
dependent upon perceived level of information 
and individual versus group assignment. Sub- 
jects in the no information — individual assign- 
ment condition should exhibit construction 
versus validation interests that differ from the 
interests displayed in each of the other con- 
ditions. 

Once the two independent variables, per- 
ceived level of information and assignment, 
had been manipulated, observations of the 
autokinetic effect began. During one par- 
ticular series of trials, the subject was in- 
formed of the judgments of each of the sup- 
posed others. These bogus responses were 
constructed so as to manipulate agreement. 
One of the supposed others was shown to con- 
sistently agree with the actual subject, 
whereas the other two consistently disagreed. 
For another series, subjects were ostensibly to 
be formed into pairs, and the members of a 
pair were to discuss their judgments prior to 
reaching a decision about the distance that 
the light moved. Ratings of the extent to 
which each of the supposed others was desired 
as a partner served as the main dependent 
variable. 

As argued earlier, subjects in the group 
assignment conditions have no opportunity to 
gain information for the construction process. 
However, they may well have a desire to val- 
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idate their judgments. Validation implies 
choice of the agreeing other as a partner; she/ 
he provides confirmation of one’s own judg- 
ment. Thus, within the two group assignment 
conditions, regardless of the subject’s per- 
ceived level of information, the agreer should 
be strongly preferred as a partner. 

The perceived level of information variable 
should, however, have an impact upon partner 
selection within the individual assignment 
conditions. We have argued that construction 
tendencies should be most apparent for sub- 
jects with no information. Hence, these sub- 
jects will presumably desire to have as a part- 
ner someone who has more information about 
the light’s movement. The agreeing other may 
be agreeing simply because she/he, too, is suf- 
fering from a lack of information. The disa- 
greers, however, may be disagreeing because 
they have more information at their disposal 
than the subject has. As a result of these possi- 
bilities, preference for the agreer as a partner 
should be relatively low within this no in- 
formation — individual assignment condition. 
For subjects with additional information, the 
need to gather more information (i.e., the 
desire to construct) should be weaker, and the 
desire to validate stronger, than for those sub- 
jects with no information. Hence, preference 
for the agreer as a partner should be stronger 
among subjects with, than among subjects 
without, additional information. 


Method 


Subjects 

Forty Princeton University 
true nature of the autokine' 
individually in the experimen! 
$1.50. Subjects were randomly assigned to one of the 
four conditions. The data from 2 subjects—1 from 
each of the group assignment conditions—were dis- 
carded. In one case the subject questioned the verid- 
icality of the supposed other subjects’ judgments, In 
the other, the subject did not perceive sufficient 
movement to permit the creation of one of the dis- 
agreeing responses. The final sample consisted of 10 
subjects in each of the individual assignment condi- 
tions and 9 in each of the group assignment condi- 


tions. 


freshmen, naive to the 
tic effect, participated 
t for a payment of 


Procedure 


After making a few introd 
experimenter informed the subject that a to 


uctory comments, the 
tal of 
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four individuals would be participating in the ex- 
perimental session. (This statement was the first of 
a number of steps that were taken to establish the 
plausibility of the presence of the fictitious group 
members.) Depending upon the punctuality of the 
subject’s own arrival, the experimenter also com- 
mented upon how many of the (fictitious) others 
had already arrived, The subject was then ushered 
into an enclosed booth and instructed to read the 
materials that had been placed there earlier. While 
the subject read, the experimenter opened and 
closed doors at the appropriate times so as to lead 
the subject to believe that all three of the other 
subjects had arrived. 

The materials that had been placed in the sub- 
ject’s booth described the study’s supposed purpose 
and experimental design. The study purported to 
concern “the perceptual judgment of motion and 
distance” and was said to consist of three series of 
trials, “simulating three different ways in which 
judgments are often made.” During Series 1, each 
individual was to make independent judgments about 
the distance the light moved. During Series 2, indi- 
vidual judgments were again to be made, but at the 
end of each trial the subject was to be informed of 
each of the other subject’s responses. The third series 
was labeled a “coalition series” during which pairs 
of subjects would discuss their judgments. This con- 
versation was supposedly to occur via a set of ear- 
phones and a microphone located in the booth. Sub- 
jects were told, “as much as possible, we will allow 
you to choose your partner.” 

Subjects also read that in order to investigate 
“how varying levels of information affect the accu- 
racy of people’s judgments,” each subject would be 
assigned to one of three information levels for the 
duration of the experiment. On each trial, the experi- 
menter would purportedly select a point within a 
distribution of distances specially prepared for that 
trial and would set a dial that would dictate the 
distance that the light moved. The information levels 
were said to relate to the distributions that the ex- 
perimenter was ostensibly employing. The three con- 
ditions were described as: (a) a no information 
condition, in which subjects were to be provided 
with no additional information about the light’s 
movement; (b) a partial information condition, in 
which subjects were to be provided with the Tr 
quartile range of the distribution of distances a 
all trials in the experiment; and (c) a complete i 
formation condition, in which subjects terik AEA 
given the interquartile range of the distributions fi 
each and every trial of the experiment. Inter atti 
age was defined for the subject. i 3 

Ssignment mani i 
followed the pp fos He fa los agtapls 
tions. Half the subjects, those who had te a 
domly assigned to the individual assignment nae 
tions, read the following paragraph. F 


Each of you will be placed in one of these three 
conditions on a random basis. Thus, you may per- 
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sonally have at your disposal more or less informa- 
tion than the other people in the experiment 


For subjects in the group assignment conditions, the 
paragraph read: 


Each group of subjects in the experiment is ran- 
domly placed in one of these three conditions. 
Thus, whereas each of you in this experimental 
group will have the same amount of information, 
you and the other subjects in this group may have 
at your disposal more or less information than 
other groups in the experiment. 


Once sufficient time had elapsed for the subject to 
read the materials and once the experimenter had 
feigned the arrival of each of the supposed other 
subjects, the experimenter began a series of private 
interviews with the actual subject and each of the 
supposed others, These interviews were conducted in 
the order of the subjects’ purported arrival. To main- 
tain and bolster the impression that three others 
were present, the experimenter opened the booth door 
for a supposed other. He then loudly requested that 
the person follow him, noisily moved the chair in 
the booth about, walked to the nearby interview 
room, and slammed the door shut behind himself. 
After allowing about a minute to pass, the experi- 
menter noisily returned the supposed subject to the 
booth and closed the booth door. This procedure was 
repeated until all the supposed subjects had been 
“interviewed.” 

‘ Perceived level of information manipulation. Dur- 
ing the interview of the actual subject, the experi- 
menter answered any questions that the subject had. 
He also made certain that the subject understood 
the descriptions of the three series and the three 
information conditions, The remainder of the inter- 
view concerned the information manipulation. The 
experimenter handed the subject a sheet of paper 
describing the condition to which the subject had 
been assigned. For half the subjects, this sheet simply 
read, “You have been randomly assigned to the NO 
INFORMATION condition.” For the remaining subjects, 
the sheet stated, “You have been randomly assigned 
to the PARTIAL INFORMATION condition” and informed 
the subjects that the interquartile range of the dis- 
tances the light would move was 2 to 6 inches. From 
pretesting, these figures had been determined to be 
within the range of distances subjects would per- 
ceive in the experimental setting. No subject was 
assigned to a complete information condition. Sub- 
jects were only led to believe that such a condition 
existed to ensure that the partial information sub- 
jects did not feel so informed that they would have 
no need to engage in social comparison. 

i Once the interviews had been completed, observa- 
tions of the autokinetic effect began. The far wall of 
each booth contained a one-way mirror through 
which the subject observed the light in the next 
a The light was located approximately 10 feet 
irom the actual subject's mirror. For each trial the 
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light was illuminated for a period of 10 seconds. 
TY After each of the three trials in Series 1, the subject 
recorded a judgment of the distance to the nearest 

half-inch.’ After each trial of Series 2, the subject 

also recorded a judgment. In addition, the subject 
wrote the judgment in large numerals on a blank 
sheet of paper and then held the sheet up to the 
mirror so that the experimenter could record each 
response. The experimenter walked from mirror to 
mirror and supposedly recorded the judgments of 


each of the subjects. Purportedly, he then made four 
copies and distributed one copy to each subject. 
Again, in order to maintain and bolster the decep- 


tion that three others were involved, the experi- 
menter made it appear that he was collecting a 
judgment from the occupant of each booth and that 
he was handing each subject a copy of the responses. 
This same procedure was followed for each of the 
three trials of Series 2. 

Agreement manipulation, The supposed responses 
of the other subjects were fabricated in such a man- 
ner that one other subject generally appeared to 
agree with the subject and the two others generally 
disagreed. One disagreer (the positive disagreer) con- 
sistently perceived more movement than the subject; 
the other (the negative disagreer) reported observing 
less movement. On the first trial, the agreer was 
reported to have perceived 1/2 inch more movement 
than the subject, and the disagreers 2 inches more or 
less movement. On Trial 2, the agreer’s response was 
1/2 inch less than the subject’s and the disagreers’ 
responses were 1 inch more or less than the subject's. 
For Trial 3, the agreer’s judgment was 1/2 inch more 
and the disagreers’ 1 and 1/2 inches more or less 
than the subject’s own response. All subjects were 
identified only by the booth identification letter (A, 
B, C, or D). The actual subject was always Subject 
B. For each subject, the location of the agreer, posi- 
tive disagreer, and negative disagreer was randomly 
determined. 

Dependent variables. After receiving the feed- 
back for Trial 3 of Series 2, subjects were asked to 
complete a “Partner Selection Form” for Series 3. 
As a forced-choice measure, subjects were requested 
to select one other subject as a partner. In addition, 
in order to “accommodate all four of you as well 
as possible” and “in order to settle any conflict in 
partner selection that might occur,” subjects were 
asked to rate the degree to which they would like 
to have each of the other subjects as a partner. 
These ratings were made on an 11-point scale where 
0 indicated “definitely do not” prefer to have a 
Particular subject as a partner and 10 meant “very 
much” prefer to have that person as a partner. 

While the experimenter ostensibly reviewed the 
Partner selection forms, subjects were asked to 
Complete a short questionnaire. Two scales involved 
assessments of the degree to which construction and 
Validation concerns were related to the subject's 
choice of a partner.2 The first item asked subjects, 
“To what extent did you choose the partner you 
chose because he/she is likely to provide you with 
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helpful information about the distance the light 
moves?” The second item read, “To what extent 
did you choose the partner you chose because he/ 
she is likely to confirm and support your own judg- 
ments about the distance the light moves?” Each 
of these ratings was made on a 0-10 scale where 0 
was labeled “not at all a consideration in my choice” 
and 10 “very much a consideration.” 

The last page of the questionnaire contained a 
check on the agreement manipulation. Subjects were 
asked to rate “how similar the responses of each 
of the other subjects were to your own responses 
during Series 2?” on a 0-10 scale where O was 
labeled “not at all similar” and 10 was “very simi- 
lar”, It should be noted that these agreement 
assessments were made while the subjects still had 
at their disposal the sheets that displayed each of 
the supposed subjects’ judgments. Thus, this ma- 
nipulation check measures only whether subjects 
could understand and correctly interpret the feed- 
back when directly asked to do so. It in no way 
served as a memory check. 

When the subject had completed the question- 
naire, the experiment was terminated. The subject 
was carefully probed for suspicion with the aid 
of a postexperimental questionnaire. All deceptions 
were revealed, and the purpose of the study was 
explained to the subject. 


Results 


Manipulation Check 


Responses * to the check on the manipula- 
tion of agreement suggest that the manipu- 


1 After each trial, the subject also indicated how 
certain he or she was of his or her judgment of the dis- 
tance that the light moved. For each subject, the 
mean of the subject’s three ratings during Series 1 
(ie., prior to any feedback) was computed, On a 
scale where 10 indicated extreme certainty and 0 
reflected being not at all certain, the overall mean 
across all subjects was 4.25, suggesting that subjects 
were sufficiently uncertain to desire social compari- 
son. An analysis of variance on these data revealed 
no significant differences between the conditions, how- 
ever. Possibly, a floor effect was encountered; no 
information subjects may have been reluctant to 
admit any greater uncertainty. 

2 These scales were preceded by an open-ended 
measure in which subjects were asked to explain the 
reasons for their partner choices. The data from two 
judges who rated the extent to which each response 
indicated a concern with construction versus valida- 
tion display the same pattern as the direct scale 
assessments, The same is true in Experiment 3. De- 
tails can be found in Fazio (1978). 

3 Two subjects, one from each of the individual 
assignment conditions, neglected to complete the 
manipulation check items. 
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Table 1 y £ 
Mean Partner Preferences in Experiment 1 
Perceived 
level of s 
information Disagreer 
and Tak ee 
assignment Agreer Positive Negative 
No 
Individual 5.70 6.30 4.40 
Group 8.11 4.22 4.22 
Partial 
Individual 7.80 5.70 4.70 
Group 8.11 6.56 2.67 


Note. The higher the mean, the more that supposed 
other was desired as a partner. 


lation was accomplished successfully. A 2 
(No vs. Partial Information) x 2 (Group 
vs, Individual Assignment) x 3 (Agreer vs. 
Positive Disagreer vs. Negative Disagreer) 
analysis of variance, with the last factor asa 
repeated measure, revealed a strong effect- 
of-the-agreement variable, F(2, 64) = 159.- 
12, p < .001. Furthermore, a contrast of the 
mean rating of the agreer (M = 6.64) with 
the average of the ratings of the two dis- 
agreers (M = 4.21) was highly significant, 
(64) = 17.72, p < 001. 


Partner Selection 


Table 1 presents the mean preferences for 
each of the three others as a partner. A 2 x 
2X3 analysis of variance revealed a main 
effect of the agreement variable (agreer M = 
1.43, positive disagreer M = 5.69, negative 
disagreer M = 4.00), F (2, 68) = 28.54, p < 
001. This general preference for the agreer 
is more characteristic, however, of the sub- 
Jects in the group assignment condition than 
of those in the individual assignment condi- 


RUSSELL H. FAZIO 


ment condition. In accordance with the con- 
ceptual framework proposed earlier, when 
others possibly had more relevant informa- 
tion and one’s own perceived level of in- 
formation was low, the agreeing other was 
not as strongly desired as a partner. Further 
confirming the hypotheses is the fact that 
multiple comparisons (by means of the least 
significant difference test) reveal that the 
agreer was preferred significantly less in the 
no information — individual assignment con- 
dition than in each of the other three con- 
ditions, 

The forced-choice data display much the 
Same pattern as the rating data, The pro- 
portion of subjects in each condition who 
chose the agreer as a partner (no informa- 
tion/individual = .50, partial information/ 
individual = 90, no information/group = 
1.00, partial information/group = .89) was 
noted, and these proportions were analyzed 
via an arcsin transformation (cf. Langer & 
Abelson, 1972). The analysis revealed a sig- 
nificant interaction between the perceived 
level of information and assignment vari- 
ables, z = 2.47, p < .02. Subjects in the no 
information — individual assignment condition 
chose the agreer as a partner relatively less 
frequently than did the other subjects. 


Motive for Com parison 


_ Table 2 presents the mean ratings of de- 
sire for construction and validation in each 
condition. The data were analyzed as a 2 


Table 2 


Mean Construction and Validation Scores 
in Experiment 1 


Se ol 
Perceived level 
of information 


and assignment Construction Validation 


No 
Individual 7.00 3.80 
TOUp 3.89 8.33 
Partial 
Individual 5.20 6.00 
Group 5.22 6.11 
Note. The hi 


è igher the mean the more construction 
or validation was desired. | 
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(Perceived Level of Information) x 2 (As- 
signment) X 2 (Construction/Validation) 
analysis of variance, with the last factor as 
a repeated measure. The analysis revealed 
an interaction between the assignment and 
construction/validation factors, F(1, 34) = 
6.82, p < .02. More importantly, the pre- 
dicted three-way interaction was also statis- 
tically significant, F(1, 34) = 6.41, p< .02, 
indicating that the reported motive for com- 
parison was jointly dependent upon one’s 
perceived level of information and the possi- 
bility that others possessed more information. 

It should be noted that comparisons be- 
tween construction and validation means 
within any given condition do not achieve 
statistical significance, with the exception of 
the no information — group assignment con- 
dition, Given that it is impossible to deter- 
mine in advance whether the no information 
situation is sufficently lacking in information 
about the light’s movement to produce a 
“pure” desire for construction, the present 
framework can only predict that the per- 
ceived level of information variable will affect 
desires to construct versus validate in a rela- 
tive sense. The predictions call for differ- 
ential construction versus validation interests 
when the no information — individual assign- 
ment cell is compared to each of the other 
cells, A series of interaction contrasts sug- 
gests that this is the case: for the compari- 
son with the no information — group assign- 
ment condition, (34) = 3.65, p < .001; with 
the partial information — individual assign- 
ment condition, ¢(34) = 1.96, p < .10; with 
the partial information—group assignment 
condition, £(34) = 1.95, p < .10.4 Given that 
these measures of motive for comparison oc- 
curred after partner selection, these subsidi- 
ary data should be interpreted cautiously, 
however. 


Discussion 


The results of the experiment provide some 
Support for the conceptual framework that 
was outlined earlier. Only when perceived 
level of information was low and the avail- 
able others might possess more information 
Were construction concerns relatively more 
apparent than validation concerns. Under 
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these conditions, it appears that it may be 
advantageous to discover why another per- 
son disagrees. The disagreeing other may 
possess some useful information about the 
object. In contrast, individual assignment 
subjects whose perceived level of information 
was relatively high expressed little or no 
interest in discovering the reasons for an- 
other’s disagreement, Rather, these subjects 
reported a relatively strong desire to confirm 
their own judgments and displayed a greater 
preference for the agreer than did no in- 
formation subjects. 

As the numerous interaction effects be- 
tween the assignment and information vari- 
ables show, the effects of perceived level of 
information differ under group versus in- 
dividual assignment conditions, When the 
available others definitely possessed no more 
or less information than the subjects did, 
construction motives were thwarted. As a 
result, the interactive effect that was ob- 
served within the individual assignment con- 
ditions (no information subjects desiring con- 
struction more and validation less than did 
partial information subjects) is not apparent 
within the group assignment conditions. If 
anything, the data in the group assignment 
conditions tend in the reverse direction, with 
no information subjects desiring validation 
more and construction less than did partial 
information subjects, ¢(34) = 1.65, p < .11. 

Regardless of the reported motives, sub- 
jects in both group assignment conditions dis- 
played a strong preference for the agreer 
as a partner. One possible reason for this 
preference is a relatively greater concern for 
validation. Choosing an agreer, however, 
could reflect many possible concerns, includ- 
ing either validation or self-esteem enhance- 
ment. In fact, a number of researchers have 
drawn a distinction between informational 
and self-esteem motives for social comparison 
(e.g., Gruder, 1977; Hakmiller, 1966; La- 
tane, 1966; Singer, 1966; Thornfon & Ar- 
rowood, 1966). Experiments 2 and 3 more 


+Each of these interaction contrasts reached a 
conventional level of statistical significance on the 
data from the open-ended measures of construction 


and validation. 
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closely address this issue of validation and 
its distinction from self-esteem processes. 


Experiment 2 


Goethals & Darley (1977) have suggested 
that two variables—agreement and similar- 
ity—may play a critical role in the selection 
of a comparison other. They invoked various 
attributional principles (Kelley, 1967, 1973) 
to note that a person seeking to determine 
whether his or her belief is correct wishes to 
determine whether his or her opinion is en- 
tity based (whether it accurately describes 
the entity) or whether it is person based 
(whether it derives from his own biases). 
Someone similar on attributes affecting the 
belief may agree simply because she/he is 
similar and is subject to the same biases 
that the individual is prone to, Correspond- 
ingly, disagreement from a dissimilar other 
can be attributed to the person’s dissimilar- 
ity on the crucially relevant attribute. 

Disagreement from a similar source and 
agreement from a dissimilar source appear 
to be more meaningful. If a similar other 
disagrees, the person’s confidence that his 
opinion is entity based must decrease, Since 
the other is similar on the relevant factors 
it would be difficult to attribute the dis. 
crepancy to the internal characteristics of 


that other. A dissimilar other wh 


0 agrees, on 
the other hand, provides strong confirmation 


that one’s belief is an accurate evaluation of 
the entity (cf. Goethals, 1972; Goethals & 
Nelson, 1973). Here is a person with wholly 
different biases who arrives at the same con- 
clusion, The judgment which Teceives con- 
raon from a dissimilar other must, then, 
e due to the compelling qualities of 
entity in question. “aa E 
Experiment 2 examined wheth i 
? e er attribu- 
tions of this sort do actually occur. Subjects 


were asked to make causal attributi 
why each of four others PRATE 


dissimilar agreer and a s 
lar disagreer) agreed or 
The predictions were: 
similarity dimension 
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of the similar disagreer and the dissimilar 
agreer and (b) the amount of information 
possessed would be assigned a greater causal 
role with respect to the judgments of the 
similar disagreer and the dissimilar agreer 
than with respect to the judgments of the 
similar agreer and the dissimilar disagreer. 


Method 
Subjects 
Eighteen Princeton freshmen participated for a 
payment of $3.00. In the experiment, subjects were 
randomly assigned to the no or partial information 


conditions, One subject's data were discarded be- 
cause the subject failed to understand how to com- 
plete the scales, leaving 9 subjects in the no in- 
formation condition and 8 in the partial informa- 
tion condition. 


Procedure 


In a first session, all subjects were administered 
two paper and pencil tests. The first three-page, 
18-item test was presented as a measure of one’s 
depth perception ability. The test actually consisted 
of gestalt completion problems. Each item pre- 
sented an incompletely and vaguely drawn picture. 
The subject’s task was to identify the picture and to 
label it. The second test, presented as a measure 
of lateral perception, was also three pages and 18 
items long. Each item presented an assortment of 
geometric figures. Subjects had to select which of 
five possible answers accurately displayed the smaller 
target figures in an arrangement that formed one 
larger figure. 

Similarity manipulation, In order to provide 
Subjects with feedback that was likely to confirm 
their own perceptions as to how well they had 
performed on the two tests, the number of items 
that the subject completed in the allotted time 
was counted for each test. These two numbers were 
Placed on each of four graphs of page size, where 
two points of the abscissa were labeled depth and 
lateral perception, with the ordinate (8 inches in 
length) simply labeled score. A line connecting the 
two points represented the subject’s profile. In 
addition, each graph presented the supposed scores 
of one other person. This other was identified by 
a letter that was to correspond to that person's 
booth identification letter during the second session. 
One similar other was shown to have a profile that 
was approximately 4 inch higher than the subject’s 
on the depth perception test and 4 inch lower on 
the lateral perception test. Another profile showed 
the remaining similar other to be } inch lower than 
the subject on depth perception and 4 inch higher 
on lateral perception, The dissimilar others were 
Portrayed by profiles with points 2 inches higher 
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or lower than the subject's. One dissimilar other 
was shown to have scored much better than the 
subject on depth perception and much lower on 
lateral perception, whereas the other was portrayed 
as having the reverse pattern. In this way, all the 
subjects were displayed as having approximately 
the same overall ability and differing on the spe- 
cific components, 

The procedure employed in the second session 
was virtually identical to that in the individual 
assignment conditions® of Experiment 1 and need 
not be repeated in detail. After reading the mate- 
tials in his or her booth, the subject was ushered into 
the next room for the interview with the experimenter. 
During this time, the subject’s questions were an- 
swered and the subject was notified of his or her 
information condition. The experimenter then handed 
the subject the test profiles and described how the 
graphs could be read. The experimenter further 
remarked that, whereas the perception of motion 
and distance involved both lateral and depth per- 
ception cues, people approach the perception of mo- 
tion and distance with different styles. In this way, 
the profiles were presented as reflecting a dimension 
that was related to the task at hand (cf. Goethals & 
Darley, 1977; Zanna, Goethals, & Hill, 1975). 

Following “interviews” with each of the supposed 
subjects, the autokinetic effect trials began. After 
each trial of Series 2, the subject received the bogus 
judgments of each of the supposed other subjects. 
One similar and one dissimilar other were portrayed 
as generally agreeing with the subject and the re- 
maining similar and dissimilar others as generally 
disagreeing. These bogus judgments were presented 
via the same scheme used in Experiment 1. For 
half the subjects, the similar and the dissimilar 
others who had scored higher than the subject on 
the depth perception task were shown to agree with 
the subject, and the two others who had scored 
lower, to disagree. For the other half of the sub- 
jects, this pattern was reversed. Furthermore, for 
half the subjects, the similar agreer and the dis- 
similar disagreer were displayed as having seen more 
movement than the subject whereas the similar dis- 
agreer and the dissimilar agreer perceived less. For 
the other half of the subjects, the pattern was re- 
versed such that the latter two individuals per- 
ceived more movement, and the former two less. To- 
gether, then, there were four possible arrangements 
of the others’ responses. Approximately, one-quarter 
of the subjects in each condition received each of 
the possibilities. This “arrangement variable” pro- 
duced no main or interactive effects on any of the 
variables assessed and accordingly receives no fur- 
ther consideration in this report. 

Following Series 2, subjects were informed that 
they would be completing a questionnaire in which 

they would assess the reasons for the other sub- 
jects’ judgments during Series 2. Subjects were asked 
to assess the importance of each of three dimensions 
in determining other subjects’ judgments. The di- 
mensions were the other’s perceptual style, the 
amount of information the other possessed, and 
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circumstances, Following Stevens and Jones (1976), 
the assessments were made on two sets of scales. 
The first set explicitly forced upon the subject the 
interdependence that is presumed to exist among 
the three dimensions. The subject was instructed to 
allocate a total of 100 points among the three pos- 
sible reasons for the other subject's agreement or 
disagreement with his or her own judgments, A di- 
mension was to be given a point value that reflected 
its judgments. Under the first listed dimension, “the 
subject’s perceptual style,” two options appeared: (a) 
“the similarity of the subject’s perceptual style to 
my own” and (b) the “dissimilarity of the subject’s 
perceptual style to my own.” The subject was to 
choose (a) or (b), whichever he or she considered the 
most appropriate, and then assign the dimension a 
point value. Under the second listed dimension, “the 

amount of information the subject had about the 

light’s movement,” two options were also listed: 

(a) “the subject’s lack of information” and (b) 

“the subject’s possession of information,” Again, 

the subject was to select the more appropriate 

option and assign the factor a point value. The third 

dimension listed was “the particular circumstances 

of the subject’s observations” (e.g. the angle at 

which the subject viewed the light, his or her mood 

and attentiveness, etc.). 

The second set of scales did not force any inter- 
dependence upon the subject. Instead, the subject 
rated the importance of the particular dimension on 
an independent scale. As the two sets of scales 
revealed the same effects, only the data from the 
interdependent scales shall be presented. The inter- 
ested reader is referred to Fazio (1978) for a de- 
tailed presentation of the data from the set of in- 
dependent scales. 

Following completion of the scales, subjects were 
carefully probed for suspicion and were thoroughly 


debriefed. 


Results 


Subjects allocated a total of 100 points to 
the three attribution dimensions of percep- 
tual style, information, and circumstances. 
Each dimension was assigned a point value 
that reflected its importance in determining 
the other person’s judgments. Table 3 pre- 
sents the means for each dimension. Because 
the value of one rating is perfectly deter- 
mined by the sum of the other two ratings, 
only the perceptual style and information 


5 It was decided to run only the individual assign- 
ment conditions because perceived level of informa- 
tion was found to exert an influence in Experiment 1 
only within the individual assignment cells. 
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Table 3 Feit 
Means on Each of the Attribution 


Scales in Experiment 2 


Attributional 


dimension Similar Dissimilar 

and perceived z ; 
level of Dis- Dis- 
information Agreer agreer Agreer agreer 
Perceptual 
style 

No 67.78 31.67 31.11 61.11 

Partial 61.25 43.13 43.75 55.63 
Information 

No 15.00 35.00 47.22 18.33 

Partial 21.25 37.50 38.75 23.13 
Circumstance 

No 122) 32:22 21.67 20.56 

Partial 17.50 20.63 17.50 21.25 


Note. The higher the mean, the more importance 
assigned to that dimension. 


dimensions were analyzed. A 2 (No vs. Par- 
tial Information) x 2 (Similar vs, Dissimi- 
lar) X 2 (Agreer vs. Disagreer) multivariate 
analysis of variance revealed a strong Simi- 
larity x Agreement interaction, F(2, 14) = 
8.16, p <.005. Perceptual style was viewed 
as more important in determining the judg- 
ments of those others whose responses were 
congruent with their perceptual style (the 
similar agreer and the dissimilar disagreer) 
than in determining the judgments of the 
incongruent others (the similar disagreer and 
the dissimilar agreer), univariate interaction 
F(1, 15) = 13.60, p < .005. Least significant 
difference tests revealed that similar agreers 
(M = 64.52) were perceived as having been 
affected by their perceptual style signifi- 
cantly more than were similar disagreers (M 
= 37.40). Likewise, dissimilar disagreers (M 
= 58.37) were thought to have been affected 
by perceptual style significantly more than 
were dissimilar agreers (M = 37.43). A uni- 
variate analysis of the information data re- 
vealed a Similarity x Agreement interaction 
such that the amount of information pos- 
sessed was assigned a greater causal role in 
determining the judgments of the incongru- 
ent others than in determining the judg- 
ments of the congruent others, F Gis 
16.99, p<.001. Similar disagreers (M = 
37.25) were thought to have been affected 
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by the information dimension significantly 
more than were similar agreers (M = 18.13), , 
and dissimilar agreers (M = 42.99) were per- 
ceived as having been affected significantly 
more than were dissimilar disagreers (M= 
20.73). No multivariate or univariate inter- 
actions with perceived level of information 
approached statistical significance.° 

The results confirm Goethals & Darley’s 
(1977) attributional analysis. Relative to 
similar disagreers and dissimilar agreers, sim- 
ilar agreers and dissimilar disagreers were 
perceived as having been influenced by per- 


ceptual style. The judgments of similar dis- | 


agreers and dissimilar agreers, on the other 
hand, are thought to emanate more from 
their possession of information than are judg- 
ments of the two remaining others. Further- 
more, at least in the present setting, these 
attributional patterns appear to hold regard- 
less of one’s perceived level of information 
and one’s motive for social comparison. 


Experiment 3 


Experiment 3 examined the implications 
of the attributional patterns found in Ex- 
periment 2 for actual choice of comparison 
Other, From an attributional perspective, 
only a similar disagreer and a dissimilar 
agreer are potentially meaningful and in- 
formative comparison choices, It was pre- 
dicted that these two sources differ in the 
degree to which they appeal to a person con- 
cerned with construction versus a person 
concerned with validation. 

The person seeking construction perceives 
himself to suffer from a lack of information 
with which to “construct reality.” Experi- 
ment 1 suggested that those seeking construc- 


ê These data were-also analyzed in terms of scores 
coded for the direction of the various attributions: 
The analysis showed that the judgments of similar 
agreers and dissimilar disagreers were attributed 
largely to similar and dissimilar perceptual styles, 
respectively. The disagreement of a similar other 
and the agreement of a dissimilar other were At- 
tributed largely to the others’ possession of informa- 
tion. No significant effects were observed on the 
circumstances scale. The analysis is reported in detail 
in Fazio (1978). 


j 


tion are more interested in discovering why 
‘some other person disagrees with them than 
are those people seeking validation. These 
differential preferences should be particu- 
larly evident when it is clear that this dis- 
agreeing other is similar on relevant attri- 
butes. Given this similarity, this other per- 
son should be agreeing, unless he is aware 
of some information that the subject himself 
lacks, To the extent that the partner prefer- 
ences of no information subjects in the pres- 
ent experimental situation are dictated solely 
by construction concerns, the similar dis- 
agreer should be more strongly preferred 
than any of the available others. At a mini- 
mum, no information subjects should display 
a greater preference for the similar disagreer 
than partial information subjects, since the 
former group should be relatively more in- 
terested in construction and less interested 
in validation than the latter group. 
Analogously, it can be hypothesized that 
those seeking validation should display a 
greater preference for the dissimilar agreer. 
Such a person, having what he considers suf- 
ficient information to “construct reality,” is 
interested in determining the validity of his 
judgment. The dissimilar agreer is particu- 
larly useful in this regard. Agreement from a 
dissimilar source informs the individual that 
he has based his judgment upon the “facts” 
and not upon any person biases. Thus, to the 
extent that partial information subjects are 
concerned solely with validation, they should 
display a stronger preference for the dis- 
similar agreer than for any of the three 
others, Given their relatively greater interest 
in validation over construction, partial in- 
formation subjects should, at a minimum, 
prefer the dissimilar agreer more than no in- 
formation subjects. ‘ 


N Method 


Subjects 


Twenty-four Princeton freshmen participated for 
a payment of $3. Subjects were randomly assigned to 
no or partial information conditions. The data from 
2 subjects were excluded from the analysis because 
the subjects were skeptical of either the agreement 
or the similarity feedback. The final sample consisted 


of 11 subjects in each of the two conditions. 
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Procedure 


With the exception of the dependent variables, the 
procedure was identical to that employed in Experi- 
ment 2. After the manipulations had been accom- 
plished, subjects in Experiment 3 completed the same 
dependent measures as in Experiment 1. Partner 
selection ratings were made for Series 3. Measures 
concerning motivations behind partner selection were 


also completed. 

These subjects then completed checks on the 
manipulations of similarity and agreement. On a 0 
(very dissimilar) to 10 (very similar) scale, subjects 
rated “how similar the depth and lateral perception 
test results of the other subjects were to your own 
test results.” Also, on a O (strongly disagreed) to 10 
(strongly agreed) scale, subjects rated “the degree to 
which the responses of each of the other subjects 
during Series 2 agreed or disagreed with your own 
responses.” It should again be noted that subjects 
were not expected to have committed all the similar- 
ity and agreement information to memory. The pro- 
files and judgments of each of the others were still 
available to the subject. These measures merely 
served to check whether subjects could correctly in- 
terpret the information provided about the others. 


Results 


Manipulation Checks 


Similarity and agreement. Subjects in Ex- 
periment 3 completed two measures designed 
to check that the similarity and agreement 
information was understandable to both these 
subjects and those in Experiment 2. A 2 (No 
vs. Partial Information) X 2 (Similar vs. Dis- 
similar) X 2 (Agreer vs. Disagreer) analysis 
of variance, with the last two factors as re- 
peated measures, was conducted on each of 
the manipulation checks. A strong main effect 
of the similarity variable was found on the 
similarity ratings (similar M = 8.41, dissim- 
ilar M = 2.23), F(1, 20) = 491.25, p < .001. 
Correspondingly, the agreement manipulation 
strongly affected the ratings of the degree to 
which the others’ judgments agreed with the 
subject’s own judgments (agreer M = 8.21, 
disagreer M = 3.41), F(A, 30) = 141.42, < 
001. Thus, subjects were capable of drawing 
the desired distinctions among the supposed 
four others present. 

Certainty. Perceived level of information 
has been presumed in this report to relate to 
certainty. The more information an individual 
believes himself or herself to possess about the 
object in question, the more certain that per- 
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Table 4 - 
Mean on Each Certainty Measure in 
Experiments 2 and 3 
eE E 
Perceived level of 
information 
Experi- -= 
Measure ment No Partial 
Difference ratio 2 84 68 
3 84 72 
Symmet 2 57 21 
y S, 3 38 26 


Note. On both measures, the higher the mean, the 
greater the uncertainty, 


son is apt to feel about his/her judgment of 
that object. Because of subjects’ reticence to 
admit of extreme uncertainty, direct measures 
of certainty tend to be relatively unpowerful. 
In Experiment 1, no effects were found with a 
direct measure of certainty (see Footnote 1). 
Thus, Experiments 2 and 3 involved an in- 
direct measure of certainty. In addition to 
their best estimates of the distance that the 
light moved, subjects were asked to indicate 
the maximum and minimum possible distances 
that the light moved. 

From this information, a two-component in- 
dex of certainty was derived:? (a) the mean 
(across the three trials of Series 1) of the 
ratios of the difference between the maximum 
and minimum estimates to the best estimate 
and (b) the proportion of the three trials on 
which the subject placed his or her best esti- 
mate symmetrically between the minimum 
and maximum judgments, For each compo- 
nent, higher scores reflect less certainty. 

Multivariate analysis of the data was per- 
formed in order to ask the question “Ts there 
any linear combination of the difference ratio 
and the symmetry measure which discrim- 
inates between the conditions?” A 2 (Experi- 
ment 2 vs. 3) X 2 (No vs, Partial Informa- 
tion) multivariate analysis of variance re- 

A vealed a main effect of the perceived level of 
information variable, F(2, 34) = 4.66, p< 
-02, with no main effect of experiment (F < 
1) nor an interaction effect, F(2, 34) = 1.04, 
Table 4 presents the univariate means on 
each of the certainty measures, Although the 
univariate analysis of variance of the differ- 
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ence ratio measure revealed no significant 
effects, the univariate analysis of the sym- 
metry measure revealed a main effect of the 
information factor, F(1, 35) = 7.01, p < .02, 
no main effect of experiment (F < 1), and no 
interaction effect, F(1, 35) = 1.99, p> 15. 

Although the two certainty measures ap- 
pear to be reasonable, numerous other plausi- 
ble measures can probably also be derived. 
Nevertheless, the present measures and the 
above multivariate analysis do provide some 
evidence that subjects in the no information 
condition may have been less certain of their 
judgments than were the partial information 
subjects, 


Partner Selection 


Table 5 presents the mean ratings of the 
extent to which each of the supposed others 


™These two components were derived from the 
notion that the experimental task of indicating a 
best estimate and minimum and maximum estimates 
can be approached in two very different manners. A 
subject may make his or her best estimate of the dis- 
tance the light moves and then decide upon a min- 
imum and a maximum, in effect placing a confidence 
band around his or her best estimate (e.g., “I think 
it moved 3 inches, but it could be anywhere from 1 
to 5 inches”). If this approach were taken, then the 
difference between the minimum and maximum esti- 
mates can be considered to reflect a confidence inter- 
val. Since the magnitude of the difference between 
maximum and minimum estimates is dependent upon 
the distance perceived, the ratio of the difference to 
the best estimate of the distance the light moved was 
computed. The mean ratio across the three trials of 
Series 1 forms the first component. 

The task may have been approached in another 
manner, however. A subject may first establish a 
minimum and maximum and then set his or her best 
estimate somewhere within this range. If this aP- 
Proach were taken, the difference between maximum 
and minimum estimates could not be considered to 
reflect a confidence interval. With this approach, the 
logical recourse when one is not sure where within 
the range to locate the best estimate is to simply 
average the minimum and maximum estimates (€g 
“Let’s see, it moved a minimum of 1 inch and @ 
maximum of 5 inches, I'll say it moved 3 inches”). 
Thus, a subject might be said to be uncertain of his 
or her response if he/she simply chooses as the best 
estimate the average of the earlier established min- 
imum and maximum distances, that is, if he/she 
simply sets a judgment symmetrically between the 
minimum and maximum. This appproach is Te- 


fected in the second component that was computed: 


ae 
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¢ was desired as a partner. A 2 (No vs. Partial 
~“ Information) X 2 (Similar vs. Dissimilar) x 
| 2 (Agreer vs. Disagreer) analysis of variance, 
with the last two factors as repeated measures, 
revealed a main effect of each of the two 
within variables. In general, similar others 
| (M= 6.07) were preferred over dissimilar 
others (M = 5.27), F(1, 20) = 5.37, p < .05, 
| and the agreers (M = 7.27) were preferred 
| over disagreers (M = 4.07), F(1, 20) = 
20.54, p < .001. Each of these within-subjects 
factors interacted with perceived level of in- 
formation. The general preference for a sim- 
ilar other over a dissimilar other was apparent 
in the no information condition (similar M = 
6.59, dissimilar M = 4.77) but not in the par- 
tial information condition (similar M = 5.55, 
dissimilar M = 4.77), F(1, 20) = 8.88, p< 
| .01. Correspondingly, the preference for an 
agreer over a disagreer was stronger in the 
partial information condition (agreer M = 
8.18, disagreer M = 3.14) than in the no in- 
formation condition (agreer M = 6.36, dis- 
agreer M = 5.00), F(1, 20) = 8.76, p < .0l. 
However, the three-way interaction, which 
was to be expected if no information subjects 
had been concerned solely with construction 
and partial information subjects solely with 
validation, did not occur, F < 1. 

Some confirmation of the hypotheses is pro- 
vided by a priori ¢ ratio comparisons of the 
no vs, partial information means. These com- 
parisons rely on the notion that despite any 
concerns extraneous to informational social 

comparison that may have arisen in the ex- 
perimental situation, the no information and 
partial information subjects should have di- 
vergent concerns with construction versus 
validation and hence should display differen- 
tial preferences for the similar disagreer and 
the dissimilar agreer., As predicted, no infor- 
mation subjects expressed a greater preference 
for the similar disagreer than did partial 
| information subjects, #(20) = 3.57, p < .005, 
and partial information subjects preferred the 
dissimilar agreer significantly more than did 
no information subjects, (20) = 3.23, p< 
005. The two conditions did not differ in their 
ratings of the similar agreer, £(20) = 1.08, or 
the dissimilar disagreer, (20) < 1. 
Inspection of Table 5 reveals that the data 
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Table 5 
Mean Partner Ratings in Experiment 3 
Perceived 
level Similar Dissimilar 
of 
informa- Dis- Dis- 
tion Agreer agreer Agreer agreer 
No Wed AS Ay 5.46 4.09 
(6) (3) (1) (1) 
Partial 8.18 2.91 818 3.36 
(7) (0) (4) (0) 


Note. The higher the mean, the more that person 
was preferred as a partner. Numbers in parentheses 
are the number of subjects who chose that person as 
a partner on the forced-choice measure. 


were not as was to be expected if subjects had 
been concerned only with informational social 
comparison. In each condition, the comparison 
other who was predicted to be most preferred, 
given that preferences were to be dictated 
only by informational motives, was not in fact 
most highly preferred. Although no informa- 
tion subjects did significantly (p < .05, by a 
least significant difference test) prefer the 
similar disagreer to the dissimilar disagreer, 
their ratings of the similar disagreer and the 
dissimilar agreer did not differ, and they 
tended (p < .20), if anything, to prefer the 
similar agreer to the similar disagreer. Partial 
information subjects did significantly prefer 
the dissimilar agreer to both the similar dis- 
agreer and the dissimilar disagreer, but they 
displayed equal preference for the two agreers. 
Thus, although the data were supportive in 
the sense that they suggest that desires to 
compare with a similar disagreer and desires 


‚to compare with a dissimilar agreer each vary 


as a function of perceived level of informa- 
tion, the similar disagreer was not the most 
preferred partner among the no information 
subjects, and the dissimilar agreer was not the 
most preferred other among the partial infor- 
mation subjects. 

The forced-choice data are also presented in 
Table 5. The data display the same general 
pattern as the ratings data. However, none 
of the No versus Partial Information com- 
parisons are statistically significant, largely 
because of the overwhelming preference 
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among both groups of subjects for the similar 
agreer. 

Motive for comparison. A 2 (No vs. Par- 
tial Information) X 2 (Construction vs. Val- 
idation) analysis of variance, with the last 
factor as a repeated measure, was performed 
on the ratings of desire for construction and 
validation. As in Experiment 1, the predicted 
interaction achieved statistical significance, 
F(1, 20) = 4.93, p < .05. In the no informa- 
tion condition, the construction and validation 
means were 7.64 and 5.27, respectively; in the 
partial information condition, these means 
were 7.00 and 8.09, respectively, Within-con- 
dition comparisons of construction means to 
validation means did not reach statistical sig- 
nificance. 


Discussion 


The value of similar disagreers versus dis- 
similar agreers appears to depend upon one’s 
perceived level information. While these two 
sources are each potentially meaningful from 
an attributional perspective, they differ in the 
degree to which they appeal to an individual 
interested in construction vs, validation. As 
perceived level of information increases (and, 
correspondingly, as interest in construction 
tends to decrease and interest in validation 
tends to increase), preference for a similar 
disagreer decreases and preference for a dis- 
similar agreer increases, The similar disagreer 
is relatively more appealing to the person 
interested in construction than to the person 
interested in validation. If one is in a hypoth- 
esis-generating stage, it appears to be impor- 
tant to discover why a similar other, who 
should be agreeing, actually disagrees. The 
dissimilar agreer, on the other hand, is pre- 
ferred relatively more by a person interested 
in validation than by a person interested in 
construction. Possessing what she/he believes 
to be a sufficient level of information the per- 
son interested in validation may find the dis- 
similar agreer a useful comparison other. His 
or her agreement can only be attributed to 
their common sharing of the “facts” and in- 
forms the individual that his or her own judg- 


ment is not based upon some per: idii 
ae personal idiosyn- 
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The difference in preference for a dissim- 
ilar agreer that was observed between no and 
partial information conditions in Experiment 
3 also serves to clarify an interpretational 
difficulty encountered with respect to the data 
in Experiment 1. As pointed out earlier, selec- 
tion of the agreer as a partner in Experiment 
1 could have been motivated by many factors 
—only one of which is validation. For ex- 
ample, the relatively stronger preference for 
an agreer in the partial information conditions 
could have been due to self-esteem motives, 
This same difficulty is not encountered with 
respect to Experiment 3 because of the addi- 
tion of the similarity manipulation. A dis- 
similar agreer appears to be the relevant 
choice for determining whether one’s judg- 
ment is accurate. The fact that this person is 
more preferred by partial than by no informa- 
tion subjects provides evidence of an in- 
creased concern for validation in the partial 
as opposed to the no information conditions. 
The similar agreer would appear to be the 
relevant partner choice if one were concerned 
with self-esteem maintenance or enhancement. 
Since the two conditions did not differ with 
respect to their preferences for the similar 
agreer, it appears that self-esteem motives do 
not vary as a function of perceived level of 
information. 

It should be recalled that the subjects in 
Experiment 3 did not display absolute prefer- 
ences for the predicted comparison other. 
That is, no information subjects did not most 
strongly prefer the similar disagreer. Nor did 
Partial information subjects most strongly 
prefer the dissimilar agreer. These absolute 
Preferences were to be expected to the extent 
that partner choice was dictated solely by in- 
formational concerns, Apparently, the experi- 
mental situation was not successful in creat- 
ing such “pure” motives. Whether “pure” 
Motives and absolute preferences occur in any | 
given situation is likely to depend upon 4 
number of additional parameters. In partic- 
ular, it is likely to depend upon the im- 
Portarice of the task to the individual (cf 
Jones & Regan, 1974). There are many 1°37 
Sons to choose a specific other as a partner: 
The less important the given task and infor- 
mational social comparison reasons are in ê 


| 


given situation, the more likely it is that these 
other factors will affect absolute preferences. 
In the present case, the noncritical nature of 
the experimental task and the anticipated ease 
of discussing a judgment with a similar agreer 
clearly affected absolute preferences. Only 
when the data are considered in relative terms 
are informational social comparison motives 
and their dependence upon perceived level of 
information apparent. 

Why, when the data are considered in ab- 
Solute terms, did so many subjects in each of 
the conditions prefer to have the similar 
agreer as a partner? One possibility is that 
subjects who chose the similar agreer may 
have done so because the task threatened 
their self-esteem. That is, rather than being 
primarily concerned with informational social 
comparison, these subjects may have been 
largely concerned with their self-esteem. A 
further analysis of the data provides some 
interesting insight. An internal analysis was 
performed in order to examine whether sub- 
jects who chose the similar agreer felt more 
uncertain of their judgments than did subjects 
who made the predicted partner choice. The 
discriminant weights from the multivariate 
analysis of variance on the difference ratio 
and symmetry measures of certainty were em- 
ployed in order to construct a weighted sum 
that could serve as an index of certainty. 
Within the no information condition, those 
who chose the similar agreer tended to be less 
certain of their judgments than those who 
chose the similar disagreer ($ < .10, by a 
two-tailed Mann-Whitney test). Among the 
Partial information subjects, a similar pattern 
was evident. Those who chose the similar 
agreer tended to be less certain than those 
who selected the dissimilar agreer ( p< .10). 
Collapsed across the two conditions, there was 
a significant difference between the subjects 
who chose the similar agreer and those who 
chose either the similar disagreer or the dis- 
similar agreer (p < .01). The results of this 
internal analysis suggest that an individual 
will not engage in either process of informa- 
tional social comparison if he/she is very un- 
Certain of his or her judgment. Rather than 
reveal the context of his or her uncertainty, 
the individual may simply ignore the need for 
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informational social comparison. Concerns 
with self-esteem maintenance may become 
primary. The similar agreer is likely to have 
been perceived as the least threatening poten- 
tial partner, Discussion with someone who is 
similar in perceptual style and who has agreed 
on past trials may have been viewed as a rela- 
tively safe and easy alternative. In addition, 
such comparison may provide a sufficient 
bolstering of one’s certainty that one may 
subsequently be willing to test one’s judgment 
in a more objective manner via informational 
social comparison. 

Although the above internal analysis ‘is 
quite revealing, the finding is a bit troubling 
in that it runs counter to the basic premise 
that informational social comparison is to be 
expected under conditions of uncertainty, Ap- 
parently, extreme uncertainty leads, not to a 
desire for information, but a feeling of threat 
and a desire to protect one’s self-esteem. (See 
Conolley, Gerard, & Kline, 1978, for a similar 
discussion of the relation between uncertainty 
and informational vs. self-esteem motives 
within the context of ability comparison.) The 
current findings suggest that once a threshold 
of certainty is achieved such that the indi- 
vidual no longer feels threatened, the person 
is willing to “risk” informational social com- 
parison. Whether construction or validation 
processes then occur will depend upon the 
person’s level of uncertainty. 

Future research will need to address the 
issue of informational versus self-esteem com- 
parison further. Presumably, however, con- 
cerns for self-esteem among individuals who 
feel extremely uncertain would be overridden 
in situations where the individuals are making 
important and consequential decisions and 
need information to do so. That is, the thresh- 
old certainty level that- must be achieved for 
a person to “risk” informational social com- 
parison may vary as a function of the im- 
portance of the judgment in question. Fur- 
thermore, in such important situations, one 
of the comparison others whose value tends 
to vary as a function of perceived level of 
information (i.e., the similar disagreer or the 
dissimilar agreer) may become the most 
strongly preferred comparison other. That is, 
the relative effects observed in Experiment 3 
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when no and partial information conditions 
were compared may also become evident in 
absolute terms when important judgments are 
being made. 
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The hypothesis that sex role development depends in part upon children’s tend- 
encies to imitate same-sex individuals more than opposite-sex models is central 
to most theories of sex typing. Yet Maccoby and Jacklin, in an influential re- 
cent review of the literature, conclude that the hypothesis has been discon- 
firmed. In the present article, it is argued that the research on which Maccoby 
and Jacklin based their conclusion is weak both methodologically and concep- 
tually, This article presents a modified social learning theory account of the 
contribution of imitation to sex role development. It is suggested that children 
learn which behaviors are appropriate to each sex by observing differences in 
the frequencies with which male and female models as groups perform various 

responses in given situations. Furthermore, children employ these abstractions 

of what constitutes male-appropriate and female-appropriate behavior as models 

for their imitative performance. A first experiment confirmed that children en- 

gage in these processes. A second experiment extended the validity of the form- 

ulation by showing how it accounts for children’s imitation of individual adult 

models. Specifically, it was shown that a child’s imitation of an adult is strongly 

influenced by the degree to which the child believes that the adult usually dis- 

plays behaviors that are appropriate to the child’s sex (that is, behaviors that 

occur with greater frequency among individuals of the child’s sex than among 

opposite-sex persons). In sum, the present research reinstates same-sex imita- 

tion as a viable mechanism of sex role development. 


(Bandura & Walters, 1963; Mischel, 1966, 
1970) all suggest that psychological differ- 
ences between the sexes are at least in part 
perpetuated by the fact that boys and girls 
as groups are more inclined to imitate re- 
sponses displayed by same-sex models than 
they are to imitate opposite-sex models, Fur- 
thermore, within each sex, individual differ- 


Sex role development, or sex typing, is the 
process by which children come to adopt the 
attitudes, feelings, behaviors, and motives 
that are culturally defined as appropriate * 
for their sex (Hetherington, 1967; Mischel, 
1970). Virtually every leading theory of sex 
typing assigns a prominent place to imitation 
in the process. Psychoanalytic theory (Freud, 
1949), cognitive-developmental theory (Kohl- 
berg, 1969), cognitive consistency theory 


(Kagan,-1964),s ands social sAeamnime theory. Mi The question of whether a given response is more 
“appropriate” to one or the other sex can be deter- 
mined in any of a number of ways. These include 


consensus by a panel of judges, empirical determina- 
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tion of the relative frequency with which the re- 
sponse is performed by the two sexes, and differences 
in the consequences boys and girls receive for per- 
forming a given response. Regardless of the manner 
in which one chooses to operationalize sex-appro- 
priateness of behavior, however, the processes through 
which sex role learning is effected are likely to be 
similar, and it is with process that the present 


article is concerned. 
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ences in children’s masculinity and femininity 
(or the degree to which a child displays re- 
sponses appropriate, respectively, to the male 
or female sex) are believed to stem in part 
from the strength of children’s tendencies to 
prefer imitation of one sex model over the 
other. 

Recently, the validity of the hypothesis 
that psychological differences between the 
sexes are due in any significant degree to dif- 
ferential imitation of same-sex models has 
come under serious attack, notably by Mac- 
coby and Jacklin (1974), These authors argue 
forcefully in their influential volume that the 
research to date fails to support the idea that 
differential imitation of like-sex models is 
probably a mechanism through which sex 
differences eventuate. 

In arriving at their conclusion, Maccoby 
and Jacklin (1974) reviewed more than 20 
imitation studies in which both sex of subject 
and sex of model had been included as factors 
in the research design, and they discovered 
that the vast majority of such studies failed 
to find a significant interaction of sex of ob- 
server and sex of model on children’s imita- 
tion. A more recent review (Barkley, Ullman, 
Otto, & Brecht, 1977) included more than 80 
studies and reached essentially the same con- 
clusion, Furthermore, the same null result 
holds regardless of whether the modeled be- 
havior is traditionally sex-typed or not and 
of whether the models are strange adults or 
familiar persons such as the child’s own par- 
ents. Maccoby and Jacklin therefore con- 
cluded that it is necessary to search for fac- 
tors other than same-sex imitation (e.g., bio- 
logical determinants or direct elicitation and 
reinforcement of sex-appropriate behavior) to 
account for sex typing. If accepted, the Mac- 
coby and Jacklin conclusion would constitute 
a devastating blow to most psychologists’ 
understanding of the sex-typing process since 
as has been pointed out, most theories assign 
a prominent role to the imitation of same-sex 
models. 

We believe that it is premature to abandon 
the same-sex imitation hypothesis altogether 
The results of these literature reviews suggest, 
however, that if the same-sex imitation hy- 
pothesis is to be retained, then a reconceptual- 
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ization of the way in which same-sex imitation 
contributes to sex role development may be 
in order. 

It is our contention that the typical investi- 
gation designed to explore same-sex imitation 
has employed an experimental paradigm that 
is conceptually remote from how imitation 
actually contributes to sex role development 
and therefore that the null results of much 
of the previous research on same-sex imitation 
constitute an inappropriate base for rejecting 
the same-sex imitation hypothesis. In the 
typical investigation, children are exposed to 
a single (often unfamiliar) male and/or fe- 
male performing (often novel) responses and 


are then tested for imitation of the models’ 1 


responses. The same-sex imitation hypothesis, 
in its usual form, predicts that children will 


view the actions of the single same-sex model | 
as a better guide for their own behavior than | 


the actions of the opposite-sex model and will 
thus imitate the former model more. This is 
actually quite unlike the way in which chil- 
dren use the responses of same- and cross-sex 
models as guides to their own behavior in 
naturalistic settings. In the real course of de- 
velopment, children discern what behaviors 
are appropriate for each sex by watching the 
behavior of many male and female models 
and by noticing differences between the sexes 
in the frequency with which certain behaviors 
are performed in certain situations. They 
then use these abstractions of sex-appropriate 
behavior as “models” for their own imitative 
performance. This is quite a different view 
from assuming that children develop such 


widely generalized tendencies to imitate same- | 


sex models that they readily take the behavior 
of a single same-sex model as a more reliable 
guide to their own behavior than the behavior 
of an opposite-sex model, even when the 
Same-sex model is unfamiliar or displays novel 
Tesponses, or when the child has no idea about 


whether the model’s behavior is likely to be 


consensually validated by others of the mod- 
el’s sex, 

l When conceptualizing the contribution of 
imitation to sex-role development, we find it 
useful to draw on the distinction made by $0- 
cial learning theorists between the observa- 
tional learning and performance of modeled 
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behavior (Bandura, 1965, 1977; Mischel, 


fa 1970). According to social learning theory, 
through the observation of the behavior of 
models of both sexes, children cognitively 
acquire the potential to perform a vast reper- 
toire of social behaviors that includes not only 


behaviors appropriate to the child’s own sex 
„but also (at least most of) the behaviors ap- 
propriate to the opposite sex. This is accom- 
plished through such observational learning 
strategies as attending to the models’ actions, 
coding the models’ responses into covert 
imaginal or symbolic representations, and 
mentally rehearsing the models’ actions. We 
propose that an additional and crucial (for 
| sex typing) aspect of the observational learn- 
| ing process involves children’s coding, or or- 
ganizing in memory, of various responses as 
male-appropriate or female-appropriate on 
the basis of their having witnessed different 
proportions of available male and female 
models performing the responses. For ex- 
ample, if children observe that 80% of avail- 
able male models perform a particular re- 
sponse in a given situation but only 5% of 
female models perform it, they are likely to 
code the response as male-appropriate or 
masculine. 

Not everything learned is performed, how- 
ever, Thus, although most boys know how to 
put on a dress, apply makeup, and primp in 
front of a mirror, few boys actually choose to 
erform these behaviors. Children are most 
ikely to perform behaviors they have coded 
as appropriate for their own sex. According 
to social learning theory, which responses 
children select to perform from their reper- 
oires depends primarily on the response con- 
sequences the children anticipate. Children 
should prefer to perform behaviors exhibited 
by same-sex models, because they have more 
often been rewarded and less often criticized 
or imitating same-sex models and because 
they gradually learn that the social environ- 
ment holds similar expectations and imposes 
similar reinforcement contingencies on them 
and others of their sex (Bandura, 1969; Bus- 
sey & Perry, 1976; Mischel, 1970). In sum, 
although children acquire, via observational 
learning, behaviors that are appropriate to 
both sexes and label certain responses as 
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same-sex-appropriate and others as opposite- 
sex-appropriate, they prefer spontaneously to 
imitate actions coded as same-sex-appropriate. 
This is because they infer that what is ex- 
pected or permissible behavior for others of 
their sex is also likely to be appropriate be- 
havior for them. 

Our first experiment constitutes a labora- 
tory analog study of how we hypothesize 
same-sex imitation to operate in naturalistic 
settings. In real life, male and female models 
as groups frequently diverge in the responses 
they display in a given situation, When con- 
fronted with a similar situation and the option 
to perform one of the responses they have 
seen male and female models exhibit, children 
should be inclined to choose an option they 
recognize as same-sex appropriate. Consider, 
for example, a 10-year-old boy who has re- 
cently transferred to a new school, In his first 
days, he observes that during recess the 
majority of girls play hopscotch, whereas most 
of the boys play baseball; that during library 
periods, most boys check out a Hardy Boys 
mystery, whereas most of the girls read Nancy 
Drew; that when given a choice of technical 
electives, boys choose woodwork, whereas 
girls choose sewing. Even if the boy has not 
previously observed sex differences in the per- 
formance of these various responses, he is now 
likely to code the responses as masculine or 
feminine and to use these codings as guides to 
his own response selection. Experiment 1 was 
designed to determine whether these sugges- 


tions are tenable. 


Experiment 1 


Method 


Overview. Boys and girls were assigned to one of 
three modeling conditions or to a no-model control 
condition. In all the modeling conditions, children 
viewed 8 adult models (4 male, 4 female) individ- 
ually and indicated their preferences on a series of 
two-choice preference tasks. In a first experimental 
condition, on every trial all the male models chose 
one of the items and all of the women chose the other 
item. In a second condition, on every trial 3 of the 
and 1 of the women chose one item and the re- 
maining man and women chose the other item. In a 
third condition, on every trial 2 of the men and 2 of 
the women chose one item and the other 2 men and 2 
women chose the other item. Subjects in the no- 


men 
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model condition skipped this phase. Subseduently, 
children in all groups were asked to indicate their 
own preferences for each item pair. The hypothesis 
was that children’s endorsement of an item would be 
a function of the difference in the frequency with 
which same- and opposite-sex models had displayed 
a similar preference—the greater the difference, the 
more likely same-sex imitation was hypothesized to 
be. 

Participants. Subjects were 96 8- and 9-year-old 

children, half boys and half girls, who attended a 
public school in a middle-class suburb of Brisbane, 
Australia. Models were four men and four women, all 
in their twenties. The experimenter was a female in 
her early twenties. 
_ Stimuli, The preference task involved 16 pairs of 
objects. Within each pair the items belonged to the 
game class (eg, a banana and an apple; a small 
plastic cow and a small plastic horse). Models were 
shown (on a black and white video monitor) mak- 
ing their choices from among the actual items, which 
were presented to them by an adult female (not the 
experimenter). The preference task was presented to 
subjects by showing them 16 color slides, each de- 
picting one item pair. We deliberately selected re- 
sponses as modeling stimuli that did not have clearly 
established histories of being more appropriate for 
one sex than the other. This was because our major 
goal was to demonstrate in the laboratory a major 
process through which previously sex-neutral re- 
sponses become sex-linked. Furthermore, the use of 
already sex-typed responses could introduce poten- 
tially troublesome and irrelevant factors (e.g., the 
relative interest value of masculine and feminine 
behaviors) that might render unambiguous interpre- 
tations of the results in terms of cause and effect 
relations quite difficult. 

Procedure. Each child was brought individually 
by the experimenter to a spare room on the school 
premises and, if assigned to a modeling condition, 
was seated before the video monitor. To modeling 
subjects, the experimenter explained that they would 
vend some people telling which things they “like 

etter.” In all modeling groups, the children saw, on 
A monitor, the experimenter place the first 

pair on a table before the eight models, indi- 


vidually solicit a preference from each model, and’ 


then continue to the next item pair until all 16 
trials were completed. The models, who were asked 
for their preferences in a different order on every 
trial, always indicated their preference by pointing 
to the item of their choice and saying, “I like the 
(item) better.” On each trial, four models always 
chose one item and the other four the remaining 
item. This equated for all groups the number of 
models endorsing a given item. Twelve boys and 12 
girls were randomly assigned to each of three model- 
= ing conditions or to a no-model -control condition 
which were run concurrently in a randomized blocks 
procedure. 
In the first modeling condition, or high (100%) 
within-sex consensus modeling condition, for every 
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item pair all the men chose one item and all the 
women chose the other item. Two modeling tapes y 
were made for this condition, with half the subjects 
seeing one tape and the remainder seeing the other, 
The tapes were designed to counterbalance for the 
sex of model displaying a given item preference. For 
the first tape, determination of whether the men or 
the women chose a given item was determined ran- 
domly for each trial. On the second tape, the sex of 
the models choosing a given item was opposite to the 
sex of the models who had chosen that item on the 
first tape. 

In the second modeling condition, or moderate 
(75%) within-sex consensus modeling condition, on 
every trial three men and one woman chose one 
item and the remaining three women and one man 
chose the other item. The identity of the man who 
chose as did the majority of the women and of the 
woman who chose as did the majority of the men 
changed from trial to trial, with each model serving 
four times as the “odd person out” for his or her 
sex, Half of the subjects watched a tape in which 
the items preferred by the three male models were 
identical to those preferred by all the men in the 
first tape used in the high within-sex consensus con- 
dition; the remaining subjects saw a tape in which 
the items preferred by the three male models were 
those that had been preferred by all the men in the 
second tape used in the high within-sex consensus 
condition. 

In the third modeling condition, or low (50%) 
within-sex consensus modeling condition, on each trial 
two of the men and two of the women chose one 
item, whereas the remaining models chose the other 
item. Again, two tapes were used, with half the sub- 
jects viewing each. In the first tape, two men and 
two women were randomly drawn, for each trial, to 
make the same item choices as those made by all the 
men for that trial on the first tape used in the high 
within-sex consensus condition. The remaining models 
chose the other item. In the second tape, the two men 
and the two women who had been drawn for a given 
trial made the choice for that trial that had been 
made by all the male models in the second tape of 
the high within-sex consensus condition. 

The remaining 12 boys and 12 girls were assigned 
to a no-model condition and proceeded directly to 
the imitation test after arrival at the experimental 
room. Children in the modeling groups proceeded to 
the imitation test immediately after the models had 
made their choices for the 16th item pair and the 
video equipment had been shut off. 

In the test for imitative performance, children were 
shown the 16 slides depicting the item pairs and were 
told to indicate for each slide (verbally as well as by 
using a pointer) which item they preferred. 
_Subjects in the three modeling conditions were then 
given a recall test. The experimenter once again dis- 
Played the 16 slides, pointed to one of the two items 
for each slide, and instructed the child to tell if mor 
of the men had liked that item, if more of the 
women had liked it, or if an equal number of me” 
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| and women had liked it. Children were promised a 


token (exchangeable for small prizes) for each correct 
answer. 

Children in all groups were told they had per- 
formed very well and were awarded little prizes that 
were distributed when testing was completed. 


Results 


Imitative performance. An imitation score 
was derived for every child in the study by 
counting the number of the child’s choices 
that matched the choices that had been 
unanimously displayed by the male models in 
the high within-sex consensus modeling condi- 


| tion, These scores of course could vary from 


0 to 16 for a given child. The reasoning be- 
hind choosing imitation of the responses dis- 
played by the male models in the high within- 


| sex consensus condition as the datum to be 


| compared across treatments is the following. 


According to the hypothesis of the research, 
boys’ imitation of these responses should be 
highest in the high within-sex consensus con- 
dition and girls’ imitation should be lowest in 
this condition. This follows from the predic- 
tion that children’s endorsement of an item 
will be a function of the percentage of same- 
sex models who had displayed the preference. 
In this condition, 100% of the male and 0% 
of the female models endorsed these items. To 
see how changes in the percentage of same- 
and opposite-sex models endorsing an item 
alter imitation, it is mecessary to examine 
imitation of the same responses but under 
conditions where the percentages of same- and 
opposite-sex models endorsing them have 
changed. This is most easily done by main- 
taining imitation of the male models’ re- 
sponses from the high within-sex consensus 
condition as the dependent variable across 
treatment conditions. As we move from ex- 
amining children’s imitation of these re- 
sponses in the high within-sex consensus con- 
dition to the moderate and then the low 
within-sex consensus conditions, boys’ scores 
should steadily decrease, and girls’ should 
steadily increase. This is because the percent- 
age of male models endorsing the preferences 
exhibited by the male models in the high 
within-sex consensus condition decreases as 
one moves to the moderate and then the low 
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conditions, whereas the percentage of female 
models endorsing these responses increases as 
we move in this direction. 

Recall that within each of the three model- 
ing conditions, two modeling tapes were used 
(in order to counterbalance the models’ 
choices), with half the children in each condi- 
tion viewing one tape and the other half 
seeing the other tape. This affected computa- 
tion of children’s imitation scores in two ways. 
First, imitation scores for subjects in each 
modeling condition who viewed the first 
modeling tape for their condition (described 
in the method section) were computed by 
counting the number of the children’s choices 
that matched the choices unanimously made 
by the male models in the first modeling tape 
of the high within-sex consensus modeling 
condition; imitation scores for subjects in 
each condition who viewed the second tape 
for their condition were computed by count- 
ing the children’s choices that matched the 
choices unanimously made by the male models 
in the second modeling tape of the high 
within-sex consensus modeling condition. Sec- 
ond, imitation scores for subjects in the no- 
model group were calculated as follows: For 
a random half of the subjects, scores were 
taken as the number of choices that matched 
those of the male in the first modeling tape 
of the high within-sex consensus modeling 
condition; for the remaining no-model sub- 
jects, scores were taken as the number of 
choices that matched those of the male models 
in the second modeling tape of the high 
within-sex consensus modeling condition. 

The mean imitation scores are presented 
separately for boys and girls for each treat- 
ment condition in Table 1. It may be seen 
that the data lend strong support to the 
hypothesis that children’s imitation of a re- 
sponse increases aS the percentage of same- 
sex models displaying the response increases. 
It is interesting to note that essentially iden- 
tical results were achieved for girls. The fact 
that girls imitated, on the average, only 2.8 
of the males’ responses in the high within-sex 
consensus condition of course means that they 
were imitating 13.2 of the choices that 100% 
of the female models had made. As the per- 
centage of female models endorsing these re- 
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Table 1 a ees ae , 
Imitative Performance Means of Boys and Girls in Each Treatment Condition (Experiment 1) 


a 
Treatment condition 


Low (50%) 


High (100%) Moderate (75%) w | 7 
within-sex within-sex within-sex “o-model 
Child consensus consensus consensus control 
8.7 

Boys 13.9, 11.64 ale 

Girls 2.85 5.8, 8.5. 
Note. For all treatment conditions, a subject's imitation score was the number of his or her preferences that 
matched those made by the male models in the high within-sex consensus condition, A sut ject's score could 
range from 0 to 16, Each mean is based on scores of 12 subjects. Cells not sharing a common subscript differ 


significantly from each other (p < .05) by a two-tailed ¢ test. 


sponses declined, so too did the girls’ imita- 
tion of them. It can be seen, then, that the 
imitative performance of both boys and girls 
is strongly affected by changes in the fre- 
quency with which same- and opposite-sex 
models display the behavior. 

An analysis of variance was performed on 
the means in Table 1, with sex of child and 
treatment condition as factors. There was a 
Strong main effect of sex of child, F(1, 88) = 
162.87, p < .001, as well as a significant inter- 
action of sex of child and treatment condition, 
F(3, 88) = 58.50, p < .001. The sex main 
effect is, of course, due to the fact that imita- 
tion scores were determined by children’s 
imitation of preferences given by the male 
models in the high within-sex consensus con- 
dition. If imitation of the preferences given 
by the female models in this condition had 
arbitrarily been chosen as the dependent 
variable, then of course a highly significant 
main effect of sex in the Opposite direction 
would have been obtained, The significant 
interaction effect and its pattern clearly sub- 
Stantiate the conclusions reached above. 

A final comment on the performance re- 
sults is in order. When children skipped the 
modeling phase and therefore had no informa- 
tion about the Sex-appropriateness of the 
responses, their imitation of the responses 
was not markedly different from when they 
observed equal numbers of male and female 
models display them (as in the low within- 
Sex consensus condition). In fact, within each 


of the two rows of Table 1, all Pairwise com- 
parisons between means are significant (by 


two-tailed ¢ tests) at p< .05, except for the 


comparison between the mean of the low 
within-sex condition and that of the no-model 
control condition. 

Recall, Children in each modeling condi- 
tion were asked, for each item choice dis- 
played by the male models in the high within- 
sex consensus condition, whether more of the 
men, more of the women, or equal numbers of 
men and women had chosen it. The number of , 
correct answers the children gave were ans 
alyzed in a 2 x 3 analysis of variance with 
sex of subject and modeling treatment condi- 
tion as factors. The only significant effect was 
the main effect of treatment condition, F(2, 
66) = 7.78, p < .002. Mean correct responses 
for the high, moderate, and low within-sex 
consensus conditions, respectively, were 14.4, 
12.6, and 11.9. Clearly, the more consensus in 
choices there is among models of a given sex, 
the better able children are to remember 
whether and how the behavior is sex-typed. 
Boys and girls do not seem to differ in their 
propensities to do this. 


Discussion 


The research clearly supports our revised 
version of the same-sex imitation hypothesis. 
It is quite clear that when children observe | 
Same-sex models as a group exhibiting a Te 
Sponse that diverges from the response made 
collectively by opposite-sex models under the 
same circumstances, the children are far more 
likely to imitate the response made by same 
Sex models. Obviously, the research restores 
same-sex imitation as a plausible mechanism 
of sex role development and reverses the Mac- 


; 


coby and Jacklin (1974) decision. However, 
the study suggests that the same-sex model 
for a child’s imitation is the child’s abstracted 
definition of appropriate behavior for the 
child’s sex based on the child’s observation of 
multiple examples of each sex behaving in a 
particular situation. 


Experiment 2 


Previous research on the same-sex imitation 
hypothesis has been guided by the assumption 
| that children develop strongly generalized 
tendencies to imitate same-sex models. Recent 
| reformulations of social learning theory (Ban- 
dura, 1977; Bandura & Barab, 1971; Mischel, 
1968, 1973), however, stress the discriminat- 
| ing nature of children’s selection of responses 

and models from among the vast array avail- 
| able to them. The negative results of research 
on the same-sex imitation hypothesis suggest 
that children do not automatically assume 
that a strange same-sex adult is a more ap- 
propriate role model for them than an op- 
Posite-sex model. Children probably learn 
quite early that individual differences exist 
among male and female models in terms of 
their masculinity and femininity and thus 
their suitability as models, Children should 
imitate a model when they believe the model’s 
behavior is appropriate for them, too, but in 
order to conclude that a same-sex model is an 
especially reliable guide for their own be- 
havior, the children must have some fairly 
unambiguous indication that the model is be- 
having typically for his or her sex (and is 
therefore an appropriate model). Experiment 
2 tested this possibility. 


Method 


Overview. In an initial phase, children observed 
8 adult models, 4 male and 4 female, indicating their 
Preferences on a series of two-choice preference tasks. 
| On each trial, 3 of the men and 1 woman chose one 
item, whereas the other man and the remaining 
women chose the other item. Across the trials, it was 
always the same man who chose as the 3 women did, 
and it was always the same woman who chose as 
the 3 men did. Thus, it was expected that over the 
trials children would learn that 3 of the men con- 
sistently concurred in their choices and thus were 

ehaving sex-appropriately, whereas 1 of the men 
Consistently chose with the majority of the women 
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and thus was behaving sex-inappropriately. Sim- 
ilarly, children were expected to learn that the be- 
havior of 3 of the women was sex-appropriate but 
that that of the fourth was sex-inappropriate. Some 
children (assigned to a no-premodeling control 
group) skipped this first phase and proceeded straight 
to the second phase. 

In a second phase, children viewed 1 of the 8 
models making choices alone on a new series of 
preference trials. For half of the children who had 
participated in the first phase, this model was one 
of those who had behaved appropriately for his or 
her sex during the first phase; for the other half of 
the children who had participated in the first phase, 
the model was one who had behaved sex-inappropri- 
ately in the first phase. Children who skipped the 
first phase (no-premodeling control subjects) also 
saw one of the models from the first phase, but of 
course they had no idea of the degree to which the 
model’s behavior had previously agreed or disagreed 
with that of other members of his or her sex. Half 
of the children in each of these three groups saw a 
male model and the other half a female model. 

In a third phase, children were tested for imita- 
tion of the preferences the single model had dis- 
played in the second phase. It was predicted, first, 
that same-sex imitation effects would not occur for 
children assigned to the no-premodeling control group 
because these children (as in most previous studies) 
would not know whether the model was behaving 
appropriately or inappropriately for his or her sex. 
Second, it was predicted that same-sex imitation 
effects would be extremely powerful among children 
exposed to a same-sex model whose behavior the 
children knew had a history of concurrence with 
other same-sex models. 

Participants and design. Subjects were 84 8- 
year-olds, half boys and half girls, enrolled at a 
school in a middle-class suburb of Brisbane. Six 
boys and six girls were assigned to cells in a 2X 
3X2 design involving Sex of Child X Sex-Appro- 
priateness of the Single Model Seen in Phase 2 
(appropriate, inappropriate, or no-premodeling con- 
trol) X Sex of Model. An additional six boys and 
six girls were assigned to a no-model ‘control con- 
dition, These children skipped the first two phases 
(that is, both modeling phases) but were asked for 
their own preferences during the third phase just 
as the other children were. Children had an equal 
chance of being assigned to any of the experimen- 
tal design groups or to the no-model group, which 
were run concurrently in a randomized blocks pro- 
cedure. The models and the female experimenter 
were the same as in the first experiment. 

Stimuli. Stimuli used to solicit models’ prefer- 
ences during the first, multiple-modeling phase were 
different from but similar to those used in the 
first experiment. Again, there were 16 item pairs. 
The same set of 16 item pairs was used during the 
second, single-modeling phase as had been used in 
Experiment 1. Models in both phases were shown 
making choices from the actual items on a black- 
and-white video monitor. Children were asked for 
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their own preferences in the third phase and were 
shown the 16 color slides used in Experiment 1. 

Procedure. Subjects were tested individually in 
a spare room at the school. Subjects participating 
in the first phase (multiple modeling) viewed the 
eight models making their choices, as in Experi- 
ment 1, on the video monitor. Three of the men 
always chose one item and three of the women 
always chose the other item; the remaining man 
and woman (inappropriate models) always chose as 
the majority of the opposite-sex models did. It 
should be clear that this treatment was identical 
to that received by subjecs in the moderate within- 
sex consensus condition of Experiment 1, with the 
exception that in the present experiment the identity 
of the odd model out for his or her sex remained 
constant across trials. No-premodeling control chil- 
dren and no-model control children skipped this 
phase. 

Children who had participated in the first phase 
or who were assigned to the no-premodeling con- 
trol group participated in a second phase (single 
modeling). Children saw one of the male or one 
of the female models from the multiple modeling 
video tape indicate his or her preferences on a series 
of 16 new item choices. This modeling sequence was 
also presented on videotape. For children who had 
participated in the first, multiple-modeling phase, 
this model was one whom they had previously seen 
behave either appropriately or inappropriately for 
his or her sex, Children in the no-premodeling con- 
trol condition observed one of the same models that 
children who had participated in the multiple- 
modeling phase saw, but as noted, they had no 
basis for knowing whether the model had a history 
of behaving appropriately or inappropriately for 
his or her sex. 

__To minimize ‘Contamination of the results by 
nerau attributes of the particular model that 
children viewed during this second, si odeling 
phase, different models served as K and 
inappropriate models for different groups of chil- 
dren. To have eight models a 


number of model- 
the following proce- 
-modeling tapes were 
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three appropriate models in the tape they had seen), 
For children assigned to the no-premodeling contro} 
condition, half viewed one of the two male models 
who had served as either an appropriate or an in- 
appropriate model for children in the other con- 
ditions, and the other half viewed one of the two 
female models who had served in one of these 
roles for the other children. These procedures 
equated the frequency across conditions with which 
children viewed a particular model in the second 
Phase as well as permitted more than one of the 
male and one of the female models to play the sex- 
inappropriate role. 

In this study all the models made the same choices 
in the second phase. Extra tapes depicting the mod- 
els making opposite choices were not used because 
examination of the data from Experiment 1 indi- 


cated that similar results were obtained regardless 


of the models’ actual choices. The model's choices 
in the present experiment were the same as the 
choices seen by the children who saw the first tape 
in the high within-sex consensus modeling condition 
of the last experiment. 

All children, including those assigned to a no- 
model control condition, were then shown the 16 
slides depicting the item pairs used in the second 
phase and were told to give their own preferences, 
as in Experiment 1, 


Children in the modeling conditions were asked | 


to recall the choices the model had made in the 
second phase and were promised rewards for doing 
so. Finally, children were asked to respond on 7- 
point scales to the questions, “How much do you 
like (the model)?”, “How much are you like (the 
model)?” “Do you think that most people would 


like (the model) ?”, and “How good is (the model) 1 


at making choices?” Children received small prizes 
for participating and were returned to the class- 
room. 


Results 


Imitative performance. A 2 X 3 X 2 anal- 
ysis of variance with sex of child, sex-appro- 
priateness of the model, and sex of model as 
factors was performed on imitation scores of 
all the children except those assigned to the 
no-model control condition. Significant effects 
included sex of subject, F(1, 60) = 9.32, p < 
005; sex-appropriateness of model, F(2, 60) 
= 19.55, p < .001; the interaction of sex of 
subject and sex of model, F(1, 60) = 9.92, 
$ < .005; and sex of model with sex-appr0- 
Priateness of model, F(2, 60) = 3.27, p< 


05. However, the three-way interaction | 


among all the variables was also highly re- 
liable, F(2, 60) = 41.15, p<.001. The 
means for this analysis are given in Table 2. 
The least significant difference required for 


—> 
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Imitative Performance Means for Interaction of Sex of Subject, Sex-A ppropriateness 


| of Model, and Sex of Model (Experiment 2) 


OOOO rrr 


Sex-appropriateness of model 


Subject and model Appropriate Inappropriate No-premodeling control 
Boys 
Male model 14.7¢ 4,3, 13.2401 
Female model 3.55 9.3b6 12.046 
Girls 
Male model 8.2, 11.264 12.7 aot 
Female model 14.3e¢ 8.3v 12.84er 


Note. Each mean is based on six subjects’ scores. Cells not sharing the same subscript are significantly differ- 


ent from one another (p < .05) by a two-tailed £ test. 


any two means in this table to be reliably 
different from one another at p< .05 by a 
two-tailed ¢ test is 2.62. 

The results confirm the hypotheses. First, 
examine the data for children assigned to 
the no-premodeling control condition by 
glancing down the third column in Table 2. 
None of the means in the column is signifi- 
cantly different from any other mean in the 
column, This substantiates the prediction 
that when children do not know how the 
behavior of a model compares with that of 
other members of the model’s sex, the same- 
sex imitation hypothesis is not confirmed. 
Also note that the means in this column are 
all relatively high. This suggests the possi- 
bility of a ceiling effect. When children see 
only one model displaying item preferences 
in an experimental setting and they do not 
know whether the model is a “good”. or 
“bad” example for his or her sex, they may 
assume that the model—regardless of sex— 
is likely to be a relatively safe example to 
follow in that setting. That the high scores 
in the no-premodeling groups are in fact due 
to imitation is apparent from the fact that 
children assigned to the no-model control 
condition made choices that matched the 
models’ at a substantially and significantly 
(p < .05) lower rate (M for boys in no- 
model group = 8.0; M for girls = 8.2). 

The second hypothesis was that same-sex 
imitation effects would occur when the chil- 
dren were exposed to a model whose be- 
havior, they had learned, was typical of 
Others of the model’s sex. It is evident, by 


glancing down the first column of Table 2, 
that this hypothesis received strong support, 
especially for boys. Note how boys who saw 
a sex-appropriate female model actually 
chose fewer of the model’s responses than 
they did when they had not seen any model 
perform at all (p< .05). 

Two other features of the pattern of means 
are noteworthy. First, glance down the 
middle column of means in Table 2, where 
data are presented for children’s imitation of 
a model whom they had previously seen be- 
have sex-inappropriately. Apparently, when 
children know an opposite-sex model is a 
more reliable indicator of behavior appro- 
priate to the child’s own sex than a same-sex 
model, they are more likely to imitate the 
opposite-sex model. Interestingly, cross-sex 
imitation is really evident only among girls, 
because the imitation scores displayed by 
girls exposed to an inappropriate male model 
(M=11.2) were significantly (p< .05) 
higher than the imitation scores of no-model 
control girls (M = 8.2). In contrast, boys’ 
imitation scores were not significantly higher 
with an inappropriate female (M = 9.3) than 
in the no-model control condition (M = 8.0). 
Boys may ignore rather than imitate females 
who behave masculinely. 

A second point is that boys more actively 
inhibited imitation of sex-inappropriate be- 
havior than girls. Imitation scores of boys 
who saw either an inappropriate male model 
(M = 4.3) or an appropriate female model 
(M = 3.5) were lower ($ < .05 in each case) 
than those of no-model control boys. In con- 
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trast, girls did not inhibit imitation of male- 
appropriate behavior, when it was displayed 
by either a male (M = 8.2) or a female (M 
= 8.3) model. 

Recall. Correctly recalled choices of the 
model were subjected to a similar 2 X 3 x 2 
analysis of variance. No effect attained sig- 
nificance (Grand M = 15.1). 

Questionnaire data. Children’s responses 
to the four postexperimental questions were 
subjected to similar three-way analyses of 
variance. No significant effects were obtained 
on the first question, which asked children 
to indicate their attraction to the model. 
For the second question, which asked chil- 
dren to rate their similarity to the model, 
several effects were significant, but they were 
all subsumed by a significant highest order 
interaction, F(2, 60) = 14.03, p < .001. The 
means indicated that when children saw a 
Sex-appropriate model, there was stronger 
perceived similarity to a same-sex model for 
both boys and girls; when children saw a 
sex-inappropriate model, boys actually ex- 
pressed greater similarity to the female model 
than the male model (no parallel result oc- 
curred for girls); among children in the no- 
premodeling conditions, there were no differ- 
ences in perceived similarity to the model, 
It is, of course, important to remember that 
these data were collected after the subjects’ 
imitation test and may thus have been in- 
fluenced by their imitative performance. On 
the third question, which asked children to 
tell whether most people would like the 
model, there were no significant effects, On 
the final question, which asked children to 
indicate how good they thought the model 
was at making choices, there was a signifi- 
cant three-way interaction, F(2, 60) = 3.97 
p < 05. It appeared that the only major con: 
tributor to this interaction was the fact that 
boys in the no-premodeling control condition 
perceived the female mod 


lel as more compe- 
tent than the male at making choices, Why 
this result emerged is unknown, However, in 
sum it appears that nei í 


3 ither perceptions of the 
model’s competence nor 
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General Discussion 


The major contribution of the present re- 
search is to reinstate same-sex imitation as 
a viable mechanism of sex role development. 
Maccoby and Jacklin’s (1974) conclusion 
that differential imitation of same-sex models 
contributes negligibly to sex role develop- 
ment was based on the results of studies that 
lacked an adequately developed conceptual 
framework of how imitation contributes to 
the social learning of sex differences, The 
present article demonstrates quite clearly 
that children discern behaviors appropriate 
to the two sexes by observing differences be- 
tween the sexes in the frequency with which 
they perform various responses in a given 
situation and, furthermore, that children use 
these abstractions concerning sex-appropri- 
ate behaviors as guides to their own per- 
formance in similar situations. 

The research also demonstrates that a 
child’s imitation of a single model is in part 
governed by the degree to which the child 
has come to believe that the model’s be- 
havior is ordinarily appropriate or inappro- 
priate to the child’s sex. In other words, chil- 
dren are most likely to imitate persons whom 
they perceive to be good examples of their 
sex role. It is of interest that some previous 
tesearch has shown that girls are especially 
likely to imitate a warm or nurturant model, 
whereas boys are particularly likely to imitate 
a dominant model (Hetherington, 1967). 
This may be in part because children usually 


Perceive warmth as indicative of femininity 


and dominance as reflective of masculinity 
(Best, Williams, Cloud, Davis, Robertson, 
Edwards, Giles, & Fowles, 1977; Rothbaum, 
1977; Williams, Bennett, & Best, 1975). 
Clearly, then, children judge the likely ap- 
Propriateness of an individual model’s be- 
havior by recalling how well that person’s 
attitudes and behaviors usually match up 
with others of his or her sex. 

According to social learning theory, chil- 
dren acquire (via observational learning) 
responses appropriate to both sexes but pre- 
fer to perform responses exhibited by same- 
sex models. In other words, model sex affects 
performance more than observational learn- 
Ing. Results of the second study are quite in 


IMITATION AND SEX ROLE DEVELOPMENT 


ine with this proposition. In this experiment, 
here were no significant effects of the ex- 
erimental variables upon children’s recall 


of the model’s behavior, although the same 


factors strongly affected children’s imitative 
performance. These results mesh with those 
of Williams et al. (1975), who found that 
joys and girls do not differ in their knowl- 
edge of cultural stereotypes for male and 
emale behavior which, we could argue, are 
primarily learned via observational learning. 
We do not wish to overstate the case, how- 
ever. Several studies do show that children 
sometimes attend more closely to, as well as 
learn more about, the behavior of models con- 
sistent with their own sex role (Grusec & 
Brinker, 1972; Maccoby & Wilson, 1957; 
Maccoby, Wilson, & Burton, 1958; Perry & 
Perry, 1975; Slaby & Frey, 1975). Nadel- 
man (1974) also found that children were 
able to identify correctly the sex-appropri- 
ateness of items appropriate for their own 
sex better than they were able to identify 
the sex-appropriateness of items traditionally 
associated with the opposite sex. Thus, al- 
though the preference for performing same- 
sex behaviors may well be stronger than the 
preference for learning same-sex behavior, it 
is not as though the latter phenomenon is 
nonexistent. 

Our analysis of same-sex imitation in sex 
role development suggests that children must 
master certain cognitive achievements before 
their sex typing can to any significant degree 
be influenced by imitation of like-sex models. 
Before children will encode responses they 
see performed more often by members of a 
given sex as male- or female-appropriate, 
they must realize that human beings are 
divided into males and females. Before they 
will prefer to perform responses coded as 
same-sex-appropriate, they must realize that 
not only do they belong to one of these 
sexes, but they are subject to a similar set 
of expectations and rei 


reinforcement contin- 
gencies as others of their sex. Children may 
learn that other people are divided into males 
and females before they establish their own 
gender identities (Thompson, 1975). Thus 
children may proceed through an initial stage 
in which they acquire some knowledge of 
male- and female-appropriate behavior (via 
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observational learning) but, not realizing 
they belong to one of these sexes, may fail 
to display a preference for imitating same- 
sex-appropriate behavior. Such a possibility 
should be tested. Once children attain gender 
constancy they do attend more to same-sex 
models (Slaby & Frey, 1975). However, de- 
velopmental research that tracks the emer- 
gence of children’s gender identities and their 
beliefs that they are expected to behave simi- 
larly to members of a particular sex and re- 
lates these variables to children’s imitation 
of same- and cross-sex models is needed. 
Because the present formulation empha- 
sizes that children imitate responses that they 
abstract as appropriate for their sex as a 
group, it clearly assigns less importance to 
children’s imitation of particular individuals 
(e.g., the like-sex parent) than do more tra- 
ditional theories, especially Freudian theory 
(1949) or various other views that hold that 
children achieve sex typing via imitation of 
a warm and/or socially powerful parental 
figure (Hetherington, 1967; Mussen & Dist- 
ler, 1959; Parsons, 1955; Sears, Rau, & 
Alpert, 1965). Although this is true, we do 
not mean to imply that children’s sex typing 
is never influenced by their especially strong 
imitation of “significant others,” for it un- 
doubtedly is. However, we would expect that 
children who initially adopt responses by 
imitating a highly nurturant and/or domi- 
nant same-sex parent would ultimately drop 
the responses from their active repertoires 
if they eventually realize that no other same- 
sex individuals perform the responses but 
that many opposite-sex persons do. 
Although in the first experiment boys and 
girls did not differ in their propensities to 
imitate abstracted same-sex-appropriate be- 
havior, there was clear evidence in the second 
experiment that boys were more concerned 
than girls with matching their behavior to a 
model known to exemplify sex-typical be- 
havior. This concurs with previous findings 
that a variety of socializing agents, including 


parents and peers, enforce stronger sanc- 


tions on boys than on girls for conforming 
to culturally defined sex-role prescriptions 
(Fagot, 1977; Fling & Manosevitz, 1972) 
and that boys are more likely to conform to 
peer pressure than are girls (Maccoby & 
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Jacklin, 1974). It may indeed be the case, 
as Lynn (1969) suggests, that boys become 
sex-typed through “masculine role identifi- 
cation” by acquiring and imitating cultural 
stereotypes of male-appropriate behavior 
learned by observing a variety of male figures 
both within and beyond the home, Girls, on 
the other hand, are less concerned with con- 
forming to abstract standards, 

Although this research indicates that imi- 
tation of abstracted sex-appropriate behavior 
is a plausible mechanism of sex role develop- 
ment, it is not implied that other factors 
(e.g., genetics, direct elicitation, and shaping 
of sex-typed behavior) are unimportant. The 
present formulation is probably a weak or 
inadequate explanation of early sex differ- 
ences, say, those occurring in the first year 
or two of life, As early as age 1, there are 
sex differences in children’s toy preferences, 
though the exact qualities of the toys (e.g., 
softness, faceness, number of moving parts, 
manipulability) responsible for the differ- 
ences have yet to be isolated (Jacklin, Mac- 
coby, & Dick, 1973). It seems unlikely that 
such early differences are due to abstracted 
differences about Sex-appropriate behavior 
learned from observation of multiple models, 
The cognitive capacities required to notice 
that the sexes differ in their frequency of 
performing certain behaviors, to register the 
information in memory as male- or female- 
appropriate, and to learn that one is ex- 
pected to conform to sex-appropriate role 
behavior take time to develop. Obviously 
other factors must be involved in early sex 
differences, 

One might wish to level the criticism 
against the present research that results are 
attributable to “demand characteristics” of 
the experimental procedures, In particular, 
it might be argued that using multiple models 


presence of an 
the models had 
lemands on the 
imitation than 


Position to view a performer’s behavior 4 
vior just 
as aware as the performer of : 


the sex-appro- 
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priateness of various response options in the 
situation. Consequently, it is entirely appro- 
priate to examine imitation of sex-appropri- 
ate and -inappropriate behavior in the lab- 
oratory under circumstances similar to those 
that exist outside of it (i.e., in the presence 
of an observer—here, the experimenter). 
Moreover, the results of a third experiment 
conducted by the authors indicate that same- 
sex imitation does occur even in the absence 
of the experimenter.* 

In the procedures used in the present 
study, endorsement of a choice preferred by 
the majority of one sex of model was con- 
founded with rejection of the choice pre- 
ferred by the majority of opposite-sex mod- 
els, To illustrate, consider the models’ choices 
in the high within-sex consensus condition 
of the first experiment. The endorsement by 
100% of the male models of a particular item 
was, of course, coupled with implicit re- 
jection of the item unanimously chosen by 
the female models. In any two-item forced- 
choice situation, endorsement of one item 
implies rejection of the other. In the present 
research, it is not possible to determine, for 
example, if a boy’s imitation of responses 
endorsed by male models was a function of 
an active desire to emulate the male models 
or a wish not to display a choice indicated 


*To examine the generality of some of the find- 
ings from the first two experiments reported, the 
authors conducted a third experiment. The design 
was identical to that of Experiment 2 of the present 
report. However, the procedure was altered in three 
major ways, First, peers instead of adults were used 
as models. Second, although the first, multiple-model- 
ing phase of the study was conducted exactly as in 
Experiment 2 (with models indicating their prefer- 
ences from among item pairs), the second modeling 
Phase involved the modeling of novel action and 
verbal sequences rather than the modeling of addi- 
tional item preferences, Third, the test for imitation 
involved leaving the child alone with the objects the 
model had used in displaying the action and verbal 
Tesponses in the second phase, with instructions to 
“see if you can do some interesting things with them” 
while the experimenter was gone. Children’s imita- 
tion of responses displayed by the model in the 
second phase was assessed by a time-sampling tech- 
nique while the child was observed through a one- 
Way vision screen for a 4-minute period. Results 


=e virtually identical to those obtained in Experi- 
meni É 
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by the female models, In real-life settings, Bandura, A. Social-learning theory of identificatory 


of course, many choice situations do exem- Processes. In D. A. Goslin (Ed.), Handbook of 
socialization theory and research. Chicago: Rand 


plify the two-pronged situation facing chil- McNally, 1969 

; s Nally, 1969. 
dren in our study where one choice was mas- Bandura, A. Social learning theory. Englewood 
culine and the other feminine. For theoreti- Cliffs, N.J.: Prentice-Hall, 1977. 


cal reasons however, it would be interesting Bandura, A. & Barab, P. G. Conditions governing 
nonreinforced imitation. Developmental Psychol- 
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Telling Lies 


Bella M. DePaulo and Robert Rosenthal 
Harvard University 


Men and women (20 each) were videotaped while describing someone they 
liked, someone they disliked, someone they were ambivalent about, someone 


they were indifferent about, someone they liked as though they disliked him 
or her, and someone they disliked as though they liked him or her. Accuracy 
at detecting that some deception had occurred was far greater than accuracy at 


detecting the true 


underlying affect, and people who were good at detecting 


that deception was occurring were not particularly skilled at reading the speak- 
ers’ underlying affects. However, people whose deception attempts were more 
easily detected by others also had their underlying affects read more easily. 
Speakers whose lies were seen more readily by men also had their lies seen 
more readily by women, and observers better able to see the underlying affects 
of women were better able to see the underlying affects of men. Skill at lying 
successfully was unrelated to skill at catching others in their lies. A histrionic 
strategy (hamming) was very effective in deceiving others, and this strategy 


was employed more by more Machia 
caught less often in their lies. Metho! 


vellian people, 
dological considerations and systematic 


who also tended to get 


programs for future research are discussed. 


Studies of skill at detecting lies from 
verbal and/or nonverbal cues usually focus 
on observers’ ability to distinguish truthful 
responses from deceptive ones (e.g, Ekman 
& Friesen, 1974; Fay & Middleton, 1941; 
Harrison, Hwalek, Raney, & Fritz, 1978; 
Kraut, 1978; Littlepage & Pineault, 1978; 
Maier & Janzen, 1967; Maier & Thurber, 
1968; Zuckerman, DeFrank, Hall, Larrance, 
& Rosenthal, in press; Krauss, Geller, & Ol- 
son, Note 1). Most published studies of 
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human lie detection have found that people 
are substantially more accurate than chance 
at distinguishing veracity from mendacity 
(Fay & Middleton, 1941; Feldman, 1976; 
Harrison et al., 1978; Kraut, 1978; Little- 
page & Pineault, 1978; Maier, 1966; Maier 
& Janzen, 1967; Maier & Thurber, 1968; 
Matarazzo, Wiens, Jackson, & Janaugh, 
1970; Zuckerman et al., in press; see also 
DePaulo, Zuckerman, & Rosenthal, in press). 

Discerning the degree of deceptiveness of 
a given response, however, is only one of 
several senses in which one might be said 
to be skilled at detecting a lie. In the case 
of a lie that involves the cloaking of a felt 
emotion with a feigned one, for example, 
skill at detecting the lie might be conceptual- 
ized as the ability to see through to the un- 
derlying affect that the deceiver is trying to 
hide. Using Ekman and Friesen’s (1969) 
terms, this latter skill (identifying the con- 
cealed information) will be called leakage 
accuracy, whereas the former (recognizing 
that deception is or is not occurring) will 
be labeled deception accuracy. Although it 
might seem quite plausible that people who 
are especially good at detecting deceptiveness 
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will also be particularly skilled at reading 
leakage, this empirical question has not yet 
been addressed. 

Two-tiered lies, in which an underlying 
or “true” affect is covered up by a dissimu- 
lated or “sent” affect, come in many varie- 
ties, For instance, they may involve the 
covering up of a positive emotion with a 
negative one or the covering up of a negative 
emotion with a positive one. In the few 
studies that have examined these kinds of 
lies (Ekman & Friesen, 1969, 1974; Feldman, 
1976), the liars were always attempting to 
send positive cues, In the Ekman and Friesen 
(1974) study, for instance, nurses watching 
a very disconcerting film pretended to be 
watching a very pleasant one; in Feldman’s 
study, teachers either liked or disliked a 
student who had performed either well or 
poorly, but in all conditions teachers were 
instructed to give only positive feedback to 
the student. We cannot know from these 
studies whether skill at detecting these sugar- 
coated lies (in which the feigned affect is 
positive) is related to skill at detecting lies 
in which the dissimulated affect is negative. 

Paralleling our lack of knowledge about 
the generality of skills at detecting different 
kinds of lies is a lack of knowledge about 
the generality of skills at decoding different 
kinds of liars, For instance, are people who 
are especially skilled at identifying deceptive- 
ness in women also particularly successful in 
Tecognizing deceptiveness in men? 

The same kinds of questions that are 
asked about people trying to detect lies can 
also be asked about people who are attempt- 
ìng to spread lies, That is, are those persons 
who are especially adept at hiding their at- 
tempts at deception also especially success- 
ful at concealing their true feelings? Are 
people who are especially good at hiding an 
underlying positive affect also particularly 
skilled at hiding an underlying negative af- 


fect? Are people who are particularly trans- 
parent to women also especially transparent 
to men? Finally, are people who are good 
liars also good lie-detectors: that is, does it 
take one to know one? ie 


To study these and other i 
y questions, 
asked subjects to describe several peri 
whom they knew: someone they liked, some- 
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one they disliked, someone they felt ambiva- 
lent about, and someone they felt indifferent 
about. To elicit deception, we asked them to 
describe the liked person while pretending 
that they really disliked him or her, and we 
also asked them to describe the disliked 
person while pretending that they really liked 
him or her. Thus, the design included under- 
lying positive affect covered by feigned nega- 
tive affect and underlying negative affect 
covered by feigned positive affect. It also 
included “pure” positive affect (liking), 
“pure” negative affect (disliking), and simul- 


taneously occurring positive and negative | 


affect (ambivalence). We asked the same 
subjects to return to judge these various 
descriptions. We believe that this paradigm, 
in which the observer is called upon to dis- 
criminate among a wide variety of affects, 
is more valid ecologically than is the scenario 
characteristic of most deception studies, in 
which the subject is asked merely to decide 
whether the speaker is lying or telling the 
truth. 

This design also permitted us to study a 
particular style of lying—one that we call 
hamming. We think that a reasonable opera- 
tional definition of a ham is someone who 
when pretending to like someone he or she 
actually dislikes, expresses more liking than 
when describing a person he or she really 
does like. Another kind of ham is one who 
exaggerates dissimulated disliking—that is, 
one who expresses more disliking when pre 
tending to dislike someone he or she actually 
likes than when describing someone he or she 
really does dislike. 

We wanted to know several things about 
this style of lying: 

1. How common is this hamming strategy? 

2. Is there really just one kind of ham, 
or might there be at least two? That is, are 
the people who exaggerate liking the same 
People who exaggerate disliking? 

3. Does hamming work? That is, do people 
who ham it up get caught in their lies less 
frequently than those who use a less exag- 
gerated style? Conceivably, hamming could 
backfire if an observer got the impression 
that the speaker “doth protest too much.” 

Because Machiavellianism is commonly 
thought to be linked to success at lying— 


nd in one study (Exline, Thibaut, Hickey, 
Gumpert, 1970) has in fact been shown 
be so related—we administered the Mach 
cale (Christie & Geis, 1970) to all of our 
subjects. 


Method 
Subjects 


Subjects were 40 Harvard summer school stu- 
dents (20 males and 20 females) recruited for a study 
of “person descriptions” and paid for their par- 
ticipation, 


Procedure 


Subjects were asked to take 1 minute to describe 
each of the following persons: someone they liked, 
someone they disliked, someone they felt ambivalent 
about, and someone they felt indifferent about. 
(Ambivalence was defined as strong feelings of 
both liking and disliking; indifference was defined 
as no strong feelings of liking or disliking.) To 
elicit deception, subjects were also asked to describe 
the person they liked as if they really disliked him 
or her (like-as-though-dislike or LD condition) and 
to describe the person they disliked as if they 
really liked him or her (dislike-as-though-like or 
DL condition), These six descriptions were given in 
one of nine different orders (randomly assigned). 
Half of the subjects (10 males and 10 females, ran- 
domly assigned) described males; the others de- 
scribed females. ; 

During the sessions, the experimenter remained 
behind a one-way mirror and videotaped the de- 
scriptions. Subjects were aware of this, and also 
knew that the experimenter was not informed of 
the sequence of their descriptions. They were urged 
to try to be very convincing in all of their de- 
scriptions. s 

From these descriptions, two hour-long video- 
tapes were made, composed of the middle 20 sec 
of each description, plus rating pauses. The order 
of appearance of the senders on these tapes was 
randomized. (Only the middle 20 sec were used to 
keep the rating task—described below—to a more 
manageable length.) All subjects returned to judge 
one of these videotapes. Subjects always judged a 
videotape on which they did not appear. Subjects 
saw the videotape they were judging twice: the first 
time, they rated the descriptions on 9-point scales 
of liking, ambivalence, and deception; the second 
time, they rated the descriptions on 9-point scales 
of disliking, discrepancy, and tension* 


Accuracy Scores 


An accurate decoder of deception would rate the 
deceptive descriptions as very deceptive and the 
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pure like and dislike descriptions as not very de- 
ceptive at all. Thus, deception accuracy was defined 
as a subject’s mean deception rating of the like-as- 
though-dislike and dislike-as-though-like descrip- 
tions minus their mean deception rating of the pure 
like (L) and dislike (D) descriptions. Separate sub- 
Scores were computed analogously for the LD and 
DL descriptions and for the decoding of male and 
female speakers. 


Overall deception accuracy 
= deception ratings of LD + DL 
2 
— deception ratings of L + D; 
2 


Deception accuracy for LD 
= deception ratings of LD 
— deception ratings of L; 


Deception accuracy for DL 
= deception ratings of DL 
— deception ratings of D. 


Accuracy of encoding deception was defined anal- 
ogously. A speaker whose lies were easily detected 
would be rated as very deceptive in the LD and 
DL conditions but not very deceptive at all in 
the liking or disliking conditions. Thus the encod- 
ing deception score was defined as the mean de- 
ception rating (by all judges who rated the speaker) 
of the LD and DL descriptions minus the mean 
deception rating of the pure like and dislike de- 
scriptions, Separate subscores were computed analo- 
gously for LD and DL and for judgments made by 
male and female observers. 

2 
Overall deception accuracy 
= rated deception on LD + DL 
2 
— rated deception on L +D; 


2 


Deception accuracy for LD 
= rated deception on LD 
— rated deception on L; 


Deception accuracy for DL 
= rated deception on DL 
— rated deception on D. 


A judge more skilled at seeing underlying affects 
(decoding leakage) would rate a speaker pretending 
to like someone she/he actually disliked as disliking 


1Scale endpoints were labeled as follows: speaker 
does not like (dislike) the person at all (1), speaker 
likes (dislikes) the person very much (9); speaker 
is not very ambivalent (deceptive, discrepant, tense) 
(1), speaker is very ambivalent (deceptive, dis- 
crepant, tense) (9). Discrepancy was defined as the 
simultaneous communication of several different 
emotions. 
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the person relatively more than liking him or her, 
Similarly, a more accurate decoder of leakage would 
see relatively more liking than disliking in the LD 
descriptions, Thus, LD accuracy was defined as a 
Subject’s mean liking rating of the LD descriptions 
minus his or her mean disliking rating of those 
descriptions; for DL, liking was subtracted from 
disliking. These scores were also computed sepa- 
rately for judgments of male and female speakers. 


Leakage accuracy for LD 
= liking ratings of LD 
— disliking ratings of LD; 


Leakage accuracy for DL 
= disliking ratings of DL 
— liking ratings of DL. 


Encoding leakage (or betraying one’s own true 
feelings) was defined as follows: for LD, the mean 
of judges’ disliking ratings was subtracted from 
the mean of their liking ratings; for DL, liking 
was subtracted from disliking, Subscores were com- 
puted for judgments made by male and female 
judges. 


Leakage accuracy for LD 
= rated liking on LD 
~ rated disliking on LD; 


Leakage accuracy for DL 
= rated disliking on DL 
— rated liking on DL, 


Hamming 


A person who hams in the DL condition ex- 
Presses greater liking of the truly disliked person 
(whom he or she is Pretending to like) than of the 
truly liked person, Thus, ham like was defined as 
judges’ mean liking rating of a speaker's DL de- 
Scription minus their mean liking rating of that 
speaker’s true or pure like description, Analogously, 
“ham dislike” was defined as judges’ mean dislike 
rating ie speaker’s LD description minus their 
mean ike rating of that sı A 
dislike Abscrpiion | eee ioe 


Ham like = rated liking on DL ~ tated liking on L; 


Ham dislike = rated disliking on LD 
~ rated disliking on D, 


Results 
Accuracy 


To examine the relative accur: 
munication of different kinds of EES 
deception, LD, DL), a 2 (Leakage/Decep. 
tion) x (LD/DL) of variance 
(ANovA) was computed, with Tepeated mea- 
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sures on both factors and accuracy scores 
as the dependent variable. 

The main effect of leakage versus deception, 
F(1, 36) = 132.04, p<.001; d=3,32 
showed that accuracy at recognizing when 
deception was occurring (deception accuracy) 
was substantially greater than accuracy at 
reading a speaker’s underlying affect (leakage 
accuracy). Or, from the point of view of the 
speaker, the tendency to have one’s attempts 
at deception accurately diagnosed was sig- 
nificantly greater than the tendency to have 
one’s true underlying affect correctly identi- 
fied. The main effect for LD versus DL and 
the interaction of LD/DL x Leakage/Decep- 
tion were both very small, showing that accu- 
racy at communicating L did not differ sig- 
nificantly from accuracy at communicating 


DL, and that the magnitude of the accuracy | 


difference between leakage and deception did 
hot vary significantly according to the par- 
ticular affect that was feigned or concealed. 

The overall level of deception accuracy dif- 
fered significantly from zero, which is the 
value that would be expected under the null 
hypothesis of no accuracy, ¢(38) = 5.06, p < 
001; d = 1.64. Thus, observers were signif- 
icantly more accurate than chance at detect- 
ing when deception was occurring. Accuracy 
at reading the underlying affect was signif- 
icantly worse than chance, however, t(38) = 
6.04, p < .001; d = 1.96. This means that on 
the whole, observers tended to read the affect 
that speakers intended to communicate rather 
than the affect that speakers really felt. 


Telling Different Kinds of Lies 


The correlation between skill at detecting 
deception and skill at reading leakage was 
05 for LD and —.11 for DL.? Both of these 
correlations were far from significant. Thus 
people who are especially good at knowing 


— 


*The index d is an estimate of the size of the 
effect expressed in standard deviation units (Cohen; 
1977). As a rule of thumb, Cohen suggests regard- 
ing ds of .20, 50, and 80 as small, medium, and 
large effects, respectively. 

ns varied from 38 to 40 because there was one 


missing leakage subscore and one missing deception 
Subscore, 


when deception is occurring are not neces- 
sarily particularly skilled at knowing the 
speaker’s true affective state. 

In terms of encoding, however, the relation- 
ship between leakage and deception is strongly 
positive (for LD, Pearson’s r = .50, p < .01; 
for DL, r = .54, p < .01). Thus people whose 
| deception attempts are readily detected tend 

also to leak their true underlying feelings. 
The ability to recognize deception when the 
speaker is hiding positive affect (LD) is not 
| significantly related to the ability to recognize 
deception when the speaker is hiding negative 
affect (DL; r = —.18). However, the ability 
to read a leaked positive cue is very strongly 
telated to the ability to spot a leaked nega- 
tive cue (r = .75, p < .001). For encoding, 
the tendency to get caught telling an LD lie is 
correlated only .13 (ms) with the tendency 
to get caught telling a DL lie. The analogous 
leakage correlation is similarly small and non- 
significant (.19), showing that people who 
leak their underlying positive feelings don’t 
necessarily leak their underlying negative 
feelings. 


Different Kinds of Liars and Lie Detectors 


Speakers who get caught lying by women 
also tend to get caught lying by men (for LD, 
r= .62, p< .001; for DL, r= 54, p< 
001), and speakers whose leaked cues are 
picked up by women also have their leaked 
cues picked up by men (for LD, 7 = 16, p < 
001; for DL, r = .77, p < 001). Similarly, 
observers who are skilled at reading the leak- 
ages of women also tend to be skilled at read- 
ing the leakages of men (for LD, r = 12, p< 
001; for DL, r = .78, p < 001). However, 
skill at knowing when women are lying is not 
significantly correlated with skill at knowing 
when men are lying (for LD, r = .06; for DL, 
r= nye 


Skill at Telling Lies and Skill at Telling Lies: 
dre They Related? 


The four different kinds of ability to tell 
(detect) lies (detecting leaked liking and 
leaked disliking: detecting LD and DL lies) 
Were correlated with the four different kinds 
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of ability to tell (get away with) lies. The 16 
correlations ranged from .20 to —.24, and all 
were statistically insignificant. The median 
correlation was exactly zero. Thus, the ability 
to get away with one’s own lies seems to be 
completely unrelated to the ability to catch 
other people in their lies. 


Hamming 


Only 13 of the 40 subjects earned ham like 
scores greater than zero; only 10 earned ham 
dislike scores greater than zero. Thus, the 
tendency to express more liking (or disliking) 
when lying than when really liking (or dislik- 
ing) was not particularly common in this 
sample. Further, ham like scores were not sig- 
nificantly correlated with ham dislike scores 
(r = .24), suggesting that there are probably 
at least two different kinds of hams. Finally, 
hams who exaggerate feigned liking and hams 
who exaggerate feigned disliking are both tre- 
mendously successful at their deception at- 
tempts (between ham like and encoding de- 
ception, or getting caught in one’s attempts to 
lie, r = —.60, p < .001; the r for ham dis- 
like was identical. 


Machiavellianism 


A median split was calculated for Machia- 
vellianism scores, and a Machiavellianism 
(High/Low) x Type of Affect (LD/DL) x 
Type of Lie (Leakage/Deception) ANOVA 
was computed with encoding scores as the 
dependent variable. A main effect for Machia- 
vellianism showed that high Machs were 
slightly more successful at getting away with 
their lies than were low Machs, F(1, 36) = 
3.04, p= .09; d= .58. The interaction of 
Machiavellianism with type of affect showed 
that high Machs were especially successful at 
deceiving when pretending to dislike someone 
they really liked, F(1, 38) = 5.96, p < .05; 
d= .79. The interaction of Machiavellianism 
with type of lie was not significant. 

An analogous 2 X 2 X 2 ANOVA was com- 
puted using decoding scores as the dependent 
variable. The main effect for Machiavellianism 
was very small, and the interactions of type of 
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lie and type of affect with Machiavellianism 
were also nonsignificant. 

In this sample, Machs also tended to be 
hams—significantly so for ham like (r = .31, 
p= .055), though only weakly for kam dis- 
like (r = .14, ns). 


Discussion 


Consistent with at least ten previous studies 
of human lie detection (Fay & Middleton, 
1941; Feldman, 1976; Harrison et al., 1978; 
Kraut, 1978; Littlepage & Pineault, 1978; 
Maier, 1966; Maier & Janzen, 1967; Maier & 
Thurber, 1968; Matarazzo et al., 1970; Zuc- 
kerman et al., 1979), we found that ob- 
servers were able to identify accurately the 
occurrence of deception. However, we also 
found that observers were markedly unable to 
see through to a speaker’s underlying affect 
when that affect was covered over by some 
other dissimulated display. Feldman (1976), 
too, in a study that examined both leakage 
and deception skill, found that observers 
could pick up the deception cues but not the 
underlying affect. Perhaps a conyergent find- 
ing is the near-zero correlation between iden- 
tifying deception and reading leakage. Ap- 
parently, an observer can have a keen sense 
of when “something fishy is going on,” with- 
out necessarily being able to differentiate the 
feigned feeling from the true affect, 

Except for the very small relationship be- 
tween skill at hiding liking (LD) and skill at 
hiding disliking (DL) (M r= -16), the abil- 
ity to lie successfully appears to be a fairly 
general one. People who effectively mask their 
deception attempts also effectively conceal 
their true affective states (M r= 52). Fur- 
ther, people who are especially adept at fool- 
ing women (whether in terms of leakage or 
deception) also tend to be particularly skilled 
at deceiving men (M r = 67). 

There was also a consid 
consistency in the ability 
kinds of leakages, Peo; 
skilled at noticing 


erable degree of 
to detect different 
ple vi were especially 
an underlyin, i 
affect were also particularly Mai ee PEAN 
ing an underlying positive affect. Similarl 
people who were especially likely to read the 
leakages of women were also especially likel: 
to read the leakages of men (Mr= .78) a 
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The tendency to read leaked cues, however, 
is not significantly related to the ability to « 
recognize the occurrence of deception, Fur- 
ther, the ability to detect deception appears 
to be a much more disparate set of skills than 
is the ability to perceive leaked affects, Peo- 
ple who can tell that a lie is occurring when 
liking is being concealed cannot necessarily 
detect a lie when disliking is hidden. Further, 
observers who know when women are lying 
do not necessarily recognize dissimulation by 
men. Using a “To Tell the Truth” format, © 
Littlepage and Pineault (1978) also found 
a near-zero correlation between the ability to 
point to the impostrous male panelists and 
the ability to identify the impostrous women. 
Perhaps observers operate on the assumption 
that there is a single set of clues that tip 
people off to deception, when in fact women’s 
deceit is revealed in ways different from 
men’s. There is already some evidence suggest- | 
ing that people’s lay theories about clues to 
deception are not particularly accurate (Ek- 
man & Friesen, 1974; Kraut, 1978; Maier & 
Janzen, 1967; Krauss et al., Note 1). The 
second part of the hypothesis, about sex dif- | 
ferences in clues to deception, is still essen-| 
tially an unanswered question. With the ex- 
ception of a study by McClintock and Hunt 
(1975) and one by Mehrabian (1971, Experi- 
ment 2), most studies of deception cues tend 
to use subjects of only one sex or to exclude 
Sex as a factor in the analyses (Ekman & 
Friesen, 1972; Ekman, Friesen, & Scherer; 
1976; Harrison et al., 1978; Knapp, Hart, & 
Dennis, 1974; Kraut, 1978; Streeter, Krauss, 
Geller, Olson, & Apple, 1977). Both Mehra- 
bian (1971) and McClintock and Hunt 
(1975) present evidence suggesting that there 
may in fact be sex differences in at least some 
of the behaviors emitted during deception. 

The relative homogeneity of encoding skills 
compared to decoding skills is consistent with 
Kraut’s conclusion that people are consist- 
ently successful or unsuccessful as liars but 
are not so consistent as lie detectors. The ex- 
ception in our data to the general consistency 
of skills in encoding lies was the very weak 
relationship between lying successfully about 
underlying positive feelings (LD) and lying 
Successfully about underlying negative feel- 


f 


ings (DL). The importance of the particular 
kind of affect that is being hidden or dissim~ 
Hulated was also apparent in (a) the weak 
| k a 

correlation between detecting LD lies and 
detecting DL lies; (b) the nonsignificant cor- 
relation between exaggerating (hamming) 
Heigned liking and exaggerating feigned dis- 
liking; and (c) the interaction between Mach- 
javellianism and the type of affect being 
encoded. This suggests that conclusions drawn 
from studies involving just one variety of 
hidden affect or one kind of “sent” cue may 
in some cases be of limited generality. 

The ability to lie effectively to others was 
pe unrelated to the ability to detect the 
lies of others. Zuckerman et al, (in press) 
reported a similarly low and insignificant 
on (.20) between encoding and de- 
coding deception. These results are consistent 
with the results of 17 studies of pure (non- 
deceptive) nonverbal communications: for 
those studies, the median correlation between 
encoding skill and decoding skill is .16 (De- 
Paulo & Rosenthal, in press). The ability to 
express one’s emotions accurately appears to 
be quite distinct from the ability to interpret 
the emotions of others accurately, whether 
those emotions are real or dissimulated. 

The Machiavellian subjects in our study, 
like those in the Krauss et al. (Note 1) study, 
Were not especially skilled at detecting de- 
ception; like the Machs in the study by Ex- 
line et al. (1970), however, they were fairly 
Successful as liars. We also found that Machs 
Seem to favor an especially theatrical style of 
deceiving, particularly when feigning positive 
tegard: they, more than their less Machia- 
vellian counterparts, tend to be hams when 
they lie. Moreover, the hamming strategy ap- 
Pears to work quite well: people who tend to 
exaggerate sentiments of liking that they do 
not really feel, as well as people who exag- 
gerate feigned feelings of disliking, are much 
less likely to get caught in their lies than are 
the less histrionic sorts. The future of the 
Study of hamming looks promising. Already 
there is evidence suggesting both sex differ- 
ences (Rosenthal & DePaulo, in press) and 
developmental differences (Feldman, Jenkins 
& Popolla, in press) in this style of lying. 
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Methodological Considerations 


The present study points to the advantages 
of distinguishing deception from leakage, of 
Sampling a wide range of encodings, and of 
using designs that completely cross the affects 
that are actually experienced with those that 
are dissimulated or “sent.” These design fea- 
tures should help to separate deceptive mes- 
sages from discrepant but nondeceptive ones 
and to distinguish deception from stress or 
from unpleasant affects. 

In our study, the lie detectors were the 
same persons who had previously served as 
liars; hence, they might have been predisposed 
to use particular kinds of cues or strategies. 
Moreover, in our study as in earlier investiga- 
tions (e.g., Streeter et al., 1977), judges knew 
exactly what proportion of the time the 
senders would be lying. Research paradigms 
in which the judges are not given this infor- 
mation—and, furthermore, are not specifically 
informed about the content of the lie—might 
be more realistic. Finally, comparisons of 
posed deceit with spontaneous lies will shed 
light on the generalizability of the kinds of 
deceptions we most often stage in our labora- 
tories, as will a more extensive sampling of 
deceptions in naturalistic settings. 


Future Research in Deception: Toward a 
More Programmatic Approach 


Studies of the abilities to detect lies and to 
deceive successfully comprise just one area in 
the psychological literature on deception. ( We 
concern ourselves here only with lie detection 
that is unaided by physiological measure- 
ments or detection devices; for a more physio- 
logically oriented review, see Lykken, 1974, 
1979, Orne, Thackray, & Paskewitz, 1972, or 
Podlesny & Raskin, 1977.) A second line of 
inquiry looks at accuracy of lie detection asa 
function of differential access to different 
channels or modalities such as the face, the 
body, the tone of voice, the words, and vari- 
ous combinations of these cues (e.g., Ekman 
et al., 1976; Ekman & Friesen, 1969, 1974; 
Feldman, 1976; Littlepage & Pineault, 1978; 
Streeter et al., 1977; Zuckerman et al., in 
press; Krauss et al., Note 1). The appeal of 
these studies probably derives in part from 
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the suggestion that certain channels, such as 
the face, that under ordinary circumstances 
are extremely informative, can be especially 
misleading under conditions of deception. 
These studies demonstrate the kinds of infor- 
mation that can be gleaned from a particular 
channel when access is restricted to that chan- 
nel; a further step would be to demonstrate 
whether this same information is in fact ex- 
tracted when information from other chan- 
nels is available, too (cf. Streeter et al., 1977), 
If it can be established that particular chan- 
nels are especially misleading under conditions 
of deception, it might then be asked whether 
people adjust their attentional strategies in 
ways that seem to acknowledge these modal- 
ity effects (DePaulo, Rosenthal, Eisenstat, 
Rogers, & Finkelstein, 1978), whether par- 
ticularly skilled lie detectors are especially 
likely to make these adjustments, and whether 
attentional instructions or training procedures 
can increase accuracy of detection. 

A third approach has concerned itself with 
identifying particular cues, such as smiling, 
speech errors, and pitch, which distinguish 
deceptive from nondeceptive responses (e.g., 
Ekman et al., 1976; Ekman & Friesen, 1972; 
Harrison et al., 1978; Knapp et al., 1974; 
Kraut, 1978; Luria, 1932; McClintock & 
Hunt, 1975; Mehrabian, 1971; Streeter et al., 
1977; Alker, Note 2), This line of research 
might be advanced by a consideration of the 
relationships among (a) the cues that actually 
do distinguish truth from deception; (b) the 
cues that people say they use to distinguish 
truth from deception; and (c) the cues that 
people actually do utilize in their judgments 
(see Baskett & Freedle, 1974; Kraut, 1978). 

Studies of veridical clues to deception, when 
they do examine comparable clues, sometimes 
yield disappointingly inconsistent results, The 
question “What are the clues to deception?” 
is probably too broad; instead, we may need 
to ask, “What are the clues to which kinds 
of deceit by which kinds of deceivers to which 


kind of receivers in which Kinds of situa- 
tions?” 


A Matrix for Generating and Organizing 
Research on Deception 


We can start, then, by sketching a tax- 
onomy of lies, liars, lie detectors, and social 
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settings. Then, by locating specific studies of 
clues to deception within this matrix of lie 
types, setting types, deceiver types, and de- 
tector types, perhaps higher order consist- 
encies will emerge. This four-dimensional ap- 
proach should be equally relevant to the 
other two lines of inquiry. Thus, for example, 
we might want to know whether certain chan- 
nels are especially misleading when communi- 
cating certain kinds of lies, or whether people 
are especially effective liars or lie detectors in 
same-sex or same-status dyads, 

Among the many kinds of lies are ordinary 
white lies, proffered in the service of smooth- 
ing social interaction, lies of self-aggrandi: 
ment for purposes of impression management 
or personal gain, and lies in which the de 
ceived party is the intended beneficiary, as 
in physicians’ careful cloaking of grim prog 
noses or parents’ imaginative fabrications 
about sex and death. 

Lies may involve facts that may or may 
not be verifiable by external criteria (as i 
police investigations), or they may involi : 
attitudes or opinions or feelings about onesé f 
or others. Lies vary also in the degree of gui 
or stress that they cause for the deceiver, tht 
degree of involvement that they engender ii 
the topic of the lie, the consequences 
getting caught in the lie, and whether the liés 
were planned and rehearsed or unpremed 
itated. 

An important dimension of social setting 
is the degree of normative or situational sup 
port that they offer to the deceiver. Those 
who lie to parole officers (especially if 
“tough” neighborhoods) or to captors in| 
military setting probably find their decell 
more readily justifiable than those who lie t 
priests, doctors, and close friends. Structura 
characteristics of contexts, such as the degré 
of formality or informality or of democt: 
or autocracy, may also importantly affect 
Cues used to detect deceit, the modalities u! 
to convey deceit, and the overall ease with 
which lies are discovered or perpetuated. Still 
another important aspect of social settings 
the people in those settings—the number 0 
participants, their structural relationship 4 
the deceiver and the deceived (eg., whethel 
they are outside observers or are directly i 
volved in the interaction); their affect 
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| relationship (whether they are more closely 
į allied with the deceiver or with the deceived) ; 


and their expectations about the probability 
of occurrence of deception. 

Relevant characteristics of both the de- 
ceiver and the detector include demographic 
characteristics (e.g., age, sex, race, socio- 
economic status); personality characteristics 
(e.g., adjustment, anxiety, Machiavellian- 
ism); social competence (e.g., nonverbal sen- 
sitivity); and cognitive competence (e.g., in- 
telligence). The liar’s and the lie detector’s 
appraisal of success in deception (i.e., whether 
they view effective liars as clever and well 
adjusted or as immoral and psychopathic) 
might also predict the outcomes of deceptive 
transactions. These effects are probably best 
studied as two-way interactions in designs 
that cross characteristics of the deceiver with 
analogous characteristics of the detector. 

Trait versus state aspects of lying might 
also mediate deception effects—that is, the 
deceiver may be an inveterate liar or an 
occasional liar who at the present moment is 
or is not telling the truth; similarly, the lie 
detector may be a chronic paranoid or a naive 
and unsuspecting soul who at the present mo- 
ment is or is not suspicious of the deceiver’s 
intent, The relationship of the deceiver to the 
deceived—spatially; temporally (is this a 
One-time meeting or will they see each other 
again in the future?); and personally (are 
they strangers, friends, intimates, or enemies? 
what is their relative status and power? )— 
should also be considered in future research 
efforts. 
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Attributional Information Processing: A Response Time Model 


of Causal Subtraction 
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Response time measures have been used occasionally in social psychology, but 
rarely as direct probes of information processing. A study collecting response 
time data in a near-exact replication of McArthur’s classic attribution study 
sheds light on the information processing involved in subjects’ responses. The 
process is analyzed into two stages: (a) encoding or comprehension of the 
stimulus sentence and the consensus, distinctiveness, and consistency informa- 
tion and (b) attributional processing per se. In the second stage, response time 
analyses suggest that perceivers operate by subtracting causes from an initial 
set to arrive at a response, rather than by adding causal components (person, 
stimulus, and circumstances) until an adequate cause is obtained. Subtraction 
is theoretically related to the salience model of attribution. Response time 
measures promise to expand greatly the ability of social psychologists to build 
process models of causal attribution and other kinds of social perception and 


cognition. 


Researchers on causal attribution have ex- 
pressed increased interest recently in the 
formulation of models of attributional pro- 
cessing, as distinct from the prediction of at- 
tributional judgments or outcomes (Pryor & 
Kriss, 1977). A model of attributional infor- 
mation processing permits the prediction of 
social cognition and behavior in a range of 
new situations (assuming that the same pro- 
cess will continue to operate), whereas studies 
that just measure attributional outcomes are 

_ hampered by the fact that any particular out- 
) come could be generated by a large number of 
different processes, with different implications 
for new situations (Anderson, 1976, p. 11). 
In addition, information-processing models 
of attribution may find a place within the 
general models of human cognitive function- 
ing that are currently being developed in cog- 
nitive psychology (e.g., Anderson, 1976; 
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Schank, 1975), leading us closer to a fruitful 
union of cognitive and social psychology 
(Simon, 1976; S. Taylor, 1976). 

Response time (RT) measures, a fre- 
quently used tool for the development of cog- 
nitive process models, are just beginning to be 
used in social psychology. Pryor and Kriss 
(1977) used response times to validate a 
manipulation of salience or availability: they 
demonstrated that the first noun presented in 
a sentence (whether subject or object) was 
more quickly recognized in a response time 
test than the second noun presented. They 
concluded that the first noun was more avail- 
able. Markus (1977) used response times to 
demonstrate that people with “self-schemata” 
relevant to a given adjective identified those 
adjectives more quickly than did subjects 
without such self-schemata. However, social 
psychologists have not yet used response time 
methods to probe processing directly (cf. 
Posner, 1978), as cognitive psychologists 
have used them (e.g., Carpenter & Just, 1975; 
Schneider & Shiffrin, 1977; Shiffrin & Schnei- 
der, 1977). l 

Two of the ways in which response time 
measures can probe processing will be con- 
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sidered here (for a review, see D. Taylor, 
1976). First, the time for a response may 
vary depending upon experimentally manip- 
ulated factors. The studies by Carpenter and 
Just (1975) on verification of affirmative and 
negative sentences illustrate this possibility. 
Negative sentences take longer to verify, and 
this can be interpreted to indicate that they 
are transformed to affirmative before the veri- 
fication process begins. The second possibility 
is that a relationship may exist between the 
content of a response and the time needed for 
the response. For example, the time for “not 
match” judgments about the identity of com- 
plex stimuli is greater than for “match” judg- 
ments in the data of Trabasso, Rollins, and 
Shaughnessy (1971). This is used as evidence 
for a serial feature-matching process, in which 
the features of the stimuli are compared for 
identity one at a time, with a mismatch of 
features leading to several time-consuming 
Processes such as preparing a “not match” 
response and rechecking features. 

The current study uses these two types of 
analysis to examine the processes involved in 
causal attributions in a replication of the 
now-classic McArthur (1972) attribution 
study, McArthur presented her subjects with 
sentences like “Sue is afraid of the dog,” fol- 
lowed by three types of additional informa- 
tion. High (low) consensus is information 
about the actions of other people: “Almost 
everybody (nobody) else is afraid of the 
dog.” High (low) distinctiveness concerns 
this person’s responses to other objects: “Sue 
is not afraid of almost any other dog (is 
afraid of almost every other dog).” High 
(low) consistency describes the past history 
of the response: “In the past Sue has almost 
always (never) been afraid of the dog.” 
The three types of information are manipu- 
lated in a 2x 2x2 design, and subjects’ 
responses as to what caused the event to 
occur are obtained. McArthur’s dependent 
measure offers subjects a choice of four 
causal possibilities: 

1. Something about the 
probably caused her 
(e.g., be afraid) to 
dog), 

2. Somethin 
Caused the pe 
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person (e.g., Sue) 
to make response X 
stimulus X (eg., the 


g about stimulus X Probably 
tson to make response X to it. 
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3. Something about the particular circum- 


stances probably caused the person to make. 


response X to stimulus X. 

4. Some combination of 1, 2, and 3 above 
probably caused the person to make response 
X to stimulus X, (The subject is asked to 
check either 1+ 2, 1+3, 2+3, or 1+2 
+3; McArthur, 1972, p. 175.) McArthur’s 
study was originally designed to test Kelley’s 
(1967) analysis of variance (anova) model 
of attribution and has been extremely influ- 
ential in the attribution research tradition, 

The addition of a response time measure 


to this study will allow two types of analysis, 


The first purpose of this study is to investi- 
gate the influence of consensus, distinctive- 
ness, and consistency information on the 
time taken for attributional processing. It 
may be hypothesized that certain patterns 
of information will be easier to encode and 
process than others. For example, one pre- 
diction is that certain “standard” patterns 
of information (Orvis, Cunningham, & Kel- 
ley, 1975, suggest the three patterns high/ 
high/high consensus/distinctiveness/consist- 
ency, low/low/high, and low/high/low) are 
consistent with perceivers’ expectations and 
should be more quickly processed than other 
patterns. In general, relationships between 
the manipulated information factors and RTs 
give insight into subjects’ encoding and pro- 
cessing of the information. 

The second aim of this study is to examine 
the relations between the content of causal 
judgments and the time taken to arrive at 
them in order to clarify the procedure the 
attributor uses. Several procedures are pos- 
sible. One is a “look-up” process, in which 
the perceiver refers to memory for a par- 
ticular attributional response that is stored 
with each possible configuration of informa- 
tion (or at least with some of the possible 
patterns), much as one looks up a name in 4 
Phone book to locate a number (the re- 
Sponse). That is, one may have stored the 
fact that low/low/high information points to 
the person as cause; that high/high/high 
implies stimulus; that low/high/low implies 
circumstances, and so forth, The process 0 
looking up each information configuration 
should take about the same amount of time; 
yielding a prediction of no relation betwee? 


mean RT and the content of the response. 
Alternatively, one could predict (following 
Orvis et al., 1975, p. 606) that perceivers 
may have stored only these three “standard 
data patterns,” so that responses of person, 
stimulus, or circumstance attributions to 
those patterns are quick, whereas other re- 
sponses are slower, since the perceiver must 
engage in further processes (perhaps com- 
paring the observed information pattern to 
more than one of the standard patterns to 
arrive at a compromise response). 
TA second type of process is additive. Here 
e perceiver begins by considering a single 
causal factor (most likely the person) and 
adds other factors to it (a process that 
akes time) when and if the available in- 
formation indicates that they are required. 
That is, if a quick check shows that the cur- 
rent response is adequate, it is given. On the 
other hand, if a mismatch appears between 
the tentative causal response and the given 
information, another component must be 
added to the tentative response (e.g., chang- 
ing from person to person plus circum- 
stances), and a response change has been 
found to require processing time (cf. Newell, 
1973, p. 490; Trabasso et al., 1971). It may 
"also be true, as Trabasso et al. (1971) found, 
that the detection of a mismatch between 
response and information requires more time 
than detection of a match. The overall pre- 
diction is thus that single-component re- 
sponses (or at least the person response) 
should be quick and multi-component re- 
sponses should be slower; ideally, this model 
would give a positive, linear relationship be- 
tween the number of components in the re- 
sponse and the response time, if detecting a 
mismatch and adding any component to the 
| response takes the same increment of time. 
This sort of model is suggested by the way 
in which psychologists tend to think of attri- 
butional responses. Examples are the treat- 
ment by Orvis et al. (1975) of information 
“patterns for the three single-component re- 
sponses as “standard,” and Kassin and Hoch- 
reich’s (1977) labeling of single-factor re- 
sponses as “simple” and multi-factor re- 
sponses as “complex.” 

A third suggested process is the inverse 
of the second. Instead of beginning with a 
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single component as a tentative response and 
engaging in processing that may add compo- 
nents to the response, the perceiver might 
begin with a three-component tentative re- 
sponse and engage in processing that may 
delete or subtract components. As in the 
second model, the subject is presumed to 
check the tentative response against the given 
information; again, a match is presumed to 
lead quickly to the emission of that already 
prepared response. A mismatch leads to 
processing, the effect of which is to delete a 
causal component from the response, As in 
the second model, the changing of the tenta- 
tive response is presumed to be time-consum- 
ing. The overall prediction is that the three- 
component response of person plus stimulus 
plus circumstances should be the quickest 
and single-component responses the slowest, 
as the latter involve the most deletions 
(two). Ideally, one would obtain a negative, 
linear relationship between the number of 
causal components in the response and the 
response time, indicating that the mismatch 
detection and subtraction of any component 
from the response takes about the same 
amount of time, The predictions of these 
three models are summarized in Table 1. 

Thus this study will investigate the rela- 
tionship between the reported attribution and 
the time taken to report that attribution to 
enable a choice among the three processes 
suggested above. From McArthur’s (1972) 
study and many other investigations we 
know the content of the causal attributions 
that subjects produce for each information 
combination. However, each of the three 
processes outlined above could generate that 
same set of outcomes (cf. Anderson, 1976, 
p. 11), and thus the entire body of prior 
research gives inconclusive evidence as to 
the nature of the process, the issue con- 
sidered here. 


Method 


The procedure of this study generally followed 
that of McArthur (1972), which should be con- 
sulted for further detail. To permit the recording 
of RTs, the study was conducted with the aid of 
a computer. The stimulus materials (sentences) 
were displayed on a televisionlike cathode-ray tube 
d subjects reported their attributions by 


screen an 
Timing started when the 


pressing response keys. 


1726 


Table 1 
Response Time Predictions for Three Models 
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TREO S IA ist tT eT O __ —_—————___. 


Response 
Model _l-component 2-component 3-component 
Look up ~ Same time for any response—————_—_> 
(Orvis et al." variant) Short Moderate to long———_—_—_—__, 
Additive Short Moderate Long 
Subtractive Long Moderate Short 


* Orvis, Cunningham, & Kelley, 1975, 


sentence appeared and ended when a key was 
Pressed; the times thus include reading and com- 
prehension time as well as attributional Processing. 
The computer controlled the generation and pre- 
sentation of stimuli, recorded the responses and RTs, 
and output the data for analysis, 


Materials 


The stimulus materials are described above and 
in McArthur (1972), The 32 sentences used fall 
into four categories based on the type of verb: 
action, accomplishment, opinion, and emotion. Like 
McArthur, we will be 
the distinction between 


sus that between the third and fourth (manifest vs, 
latent verbs), 


(a) 
> her Procedure was 


the responses indi- 
replication of cArthur’: 
= achieved. = ne 

e question to which subjects ri 
that used by McArthur; “Your ES ti Aid 
tion am, what prob- 

occur”: (] 
McArthur’s response format was ae 


and circumstances causal factors, atimu- 
were permitted to respond with just eae 


is response format Posed no dj x 
adopted for the data nn e culty, 


$ 
each trial with three fingers of the preferred ha: 
resting on the three keys. Timing stopped whéi 
the computer sensed the first key press; if an 
further key presses took place within one-half se 
they were recorded as part of a multikey com 
bination response (the keys of a combination re 
sponse were nearly always pressed simultaneousl 
by subjects). The assignment of the labels person 
Stimulus, and circumstances to the three keys wat 
counterbalanced between ‘subjects. 


Procedure 


in the study, the actual trials began. On each tri 
the subjects read the sentence and additional con 
sensus, distinctiveness, and consistency information}! 
responded with an attribution by pressing a key 
or keys; and then signaled when they were ready) 
for the next trial. The intertrial interval was not 
timed. Trials were administered in two blocks of 1 
each, with a 3-5 min rest between blocks. The 3 


(using the same experimental materials, eg, Sem 
tences) but with a new sample of subjects and 
new sample of materials as well. The routine usi 
of simple F statistics with only subject as an erro 
term provides no evidence for the replicability O 


ie the. technical aspects of this analysis more 
, including the procedure for assigning infor- 
_ Hin conditions to sentences for each subject and 
recovering them in the analysis. 

‘Analyses were performed for several dependent 
fables, including (a) response times; (b) the 
onse categories McArthur used, which were 
son, stimulus, circumstances, person plus stimu- 
and “other” attributions (all other types com- 
ed); and (c) the person, stimulus, and circum- 
ces components of the response. The compo- 
ts (ie., the responses on the three keys treated 
three separate dependent variables) have the ad- 
tage that they are statistically independent, 
ich is not the case with McArthur’s categories, 
o fall into exactly one 
nalysis of components 
t the content of sub- 
a response of per- 
be counted here as 
son and a response 


ay more accurately represen! 
kts’ responses, For example, 
n plus circumstances will 
oth) a response involving pe! 
volving circumstances, whereas the original cate- 
ry system would treat such a response with 
others” as unrelated to either person oF circum- 


tances. 
d because a slope of 


arly 1 related the raw RT cell means to the cell 
standard deviations. Winer (1970, 


wing a log transformation in 
practically succeeded in eliminating the cell mean- 
tandard deviation relationship. 


i Results 


Response Times 


The analyses of variance by information 
condition using log response time as the 
jependent variable yielded only one sigpifi- 
cant effect, a distinctiveness—consistency* 1n- 
teraction. Under high distinctiveness — high 
consistency and low distinctiveness — low con- 
sistency, responses were significantly slower 
(M = 13.44 sec) than under the other two 
combinations, M = 12.32 sec, quasi-F (1, 7) 
= 18.9, p< 01. The verb-type factor had 
no significant main effects or interactions af- 
fecting response time—thus manifest and la- 
tent verbs do not seem to differ in the 
“a of time required for attribution. 


Relations Between Response and 
Response Time 


Collapsing across information conditions, 
one can analyze the relationship between the 
response given by the subject and the RT 
‘or that response. This analysis should indi- 
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Table 2 
Response Time by Response Analysis, 


Using Logs 


M response 


time 

Response Log Sec 

Person 7.223 13.71 

Stimulus 7,209 13,52 

Circumstances 7.222 13.69 

Person + stimulus 7.142 12.64 

Person + circumstances 7.171 13.01 

Stimulus + circumstances 7.116 12.32 
Person + stimulus 

7.009 11.07 


+ circumstances 


cate the attributional procedure followed by 
the subject, as discussed above. Ignoring 
conditions in this analysis is appropriate, 
since subjects receive conditions in a random 
order and so do not know until the stimulus 
appears which condition it represents. Thus 
subjects are unable to apply different attri- 
butional procedures for different conditions. 
(In addition, if subjects did use different 
procedures for different conditions, one would 
expect to find only confused relationships 
between responses and RTs, as a result of 
averaging across different procedures. ) The 
results of this analysis are shown in Tables 
2 and 3. The data show a very clear nega- 
tive, linear trend in log response time de- 
pending upon the number of components in 
the response, Subjects take the shortest time, 
approximately 11.07 sec (7.009 in log units) 
to make the “complex” person-plus-stimulus- 
plus-circumstances response, and the longest 
time, approximately 13.64 sec, for the “sim- 
ple” single-factor responses, Each component 
that is subtracted adds approximately 097 
in log units to the RT. Deviations from this 
linear trend are nonsignificant, so one would 
not gain by assuming that different compo- 
nents might take different amounts of time 


to subtract. 


Attributions 

McArthur’s (1972) de- 
ed those of the 
except that the 
power be- 


The results with 
pendent measures parallel 
earlier study nearly exactly, 
analyses in this study have less 
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Table 3 
Analysis of Variance of Logs 
Effect Sum of squares df MS F p 
Between responses 3.570 6 595 3.57 002 
Linear trend in number of components 3.327 1 3.327 19.92 .001 
Deviations from linear trend -243 ES 049 <1 ns 
Within groups 126.786 760 167 
Note. Regression summary of linear trend: Log RT = 7.315 — (number of components) .097 
cause of their reliance on quasi-F ratios Discussion 


rather than simple Fs. Across the 28 ANOVA 
effects in the two studies (4 Dependent Mea- 
sure Categories X 7 Main Effects and inter- 
actions of the informational variables), 9 
effects were significant in both studies; 10 
effects were significant in McArthur’s study 
and not here, and none was significant here 
and not in McArthur’s investigation. All 
effects that were significant in both studies 
had mean differences in the same direction.’ 

Analyses of the person, stimulus, and cir- 
cumstances components produced somewhat 
simpler results, as predicted. In contrast to 
the analysis of the McArthur dependent 
measures, no interaction effects were found 
with the components, Person attribution was 
increased by low consensus, low distinctive- 
ness, and high consistency. Stimulus attribu- 
tion was increased by high consensus. Cir- 
cumstances attribution was increased by low 


consistency. Means and quasi-F tests a 
in Tables 4 and 5, ised 


Table 4 


Means for Significant Effects of 
Informational Factors on Causal Components 


Proportion of responses 


including; 
` 
Information / Circu; 
level Person Stimulus a 
Consensus 
High 552 68. 
S -685 
Low 724 ATL Hf 
Distinctiveness 
High 
ior is, 5 ae 
Consistency 
High 714 
` ns 


The near-exact replication of McArthur’ 
results (when her measures are analyzed) i 
this study indicates that the changes in de 
sign (a different set of sentences and a ne 
response format) did not materially affect th 
results of subjects’ information processin; 
However, the addition of the response tim 
dependent measure yields valuable informa 
tion about the processes that underlie th 
attributional judgments made by subjects 
The results will be discussed in the contex 
of a proposed model of information process 
ing containing two stages, the times fol 
which are added to produce the total R 
(cf. D. Taylor, 1976). The stages are ê 
coding the presented information (stimulu 
sentence plus the three types of information) 
and attribution processing. The logical neces 
sity for these two stages in a model for th 
current task is clear. The subject must begi 
by reading and understanding the stimulu 
sentences, transforming them into some 1N 
ternal representation in memory. This en 
coding stage must take time. Second, though 
some inferences about causal relations may; 
take place during encoding (cf. Kintscl 
1974; Schank, 1975), the encoded inform 
tion must still be used in some fashion i 
generate and check an appropriate repo 
to the attributional question; three altern 
tive processes for this stage were outis 
above. Of course, other processing may Pa 
as well—for example, the generation 0 


as de- 
1 The results of these analyses of McA 


. are 
pendent variables are not presented in fall; es 
relied upon simply to establish that & d. THe 


replication of McArthur’s results has o Table 
analyses reported in the next paragraph an sentation 
3 represent a simpler and more direct P" 
of the attributional results than do the a 
McArthur’s dependent variables. 
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physical movements involved in making the 
response—but if their time is roughly con- 
stant across conditions, the conclusions drawn 
for the two-stage model are not affected 
(only the grand M RT will be changed). 
The current data cannot indicate whether 
the two stages are purely sequential. At least 
some encoding must precede attributional 
processing, but the two stages might overlap 
to some extent; our conclusions are un- 
affected by this variable. (For a comparable 
two-stage model for a concept verification 
task—comprehension and _ processing—see 
Trabasso et al., 1971.) 


Encoding 


The effects of the informational variables 
on RT can be interpreted as representing 
encoding or comprehension processes. In par- 
ticular, since distinctiveness and consistency 
interact with this particular task, they must 
be presumed to influence the same stage 
(Sternberg, 1969; D. Taylor, 1976). The 
reasoning behind this conclusion is based 
on the fundamental assumption of RT mod- 
els, that of additive stages. If two factors 
affected the times of different stages, their 
effects on the total RT would be additive 
_ (and thus noninteractive). Hence, factors 
_ that interact must influence the time of @ 
single stage. Consensus does not interact 
with the other two information factors, but 
this need not cause its assignment to @ dif- 
ferent stage (D. Taylor, 1976, p. 190). 

_ The distinctiveness—consistency informa- 
ation combinations high-low and low-high 
take less time to encode than do the other 
two combinations. Why is this? Subjects’ 
expectations may provide an answer. In two 
_ Studies Orvis et al. (1975) presented sub- 
i RE with single information items (consen- 
E distinctiveness, or consistency alone) 
a tee the subjects to predict the levels 
ae kee information factors. Subjects 
age igh consistency predicted that dis- 
studie eness would be low, in both of the 
obser by Orvis et al.; subjects given low 
would Ta predicted that distinctiveness 
high di e high in both studies. Similarly, 
ate ee led to predictions of low 
howe ncy (in only one of the two studies, 

ver), and low distinctiveness led to pre- 
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Table 5 
Quasi-F Tests for Significant Effects of 
Informational Factors on Causal Components 


Effect df F p< 
Consensus on person 1,13 17.36 001 
Consensus on stimulus 1,11 23.03 001 
Distinctiveness on person 1, 6 81.52 001 
Consistency on person 1, 9 20.64 001 
Consistency on - 

1, 13 47.68 001 


circumstances 


Note. All effects not shown are not significant at the 
i-F test depend 


.05 level. Denominator dfs for a quasi- 


on the data and are not simply the df for a denomi- 


nator MS in the analysis of variance table. 


dictions of high consistency (in both stud- 
ies), Thus, in seven of these eight compari- 
sons the high-low and low-high combinations 
seem to go together or to be consistent with 
subjects’ expectations. They are therefore 
partially redundant (carry less information) 


and so should be encoded more quickly. 


The other two combinations, high-high and 
they 


low-low, do not have this redundancy; 
tend to contradict subjects’ expectations and 
should take more processing time to encode. 
The distinctiveness—consistency interaction 
effect on overall RT is thus explicable as a 


function of encoding processing. 


Attribution 

The second stage is the inference of a 
cause using the encoded forms of the pre- 
sented information. The negative relation- 
ship obtained between the number of com- 
ponents in the response and the response 
time suggests that t subtractive. 


he process is \ 
The subject appears to start by considering 
person plus stimulus plus circumstances as 
a cause. If that is found to be the appro- 
priate response, it is given very quickly; 
otherwise, components are successively sub- 
that set as they are found to be 
ch detection of a mismatc 
between the and the in- 
formation, followed 
component, takes 
of time (.097 in the lo; 


component causes all t 
than the 


time to give 
sponse, and single-component causes take 


longest of all. 
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The pattern of results weighs against the 
additive model of attributional processing, 
which predicts that single-component re- 
sponses should be the quickest rather than 
the slowest. The information look-up model 
is also inconsistent with the results; this 
model postulates that the perceiver looks up 
the causal response appropriate for each pos- 
sible configuration of information. This model 
predicts that RT should be not at all de- 
pendent upon the response given or else (in 

_ the variant of Orvis, Cunningham, & Kelley) 
that single-component responses should be 
quickest. 


Subtraction and Salience 


The subtraction model of attributional 
processing faces a problem that has not been 
discussed yet, namely the origin of the spe- 
cific causal elements included in the initial 
tentative response. Other models, for ex- 
ample the additive model, do not face this 
problem: for them, a causal component (e.g., 
Person or circumstances) to be added to the 
tentative response is Suggested by a con- 
sideration of available information (e.g., con- 
sensus or consistency) about the event under 
scrutiny, But the subtractive model holds 
that people start with a list of causal ele- 
ments prior to examining the relevant in- 
formation, Where do these causal possibili- 
ties come from? The reason this issue has 
not been discussed to this point is that its 
solution in the context of the McArthur ex- 
periment is obvious: subjects use the causal 
possibilities suggested by the dependent mea- 
sure. If the subtractive model is to describe 
attributional processing outside this context 
though, some other mechanism must be 
involved, 

In fact, the literature contains evidence 
for several possible mechanisms. One is sali- 
ence. A model of attribution proposed by 
Pryor and Kriss (1977) and Taylor and 
Fiske (1978) holds that perceivers seize 
upon a single sufficient cause that is salient 
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subtraction model thus predicts the salience 
phenomena that have been empirically ob- 
tained by the above-mentioned researchers 
and others, since salient causes will be con- 
sidered as candidate causes and will often be 
given as an attribution, The proposal ad- 
vanced here, however, goes beyond prior pro- 
posals to put the role of salience into a 
broader context of attributional information 
processing. 

A second possible mechanism is scripts. If 
the action to be explained is part of an 
expected sequence of events, then the script 
(Schank & Abelson, 1977) of which it is 
part may suggest one or more plausible can- 
didate causes. For example, if the waiter 
comes to one’s table in a restaurant, one can 
tentatively guess that he may be bringing 
a menu, preparing to take one’s order, and 
so forth. Again, these causes should then be 
subtractively processed, Thus, scripted causes 
are also candidates for the initial list, There 
are other possible sources of causes as well; 
for example, a person may simply (for idio- 
Syncratic reasons) enter a situation with 
some specific expectation as to the nature 
of the causal factors operating there. 

In any case, the subtraction model’s dif- 
ficulty seems solvable, Even outside the ex- 
perimental context, sources of candidate 
causal factors (salience, scripts, or expecta- 
tions) that enable the Person to put together 
an initial list of one or more tentative causes 
are frequently available. The model holds 
that this initial list is then processed by 
attempting to rule out causes, When this 

been done to the perceiver’s satisfaction 
(the criterion may be influenced by demands 
to be careful, anxiety, or time pressure, etc.), 
the perceiver reports a cause. It is not yet 
clear what will happen if the perceiver suc- 
ceeds in ruling out all candidate causes from 
the list. Presumably a new list will be put 
together and subjected to the same process of 


subtraction, and this should be time-con- 
suming. 


r” 


Ee 


Conclusions 


The data reported here are interpreted by 
Proposing a two-stage model of the cognitive 
processing elicited by the experimental ta% 
encoding or comprehending the stimulus test | 
and attributional Processing itself, Examina- l 
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tion of the relationship between RT and the 
content of the response led to the proposal 
that perceivers assign a cause by subtracting 
causal components after beginning by con- 
sidering as candidates the full set of three 
components. The subtractive model was 
shown to be consistent with salience effects 
on attributions, since perceptual salience can 
suggest candidate causes for consideration. 

The methods applied in this paper have 
been much used in cognitive psychology but 
have as yet found little application in social 
psychology. The work of Kintsch (1974; 
Kintsch & van Dijk, 1978) and others on 
text comprehension should be referred to 
for clues as to how social psychologists might 
increase their knowledge of the first stage, 
the comprehending or encoding of events for 
attribution, Response time methods can be 
extended to cover attributional processing in 
other types of situation and with different 
materials and response formats, allowing tests 
of the idea that the basic subtractive na- 
ture of the attribution process may be a 
stable, general feature. 


Reference Note 


1. Kenny, D. A., & Smith, E. R. A note on the 
analysis of designs in which subjects receive each 
Stimulus only once. Unpublished manuscript, Uni- 
versity of Connecticut, 1979. 
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The Effects of Self-Esteem, Success—Failure, and 
Self-Consciousness on Task Performance 


Joel Brockner 
State University of New York College at Brockport 


Previous research has demonstrated that the task performance of low a 

esteem individuals (low SEs) suffers in the presence of self-focusing stimuli 

(e.g. a mirror). The present study was designed to determine if such stimuli 
Bo A 


must inevitably have adverse effects on low SEs. It was reasoned that if low 
SEs were provided with success feedback from a previous task, then Sopa 
of their self-consciousness would be altered on a subsequent task. Specitically, 


low SEs should attend more to positive and less_anxiety-provoking aspects of 
themselves than would low SEs who received failure feedback from the pre- 
vious task. Under the former condition, the low SEs’ subsequent task perform- 
ance was expected to improve. For high self-esteem individuals (high ay H 
who typically perform well, previous success-failure feedback was expected to 
have little effect on subsequent performance. In a three-factor design, subjec $ 
high and low in chronic self-esteem received false success or failure eget 
from a previous task and completed a concept formation task in either = 
presence or the absence of a mirror. Whereas high SEs performed equally yel 
following success or failure, low SEs in the success condition performed sig; 
nificantly better than low SEs in the failure condition (and just as well as high 
self-esteem-success participants). This Self-Esteem X Prior Feedback interac- 
tion was significant in the presence of the mirror, but not in its absence. In 
the absence of the mirror, however, this interaction was observed for subjects 
who were high in dispositional self-consciousness, but not for those who were 
low, Practical and theoretical implications of these findings are discussed. 


Clinical practitioners and researchers have 
long been concerned with ways to enhance 
the self-evaluations of individuals suffering 
from low self-esteem (SE). Unfortunately, 
this task has proven quite difficult, as low 
self-esteem people (low SEs) seem to be 
trapped in a “vicious cycle” of negativity. 
For example, many investigators (Hamachek, 
1971; Shrauger, 1972) have reported that 
low SEs do significantly worse in achieve- 
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ment settings (i.e., situations that have Mi 
uative implications for one’s SE) than e 
high self-esteem subjects (high SEs). roe 
performance, in turn, would seem to se 
the stage for continued self-criticism and low 
SE. 

What are the precise causes of the a 
SEs’ often poor task performances? A Ree 
series of experiments (Brockner, 197 1 
Brockner & Hulton, 1978) suggested that on 
important factor was the low SEs’ focus pi 
attention during task performance. In ee 
eral studies the performance of the low $ 


(but not of the high SEs) was quite malle- 


able across conditions, depending on the ae 
nipulation of their attentional focus. T oi 
low SEs did quite poorly in the presence : 
stimuli designed to increase self-focused a 
tention (ie, an audience, a mirror, a” i 
video camera). However, the low SEs P 
formance was also enhanced considerably 
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a set of instructions intended to increase 
task-focused and decrease self-focused atten- 
tion, These latter data have implications for 
preaking the vicious cycle of low SE. It 
seems entirely possible that if low SEs re- 
peatedly improve their task performance 
their self-opinions may begin to become more 
positive. 


Low SE, Focus of Attention, 
Anxiety, and Performance 


Brockner (1979) 
tional focus manipulations produced differ- 
ential levels of anxiety for low SEs that in 
turn could have mediated task performance. 
That is, the self-focusing stimuli appeared 
to make the low SEs more anxious, whereas 
the task-focusing instructions, by reducing 
self-consciousness, 
them less anxious, Brockner also presented 


evidence consistent with the notion that anxi- 


ety and performance were inversely related— 
not surprising, given the fairly complex na- 
ture of the task (see Spence, Farber, & Mc- 
Fann, 1956). 

If Brockner’s (1979) analysis is correct 
(ie, if the attentional manipulations influ- 
enced the low SEs’ anxiety, and if their anxi- 
ety mediated performance), then several 
questions confront researchers who wish to 


i discover methods to improve the task per- 


formance of low SEs. Perhaps most impor- 
tant, does self-focused attention always C0- 
vary with anxiety for low SEs? Stated 
another way, must low SEs’ anxiety level 
always increase in the presence of self-focus- 
ing stimuli? If self-focused attention and 
anxiety always covary for low SEs, then 
self-focusing stimuli should inevitably pro- 
duce decrements for low SEs on tasks in 
which anxiety is known to impair perform- 
ance. However, if low SEs’ self-focused at- 
tention and perceived anxiety need not C0- 
vary, then it may be possible for low SEs 
to perform well on such tasks, even in the 
presence of self-focusing stimuli. In fact, 
Brockner (1979) observed that the low SEs’ 
Performance does not always have to suffer 
in the presence of self-focusing stimuli. In 
one study low SEs were placed in front of a 
mirror and were also instructed to concen: 
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trate on the task. It was found that these 
subjects made significantly fewer errors than 
low SEs who performed the task in front of 
a mirror but were not told to focus on the 
task, It was suggested that by concentrating 
on the task, the former group’s level of self- 
focus, and therefore anxiety, was reduced. 
To reduce the anxiety of low SEs, is it 
always necessary to reduce their level of 
self-focused attention? Or is it also possible 
to minimize their anxiety (and thereby en- 
hance performance) by changing the selj- 
aspect that is focused on when attention is 
self-directed? Both Duval and Wicklund 
(1972) and Wicklund (1975) have suggested 
that the affective nature of self-focused 
attention largely depends upon what hap- 
pened to the individual prior to his being 
made self-aware. Therefore, under certain 
conditions, self-focused attention can be af- 
fectively pleasant; for example, if the in- 
dividual received positive feedback on some 
personality dimension, then self-focused at- 
tention actually enhanced positive affect 
(Ickes, Wicklund, & Ferris, 1973). More- 
over, recent research by Scheier and Carver 
(1977) convincingly demonstrated that self- 
focused attention can intensify any emotional 
experience, regardless of its affective nature. 


Present Study 


In the present study, subjects high and 
low in chronic self-esteem first received either 
success or failure feedback on a social in- 
sight test and then completed a concept 
formation task in either the presence Or ab- 
sence of a self-focusing stimulus. Results 
obtained by Ickes et al. (1973) and Scheier 
and Carver (1977) suggest that low SEs who 
are provided with success feedback and then 
made self-aware should experience self-focus 
characterized by more positive and/or fewer 
anxiety-provoking thoughts. Under these cir- 
cumstances low SEs should perform well on 
the concept formation task. ‘Alternatively, if 
they are provided with failure feedback, low 
SEs will probably experience negative affect. 
The self-focusing stimulus accompanying the 
concept formation task should then cause 
an increase in anxiety awareness. Accord- 
ingly, low SEs would be expected to do quite 
poorly on the task in this condition. 
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On the basis of previous research, it was 
expected that the affective state of the high 
SEs would be less influenced by the success- 
failure feedback than the affective state of 
the lows. Given that high SEs are generally 
already low in anxiety (Crandall, 1973), the 
success feedback should not have such pro- 
nounced effects. Moreover, research by 
Shrauger and Rosenberg (1970) strongly 
suggests that the high SEs’ greater ego 
strength would allow them to withstand the 
potentially anxiety-provoking effects of the 
failure feedback. High SEs, in addition to 
being less affectively influenced by success— 
failure feedback, also report lower levels of 
self-focused attention in the presence of self- 
focusing stimuli (Brockner, 1979; Brockner 
& Hulton, 1978). For these two reasons the 
self-focusing stimulus should not have the 
amplifying effects on affect and performance 
that it is expected to have on the low SEs. 
Rather, high SEs should perform well in the 
self-focus condition after either success or 
failure. 

Because the low SEs’ affective states were 
expected to be less polarized in the absence 
of self-focusing, the condition differences in 
performance in the no self-focus conditions, 
although similar to those expected in the 
self-focus conditions, may not attain statis- 
tical significance. 

In sum, the present study consisted of a 
three-factor (SE x Prior Feedback x Atten- 
tional Focus) design. A SE X Prior Feed- 
back interaction was predicted, such that (a) 
low SEs would perform worse in the failure 
than in the success condition, with no corre- 
sponding difference for high SEs, and (b) 
low SEs would perform worse than high SEs 
in the failure condition, but just as well as 
high SEs in the success condition. More- 
over, this interaction effect was only clearly 
expected to be significant in the self-focus 
condition, 

Subjects also completed a measure of dis- 
yes self-consciousness (the private 

sciousness scale of Fenigstein, Scheier 
& Buss, 1975) prior to performing the con- 
cept formation task. Previous research has 
eoa this individual difference vari- 
able, like situational mani i 
focused attention, is di e a tain 


rectly related to aware- 
ness of one’s emotional state (Scheier, 1976; 
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Scheier & Carver, 1977). Furthermore, it . 


has been suggested that personality variables 
are most likely to be related to behavior 
when relevant situational stimuli are absent 
or ambiguous (Mischel, 1973). Therefore, 
it was expected that in the no self-focus con- 
dition, (a) the performance of subjects high 
in private self-consciousness would parallel 
the behavior already predicted for subjects 
in the self-focus condition, and (b) subjects 
low in private self-consciousness would be 
less apt to demonstrate the basic SE X Prior 


Feedback interaction effect hypothesized 
above. 

Method 
Participants 


The 100 undergraduate participants were volun- 
teers recruited on the campus of the SUNY College 
at Brockport. The data of 7 subjects who were 
suspicious about the experimental procedure and 
3 who did not understand the instructions were 
not included.t The final sample consisted of 60 
females and 30 males, who were randomly assigned 
to conditions. Subjects’ SE and_self-consciousness 
were assessed by scales completed immediately prior 
to the experiment. k 

Self-esteem. The self-esteem scale was identical 
to the one used by Brockner (1979) and Shrauger 
(1972). The scale measures the subjects’ perceived 
competence across 16 various situations (academic, 
athletic, and social). Subjects were asked to indicate 
the percentage of time a particular positive behavior 
or outcome applied to them. The average score ies 
all 16 items was computed for each subject, suc 
that higher scores represented higher SE. Subjects 
were classified as high or low SEs on the basis of a 
median split. 6 

Self-consciousness. Subjects also indicated on f 
point scales how much each of the private ae 
consciousness statements of Fenigstein et al. ee 
was characteristic of them. The scores for each suap 
ject were summed over the 10 items, with higher 
scores representing greater self-consciousness. A ma 
dian split was employed to classify subjects a5 kr 
or low in private self-consciousness. SE and priva 
self-consciousness were not related, r(88) = 06. 


Procedure 


; on 
All participants were run one at a time. i ; 
entering the laboratory they were greeted 


a for 

1 Interestingly, all of the discounted subject ilure 

whom there were data (N =7) were in the form- 
condition. More will be said about their Pe 


ance on the concept formation task in the 
section. 


a 


| social insight is 


< percentile of a group 
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female experimenter, led to a cubicle, and told that 
this “problem-solving” experiment would consist of 
several parts. First, subjects were asked to complete 
the SE and self-consciousness inventories. Afterward 
they responded to the Chapin Social Insight Test 
(Gough, 1968). 

Social insight test. For each of these 10 problems, 
subjects read a short paragraph describing a social 
situation. They then had to choose one of four 
alternatives that they thought was the best explana- 
tion of the behavior of the individuals described in 
the paragraph. Subjects were told that the test mea- 
sures the ability to “see into” social situations and, 
to heighten their psychological involvement, that 
“correlated with a person’s capacity 
to show empathy for others, to predict others’ re- 
sponses in social situations, and it relates to a per- 
son’s general understanding of human nature. After 
completing the test you will be informed of how 
well you did on it.” The test took 10-15 minutes to 
complete. 

When subjects finished the test, the experimenter 
excused herself while she went to “score” their per- 
formance. Returning about 3 minutes later, she ad- 
ministered the false feedback. In the success condi- 
tion subjects were told that they scored in the goth 


peers.” In the failure condition subjects 
that they had scored in the bottom 20% of their 
peer group. After leaving the subjects for 1 minute 
so that they would absorb the feedback, the experi- 
menter returned and asked them to complete 2 
manipulation check questionnaire. 

Subjects were then led across the hall to “complete 
the last part of the experiment.” In this second room, 
a male experimenter blind to the subjects’ feedback 
condition and SE and self-consciousness scores seated 
the subjects at a desk where they found the concept 
formation task (described immediately below) and 
the accompanying instructions. Subjects read the in- 
structions to themselves while the experimenter 
simultaneously read them aloud. Subjects were also 
given a practice trial to familiarize them further 
with the task. After the practice trial was com- 
pleted, subjects estimated on 4 41-point scale how 
well they expected to perform at the task (endpoints 
were “very poorly” [1] and “yery well” [41]). These 
data were collected for exploratory purposes but will 
become relevant in the discussion of the results. 

Concept formation task. The task was identical to 
the one employed by Brockner (1979; Brockner & 


The subject was told that the concept consisted of 
one, two, or three characteristics. The experimenter 
selected a square that was an example of the correct 
sine The subjects’ task was to determine the con- 
oe by lifting up other squares, one at a time. If the 
eer was an example of the correct concept, the 
x rd YES was written on a piece of oak tag below 

e board. If the square was not an example of the 
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concept the word NO was written below. When sub- 
jects believed that they had ascertained the correct 
concept, they wrote it on a slip of paper provided to 
them. At no time were subjects ever informed of the 
accuracy of their responses. Subjects were told 
that although “we are interested in the speed and 
accuracy with which you determine the concepts, it 
is more important to be accurate.” 

Attentional focus. The manipulation of attentional 
focus was introduced just before subjects began the 
concept formation task. In the self-focus condition 
subjects were told: 


One more point. Since we are interested in how 
people go about forming concepts, the experimenter 
will be watching you as you are working on the 
task. Also, a mirror will be placed in front of you, 
making it easier for the experimenter to observe 
your behavior more closely. 


With that, the experimenter placed a large mirror 
(24 X 36 inches—61 X 91 cm), which had hitherto 
been unexposed, directly in front of the subject. A 
large mirror was employed in order to strengthen its 
self-focusing effect. That is, subjects could see not 
only themselves but also the experimenter observing 
them as they worked on the task. In the no self-focus 
condition the mirror manipulation was never intro- 
duced. 

In all conditions subjects then began to work on 
the task, which was comprised of seven trials. The 
experimenter, seated adjacent to the subjects, re- 
corded the amount of time taken and the number 
of squares Jooked at for each trial, When the seventh 
trial was completed, subjects answered a postexperi- 
mental questionnaire. In addition to indicating the 
nature of their attentional focus and the anxiety they 
felt while completing the task, subjects wrote their 
hypotheses and suspicions about the study. Finally, 
because of the potentially sensitive nature of the 
subject matter under study, all participants were 
carefully debriefed. 


Results 


Performance * 


and omission 
trials to deter- 


Errors of both commission 
were summed over the seven 


2Qnly the error data will be 
qualitative performance measures included the time 
taken to complete the trials and 
squares looked at before subjects made their re- 
sponses. Consistent with previous research (Brock- 
ner, 1979; Shrauger, 1972), there were no condition 
differences on the time measure. The only effect to 
emerge on the other measure was the Feedback X 


Attentional Focus (p < 05). This effect 
is not 


this study, 
found con- 


dition differences On this measure. 
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mine task performance. A preliminary ¢ test 
revealed no effect for sex of subject, £< 1. 
Therefore, the data were analyzed with a 
three-factor unweighted-means analysis of 
variance. The only effect to attain significance 
was the SE x Prior Feedback interaction, 
F(1, 82) = 4.22, p < .05. Simple effects eluci- 
dated the nature of this effect. Specifically, 
low SEs performed worse than high SEs in 
the failure condition, F(1, 82) = 7.39, p< 
.01, but not in the success condition (F< 1)? 
Moreover, low SEs made significantly more 
errors in the failure than in the success condi- 
tion, F(1, 82) = 6.81, $ < .02, whereas high 
SEs were unaffected by the feedback manip- 
ulation (F < 1), 

While the triple interaction did not achieve 
significance, the means in Table 1 suggest that 
the significant effects reported above were 
largely attributable to the results obtained in 
the self-focus condition, Indeed, planned com- 
parisons (i.e., simple interaction effect analy- 
ses) demonstrated that the SE X Feedback 
interaction was Significant in the self-focus, 
F(1, 82) = 4.55, p< .05, but not in the no 
self-focus condition (F <1)4 Additional 
simple effect analyses revealed that the only 
instance in which low SEs made significantly 
more errors than high SEs was in the failure- 
self-focus condition, F(1, 82) = 10.25, p< 
005, whereas the only time significantly more 
errors were made in the failure than in the 
Success condition was in the low SE ~self- 
focus condition, F(1, 82) = 4.02, p < 05. 


Table 1 
Task Performance by Condition 
Prior feedback 
Attentional Sı i 
a aE uCCess Failure 


————— 
esteem n 


M SD n M sp 
Self-focus 


High 11 3.64 234 44 
A 5 Pay Walk 3 
Low 10 3.90 3.54 11 6.73 ne 
No self-focus 
High 14 3.36 213 10 
5 $ 4.20 4. 
Low 11 2.73 3.35 905.11. aa 
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Table 2 
Task Performance by Condition, Including 
Private Self-Consciousness 


ee 


Prior feedback 


Private self- Success 
consciousness / — 
self-esteem n M SD ® MSD 


Failure 


Self-focus 
High/high 5 440 2.07 10 1.70 1.57 
High/low 5 640 3.13 5 5.20 4.32 
Low/high 6 3.00 2.53 4 3.50 3.70 
Low/low 5 140 1.67 6 8.00 5,33 
No self-focus 
High/high 6 5.00 1.55 4 3.25 2.36 
High/low 6 2.17 3.06 4 7.50 3.42 
Low/high 8 2.13 1.64 6 4.83 5.23 
Low/low 5 340 3.19 5 3.20 1.92 


Nole. Performance was measured by errors. Scores 
ranged from 0 to 18. The higher the score, the poorer 
the performance. 


Self-consciousness analysis. To test for the 
main and interaction effects involving the dis- 
positional self-consciousness variable, a four- 
factor (Self-Consciousness x SE x Feedback 
X Attentional Focus) unweighted-means 
analysis of variance was performed. The only 
significant effect to emerge, in addition to the 


already reported SE x Feedback interaction, 
` 


*The data of seven subjects in the failure con- 
dition were discarded because of their suspicions 
about the validity of the social insight test. Perhaps 
because the impact of the failure feedback Was 
lessened for these subjects, they tended to make 
very few errors on the task. Nevertheless, the four 
Subjects who would have been low SEs still made 
more errors (M=2.50) than the three would-be 
high SEs (M = 1,33), 

“Previous researchers (Brockner, 1979; Bia 
& Hulton, 1978; Shrauger, 1972) have found an 
differences on this task in the presence, but ea 
the absence, of self-focusing stimuli. Furthermore 
the a priori hypothesis of the present study, a 
theoretical grounds, was that the SE X Feedbar® 
interaction would be significant in the self fort 
but not necessarily in the no self-focus conde 
Winer (1971) clearly states that it is legitimate a 
compare specific means, if the comparisons Tie 
planned prior to the inspection of the data- were 
appropriate planned comparisons, therefore, self- 
the simple interaction effect analyses in the 
focus and no self-focus conditions. 


ae 


was the four-way interaction, F(1, 74) = 
Í 7.25, p < .01. By inspecting Table 2, one 
finds that there were nonsignificant perform- 
ance differences between SE groups in the 
success conditions. In the failure conditions 
high SEs performed far better than low SEs, 
except in the absence of a source of self- 
focused attention (i.e., the no self-focus —low 
self-consciousness condition). Although the 
means in the self-focus condition suggest that 
the SE x Prior Feedback interaction was 
stronger for subjects low, rather than high, in 
dispositional self-consciousness, the simple 
triple interaction effect (comparing the SE x 
Feedback interaction for high vs. low self- 
conscious subjects) revealed that this differ- 
ence was not significant, F(1, 74) = 1.44. In 
the no self-focus condition, high self-conscious 
subjects exhibited a similar SE X Feedback 
interaction shown by all subjects in the self- 
focus condition, whereas low self-conscious 
= subjects did not. Indeed, the simple triple 
interaction effect did attain significance in the 
no self-focus condition, F(1, 74) = 6.79, p < 
02. In short, in the no self-focus condition, 
subjects high in self-consciousness responded 
(a) like subjects in the self-focus condition 
and (b) unlike subjects low in self-conscious- 
hess. Stating the latter effect more generally, 
the personality variable only had an effect in 
_ the absence of the mirror, that is, the relevant 
situational stimulus (Mischel, 1973).° 


Manipulation Checks 


Prior feedback. After receiving their feed- 
back but before starting to work on the con- 
cept formation task, subjects completed two 
41-point scales measuring their affective states 
(Question 1: At this point, how socially in- 
Sightful do you think you are? Anchor points: 

“ “not at all” [1] and “very” [41]; Question 
2: How would you describe your current mood 
State? Anchor points: “very bad mood” [1] 
and “very good mood” [41]). Since the two 
items correlated significantly, 7(88) = -44, 
’ < 01, an affect index was computed for 
each subject by summing the scores. Because 
these measures were collected before the at- 
tentional focus variable had been manipulated, 
à two-factor unweighted-means analysis of 
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variance was performed on the index. The 
analysis revealed a significant interaction 
effect, F(1, 86) = 7.10, p < .01. Low SEs ex- 
pressed more positive affect in the success 
than in the failure condition, F(1, 86) = 
24.27, p < 001, whereas high SEs did not, 
F(1, 86) = 1.34. Moreover, high SEs were 
only more positive than low SEs in the failure 
condition, F(1, 86) = 12.10, p < .005. 

Attentional focus. Two 41-point self-report 
items were employed to measure the amount 
of positive and negative thoughts subjects 
were having about themselves while working 
on the task. Question 1 measured positive 
(e.g., feeling confident) and Question 2 nega- 
tive (e.g., thinking I was doing poorly) self- 
evaluative thoughts. Three-factor unweighted- 
means analyses revealed a SE X Feedback 
interaction on the first question, F(1, 82) = 
3.98, p < .05. Specifically, it was only in the 
failure condition that high SEs were having 
more positive thoughts than low SEs, On the 
second question the means were in the op- 
posite direction, but all effects were nonsig- 
nificant. Thus, it was only in the failure con- 
dition that low SEs tended to have more nega- 
tive thoughts than high SEs. 

Anxiety. Subjects were also asked, “While 
you were completing the task, (a) how anx- 
ious did you feel, and (b) how much did the 
experimenter’s presence bother you?” Since 
the two items were well correlated, r(87)° = 
42, p < .01, an anxiety index was formed by 
summing each subject’s scores. The analysis 
of variance revealed a significant SE main 
effect, F(1, 81) = 4.85, p < .05, with low 
SEs more anxious than high SEs, and a mar- 


5 Subjects were also measured on the other two 
subscales of the self-consciousness scale of Fenig- 
stein et al. (1975), that is, public self-consciousness 
and social anxiety. The four-factor analysis involv- 
ing the public self-consciousness variable yielded a 
marginally significant four-way interaction (p< 
.10), the nature of which was similar to that ob- 
served on the private self-consciousness analysis. 
The social anxiety analysis was not performed be- 
cause an unbalanced distribution of that variable 
caused one of the conditions to contain only one 
subject. 

6One subject inadvertently failed to complete 
the first anxiety measure. Thus, the degree of free- 
dom is 87 rather than 88. 
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ginal SE X Feedback interaction, F (1, 81) = 
2.76, p < .10. Interestingly, the mean condi- 
tion differences on the index were remarkably 
similar to the performance data reported in 
Table 1. Although the triple interaction was 
not significant, simple interaction effect analy- 
ses showed that the SE x Prior Feedback in- 
teraction effect was significant in the self- 
focus condition, F(1, 81) = 4.90, p< .05, 
but not in the no self-focus condition (F < 
1). Low SEs were only significantly more anx- 
ious than high SEs in the failure-self-focus 
condition, F(1, 81) = 8.64, p < .01, and the 
only instance in which failure subjects were 
more anxious than success subjects was the 
low SE ~self-focus condition, F(1,81) = 
4.90, p < .05. 


Correlational Data 


The pattern of the SE x Feedback inter- 
action was similar on measures of self-evalua- 
tive thoughts, anxiety, and task performance. 
Further suggestive evidence that task per- 
formance was mediated by subjects’ self- 
evaluative thoughts and anxiety levels stems 
from a series of correlations, Specifically, sub- 
jects’ positive thoughts were inversely related, 
r(88) =—42, p< 01, and their negative 
thoughts were directly related, r(88) = 36, 
$ < 01, to the number of errors they made. 
There was also a positive correlation between 
subjects’ anxiety index scores and the number 


of errors they committed, r(87) = 33, p< 
01, 


Discussion 


In sum, the major hypotheses of the study 
were supported. It was predicted that the 
feedback would cause low (but not high) SEs 


to attend to different aspects of themselves, 


such that the self-evaluatiye statements 
by low SEs would be aa 


y à more anxiety-provokin; 

following failure than following success, Self 
focused attention was expected to intensify 
anxiety awareness, which in turn was thought 


to mediate task performance, Indeed the task 
Performance results showed that re SEs 
made more errors than high SEs following 
failure, as long as there was some source of 


g 
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self-focused attention (i.e., the situationally 
produced self-awareness or subjects’ own tend- * 
encies towards self-consciousness). Following 
success, low SEs performed just as well as 
high SEs in all conditions. 

The fact that low SEs performed better 
following success has implications for break- 
ing the vicious cycle of low self-esteem. That 
is, previous investigators (Brockner, 1979; 
Brockner & Hulton, 1978; Shrauger, 1972) 
have demonstrated that low SEs’ performance 
deteriorates in the presence of self-focusing 
stimuli. The present results suggest, however, 
that by reducing the low SEs’ anxiety before 
making them self-aware, one can block the 
negative effects of the self-focusing stimulus 
on their performance. Of course, in the ab- 
sence of a no feedback —self-focus condition, 
it is not known whether the success feedback 
reduced the low SEs’ anxiety and enhanced 
their performance, whether the failure further 
heightened anxiety and impaired performance, 
or whether both occurred. However, in wl 
would amount to a relevant no feedback —self- 
focus condition, Brockner (1979) found that 
high SEs made 2.50 errors, whereas the mean 
score for low SEs was 7.22, These data are 
very close to the means observed in the fail- 
ure —self-focus condition in the present study. 
Thus, for low SEs in the self-focus conditions, 
it would appear that the success feedback 
reduced anxiety and enhanced performance, 
whereas the failure had only slight effects. 

Alternative interpretation. It could be 
argued that the mirror produced its effects by 
serving as a source of general arousal. More 
Specifically, consider the following logical se- 
quence. Feedback may have manipulated the 
direction of attentional focus experienced by 
low SEs, so that they were more task-focused 
and less self-focused following success than 
failure. Accordingly, their performance should 
be better following success. The mirror May 
then have acted as an arousal source, enhancing 
performance differences by further rest 
the range of cues to which subjects attende 
(Easterbrook, 1959), That is, restriction 0 
cues among task-focused subjects would have 
increased task focus; cue restriction among 
self-focused subjects would have further ci 
creased task focus, How might the mirror 
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presence have enhanced arousal? Given the 
instructions that accompanied the introduc- 
tion of the mirror (“so that the experimenter 
can watch you more closely”) and the fact 
that subjects could see the exprimenter ob- 
serving them in the mirror, the perceived im- 
portance and/or evaluative nature of the task 
may have been greater in the self-focus than 
in the no self-focus condition. 

Although this alternative explanation war- 
rants consideration, there is evidence against 
it. For example, if the mirror was a source of 
general arousal because it caused subjects to 
attribute more significance to the task, then 
they should have assigned greater importance 
to the task in self-focus than no self-focus 
conditions. Although no measures of perceived 
task importance were collected in this study, 
Brockner (1979) did measure “the importance 
of doing well at this task” as a function of 
SE and the presence of a mirror and video 
camera (using the same justification to intro- 
duce the self-focusing stimuli that was em- 
ployed in the present study). There was no 
tendency for subjects (including low SEs) to 
attribute more importance to the task in the 
presence of the self-focusing stimuli. 

In addition, it has been demonstrated that 
private self-consciousness does not correlate 
with individual difference measures of arous- 
al/emotionality (e.g, general emotionality, 
test anxiety, social anxiety; Scheier & Carver, 
1977), In examining the performance data of 
high private self-conscious subjects in the no 
self-focus condition (see Table 2), one finds 
that the SE x Feedback interaction was 
parallel to the one observed for all subjects 
who were exposed to the mirror. In attempting 
to account for such parallel effects of manip- 
ulated and dispositional self-attention, Scheier 
and Carver have suggested that “an interpre- 
tation based on arousal may provide a viable 
explanation for the effects of a mirror, but it 
cannot account for the effects of private self- 
consciousness. An interpretation based on 
self-focused attention can explain both the 
effects of a mirror and the effects of private 
self-consciousness. Thus, although different 
interpretations can be used to explain the 
effects of a mirror and private self-conscious- 
hess separately, at this time the self-awareness 
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interpretation appears more reasonable, com- 
prehensive, and parsimonious” (Scheier & 
Carver, 1977, p. 634). 

A similar line of reasoning may be used to 
suggest that the mirror did not cause in- 
creased evaluation apprehension. That is, if 
the mirror enhanced evaluation apprehension, 
then subjects in the no self-focus condition 
who are more sensitive to the evaluation pro- 
cess (those high in social anxiety; Turner, 
1977) should have performed as did those in 
the self-focus condition. As stated previously, 
the social anxiety analysis was not conducted 
because one of the cells (in the no self-focus — 
low social anxiety condition) contained only 
one subject. However, there were enough sub- 
jects in the no self-focus —high social anxiety 
condition to test this hypothesis. Interestingly, 
the results did not at all resemble the SE x 
Feedback interaction obtained in the self- 
focus condition. Rather, there was a slight 
tendency for both high and low SEs to make 
more errors following failure. 


Self-Attention and Behavioral Self-Regulation 


The present data are generally consistent 
with Carver’s (1979) analysis of the conse- 
quences of self-directed attention. In essence, 
Carver suggests that when subjects are self- 
aware they will be more motivated to conform 
to the salient behavioral standard in a situa- 
tion (e.g., to solve the concept formation task 
successfully). If arousal cues are not intro- 
duced, the subject should complete the task 
successfully. If fear or anxiety arousal cues 
become salient, however, the subject’s task 
performance will be interrupted. At this point 
the individual will assess his/her likelihood of 
being able to complete the task. If the sub- 
ject’s expectations are positive, he/she will 
vigorously reattempt to match behavior to the 
standard. Subjects who feel unable to com- 
plete the task will respond with passivity and 
withdrawal. 

To test this model, Carver, Blaney, and 
Scheier (1979a, 1979b) have presented sub- 
jects with fear or anxiety-provoking situations 
and have measured task persistence/perform- 
ance. The results of these studies have gen- 
erally supported the model: positive expect- 
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ancy subjects exhibit greater persistence/ 
performance than negative expectancy partici- 
pants, but only when self-focused attention is 
high, 

One of the experiments of Carver et al. 
(1979b) is conceptually quite similar to part of 
the present design. In that study, all subjects 
were provided with failure feedback. Persist- 
ence at a subsequent task was measured as a 
joint function of self-awareness and expect- 
ancy for success at that second task. In the 
present study, some subjects were given fail- 
ure feedback and then had to perform the 
concept formation task with no manipulation 
of expectancy. High SEs performed better 
than low SEs in the present research when 
attention was self-focused, but not when self- 
focus was lower. These results parallel those 
of Carver et al. if it can be assumed that (a) 
task persistence is related to task perform- 
ance and (b) SE is positively related to ex- 
pectancy for success at the concept formation 
task. The first assumption seems straightfor- 
ward, and the second one was empirically sup- 
ported in the present study. High SEs did 
expect to perform better than low SEs, F(1, 
86) =6.77,p< 025, particularly in the fail- 
ure condition, F(1, 86) = 5.80, p < .025. 

These performance data not only confirm 
those obtained by Carver et al. (1979a, 
1979b), but they provide converging support 
for Carver’s ( 1979) theoretical model. That 
is, to demonstrate that self-focusing is a 
necessary component of the model, Carver et 
al. employed situational manipulations of self- 
focused attention. In the present study, the 
fact that only subjects high in dispositional 
self-consciousness in the no self-focus condi- 
tion performed as did those in the self-focus 


condition further attests to the importance of 
self-focused attention, 
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pectancies, no SE differences in performance 
would be predicted, as high and low SEs did 
not reliably differ in their expectations for 
performance in the success condition, F(1, 
86) = 1.62. 


Conclusions 


The task performance findings in this study 
are a logical extension of those observed in 
previous research (Brockner, 1979). In the 
earlier study, to enhance their task perform- 
ance in the presence of a self-focusing stim- 
ulus, low SEs were told to concentrate dili- 
gently on the task. Thus, the previous study 
adopted a quantitative approach to combating 
self-focused attention for low SEs, in that the 
task-focus instructions were designed to re- 
duce the degree of the low SEs’ self-focusing. 
The present study entailed a more qualitative 
strategy to offset the negative effects on per- 
formance of the low SEs’ self-focused atten- 
tion, Rather than reduce the degree of their 
self-focusing, an attempt was made to change 
the nature of the low SEs’ self-focused atten- 
tion by providing them with success feedback 
from a previous task. This strategy also 
proved to be effective, as the low SEs’ task 
performance was no longer impaired in the 
Presence of the self-focusing stimulus, These 
Performance data are consistent with and 
Serve to extend recent theorizing about the 
consequences of self-directed attention. 
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Experimental social psychology has dealt primarily with situations that are not 
true social interactions; in a typical study, a subject responds to a fixed, arti- 
ficial social stimulus such as a photograph, written description, or performance 
by a confederate. Although these artificial social stimuli provide experimental 
control over independent variables and can be analyzed using the types of 
statistical models originally developed for nonsocial experimental research, they 
provide little or no information about the interactive aspects of social behav- 
ior—the reciprocity or mutual contingency of the behavior of interaction part- 
ners. This paper describes a nonexperimental design specifically tailored to 
social interaction data that provides more information about individual differ- 
ences and social influence in social interactions: a round robin design in which 
each person interacts with every other person. After a brief review of available 
models, a new and more general model for the analysis of social interaction 
data is presented, with an empirical demonstration using vocal activity data. 


Experimental social psychology has often 
been limited to the study of artificial, one-sided 
social situations in which the subject responds 
to some fixed social stimulus created by the 
experimenter; this has been the case even in 
research on person perception and interpersonal 
attraction, where it is especially apparent 
that the behaviors and attitudes of persons 
are mutually contingent (i.e, A’s liking for 
B affects B’s liking for A, and vice versa). 
In research on subject reactions to fixed 
stimuli (rather than the reactions to other 
persons in the context of naturally occurring 
social interactions), information about the 
reciprocity of social behaviors is lost. More- 
over, the use of standardized, artificial stimuli 
often precludes having stimuli that are 
representative of typical interactions. Thus, 
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the generalizability of such studies may be 
rather limited. 

A round robin design is one in which all 
possible pairs of subjects from some set of 
subjects interact (like the pairings formed in a 
tennis tournament). This term was introduced 
by Gleason and Halperin (1975). For each per- 
son paired with every other person, an observa- 
tion is made of some social behavior (speech 
pattern, rating, or degree of attraction). Thus 
the “treatments” to which each subject re- 
acts are the behaviors of other subjects. 
This design provides two kinds of information 
about social behavior—first, information about 
individual differences among subjects Gn 
speech patterns, ratings, or other behaviors) ; 
second, information about the mutual influence 
that interaction partners have on each other $ 
behaviors (for instance, the tendency tO 
reciprocate positive feelings or to match the 
durations of certain kinds of pauses in speech) i 

Before introducing a new round robin 
analysis of variance, two simpler designs for 
social interaction data will be briefly reviewe i 
Each of these designs provides partial inform 
tion about social behavior, and the roun 
robin design can be understood as a mode 
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ROUND ROBIN ANOVA 


that integrates and elaborates on these two 
simpler designs. 


Intraclass Correlation 


A common nonexperimental design for 
social interaction research is one in which a 
number of different dyads are formed and 
each subject’s behavior toward one randomly 
assigned partner is observed. An example is 
a study reported by Welkowitz, Cariffe, and 
Feldstein (1976) on the congruence of vocal 
activity (ie. the tendency for conversation 
partners to match the mean durations of their 
switching pauses, the pauses that occur after 
one person has stopped speaking and before 
the other person begins to speak). For each 
subject there is one observation, the mean 
switching pause duration in seconds, The 
observations are arranged by dyads, as 
illustrated by the hypothetical data displayed 
in Table 1. 

To determine the strength of the tendency 
for partners to match pause durations, it is 
necessary to look at the correlation between 
the two columns of data, but an ordinary 
correlation coefficient is inappropriate here 
because the data do not consist of ordered 
pairs, that is, the assignment of any particular 
observation to the first or second column is 
arbitrary. This type of layout also occurs in 
examining twin data, as jn research on IQ; 
here also there is no basis for labeling one 
twin X and the other twin X’. For this type 
of data layout an intraclass correlation is 
required (Snedecor & Cochran, 1967, 294-296). 
The r intraclass is calculated slightly differently 
from the ordinary product-moment f; each 


Table 1 
Hypothetical Data Layout for Intraclass 
Correlation of Switching Pause Durations 


Dyad 7.6 Pai 
e a E O E 
1 J 9 
2 1.2 1.1 
3 1.5 8 


siecia a a ae eS 
Note. Each entry is a hypothetical mean switching 
Pause duration (in seconds). The sequence can 
S course be extended for Dyads 4, 5, 6, and so 
orth. X = switching pause duration for one of the 
Speakers; X’ = switching pause duration for the 
other speaker. 
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Table 2 

Hypothetical Data Layout for an Analysis of 
Variance Design: Mean Proportion of Time 
Spent Speaking by Doctors 


M 


Patients 
Quiet, Talkative, 
Doctor depressed anxious 
1 23.4 15.7 
2 36.6 24.1 
3 45.9 26.3 


Note. Design used by Goldman-Eisler (1952). 


pair of observations is counted twice when 
computing the covariance term in the numera- 
tor—once as X, X’ and again as KWEK 
Instead of calculating the variances of the X 
and X’ groups separately, the denominator 
consists of the variance of all the observations. 
The r intraclass is just the ratio of these two 
terms. The intraclass correlation provides 
information about the matching or reciprocity 
of behaviors between partners, but it ignores 
individual differences among persons. 


Ordinary Two-Way Analysis of Variance 


A second approach to the study of social 
interaction involves the partitioning of var- 
iances to look for individual subject differences. 
Individual persons (or groups of persons) are 
treated as levels of subject factors in an 
ordinary analysis of variance. When there is 
some asymmetry of roles (such as doctor/ 
patient or interviewer/interviewee), the data 
are easily translated into a two-way analysis 
of variance. An example of this design is a 
study by Goldman-Eisler (1952); the hypo- 
thetical data in Table 2 represent the mean 
proportion of time each doctor spent talking 
when paired with individual patients of two 
different types. This design can be treated as 
a two-way analysis of variance, and compar- 
isons can be made of the following : individual 
differences among doctors in’ talkativeness; 
differences in the activity levels elicited from 
doctors by the two different types of patients; 
and interactions between doctor and patient 
type. 

This design does deal with individual 
differences (both in the types of social behavior 
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Table 3 
Round Robin Data Layout 
Subject 
Subject 1 2 3 4 
1 — Xia Xin Xia 
. Xin Xin Xiu 
= Xin Xis Xu 
2 Xu 3 Xan Xa 
Xan Xn Xu 
Xa Fey Xums Xu 
3 Xu Xiu = Xuan 
Xu: Xm Xu 
Xas Xm EN Xa 
4 Xan Xen X = 
Xan Xan Xin . 
Xas Xan Xin = 


Note. Xi; = a behavior of Person i toward Person 
jon Day k. 


that individuals exhibit and in the types of 
social behavior that groups of patients elicit) ; 
but it is less well suited to the study of social 
interactions among peers (i.e., social interac- 
tions that do not have a role asymmetry 
that makes it easy to assign some individuals 
to the row factor and others to the column 
factor); furthermore, it provides no informa- 
tion about mutual influence of social behaviors, 


Round Robin Analysis of Variance 
Model Specification 


Round robin designs make it possible to 
obtain both kinds of information about social 
interactions—individual differences and mu- 
tual contingency. Several round robin-t 
designs have been developed (Bechtel, 1971; 
Gleason & Halperin, 1975; Ley & Kinder, 
1957); all of these designs are applicable to 
data layouts in which each subject is paired 
with every other subject, but they differ with 
respect to the assumptions that they have made 
about the kinds of nonindependence among 
observations, Since these models have devel- 
oped as a special application of the paired- 
comparisons design in Psychophysical research 
some of the assumptions of these models 
are rather restrictive or in other respects not 
ideally Suited for some of the Special difficulties 
that arise in social psychological research. 
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The reason for developing a new model is to 
tailor the round robin design more specifically 
to the problems of social interaction and to 
make available a model with less restrictive 
assumptions about the type of dependence 
among observations. Depending on the research 
problem, however, one of these previously 
suggested models may be more appropriate; 
accordingly, the specifications of these other 
models will be described briefly later in this 
paper to assist readers in choosing the best 
model. 

Round robin designs arise frequently in 
small-groups research, where records are kept 
of the frequency or duration of various types 
of acts that each group member addresses to 
each other group member; in studies of person 
perception, where each person rates every 
other person in a group; and such data may 
be generated by creating all possible pairs from 
a small subject pool and studying their 
behavior in isolation or by collecting data in 
group settings. i 

Any social psychological dependent variable 
can be used: summary information on speech, 
gaze, or body movements; frequency counts 
of the number of acts (aggressive, altruistic, 
or other); ratings, perceptions, or self-report 
measures of utility—the amount of some 
resource (money, time, materials, or whatever) 
that each person gives to each other person. 

A round robin data layout is illustrated in 


Table 3, where X;;, is an observation of some «i 


social behavior of Person i toward Person j 
on Day or Time k. ales 
The diagonal cells are empty, since ordinarily 
a person cannot be paired with himself or 
herself, and the matrix is asymmetric, since 
the behavior of Person i toward j is ordinarily 
different from that of j to i, even though they 
may be highly correlated. With some modifica- 
tions, this will be treated as an nX "Xf 
analysis of variance, where n is the number © 
subjects and r is the number of observations 
made on each dyad. This is a random effec 


1 Note two limitations on appropriate data font 
round robin analysis of variance—there should ne 
many cells with zero entries (apart from the diag 
cells), and neither the row nor column sums show! ti 
constant, as would occur if each subject aa 
limited resources among other subjects es) : 
tank-order preferences or limited amounts of M0 
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model, that is, the set of n subjects is con- 
sidered to be a very small random sample from 
a very large population of subjects; subjects 
were not selected to represent levels of some 
dimension (such as dominance). By treating 
this as a random effects model, we achieve 
greater generality. Formally stated, the model 
is as follows: 


IE 
BORAR S 
Xij = p + ai + Bit Yii + Gee - 


The a; term represents the contribution of 
Person i as an actor, a source of behaviors; 
for instance, in a study of the proportion of 
time spent speaking in conversations, a 
represents Person i’s talkativeness. The 8; 
term represents Person j’s effect as a partner 
(i.e., the amount of talk that Person j tends to 
elicit from people when he/she is their partner). 
Note that these effects can be logically 
distinguished, although in the case of vocal 
activity there is a negative correlation between 
the proportion of time a person tends to talk, 
and the proportion of time he/she allows other 
persons to talk when he/she is their partner. 
For some other social variables, there might 
conceivably be a positive or zero correlation 
between the subject-as-actor and subject-as- 
partner effects. The yi; term is an interaction 
effect, representing the special adjustment 
which Person i makes in level of talkativeness 
when paired with Person j. As usual, éijk 
represents the error term, which picks up 
variability in behavior at different times. 

Looking at the layout in Table 3 may clarify 
the distinction made between actor and partner 
effect. The ith row mean represents the average 
behavior of Person i toward n — 1 different 
Partners; thus the row variance indicates 
whether there are clear-cut and consistent 
individual differences among persons when 
their behavior is observed with a number of 

erent partners. The jth column mean 
represents the average of behaviors of persons 
who have Person j as a partner; thus the 
column variance indicates whether there are 
Significant differences among individuals as 
Partners (that is, as social stimuli to which 
other persons react). These two main effects, 
SK (actor) and column (partner) are both 
ubject variables, and they are in fact based 
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on the same set of subjects; in developing our 
new round robin model, we will distinguish 
the actor and partner factors and allow these 
two factors to be either positively or negatively 
correlated. Among the other round robin 
designs, different approaches are used: Bechtel 
(1971) treats actor and partner factors 
separately and allows them to be positively 
correlated; Lev and Kinder (1957) treat 
actor and partner as separate factors and 
require that they be independent; Bechtel 
(1967) and Gleason and Halperin (1975) treat 
actor and partner as equivalent and combine 
them into a single subject factor. 

This nonindependence between the row and 
column factor is just one of the special consider- 
ations that necessitate the development of a 
special analysis of variance model; another 
special problem is the nonindependence of 
partner behaviors that comes about because 
social behaviors are mutually contingent. 
This mutual contingency means that the 
behavior of j to i is correlated with the 
behavior of i to j; or, to put it another way, 
in the layout in Table 3, the cells across the 
diagonal from each other (the i,j and j,i 
pairs of cells) are correlated, either positively 
or negatively. Since this violates the ordinary 
analysis of variance assumptions, it requires 
special adaptations in the specification of the 
model and the derivation of expected mean 
squares. 

Many of the basic assumptions about 
expectations are unchanged: The expectations 
of row, column, and interaction effects and 
the expectation of the error are still zero. 
Formally, E(a;) = 0, E(6;) = 0, E(yi;) = 9, 
and E(éjx) = 0, for all i, j, and k G 4%), 
where E is the expectation operator. The 
usual variance specifications are also made: 


E(a?) = cè; E(B?) = 08°; Elvi?) = oy} 
and 
Elek) =c, for all 4, j, and k, (j # i). 


The sources of nonindependence outlined 
earlier (the correlation between row and 
column factors and between pairs of cross- 
diagonal cells) must also be incorporated into 
the model specifications. The covariance 
between Person i’s effect as an actor (ai) and 
Person i’s effect as a partner (8;) is formally 
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Table 4 : j 
Summary of Round Robin Model Specifications 


E(a;)=0, E(a#) =ø forall i 
E(ß;i)=0, E(8;) =o, forall j 


E(yij) = 0 for alli and j, j #i 
E(eije) = 0 for all i, j, and k 
pigaog if i = j 
E(ai, Bi) = 49 fig 


o 


Elri vey) = {ree 
0 


ifisi, j=j 
iff = jj si 
otherwise 

of ifi=i', j=j,k=k' 
Elije evan) = {ps i=j j' =i, k =k’ 
0 otherwise 


stated as E(a;, Bi) = piace, hereafter referred 
to as the row and column covariance. 

The covariance between pairs of cross- 
diagonal cells is more complex, since it consists 
of two parts. First, it is reasonable to assume 
that there is some correlation between yi; 
and yj; (that is, between i’s special adjustment 
to j and j’s special adjustment to i). This 
nonzero covariance between the interaction 
effects within dyads is represented as p20,?; 
formally, the nonindependence assumption 
requires that we specify Elya, Yj) = Poy. 
This term will be referred to as the cross- 
diagonal interaction covariance, and it will 
be interpreted as evidence for some kind of 
lasting reciprocity of behaviors within dyads. 
Furthermore, 
for the behavior of social interac! 
on a particular day or time will also be corre- 
= psc’; this will 
, l for a situation- 
specific reciprocity in partner behaviors. All 
other covariances are assumed to be zero; 
outlined in this 
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Before outlining the calculations and param- 
eter estimation procedures, we should note 
the alternative model specifications briefly 
described in Table 5; depending on the nature 
of the data, one of the previously developed 
models may provide more powerful means of 
analyzing some data. This new model has the 
least restrictive assumptions of any of these 
designs. 


Calculation of Means and Mean Squares 


The strategy adopted for the purpose of 
developing the new model was as follows: | 
First, we set up formulae for the sample means 
and mean squares; next, taking into account 
the empty diagonal cells and the nonindepen- 
dence assumptions included in the model 
specifications, we worked out the expected 
mean squares; finally, we used the expected 
mean square equations to solve for the variance 
components of the model (e.g., oa, og, oy) 
in terms of the sample mean squares. Signif- 
icance testing on these variance components 
was done by jackknifing, since ordinary F 
ratios are not easily set up. After these pro- 
cedures have been outlined in detail, the model 
will be demonstrated using vocal activity data. 

The computation of sample means “E 
straightforward; the only special consideration 
is the presence of empty cells in the diagonal. 


Grand mean: 
eS SS Xin, 
rn(n — 1% j*i k 
Row mean: 
ie S Xin. 


r(n— 1) ji k 
Column mean: 
1 
Mj. = —_ EE Xin. 


rn — 1) T 
Cell mean: 


My. =- Xin. 


be ad 
=M 


As usual, the estimate of the grand mean iL 
M.... Because of the empty diagonal ce” 
Mi.. — M... is not an unbiased estimate a 
the row effect (æ;); the following argum Ei 
will explain why the empty diagonal pr 
produce bias in the row means of this layout 
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When you sum across row i, for instance, you 
get an average of Person i’s behavior with 
n — 1 partners, not including self. This means 
that the estimate of Person i’s talkativeness 
relative to the talkativeness of the other 
persons in the sample is biased, since Person i 
is not observed with the same set of partners 
that all the other persons had. Since we can 
get an estimate of Person 7’s effect as a partner 
(8;), however, it is possible to correct for this 
“missing partner” bias. For instance, in the 
data on proportion of time spent speaking in 
conversations, any particular observation (X;;) 
depends upon the talkativeness of 7, the 
person being observed, and the amount of 
talk elicited by j, the partner. Typically, the 
amount of talk elicited by j depends on j’s 
talkativeness. If Person 7 is the most talkative 
person in the sample, the the fact that ¢ lacks 
himself or herself as a partner means the 
estimate of his/her talkativeness is slightly 
inflated, since he/she had the last talkative 
set of partners of any individual in the sample. 
An analogous argument holds for bias in the 
column means. The estimates for the row 
effects and the column effects are as follows: 


(w= (n— 1) 
eG n(n — pie ie n(n — 2) 


M.i. 


(n-1) 
~ (a Nee 4 
a Gus (Se 
1 12) n=) 
(n= 1) 
=" Gaon . 


These equations make sense in light of the 
previous discussion; to get an estimate of 
Person i’s true row effect, it is necessary to 
correct for the “missing partner” bias by 
adjusting by some fraction of Person t's 
column mean. As usual, once the row and 
column effects are known, estimates of the 
Interaction effects can be obtained by subtrac- 
tion: 


fy = My. — M... — & — B;. 


The calculations for mean square row and 
Mean square column are straightforward, but 
the coefficients are slightly different; note 
that the total number of observations in the 
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design is rn(n — 1) and the number of observa- 
tions in a row or column is r(n — 1). 


Mes eS ayn aS 


—-)G 
rezi) TA 
MSeotmn = REN x (Mj. — M...) 


Before going further, notice how the empty 
cells and the special nonindependence assump- 
tions affect the variance components that 
are estimated by the row mean square. For 
Mi.. and M... , we have: 


i 


Bi > 
M:.=u+ai — Ge pt te + 


M... =e +à. +B +H + E 


When these expressions are used to derive 
the expected mean square for row effect, the 
special assumptions incorporated into the 
model specifications mean that many cross- 
multiplication terms whose expected values 
would be zero in an ordinary analysis of 
variance now have nonzero expectations; for 
instance, since æ; is correlated with B. , there 
will be a nonzero covariance term picked up 
when the a; term in the equation for Mi.. is 
multiplied by the . term in the equation for 
M.. and their expectation is taken. The 
expected mean square for row in the round 
robin model includes the following terms. 


i) 


LON PRAY 2 SELLS 2 
EMS rq = rn — Nod + Go)” + ro, 


if 
—2rpicatp — q0” 


Foie Peer 


where pita is the Row X Column covariance, 
poo? is the cross-diagonal interaction covar- 
jance, and pss is the cross-diagonal error 
covariance. 

The complete derivation for the expected 
mean squares will not be presented here since 
it is rather lengthy; a copy of the derivation 
is available on request (Stoto, Kenny, & 
Warner, Note 1). 

The number of untraditional variance 
components included in this expected mean 
square suggests that it will not be easy to find 
an appropriate error term that will make it 
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Table 5 k : FL 

Summary of Round Robin Design Specifications 
rr 


Relationship Relationship 


Actor/ of cross of cells hd 
partner diagonal within same Significance Cop: 
Study factors cells row or column testing Special features 
i il a at 
Lev & Separate and Independent Independent F — 
Kinder independent 
(1957) 
Bechtel Separateand Positively Independent F Tests for identity of row and 
(1971) positively correlated column factors, and symmetry 
correlated of interaction effects 
Gleason & Combined Positively Positively Pseudo F Split plot, with all subjects 
Halperin into correlated correlated within each condition paired 
(1975) single factor round robin style; main 
emphasis on treatment rather 
than subject variables 
Warner, Separate, Positively Independent t i 
Kenny either or (jackknifing) 
& Stoto positively negatively 
(1979) or negatively correlated 
correlated 
possible to set up an F ratio to evaluate the MS, 
significance of the row variance, Instead, we A $ 
decided to solve directly for estimates of the = acii EE Mj — M.) 
variance components themselves; this requires yaad 
Seven equations in seven unknowns. Some M. Srowxco! 


additional sample mean squares are therefore 
needed, and the nature of these mean squares 
can be guessed from the set of unknowns 
appearing in EMS,,,,. The new mean squares 
are a mean square for row and column covar- 


iance (denoted MSrowxcoi); a mean square 


or the covariance of cross-diagonal interaction 
effects (denoted MScouxcen); and a mean 
Square for the covariance of cross-diagonal 


errors (denoted MSsxroe yceeroe): Also, in setting 


up the sample mean Squares, we have chosen 
to calculate a mean 


MS, 


row 


=r x (Mi. — M...)2, 


MS coro 
oi rz (M.;.— M...)2, 


=r} (Mi... — M...)(M.:.— M...), 


MSconxcon 
r 
ea AEM 
ni — 1) 1% >, M 
X (Mj. — Mon) 
USeror 


EEE Kin — Mads 


i ji k 


k 1 
n(n — 1)(r — 1) 
and 


M. Sterror XError 


EEE (Xi Ma) 
i jæi k 


X (Kn — My): 


The coefficients for the variance components 
in the seven expected mean square equations 
corresponding to these sample mean squares 
are summarized in Table 6. 

We now have the information necessary fia 
solve for the estimates of the seven valle 
components. Let M be the (7 X 1) vector ® 


E 1 
n(n — 1)(r — 1) 


| 
z| 
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sample mean squares: (MSjoy, MSoo MSeew 
MSpowxcov MScouxcew MSerrors MS error xerror)- 
Let V be the (7X 1) vector of variance 
components to be estimated: 


2 
(Ca, 08°, 07's P10aTB, P20 Ai Tey poè). 


Let C be the (7 X 7) matrix of coefficients 
from the expected mean squares equations, 
as in Table 6. 

By definition, M = CV. Therefore we can 
solve for estimates of the variance components 
by multiplying both sides by the inverse of 
the coefficient matrix, to get V = CM. 

One can of course consider various special 
cases of the model as oa? = og’ and pi = 1; 
p: = 03 p2 = 13 p2 = —1} 93 = 1; 0rps = —1. 
For any of the above assumptions one can 
simply drop the relevant mean square and 
change the coefficient matrix accordingly. 
In the case of ga? = og" and pı = 1, a sensible 
strategy would be to combine MS,ow and 
MSeotumns and drop M SrowXcoiumn: 

In order to ensure that all the parameters 
in the model are identified, it is necessary to 
have a minimum of four subjects. The model 
may be applied where r = 1 (each dyad is 
observed only once), but in this case it will 
not be possible to distinguish the interaction 
variance (ø,?) from the error variance (oè), 
or the cross-diagonal interaction covariance 
(p20?) from the cross-diagonal error covariance 


(p30). 


Significance Testing 


Once estimates of the variance components 
have been obtained, it is necessary to have 
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some means of evaluating whether they differ 
significantly from zero.? A useful strategy is 
jackknifing (Mosteller & Tukey, 1977). The 
idea is to estimate the variance of a parameter 
by examining the empirical distribution of 
estimates of that parameter (such as oa), 
since there is no simple way to determine the 
variance of its theoretical distribution. For 
oè, for instance, this can be done as follows: 
First, create n different subsets of the data, 
each subset consisting of the full data matrix, 
omitting the data for one subject (i.e., one 
row and the corresponding column). For each 
of these leave-out-one subsets, we can estimate 
a value for a2. Now, these n different estimates 
of oa? are not independent, since the subsets 
overlap in membership; but Mosteller and 
Tukey have devised a way around this 
problem, A set of pseudoestimates is generated 
from the leave-out-one estimates, and the 
pseudoestimates can be treated as if they were 
independent. Let Yan be the value of fa for 
the whole group of subjects; let Yo) be the 
value of ca for the subset of n — 1 subjects 
created by omitting person j. Then the 
pseudoestimates are given by Yy; = n(Y) 
— (n— 1)Yo for j = 1, 2, m. The Vy; 
pseudoestimate is a “fake” estimate of oa’, 
as if we had been able to estimate oa’ based 
on only subject j; using this set of pseudo- 
estimates, we can estimate the standard error 


difficult to set up, consider 
the expected mean squares associated with the ratio 
of mean square rows and mean square error, It is clear 
that this ratio would not be an unbiased estimate of 
the row effect variance. 


2 To see why F ratios are 


5 


Expected Mean Square Coefficients 5 


MS af Aa ost oy ET P:i od pw 
5 7 j ae 
Row noi r(n—1) ee A sat “yt nat 
1 
r IRPA 
Column n1 <4 r(n—1) r er, Ta l nol 
$ 
1)? —1)? —2r(n—1) at) 1 a 
Cell na- AS way a-i raD ra=- 
Row X r r[(n—1)+1] es 1 
Column "7! Sp pi Test E A) i al 
1 
uy r(n—1) -r 2r(n—1)? 1 
Cell XCell n(n—1)—1 — =O aaa Med =1 r aai) 
1 0 
Error n(n—1) (7-1) o o o o 2 
Err o 1 
Eua m(n—1)(r—1) o o o 0 g 
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of oa” by the variance of the Y,; values and 
use this to set up a ¢ test (with n — 1 degrees 
of freedom) to evaluate whether ga = 0. 
Let y be the mean of the Y,; pseudovalues, 
and let s? be the variance of the Y,; pseudo- 
values. Then (n!y,)/s is distributed as / 
with n — 1 degrees of freedom if oa? = 0. 

This jackknifing procedure can be used to 
test the significance of each of the variance 
and covariance components estimated for the 
model. For justification of the procedure and 
further examples of its applications, see Mos- 
teller and Tukey, 1977 (chap. 8). 


Interpretation of the Variance and Covariance 
Components 


Since the components of this model differ 
from those of an ordinary analysis of variance, 
Some comments on the meaning that the 
variance components have for various types 
of social interaction data may be helpful. 
Notice first that three variance components 
(actor, partner, and interaction) correspond 
to the components of an ordinary two-way 
analysis of variance and that three covariance 
components (row and column, cross-diagonal 
interaction, and cross-diagonal error) provide 
information somewhat similar to that of an 
intraclass correlation design. 

Actor effect. This factor is related to individ- 
ual differences among persons as sources of 
social behaviors ; depending on the dependent 
variable, this factor can be renamed speaker, 
tater, perceiver, and so forth, Since each 
individual’s behavior is observed over a number 
of different partners, the round robin analysis 
of variance provides a fairly strict test for 
stability; the actor effect cannot be large 
unless individuals are fairly consistent in 
their behavior, regardless of the Person with 
whom they are paired. 

Partner effect. This factor indicates whether 
there 18 a strong tendency for individuals to 
elicit Particular types of behavior from other 
Persons—for instance, high or low trait ratings, 
high or low level of attraction, high ; 

§ » high or low 
amounts of speech activity, and so forth. 


Notice that it is Possible to h: 
actor effect and a weak Poke ae 
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or a strong partner effect and a weak actor 
effect (as we have seen in preliminary work 
with person perception data, where the 
subjects are being trained to apply an objective 
rating system; there are small differences 
among ratings given by different raters and 
large differences in the ratings received by 
different individuals). Thus, although they 
are often highly correlated, it is important 
to stress again the logical distinction between 
these factors, and in some situations, partic- 
ularly in person perception research, it is 
interesting to compare the magnitude of 
these two main effects. 

Interaction effect. This component relates 
to the particular adjustment that each person 
makes to each particular partner. In vocal 
activity research, for instance, interaction 
effect indicates whether Person i paired with 
Person j consistently talks a little more (or a 
little less) than would be expected on the 
basis of i’s talkativeness and j’s tendency to 
elicit talk. This interaction effect also has 
meaning in person perception and attraction 
research—does Person i tend to like Person j 
more than would be expected on the basis of 
i's average liking and j’s average tendency 
to inspire liking? The interaction effect picks 
up any tendency to make adjustments that 
are unique to each partner and hold up across 
repeated pairings with that partner. In a 
sense the interaction term measures what is 
unique to the interaction between partners. 

Row and column covariance. If this covat- 
jance is very large relative to the row and 
column covariances, and if it is positive, it 
Suggests that the row and column factors are 
nearly indistinguishable, in which case @ 
single factor design (one that combines actor 
and partner into a single subject factor) may 
be more appropriate. In some situations, the 
row and column covariance may have 4 
substantive interpretation; if the dependent 
variable is related to utility (for instance, the 
favorableness of a rating or the amount of 
some resource given by one person to another), 
then the relationship between the row factor 
(amount Person i gives out, on the average) 
and the column factor (amount Person f 
receives, on the average) may be viewed 45 i 
kind of “equity” factor. Another theoretica 
construct that comes close to describing this 


"l 


a 
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factor in person perception is projection; a 
high row and column covariance could be 
taken to mean that the ratings a person gives 
to others are closely tied to his or her own 
traits (as perceived by others). For the vocal 
activity data, there is no simple interpretation 
of this covariance. 

Cross-diagonal interaction covariance. This 
is one of the two dyadic reciprocity factors in 
the round robin design; it should be distin- 
guished from the cross-diagonal error covar- 
iance, which picks up a different type of 
reciprocity. A large cross-diagonal interaction 
covariance means that for each of the possible 
(i, j) dyads, the systematic and enduring 
adjustment that i makes to j is correlated 
with the enduring adjustment that j makes 
to i. For interpersonal attraction data, it 
seems likely that this could be a strong 
relationship; that is, if 7 always tends to like 
j more than expected, it seems probable that 
in return j always tends to like ¿ more than 
expected. The reciprocity picked up by this 
covariance is an enduring reciprocity, that is, 
a relationship between partner behaviors that 
holds up across different times or situations. 

Cross-diagonal error covariance. This is a 
situation-specific reciprocity effect. If this 
covariance is large, then the behaviors of the 
partners are highly correlated in any particular 
situation, although they may not necessarily 
be related in the same way across different 
situations. For instance, in the vocal activity 
data, we will find a large cross-diagonal 
error covariance—since the proportion of 
time available for speaking in any particular 
conversation is approximately zero sum, the 
vocal activity of partners in specific conversa- 
tions is highly negatively correlated. However, 
the reciprocity turns out not to be enduring— 
that is, although the sum of activity is approx- 
imately 1 in any particular conversation, the 
allocation of time between partners may be 
worked out differently in different conversa- 
tions—in one, Person i may do more than 
his or her share of the talking; in the next, 
Person j may do more than his or her share. 
A fanciful example illustrating the indepen- 
dence of the situation-specific and the enduring 
reciprocity factors might be found in a soap 
Opera, where the attractions and antipathies 
among the characters may be highly correlated 
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and mutual in any one episode, but the pattern 
of attractions may shift from one week to the 
next—indicating strong situation-specific reci- 
procity and a lack of enduring reciprocity. 


Summary 


Clearly, the interpretation of the variance 
and covariance components in the round robin 
model depends upon the nature of the depen- 
dent variable and the type of substantive or 
theoretical questions that arise about the 
reciprocity of behaviors. The goal of this 
section is to show the flexibility of the new 
round robin model; there are some readily 
understandable parallels between well-known 
theoretical constructs and the factors in the 
round robin design. The following section will 
demonstrate the application of the round 
robin design to vocal activity data. 


Empirical Illustration 


Various aspects of speech activity in con- 
versations and interviews have been singled 
out for study, for example, mean duration 
of vocalizations or pauses, probability of 
initiating or maintaining speech, and propor- 
tion of time spent speaking; these vocal 
activity parameters are interrelated, and 
certain aspects of vocal activity are highly 
consistent for individual speakers (Jaffe & 
Feldstein, 1970). Of these parameters, the 
simplest one to study is the proportion of time 
spent speaking, or speaker activity level. Past 
research on individual speaker consistency has 
generally involved at most two or three 
different partners; the round robin design 
provides a natural method of examining 
individual differences across many different 
partners in order to evaluate parametrically 
whether the reliable individual speaker differ- 
ences claimed on the basis of this earlier 
research actually exist. 

A round robin study of proportion of time 
spent speaking was conducted by Warner 
(Note 2). Eight participants (four male, four 
female) were enlisted for a study of conversa- 
tion. Each of the 28 possible pairs conversed 
privately on 3 separate days for about 12 to 
15 minutes each time. Each speaker’s voice 
was recorded onto a separate channel of a 
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Table 7 Te 
Proportion of Time Spent Speaking in 
Conversations 


Subject 
SIDRA A Si Ge a M 


1 SS 65 59 62 87 69 81 67 


2 22 21 34 25 35 39 58 38 
50 42 27 38 35 36 53 
36 35 38 34 49 53 28 


3 31 56 34 31 24 34 36 35 
45 45 40 33 25 48 16 
12 54 24 37 42 37 30 


4 26 62 68 46 31 58-72 SS 
30 62 55 25 89 43 83 
45 60 81 65 33 59 65 


5 66 56 70 67 36 58 733 61 
70 56 89 69 54 53 56 


52 58 64 44 3 1 n 
6 27 30 74 45 49 70 40 48 
49 S3 Iti 35 78 69 


28 41 47 48 37 46 41 
7 52 40 77 32 52 49 67 50 


45 66 54 40 48 22 65 
67 34 60 48 23 49 61 
8 39 47 75 47 37 45 37 55 


59 80 90 47 51 46 67 
45 70 73 49 37 51 51 


Column 
M 43 54 65 43 43 47 54 58 51 


Note. Xij, = the proportion of time s ii 
iik = 1 pent speakin, 
by Person 7 to Person jon Day k (in percent). 


Stereo tape. The participants wore headsets 
with noise-canceling microphones close to 
the mouth; this type of microphone arrange- 
ment, which was somewhat intrusive, was 
necessary because ordinary microphones allow 
too much spillage of voices between channels, 
The proportion of time spent speaking by each 
person was determined by using a computer 
Volce-operated relay to detect the presence 
or absence of speech in each of the two channels 
of the tape-recorded conversations. The voice- 
Operated relay was calibrated by means of an 
indicator light that showed when the system 
was detecting speech activity; thus a human 
listener could manipulate the threshold and 
other settings on the voice-operated relay 
until the pattern shown by the light matched 
the perceived on-off pattern of speech activity. 
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The on-off vocal activity judgment was 
made twice per second, and this information 
was used to determine the proportion of time 
spent speaking by each person over the time 
of the whole conversation. 

Subjects were instructed that the study was 
about the process of becoming acquainted 
and were told that they could talk about 
whatever they liked. Pilot tests had indicated 
that this was probably much more natural 
than imposing some kind of task; most dyads 
had little difficulty in carrying on a conversa- 
tion, and a number of conversations were 
quite animated and contained information of 
a rather personal nature. Subjects 
free time between recording 
lounge where they were free to talk. 

An effort was made to promote a feeling 
of ease in the situation, at the cost of experi- 
mental control; for example, several persons 
in the study were previously acquainted, some 
of the conversations that took place for the 
tape recorder were continuations of conversa- 
tions that had begun in the lounge, and many 
conversations continued after the tape recorder 
was turned off. Partly because of the marathon 
nature of the scheduling (the eight subjects 
spent three 8-hour days participating in the 
study), a certain amount of group cohesiveness 
developed. For all these reasons, it seems 
likely that the conversations reported in this 
study are more natural than those that have j 
been elicited from persons who come into the 
laboratory “cold” for just one session or 
who converse via intercom without visual 
contact. 

The proportion of time spent speaking was 
tabulated for all 84 conversations, and the 
results are displayed in the 8 X 8 X 3 data 
matrix shown in Table 7.3 3 

The O-Bird round robin analysis of variance 
program, which carries out all the calculations, 
parameter estimation, and significance testing 
as outlined in this paper, was run on these 
data.* The results are displayed in a modified 
Source table, with / tests in place of the usual 
F ratios, Since the true variance components 


* No transformation for proportional data, (uch 7 
arcsine) was used, since none of the proportions W® 
very near zero or one, i from 

“The O-Bird program (in Fortran) is available fro 
the second author. 
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(row, column, interaction) must be nonnega- 
tive, a one-tailed ¢ was used for these three 
tests; the covariances (row and column, 
cross-diagonal interaction, and cross-diagonal 
error) can be either positive or negative, and 
so a two-tailed / is needed for these three tests.* 


Discussion 


There were two significant effects: (a) 
clear-cut individual speaker differences in 
activity level, as shown by the significant £ 
value for row effect and (b) strong situational 
reciprocity between the speech of partners in 
particular conversations, as indicated by the 
significant ¢ value for the cross-diagonal error 
covariance, No other effects were statistically 
significant (although the sizes and signs of all 
other effects were generally consistent with 
our expectations, which lends some additional 
plausibility to the model). Our results confirm 
earlier findings of individual differences; we 
also have information about the cross-diagonal 
error convariance, which can be interpreted 
as evidence of a situation-specific reciprocity 
of speech activity level. This was anticipated, 
since it is a common sense observation that the 
time allocation within any particular conversa- 
tion is approximately zero sum, that is, the 
more time taken up by one speaker, the less 
time available to the other speaker. Thus, the 
proportions of time spent speaking by the 
two persons in a particular conversation 
(X ijn, Xj) are negatively correlated. 

The partner effect was smaller than the 
speaker effect, which would suggest that the 
impact of partner activity level on an individ- 
ual’s speech production is not as great as the 
effect of his or her own “preferred” activity 
level. As anticipated, there was a negative 
covariance between the activity level of a 
Person and that person’s effect on other 
Prople’s activity level when he or she was 
their partner; however, this covariance was 
Not significant, so it seems appropriate to 
conclude that the relationship between an 
Individual’s activity level and his or her 
tendency to elicit a high or low activity level 
from others is not so strong that these two 
factors should be considered equivalent. 

_ There was a relatively modest and non- 
Significant interaction effect. This can be 
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rationalized as follows: although it is true that 
in any particular conversation between persons 
i and j they adjust their activity levels to 
each other to achieve a total proportion of 
time active of about 1.0, this adjustment can 
be made in a number of different ways, and the 
adjustment that they make in any one partic- 
ular conversation is unique to that conversa- 
tion and does not carry over to later conversa- 
tions (at least, not in our study, which con- 
siders only three conversations per dyad). 
That is, in one conversation, į may talk a 
little more than would be expected (based on 
is talkativeness and j’s tendency to elicit 
talk); in another conversation, i may talk 
less than would be expected, The small 
interaction variance indicates that there is 
not a fixed adjustment for each dyad, so 
that when i talks to j, he or she always talks 
a little more (or less) than usual. 

The cross-diagonal interaction covariance 
for these data is virtually zero. This is reason- 
able; if i’s adjustment to j is not stable or 
fixed across different situations, it is not 
reasonable to expect a strong relationship 
between is average adjustment to j, and 
j’s average adjustment to 7. Furthermore, 
even if these partner adjustments were 
consistent, as would be indicated if there were 
a large interaction effect, the adjustments in 
activity level might not necessarily be corre- 
lated for partners. Any combination is possible 
—depending upon the “preferred” activity 
levels of i and j, both partners could talk 
more, both partners could talk less, i could 
talk more and j could talk less, or j could 
talk more and i could talk less; any of these 
types of adjustment could conceivably be 
needed in order to work out the time allocation 
between partners. Since all these combinations 
can and do occur, on the average the covariance 
between i’s adjustment to j and j’s adjustment 


š When testing whether a variance component is 
zero in the population, a one-tailed ¢ is appropriate, 
since the true variance component in the population 
cannot be negative. Since the pseudovalues can be 
negative (and will be negative about half the time if the 
true variance component being estimated is zero), 
there is no reason why these pseudoestimates could 
not be ¢ distributed. Of course, the covariance terms 
can be either positive or negative, so two-tailed ts 
should be used for the covariance terms. 
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Table 8 $ 
Source Table for Round Robin Analysis of 
Variance 


Variance or 
covariance 
Source MS estimate t(7) 
Row 
(speaker) 2438.3167 91.9680 2.0144* 
Column 
(partner) 1388.1423 40.9178 1.2669 
Interaction 622.8201 29.7534 1.0262 
Row and 
column —1380.7346 —40.3749 —1.1547 
Cell X Cell —354.1238 4.0811 0.0997 
Error 
X Error —95.5282 —95.5282 —2.4500** 
Error 146.0462 142.0462 — 


*p <.05, one-tailed ¢(7). **p < .05, two-tailed 
t(7). 


to i will be nearly zero. The small (virtually 
zero) covariance obtained for this cross- 
diagonal interaction covariance indicates a 
lack of enduring reciprocity in speech activity 
level—there is no adjustment that i always 
makes to j that is correlated with an adjust- 
ment that j always makes to i. It seems likely 
that certain other vocal activity parameters 
(such as mean switching pause duration) and 
other variables such as attraction, might show 
enduring reciprocity; but there seems to be 
no such effect for activity level of speakers. 
The significant individual speaker differ- 
ences, taken together with the nonsignificant 
interaction effect, Suggest that the appropriate 
unit of analysis for Proportion of time speaking 
is the individual person rather than the dyad. 
The Strong situational reciprocity, together 
with the lack of enduring reciprocity, suggests 
that the reciprocity of speech activity levels 
depends heavily on situational factors—per- 
haps topic of conversation, time of day, 
moods of the Participants, and so forth, 
rather than on some lasting “agreement” 
between partners as to their relative dominance 
or right to claim speaking time. At present 
the model is not set up to handle order effects, 
but it could be extended to account for predict- 
able shifts in the time allocation between 
partners over time (if these occur), 
; Although the main focus of this discussion is 
interpretation of the significant effects, there 
are additional aspects of the results in Table 8 
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that bear further examination. It is possible 
to estimate correlations from the 


mean 
squares: 
prise M Snow xcotuma sm = 17504) 
(MSrow*MScotumn)! 
and 
pa = MBoaxcu _ _ s6g6, 
sf MS con 
The fı is the correlation estimate for the 


relationship between the row and column 


factors that would be obtained directly from - 


the raw data. It is a biased estimate of the 
true relationship between row and column 
factors; this can be seen by inspecting the 
set of variance components belonging to 
EM Srowxcotumas EM Sow, and EM Soom: The 
ratio of these raw mean squares clearly does 
not provide an estimate of the true correlation 
pi. The same problem applies to the estimate 
for rz, which is also biased; however, no such 
problem arises in estimating ps, which is just 
(MSirorxeror)/M S erce: l 

An alternative means of estimating the 
correlations is to take ratios of the variance 
and covariance components derived from the 
round robin model, as follows: 


Privat A à 
= -n = —.6582 
Îi 6263)! ; 
~~ 
a Pary _ ” 
po = rs = 1372. 


Although the components of these new correla- 
tion estimates are unbiased, our initial 
experiences with them indicate that they may 
be unreliable. Also, just as estimates for the 
variance components of the round robin 
model can be less than zero due to sampling 
error, correlation estimates derived from the 
variance and covariance components may 
fall outside the range of plus or minus ei 
especially if the denominators are Sma 
Thus some caution is in order when using O" 
interpreting these correlation estimates. ; 
should be kept in mind that correlator 
based on the raw data in a round robin-typ 
layout do not estimate what they seem to i 
estimating. Such raw correlations are ol 
monly reported in person perception literature; 
their interpretation is highly problematic. 


Ę 


: 


; 


Another useful strategy is comparing the 
magnitude of various effects. A simple instance 
where this may be substantively interesting 
is in person perception research, where it 
may be useful to know whether there is 
greater variability among the ratings that 
people give (rater or actor effect) or the 
ratings that people receive (partner effect) ; 
that is, is the trait in question “in the eye of 
the beholder” (influenced mainly by the 
person doing the rating)? Or is it a trait that 
different observers can agree upon, influenced 
“mainly by characteristics of the person being 
rated? This comparison can be made by 
looking at the ratio (or the difference) between 
the row and column variance components, 
and the significance of the ratio or difference 
can be evaluated by jackknifing. This compar- 
ison of actor and partner effects was not 
significant for the vocal activity data, /(7) = 
.635. That is, even though the row variance 
was significantly greater than zero and the 
column variance was not significantly greater 
than zero, the row variance was not signif- 
icantly greater than the column variance. 

More elaborate comparisons among the 
variances can also be constructed. For example, 
for some applications it might be instructive 
to ask to what extent the variability of social 
behavior is accounted for by characteristics 
of the individuals (actor and partner effects) 
as opposed to characteristics of the dyad or 
“the situation (interaction and error). To the 

extent that actor and partner factors pre- 

dominate, social behavior can be viewed as 
additive; social interaction can be conceived 
of as a linear system, that is, a system with 
behavior that can be predicted from the 
behavior of its components. Such a test of 
additivity might take the following form: 

oa + of 

e+ ce" 

+ For our data, this ratio was significantly 

greater than zero, (7) = 2.44, p < .025 one- 

tailed, based on the jackknifing of this ratio 

Over different subsets of the data; that is to 

Say, some nonnegligible amount of the variabil- 

Aty in vocal activity level is accounted for by 

e components of the conversational system, 
co speakers themselves. Depending upon the 
te ure of the research problem, other compar- 

ns among variances can be set up. 


A= 


“x 


P 
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Special Applications 


As with any statistical model, the decision 
to use the round robin model depends on two 
considerations—first, whether the assumptions 
of the model are satisfied, and second, whether 
the model provides the desired information. 
It is possible to use the round robin model to 
analyze data on unconstrained social interac- 
tions in natural settings or to make group 
comparisons; however, it is important to 
realize that these data may violate certain 
assumptions of the round robin model. 

Consider the type of study in which social 
interactions are observed in natural settings 
such as classrooms. One might count the 
number of aggressive or attention-getting acts 
by each child. Since such data violate the 
independence assumptions of ordinary chi- 
square and analysis of variance, one should 
not use these tests. Clearly each act within a 
given classroom is dependent to some extent 
on the other acts that occur within that 
classroom. This might suggest that the 
round robin model is suitable for these data; 
in addition to reciprocity effects, however, 
there are likely to be other forms of social 
influence such as modeling or shared attitudes 
toward particular class members. In postulat- 
ing the round robin model, we assume that 
(except for the reciprocity covariances) the 
observations are independent, for example, h’s 
behavior toward Person i is assumed to be 
uncorrelated with j’s behavior toward i. 
Clearly if Person j imitates Ws behavior or 
shares h’s attitudes toward other individuals, 
the behaviors of and j will be correlated, 
This means that the independence assumptions 
of the round robin model are violated. It may 
be possible to minimize these other forms of 
social influence by preventing the participants 
from observing the interactions of other dyads 
and preventing participants from discussing 
their perceptions and attitudes. In many 
natural social environments, however, this 
kind of control is not feasible, and the use of 
such stringent controls may sacrifice too much 
external validity. 

We know of no statistical test of the covar- 
iance structure of the data to evaluate whether 
the independence assumptions of the round 
robin model are satisfied. Such a test could 
conceivably be developed along the lines of 
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work done by Huynh and Feldt (1970). In 
the absence of statistical criteria, the investiga- 
tor must realize that modeling and shared 
expectations will bias the estimates of the 
round robin model parameters. 

Another difficulty in the analysis of interac- 
tion data from unconstrained social situations 
is that participants self-select their partners 
and may not interact with everyone else in 
the group. This results in large numbers of 
missing observations or zero frequencies. At 
present the analysis we propose cannot handle 
missing data. The presence or absence of ties 
between persons reveals network structure, 
however, so block models or network analysis 
are viable alternatives (White & Breiger, 
1975). 

Another possible application involves group 
comparisons: either comparisons of two or 
more round robin layouts or comparisons of 
subsets of subjects within a round robin layout. 
For instance, the vocal activity study reported 
earlier included four male and four female 
participants. One might wish to examine 
differences in talkativeness between males 
and females or between same sex and cross-sex 
dyads. This might be done by taking the mean 
difference in the row effect estimates (q;) 
for males versus females in such of the leave- 
out-one subsets of data and jackknifing to 
get a significance test. One might also wish 
to correlate the actor effect (ai), talkativeness, 
with a personality scale such as dominance. 

Another consideration in applications of 
the round robin model is sample size. The 
round robin analysis requires a minimum of 
four persons, and the jackknife procedúre 
raises this to five. In fact, we believe that the 
analysis will provide unstable estimates for 
sample sizes less than eight. Recall that if 
there are only seven members of the group, 
there are only six degrees of freedom in the 
! test of the actor and partner main effects. 
‘To increase the efficiency in estimation, one 


‘ould then be Pooled across the 


different groups, resulting in more stable 


estimates, 
Summary 


The new round robin 


$ analysis of i 
provides a tool for dealin, ysis of variance 


g with social interac- 


tion data that allows assessment of both 
individual differences and reciprocity in social 
behaviors. It readily lends itself to the sub- 
stantive problems encountered in social psy- 
chological research on person perception, 
attraction, vocal activity and other mutually 
contingent behaviors. This means that instead 
of regarding the mutual contingency of social 
behaviors as a problematic deviation from 
the independence assumptions of statistical 
models traditionally used in the nonsocial 
sciences, it is possible to explicitly incorporate 
mutual contingency into the design and to? 
treat it as an interesting effect in its own right, | 


By tailoring statistical models to the questi 
that arise in social psychological research, we | 
can leave behind some of the limitations that 
are inherent in the use of statistical models 
that were created to study nonsocial phenom- 
ena. 

The round robin data layout is not new; 
this type of design has been employed in a 
number of classic social psychological studies 
(Campbell, Miller, Lubetsky, & O’Connell, 
1964; Cronbach, 1955; Newcomb, 1961). It 
continues to be used in small group and person 
perception research (e.g., Bales, 1970). What 
is new is the statistical model specifically 
tailored to this design, which makes it possible 
to test the significance of individual differences, 
interaction effects, and various types of 
reciprocity. This new round robin model is, 
ideally suited to the kind of research Tagiuri | 
called for in his 1969 article on person percep- 
tion in the Handbook of Social Psychology. He 
recommended more study of person perception 
in the context of ordinary transactions that 
occur in the natural environment, where the 
persons are interacting with each other am 
each person is simultaneously judge an 
object. i 

The round robin design deserves serious | 
consideration as an alternate research strategy + 
that provides information complementary t0 
that from traditional experimental desig: 
We hope that the availability of ean 
that facilitate analysis of data from nature y 
occurring social interactions will simui 
interest in the interactive, mutually contingen 
aspects of social behavior and the relationship 
between individual differences and 50° 
interaction. 


ROUND ROBIN ANOVA 
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expected mean squares for the round robin analysis 
of variance. Unpublished manuscript, Harvard 
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The Independence of Evaluative and Item Information: ` 


Impression and Recall Order Effects in 
Behavior-Based Impression Formation 


i id Hastie 
i . Dreben, Susan T. Fiske, and Reid Ha 
Ee E Harvard University 


Impression and recall order effects were examined in the context of Anem 
AR integration model of impression formation. Subjects read fo 


j í 
i i ion about each stimulus person and participated in one fa 
se a conditions: (a) rated the stimulus person after the ae 
sentation of each item and, after the fourth rating, recalled the fon ao 
(b) did the same, except that a trait-rating filler task was interpolates < pe 
each rating and the next presentation; (c) did the same, except that the A 
task was math problems; (d) made only one rating, after the —— n 
the fourth item, did no filler task, and did recall the four items; (e) ¢ h e 
same, without recall. In the first three conditions, each impression rating me 
affected most by the most recently presented item, whereas in the two in y 
response conditions, there was a slight primacy effect. In contrast, task con 


tions had little effect on recall; serial position curves for all four recall on 
ditions were U-shaped, exhibiting both primacy and recency, as commonly 
found in free recall research. Low and nonsignificant correlations between 
impression weights and recall, as well as the markedly different serial position 
curves, were discussed as evidence for distinct processes of recall and impres- 


sion formation. 


Recent theoretical analyses of human cog- 
nition have depended heavily on the concept 
of abstraction processes to explain perform- 
ance in perception, comprehension, memory, 
and judgment tasks. For example, psycholog- 
ical models for visual perception (Minsky, 
1975; Posner, 1969), sentence comprehension 
(Bransford & Franks, 1972; Schank & Abel- 
son, 1977), verbal memory (Bartlett, 1932; 
Mandler & Johnson, 1977), visual memory 
(Mandler & Parker, 1976), and decision mak- 


ing (Kahneman & Tversky, 1972) have em- 


phasized the abstraction of prototypes, data 
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Structures, semantic products, deep — 
schemata, and themes as the central co 
nent of cognitive information processing: k 
important empirical question cnc 
relationship between mental representat |) 
of the concrete surface structure of mo A 
ulus and the schematic structure that a 
Stracted from the original stimulus \% 
Sachs, 1967). a 
The present research is concerned bi Pe 
relationship between memory for a wi 
tual information about a person’s behav! 


based oi 
I be oo 


e im 
cerned with the relationship bermet a : 
portance or impact of an item of » keno 


IMPRESSION 


Impression Formation Processes 


In 1946 Asch introduced an impression for- 
mation task in which experimental subjects 
are shown a sequence of trait adjectives at- 
tributed to a single fictional person and are 
asked to judge the person’s personality, This 
task has dominated research on social judg- 
ment during the past 30 years. Anderson’s 
information integration theory (e.g., 1974) 
averaging model provides the most popular 
and most thorough account for the impression 
abstraction process in the Asch task. Ander- 
son’s model prescribes that information from 
stimulus elements or cues will be integrated 
according to an averaging rule to produce a 
response. Quantitatively: 


» 
R = wsi + ws $22 = LD wisi 


i=l 


| where w; = weight and s; = scale value The 
subscripts refer to the serial positions of the 
piece of information in the sequence (i.e., Wisı 
denotes the weight and scale value of the first 
trait in an Asch task). The restriction that 
the weights must sum to one makes the model 
an averaging model. 

One major finding obtained in the Asch 
paradigm concerns order effects. In general, 
early information seems to have a greater 
effect on the response than does subsequent 
information, creating a primacy effect. How- 
ever, given certain task conditions, such as 
continuous responding, final recall, and others, 
recency effects occur. 

According to Anderson’s averaging model, 
primacy occurs when the relative weights of 
the first items are larger than the relative 
weights of the last items. Recency occurs in 
the reverse situation, when the last items are 
weighted more heavily than the first items. 
The averaging model is used to estimate the 
effects of each serial position on the final 
rating. (These serial position effects are pro- 
Portional to the weights in the averaging 
model.) 

Anderson suggests both an inconsistency 
discounting explanation and an attention de- 
Crement explanation to account for the chang- 
ing weights that cause the primacy effect. 
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According to the discounting explanation, 
later adjectives that are inconsistent with 
earlier adjectives are discounted and weighted 
less. The attention decrement hypothesis 
posits that the earlier information counts 
more because the subjects attend less to each 
subsequent piece of information. 

Several experiments have been designed to 
test between these explanations by varying 
certain requivements of the standard proce- 
dure. Stewart (1965) found the primacy 
effect when the subjects made only one rat- 
ing after all the adjectives were presented, 
but he found a recency effect when subjects 
made a rating after each adjective. Anderson 
(1968) obtained recency when subjects pro- 
nounced aloud each adjective after it was 
presented. This recency occurred when final, 
continuous, and intermittent (rating once in 
the middle as well as at the end) response 
conditions were used. Hendrick and Constan- 
tini (1970) varied the degree of inconsistency 
between the traits and found that intertrait 
inconsistency did not change the primacy 
effect. They observed a recency effect when 
pronunciation of each adjective was required, 
and primacy without the pronunciation re- 
quirement. 

The results we have just reviewed are all 
supportive of the attentional account for serial 
position effects in impression formation. Only 
one result in the impression formation litera- 
ture appears to contradict such an account. 
Hendrick, Constantini, McGarry, and Mc- 
Bride (1973) found that variations in the 
time interval between trait presentations (1 
to 5.8 sec) did not affect primacy. However, 
even this result is not unequivocally contrary 
to attentional accounts. Thus, we have em- 
phasized attentional mechanisms in our theo- 
retical analysis of the present experimen! 
task. : 

To look at order effects in greater detail, 
Anderson and Farkas (1973) used an experi- 
mental design that permitted estimation of 
serial position weights. They used a continu- 
ous response task and found recency in the 


1 For the purposes of this discussion the scale value 
and weight of the initial impression (designated WoSo) 
are ignored. 
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last serial position (for each rating), with the 
early portions of the curve being flat. Their 
explanation, seen as a specification of the 
attention decrement hypothesis, postulates a 
surface component that refers to the short- 
term recency effect, reflecting the immediate 
reaction to the stimulus just presented and a 
basal component that represents the stable 
component of the impression reflected in the 
flat portion of the curves, Fluctuations in the 
content of the surface component correspond 
to variations in attention. 
One experiment in this tradition was con- 
cerned with the impression—-memory relation- 
ship that is central in the present investiga- 
tion. In 1963 Anderson and Hubert conducted 
an impression formation experiment that rep- 
licated the standard primacy effect obtained 
in earlier research. When subjects were told 
they would be asked to recall the adjectives, 
however, they found reduced primacy and, in 
One condition, a slight recency effect. Ander- 
son and Hubert also found a strong recency 
effect for recall in all conditions, even when 
there was a primacy effect on the impression. 
They cited the impression results as evidence 
for their attention decrement hypothesis. The 
reduced primacy was interpreted as evidence 
that subjects anticipating recall equalized 
attention to all items as they were presented. 
More importantly, Anderson and Hubert con- 
cluded that the difference between impression 
and recall serial position effects implied “that 
the formation of the impression involves a 
memory process which is distinct from, and 
not dependent on, the immediate verbal mem- 


ory for the adjectives just heard” (And 
& Hubert, 1963, p. 386). Tra 


Free-Recall M, emory Processes 


The free-recall paradigm in memory re- 
search is analogous to the Asch impression 
formation task. In free recall the s 
tion effect is usually in 
shaped curve (Klatzky, 
primacy effect, followed 
the middle section and 

Variables such as rate 
frequency of occurrence 
length variations affect 


erial posi- 
the form of a iv. 
1975). There is a 
by an asymptote in 
then a recency effect. 
of presentation, word 
in English, and list 
primacy and asymp- 
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tote portions of the curve (Glanzer, 1973). 
Variations in delay between study and recall ` 
affect the recency portion, These results sup- 
port the two-component memory hypothesis, 
that is, that there is a limited short-term store 
and a very large long-term store. 

It is important to note that the simple two- 
store model is not universally popular. For 
example, one result, apparently discrepant 
from the two-component theory, is that of 
Bjork and Whitten (1974). They used dis- 
tractor-delay tasks to interfere with the short- 
term component and thereby eliminate re“ 
cency effects. These variations did not affect 
recency, however, and so Bjork and Whitten 
concluded that long-term storage must under- 
lie the recency effect. 


Overview 


The present experiment used the Anderson 
and Farkas (1973) design with continuous 
responding and includes the Anderson and 
Hubert (1963) free recall measure. In addi- 
tion, the Bjork and Whitten (1974) inter- 
polated distractor task, or delay between the 
serial positions, was also employed. The logic 
of our examination of the impression-recall 
relationship is an extension of the Anderson 
and Hubert (1963) reasoning. The focus is 
on the correlation between the forms of im- 
Pression weight and recall probability serial 7 
position curves. However, we carry the anal- 
ysis forward in two respects: first, by examin- 
ing complete position-by-position serial curves, 
and second, by varying interpolated response 
tasks that previous research has shown will 
affect impression and memory processes. Two 
interpolated response tasks were chosen, math 
problems and trait ratings, because we pre- 
dicted that they would have differential ef- 
fects. Since traits are relatively similar 1 
behavior sentences, we hypothesized that the 
trait task would interfere more on both the 
impression and recall tasks than would the 
math task. if 

Two recent papers (Riskey, 1979; Rywick 
& Schaye, 1974) report related extensions © 
the Anderson and Hubert approach. For the 
Present, we will only note that these investi- 
gations reached contradictory conclusions 0” 


de 


IMPRESSION AND RECALL 


the issue of the impression—memory relation- 
ship. Riskey concurred with Anderson and 
Hubert that abstract impression and item spe- 
cific representations are independent while 
Rywick and Schaye concluded that the im- 
pression was dependent on specific item in- 
formation in long-term memory. We will dis- 
cuss these results more thoroughly after we 
have described our own empirical investiga- 
tion 


Method 


The subjects completed ratings of eight sequences, 
each of which was composed of four sentence pairs 
describing a person. We used sentences describing 
behaviors, instead of traits or paragraphs, because 
people often use sentences involving behaviors in 
person descriptions (e.g, Fiske & Cox, 1979). There 
were five experimental conditions: three continuous 
response groups who made a cumulative likability 
rating after each sentence pair and were asked to 
recall as much as possible after the fourth or final 
rating for each stimulus person, and two final re- 
sponse groups who made only one rating after the 
fourth sentence pair for each stimulus person. 


Subjects 


Eighty Harvard-Radcliffe undergraduates recruited 
as paid subjects were tested in groups of 1 to 5 
people, with each subject in a separate cubicle. Sub- 
jects tested at the same time were always in the same 
experimental condition. The continuous response 
groups were tested as one phase of the experiment, 
and the final response groups were tested 5 months 
later, The five experimental conditions can be com- 
pared, as experimental procedures were uniform for 
all groups. 


Procedure 


There were five cover sheets, three for continuous 
and two for final responding. In the first continuous 
responding group (no-task condition), the subjects 
were instructed to read two sentences about a person 
and then to make a rating below the sentences on an 
unmarked 200-mm scale with endpoints labeled most 
likable and least likable. They were to turn the page 
When told and then were to read another two 
Sentences, They then made a rating based on all the 
Previous information. They were told that after four 
ratings they would be asked to recall as much as 
Possible about the stimulus person. 
dhe two other continuous responding groups in- 
te the same tasks described above, but added a 
alte in between each page for sentence pairs. The 
Pae task was either a page of multiplication 

lems (math condition) or a page of trait ratings 
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(trait condition). Instructions were to read and rate 
a person and, upon signal, to turn the page and work 
on a separate task. The alternating tasks would be 
repeated four times (after each sentence pair) and 
would be followed by the recall task. 

In the fourth condition (final-only condition), 
subjects were instructed to read two sentences about 
a person, and on signal they were to turn the page 
and read another two sentences. After four sentence 
pairs a final likability rating was made. In the fifth 
condition (final with recall condition), instructions 
were the same as those in the fourth condition, ex- 
cept that the likability rating was followed by the 
recall task. 

During the experiment, subjects were given 10 sec 
to read and rate each sentence pair, 30 sec for the 
second task (in the alternating conditions), and 
90 sec for the free-recall measure. 


Stimuli 


The stimuli were pairs of sentences, with each pair 
composed of either a positive sentence and a neutral 
sentence or a negative sentence and a neutral sen- 
tence. An example of a positive-neutral pair is “Alan 
bought groceries for an elderly lady next door who 
was ill” and “Alan pressed the button and waited for 
the elevator to come.” For generalizability of stimuli, 
and so that each sentence appeared only once for 
each subject, there were 64 sentences (8 stimulus 
people each composed of 8 sentences): 16 positive, 
16 negative, and 32 neutral. To insure that the sen- 
tences within each category (ie. positive, negative, 
or neutral) would be considered roughly equivalent 
in terms of likability, the sentences were pilot tested 
before the experiment was conducted. The mean and 
spread of ratings for a larger group of sentences was 
examined, and 64 sentences were chosen (with the 
lowest positive mean at 139.40, the highest negative 
mean at 49.75, and the neutral means ranging be- 
tween 93.05 and 105.15 on a 200 mm scale). 


Design 


The design is a 5 X (4 X 2) factorial design, with 
one between-subjects variable and two within-sub- 
jects variables. The five-level variable is the subject's 
task during the experiment session that was intro- 
duced above. The four-level repeated measure is 
serial position (1 through 4) and the two-level 
repeated measure is (high or low) valence. 

In order to reduce the number of observations per 
subject, the research used a half-replication of the 2¢ 
design? (e.g., Cochran & Cox, 1957) with 8 sequences 


2 The half-replicationof the 2‘ design was analyzed 
as a 4X2 within-subjects design. The four-level 
factor corresponds to input serial position and the 
two-level factor corresponds to item valence at each 
position (evaluating positive or negative). 
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IN PRESENTATION SEQUENCE) 


Figure 1. Relative impact of information 
ratings made after each position. (Note th: 
impact of the most recently 
~ made only one rating, 
sent the between-subjects task conditions: 
Presented at each serial position, 
four ratings, final recall, 


no filler [final only condition].) 


(each containing 4 sentence pairs) instead of the full 
16. Each half-replication was counterbalanced in an 
8 X 8 Greco-Latin square so that both stimulus peo- 
ple and sentence pairs occur equally often in all posi- 
tions. This ensured that order of Presentation was 
not confounded with the order effects under study, 
A complete replication of this design requires 8 sub- 
jects in each condition. Two replications of the design 
were run with 16 subjects for each between-subjects 
condition, making 80 subjects in all. 


Results 


Impression Order Effects 


f The main analysis was done on the final 
impression rating, Rating 4, There was one 


between-subjects variable, task condition 


presented at first through fourth positions on impression 

at each subject in Panels A-C made four ratings, and the 
presented item was always the greatest. Conditions in which subjects 
Panels D and E, show no recency and some primacy. The five panels repre- 

[a] four impression ratings, one after information 4 
and a recall task after fourth rating [no-task condition]; [b] j 


and irrelevant trait-rating filler after each rating [trait condition]; [c] 
four interpolated ratings, final recall, and irrelevant 


[d] final rating, final recall, and no filler [fina] wi 


math filler after each rating [math conditioni ; 
ith recall]; [e] final rating only, no recall, a 


(math/trait/no task/final with recall/ 
only) and two within-subjects variables, i 
tion (four levels) and valence (high/low, 


conditions did not. The serial position curves 4 
(serial position weights) calculated for eac 
of the four ratings for no task, trait, and | a 
math conditions are displayed in Figure 
and clearly show a recency effect. In cont aa 
subjects in the final response conditi 
showed a primacy effect. The cite 
effects of serial position are illustrated bY 
main effect for position, F(3, 225) = Table } 
b < .001. The serial position weights in 14? 
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1 show that Position 4 has the greatest impact 
on the final rating for the continuous response 
conditions but not for the final response con- 
ditions. 

The most sensitive tests of the effects of 
the experimental tasks are the planned con- 
trasts differentiating among the shapes of 
the five task condition serial position curves. 
These contrasts partition the variance from 
the significant Task X Serial Position inter- 
action effect, F(12, 225) = 4.36, p < .001. 
The task main effect was nonsignificant (p > 
.5). The striking effect in this analysis was 
the comparison between the final and con- 
tinuous responding conditions on the linear 
effect, F(1, 225) = 21.29, p < .001. 

As in previous research using similar tasks 
and the serial position counterbalancing plan 
(Anderson & Farkas, 1973), several higher 
order interactions, concentrated in the Serial 
Position X Task x Replication effects, were 
significant. 


Recall Order Effects 


The dependent measure for the recall task 
was the number of valenced (H or L) sen- 
tences recalled from each of the four serial 
positions. The amount of recall at each posi- 
tion was analyzed, using analysis of variance 
for the four-level within-subjects variable of 
serial position, the two-level within-subjects 
variable valence (H or L), and the between- 
subjects variable of task (trait/math/no task/ 
final with recall). The recall task showed the 


Table 1 
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standard U-shaped curve (both primacy and 
recency) for all the conditions, although there 
were differences in the sharpness of the 
curves depending on the condition. Figure 2 
shows the serial position effects in which the 
no-task condition exhibits the largest primacy 
and recency effects. The math and final with 
recall conditions show intermediate curvature 
across serial positions, whereas the trait con- 
dition exhibits the flattest curve. These results 
are reflected by the significant Task X Serial 
Position interaction, F(9, 180) = 2.10, p< 
.05, which shows the differential position 
effects as a consequence of experimental con- 
dition. 

There was a main effect for serial position, 
F(3, 180) = 19.23, p < .001, which indicates 
that the location of a valenced sentence influ- 
enced how likely the sentence was to be re- 
called. The task main effect was nonsignif- 
icant. There was also a main effect for valence 
(H or L), F(1, 160) = 30.84, p < .001, such 
that negative information was recalled more 
frequently than was positive information 
across all serial positions. 

The number of errors in recall (either in- 
trusions of previous stimuli presented or 
recall made faulty by the inclusion of stimuli) 
were analyzed, and there were no patterns for 
either serial position or task effects. (The 
error rates were very low; for intrusions, M = 
.15, and for faulty recall, M = .07.) 

A separate analysis of the recall of neutral 
sentences revealed only a significant main 
effect for serial position, F(3, 180) = 7.33, 


Relative Influence of Each Serial Position on Final Impression Rating 


Task condition 


Four ratings, Four ratings, Four ratings, alee ole 

Serial math filler, trait filler, no filler, no filler _no filler, 

Position with recall with recall with recall with recall without recall Row M 
First 17.64 32.31 33.34 34,99 35.02 30.66 
Second 22.89 36.06 35.56 41.78 45.92 36.44 
Third 25.11 24.34 25.78 29.70 30.83 27.15 
Fourth 71.30 65.50 46.31 30.82 27.23 48.23 

Column M 34.23 39.56 35.25 34.32 34.75 35.62 


Note. Table entries are proportional to information integration weights (for derivation, see Anderson & 


Farkas, 1973), 
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1 2 3 4 
SERIAL POSITION 
Figure 2. Recall serial position curves for the four 
between-subjects conditions in which recall was 
tested: no task (labeled no filler), math interpolated 


response task, trait rating interpolated response task, 
and final rating with recall task. 


p < .001. Neutral sentence recall exhibited 
the typical U-shaped serial position curves 
but no other systematic patterns. 


Relationships Between Impression and Recall 


Overall, recall curves differ markedly from 
the impression serial position curves. In par- 
ticular, there is a clear primacy effect for 
recall in all task conditions—although there 
is no such effect for impressions in continuous 
responding task conditions, It seems that 
earlier information is not forgotten, but it is 
given less weight in impressions. 

At a finer level of analysis, correlations 
were calculated to measure the degree to 
which total valenced recall at each position 
corresponded to differential serial position 
weights. For the correlations, the impression 
difference scores (positive minus negative in- 
formation for each position) were calculated 
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on the final (fourth) likability rating.’ Ex 
amination of the mean correlations poo 
across the four serial positions for each of 
four relevant between-subjects conditions 
(trait/math/no task/and final with recall) 
shows that none of these four pooled m 
correlations is significantly different from 
zero. But the no-task and math conditions 
(r= .28 and r= .38 respectively) show a 
higher correlation between recall and diffe 
tial impression rating than do the trait a 
final with recall conditions (r = —.15 andr= 


ean 


degree of correlation between recall and 
ferential likability ratings. 


Discussion 


The focus of the present experiment is 0 
the relationship between an abstract perso 
ity impression and the specific items of in 
formation on which that impression is based: 
Anderson and Hubert (1963) and Riskey 
(1979) have conducted research that uses 4 
method similar to the present experimental” 
paradigm and have concluded that the impres- 
sion, once formed, is independent from the 
representation of specific item information ini 
memory. Our conclusion is in agreement wi 
these investigators but is based on rese 
with rich behavioral stimulus materials, in: 
volves an extensive variation in subjec 
processing tasks, and includes a more detail 
analysis of impression and recall results th 


*For the Rating 4 analysis only, because half- 
replications were used in this design, the higher order 
interaction effects are confounded with replication. 
The difference score at Position 4 is equal to 
four-way interaction of Replication X Difference 
Score at Position 1 X Difference Score at Position 2X 
Difference Score at Position 3. The pattern of a f 
first three positions in any sequence is identical ‘th 
the two replications (i.e, HHH). It is the fourth 
sentence pair (H or L) that differs for the two TP 
lications. is 

The valence of the information at Position 4 
determined by the levels of Positions 1-3 and repli ' 
tion (1 or 2). Thus, the alias for Position 4 ist 
interaction of Position 1 X Position 2 X Position ‘be 
Replication (Cochran & Cox, 1957, p. 276). Since = 
other three ratings were not affected by the informa 


tion at Position 4, that factor could be dropped ma 
the design, ‘ 


IMPRESSION AND RECALL 


earlier experimental work. In this discussion 
we will briefly summarize our analyses of im- 
pression and recall results and then end with 
a theoretical note on the significance of these 
results in the context of the conclusions of 
Anderson and Hubert and of Riskey. 


Impression Order Effects 


First, the present results using sentences as 
stimuli reproduce the impression order effects 
obtained in previous research using traits, 
paragraphs, or photographs (Anderson & 
Farkas, 1973; Anderson & Lampel, 1965; 
Asch, 1946). This suggests that information 
embodied in the sentence descriptions of be- 
haviors is used in the same way that informa- 
tion contained in the other stimulus materials 
is used. Thus, order effects do not seem sen- 
sitive to the stimuli used (traits, sentences, or 
paragraphs). This insensitivity is in marked 
contrast to the dependency of order effects 
on the method of presentation of the stimuli. 
As mentioned earlier, continuous responding 
(these data; Anderson & Farkas, 1973; 
Stewart, 1965), pronouncing aloud (Ander- 
son, 1968), and warned recall (Anderson & 
Hubert, 1963) all serve to change the primacy 
effect into a recency effect. Thus, although the 
order effects are fragile in the sense that they 
can be changed when task requirements 
change, these same order effects are quite 
Stable across variations in the stimuli em- 
ployed. 

Second, the final versus continuous respond- 
ing variation in impression rating task re- 
quirements clearly affected the shape of the 
Serial position curve, as shown by recency 
effects with continuous responding and no 
recency with final responding (Panels A, B, 
and C vs, Panels D and E in Figure 1). The 
distractor conditions (math and trait tasks) 
Showed more recency than did the no-task 
Condition (‘Panels B and C vs. Panel A in 
Figure 1). The impression-irrelevant math 
Problems and trait ratings may interfere with 
the process of impression formation, with the 
Mterruptions leading subjects to weight im- 
Mediate information the most in their ratings. 

Third, the current no-task condition (Panel 

ìn Figure 1) replicated the Anderson and 
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Farkas (1973) recency effect, and our addi- 
tion of a warned recall task did not change 
the impression serial position curves. This 
differs from the Anderson and Hubert (1963) 
finding that warned recall decreased the pri- 
macy effect and sometimes produced recency. 
Finally, the results for all conditions sug- 
gest that negative information has more 
weight than positive, which accords with the 
negativity effects in the impression formation 
literature (Kanouse & Hanson, 1972). 


Recall Order Effects 


The recall curves for the no-task, math, and 
final with recall conditions (see Figure 2) are 
the usual U-shaped curves obtained in the free- 
recall task. The trait condition curve is flat- 
test but also shows slight primacy and recency 
effects, Although the time elapsed between the 
start of the no-task and final with recall con- 
ditions and the recall task was 2 minutes 
shorter than in the distractor conditions, the 
math recall curve is quite similar to the 
curves from the nondistraction conditions. 
The reduced recency in the math and trait 
conditions could be due to either the four dis- 
tractor tasks themselves or to the final dis- 
tractor task, which delayed the recall task. 

The trait recall curve, considered in isola- 
tion, is interesting because the Bjork & 
Whitten (1974) finding of recency in the 
presence of a distractor task was not repli- 
cated, In one experiment, word pairs were 
presented, with simple multiplication prob- 
lems between each pair and further multi- 
plication before the recall task. They ob- 
tained a U-shaped curve and suggested that 
this was evidence for long-term store effects 
on recency. In another experiment, they found 
similar recency effects, whether or not there 
was a 30-sec final delay before recall, as long 
as the interpolated task between each word 
pair was presented, This is further evidence, 
according to Bjork & Whitten, that recency 
results from a long-term memory process. 

Bjork and Whitten suggested that the dis- 
criminability between items to be remembered 
caused the recency effect and is produced by 
using distractor tasks. These interruptions 
temporally separate the stimuli to make each 
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piece of information distinct. The current re- 
sults cast doubt on this explanation. Although 
the math condition, which is similar to their 
interpolated task, replicated Bjork and Whit- 
ten’s recency effect, the trait condition did 
not. It seems likely that the nature of the 
distractor task is important and that these 
tasks do not simply temporally distinguish 
between the stimuli. The distractor task may 
make the items more distinct by means of a 
figure-ground relationship. Since the math is 
clearly different from the verbal stimuli, it 
may separate each item of information more 
than do traits, which are more similar to sen- 
tences. 


Relationships Between Impression and Recall 


Under continuous rating conditions, sub- 
jects clearly weight the earliest information 
the least in their final impression rating, but 
this is not because they no longer remember 
it. All of the relevant recall serial position 
curves show clear primacy and recency effects. 
A more complete summary of the relation- 
ship between mean recall and mean impres- 
sion weight would be obtained by plotting 
the mean recall at each position (points 
plotted in Figure 2) against the impression 
serial position weights on Rating 4 (solid 
points in Figure 1). Such a summary reveals 
that there is no simple relationship between 
recall and impressions, at least as indicated 
by their respective mean serial position 
curves, The nonsignificant correlations be- 
tween recall and differential impression rat- 
ings based on the data from each individual 
character were reported above as further 
evidence that no simple relationship between 
memory and the impression exists, 

The final responding conditions provide 
a replication of the Anderson and Hubert 
(1963) and the Riskey (1979) findings. Im- 
pression serial position curves show slight 
primacy and no recency, whereas recall serial 
position curves show dramatic recency and 
some primacy. Our interpretation of these 
results is in complete agreement with the An- 
derson and Hubert and Riskey accounts 
Attention decreases across the sequence Ee 
stimulus items, producing the primacy effect 
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in the impression serial position curve. Mem- 
ory for individual stimulus items is not rele- 
vant to the impression, once their evaluative 
content has been integrated into the current 
impression. Thus, the U-shaped memory 
serial position curve is unrelated to the 
monotonically decreasing impression serial 
position curve. 

A somewhat different picture emerges from 
an examination of results from continuous 
responding conditions. Here a large recency 
effect is consistently obtained for the most 


recent item on each of the four likability - 


ratings. Anderson and Farkas (1973) have 
proposed a two-component model for a simi- 
lar task, distinguishing between surface and 
basal components of an impression. Our in- 
terpretation is that each impression rating is 
dependent on information immediately avail- 
able in short-term memory (surface compo- 
nent) as well as on the enduring impression 
that has accumulated across all previous stim- 
ulus events (basal impression). Thus, under 


continuous responding conditions the impres- | 


sion rating is dependent on immediately 
available item information as well as on the 
more abstract, item-independent, cumulative 
impression. The increase in recency effect 
magnitude with increasingly demanding tasks 
interpolated between ratings (no task # 
trait task to math task) adds support to this 
interpretation. (Note that this ordering of 
tasks by difficulty or cognitive demand 18 
a post hoc interpretation.) 

The present results from both final re- 
sponding and continuous responding task 
conditions are quite consistent with an ace 
count based on attentional mechanisms (An- 
derson & Hubert, 1963; Riskey, 1979) and 
on a surface-basal component distinction 
(Anderson & Farkas, 1973). The emphas!s 
on attentional mechanisms is strengthene 
even further by a detailed comparison 4 
response requirements in continuous and fna 
responding rating tasks. Under continuous 
responding conditions, the impression rating 
scale was printed on each page containing 
behavior descriptions, located below thos? 
descriptions. These are the conditions under 
which large recency effects were obtaine™ 
Thus, the information that has a large "y 
pact on the rating is not only recently av 
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able to comprehension and short-term mem- 
ory mechanisms, it is directly available in 
the visual (attentional) field. On the other 
hand, under final responding conditions, the 
final (single) rating scale was printed on the 
page following the last page of behavior de- 
scription sentences, Thus, under final re- 
sponding conditions where no recency and 
slight primacy effects were obtained, imme- 
diate attention was not dominated by the 
final most recent information. 

We would like to close this report by 
noting that our conclusion is an argument 
for the relative independence of a general 
abstract impression and the specific informa- 
tion on which the abstraction was based. 
Furthermore, we have cited a simple atten- 
tional mechanism as an explanatory prin- 
ciple to account for the differential impact of 
recent events on the evaluative impression 
under continuous and final responding con- 
ditions. This explanation presents a picture 
of the subject as a person whose behavior 
is stimulus driven and whose thoughts are 
dominated by the immediate environment 
(Kahneman & Tversky, 1972; Taylor & 
Fiske, 1978). Although we believe our sub- 
jects’ judgments were dominated by the given 
stimulus information, we also attribute consid- 
erable cognitive complexity to comprehension, 
valuation, and integration operations that are 
assumed to underlie the impression formation 
process. 
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Self-Focus, Felt Responsibility, and Helping Behavior 


Shelley Duval, Virginia Hensley Duval, and Robert Neely 
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Two experiments were designed to determine if juxtaposing self-focus and 
salient distressed others would (a) increase self-attribution of responsibility for 
those needy others and (b) increase willingness to help those others. In Ex- 
periment 1, these hypotheses were tested by exposing subjects to their images 
on a TV screen 4 minutes before, immediately before, immediately after, or 
4 minutes after seeing a videotape on victims of the venereal disease epidemic. 
As predicted, results indicated that subjects who saw their images immediately 
before or after seeing the videotape on victims of venereal disease felt more 
responsibility for and were more willing to help that group than did subjects 
in other conditions. In Experiment 2, subjects filled out a biographical ques- 
tionnaire either 4 minutes before or immediately before seeing a videotape on 
poverty-stricken Latin Americans. Results from this experiment also confirmed 
predictions. Subjects who filled out a biographical questionnaire immediately 
before seeing a videotape on poverty in Latin America felt more responsibility 
for and were more willing to help the distressed group than did subjects who 
completed the questionnaire 4 minutes prior to exposure to the tape on Latin 
American poverty. Additional evidence indicated that this effect is probably 
not mediated by the sole operation of the self-evaluative mechanism posited 
by Duval and Wicklund or by any change in attitudes or norms regarding the 
distress of and/or necessity of helping distressed others. Implications of the 
self-focus/felt responsibility/helping behavior relationship for the general study 
of helping behavior are discussed. 


Piliavin, 1969; Schopler & Matthews, 1965) 


Research clearly demonstrates that help- 
appears (with one exception, Wegner & 


ing behavior is related to feelings of respon- 


sibility for the welfare of others in distress 
(Geer & Jarmecky, 1973; Schwartz, 1973; 
Schwartz & Clausen, 1970; Tilker, 1970). 
This relationship between feelings of respon- 
sibility for the welfare of others and helping 
behavior can be stated within the theoreti- 
cal framework of attribution theory. Factors 
that increase self-attribution of responsibility 
for a distressed person will increase helping 
behavior. Factors that decrease self-attribu- 
tion of responsibility for a distressed person 
will decrease helping behavior. 

In retrospect, research based on an attri- 
butional approach to helping behavior (Ickes, 
Kidd, & Berkowitz, 1976; Piliavin, Rodin, & 
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Schaefer, 1978) to have been guided by one 
general question: What factors affect a per- 
son’s attribution of responsibility, given that 
the attributional processes are governed by 
rational rules or schema such as discounting? 
Recent developments in attribution theory 
(Duval & Duval, in press; Duval & Hensley, 
1976; Jones, 1976) indicate that attribution 
does not always conform to the logical pat- 
terns dictated by reason or common sense. 
This suggests that an attributional approach 
to the study of helping behavior might also 
attempt to determine what conditions affect a 
person’s attribution of responsibility, given 
that the attribution processes are not strictly 
governed by rational rules. The present re- 
search is addressed to this question. Specifi- 
cally, two studies were designed to determine 
whether conditions that affect the focus of 
attention also affect self-attribution of re- 
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sponsibility for the welfare of distressed per- 
sons and thus helping behavior. 

Heider (1958) suggests that attribution of 
responsibility reflects the processes of unit 
formation. To the extent that conditions pro- 
duce a unit relationship between some agent 
and some effect, that is, some event or situa- 
tion, responsibility for the effect will be at- 
tributed to the agent. Duval and Duval (in 
press) and Duval and Hensley (1976) pro- 
pose that the focus of attention is one vari- 
able that affects this process. To the extent 
that attention focuses on an agent, B, rather 
than on A, C, or others, the tendency to 
link B and an effect, X, in a unit relation- 
ship and thus to attribute responsibility for 
X to B should increase. Data from several 
studies support this hypothesis (Arkin & 
Duval, 1975; Duval & Wicklund, 1973; 
Pryor & Kriss, 1977; Storms, 1973; Taylor 
& Fiske, 1975), 

Applied to a situation in which the dis- 
tress of other persons is or becomes salient, 
the focus of attention-attribution hypothesis 
Suggests that increasing self-focus will in- 
crease self-attribution of responsibility for 
the welfare of these distressed persons. Given 
the relationship between felt responsibility 
and helping behavior, increasing attribution 
of responsibility for distressed persons to self 
Should in turn increase willingness to help 
those distressed persons, In order to test 
these two hypotheses, subjects were exposed 
to a stimulus designed to increase self-focus 
for 1 minute either 4 minutes before, imme- 
diately before, immediately after, or 4 min- 
utes after they saw a documentary videotape 
designed to increase awareness of the plight 
of victims of the venereal disease (VD) epi- 
demic. Attribution of responsibility for the 
welfare of VD victims and willingness to help 
VD victims were then measured. It was pre- 
dicted that subjects in the immediately be- 
fore and after conditions would feel more 
responsible for and would be more willing 
to help victims of VD than would subjects 
in k 4-minute before or after conditions. 
etore turning to the rimental proce- 
dure, we might ‘briefly dises the iste 
the predictions Tegarding the temporal in- 
terval between focusing on self, presentation 
of the distressed group, and attribution of 
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are particularly interested in determining 


responsibility to self, as well as the rationale 
for choosing victims of VD as the potential y 
recipients of help. j 
The prediction regarding time interval, 
focus of attention and self-attribution is de- 
rived from the unit formation principle of 
temporal proximity. This principle states 
that decreasing the time interval that sepa- 
rates the presence or occurrence of an agent 
and an event increases the tendency to at- 
tribute responsibility to the agent and vice 
versa (Heider, 1958). Data from Michotte 
(1963) clearly support this prediction. Wei 
propose that the principle of temporal prox- 
imity applies not only to the time interval 
that actually separates the presence or occur- 
rence of an agent and an event but also to 
the time interval that separates periods of 
increased awareness of the agent and the 
event. From this perspective, decreasing the 
time interval that separates the periods of 
increased focal awareness of self and in- 
creased awareness of a situation, such asf 
the distress of VD victims, should increase 
the tendency to link self and the situation 
in a unit formation, According to Heider 
(1958), this increased tendency should lead 
to greater attribution of responsibility for 
the welfare of VD victims to self. Thus, sub- 
jects who experience increased self-focus im- 
mediately before seeing the videotape about 
VD victims should attribute greater responi 
sibility for the welfare of that group to sél 
and should be more willing to help VD Vi 
tims than should subjects who experience in- 
creased self-focus 4 minutes before seeing 
the videotape. In addition, data from Mir 
chotte (1963) suggest that decreasing the 
time interval between the periods of increased 
self-focus and increased awareness of 
distress of VD victims will increase self-ate 
tribution regardless of the order in which 
these periods of increased awareness occ 
With regard to the choice of VD victims) 
we should first note that studies on felt Ke 
sponsibility and helping behavior vary wid yo 
in terms of the number of help recipien ag 
the proximity of the subjects to those peop a 
and’ the extent to which ‘situational CU% 
Provide justification for feeling perso” on 
sponsibility. In the present experiment, Wi 
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whether the attribution processes associated 
with the focus of attention affect felt respon- 
sibility and thus helping behavior when the 
subject is essentially removed from a fairly 
large group of persons in distress and has 
few situational cues that justify attribution 
of responsibility for that group to self. Se- 
lection of victims of VD as the target popu- 
lation represents a major part of the attempt 
to create these conditions. 


Experiment 1 
Method 


Subjects. Subjects were 55 female undergraduates 
who volunteered for the study in partial fulfill- 
ment of a course requirement, Subjects were ran- 
domly assigned to four experimental conditions and 
one control condition and were run individually. 

Procedure. Subjects were greeted by a female 
undergraduate experimenter and were seated in 
front of a television set. They were told that they 
were participating in an experiment designed to sur- 
vey students’ attitudes concerning certain social 
issues, The experimenter further explained that a 
short documentary videotape had been prepared 
to familiarize them with some of the facts con- 
cerning the particular social issue being investigated. 
Subjects were told that they would be asked to 
indicate their attitudes concerning the social issue 
by filling out a short questionnaire following pre- 
sentation of the videotape. At this point, the ex- 
perimenter explained that the study was also con- 
cerned with determining whether a simple audio 
or audio and visual presentation of factual mate- 
tial was more effective. Supposedly, some subjects 
Would only hear the audio portion of the video- 
tape, whereas others would be exposed to both the 
audio and the visual portions of the videotape. The 
rationale for interest in the effectiveness of various 
types of presentations was that previous experiments 
had failed to produce conclusive evidence on this 
Point, All subjects were told that they were in the 
audio and visual condition. j- 

After explaining the purpose of the experiment, 
the experimenter warned subjects not to be sur- 
Prised if a live picture of themselves appeared on 
the TV screen at some time during the experiment. 
She indicated that the experiment would be sub- 
mitted to the American Science Foundation for a 
grant and that the foundation wanted all grant 
Proposals to include videotape recordings of all sub- 
jects and their exact experimental situations. If 
there were no questions, the experimenter told sub- 
Jects she had to warm up the videotape machine 
ue approximately 5 minutes, after which the docu- 
i entary videotape would be shown and the ques- 
‘Onnaire administered. The experimenter then 
poved behind a partition in the room in order 
© avoid contact with subjects prior to and after 
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presentation of the videotape. After approximately 
5 minutes had elapsed, a 2-minute videotape was 
shown, The videotape showed an adult male talk- 
ing about victims of VD. He first presented statistics 
concerning the epidemic spread of venereal disease 
to a significant percentage of the population. He 
then discussed the various harmful physical and 
mental consequences that victims of venereal disease 
typically suffer. His presentation did not discuss the 
causes of VD or responsibility for the welfare of 
victims of VD, nor did it request any form of help 
for VD victims, 

After viewing the videotape, subjects remained 
seated for an additional 5 minutes while the ex- 
perimenter supposedly rewound the videotape and 
shut down the equipment. After 5 minutes had 
elapsed, the experimenter reentered the subject's 
portion of the cubicle and administered the de- 
pendent measures, 

Time interval and before-after manipulations. 
In all conditions the experimenter first went be- 
hind the partition and activated some equipment 
that made noises audible to the subject. In the 4 
minutes before condition, the experimenter then 
activated a TV camera positioned so that the sub- 
ject’s face appeared on the TV monitor. The sub- 
ject was exposed to her image for 1 minute. After 
turning the camera off, the experimenter waited 4 
minutes and then played the videotape. After show- 
ing the videotape, the experimenter activated other 
equipment that made noises audible to the subject, 
waited behind the partition for 5 minutes, and then 
administered the dependent measures. In the brief 
time interval before condition, the experimenter 
activated the equipment, waited for approximately 
4 minutes, and then activated the TV camera fo- 
cused on the subject. The subject was exposed to 
her image on the TV monitor for 1 minute, After 
the 1-minute exposure, the experimenter turned 
the camera off, delayed for 1 to 2 sec, and then 
presented the videotape. After exposure to the 
videotape, the experimenter waited 5 minutes and 
then administered the dependent measures, In the 
brief time interval after condition, the experimenter 
activated the equipment, waited for approximately 
5 minutes, and then presented the videotape. The 
end of the videotape was followed by a 1- to 2- 
second delay, The experimenter then turned on 
the live TV camera, exposed the subject to her 
image on the TV monitor for 1 minute, waited 
for approximately 4 minutes, and administered the 
dependent measures. In the 4 minutes-after condi- 
tion, the experimenter turned on the equipment, 
waited for 5 minutes, played the videotape, waited 
for 4 minutes, and then exposed the subject to her 
image on the TV monitor for 1 minute. The de- 
pendent measures were then administered. A control 
condition using similar instructions was also run. 
Subjects waited 5 minutes before and after seeing 
the videotape. However, no TV camera was visible 
to the subjects, and the subjects were never ex- 
posed to a live image of their faces on the TV 


monitor. 
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Table 1 he 
Mean Attribution of Responsibility for 
Victims of VD to Self 


m_m 


Condition Immediate 4-minute interval 
ee AE a = o 
Before 9.0 6.9 
After 10.27 79 
Control 6.5 


Note. The larger the number, the greater the attri- 
bution to responsibility to self. For each condition, 
n = 11. VD = venereal disease. 


Dependent measures. Five minutes after the 
videotape ended, subjects were given a question- 
naire to fill out. The questionnaire included items 
concerning the content and effectiveness of the 
videotaped presentation. It also included questions 
designed to assess subjects’ feelings of responsibility 
for the welfare of VD victims and willingness to 
help victims of VD. These questions were the 
following: 

1. To what extent would you say that the failure 
to control the current VD epidemic is due to the 
public’s, including your own, lack of concern rather 
than to the VD victim himself? 

2, To what extent would you be willing to take 
a l-hour course dealing with the prevention and 
treatment of VD and then conduct two half-hour 
educational sessions with groups of college fresh- 
men? 

3. A VD clinic exists in the college community 
but is underfinanced and greatly understaffed. How 
much time per week would you be willing to spend 
doing volunteer work at the clinic? 

4, How large a financial contribution would you 
be willing to make to the National Center for the 
Prevention of VD? 

All questions were accompanied by 15-point scales 
anchored at both ends with appropriate positive and 
negative labels. After subjects had completed the 
questionnaire, the experimenter checked for sus- 
picion and then revealed the true nature of the ex- 
periment and experimental procedures. 


Results 


_ An analysis of data obtained from ques- 
tions concerning the content and effectiveness 
of the videotaped presentation was carried 
out to determine if the experimental manipu- 
lations affected the subject’s memory of the 
information presented on the tape or her 
impression of the effectiveness of the pre- 
sentation. No significant differences between 
conditions were found for content or for ef- 
fectiveness (F < 1 in all cases). In addition, 


ho differences between the control condition 
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and any of the experimental conditions were 
found on these measures (F < 1 in all cases), 

The first experimental hypothesis suggests 
that a person should tend to see self as re- 
sponsible for the welfare of people in distress 
to the extent that a brief time interval sepa- 
rates the periods of increased self-focus and 
increased awareness of distressed persons, In 
addition, this effect should occur regardless 
of the order in which increased self-focus 
and increased awareness of the distressed 
others occur. An analysis of the attribution 
of responsibility to self data (Table 1) is 
consistent with this prediction. Subjects in 
the brief time interval conditions felt more 
personal responsibility for the welfare of VD 
victims than did subjects in the 4-minute 
time interval conditions, F(1, 41) = 5.01, 
p <.05. There was no effect for before-after 
or any interaction between before-after and 
time interval (F <1) in both cases. A com- 
parison between the brief time interval con- 
ditions and the control condition evidenced 
the presence of a reliable difference, F (1, 50) 
=7.01, p<.05. A similar comparison be 
tween the 4-minute time interval conditions 
and the control did not reveal the presence 
of a significant difference (F < 1). 

The second experimental hypothesis sug- 
gests that focusing on self immediately be- 
fore or after awareness of the distress of 
others is increased should result in increase 
willingness to help those persons. The me 
responses to each of the questions designe 
to measure willingness to help victims p 
VD are presented in Table 2. Responses to 
these questions were highly correlated (rs 
ranging from .59 to .76) and were summe 
for purposes of analysis. The results of H 
analysis indicated that subjects in the brie 
time interval conditions expressed gam 
overall willingness to help victims of i 
than did subjects in the 4-minute time inte? 
val conditions, F(1, 41) = 6.41, P< 05. 4 
other significant main effects or interaction 
were found. Additional comparisons 1 al 
cated that subjects in the brief time inter 
conditions were more willing to help r 
were subjects in the control condition, F i 
50) = 4.20, p < .05, whereas subjects 1" 
4-minute interval conditions were not 
g1): 
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Table 2 
Mean Willingness to Help Victims of VD 


HELPING BEHAVIOR 1773 


ee ee 


No interval No interval 4-minute interval  4-minute interval 
Item before after before after Control 
Teach class 11.2 11.2 8.5 7.9 7.0 
Volunteer work at clinic 7.6 8.9 7.6 6.0 8.2 
Personal contribution 6.9 77 3.3 6.6 5.2 
Combined 25.7 27.8 19.4 20.5 20.4 


Note. Numbers represent means for each condition 
help. 


Discussion 


The results from Experiment 1 clearly 
support both experimental hypotheses. Sub- 
jects who focused on self immediately before 
or after the plight of VD victims was made 
salient attributed greater responsibility for 
the welfare of that group to self and indi- 
cated greater willingness to expend time, 
effort, and money to help VD victims than 
did subjects who focused on self 4 minutes 
before or after seeing the videotape on VD 
victims. Comparisons between the experi- 
mental conditions and control condition in- 
dicate that these main effects were due to 
increases in attribution of responsibility to 
self and willingness to help in the brief time 
interval conditions rather than to any de- 
crease in the magnitude of those responses 
in the 4-minute time interval conditions. 

Alternate explanations. It might be ar- 
gued that putting the subject’s image on 
the screen immediately before or after the 
videotape had been shown caused subjects 
to suspect that the experimenter wanted 
them to feel personally responsible for vic- 
tims of the VD epidemic. This argument is 
challenged by the results in the 4-minute 
time interval/after condition. Subjects in this 
condition saw their image immediately before 
they filled out the dependent measures. If 
subjects were susceptible to the belief, “The 
experimenter is trying to get me to feel per- 
sonally responsible for the welfare of victims 
of VD by putting my face on TV,” it seems 
reasonable that introducing the subjects’ 
images immediately before the dependent 
Measures were administered would be just 
= likely to make them suspicious about the 
Motives of the experimenter as would intro- 


. The higher the number, the greater the willingness to 


ducing their images before or after the video- 
tape. Since subjects in the 4-minute time in- 
terval/after condition did not differ from 
subjects in the control condition, it would 
be difficult to argue that the experimental 
results were due to experimental demand. 

Aside from an experimental demand ex- 
planation, it might be argued that increas- 
ing subjects’ self-focus in the brief time in- 
terval conditions affected helping behavior 
by increasing their efforts to conform to a 
generalized norm or standard of helping 
others (Wicklund, 1975). However, if in- 
creasing self-focus affected helping behavior 
via a direct self-evaluative mechanism (Du- 
val & Wicklund, 1972), it seems reasonable 
to expect increased willingness to help when 
self-focus was increased immediately before 
measures of helping were taken, that is, in 
the 4-minute time interval/after condition. 
Since there is no evidence of increased will- 
ingness to help in this condition, we must 
conclude that the self-attribution interpre- 
tation of the obtained effects, at least under 
these experimental conditions, must be given 
substantial consideration. 


Experiment 2 


The data from Experiment 1 are con- 
sistent with the two experimental hypotheses 
but do raise several issues. Since the self- 
attribution measure separated VD victims 
and subjects as members of the general pub- 
lic into two discrete groups that might be 
responsible for victims of the VD epidemic, 
it is conceivable that subjects actually con- 
sidered themselves as separate from the gen- 
eral public when responding to this question. 
To resolve this interpretational ambiguity, 
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the attribution-of-responsibility-to-self mea- 
sure in Experiment 2 was more direct and 
specifically asked subjects to determine the 
extent to which they were responsible for 
the people in question. 

Second, to provide evidence concerning the 
generality of the proposed relationship be- 
tween self-focus, felt responsibility, and help- 
ing behavior and to demonstrate the con- 
ceptual replicability of the results obtained in 
Experiment 1, Experiment 2 used a different 
technique to increase self-focus and pre- 
sented a different population as potential 
help recipients. To increase self-focus, sub- 
jects were asked to fill out a biographical 
questionnaire, a procedure that should in- 
crease self-focus (Duval & Wicklund, 1972) 
as effectively as a TV camera, The target 
population consisted of poverty-stricken Latin 
Americans rather than VD victims. 

Third, we have assumed that increasing 
self-focus affects felt responsibility for a spe- 
cific group only if awareness of that group’s 
distress is increased immediately before or 
after self-focus is increased. This may not 
be the case. Increasing self-focus a few 
minutes before measuring felt responsibility 
for a particular group of distressed Persons 
may increase felt responsibility for that 
group even when awareness of the group’s 
distress has not been increased. To test this 
possibility, Experiment 2 included a control 
condition in which subjects filled out the 
biographical questionnaire but were not ex- 
posed to any material designed to increase 
awareness of poverty-stricken Latin Ameri- 
cans prior to administration of the dependent 
measures. 

, Finally, the inclusion of the 4-minute time 
interval/after condition in Experiment 1 
renders an experimental demand or a direct 
self-evaluation interpretation of the obtained 
results unlikely. On the other hand increas- 
ing self-focus typically causes attitudes and 
norms to change in the direction of any rele- 
vant and positive reference group’s majorit: 

position (Duval, 1976; Duval & Wicklund. 
1972; Wicklund & Duval, 1971), The exist. 
ence of this effect Suggests that juxtaposing 
the subject’s image and material related to 
victims of VD may have 


pio caused the sub- 
Ject’s attitudes and/or norms concerning vic- 
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tims of VD to change in a direction dictat- 
ing more helping behavior. From this point 
of view, the increased willingness to help ob- 
served in the brief time interval conditions 
of Experiment 1 might have been due to 
changes in attitudes and/or norms rather 
than to the differences in felt responsibility, 
We view this as an unlikely possibility, since 
any mention of other people’s attitudes and 
norms regarding victims of VD was care 
fully avoided. Nevertheless, it is a plausible 
alternative explanation for the results in Ex- 
periment 1 that cannot be ruled out. In 
fact, no study on felt responsibility and help- 
ing behavior has determined if the manipula- 


tion of felt responsibility also affects atti- 
tudes and norms, Thus, in Experiment 2, the 
dependent measures included two items de- 
signed to measure subjects’ attitudes and 
norms toward the target population, poverty- 
stricken Latin Americans. 

The hypotheses of Experiment 2 were €s- 
sentially the same as those of Experiment L. 
To test these hypotheses, subjects filled out @ 
biographical questionnaire either 4 minutes 
before or immediately before seeing a video- 
tape on poverty-stricken Latin Americans. 
It was predicted that subjects in the brief 
time interval condition would feel more re- 
sponsibility for and would be more willing 
to help poverty-stricken Latin Americans 
than would subjects in the 4-minute time 
interval condition. 


Method 


Subjects. Subjects were 55 female undergrad 
ates who volunteered for the study in partial e 
fillment of a course requirement. Subjects Wo 
randomly assigned to two experimental and thri 
control conditions and were run individually. Xe 

Procedure. Subjects were greeted by 4 wale 
Perimenter and were seated in front of a ter’s 
vision set. From this point on, the experimen 
explanations of the purpose of the study iy the 
subject’s role in the study were similar inp 
explanations given to subjects in Experime ea 
After the purpose of the experiment HA alsi 
explained, subjects were told that they woull tion- 
be asked to fill out a short biographical ques 
naire. Subjects were told that this informatio? s in 
needed in order to determine whether participan =, 
the survey were all relatively homogeneous these 
regard to personality and background. fter essen- 
instructions, the experimenter’s actions were 
tially the same as those in Experiment 
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The 10-minute documentary videotape was com- 
posed of segments taken from a film called High- 
land Indians of Peru. These Segments consisted of 
fairly impactful scenes depicting the urban and 
rural poor of Peru and were accompanied by a 
spoken commentary. However, the videotape and 
commentary did not suggest that the film referred 
to a particular people or country and treated the 
scenes as though they were characteristic of large 
portions of the Latin American population. In addi- 
tion, the videotape did not discuss the causes of 
poverty in Latin America or responsibility for this 
situation, nor did it request any form of help for 
q the Latin American poor. 

« After viewing the videotape, subjects remained 
seated for an additional 5 minutes while the ex- 


O 


perimenter supposedly rewound the videotape and 

shut down the equipment. After the 5-minute wait- 

ing period had elapsed, the experimenter reentered 
the subject’s portion of the cubicle and adminis- 
tered the dependent measures. 

Time interval manipulation. In the 4-minute 
time interval condition, the experimenter went be- 
hind the partition, reentered the cubicle, and handed 
the subject a biographical questionnaire saying: 
“Please take about a minute to fill this out, and 
_ let me know when you've finished.” After the 
4 subject had finished the questionnaire and had 
returned it to the experimenter, the experimenter 
waited for 4 minutes. He then played the videotape 
about poverty-stricken Latin Americans on the 
subjects TV monitor. In the brief time interval 
condition, the experimenter turned on the equip- 
ment, waited for 4 minutes, and then asked the 
subject to fill out the biographical questionnaire. 
After the subject had finished the questionnaire 
| and had returned it, the experimenter played the 
documentary videotape on the subject's TV monitor. 

+ Biographical questionnaire. The biographical ques- 
tionnaire was designed to require subjects to think 
about themselves as objects but purposely avoided 
asking any questions that might sensitize them to 
responsibility for the poor in Latin America. 
The items included on the questionnaire were the 
following: (a) age; (b) sex; (c) scholastic major; 
(d) intended area of employment after completing 
college; (e) hobbies; (f) briefly describe your 
Physical characteristics (size, weight, looks, etc.) ; 
(8) briefly describe your most unique character- 
istic. The item concerning uniqueness was consid- 
red the critical question in terms of increasing self- 

+ focus, since data (Duval, 1976; Duval & Duval, 
Note 1; Duval & Siegel, Note 2) indicate that in- 

ceasing the salience of a person’s novelty as an 

object in the world increases the tendency for 
attention to focus on self. 

A Control conditions, In the no-biographical-ques- 

bonaire control condition, the experimenter fol- 

a „the same procedures used in the experimental 

nain tions, except that the biographical question- 

A ia was not mentioned and the subject did not, 

tae © fill out such a questionnaire. The video- 

On poverty in Latin America was shown 
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approximately 5 minutes after the experimenter’s 
initial instructions were completed. In the no- 
videotape control condition, the subject filled out a 
biographical questionnaire but did not see the 
videotape concerning poverty-stricken Latin Ameri- 
cans. The biographical questionnaire was adminis- 
tered approximately 4 minutes after the experi- 
menter had completed his initial instructions. In the 
no-biographical-questionnaire/no-videotape control 
condition, the subjects neither filled out the bio- 
graphical questionnaire nor saw the videotape on 
poverty in Latin America. In all control conditions, 
the time interval between the end of the experi- 
menter’s initial instructions and administration of 
the dependent measures was approximately the same 
as the corresponding time interval in the experi- 
mental conditions. 

Dependent measures, The dependent measures 
included the following items, listed in order of 
presentation: 

1. The level of poverty suffered by many Latin 
Americans is totally undesirable. 

2. The level of poverty suffered by many Latin 
Americans ought to be substantially reduced. 

3. To what extent do you feel any personal re- 
sponsibility for the poverty-stricken people of Latin 
America? 

4. To what extent would you be willing to take 
a l-hour course dealing with poverty in Latin 
America and then conduct two half-hour educa- 
tional sessions with groups of college freshman? 

5. Several agencies dealing with poverty in Latin 
America exist in Los Angeles but are understaffed, 
How much time per week would you be willing to 
spend doing volunteer work at one of these 
agencies ? 

6. How large a personal financial contribution 
would you be willing to make to the Agency for 
Latin American Development? 

All questions were accompanied by 15-point scales 
anchored with the appropriate positive and negative 
levels. After subjects had completed the question- 
naire, the experimenter checked for suspicion and 
informed the subject of the actual purpose of the 
experiment and experimental procedures. 


Results 


Data from Experiment ,1 indicated that 
subjects in the brief time interval conditions 
attributed more responsibility for victims of 
VD to the general public including self. This 
was interpreted as reflecting greater attribu- 
tion of responsibility to self. As was men- 
tioned, however, this measure could have re- 
flected the subject’s attribution of responsi- 
bility to the general public but not to self. 
Experiment 2 attempted to resolve this ques- 
tion by using a more directly phrased ques- 
tion concerning the subject’s attribution of 


1776 


Table 
Mean 
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3 
Attributed Responsibility and Willingness to Help 


AND R. NEELY 


Brief time 4-minute No No o control /no 
Item interval time interval bio control video control video control 
esponsibility 5.24 3.06, 2.81, 2.54, 3.195 
ae ease i 8.31 5.10 4.4 3.46 3.77 
Volunteer work at clinic 7.19 3.57 4.32 4.42 3.58 
Personal contribution 6.24 3.29 4.00 3.59 4,20 
Combined 22.345 11.96, 12.72, 11.45), 11.65) 


Note. For each condition, n = 


greater the responsibility attributed to self and willingness to help. Means with 


differ. 


responsibility to self. As predicted (see 
Table 3), subjects who completed the bio- 
graphical questionnaire immediately before 
seeing the documentary videotape on poverty- 
stricken Latin Americans attributed more re- 
sponsibility to self than did subjects who 
completed the questionnaire 4 minutes be- 
fore viewing the videotape, (20) = 2.57, p 
< 02. An additional analysis using the New- 
man-Keuls procedure indicated greater attri- 
buted self-responsibility in the brief time in- 
terval condition than in any other condition 
(p < 05). 

Data from Experiment 2 also replicate the 
findings in Experiment 1 with regard to will- 
ingness to help the target population (Table 
3). Subjects in the brief time interval con- 
dition exhibited greater willingness to ex- 
pend time, effort, and money (rs ranged 
between .76 and .82 and were summed for 
analysis) to help the poor of Latin America 
than did subjects in the 4-minute time inter- 
val condition, (20) = 2.66, p < .02. An ad- 
ditional analysis using the Newman-Keuls 
procedure indicated greater willingness to 
help in the brief time interval condition than 
in any other condition (p < .05). 


11, Numbers represent means for each condition. The higher the number, the 


similar subscripts do not 


Experiment 2 was also designed to deter 
mine whether the experimental manipulatio 
of self-focus affected subjects’ atti 
or norms concerning poverty-stricken Latin 
Americans. As is evident in Table 4, ex- 
pressed attitudes and norms did not differ 
between conditions (p > .20). 


Conclusions 


A number of conclusions regarding the 
relationships between self-focus, attribution” 
of responsibility to self, and helping behavior 
can be drawn from the results obtained in 
Experiments 1 and 2. First, it appears that 
increasing self-focus increases self-attribution 
for a situation or event. 
tionship obtains only to the extent that the 
periods of increased self-focus and a 
of the situation or event are separated by 4 
relatively short time interval. These results 
suggest that the unit formation principle 
temporal proximity does apply to the time 
interval that separates periods of increas 
self-focus and increased awareness of situa: 
tions and events, as well as to the time 
interval that separates the actual presence 


Table 4 
Mean Attitude and Norm Regarding Poverty in Latin America 
Immediately 4 minutes N Ç Yo bi trol/n 
À o No No bio con 
Level of poverty before before biocontrol video control video contro! 
Totally undesirable 13.34 82 
Ought to be substantially eae Sas ea f 
reduced 12.85 13.22 14.67 12.21 13,74 
itude 0 


Note. Numbers represent means for each condition, The higher the number, the stronger the att 


norm, 


| or occurrence of agents, situations, and 
»events. In and of itself, this conclusion has 
__ obvious implications for dealing with both 
the theoretical and empirical relationship be- 
tween self-focus and self-attribution, 
Second, juxtaposing self-focus and a salient 
group of distressed persons increases willing- 
ness to expend time, effort, and money to 
help that group. The failure to find increased 
self-attribution and willingness to help in 
the 4-minute interval /after condition of Ex- 
_ periment 1 Suggests that an experimental 
m demand or a direct self-evaluation interpre- 
tation (Duval & Wicklund, 1972) of this re- 
lationship is not highly plausible. In Experi- 
ment 2, increasing self-focus produced differ- 
“ences in felt responsibility for and willingness 
to help the poor in Latin America without 
differentially affecting subjects? attitudes and 
norms toward the target population and their 
difficulties, These data suggest that the effect 
of self-focus on helping behavior is not 
F caused by change in attitudes and/or norms. 
' Having ruled out these alternative explana- 
tions, we conclude that juxtaposing self-focus 
and a salient group of distressed persons 
affects helping behavior by increasing felt 
responsibility for the welfare of that group. 
Owever, this conclusion must be tempered 
by the results obtained by Federoff and Har- 
| vey (1976), which Suggest that some as yet 
"unidentified variable may affect the relation- 
Ship between self-focus and internal versus 
external attribution for a negative event, 
The obtained relationship between self- 
focus and helping behavior has other impli- 
Cations for the study of helping behavior. 
Pecifically, any conditions that affect self- 
focus when distressed others are salient should 
affect felt responsibility and thus helping be- 
- havior. Since the focus of attention is gen- 
| “tally attracted to “important” or “novel” 
Stimuli, Conditions that increase or decrease 
the relative situational importance or novelty 
of an individual when distressed others are 
{salient should increase or decrease that per- 
Son's level of self-focus, felt responsibility, 
Fand tendency to help the needy others. The 
relevance of this Proposition for the general 
Study of helping behavior has been demon- 
strated in a recent experiment. Using the 
Motion that the novelty of two interacting 
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groups is affected by 
People in each group (Duval, 1976; Duval 
& Wicklund, 1972), Wegner and Schaefer 
(1978) 


of situational 


These hypotheses were confirmed, Although 
measured, sub- 
tended to focus 
on self and to help a needy person less than 
did subjects run singly; individual subjects 
confronted with three victims focused on 
self and helped those victims more than did 
individual subjects faced with a single victim, 

The novelty/self-focus/felt responsibility / 
helping behavior relationship may also apply 
to the relationship between distinctive suita- 
bility (Schwartz, 1970), dependency (Berko- 
witz, 1972), and helping, A person who is 
one of a “limited and uniquely qualified” 
group of potential helpers (high distinctive 
suitability) would be more novel in that 
situation than a person who is merely one 
of many people who could help (low dis- 
tinctive suitability, Schwartz, 1970). To the 
extent that a person is depended on by 
others, that person is defined as uniquely im- 
portant with respect to helping others, Thus, 
increasing distinctive suitability or depend- 
ency may increase a person’s tendency to 
help others in distress because those manipu- 
lations affect the individual’s level of situa- 
tional novelty or importance, self-focus, felt 
responsibility and, thus, helping behavior. 

In summary, research has found a strong 
empirical relationship between felt responsi- 
bility and helping behavior, The present re- 
search demonstrates that juxtaposing self- 
focus with a salient group of needy others 
increases felt responsibility for and willing- 
ness to help those others. We have suggested 
that the effects of distinctive suitability, de- 
pendency, and relative number of potential 
helpers and victims on helping behavior may 
be mediated by self-focus and attribution of 
Tesponsibility to self. Thus, the self-focus/ 
felt responsibility relationship may have sub- 
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stantial significance for the general study of 
helping behavior. 
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Sex Role Identity and Its Relationships to Sex Role Attributes 
and Sex Role Stereotypes 


Michael D. Storms 


University of Kansas 


Kagan’s and Kohlberg’s theories of sex role identity are examined in this paper. 
Both theories premise relationships among an individual’s sex role identity, 


variables. Beyond that, the results generally supported Kohlberg’s theory over 
Kagan’s. Significant correlations were found where Kohlberg’s theory would 


Over the past decade, researchers have relationships to sex role attributes and sex 
‘amassed considerable data on sex role stereo- role stereotypes. 

types (e.g., Rosenkrantz et al., 1968) and Kagan (1964) and Kohlberg (1966) have 
$ masculine and feminine sex role attributes both defined sex role identity as an acquired 
EGA Bem, 1974; Spence & Helmreich, 1978). self-concept of being masculine or feminine, 
Relatively less attention has been paid to self- and they have proposed that sex role identity 
concepts of masculinity and femininity—that is closely related to sex role stereotypes and 
is, to sex role identity. Although sex role iden- sex role attributes. The two theorists differ 
tity has existed as a theoretical construct for sharply, however, about the causal structure 
Some time (Gagan, 1964; Kohlberg, 1966), of these relationships. Kagan postulates that 
there has been no widely accepted measure of sex role identity is the product of differences 


Sex role identity and little research on its between individuals’: sex role tributes and 

their perceptions of sex role reotypes; 

Kohlberg believes that sex role identity is the 

$ This research was supported in part by Univer- cause of those differences, Figure 1 illustrates 
any °f Kansas General Research Fund Grants 3800 the causal structure of these two theories. 


reich, ies The author is indebted to Robert Helm- Specifically, Kagan (1964) theorizes that 


Teich, Kevin McCaul, and Shelley Taylor for their ae J é 
f fomments on an earlier draft and to Scott Lambers individuals compare their own attributes 
f r his invaluable research assistance. against sex role stereotypes to arrive at a rela- 
Bs hats for reprints should be addressed to  tivictic self-concept of their sex role identity. 
`p el D. Storms, Department of Psychology, y Kagan’s (1964) words: 
| Fraser Hall, University of Kansas, Lawrence, Kan- 10 Kag ( ) i 
Sex-role identity represents the degree to which an 


sas 66045, 
Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3710-1779$00.75 
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SEX-ROLE 
IDENTITY 


SEX- ROLE 
ATTRIBUTES 


SEX-ROLE 
STEREOTYPES 


> _KAGAN’S MODEL 


————- > KOHLBERG'S MODEL 


Figure 1. Kagan’s (1964) and Kohlberg’s (1966) 
models of the relationships among sex role identity, 
attributes, and stereotypes. 


individual regards himself as masculine or feminine. 
_, , The degree of match or mismatch between the 
sex-role standards of the culture and an individual’s 
assessment of his own overt and covert attributes 
provides him with a partial answer to the question, 
“How masculine (feminine) am 1?” (p. 144) 


In short, Kagan views sex role identity as the 
product of sex role stereotypes and sex role 
attributes. 

In contrast, Kohlberg (1966) asserts that 
sex role identity is firmly established early in 
life and thereafter serves to mediate the influ- 
ence of sex role stereotypes on the individual’s 
development of sex role attributes. Sex role 
identity guides the individual’s attachment to, 
evaluation of, and desire to emulate sex role 
prototypes (specific adult role models) and 
sex role stereotypes (abstract role models). 
Sex role attributes, according to Kohlberg’s 
theory, are the product of the individual’s sex 
role identity and perception of sex role stereo- 
types. 

The theories described above lead to many 
similar hypotheses about the relationships 
among sex role identity, sex role stereotypes, 
and sex role attributes. Both theories predict 
a multiple correlation among the three sex 
role variables—knowing any two of the vari- 
ables should enhance prediction of the third. 
Both theories also predict a correlation be- 
tween sex role attributes and sex role identity, 
although for opposite reasons—either because 
attributes influence identity (Kagan) or be- 


cause identity influences attributes (Kohl- 
berg). 
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The two theories differ most noticeably in 
their hypotheses about the relationships be- : 
tween sex role stereotypes and the other two 
variables. According to Kagan’s theory, sex 
role stereotypes and sex role attributes inter- 
act to determine sex role identity. Thus, 
Kagan’s theory predicts a correlation between 
stereotypes and identity but not (necessarily) 
a correlation between stereotypes and at- 
tributes. Conversely, according to Kohlberg’s 
theory, sex role stereotypes and sex role iden- 
tity interact to determine sex role attributes. 
Kohlberg’s theory predicts a correlation be- 
tween stereotypes and attributes but not 
(necessarily) between stereotypes and iden- 
tity. 

The relationship between sex role stereo- 
types and sex role attributes has already been 
investigated in a study by Spence, Helmreich, 
and Stapp (1975). Briefly, they found no 
correlation between these two variables—a 
finding that would support Kagan’s theory 
and contradict Kohlberg’s. Unfortunately, 
there are several methodological problems 
with the Spence et al. study that cast some 
doubt on the relevance of their data to the 
present discussion. Examination of these 
problems will require a detailed review of the 
procedures used. $ 

The main purpose of Spence, Helmreich, 
and Stapp’s (1975) study was to introduce & 
new measure of sex role attributes, the Per- 
sonal Attributes Questionnaire (PAQ). Spence 
et al. created the PAQ to measure stable, 
underlying personality traits associated Wi 
masculinity and femininity, but they were 
also aware of a possible alternative interpreta- 
tion of their scale. Spence et al. noted that 
items on masculinity-femininity scales (in- 
cluding the PAQ) are usually adopted from 
lists of sex role stereotypes. Thus, an individ- 
ual’s endorsement of a particular item cou 
merely indicate his/her knowledge of 588 
identification with the corresponding stereo 
type and not an accurate reflection of his/her 
actual traits. 

Spence, Helmreich, and Stapp (1975) pS 
posed a test of the competing stereotyp? ma 
pretation of the PAQ. They reasoned the 
according to the stereotype hypothes!® per 
ple’s self-ratings on masculinity and feminin- 
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ity scales should be correlated with their per- 
ceptions of sex role stereotypes. For example, 
people who believe that a particular trait is 
very characteristic of their sex in general 
should rate themselves higher on that trait 
than will people who think the trait is less 
characteristic of their sex. In contrast, ac- 
cording to the trait hypothesis, self-ratings 
reflect genuine attributes of individuals inde- 
pendent of their beliefs about sex roles, and 
the two should not be correlated. People who 
have more of a trait should rate themselves 
higher on that trait independently of their 
beliefs about how much of the trait others 
of their sex have, 

Spence, Helmreich, and Stapp (1975) 
selected 55 traits that pretest subjects had 
rated as more typical of one sex than the 
other, These items were sorted into three 
scales—two independent masculinity (M) 
and femininity (F) scales and a third, bi- 
polar masculinity-femininity (M-F) scale. 
Male and female college students first rated 
themselves on each of the 55 items, then 
tated each item for the typical male or female. 
Eighteen correlations were generated by com- 
paring male and female subjects’ self-ratings 
on each scale to their stereotype ratings on all 
three scales. Finding only 5 of these 18 corre- 
lations statistically significant, Spence et al. 
rejected the stereotype hypothesis and con- 
cluded that the PAQ measures genuine traits. 
In the present context, these results could also 
be interpreted as evidence against Kohlberg’s 
theory of sex role identity, which also pre- 
dicts a correlation between sex role stereo- 
types and sex role attributes. 

Unfortunately, there are at least three 
Methodological problems with the Spence, 
Helmreich, and Stapp (1975) study that cast 
doubt on their conclusions, First, Spence et al. 
used different item formats for subjects’ 
Stereotype ratings than for their self-ratings. 
For self-ratings, subjects were presented with 
bipolar items (e.g., very passive to very ac- 
tive), whereas for stereotype ratings subjects 
Were presented with single labels (e.g., ac- 
tive). These two formats could have created 
different psychological dimensions for the sub- 
Jects; a subject who saw only the word active 
may have assumed that the intended opposite 
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was inactive rather than passive. Thus, sub- 
jects may have rated stereotypes along dimen- 
sions somewhat different from those they used 
in rating themselves. That would have greatly 
reduced any possibility of obtaining correla- 
tions between the attribute and stereotype 
ratings, 

A second problem was created by the par- 
ticular type of rating scale that followed each 
stereotype item. Each single label was fol- 
lowed by a bipolar, unidimensional scale rang- 
ing from much more characteristic of the male 
to much more characteristic of the female. 
The problem with using unidimensional scales 
to measure stereotypes is that ratings in the 
middle of such scales are ambiguous. A middle 
rating could mean that the trait was seen as 
either highly characteristic of both sexes or 
not characteristic of either sex. That am- 
biguity in turn could have reduced correla- 
tions between attribute and stereotype rat- 
ings. According to the stereotype hypothesis, 
subjects who thought a particular trait highly 
characteristic of both sexes would have rated 
the trait in the middle of the stereotype scale, 
but they would have rated themselves high on 
the trait. On the other hand, subjects who saw 
the trait as uncharacteristic of either sex 
would also have rated the trait in the middle 
of the stereotype scale, but they would have 
rated themselves low on the trait. These two 
possibilities would cancel each other out and 
artifactually lower the correlation between 
attribute and stereotype ratings. 

A final problem with Spence, Helmreich, 
and Stapp’s (1975) study concerns the 18 
correlations they presented as tests of the 
stereotype hypothesis. The 18 correlations 
were generated by comparing attribute ratings 
on each of three different scales to stereotype 
ratings on all three scales separately for each 
sex. But two of the three attribute scales (the 
M scale and the F scale) were theoretically 
designed and empirically shown to be uncor- 
related with each other. If attribute ratings 
alone were not expected to correlate between 
scales, it was inappropriate to expect attribute 
ratings and stereotype ratings to correlate be- 
tween scales. Eliminating these cross-scale 
comparisons, only 6 of the 18 possible correla- 
tions actually provide a valid test of the 
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stereotype hypothesis—namely, self-ratings 
on each scale compared to stereotype ratings 
only on the same scale, calculated separately 
for each sex on each scale. Of these 6 correla- 
tions, 3 were significant and in the predicted 
direction. Viewed thus, Spence, Helmreich, 
and Stapp’s (1975) results provide highly 
equivocal evidence about the stereotype hy- 
pothesis and not definitive evidence against it. 
It would also be unfair to rule against Kohl- 
berg’s theory of sex role identity on the basis 
of those results. 

The present study was designed to investi- 
gate Kagan’s (1964) and Kohlberg’s (1966) 
models of sex role identity, using a paradigm 
similar to Spence, Helmreich and Stapp’s 
(1975) but with several methodological im- 
provements. First, subjects were asked to 
make attribute ratings and stereotype ratings 
on identical sets of bipolar items. Second, sub- 
jects’ stereotypes were measured by taking 
Separate ratings of the typical male and the 
typical female on all items, Third, attribute 
ratings and stereotype ratings were compared 
only within each scale and not between scales, 


Finally, a measure of sex role identity was 
included, 


Method 


Subjects and Procedure 


_ One hundred four male and 110 female students 
in introductory social Psychology at a midwestern 
state university completed a battery of four instru- 


pleted during one class 
the semester, Subjects’ 


course debriefed 
a lecture based on the results 


Instruments 


è Because no widely 
identity presently exists, 


(feminine) is your personality?” «“ 
ini ty?” “How masculi 
(feminine) do you act, appear, and come Seca cs 
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others?” and “In general, how masculine (feminim 
do you think you are?” Each item was followed 5 
a 31-point scale with the endpoints labeled not at 
masculine (feminine) (scored as 0) to extremal 
masculine (feminine) (scored as 30) 

Subjects’ sex role attributes were measured 6 
Spence and Helmreich's (1978) Personal Attribul 
Questionnaire, a shortened version of the Sp 
Helmreich, and Stapp (1975) measure describ 
earlier, It contains 24 of the original $5 items, Si 
categorized into three subscales—an M scale, @ 
F scale, and a unidimensional M-F scale. Each sed 
contains 8 items; each item is scored from 0 tod 
and total scores on each scale can range from 0 to 
Higher values indicate greater endorsement of ) 
items, F items, or the masculine pole of M-F iten s 

Subjects’ stereotypes were measured by two modeume™ 
ified versions of the PAQ. One questionnaire 
subjects to rate the “typical male,” and one asked 
Subjects to rate the “typical female.” Thus, ste 
type ratings for each sex were gathered separat 
using exactly the same scales as the self-ratings. The 
Same scoring system was used for each stereotype) 
scale as was used for the corresponding self-ratin 
scale. 2 

The order of the four questionnaires in the battery. 
was randomized across subjects, except that the two 
Stereotype questionnaires always appeared together, 


Results 
The Sex Role Identity Scale 


Since a new measure of sex role identity 
was designed for this study, its psychometric 
properties were examined first. The scale 
showed strong internal consistency: the three — 
masculine identity items intercorrelated posi- 
tively better than .66 for men and 68 for q 
women (all ps < .001), and the three feminine 
identity items intercorrelated positively better 
than .80 for men and .70 for women (all ps < 
001). More important, despite all the evi- 
dence of multidimensionality in people’s sex 
role attributes (Bem, 1974; Constantinople, 


* Two researchers have designed instruments that 
may appear related to sex role identity. First, Stein 
(1971) has measured sex role preference, which may 
share some conceptual meaning with sex role identity. 
The two concepts are not identical, however. Log- 
ically, one can prefer to have sex role characteristics 
that differ from one’s present sex role self-concept: 
Second, Heilbrun (1976) has published a measure 
called “sex role identification”; however, the com- 
Position and use of that instrument suggest that it 
Measures specific sex role attributes rather than a 
global sex role self-concept. 
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Table 1 
š Means, Standard Deviations, and p Values for Male-Female Differences on all Measures 
Men Women 
Scale M SD M SD p 
Sex role identity 24.43 4.25 7.86 5.26 001 
Attribute self-ratings 
M 22.58 4.78 20.15 4.30 001 
F 21.56 4.41 23.92 3.38 001 
M-F 16.77 4.54 12.87 4.07 001 
Stereotype ratings of men 
M 24.68 3.96 24.08 3.80 ns 
F 17.10 5.00 17.22 4.76 ns 
M-F 20.53 4.32 19.83 4.17 ns 
p: Stereotype ratings of women 
15.75 4.13 16.94 4.99 ns 
F 24.70 4.10 24.18 3.69 ns 
M-F 9.09 3.79 10.84 3.95 001 


1973; Spence & Helmreich, 1978), subjects 
in this study conceptualized their sex role 
identities along a single dimension. Mascu- 
line identity items correlated negatively with 
» feminine identity items better than —.64 for 
men and —.74 for women (both ps < .001). 
Therefore, the six items were combined by 
scoring the three feminine items in the oppo- 
site direction from the three masculine items, 
summing, and dividing by six—thus yielding 
a 31-point, bipolar sex role identity score 
ranging from 30 (most masculine/least femi- 
nine) to O (least masculine/most feminine). 


Sex Differences on All Measures 


All of the hypotheses in the present study 
Presume the existence of sex-of-subject differ- 
ences in sex role identity and attributes and 
Sex-of-target differences in sex role stereo- 
types. Table 1 illustrates these differences. 
Male and female subjects reported not only 
Significant but extremely large differences on 
the measure of sex role identity, with means 
of 24.4 and 7.9, respectively, p < .001. Males 
and females also reported significant differ- 
ences on all three subscales of the sex role 
attributes measure, the PAQ. Men reported 
higher self-ratings on the M scale and the 


Note. Higher values indicate greater endorsement of masculine items on M scales, of feminine items on F 
scales, and of the masculine pole of items on the M-F scales and the Sex Role Identity Scale. For men, 
n = 104, For women, n = 110. M = masculine. F = feminine. 


bipolar M-F scale, whereas women reported 
higher self-ratings on the F scale. 

Although men and women differed in their 
attribute and identity self-ratings, they agreed 
in their stereotypes about the typical man 
and the typical woman. Men and women dif- 
fered in their stereotype ratings in only one 
instance: Women rated the typical woman 
higher on the M-F scale. Men and women 
also agreed on the differences between the 
typical man and the typical woman, as can 
be seen by making vertical comparisons be- 
tween the two stereotype scales in Table 1. 
Both sexes rated the typical man higher on 
the M scale and the M-F scale and rated the 
typical woman higher on the F scale (all ps < 
.001). In short, all of the instruments pro- 
duced results consistent with past research 
and consistent with the assumptions under- 
lying this study. 


Correlations Among Measures 


Correlations among identity, attribute, and 
stereotype ratings are presented in Table 2. 
Before discussing these correlations, I should 
explain the structure of Table 2. As argued 
earlier, it is appropriate to consider only those 
correlations occurring within each subscale of 
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Table 2 
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Correlations Among Sex Role Identity, Sex Role Stereotypes, and Sex Role 


Attributes for Men and Women 


Attribute/stereotype scales 


Women 


Men One: 
- M F M-F 
Correlation M F M-F ug 
Multiple ne att Gives a 
aS 7 Sare .19* agree 
aa AI i ipia .26°* 21° . 
Eep ord and partial AN We Recs D am 
A 30*  —.08 sate 13 a 
X 19* .19* 06 —.19° : ae 
Sta ‘04 19% = 02 —.22" -10, saa 
SA joes 7 21° ‘28e* ‘20 mt 
Ser 34ee 19* 20° ‘30*** M4 $ 
; ity EE ela- 
Note. Variables are abbreviated: A = attributes, S = stereotypes, and J identity. In the multiple corr 


i initials indicate a zero- 
tions the first variable is the dependent variable. In the rest of the table, two variable initials indicate a 


order correlation and three variable initials indicate a partial correlation between the first tw 


variables con- 


sign indicates multiple 
trolling for the third variable. For men, n = 104. For women, n = 110. An equals sign indicates multipl 
correlations, whereas a dash indicates partial correlations. M = masculine. F = feminine. 


PALOS *p<0l. ** p <.001. 
the PAQ and not between scales, Since every 
correlation in this study involved one or both 
of the subscaled instruments (i.e., the self- 
rating version of the PAQ and/or the stereo- 
type-rating version), each correlation was cal- 
culated six times, once for each subscale for 
each sex—thus yielding the six columns in 
Table 2. The hypotheses were tested by ex- 
amining the three multiple correlations among 
the three variables, the three zero-order cor- 
relations between each pair of variables, and 
the three partial correlations between each 
pair of variables controlling for the third 
variable—thus yielding the nine rows in 
Table 2.2 

The multiple correlations among sex role 
identity, sex role attributes, and sex role 
Stereotypes are presented first in Table 2. 
Both sex role identity theories presume that 
sex role identity is involved in a casual struc- 
ture with sex role attributes and sex role 
stereotypes. For Kagan, sex role identity is 
the product of this structure; for Kohlberg, 
sex role attributes are the product. Kagan’s 
theory thus leads one to examine the multiple 
correlation with sex role identity as the de- 


pendent variable. Kohlberg’s theory is- Reali 
appropriately tested with sex role attri ye 
as the dependent variable. In fact, bot! g 
those multiple correlations, and even ad 
correlation with stereotypes as the dependen! 
variable, were significant for both sexes bi 
each of the three attribute/stereotype De 
scales, as seen in the top three rows of TE 
2. One might notice that Kohlberg’s a ; 
tions (second row) accounted for OR ai 
more variance than Kagan’s correlations ae 
row) in four of the six comparisons— ts 
more in two instances (men on the M Esi 
and women on the M-F scale) and Ca 
5% more in two instances (men Oars 
M-F scale and women on the M scale). 

is far from a crucial test between the oR 
theories, however; both theories are essi 
tially supported by these results. 


e 
2 Many other analyses were performed oe ahh 
data, including all possible part ae URNS aaa 
tions of the differences between two varia ea 
the third, and correlations of the differences NEN 
male and female stereotypes and the other iets oe 
All of these auxiliary analyses produced sete d 
tirely consistent with the main analyses P 
in Table 2 and so are not reported here. 


j 


SEX ROLE IDENTITY 


The two theories also agree that significant 
correlations should obtain between sex role 
attributes and sex role identity (although 
they disagree about the causal mechanisms 
that should produce these correlations). 
Given the direction in which the various 
scales are coded, these correlations should be 
positive for both sexes on the M and M-F 
subscales and negative on the F subscale. The 
correlations presented in the fourth and fifth 
rows of Table 2 strongly supported both 
theories in four of the six comparisons. For 
both the zero-order and the partial correla- 
tions, men reported a significant correspon- 
dence between their sex role identity and their 
sex role attributes on the M and M-F scales 
of the PAQ. Women reported a similar cor- 
relation between their sex role identity and 
sex role attributes on the F scale and the 
M-F scale. Thus it appears that sex role 
identity was related to sex role attributes, but 
only on those scales that were most appropri- 
ate for the subject’s own sex—that is, the 
masculinity scale for men, the femininity 
scale for women, and the masculinity-fem- 
ininity scale for both sexes. 

A relationship between sex role identity and 
sex role attributes was also evident in an 
analysis of subjects classified according to 
sex role type. Table 3 presents the distribu- 
tion of subjects across the four categories of 
traditional masculine, traditional feminine, 
androgynous, and undifferentiated. The per- 
centage of subjects classified into each cat- 
egory corresponded closely to the normative 
distribution of college men and women re- 
ported by Spence and Helmreich (1978). 
More important, one-way analyses of variance 
(least squares method) indicated strong sex 
role identity differences among the groups, 
F(3, 100) = 6.59, p < .001 for men and F(3, 
106) = 6.55, p < .001 for women. 

Further analysis of sex-typed groups con- 
firmed that sex role identity was related only 
to masculine attributes for men and only to 
feminine attributes for women. Men with 

igher M scores (androgynous and masculine 
men) reported more masculine sex role iden- 
tities than men with lower M scores (feminine 
and undifferentiated men). In contrast, scores 
®n the F scale were irrelevant. Although an- 
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Table 3 
Number, Percentage, and Sex Role Identity 
Means of Men and Women by PAQ Category 


Men Women 
PAQ 

category n% M n % OM 
Androgynous 32 31 24.82» 36 33 6.808 
Masculine 41 39 26,028 13 11 12.46> 
Feminine 11 10 22.14 38 34 6,288 
Undifferen- 

tiated 20) 19.21.78 0723). 21 9/50 


Note. Higher mean values indicate a more masculine 
sex role identity. Within sex, means not sharing the 
same superscript are significantly different at 
p <.05 by weighted contrasts. PAQ = Personal 
Attributes Questionnaire. 


drogynous men have higher F scores than 
masculine men, they did not report a signif- 
icantly more feminine sex role identity. Nor 
did feminine men report a more feminine iden- 
tity than undifferentiated men, Similarly, 
women who scored higher on the F scale 
(androgynous and feminine women) reported 
more feminine sex role identities than women 
who scored lower on the F scale (masculine 
and undifferentiated women). But women who 
scored higher on the M scale (androgynous 
vs. feminine women and masculine vs. un- 
differentiated women) did not report more 
masculine sex role identities. 

The analyses presented so far can be inter- 
preted as supporting both Kagan’s (1964) 
and Kohlberg’s (1966) models of sex role 
identity, with the exception of subjects’ at- 
tribute ratings on cross-sex scales, as de- 
scribed above. There are two other correla- 
tional analyses, however, on which Kagan’s 
and Kohlberg’s theories make distinctly dif- 
ferent predictions. First, Kagan’s theory pre- 
dicts a causal link, and therefore a correlation, 
between sex role stereotypes and sex role 
identity. Furthermore, these correlations 
should have the opposite sign of the correla- 
tions between attributes and identity (row 4 
of Table 2). To illustrate this logic by ex- 
ample: The more masculine attributes a man 
has the more likely he is to compare “fayor- 
ably” with male stereotypes and infer a highly 
masculine identity for himself, but the more 
masculine he perceives the stereotypic male to 


1786 


be, the more likely he is to compare unfavor- 
ably and infer a less masculine identity. Kohl- 
berg’s theory does not predict any of these 
correlations because it views sex role stereo- 
types and sex role identity as two separate, 
independent variables. 

The correlations between sex role stereo- 
types and sex role identity are presented in 
the sixth and seventh rows of Table 2. Ex- 
amining the zero-order correlations first, the 
predictions derived from Kagan’s theory were 
confirmed in only two of the six comparisons 
(men on the F scale and women on the M 
scale). In fact, two of the six correlations 
were significant (p< .05) in the opposite 
direction from that predicted by Kagan’s 
theory. In case these correlations were con- 
taminated by the influence of sex role at- 
tributes, the partial correlations between 
stereotypes and identity were also calculated. 
Both of the significantly reversed correlations 
then disappeared, but no new correlations 
reached significance in the predicted direction, 

Kagan’s and Kohlberg’s models also lead 
to opposing predictions about the relationship 
between sex role attributes and sex role 
stereotypes. Kohlberg’s theory posits a causal 
link, and therefore a correlation, between sex 
role stereotypes and sex role attributes. Fur- 
thermore, all of these correlations should be 
positive across the three attribute subscales 
and for both sexes. The more one sees one’s 
own stereotypic sex as having amy attribute, 
the more one should develop that attribute. 
Kagan’s theory predicts no correlations be- 
tween sex role attributes and sex role stereo- 
types because it views these as separate, inde- 
pendent variables. The correlations between 
sex role stereotypes and sex role attributes are 
presented in the last two rows of Table 2. All 
six zero-order correlations and five of the six 
partial correlations reached significance in the 
direction predicted from Kohlberg’s theory. 


Discussion 


The Nature of Sex Role Identity 


This study examined a relatively over- 
looked concept in current sex role research— 
sex role identity. A very simple six-item mea- 
sure of sex role identity was introduced and 
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was shown to have solid internal reliability, 
Although the instrument contained separate 
masculine and feminine identity items, thus 
allowing for multidimensional responses, sub- 
jects conceptualized their sex role identity 
along a single, bipolar dimension. This is not 
surprising, given that until very recently 
everyone, including psychologists, viewed — 
masculinity and femininity as diametric. De- _ 
spite research evidence that masculine and 
feminine sex role attributes form orthogonal 
dimensions, and despite the increased cur- 
rency of the term androgyny, people are likely 
to persist in thinking of masculinity and , 
femininity along a single continuum. 

This measure of sex role identity was then 
compared to measures of sex role stereotypes 
and sex role attributes taken from Spence and 
Helmreich’s (1978) instrument, the PAQ. 
Two theories of sex role identity were used 
to derive hypotheses about the relationships 
among these three sex role variables. Kagan’s 
(1964) theory posits that individuals com- 
pare self-perceptions of their own sex role 
attributes with their perceptions of sex role 
stereotypes and then infer their relative sex 
role identity, Kohlberg’s (1966) theory posits 
that people already have firmly established 
and relatively unchanging sex role identities 
that guide the development of sex role at- 
tributes by means of the modeling or emulat- 
ing of sex role stereotypes. Both theories pre- 
dict some shared correlational patterns” 
namely, that the three sex role variables will 
form a significant multivariate relationship 
and, more specifically, that sex role identity 
and sex role attributes will correlate. On the 
whole, these predictions were strongly COW 
firmed in the present study, with one excep- 
tion to be discussed later. F 

The two theories differ, however, in their 
predictions about the relationship between 
sex role stereotypes and the other two a 
ables; Kagan’s theory predicts a relations! n 
between stereotypes and identity, KEERA 
Kohlberg’s theory predicts a relationship s 
tween stereotypes and attributes. Virtually j | 
support was found for the predictions deriv: : 
from Kagan’s theory. Extremely strong TE, 
port was found for the predictions derive! 
from Kohlberg’s theory. 


Po 


SEX ROLE IDENTITY 


Obviously, the results of a single study may 
be open to a number of refutations and 
alternative explanations. Proponents of Kag- 
an’s theory could argue, for example, that the 
instruments used in this study are invalid or 
measure the wrong sex role attributes and 
stereotypes and that people base their sex 
role identities on some other set of traits and 
stereotypes not tapped by the PAQ. That 
argument seems implausible, however, for the 
following reasons. First, there is a wealth of 
evidence supporting the PAQ as a valid mea- 
sure (Spence & Helmreich, 1978, 1979). More 
important, the results of this study demon- 
strated that sex role identity, attributes, and 
stereotypes formed a meaningful, coherent, 
and consistent pattern in nearly every anal- 
ysis in this study except those analyses de- 
tived from Kagan’s theory. 

Kohlberg’s theory of sex role identity was 
overwhelmingly supported by the present re- 
sults, with one qualifying detail. It seems that 
sex role identity influences the development 
of same-sex-typed attributes (masculine at- 
tributes for men and feminine attributes for 
women) but has little influence on the de- 
velopment of opposite-sex-typed attributes. 
In retrospect, this makes logical sense. Given 
that sex role identity is a unidimensional con- 
Struct, if it were to influence the development 
of both same- and opposite-sex-typed attri- 
butes, those attributes would also be unidi- 
mensional. Thus, the orthogonality of mascu- 
line and feminine attributes and the develop- 
Ment of androgynous attributes in some 
individuals are possible only because the influ- 
ence of sex role identity is limited to same- 
Sex-typed attributes. 

Kohlberg’s theory has many important 
ramifications for future research in the area 
Of sex roles, Conceptually, it suggests that sex 
tole identity may be one of the most powerful 
and central variables in guiding the develop- 
Ment of sex role attributes, in filtering the im- 
Pact of sex role stereotypes on the individual, 
pad in moderating the influence of situational 
Variables on sex role behavior. Practically, it 
“Bests that sex role identity is a funda- 

Mental variable to assess in research on any 

X tole topic, 
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The Nature of Sex Role Attributes 


Researchers have long debated the mean- 
ing of masculinity and femininity or, more 
precisely, the psychological constructs mea- 
sured by masculine and feminine attribute 
scales (Constantinople, 1973; Locksley & 
Colten, 1979; Pedhazur & Tetenbaum, 1979; 
Rosenkrantz et al., 1968). Do such scales re- 
flect true masculine and feminine traits (or 
self-perceptions of traits)? Or do sex role 
attribute scales merely reflect what respon- 
dents think are the socially acceptable, stereo- 
typic attributes for their sex? Spence, Helm- 
reich, and Stapp (1975) argued against the 
stereotype interpretation of their instrument, 
the PAQ. They reasoned that the stereotype 
interpretation would predict a correlation be- 
tween peoples’ perceptions of stereotypes and 
their PAQ scores, Finding no such correlation 
in their data, Spence, Helmreich, and Stapp 
dismissed the stereotype explanation. 

The present study included important 
methodological improvements on Spence, 
Helmreich, and Stapp’s (1975) design and 
found a significant relationship between sex 
role stereotypes and sex role attributes. As 
discussed above, these results support Kohl- 
berg’s model of sex role identity. Do these re- 
sults also present a serious challenge to 
Spence, Helmreich, and Stapp’s trait inter- 
pretation of the PAQ? We would argue that 
these results do not, necessarily, discount the 
PAQ. In the first place, the absolute size of 
the correlations between attributes and stereo- 
types in this study was small. A great deal of 
variance remained in PAQ attribute scores 
that could not be accounted for by stereotype 
scores. Second, a correlation between sex role 
attributes and stereotypes is not really incon- 
sistent with a trait interpretation of the PAQ. 
According to Kohlberg’s model, stereotypes 
should correlate with attributes precisely be- 
cause those stereotypes influence the early 
development of sex role traits. In short, 
Spence, Helmreich and Stapp’s trait inter- 
pretation of the PAQ is not damaged by the 
present results, and their concern about a 
correlation between the PAQ and sex role 
stereotypes may be unfounded in light of 
Kohlberg’s theory. 

We are cautious, however, about certain 
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surplus meanings that may accompany the 
trait view of sex role attributes. At issue is the 
notion that traits are immutable and intran- 
sient aspects of the personality, to be assessed 
but never changed. An important implication 
of Kohlberg’s theory is that people can and 
will change, modify, and develop new sex role 
attributes as their stereotypes change, This 
notion suggests a line of future research in 
which subjects’ perceptions of stereotypes are 
manipulated and instruments like the PAQ 
are used as dependent variables. Such re- 
search is not likely to be undertaken, how- 
ever, if investigators view sex role attributes 
as fixed and immalleable personality traits. 

The trait validity of Bem’s (1974) sex role 
attribute scale, the Bem Sex Role Inventory 
(BSRI), has recently faced more serious 
challenges than Spence and Helmreich’s PAQ. 
Pedhazur and Tetenbaum (1979) subjected 
the BSRI to factor analysis and discovered 
that the two items on the instrument—the 
items Masculine and Feminine—accounted 
for nearly 80% of the variance. Furthermore, 
despite Bem’s contention that masculinity and 
femininity are independent trait dimensions, 
females who rated themselves high on the 
Feminine item tended to rate themselves low 
on the Masculine item, and vice versa for 
males, 

Given the high loadings on the Masculine 
and Feminine items in the BSRI and the in- 
verse relationship between those items, it 
seems possible to us that the BSRI actually 
measures sex role identity rather than sex 
role attributes. This possibility is given fur- 
ther credence by the results of one of Bem’s 
own studies. Bem and Lenney (1976) gave 
subjects an opportunity to engage in a variety 
of sex-typed masculine behaviors (oiling the 
squeaky lid of a tool box, baiting a fishing 
hook, etc.) or feminine behaviors (ironing 
napkins, filling a baby bottle, etc.). They 
found that masculine and feminine subjects 
(as determined by BSRI scores) chose sex 
role appropriate tasks and avoided opposite- 
sex-typed tasks, even if offered more money 
to perform the latter. Androgynous subjects, 
in contrast, freely engaged in whichever sex- 
typed behavior paid more. 

Although ‘Bem and Lenney predicted the 
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evidence regarding the trait validity of the 
BSRI. All of the masculine and feminine sex- 
typed behaviors used in Bem and Lenney’s 
study were in fact highly agentic/instru- 
mental tasks, Baiting a fishhook and ironing a 
napkin obviously differ in their sex role ap- 
propriateness, but they both require agentic 
manipulation of the physical environment. 
Thus, in terms of the traits that the BSRI 
supposedly reflects, masculine subjects should 
have freely engaged in any of the experi- 
mental tasks, whereas feminine subjects 
should have actively avoided all the experi- 
mental tasks. If the BSRI measures sex role 
identity, however, Bem and Lenney’s results 
make perfect sense. Kohlberg’s theory postu- 
lates that sex role identity guides the indi- 
vidual’s modeling of sex role stereotypes. 
Thus, if the BSRI in fact classifies people by 
sex role identity, Bem and Lenney’s subjects 
chose exactly the stereotypic tasks one might 
expect.* 


above result, it actually provides negative 


The Nature of Sex Role Stereotypes 


There has been recent controversy over the 
apparently tenuous nature of sex role stereo- 
types (Locksley & Colten, 1979; Pedhazur & 
Tetenbaum, 1979), Stereotypes of men and 
women vary markedly as a function of mea- 
surement techniques (Ehrlich & Rinehart, | 
1965) and the inclusion or exclusion of role- 
specific information about the target (Clifton, 
McGrath, & Wick, 1976). Some writers have 
argued that research should focus on sex role 
schemata rather than on sex role stereotypes 
(Locksley & Colten, 1979; Spence & Helm- 
reich, 1979). Schemata are more personahe i | 
situation-specific expectations held by inch | 
viduals about their own and others’ sex role 
attributes and behaviors, and thus may haxg 
more predictive validity than abstract stereo 


types. | 


3 Helmreich, Spence, and Holahan (1979) aa 
attempted to replicate Bem and Lenney’s T PAQ. 
sults using the PAQ instead of the BSRI. ea 
failed to predict subjects’ choice of sex role He ee 
Tronically, that failure to replicate can See 
preted as positive evidence that the PAQ m 
sex role attributes and not sex role identity. 


; 
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SEX ROLE IDENTITY 


Tt is interesting to note that Kohlberg fre- 
quently used the term schema in place of the 
term stereotype in his 1966 theoretical paper. 
According to him, individuals are exposed to 
both abstract stereotypes and specific 
schemata from peers, parents, and other 
adults. The relevance of sex role information 
to the individual, or the extent to which the 
individual models after the stereotypes and 
schemata provided by others, is guided by his 
or her sex role identity. From that interaction 
each individual builds a catalog of very spe- 
cific personal schemata, intentions, prefer- 
ences, and behavioral propensities. In short, 
Kohiberg’s theory, with its emphasis on the 
development of specific, individualistic sex 
role schemata, presaged many of the current 
trends in sex role research and could provide 
an integrative model for future research. 
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Acknowledgment of Handicap as a Tactic in Social Interaction ` 
\ 


Albert H. Hastorf, Jeffrey Wildfogel, and Ted Cassman 


Stanford University 


Nonhandicapped people often report discomfort and uncertainty when interact- 
ing with handicapped individuals, The three studies reported here investigated 
a possible tactic that handicapped people could use to reduce a fellow inter- 
actant’s discomfort and uncertainty. Nonhandicapped subjects watched two 
videotapes of handicapped individuals being interviewed. Each subject then 
chose the handicapped person with whom he would prefer to work on a co- 
operative task. Results of all three studies supported the hypothesis that a 
handicapped person acknowledging his handicap will be preferred to a handi- 
capped person who does not acknowledge his handicap. In Study 1, subjects 
significantly preferred a handicapped person who acknowledged his handicap to 
a handicapped person who did not disclose anything personal. In Study 2, sub- 
jects significantly preferred an acknowledging person over one who made a 
personal disclosure other than about his handicap. In Study 3, subjects pre- 
ferred the individual acknowledging a handicap over one who disclosed some- 
thing else personal even when the acknowledging individual was clearly nervous 
about doing so. These results suggest that acknowledging the handicap may be 


a promising tactic. 


Experimental evidence has shown that 
physical handicaps can negatively affect per- 
sonal encounters. Nonhandicapped individuals 
often report discomfort and uncertainty when 
interacting with handicapped individuals 
(Davis, 1961; Kleck, Ono, & Hastorf, 1966), 
and these subjective reports have been cor- 
roborated by an objective measure of emo- 
tional arousal (Kleck et al., 1966). Further, 
when interacting with the handicapped, non- 
handicapped individuals exhibit less variabil- 
ity in their behavior, express opinions less 
representative of their actual beliefs, gesture 
less, and even end the interaction sooner than 
they do when interacting with a nonhandi- 
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capped individual (Kleck, 1968; Kleck et al., 
1966). Finally, it has been shown that this 
tension and discomfort is felt by the handi- 
capped individual on the “other side” of the 
interaction as well (Comer & Piliavin, 1972). 
There are several plausible explanations 
for the discomforting effect physically handi- 
capped individuals have upon the nonhandi- 
capped. First, nonhandicapped individuals 
may feel discomfort because the mere pres- 
ence of a handicapped person forces upon 
them the realization that they, too, are vul- 
nerable to similar disabilities (Novak & 
Lerner, 1968). There may also be a fear that | 
the stigma accorded to handicapped individ- 
uals is contagious, that being seen with the 
stigmatized is “discrediting” by association 
(Goffman, 1963). o 
Yet another source of the discomforting 
effect handicapped individuals have on the 
nonhandicapped may stem from the non- 
handicapped individual’s uncertainty as t° 
what kind of behavior is expected and ap- 
propriate. There are strong societal norms to 
treat the handicapped kindly and carefully, 
but there are equally strong norms to treat 
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them just like anyone else (Kleck et al., 
1966) so as not to appear condescending. 
í Similarly, nonhandicapped individuals may 

find themselves in conflict over a desire to ex- 

plore the handicap because it constitutes a 

novel stimulus and a duty to adhere to strong 

norms against staring at another person 

(Langer, Fiske, Taylor, & Chanowitz, 1976). 

Finally, feelings of discomfort may arise in 

the presence of a handicapped individual be- 

cause a deformity is unsightly. Closely related 

to this is the notion that discomfort arises 

because the deformity violates our expecta- 
tions of what a “whole” person should look 
like (Richardson, 1976). 

Whatever the cause of the discomfort felt 
by nonhandicapped individuals in the pres- 
ence of the handicapped, it is clear that it 
makes the physically handicapped the socially 
handicapped as well. Since the handicapped 
do not receive accurate feedback concerning 
the appropriateness of their own behavior or 
experience the normal behaviors of others, 
(Hastorf, Northcraft, & Picciotto, 1979) it is 
likely that their learning of rules about social 
interactions and their development of sensitiv- 
ity to others is impaired. In addition, Kleck 
(1969) has found that although the norms to 
be kind to the handicapped may result in 
positive impressions of them in initial en- 
counters, this positive first impression usually 
attenuates with further interaction. This find- 
ing is hardly surprising in view of the im- 
paired social skills development of the handi- 
capped mentioned above. Thus, because of the 
way the nonhandicapped react to a physical 
handicap, the handicapped individual may be 
led to the attribution: “I am the type of per- 
son who causes others to feel uncomfortable, 
and am therefore avoided. When people do 
get to know me, they appear to like me less 
the better they get to know me.” The nega- 
tive implications for self-concept are not dif- 
ficult to imagine. 

Thus, where the handicapped can be suc- 
cessful in overcoming physical and legal bar- 
tiers, psychosocial barriers confronting them 
may prevent their entry into the mainstream 
of society. The present series of investigations 
Sought to determine if there are tactics the 
handicapped might employ to overcome these 
Psychosocial barriers. 
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The discomfort of the nonhandicapped in 
interactions with the handicapped seems to be 
a major cause of the psychosocial barriers 
faced by the handicapped. It therefore seems 
reasonable that any tactic that reduces this 
discomfort would facilitate interactions be- 
tween the handicapped and nonhandicapped. 
One such tactic may be acknowledging the 
handicap. Just as one may find it disarming 
when a relatively short individual introduces 
himself with the phrase “You may have 
noticed I’m a little tall for a leprechaun,” so 
might it serve to reduce interactional tension 
when a handicapped individual conversation- 
ally acknowledges his or her handicap. This 
would seem especially true if the acknowledg- 
ment serves the purpose of signaling that the 
handicap is not an interactional obstacle, men- 
tion of which or involvement with which must 
be carefully circumnavigated. 

The only study that has investigated the 
tactic of acknowledging the handicap found 
that it had no impact on the behavior of non- 
handicapped individuals or their ratings of 
handicapped people (Farina, Sherman, & 
Allen, 1968). There are, however, circum- 
stances that cast doubts on the validity of this 
conclusion. The study used male subjects, 
and its findings are therefore equally con- 
sistent with the hypothesis of no relationship 
between disclosure and liking for nonhandi- 
capped males. Furthermore, the handicapped 
acknowledgment in the Farina et al. study 
may not have been effective because it did not 
reduce the significance of the handicap as a 
topic of suppressed concern. For acknowledg- 
ment of the handicap to be an effective tactic, 
it may be necessary for the handicapped per- 
son to mention it in a manner that conveys 
that he or she is not overly sensitive about it, 
that it is all right if the handicap comes up in 
conversation, The confederate in the Farina 
et al. study merely mentioned that his handi- 
cap caused inconveniences and was the result 
of a car accident. Such a statement, rather 
than implying that the handicap is an ac- 
ceptable topic of conversation, may indicate 
that the topic has been opened and closed. 

The present series of investigations ex- 
plored the tactic of acknowledging the handi- 
cap in interaction between handicapped and 


nonhandicapped individuals. Three studies 
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were conducted to test the hypothesis that 
nonhandicapped people would prefer to in- 
teract with handicapped individuals who 
acknowledged their handicap in a manner 
that showed that they were not overly sensi- 
tive about it. In each study, nonhandicapped 
subjects watched two videotapes, each of a 
handicapped person in a wheelchair being 
interviewed. All subjects were told they would 
interact later with one of the handicapped 
persons they had viewed. After each inter- 
view, subjects gave their impressions of the 
person they had just seen, and after seeing 
both interviews, subjects chose the handi- 
capped person with whom they wanted to 
interact. 


Experiment 1 


This first study investigated whether non- 
handicapped subjects prefer a handicapped 
confederate who acknowledges his handicap 
over another who does not. It was hypoth- 
esized that nonhandicapped subjects would 
feel more at ease with the acknowledging 
confederate and would therefore have more 
favorable impressions of him and choose to 
interact with him. 


Method 


Subjects. Fifty-three male and female students at 
Stanford University served as subjects. Most were 
first-year students enrolled in the introductory psy- 
chology course. Subjects in the course participated 
for course credit; others participated for money. 
Only the data of 48 subjects were included in the 
analysis. Two subjects declined to state a preference 
between confederates, 2 subjects suspected that the 
interviews were staged, and 1 subject had met one 
of the confederates beforehand. The data of these 5 
subjects were omitted from the sample, 

Procedure. Subjects were told that the experiment 
was investigating the effects of seeing and hearing a 
paraplegic individual (as compared to only reading 
about him) on future interactions with him. Subjects 
were informed that the two paraplegics working on 
the project were extremely busy, and that their 
schedules were unpredictable. For these reasons, it 
was explained, they would be shown videotapes of 
the two paraplegics during this first session and 
would then return at a later date to perform a simple 
task with one of them. This cover story provided an 
opportunity to ask subjects their impressions of the 
confederates and preferences for a partner without 


arousing suspicion. 
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Two interviews of approximately 34 mini 
tion each were taped for four apparent 
confederates. The four tapes of two col 
(Mathew and Peter) were selected for 
experiment because pretests showed that 
most closely matched on attractiveness 
liked substantially more than the other 
federates. One of the confederates (Mathew. 
actually a paraplegic. His tapes were n 
used because pretests confirmed that subj 
not detect which of the four confederates wi 
true paraplegic. 

The interviews were carefully rehearsed to 
only in whether the interviewee acknowl 
handicap. In all interviews, the confederate 
his happy childhood, good relationship with 
ents, part-time work, and plans to take a yi 
from school, Acknowledgment and nonackn 
ment interviews differed in how the confed 
sponded to the question, “How do you feel 
yourself and your college experience up to 
point?” In the acknowledgment interview, the 
federate spoke of the problems of being in a ¥ 
chair but said that he had learned to accept 
conveniences. He mentioned that he realized p 
were afraid to talk about his handicap, but he í 
couraged them to ask questions, anyway; on 
this way could his handicap be gotten out of 1 
way so that people could really get to know him. In 
the nonacknowledgment interview, the confedt 
responded that college life was agreeable, that S 
work kept him rather busy, and that he had 
begun to play clarinet in the school orchestra. 
of confederate viewed, sex of subject, and ordes 
presentation of the confederates and interview Coi 
tions were counterbalanced. : 

After watching the first interview, subjects ri 
the handicapped interviewee on an impression sC 
and answered four questions about him. The impi 
sion scale consisted of nine polar adjective P 
separated by a 7-point scale. The adjective pi 
were pleasant-unpleasant, positive-negative, 
able-calm, active-passive, strong-weak, tough-fr 
likable-unlikable, well adjusted - poorly adjusted, í 
hardworking-lazy. Two of the four questions 
how the subject thought the handicapped intervii 
would act in certain situations, and two asked h 
the subject would act in certain situations with | 
handicapped interviewee. After the second i 
view, subjects rated the second confederate in 
Same manner. 

Following the second rating, subjects also chose 
the handicapped person with whom they would pre- 
fer to work in the second session. Subjects were told 
that an effort would be made to give them the 
ner they chose. Although the experimental situatt 
of choosing between two handicapped people ga 
at first glance seem a strange one, subjects did no 
seem to consider the situation odd. Now that hand 
capped people are more in the news, this type F 
situation is apparently viewed as quite piaui 
When the second questionnaire was completed, the 
experimenter asked each subject to explain the rea 


iy 
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sons for his or her choice. The experimenter then 
explained the true purpose of the experiment and 
answered any questions the subject had. 


Results and Discussion 


Significantly more subjects (71%) chose 
to work with the confederate who acknowl- 
edged his handicap than with the nonacknowl- 
edging confederate, x?(1) = 7.52, p<.01. 
Analyses of the effects of confederate viewed, 
sex of subject, and order of presentation of 
the confederates and interview conditions did 
not reveal any significant differences in which 
confederate was preferred. Furthermore, 79% 
of the subjects who chose the acknowledging 
confederate reported that the acknowledg- 
ment was an important factor in their deci- 
sion, Clearly the acknowledgment exerted a 
major influence on working partner prefer- 
ence,* 

Sign tests revealed that of the four ques- 
tions asked each subject about the confeder- 
ates, only the question “Do you think you 
would feel uncomfortable during a conversa- 
tion with the interviewee and your friends, if 
someone began talking about a touch football 
game?” successfully distinguished between 
acknowledging and nonacknowledging confed- 
erates, p < .05. This suggests that subjects 
chose to work with the acknowledging con- 
federate because they anticipated feeling more 
comfortable in his presence. 

Analysis of the polar adjectives revealed 
that the acknowledging confederate was rated 
more favorably on the evaluative and potency 
factors and was perceived as more likable and 
better adjusted, smallest #(47) = 2.02, p< 
.05. Subject ratings of the handicapped con- 
federates thus also support the hypothesis 
that acknowledging the handicap is an effec- 
tive tactic for increasing working-partner 
Preference. Moreover, the subjects’ ratings 
Suggest that the acknowledging tactic may be 
successful because it introduces the handi- 
capped person as someone who is not overly 
Sensitive about his handicap. 


Experiment 2 


The results of Experiment 1 suggest that 
acknowledging the handicap is an effective 
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tactic in handicapped/nonhandicapped inter- 
action because acknowledgment reduces dis- 
comfort of the nonhandicapped interactant. 
However, the findings of the first study do not 
rule out the possibility that the effectiveness 
of acknowledgment arises because the ac- 
knowledgment reveals personal information, 
thereby increasing intimacy and liking (Jour- 
ard, 1959; Worthy, Gary, & Kahn, 1969). A 
second study therefore compared the prefer- 
ences of nonhandicapped subjects who had 
watched a handicapped confederate who 
acknowledged his handicap and another who 
disclosed something else personal but did not 
mention his handicap. Since it was hypoth- 
esized that it is acknowledgment of the handi- 
cap that helps nonhandicapped individuals 
feel less uncomfortable and influences their 
preferences, it was expected that subjects 
would still have more favorable impressions 
of, and choose to interact with, the confeder- 
ate who acknowledged his handicap. 


Method 


Subjects. Fifty-five male and female students from 
the same subject population as in Experiment 1 par- 
ticipated in the second study. The data of 48 sub- 
jects were included in the analysis. Three subjects 
were suspicious that the interviews were staged, and 
4 subjects had met or heard about one of the 
confederates. The data of these 7 subjects were 
omitted from the analysis. 

Procedure. The procedure was identical to that of 
Experiment 1, except that the two interviews that 
subjects watched featured a confederate who ac- 
knowledged his handicap and one who made a per- 
sonal disclosure unrelated to his handicap. This per- 
sonal disclosure interview replaced the nonacknowl- 
edgment interview of the first experiment. 

In the personal disclosure interview, the confeder- 
ate stated that school was going well and that he had 
several good friends but that recently he and his 
girlfriend of 8 months had been having problems. 
He then described their uncertainty about next year 
and the “weird religious trip” his girlfriend’s mother 
was “laying on her.” He finished the interview by 
saying that he hoped it all worked out. Although 
there are other types of disclosures that might have 


1A similar finding has since been reported. An 
unpublished paper by Bazakas (Bazakas, Note 1) 
reports that acknowledgment of the handicap results 
in more favorable reactions toward a handicapped 
confederate only when he presents himself as a 
“coping” person. 
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been used, this type of disclosure had the advantage 
of being very revealing. Whether other types of dis- 
closures might have different effects on preferences 
for the handicapped is a question that merits further 
exploration. 

The dependent measures were essentially the same 
as those used in Experiment 1. One of the adjective 
pairs for the activity factor (excitable-calm) was 
replaced by fast-slow, and three more polar adjective 
pairs were added (warm-cold, considerate-inconsider- 
ate, and friendly-unfriendly). Fast-slow replaced 
excitable-calm because the two adjective pairs com- 
prising the activity factor were insignificantly cor- 
related in the first experiment (r = .04). 


Results and Discussion 


As predicted, subjects preferred the con- 
federate who acknowledged his handicap over 
the one who made a personal disclosure un- 
related to his handicap, y*(1) = 13.02, p < 
001. Fully 77% of subjects chose to work 
with the confederate acknowledging a handi- 
cap. 

As in Experiment 1 there were no prefer- 
ence effects due to confederate viewed, sex 
of subject, or order of presentation of the 
confederates and interview conditions. 

As a manipulation check of personal dis- 
closure, the personal disclosure interviews 
from Experiment 2 and the nondisclosure 
interviews from Experiment 1 were shown 
to 32 additional subjects who rated the 
interviews on 17 polar adjective pairs. Two 
of these adjective pairs were combined into 
a “disclosing” measure. The scales were 
open-closed and secretive—revealing. A ¢ test 
of these ratings revealed that there was a 
tendency to see the personal disclosure con- 
federate as more disclosing, (31) = 1.67, p < 
-11. Although the personal disclosure inter- 
view may have differed from the nondis- 
closure interview on other dimensions, this 
notion cannot be properly tested. There is no 
way to distinguish whether other differences 
would be due to the disclosure itself or to the 
specific type of disclosure used. 

As in Experiment 1, belief in the impor- 
tance of the handicap acknowledgment in 
determining subjects’ preferences for partner 
is substantiated by the reports of subjects 
who chose to work with the confederate who 
acknowledged his handicap. Eighty-two per- 
cent of the subjects choosing the acknowledg- 
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ing confederate stated that handicap 
edgment was an important factor i 
cision. 

A sign test on the question “Wou 
be more comfortable” found that s 


comfortable with the acknowledging 
erate, p< .05. Again, this suggests 
choice of partner may be related to 
comfortable acknowledgment of the 
cap makes the nonhandicapped in 
feel. Analyses of the polar adjectives reve 
that the confederate acknowledging the h 
dicap was rated more favorably on the 
uative factor and was perceived as both b 
adjusted and more friendly, smallest £(/ 
2.19, p < .05. 
The results of Experiment 2 again 
port the hypothesis that acknowledg 
the handicap (and not just the voluni 
of something personal) is an effective 
in increasing partner preference. Further, 
findings of this second study are consis 
with the hypothesis that the acknowledgm 
tactic is effective because it helps the h 
capped individual appear less sensitive a 
his handicap, thereby making those 
interact with him more comfortable. 


ment 


Experiment 3 


It may be difficult, at least for some ha 
capped individuals, to encourage the no 
handicapped to ask questions about th 
handicap. Handicapped individuals are: 
fact, often sensitive about their hand 
and will probably not be able to use | 
tactic of acknowleging their handicap witty 
out showing nervousness and apprehensi 
Before recommending a tactic, it is impo 
to know something of its limits, This thi 
study explored the consequences for a hani 
capped individual of showing signs of tensi 
and anxiety while acknowledging the han 
cap. ; 

Tt was hypothesized that nonverbally com- 
municated nervousness would cast a diff 
interpretation upon the confederate’s 
knowledgment of the handicap. Specificall 
it was hypothesized that this nervousni a 
would lead the subject to feel that it was ma 
acceptable to talk about the handicap. The 
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characteristic response to discrepant infor- 
mation is to rely more on the nonverbal than 
the verbal information (Mehrabian, 1971). 
Consequently, subjects might feel even more 
uncomfortable around a handicapped per- 
son who (nonverbally) showed discomfort 
about being handicapped while verbally in- 
sisting otherwise. It was therefore predicted 
that the findings of the first two experiments 
would be reversed: that subjects would have 
more favorable impressions of, and choose to 
interact with, a confederate who disclosed 
something personal but neither mentioned the 
handicap nor showed signs of nervousness 


while being personal. 
Method 
Subjects. Fifty-two male and female students 


from the same subject population as in Experi- 
ments 1 and 2 participated in the study. The data 
of 48 subjects were included in the analysis. Three 
subjects had met or heard about one of the con- 
federates, and 1 subject declined to state a prefer- 
ence between confederates. The data from these 4 
subjects were omitted from the sample. 

Procedure. The procedure was virtually identical 
to that of Experiment 2, except that a “nervous” 
acknowledgment of handicap replaced the interview 
acknowledging the handicap of the second experi- 
ment. The verbal text of the interviews remained 
unchanged. However, during most of the handicap 
acknowledgment in the nervous condition, the con- 
federate attempted to display tension by avoiding 
eye contact, running his hand through his hair, and 
clasping his hands tightly together. The other inter- 
view, as in Experiment 2, was of a confederate 
who made a personal disclosure. In this interview, 
the confederate neither mentioned his handicap 
nor showed signs of nervousness. The dependent 
measures were the same as in Experiment 2. 


Results and Discussion 


Contrary to predictions, subjects tended to 
prefer the confederate who acknowledged his 
handicap, x?(1) = 3.52, p < .07, even when 

e was nervous about doing so. Sixty-five per- 
Cent of subjects chose to work with this con- 
federate, 

An analysis of the effects of confederate 
viewed, sex of subject, and order of presenta- 
tion of the confederates and interview condi- 
tions revealed that the order of presentation 
of the interview condition significantly af- 
fected confederate preferences, x°(1) = 5.83, 


1795 


p < .02. That is, the confederate who was 
nervous while acknowledging his handicap 
was preferred significantly more often when 
he was seen first than when he was seen 
second. Since order of presentation of the 
interview condition did not affect confeder- 
ate preferences in the first two studies, this 
finding is difficult to explain. Nevertheless, 
it remains striking that although the con- 
federate who acknowledged his handicap was 
clearly nervous about doing so, he was still 
preferred. 

As a manipulation check on perceived ner- 
vousness, the nervous acknowledment of 
handicap and personal disclosure interviews 
were shown to 32 additional subjects who 
rated each interviewee on 17 polar adjectives. 
Three of the adjective pairs were combined 
into a “nervousness” measure. The scales 
were at ease-nervous, relaxed-tense, and 
comfortable-uncomfortable. A £ test of these 
ratings revealed that the nervous handicap 
acknowledgment interview was seen as sig- 
nificantly more nervous than the personal 
disclosure interview, ¢(31) = 2.32, p < .05. 
There were no differences in perceived ner- 
vousness of the confederate who acknowl- 
edged a handicap as a function of whether 
he was seen first or second, 

Even though the confederate was perceived 
as nervous while acknowledging his handi- 
cap, 74% of subjects choosing the acknowl- 
edging confederate identified the acknowl- 
edgment of handicap as an important factor 
in their decision, and 19% of subjects who 
chose the nervous confederate did so despite 
acknowledging that the confederate clearly 
was not comfortable with his own handicap. 
Apparently, then, some subjects may have 
chosen the nervous confederate because he 
was trying to cope with his handicap. 

Subjects realized that they would be just 
as uncomfortable around the confederate 
who acknowledged a handicap as around the 
confederate who only made a personal dis- 
closure; more than half of the subjects re- 
ported that they would be uncomfortable 
around the confederate who acknowledged 
his handicap. Finally, nervousness while ac- 
knowledging did have a negative effect, as 
76% of subjects who rejected the nervous, 
acknowledging confederate stated that they 
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rejected him because he did not seem ad- 
justed to his handicap. 

The nervous confederate was perceived to 
be more passive, #(47) = 2.12, p < .05. Com- 
parison of no other polar adjective pair rat- 
ings attained significance. However, the vari- 
ance of the ratings on 10 of the 12 adjective 
pairs was larger when subjects were rating 
the nervous, acknowledging confederate. A 
check on the variance of subject ratings in 
the first two experiments revealed that this 
pattern of variance in the third experiment 
was a significant reversal of the pattern in 
the first two studies, x*(2) = 38.6, p < .001. 
One interpretation of these results is that in 
the first two experiments, subjects were more 
certain how to act with the confederate ac- 
knowledging a handicap because his commu- 
nication in acknowledging made it clear what 
behavior was appropriate. Impressions of sub- 
jects were therefore relatively uniform. In 
contrast, the message of the acknowledgment 
of handicap in the third experiment was am- 
biguous. While verbally communicating an 
Openness about his handicap, the nervous 
confederate was clearly betraying an anxious- 
ness about the topic nonverbally. As a re- 
sult, impressions of subjects were uncertain 
and thereby varied. 

This interpretation is consistent with the 
ambivalence hypothesis suggested by Katz 
and his colleagues (Katz, Glass, & Cohen, 
1973; Katz, Glass, Lucido, & Farber, 1977). 
They point out that ambivalence creates a 
tendency toward behavioral instability. As a 
result, any positive information amplifies fa- 
vorable attitudes, and similarly, any negative 
information amplifies unfavorable attitudes, 
Our subjects, like most nonhandicapped in- 
dividuals, probably have ambivalent (ap- 
proach/avoidance conflicted) attitudes to- 
ward the handicapped. Acknowledgment 
of the handicap in the first two experi- 
ments was positive information because it 
reduced discomfort. Therefore, it amplified 
favorable impressions of the acknowledging 
confederates. Acknowledgment of the handi- 
cap in the third experiment contained both 
positive information (verbally) and nega- 
tive information (nonverbally). Therefore, it 
amplified either favorable or unfavorable im- 
pressions, depending on which source of com- 
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munication was more salient for each sub- 
ject. This possibility of amplification of im. 
pressions in either direction accounts for the 
increased variance found in ratings of the 
acknowledging confederate in Experiment 3; 
the probable high salience of his nervousness 
accounts for the reduced effectiveness of the 
acknowledging tactic. 


General Discussion 


The findings of these three experiments 
have important implications for social policy; 
the tactic of acknowledging a handicap seems 
very promising. In all three studies, the con- 
federate who acknowledged his handicap was 
preferred to the other handicapped confed- 
erate. This finding was true even if the other 
confederate made a personal disclosure. 

In contrast to the results of investigations 
on personal disclosure (Ehrlich & Graeven, 
1971; Jourard & Landsman, 1960), acknowl- 
edgment of the handicap was effective even 
though males were doing the acknowledging. 
Perhaps the macho stereotype of males, 
which views disclosure as weak and effemi- 
nate, is not a barrier to the acknowledging 
tactic. : 

It is noteworthy that the acknowledging 
confederate was still slightly more preferred 
even when he was clearly nervous about the 
acknowledgment. This suggests that handi- 
capped people may use the tactic effectively 
even before they are well adjusted to their 
handicap. By acknowledging their handicap, 
handicapped individuals can reduce the dis- 
comfort and uncertainty of the nonhandi- 
capped and thereby increase their opportuni- 
ties for social interaction and its accompany- 
ing benefits. This increased opportunity for 
social interaction can then itself help pro- 
mote smoother adjustment to the handicap. 

Further research is needed, however, tO 
explore the impact of the acknowledgment 
tactic. It would be useful to know, for in- 
stance, if the acknowledgment tactic attent- 
ates the biased, inhibited behavior that the 
nonhandicapped usually demonstrate 1m the 
presence of handicapped individuals. The 
behavior of nonhandicapped individuals 
should be measured in interactions with oa 
dicapped individuals employing the acknowl- 


ACKNOWLEDGMENT AS TACTIC 


edgment tactic. Studies of ‘this kind will ulti- 
mately determine the utility of acknowledg- 
ing the handicap. 
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Depression and Rape 
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Two types of self-blame—behavioral and characterological—are distinguished. 
Behavioral self-blame is control related, involves attributions to a modifiable 
source (one’s behavior), and is associated with a belief in the future avoid- 
ability of a negative outcome. Characterological self-blame is esteem related, 
involves attributions to a relatively nonmodifiable source (one’s character), and 
is associated with a belief in personal deservingness for past negative outcomes. 
Two studies are reported that bear on this self-blame distinction. In the first 
study, it was found that depressed female college students engaged in more 
characterological self-blame than nondepressed female college students, whereas 
behavioral self-blame did not differ between the two groups; the depressed 
population was also characterized by greater attributions to chance and de- 
creased beliefs in personal control. Characterological self-blame is proposed as 
a possible solution to the “paradox in depression.” In a second study, rape 
crisis centers were surveyed. Behavioral self-blame, and not characterological 
self-blame, emerged as the most common response of rape victims to their 
victimization, suggesting the victim’s desire to maintain a belief in control, 
particularly the belief in the future avoidability of rape. Implications of this 
self-blame distinction and potential directions for future research are discussed. 


In a study by Bulman and Wortman 
(1977) on the relationship between blame 
attributions and coping, self-blame emerged 
as a predictor of good coping among para- 
lyzed victims of freak accidents. A conclusion 
that is consistent with these results—that 
self-blame is a positive psychological mecha- 
“nism—derives primarily from the implica- 
tions of this attribution for a belief in per- 
sonal control over one’s outcomes. The ad- 
vantages of perceived control have been re- 
peatedly demonstrated in social psychological 
experiments (see, e.g., Bowers, 1968; Glass & 
Singer, 1972; Langer & Rodin, 1976; Schulz, 
1976); Walster’s (1966) formulation of ob- 
servers’ reactions to victims and Kelley’s 
(1971) view of attributional processes “as a 
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means of encouraging and maintaining [his] 
effective exercise of control in the world” 
(p. 22) are also based upon a recognition of 
the significance of perceived control. The 
tenuous link between control and self-blame : 
becomes comprehensible as one realizes that 
in order to maximize a belief in control when 
attributing blame to particular factors, ones 
choice is influenced by the perceived modifia- 
bility of the potential factor(s). As Medea 
and Thompson (1974) write in the case of 
rape, “If the woman can believe that some- 
how she got herself into the situation, if she 
can make herself responsible for it, then she's — 
established some sort of control over rape. — 
It wasn’t someone arbitrarily smashing into 
her life and wreaking havoc” (p. 105). 
Unfortunately, this adaptive, control-orl- 
ented view of self-blame too easily ignores 
the more popular conception of the phenome- 
non, by which self-blame is regarded as mal- 
adaptive, a correlate of depression, and 4 
reflection of psychological problems. Thus 
Beck (1967), writing about depressed pê- 
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tients, states, “Another symptom, self-blame, 
expresses the patient’s notion of causality. 
He is prone to hold himself responsible for 
any difficulties or problems that he encoun- 
ters” (p. 21). Self-blame as a maladaptive 
psychological mechanism is generally related 
to harsh self-criticism and low evaluations of 
one’s worth. 


Two Types of Self-Blame 


Recognizing that self-blame may be both 
adaptive and maladaptive is a first step to- 
wards the conclusion that there are two 
different types of self-blame, one representing 
an adaptive, control-oriented response, the 
other a maladaptive, self-deprecating re- 
sponse. The primary distinction between 
these two self-attributions is the nature of 
the focus of blame, for it is proposed that 
the control related self-blame focuses on one’s 
own behavior, whereas the esteem related 
self-blame focuses on one’s character, an 
overall view of the kind of people individuals 
perceive themselves to be. In other words, 
individuals can blame themselves for having 
engaged in (or having failed to engage in) a 
particular activity, thereby attributing blame 
to past behaviors; or individuals can blame 
themselves for the kind of people they are, 
thereby faulting their character. To facilitate 
discussion of these two  self-attributional 
strategies, the esteem related blame will be 
labeled “characterological” self-blame and 
the control related type, “behavioral” self- 
blame. In the case of rape, for example, a 
woman can blame herself for having walked 
down a street alone at night or for having 
let a particular man into her apartment (be- 
havioral blame), or she can blame herself for 
being “too trusting and unable to say no” 
or a “careless person who is unable to stay 
Out of trouble.” This behavioral-charactero- 
logical distinction parallels findings in the 
area of the just world theory. In their recent 
review, Lerner and Miller (1978) state 
that innocent victims who cannot be charac- 
terologically blamed (i.e, derogated) by 
virtue of their reputedly good character are 
instead blamed for some behavior in which 

ey engaged (i.e. behavioral blame). 

While this distinction between charactero- 
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logical and behavioral self-blame appears 
related to the state-trait distinction in clini- 
cal psychology (see, e.g., Spielberger, 1972), 
it more specifically corresponds to the dis- 
tinctions drawn by Weiner and his colleagues 
(Weiner et al., 1971) in their scheme of 
attributions in the area of achievement. In 
attributing failure to oneself (internal attri- 
bution), one can point to his/her own lack 
of ability or effort, attributions that have 
very different implications for perceived con- 
trol, Individuals who make an attribution 
to poor ability believe that there is little 
they can do to control the situation and suc- 
ceed, for ability is stable and relatively un- 
changeable. Effort attributions, on the other 
hand, will lead one to believe that as long 
as he/she tries harder, he/she will be able to 
control outcomes in a positive manner (see 
Dweck, 1975). Similarly, characterological 
self-blame corresponds to an ability attribu- 
tion, and behavioral self-blame corresponds 
to an effort attribution, having very different 
implications for perceived personal control, 
While the dimension used by Weiner and his 
colleagues to distinguish between ability and 
effort is that of stability (stable-unstable), 
the differences between the attributions may 
also be captured through the use of a con- 
trollability dimension (cf. Elig & Frieze, 
1975; Weiner, 1974). The primary distinc- 
tion to be drawn between behavioral and 
characterological self-blame is the perceived 
controllability (i.e. modifiability through 
one’s own efforts) of the factor(s) blamed. 
In a recent reformulation of learned help- 
lessness, Abramson, Seligman, and Teasdale 
(1978) have posited a third dimension of at- 
tributions—global-specific—that is important 
to specify in determining subsequent per- 
ceived helplessness. While this global-specific 
dimension characterizes one of the differences 
to be noted between characterological and 
behavioral self-blame, it is proposed that the 
dimension of significance distinguishing these 
two types of self-blame is perceived con- 
trollability, and the  global-specific and 
stable-unstable dimensions are important be- 
cause of their contribution to perceived con- 
trol. Abramson, Seligman, and Teasdale 
(1978), however, note that the dimension of 
“controllability is logically orthogonal to the 
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Internal X Global x Stable dimensions . . .” 
(p. 62). The position presented here is con- 
sistent with a comment by Wortman and 
Dintzer (1978) in their recent reaction to 
the learned helplessness reformulation. They 
state, “We feel that assessments of the con- 
trollability of the causal factor may be of 
the utmost importance in predicting the na- 
ture and magnitude of subsequent deficits” 
(p. 82). 

In a discussion of self-blame by Abramson, 
Seligman, and Teasdale (1978) these authors 
state that self-blame in helplessness and de- 
pression follows from the “attribution of 
failure to factors that are controllable” (p. 
62). The self-blame they are dealing with is 
that which is consistent with “self-esteem 
deficits” and “self-criticism” and thus paral- 
lels characterological self-blame. The authors 
do not recognize a second type of self-blame, 
behavioral self-blame, which according to 
the present analysis is the type of self-blame 
following from attributions to controllable 
factors. Contrary to the assertions of Abram- 
son, Seligman, and Teasdale, it is proposed 
that characterological self-blame follows from 
attributions to uncontrollable factors. 

A further distinction between behavioral 
and characterological self-blame lies in the 
time orientation of the attributor. It is pro- 
posed that in blaming one’s behavior, an 
individual is concerned with the future, par- 
ticularly the future avoidability of the nega- 
tive outcome. This concern for future avoid- 
ability is consistent with the control-moti- 
vated basis for behavioral self-blame. The 
future-oriented concerns of behavioral self- 
blamers need not focus exclusively on the 
future avoidability of the negative outcome 
for which the attributor is blaming him/her- 
self; rather, behavioral self-blame may pro- 
mote a general belief in one’s ability to avoid 
negative outcomes and to effect positive out- 
comes in the future. Thus, the paralyzed 
victims in the Bulman and Wortman (1977) 
study were apt to be better copers if they 
blamed themselves, but self-blame was more 
likely to be in the service of a general belief 
in future control (e.g., I'll be able to im- 
prove my physical condition through physi- 
cal therapy), rather than a more specific 
belief in the future avoidability of their own 


RONNIE JANOFF-BULMAN 


paralysis, which was medically regarded aş 


irreversible in all cases. 

In blaming himself or herself charactero- 
logically, the individual is not concerned 
with control in the future, but rather with 
the past, particularly deservingness for past 
outcomes. Individuals who engage in behay- 
ioral self-blame are apt to have an eye 
towards the future and what they can do 
to avoid a recurrence of the negative out- 
come (or the occurrence of negative out- 
comes in general), Individuals who engage 
in characterological self-blame are apt to 


focus more on the past and what it was about 


them that rendered them deserving of the 
negative outcome for which they are blaming 
themselves.’ Perceived avoidability and be- 
havioral self-blame are thus assumed to be 
part of the same blame cluster, whereas char- 
acterological self-blame and feelings of de- 
servingness are representative of another 
blame cluster. 


Self-Blame and Depression: 
Toward the Resolution of a Paradox 


Distinguishing between characterological 
and behavioral self-blame may be a first step 
toward resolving the “paradox in depression 
recently recognized and discussed by Abram- 
son and Sackeim (1977). According to these 
authors, there are two symptom clusters of 
depression, one represented by hopelessness, 
powerlessness, and futility, the other by self- 
blame, self-deprecation, and guilt. Abramson 
and Sackeim discuss two prominent theories 
of depression that are based on cognitions 
of hopelessness and self-blame. Seligman s 
(1975) theory of learned helplessness va 
gests that depression results from a belie 
in the uncontrollability of outcomes. Accore- 
ing to Beck’s (1967) A 
ihe depressed individual blames him/ heroes 
for negative outcomes, particularly persona: 
failures, It is the conjunction of these Aa 
models that accounts for the paradox. Thai 


i i i recent 
1 These distinctions are consistent with @ 


i Rule 
analysis of responsibility by Harvey PaT 
(1978), in which causal responsibility and ge 


ness are regarded as conceptually distinc 
of responsibility. 


theory of depression, 
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is, how can individuals blame themselves 
for outcomes over which they feel they have 
had no control? How can an individual feel 
both helpless and self-blaming? Abramson 
and Sackeim discuss several possible resolu- 
tions to this paradox but remain dissatisfied 
with the alternatives presented to date. How- 
ever, a recognition of self-blame, not as a 
unitary phenomenon, but rather as a label 
for two very different self-attributional strat- 
egies, may inform and resolve the apparent 
paradox in depression. 

One reason why a resolution to the de- 
pression paradox has not been forthcoming 
is suggested by Abramson, Seligman, and 
Teasdale’s (1978) assertion (presented 
above) that self-blame follows from attribu- 
tions to controllable factors. In assuming that 
self-blame naturally involves blaming con- 
trollable factors, the possibility that individ- 
uals can simultaenously feel they do not have 
control and blame themselves is foreclosed. 
Instead, if we recognize the distinction be- 
tween behavioral and characterological self- 
blame, then the paradox begins to disappear. 
Essential to an understanding of this asser- 
tion is the proposition that in blaming him- 
self or herself for the kind of person he/she is, 
the individual is not necessarily placing 
blame for an event regarded as personally 
controllable. A person can believe that he/she 
deserves what happened and is therefore “re- 
sponsible” for it (see Harvey & Rule, 1978), 
without believing that he/she is capable of 
altering the outcome in the past, present, or 
future, 

In the case of personal failures, the char- 
terological blamers will point to deficits in 
themselves that are believed to account for 
these failures. The deficits are likely to lie 
in the realm of characteristics that generally 
define them, characteristics that are rela- 
tively nonmodifiable, stable, and global. 

hus, in achievement tasks, an ability at- 
tribution would represent a characterological 
self-blaming strategy. In the case of self- 

lame for failures that are further removed 
from the individual, represented by the “de- 
lusions of depressives” who blame themselves 
‘or the “violence and suffering in the world” 
(see Beck, 1967), the individuals appear to 
‘gard sthemselyes as being punished for who 
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or what they are. In this case, rather than 
perceive themselves as responding, active or- 
ganisms, depressed individuals seem to per- 
ceive themselves as passive stimuli. They do 
not believe they actively bring about out- 
comes that remain under their control. 
Rather, negative outcomes occur in reaction 
to them by other people and the world at 
large. In sum, self-blame by depressives is 
proposed as characterological in nature. Since 
characterological self-blame and feelings of 
helplessness are not logically inconsistent, 
their conjunction in depressed individuals 
should not be regarded as paradoxical. 


Self-Blame Among Rape Victims 


The association between self-blame and 
depression is probably well recognized and 
accepted within this culture. While the asso- 
ciation between self-blame and rape is prob- 
ably not as strong, the more or less popular 
image of the self-blaming rape victim may 
be more accurate than many would like to 
believe. The pervasiveness of self-blame has 
been well documented in literature on rape 
(see, e.g., Burgess & Holmstrom, 1974a, 
1974b, 1976; Griffin, 1971; Hursch, 1977; 
Weis & Weis, 1975; Bryant & Cirel, Note 1). 
Although fear (of injury, death, and the 
rapist) is the primary reaction to rape, self- 
blame may be second only to fear in fre- 
quency of occurrence; perhaps surprisingly, 
it is far more common than anger. 

In considering the few existing facts on 
victim precipitation in the crime of rape, 
however, it becomes obvious that the vic- 
tims’ self-attributional strategies (i.e. self- 
blame) do not reflect an accurate appraisal 
of the woman’s causal role in the assault. 
The National Commission on the Causes and 
Prevention of Violence (1969) concluded that 
only 4.4% of all rapes are precipitated by 
the victim. Although a higher figure, 19%, 
has been proposed by Amir (1971), he used a 
considerably broader definition in establish- 
ing his criteria for victim precipitation. Thus, 
criteria such as “risky situations marred with 
sexualty” were used, affording the interpreter 
of data considerable discretion. In light of 
these percentages, the pervasiveness of self- 
blame becomes a puzzling phenomenon. 
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An attempt to account for the pervasive- 
ness of such feelings has involved the propo- 
sition that women have been socialized to 
accept blame for their own victimization. As 
Brownmiller (1975) suggests, women are 
conditioned to a female victim mentality, 
Brownmiller discusses the psychologies of 
Deutsch and Horney and concludes that 
masochism is a female trait, one that has 
been socialized by men, Similarly, Burgess 
and Holmstrom (1974a) contend that women 
are socialized to the attitude of “blaming 
the victim,” a perspective shared by Bryant 
and Cirel (Note 1). While there is no doubt 
much truth to this socialization hypothesis, 
it may paint a very incomplete picture of 
the factor(s) responsible for self-blame in 
women and the rape victim in particular. 
It fits nicely with a portrait of women as 
helpless and masochistic and may unwit- 
tingly perpetuate a view of women too con- 
sistent with the role of rape victim. In par- 
ticular, this view entirely overlooks the possi- 
bility that self-blame by victims of rape may 
represent an adaptive response, an attempt 
to reestablish control following the trauma 
of rape. 

A common reaction to rape is the feeling 
of a loss of control over one’s life (Bard & 
Ellison, 1974; Bryant & Cirel, Note 1). The 
woman does not feel sure of herself and ques- 
tions her self-determination. She needs to 
feel a sense of control (Hilberman, 1976), 
for she feels extremely vulnerable and par- 
ticularly fears the rapist and a recurrence of 
rape. In blaming herself, perhaps the rape 
victim is engaging in a type of self-blame 
that maximizes a belief in control; that is, 
perhaps rape victims engage in behavioral 
self-blame rather than characterological self- 
blame. Whereas the latter type of blame 
would provide some support for a view of 
women as helpless and masochistic, the for- 
mer would foster a different image of the 
rape victim and her reactions, that of an in- 
dividual reacting in an adaptive manner 
to her recent loss of control. 

If the rape victim engages in behavioral 
self-blame and attributes her victimization 
to a modifiable behavior (e.g., I should not 
have walked alone, I should have locked the 
windows), she is likely to maintain a belief 
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in the future avoidability of a similar mis. 
fortune, while simultaneously maintaining a 
belief in personal control over important life 
outcomes. If, on the other hand, the rape 
victim blames herself characterologically, at- 
tributing the victimization to more or less 
unchangeable factors (e.g., Pm a weak person 
and can’t say so, I’m the type of person who 
attracts rapists), she will presumably be con- 
siderably less likely to believe that she is 
capable of alleviating her vulnerability in the 
future and may begin to perceive herself as 
a chronic victim. 

Two studies were conducted in order to 
test the usefulness of the distinction be- 
tween behavioral and characterological self- 
blame in the areas of depression and rape, 
Study 1 was designed to determine whether 
characterological self-blame is a distinguish- 
ing characteristic of depressed individuals 
and whether it co-occurs with decreased be- 
liefs in personal control among female col- 
lege students. Study 2 involved surveying 
rape crisis centers across the country in order 
to determine which type of self-blame—be- 
havioral or characterological—more accu- 
rately characterizes the reactions of rape vic- 
tims served by these centers. 


Study 1: Depression 
Method 


Subjects. Subjects were 129 undergraduate W0- 
men at a large state university who were volun- 
teers drawn from a number of undergraduate psy- 
chology courses. Each received one experiment 
credit for her participation. Responses from i ie | 
the subjects lacked much data, and these were yee 
inated from the analyses, leaving the responses 
120 subjects. { | 

Botodare. Data? were collected during ae 
sessions that generally ranged from 10 to oi a 
dents, Subjects were told that we were interes! t 
the relationship between personality variables Er 
artistic taste, and that there would be three pal 


2These data were collected by Laurie Gul 
for her senior honors thesis, which ra gee sa 
and completed under the direction of tl A a salle 
While Gunsolley was particularly intereste i ad 
esteem, the data have been reanalyzed fo self- 
presentation, using the responses to the ain él 
Rating Depression Scale (1965) as Te 
distinguishing between the two groups 0 
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to the study: completing personality scales, view- 
ing and rating a series of art slides, and reacting 
to several “real-life” types of situations. The sub- 
jects were asked to complete three personality scales, 
The Zung Self-Rating Depression Scale (Zung, 1965) 
was used to measure depression.* In addition, sub- 
jects completed the Janis—Field Feelings of Inade- 
quacy Scale (Eagly, 1967), a self-esteem measure, 
and the Rotter Internal-External Locus of Control 
Scale (Rotter, 1966). Having completed these, the 
subjects were then asked to rate seven art slides on 
“aesthetic appeal.” An overhead projector was used 
to show the slides, and ratings were made on 5- 
point scales. These artistic ratings not only pro- 
vided a “cover” for the study but, more important, 
served as a distraction between the first (person- 
ality scales) and third (self-blame measures) parts 
of the experiment. 

In the third phase of the study, subjects were 
asked to read four scenarios and to imagine that 
the various situations described had actually been 
experienced by them; that is, they were told to 
react to the scenarios given that they were the 
target people presented. In each situation the out- 
come was negative and the role of the target per- 
son was intentionally ambiguous. The scenarios in- 
Volved the following situations; (a) a car driven 
by the target person is in an accident on a snowy 
winter day; (b) a social invitation by the target 
person is rejected (on the basis of false excuses) 
by an individual she recently met and regarded as 
a friend: (c) an urgent call for a roommate results 
in the target person’s taking down the wrong num- 
ber; the roommate is subsequently unable to return 
the call successfully; (d) an intense love relation- 
ship is ended when the target person’s boyfriend 
leaves her and immediately gets involved with an- 
other woman. 

Subjects were asked to respond to five questions 
following each scenario; responses were made on 
6-point scales with endpoints not at all and com- 
pletely, Subjects were asked to indicate how much 
they blamed themselves, other people, the environ- 
ment (ie., impersonal world), and chance, for the 
situations described. The question that tapped 
characterological self-blame asked, “Given what 
happened, how much do you blame yourself for 
the kind of person you are (e.g, the kind of person 
who is in an accident [Scenario A], the kind of 
Person who has invitations turned down [Scenario 
B], the kind of person who causes inconveniences 
for others [Scenario C], the kind of person who is 
rejected in relationships [Scenario D])?” The ap- 
Propriate “kind of person” was included separately 
for each scenario, so that for scenario D only “the 

ind of person who is rejected in relationships 
Was included, Question 3 sought to tap behavioral 
Self-blame and asked, “Given what happened, how 
much do you blame yourself for what you did 
(eg, your driving behavior [Scenario A], how you 
acted when you first met the person [Scenario B], 

OW you acted when taking down the telephone 
Number [Scenario C], how you acted with your 
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boyfriend [Scenario D]) ?” Question 4 asked, “How 
much do you think you deserved what happened?” 
and Question 5 following each scenario was, “If the 
same situation arose in the future, to what extent 
do you believe that you could avoid what hap- 
pened in this case?” All subjects were thoroughly 
debriefed upon completion of the session, 


Results 


Using a median split, subjects were divided 
into nondepressed (responses ranged from 
6 to 21 on the Zung scale) and depressed 
(22 to 45 on the Zung scale) groups.’ On the 
Janis-Field Feelings of Inadequacy Scale, 
the depressed group scored lower (i.e., had 
lower self-esteem) than the nondepressed 
group (64.05 vs. 74.97), F(1, 118) = 36.72, 
p < .001, and the depressed group was found 
to be more external than the nondepressed 
group on Rotter’s Internal—External Locus 
of Control Scale (13.36 vs. 10.57), F(1, 118) 
= 12.32, p < .001. 

Parallel attributional and self-blame mea- 
sures were summed across the four scenarios; 
for example, a score for characterological 
blame was derived by adding the individual 
responses to each of the four questions (one 
following each scenario) that asked about 
characterological self-blame. In order to 
justify adding the four scales, alpha relia- 
bility coefficients were calculated for each of 
the eight summed scores. Despite the fact 
that each was composed of four scores, only 
the perceived avoidability measure failed to 
reach a reliability of .50. The general self 
and other people attributions were less than 
.60, and the other five measures had alpha 
reliabilities between .62 and .74.° Each 


3In accordance with work on depression by Bon- 
nie Strickland, a clinical psychologist in the De- 
partment of Psychology, University of Massachu- 
setts, Amherst, a response category labeled “none 
of the time” was added to the Zung Self-Rating 
Depression Scale (1965). According to Strickland 
(personal communication), this renders the scale 
particularly sensitive to depression in a college 
population. 

4 Male and female college students completed the 
same depression scale in a study by Haley and 
Strickland (Note 2); their data had a median of 23. 

5 Nunnally (1967) writes that in early work on 
“hypothesized measures of a construct”, reliabilities 
of .50 or .60 are sufficient standards (see p. 226). 
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summed score could range from a total of 
0 to 24. 

The depressed and nondepressed groups 
did not differ in the amount of blame they 
attributed to themselves in general, nor did 
they differ in the amount of behavioral self- 
blame reported, F(1, 118) = 2.47, ns. How- 
ever, the groups did differ significantly in 
the amount of characterological self-blame 
reported, with more characterological self- 
blame reported by the depressed than the 
nondepressed group (11.59 vs. 10.03), F(1, 
118) = 4.33, p < .05. Other differences on 
the total scores were found for attributions 
to chance; consistent with their greater ex- 
ternality on Rotter’s scale, the depressed 
group blamed chance more than the nonde- 
pressed group (11.38 vs. 9.73), F(1, 118) = 
4.54, p < .05. Further, there was a margin- 
ally significant difference between the two 
groups on the question of how much they 
felt they deserved what happened, with the 
depressed group reporting greater deserving- 
ness than the nondepressed group (8.31 vs. 
7.24), F(1, 118) = 3.68, p = 057° 

A stepwise discriminant analysis was con- 
ducted in order to determine the variables 
that differentiated best between the two 
groups. All blame attribution measures were 
entered, with the exception of general self- 
blame, since characterological and behavioral 
self-blame were assumed to be finer distinc- 
tions of the general measure. Attributions to 
chance emerged as the best discriminator, 
F(S, 114) = 4.54, Wilks A = .963, and char- 
acterological self-blame emerged as the next 
strongest differentiator, F(5, 114) = 3.29, 
Wilks A = .937. These were followed, respec- 
tively, by attributions to other people, en- 
vironment, and behavioral self-blame. 

The correlations between deservingness, 
perceived avoidability, and the two types of 
self-blame were all strong. As an exploratory 
tool, an analysis of variance was conducted in 
order to further investigate the relationship 
between the variables. It should be noted that 
the low reliability of the avoidability measure 
calls for caution in interpreting the results of 
this analysis. Median splits were performed 
on the behavioral self-blame and character- 
ological self-blame totals, and deservingness 
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and avoidability totals were each analyzed 
by behavioral (high-low) and characterolog. 
ical (high-low) self-blame. A main effect for 
characterological self-blame emerged from the 
analysis of deservingness, with less deserving. 
ness reported by those who engaged in low 
characterological self-blame as compared with 
those who engaged in high characterological 
self-blame (6.20 vs. 9.26), F(1, 119) = 29,78, 
p < .001. On the other hand, a main effect 
for behavioral self-blame emerged from the 
analysis of perceived future avoidability, with 
less perceived future avoidability reported by 
those who engaged in low behavioral self- 
blame as compared with those who engaged in 
high behavioral self-blame (11.89 vs. 13.88), 
F(1, 118) = 6.85, p < .01. 


Discussion 


When self-blame was treated as a single 
entity (i.e., “self” as one of several possible 
factors tapped for blame attributions), no 
differences were found between the depressed 
and nondepressed students on this variable. 
However, when self-blame was divided into 
two types of self-attributions, behavioral and 
characterological, differences between the 
groups emerged. While the depressed and 
nondepressed students did not differ in terms 
of behavioral self-blame, they did differ sig- 
nificantly in terms of characterological self- 
blame; that is, the depressed students blamed 
themselves more characterologically than did 
the nondepressed students. 

The suggestion that characterological self- 
blame follows from attributions to uncon- 
trollable factors received strong support. — 
Those who were depressed were more likely to 
attribute negative outcomes to chance, 4 
variable that differentiated best between the 
depressed and nondepressed groups. Further, 
the depressed subjects were more external in 
locus of control orientation. Low self-esteem 
and somewhat increased feelings of deserving- 


6When analyses were rerun using the on 
tive Scheffé procedure to correct for error aus! 
flation, significant differences between the depre 
and nondepressed groups were again found A 
esteem, internal-external control, charactero! 
self-blame, and chance. 
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ness characterized the depressed population, 
suggesting that characterological self-blame is 
esteem related, not control related. 

The results would have been considerably 
more compelling if it were found that those 
students who were not depressed engaged in 
more behavioral self-blame than depressed 
students, yet this was not the case. However, 
it can perhaps be argued that the behavioral 
self-blame reported by the depressed and 
nondepressed populations differed in an im- 
portant way; for the depressed group the be- 
havioral self-blame co-occurred with char- 
acterological self-blame, and blaming one’s 
behavior was thus an extension of blaming 
one’s character. It may be difficult to blame 
one’s character without blaming one’s be- 
havior, yet it may be very possible to blame 
one’s behavior without blaming one’s char- 
acter. In the former instance the behavior 
may be regarded as uncontrollable in that it 
is a direct and unalterable extension of one’s 
character (i.e., controlled by one’s character). 
In the latter case the behavioral self-blame 
does not reflect decreased self-esteem, but 
rather the belief that one’s behavior is mod- 
ifiable. Perhaps behavioral self-blame, when 
displayed in conjunction with characterolog- 
ical self-blame, is simply a further reflection 
of characterological self-blame. However, 
when it occurs alone it is likely to represent 
an adaptive response, stemming from a desire 
to maintain a belief in personal control follow- 
ing a negative outcome. 


Study 2: Rape’ 
Method 


Respondents. Respondents were rape crisis centers 
located throughout the United States. Center names 
Were derived primarily from a list located in a fed- 
eral report on rape and its victims (Brodyaga et al., 
Note 3); this list was supplemented by names of 
tape crisis centers found in an informal directory at 
a local women’s center. Services that were hotlines 
only or were task forces without counseling services 
Were excluded from the final list. Questionnaires were 
Mailed to 120 centers representing 37 states and the 

istrict of Columbia. Thirty of the questionnaires 
Were returned “addressee unknown.” Of the remain- 
ing 90 crisis centers, 48 responded (53% return 
"ate; including those returned “addressee unknown,” 

€ return rate was 40%). 

Questionnaire. In peel letter I identified myself 
3 a social psychologist interested in the nature of 
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self-blame among victims of rape; letter recipients 
were asked to base their questionnaire responses on 
their experiences as counselors of rape victims. The 
questionnaire items dealt primarily with the issue of 
self-blame. Crisis centers were asked to indicate 
approximately how many rape victims they see yearly 
and of those they see, the percentage who blame 
themselves, at least in part, for the rape. The be- 
havioral self-blame question asked, “Of the rape 
victims you see, what percentage blame themselves 
for the rape because of some behavior (act or omis- 
sion) they engaged in at the time of or immediately 
prior to the rape (e.g, ‘I should not have walked 
alone,’ ‘I should not have hitchhiked,’ ‘I should have 
locked my windows’) ?” The rape crisis centers were 
then asked to provide specific examples of behavioral 
self-blame related by the women they have coun- 
seled. The characterological self-blame question asked, 
“Of the rape victims you see, what percentage blame 
themselves for the rape because of some character 
trait or personality flaw they believe they have (e.g., 
‘I am so stupid, I deserved to be raped,’ ‘I’m the 
kind of woman who attracts rapists,’ ‘I am a weak 
person and can’t say no’) ?” Specific examples of this 
type of blame were then requested as well. The 
centers were also asked to indicate on two 7-point 
scales, with endpoints almost not at all and com- 
pletely, how much self-blaming characterized the 
women who engaged in behavioral and characterolog- 
ical self-blame, respectively; this was included in 
order to ascertain whether behavioral and character- 
ological self-blamers differ in terms of the amount of 
self-blame they attribute to themselves for the rape. 


Results 


Of the 48 rape crisis centers that responded, 
38 completed the questionnaire, 6 wrote letters 
providing general comments, and 4 wrote that 
they did not provide direct counseling services 
and were therefore unable to complete the 
items. Results were therefore based on the 
completed questionnaires of 38 centers. The 
rape crisis centers differed markedly in the 
scope of their operation, with the 3 smallest 
serving 12, 30, and 40 rape victims yearly, 
and the 3 largest serving 1,200, 1,250, and 
1,500; the mean number of rape victims seen 
across the centers was 335. 

In general, self-blame was reported as quite 
common; the reported mean percentage of 


7 The results of this study were reported by the 
author at the symposium “New Directions in Con- 
trol Research” at the convention of the American 
Psychological Association, Toronto, 1978. The author 
thanks Chris Eagan for her invaluable help on the 


project. 
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women who blamed themselves at least in part 
for the rape was 74%. Of those who blamed 
themselves, behavioral self-blame was reported 
as considerably more common than character- 
ological self-blame, and the differences be- 
tween the reported incidence of the two blam- 
ing strategies was significant, F(1, 32) = 
140.90, p < .001; an average of 69% of the 
women were reported as blaming themselves 
behaviorally, whereas an average of 19% were 
reported as blaming themselves characterolog- 
ically. Further, examples of the two types of 
self-blame provided by the rape crisis centers 
confirmed the fact that they were readily able 
to distinguish between the two. Frequently 
mentioned examples of behavioral self-blame 
included the following: I shouldn’t have let 
someone I didn’t know into the house, I 
shouldn’t have been out that late, I should not 
have walked alone (in that neighborhood), 
I should not have hitchhiked, I should not 
have gone to his apartment, I shouldn’t have 
left my window open, I should have locked 
my car, Examples of characterological self- 
blame that were frequently reported included: 
I’m too trusting, I’m a weak person, I’m too 
naive and gullible, I’m the kind of person who 
attracts trouble, I’m not a very aware person, 
I’m not at all assertive—I can’t say no, I’m 
immature and can’t take care of myself, I’m 
not a good judge of character, I’m basically a 
bad person. It is perhaps worth noting that 
examples of behavioral blame were, without 
exception, reported in the past tense (i.e., I 
should have/should not have), whereas ex- 
amples of characterological self-blame were 
presented in the present tense (I am/am not), 
perhaps implicitly indicating the presumed 
modifiability /nonmodifiability of factors asso- 
ciated with behavioral and characterological 
self-blame respectively (cf. Elig & Freize, 
1975). However, the examples of the two 
types of self-blame provided in the question- 
naire were consistent with the different tenses 
reported in the examples of the crisis centers, 
and this alone could have accounted for the 
findings. The author did not realize that she 
had made these distinctions on the question- 
naire until the results clearly differentiated 
between the tenses used for the two types of 
self-blame. Finally, in responding to how 
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the centers reported that chara 
self-blamers blamed themselves s 
more for the rape than did behavi 
blamers (3.92 vs. 3.23), F(1, 25)= 
p < .002. 


Discussion 


The rape crisis center counselors 
that the majority of rape victims 
themselves, at least in part, foll 
rape. However, the focus of this 
tion is a behavioral act or omission eng 
at the time of (or immediately pre 
rape. Fewer than one-fifth of the 
served by the centers blamed thems 
characterological way, evidence that 
ular” view of the masochistic rape V 
perceives herself as worthless is la 
founded. Rather, the self-blame in 
rape victims engage may represent 
maintenance strategy, a functional Ti 
to a traumatic event. Given the large 
ancies between those who blame 
behaviorally and characterologically, 
lows that most women clearly b 
selves in a behavioral manner only am 
combine this response with chara 
self-blame, as may be the case with 
sives (see above). In suggesting thai 
ioral self-blame reflects a positive 
rape victims, there is no intention 
ing that the rape was the woman’s 
even likely that the woman who el 
behavioral self-blame does not do 
exclusion of blaming the rapist, S$ 
other factors, These blame attribui 
stead, would stem from different mo 
control maintenance being the 
behind self-blame. A 

Two potentially serious objectio 
study require a response. First, ma 
women who volunteer or work in f 
centers may be ardent feminists 
be more likely to indicate that won 
themselves behaviorally rather 
acterologically, for the latter sv 
women see themselves as worthies 
sponse, if the crisis center V 
wanted to present women in a pos! 
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they would have indicated quite simply that 
women infrequently blame themselves. Fur- 
ther, there was nothing in the questionnaire 
or cover letter to indicate that one type of 
self-blame was “healthier” than another, and 
several counselors commented that they had 
never before distinguished between types of 
self-blame but that it appeared interesting to 
them. Comments by the counselors indicated 
that these women were concerned about the 
health of the women they served and that pre- 
serving a positive image of womanhood in 
general was clearly not central to their activ- 
ities as rape counselors. The second criticism 
that could be raised is potentially more seri- 
ous. It is that women who go to rape crisis 
centers are most likely to be individuals who 
do not blame themselves characterologically 
and do not feel they deserved to be raped. 
Thus, there is a self-selected population of 
behavioral self-blamers served by rape crisis 
centers. It is difficult to counter this claim, for 
there is probably much truth to it, One must 
realize, however, that the literature written 
on rape is almost entirely derived from 
women who seek help after rape and not from 
women who quietly keep the trauma to them- 
selves, ashamed to talk about it or admit it. 
The pervasiveness of self-blame documented 
in the rape literature is drawn primarily from 
observations of women at rape crisis centers, 
from women’s centers, or from women who 
‘agree to be interviewed by researchers, also a 
Population likely to be self-selected. Thus, the 
negative image of the rape victim engaging in 
masochistic, maladaptive self-blame derives 
from a rape victim population likely to be 
very similar to that served by the rape crisis 
centers surveyed. It might also be mentioned 
that those women who have least difficulty 
coping with the rape and who are apt to be 
behavioral self-blamers are probably also 
Missing from the rape center population, for 
they may not require help (outside their own 
Circle of family and friends) following the 
tape, Perhaps it is sufficient to point out that 
Within the population of women served by 
Tape crisis centers, self-blame has been im- 
Properly understood as self-derogating, re- 
flecting the woman’s belief in her own worth- 
lessness, rather than as a response that 
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reflects a positive attempt to reestablish per- 
sonal control. 


General Conclusions and Implications 


Self-blame appears to be a label for two 
very different self-attributions, characterolog- 
ical self-blame being esteem related, and be- 
havioral self-blame being control related, Self- 
blame as a predictor of good coping and self- 
blame as a concomitant of depression are no 
longer inconsistent in light of the two types 
of self-blame. Further, the paradox in depres- 
sion—that individuals are simultaneously 
helpless and self-blaming—can be resolved if 
characterological self-blame characterizes de- 
pressives and differentiates them from non- 
depressed individuals. The division of self- 
blame into two different phenomena even has 
political or cultural implications, for self- 
blame by a victimized group such as rape vic- 
tims can now be understood in such a manner 
as to preclude the perpetuation of a negative 
image of the group in question, It is perhaps 
unfortunate that one term has been used as 
a label for these two different self-attributions, 
for the singular term se/f-blame blurs im- 
portant distinctions between adaptive and 
maladaptive responses to failures and vic- 
timizations. Since popularly the term has 
negative connotations, it would perhaps be 
desirable to provide a more neutral label for 
behavioral self-blame, Particularly in the case 
of rape, this would render more politically 
palatable the proposition that behavioral 
self-blame is of functional value for victims 
of rape. 

The recognition of two types of self-blame 
may have therapeutic implications. Seligman’s 
(1975) control-oriented strategies continue to 
seem appropriate for depressives, whose self- 
blaming does not imply high perceived con- 
trol, but rather lack of control. Further, a 
cognitive therapy that entails reattributing 
the focus of one’s attributions (e.g., from 
character to behavior) might be of value in 
treating depressives. In general, leading peo- 
ple to focus on behaviors that are alterable, 
rather than on their relatively nonmodifiable, 
more global character, may increase perceived 
future avoidability of negative events and 
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perceived control in general, outcomes that 
would presumably be of positive value. 
Dweck’s (1975) successful reattribution train- 
ing with helpless students, involving reattrib- 
uting their ability attributions for failure to 
effort attributions, suggests the potential of 
such cognitive strategies using self-blame. 

In the case of rape, the control concerns 
that may be implicit in the rape victim’s self- 
blame often seem to be ignored in counseling, 
not because they are regarded as unimportant, 
but because they may go unrecognized. One 
counseling technique for rape victims includes 
repeatedly telling a woman that there is noth- 
ing she could have done to avoid the rape, 
that it was entirely the rapist’s doing and out- 
side of her control. Although meant to be 
reassuring, these statements could conceiv- 
ably be not at all helpful, in light of the 
proposition that the women are seeking to re- 
establish a sense of control. Rather, counselors 
should perhaps recognize the functional value 
of behavioral self-blame and concentrate on 
enabling the victim to reestablish a belief in 
her relative control over life outcomes (e.g., 
discussing possible ways of minimizing the 
likelihood of a future rape). Too often, be- 
havioral self-blame is regarded as detrimental 
to mental health. Rather, it may serve as an 
indicator of the victim’s psychological needs 
at the time. 

Behavioral and characterological self-blame 
appear to be distinct reactions yet are far 
from fully understood. Ideas raised in this 
paper have been tested only with female sub- 
jects and thus may not generalize to other 
populations; this issue of generalizability par- 
ticularly calls for research with male subjects. 
Further, the relationship between the two 
types of self-blame would appear to be a 
fruitful area for future study. Does behavioral 
self-blame that occurs with characterological 
self-blame, for example, lose its adaptive 
value, or is it similar to behavioral blame that 
occurs without characterological self-blame? 
Is characterological self-blame that occurs 
without behavioral self-blame more or less 
maladaptive than characterological self-blame 
that occurs with behavioral self-blame? In 
addition, longitudinal studies designed to tap 
the coping implications of these two types of 
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self-blame would be important contributions 
to our understanding of the relationship be- 
tween coping and attributional strategies, 
Another possible direction lies in the area of 
blaming strategies by help-givers, Brickman 
and his colleagues (Brickman et al., Note 4) 
have presented a compelling case for the psy- 
chological tensions that exist between condi- 
tions that render helping appropriate (i.e., 
regarding the recipient of help as not respon- 
sible) and conditions that render helping 
effective (i.e, attributing responsibility to 
the recipient of help). That is, one is apt to 

help an individual who is not to blame for a 

misfortune, yet this attribution minimizes the 

belief that one’s help will be effective. Per- 

haps training both help-givers and recipients 

of help to hold behavioral blame orientations 

(as opposed to characterological blame orien- 
tations) would help resolve the existing ten- 

sions. Last, the therapeutic implications of 

the two types of self-blame—behavioral and 

characterological—remain an area ripe for 

future study. 
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A nonmotivational, expectancy-confirmation model of asymmetrical achievement 
attributions is compared with a theory of attributional egotism. Data permitting 
tests of the theories and comparisons between them were collected over an 
entire semester from students taking examinations in a large undergraduate 
course. Students provided expectancies, bases for expectancies, and attributions 
for three examinations. It was found that students’ expectancies were unrealis- 
tically high at the beginning of the course but became more accurate over time. 
Path analysis indicated that expected scores were based on prior performance 
and, increasingly throughout the semester, on expected effort. Analysis of var- 
iance further revealed that subjects were generally egotistical in their outcome 
attributions, stressing internal factors following success and external ones fol- 
lowing failure. Attributions were also influenced by the degree to which ex- 
pectancies were confirmed if such expectancies were clearly based on the at- 


tribution factor in question. 


Are attributions for achievement biased by 
a motive to protect and enhance self-esteem? 
Bradley was more satisfied in 1978 that the 
existence of the motive had been demon- 
strated than Miller and Ross were in 1975. 
Both reviews sensibly concluded, however, 
that probably motivational and cognitive fac- 
tors together determine attributions. “The 
challenge remains,” said Miller and Ross 
(1975), “for future researchers to assess the 
relative explanatory values of the motiva- 
tional and nonmotivational interpretations of 
assymetrical attributions.” The present inves- 
tigation, which focuses on the attributions 
students make for their performance on exam- 
inations, attempts such a comparison of cog- 
nitive and motivational theories. Specifically, 
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a cognitive model maintaining that attribu- 
tions for exam performance are a function of 
expectancy confirmation is compared with an 
egotism model holding that good outcomes) 
should be attributed internally and bad out- 
comes externally, regardless of expectancy 
confirmation, 

The expectancy-confirmation model, first 
proposed by Miller and Ross (1975), is de- 
rived from Heider’s (1958) common sens 
theory of action. According to Heider, an out- 
come that disconfirms an expectancy calls th 
basis of that expectancy into question, Miller 
and Ross assume that in achievement situa’ 
tions people generally expect and intend ti 


succeed. If these high expectations are ee 
on the individuals’ beliefs that they poss 


effort, then success will be attributed n E 
internal factors of ability and effort. Fail A 
on the other hand, disconfirms high nit 
tions, weakens the individuals’ faith in 
internal bases for those expectations, 
leads to external attributions such 4s 


and task difficulty. 


EXPLAINING ATTRIBUTIONS FOR ACHIEVEMENT 


A comparative analysis of expectancy con- 
firmation and egotism involves two types of 
inquiry. We must not only decide how well 
the models predict attributions in situations 
where we know their preconditions are satis- 
fied (cf. Stephan, Bernstein, Stephan, & 
Davis, 1979, Experiment I), but we must also 
consider whether the models’ preconditions 
themselves are likely to obtain in real achieve- 


7 ment contexts, Egotism is a simple theory, ap- 


l plicable in most any situation where perform- 


ance is relevant to self-esteem (Snyder, 
Stephan, & Rosenfield, 1978). Expectancy 
confirmation, on the other hand, is built upon 
a more complex set of interlocking assump- 
tions. 
The present study analyzed students’ at- 


_ tributions for their own performance on 


academic examinations, We think it safe to 
assume that test performance is relevant to 
students’ self-esteem and hence that egotism 
can make meaningful predictions of attribu- 
tions in the examination context. Whether 
students’ expectations for exam performance 
tend to be high and based on internal factors 
seems less certain, Our analysis will therefore 
focus first on verifying the assumptions of the 
expectancy—confirmation theory. 

Two previous studies, both conducted in 
university exam settings, have measured ex- 
Pectations for success and bases of expecta- 
tions. Simon and Feather (1973) asked stu- 
dents to estimate their chance of passing an 
upcoming exam and to rate the anticipated 
Contributions of ability, effort (study), test 
difficulty, and luck on their future perform- 
ance, Although high expectations of passing 
the exam did tend to be based on internal 
factors (ability and effort), the expectations 
did not appear to be especially high. The 
Mean rating of confidence of passing the exam 
Was “moderate,” and 80% of all students did 
actually pass. Davis and Stephan (in press) 
examined expectancies and their bases in two 
Separate classrooms. While the expectancies 
of students in both classes were overoptimistic 
(combined mean expected score 88%; actual 
Score 80%), expectations in one class were 
Strongly based on ability alone and expecta- 
ps in the other were weakly related only to 
effort, 
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The results of past research do not provide 
clear answers to our questions about the 
assumptions of the expectancy-confirmation 
model. The present study, which examined 
students’ expectations, their bases, and at- 
tributions over the course of an entire semes- 
ter, was designed to exploit more fully the 
potential of the university field setting. Stu- 
dents from a large introductory psychology 
course completed questionnaires before and 
after each of three examinations given over 
the course of a semester. For each exam stu- 
dents reported their expected scores and the 
bases for these expectancies; then, after the 
exam, they made attributions for their actual 
performance, The acquisition of expectancies, 
their bases, actual performance, and attribu- 
tions for three successive exams allows us to 
chronicle potential changes in these variables 
stemming from increasing exam experience 
and feedback. Path analysis (Duncan, 1966; 
Wright, 1921) will be used to plot the rela- 
tions between the variables across the three 
tests, 

The decision to use path analysis was 
based largely on two factors. First, the path 
approach adds graphic, descriptive conti- 
nuity to the flow of related events. Sec- 
ond, path analysis makes it possible to ex- 
amine a number of potentially significant 
determinants of expectations that have been 
ignored in previous research. Simon and 
Feather (1973) and Davis and Stephan (in 
press) examined only the four attribution fac- 
tors of Weiner, Frieze, Kukla, Reed, Rest, and 
Rosenbaum (1971). It seems likely, however, 
that previous performance, previous expecta- 
tions, and attributions for previous perform- 
ance would also affect subsequent expecta- 
tions. 

After the presentation of the path model, 
the need for refining the expectancy-confirma- 
tion predictions of posttest attributions will be 
discussed. Additional analyses designed to 
assess the relative merits of the refined cogni- 
tive model and egotism will then be presented. 


Path Analysis Method 


Subjects 


The subjects were 469 undergraduates enrolled in 
an introductory psychology class at the University 
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of Texas at Austin, Grades in the course were based 
on scores obtained on three out of four multiple 
choice tests administered during the semester. Pretest 
expectations and posttest attributions were collected 
for the first three tests given. 


Pretest Questionnaire 


Immediately before each test was administered, a 
short questionnaire was given to each student. Sub- 
jects were asked what score (0% to 100%) they 
expected to obtain on the test. They were also asked 
to judge the anticipated effect of ability, effort 
(study), test difficulty, and luck on their upcoming 
performance in the following manner. First they in- 
dicated whether a factor would help or hinder per- 
formance. They then estimated the extent to which 
the factor would influence performance on a 10-point 
scale running from will slightly influence to will 
greatly influence. If the student marked that a factor 
would hinder performance, the amount the factor 
was judged to influence performance was given a 
negative value. If a factor was expected to help per- 
formance, a positive value was assigned. 


Posttest Questionnaire 


During the first class period following the exam, 
each student was provided with a second question- 
naire that included his or her actual percentage score 
on the exam as well as the predicted score, Students 
were asked to make attributions for their actual per- 
formance, using the same four causal factors that 
appeared on the pretest questionnaire. Again, helping 
factors were scored positively and hindering factors 
negatively. 


Model Construction 


The intercorrelations of bases of expectation, ex- 
pected score, actual score, and attributions were 
studied by means of a path model. In path analysis it 
is assumed that there is a weak causal ordering 
among the variables. In the present study, the tem- 
poral sequence in which the questionnaires were ad- 
ministered (which followed the temporal sequence of 
real events) was used as the basis for causal order- 
ing of the variables. Only one ordering decision could 
not be made by appealing to the questionnaire order. 
This involved the precedence of expected score and 
basis of expected score on the pretest questionnaires. 
The assumption was made that bases of expectancy 
precede expectations. The ordering of variables in the 
model then is: bases of expected score for test one, 
expected score for test one, score for test one, at- 
tributions for test one outcome, bases of expected 
score for test two, expected score for test two, score 
for test two, and so on, Given the large number of 
variables considered, the original model from which 
Figure 1 was derived did not include estimates of 
every possible path. The criteria used to construct the 
original model were as follows: 

1. No paths were calculated between pretest bases 
of expectations or between posttest attributions (e.g., 
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no path was estimated between effort expectations, 
and ability expectations, or between effort attribu. 
tions and ability attributions). } 

2. The only paths estimated from a given basis of _ 
expectancy were to (a) the immediately following 
expected score, (b) the immediately followìng actual 
score, (c) the immediately following attribution, and 
(d) the next basis of expectancy (eg., paths were 
considered from XEF2 to XSCOR2, scor2, ATEF2, and | 
XEF3).1 

3. The only paths estimated from a given expected 
score were to (a) the actual score for which the ex- 
pectancy was made, and (b) the immediately follow- 
ing expected score (e.g., paths estimated from xscorl 
were to scorl and xscor2? only), 

4. The only paths estimated from an actual score 
were to (a) the following scores, (b) the immediately 
following expected score, (c) the immediately follow- 
ing attributions, (d) the bases of the next expected 
score (e.g., paths were estimated from scor1 to SCOR?, 
SCOR3, XSCOR2, ATEF1, ATABI, ATTD1, ATLK1, XEF?, 
XAB2, XTD2, XLK2), 

5. The only paths estimated from a given attribu- 
tion were to (a) the immediately following score, 
(b) the immediately following expected score, (c) the 
immediately following basis of expectancy, and (d) 
the next attribution (e.g. paths were estimated from 
ATEF1 to SCOR2, XSCOR2, XEF2, ATEF2). 

Paths were trimmed from the model constructed by 
the above criteria if the probability level of their 
associated F ratios was greater than .05 (Kerlinger & 
Pedhazur, 1973, p. 318). If a path was deleted from 
the original model, we assumed for the sake of par- 
simony that the contribution of the nonsignificant 
predictors was zero and performed new regressions 
with only the significant predictors included to ob- 
tain the path coefficients. A complete correlation 
matrix for all the variables in the model appears in 
Table 1. Means and standard deviations for 
variables are in Table 2. The path analysis was b 
on data collected from the 179 students from the en- 
tire class who responded to all six questionnaires 
Generalization from the restricted population seems 
justified, since the distribution of all scores m ; 
expected scores for both the restricted group ani 
entire population is substantially the same. 


ed to refer to the variables in 


ds for 
i i ightforward. The prefix X stan 

pa n conjunction with both 

cted effects of # 


evant. Hence, xscor1 would be thi 
Test 1; xEF2 would be the expected effect 
study on Test 2 performance. 
formance are indicated by the pret ha for 
ample, attributions to the task difficulty ‘ity, 


y- ibutions to abil)» 
Test 3 are symbolized ATTD3. Anaura Said be 


luck, and effort for Test 2 perfo tively. 
indic ATEF2, respect! raj 
indicated by ATAB2, Aten, E E ihe ol 


‘Actual scores are indicated 


5 k 3. 
score is scorl; Test 2 is scor2; Test 3 is SCOR. 
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Exscor: Escori Exscorez Escore Exscors Escors 
91 94 68 jso = 74 |i 
RI, = SCOR2 a7 ae 
\ 7? 0 ? P g 
| a SCOR] 
Ik ` ‘i ai sa 
XEFI H= HI XEF2 71 | ATEF2~ XEF3 36 ATES 
; 18 | 
xamm 2 Sa XAB2 = 32 ATAB2 ABS ATARS 
a A urf 3 19 Ie 
x È 18 ‘ 
xToI Soe | ATT02=2—XTD3 a ATTOS | 
[7 = S 
XLKI =——_/6___ATLKH tke 18 ‘ATLK2 XLK3 ATLKS 


Figure 1. Path Model. (n= 179. xscorl, xscor2 and xscor3 are students’ expected scores for 
Tests 1, 2, and 3, respectively, scor1, scor2, and scor3 are the actual scores achieved. XEF1, XEF2, 
and xEF3 are the expected effects of effort [study] on Test 1, 2, and 3 performance, respectively. 
XAB1, XAB2, and XaB3 are the expected effects of ability on Test 1, 2, and 3 performance, respec- 
tively. XTD1, XTD2, and xTp3 are the expected effects of test difficulty on Test 1, 2, and 3 perform- 
ance, XLK1, XLK2, and XLK3 are the expected effects of luck on Test 1, 2, and 3. ATEFI, ATEF2, and 
ATEF3 are effort attributions for the three tests. ATABl, ATAB2, and ATAB3 are ability attributions. 
ATTD1, ATTD2, and ATTD3 are test difficulty attributions. ATLK1, ATLK2, and ATLK3 are luck attribu- 


tions,) 


Examination of Expectations 


Before presenting the path analysis results, 
we can examine the assumption made by 
Miller and Ross that expectancies in achieve- 
ment situations tend to be relatively high. An 
inspection of the discrepancies between the 
students’ expected and actual scores indicates 
that whereas they initially overestimated their 
actual scores by a considerable margin, they 
became increasingly accurate across the three 
tests (Table 2). Test 1 predictions showed an 
overestimation of 9 points. By Test 2, students 
were still overestimating their scores but this 
time by only 5 points, significantly more ac- 
curate than on Test 1, #(178) = 4.09, p< 
.001. By Test 3, expected—actual discrepancy 
was less than 1 point, and the error consisted 
of an underestimation of performance. Test 3 
expectations were significantly more accurate 
than those for Test 2, #(178) = 5.01, p< 


.001. 


Path Analysis Results * 


Bases of Expectations 


Across the three tests there is an increased 
reliance on perceived study (and to a s "7 
degree on perceived ability) as a basis 0 
pectation (Figure 1). Whereas the vn 
mistic Test 1 expectations were primarily ba 
on luck and to a lesser extent on study, ‘i 
more realistic expectations for Tests 2 an 


2 Interpretation of changes in path ote 
more meaningful when we can safely assum ne val 
structure of the causal system regulating : 4 
ables in a model remains constant over pac si 
tural stability between explicitly consid ie 
is, however, typically accompanied bys al iy fable 
latent, unanalyzed sources of variability. ree ell 
latent variables create correlations amoa ae 
uals of explicit variables, th t i 
factor structure over time practically 
violation of path model assumptions COn! 
turbance terms (see Heise, 1970). 


EXPLAINING ATTRIBUTIONS FOR ACHIEVEMENT 


Table 2 
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Means and Standard Deviations for the Variables in the Path Model 
—  h— 


Variable M SD Variable M SD Variable M SD 
XEF1 5.91 5.42 xEF2 3.53 7.00 xeEF3 
xani 6.21 3.59 xan? 4.23 4.82  xan3 307 S37 
xTp1 2.28 7.30 x72 —1.28 7.78 7, XED —0.26 7.44 
XLKÍ 2.70 4.33 xLK2 1.93 5.30  XLK3 245 4.86 
xscori 81.89 7.30 XSCOR2 75.78 10.15 xscor3 74.02 12.55 
scori 72.72 11.11  scor2 70.82 11.01 — scor3 74.78 11.73 
] scorl-xscorl —9.17 10.92 scor2-xscor2 —4.96 11.56  scor3-xscor3 0.77 13.07 
ATEFL 1.88 7.18 ATEF2 1.38 7.07 ATEF3 1.97 6.88 
ATABL 1.44 648 araB2 2.42 546 ataB3 3.47 5.33 
arrpl —4.37 6.24 atrp2 —3.94 611 arrp3 -1.70 6.56 
ATLKL —1.73 4,16 ATLK2 —0.51 4.72 ATLK3 0.91 4.78 


Note. n = 179. (xscori, xscor2 and xscor3 are students’ expected scores for Tests 1, 2, and 3, respectively. 
-scor1, scor2, and scor3 are the actual scores achieved. xEF1, XEF2, and xEF3 are the expected effects of 
effort [study] on Test 1, 2, and 3 performance, respectively. xaB1, XAB2, and XAB3 are the expected effects 
of ability on Test 1, 2, and 3 performance, respectively. XTD1, XTD2, and xTD3 are the expected effects of test 
‘difficulty on Test 1, 2, and 3 performance. XLK1, XLK2, and XLK3 are the expected effects of luck on Tests 1, 
6 and 3. ATEF1, ATEF2, and ATEF3 are effort attributions for the three tests. ATAB1, ATAB2, and ATAB3 are 
ability attributions. arrp1, ATTD2, and ATTD3 are test difficulty attributions. ATLK1, ATLK2, and ATLK3 are 


luck attributions.) 


were directly and strongly related to perceived 
study. Previous expectations also played a 
role in determining subsequent expectations, 
although their influence diminished over the 
semester (from .46 to .27),as expectations be- 
came more accurate. No paths were found 
between anticipated ease or difficulty of any 
test (xrp) and expected score, but task dif- 
ficulty attributions for Test 1 performance 
(atTp1) were positively related to Test 2 ex- 
Pectancies. That is, students who said the ease 


Two types of analyses performed on the present 
data indicate that while factor constancy does exist, 
the accompanying violation of disturbance assump- 
tions is minimal. Factor analyses performed on the 
10 Variables represented at each of the three time 
Periods all yielded highly similar three-factor solu- 
tions, Accordingly, analysis of residuals revealed sig- 
nificant (though generally small) intercorrelations 
among disturbance terms within each time period. 

In the other hand, correlations between the residuals 
E the same dependent variable measured at separate 
ae Points (e.g., ATEF1, ATEF2, and ATEF3) were neg- 
ae Note the lack of correlation between the error 
ia E= (1— R°), associated with expected and 
is ual scores, These results indicate that an acceptable 
on promise between the assumptions of structural 

ice and uncorrelated errors exists in these 
N Thus, while the usual caveats relevant to in- 
apply causality from correlational data of course 
i ne here, violations of the particular assumptions 
me series analysis seem minimal. 

) 


of Test 1 aided Test 1 performance tended 
to predict higher scores for Test 2. 

The basis-of-expectation results differ from 
those of Simon and Feather (1973) and Davis 
and Stephan (in press) in that effort emerged 
as a much stronger determinant than ability. 
There is at least one important difference in 
procedure that may account for the discrep- 
ancy between those studies and the present 
study. In the earlier studies, previous expecta- 
tion and previous performance were not en- 
tered into the regression equations. When the 
path coefficients for xscor2 are calculated 
omitting xscor1 and scor1 as predictors, the 
path from effort to xscoR2 remains un- 
changed, whereas the ability path shows a siz- 
able increase (from .09 to .21). These findings 
suggest that had previous studies investigated 
the role of past expectancy and performance 
on subsequent expectations, the role assigned 
to perceived ability as a basis of expectations 
would have been smaller. 

Despite the significantly larger paths from 
effort to expectancy than from ability to ex- 
pectancy, the mean perception (Table 2) of 
ability’s anticipated effect on performance was 
somewhat larger than that of effort for every 
test. That is, although most students said 
their ability would help them perform well on 
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the tests, only those who thought that they 
had studied hard actually predicted high 
scores. 


Determinants of Performance 


The two factors that were consistently re- 
lated to actual outcome for Tests 2 and 3 were 
the expected scores for the tests and perform- 
ance on the previous test(s). For Test 1 ex- 
pected score was the only significant anteced- 
ent of performance, and the size of the ex- 
pectancy—performance relation is larger than 
that for subsequent tests. This makes sense if 
it is assumed that students’ expectations for 
the initial test include information about per- 
formance on similar tasks in the past. xscor1 
as a predictor of scorl may then be thought 
of as confounding the two elements that are 
discretely represented by scorl and xscor2 
as predictors of scor2, and scor1/scor2 and 
xscor3 as predictors of scor3. 

Other predictors of actual performance 
were ability attributions (aTaBl) and antic- 
ipated effect of test difficulty (x1rp2) for 
scor2, and the anticipated effect of study 
(xEF3) for scor3, Test 1 ability attributions 
and the expected role of test difficulty were 
both negatively related to performance on 
Test 2. Taken together, these two findings 
may be indicative of complacency among 
students who attributed high Test 1 scores to 
ability and those who expected Test 2 to be 
easy. Both types of inference could conceiv- 
ably have led to poor performance on Test 2. 
The positive relation between the anticipated 
beneficial influence of study (xEF3) on Test 3 
performance and actual performance seems 
straightforward. 


Determinants of Attributions 


The results clearly show that all four at- 
tributions were related to actual performance 
in a logical manner, that is, they were per- 
ceived to have facilitated high scores and 
hindered low scores. In addition, for attribu- 
tions to internal factors it was found that the 
expected contribution of these factors was 
related to subsequent attributions to them. 
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Thus, if the subjects expected effort (and tp 
a lesser extent, ability) to help them, they 
subsequently attributed their outcome to that 
factor. This was not the case for luck on any 
of the three tests. The expectancy-attriby. 
tion relation was obtained for task difficulty, 
the other external factor, only for Test 3 (the 
path between xtp3 and ATTD3). 

The existence of direct paths from internal 
bases of expectancy to attributions (unmedi- 
ated by outcome) suggests that to some de- 
gree students do not alter their pretest opin- 
ions of the expected contributions of their 
ability and effort to test performance, even 
after performance feedback. Since the in- 
ternal factors are knowable to a greater ex- 
tent before the test than the external factors 
are, one might expect more pretest—posttes! 
consistency for internal than external factors. 


Attribution, Future Expectations, and 
Future Performance 


Attributions for performance had few ef- 
fects on future expectations, the bases of these 
expectations, or future performance. We ex 
pected attributions at least to affect bases of 
expectations (e.g., ATEF1 was expected to in{ 
fluence XEF2, ATAB] was expected to influen i 
xaB2, etc.). Only one such relationship was 
found, the path between atrp2 and xTD3, ang 
it was relatively small. Where attributions did 
affect expected scores and actual performance 
their influence was not mediated by the bases 
of expectancy variables (e.g., artn] did a 
affect xTp2, which in turn affected xSCORd) 
rather, aTTD1 affected xSCOR2 directly). Three 
such direct relationships were found, and they 
comprised the entire influence of attributlol 
on future expectancy and actual pe 4 
The paths were between atapl and oe 
(—.19), ATEF2 and xscor3 (—.19), 
atTp1 and xscor2 (.20). í 

The small effect of attributions on 
tancies is surprising given the importan 
assigned to them in both Weiner s Pe 
model and the recent learned help! 
formulation (Abramson, Seligman, & ot 
dale, 1978). More detailed anahe oe ‘ol 
role of attributions in expectancy 0 
seems warranted by these results. 


EXPLAINING ATTRIBUTIONS FOR ACHIEVEMENT 


Reformulation and Test of the 
Expectancy Model 


The path model and the associated table of 
means suggests that a revision of the dual 
assumptions of the expectancy-confirmation 
approach to attribution is needed. Initially, 
people tend to believe that they will be lucky 
and are overoptimistic regarding their chances 
for success. Later, the more experience stu- 

| dents gain with a task and the more accurate 
their expectations become, the more expec- 
tancies are based on internal factors, especially 
effort. 

i Given what we have learned about students’ 

expectations and their bases, we can now 
make more specific predictions concerning the 
role of confirmation of expectations on at- 

tributions, 

1. Confirmation will exert an effect on 
effort attributions for Tests 2 and 3 perform- 
ance, since expectations were largely based 
on perceived study for these two tests. Con- 
firmed expectancies should be attributed to 
effort more than disconfirmed expectancies. 

2. Ability and task attributions should be 
minimally affected by confirmation since ex- 
pectations were not based on either of these 
factors. 

3. Previous research has consistently shown 
that unexpected outcomes are attributed more 
to luck than expected outcomes (McMahon, 
1973; Simon & Feather, 1973; Stephan et al., 
1979), We expect to find a similar result in 
this study for all three tests, regardless of 
Whether or not luck was a significant predic- 
tor of expected score (as for Test 1). 

The prediction of differential effects of the 
Confirmation factor on effort (disconfirmation 
leading to less attribution) and luck (discon- 
firmation leading to more attribution) is 
logical. If one knows he/she has worked hard, 
that knowledge is not easily changed by re- 
Ceipt of a low grade. Rather, the perceived 
į fect of study merely becomes discounted as 
4 cause of poor performance relative to its 
Tole in explaining good performance. Judg- 
ments of luck’s influence on performance, 

Owever, are directly related to the element 
of surprise and should not be affected by 
Whether or not luck formed the basis of ex- 
Pectation. Disconfirmation of a positive or 
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Negative nature seems to automatically imply 
the operation of a greater amount of capri- 
ciousness and luck in the production of out- 
comes than does confirmation. 

The predictions of the egotism model are, 
of course, not altered by the path model in- 
formation, since they do not depend upon 
expectations or bases of expectations, The 
motivational model simply predicts that the 
internal factors of ability and effort will be 
seen as causing good performance and the ex- 
ternal factors of task difficulty and luck will 
be seen as the causes of poor performance on 
all three tests. 


Analysis of Variance Method 


In order to test the above predictions and to con- 
trast the effects of expectancy confirmation and 
egotism on attributions, we separately analyzed the 
attributions following each test according to a simple 
2(Level of Expectancy) X 2(Confirmation of Expec- 
ancy) design. Students were informed at the begin- 
ning of the semester and on our pretest questionnaire 
that scores of 85%-100% constituted an A, 75-84 a 
B, 60-74 a C, 50-59 a D, and 49 and below an F. 
Students whose expected score for a given test was 
75% or above made up the high expectancy group 
for that test. Those whose expected score was 74% 
or below made up the low expectancy group. Stu- 
dents with high expectancies whose scores were 75 or 
above and those with low expectancies whose scores 
were 74 or below made up the confirmation of ex- 
pectancy groups. Students with low expectancies who 
scored 75 or above and those with high expectancies 
who scored 74 or less made up the disconfirmation 
groups. 

A different method for scoring the attribution 
questions was used for this analysis than was em- 
ployed for the path model (see Path Method), In 
order to study the interaction between expectancy 
and performance (i.e. the confirmation factor), the 
scoring of the attributions was modified to take into 
account the relationship between the perceived effect 
of a factor (did it contribute to the outcome or not) 
and the valence of the outcome (success or failure). 
Accordingly, when the rating indicated that the sub- 
ject felt his outcome, high performance or low per- 
formance, could be attributed to a given factor, its 
rating was scored positively. If a factor was per- 
ceived to work in the direction opposite from the 
subject’s ultimate outcome, it was scored negatively 
(Stephan et al., 1979; Stephan, Rosenfield, & Stephan, 
1976). Thus, when the subject succeeded, all factors 
the subject perceived as having helped him were 
scored positively, and all hindering factors were 
scored negatively. When he failed, hindering factors 
were scored positively, and helping factors nega- 
tively. The reason for reversing the directionality 
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Table 3 


Expected Scores, Actual Scores, and n of Subjects 
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To 


Confirmed Disconfirmed 
Expectancy Expected score Actual score n Expected score Actual score n 
Test 1 
High 85.7 82.7 158 82.6 63.5 157 
Low 66.4 61.2 41 66.6 78.9 10 
Test 2 
High 83.1 83.1 108 80.8 63.6 119 
Low 64,2 60.1 109 66.4 79.5 28 
Test 3 
High 83.1 84.3 123 80.5 66.7 68 
Low 61.9 62.5 85 66.0 81.0 54 


of the scoring for subjects who failed is derived 
from Kelley’s (1971) analysis of facilitatory and 
inhibitory causes. Facilitative causes are those 
that make the outcome more likely, whereas in- 
hibitory causes tend to prevent the occurrence of 
certain outcomes. For example, high ability facilitates 
success but inhibits failure. Similarly, low ability 
facilitates failure and inhibits success. The scoring 
system, then, scores facilitative causes positively and 
inhibitory causes negatively. 


Analysis of Variance Results * 


Table 3 shows the results of the group 
assignment procedure for the three examina- 
tions. Three separate Expectancy x Confir- 
mation analyses of variance (aNovAs) show 
that the mean expected score of the high ex- 
pectancy groups is significantly greater than 
that of the low expectancy groups, as planned. 
Test 1: F(1, 362) = 321.43, p< .001; Test 
2: F(1, 360) = 424.53, p < .001; Test 3: 
F(1, 326) = 590.41, p < .001. In addition, 
high expectancy —confirmation and low ex- 
pectancy — disconfirmation students on Tests 
2 and 3 have higher expected scores; Test 2: 
F(1, 360) = 7.57, p < 01; Test 3: F(1, 326) 
= 21.25, p < .001. 

The confirmation group assignment was 
equally effective. Subjects in the confirmation 
cells had significantly smaller absolute dis- 
crepancies between their expected and actual 
scores for all three tests than did students in 
the disconfirmation conditions. Test 1: F(1, 
362) = 33.65, p < .001; Test 2: F(1, 360) = 
91.51, p < .001; Test 3: F(1, 326) = 111.22, 


p < .001. There were three other significant’ 
effects observed for discrepancy scores. On 
Tests 1 and 2, high expectancy — confirmation 
and low expectancy — disconfirmation students 
tended to be more accurate with their predic 
tions than were students in the other condi 
tions; Test 1: F(1, 362) = 16.08, p < 0015 
Test 2: F(1, 360) = 14.11, p < .001. On Test 
3, low expectancy subjects tended to be morg 
accurate than were high expectancy subjects 
F(1, 326) = 4.69, p < .05. 

Attribution measures. Egotism would pre 
dict a greater emphasis on internal factor 
given success (high expectancy — confirmation 
and low expectancy — disconfirmation) ail 
would occur given failure (low expectancy 
confirmation and high expectancy — discon y 
mation). The opposite pattern of ga 
would be predicted for external factors. a 
tributions in accordance with these predi 
tions would result in a Level of Expectancy i 
Confirmation interaction. As predicted, @ i 


3 All analyses used the Statistical Package ae 
Social Sciences (SPSS) ANOVA rere By: 
given the unequal cell ns. Also note me eal ic 
analyses included any subjects who mai fae en. 
and took the test relevant to the spect Bee 
Hence, the ns vary for each analysis. ee a 
that the population assessed in these m val 
extra subjects whose data were not us ats 
the path model. Similar ANOVAS caria n 
restricted population used to constru! pattern 
model yielded substantially the an Populi 
results as that obtained from the toti i 
although significance levels varied some’ 
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| butions for Test 1 performance (Table 4) ex- 


| hibit egotism effects for all four factors, The 


Expectancy X Confirmation interaction for all 
four factors and for an internal minus external 
composite (Stephan et al, 1976) were all 
highly significant: Effort F(1, 349) = 5.17, 
$ < .05; ability F (1, 356) = 13.16, p < .001; 
task F(1, 351) = 30.15, p < .001; luck F(1, 
356) = 4.43, p < .05; internal-external F(1, 
334) = 58.93, p < .001. Ability and effort, 
į the internal factors, were emphasized as the 
causes of good outcomes, whereas task dif- 
ficulty and luck, the external factors, were 
perceived to be the cause of poor perform- 
"ance. 

The only other significant effect for Test 1 
was the predicted emphasis on luck by stu- 
dents whose expectations were disconfirmed, 

'F(1, 354) = 4.98, p < .05. 

Attributions for Test 2 also generally con- 
firmed predictions. The egotism interaction 
again emerged for three of the four factors 
and the composite: Effort F(1, 353) =19.04, 
$ < 001; ability F(1, 351) = 68.58, p< 
001; task F(1, 353) = 69.18, p < .001; in- 
ternal-external F(1, 335) = 116.68, p < .001. 
Luck attributions, however, did not show a 
Motivational effect. Effort attributions were 
influenced more by the confirmation factor, as 
Predicted. Confirmed expectancies were at- 
tributed more to effort, the basis of expecta- 
tions, than were disconfirmed expectancies, 
F(1, 349) = 4.47, p < .05. Luck attributions 
Were again influenced by confirmation, with 
disconfirmation subjects stressing luck more 
than confirmation subjects, (1,354) = 
13.65, p < 001. The composite measure was 
also influenced by the confirmation variable. 
Disconfirmed expectancies led to a greater 
reliance on external than on internal factors, 
F(1, 335) = 12.19, p < 001. 

Attributions for Test 3 showed egotism ef- 
fects on all five dependent variables: Effort 
P(C, 321) = 36.69, p< .001; ability F(1, 
320) = 106.69, p < 001; task F(1,314) = 
17.52, p < 001; luck F(1, 317) = 5.01, p < 
05; internal—external F(1, 303) = 136.75, 
P< 001. Confirmation again resulted in a 
steater reliance on effort than did disconfir- 
ation, as predicted, F(1, 312) = 5.43, Ż < 
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Table 4 
Attributions to Effort, Ability, Task 
Difficulty, Luck, and Internal-External 
Factors 
a i tech arta 
High expectancy Low expectancy 
(A or B) (C or less) 
Attri- Con- Discon- Con- Discon- 
bution firmed firmed firmed firmed 
Effort 
Test 1 5.1 9 1.8 3.2 
Test 2 5.1 —=.2 1.3 3.2 
Test 3 5.9 2 1.2 3.8 
Ability 
Test 1 4.5 1.4 =.9 3.7 
Test 2 5.1 =1 =.2 5.4 
Test 3 5.2 -14 =1.7 4.0 
Task 
Test 1 =2.3 6.0 41 —8 
Test 2 SA 6.0 5.5 9. 
Test 3 1.0 5.4 2.8 1.3 
Luck 
Test 1 —.2 2.9 24 2.5 
Test 2 1.0 27. 1.5 3.8 
Test 3 21 5.4 2.8 3.7 
Internal-external 
Test 1 11.9 —6.9 -7.0 5.9 
Test 2 10.0 —8.8 =5.9 3.7 
Test 3 8.2 —12.4 —6.3 2.7 


Note. Expectancy is rated according to test grade 
anticipated (A, B, C, or less). 


.05. Again, luck attributions—and, surpris- 
ingly, task attributions—were emphasized by 
subjects whose expectations were disconfirmed 
following Test 3: Luck F(1, 317) = 15.72, 
p<.001; task F(1, 314) = 4.51, p< .05, 
As on Test 2, summed external factors were 
stressed more than internal factors by discon- 
firmation subjects, F(1, 303) = 20.92, p < 
.001. 


Discussion 


These results help to define the boundaries 
of two major theories concerned with the 
determination of attributions for performance 
in achievement settings. Students in the pres- 
ent study were egotistical in their attributions 
for three examination scores, They attributed 
A and B scores to ability and effort, and C, D, 
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and F performance to test difficulty and bad 
luck, However, attributions were also influ- 
enced by less motivated concerns. The path 
analysis indicated that the most important 
factor in determining expectations for Tests 
2 and 3 was perceived amount of study. Given 
this clear basis of expectancy, the students’ 
subsequent attributions to effort were influ- 
enced by their expectations in a manner con- 
sistent with the expectancy confirmation ap- 
proach outlined by Miller and Ross (1975). 
Students whose expectations were confirmed 
on Tests 2 and 3 did make more attribution to 
effort, the basis of their expectations, than did 
students who received unexpected scores. 
The results of this study highlight the im- 
portant role of effort in the attribution pro- 
cess. While a number of previous studies have 
shown that attributions are affected by out- 
comes and expectancy confirmation (e.g., 
Frieze & Weiner, 1971; McMahon, 1973), the 
present study shows that over a series of 
trials, effort emerges as the single most im- 
portant attribution factor. On Tests 2 and 3 
effort attributions were influenced by both 
motivational and nonmotivational attribution 
processes. Effort is distinguished from the 
other attribution factors by the fact that it 
alone is under the actor’s control. It may be 
easier for the actor to perceive the covariance 
between effort and outcome than to perceive 
the relationship between outcome and the 
other attribution factors (Kelley, 1967). For 
instance, an actor may attempt to assess the 
difficulty of the test while he/she is taking it, 
but an accurate assessment of its difficulty is 
probably dependent on social comparison in- 
formation provided by subsequent informal 
discussion and feedback on class performance. 
Ability and task difficulty attributions, in 
contrast to effort ascriptions, were relatively 
uninfluenced by the confirmation factor. Con- 
cern with self-protection and enhancement 
were the most important determinants of abil- 
ity and task attributions in the present study. 
The path analysis suggested that unlike effort, 
neither ability nor task difficulty judgments 
were important in the formation of expect- 
ancies. According to our interpretation of the 
expectancy-confirmation model, lack of a di- 
rect relation between the expected effect of a 
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factor and the expected level of performance 
makes it impossible for that model to make 
meaningful predictions of posttest attribu- 
tions. This is because achieving or not achiev- 
ing the score one expects has no clear im- 
plications for attributions to a factor un- 
related to the production of the expectancy, 
An exception to the last statement seems 
warranted only in the case of the luck factor, 
Attributions to luck, like those to effort, were: 
generally influenced by both egotism and ex- 
pectancy confirmation. But, unlike effort, luck 
was stressed as a causal factor by students 
whose expectancies were disconfirmed rather! 
than those whose expectancies were con- 
firmed. In further contrast to effort, the effect 
of confirmation on luck attributions did not 
depend upon whether luck was (as for Test 
1) or was not (as for Tests 2 and 3) a basis! 
for expectancy. This is because luck bears a 
more direct relation to the confirmation factor 
than do the other three factors, Estimates of 
luck’s role are in a sense equivalent to judg: 
ments of surprise and hence should be great- 
est given disconfirmation of expectancy. } 
The results for the internal minus external 
composite showed that on all three tests sut- 
cessful students stressed internal factors t 
explain their performance and less successful 
students used external causes. For Tests 1 
and 3, in addition to the egotism effect, it wasi 
shown that disconfirmed expectancies welt 
associated with external attributions. In tag 
present context, the internal-external comi 
posite might also be seen as a stable-unstable 
index, In Weiner’s scheme, ability and Ce 
difficulty are regarded as stable factors ane 
effort and luck as unstable factors. It is oe 
however, that in a university examinati 
setting test diffculty may vary throughou i 
semester. Additionally, we have demonstrat 
that effort formed the strongest basis © fo 
pectancy for Tests 2 and 3 and that nae 
attributions for these tests were consis 4 
with the idea that judgments of one’s amot 
of prior study remain stable acros 
It therefore seems reasonable to Mi 
the emphasis on external factors by aii 
firmation subjects on Tests 2 an 
emphasis on unstable factors (see 


Weiner, 1971). 


as # 
Frieze 


l 
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A third and perhaps more parsimonious 
way to conceptualize these results is to view 
the composite as a comparison between at- 
tributions to factors that were important in 
forming expectancies versus those that were 
not. Given disconfirmation of Tests 2 and 3 
expectancies, task difficulty and luck, factors 
unrelated to Tests 2 and 3 expectancies are 

_ emphasized relative to effort and ability. 

Deciding upon a way to conceptualize these 
results depends upon the particular expect- 
ancy attribution theory one adopts. Some 
theorists maintain that the stability of at- 
tribution factors is the dimension most rel- 
evant for explaining the effects of confirma- 
tion and disconfirmation on causal attribu- 

| tions (Frieze & Weiner, 1971; McMahon, 
1973; Valle & Frieze, 1976). Others stress the 
importance of the internal-external dimen- 
sion (Feather, 1969; Miller & Ross, 1975). 
Still others have deemphasized the role of ex- 
pectancies and have stressed the motive to 
make self-serving attributions regardless of 
the degree of expectancy confirmation (Brad- 
ley, 1978; Stephan et al., 1979). The present 
Study suggests that closer examination of the 
antecedents of expectancies may be a useful 
way to disentangle, compare, and reintegrate 
these competing hypotheses. 
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Emergencies: What Are They and Do They Influence 
Bystanders to Intervene? 


R. Lance Shotland and Ted L. Huston 


Pennsylvania State University 


Social psychological research on helping has, in part, been concerned with the 
intervention of bystanders into emergencies. Pertinent empirical literature does 
not seem to be available on what factors bystanders use to define an emergency 
nor the effect of such a decision on the rate of helping. A series of four studies 
was conducted to answer these questions. We found that (a) Emergencies are 
a subclass of problem situation that usually result from accidents; (b) there 
is a high degree of agreement concerning what problem situations are definitely 
an emergency; (c) emergency situations are differentiated from other problem 
situations by threat of harm or actual harm worsening with time, unavailability 
of an easy solution to the problem, and necessity of obtaining outside help to 
solve the problem; (d) disagreement on whether a problem situation is an 
emergency or not results from differing perceptions of the degree to which 
threat of harm or actual harm worsens with time; (e) bystanders are more 
likely to help in emergency than in nonemergency problem situations. The 
results were interpreted as indicating that the need of the victim is a salient 


feature used by bystanders in determining whether or not to help. 


Social psychological research on helping 
has, in part, been concerned with the inter- 
vention of bystanders into emergencies. Sur- 
prisingly, the question of whether onlookers 
are more apt to help in emergencies than in 
nonemergencies has not been investigated, nor 
does there seem to be any firm theoretical 
basis for predicting the effect of whether or 
not a given situation is seen as an emergency 
on the rate of helping. Moreover, no evidence 
is available concerning the considerations by- 
standers use in labeling a situation as an 
emergency. These are the problems we wish 
to investigate. 
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Latané and Darley (1970), in their mono: 
graph on the unresponsive bystander, come 
to an equivocal position on whether a person 
is more or less likely to respond to an emer- 
gency situation, In laying out the reasons 
why bystanders may fail to help in eme 
gencies, these writers make the following ob: 
servation about emergencies: “The picture ® 
a grim one. Faced with a situation in which) 
he (the bystander) can gain no benefit, ut 
able to rely on past experience, on the expen 
ence of others, or on forethought and plan: 
ning, denied the opportunity to carefully co! 
sider his course of action, the bystander to 
an emergency is in an unenviable po 
is perhaps surprising that anyone shoul 
tervene at all” (p. 31). s 

Latané and Darley, in proposing 
decision-making model, however, sug 
that perceiving an event as: an emer} 
is a critical determinant of Mee 
stander will provide help. “If an individu 
to intervene in an emergency,” accor 
Latané and Darley (1970), “he must o 
not just one, but a series of decisions. 


. . m t 
one particular set of choices will lead hi f 


f 


take action in the situation” (p. 31). In order 
to intervene the onlooker must (a) notice 
that something is happening, (b) interpret 
the event as an emergency, (c) assume re- 
sponsibility for taking action, (d) decide on 
the appropriate form of assistance, and (e) 
decide how to implement the assistance. 

Piliavin, Piliavin, and Rodin (1975) take a 
mewhat different stänce concerning the im- 
pact of emergencies on helping. According to 
j their model, emergencies produce arousal that 
becomes more unpleasant as it increases. In 
rder to reduce the unpleasant arousal, by- 
tanders take action, the form of which de- 
ends on the line of action they perceive to 
e most likely to reduce their arousal quickly 
and effectively. Emergencies ought to in- 
tensify the actions of bystanders, but whether 
the onlookers run toward oraway from the 
Scene is predictable from elements in the 
ituation other than its emergency character. 

Since we are interested in determining 
whether emergencies stimulate helping, our 
first task was to identify such events and to 
differentiate them from other situations where 
‘help is not a matter of urgency or necessity. 
he emergency intervention literature has 
nfronted onlookers with a variety of inci- 
nts, including a collapsed subways rider 
{Piliavin, Rodin, & Piliavin, 1969), a physical 
ssault (Shotland & Straw, 1976), and a vic- 
tim of stomach cramps (Staub, 1974). No 
Systematic attempt has been mounted to iden- 
tify the range of events perceived to be of an 
emergency nature. 

This article reports data from four studies. 
The first three studies were concerned with 
identifying emergencies and defining their 
Properties. The fourth study examined 
Whether the emergency character of an event 
fluences helping. 


Study 1 


The first study had two phases, In the first 
Phase (Part A) participants were asked to 
“Wentify the kinds of events that come to 
Mind when they think of an emergency. The 
‘vents identified by these respondents, as well 
*S a sample of the kinds of events simulated 
by Tesearchers studying bystander interven- 
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tion, were rated as to their degree of emer- 
gency by a new set of respondents during the 
second phase (Part B) of the study. 


Part A 
Method 


Subjects. The subjects were 39 men and 16 women 
enrolled in a social psychology course, winter term 
1976, at The Pennsylvania State University. 

Procedure. Subjects were asked to list events they 
considered an emergency that they had witnessed, 
imagined, or knew about. The term emergency was 
left undefined, given our interest in determining the 
subject’s own definition. 


Results 


In order of frequency, the following events 
were identified: auto accident, fire, heart at- 
tack, drowning, flood, poisoning, airplane 
crash, hurricane/tornado, earthquake, rape, 
and war. The events can be classified into the 
following categories, listed in order of the 
number of events identified: (a) accidents; 
(b) illness, disease, and death due to natural 
causes; (c) natural disasters; (d) acts of 
violence and crime; (e) psychological and 
emotional problems (e.g., attempted suicide, 
nervous breakdown); (f) political problems 
and crises (e.g., overpopulation, oil shortage) ; 
(g) everyday problems and crises (e.g., late 
for an appointment, studying all night for an 
exam, and (h) miscellaneous (e.g., electrical 
power failure, financial emergency). 


Part B 
Method 


The emergency situations generated by the subjects 
in Part A were augmented by a list of emergencies 
simulated in previous research (Huston & Korte, 
1976), as well as by situations generated by the in- 
vestigators. 

Subjects. The subjects were 69 females and 21 
males enrolled in a social psychology course, spring 
term 1976, at the Pennsylvania State University. 

Procedure. The final pool of items consisted of 96 
descriptions of events. Half of the items were put 
into Form A, the other half into Form B. This was 
done to prevent respondents from becoming fatigued. 
Participants rated each event in terms of whether it 
is an emergency on the following 5-point scale: 
(1) definitely an emergency, (2) probably an 
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Table 1 
Classification of Events 
Event M SD Categery 
Definite emergency 
Cut artery, profuse bleeding 1.00 00 Accident 
House ablaze, people screaming for help 1.00 .00 Accident ` l 
Child poisoned 1.00 -00 Accident 
Child swallowed razor blade 1.00 .00 Accident s% 
Person in shock, lying on ground 1.07 25 Accident q 
Heart attack 1.02 AS Illness, disease d 
Airplane crash, some passengers still alive 1.02 AS Accident 
Rape in progress 1.09 36 Act of violence 
Car accident, driver motionless on ground 1.09 29 Accident 
Drug overdose 1.09 46 Accident 
Not sure* k ' 
Terminal cancer, 3 mos. to live 2.60 1.35 Illness, disease 7 
Lost person in woods shouting for help 2.64 1.03 Miscellaneous : 
Car broken down on side of road 2.72 98 Everyday problem r 
Emergency landing by airplane, 
passengers shaken but unhurt 2.78 1.08 Accigent 
Burning leaves, no one attending, windy 
day 2.81 95 Accident . 
Rock stars in danger of being trampled i 
by fans 2,84 1.18 Miscellaneous 
Mildly intoxicated friend wants to drive 
home 2.84 1.18 Everyday problem 
Blackout in neighborhood 2.98 1.03 Miscellaneous 
Man on first date with woman he likes, 
no money 3.07 1.08 Everyday problem 
Friend states he is miserable and f 
depressed 3.18 92 Psychological and emotional problem 
Probably not an emergency 
Car door chips other car's paint 3.75 1.06 Everyday problem 
Telethon wants to get $20 million for 
muscular dystrophy 3.75 1.17 Miscellaneous 
Car parked by fire hydrant 3.98 98 Everyday problem 
Scraped knee 4.05 84 Everyday problem 
Tramp panhandling 4.05 .99 Everyday problem 
Tray of hors d'oeuvres drops at party 4.09 91 Everyday problem 
Child holds breath in protest 4.18 97 Everyday problem 
Person loses dime in pay phone 4,23 94 Everyday problem 
Cat is stuck in a tree 4.23 73 Everyday problem 
Someone has cigar, no match 4.87 54 Everyday problem 


a Ms closest to 3. 


emergency, (3) not sure, (4) probably not an emer- 
gency, (5) definitely not an emergency. 


Results 


The most striking finding is that our par- 
ticipants agree considerably concerning what 
events are emergencies. In fact, four items 
were seen as definitely an emergency by all 
respondents; several other events were seen as 
definite emergencies by all but a few of the 
respondents. As can be seen in Table 1, the 


events seen as emergencies were primar 


accidents. At the other end, problems see? 
probably not an emergency were cateo 
for the most part, as everyday pro! i 
Situations that elicited uncertainty on 
their emergency nature fit a number 
egories, including everyday pro 
logical and emotional problems, fe i 
miscellaneous occurrences. From Tal T 
apparent that emergencies appear to 
cial class of problem situations. 
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As noted earlier, a sampling of events sim- 
i ulated in the laboratory in previous research 
was rated by our respondents in terms of 
their degree of emergency. These events 
tended to be rated as probably an emergency. 
Representative events are as follows: a col- 
lapsed subway rider (M = 1.50; SD = .73), 
| a person lying on the floor having fallen from 
“a ladder (M = 1.61; SD = .64), an individ- 
ual suffering from an asthmatic attack (M = 
2.00; SD = 1.10), a person groaning and 
lying in a doorway (M = 2.02; SD = .17), 
and an individual complaining of severe 
tomach cramps (M = 2.38; SD = .86). 


Study 2 


The next step is to identify the defining 
qualities of emergencies that distinguish them 
from other problem situations, Latané and 
Darley (1970) have drawn a portrait of 
emergencies. We shall draw heavily from their 
observations, elaborating them in terms of 
insights gleaned from Table 1. First, as can 
be seen in Table 1, events seen as definite 
emergencies all kappen suddenly and unex- 
pectedly. Victims and bystanders alike find 
hemselves unprepared. Eight of the 10 events 
“host consistently seen as definite emergencies 
were accidents; the remaining 2—heart attack 
and rape—also catch their victims by sur- 
prise. The fact that something happens with- 
out warning, however, does not necessarily 
indicate an emergency. Dropping a tray of 
hors d’oeuvres at a party may be embarrass- 
ing, but according to our respondents, it is 
probably not an emergency. 

A second important feature of emergencies 
is that they involve threat of harm or actual 
harm to the victim, Events seen by our re- 
spondents as definite emergencies all threaten 
the life of the victim. A cut artery, poisoning, 
and drug overdose are unambiguous with 
egard to their life-threatening potential. In 
each case, in addition, immediate, urgent ac- 
tion is required if the victim is to be saved. 
Too much loss of blood or the failure to find 
an antidote quickly to counteract the poison 
or the drug would be fatal. In contrast, events 
that were seen as nonemergency problems or 
that our respondents were unsure how to 

lassify pose little or no threat of harm to the 


victim. A child holding his breadth in protest 
may put a scare into his parents, but our 
respondents did not think of it as much of an 
emergency. Similarly, a person who scrapes a 
knee or loses a dime in a pay phone may feel 
distressed, but others are unlikely to see this 
situation as an emergency. 

Another important feature of an emergency 
is that outside help is necessary, Without the 
intervention of an onlooker, the victim is un- 
able to keep the situation from worsening. A 
helpless victim is characteristic of all situa- 
tions classified as definite emergencies in 
Table 1. The fact that a victim is helpless, 
however, does not necessarily indicate that an 
emergency is at hand. Victims of terminal 
cancer are helpless, but our respondents were 
uncertain whether to classify their circum- 
stances as an emergency. This suggests, per- 
haps, that in order for an event to be seen as 
an emergency, it must be possible for on- 
lookers to do something to help the victim. 

The above analysis suggests that emer- 
gencies involve several essential features: 

1. They occur suddenly and unexpectedly. 

2. They involve threat of harm or actual 
harm to the victim, 

3. The threat increases with time. 

4. The victim is helpless. 4 

5. Potentially effective intervention must 
be possible. 

According to the data in Study 1, it appears 
that events that involve one of the above fea- 
tures but not others are unlikely to be seen 
as definite emergencies. We expect to see 
“threat of harm” (2) and “threat of harm 
increasing with time” (3) to be moderately 
to highly correlated. A moderate correlation 
is predicted between ‘something can be done 
to correct the situation” (4) and “outside 
help is necessary” (5). 


Method 


Brief scales comprised of both positively and nega- 
tively worded Likert-type items were constructed to 
measure each of the factors that were hypothesized 
to define an event as an emergency. The scale de- 
signed to measure the degree to which an event 
occurred suddenly and unexpectedly consists of 10 
items such as “the incident occurred without warn- 
ing” and “the situation could have been anticipated.” 
Guttman's lamda-3 index of reliability was measured 
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as 88 and .80 on two data sets using different clus- 
ters of situations. The scale measuring threat of harm 
or actual harm consisted of 13 items yielding a Gutt- 
man lambda-3 index of reliability of 83 and 88. A 
representative item from this scale is “individuals or 
property involved will only benefit from this situa- 
tion.” The scale indexing threat of harm or actual 
harm increasing with time had reliabilities of .79 
and .79 with a total of 8 items. An example from 
this scale is “the probability of threat or harm to the 
person or property increases with time.” The scale 
designed to tap whether something can be done to 
correct the situation consisted of 9 items yielding a 
Guttman lambda-3 of .83 and .79. An example of an 
item from this scale is: “help will be effective in 
solving the problem.” The scale designed to ascertain 
whether outside help is necessary contained 10 items 
and had Guttman reliabilities of .75 and .79. “Others 
are needed to help change the situation” is an item 
taken from this scale. 

The items from the five scales were randomized 
along with an additional 5-point Likert-type item 
designed to measure subjects’ perceptions of a prob- 
lem situation as an emergency. The anchor points of 
the latter measure were (1) definitely an emergency 
and (5) definitely not an emergency, with (3) being 
not sure. 

To increase the generalizability of the results, we 
decided to investigate the defining features of three 
problem situations. Two situations that were rated as 
“probably an emergency” by subjects in Study 1 
were randomly chosen. We used situations rated as 
“probably an emergency” for two reasons, First, 
situations rated as “definitely an emergency” had 
either no or very little variance. Second, as we 
pointed out previously, this is the category that most 
of the situations used by previous researchers in- 
vestigating emergency intervention phenomena had 
used in their research. The first emergency situation 
was “you see a driver weaving from one side of the 
road to the other. He is apparently drunk” (M = 
2.07; SD = 1.04). The second situation was: “you 
see the classmate next to you have an asthmatic 
attack” (M = 2.00, SD=1.10). We also wished to 


Table 2 


Attenuated and Unattenuated Pearson Correlation Matrix Indicating the I cage Tai 
Between the Variables Hypothesized to be Components of an Emergency Across the 


Emergency Situations 


1. Sudden and unexpected 

2. Threat of harm —.23 
3. Harm increases with time —.09 
4. Something can be done 10 
5. Outside help necessary 7 
6. Emergency —.06 


i i for reliability, t 
Note. Below the diagonal are correlations corrected li Le rail eal 
test for significance use the attenuated matrix. Correlations equal to or abo AR 


level. 
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investigate crimes as emergencies, as one might ex- 
pect that crimes might be emergencies with different 
defining characteristics, However, the crimes on our 
list were all rated as “definitely an emergency” and 
had little or no variance. So a crime was chosen from 
Rossi, Waite, Bose, and Berk’s (1974) list of crimes 
ordered in terms of seriousness. We chose a crime 
that was not seen as extremely serious. This problem 
situation was: “you see a young man grab the purse 
from an elderly woman and run away.” 

Subjects. The subjects were 279 undergraduates 
from an introductory psychology course at The 
Pennsylvania State University during the spring term 
of 1977. Subjects were randomly assigned a given 
emergency situation: 57 males and 43 females were 
assigned to the drunk driver situation; 50 males an 
38 females were assigned to rate the asthma 
attack; and 45 males and 45 females, plus on 
respondent whose sex was not ascertained, rated the 
purse-snatching incident. 

Procedure. The subjects were given the question- 
naire described above. The questionnaire contained 4l 
front page that listed one of the three problem situa- 
tions and provided instructions to the subject to rate 
the situation on each of the items listed on the fol 
lowing pages. A sample of the Likert scale format 
was included on the front page of the questionnaire 
as well as on each succeeding page. 


Results 


As Table 2 illustrates, if we view across 
three emergency situations and look at | 
half of the matrix corrected for attenuatio 
the scale measuring the degree to which 
event is seen as sudden and unexpected b 
little relationship to the other scales. 
remaining four scales are moderately related 
with one another and share between 15.2% ol 
the variance (threats of harm with outs! 
help necessary) and 38.4% of the varian 


lationships 


fie 41 
i .56 5 36 fe 
62 i is 
36 187 jy 9730 35 


relations: 


above it are the atten uated co ‘at the 
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Table 3 
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Regression Analyses of the Variables That Lead to a Decision That a Problem 
Situation Is an Emergency (Corrected for Attenuation) 


Variable in order of 
insertion into regres- 
sion equation 


Multiple R 


R B F 


Problem Situation 1: Asthmatic attack 


Harm increases with time 613 376 1.060 41,17* 
Something can be done 677 ‘458 — 1568 13.22* 
Sudden and unexpected 695 A84 —.158 3.54 
Threat of harm .704 496 -140 2.13 
Outside help necessary -705 496 —.040 Al 
Problem Situation 2: Drunk driver 
Harm increases with time 117 514 984 38.75* 
Outside help necessary -143 +592 —.283 4.94 
Sudden and unexpected -156 572 141 3.44 
Something can be done -758 574 -093 85 
Threat of harm +759 576 —.080 34 
Problem Situation 3: Purse snatching 
Harm increases with time -360 130 344 1.24* 
Something can be done .369 .136 148 1.25 
Threat of harm -382 146 —.150 1.20 
Sudden and unexpected 392 154 —.103 1.00 
395 156 .070 .25 


Outside help necessary 


*p <.01. 


(harm increases with time with outside help 
Necessary). 
] Regression was employed to determine the 
impact of various perceptual elements in our 
subjects’ definitions of events as emergencies. 
In Table 3, R and R? are ordered as they 
were produced by a forward stepwise regres- 
sion. The beta weights and F ratios are the 
product of the summary analysis in which the 
final regression equation, which had all the 
Variables entered, was used. Thus, the F 
tatios and beta weights reflect the contribu- 
tions of each of the variables by holding con- 
“stant all of the other variables. Since differ- 
“ences in the reliabilities of the scales can 
Cause disparities due to measurement error 
rather than to the variables themselves, we 
decided to conduct the analyses on matrices 
Corrected for attenuation due to imperfect 
reliability, However, one must be cautious in 
interpreting tests of significance based upon 
ata corrected for attenuation, in that the test 
Statistic, as well as other coefficients, is in- 
flated as a result of increased sampling error 


(Cohen & Cohen, 1975). Therefore, we have 
set alpha at the .01 level, and we will report 
differences in conclusions regarding each 
study that are reached by doing the regres- 
sion analysis on attenuated data using the 
standard confidence interval. Since no sex 
differences were found for either the emer- 
gency question or with regard to any of the 
scales, the samples were combined for the 
regression analysis. The results are shown in 
Table 3. 

The asthmatic attack was generally seen as 
probably an emergency (M=2.10, SD= 
.87). There were two statistically significant 
predictors of the degree to which the asth- 
matic attack was viewed as an emergency. As 
shown in Table 3, the most important pre- 
dictor was harm increasing with time, which 
was entered first and accounted for approxi- 
mately 38% of the variance. Whether the 
incident was seen as one in which something 
could be done to help the victim was the 
second statistically significant variable and 
accounted for another 6.4% of the variance. 
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Table 4 
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Regression Analysis Across 96 Problem Situations of the Variables That Lead to a Decision _ 
That a Problem Situation Is an Emergency (Corrected for Attenuation) yl 


Variable in order of 
insertion into regres- 


sion equation Multiple R R? B F i 
Harm increases with time 856 .733 -654 11.78* g 
Something can be done .908 .824 —.255 13.54* i 
Outside help is necessary 916 -838 .222 7.34* I 
Threat of harm O17 840 -130 84 
Sudden and unexpected 917 842 .039 .80 


Note. Reliabilities were calculated across emergencies rather than subjects and these reliabilities were used t 


correct for attenuation. 
*p<.01. 


The drunk driver situation was again seen 
by most of our respondents as “probably an 
emergency” (M = 2.05, SD = .97). Again, 
harm increasing with time was entered into 
the equation first. It accounted for approxi- 
mately 51.4% of the variance. No other scale 
was a significant predictor of perceiving the 
event as an emergency. 

The purse-snatching situation was rated as 
probably an emergency by most of our sub- 
jects (M=1.96, SD=.91). As Table 3 
shows, harm increasing with time was the only 
scale that was significantly tied to the per- 
ception of the event as an emergency account- 
ing for 13% of the variance in the emergency 
ratings. 


Discussion 


Only one of the variables, threat of harm 
increasing with time, was a significant pre- 
dictor of perceiving the event as an emergency 
across all three situations. No other variable 
was significant across any two of the situa- 
tions. The same conclusion is reached if the 
attenuated matrix is used. We find these re- 
sults surprising. Perhaps the discrepancy be- 
tween our expectations and the results reflects 
differences in our conceptualization and our 
test of the model, In formulating the model 
we were attempting to identify the critical 
dimensions that differentiate emergency situa- 
tions from other problem situations. The test 
provided in this study asked the question: 
What are the considerations that cause differ- 
ent people to see the same situation differ- 


ently? This is an important question, par 
ticularly if individual differences in perceiv- 
ing an event as an emergency relate to tl 

likelihood that onlookers respond, We would 
now like to focus our attention on answering 
the question: What features do people use to 
differentiate emergency situations from other 
nonemergency problems? 


Study 3 
Method 


The same questionnaire used in Study 2 was ug 
in Study 3. Instead of selecting 3 problem situations) 
all of which were about equal in the degrees to whl 
they were likely to be seen as an emergency, all a 
problem situations used in Study 1 were again a 
ployed. Each situation was rated by a minimum 0 
four subjects (M = 5.44), who were randomly a 
signed to the emergency situations. The responses 
subjects who rated a given emergency were avert 
thereby producing average scale scores and an aver 
age rating of the situation as an em Ee a 1 
correlational analysis, then, is across 96 pro! j 
situations rather than subjects. The variance gera 
ated with this procedure is basically due to difi 
ences in situations rather than to individual 
ences in the perceptions of the subjects. PS. 

Subjects. The subjects were 522 male ani AE 
undergraduates at The Pennsylvania State Uni f 
The students participated during the f 
1977 and winter term of 1978 and ear 
mental credit for their introductory psy! 


course. 


ned supple 
chology 


ts 


he regression procedures used in s 
were also employed in Study 3. a aan 
indicates, and as found in Study 2, 
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increases with time” was an important pre- 
dictor accounting for 73.3% of the variance. 
The second statistically significant variable 
was “something can be done,” which ac- 
counted for another 9.1% of the variance and 
yielded a negative beta weight. The last sta- 
tistically significant variable was “outside 
help is necessary,” accounting for another 
14% of the variance, 

_ The zero-order (Pearson) correlation mat- 
rix illustrating the interrelationships among 
the variables shown in Table 5 provides addi- 
tional information, This correlation matrix, 
although similar to the one presented in Table 
2 in some respects, differs in others. 

Again, it appears that the sudden and un- 
expected variable bears a small relationship 
to the other variables. In general, however, 
the relationships among the other variables 
are much higher. Of particular note is the cor- 
rected correlation between “harm increases 
with time” and “threat of harm.” The correla- 
tion between those two scales was .89, indicat- 
ing that “threat of harm” and “harm in- 
creases with time” are basically the same 
variable. The correlation between “harm in- 
creases with time” and “outside help is neces- 
sary” (r = .80) indicates that these two vari- 
ables also show a great deal of common vari- 
ance, 

The negative beta weight of “something 
can be done” along with its zero-order correla- 
tion of .05 with the emergency question and 
its zero-order correlation of 38 with “harm 
increases with time” and .22 with “outside 
help is necessary” indicates that “something 
can be done” is working as a suppressor vari- 
able in the multiple regression equation (see 
Nunally, 1967). This is consistent with the 
negative beta weight associated with the 
‘something can be done” variable found in 
the asthmatic attack situation in Study 2. 

he regression analysis based upon the atten- 
uated correlation matrix leads to the same 
general conclusion, In this analysis, threat 
of harm proved to be a significant predictor of 
Perceiving events as emergencies. 


Discussion 


It appears that a major determinant of 
Whether or not a problem situation is seen as 
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Table 5 

Attenuated and Unattenuated Correlation 
Matrix Indicating the Interrelationships 
Between the Variables Hypothesized to be 
Components of an Emergency Across the 96 
Emergency Situations 


Variable 1 2 3 4 CIRO: 
1. Sudden and 

unexpected — 16 13 .04 .20 .21 
2. Threat of 

harm 19 — .74 03 53 78 
3. Harm in- 

creases with 

time A5 89 — ,30 65 .78 
4. Something 

can be done —.05 .04 38 — 17 .04 
5. Outside 

help neces- 
x sary 224 a65 80 22) m 70 
6. Emergency 23° 385. 386; 0S 78 — 


Note. To test for significance use the attenuated 
matrix (i.e., the correlations above the diagonal). 
Correlations equal to or above .17 are significant at 
the .10 level, correlations equal to or above .20 are 
significant at the .05 level. 


an emergency is the presence of harm or 
threat of harm. If there is a sufficient amount 
of harm, then almost by definition it becomes 
worse with time. In addition, if the amount of 
harm is sufficiently grave, then outside help 
is necessary. The negative beta weight asso- 
ciated with “something can be done” indicates 
that if there is an easy solution to the problem, 
it is less likely to be viewed as an emergency, 
For example, Table 1 illustrated that a 
scraped knee was generally not seen as an 
emergency (M = 4.05), Harm is minimal, 
providing that infection doesn’t set in. It does 
not get worse with time, and no outside help 
is required in that the victim generally can 
take care of the problem of cleaning the 
abrasion and putting a Band-Aid on it. On 
the other hand, Table 1 indicates that all of 
our subjects saw a cut artery with profuse 
bleeding as a definite emergency (M = 1.00). 
In this incident, there is a greater degree of 
harm than with a scraped knee—at the least 
the cut is deeper. The situation is worsening 
with time, in that if nothing is done the per- 
son may well bleed to death. With this type 
of injury an outside source of help with med- 
ical expertise is usually required to solve the 
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problem directly. One might expect that a 
person that knows that “something can be 
done” and precisely what to do, for example, 
a doctor with a cut artery and proper equip- 
ment, may regard the incident as much less of 
an emergency. 

Of the three considerations used by re- 
spondents in Study 3 to discriminate between 
emergency situations and nonemergency prob- 
lems, only one differentiated respondents in 
Study 2 in terms of the degree to which they 
saw events as emergencies. This factor was 
the degree to which they saw harm to the 
victim increasing with time. 


Study 4 


Our original intention was to unravel the. 


concept of an emergency and, once we under- 
stood its important features, to design a fac- 
torial experiment assessing the relative im- 
portance of the elements in accounting for 
helping. The findings of the third study, how- 
ever, clearly indicate that the defining in- 
gredients of an emergency are tightly inter- 
twined (see Table 5), thereby making it im- 
possible to separate them factorially. Study 3 
demonstrated that the perceptions of threat 
of harm or actual harm, harm increasing with 
time, and outside help as necessary were 
highly intercorrelated. Furthermore, the per- 
ception that “something can be done” acted 
as a suppressor variable, meaning that taken 
alone it did not correlate with people’s ratings 
of events as emergencies. Although we now 
know what people mean by an emergency, we 
could not experimentally isolate its compo- 
nent parts, The key question remains, how- 
ever, as to how people behave in emergency 
versus nonemergency situations, and that is 
what the next study was designed to answer. 


Method 


Given the high intercorrelations between two of 
the three factors used to predict an emergency and 
the indirect effect of the third factor, it did not seem 
reasonable to use a factorial design to try to decipher 
the relative importance of the three factors in pro- 
viding help. We decided simply to examine emer- 
gencies versus nonemergencies. The major difficulty 
is to construct problem situations of an emergency 
and nonemergency nature that require the same kinds 
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of help, so that the two types of situations are 
parable. We decided that the most reasonable prog, 
dure was for the persons who needed help alway 

ask for a ride to a given location following their 
explanations of why the ride was necessary, 4 

The setting was the student parking lot at 
Pennsylvania State University. Four male and four 
female undergraduate and graduate students 
as the help solicitors. The solicitors varied in age 
(from 19 to 30 years) and in physical characteristics, 
although none was extreme as to size, weight, or 
attractiveness, Males varied on length of hair and 
amount of facial hair. The solicitors were stationed 
at various points in a parking lot capable of n 
1,340 automobiles. When a subject approached an 
automobile in the solicitor’s area, or when a car w 
being parked in the solicitor’s area, the solicitor 
would approach the potential subject and explain hi 
or her predicament following a script and keepi 
to a standardized, moderate level of affect. The solid 
tor would then ask for a ride to his or her hom 
in a small town approximately 5 miles away. Se} 
emergency and nonemergency situations were usedi 
Some examples of emergency and nonemergency 
uations are; 

1. “Excuse me. I need some help. I’m a diabetic 
and I forgot to take my insulin this morning, I'm 
long overdue. Can you give me a ride home to Boal 
burg?” (emergency). 

2. “Excuse me. I need some help. I forgot to 
my allergy medicine this morning and it’s starting 
to bother me. Can you give me a ride home to Bo i 
burg?” (nonemergency). ; 

3. “Excuse me, I need some help. My roommal 
called and she took an overdose of sleeping pills. C 
you give me a ride home to Boalsburg?” (eme 
gency). j 

4. “Excuse me. I need some help. My roommal 
called and sounded very depressed. Can you give m 
a ride home to Boalsburg?” (nonemergency). d 

Once the subject’s response was ascertained hed 
she was immediately debriefed. A response H 
counted as helping if the subject offered to take thi 
solicitor to the appointed destination. $ 

Subjects. The subjects were 286 male and Be 4 
students (208 males and 78 females) who were f 
and were either entering or exiting from a a 
mobile in a student parking lot on The re ws 
State University campus during spring term 0! 


Hypotheses 


The key hypothesis concerns whether pe? 


ple are more inclined to help in emergencies : 
in nonemergencies. As we indicated E a 
introduction to the paper, there is no the a 
ical basis for making a prediction. Noi 
less, we expect people to help more freq i 
in emergencies compared to nonemeré tt 
because bystanders should perceive 4 
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need for their assistance, In order to general- 
ize, we employed a number of different emer- 
gency and nonemergency situations and used 
both male and female solicitors. Given that 
the subjects were of both sexes and were 
either arriving at or leaving the lot, we re- 
corded this information as well. Since we col- 
lected this information, the following addi- 
tional hypotheses were formulated. 

1. Based upon previous research (e.g., 
Borofsky, Stollak, & Messé, 1971; Piliavin & 
Piliavin, 1972), we hypothesized that males 
will help more than females. 

2. Latané and Dabbs’ (1975) research sug- 
gests that females should receive more help 
than males. 

3. We expect subjects who are arriving at 
the parking lot to be less helpful than sub- 
jects who are leaving the lot. Commuters who 
are arriving should be, on average, going to 
class, whereas those who are leaving should 
have greater time flexibility. Darley and Bat- 
son (1973) found that subjects in a hurry 
help less frequently than subjects with more 
time. 


Results 


Manipulation check. The first step is to 
determine whether our emergency events are 
Seen as emergencies in comparison to our non- 
emergency events. To accomplish this, 168 
subjects rated the situations, Each of our 14 
situations was described to separate groups of 
6 males and 6 females earning extra credit in 
an introductory psychology class. They were 
asked to imagine themselves walking near 
their car in a parking lot and having a person 
approach them who then made one of the 14 
statements used in the study. The subjects 
were then asked to fill out a short question- 
naire concerning the situation. The emergency 
question used in each of the earlier studies 
Was again used here. The results showed that 
Our emergency situations were indeed seen 
as emergencies (M = 1.94) in comparison to 
our nonemergency situations (M = 3.14), E (1, 
164) = 74.35, p < 001. The degree to which 
an event was seen as an emergency was 
Strongly correlated (.75) with judgments con- 
Cerning the severity of the situation. There 
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was no difference between males (M = 2.64) 
and females (M = 2.44), F(1, 164) = 2.11, 
ns, nor were differences found due to the spe- 
cific incidents or to interactions between and 
among the nature of the event, the sex of the 
respondent, and the specific events. 

Tests of hypotheses. The data from the 
factorial study were analyzed by means of 
Goodman’s techniques for the assessment of 
interaction effects in multivariate contingency 
tables (Bishop, Fienberg, & Holland, 1975). 
Using analysis of variance terminology, the 
model of the four-way interaction was signif- 
icant, x*(1) = 8.04, p< ,005. In nonemer- 
gency situations when female subjects were 
arriving and a male solicitor asked for a ride, 
none of the 12 subjects in that cell helped in 
comparison to the average helping rate of 
55% across all cells. This is precisely the cell 
one would predict the lowest rate of help 
from. The models of the three-way and two- 
way interactions were not significant. In fact, 
if one lumps all four-, three-, and two-way 
interactions together and drops them from the 
model, leaving main effects, one loses a non- 
significant chi-square, x*(11) = 15.47. Since 
the four variables only interact if all the 
variables are present and do not interact in 
any form with a subset of these variables, it 
is not useful to consider this model except in 
the unlikely possibility that another re- 
searcher is going to combine those four vari- 
ables. 

Three of the main effects were also statis- 
tically significant. More help was produced in 
emergency conditions (64%) than in non- 
emergency conditions (45%), x*(1) = 11.17, 
p < .001. Female solicitors received more help 
(64%) than male solicitors (47%), x°(1) = 
9.04, p < .01. Males helped (61%) more than 
females (40%), x2(1) = 11.48, p < 01. In 
addition, subjects who were arriving on cam- 
pus (50%) were marginally less likely to help 
than subjects who were leaving (63%), x?(1) 
= 3.84, p < .10. If we drop the data from the 
cell that caused the four-way interaction 
where no one helped, and use a chi-square 
procedure to recalculate the main effects to 
determine if the main effect is independent of 
the interaction, we find that the main effect 
due to the presence or absence of an emer- 
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gency is independent of the four-way inter- 
action, x*(1) = 5.76, p < .02. The sex of the 
solicitor is marginally significant, x*(1) = 
3.74, p < .10, as is the sex of the subject, 
x*(1) = 3.76, p < .10. The main effect due to 
whether the subject is coming or going, how- 
ever, is dependent on the interaction, y*(1) = 
2.37, ns. 


Discussion 


The data from Study 4, when considered to- 
gether with the findings of Study 3, indicate 
that a prime concern motivating potential 
helpers is the degree to which the victim is in 
need of help. From Study 3 it was determined 
that our subjects defined an emergency pri- 
marily in terms of the presence of a sufficient 
amount of harm or threat of harm such that 
the situation was both worsening with time 
and that outside help was necessary. 

The results of Study 4 indicate that people 
are more inclined to help in an emergency as 
compared to nonemergency situation. This 
research is not the only line of research that 
indicates that the bystander’s perceptions of 
the need of the victim are an important deter- 
minant of whether or not help is offered. 
Weiss, Boyer, Lombardo, and Stitch (1973) 
found that the more pain a person is in, the 
more rapidly people will provide help. In an- 
other study, Staub and Baer (1974) investi- 
gated the severity of an incident by having a 
confederate feign either a heart attack or a 
twisted knee. Although the results were not 
entirely consistent, they concluded that the 
greater the severity, the more the helping. 
West and Brown (1975) used blood as a sign 
of severity and found more help when blood 
was used; in contrast, Piliavin and Piliavin 
(1972) found less help with the use of blood. 
A recent study by Clark (1975) found that 
the less a person is able to help himself, the 
more inclined people are to provide help. His 
data showed that a man on crutches who had 
dropped a book was much more likely to be 
helped than a man walking normally. Finally, 
Shotland and Johnson (1978) varied the 
severity of a fall and found that onlookers 
were more apt to help the victim after a bad 
fall than after a mild one. As a group, these 


R. LANCE SHOTLAND AND TED L. HUSTON 


studies document the impor 
of need as a determinant of 
vention. 


General Discus 


The data reported in the four 
on both Latané and Darley’s 
sion-making model and the aro 
model of bystander intervention 
the Piliavins and their colleagues (1 
al., 1969; Piliavin et al., 1975; 1 
Piliavin, 1972; Piliavin & Piliavin, 
It will be recalled that Latané 
(1970) suggested that the decisio1 
vene is based on a series of deci! 
notice that something is happen 
pret the event as an emergency, 
responsibility for taking action, (d) 
the appropriate form of assistance, í 
decide how to implement the assis 
data suggest that the decisions é 
ceptually distinct. Latané 
(1970) second decision in the seri 
event is an emergency—is affected 
clusions onlookers come to regarding 
and fourth decisions—that is, the 
which onlookers believe they are 
to help and take responsibility for 
help in one form or another. If th 
able to extricate himself without 
bystander, or if no form of assistan 
to help correct the situation, then 
action to take and therefore no 
responsibility for the situation. 
pretation that a problem situation 
gency is not only based on harm | 
harm to the victim increasing wi 
also on the degree to which oi 
both necessary and feasible. Th 
sion that an event is an emergen 
by the judgments onlookers maki 
the importance and feasibility of 
to correct the situation. 

The work of the Piliavins has 
the idea that onlookers confro1 
emergency choose a course of 
their assessment of the rewards 
they believe flow from various Jin 
ior. Their research (Piliavin 
Piliavin et al., 1975) has focuse 
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on the effects on bystander response of varia- 
tions in the costs to the bystander of direct 
intervention. The data reported herein, when 
combined with the material summarized 
earlier, suggest that another important factor 
influencing the response of onlookers is their 
perception of the degree to which, the victim 
is in need of help, Indeed, such a perception 
lies at the cornerstone of Schwartz’s (1977) 
recent model of the decision processes that 
precede taking action in a distress situation. 
According to Schwartz (1977), the following 
sequential steps are involved in arriving at a 
decision regarding whether or not to take one 
or another form of action: (a) awareness of a 
person in a state of need, (b) perception that 
there are actions that could relieve the need, 
(c) recognition of own ability to provide re- 
lief, (d) apprehension of some responsibility 
to become involved, (e) activation of pre- 
existing or situationally constructed personal 
norms of moral obligation, and (f) assessment 
of costs and evaluation of probable outcomes 
of taking action. Depending on the assessment 
of costs, onlookers are expected either to take 
action or not, or to reassess and redefine the 
situation by doing such things as denying the 
victim’s need or by denying their own re- 
sponsibility to respond. The fourth study in 
our series varied the potential helper’s per- 
ception of the degree to which the victim was 
in need and assessed the degree to which the 
Perception of need influenced the frequency 
of helping behavior, Models of bystander 
intervention have emphasized the hedonistic 
view of onlooker behavior and underplayed 
the altruistic features of the act. Although a 
plausible case can be made for assuming that 
reducing another’s distress is rewarding, a 
Strong argument can be set forth that by- 
Stander behavior can be understood more 
readily by focusing on objective costs to the 
bystander weighted against the benefits to the 
Victim. If this is, in fact, what bystanders do, 
then it would appear that a model positing 
limited altruism” might be more accurate 
than the hedonistic model. 


Reference Note 
l Piliavin, J. A, & Piliavin, I. M. The good samari- 
an: Why does he help? Unpublished manuscript, 
niversity of Wisconsin, 1976. 
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Self-Awareness and Transgression in Children: 
Two Field Studies 
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Two field studies explored the relationship between self-awareness and trans- 
gressive behavior. In the first study, 363 Halloween trick-or-treaters were in- 
structed to only take one candy. Self-awareness induced by the presence of a 
mirror placed behind the candy bowl decreased transgression rates for children 
who had been individuated by asking them their name and address, but did not 
affect the behavior of children left anonymous. Self-awareness influenced older 
but not younger children. Naturally occurring standards instituted by the be- 
havior of the first child to approach the candy bowl in each group were shown 
to interact with the experimenter’s verbally stated standard. The behavior of 
349 subjects in the second study replicated the findings in the first study. 
Additionally, when no standard was stated by the experimenter, children took 
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more candy when not self-aware than when self-aware. 


The theory of objective self-awareness 
(Duval & Wicklund, 1972; Wicklund, 1975) 
proposes that a person’s attention may be 
focused inward toward the self (self-aware) 
or outward toward the environment, Accord- 
ing to the theory, when self-aware, a person is 
likely to focus on the standards of behavior 
appropriate to the setting and upon the con- 
sistency of his/her behavior with these stan- 
dards, Self-aware individuals are more likely 
to self-regulate (Kanfer, 1970) their behav- 
iors to be consistent with normative standards 
when such standards are salient. Several lab- 
oratory studies have demonstrated such ef- 
fects. Diener and Wallbom (1976) found 
college students cheated less on an “intelli- 
gence test” if they were made self-aware by 
observing their reflection in a mirror and 
listening to their own tape-recorded voice. It 
has also been shown that self-awareness may 
either increase or decrease aggression in a 
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laboratory situation, depending on the direc- 
tion of the experimenter-produced norm (Car- 
ver, 1975; Scheier, Fenigstein, & Buss, 1974). 


Transgressing a Standard 


The present research extended the theory 
to a new population and further examined 
the relationship between self-awareness and 
self-regulation. A standard of behavior was 
made salient to Halloween trick-or-treaters by 
instructing them to take only one candy. They 
were then presented with an opportunity to 
“transgress” by taking extra candy. (It 
should be remembered that self-awareness 
theory itself is not a moralistic theory but one 
of consistency between behaviors and any 
standard of interest.) By varying the presence 
or absence of a mirror, it was possible to 
examine the effect of self-awareness on the 
rate of transgression (taking extra candy). 
The field study would reduce demands that 
might occur in laboratory studies and offer 
generalization to a noncollege population. It 
was hypothesized that the presence of the 
mirror would decrease the incidence of trans- 
gression. 
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We noted, however, that the children in 
this research would be in costume. If the 
children focused on their costumes, which are 
aspects of the external environment, mirror 
presence might not elicit self-awareness. Thus 
half of the children were asked their names 
and addresses to individuate them and make 
their identities salient just prior to entering 
the room where the candy and mirror were. 
It was expected that the mirror might have its 
predicted effect only in the individuated con- 
dition. 


Age 

By collecting age data on the children, the 
present study also provided some evidence 
concerning the developmental course of self- 
awareness as it affects self-regulation. It is not 
known whether children, either young or old, 
experience self-aware states or whether these 
states influence their behavior. The ability to 
be self-aware and to focus on behavior/stan- 
dards discrepancies in order to regulate be- 
havior probably develops over a period of 
time. Duval and Wicklund (1972) suggested 
that children must learn the objectlike nature 
of the self before they can become self-aware. 
Although not speculating about the age at 
which this occurs, Duval and Wicklund argue 
that self-awareness arises as the child repeat- 
edly encounters “situations that cause him 
to examine dimensions of himself . . . to build 
up a unified conception of himself, a concep- 
tion that will constitute a ‘comprehensive 
causal agent self. . .” (p. 52). It is also pos- 
sible that children have not developed in- 
ternalized standards at young ages. Therefore 
we hypothesized that the self-awareness ef- 
fects would vary as a function of age, with the 
older children being relatively more influenced 
by the manipulation. 


Naturally Occurring Standards 


The present design also allowed an exam- 
ination of the influence of self-awareness on 
naturally occurring modeling. The behavior of 
the first child to approach the candy bowl in 
each group might provide the subsequent 
children with a standard of behavior, This 
naturally occurring standard might support or 
might conflict with the admonition of the 
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experimenter to take only one candy, depend. 
ing on whether the first child did or did Not 
transgress. The distinction between verbally 
stated standards and the standards expressed 
implicitly in the behavior of peers could prove 
theoretically valuable. It is possible, for ex. 
ample, that self-awareness may focus persons! 
attention differentially on one or the other 
standard. The present study allowed a post 
hoc examination of the influence of self- 
awareness on unstated norms as well as on 
those that were verbally specified. 


Deindividuation Theory 


A paradigm similar to the present one has 
been used previously (Diener, Fraser, Bea 
man, & Kelem, 1976; Diener, West 
Diener, & Beaman, 1973) to investigate d 
individuation theory (Zimbardo, 1970). Since 
only half of the trick-or-treaters in the E 
rent study were individuated (asked thei 
name and address), the behavior of the re 
maining children could offer additional infor 
mation on the effects of anonymity, which isa 
deindividuation theory input variable. 

A deindividuation theory perspective woul 
argue that transgressions increase as anonym 
ity increases, although data on this point have 
been somewhat inconsistent (see Dienetj 
1977). This prediction is quite similar in some 
respects to the self-awareness theory perspec 
tive. Both imply that people are least likely 
to match their behavior to norms when the 
are not aware of themselves as individua 
Indeed, several authors (e.g., Diener, 1977) 
have suggested the possibility of merging sel 
awareness and deindividuation theories. 

Because the present study included two 
variables traditionally considered to be antes 
cedents of deindividuation (anonymity an 
group presence), predictions of deindividua 
tion theory could also be tested. This mi if 
provide the basis for further integration 
the two theories. 


Experiment 1 
Method 


Participants and Setting } 


7 jved 
The participants were 363 children who aie gal 
trick-or-treat between 5:00 and 8:00 p.m. 
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Joween, 1976, at any of 18 selected homes. Children 
were used only if they were not with a parent. Thus 
all were old enough to be able to state their name 


cand some description of where they lived. Further- 


more, only children who came disguised by a costume, 
mask, or painted face were used. When too many 
children appeared at one time (more than 7), or if a 
second group of children arrived before the experi- 
menter could leave the room, these children were 
not included as subjects, 

The 18 homes were arranged in the same basic pat- 
side a room near the front door was a low 
»proximately 5 feet (1.5 m) long. On the 
table was a large bowl full of wrapped, bite-sized 
candy bars that was replenished frequently during 
the evening. A large mirror was used in the self- 
awareness condition. The mirror rose above the table 
to a height of at least 5 feet (1.5 m) from the floor. 
The mirror was placed directly behind the table at a 
90° angle such that the children would always see 
themselves while reaching for the candy. Within full 
view of the bowl was a decorative backdrop with a 
peephole that camouflaged an unobtrusive observer. 


Procedure 


A female experimenter greeted all children who 
came trick-or-treating, and an assistant served as the 
unobtrusive observer who recorded the data. When- 
ever a child or children arrived, the experimenter 
would greet the child(ren) amicably and would com- 
ment on their costumes. The experimenter then told 
each child, “You (or each of you) may take one of 
the candies. I have to go back to my work in another 
room.” The purpose of this, as in past studies, was 
to make the standard of taking only one candy 
clearly salient. If any child had any questions about 
what was supposed to be done, the experimenter 
repeated the instructions to take one candy. She 
then exited and made sure she was out of sight of 
the children, 

The observer recorded the number of children who 
entered the house, the number of candies taken by 
each child, and the estimated age and sex of each of 
the children, In past research and pilot studies, esti- 
mates of number of candies taken have been found 
to be highly reliable. With regard to age estimations, 
any inaccuracies would tend to obscure real differ- 
ences between the age groups. If observers tend con- 
sistently to overestimate or underestimate ages, then 
the age data would just be shifted consistently up or 
down and would be less specific but would not ob- 
Scure the trends present. 

Since the experimenters in each house knew 
whether or not a mirror was present and at least the 
Steeter knew whether or not children were indi- 
Viduated (see below), experimental bias was possible. 

owever, a number of controls minimized this con- 
cern. Both mirror and no mirror conditions were run 
at each home, and a large number of homes and 
experimenters were used. Although all experimenters 
Were carefully trained and rehearsed in the proce- 
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dures, most were not on the research team involved 
in designing the study and hence were blind to the 
predictions made. Also, due to the relatively large 
number of conditions, systematic inaccurate recording 
of the data was less likely. Additionally, since the 
mean number of subjects per house was 26, trends in 
the various conditions should not have been apparent 
to any group of experimenters at a particular home, 

Self-awareness manipulation. In this condition a 
large mirror was placed behind the table. Half of the 
homes were randomly assigned to the self-awareness 
condition from 5:00 to 6:30 p.m. At 6:30 p.m. mir- 
rors were removed in these homes, whereas in the 
remaining homes, mirrors were placed behind the 
tables and the self-awareness conditions were run 
from 6:30 to 8:00 pm. 

Individuation manipulation. There was a concern 
prior to the experiment that the mirror manipulation 
of self-awareness might be ineffective for costumed 
(anonymous) children. The mirror might focus their 
attention on their costumes,’ not upon themselves as 
individuals, Thus, it might be more difficult to make 
anonymous children self-aware. It was therefore con- 
sidered important to shift the focus of attention from 
examining and appreciating one’s costume to reflec- 
tion upon oneself. To do this, after greeting the chil- 
dren and commenting on their costumes, the experi- 
menter explicitly asked each child in the individuated 
condition what his or her name was and where he or 
she lived. The experimenter carefully repeated each 
child’s name and address to make it salient that she 
knew this information about each of them, She then 
continued with the rest of the basic procedure by 
telling each child to take one candy and excusing 
herself to work in another room while the child(ren) 
entered the room where the candy was. These proce- 
dures have been used several times before and require 
relatively little time. Because the children are asked 
their names at many other homes on Halloween, they 
do not seem to find the request strange. No attempt 
was made to identify any of the costumed children 
in the anonymous condition. The children were as- 
signed to the anonymous and individuated conditions 
alternately by groups in each home, 


Results 


Of the 363 children, 70 (19.3%) trans- 
gressed by taking more than one candy.* Chil- 


1 The importance of these thoughts received anec- 
dotal support by the statement of one parent to the 
children: “Oh, look, they have a mirror so you can 
all see your costumes. How nice.” 

2Jt should be noted that there are a number of 
possible ways to analyze the data. The overall rate 
of transgression can be computed for each relevant 
condition and contrasted with other conditions. Two 
other possibilities readily come to mind. One might 
just examine the behavior of the first child in each 
group, reasoning that those individuals will be less 
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Table 1 
Percentage of Children Transgressing 


p 


Condition 


Group No mirror Mirror 
OM DS. G E 


Individuated 


% 37.7 8.9 

No, of children 69 90 
Anonymous 

% 19.1 19.6 

No. of children 68 97 


dren arriving alone (m= 39) transgressed 
somewhat less frequently than groups of chil- 
dren did, 10.3% versus 20.4%, z = 1.51, p< 
.07. Since the distribution among all the vari- 
ous conditions of the few children who arrived 
alone resulted in sample sizes much too small 
for statistical analysis, and since the children 
who arrived alone seemed to transgress less 
often, the subsequent analyses of self-aware- 
ness effects are based upon the 324 children 
who arrived in groups.” 


Seljf-Awareness and Deindividuation 


Because of the apparently strong interac- 
tion between self-awareness and anonymity, 
an examination of the interaction by a z test 
on the arc sine transformation of the propor- 
tion differences (Langer & Abelson, 1972) was 


influenced by the group than the following children 
would be, and hence a more conservative analysis 
will emerge. One could instead assign a value to each 
group (such as the proportion of the group trans- 
gressing) and make groups, not individuals, the unit 
of observation. Since the group members certainly 
could influence each other, a group-as-unit analysis 
could be preferred. However, three important con- 
siderations argue for the use of the data from all 
individuals. First, past research employing the Hal- 
loween paradigm has found group and individual 
data to provide identical statistical outcomes. Second, 
since groups are comprised of mixed ages and sexes, 
hypotheses made concerning these variables could 
not be examined using groups as units. Third, the 
potential modeling effects (which work against inde- 
pendence of group members) require investigating 
each member’s behavior, and this was one intent of 
the present design. Nevertheless, the analyses were 
performed when possible using the first child and 
also the proportion-per-group data. In almost every 
case the conclusions remained the same. These re- 
sults will be noted in footnotes. 
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completed. The highly signifi 
(p <.001) revealed that there v 
strong interaction between the 
and so comparisons of individual 
were computed. Children in the 
condition transgressed significantly m 
those in the mirror condition, 28.. 
14.4%, z = 3.10, p< 014 As 
Table 1, the self-awareness effect si 
to occur in the individuation cond 
each child’s identity was made sali 
individuated, no mirror condition 
the children transgressed, compared 
in the individuated mirror condition, : 
p < .0001.° However, in the ano: 
mirror condition, 19.1% transgre 
pared to 19.6% in the anonym 
condition (ms). A 

Deindividuation theory would noi 
dict lower stealing rates for ch 
arrive alone (as was found) but 
predict that groups of children who 
ymous would steal more than gro 
ing of individuated children, Ho 
data indicated that overall the 
groups actually produced a tran 
19.4%, about equal to that for 
uated conditions, 21.4%. In th 
condition, the anonymous children 
nificantly less often, 19.1% ver 
z = 2.42, p < .02, than individua! 
did. However, in the mirror co 
individuated groups’ norm violatic 
duced below those of the anonymi 


randomly with respect to sex, age, 
is possible that the children did nol 
random pattern, Thus it would be po 
of these variables to be disproporti 
sented in some cells, Fortunately, such 
case. The correlation of group size by 
096 (ns). The correlation for age by 
.15 (ns). And the chi-square for the 
interaction was .3 (^s). 

4 In the analysis using only first chil 
parison reached only marginal si 
versus 12.9%, 2=1.28, P < 10. 
data with groups as units for each 
significant results, (103) = 1.69, P< 

5 Data from first children yi 
35.3% versus 9.7%, z= 2.16, $ <02 
occurred with the proportion data 
units analysis, t(50) = 2.95, $< 
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Percentage of Individuated Children Transgressing as a Function of Age and Self-Awareness 


SS SSeS 


Overall No mirror Mirror 
Age (years) % n % n % n p 
1-4 0 11 0 5 0 6 ns 
5-8 12.2 49 13.3 15 11.8 34 ns 
9-12 27.0 63 46.4 28 11.4 35 01 
13 and above 40.0 20 72.7 11 0 9 .001 


8.9% versus 19.6%, z = 2.09, p < .05. As 
will be discussed later, the unexpectedly low 
transgression rates among anonymous sub- 
jects have also occurred in several laboratory 
studies. 

The results for the amount of candy taken 
per child replicated the earlier research of 
Diener, Fraser, Beaman, and Kelem (1976) 
in obtaining a mean of roughly three candies 
(two extra) for each transgressing child, They 
tend to take the amount their hands can hold 
easily. However, children in the condition 
with the reduced stealing rate, the individ- 
uated mirror condition, did take fewer candies 
compared to those in the individuated no 
mirror condition, 2.68 versus 2.0, #(32) = 
1.96, p < .05, one-tailed. It appears that self- 
awareness reduced both the rates and the 
amount of candy taken when children were 
individuated. 


Age Data 


The median estimated age of the children 
was 8. Based upon the median, the age data 
Were divided into four categories, 1-4, 5-8, 
9-12, 13 and older, prior to examining the 
transgression data. These age categories were 
examined to investigate developmental trends. 
Analysis of the four categories indicated a 
Significant age effect, x3(3) = 18.9, P< .001, 
With the percentage of children stealing in- 
creasing over the four age categories, 6.5%, 
9.7%, 23.6%, 41.9%, respectively. The find- 
ing of an increase of transgression with age 
Should be interpreted cautiously, because it 
may be specific to the type of transgression 
involved, and there may be greater self-selec- 
tion of those children who trick-or-treat in the 
Older age groups, 


More interesting, however, is the interac- 
tion between self-awareness and age. .Self- 
awareness produced its largest effect on the 
transgressions of older children, Whereas self- 
awareness decreased the transgression rate by 
55% among teenagers, z = 3.06, p < .001, it 
decreased transgression by only 15% for pre- 
adolescents, z = 2.05, p < .02, and not at all 
for younger children, Based on the overall 
analyses, these effects should be due to the 
behaviors of the individuated children, since 
apparently only these children were self-aware 
in the presence of the mirror. Table 2 presents 
the data for the individuated children, Again, 
the developmental trends were apparent. Self- 
awareness exerted its strongest influence on 
children aged 9 and above, and not at all on 
the younger children, Similar analyses done 
on the anonymous children (who we suspected 
were not made self-aware by mirror presence) 
yielded no significant difference at any age 
grouping. 


Naturally Occurring Standards 


The present design allowed an opportunity 
to examine the effect of self-awareness both 
on experimenter-imposed verbal standards 
and behaviorally set standards of peers (i.e., 
the behavior of the first child to approach the 
bowl could serve as a modeled standard for 
the rest of the children). All children received 
the experimenters’ standard instructing them 
to take only one candy, and the results above 
reveal that self-awareness enhanced the im- 
pact of this norm. Peer-induced standards via 
the model effects were also apparent. What- 
ever the first child did in each group (trans- 
gress or not), a large percentage of subse- 
quent children did also. Whereas children ex- 
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Table 3 
Percentage of Children Not Transgressing as a 
Function of Modeled Behavior and Condition 


Modeled behavior 


Not trans- 
gress Transgress 
Condition % n % n 
Individuated 
No mirror 76.9 26 50.0 12 
Mirror 96.4 55 25.0 4 
Anonymous 
No mirror 88.0 33 33.3 6 
Mirror 89.0 54 36.4 11 


posed to a first child who took more than one 
candy transgressed frequently (60.6%), those 
exposed to a nontransgressing model rarely 
did so, 10.7%, x*(1) = 41.6, p < .001. In the 
anonymous condition only this modeling effect 
is apparent (see Table 3), once again reveal- 
ing that the self-awareness manipulation had 
no effect for anonymous children. However, 
in the individuated condition where the mirror 
did affect behavior, the interaction with 
modeling is more interesting. The interaction 
was again analyzed by a z test on the differ- 
ences in arc sine transformed proportions, and 
a significant value was found, 1.83, p < .05. 
Because of the use of a one-tailed test and the 
small sample in two cells, the Modeling x 
Self-Awareness interaction should be cau- 
tiously interpreted. It should be noted that 
the data in Table 3 for the mirror condition 
indicate that both self-awareness and model- 
ing exerted effects on the self-aware children. 
That is, independent of the overall reduction 
in transgression caused by mirror presence, 
mirror presence also enhanced the tendency 
of subjects to conform to the behavior ex- 
hibited by the model. 


Sex 


The estimation of each child’s sex was ob- 
tained for 326 of the children (including chil- 
dren who arrived alone, where age data were 
available). Due to their costumes, the sex 
data for the remaining 37 children could not 
be reliably determined. There were 190 boys 
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and 136 girls. It was found that boys tran 
gressed significantly more than girls, 24.00 
versus 10.3%, z = 3.16, p < .OOL. The effeg 
of the mirror were also examined for each se 
Boys’ stealing rate decreased from 35.8% (r 
mirror) to 15.6% (mirror), z = 3.2, p < 00 
The stealing rate for girls only dropped fro 
13.2% (no mirror) to 8.4% (mirror), a: 
87, p < .20. 

One last consideration pertains to a gro 
size analysis. There were no differences in ti 
transgression rates of the groups of vario 
sizes (two to seven children), except for 
high rate of 44% observed in groups of si 
five. Removing these groups from the analys 
did not alter the significance levels of t 
analyses. It appeared that the high rate o 
curred because these few groups of five ha 
pened to occur only in no mirror condition 
hence self-awareness was not present tot 
duce the transgressions. It should be emph 
sized again that since there was no other rel 
tionship between group size and transgressi 
rates, the analyses are not influenced by 
distribution of groups to various condition 
In a similar vein, the distribution of child 
of various ages to conditions was investigat 
and found to have no effect on the d 
analyses. 


Discussion 


As the theory of objective siran A 
predicts (Duval & Wicklund, 1972; 
lund, 1975), standard-consistent behavior y 
increased in a naturalistic setting by focus 
children’s attention upon themselves. i 
occurred only when children had first ; 
individuated by being asked their name A 
address. This reduction in transgression is 
to reflect an increase in compliance A 
salient behavioral standard. The sat f 
stituted in the present study was the E 
tion given by the experimenter to “ta 4 
candy only.” The children in ae ' 
study altered their behavior to be a a 
sistent with the standard of con nt 8 
they were self-aware. Thus the Pit 
ings add external validity to the “ih 
search showing greater adherence be f 
appropriate behaviors (Carver, fee 
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< Diener & Wallbom, 1976; Scheier et al., 
1/1974). 

y 

l Anonymity 

1) 

| In cases where the children remained anon- 
u ymous, the presence of the mirror had no im- 
pact on behavior. We interpret this as an 
indication that the child’s focus of attention 
piwas directed to aspects of his/her costume 
rather than to the self. Hence, no state of 
self-awareness seemed to develop, Although 
these findings are consistent with the conten- 
tion that anonymity may inhibit self-aware- 
ness, this may occur only in the presence of a 
disguise, not as a general function of anonym- 
ity. Future research in which self-awareness 
is induced and anonymity is experimentally 
manipulated in several different ways should 
help clarify the way in which anonymity 
affects self-awareness. It may be that when 
one is disguised and a group member, ano- 
nymity inhibits feelings of self-awareness, If 
one were alone in a costume on a downtown 
Street at noon, the disguise might enhance 
self-awareness. Both the manner in which one 
is made anonymous and the situational con- 
text in which the anonymity occurs probably 
influence the effect this variable has on self- 
awareness. 

| Data from the anonymous conditions were 
Not what one would predict from deindividua- 
tion theory (Zimbardo, 1970). It was hoped 
that the present study might, in addition to 
assessing various self-awareness effects, shed 
light on the interaction of the two theories. 
Earlier research (e.g., Diener et al., 1976) 
has shown that group presence, and also 
anonymity, increases transgressions. Although 
Some weak support for increasing transgres- 
Sions among groups compared to children 
arriving alone (p < .07) was found in the 
Present study, no overall effects were found 
°r anonymity. In fact, anonymous children 
actually transgressed significantly less in the 
no mirror condition than individuated chil- 
tren did (a similar result was also reported 
by Zimbardo, 1970, p. 277). One possible 
“planation for this is that in the present 
Study the individuated no mirror condition 
May have had high rates of transgression due 
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to some artifact introduced by the procedures 
used.® 

A close inspection of the individuation lit- 
erature reveals that previous results for ano- 
nymity effects have been quite mixed, with 
anonymity manipulations sometimes increas- 
ing and sometimes decreasing antinormative 
behavior (Diener, 1977). Thus our results are 
actually not inconsistent with the previous 
research findings. However, our findings do 
not illuminate why anonymity produced un- 
expected effects, The puzzle is heightened by 
the fact that our manipulation was identical 
to that used in an earlier study in which 
anonymity produced heightened transgression, 
Future research on anonymity should explore 
the interaction of other variables with ano- 
nymity in order to shed light on the past con- 
tradictory findings. One possible explanation 
of the problematical data in the anonymity 
versus individuated no mirror conditions is 
that asking children their name actually made 
salient to them the fact that they could not be 
easily identified by the experimenter, espe- 
cially since she was rarely known by the 
children. Another possibility is that asking 
children their names heightened the children’s 
awareness of the identity of others in their 
group, thereby enhancing cohesiveness, and 
thus possibly strengthened a transgressive 
norm if it was held by members of the group. 


Development of Self-Awareness 


One of the major benefits of this naturalis- 
tic study was the opportunity to gather data 


6 However, there are a number of considerations 
that rule against this possibility. First, the same indi- 
viduation manipulation has been used by some of 
the authors in earlier research (e.g. Diener et al, 
1976) which produced expected results. Thus we 
doubt that in this study the same manipulation 
would produce a radically different psychological 
state, Also, as mentioned earlier, asking children their 
names is not an unusual practice on Halloween. One 
might argue that sampling error led to the self- 
awareness effects. In order to provide an even more 
conservative test of the hypothesis, the anonymous 
no mirror transgression rate was used as a baseline 
to assess the decrease noted in the individuated 
mirror condition. Again, the mirror condition was 
shown to be significantly lower, 19.1% versus 8.9%, 
z= 1.88, p<.04. It thus appears unlikely that self- 
awareness results occurred due to some unknown 
artifact present in one of the individuated groups. 


1842 A. BEAMAN, B. KLENTZ, E. 
from children of various ages, thus allowing 
an examination of self-awareness from a de- 
velopmental viewpoint. As was seen in Table 
2, individuated children 8 years of age or 
younger were apparently unaffected by their 
self-reflections in the mirror. The mirror ef- 
fects appeared to begin in our study some- 
where around the age of 9, and then the effects 
were substantial. However, since stealing rates 
were lower among the younger children, it is 
possible that a floor effect was responsible 
for the absence of larger reductions in the 
lower age groups. Moreover, one must view 
the specific ages in our age data cautiously, 
as they are based on estimates. Further, our 
data do not indicate whether children under 
a given age do not enter the self-aware state 
or whether they have not yet gotten to the 
point where the presence of their reflection in 
a mirror cues other self-relevant thoughts. It 
seems relatively certain, though, that by ages 
9-12, self-awareness has an effect in reducing 
transgressions, and by 13 and above self- 
awareness substantially reduces antinormative 
behavior. 


Behavioral Standards 


In the present study self-awareness could 
focus one’s attention on a standard set by the 
experimenter and/or one set by the behavior 
of other children. The modeling effects were 
examined by analyzing the transgression rates 
for the children who approached the candy 
bowl after the first child (model) did in each 
group. If the first child did not transgress 
(honest model), 89.3% of the subsequent chil- 
dren were also honest. If the first child in a 
given group did transgress (dishonest model), 
the honesty rate was reduced to 39.4% for 
the followers. These effects replicate an earlier 
field study (Diener et al., 1976). However, 
these data are potentially confounded by the 
possibility that “honest” and “dishonest” 
children may tend to trick-or-treat together. 

Self-awareness in the present study magni- 
fied the modeling effects, among individuated 
subjects. This Modeling X Mirror interaction 
should not be due to cohort effects, since the 
mirror was randomly assigned and cohorts 
should not have been any more similar in the 
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mirror than the no mirror conditic 
over, this finding suggests that 
behavior provided an additional 
time the followers approached the 
Carver (1974) has shown that ; 
which might normally be cons 
normative, can be facilitated by sı 
in the laboratory if the salient sta 
havior favors aggressing. Our 
field support Carver’s in sugges’ 
gressing can be enhanced by self- 
the model makes salient a transg 
dard). The present results sugg 
awareness can heighten the salienci 
implicit in the behavior of others, 
these norms are not verbally stai 
because of the small sample size 
gressive model groups, this i 
should be accepted cautiously until 
Further research might vary or 
types of standards in order to 
ing further. 


Summary 


Overall, the findings form a 
tern of support relating antinoi 
havior to the theory of objective 
ness, An alternative explanation fo 
tion of stealing in the mirror cor 
be that the children were more WO 
getting in trouble for taking 
when the mirror was present than 
not. If this were the case, howev! 
expect the fear of getting 
more salient in the individ 
than in the anonymous conditi 
was clear to the children that | 
addresses were known by the 
the individuated condition. 1 
would predict a lower transgr 
individuated no mirror conditio 
to the anonymous conditions. 
a result was not found, Ol 
analysis offers a more comp! 
for the findings than does an 
on fear of reprisal. 


Experiment + 


To add confidence to our 
we designed a second study | 
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self-awareness findings. Thus, conditions were 
included where the standard of taking only 
„one candy was made salient and self-focus 
was varied. In this second study, children 
were sent into the candy room one at a time, 
and all children were individuated by asking 
them their names and ages. The request-for- 
age information was used instead of asking 
their addresses in order to provide actual age 
data instead of estimates (as in Experiment 
1). Finally, conditions were included in which 
no standard was made salient, to explore what 
standards, if any, may be present to guide 
childrens’ behavior when trick-or-treating, 
without the experimenter’s suggesting a spe- 
cific behavior as appropriate. 

It was hypothesized, as before, that mirror- 
induced self-awareness would decrease the 
percentage of children transgressing, The age 
trends revealed in Experiment 1 were ex- 
pected to occur again with the actual ages 
given by the children. In the no standard con- 
ditions, the experimenter did not make any 
statement about the appropriate number of 
candies to take, We expected that children 
would be inclined to think that more than one 
candy was appropriate (since the candies 
Were smaller than in Experiment 1, see be- 
low). Therefore, overall a larger percentage 
of children should take more than one candy 
in these conditions than in the standard con- 
ditions, 

Whatever standard (if any) is common 
among trick-or-treaters should be most appar- 
ent in the no standard—mirror condition. 
These children should be self-aware and there- 
fore more likely to behave consistently with 
the standards that they brought with them 
into the situation. One possibility is that chil- 
dren trick-or-treating on Halloween might 
think it is a night when they can “get all the 
free candy they want” or that “it is okay to 
Steal.” If such reasoning is correct and this 
Sort of standard is salient, then more children 
M the no standard condition should take more 
than one candy in the mirror condition than 
Would do so in the no mirror condition, since 
Such behavior would be consistent with their 
Standards. However, we believed it reasonable 
t0 suggest that children on Halloween have 
the cognitive set that they are to be “given 
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free candy.” Further, children realize that 
what they will receive at each house is gov- 
erned by the people there. In addition, most 
children have in general an internal standard 
that it is inappropriate to take more than 
their-own share or to steal. Thus we expected 
a decrease for the mirror condition in the 
number of candies taken. Which of these two 
possibilities is correct should be revealed by 
the directional difference between the no 
mirror and mirror conditions when no ex- 
perimental standard was made salient. 


Method 
Participants and Setting 


The participants were 349 children who arrived 
to trick-or-treat between 5:00 and 9:00 p.m, on 
Halloween, 1978, at any of 13 selected homes. 
Again, children were not used as subjects if they 
arrived with a parent or in groups too large to 
process or if a second group arrived before the 
manipulations could be completed on the group 
already present. The candy rooms were arranged as 
in Experiment 1. 


Procedure 


The greeting of children was performed as in 
Experiment 1, The children were sent into the 
candy room one at a time. The candy room was 
arranged such that the bowl of candy and child 
approaching it would not be visible to cither the 
greeter or the other children waiting to enter the 
candy room, The hidden rater again recorded the 
number of candies taken. Instead of using age esti- 
mates, each child was individually asked his or her 
name and age and the greeter repeated this to the 
child prior to sending the child into the candy 
room. Thus all children were individuated. Based 
upon the results of Experiment 1, this appeared to be 
necessary as a precondition for self-awareness when 
the child was later in front of the mirror. Each 
child was alone at the candy bowl (thus each child's 
behavior was statistically independent of other chil- 
dren’s). 

Independent variables, Self-awareness was again 
manipulated by the presence or absence of a mir- 
ror, Mirrors were physically placed as in Experi- 
ment 1, and the same randomization procedures 
were used, To make a standard salient, children 
were instructed to take one candy only, as before. 
However, additional no standard conditions (for 
mirror and no mirror) were added. In these cases 
each child was told “the candies are in the living 
room, go help yourself.” For these children the 
standard for behavior would be whatever norm(s) 
children generally have when trick-or-treating. The 
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Table 4 
Percentage of Children Taking More Than 
One Candy and Mean Candies Taken 


Self-awareness 


Verbal 
manipulation No mirror Mirror $ 
Standard 
% 34.2 11.7 001 
M 2.41 1.18 .001 
No. of children 76 94 
No standard 
% 75.0 58.9 03 
M 4.56 2.17 001 
No. of children 84 95 


standard and no standard conditions were alternated 
at each house, so all four conditions were in opera- 
tion at each house at various times throughout the 
evening. 

Dependent variables. Again the percentage of 
children taking more than one candy was computed. 
Also the exact number of candies taken by each 
child was recorded. In Experiment 1, small bite- 
sized candy bars were used. However, the trans- 
gression rate (more than one candy) was relatively 
low (about 10% for individuated children who 
arrived alone). Thus it was decided to use smaller 
candies for this study, to produce a higher trans- 
gression base rate, so that decreases caused by self- 
awareness could more easily be discerned. Several 
types of candies were chosen that varied in size 
between that of an individually-wrapped caramel 
and the candy size used before. This choice should 
also allow for a more sensitive measure, since the 
size of a child's hand would not limit as much the 
number of candies taken. 


Results 
Mirror Effects 


Of the 349 children, 156 (44.7%) took 
more than one candy. The effectiveness of the 
verbally stated standard to take only one 
candy is shown by the highly significant re- 
duction in the standard conditions when com- 
pared to the no standard conditions, 21.8% 
versus 66.59%, x*(1) = 68.7, p < .001. Sim- 
ilarly, over all conditions the mirror produced 
a strong reduction inhibiting children from 
taking more than one candy, 55.6% versus 
35.4%, x? (1) = 37.5, p < .001. However, the 
comparisons of most interest were those 
within each of the types of standards. 
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The percentages of children taking ; 
than one candy and the mean number taken 
are shown in Table 4. Transgressions wee 
greatly reduced by the mirror, y2(1) =I) 
p < .001, when the standard was made salien 
by the experimenter. The reduction in num 
bers of candies taken was also significant 
t(168) = 3.29, p < .001. Even though chil 
dren were not given specific instructions in th 
no standard condition, fewer children took 
more than one candy when a mirror 
present than when it was not, x7(1) = 4: 
p < .03, and children took fewer candies 
the average, ¢(177) = 3.89, p < .001. 


Age Data 


As is shown in Table 5, self-awareness al 
fected the older children when a standard 
been made salient, but not the younger ones 
These results compare favorably with Exper: 
ment 1, where age estimates were used insteal 
of actual age, as in the present study." Sim 
ilarly, the presence of the mirror reduce 
transgressions for the two oldest age oy 
egories when no standard was made salient 
These decreases did not reach statistical sigi 
nificance, 


Sex Data 


Overall, there was only a marginal ten enc 
for males to take two or more candies ay 
frequently than females, 49.0% versus 39.1 
x? (1) = 2.59, p < .11. However, if be 
amine only the standard conditions, W! a 
such behavior may more clearly be lab ie 
transgression, we find that males did oa 
gress more, 28% versus hie on 

X] 
p < .03, as was the case in r “ond 
reduced the proportion of male trane ; 
from 46.8% to 11.3%, x (1) = 13 
.001, whereas the reduction was 1 
among females (14.3% to 12.2%). 
7 The accuracy of these estimates is not of ma 


concern with respect to the pres 
less, an estimate of their reliability ee E; 
for future research. A correlation coe 


Table 5 
Percentage of Children in the Standard 
S Conditions Taking More Than One Candy as a 
\ Function of Age and Self-Awareness 
—————SSs—————————— 


Condition 
No mirror Mirror 

' Age (years) % n % n t 

] 

f 1-4 o TIa ns 
5-8 40 20 25 32 ns 
9-12 24 25 6.1 33 04 
13 and above 50 24 0 16 .001 

Discussion 


Since all findings in the second experiment 
‘replicated the earlier findings, only the exten- 
sion and additions will be discussed here, in 
order to avoid redundancies. Overall, self- 
awareness again exerted a strong influence on 
children aged 9 and older. The self-awareness 
results in Experiment 2 are especially reassur- 
ing in that they were clearly unconfounded by 
modeling effects, since children entered the 
toom individually. The use of actual age data 
supported the former estimated age findings. 
The mirror-induced self-awareness effect sim- 
ply was not present among young children. 
Why this was so should be an important sub- 
Ject for future research. t 

The addition of no standard conditions, 
Where children were not given an instruction 
as to how many candies they might take, 
yielded two important conclusions. First, the 
saliency of the simple verbal statement to 
take just one candy was demonstrated by the 
large reduction in the standard condition com- 
pared to the no standard condition. Second; 
the first evidence was provided concerning the 
Cognitive standards by which children may 
generally guide their behavior in this particu- 
lar setting, Apparently Halloween is a night 
When a child receives free candy, but that fact 
does not remove his/her knowledge that it is 
Inappropriate to overindulge or to steal. 

It should be reiterated at this point that 
although the term “transgression” has a 
moralistic connotation, the theory of self- 
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awareness is a neutral one. It may be in our 
study that self-awareness inhibits transgres- 
sions or increases conformity to standards, 
Children who transgress may not be so much 
actively breaking rules as just failing to self- 
regulate. Self-awareness may be one of many 
variables that aid in producing self-monitor- 
ing of one’s behavior. The theory has been 
primarily developed in terms of producing a 
state where one evaluates his or her behavior 
in terms of how consistent it is with salient 
social or personal standards. In such states 
one should act in ways that are more con- 
sistent with whatever standards are salient 
and are seen as applying in that situation. 

In sum, the present two large-scale field 
studies offer strong support for the theory of 
self-awareness as applied to transgressive be- 
havior among children. The first study also 
provided some leads concerning modeling that 
should be followed up in future laboratory 
studies. Finally, the age effects should also 
encourage researchers to study the develop- 
mental course of self-awareness. 
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Variation in Behavior Perception and Ability Attribution 


Darren Newtson and Richard J. Rindner 
University of Virginia 


Three studies were conducted. In the first, 16 subjects segmented a videotape 
of an actor solving 30 problems in an ascending success pattern under instruc- 
tions to analyze the first half of the sequence into fine or large units. Estimates 
of future performance were lower when the initial part of the series was ana- 
lyzed more finely. In a second study, 72 subjects viewed either an ascending 
or a descending success pattern; the first half of the series was presented at 
fast motion, normal speed, or slow motion. Slow-motion presentation induced 
finer units of analysis than normal speed and enhanced primacy effects; fast 
motion induced larger units of analysis and attenuated primacy effects. A third 
study compared fine-unit and large-unit analysis induced by instruction and 
large-unit analysis induced by fast-motion presentation of the entire ascending 
and descending series. Results were comparable to those of the first two studies. 
It was concluded that primacy effects result from a cessation of processing 
when a point of subjectively sufficient information is reached and that varia- 


tions in level of analysis set limits on information gain in observation, 


Results of several recent studies of behavior 
perception have supported the view that ac- 
tions are discriminated by the selection of 
Successive points of definition in the behavior 
Stream. These points of definition, termed 
breakpoints, are points judged by perceivers 
as the points at which actions have occurred 
(Newtson, Engquist, & Bois, 1977; Newtson, 
Rindner, Miller, & LaCross, 1978). Break- 
Points appear to be selected in such a way 
that they provide a summary of the informa- 
tion gained from observation of the sequence 
of events (Newtson & Engquist, 1976). 

Newtson (1973) hypothesized that per- 
Ceivers actively control information gain from 
behavior by selecting a greater or lesser num- 
ber of breakpoints from the behavior stream. 
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That is, observers selecting more breakpoints 
from a given sequence (i.e., analyzing the be- 
havior into finer units of action) should gain 
more information than observers analyzing 
the behavior into larger units, Newtson 
(1973) in fact did find that subjects in- 
structed to analyze a 5-minute action se- 
quence into fine units had more confident and 
differentiated impressions of the stimulus per- 
son than those subjects instructed to analyze 
that behavior into large units. In a second 
study, it was found that the occurrence of an 
unexpected event prompted observers to an- 
alyze subsequent behavior more finely than 
observers who had not viewed the unexpected 
action. This result is also consistent with the 
view that observers actively control informa- 
tion gain from ongoing behavior. In the con- 
trol condition of that study, where the be- 
havior remained predictable throughout, a 
marked decline in number of breakpoints was 
observed from the beginning to the end of the 
sequence. Wilder (1978a) reported a similar 
decline in rate of segmentation over time. 
Newtson (1973) noted that one conse- 
quence of this decline over time could be that 
the perceiver’s sample of information is 
biased, in that it would be largely drawn from 
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the first part of a sequence. Thus one might 
predict primacy effects in attributions based 
on the observation of behavior. Newtson 
(1973) raised this possibility as one alterna- 
tive explanation for the primacy effects in 
ability attribution reported by Jones, Rock, 
Shaver, Goethals, and Ward (1968). 

The present article reports a series of 
studies exploring such biased sampling phe- 
nomena. Biased sampling was induced in two 
different ways: by instruction as to level of 
perceptual analysis, as in Newtson (1973), 
and by variations in film speed. Filmmakers 
have long defined a quantity termed event 
density in terms of the number of events hap- 
pening within a certain time unit. Slow- and 
accelerated-motion techniques are explicitly 
defined as manipulations more of event den- 
sity than of motion and may be realized equi- 
valently with both cutting techniques and 
actual manipulation of film speed (cf. Zettl, 
1973, pp. 280-283). Variations in level of 
analysis may be thought of as subjective 
variations in event density that are under the 
perceiver’s control. Thus film speed manipula- 
tions may provide a useful converging opera- 
tion for the manipulation of level of per- 
ceptual analysis, although their precise effects 
on interpretation of events remain to be 
evaluated. 


Experiment 1 


In the first experiment, subjects viewed an 
actor attempting to solve 30 problems, suc- 
ceeding on 5 of the first 15 and on 10 of the 
last 15. In one condition, subjects were in- 
structed to analyze the series into fine units 
for the first half of the series and then into 
whatever units seemed natural and meaning- 
ful to them for the second half of the series, 
In a second condition, subjects analyzed the 
series into large units for the first half and 
then into natural units for the second half of 
the series. Subjects were next asked to make 
estimates of the actor’s performance on a 
subsequent series of problems. Thus, it was 
reasoned, the subjects’ information sample 
from the initial part of the series should be 
ased. The result should be 


increased or decre 
f the initial part of the 


that the impact 0 
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series on estimates of the actor’s future 
formance should be greater when it is analy, 
finely and less when it is analyzed into Tange 
units. E 


Method 


Subjects. Subjects were 16 undergraduates ( 
males, 4 females) recruited from an introd 
psychology course at the University of Virginia, 

Stimulus. A videotape was prepared showing 2 
male actor ostensibly taking a computer-scored 
telligence test. The actor read each question from 
3X 5 inch (7.62 X 12.70 cm) card and typed @ 
sponse into a teletype Feedback was presented | 
means of two lights above prominent signs label 


items. The specific feedback pattern was as follow 
1 “right,” 4 “wrong,” 1 “right,” 3 “wrong,” 2 “right 
2 “wrong,” 1 “right,” 2 “wrong,” 3 “right,” 5 
“wrong,” 3 “right,” 1 “wrong,” and 4 “right.” 
Procedure. Subjects were assembled in grou 
two or three persons facing a 23-inch (58 cm) 
monitor, Each subject was given a hand-held. butte 
connected to an event recorder in another room. 
jects were informed that they would view a vie 
tape of another undergraduate taking a comput 
scored intelligence test. All subjects were then i 
structed as follows: “What I am interested in M 
are the ways in which people organize or bi 
another person’s behavior into its component adh 
By that I mean that people may break up anoth 
person’s behavior in different ways. For examp 


might turn, walk over, push a oe dodot 1 
k k T i ach 0) r: 
walk back, and you might see ¢ Peis atl 


as a separate, meaningful action. yo 
them as just one action, such as closing © ad 
In one condition, subjects were oe ae p 
“What I would like you to do is to mark 0 
the largest actions in the sequ 
see, as you see them. That is, t A 
this button firmly whenever, in your judgment, ® 
a meaningful action occurs. owh | 
wrong ways to do this; I simply want to kni 


you do it.” / 
` In the second condition, subjects we : 
same instructions, except that the word $ 
substituted for largest. 
At the midpoint of the ree 
j ere Í 5 
halted, and all pee t eee 
you.” 


jsted of 
Data oon of th 


seque! 
actio 


Dependent measures. 
ber of breakpoints mar’ 
tape and responses to the 
naire items. The first item asked 
the percent corre a 
obtain on a second series 


VARIATION IN BEHAVIOR PERCEPTION 


_ question asked subjects to report the number of items 
the stimulus person had answered in the videotape. 
A third item asked subjects to recall the percentage 
of those items the stimulus person got right. 
An additional comparison of the unit data was 
conducted to see if the predicted differences in level 
of analysis varied over a hierarchical structure, Fre- 
“quencies of marking were tabulated for each 1-sec 
interval of the sequence. If fine-unit subjects are 
simply breaking down into smaller components the 

same actions that subjects using larger units are dis- 

criminating, then it follows that where the large-unit 
‘condition subjects indicate a breakpoint in the se- 
quence, fine-unit subjects should agree, also indicating 
a breakpoint (note that the converse need not neces- 
sarily be true). Accordingly, consensus breakpoints 
in the large-unit condition were identified, and the 
frequency of identification of these same points as 
breakpoints in the fine-unit condition was compared 
to chance expectation by chi-squares (see Newtson, 
1973). Statistical confirmation of the hypothesis 
Would indicate a significant tendency toward a hier- 
archical relationship between the two segmentations. 


Results 


Subjects viewing the first part of the se- 
quence under fine-unit instructions marked an 
average of 50.13 units, which was significantly 
More than the quantity subjects marked under 
large-unit instructions, M = 18.88, ¢(14) = 
5.83, p < .001. Number of units did not differ 
significantly for the second half of the tape; 
Means were 43.38 and 43.50 for the fine-unit/ 
hatural-unit and large-unit/natural-unit con- 
ditions respectively, ¢(14) = .023. 

_ Subjects in the fine-unit/natural-unit condi- 
tion estimated future performance at 35.75%, 
significantly worse than subjects in the large- 
Wnit/natural-unit condition, M = 61.88, ¢(14) 
~ 3.34, p < .01, Results on recall of past per- 
formance were in the same direction, although 
this difference did not attain significance; 
Means were 34,50 and 43.75, ¢(14) = 1.69, 
P< .20, two-tailed, for the fine-unit/natural- 
init and Jarge-unit/natural-unit conditions, 
spectively, 

Comparison of segmentation patterns con- 
itmed the hypothesis that fine-unit and large- 
init segmentations would be more hierarchi- 
‘ally related than chance, x?(1) = 9.46, p< 
vl. That is, fine-unit subjects tended to break 
ag the actions discriminated in the large- 

nit conditions into their component parts. 
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Discussion 


These results confirmed that variation in 
level of analysis systematically altered the 
impact of the behavior on observer judgments. 
When the initial part of the series was an- 
alyzed finely, subjects predicted significantly 
poorer subsequent performance than when the 
first part of the series was analyzed in larger 
units. These results raise the possibility that 
biased information sampling could underlie 
primacy effects in ability attribution in the 
Jones et al. (1968) paradigm, 

Evaluation of this hypothesis requires a 
comparison of the segmentations between the 
first and second halves of both an ascending 
and a descending success series. If the biased 
sampling interpretation is correct, then more 
breakpoints should be discriminated in the 
first halves of the sequences than in the sec- 
ond halves, and a typical primacy effect 
should result. That is, subjects viewing a 
series in which rate of success is low at the 
outset and then improves should estimate fu- 
ture performance as poorer than subjects 
viewing a series in which rate of success be- 
gins at a high level and then declines, 

In addition, film speed was varied for the 
first halves of the sequences. The first half of 
each sequence was presented in slow motion 
or in fast motion, Our ability to comprehend 
variable speed films is an interesting phenom- 
enon. Rate of movement in fast-motion and 
slow-motion films is typically altered by a 
factor of 7 to 10 times (cf. Miller & Strenge, 
1969). This ability has simply not been ex- 
plained. A careful search of the literature has 
failed to yield a single study systematically 
investigating the effects of film speed on the 
perception or interpretation of action, If, as 
suggested earlier, manipulations of film speed 
are similar in important ways to normal varia- 
tions in level of perceptual analysis, variable 
speed film techniques may provide a useful 
convergent operation for the manipulation of 
level of perceptual analysis that is free of at 
least some of the possible demand character- 
istics associated with direct instructions. It 
would seem reasonable to expect that slow- 
motion film presentation would induce fine- 
unit analysis of a behavior sequence, whereas 
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fast-motion presentation would induce large- 
unit analysis, 


Experiment 2 


In this experiment subjects viewed one of 
two sequences showing an actor attempting to 
solve 30 problems. One sequence ‘was the 
ascending success series employed in Experi- 
ment 1. A second sequence showing a descend- 
ing success series was included in this experi- 
ment. The first half of each sequence was 
shown to subjects at slow motion, normal 
speed, or fast motion, In all conditions, the 
second halves of the sequences were shown at 
normal speed. It was anticipated that slow- 
motion presentation should induce finer unit 
analysis than normal speed presentation and 
that fast-motion presentation should induce 
larger unit analysis. 

Results from the previous study suggest 
that the number of units marked for the first 
half of the series determines the impact of 
that part of the series in subsequent judg- 
ments. It would follow, therefore, that slow- 
motion presentation should enhance primacy 
effects as compared to normal-speed presenta- 
tion, whereas fast-motion presentation should 
attenuate primacy effects. Alternately, the 
differences observed in Experiment 1 could 
have been unidirectional, such that only one 
of the instructional conditions affected judg- 
ments as compared to normal observation. 


Method 


Subjects. Seventy-two male undergraduates served 
as subjects in the experiment. 

Stimuli. The study employed the ascending success 
sequence from the previous experiment and a second, 
descending success sequence, In this sequence, the 
actor received “right” feedback on 10 of the first 15 
problems, and on 5 of the last 15. The precise pattern 
of feedback for this sequence was simply the reverse 
of the pattern for the ascending success series. 

Three versions of each series comprised the stimuli 
for the experiment. One version simply consisted of 
the entire sequence of 30 problems at normal speed 
(normal-speed conditions). In a second version, the 
actor's performance on the first 15 items was copied 
at seven times normal speed, and the last 15 responses 
were recorded at normal speed (fast-motion condi- 

first-half performance was 


tions). In a third version, 
copied at one-seventh speed, whereas the second half 
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was recorded at normal speed (slo 
tions). 

This resulted in sequences of the fo 
normal speed, 10.1 min.; slow moti 
fast motion, 5.9 min. = 

Selection of these film speeds was 
available videotape machinery. Vide 
continuously variable speed capacity 
roll”: each video “frame” is pres 
with a black bar separating them, 1 
for this experiment (a Javelin) is cap 
senting videotapes at 7:1 and 
frame roll but is limited to these 
speeds. 

Procedure. Subjects were 
randomly varying size (1 to 15 
inch (58 cm) video monitor. Each 


in another room. Subjects were 
would view a film of another undergradual 
a computer-scored intelligence test. All s 
then given the same segmentatit 
Experiment 1, except that subj 
to discriminate the “naturally of 
the sequence. à 
Finally subjects were informed 
would be presented at varying $] 
the speed of the tape that they 
This was done in order to preveni 
on marking and to prepare subjects (¢ 
fast-motion condition) for m 
Following the viewing of the 
pleted a questionnaire and were 
Dependent measures. The sai 
questionnaire was used as in the p 
Design and analysis. Number 
marked for the sequences were 
X 2 mixed-effects analysis of 
were two between-subjects fi 
(ascending vs. descending) and 
(slow, normal, and fast motion) 
the sequences, and a repeated 
vs, second). A 
Responses to postexperimental 
were analyzed in a 2 X 3 anal 
in which factors were success pa 
descending) and speed of presenta 
and fast motion). 
Segmentation patterns for 
films were analyzed as in Expe! 
is rather more important in 
bears on the question of wh 
speed manipulations simply 
the sequences. Frequencies 0! 
for each 1-sec interval, as 
breakpoints were identified 
condition. Since presentation $ 
condition was seven times th 
sentation, each 1-sec inte! 
record corresponded to seven 
normal-speed condition. A 
of breakpoints were identifi 
lowing criterion: if an ini 


Table 1 
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Mean Number of Breakpoints by Success Pattern, 
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Speed of Presentation, and Film Half 


Success pattern 


Ascending Descending Combined average 
7 
Speed First half Second half First half Second half First half Second half 
Slow 73.82 65.36 79.42 61.00 16.74a 63.09, 
Normal 38.46 36.00 55.08 48.42 47.13, 42.48, 
Fast 21.17 42.42 26.83 43.58 / 24.004 43.00, 


p < .001. 


breakpoint in the fast-motion condition, then the 
occurrence of a breakpoint in the normal-speed con- 
dition in any of the corresponding seven 1-sec inter- 
vals was counted as one, and only one, co-occurrence. 
Comparison of segmentations between the normal- 
speed and slow-motion conditions was performed in a 
completely analogous manner. Statistical confirma- 
tion of these hypotheses would indicate a significant 
tendency toward a hierarchical structure in the seg- 
mentation of the behavior. 


Results 


Unit data. Analysis of variance of unit 
data indicated a significant main effect for 
speed of presentation, F (2, 64) = 23.46, p < 
001, and a significant Speed of Presentation 
X Film Half interaction, F (2, 64) = 27.15, 
Ż < .001. The main effect for success pattern, 
(1, 64) = 1.93, p < .20, and the interaction 
between this factor and film half, F (1, 64) = 
81, p< .10, approached significance. No 
ther effects or interactions approached signif- 
cance. Means are reported in Table 1. Inter- 
ction means are reported so as to facilitate 
°mparison with Experiment 1, which em- 
loyed the ascending success series only. The 
biased sampling hypothesis predicts that at 
normal speed more units should be marked for 
he first halves than the second halves, This 
Ypothesis was not supported. When the se- 
Wence was presented at normal speed, unit- 
“ation of the two film halves did not differ 
"nificantly, t(64) = 1.45, p < .20. 
, nder slow motion, more units were marked 
*t the first than the second halves of the 
; Mences, ¢(64) = 4.24, p < .001, which were 
normal speed, When the first halves of the 
ienes were presented at fast motion, sig- 
cantly fewer units were marked for the 


Note. Speed of presentation varied for the first film half only. Means with different subscripts differ by test, 


first halves of the sequences, ¢(64) = 5.91, 
$ < .001. It was predicted that speed of pre- 
sentation would alter level of unitization for 
the first halves of the sequences, Comparison 
of these means using ¢ tests confirmed that 
more units were marked under slow motion 
than normal speed, ¢(64) = 9,20, p <.001, 
and more units were marked at normal speed 
than at fast motion, £(64) = 7.91, p < .001. 

An unexpected result was obtained for the 
second half of the films in the slow-motion 
condition. Apparently because they had been 
induced to analyze the behavior finely in the 
first half by the slow-motion presentation, 
subjects persisted in finer unit analysis than 
that performed in the other condition (see 
Table 1). 

Comparisons of segmentation patterns for 
the first halves of each sequence across film 
speeds indicated that the film speeds simply 
altered segmentations over a_ hierarchical 
structure. For the ascending series, 90% of 
the breakpoints marked under fast motion 
were also marked at normal speed, x*(1) = 
153.31, p < .001, whereas 45% of the break- 
points marked at normal speed were also 
marked at slow motion, x*(1) = 35.78, p < 
001. In the descending series, 60% of the 
breakpoints identified at fast motion were also 
breakpoints at normal speed, (1) = 63.68, 
p< .001, whereas 64% of the breakpoints 
marked at normal speed were also break- 
points at slow motion, x°(1) = 56.43, p< 
.001. This level of consensus is comparable to 
that observed for different levels of segmenta- 
tion in previous studies (cf. Newtson, Eng- 
quist, & Bois, 1976). These results confirm 
that the film speed manipulations were not so 
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Table 2 

Mean Estimates of Future Performance, 
Number of Problems Attempted, and Percent 
Correct by Speed of Presentation and 


Success Series 
———————— ae 


Measure 
ane 

Success pattern Future 

and speed of performance % No. at- 

presentation (% correct) correct tempted 
ee SS eee 
Ascending 

Slow 35.97 32.27 23.82 

Natural 46.36 39.09 27.91 

Fast 67.08 60.00 27.33 
Descending 

Slow 71.83 69.33 24.67 

Natural 64.58 60.00 24.42 

Fast 45.42 43.75 27.08 


ee 
Note. Speed of presentation varied for the first film 
half only. 


extreme as to disrupt perception of the se- 
quences but that they did, as hypothesized, 
alter level of perceptual organization in a law- 
ful manner. 

Estimates of future performance. Analysis 
of variance of estimates of future performance 
indicated a significant main effect for success 
pattern, F(1, 64) = 13.78, p < .001, and a 
significant interaction between this factor and 
speed of presentation, F (2, 64) = 34.11, < 
001. The main effect of speed of presentation 
was not significant (F < 1). Means for this 
measure and two others are presented in 
Table 2. Comparison of the means in the 
natural-speed conditions across the ascending 
and descending success patterns indicated a 
strong primacy effect. Subjects viewing the 
ascending success series predicted significantly 
poorer subsequent performance than subjects 
viewing the descending success series, (64) = 
3.61, p < 01. This result replicates the find- 
ings of Jones et al. (1968). 

It was predicted that slow-motion presenta- 
tion of the first halves of the series would 
accentuate primacy effects. In the ascending 
series condition, slow-motion presentation re- 
sulted in significantly lower estimates of fu- 
ture performance than natural-speed presenta- 


tion, #(64) = 2.07, P < .05, as expected. In 


the descending series condition, however, 
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while slow-motion presentation tended top 
duce higher estimates of future perfor 
this difference was only of marginal sig 
icance, t(64) = 1.44, p < .10, one-tailed, 
The second prediction, that fast motio 
should reduce primacy effects, was uneq 
vocally supported. In the ascending 
condition, fast-motion presentation resulted 
significantly higher estimates of performa 
than natural-speed presentation, 
4.10, p < .001. In the descending series con 
tion, fast-motion presentation resulted in 
nificantly lower estimates of future pe 
ance, t(64) = 3.79, p < .001. 
Recall of percent correct. A second 
tionnaire item asked subjects to recall i 
percentage of problems that the stimuli 
son had gotten right. Results on this m 
generally paralleled those of the first but wé 
slightly weaker statistically, A signi 
primacy effect was again obtained in th 
ural-speed conditions on this measure | 
Table 2), with subjects recalling better 
formance in the descending series than 
ascending series conditions, t(64) = 3.92)? 
001. 
As on the previous measure, differences | 
tween slow-motion presentation and natt 
speed were consistently in the predicted d 
tion. In the ascending condition, the ai 
ence in recall of number correct did not d 
significantly between the natural and 
motion conditions, ¢(64) = 1.28, p< d 
tailed, although this difference did , 
significance in the descending condition, ‘ 
= 1.75, p < 05, one-tailed. 
Comparison of the fast- 
speed presentation conditions confi a 
predicted effect on this measur’ via 
the natural-speed conditions ree led $ 
icantly lower performance than 1" 
condition subjects from the ae 
1(64) = 3.92, p < 001, and be al 
icantly higher performance at nawn 
from the descending a than at 
tion, (64) = 3:04, PSr 
A a item requested subie 
the number of items at 


variance of these responses im ei s 
nificant main effects Or inten y! 
are reported in Table 2. Appar J 


jations in film speed did not alter perception 
" of problems attempted, 


Discussion 


_ Results did not support a simple biased 
sampling interpretation of primacy effects in 
ability attribution. Differences in number of 
units marked between sequence halves were 

small, despite a strong replication of the Jones 

vet al. (1968) primacy effect on the cognitive 

_measures. This failure to confirm the hypothe- 

sis is especially intriguing in view of the sig- 

nificant impact of the film speed manipula- 
tions on primacy effects in the second experi- 
ment, and the effects of level of analysis in 

Experiment 1. 

Comparisons between the two experiments 
are also interesting in view of the tremendous 
differences in processing time between the two 
experiments. Subjects viewing the first half of 
the ascending success series under fine-unit 
instructions estimated future performance 
(M = 35.75) as comparable to those viewing 
that part of the sequence under slow motion 
(M = 35.91). Similarly, estimates of future 
erformance resulting from large-unit instruc- 
ions (M = 61.88) closely paralleled those 
btained when the initial part of the sequence 
as presented in fast motion (M = 67.08). 
S the sequences were about 10 minutes long, 
he first half took 35 minutes to view at slow 
lotion and about 43 seconds at fast motion. 
4 Despite the large differences in presentation 
ime, analysis of segmentation patterns con- 
med that the fast-motion and slow-motion 


tences were divided in half for presentation 
nd simply doubled their estimates from the 
‘tcond halves of the films. More simply, how- 
‘Ver, inspection of the films themselves re- 
Yealed no difficulty in following the events. 
ese film speeds might be considered ex- 
"me. As indicated by their choice as a fixed 
a by the manufacturer of the video re- 
ki a however, these speeds are fairly typ- 

or fast-motion and slow-motion film pre- 
ütations, 
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The comparability of the instructional var- 
iations and film speed variations on the per- 
ceptual and cognitive measures does not indi- 
cate that there was some fortuitous “match” 
between the film speeds used and these in- 
structional extremes. Rather, a hierarchy of 
structures exists in behavior, in this instance 
at a step-by-step level, at a trial-by-trial level 
and, as suggested by the natural-speed condi- 
tions, at at least one intermediate level. The 
variations in film speed apparently caused 
observers to select the level of analysis that 
was least difficult to employ under the cir- 
cumstances. 

With respect to the mechanisms of informa- 
tion integration underlying the cognitive data, 
the present findings are suggestive, if not con- 
clusive, One possibility could be a simple kind 
of attention decrement or enhancement effect. 
That is, if fine units provide more information 
to the observer, subjects in the fine-unit/ 
natural-unit and  slow-motion/normal-speed 
conditions would have, in comparison to sub- 
jects in the normal-speed conditions, a sample 
of information from the behavior that is 
biased toward the first part of the sequence. 
For this to occur, however, it would also be 
necessary for perceivers to weight outcomes 
according to the number of actions associated 
with them. That is, one would have to assume 
that the experience of following someone 
through a step-by-step problem-solving at- 
tempt and then seeing him or her fail causes 
the outcome (i.e., the failure) to have more 
impact on the perceiver’s judgment than if the 
perceiver viewed the event in large units of 
action (e.g., the actor tried and failed), The 
smaller magnitude of differences between the 
slow-motion/normal-speed conditions and the 
normal-speed conditions could thus be seen 
to follow from the fact that fine-unit analysis 
tended to persist in the slow-motion/normal- 
speed conditions. This interpretation could 
not, however, account for the occurrence of 
primacy effects in the normal-speed condi- 
tions, since segmentation did not differ across 
film halves in these conditions. Thus such an 
interpretation would have to assume that 
these effects occur in addition to primacy 
effects due to other mechanisms, 


A second, perhaps more likely interpreta- 
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tion is that differences in level of analysis 
affect the point in the series at which a sub- 
jectively sufficient quantity of information is 
reached. That is, observers in these studies 
are simply told to view the series of events. 
If we assume, following Jones and Davis 
(1965), that perceivers take in information 
about others only until they feel they have 
“reason enough” to predict or explain the 
other’s behavior, then it could be that fine- 
unit perceivers obtain sufficient information 
(subjectively) about the actor sooner in the 
series than large-unit perceivers. This could 
include other information about the actor 
than just ability information as reflected in 
outcomes of trials. Since the distribution of 
outcome information in the series is biased, 
the sooner the perceiver stops incorporating 
new information, the poorer (in an ascending 
success series) the impression of the actor’s 
ability would be. This interpretation would 
have the advantage of being a more parsi- 
monious one and might also account for the 
parallel results on both estimates of future 
performance and on recall of number of prob- 
lems solved, in the absence of differences in 
recall of number of problems attempted. That 
is, although subjects continued to monitor the 
problem-solving behavior throughout the 
Series, it would appear that they ceased to 
integrate outcome information into their im- 
pression of the actor’s ability beyond a certain 
point. 

It might be argued, however, that the re- 
sults on the cognitive measures in both experi- 
ments are artifactual, in that there was a con- 
eve hate the shift in content between 
te te (the changing success rate) and 

e shift in instructions (Experiment 1) or in 


film speed (Experiment 2). That is 


Ex n perhaps 
the changes in instruction or film speed ey 


subjects to conclude that the 
second 
the sequences was more or less i ces 


thitd, exparimene eee A mportant. A 
eee us condi 
out this interpretation, eee ' rule 
Experiment 3 


In this experiment subj i 

e jects viewed ei 
the ascending or descending ak, ER 
Subjects we sate 


re instructed to analyze normal- 
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speed films into fine units or large units or tp 
segment fast-motion films naturally. The same 
segmentation instructions and film speed wete 
maintained throughout the series. If the re’ 
sults on the cognitive measures in the first two 
experiments were due to an artifact of changes 
in film speeds or instructions, differences on 
these measures as a function of these manip- 
ulations should not be observed. Failure to 
find accentuation or attenuation of primacy) 
effects as a result of these manipulations, how- 
ever, could also be consistent with the inter 
pretation of the previous results as due to 
enhancement of or decrements in attention} 
That is, if level of analysis simply affects 
weighting of outcomes throughout the seri 
then differences in levels of analysis, if main 
tained throughout, should leave the over 
judgment unchanged. i 
If, however, differences in level of analysis 
affect the point in the series where informa 
tion integration ceases, fine-unit analy. 
should produce strong primacy effects, where 
Jarge-unit analysis should attenuate them. 
The present study did not include slow 
motion conditions, Slow-motion presentatidl 
of these sequences would take more than 4 
minutes. Our experience with the previo 
study suggested that slow-motion presentatiol 
of only the first halves of the sequences tested 
the limits of human endurance. Since the fast 
motion conditions are adequate to test 
alternative interpretations noted above, slow 
motion conditions were excluded. 


Method 


Subjects. Subjects were 53 undergraduates 
males, 33 females). ef 

Stimuli, The same sequences were used i d 
employed in the previous study. Two versions” 
each sequence were used: (a) the normal SP ngl 
sion of the ascending and descending series i 
was 10.1 min.), and (b) a fast-motion versio! A 
which the entire sequence was presented at í 
times normal speed (length was 1.45 min.). lard 

Procedure. Subjects in the fine-unit ani atl 
unit conditions were instructed as in Experir oh 
subjects in the fast-motion conditions were ins! cedi 
as in Experiment 2. One alteration in tbe Be 
was made from the previous two studies. i, 
were asked to record units by making n tond 
piece of paper, rather than by pressing 2 | i of 
in the previous two studies. This modification of 
recording procedure has been used by Wilder 
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Table 3 
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Mean Number of Breakpoints, Estimates of Future Performance, Recall of Percent Ci 
, ‘orrect, and 
Recall of Number of Problems Attempted by Condition and Success Series 


Measure 


Success pattern 


and condition Number of breakpoints Future performnace % correct No. attempted 
Ascending 

Fine unit 104.70 35.50 33.50 27.30 

Large unit 62.70 56.50 51.30 28.30 

Fast motion 80.40 60.75 53.88 24.13 
Descending 

Fine unit 147.22 82.56 68.89 25.00 

Large unit 64.00 49.29 55.71 28.86 

Fast motion 60.44 39.22 35.89 29.11 


1978b) with good results and is more efficient if one 
is primarily concerned with the number, rather than 
the patterning, of breakpoints. 

Dependent measures. The same postexperimental 
questionnaire was used as in the previous study. 

Design and analysis. A one-way analysis of vari- 
ance was conducted between the six groups (fine-unit, 
large-unit, and fast-motion conditions for the ascend- 
ing and descending series) to obtain an estimate of 
variance, and £ tests were conducted between groups 
on each of the variables. 


Results 


f Results of the one-way analyses of variance 
indicated significant differences between 
groups in number of breakpoints marked, 
F(5,47) = 13.11, p < .001; estimates of fu- 
ture performance, F(5, 47) = 21.94, p< 
001; and recall of percent correct, F(5, 47) = 
9.05, p < .001. As in the previous studies, no 
Significant differences were obtained between 
groups on recall of number of problems at- 
tempted, F(5, 47) = .40. 

Means for the six groups on all measures 
are reported in Table 3. Tests of mean differ- 
ences indicated that for the ascending success 
Series, significantly more breakpoints were 
marked in the fine-unit condition than in the 
large-unit condition, #(47) = 3.33, p < .005, 
or in the fast-motion condition, #(47) = 1.85, 
> <.05, one-tailed. Number of breakpoints 
marked did not differ between the large-unit 
4nd fast-motion conditions for the ascending 
Series, (47) = 1.29, p < .20. 

3 Similarly, in the descending success series 
3 nditions, significantly more breakpoints 
ère marked in the fine-unit condition than 


in the large-unit condition, (47) = 5.86, p < 
.001, or in the fast-motion condition, (47) = 
6.53, p < .001. The latter two conditions did 
not differ in number of breakpoints marked, 
t(47) = .25, p < 80. 

Results on estimates of future performance 
paralleled those obtained in the previous two 
studies, ruling out interpretation of those 
results as due to artifacts or attentional varia- 
tions. In the ascending series, subjects in the 
fine-unit condition predicted significantly 
lower performance than subjects in the large- 
unit, (47) = 4.26, p < .001, or in the fast- 
motion, ¢(47) = 4.83, p < .001, conditions. 
Fast-motion and large-unit condition means 
did not differ, ¢(47) = .81, p < 40. 

Results on this measure in the descending 
series conditions indicated significantly higher 
estimates of future performance in the fine- 
unit condition than in the large-unit, (47) = 
5.99, p < .001, or in the fast-motion, t(47) = 
8.35, p < .001, conditions. The difference be- 
tween the large-unit and fast-motion condi- 
tions did not quite attain significance, ¢(47) 
= 1.81, p < .10. 

The measure of recall of past performance 
produced results similar to those on the previ- 
ous measure. In the ascending series condi- 
tions, mean recall of past performance was 
significantly lower in the fine-unit condition 
than in the large-unit condition, (47) = 2.98, 
p<.005, or in the fast-motion condition, 
(47) = 3.22, p < .005. Recall of past per- 
formance did not differ between the large-unit 
and fast-motion conditions, t(47) = 41, p< 
-70. 
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In the descending series conditions, recall of 
past performance was significantly higher in 
the fine-unit condition than in the large-unit, 
t(47) = 1.96, p < .05, or in the fast-motion, 
t(47) = 5.25, p < .001, conditions. The dif- 
ference between the large-unit and fast-mo- 
tion conditions, which was marginally signif- 
icant on estimates of future performance, 
attained significance on this measure, (47) = 
2.95, p < .005. 


Discussion 


Since the same film speed and instructional 
conditions were maintained throughout the 
series in this experiment, subjects could not 
have responded to a demand that they weight 
the performance in the series unequally. This 
rules out a “confound” explanation of results 
on the cognitive measures in the first two 
experiments. Furthermore, results on number 
of breakpoints marked and estimates of future 
performance were comparable to those ob- 
tained in the first two experiments, 

One problem in these data, however, is that 
variance in number of units marked was 
much larger than in the previous two experi- 
ments. This difference may reflect the differ- 
ence between the button-press procedure used 
in Experiment 1 and 2 and Wilder’s (1978a) 
pencil-and-paper procedure in this experiment. 

These data also Support the interpretation 
that level of analysis alters the cessation of 
information integration. Subjects analyzing 
the ascending series into fine units predicted 
future performance to be poorer than subjects 
analyzing the series into larger units, and sub- 
jects analyzing the descending series into fine 
units predicted future performance to be 
better than subjects analyzing that series into 
large units. The precise relationship between 
estimates of future performance and number 
of units, it should be noted, will depend upon 
a number of factors and is not a simple linear 
function of number of units, These factors 
include the underlying pattern of performance 
level of analysis, and time of termination of 
processing. In the present studies, for ex- 
ample, performance did Not steadily increase 
or decrease; the pattern of performance was 
more on the order of a step function, with 
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very good performance followed by very por 
performance in the “descending” series ani 
the reverse pattern in the “ascending” seris, 
Nevertheless, one might expect a correlation 
between number of breakpoints and estimates 
of future performance in the fine-unit and) 
slow-motion conditions, since the point of 
sufficiency should occur closer to the midpoint 
of the series when level of analysis was very 
fine. These correlations were computed for) 
each sequence for the fine-unit and slow’ 
motion conditions combined over the thre 
experiments. For the ascending series, number 
of breakpoints correlated —.27 (df = 37, p< 
.05) with estimates of future performance] 
whereas this value was .64 (df = 30, b 
001) for the descending series. That is, whe 
performance was poor at outset, finer u 
analysis led to lower predictions of future pe 
formance; when performance was good at th 
outset, finer unit analysis led to higher prë 
dictions of future performance. Similar cot 
relations over the larger unit conditions Wi 
not significant. 

There was one result in Experiment 3 tha 
was not consistent with the previous expe’ 
ment. Despite the comparable number of unit 
marked for the descending series in large-uni 
and fast-motion conditions, estimates of f 
ture performance and recall of percent es 
tended to be significantly lower in the f 
motion conditions. It is possible „that E 
novelty of the fast-motion presentation a 
the criterion of sufficiency in this condi 7 
although it is not clear why this was 
equally the case for the ascending series. 


General Discussion 


Jones and Goethals (1971) identified th q 
types of processes that can produce a 
effects: (a) discounting processes, W 
later information is rejected if it 18 
sistent with earlier information; (b) ‘i 
ilation processes, whereby the ine 
of later information is altered in light “ea 
vious information; and (c) attention of ith 
ments, whereby the perceiver’s sample ara 
formation is biased in that it is largely 
from the first part of the sequence. stat 

The explanation of primacy effects 


tiol 


a 
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have proposed shares important features with 
each of these explanations, That is, once a 
point of sufficient information is reached, 
later information is rejected, consistent with 
a discounting interpretation, although this 
rejection is not seen as a result of its con- 
sistency with previous information. Jones and 
Goethals (1971) reject the discounting inter- 
pretation, noting that “it is difficult to see 
how the discounting hypothesis might apply 
in performance evaluation when there is no 
evidence for motivation decrement or varia- 
tions in problem difficulty” (p. 38). Our sug- 
gestion would be that later evidence in a series 
may be “discounted,” but only in the sense 
that it is not used, rather than being rejected. 

Similarly, once a point of subjectively suf- 
ficient information is reached, both the “suf- 
ficiency” and assimilation interpretations 
make the same prediction. The assimilation 
hypothesis assumes that later information is 
distorted; the sufficiency hypothesis assumes 
it is disregarded. Thus the sufficiency explana- 
tion is proposed as an alternative view, but 
the present data cannot discriminate between 
them. 

Finally, the sufficiency explanation incor- 
Porates some features of the attention decre- 
Ment interpretation in that it postulates that 
Such effects are due to a decrease in process- 
Ing effort over time. The results of Experi- 
Ment 3, as argued, rule out a simple atten- 
tional weighting interpretation of these effects. 

One question raised by the present data is 
the reason that a “recency effect” was ob- 
tained in the fast-motion and large-unit condi- 
tions, That is, if subjects were maintaining an 
accurate count of successes, one might antic- 
ìpate convergence of the large-unit conditions 
to an accurate estimate of a 50% success 
Tate. This prediction, however, would require 
à number of assumptions as to the nature of 
Processing that are, so far at least, unverified. 

he assumption would be that if successes 
Were randomly arranged, estimates of success 
Would be accurate, Jones et al. (1968) used 
randomly arranged sequences and found that 
ĉstimates of future performance ranged as 
= as 61% in some series. A second assump- 

n is that subjects do indeed keep running 
“unts of total number of successes rather 
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than more impressionistic estimates of success 
rates based on the last few observed out- 
comes, If the latter were the case, it should 
be noted, the judgment of success rate would 
simply reflect the estimate that was current 
when the point of sufficiency was reached, 

Another question raised by these results 
concerns the relationship between level of 
analysis and information gain. Newtson 
(1973) argued that level of perceptual anal- 
ysis determines potential information gain 
from observation. In the large-unit and fast- 
motion conditions of the present studies, more 
units were marked than there were outcomes 
in each of the conditions, so that all subjects 
should have had complete information on out- 
comes. The sufficiency explanation proposed 
here postulates that subjects reached their 
point of sufficiency because they picked up 
additional information as well. 

Newtson (1973), in the initial study of 
level of analysis and information gain, found 
that observers analyzing a sequence into fine 
units of action had more confident and differ- 
entiated impressions of the actor. Subsequent 
investigations (cf. Deaux & Major, 1977; 
Newtson, 1976; Wilder, 1978b; Ebbesen, 
Cohen, & Lane, Note 1) have indicated that 
this effect may often be obtained for par- 
ticular sequences, but sometimes is not found. 
No studies, to our knowledge, have reversed 
the effect such that greater information is 
gained under large-unit than under fine-unit 
analysis. 

Newtson (1973) based his argument on an 
analysis of perceived actions in terms of in- 
formation theory. Strictly defined in those 
terms, the notion that more units should pro- 
vide greater potential information is a valid 
one. As Garner (1962) points out, however, 
there is a difference between the quantity of 
information that is potentially available in a 
series of instances and the information that is 
transmitted by those instances. That is, 
humans are able to use information only when 
it is available in particular forms or struc- 
tures. Newtson’s (1973) prediction that finer 
unit analysis should produce more informa- 
tion gain than large-unit analysis is like the 
prediction that concept attainment is a simple 
function of the number of instances presented. 
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Such a generalization in concept attainment 
paradigms is a highly limited one (cf. Bourne, 
1966; Bruner, Goodnow, & Austin, 1956; 
Garner, 1962), restricted to sets of instances 
providing usable information in small incre- 
ments. Newtson’s (1973) prediction thus de- 
pends upon the further assumption that the 
actions discriminated provide information 
about the actor in a usable form. The present 
data further qualify this relationship. Level 
of analysis may be seen as setting the limits 
of information gain, but further processing of 
that information may depend upon other fac- 
tors. 


Reference Note 


1. Ebbesen, E. B., Cohen, C. E., & Lane, J. L. En- 
coding and construction processes in person per- 
ception. Paper presented at the meeting of the 
American Psychological Association, Chicago, 1975. 
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Reassertion and Giving Up: The Interactive Role of 
Self-Directed Attention and Outcome Expectancy 


Charles S. Carver and Paul H. Blaney 
University of Miami 


Michael F. Scheier 
Carnegie-Mellon University 


Two experiments tested a theoretical model of behavioral self-regulation, which 
makes predictions about the effects of failure on a person's subsequent efforts. 
This model holds that degree of effort will be a product of two things: expect- 
ancy of being able to redress the failure and degree of self-attention. In the 
experiments, a failure pretreatment was used to create large within-self discrep- 
ancies among female subjects. It was predicted (a) that negative outcome ex- 
pectancies regarding a subsequent task would lead to decreased persistence on 
that task, (b) that positive outcome expectancies for the subsequent task would 
lead to increased persistence on that task, and (c) that both of these tendencies 
would be mediated by self-directed attention. The results of the two studies 
supported these predictions. Discussion centers on implications for research and 


theory in the areas of self-awareness, self-efficacy, and helplessness. 


A good deal of attention has been devoted 
in recent years to the consequences that an 
initial failure can have for subsequent task 
performances. Sometimes the inability to 
achieve an expected goal at one task has led 
to performance increments at a second task 
(e.g., Hanusa & Schulz, 1977; Roth & Kubal, 
1975). This typically has been interpreted as 
reflecting heightened efforts to compensate 
for the prior inadequacy. In other circum- 
stances, however, failure at one task has led 
to performance decrements at a second task 
(e.g, Hiroto & Seligman, 1975; Seligman, 
1975), This “giving up” phenomenon is 
widely believed to reflect a belief that effort 
on the second task will not lead to a positive 
outcome. 


The authors are grateful to Denise Brandom, 

arcia Hamilton, Ileana Rodriguez, and particu- 
larly Jody: Kreitchman, for their efforts in the 
collection of the data, The results of this research 
Were presented in an abbreviated form at the 1979 
Meeting of the American Psychological Association, 
New York City. 

Requests for reprints should be sent to Charles 
S. Carver, Department of Psychology, P.O. Box 
eats University of Miami, Coral Gables, Florida 
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A theoretical model of behavioral self-reg- 
ulation has recently been proposed (Carver, 
1979; Carver, Blaney, & Scheier, 1979) that 
has important implications for our under- 
standing of performance increments and dec- 
rements as products of failure. This theoretical 
model was developed in a somewhat different 
context, however. Specifically, it was designed 
to describe the behavioral consequences of 
self-directed attention. 


Self-Attention and Behavioral Regulation 


Self-focused attention has been shown to 
have several predictable effects, two of which 
are directly relevant to present concerns. 
These two effects both occur when some be- 
havioral standard has been made salient to 
the person prior to the heightening of self- 
focus. In such a case, self-directed attention 
is believed to cause the person to become 
more aware of his or her present state or be- 
havior and how it compares to the standard. 
One frequently demonstrated consequence of 
this awareness is an enhanced tendency for 
self-attentive persons to alter their behavior 
so that it conforms more closely to the stan- 
dard of comparison (e.g., Beaman, Klentz, 
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Diener, & Svanum, 1979; Carver, 1974, 
1975; Froming, in press; Scheier, Fenigstein, 
& Buss, 1974; Vallacher & Solodky, 1979). 
On the other hand, in at least one study 
(Duval, Wicklund, & Fine, 1972), persons in 
whom a large self versus standard discrep- 
ancy had been created simply withdrew more 
quickly from the experimental context when 
self-focus was high than when it was lower. 
These two divergent effects of self-attention 
—enhanced conformity to behavioral stan- 
dards, and withdrawal—are reconciled in the 
following way (Carver, 1979). When a dis- 
crepancy exists between a present state or 
behavior and a salient standard of compari- 
son, self-focus leads initially to an impetus 
to match one’s behavior to the standard. This 
is presumed to be the case until or unless 
something interrupts this impetus. Interrup- 
tion can occur prior to initiating the attempts 
to alter one’s behavior if, for example, the 
person knows ahead of time that reducing 
the discrepancy between self and standard 
will be difficult. Alternatively, interruption 
can occur during the discrepancy-reduction 
attempt itself. Any interruption of the dis- 
crepancy-reduction impetus leads to an as- 
sessment of outcome expectancy, that is, the 
perceived likelihood that the discrepancy can 
be reduced. This expectancy judgment repre- 
ents a kind of psychological watershed. Sub- 
sequent behavior can be seen as falling into 
one of two classes, depending on the outcome 
of this judgment. If the expectancy is fa- 
vorable, the result is reassertion: further ef- 
fort to reduce the discrepancy, to match one’s 
behavior or state to the standard. This ten- 
dency should be exaggerated by further self- 
awareness. If the expectancy is unfavorable, 
the result is an impetus to withdraw. The 
execution of the withdrawal impetus should 
also be enhanced by further self-focus. 
Carver et al. (1979) have applied this rea- 
soning to analyzing the approach-withdrawal 
decision that may occur among fearful per- 
sons in response to their recognition of rising 
fear, Carver et al. tested subjects who were 
equivalent in their degree of self-rated fear of 
a specific stimulus but who differed from each 
other in their chronic expectancies of being 
able to approach and handle the feared stim- 
ulus. Experimentally enhanced self-attention 
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led the doubtful subjects to with 
earlier stage of the approach seq 
was true with less self-attention, O; 
hand, self-attention tended to en 
proach among the more confident su 
though not significantly so. These 
thus provided substantial support 
theoretical model, at least as it ap 
fear-based behavior. 


Failure, Expectancy, and Self-Atte 


This theoretical model has a gr 
potential applications, however, of w 
fear-response decision is only one, 
makes predictions, for example, 
sponses to failure. Failure produces 
discrepancy between one’s present 
one’s desired state (i.e. a discre| 
tween oneself and the standard of 
son). According to our model, one’s 
to failure should be dictated in part 
expectancy about being able to red 
discrepancy. A positive expectancy—t 
ception that the gap can be closed 
fully—should lead to reassertion a 
This should be reflected by increased 
to do well on a subsequent task. A 1 
expectancy—the perception that the 
ancy cannot be altered—should lead | 
impetus to withdraw. This should 
flected by reduced efforts at a sub 
task. Moreover, both the reassertio 
withdrawal tendencies should be exa 
by increased self-awareness. 

Although there is some indirect 
this reasoning (McDonald, in pre: 
barger & Aderman, in press), the mi 
not yet been tested directly in thi 
behavioral context. To do so was ti 
of the present research. All parti 
this research experienced a failu 
created a substantial discrepancy 
their present states and their desi 
We then manipulated subjects’ 
about whether or not they woul 
do well at a subsequent task. The 
task was chosen to provide us 
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it was predicted that subjects in whom nega- 
tive expectancies had been created would 
display less persistence with high than with 
low self-attention, 


Experiment 1 
Method 


Subjects and Experimenters 


jects were 74 female undergraduates from the 
ty of Miami subject pool. We used only 
females in this experiment because our experiment- 
ers were female and we wished to avoid introducing 
the potential psychological complexity associated 
with mixed-sex dyads. Eight other subjects were 
deleted prior to data analysis, for the following 
reasons: 1 reported having succeeded on the initial 
(failure manipulation) task, 3 spontaneously re- 
ported the belief that neither task was possible, and 
4 either failed to understand the instructions for 
the second task or else failed to follow those in- 
structions, which resulted in their claiming to have 
solved an insoluble puzzle. Each subject, tested in 
an individual session, was randomly assigned to one 
of two expectancy conditions and to one of two 
self-attention conditions (described more fully be- 
low). 

There were two experimenters; each interacted 
alone with the subject at a different stage of the 
experiment, The first experimenter was responsible 
for manipulating the subject’s outcome expectancy 
but remained blind to the subject’s self-awareness- 
condition assignment. The second experimenter was 
responsible for manipulating self-focus but re- 
mained blind to each subject’s expectancy-condi- 
tion assignment, Both experimenters were ignorant 
of the specific hypotheses being tested. 


Procedure 


The first experimenter escorted the subject to 
the first testing room, There she explained that 
the research team was evaluating a set of percep- 
tual motor tasks for eventual use as an instrument 
for measuring abstract reasoning skills. It was 
stressed that the specific skills that these items 
assessed were very important in a number of pro- 
fessional and interpersonal domains. It was further 
explained that a wide variety of different test-item 
groups were being developed by the research team, 
in order to make possible measurement of these abili- 
ties from several quite different perspectives. The 
Subject was led to believe that 12 different sets of 
items were being tested, although she herself would 
only be asked to work on 2 of the sets. Each par- 
ticipant in the project ostensibly was being as- 
Signed 2 randomly chosen item sets, in order to 
Control for fatigue and order effects. In reality, all 
Subjects were asked to attempt the same 2 tasks in 
the same order, 
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The experimenter continued by saying that some 
of the problem sets had been tested in previous 
semesters. In such cases, any information obtained 
from the previous research would be printed along 
with the instructions for that task, The experi- 
menter concluded her preliminary remarks by say- 
ing that the subject would be asked occasionally 
to fill out some brief self-report scales. The subject 
was then given a questionnaire, which included an 
expectancy rating for the first task: “How well do 
you expect to do on your first task?” answered on 
a 9-point scale labeled extremely poorly at one end 
and extremely well at the other end. 

Anagram task, Each subject then was given a 
sheet labeled “anagrams task.” The sheet included 
printed instructions explaining that the task was to 
rearrange the letters of each item to form a word, 
Work space was provided, but the subject was 
cautioned not to guess randomly in the answer 
blanks, because a proportion of her incorrect an- 
swers would be subtracted from her total number 
of correct answers. 

This sheet also included a paragraph indicating 
that the anagrams had been partially evaluated 
during three previous semesters. Scores on the ana- 
grams task ostensibly had been found to predict 
scores on the “design problems” task and the 
“digit-symbol” task but not scores on the “mazes” 
task. This information was included to provide 
some validation in the subject’s eyes for the ex- 
pectancy manipulation that took place later (see 
below). This paragraph also indicated that Uni- 
versity of Miami students over the previous three 
semesters had ostensibly averaged 7.21 correct $0- 
lutions, with last semester’s average being 7.34. 
This information was included in order to ensure 
that all subjects would view their own performances 
as failures, In actuality, the anagrams, chosen from 
Feather and Simon (1971) and Tresselt and Mayz- 
ner (1966), were quite difficult and in some cases 
impossible (overall, subjects’ scores averaged .77, 
SD = 93). 

The experimenter told the subject that she would 
have 5 minutes to work on the anagrams, “gl. 
though I don't suppose you'll need all that time 
for just 9 items,” The experimenter said she would 
be on the other side of the room working if the 
subject had any questions, After 5 minutes had 
elapsed, the experimenter indicated that the time 
was up and returned to where the subject was seated, 
She looked at the subject's performance with mild 
surprise and said, “That’s not very good at all. 
‘Almost everyone gets at least five or so on this 
task.” The experimenter then delivered the state- 
ment that constituted the manipulation of outcome 
e tancy. 

UAn manipulation. In the positive ex- 
pectancy condition, the experimenter continued as 
follows: 


What did I say was going to be your other task, 
the design problems? [After checking her list] Yes. 
Well, that’s a good break for you. You see, the 
anagrams and design problems have both been 
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studied quite a bit already, so we know a lot 
about how they go together, The evidence so far 
is that the two are kind of opposites of each 
other. They both measure the same abilities, but 
in such different ways that people who do poorly 
on this one often seem to do much better on 
that one, and vice versa. Probably you'll be able 
to do better on the design problems. 


In the negative expectancy condition, the experi- 
menter said instead: 


What did I.say was going to be your other task, 
the design problems? [After checking her list] 
Yes. Well, that’s unfortunate for you, You see, 
the anagrams and the design problems have both 
been studied quite a bit already, so we know a 
lot about how they go together. The evidence so 
far suggests that the two correlate very very 
highly with each other, which means a person 
who does well on one will also do well on the 
other—if a person does badly on one, that person 
will do badly on the other one, too. So I’d expect 
you're not going to do very well on the design 
problems. 


The experimenter went on to say that she was 
working only with the anagram task and that 
another person was handling the design problems 
in a different room. The experimenter said that the 
person supervising the design problems would deal 
with the subject from that point on, which would 
include seeing that the subject received credit for 
participation. The implication of the latter infor- 
mation was that the subject would have no further 
contact with the first experimenter. 

The first experimenter then escorted the subject 
to a second experimental room, knocked on the 
door, introduced the subject to the second experi- 
menter, and said that the subject was to work on 
the design problems next. The second experimenter 
guided the subject to a seat that faced the experi- 
menter’s desk and closed the door. She recorded on 
a form that the subject was undertaking the design 
problems as her second task, asked the subject what 
her first task had been, and recorded the subject's 
response. She then handed the subject a printed 
form that was labeled “instructions for design 
problems,” and asked the subject to read these in- 
structions carefully. 

Design problem. The design problem used in the 
study was patterned” after those developed by 
Feather (1961), which have been used a number of 
times subsequently to measure persistence (€£., 
Glass & Singer, 1972). The goal of the task is to 
trace over all the lines of a geometric figure, accord- 
ing to two rules: first, the line must be continuous 
—the pen cannot be lifted from the page; second, 
whereas it is permissible to cross a line traced 
previously, each line segment must be traced once 
and only once. The instruction sheet included a 
simplified sample problem with illustrations of a 
correct solution and solution attempts that violated 


each of the two rules. 
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The instruction sheet continued by saying thi 
the subject would be given four separate de 
problems. She was to work on the first one eii 
until she solved it or until she decided to go on to 
the next one. Once she went ahead, she could not 
go back to prior puzzles. The subject was allowed 
to attempt to trace the design as many times as 
she wished (on separate pieces of tracing paper), 
provided that a previous attempt was discarded 
before a new attempt was begun (a wastebasket 
was provided for this purpose). It was implicit 
that neither the time taken to reach a solution not 
the number of attempts required to reach a solution 
mattered in the scoring—only whether the problem 
was solved or not. 

After briefly recapitulating the instructions, the 
second experimenter asked the subject to seat 
herself at a nearby table, on which were a felt-tipped 
pen and a stack of tracing paper. Also on the table 
were several pieces of nondescript equipment. Th 
experimenter indicated that she herself was jus 
going to keep working at her own desk while the: 
subject worked at the design problems. The experi- 
menter gestured at the clutter on the subject's desk, 
and said, “Don’t pay any attention to this junk over 
here; it’s part of another experiment.” She then 
asked the subject to fll out a second self-report; 
blank, 
subject had done on her first task and a rati 
how well the subject expected 3 
problems (both on 9-point scales). The experiment 
then placed the first design problem in front ol 
the subject, saying, “Here’s the first puzzle. By & 
way, this first one is the hardest one. Hardly any 
one gets it, so don’t be too surprised if you 4 
You can go ahead and start; just let me know 
you want the next one.” f. 

The experimenter then turned and walked, ba 
to her own desk. As she did so, she unobtrusi 
looked at a clock on her desk, which had 
shielded from the subject’s view. Using that 
she began timing the subject’s attempt on the 
design problem. In reality, the subject was 
only one problem, for which there was no “al 
The amount of time that the subject spen B 
ing that problem was the study’s dependent VO 


ble. rected) 
Self-attention manipulation. Sat ie wd 
i ii i the secon ask j 

tion was varied during $ (75m x M 


resence or absence of a wall mirror | s 
nR on the table at which the subject wie 
ing. (Evidence of the self-focusing Prol 
mirror presence is discussed in detail by ae 
Scheier, 1978.) In the mirror present i 
the mirror was situated directly in o 
subject’s chair, leaning against the W. a 
at which the subject could view her owt 
but not that of the experimenter. In 


1 This statement was included because pilo 
had indicated that most subjects were } 
sistent on the design problem Im th 
such a statement. This comment had all 
reducing subjects’ persistence slightly ove” 
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absent condition, the mirror’s reflective surface was 
turned toward the wall. In both conditions, the mir- 
ror was implicitly included in the experimenter’s 
reference to the “junk” that was part of another 
experiment. The instructions for the task had been 
given before seating the subject at the table so 
that there would be relatively little time for her to 
habituate to the mirror’s presence before beginning 
the experimental task. 

When the subject asked the experimenter for the 
second design problem, or after 10 minutes (which- 
ever came sooner), the experimenter recorded the 
elapsed time, gave the subject a final brief question- 
naire, and began to probe for signs of suspicion. 
She then took the subject to a third experimental 
room, where the general purpose of the study was 
explained. After being assured that her poor per- 
formances on the two tasks had been manipulated 


"and were not her own fault, the subject was given 


credit for her participation and was dismissed. 


Results 


| Manipulation Checks 


It was important for an adequate test of 
our predictions that all subjects perceive a 
within-self discrepancy and that the positive 
and negative expectancy groups differ from 
each other in their expectancies of being able 
to reduce the discrepancy. To verify that this 
was the case, each subject was asked to com- 
plete two self-report items that served as 
checks on the failure and expectancy manipu- 
lations, respectively, just before beginning 
the design problems task. 

The first item asked the subject to rate 
her performance on the first task (anagrams) 
by circling a number on a 9-point scale with 
endpoints labeled very successful and very 
unsuccessful. All subjects included in the data 
analysis circled numbers on the “unsuccess- 
ful” side of the neutral point (overall M = 
1.84, SD = 1.19). Thus every subject viewed 
her performance on the first task as a failure. 
Just as important, a 2 X 2 factorial analysis 
of variance of subjects’ responses to this item 
tevealed that the experimental groups did not 
differ from each other in the degree to which 
they perceived the outcome of the first task as 
having been unsuccessful (all Fs < 1). Nor, 
in fact, had the groups differed reliably in 
actual performances on the anagrams 
(Fs < 1.5) or in the expectations for success 
that they had expressed with regard to the 
anagrams task (Fs < 1). 

On the other hand, subjects’ ratings of 
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how well they expected to do on the second 
task—the design problems—did differ reliably 
between expectancy conditions, F(1,70) = 
5.45, p < .03. Moreover, no effect other than 
this predicted main effect approached sig- 
nificance (other Fs <1), Although the ex- 
pectancy conditions differed from each other 
as had been anticipated, closer examination 
of the means revealed that the two expectancy 
manipulations had apparently not been equally 
effective. That is, the average rating among 
subjects in the negative expectancy condition 
was on the appropriate side of the neutral 
point of the scale (M = 4.16, SD = 1.53), 
but the average among subjects in the positive 
expectancy condition was almost exactly at 
the scale’s neutral point (M = 4.92, SD = 
1.23). We will return to this point below. 


Persistence 


The dependent measure of greatest interest 
was the amount of time that subjects had 
spent attempting to solve the design problem. 
An analysis of variance of these values (see 
Table 1) revealed a significant interaction 
between expectancy condition and self-aware- 
ness condition, F(1, 70) = 4.87, p < .03, as 
well as main effects approaching significance 
for both variables (ps <.10), Individual 
group comparisons revealed that all of these 
statistical effects depended primarily on the 
low persistence values among subjects who 
had been provided with poor outcome ex- 
pectancies and for whom self-attention had 
been enhanced. These people withdrew from 
their attempts to solve the design problem 
more quickly than did either subjects with 
negative expectancies who were less self-at- 
tentive or subjects in whom self-attention had 
been enhanced but who had more positive 
expectancies, ps < .05 by Scheffé test. Con- 
trary to our hypothesis, self-focus in the posi- 
tive expectancy condition did not increase 
subjects’ persistence. Nor, somewhat sur- 
prisingly, did negative outcome expectancy 
reliably reduce persistence in the absence of 
self-focus.* 


2A separate analysis was also performed on the 
number of times the subject attempted to solve the 
problem, though prediction on that measure was 
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Table 1 
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Persistence (in minutes) at an Insoluble Task, as a Function of Outcome Expectancy 


and Self-Attention, Experiment 1 


Mirror absent Mirror pr 
Expectancy Persistence SD n Persistence 
Positive 8.15 2.38 17 8.43 
Negative 8.51 1.88 17 6.55 
Discussion 


Experiment 1 provided clear evidence that 
self-focus enhances the tendency of subjects 
to withdraw from an experimental task when 
their outcome expectancies are negative. Self- 
directed attention did not cause withdrawal 
among subjects with more positive expectan- 
cies; however, neither did self-focus cause 
these subjects to display enhanced persistence 
on the second task as had been predicted. 


There are several possible reasons for the- 


failure to confirm this reassertion prediction. 
The most obvious possibility is that exag- 
gerated persistence was prevented by our ar- 
bitrary imposition of a 10-minute time limit 
on the second task. The persistence means of 
three of the four groups were, in fact, rela- 
tively near this artificial “ceiling” on effort. 
Another possibility is suggested by the find- 
ing that subjects in the positive expectancy 
condition did not really have particularly 
positive expectancies. Although we can do 
little more at this stage than speculate as to 
where the expectancy “watershed” point 
actually is, it may be that these persons’ out- 
come perceptions were not sufficiently posi- 
tive to cause a discernible increase in reas- 
sertion under self-focus. A third possibility, 
of course, is that self-attention does not lead 
to reassertion under conditions of positive 
expectancy. 

Because the reassertion prediction is an 
important part of our theoretical model, and 
because the results of at least one previous 


That is, increased effort could 
result in more frequent attempts; alternatively, it 
could lead only to more concentration prior to 
making an attempt. Analysis of this measure 
yielded only a nonsignificant tendency toward more 
attempts in the mirror absent than the mirror 
present condition. 


more ambiguous. 


conducted a second experiment. 
study an attempt was made to 
more positive expectancies in 
means of slight changes in th 
manipulation. In addition, we minit 
effects of a potential ceiling on su 
sistence by increasing the time 
periment 2, each subject was a 
20 minutes to work on what she 
the first of four design problems. 


Experiment 2 
Method 


Subjects were 34 female undergré 
the University of Miami subject Pi 
tional subjects were deleted prior to 
One claimed to have solved the 
and the other voiced suspicions &l 
mental tasks. : 

All procedures in this study ext 
those of the positive expectancy con 


“They both measure the same abi 
different ways that people who do 
anagrams often seem to do much t 
design problems, and vice versa, It 
almost always when people do 
they seem to do really well on thi 
past experience I expect you'll do 
problems.” The second procedul 
that, as noted above, subjects V 
20 minutes to work on the n 
As in Experiment 1, subjects 
signed to either the mirror present € 
absent condition. 
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each other in first-task expectancies, number 
of anagrams solved (M = .32, SD =.47), 
ratings of their anagram performances (M = 
1.68, SD = 1.27), or their expectancies with 
regard to the second task (all Fs < 1). (It 
may be noted that subjects’ expectancies for 
the second task were slightly more positive 
overall than those of the comparable subjects 
in the first experiment, M = 5.06, SD = 
1.54.) 

Analysis of subjects’ persistence on the 
second task revealed greater persistence among 
subjects in the mirror present condition (M = 
12.47, SD = 5.58) than among subjects in 
the mirror absent condition (M = 8.15, SD = 
4.63), F(1,32) = 6.04, p< .02.* This dif- 
ference proved to be even more reliable (p < 
.008) when a comparison between groups was 
made that included data from only those sub- 
jects who reported either neutral or positive 
second-task expectancies. Thus our hypothesis 
was confirmed. When one’s outcome expect- 
ancy is positive, self-focus leads to enhanced 
efforts. 

One final point should be made about these 
data. Subjects’ persistence in the mirror ab- 
sent condition of Experiment 2 was virtually 
identical to that displayed by the comparable 
subjects (positive expectancy — mirror ab- 
sent) in Experiment 1. More to the point, 
neither of these values differed from the per- 
sistence displayed by subjects with negative 
expectancies in the mirror absent condition of 
Experiment 1, This absence of an expectancy 
effect in conditions of low self-focus is con- 
sistent with previous findings (Carver & 
Blaney, 1977a, Experiment 3; 1977b; Carver 
et al., 1979). Our explanation for this finding 
is straightforward, In the absence of an ex- 
Perimental manipulation, self-focus among our 
Subjects may have been quite low. Our theo- 
retical model describes effects that occur un- 
der conditions of relatively high self-atten- 
tion. If self-focus were low enough, the 
postulated reassertion/withdrawal dichotomy 
Would not have been evoked. Thus the ab- 
Sence of a main effect for expectancy does not 
appear to pose any problem for our theoreti- 
cal analysis. Indeed, our data simply suggest 
that expectancy effects require some self-at- 
tention before they occur at all. 
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General Discussion 


The research presented here conceptually 
replicates a previous finding (Carver et al., 
1979), extending that finding in three im- 
portant respects, First, Carver et al, mea- 
sured outcome expectancy rather than manip- 
ulating it. Thus, although their results were 
consistent with the expectancy-based analysis, 
the findings were susceptible to the criticism 
that it could have been another variable, cor- 
related with expectancy, that had really 
caused the effect. In the present research, in 
contrast, we experimentally manipulated out- 
come expectancy, thus minimizing that po- 
tential criticism. 

A second way in which these findings add 
strength to those of Carver et al. (1979) con- 
cerns the degree to which reassertion can be 
said to have occurred when a positive ex- 
pectancy was combined with self-directed 
attention. The previous study had found only 
directional support for that hypothesis (as 
was also the case in the present Experiment 
1). Presumably this was because of a ceiling 
on subjects’ efforts that was imposed by the 
nature of the task being attempted, In con- 
trast, the present Experiment 2 clearly dem- 
onstrated such a reassertion response. 

A third aspect of this replication may be 
most important of all. Specifically, we have 
demonstrated here that the same principles 
that allowed us in the past to predict fear- 
based behavior can be successfully applied to 
a fundamentally different behavioral domain. 
That is, the present predictions were based 
on precisely the same theoretical model and 
used exactly the same logic that led to the 
previous predictions of Carver et al. However, 
the logic in this case was applied not to the 
attempt to approach a feared stimulus, but 
rather to the persistence with which a person 
tries to reduce a within-self discrepancy 
caused by a prior failure, The fact that the 
theoretical analysis is so easily applied to 
these divergent areas of behavior suggests 
that the model may be very general indeed. 


3 Analysis of the numbers of times subjects at- 
tempted the problem indicated that there were 
marginally more attempts (p < .08) among mirror 
present than among mirror absent subjects. 
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There is one potential concern about these 
findings, however, which stems from the fact 
that a mirror was used to vary self-focus both 
in the present studies and in our previous 
one (Carver et al., 1979). Though we have 
argued elsewhere (Carver & Scheier, 1978; 
Carver & Scheier, Note 1) that the mirror 
is in many ways the “purest” way to increase 
self-attention experimentally, it does have one 
important side effect: It provides subjects 
with visual access to their facial expressions. 
With regard to the present research, this 
raises the possibility that subjects seated 
before a mirror became especially aware of 
their “hopefulness” (in the positive expect- 
ancy condition) or their “hopelessness” (in 
the negative expectancy condition) by observ- 
ing their facial expression, inferred their in- 
ternal states through this external mediation, 
and then behaved in a manner consistent with 
that inference (see, for example, Bem, 1972). 

There are two sources of evidence against 
this line of reasoning: First, evidence con- 
verging on the validity of the self-attention 
construct has been gathered in a good deal of 
previous research (e.g., Carver & Scheier, 
1978; Carver & Scheier, in press; Scheier & 
Carver, 1977; Scheier, Carver, & Gibbons, 
1979; Scheier, Carver, & Gibbons, in press). 
The strategy in that research was to replicate 
mirror-induced effects using individual dif- 
ferences in the disposition to be self-attentive, 
which is termed self-consciousness (Fenig- 
stein, Scheier, & Buss, 1975). Dispositional 
self-consciousness is a particularly useful tool 
for this purpose, in that its use appears to 
rule out possibilities such as external media- 
tion of any obtained effects (see also Carver 
& Scheier, 1978; Scheier et al., 1979). That 
is, being a dispositional variable, it does not 
require the presence of any external stimulus 
to mediate its effects. By implication, this evi- 
dence suggests that mirror effects are also 
mediated by inward focus of attention. 

The second source of evidence is a very 
recent study by Brockner (1979), examining 
a phenomenon that is conceptually similar to 
that studied here. Brockner was interested in 
the effects of self-focus on persons high and 
low in self-esteem, following success and fail- 
ure experiences. He examined these effects by 
measuring dispositional self-consciousness as 
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well as by manipulating mirror presence, 
Brockner found that in the absence of the 
mirror, dispositional self-consciousness led to 
precisely the same effects as had mirror pres- 
ence. That is, among highly self-conscious 
subjects, failure led to a subsequent perform- 
ance decrement for those with low self-esteem 
but not for those with high self-esteem, In 
contrast, persons who differed from each other 
in self-esteem, but who were low in self- 
consciousness, did not differ from each other 
behaviorally following failure. Brockner’s re- 
sults thus show an important convergence 
with the present data, which suggests in tum 
that the present effects were internally medi- 
ated. 

Two differences between Brockner’s te 
search and our own should be noted briefly, 
both of which appear to add further general- 
ity to our theoretical stance. First, Brockner’s 
research investigated the phenomenon of low 
self-esteem, rather than the giving-up response 
per se. The fact that his findings are predicta- 
ble from our reasoning (see Brockner, 1979, 
for greater detail) provides one more indica 
tion that phenomena as diverse as anxiety; 
helplessness, and low self-esteem may have 4l 
common core of conceptually similar pro 
cesses. A second point is that Brockner mi 
sured subjects’ task performances, rather that : 
persistence, on the test-phase task. Carver has 
argued elsewhere (1979) that when the situa 
tion does not allow behavioral withdrawal, 4i 
kind of cognitive withdrawal may occur, 1% 
sulting in performance decrements. Brocknet® 
data lend important support to that assertion 


Theoretical Implications 


Three additional issues are raised by k 
nature of the variables from which we He 
dicted subjects’ behavior and by the 13% 
of the behavior that we examined. Thee 
sues will be briefly addressed in the follow! 
pal ‘aphs. 7 A 

E The present findings 4 A 
obviously have some implications for i? 
ing about the behavioral consequences © 
directed attention. There is some div! A 
opinion as to what these consequent i 
Wicklund (1975a) has speculated that ° maf 
potent response to self-focus is to attemi 
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avoid it, either by withdrawing from the situ- 
ation or by attempting to immerse one’s 
attention in the environment (see also Wick- 
lund, 1975b). In part, this is because Wick- 
Jund has viewed self-attention as an inher- 
ently aversive state, particularly when there 
is a large within-self discrepancy. To Wick- 
lund, discrepancy reduction occurs only when 
self-focus cannot be avoided. 

We, on the other hand, see self-focus as a 
necessary component process in the normal 
sequence of behavioral self-regulation (cf. 
Diener, 1979). From this perspective, we have 
argued that the prepotent response to self- 
directed attention is to attempt to reduce 
within-self discrepancies and that withdrawal 
occurs only if that attempt is interrupted, 
and even then only if self-assessment leads to 
a negative outcome expectancy (Carver, 1979; 
Carver et al., 1979). In our view, even when 
the discrepancy is very large, the reassertion 
impulse will hold sway if one’s expectancy of 
` being able to reduce the discrepancy is posi- 
tive, 

The results of the present research seem 
unequivocal in supporting our position over 
that of Wicklund. In order to provide a strong 
test of our reasoning, we designed the pre- 
treatment phase of the present study so that 
large within-self discrepancies were produced 
among all subjects by a failure on the first 
task. Yet only when outcome expectancies 
Were negative did self-focus lead to with- 
drawal. Subjects with positive expectancies 
exerted greater efforts in their attempts to do 
well on the second task when self-focus was 
high than when it was lower. It might be 
argued that our positive-expectancy subjects 
were simply immersing themselves in the sec- 
ond task to avoid self-focus (see Duval & 
Wicklund, 1972), This seems very unlikely, 
however, because an easier way of avoiding 
Self-focus was readily available—asking for 
the subsequent puzzles, finishing them quickly, 
and leaving. (This was also the case in a 
study by McDonald, in press, cited earlier as 
Providing indirect support for our model.) 
Moreover, if these subjects had been burying 
themselves in the experimental task in order 
to; avoid self-focus, one must ask why subjects 
with negative expectancies did not similarly 
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bury themselves in the task. In short, our 
explanation for the dichotomy of responses to 
self-focus is the most parsimonious presently 
available. 

Self-efficacy. The present findings also 
appear to have implications for recent theoriz- 
ing about “self-efficacy” and behavior regula- 
tion. There is a good deal of similarity be- 
tween some of Bandura’s (1977) ideas in that 
regard and aspects of our own reasoning, even 
though our analysis emphasizes moment-to- 
moment behavior regulation, whereas Ban- 
dura’s emphasizes long-term differences as a 
function of therapy. It seems useful, therefore, 
to consider the implications of the present 
findings for the two models as they apply to 
persistence and giving up. 

We have framed our discussion to this point 
in terms of “outcome” expectancies, following 
Carver’s (1979) definition of outcome expect- 
ancy as the degree to which the person views 
a desired outcome as attainable. In our 
studies, however, we manipulated outcome 
expectancy largely by manipulating percep- 
tions of efficacy expectancy.’ Efficacy expect- 
ancy is viewed by Carver (1979) as one’s 
self-perceived level of effectiveness in execut- 
ing the needed responses, a perspective that 
is quite comparable to Bandura’s, that is, “the 
conviction that one can successfully execute 
the behavior required to produce the out- 
comes” (1977, p. 193). 

Although viewed in this light there are clear 
similarities between the models, the present 
data join those of Carver et al. (1979) in indi- 
cating the inadequacy of Bandura’s theory in 
at least one important respect. Bandura’s 
model fails to take into account the role of 
self-focus in the self-regulatory process, As is 
apparent from the present data, this is a role 


4In Carvers (1979) terminology, outcome ex- 
pectancy is held to depend on several variables, of 
which efficacy expectancy is only one. Others are 
such things as time constraints, environmental con- 
straints, and knowledge of relevant response-out- 
come contingencies. Carver drew this distinction 
between these two types of expectancies because he 
believed that outcome expectancies are central in 
determining behavior (assertion vs. withdrawal), 
whereas efficacy expectancies determine the nature 
of concomitant affective experiences, Those predic- 
tions remain to be tested. 
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that should not be disregarded. That is, if we 
had not heightened self-focus among some of 
the subjects in the present research, we might 
have concluded that efficacy expectancies were 
not important in this behavioral context, In- 
stead, we have shown that they are clearly 
important, though only in the presence of self- 
directed attention. 

Helplessness. The present findings also 
seem to have some relevance for issues raised 
by researchers working in the “learned help- 
lessness” tradition. All of our subjects re- 
ceived what might be construed as a helpless- 
ness pretreatment, and some of them dis- 
played reduced persistence, which is often 
viewed as a sign of helplessness, For this rea- 
son, it is interesting to note some important 
similarities between our theory (and our find- 
ings) and the kind of cognitive analysis of 
helplessness in humans that recent theorists 
in that area have favored (e.g., Abramson, 
Seligman, & Teasdale, 1978; Hanusa & 
Schulz, 1977; Wortman & Brehm, 1975). We 
will then consider two differences between our 
model and previous theories, 

Abramson et al. (1978), for example, have 
proposed a reformulation of helplessness 
theory in which the impact of a pretreatment 
on subsequent performance depends on one’s 
beliefs about the relevance of one’s pretreat- 
ment performance for performance on a sub- 
sequent task, In simpler terms, the effect of 
the pretreatment is mediated by its effect on 
the person’s “expectation of future noncon- 
tingency.” Expectation of future noncontin- 
gency is similar to what Wortman and Brehm 
(1975) called “expectancy of no control.” 
Both, of course, are similar to what we have 
termed unfavorable outcome expectancy. 

In the Abramson et al. model, this expect- 
ancy is influenced largely by the subject’s 
attributions regarding the reasons for the. 
initial failure.’ Indeed, it is this attributional 
aspect of their model that has received by far 
the greatest amount of attention thus far. It 
is worth emphasizing, however, that despite 
this focus on attributions, it is the person’s 
expectancy that ultimately influences per- 
formance, not the attributions themselves. 

The person who finds himself or herself ex- 


pecting to do poorly, for whatever reason, 
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should display “helplessness” effects, Pı 
who expect to do well should not show th 
effects. 
The central importance of outcome expect 
ancy in the Abramson et al, model is qui 
consistent with our own reasoning (Carver 
1979; Carver et al., 1979), However, 0 
theory adds to theirs in two important re 
spects. First, our model assumes that givin 
up phenomena reflect an impulse to withdra 
from the behavioral context. Indeed, both 
the studies that have supported our analys 
of giving up (Carver et al., 1979; the presei 
Experiment 1) have used dependent measure 
embodying behavioral withdrawal as a te 
sponse option. In contrast, in most studies in 
the helplessness tradition, such an option ha 
not been explicitly available. As was men 
tioned above, Carver (1979) has suggest 
that when behavioral withdrawal is preven 
a cognitive withdrawai may occur, manifes 
either as a focus on one’s inability to do the 
required behavior or as a mental dissociation 
from the task, in which further attempts att 
sporadic or halfhearted. Thus, in our view 
the kinds of deficits shown in helplessnes 
studies may actually depend on an impulse i 
remove oneself, physically or cognitively, from 
the behavioral context. 
It is interesting to note, in this regard, thal} 
there is indirect support for this notion fron 
at least one previous research project in 
helplessness area (Diener & Dweck, 1978) 
Diener and Dweck studied the verbalizations 
and behaviors of children, classified either ® 


5 Previous research has shown that exposure to 
mirror can increase the amount of cous 
tributed to the self for hypothetical ae 
(Duval & Wicklund, 1973; Buss & Scheier, ia 
Inasmuch as the internal-external attribui 
dimension plays an important role in the mo 
Abramson et al. (1978), it might appear ha 
model thus is capable of explaining the persis i 
differences between self-awareness condon 
tained in the present research. This does nol aA 
ever, appear to be the case. The Abran nid 
model seems to suggest that attributional a 
will affect only the person’s level of at 
following failure, not the extent to which 
tence occurs on subsequent tasks. Thus, any P 
nation involving attributional internality Wi 
seem to be inadequate to account for the Pi 


findings. 


elpless” or “mastery-oriented,” who were 
Onfronted with a series of task failures. The 
response style of mastery-oriented children in- 
cluded the following: monitoring of their own 
behaviors; verbalization of task-oriented self- 
instructions; attempts to renew their concen- 
tration; statements of positive affect (“I love 
a challenge”); and statements reflecting a 
positive prognosis (“I’ve almost got it now”). 
Helpless children did not make statements 
reflecting monitoring of their behavior or self- 
instructions but were much more likely to 
make statements of negative affect (“This 
’t fun anymore”) and statements that were 
relevant to task solution (“There is a talent 
‘show this weekend, and I’m going to be 
Shirley Temple”). Diener and Dweck char- 
acterized the latter category, of statements as 
reflecting psychological withdrawal from the 
f situation, which implicitly not physically 
escapable, This characterization is quite con- 
sistent with our reasoning. 
The second inadequacy of the Abramson 
et al, model is, of course, the same as the one 
discussed above regarding Bandura’s (1977) 
analysis of fear-based behavior. Specifically, 
it has no mechanism for taking into account 
the effect of self-attention. Accordingly, it is 
unable to account for the present pattern of 
results. In supporting our theoretical model 
(Carver, 1979; Carver et al., 1979), the 
resent findings thus suggest that the deficits 
obtained in helplessness research may be due 
in part to a process not yet addressed by help- 
lessness theorists: the focusing of attention on 
oneself, 


Reference Note 


“1. Carver, C. S., & Scheier, M. F. The self-attention- 
_ induced feedback loop and human motivation: A 
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Consistency and Bias in the Attribution of Attitudes 


Icek Ajzen, Carol Ann Dalto, and Daniel P. Blyth 
University of Massachusetts—Amherst 


This article is based on the proposition that people attribute a disposition to 
an actor by evaluating its consistency with other information about the actor 
or the situation. This strategy is assumed to be accompanied by a cognitive set 
or bias to view ambiguous information as consistent with the hypothesized dis- 
position. Respondents were told that a student had chosen or had been assigned 
to write a proabortion or antiabortion paper, and they were or were not given 
an ambiguous description of the author’s personality. In support of predictions, 


paper’s position, but under assignment conditions such attributions occurred 


) under choice conditions attitudes were always attributed in accordance with the 


only when respondents received the ambiguous personality description, 


A central focus of attribution theory is the 
tent to which an actor’s bghavior provides 
formation about his or her stable underlying 
positions (cf. Ajzen, 1971; Jones & Davis, 
65; Steiner, 1970). Most analyses of the 
tribution process (e.g., Heider, 1958; Jones 
Davis, 1965; Kelley, 1967) have likened 
servers to naive scientists who, except for a 
w motivational biases, make systematic use 
the information available to them. It has 
cently been suggested, however, that instead 
using sophisticated information-processing 
‘ategies, people rely on rather simple intui- 
re heuristics (Tversky and Kahneman, 
74) or cognitive scripts (Abelson, 1976) in 
aking their judgments (see Slovic, Fischhoff, 
Lichtenstein, 1977, for a review). Although 
ese intuitive strategies often produce rea- 
nable judgments, they are said to be capable 
leading to systematic biases and errors 
\jzen, 1977; Kahneman & Tversky, 1973; 
isbett & Borgida, 1975). 
This article deals With WHat Ross (1977) 
s termed the fundamental attribution error, 
e assumed tendency for observers to over- 
timate the importance of dispositional fac- 


The authors thank Ronnie Bulman for her com- 
nts on an earlier draft of this paper. 

Requests for reprints should be sent to Icek Ajzen, 
partment of Psychology, University of Massachu- 
ts, Amherst, Massachusetts 01003. 


Copyright 1979 by the American Psychol 


tors relative to environmental factors, We will 
try to identify a cognitive strategy that may 
be in part responsible for any such bias and 
to specify some of the conditions under which 
this bias is likely to materialize. 

One area in which the fundamental attribu- 
tion error is assumed to operate is the attribu- 
tion of attitudes on the basis of observed (or 
reported) behavior. Jones and Harris (1967) 
reported that observers attribute attitudes to 
the author of an essay in accordance with the 
position advocated in the essay. Although 
weaker, this tendency also emerged under 
conditions in which situational constraints, in 
the form of an assigned position on the issue, 
appeared sufficient to explain the actor’s be- 
havior. This and other studies along the same 
lines (eg, Jones, Worchel, Goethals, & 
Grumet, 1971; Miller, 1976; Miller, Mayer- 
son, Pogue, & Whitehouse, 1977; Snyder & 
Jones, 1974) are usually taken as evidence for 
a pervasive bias on the part of observers to 
attribute an actor’s behavior to internal dis- 
positions rather than to environmental fac- 
tors. 

Note, however, that even under conditions 
of little choice, attribution of behavior to 
internal factors does not necessarily imply a 
bias in favor of dispositional explanations. 
The presence of a viable situational cause 
(such as assignment by a course instructor) 
should produce some discounting of the dis- 
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positional explanation (Kelley, 1973), but it 
need not completely discredit it. We might 
thus reasonably expect that under situational 
constraints, the position expressed in an essay 
would have a significant, though weaker, 
effect on the author’s perceived attitude. 

Before considering potential biases, we must 
explore the cognitive strategies people employ 
when asked to judge the likelihood that the 
author of an essay actually holds an attitude 
in accordance with the expressed position. 
This article is based on the proposition that in 
attributing an attitude or any other disposi- 
tion to an actor, observers evaluate the extent 
to which the disposition in question would be 
consistent with other available information. 
A similar strategy has been suggested by 
Ajzen and Fishbein (1975) in their applica- 
tion of Bayes’ theorem to causal attribution, 
and by Kruglanski, Hamel, Maides, and 
Schwartz (1978) in their analysis of lay 
epistemology. 

Evaluation of consistency in the validation 
of specific hypotheses concerning an actor’s 
dispositions can be described as follows. When 
judging the validity of a given dispositional 
attribution, the observer examines the extent 
to which the stipulated disposition is con- 
sistent with other information about the actor 
and about the context in which the behavior 
occurred, The greater the number of informa- 
tional items consistent with the disposition, 
the more confident the observer will be that 
the actor in fact has the disposition in ques- 
tion. Information inconsistent with the stip- 
ulated disposition should lower the observer’s 
confidence.* 

The interpretation of information as con- 
sistent or inconsistent with a given disposi- 
tion is, however, not made in a vacuum. It is 
here proposed that these judgments are influ- 
enced, among other things, by prior beliefs 
concerning the validity of the hypothesized 
disposition. Once a person has formed a given 
dispositional hypothesis, he or she will tend to 
interpret new information as consistent with 
the disposition in question. A prior hypothesis 
is thus assumed to provide a “cognitive set” 
for the processing of new information. As is 
true of a perceptual set, this tendency to view 
new information as consistent with a prior 
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hypothesis should be particularly pronouneg 
in the case of ambiguous information th 
and of itself has little bearing onthe val 
of the hypothesis under consideration, Th 
end result of such a tendency would be 
overestimation of the amount of evidence s 
porting the hypothesis, in this instance | 
disposition that is being attributed, 

A bias toward perceiving ambiguous info 
mation as consistent with prior hypothey 
could help explain attribution of attitud 
under situationa! constraints. In previo 
studies, subjects were given a consideral 
amount of information that may. have b 
interpreted as consistent with the hypol 
esized attitude. For example, in their first a 
periment, Jones and Harris (1967) describi 
the author of the essay as “a student at fl 
University of North Carolina, a resident 
the state, and the son of an automobile 
man.” Furthermore, the respondents re 
200-word essay defending or criticizin 
tro’s Cuba. The direction of the essay P 
vided the initial dispositional hypothesi 
description of the author and the style í 
organization of the material in the essay, 


been interpreted as consistent with the int 
ence that the author’s attitude corresp 
to his expressed position in favor of ora 
Castro’s Cuba. Providing ambiguous p8 
ground information may thus have atten ual 
any tendency to discount the dispo: F 
hypothesis under conditions of situati 
constraints. ac 
One implication of this analysis 15 
the ambiguous background information 
removed, apparent support for the 
esized disposition would be greatly rel 
eliminated, thus resulting in Jittle F 
tional attribution under no choice cong 
The study reported below was a 
part to test this prediction, In a, J 
tempted to provide a more direct test © 
operation of the cognitive - 
accompany use of the consis 


take int 


1 The person is also assumed to iooi ¥ 


each item’s diagnosticity or inform: 
the disposition under consideration. 


Information known to be ambiguous with re- 
spect to the attribution in question was pro- 
vided for some subjects. This information, in 
the form of a general personality description, 
was assumed to permit emergence of the con- 
sistency bias and hence to increase disposi- 
tional attributions in the presence of situa- 
ional constraints. 


Method 
ab jects 


_A total of 256 male and female* undergraduate 
psychology students received experimental credit for 
their participation in the study. They were assigned 
at random and in equal numbers to the 12 experi- 
mental and 4 control conditions (16 per cell). 


vocedure 


The participants were scheduled in small groups 
and were handed a self-contained questionnaire that 
described the experiment as a study of the perception 
and interpretation of social events, They then read 
about a student who had written a paper in favor of 
or against abortion. The stand taken on the issue 
either had been assigned to the student or had been 
a matter for his or her choice. For one-third of the 
participants, the freedom-of-choice information was 
followed by an ambiguous description of the student’s 
personality, The order of presentation was reversed 
for another third of the participants. Here, the per- 
Sonality description appeared prior to the choice 
‘information, The remaining participants received no 
information about the student’s personality. Spe- 
cifically, the two conditions containing personality 
descriptions provided the following information. 


Peter G, is a student at Hampshire College. Last 
semester he wrote a paper for a social science class 
in which he took a stand favoring [opposing] 
abortion on demand, The course instructor had 
given the following instructions for the paper. 

Free choice condition. Based on the past week's 
readings and lectures, write a short essay either 
defending or criticizing abortion on demand. 

No choice condition, Based on the past week’s 
readings and lectures, write a short essay defending 
[criticizing] abortion on demand. 


4 The personality description appeared either prior 
© or after this choice manipulation. 


The following paragraph gives you a short de- 
Scription of Peter’s personality. It was prepared by 
a psychologist on the basis of a personality inven- 
tory administered to all students in the class. 


Peter G. prides himself as being an independent 
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thinker and he does not accept others’ opinions 
without satisfactory proof. He has a great deal of 
unused capacity that he has not turned to his 
advantage. Peter has a tendency to be critical of 
himself. He has a strong need for other people to 
like and admire him. At times he has serious 
doubts as to whether he has made the right deci- 
sion or done the right thing. Disciplined and con- 
trolled on the outside, he tends to be worrisome 
and insecure on the inside. Some of his aspirations 
tend to be quite unrealistic, Peter has found it 
unwise to be too frank in revealing himself to 
others. He prefers a certain amount of change and 
variety and becomes dissatisfied when hemmed in 
by restrictions and limitations, While he has some 
personality weaknesses, he is generally able to com- 
pensate for them. 


This description contains 10 of the 12 statements 
used previously by Ulrich, Stachnik, and Stainton 
(1963), who showed that students readily accept such 
general statements as accurate descriptions of their 
personalities (see also Carrier, 1963; Forer, 1949). 

At this point (or immediately after the choice in- 
formation in the no personality conditions), attribu- 
tion of attitude was assessed. “What do you think is 
Peter G.’s true position on the question of abortion 
on demand? The probability that Peter G. favors 
(opposes) abortion on demand is %. 

To test our assumption that the personality de- 
scription was ambiguous or neutral with respect to 
this judgment, control subjects were given no infor- 
mation about the stand Peter G. had taken on the 
issue of abortion or about his freedom of choice. 
They were simply told that he had written a paper 
for a social science class concerning the issue of abor- 
tion on demand, and one-half of the control subjects 
were given the description of Peter Gs personality. 
On the basis of this information, they were asked to 
judge the probability that Peter G. favored (or op- 
posed) abortion on demand. 


Results 
Ambiguity of Personality Description 


A 2 X 2 analysis of variance was performed 
on the control data. The factors in this anal- 
ysis were personality description (yes or no) 
and question format (in favor of or opposed 
to abortion). Responses were scored, with 
high numbers indicating a proabortion stand. 
That is, when subjects judged the probability 
that Peter G. was opposed to abortion, their 
responses were subtracted from 100. Neither 


2Since there were no significant differences be- 
tween males and females, the participants’ sex will 
not be further considered. 


1874 


Table 1 
Mean Estimates of Proabortion Attitudes Under 
Different Experimental Conditions 


No personality Personality 
description description 
Direction No No 
of essay Choice choice Choice choice 
Pro- 
abortion 80.44 57.19 68.50 63.59 
Anti- 
abortion 20.94 50.00 35.68 44.47 


Note. n = 16 per cellin the no personality conditions 
and n = 32 per cell in the personality conditions. 


the main effects nor the interaction were 
Statistically significant (F < 1.0 in each case). 
Disregarding question format, the mean prob- 
ability of a proabortion attitude was judged 
to be 57% when the personality description 
was provided and 59.5% without the per- 
sonality description, Clearly, then, in and of 
itself the personality description had no ap- 
preciable effect on attitude attributions. 


Attribution of Attitudes 


The data of the experimental conditions 
were submitted to a 2 x 2 x 3 analysis of 
variance, with direction of essay (pro vs. 
con), freedom of choice (choice vs, no 
choice), and personality description (before 
choice information vs. after choice informa- 
tion vs. no personality description) as design 
factors. Since there were no significant differ- 
ences due to the positioning of the personality 
description (before or after the choice infor- 
mation), the data for these two conditions 
were pooled, resulting in a 2 x 2 x 2 design. 
For ease of presentation, the data are dis- 
cussed only in terms of this latter design. 

Table 1 presents mean probability judg- 
ments, scored with high numbers again indi- 
cating attributions of proabortion attitudes. 
Table 2 shows the results of the analysis of 
variance. The significant main effect for direc- 
tion of essay indicates that, predictably, a stu- 
dent who wrote a proabortion essay was 
viewed as more likely to be in favor of abor- 
tion than a student who wrote an antiabor- 
tion essay. More interesting, the significant 
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interaction between essay direction and | 
dom of choice shows that this effect 
stronger in the choice condition than į 
no choice condition. Sa 

Most important, however, is the 
three-way interaction. It indicates 
presence or absence of an ambiguo 
sonality description influenced the ex 
which the effect of essay direction was í 
tingent upon freedom of choice. The 
of means shown in Table 1 supports o 
potheses, Consider first the results ol 
when no personality description was p 
It can be seen that as in previous si 
essay direction had a strong effect on at 
uted attitudes under free choice conditi 
Contrary to previous findings, however, 
was virtually no difference due to directio 
the essays under no choice conditions, P 
hoc comparisons revealed that the effecti 
to direction of essay was significantly grei 
for choice than for no choice subjec 
184) = 20,97, p < .001. Moreover, 
ference between 57.19 and 50.00 in 
choice condition was not statistically 
icant (F = 1.21). 

Very different results emerged wh 
participants were given the ambiguous 
scription of Peter G.’s personality. Hi 
jects attributed attitudes in accordance 1 
the position expressed in the essay even 
presence of situational constraints. Es 
rection had a significant effect on attr 
under choice conditions, F(1, 184) = 
p < .001, by post hoc comparison, aS 
under no choice conditions, F(1, 184) 
p < 005. Further, the effects due t 


Table 2 
Analysis of Variance 


Direction of essay (A) 
Freedom of choice (B) 
Personality 
description (C) 

AXB 

AXC 

BXC 

AXBXC 

Error 
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rection of the essays (proabortion or anti- 
abortion) did not differ significantly under 
the two freedom-of-choice conditions (F = 
2.20). 

Post hoc comparisons also revealed that the 
ambiguous personality description reduced the 
effect of essay direction in the free choice 
‘situation. The difference between proabortion 
and antiabortion essays, given a personality 
description (M = 32.82), was significantly 
smaller, (1, 184) = 8.34, p < .005, than the 
difference between these essays when no per- 
sonality description was provided (M = 
59.50). 


Discussion 


The findings reported in this article are con- 
sistent with our analysis of dispositional at- 
tribution. When no personality description of 
the essay’s author was made available, re- 
spondents’ judgments were strongly influenced.: 
by the presence or absence of situational con- 
straints. Attitudes in accordance with the 
position advocated in the essay were at- 
tributed to authors who were free to support 
Or oppose abortion on demand, while respon- 
dents were uncertain as to the attitude of an 
author who had written his essay under no 
choice conditions. In fact, given situational 
Constraints and no personality description, 
dispositional attributions were very close to 
those made by control subjects who were 
Biven no information about the position ad- 
Vocated in the essay. These results provide 
little support for the argument that observers 
have a pervasive tendency to attribute an 
actor’s behavior to internal dispositions. When 
Situational constraints are salient and no other 
Information is available, there seems to be no 
Inclination to make dispositional attributions. 
_ However, the availability of other informa- 
tion in the form of an ambiguous description 
of the actor’s personality had a profound 
effect on dispositional attributions. Even 
though by itself the description was found to 
© completely uninformative with respect to 
attitudes toward abortion, it increased dis- 
Positional attributions in the no choice condi- 
Ons. At the same time it was found to 
Weaken such attributions in the choice condi- 
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tions. Only the former effect was predicted, 
but both effects may in retrospect be viewed 
as reflecting the same underlying process. 
When no personality description was pro- 
vided, all information available to the respon- 
dent was either consistent with the hypoth- 
esized disposition (in the choice conditions) 
or largely discredited by situational con- 
straints (in the no choice conditions). As a 
result, freedom of choice had a strong effect 
on attribution of attitudes. 

In contrast, even given the operation of a 
cognitive set to interpret ambiguous informa- 
tion as consistent with the hypothesized atti- 
tude, some of the information contained in 
the personality description is likely to have 
been viewed as contradicting the dispositional 
attribution, Taken as a whole, the available 
information would still favor attribution of an 
attitude corresponding to the behavior, but 
the few items of contradictory evidence would 
tend to lower confidence in the attribution. 
The situational constraints in the no choice 
conditions would again act to discredit the 
behavioral information and thus to some ex- 
tent further reduce the strength of the disposi- 
tional attributions, Consequently, freedom of 
choice was found to have an effect on attribu- 
tion of attitudes, but its effect was not signif- 
icant and was much less pronounced than in 
the conditions where no personality descrip- 
tion was provided. 

This finding appears to contradict previous 
research (e.g., Jones & Harris, 1967), which 
has reported significant interactions between 
essay direction and freedom of choice. Note, 
however, that although previous research also 
provided some ambiguous background infor- 
mation, the personality conditions of the 
present study contained a much more elab- 
orate set of such information explicitly de- 
signed to be interpretable as consistent with 
a proabortion or antiabortion attitude. As a 
result, only a little discounting of the disposi- 
tional hypothesis was found to occur under no 
choice conditions when the personality de- 
scription was provided. 

The tendency revealed in the present study 
to view ambiguous information as consistent 
with a prior hypothesis could theoretically 
function in different ways. One possibility is 
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that a dispositional attribution in accordance 
with the essay’s position becomes anchored in 
the personality description, and as a result 
subsequent refutational evidence (in the form 
of information about situational constraints) 
is rendered relatively ineffective (cf. Ross, 
1977). This account was not supported by the 
present study, since providing the personality 
description had the same effect whether it ap- 
peared prior to or after the freedom-of-choice 
information. Instead, information about the 
essay appears to have led respondents to hy- 
pothesize a corresponding disposition, and 
evidence consistent with this hypothesis was 
then sought in other available information. 
The cognitive set created by such an orienta- 
tion seems to produce a tendency to interpret 
ambiguous information as consistent with the 
hypothesis in question, 
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On the Requirements of Proof: The Timing of Judicial 


Instruction and Mock Juror Verdicts 


Saul M. Kassin and Lawrence S. Wrightsman 
University of Kansas 


At the close of a trial, the judge instructs the jury that the defendant is pre- 
sumed innocent, that the burden of proof is on the prosecution, and that guilt 
must be established beyond a reasonable doubt. In view of criticisms that the 
judge’s charge has no effect on jury decisions, the present study examined 
whether the timing of judicial instruction mediates its efficacy. One hundred 
seven mock jurors watched a 1-hour videotape of a trial and were instructed 
by the judge either before the evidence, after the evidence, or not at all. Re- 
sults for posttrial measures indicated that although the timing manipulation 
had no significant effect on the standards of reasonable doubt adopted by sub- 
jects, those who were instructed before the evidence viewed the defendant as 
less likely to have committed the crime and demonstrated a lower conviction 
rate than subjects in the instructions-after and no-instructions groups. Results 
for a series of mid-trial judgments of 54 subjects further indicated that pre- 
instructed subjects were less likely to convict throughout the trial. These find- 
ings are discussed, and their implications for procedural reform in the court- 


room are noted, 


In trials by jury, the judge is obligated to 
instruct jurors in both general and case-spe- 
cific matters of law. These instructions serve 
a number of functions: The judge orients 
jurors in their task, outlines the undisputed 
facts and issues of the case, explains the rele- 
vant law, and informs jurors about pro- 
cedural matters (McBride, 1969). From the 
defendant’s perspective, perhaps the most 
crucial of the mandatory instructions is that 
concerning the “requirements of proof” (La- 
Buy, 1963). Specifically, the accused is en- 
titled to the instruction that he or she is pre- 
sumed innocent, that the burden of proof is 
on the prosecution, and that all elements of 
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the crime must be proven to a constitutional 
standard of “beyond a reasonable doubt.” 
Although the potential importance of the 
judge’s charge is widely recognized (McCart, 
1964), its actual effectiveness is a subject of 
controversy. On the one hand, the courts as- 
sume that jurors understand their instructions, 
use them in making decisions, and are acutely 
sensitive to even minor variations in wording. 
Thus, attorneys often request specific in- 
structions, appellate courts have occasionally 
reversed verdicts on the basis of an improp- 
erly worded instruction, and many states cur- 
rently favor the practice of having judges 
recite a preapproved pattern instruction in 
order to ensure standardization, Experimental 
support for the effect of variations in instruc- 
tional content has recently been obtained. 
Kerr et al. (1976) presented mock jurors 
with one of three definitions of reasonable 
doubt; one that described a lax criterion of 
reasonableness (i.e., “you need not be abso- 
lutely sure that the defendant is guilty to find 
him guilty,” p. 286), one that described an 
extremely stringent criterion (i.e., “if you 
are not sure and certain of his guilt, you must 
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find him not guilty,” p. 286), and one in 
which reasonable doubt was not defined. As 
it turned out, these varying definitions sig- 
nificantly influenced both individual and 
group verdicts—a lax criterion resulted in a 
high conviction rate, whereas a stringent one 
produced a low rate of conviction. Subjects 
for whom reasonable doubt was not expli- 
cated fell between these extremes. This study 
thus demonstrated that verdicts are indeed 
influenced by the reasonable doubt element 
of the requirements-of-proof instruction. 
On the other hand, it has been suggested 
by legal scholars (Frank, 1949) and research- 
ers (Sealy & Cornish, 1973) that these in- 
structions have little effect on jurors’ verdicts. 
One common criticism is that because they 
are written in statutory language, the instruc- 
tions are often confusing to laypersons un- 
trained in the law (Elwork, Sales, & Alfini, 
1977). In fact, one study revealed that 40% of 
375 sampled jurors reported that they did not 
understand their judge’s instruction (Hervey, 
1947), A second criticism is aimed at the tim- 
ing of the judge’s charge. Although the pro- 
cedure is not fixed by law, the jury is typically 
instructed at the close of the trial presenta- 
tion, that is, after the evidence has been pre- 
sented. Although it is possible that this se- 
quence increases the salience of the instruc- 
tion and its availability for recall during 
deliberation, a number of sources (e.g., Mc- 
Bride, 1969) have questioned the utility of 
an instruction that is given at a stage where 
jurors might have already decided on a ver- 
dict. Kalven and Zeisel (1966) noted that 
jurors often form very definite opinion about 
a defendant’s guilt or innocence before the 
close of the trial. Accordingly, Judge E. Bar- 
rett Prettyman (1960) argued the following: 


It makes no sense to have a juror listen to days of 
testimony only then to be told that he and his con- 
ferees are the sole judges of the facts, that the 
accused is presumed to be innocent, that the govern- 
ment must prove guilt beyond a reasonable doubt, 
etc. What manner of mind can go back over a 
stream of conflicting statements and alleged facts, 
recall the intonations, the demeanor, or even the 
existence of the witnesses, and retrospectively fit 
all these recollections into a pattern of evaluation 
and judgment given him for the first time after the 
events; the human mind cannot do so. . . . Why 
should not the judge, when the jury is sworn, then 
and there tell them the rules of the game. (p. 1066) 
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In view of the flexibility in courtroom. 
cedure and the potential importance of 
judge’s charge to the jury, the absence 
research on temporal factors associated 
the instructions is surprising (Elwork, Sak 
& Alfini, 1977, is an exception). After 
order effects in the perception of unfold 
behavior sequences have repeatedly 
noted in both person perception (Jones et 
1968) and jury contexts (Walker, Thil 
& Andreoli, 1972). In the latter 
Walker et al. (1972) varied the ord 
which the prosecution and defense p 
their cases but did not address the issue 
judicial instruction. 

The present study was designed to invi 
gate the relationship between the timing ¢ 
judge’s instructions and mock juror ve 
for a criminal case. In particular, the ai 
the research were twofold. First, we 
to determine whether or not a protot; 
ecologically valid instruction on the requ 
ments of proof influences jurors’ de 
and whether the timing of that inst 
mediates its efficacy. In order to achieve 
goals, requirements-of-proof instructions 1 
gleaned from those currently employed ( 
Vitt & Blackmar, 1977; LaBuy, 1963; 
Bride, 1969) and were introduced to sible 
before testimony, after testimony, oF n% 
all. At the conclusion of the trial prese 
tion, subjects rendered their individual | 
dicts and answered a number of other ca 
related questions. Since the requirement 
proof are defendant oriented, their effec 
ness should be manifested in a lowered 1 
of conviction. Based on jurors’ tend 
make early decisions about guilt of 
cence and Judge Prettyman’s (1960) 
ing that instructions can have th 
effects only when they are delivered 
jurors have made up their minds, @ $ 
primacy or inoculation effect was Pi 
That is, the present instructions ste 
duce fewer guilty verdicts when P! ie 
fore the evidence than when presen 

The second general aim of the 
study was to explore the m 


ed 
1 Jones and Goethals (1971) have : 
“recall readiness” hypothesis for " 
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nitive process that underlies the proposed 
‘order effect. Requirements-of-proof instruc- 
tions may operate either by decreasing the 
belief that the defendant committed the crime 
(ie, a lowered probability-of-commission 
estimate) or by increasing the standard or 
threshold to which that likelihood is compared 
lie., a stringent interpretation of reasonable 
doubt). Consequently, both were assessed. A 
‘collateral issue addressed here was whether 
‘or not subjects who are instructed prior to 
‘the presentation of evidence evaluate eviden- 
tiary information differently as it unfolds 
from those who do not receive prior instruc- 
tion. Accordingly, half the subjects in each 
instruction group made judgments of guilt- 
innocence at various points during the trial. 
It was hypothesized that instructed subjects 
would set a higher standard by which to 
evaluate the case against the defendant and 
would therefore be less influenced by the 
prosecutor’s case throughout the trial. 


Method 
Subjects and Design 


A total of 107 introductory psychology students 
(47 male, 60 female) participated in the study. The 
experiment was conducted in 25 small groups rang- 
ing in size from 3 to 7 and took 1 hour and 20 
Minutes to complete. Each group was randomly 
assigned to one of six cells produced by the 3 
(Judge’s Instructions Before Evidence, Instructions 
After Evidence, No Instructions) X2 (Multiple 
Pitdgments vs. Single Judgment Only) factorial 
eSign , 


Procedure 


Upon entering, subjects were told that they would 
Observe a videotaped trial simulation, after which 
they would be asked to render a verdict. They 
Were further instructed that as jurors they should 
be attentive but should not take notes and should 
Rot converse while the trial presentation was in 
Progress, 

p. subjects were then told that in order to get 

ia entire trial presentation on one reel of tape, a 
det Pauses and meaningless exchanges had been 
eleted, but that all the testimony remained intact. 
rien groups who were to receive the judge's in- 
hn mee however, were informed that this instruc- 
tead ad inadvertently been deleted but would be 
iinr them „at the appropriate time from the 
jud, al transcript. Finally, subjects in the multiple- 
cenment cells were informed of the fact that at 

ain points the tape would be stopped and their 
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judgments would be assessed, The trial was then 
presented. 


The Trial 


The stimulus trial was one that had previously 
been employed (Juhnke et al, 1979). Stylistically, 
the simulation was presented on a 1-hour (black 
and white) videotape in which a number of law 
students retried a case in a realistic courtroom 
setting. It was filmed from a juror’s perspective: 
the judge, attorneys, and witness stand were all in 
view, 

Substantively, the trial was based on an actual 
criminal case in which the defendant, Ronald Oli- 
ver, was charged with stealing a car and with 
transporting it across state lines? Though continu- 
ously presented, the trial consisted of the following 
three distinct phases: (a) opening statements by 
prosecutor and defense; (b) direct examination, 
cross-examination, and redirect examination of two 
prosecution witnesses (the salesman from whom 
the vehicle was stolen and the arresting officer) and 
one defense witness (the defendant); and (c) clos- 
ing arguments of counsel (prosecution, defense, 
prosecution). The judge’s instructions on the re- 
quirements of proof represented a fourth phase 
whose presence-absence and timing were varied, 

In one condition, these instructions appeared prior 
to the introduction of evidence (ie, between the 
first and second phases). In a second condition, they 
appeared after the closing arguments (i.e. after the 
third phase). In a third condition, no instruction 
was given. The specific instruction employed was 
neither strong nor weak. Rather, it was patterned 
after the approved instructions designed to convey 
each element of the requirements of proof: pre- 
sumption of innocence, burden of proof, and rea- 
sonable doubt (see DeVitt & Blackmar, 1977; Mc- 
Bride, 1969), The instruction read as follows: 


Ladies and gentlemen of the jury—at this point 
I want to emphasize that the law presumes the 
defendant, Ronald Oliver, to be innocent unless 
proven otherwise, A defendant begins the trial 
with a “clean slate” with no evidence against 
him. 


This presumption places the burden not upon the 
defendant to prove his innocence but, on the con- 
trary, the burden is on the prosecution to con- 
vince you beyond any reasonable doubt that the 
defendant, Ronald Oliver, committed the crime. 
That burden never shifts at any stage of the 
proceeding to the defendant, Ronald Oliver has 
no obligations of any kind to go forward and 
prove that he is innocent. 


You have now heard the term “reasonable doubt.” 


2A transcript taken verbatim from the videotape 
and a summary of the trial are available upon 


request. 
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Table 1 
Pattern of Final Verdicts in Each of 
the Six Cells 


Multiple Single 

judgments judgments 

Instruction ed a 

and verdict No. % No. % 

Before 

Guilty 7 39 6 35 

Not guilty fier 761 1o 65 
After 

Guilty 12 63 10 56 

Not guilty 7 37 8 H 
None 

Guilty 2 71 10 56 

Not guilty 5 29 8 4 


What is it? It is a doubt based upon reason and 
common sense—the kind of doubt that would 
make a reasonable person hesitate to act in im- 
portant matters. To summarize, the defendant is 
presumed innocent, so the prosecution must prove 
to your satisfaction beyond any reasonable doubt 
that Ronald Oliver is guilty. If two conclusions 
can reasonably be drawn from the evidence— 
one of innocence and one of guilt—the jury 
should adopt the one of innocence. 


Dependent Measures 


At the close of the trial, all subjects responded 
individually and without deliberation to a two- 
page questionnaire in which they first rendered a 
dichotomous judgment (guilty-not guilty) and indi- 
cated their confidence in that verdict on a 0-8- 
point scale. They then rated the strength of the 
evidence as well as their interest and involvement 
in the case (all on 0-8-point scales). 

Since verdicts are a function of the perceived 
probability that the defendant committed the crime 
and of the standard of proof deemed necessary for 
conviction, both of these variables were also as- 
sessed. All subjects were thus asked, “What is the 
likelihood that the defendant committed the crime?” 
to which they responded by circling a number from 
O to 100 (in multiples of 5), and “A defendant 
should be found guilty if there is at least a 

% chance that he committed the crime.” 
Finally, subjects answered 16 short-answer recall 
questions that pertained to the major facts of the 
case (eg., “On what highway was Ron Oliver 
stopped?”). The total number of correctly recalled 
items served as a measure of fact recall. 

In addition to providing these outcome data, 
half the subjects indicated their judgments (guilty- 
not guilty), confidence values, and probability-of- 
commission estimates at six distinct points during 
the trial—after both the direct examination and 
cross-examination of each witness. Specifically, they 


were asked, “If the trial ended now, would you 
that the defendant is guilty or not guilty?” 

confident are you in this judgment?” (0-8) 

is the likelihood that the defendant committi 
crime?” (0-100). It was thus possible to trace 
liefs as the trial unfolded and to examine wheth 
subjects who were already instructed on the 
quirements of proof (instructions before) evaluat 
the evidence differently from those who had n 
yet been instructed (instructions after and no 
structions). Previous research employing th 
multiple judgments (eg., Weld & Danzig, 1940) hi 
been criticized on the ground that such a p 
cedure may bias final verdicts (Davis, Bray, 
Holt, 1977). For that reason, only half the 
jects in the present study made these ond 
responses. A comparison of their posttrial respon 
with those of single-judgment subjects thus p 
vided a test for the (non)reactivity of the p 
cedure. 


Results 


Outcome Measures 


In order to test for all main and i 
tion effects and to determine which mo 
best fits these categorical data, these d 
mous judgments were analyzed with a lik 
hood ratio (goodness of fit) chi-square ( i 
berg, 1977). Results indicated that | 
simplest model that described the b 
frequencies was the two-way interactio 
tween instructional set and verdicts, X 
1.21, p > .97. None of the more comp 
models (i.e., those involving the multipl i 
single judgments factor) contributed 
explanatory power to this Instructional i 
Verdict model. Put another way, the P 
of verdicts was accounted for by 
effect for timing of instruction. Table 4 
that the instructions-before condition 4 
duced 37% guilty verdicts, compared : 
in the instructions-after and 63% in © 
instructions conditions. 

A scalar variable was defined by €% 
subjects? verdicts with their confid 
ings (confidence itself was unaffec S 
independent variables). Specifa : 
confidence values were assigned to 
dicts and negative values to verdic 
guilty. Scores could thus ae i 
(maximum confidence in not-guilty | 


to 8 (maximum confidence in guilty verdict). 
I} A 2 X 3 analysis of variance on this measure 
revealed one marginally significant effect for 
timing of instruction, F(2, 101) = 2.90, p< 
.06. Duncan’s multiple-range test further indi- 
cated that subjects in the instructions-before 
‘condition were less likely (p < .05) to con- 
vict the defendant than were noninstructed 
subjects (means of —.94 and 2.35, respect- 
ively). Instructions-after (M = 1.79) and 
honinstructed conditions did not so differ. An 
analysis of variance on the probability-of- 
commission estimates yielded results that 
closely paralleled those for the verdict-confi- 
dence measure—the only significant effect 
was for timing of the judge’s instruction, 
F(2, 101) = 2.92, p < .06. Only subjects who 
received instructions before the evidence 
viewed the defendant as less likely to have 
committed the crime (p < .05) than did the 
noninstructed subjects (mean percentage es- 
timates of 64% and 77.6%, respectively) ; 
those who were instructed after the reception 
of evidence (M = 73.1%) did not lower 
their probability-of-commission estimates. 
Contrary to predictions, no significant dif- 
ferences were obtained on evaluations of evi- 
dence strength, interpretations of reasonable 
doubt, or self-ratings of interest and involve- 
ment, Interestingly, a close look at the rea- 
Sonable-doubt data indicates that the overall 
estimate of reasonable doubt (M = 86.07) 
Was almost identical to that previously re- 
Ported for college students (Simon & Mahan, 
1971, obtained a general estimate of 87% 
and an estimate of 85% when the crime was 
auto theft). Subjects thus believed that there 
Should be at least an 86% chance that the 
defendant committed the crime in order to 
Vote for conviction. 
; Finally, subjects demonstrated a moderately 
high level of recall, averaging 12.15 correctly 
recalled items from a total of 16.° Further 
analysis indicated that a main effect was 
Obtained for timing of instruction on the 
ae of case-related facts recalled, F(2, 
01) = 3.26, p <.05. Surprisingly, subjects 
M the instructions-after condition recalled 
Aa items (M = 11.30) than those in either 
a ‘nstructions-before (M = 12.51) or non- 
rented (M= 12.69) conditions. The 
Presence of the instruction between the evi- 
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Table 2 

Pattern of Mid-Trial Verdicts for the Three 
Instruction Groups at Each of the Six 
Decision Points 


ee 


Decision point 


Instruction 

and verdict 1 2 3 4. SiO 
Before 

Guilty a A A SSeS 

Not guilty TO Oe 0 wd 13 tO. 
After 

Guilty EO NIOS 17 016. G13- TIS, 

Not guilty Sj 3 2 3 6 4 
None 

Guilty TSI C VTS 4) 145 2 

Not guilty 2 3 2 3 3 5 


Note. Decision points 1-6 immediately follow the di- 
rect examination and cross-examination of the pro- 
secution’s two witnesses and the defendant. 


dence and response assessment apparently 
inhibited recall. 

In sum, the mere timing of instructions 
had an impact on jurors’ final judgments, 
Results for the verdict-confidence measure 
tended to support the major hypothesis that 
instructions on the requirements of proof 
would reduce subjects’ tendencies to convict 
(i.e. relative to no instructions) only when 
delivered prior to the introduction of testi- 
mony. Although the timing variable had no 
significant effect on interpretations of reason- 
able doubt, preinstructed subjects actually 
viewed the defendant as less likely to have 
committed the crime than either the instruc- 
tions-after or no-instructions subjects, Those 
who received instructions at the end of the 
trial did not respond differently on the ver- 
dict and probability-of-commission measures 
from those who were never instructed. In fact, 
they even recalled fewer of the case-related 


3Qnly the major facts in the case were tested. 
Moreover, these facts were typically repeated 
throughout the trial proceedings (eg., the highway 
on which Ron Oliver was stopped was first men- 
tioned during the prosecutor’s opening statement 
and was reiterated during the examination of both 
the arresting officer and the defendant and during 
the closing arguments). As a result, the facts tested 
could not be classified for further analysis as pro- 
prosecution or prodefense, nor could they be lo- 
cated at a single point in the trial. 
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VERDICT-CONFIDENCE SCORES 
+ 


DECISION POINTS 


Figure 1. The pattern of verdict-confidence scores for the three groups at each of 
points. (Higher scores indicate confidence in guilt.) 


facts than all other subjects did. Finally, 
whether or not subjects made mid-trial judg- 
ments did not affect any of the above varia- 
bles. The nonreactive nature of this multiple- 
judgment procedure was thus confirmed. 


Process (Mid-Trial) Measures 


Overall, 68% of the mid-trial responses 
were guilty verdicts. For the 54 subjects who 
made these judgments, the pattern of verdicts 
is presented in Table 2. As with the outcome 
data, these verdicts were analyzed with a 
likelihood ratio goodness-of-fit chi-square. 
Again, the simplest model that explained the 
observed frequencies was the two-way interac- 
tion between instructional set and verdicts, 
x?(30) = 6.50, p= 1.0. Although a higher 
order model comprised of both Instruction X 
Verdict and Decision Point X Verdict inter- 
actions also fits the data, x7(20) = 3.13, p= 
1.0, it did not contribute significantly to the 
predictions made by the simpler model. Mid- 
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trial verdicts were thus acco: 
effect for instructional set. Tab! 
across all decision points 
before group yielded 41% 
compared to 82% in both 
after and no-instructions 
A verdict-confidence 
created and was this time 
3 (Instructions) X 6 (De 
sis of variance. Figure 1 illusi 
main effect for the instruct 
51) = 10.15, p<.001, on 
scores. As before, this 4 
cated that subjects who 
fore the evidence (M= 
sistently less likely to convi 
001) than were either 
after or noninstructed subj 
and 3.65, respectively). Ih 
did not differ. An additional 
the decision point factor, 
p < .001, indicated, as expe 
scores fluctuated widely 
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Table 3 
Correlation Coefficients of Mid-Trial and Final Verdict-Confidence Scores 


in Each Group and Across All Groups 
———— ŘE 


Verdict-confidence score 


Group 1 2 3 4 5 6 
Before 14 42 T** 46** 42 .54* 
After 22 22 18 AS —.03 —AS 
None 31 .30 37 42 45 .80""* 
Overall .28* Alt* Atte 44e* ow" coaae 


Note. Correlations are based on ns of 18, 19, and 17 (N = 54). 
*p < .05.** p < .01. *** p < 001. 


gressed. As might be anticipated, the convic- that over all multiple-judgment groups the 
tion rate was highest at the third point— average correlation between mid-trial and 
right after the prosecutor examined his sec- posttrial verdict scores was .44 (p < .01). In 
ond witness (p < .01), and lowest at the fact, the average correlation between the ver- 
fifth point—right after the direct examina- dict scores rendered at even the first decision 
tion of the defendant (p < .01). The interac- point and those given at the end of the trial 
tion between instructional set and decision was .28 (< .05). To some extent, then, 
point did not approach significance (F <.1). subjects’ decisions were substantially formed 
On the question of whether jurors’ early very early in the trial presentation. 
impressions were predictive of their final pre- Results for the repeated probability-of- 
deliberation verdicts, Table 3 presents the commission measures followed a similar pat- 
correlations between subjects’ mid-trial and tern (see Figure 2). A main effect for timing 
final verdict-confidence scores. It can be seen of instruction, F (2,5) = 6.58, p < .005, re- 


PROBABILITY-OF-COMMISSION 


i 2 3 * 2 e 
DECISION POINTS 


Figure 2. The pattern of probability-of-commission estimates for the three groups at each of six 
decision points. 
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vealed that during the trial, the instructions- 
before group (M = 52.5%) viewed the de- 
fendant as less likely to have committed the 
crime (p < .01) than either the instructions- 
after or no-instructions groups (68.16% and 
71.67%, respectively), who again did not 
differ from each other. Moreover, a main ef- 
fect for the decision point factor, F(5, 255) = 
3.08, p < .01, indicated that the perceived 
probability of commission was lowest at the 
fifth point—right after the defendant testi- 
fied in his own behalf. 

In sum, subjects’ mid-trial judgments fluc- 
tuated in the predicted directions (i.e., guilty 
judgments after the prosecutor’s examination 
and not-guilty judgments after the defense’s 
examination). More important, these judg- 
ments were influenced largely by whether or 
not subjects had been instructed on the re- 
quirements of proof. That is, although in- 
structional set and decision point did not 
interact (i.e., their curves were almost per- 
fectly parallel, thereby indicating that pre- 
instructed subjects were neither less influenced 
by the prosecutor’s testimony nor more in- 
fluenced by the defendant’s testimony), sub- 
jects who had been instructed before the 
evidence immediately and continually viewed 
the defendant as less likely to have committed 
the crime. 


Discussion 


The present study demonstrated what 
might be described as a primacy effect. A 
judge’s instruction to the jury was effective 
when delivered prior to but not after the 
presentation of evidence. That is, mock jurors 
who were instructed on the presumption of 
innocence, burden of proof, and reasonable 
doubt before observing the testimony were 
ultimately less likely to vote for conviction. 
Tt was noted earlier that verdicts are a 
function of the perceived likelihood that the 
defendant committed the crime and of the 
standard or threshold to which that likeli- 
hood is compared. In the present experiment, 
variations in the timing of the instruction 
affected estimates of the probability of com- 
mission but not interpretations of reasonable 
doubt, Subjects who received the instructions 
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before the evidence thus demonstrated a I 
rate of conviction because they act 
viewed the defendant as less likely to hay! 
committed the crime. f 
Although the present experiment was nq) 
theoretically „guided, it now appears that a 
information integration model of juror judg 
ments (Kaplan & Kemmerick, 1974) mig 
well describe the results. Briefly, each pig 
of evidentiary and nonevidentiary informatii 
possesses some scale value along a dimensi 
of guilt-innocence and a weight that dete 
mines the importance of that information, | 
trial judgment is then formed on the basis¢ 
a weighted-average combination of stimuli 
components. In the context of the pres 
study, subjects received two global categori 
of information: the trial presentation ai 
judicial instruction. From the probability 
commission responses of noninstructed si 
jects, the scale value of the trial informati 
may be estimated at .78 (i.e., .78 is the 
ginal mean for noninstructed subjects). A 
though the scale value of the requirement si 
proof instruction was not assessed in 
present study, it should—by legal ideal—hay 
implied a scale value or initial probability 
commission of 0 (Ostrom, Werner, & Sal 
1978). On the assumption that the scale vä 
ues of the instruction and trial presentati 
remained constant (these assumptions are si 
ported indirectly by the lack of a timing ei 
on interpretations of reasonable doubt g 
ratings of evidence strength, respectively) 
what do we attribute the effects for the 5 
ing manipulation? Within an in ‘ 
mation paradigm, Anderson (1965) foul 
that later adjectives keep a fixed scale 
but decrease in their weight. This me a 
describes the present results—the prod A 
ant instruction received a greater weight 
delivered before than when delive 
the evidence.* a 
What process underlies these findings? 
least two plausible mechanisms E 
evaluating. It was hypothesized tha 
instructed subjects would demand a6 
burden of proof when evaluating the 5 í. 


4 


alysis 
‘Note that although the present a a 


judges’ instructions as information 


a 
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of the prosecutor’s case as it unfolds during 
the trial. This critical, evaluative “schema” 
was expected to manifest itself in different 
fluctuating patterns in the responses of the 
instructed (instructions-before) and nonin- 


“structed (instructions after and no instruc- 


tions) mid-trial judgment groups. However, 
the absence of an Instruction X Decision 
Point interaction on verdicts and probability- 
of-commission ratings ruled out this hypothe- 
sis, In fact, the three instruction groups dem- 
onstrated remarkably similar mid-trial shifts 
in judgment (see Figure 1). Those who had 
been instructed were neither more influenced 
by the defense nor less influenced by the 
prosecutor than the others were. Instructed 
subjects thus did not reject or distort the 
prosecutor’s evidence to fit an initial impres- 
sion or presumption of innocence. And why 
should they? Luchins (1957) has demon- 
strated that by forewarning observers of the 
imminence of additional information, the 
proactive effects of early information on sub- 
sequent information are suppressed. Subjects 
in the present study and jurors in general 
expect to be confronted with inconsistent 
information. Early information in this setting 
(ie, judicial instruction) is thus unlikely to 
produce the strong bias that would stimulate 
the processes of discounting or assimilation. 

The process that did appear to operate was 
considerably less complex. Subjects who were 
instructed before the evidence were more 
likely than the others were to vote for ac- 
quittal and indicated a lower probability of 
commission even from the first decision point. 
Although they responded similarly to testi- 
Mony that followed, their initial leanings 
resulted in fewer guilty verdicts at the con- 
clusion of the trial, The impact of this initial 
reaction is reflected in the significant correla- 
tion between first and final verdict scores. 


Value and weight, Kaplan and his colleagues are 
‘a ick to point out that strictly speaking, because 
structions do not pertain to the specific de- 
ag or crime, they do not relate to information 
Bes value (Sı). Instead, instructions should affect 
Seta impression of the defendant that exists 
er S jurors’ receiving information about him or 
tne O), that is, the impression of defendants in 
eral (see Kaplan & Miller, 1978)- 
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Simply put, most preinstructed subjects 
“presumed innocent,” whereas the others 
“presumed guilty.” 

The practical implications of the present 
study are straightforward. Mandatory code 
provisions do not forbid or discourage judges 
from using their discretion in instructing the 
jury before the trial. A number of states 
(e.g., Indiana, 1966), individual judges (e.g., 
Prettyman, 1960), and legal scholars (e.g., 
McBride, 1969) have thus adopted or advo- 
cated delivery of preliminary instructions 
prior to the presentation of evidence. In 
Missouri (1964), for example, the Supreme 
Court Committee on Jury Instructions rec- 
ommended that “the jury be instructed before 
the trial begins. . . . The committee believes 
that it is better to draw the jury’s attention 
to these matters before the trial rather than 
waiting until after the jurors may have 
reached a decision” (quoted in McBride, 
1969, p. 62). This latter example is unfortu- 
nately an exception rather than the rule, 
which is that preliminary instructions are 
vastly underutilized (DeVitt & Blackmar, 
1977). Viewed in this context, the present 
study provides firm support for proponents 
of procedural reform in the courtroom. The 
Anglo-American system of justice has tradi- 
tionally favored the accused on the philosophy 
that acquitting a truly guilty person is better 
than convicting a truly innocent one (Plos- 
cowe, 1935). Yet the present results suggest 
that the accused may not benefit from this 
protective instruction unless it is delivered to 
jurors before the trial. In this regard, perhaps 
the requirements of proof might well be in- 
cluded in the currently popular “juror hand- 
books” that are distributed to jurors as part 
of a pretrial orientation (National Institute 
of Law Enforcement and Criminal Justice, 
1975) and might perhaps be presented at 
both the beginning and the end of the pro- 
ceedings (Elwork et al., 1977). 

From a methodological standpoint, the 
multiple-judgment procedure initially em- 
ployed by Weld and Danzig (1940) merits 
increased consideration in future research, 
The present study confirmed that it does not 
bias jurors’ ultimate verdicts or any other 
posttrial measures (also see Pyszczynski, Note 
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1). Moreover, it appears to be valuable for Fienberg, S. E. The analysis of cross-classifed 
examining jurors’ reactions to the trial as it Cambridge, Mass.: MIT Press, 1977. 
unfolds. Questions concerning the relative ake) - Courts on trial: Myth and 
impact of different trial phases (e.g., opening cease ries, Se NJ. Prineston 
statements and closing arguments, direct ex- Hervey, J. C. Jurors look at our judges, 0 
amination vs. cross-examination) and different Bar Association Journal, 1947, 25, 1508- 
kinds of evidence (e.g., tangible exhibits, ex- Zndiana pattern jury instructions. Indianap 
pert or eyewitness testimony) may be fruit- _ Bebbs-Merrill, 1966. 


r eyen “Wee Jones, E. E., & Goethals, G. R. Order effects in 
fully investigated through the mid-trial as- pression formation: Attribution context and | 


sessment of verdicts. AD nature of the entity. Morristown, N.J.: Gen 
Finally, some of the limitations of the Learning Press, 1971. 
present results deserve mention. As with other Jones, E. E., Rock, L., Shaver, K. G., Goethals, 


nonevidentiary factors, the timing of an in- R» & Ward, I. M. Pattern of pe 
struction cannot be expected to influence the Sours Ay SSRS Sock Badd isa 


outcome of a one-sided case (i.e. an ex- 1968, 10, 317-340. 
tremely strong or weak case against the Juhnke, R., Vought, C., Pyszczynski, T. A, Di 
defendant). Rather, the effects should be F. C., Losure, B. D., & Wrightsman, L. S, Efi 


į imi $ cases. of presentation mode upon mock jurors! readti 
oorsideredilmiteditociose; ambiguous y to a trial. Personality and Social Psychology Bi 


Note also that the present experiment as- letin, 1979, 5, 36-39. 
sessed the verdicts of nondeliberating jurors. Kalven, H., & Zeiscl, H. The American jury} 
Whether or not deliberation provides enough ton: Little, Brown, 1966. 

of a corrective strategy to erase the timing Kaplan, M. F., & Kemmerick, E. D. Juror 


ect n juries remai ion. as information integration: Combining ¢ 
= ee ee ee ee on and nonevidential information. Journal of Per 


Finally, the trial used’ in ‘the: present study ality and Social Psychology, 1974, 30, 49 
was much shorter than the average criminal Kaplan, M. F, & Miller, L. E. Reducing the eft 
case. Whether or not the timing manipulation of juror bias. Journal of Personality and So 
would affect jurors’ responses in realistically Psychology, 1978, 36, 1443-1455. 


í i Kerr, N. L, Atkin, R. S, Stasser, G, Meek)? 
longer trials remains to be seen. Holt, R. W, & Davin J. H. Guilt beyond 


reasonable doubt: Effects of concept deñn 
and assigned decision rule on the judgments 
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Expectancy Shifts and the Expectancy Confidence Hypothesi 


Richard W. Wollert 
Portland State University 


Explanations of differences in the size of expectancy changes following task 
outcomes were considered. The control perception hypothesis, the most fre- 
quently proposed explanation, is that small expectancy shifts occur when task 
outcomes are perceived to be externally (i.e., chance) controlled. An alternative 
explanation, the expectancy confidence hypothesis, is that small shifts occur 
when subjects are relatively confident of the accuracy of their expectations. 
Two experiments examined these positions. Experiment 1 partially replicated 
a study often cited as supporting the control perception hypothesis. Expectancy 
confidence was assessed, and as predicted by the expectancy confidence hypoth- 
esis, expectancy shifts were found to be related negatively to expectancy con- 
fidence. Skill perceptions and levels of expectancy confidence were manipulated 
in Experiment 2, and their impact was assessed by several expectancy shift 
measures, Expectancy confidence was found to influence expectancy shifts as 
predicted for four of five measures, whereas skill perceptions did not signifi- 
cantly affect expectancy shifts on any measure. Expectancy confidence thus 
exerts a substantial impact upon expectancy shifts. The relevance of the find- 
ings for a third explanation of expectancy shifts, the causal stability hypothesis, 
is discussed, as are the implications of the expectancy confidence hypothesis 
for theories of personality and depression. 
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For the last 25 years, expectancy for rein- 
forcement has remained a fundamental con- 
cept of Rotter’s social learning theory of per- 
sonality (Rotter, 1954; Rotter, Chance, & 
Phares, 1972). A corollary of Rotter’s belief 
that expectancies influence personality pro- 
cesses is that changes in these expectancies 
are important psychological phenomena. In 
Rotter’s analysis, expectancy changes are con- 
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sidered to reflect learning about behavior 
outcome relationships and to be precu 
behavior change. } 
Because of their significance for le 
and behavior change, many attempts 
been made to outline the conditions 
ing changes in expectancies for success: © 
general findings serve to put these 
broad perspective. First, subjects ty 
raise their expectations after successtuiy 
forming a task and lower them after 
cessful performance (cf. Rotter, Li 
Crowne, 1961). Second, large differe! 
expectancy changes exist between 
even where initial expectancy Jevels h 
equated and the pattern of task outco 
been controlled (cf. Miller & Seligman 
Taken together, such findings M ca! 
recent and direct task experience 
powerful influence on expectancy 
is not sufficient to account for all 
subjects differences. 
Rotter has proposed that several c0 
variables, termed “generalized exp% 
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account for these individual differences, The 
most well known of these is internal versus 
external locus of control of reinforcement. 
According to Rotter (Rotter et al., 1972), 
small changes in verbalized expectancies occur 
where subjects perceive reinforcements to be 
externally rather than internally controlled. 
This view, which will hereafter be referred to 
as the control perception hypothesis, has been 
supported by several findings (Miller & Selig- 
man, 1973; Phares, 1957; Rotter et al., 1961) 
that subjects faced with chance tasks (i.e. 
Where task outcomes were perceived to be ex- 
ternally controlled) changed their expect- 
ancies of success less than subjects faced with 
skill tasks (i.e, where task outcomes were 


| perceived to be internally controlled). 


This hypothesis has had a significant im- 
pact on many areas of psychology. It has, as 
already noted, been used in social and per- 
Sonality psychology research to explain why 
Subjects learn more about behavior-outcome 
relationships on some tasks than on others 
(Rotter et al., 1961). It has also been applied 
to the field of psychotherapy, where increases 
in a client’s internal locus of control are often 
Seen to precede both expectancy shifts and 
Positive behavior changes (Rotter et al., 
1972). Still a third application has come in 
the area of psychopathology. Seligman and his 
associates have found that depressives change 
their expectancies for success on skill tasks 
less than nondepressives (Klein & Seligman, 
1976; Miller & Seligman, 1973). Relying on 
the control perception hypothesis, they have 
interpreted this finding to mean that depres- 
ives perceive reinforcement to be under 
8teater external control than nondepressives 
do. This interpretation is consistent with 
Seligman’s learned helplessness theory of de- 
Pression (Seligman, 1975). 
_It should be recognized that these conclu- 
ions are justified only as long as the control 
Perception viewpoint provides the most ade- 
quate explanation of expectancy shifts. This 
Paper reviews and critiques some of the re- 
“earch supporting the control perception hy- 
Pothesis and presents an alternative theory of 
“pectancy shifts called the expectancy con- 

ence hypothesis, The results of several 


Studies Supporting this alternative are then 
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reported. Finally, the implications of the ex- 
pectancy confidence hypothesis are discussed 
for theory and research on personality and 
abnormal psychology: 


Review and Critique of the Control 
Perception Hypothesis 


The two studies most frequently cited in 
support of the control perception hypothesis 
were conducted by Phares (1957) and Rotter, 
Liverant, and Crowne (1961). In the first 
study, Phares manipulated perceptions of con- 
trol of reinforcement by giving subjects dif- 
ferent pretask inductions. Phares discovered 
that smaller expectancy shifts were reported 
when a task was preceded by a chance induc- 
tion rather than a skill induction. 

In the second study, Rotter et al. assigned 
subjects to either a prediction or motor task. 
It was assumed that, “on the basis of previous 
cultural experiences” (Lefcourt, 1966, p: 
208), subjects would perceive outcomes of the 
prediction task to be chance dependent. Out- 
comes were controlled so that success occurred 
on the first trial of each task. The prediction 
of Rotter et al. that chance task subjects 
would change their expectancies less than the 
other subjects was subsequently confirmed by 
several expectancy shift measures: (a) from 
the first to the second trial; (b) over all trials 
in which shifts in the appropriate direction 
occurred (i.e., upward or downward shifts in 
expectancies following success or failure, re- 
spectively, on the immediately preceding 
trial); (c) the final expectancy, recorded 
after the eighth trial. On the bases of these 
and earlier results, Rotter et al. concluded 
that the control perception hypothesis had 
been strongly supported. 

The control perception interpretation of 
these two studies may be questioned on sev- 
eral grounds. A prime objection to Phares’ 
approach is that neither initial expectancy 
differences nor expectations for improvement 
were assessed. However, the inductions placed 
such different emphases on task difficulty that 
they may have differentially affected initial 
expectations, Skill inductions, for example, 
pointed out that subjects performed at vari- 
ous levels, whereas chance inductions indi- 
cated that the task was “so difficult that it 
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did not differentiate between people . . .” 
(Phares, 1957, p. 340). Failure to assess in- 
itial expectancies and expectations for im- 
provement is a serious weakness, given these 
inductions, since they conceivably created 
expectancy differences that later affected the 
size of expectancy changes. 

Significant objections may also be raised to 
the methodology employed by Rotter et al. 
(1961). No manipulation check was con- 
ducted, for example, to determine if subjects 
perceived the tasks as skill or chance. It is 
consequently unclear whether the tasks were 
actually perceived as determined by skill or 
by chance, Equally important, the skill task 
differed greatly from the chance task. The 
possibility thus exists that task factors other 
than skill or chance perceptions may have led 
to the reported expectancy shift differences. 

To these methodological drawbacks may be 
added the criticism that the control percep- 
tion hypothesis has not received consistent 
empirical support, James and Rotter (1958) 
had subjects perform identical prediction 
tasks but manipulated perceptions of control 
by skill and chance inductions, They found 
that “the acquisition (i.e., raw expectancy) 
curves of the 100% (reinforcement) groups 
are fairly similar, as are the 50% groups” (p. 
400). If raw expectancy curves are similar, 
there is no possibility that expectancy shifts 
may differ. This follows because differences in 
expectancy shifts may be defined as differ- 
ences in the slopes of two expectancy curves 
at the same point. The findings of James and 
Rotter thus conflict with the findings of Phares 
(1957). 

This lack of empirical support is apparent 
in correlational as well as experimental 
studies. Miller and Seligman (1973), found 
no relationship between expectancy shifts and 
personality measures that assess generalized 
perceptions of internal and external control 
of reinforcement. Consistent with this result, 
other authors have concluded that perception 
of control, when treated as a personality 
dimension, rarely has been found to be related 
to expectancy shifts (Weiner, Nierenberg, & 
Goldstein, 1976). 

A final reservation applies not only to the 
research of Phares (1957) and Rotter et al. 
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(1961) but to almost all studies of 
tions for reinforcement. As defined by 
expectancy for reinforcement is the “ 
ability held by the individual that a p 
reinforcement will occur as a function 
specific behavior on his part in a s 
situation or situations” (Rotter et al, 
p. 12). In spite of this definition, h 
subjects in expectations studies have co 
sistently been asked to report how sure orh 
certain they were that they would suc 
a task. This question is not the same as askit 
subjects to report their chances of success. 
fact, it may be argued that this method of a 
sessment did not focus upon expectancie 
success, but on how confident subjects we 
that an expectancy of success having 
probability of 1 was accurate. Previ 
studies, therefore, may have confounded 
jective expectancies of success with confide 
in the accuracy of these expectations, 

To summarize, in some cases adequate 4 
trol has not been exercised over the effect 
different tasks on expectancy shifts. Wh 
tasks have been controlled, pretask inducti 
may have created initial differences in| 
pectations for success and improvement 
addition, the results of different studii 
conflicted, Finally, the assessment of 
tive expectancies seems to have 
founded consistently with assessing coni 
in the accuracy of these expectancies 
these difficulties, it may be argued i 
cognitive factors most importantly invo! 
expectancy shifts have not yet been 
isolated. 


The Expectancy Confidence Hypo 


A possible alternative explanation © 
findings of Phares (1957) and Roi 
(1961) stems from the expectancy © 


liefs about their probabilities of SU 
reinforcement but also hold beliefs 
accuracy of these expectations. 
confidence may be bom in ti 
certainty or, more formally, as © ‘ 
held y. an individual that his OF 
pectancy is correct. For exam] 
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jects hold that only expectations in a narrow 
range are plausibly accurate estimates of 
their chances of success, they possess high 
levels of expectancy confidence. Alternatively, 
when subjects will accept many expectations 
as plausible estimates, they have low levels 
of confidence. 

If we contrast locus of control with ex- 
pectancy confidence, a fundamental concep- 
tual difference between approaches becomes 
apparent. While locus of control emphasizes 
specific beliefs, expectancy confidence empha- 
sizes the relationships between cognitions in 
terms of certainty as defined by the range of 
plausible alternatives. Adopting Levy’s 
(1970) terminology, locus of control may be 
considered to deal with the content of cogni- 
tions, whereas expectancy confidence deals 
more with the structure of cognitions. 

The central proposition of the expectancy 
confidence hypothesis reflects this structural 
emphasis. That proposition is that the degree 
of change in expectancies that occurs over 
time or events is a function of expectancy 
confidence. When expectancy confidence is 
low, expectancies are likely to change; when 
expectancy confidence is high, expectancies 
are likely to remain stable. Applied to the 
findings of Phares (1957) and Rotter et al. 
(1961), the expectancy confidence hypothesis 
suggests that chance condition subjects may 
have been more confident of their expect- 
ancies than other subjects were, An analysis 
of some of the experimental procedures fol- 
lowed in these studies underscores the plausi- 
bility of this suggestion. For example, Rotter 
et al. (1961) informed chance condition sub- 
jects that their objective was to predict cor- 
rectly whether slides marked with an X or an 
O would be projected upon a screen. In all 
likelihood, subjects had had some previous 
experience with gambling games or other 
Situations similar to this task (e.g. predict- 
ing the toss of a coin), In the skill condition, 
the goal was to raise a steel ball resting on a 
movable platform above a certain height (Sky, 
1950). Subjects in this condition probably 
had almost no prior experience with similar 
games or situations, Greater task experience 
Presumably induced a relatively high level of 
Confidence on the part of chance condition 
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subjects that their initial expectations of suc- 
cess were correct. Since they were more con- 
fident of their expectations, subjects in the 
chance condition were less likely to change 
these expectations than were subjects in the 
skill condition. 

In the Phares (1957) study, chance induc- 
tions stressed both task difficulty and lack of 
variation in task performance. It seems that 
this would lead chance condition subjects to 
expect to perform poorly and also to believe 
that their performance would not improve. 
In other words, chance subjects may have 
held more negative expectations than skill 
subjects held and may also have believed that 
these expectations were accurate estimates of 
their objective probability of success. With re- 
spect to this analysis, it should be pointed out 
that no mention of task difficulty or perform- 
ance variation was included in the chance in- 
duction used by James and Rotter (1958). 
Levels of expectancy confidence may conse- 
quently have been somewhat similar for their 
skill and chance subjects. This could account 
for the lack of expectancy shift differences 
they found. 

Data from several pilot studies conducted 
by the present author support the logic of 
this analysis, In one, the apparatus and de- 
scriptions for the tasks used by Rotter et al. 
(1961) were presented in a group setting to 
49 undergraduates. Since a slide projector 
was unavailable, the prediction task was 
slightly modified so that sample letters were 
presented on 4” X 6” (10 X 15 cm) file cards, 
and subjects were told that their task was to 
guess correctly the letters on sets of file cards 
(rather than on slides). All subjects picked 
one of the tasks to complete this sentence: 
«I have more general experience with tasks 
similar to the [platform lifting or prediction] 
task.” As predicted, a greater proportion of 
students (.76) reported more experience with 
the prediction task (p < 001). 

In a second study, the color and line tasks 
and inductions used by Phares were pre- 
sented to a group of 40 undergraduates. The 
color task was paired with the chance induc- 
tion, and the line task with the skill induction. 
Subjects were asked to circle their best esti- 
mate of their chances of success on an 11- 
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point scale ranging from 0 to 100%. They 
also checked all the alternative estimates they 
considered plausible. This made it possible 
to determine their highest and lowest plausi- 
ble chances of success as well as their best 
estimate, As predicted, paired ¢ tests showed 
that all three expectancy variables were less 
for the chance induction task than for the 
skill induction task (all p’s < .001). When we 
paired the color task with the skill induction 
and the line task with the chance induction 
in a later administration to 50 undergraduates, 
this finding was replicated. 

In the following two experiments, the ex- 
pectancy confidence hypothesis has been ex- 
amined in a more precise fashion and under 
more carefully controlled conditions. 


Experiment 1 


In an effort to determine whether the ex- 
pectancy confidence hypothesis provides a 
plausible explanation of the findings of Rotter 
et al. (1961), a partial replication of this 
study was conducted. The following predic- 
tions, based upon the expectancy confidence 
model, were tested: 

1, Subjects presented with the chance task 
would report that they were more confident 
of their initial expectations than would sub- 
jects assigned to the skill condition. This fol- 
lows from the expectancy confidence position 
that subjects who report small expectancy 
shifts should also be confident that their in- 
itial expectations are correct. 

2. Expectancy shifts from Trial 1 to Trial 2 
would be correlated negatively with initial 
measures of expectancy confidence within 
conditions. In other words, the negative rela- 
tionship between expectancy confidence and 
expectancy shifts was expected to hold within 
conditions as well as between conditions. 

3. Expectancy shifts from Trial » to Trial 
n+ 1 would be correlated negatively with 
expectancy confidence on Trial n for the sub- 
jects within each condition; that is, the rela- 
tionship between expectancy confidence and 
expectancy shifts was expected to hold within 
subjects as well as within conditions. 

4. Trend analysis of expectancy confidence 
over all trials would show an increase in con- 
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fidence within the skill condition. Th 
pothesis was based on the presumpi 

subjects are usually unfamiliar with 
used in this condition and that ex 
confidence will increase as experien 
creases. A similar hypothesis was not 
for subjects in the chance condition, 
assumed that they had had more prior e 
ence with the chance task. 


Method 
Subjects 


Subjects were 51 female college students € 
in introductory psychology courses. Female | 
were used because Rotter et al. used females & 
cause this increased the homogeneity of the 
samples. Eleven of these subjects were 
from the data analysis: 4 because of apparati 
ure, 5 because they suspected the skill task 1 
fair, 1 because she suspected the chance task ¥ 
fair, and 1 because she did not perform the sl 
as directed. 


Apparatus 


Chance task. Slides were projected onto 
8 feet (2.46 m) in front of the subject. Subjec 
informed that some of these slides were im 
with an X and that others were imprinted } 
O. Before a slide was projected, subjects J 
guess whether an X or an O would appear, 
successful on a “trial,” a subject had to predi 
rectly four out of five presentations. 

Skill task. The skill task, hereafter refer ei 
the “Sky” task, involved an apparatus simitari 
originally developed by Sky (1950) and |; 
by Rotter et al. (1961). Each subject was Te 
to raise a 4-inch (10 cm) square platform 
which a 4-inch (13 mm) steel ball bearing h 
placed. The platform was to be raised 
pulling on a string connected to the plai 
threaded through a pulley at the top of | 
apparatus, To be successful, it was necess y 
the platform 9 cm without the ball rolling 0 
of the edges. The ball, positioned by 
menter, was held in place by an electromag 
planted on the underside of the platform. Cu 
this device could be interrupted by depres 
concealed, silent solenoid switch. Since the p 
of the apparatus sloped forward very sl 
ball would roll off the platform when such 
ruption occurred. Outcomes could therefo ‘ 
reptitiously controlled by the experimenter, 

Assessment. To report expectancies of su 
jects answered the following question | 
point scale running from 0 to 100% (ies í 
20%, etc.): “What are the chances that you 
ceed on this task? In other words, if you 
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task over and over, what percent of the time would 
you be successful ?” 

To report expectancy confidence, subjects answered 
the following question on a 5-point scale ranging 
from “very good idea” to “almost no idea”: “How 
good an idea do you really have of your. actual 
chances of success?” 

Subjects reported perceptions of control of rein- 
forcement by responding to the following question 
on an 11-point scale ranging from 0 (success depends 
on chance) to 10 (success depends on skill): “How 
important is skill or chance in producing success on 
this task?” 


Procedure 


Twenty subjects were randomly assigned to each 
condition. Data were collected for each subject indi- 
vidually. 

Subjects were seated and a prepared script was 
read by the experimenter. The first section of the 
script described whichever task the subject was to 
receive, Subjects were then given printed forms of the 
questions assessing expectancies and confidence and 
were provided with examples of how they might 
answer these if they were to report high or low 
ratings of expectancies of success or expectancy con- 
fidence, After the experimenter answered any remain- 
ing questions, subjects recorded their answers to the 
questions. The question assessing perceptions of con- 
trol of reinforcement was then explained, and a rat- 
ing was obtained from the subject. 

After completing the third question, subjects were 
given eight trials on either the chance or skill task. 
Success was controlled at a rate of 50%,’ and all 
subjects were presented with an identical pattern of 
outcomes (success, failure, failure, success, success, 
failure, failure, success). In the chance task, the ex- 
perimenter exerted control by either advancing or 
teversing the slide tray in response to the subject’s 
Predictions. In the skill task, outcomes were con- 
trolled by depression of the silent solenoid switch. 
The outcome of every trial (ie, success or failure) 
was announced to subjects immediately after it was 
completed. Each subject then reported her expectancy 
of success and expectancy confidence for the next 
trial. After the last trial, a final rating of perception 
of control of reinforcement was obtained in addition 
to the other ratings. 

Following the data collection period, subjects were 
asked what they thought was the hypothesis of the 
experiment, whether they fully understood the ques- 
Hons, and whether they believed that the task was 
Unfair. No subjects correctly guessed any of the 
experimental hypotheses or indicated that they did 
for understand the questions. Data for subjects who 
oo the task was unfair (eg. some thought that 

Platform was not level or believed that the ex- 
iaa had some means of controlling the out- 
omes) were eliminated from the final analysis. 
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Results 


Before data analysis, indices of expectations 
of success and expectancy confidence were 
produced, as follows: 


Expectations of success for Trial n = raw 
expectations score X 10; 
expectancy confidence for Trial n = raw 
expectancy confidence score/4 X 10. 


These transformations simply allow all re- 
sponses to be presented as if they had come 
from 11-point scales, where O represented 
minimal ratings of skill perception, expect- 
ancy confidence, or expectations of success. 

Preliminary data analysis indicated that 
the three dependent measures were not inter- 
correlated significantly before Trial 1 within 
either condition (product-moment r values 
between .00 and .23). Consequently, pretask 
expectancies of success, levels of expectancy 
confidence, and perceptions of skill require- 
ments are assumed to represent relatively sep- 
arate dimensions. 

Further data analysis, summarized in Table 
1, pointed to the conclusion that the results 
of Rotter et al. (1961) were replicated in the 
current study. Using ¢ tests for independent 
means, we found that subjects assigned to the 
two groups did not differ in their initial ex- 
pectations of success, t(38) = 1.77, p > 05. 
However, skill condition subjects displayed 
significantly greater expectancy shifts than did 
chance task subjects from Trial 1 to Trial 2, 
£(38) = 4.66, p < .001. In addition, subjects 
did perceive the Sky task as requiring greater 
skill than the prediction task, both before the 
first trial, ¢(38) = 5.58, p< .001, and fol- 
lowing the last trial, t(38) = 4.25, p < 001. 

More important for the expectancy COn- 
fidence position, Hypothesis 1 was confirmed. 
That is, subjects in the chance condition were 


1The scripts used in both experiments have not 
been included because of space limitations. They are 
available from the author. 

2 When expectancy shifts under four different rein- 
forcement conditions (25, 50, 75, and 100%) were 
observed, Rotter et al. (1961) discovered that similar 
shift discrepancies took place under all conditions. 
Therefore, to focus upon only a single condition of 
success was sufficient for the present study. 

3 All tests of significance were two-tailed. 
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Table 1 
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Means and Standard Deviations for Skill and Chance Condition Subjects in Experiment 1 


—— mu 


Chance Skill 
Measure M SD M SD 
Expectations of success 3.45 1.28 4.30 1 
Expectancy shifts, Trials 1 to 2 .75 97 2.55 1.43 
Skill perceptions, pretask 1.90 1.33 4.85 1, 
Skill perceptions, posttask 3.30 2.20 5.90 1 
Expectancy confidence 3.88 2.50 1.62 2 


Note. All t tests for independent means were two-tailed. n = 20 in each group. 


*p<.01.** p < .001. 


found to be significantly more confident of 
their expectancies than were subjects in the 
skill condition, ¢(38) = 3.13, p < .01. Hy- 
pothesis 2 was partially confirmed, since a 
significant negative product-moment correla- 
tion was found between expectancy confidence 
on Trial 1 and expectancy shifts from Trial 1 
to Trial 2 within the skill condition, r(18) = 
—.50, p = .02, although not within the chance 
condition, r(18) = .16, p > .05. Subsequent 
analysis revealed that the variance of initial 
shifts for the chance group was significantly 
less than that for the skill group, F(19, 19) = 
2.68, p < .05, which may account for the low 
and nonsignificant correlation found for the 
chance group. No significant relationships be- 
tween initial skill perceptions and initial ex- 


Table 2 

Means and Standard Deviations for Chance 
and Skill Condition Subjects in Experiment 1 
for all Experimental Trials 


Expectancy confidence 


Chance Skill 
Trial Outcome M SD M SD 
1** Success 3.88 2.50 1.62 2.03 
2 Failure 4.50 2.08 5.12 1.51 
3 Failure 5.00 2.29 5.00 1.40 
4 Success 4.25 2.82 5.00 1.81 
5 Success 5.00 3.03 5.75 1.43 
6* Failure 5.38 2.84 7.00 1.54 
7 Failure 5.25 3.13 6.62 1.86 
8° Success 4.50 3.10 6.25 2.07 
9* — 4.88 3.09 6.88 1.79 


Note. All t tests for independent means were two- 
tailed. n = 20 in each group. 
*p <.05.**p <.01. 


pectancy shifts were found within either th 
skill condition, r(18) = —.21, p > .05, or ti 
chance condition, (18) = .35, p > .05. 


Trial n + 1 were calculated for each subje 
over all eight trials. Correlations were t 


mation, and the z values were averaged wil 
each condition (McNemar, 1962). By test f 
for the significance of each Za», the sig 
icance levels of the corresponding fav Vi 
were determined, Hypothesis 3 was pa 
confirmed, since a significant negative avert 
correlation was found within skill conditi 
subjects, ra, = —.49, p = .007. A similar 
tionship was not found for chance condi 
subjects, ra. = .16, p > .05. Three subjects! 
the chance condition did not display any va 
ance in expectancy shifts. Since the coms 
tion is undefined in such cases, the data 
these subjects did not contribute to the: 
age correlations for the chance condition. 

A test for linear trend revealed that @ 
pectancy confidence increased over the c0 
of eight trials for subjects assigned to e 
condition, F(1, 171) = 79.52, p < 00% 
such changes were noted for chance cone 
subjects, F(1, 171) = 138, p> 0% = 
fourth hypothesis, therefore, was ais ie 
firmed. Mean levels of expectancy confide 
for each trial are shown in Table 2. y 

In order to analyze expectancy shifts i, 
fully, it would be important to consi al 
fluctuations in expectancy shifts that 0 
over the course of the experiment. Ani 
of variance of a two-factor mixed 
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repeated measures on one factor revealed sig- 
nificant main effects for conditions, F(1, 38) 
= 13.90, p < .001, and trials, F(7, 266) = 
13.5, p < .001, and a significant Conditions x 
Trials interaction, F(7, 266) = 6.2, p < .001. 
Inspection of the mean trial by trial expect- 
ancy shifts revealed that greater expectancy 
shifts occurred for skill task subjects than for 
chance task subjects over the first three or 
four trials and from Trial 1 to Trial 2 in par- 
ticular. Differences in expectancy shifts de- 
creased after the first few trials, so that they 
were virtually of the same magnitude for both 
groups by the end of eight trials. 

In sum, the results of Experiment 1 con- 
firmed, or partially confirmed, each hypothesis 
of the study. 


Discussion 


An analysis of expectancy shift research 
Suggested that high expectancy confidence 
leads to small expectancy changes. This anal- 
ysis also suggested that in an important ex- 
pectancy shift study (Rotter et al., 1961), 
chance task subjects were more confident than 
skill task subjects that their initial expect- 
ancies were correct. 

The first experiment evaluated these con- 
tentions. In accordance with the expectancy 
confidence hypothesis, subjects performing 
the prediction task in Experiment 1 were 
found to possess higher levels of expectancy 
Confidence than subjects performing the plat- 
form task. Product-moment correlations 
also indicated that differences in expectancy 
Confidence were related to expectancy shifts 
for subjects in the Sky task. Although similar 
differences were not found for prediction task 
Subjects, subsequent analysis revealed that 
the variance of initial expectancy shifts was 
Significantly less for subjects in this condition. 
McNemar (1962) has pointed out that the 
magnitude of a correlation coefficient is les- 
sened when the range of either variable is cur- 
tailed, Therefore, although the relationship 
between expectancy confidence and expect- 
ancy shifts under chance conditions warrants 
amination in future studies, this explana- 
‘on plausibly accounts for the correlational 
Y erences between skill and chance condi- 
‘ons in Experiment 1. 
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The first experiment also determined if sub- 
jects performing the Sky and prediction tasks 
perceived outcomes to be differentially de- 
pendent upon skill or chance factors as as- 
serted by Rotter et al. (1961). Subjects did 
perceive the Sky task as requiring more skill 
than the prediction task, Skill perceptions 
were not, however, correlated with expectancy 
changes. In addition, the convergence of ex- 
pectancy shifts reflected by the significant 
Conditions X Trials interaction is at odds 
with the control perception hypothesis, Since 
subjects rated the Sky task as requiring more 
skill than the prediction task, both before and 
after completing the tasks, the control percep- 
tion hypothesis would seem to suggest that 
some differences in the size of expectancy 
shifts should have been observed throughout 
the experiment. 


Experiment 2 


Experiment 2 was an attempt to make 
comparative tests of predictions about ex- 
pectancy shifts based on the expectancy con- 
fidence and control perception hypotheses. All 
subjects performed on the skill task used in 
Experiment 1. Expectations of success were 
recorded over several trials, thereby allowing 
for the calculation of expectancy shift mea- 
sures similar to those frequently used by other 
researchers (cf. Klein & Seligman, 1976; 
Miller & Seligman, 1973; Rotter et al., 1961). 
Skill perceptions were manipulated by pretask 
inductions so that one group of subjects per- 
ceived outcomes to be skill dependent (high 
skill perception), whereas another group per- 
ceived outcomes to be chance controlled (low 
skill perception). Levels of expectancy con- 
fidence were also manipulated by pretask in- 
ductions so that one group of subjects was 
relatively certain that their expectancies of 
success were correct (high expectancy con- 
fidence), whereas another group was less cer- 
tain of this (low expectancy confidence). 
These locus of control and confidence manip- 
ulations were combined factorially. The fol- 
lowing hypotheses were tested: 

1. Smaller expectancy shifts will be re- 
ported by subjects assigned to the low skill 
conditions than by those in the high skill con- 
ditions (the control perception prediction). 


1896 


2. Smaller expectancy shifts will be re- 
ported by subjects assigned to the high ex- 
pectancy confidence conditions than by those 
in the low expectancy confidence conditions 
(the expectancy confidence prediction) . 


Method 


Subjects 


Subjects were 97 female college students enrolled 
in introductory psychology courses. Seventeen sub- 
jects were eliminated from the data analysis: 1 be- 
cause of apparatus failure, 2 because they did not 
operate the apparatus according to instructions, 7 
because they suspected the task was not fair, and 5 
because they reported that they either did not an- 
swer the questions properly (e.g. estimated expect- 
ancies on a trial by trial rather than on a long-term 
basis) or did not understand the objective of the 
task, 


Instructions 


All of the pretask instructions given to subjects 
performing on the skill task in Experiment 1 were 
Presented to subjects Participating in Experiment 2. 
In addition, instructions designed to induce par- 
ticular perceptions of control of reinforcement and 
levels of expectancy confidence were delivered. These 
inductions had been found in pilot studies to be 
sufficiently powerful to create significant differences 
between groups with respect to skill perceptions and 
levels of expectancy confidence, The high skill sub- 
jects were informed that how they performed was 
determined by many factors under their control. The 
low skill subjects were told that how they performed 
was determined by many factors outside their control. 

ı The high confidence subjects were told that, of the 
college students who had performed the Sky task 
many times, 50% were successful more than half the 
time, and the other 50% were successful less than 
half the time, The low confidence subjects were told 
that 25% of their peers were successful on the task 
almost all the time, 25% were successful over half the 
time, 25% were successful less than half the time, and 
25% were almost never successful, To clarify’ this 
information, these subjects were also Presented with 


a figure indicating the rectangular distributi 
performance scores. These ae E 


D jects given oi 
categories. In order to deliver the iene ae 


tions in as uniform a manner as Possible 

videotapes was presented to each paron eer 
The questions and scales used in Experiment 1 to 

assess perceptions of control of reinforcement, ex- 

pectancy confidence, and expectations of success rs 

also used in Experiment 2. 
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Procedures 


Twenty subjects were assigned randomly to each f 
of the four induction conditions. Data were collected 
for each subject individually. Since the data collec- 
tion period spanned two quarters, a randomized block 
design (Winer, 1971) was adopted to control for 
fluctuations in subject characteristics. 

After being seated at a table upon which the Sky 
apparatus had been placed, each subject was pre- 
sented with videotaped instructions. After the video- 
taped instructions were presented, subjects performed 
the Sky task. The object of cach trial was the sam 
as that described in Experiment 1. Each subject par- 
ticipated in eight trials, with the pattern of outcome 
being surreptitiously controlled by the experimenter, 
as in Experiment 1. Expectancies of success and er 
pectancy confidence were reported before each trial 
and after the final trial, Skill perceptions were re- 
ported before the first trial and after the last trial. 1 

Subjects were asked the debriefing questions used | 
in Experiment 1. No subjects correctly guessed any ol 7 
the hypotheses of the experiment, although several 
indicated that they believed the outcomes were con- 
trolled. Data for these latter subjects were eliminated 
from the final analysis. 


Results 


The analysis of variance applied to the data 
was a randomized block design with replica- 
tions as an additional factor (Winer, 1971): 
Therefore, wherever analyses of variance wert 
undertaken in Experiment 2, the followin 
three factors were included: skill perception 
(two levels), expectancy confidence (tw) 
levels), and replications (four levels). Befort 
data analysis, raw scores were transformed 4 
in Experiment 1. 

To make meaningful comparisons betta 
groups, it was necessary first to determini 
whether the instructions were successful 1 
producing the intended differences and ; 
whether there were differences in initial a 
pectancies that might account for later at | 
ences. When skill perceptions were analyz 4 
high skill subjects indicated that a nea 
level of skill was required for successful ‘el 
performance than that indicated by low { 


e 
subjects, F(1, 64) = 8.20, p = .006. On a 
sures of expectancy confidence, > re Jot 


fidence subjects were more confident É: 
confidence subjects that their initial exP”” 
ancies were correct, F(1, 64) = 11.36, t 
-002. All other main effects and interaction g 
these two analyses had F values less tha? © 
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Table 3 
Means and Standard Deviations of Dependent Measures for Experiment 2 
Experimental condition 
Low skill- Low skill- High ski i 
: i ; skill- ill- 
Measure low confidence high confidence low sondele: high cosets 
Change, Trials 1 to 2 
A 2.20 1.45 2.85 1.90 
1.24 1.15 1.53 1.45 
Total change i 
M 5.70 3.7 
S .70 6.70 4, 
SD 3.10 2.89 3.08 238 
Total increase 
M 3.55 2.30 
$ s 4.05 2.80 
SD 1.96 1.75 2.21 1.99 
Total decrease 
EM 
2.15 1.40 2.65 1.70 
4 1.39 1.31 1.63 1.26 
Final expectancy A 
A 5.80 5.45 5.15 5.60 
1.28 1.28 1.09 1.88 


Note. n = 20 for all conditions. 


Initial expectations of success were not influ- 
enced significantly by either the skill, F(1, 
64) = 42, p > .05, or confidence, F(1, 64) = 
118, p > .05, inductions, and a significant 
ee was not found between inductions, 
(1, 64) = .13, p > .05. Taken together, the 
Bp minary analyses suggest that the expect- 
E shifts observed in Experiment 2 may be 
tributed to the manipulations rather than to 

‘ome uncontrolled factor. 
es employed to evaluate expectancy 
p inchided the following: (a) change in 
pome between Trials 1 and 2; (b) 
NRS in expectancies, defined as the 
s sum of the changes that occurred im- 
B z y following successful outcomes; (c) 
aa in expectancies, defined as the 
ilow ; sum of the changes that immediately 
ha ed failure experiences; (d) total amount 
Ba ‘esate change, defined as the sum of 
5 Ncreases and decreases in expectancies; 

the final expectancy. 

Bice 3 contains the means and standard 
ee of the dependent measures. Signif- 
ic main effects were found for the con- 
induction in the direction predicted 
ike expectancy confidence hypothesis for 
S in expectancies from Trial 1 to Trial 


2, F(1, 64) = 8.91, p = .004; total amount of 
expectancy change, F(1, 64) = 8.95, p= 
.004; total increase in expectancies, F(1, 64) 
= 7.36, p = .008; and total decrease in ex- 
pectancies, F(1, 64) = 7.05, p=.01. The 
only measure not reflecting such differences 
was the final expectancy reported by subjects, 
F(1, 64) = .02, p > 05. A main effect for the 
skill induction approached the .05 level for 
changes in expectancies from Trial 1 to Trial 
2, F(1, 64) = 3.33, P= 07, However, no 
significant main effects for the skill induction 
were found for any of the other dependent 
measures (p > -20 in all cases). In addition, 
no significant interactions were found (p > 


.20 in all cases). 
Discussion 


In Experiment 2, high and low levels of ex- 
pectancy confidence were factorially combined 
with high and low skill perceptions. Compara- 
tive predictions based on the expectancy con- 
fidence and control perception hypotheses 
were tested. The prediction based upon the 
expectancy confidence hypothesis was sup- 
ported by the finding that subjects with high 
levels of confidence changed their expectancies 
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less than subjects with low levels. Although 
high skill subjects reported that the Sky task 
required more skill than the low skill subjects 
said it did, skill perception differences did not 
significantly affect expectancy shifts. The pre- 
diction based upon the control perception 
hypothesis was therefore not supported. 


General Discussion 


Considering the contributions of each ex- 
periment separately, Experiment 1 pointed up 
naturally occurring relationships between 
variations in expectancy confidence and var- 
iations in expectancy shifts. Experiment 2 
showed similar effects for manipulated levels 
of expectancy confidence. Taken together, the 
results provide strong support for the ex- 
pectancy confidence hypothesis but not for 
the control perception hypothesis. 


Generality of Applicability 


The clear predominance of support given 
the expectancy confidence hypothesis raises 
the question whether all expectancy shift dif- 
ferences previously explained by the control 
perception hypothesis could now be accounted 
for by the expectancy confidence hypothesis. 
For example, could the findings of Rotter et 
al. (1961) be considered, on the basis of the 
results of Experiment 2, to be an artifact 
produced by dissimilar tasks? This probably 
1s too strong a statement in view of the fact 
that skill perception differences were larger in 
Experiment 1, the replication of Rotter et al. 
than in Experiment 2. This observation sug- 
gests that the size of expectancy shifts could 
be affected differentially where large skill per- 
ception differences exist, 


Another consideration that argues against 


use of five expectancy shift 


l measures great] 
increased the chances that one of the Sad 
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measures would have a low p val 
obvious possible explanation is 
certain circumstances, perceptions 
may influence expectancy shifts ind 
of expectancy confidence. Confirmal 
possibility would limit the genera 
expectancy confidence hypothesis. Thi 
not, however, negate the conclusion | 
concept of expectancy confidence has 
“range of convenience” (Kelly, 19 
comprehension and prediction of so 
haviors. 

In evaluating the generality of thi 
ancy confidence hypothesis, some 
should be given to other exp 
theories. Weiner’s attributional theory 
ticularly relevant in this regard, since 
been explicitly offered to account for 
sults obtained by Rotter and his œ 
(Weiner, Frieze, Kukla, Reed, Rest, 
baum, 1971; Weiner, Heckhausen, 
Cook, 1972; Weiner, Nierenberg, í 
stein, 1976). Weiner (Weiner et 
reviewed several empirical studies of 
tionship between expectancy shifts 
attributions that were stable and” 
(skill), stable and external (task 
unstable and internal (effort), and u 
and external (luck), He concluded | 
greatest shifts occurred for subjects 
tributed their outcomes to stab 
rather than to internal factors. Cal 
ity, as opposed to control perception; 
considered by Weiner to be the majo! 
sion underlying expectancy shifts. 

While powerful tests of the exp’ 
fidence and causal stability hypothes 
await future research efforts, it is f 
make some comparative statemem 
these theories on the basis of the p! 
ings. Weiner has argued that the 
chance tasks used in Experiment 
jects to make stable and unstable au 
respectively (cf. Weiner et al, 
would presumably advance a simi 
tation of the effects of the skill 
inductions used in Experiment 2. 
tions would suggest many of the 
eses as those based on the coni 
hypothesis: 

1. Skill perceptions in Experim 
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be positively correlated with the size of ex- 
, pectancy shifts. 

2. Differences in expectancy shifts should 
be observed for skill and chance subjects in 
Experiment 1 on later as well as earlier trials. 

3. A significant main effect for skill percep- 
tions should be present in Experiment 2. 

Since these hypotheses were not confirmed, 
whereas those derived from the expectancy 
confidence hypothesis were confirmed, the re- 
sults of Experiments 1 and 2 support the ex- 
pectancy confidence hypothesis but not the 
causal stability hypothesis. 

A final theoretical dimension that should be 
related to the generality of the expectancy con- 
fidence hypothesis is that of commitment. 
Mischel (1958) and Watt (1965) have as- 
serted that verbalization and discussion of 
initial expectancies increase the level of com- 
mitment to the specific values of these ex- 
pectancies. Expectancy changes may therefore 
be small where commitment is high. Applying 
this formulation to the present research, it 
could be argued that reporting high levels of 
confidence induced high levels of expectancy 
commitment and that this led to small expect- 
ancy changes. 

Although this explanation could account for 
the results of Experiment 2 and the correla- 
tional findings of Experiment 1, it should be 
pointed out that the relationship between 
commitment and expressions of confidence 
Was not empirically determined in the present 
studies. Furthermore, by itself, a commitment 
hypothesis does not provide a clear explana- 
tion of the shift differences that have been 
found when confidence was not assessed and 
When commitment was presumably equalized 
as a result (cf. Miller & Seligman, 1973; 
ae 1957; Rotter, Liverant, & Crowne, 

It is clear from this discussion that many 
variables may be involved in the production 
i expectancy shifts. It is also apparent that 

me post hoc alternative explanations may 
RS: to account for the present re- 
Aye In view of these circumstances, an ad- 
as ie future research course would include 
i eloping several methods of manipulating 
ae variables, devising unobtrusive 

Sures for assessing these manipulations, 
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and then completing additional comparative 
studies. 


Implications 


Methodological. Applications of the ex- 
pectancy confidence hypothesis may be made 
to several areas. Perhaps the most straight- 
forward of these falls on a methodological 
level. The nonsignificant correlations between 
initial expectancies and expectancy confidence 
in both experiments and the lack of initial 
expectancy differences between confidence 
conditions in Experiment 2 indicate that ex- 
pectancies were independent of expectancy 
confidence. Unfortunately, the measurement 
of expectancies has consistently been con- 
founded with expectancy confidence in previ- 
ous research, This makes it difficult to inter- 
pret clearly many previously reported find- 
ings, Furthermore, from the standpoint of 
future research, it would seem extremely im- 
portant to assess expectancy confidence sep- 
arately whenever expectancies are assessed. 

Theoretical. The expectancy confidence 
hypothesis also holds implications for per- 
sonality and psychotherapy theories using 
expectancy constructs (€.g., Bandura, 1977; 
Rotter, 1954), in that it provides a new ex- 
planation of changes in expectancies and be- 
havior. Rotter, for example, has proposed that 
behavior is a function of expectancies for 
reinforcement and reinforcement value (Rot- 
ter, 1954; Rotter et al., 1972). Where rein- 
forcement value remains constant, changes in 
expectancies for reinforcement have typically 
been considered necessary preconditions for 
behavior change. A formulation of behavior 
change that takes into account the importance 
of expectancy confidence suggests that low 
levels of expectancy confidence are anteced- 
ents of changes in expectancies and behavior. 

Another application of the expectancy con- 
fidence hypothesis lies in the reinterpretations 
it provides of previous research based on the 
control perception hypothesis. As pointed out 
in an earlier section, it not only accounts for 
the findings of Phares (1957) and Rotter et 
al. (1961) but reconciles these findings with 
those of James and Rotter (1958). It also 
holds implications for research on the learned- 


1900 


helplessness model of depression. Seligman’s 
(1975) interpretation of the low expectancy 
shifts of depressed undergraduates as evi- 
dence for this model rests not only on the ex- 
plicit assumption that expectancy shifts are 
greater for skill tasks than chance tasks but 
also on the implicit assumption that there are 
no other sources of differences in the size of 
expectancy shifts. The present results contra- 
dict these assumptions by indicating that ex- 
pectancy confidence is an important source of 
differences that has not previously been taken 
into consideration, The present results thus 
call into question evidence that has been 
thought of as providing strong support for the 
learned-helplessness model. 

An alternative explanation of the associa- 
tion between depression and small expectancy 
shifts is that depressives shift their expecta- 
tions less than nondepressives because they 
are more confident of their expectancies. At 
first, this might seem illogical. However, it 
appears more reasonable when the nature of 
the certainty associated with depression is 
considered. The findings of Friedman (1964) 
and Loeb, Beck, and Diggory (1971) indi- 
cate that one of the hallmarks of clinical de- 
pression is extreme negativity regarding self- 
evaluation and expectations of successful 
performance. Beck (1967) has also observed 
that depressives cling to such beliefs with 
remarkable tenacity. Negative expectations of 
success and negative evaluations of perform- 
ance have also been reported for populations 
of depressed undergraduates (Wollert & 
Buchwald, 1979). On the bases of these re- 
ports and the results of the present investiga- 
tion, it is proposed that depressives shift their 
expectations less than nondepressives because 
they are relatively certain that they will 
neither succeed nor improve when confronted 
with a skill task. 

The expectancy confidence hypothesis pro- 
vides another explanation of expectancy shift 
differences and points to several important 
reinterpretations of previous results. It also 
holds implications for theories of personality, 
behavior change, and depression. This hypoth- 
esis emphasizes the structure of cognitions and 
seemingly provides additional breadth to per- 


sonality theories based on expectancy con- 
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structs. To realize this potential, however ių 
will be necessary to integrate the construct of 
expectancy confidence with existing theories 
Consideration of those factors possibly affect! 
ing expectancy confidence would also be te 
quired. Such factors include experience, task 
perceptions, the degree of internal consisten 
within an expectancy network, the number d 
plausible alternative expectancies, the avail 
ability of meaningful information, and the op- 
eration of interpersonal influence factors, In 
view of the numerous applications of the e 
pectancy confidence hypothesis, further tt 
search along these lines would certainly ap 
pear warranted. 
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Components of Aggression in Chickens and Conceptualizations 


of Aggression in General 


D. W. Rajecki, David R. Nerenz, Terry G. Freedenberg, and 
Patricia J. McCarthy 


University of Wisconsin—Madison 


A refined analysis of the peck order in chickens was offered as a test of the 
notion that for this species, different responses such as leaping and various 
types of pecking need not be interchangeable indexes of aggression. Indeed, 
tests showed that particular response types of the birds were differentially 
mediated by organismic or environmental factors. In large cages pecking at the 
body was most frequent by birds that had a home-cage advantage. Contrarily, 
rates of aggressive leaping were independent of this environmental influence, 
with males having an advantage over females. Males showed more head peck- 
ing than females, but the profile for this sex difference did not resemble the 
profile for leaping. Correlational analyses revealed that whereas head pecking 
between testmates was not matched in frequency, leaping was positively related. 
Finally, the behavior of birds tested in small cages differed from that of the 
large-cage subjects. Although there was more head pecking in the small cages, 
males did not have an edge, and leaping was infrequent. Such results indicate 
that these responses cannot be viewed as interchangeable indicators of aggres- 


sion in fowl. 


Our thesis is that certain indexes of pre- 
sumably aggressive behavior cannot be used 
interchangeably in an uncritical manner. 
Where they exist, separate influences on dif- 
ferent measures will require identification. 
Indeed, there is controversy over procedures 
and standards for classifying particular re- 
sponses as aggressive in both the human (cf. 
Baron, 1977; Tedeschi, Smith, & Brown, 
1974) and nonhuman (cf. Deag, 1977; 
Rowell, 1974; Syme, 1974) research domains. 
In terms of the latter literature, evidence is 
beginning to accumulate that in analyzing the 
social psychology of animals certain ‘“com- 
ponents,” “forms,” or “types” of aggression 
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(labels used by different writers) should mt 
be used interchangeably. | 
To take the literature on rodents as al 
example, in intact rats there was no relat 
ship between success in intraspecies figl 
(in dominance encounters) and mouse 
(Baenninger & Baenninger, 1970), and 
though spontaneous dominance orders 
limited access (competitive) dominance 01t® 
were reliable within themselves, they a 
very low interorder correlation (Baenn 
1970). Further, rats that would not norms: 
engage in fighting (females and males 
miliar with one another) could be in 
fight in competition for food (Zook & Ada 
1975). Sensory alterations and other E 
preparations have also revealed atea 
components of rat aggression. Whereas a 
rats fought at control levels in a shot 
tion, anosmic and devibrissaed pau 
less, and olfactory bulbectomized 
showed an increase in mouse killing G 
bee & Eichelman, 1972; Ghiselli & 
1975; Thor, 1976). Moreover, it 
demonstrated that shock-induced ag 
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is separable from territorial aggression in rats 
in terms of the differential influence of 
hypothalamic lesions on these two response 
types (Adams, 1971) and the differing behav- 
ioral profiles of intact animals across the two 
situations (Blanchard, Blanchard, & Taka- 
hashi, 1978). 

Studies with mice parallel the rat literature 
(Fredericson, 1952; Rowe & Edwards, 1971), 
and a set of researches indicates that the com- 
ponents or patterns of aggressive displays in 
certain fishes are not uniformly influenced by 
experiential and environmental factors (Clay- 
ton & Hinde, 1968; Davis, 1975; Shapiro & 
Schuckman, 1971). Further, success in peck 
exchanges between young chicks does not cor- 
telate with their success in limited access 
tests (Rajecki, Nerenz, Barnes, Ivins, & Rein, 
1977). Taken together, this growing body of 
evidence suggests that in the study of any 
species it may be necessary to forego the 
assumption that there is some unitary factor 
in all modes of aggression. This consideration 
brought us to an assessment of components 
of aggression in young domestic chickens. 

In fact, a detailed analysis of aggression in 
chickens is quite warranted. Although the 
term peck order is widely used by both scien- 
tist and layman, there is surprisingly little 
agreement as to just what sort of pecking or 
other form of behavior might constitute an 
aggressive response in young precocial birds. 
Some investigators have regarded any inter- 
Subject pecking as aggressive (ducklings and 
quail chicks: Eiserer, Emerling, Scardina, & 
Hoffman, 1976); others have restricted atten- 
tion to pecks that cause the withdrawal of 
the opponent (ducklings: Hoffman, Boskofi, 
Eiserer, & Klein, 1975), whereas still others 

ve concentrated on the locus of pecking 
(chicks: Rajecki, Ivins, & Rein, 1976). 

However, it may be premature to categorize 
any activity on the part of developing birds as 
aggressive without a better knowledge of the 
ae of organismic and environmental factors 
a such behavior or a fuller understanding of 
in mutual influence between birds engaged 

Particular response types. Further, although 
Seminal sources have suggested that certain 
iets of agonistic behavior emerge at dif- 
Cub. Points in the development of fowl (e-8., 

| , 1958; Kruijt, 1964), detailed informa- 
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tion concerning the ontogeny of such re- 
sponses is necessary before a taxonomy of 
components can be established: Therefore, the 
research reported here was undertaken to 
identify certain biological, maturational, and 
experiential influences on the capacity for 
aggressive social behavior among broodmates. 
The aim was to compare and contrast par- 
ticular response types as possible components 
of aggressive behavior in young fowl. 


Method 
Design 


The design of this longitudinal study permitted 
an assessment of the influence of particular factors 
on discrete components of chicks’ social behavior. 
Earlier reports indicated that sex is a strong de- 
terminant of differential status between birds (Wood- 
Gush, 1971), so pairs consisting of a female and a 
male were tested. In supplemental tests, same-sex 
pairs were observed for aggression. Other studies of 
young chickens indicated that in encounters in- 
volving social pecking, there is a kind of “prior 
resident” or “home-cage” advantage. That is, birds 
tested in the home cage (or in a situation containing 
some feature of the home cage) generally have an 
edge over birds introduced to that home cage (Ra- 
jecki, Grams, Stursa, & Nerenz, 1978; Rajecki, Lamb, 
& Suomi, 1978; Rajecki, Nerenz, & Rein, 1978), 
Therefore, a “home-away” distinction was included 
in the current design, Further, it is clear that the 
physical dimensions of the test area have an influ- 
ence on the likelihood of aggression in fowl, with 
less aggression seen in more cramped places (Hughes 
& Wood-Gush, 1977). Accordingly, chicks were 
reared in relatively large or small cages. 

Finally, a recent report shows that intersubject 
pecking between pairmates can be promoted by 
repeatedly separating and reuniting cagemates for 
day-long periods (Rajecki, Lamb, & Soomi, 1978), 
so the current subjects were tested in at least 10 
different reunions (following a period of physical 
separation) over the first 3 weeks after hatching, 
In sum, the basic design was a completely crossed 
2X2X2 (X10) factorial representing sex, the 
home-away distinction, cage size, and a number of 
repeated measures. Records were kept of a variety 
of responses. 

In addition to observing the influence of the 
experimental conditions on social responses of de- 
veloping chicks, the covariance of given components 
was assessed to provide further information on the 
similarities and differences between components. 


Subjects 


Subjects were 84 cockerels and 84 pullets of the 
White Leghorn strain, received on the day of 
hatching from the Sunnyside Hatchery in Oregon, 
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Wisconsin. The sex of the chicks had been deter- 
mined at the hatchery by professional poultrymen 
using a method of cloacal extrusion, and sex was 
reestimated in the laboratory when the animals 
were 3 to 4 weeks of age by an examination of 
comb size and color. The two methods of determin- 
ing sex agreed in 92% of the cases of 3-week-old 
birds and in 97% of the cases of 4-week-old birds, 
so the original method was used to classify sub- 
jects. Birds were housed in mixed-sex pairs and were 
marked with a felt pen for identification. 

Large cages. Thirty-six pairs of chicks were 
assigned to 46 X 46 X 61 cm metal cages, each con- 
tinuously illuminated and heated to 32-35 °C by a 
shaded 150-W lamp installed outside the wire mesh 
door. After several days the original bulbs were 
replaced by 40-W lamps that produced a midcage 
temperature of 24-26 °C. Each cage was also 
equipped with a plastic or metal trough for food 
(Purina chick starter) and a watering station con- 
sisting of an inverted quart jar and a plastic basin. 
A 30 X 30 cm sheet of absorbent paper (Cagesorb) 
was placed on the floor of the cage near the lamp 
and was liberally sprinkled with food prior to the 
arrival of the residents. The purpose of the paper 
was to insure that chicks would begin to peck at 
food particles, and to provide a warm substrate for 
the very young birds. Regardless of its physical 
condition, the original sheet of paper remained in 
the large cages for the duration of the experiment. 
These cages were arranged in racks of six in several 
laboratory rooms, 

Small cages. Forty-eight pairs of chicks were 
assigned to 22.5 X 17.5 X 17.5 cm metal cages which 
contained two 5-cm petri dishes covered with wire 
netting, one for water and one for chick starter. 
Each cage was continuously illuminated and was 
heated to 34 °C by a 40-W bulb outside the mesh 
wall. After several days the original lamps were 
replaced with 15-W bulbs that reduced the midcage 
temperature to 24 °C. The small cages were ar- 
ranged in racks of 12 in a single laboratory room. 


Procedure 


Experimental conditions. The chicks were deliv- 
ered to the laboratory in large groups in separate 
cartons of cockerels and pullets. They were assigned 
to cages in pairs comprised of a male and a female? 
Because the birds were to be repeatedly separated 
and reunited over the course of the experiment, one 
chick was designated the home bird, the other the 
away bird, and each was marked accordingly, The 
home bird remained in the home cage during sepa- 
rations, whereas the away bird was installed in a 
similar cage in an adjacent rack. Each away bird 
had its own away cage. The same bird was always 
the away bird, and birds were always reunited with 
their cagemate in the home cage. Across pairs an 
equal number of males and females were designated 
as away birds in both the large- and small-cage 
conditions. 


Test sequence. The subjects were tested in two 
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large batches during late spring a 
For a given test of a pair, the anim 
rated for 24 hours and were then 
the act of returning the away bird 
dling the animal, the home bird w 
Both birds were removed from their 
at the same moment, and were sii 
placed in the home cage. Observatii 
mediately upon reunion, and continu 
utes (after Rajecki, Lamb, & Suomi, 
ing the observation period the p: 
gether for 24 hours and then unden 
such separation-reunion episodes until 
of testing. 

Since a given pair could be tes 
other day because of the separation 
venient schedule was adopted wher 
sample was tested each and every d 
the day each batch arrived, half 
placed in pairs in home cages, wherea 
half of the sample was immediatel 
with the home bird placed in the h 
the away bird placed in the away 
rated chicks were reunited and ol 
later, at which point in time the birds 
originally paired were separated for 2: 
after these two subsets of the samj 
nately separated and reunited such 
birds were observed in reunions on 
age (1, 3, 5, 7, and so on for tho 
separated) and half were observed ol 
age (2, 4, 6, 8, and so on for those in 
All observations were made between 
1:00 p.m. local time. Testing cont 
fashion for about 3 weeks in the cas 
jects in the small cages and for about 
the large-cage subjects. This proced 
a within-subjects series of 10 and 1 
per pair in the small and large cages; 
Although we wished to maximize t 
tests in order to detect effects for 
under scrutiny, the testing in the sn 
terminated when the birds outgrew 
large-cage testing ceased when thi 
needed for a new batch of chicks 
scheduled for delivery on a certain @ 


Dependent Measures 


Responses during a test were Ti 
observers, each responsible for a § 


1It should be noted that the phi 
of the birds to conditions was donei 
who had no other connection with the } 
the record of those assignments was 
any observer until the completion 
a given batch. Accordingly, at least 
phase of this longitudinal study, 
blind to the sex of his or her sub 
servers were obviously aware of 
and of whether their ke Be 
bird). By about two weeks of a 
show A of sexual dimorphism. 
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dependent measures were employed. Four of these 
measures were based on the locus of subjects’ pecks 
and included pecks at the head, body, or feet of 
the testmate, or at any feature of the test unit. 
Discrete pecks were recorded by pressing corre- 
sponding buttons on hand-held panels that were 
connected to a multichannel event recorder. 

A fifth response was aggressive leaping which 
consisted of a hop or a very rapid charging motion 
toward the testmate, with the leaping bird’s wings 
flapping (cf. Kruijt, 1964, Figures 1 and 7; and 
the descriptions of sparring by Dawson & Siegel, 
1967; and Ratner, 1965). These responses very often 
ended with the subject colliding with the testmate 
or landing with one or both feet on some part of 
the testmate. Each discrete leap was recorded with 
the press of a button. 

Two other responses were recorded to measure 
states of disturbance on the part of the subjects 
(after Rajecki, Suomi, Scott, & Campbell, 1977). The 
first of these was the distress call, which is a long, 
loud note that is accompanied by the opening of 
the mandibles, Discrete calls were recorded. The 
other disturbance reaction was labeled cage pushing, 
in which a bird appeared to be motivated to escape 
the test situation by pushing with its beak at the 
walls of the cage (as distinct from pecking) or by 
trying to wriggle through the gaps in the mesh 
front. The button representing this response was 
Pressed for the duration of an episode of cage push- 
ing. The recording chart on which the responses 
Were recorded was calibrated in 2-sec intervals, 
thus cage pushing will be discussed in terms of a 
time sample of 2-sec intervals per test in which the 
response was observed. On this index, maximum 
Score per test is 150. 

An eighth response was termed immobility, in 
which a bird rested on its breast feathers and held 
its eyes closed for at least several seconds. For this 
response, as for cage pushing, a time sample of 
2-second intervals was obtained. 


Supplemental Tests in the Large Cages 


As testing proceeded, casual observations indi- 
cated a decline in head pecking over days, and an 
inverted-U pattern for aggressive leaping (and see 
Results). ‘These changes could be interpreted in a 
number of ways, so an additional test was pro- 
fest that might help evaluate competing accounts 
lor the time-dependent shifts in responding.: Sup- 
Plemental tests of 14 pairs of birds® housed in the 
this cages took place after the final routine re- 
dete reese birds were tested with strangers to 
as if they would react in a xenophobic 
echo (cf. Rajecki, Kidd, & Ivins, 1976; Ra- 
Teg i Ivins, & Kidd, 1977), Subsequent to the 
Si ar reunion on Days 27-28, normal procedures 
ma alowed whereby cagemates were allowed to 
tated f together for 24 hours and were then sepa- 
condua y + hours. A 15th test of these birds was 
tion, ucted at the end of this final day-long separa- 

» but birds were not reunited with their cage- 
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mates. Rather, they were confronted with a same-sex 
bird from some other cage. A male bird that was 
an away bird was placed in the home cage of a 
home male, and the same procedure was followed 
for females, Seven pairs of male strangers and 7 
pairs of female strangers were observed. Test pro- 
cedures were identical in every way to those em- 
ployed routinely. 


Results 
Subject Losses 


Thirty-six pairs of chicks were assigned to 
the large cages, but 3 pairs did not survive 
the experiment. Of the remaining 33 pairs, 17 
had males as the home bird and the rest had 
females as the home bird. Of the 48 pairs of 
birds assigned to the small cages, 2 pairs did 
not survive, thus in 24 of the remaining 
small-cage pairs the female was the home 
bird. 

The small cages provided suboptimal con- 
ditions for the locomotor development of 
chicks. By the end of the third week, 16 of 
the 46 cages contained lame birds. Of these, 
11 were male and 5 were female. The analy- 
ses of variance reported below were computed 
twice, once with the lame birds’ scores in- 
cluded and a second time with these pairs 


2 Interobserver agreement using push-button pan- 
els is not problematic for experienced observers (see 
Rajecki, Ivins, & Rein, 1976), but new personnel 
were involved in this project, so reliability was 
checked. Observations seemed more dificult in the 
smaller cages, so all reliability estimates were made 
with extraexperimental birds in the small units. On 
several occasions pairs of observers watched the 
same bird and recorded its responses covertly. 
Average agreement on these occasions generally 
proved satisfactory for head pecks (M = 95, range 
= 88 to 1.00), body pecks (M = 84, range = .67 
to 1,00), cage pecks (M = .95, range = .88 to 1,00), 
distress calls (M = .91, range=.79 to 1,00), and 
aggressive leaps (M = 86, range = .78 to 93), Cage 
pushing and foot pecking never occurred during 
reliability checks, but we have no reason to believe 
that observers were any less reliable on these mea- 
sures than on the others. As a final step to insure 
accuracy during tests, observers instructed one 
another to record responses on occasions when a 
particular observer had his or her view of a subject 
blocked by obstructions such as the large water 

unt in the larger cages. 

ETa decision to conduct the supplemental tests 
was made at a time when the 14 pairs noted were 
the only birds remaining in the laboratory. 
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HEAD PECKING 


Figure 1. Head pecks per minute as a function of experimental conditions. — 


excluded. The two versions of each analysis 
were identical with respect to the direction 
and significance levels of results, so in this 
section the data from the complete sample 
are reported. 


Analyses of Variance 


Each response type was analyzed sepa- 
rately in terms of the 2x 2x2 (xX 10) 
design outlined in the method section. The 
birds in the large cages were tested on several 
more occasions than were the small-cage birds, 
but for purposes of comparison, only the 
first 10 test days of large-cage birds were 
contrasted with the 10 tests of birds in small 
cages. However, for clarity, all 14 tests of the 
former subjects are plotted in the figures to 
follow, where it can be seen that there were 
no substantial changes in trends established 
by the tenth test. 

In these (and subsequent) analyses the 
scores for birds tested on Day 1 and Day 2 
were combined, since these days represented 
the first test for both subsets of subjects. 
This combinaiton procedure was carried out 


respectively), Day 5 and Day 6 see 
test, respectively), and so on for t 
ing test days. The average score 
combined test days are plotted im 
entries for Day 1-2, Day 3-4, am 
the figures below. 

Finally, the score of the i 
was considered the unit of a 
was the only way to preserve 
between the sexes, and the ho 
tinction. However, mutual infl 
pairs is of interest and will be ¢ 
the section on correlational ana 

Head pecking. The averag 
head pecks per minute are shown} 
Overall, there were more head pé 
small cages than in the large ca 
= 17.55, p < .01, but this difi 
ated later in testing as indicated 
Size x Days interaction, F(9, 1 
p< 01. a 

Another clear finding is 
gave more head pecks than a’ 
150) = 10.94, p < .01. This h 
fect was not influenced by cage $ 


COMPONENTS OF AGGRESSION IN CHICKENS 


cated by an insignificant interaction between 
these variables, F(1, 150) = .92, ms, and sex 
had little impact on the home-away distinc- 
tions, as revealed by an insignificant Sex X 
Home-Away interaction, F(1, 150) = 1.02, 
ns. 

On head pecking the sexes were not clearly 
different overall, F(1, 150) = 2.87, p> .05. 
There was, however, an interesting Sex X 
Cage Size interaction, F(1, 150) = 4.02, p< 
.05, indicating that males enjoyed more of an 
advantage in the large cages than in the small 
cages. Further, holding everything else equal, 
a Sex X Days interaction indicated that males 
had an advantage early in the test sequence, 
but less so later on, F(9, 1350) = 2.37, p< 
05. 

Finally, there was a general decline in head 
pecking over days, F(9, 1350) = 79.02, p< 
01. 

Body pecking. Body pecks per minute are 
shown in Figure 2. Neither cage size, F(1, 
150) = .51, ms, nor sex, F(1, 150) = .14, ns, 
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had an influence on this response type. How- 
ever, as in the case of head pecking, there 
was a Clear influence for the home-away dis- 
tinction, with home birds of either sex having 
an advantage, F(1,150) = 8.39, p< 01. 
There were no interactions between any of 
the between-subjects factors of cage size, sex, 
and home-away status. 

There were marked changes in the rate of 
body pecking over days, F(9, 1350) = 21.46, 
p < .01, and a Cage Size X Days interaction 
indicated that this response type peaked 
sooner in the large cages than in the small 
ones, F(9, 1350) = 2.11, p < .05. Finally, a 
Home-Away X Days interaction supports the 
impression that there was a decline in the 
home-cage advantage over the days of test- 
ing, F(9, 1350) = 2.47, p < .05. 

Foot pecking. Foot pecks were rare, rela- 
tive to pecks at other places. The grand per- 
minute means for head, body, and cage pecks 
were 2.61, 1.85, and 8.88, respectively, com- 
pared to a grand mean of .35 for foot pecks. 
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Figure 2. Body pecks 


AGE y 
per minute as a function of experimental conditions. 
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MEAN RESPONSES PER MINUTE 


Figure 3. Cage pecks per minute as a function of experimental conditions. 


The pattern of foot pecks generally paral- 
leled that for body pecks. Three significant 
effects emerged. There was more foot pecking 
by home birds (M = .41) than by away birds 
(M = .29), F(1, 150) = 5.70, p < .05; peck- 
ing was more frequent early in the test se- 
quence than later, F(9, 1350) = 20.74, p< 
01; and the home-away distinction dimin- 
ished over days, F(9, 1350) = 3.18, p < .05. 
Cage pecking. The patterns of cage peck- 
ing are presented in Figure 3. There was less 
pecking by home birds (M = 7.88) than by 
away birds, (M = 9.88), F(1, 150) = 4.33, 
p < .05; a general increase in cage pecking 
over the test sequence, F(9, 1350) = 17.35, 
p < .01; and a Cage Size X Days interaction 
indicated that whereas small-cage birds 
pecked less at the beginning of the sequence, 
later on they equalled the rates of the large- 
cage birds, F(9, 1350) = 2.84, p < .01. 
Aggressive leaping. The mean numbers of 
aggressive leaps per minute are plotted in 
Figure 4. Several between-subjects effects 


emerged from this measure. First, there w 
far more leaps in the larger cages than in @ 
smaller cages, F(1, 150) = 83.34, $<% 
Second, males showed a higher overall rate 
this response than did females, F(1, 150) 
15.70, p < .01. Finally, males in ee 
cages had a greater advantage over f i i 
than did the males in the small cages, FU 
150) = 9.32, p < .01. Also interesting 1 
finding that there was no effect for the se. 
away distinction on this particular rll 
(means of .32 and .29, respectively), fe 
150) = .50, ns, and that the home-away 
tor entered into no significant inter 
with any other variable. ays 

As was the case for other measures i 
of age had a strong influence on o 
leaping, F(9, 1350) = 29.73, P< 0» og 
ther, there were interactions betwee? © 
size and days, F(9, 1350) = 18.60, ? 7 0 
sex and days, F(9, 1350) = 6.94; p si 
and a three-way interaction between p 
sex, and days, F(9, 1350) = 3.85 
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These interactions mean that the differences 
between the sexes were more evident later in 
the test sequence than in early tests and that 
large-cage males had a longer lasting advan- 
tage over females than did small-cage males. 

Other response types. The remaining re- 
sponses were rarely observed. Scores on the 
immobility measure could range up to 150 per 


“test, but near-zero rates emerged, with a 


grand mean for immobility of .12. The cage 
push reaction was also quite infrequent, with 
a grand mean of 1.35 out of a possible 150 
per test. Therefore, whereas some significant 
effects emerged from these two measures, they 
do not bring much clarification to the issues 
at hand, and we hesitate to place strong in- 
terpretation on these contrasts because of 
their basis in such small numbers. 

Distress vocalizations were also relatively 


AGGRESSIVE LEAPING 
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infrequent, with a grand mean of .18. That is, 
during any given minute of testing, an ob- 
server was 76 times more likely to see some 
kind of peck than to hear a distress call, Al- 
though we have reservations about strong 
interpretations based on this overall low 
rate, it is perhaps worth noting that there was 
a general decline in distress calls per minute 
from the first day (M = .50) to the last day 
of testing (M = .02), F(9,1350) = 2.64, p 
<.0l. 


Correlational Analyses 


Correlations between pairmates were com- 
puted for selected measures on each day of 
testing. Large and small cage data were 
treated separately. The results of these analy- 
ses are shown in Figure 5, where conven- 
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Figure 4. Aggressive leaps per minute as a function of experimental conditions. 
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CORRELATIONAL ANALYSES 


CORRELATION COEFFICIENTS 


AAW 
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Figure 5. Correlation coefficients showing the relationship between pairs of birds o1 


response types, 


tional (.05) significance levels are also de- 
picted. These significance levels differ over the 
two cage sizes because more pairs were tested 
in the small cages (n = 46) than in the large 
cages (n = 33). 

There is a fair amount of consistency in 
the patterns of coefficients across the two test 
units. Correlations for cage pecking were 
mostly positive and stable, and were signifi- 
cantly greater than zero on many test days. 
This pattern is in line with an earlier report 
on the matching of cage pecking during this 
part of chicks’ developmental span (cf. Ra- 
jecki, Lamb, & Suomi, 1978, Table 2). We 
interpret these positive correlations to mean 
that the cage pecking by one bird elicited or 
facilitated the cage pecking of its testmate 
(cf. Rajecki, Ivins, & Kidd, 1977). 

On the other hand, the coefficients for head 
pecking showed a weak negative relationship 
for this response. The interpretation placed 
on this pattern is that a head peck by one 
bird at another did not necessarily elicit or 
facilitate return head pecks, and on some 
occasions (indicated by significant negative 
coefficients) a high rate of pecks to the head 
precluded return pecks to that locus. 

In further contrast to both the patterns for 
cage and head pecking, the coefficients for 
aggressive leaping suggest that initially (or 
relatively early in testing in the small cages) 
such leaps elicited return leaps. However, as 
reference to Figure 4 shows, as leaps per test 


i 
increase, the positive relationship 
leap rates of testmates deterior; 
and below. This means that w 
leaping of given males was 
particular female test mates we 
responding to the same degree. 


Supplemental Tests in Large Ca; 


The purpose of the supplen 
strangers was to provide some 
about the decline in certain res 
least in terms of whether the 
sessed some capacity for aggr 
ing. The tests with strangers i 
last regular tests with cagemat 
responses were at their lowes 
Therefore, it will be useful to” 
cagemate test as a baseline 
gauge xenophobic reactions. I 
plemental tests produced 
shifts in response rates. Altho 
away distinction was included 
cal design that evaluated the 
case was there a significant 
this comparison. Accordingly, 
restricted to comparisons for se 
of test (cagemate versus str 
effects for these variables are 
ak 
There was a marked ele 
at strangers compared to such 
mates during the last regular 
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= 9,00, p < .01. The sexes did not differ on 
this measure, F(1, 24) = 2.18, ms, nor was 
there an interaction between sex and type of 
test, F(1, 24) = 1.54, ns. 

Effects for body or foot pecking did not 
emerge, but there was a strong effect for tests 
on the cage peck measure. Table 1 shows that 
cage pecking was much less frequent among 
strangers than among cagemates, F(1, 24) = 
12.15, p < .01, with males and females af- 
fected across tests about equally, F(1, 24) = 
17, ns. 

On the other hand, sex had a very strong 
influence on aggressive leaping, with males 
engaging in this behavior more than females 
overall, F(1,24) = 28.19, p< .01. As was 
the case with head pecking, there was more 
leaping during tests of strangers than during 
tests of cagemates, F(1,24) = 11.19, p< 
01, but interestingly enough, it can be seen 
in Table 1 that the cagemate-stranger dif- 
ference is due entirely to the behavior of the 
males, F(1, 24) = 10.56, p <.01. In fact, 
comparisons between Table 1 and Figure 4 
show that males’ average rate of leaping at 
unfamiliar males was about twice as high as 
the highest average rate of leaping at female 
cagemates. This indicates that leaping was in- 
deed an aggressive response and not strictly 
a sexual response directed at the opposite sex. 

The distress call and immobility measures 


Table 1 

Average Responses per Minute in the Last 
Regular Tests With Cagemates (Mixed-Sex 
Pairs) and the Supplemental Tests With 
trangers (Same-Sex Pairs) 


Test 
Sex Cagemate Stranger 
| ee eee 
Head pecking 
Male 3t 1.07 
Female 10 ‘41 
ee 
Cage pecking 
Male 17.86 84 
Female 17.46 7.29 
Aggressive leaping 
Male 56 2.54 
Female 14 17 
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revealed no significant effects, but there was 
an interesting interaction between sex and 
test on the cage push measure. When tested 
with their familiar female cagemates, males 
showed 0 scoring units (see Method) on this 
index, but their scores rose to an average of 
1.14 when tested with unfamiliar males. To 
the contrary, females showed an average 
score of 4.21 when tested with familiar males 
and only .43 such units when confronted with 
unfamiliar females, F(1, 24) = 5.03, p < .05. 


Discussion 


Note should be made of the fact that the 
behavioral profiles just reported emerged 
under conditions of multiple separations and 
reunions of female and male pairmates. There 
is little doubt that birds reared under other 
circumstances, sexual combinations, or num- 
bers might exhibit very different patterns. 
Previous evidence indicated that, compared 
to animals never separated, the multiple sepa- 
ration procedure enhances social behavior over 
this age range (see Rajecki, Lamb, & Suomi, 
1978), and this was precisely why the tech- 
nique was used here. The separation technique 
proved quite useful for demonstrating the 
capacity and forms of social behavior in devel- 
oping fowl, which was the aim of the study. 

In some cases, patterns of behavior across 
the large and small cages were vastly dif- 
ferent. In order to bring organization to 
these and other complicated results, we will 
begin with the findings from the large cages, 
and then take up the small-cage data. 


Behavior in Large Cages 


The behavior of the birds in the large cages 
clearly indicates that in studying the develop- 
ment of aggressiveness in such species, dif- 
ferent forms of physical contact between in- 
dividuals cannot be viewed as equivalent. 
That is, response types such as head pecking, 
body pecking, foot pecking, and leaping are 
not interchangeable indexes of “aggression 
because they seem governed by separable 
influences or factors. By this we do not mean 
merely that certain response types are more 
or less likely at given points in maturation. 
Rather, we mean that when they emerge or 
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reach peak rates, given responses seem differ- 
entially mediated by internal or external fac- 
tors. 

At one extreme, aggressive leaping in the 
large cages appears to have been independent 
of the environmental factor represented by 
the home-away distinction. Regardless of 
whether males were home or away, they began 
leaping (on average) at the same point in 
the developmental span and reached the same 
peak rate on the same test day. Also interest- 
ing in this respect are the leaping responses 
of the females. On average, both home and 
away females initially matched the male rates 
but then peaked at the same level on the 
same day. 

At the other extreme, sex had little effect 
on body pecking compared with its influence 
on leaping. When rates of body pecking 
peaked, it was the home bird that showed the 
highest rate, regardless of sex (and a similar 
pattern emerged for foot pecking). 

Further, it seems possible to draw distinc- 
tions between leaping and head pecking. 
Males had a general advantage over females 
in head pecking at the point of peak response 
rate, but this sex difference did not conform 
to the profile of the sex difference for leap- 
ing. In leaping, males established an advan- 
tage and then kept it (so to speak), whereas 
in head pecking the males had an advantage, 
then lost it. Finally, head pecking is different 
from body pecking, since at the point of 
maximum head peck rates in the large cages, 
the biological factor of sex outweighed the 
environmental factor that was based on the 
home-away distinction. 

The components of social behavior are also 
distinguished from one another by the corre- 
lations within response types. The correla- 
tional analyses presented in Figure 5 demon- 
strate pronounced differences across response 
categories. These analyses indicate that cage 
pecks were matched between birds, but that 
head pecks were not. Interestingly, leaps were 
matched early in the test sequence but not 
later, The interpretation we place on these 
different patterns is that different motiva- 
tional factors mediate such responses. A peck 
to the head may or may not reflect an ag- 
gressive disposition on the part of the pecking 
bird, but it does not seem to serve as an occa- 
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sion for retaliatory pecks on the part of 
target (cf. Rajecki, Ivins, & Rein, 1976 
the other hand, an aggressive leap was 
to elicit a leap in return,‘ up to a 
testing. We suspect that that point 
reached when the females became submi 
This suspicion seems borne out by the re 
of males to chickens that were not submi 
that is, their opponents in the male- 
plemental tests. These supplementa 
produced the highest rates of leaping s€ 
the study and revealed a very close ma 
leap rates between such willing birds, 
= 98, p < 01. 

This allusion to the supplemental 
brings us to another distinction between 
ing and head pecking. The tests of 
strangers revealed very little leaping, a 
suggest (as noted earlier) that this W 
because they had learned, on the ba 
their experience with males, to inhi 
in the presence of a testmate. Howe 
females were subdued with respect to M 
they were much less so with respect t 
pecking. Like their male counterparts, 
showed a fourfold increase in head 
tests of strangers, as compared with ti 
cagemates. 


Behavior in Small Cages 


A number of reports (cf. Craig & 
1974; Craig, Biswas, & Guhl, 1969; 1 
& Wood-Gush, 1977) indicate that fo 
in close confinement are less likely t 
aggressive behavior than are birds r€ 
more spacious units. Hughes and 
(1977) account for this difference 
pothesizing that aggression 1s trigg! 
the event of birds coming into 
rather than by the condition of bi j 
tinuous proximity. Indeed, the data 4 


small cages are in strong support oi 


41t is worth reporting that head pea 
correlated with leaps. Correlation © ; 
computed to determine the relatio: i 
male head pecks and female leaps 
female head pecks and male leaps. f 


yielded 48 coefficients from the a ae 


from zero. 
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pothesis that close confinement dampens or 
otherwise influences aggressive responding. 
The first indication of this is that males in 
small cages had no advantage over females 
on head pecking; in these units the home 
bird had the edge regardless of sex. There was 
also a home-cage advantage in body pecking, 
as was seen in the larger units. 

However, the most dramatic difference 
across the two cage sizes was on the aggressive 
leaping index, with leaps in the small cages 
being far less frequent (see Figure 4). There 
are two ways to account for this difference. 
First, the relative absence of leaping in the 
small units may be in line with the Hughes 
and Wood-Gush (1977) notion that birds 
continuously in proximity are not in the 
proper condition to be stimulated to aggress. 
On the other hand, at present we cannot rule 
out the argument that the very dimensions of 
the small cages may have reduced leaping, 
whatever the disposition of the inhabitants. 

The distinction between leap and head 
peck rates in the small cages is, however, 
noteworthy with respect to the motivational 
bases of such responses. While leaping was 
somehow limited by confinement, head peck- 
ing under these conditions was initially higher 
than in the large cages. Of course, the peak 
tates for these responses occurred at different 
points in development, but the differential 
effect of the small units on pecking and leap- 
ing further points to distinctions between 
components of agonistic behavior in fowl. 
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Issue Involvement Can Increase or Decrease Persuasion 
by Enhancing Message-Relevant Cognitive Responses 


Richard E. Petty 
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Two experiments were conducted to test the hypothesis that high issue involve- 
ment enhances thinking about the content of a persuasive communication. Ex- 
periment 1 varied involvement and the direction of a message (proattitudinal 
or counterattitudinal). Increasing involvement enhanced persuasion for the pro- 
attitudinal but reduced persuasion for the counterattitudinal advocacy. Experi- 
ment 2 again varied involvement, but both messages took a counterattitudinal 
position. One message employed compelling arguments and elicited primarily 
favorable thoughts, whereas the other employed weak arguments and elicited 
primarily counterarguments. Increasing involvement enhanced persuasion for 
the strong message but reduced persuasion for the weak one. Together the ex- 


periments provide support for the 


view that high involvement with an issue 


enhances message processing and therefore can result in either increased or 


decreased acceptance. 


Persuasion researchers have recognized for 
some time that it is easier to demonstrate 
attitude change in the laboratory than in the 
field. One prominently mentioned explanation 
for this observation is that the advocacies em- 
ployed in laboratory investigations are of con- 
siderably lower “involvement” than the ad- 
vocacies encountered in the real world (Hov- 
land, 1959), Greater involvement with an 
issue is presumably related to greater resist- 
ance to persuasion (cf. Triandis, 1971). The 
major goal of the present paper is to present 
and test a “cognitive response” (Greenwald, 
1968; Petty, Ostrom, & Brock, in press) in- 
terpretation of involvement effects holding 
that increasing involvement with an issue in- 
Creases one’s motivation to process informa- 
tion relevant to the issue and can lead to 
either increased or decreased persuasion. 


a We are grateful to Chuck LaJuenesse, Colleen Holt, 
hoe Bistline, and Michael Bryant for help in col- 
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Before presenting the model in more detail, 
it is necessary to note that attitude re- 
searchers have distinguished between two dif- 
ferent types of involvement that can affect 
susceptibility to influence. One kind of in- 
volvement concerns the extent to which the 
attitudinal issue under consideration is of per- 
sonal importance, whereas a second concerns 
the extent to which the particular attitudinal 
response adopted is of personal importance to 
the individual. The first type of importance, 
which is the type under investigation here, has 
been called “issue involvement” (Kiesler, Col- 
lins, & Miller, 1969), “ego-involyement” 
(Rhine & Severance, 1970; Sherif, Sherif, & 
Nebergall, 1965), and “personal involvement” 
(Apsler & Sears, 1968; Sherif, Kelly, Rodgers, 
Sarup, & Tittler, 1973). Also, Halverson & 
Pallak (1978) and Madsen (1978) have 
argued that typical manipulations of “com- 
mitment” (cf. Kiesler, 1971) elevate involve- 
ment with an issue. 

The second type of involvement is often 
referred to as “response involvement” (Zim- 
bardo, 1960) or “task involvement” (Sherif 
& Hovland, 1961). In this second kind of in- 
volvement, the attitudinal issue is not par- 
ticularly important to the person, but adopt- 
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ing a position that will maximize the im- 
mediate situational rewards is. Thus in some 
cases response involvement will lead to in- 
creased influence (e.g., Zimbardo, 1960) and 
in some cases decreased influence (e.g., Freed- 
man, 1964), depending upon which position 
the rewards favor. 

The focus of this paper is on the mechanism 
mediating the effects of high issue involve- 
ment. High issue involvement occurs when an 
issue has “intrinsic importance” (Sherif & 
Hovland, 1961, p. 197), or “personal meaning” 
(Sherif et al., 1973, p. 311), when people ex- 
pect the issue “to have significant conse- 
quences for their own lives” (Apsler & Sears, 
1968, p. 162), and when concerns about im- 
mediate situational rewards are “dwarfed by 
outcomes connected with the topic itself” 
(Cialdini, Levy, Herman, Kozlowski, & Petty, 
1976, p. 664). Most of the early research in- 
dicated that increased issue involvement was 
associated with increased resistance to per- 
suasion (e.g., Miller, 1965; Sherif & Hovland, 
1961). The most prominently mentioned ex- 
planation for this finding was derived from 
social judgment theory (Sherif et al., 1965). 
The notion was that on any given issue, highly 
involved persons should exhibit more negative 
evaluations of a communication because high 
involvement is associated with an extended 
“latitude of rejection” (the attitudinal posi- 
tions that a person finds unacceptable). Thus, 
incoming messages on high involvement issues 
would have an enhanced probability of being 
rejected because they were more likely to fall 
within the unacceptable range of a person’s 
implicit attitude continuum (cf. Eagly & 
Manis, 1966). 

Contrary to social judgment theory, which 
seems always to predict greater resistance 
with increased issue involvement, some inves- 
tigators have found increased involvement to 
be associated with greater influence. For ex- 
ample, Eagly (1967) presented subjects with 
information about either themselves (high 
involvement) or another person (low involve- 
ment) that was discrepant in either a favor- 
able or an unfavorable direction from their 
initial attitudes. She found that although high 
involvement subjects changed less than low 
involvement subjects when unfavorable infor- 
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mation was provided, the former 
more change when favorable infor 
provided. Similarly, Pallak, Muell T 
& Pallak (1972) presented subjects 
either publicly committed (high in 
or privately committed (low involve 
their initial attitudes with informa 


probability of rejecting or contrasti 
terattitudinal information but incr 
probability of accepting or assimi 
attitudinal information. 
These same data may be expla 
alternative view holding that invol 
creases the amount of thought in wi 
jects engage about the stimulus ini 
When the stimulus information is incon 
with subjects’ original attitudes, it 1S 
that subjects are motivated and abl 
erate counterarguments to the material 
sented. To the extent that incre: 
ment is associated with more 
creased counterargumentation and 
to influence would be a likely result 
other hand, when the stimulus info 
consistent with subjects’ original ai 
is likely that they are initially bi 
generating favorable cognitions. To tl 
that increased involvement is asst 
more thinking here, more favorabl 
might be generated, and increased 1 
would result. There is already a larg 
of literature supporting the vi 
idiosyncratic cognitive responses 
communication are an important 4 
of the direction and amount Of | 
change produced (e.g., Brock, 1967; © 
& Petty, 1979; Cook, 1969; Insko, % 
& Yandell, 1974; Petty, Wel 
1976; Tesser, 1978; etc.). io 
i i nt to note k 
It is also importa; ie 


already some suggestions in the 
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increased involvement with a stimulus is asso- 
ciated with more extensive information pro- 
cessing. For example, Rogers, Kuiper, and 
Kirker (1977) had subjects rate personality 
adjectives in terms of their self-relevance 
(high involvement) or semantic, phonemic, 
and structural properties (low involvement). 
Adjectives processed under high involvement 
conditions were subsequently recalled best in 
an incidental recall test. According to Craik 
and Lockhart’s (1972) “depth of processing” 
framework, enhanced recall is thought to re- 
flect more extensive processing of the stim- 
ulus. More relevant to persuasion situations, 
Cialdini et al. (1976) found that subjects 
who expected to engage in a discussion with 
an opponent generated more supportive 
thoughts in anticipation of the discussion 
when the attitude issue was of high rather 
than low personal relevance. Finally Chaiken 
(Note 1) and Petty & Cacioppo (1979) have 
reported that subjects’ message-relevant 
thoughts show higher correlations with mes- 
sage acceptance under high than under low 
issue involvement conditions. 

Two experiments were conducted to test 
the cognitive response view of the effects of 
issue involvement. The primary goal of Ex- 
periment 1 was to replicate conceptually the 
Eagly (1967) and Pallak et al. (1972) atti- 
tudinal findings; measures of subject-gen- 
erated cognitive responses were included also 
to assess the viability of the proposed refor- 
mulation of issue-involvement effects. The sec- 
ond experiment was designed to evaluate the 
two competing explanations of issue-involve- 
Ment effects, namely, thought enhancement 
versus the revised social judgment theory 
formulation, Thus, an experiment was devised 
for which the two formulations made compet- 
ing predictions, 


Experiment 1 


Most of the early work on issue involve- 
Ment was conducted by finding existing groups 
that differed in the extent to which an issue 
Was important, and thus this work was cor- 
jclational in nature (see Kiesler et al., 1969, 
or the interpretational problems with this 
approach). More recent investigators have 
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chosen to manipulate involvement by varying 
the issue between subjects (e.g., Dean, Austin, 
& Watts, 1971; Rhine & Severance, 1970). 
In other words, some subjects would receive a 
highly involving issue (e.g., increasing tui- 
tion), whereas others received an issue of low 
involvement (e.g., increasing park acreage in 
a distant city). A preferable procedure that 
keeps the communication constant across sub- 
jects was introduced by Apsler and Sears 
(1968) and is the method employed here. In 
this procedure, subjects in both high and low 
involvement groups receive the same com- 
munication, but high involvement subjects are 
led to believe that the advocated change will 
affect them, whereas low involvement subjects 
do not believe the change will have personally 
relevant effects. In the present study, this was 
accomplished by telling the college student 
subjects that a proposed change in university 
regulations regarding mixed-sex visitation 
hours was being made at either their uni- 
versity or a distant university. Pilot testing 
was conducted to develop proattitudinal and 
counterattitudinal communications. Not sur- 
prisingly, advocating more lenient regulations 
regarding visitation hours was found to be 
highly proattitudinal, and advocating stricter 
regulations was highly counterattitudinal, In 
addition, pilot testing revealed that subjects 
generated predominantly favorable thoughts 
to the proattitudinal communication and 
counterarguments to the counterattitudinal 
appeal. In Experiment 1 subjects received 
either a proattitudinal or a counterattitudinal 
advocacy under conditions of either high or 
low issue involvement. We predicted that in- 
creased involvement would enhance process- 
ing of the message contents, As a result of 
this, we expected both cognitive and affective 
responses to the two messages to be more 
extreme under high than under low involve- 
ment conditions. More specifically, increased 
involvement should decrease agreement with 
the counterattitudinal message but increase 
agreement with the proattitudinal communica- 


tion. 
Method 


Procedure 


Twenty-four male undergraduates at the Uni- 
versity of Notre Dame participated in order to earn 
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extra credit in an introductory psychology course. 
The design was a 2 (High or Low Issue Involvement) 
X2 (Proattitudinal or Counterattitudinal Message 
Direction) factorial. Subjects were tested individ- 
ually. Upon arrival, subjects were informed that stu- 
dents in a sound engineering course had prepared 
the communications that were to be employed in the 
investigation and that in return for the use of the 
tapes in other research, the investigators had agreed 
to provide evaluations of the sound quality of the 
tapes. The subjects were asked to aid in this en- 
deavor. 

Subjects were informed of the topic and position 
of the communication they were about to hear and 
60 sec later were exposed to one of two professionally 
taped messages. Following exposure to a 2-minute 
message, subjects completed the dependent variable 
booklets and were debriefed, thanked, and dismissed. 


Independent Variables 


Message direction. All subjects were exposed to a 
communication concerning coed visitation hours. The 
proattitudinal advocacy contended that colleges 
should be more lenient in allowing mixed-sex visita- 
tion. Elaborations of the following arguments were 
employed; (a) More lenient hours would not inter- 
fere with education, since the “Educational Examina- 
tion Foundation” found no correlation between the 
length of visitation hours and graduate school en- 
trance examination scores; (b) they also found no 
connection between visitation hours and grade point 
average; (c) enforcement of the morality inherent 
in visitation hours fails because Morality is not tied 
to time of day; (d) since college is a period of 
responsible maturation, the imposition of visitation 
hours can be counterproductive; and (e) students 
are able to judge for themselves at what time a party 
should end. The Ccounterattitudinal advocacy con- 
should be more strict in their 


: ies. Elaborati A 
arguments were employed: ol ea 


For ji i ear 

volvement conditions, the aes vets ae i 
change in visitation rules go into effect at thay d 
university (Notre Dame). Thus all subjects Sed be 
affected personally by the proposal. In the low in- 
volvement conditions, the speaker advocated that the 
change in visitation rules go into effect at another 
university (Juanita Junior College). Thus, none of 
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the subjects would be affected personally by the pro. 
posal. The messages in both involvement conditions 
were identical except for the words Notre Dame and ¢ 
Juanita Junior College. 


Dependent Variables 


Attitude measures. On the first page of the de 
pendent variable booklet, subjects read: “Becaug 
your own opinion about the position advocated on 
the tape may influence the way you rate the quality 
of the tape, we would like to obtain a measure of 
how you feel about the views proposed by the 
speaker on each scale below.” Two measures of 
opinion about visitation hours were included, First 
subjects rated the advocated position on four 9-point 
semantic differential scales (harmful-beneficial, wise- 
foolish, good-bad, favorable-unfavorable) that were 
summed to obtain a general measure of evaluation. 
Next, subjects responded to an 11-point Likert-type 


rating scale regarding their agreement with the 
speaker's position. On the scale, 1 indicated that the 
subject “did not agree at all,” and 11 indicated 


“agree completely.” The subjects’ responses to the 
two attitude measures were converted to standard 
scores and were averaged for an index of communica- 
tion acceptance. 

Cognitive response measures. After completion of 
the attitude scales, subjects were given 24 minutes 
to list the thoughts they had while listening to the 
tape (cf, Petty & Cacioppo, 1977). Twelve 8-inch 
(20.32 cm) horizontal lines each about 1 inch (254 
cm) from the one above created the boxes in which 
subjects were to write their ideas, one per box. Aftet 
recording their thoughts, subjects were instructed to 
rate their ideas as either + (in favor of the advo- | 
cated position), — (opposed to the advocated posi 
tion), or O (neutral or irrelevant). No independent | 
judges were used to score subjects’ thoughts, sinc 
previous research has indicated that subjects’ ant) 
judges’ ratings correlate highly (cf. Petty €t a 
1976). After rating their thoughts, subjects com 
a page of ancillary items regarding the quality of th 
taped communication and how involving they felt 
was. 


Results and Discussion 
Manipulation Checks 


Subjects rated the extent to which the] 
found the communication “involving” 
11-point scale where 11 indicated “ext 
involving.” Subjects in the high involvem® l 
conditions rated the message as significa" in 
more involving (M = 6.67) than subject 
the low involvement conditions (M = 4 
F(1, 20) = 5.09, p < .03, providing oe 
for the effectiveness of the involvement 
nipulation. 
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Table 1 
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Attitudes and Cognitive Responses in Relation to Involvement and 


Type of Message (Pro or Counter) 


—— a aa sss 


Counterattitudinal message 


Proattitudinal message 


Involvement Involvement 
Item Low High Low High 
Attitude — 39, -1.79, 
Counterarguments 3.00,. 400, one a 
Favorable thoughts 1:33, ‘50, Lie KYA 


Note. Means in any given row without a common subscript are significantly different at the 05 level by the 


Newman-Keuls procedure. 


..The mean score on the standardized atti- 
tude index for subjects hearing the counter- 
attitudinal advocacy was —1.10, which dif- 
fered significantly from the mean score of 
1.10 for subjects hearing the proattitudinal 
message, F(1, 20) = 35.33, p< .001. An 
average of 3.5 counterarguments was gen- 
erated to the counterattitudinal message, 
whereas an average of only 1.3 counterargu- 
ments was generated to the proattitudinal 
message, F(1, 20) = 15.09, p< .001. An 
average of 2.2 favorable thoughts was gen- 
erated to the proattitudinal message, whereas 
only .9 favorable thoughts were generated to 
the counterattitudinal message, F(1, 20) = 
3.5, p < .07. The attitude and cognitive re- 
sponse data provide support for the notion 
that the two communications differed in the 
agreeableness of their positions. 


Tests of Hypotheses 


The attitude and cognitive response data 
are presented in Table 1. It was expected that 
involvement would decrease the effectiveness 
Of the counterattitudinal message but increase 
the effectiveness of the proattitudinal message. 
o Involvement X Message inter- 
ae on the attitude index provided support 
is hypothesis, F(1, 20) = 10.14, p< 
ieee A Newman-Keuls analysis revealed that 
Ra issue involvement increased agree- 
te with the proattitudinal message but de- 
doed agreement with the counterattitudinal 

Si cacy (see Table 1). 
napp Involvement X Message interac- 

S on the cognitive response measures pro- 


vided support for the information-processing 
interpretation of involvement effects: coun- 
terarguments F(1, 20) = 4.36, p < .05; fa- 
vorable thoughts F(1, 20) = 4.42, p < .05. 
Newman-Keuls analyses of these interactions 
(Table 1) revealed that under high involve- 
ment subjects generated more favorable 
thoughts and fewer counterarguments to the 
proattitudinal advocacy than to the counter- 
attitudinal one; under low involvement, how- © 
ever, neither the number of favorable thoughts 
nor the number of counterarguments was 
affected by message direction. Surprisingly, 
there was a tendency for subjects hearing the 
proattitudinal message under low involvement 
to generate more counterarguments than did 
subjects hearing the same message under high 
involvement. Postexperimental interviews with 
subjects suggested that this effect may have 
resulted from subjects’ jealousy over having 
such a desirable effect (leniency in visitation 
hours) occur at an institution other than their 
own. 

The analyses of the several ancillary mea- 
sures on tape quality revealed one significant 
effect. Subjects rated the speaker’s voice qual- 
ity as higher in the high involvement condi- 
tions than in the low, F(1, 20) = 6.68, p < 
.05. This effect was obtained regardless of 
whether the speaker advocated a proattitu- 
dinal or a counterattitudinal position and is 
puzzling, since both high and low involvement 
subjects heard identical tapes (except for the 
spliced insertion of the name of the appro- 
priate university). A possible explanation for 
this finding may be that subjects wanted to 
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give the tapes higher quality ratings when 
they were prepared by students at their own 
institution than when they were prepared at 
another school. 


Experiment 2 


Although the cognitive response data from 
Experiment 1 provide support for the notion 
that increasing involvement increases the 
thinking that subjects engage in about an 
attitudinal issue, the attitude data could just 
as easily be explained by the revised social 
judgment formulation, Thus, subjects may 
have assimilated the proattitudinal informa- 
tion, producing acceptance, and contrasted 
the counterattitudinal information, producing 
resistance. According to the Pallak et al. 
(1972) formulation, the key determinant of 
whether involvement will facilitate or hinder 
persuasion is the extent to which the informa- 
tion provided appears to contradict the sub- 
ject’s initial position, When the information 
advocates a position opposite to that of the 

- subject, involvement will decrease persuasion, 
but involvement “facilitates change toward a 
more extreme attitude in response to appeals 
which do not explicitly reject one’s own posi- 
tion” (Pallak et al., 1972, p. 434). 

According to the cognitive response view 
espoused here, the position advocated in the 
communication is not as important as the 
nature of the thoughts elicited by the message. 
Two counterattitudinal communications were 
Constructed specifically for Experiment 2. 

` Both messages argued that seniors be required 
to pass a comprehensive exam in their de- 
clared major before being granted a degree. 
Previous work has indicated that this position 
1s strongly counterattitudinal for most college 
students (Petty & Cacioppo, 1977). The mes- 
sages differed, however, in their presentation 
of eight key arguments. One message was de- 
signed to contain points that were logically 
sound, defensible, and compelling. The argu- 
ments in this message were selected from a 
pool that elicited Predominantly favorable 
thoughts in a pretest, A Second message was 
designed to be more open to refutation and 
skepticism, The arguments in this message 
were selected from a pool that elicited pre- 
dominantly counterarguments in a pretest. 
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Given that the messages advocated an iden. 
tical counterattitudinal position, but differed | 
in the quality of the arguments used to sup- 
port that position, it becomes possible to 
evaluate the two explanations of the effects 
of issue involvement. The modified social 
judgment formulation, as outlined by Pallak 
et al. (1972), predicts that increased issue 
involvement will produce decreased persua- 
sion for both messages, since each message 
adopts an identical position in opposition to 
the subjects. 

The cognitive response hypothesis, on the 
other hand, predicts that increased issue in- 
volvement will be associated with decreased 
persuasion only for the message containing 
“counterarguable” arguments. For the mes- 
sage containing the compelling and difficult- 
to-counterargue arguments, increased involve- 
ment should be associated with increased pet- 
suasion. In other words, increased involvement 
motivates subjects to process the information 
contained in the communications more care 
fully. Thus, although high involvement may 
initially increase a subject’s motivation to 
reject a counterattitudinal advocacy, subjects 
should ultimately better recognize the flaws 
in the weak communication and the virtues M 
the strong one, . 

In sum, the following predictions were made 
for Experiment 2, More counterarguments 
would be generated to the weak than to the 
strong communication, but more favorable 
thoughts would be generated to the strong 
message, There would be significant Argi 
ments X Involvement interactions on the att- 
tude and cognitive response measures. a 
Experiment 1, it was predicted that bo 
cognitive and affective responses to the WW 
messages would be more polarized under high 
than under low involvement conditions. 


4 involve 
1 Alternatively, it might be predicted that inv’ 


Ment would increase resistance more for the kare 
containing the compelling rather than the weak" 
ments, since this message might be viewed i? z 
a stronger stand against one’s own position odified 
hesselink & Edwards, 1975). In any case, the Mj, 
Social judgment formulation expects ingre i 
volvement to be associated with decreased pe" 

for both messages. 
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Method 


| Procedure 


Seventy-two male and female undergraduates at 
the University of Missouri participated in order to 
earn extra credit in an introductory psychology 
course. The design was a 2 (High or Low Issue In- 
yolvement) X 2 (Strong or Weak Argument Quality) 
factorial. Subjects were run in groups of 4 to 12 in 
cubicles constructed so that no subject could have 
visual or verbal contact with any other subject. Dur- 
ing any one session, all four experimental conditions 
were run, Upon arrival at the laboratory, subjects 

“were told that each year the psychology department 
assists the school of journalism in evaluating radio 
editorials that are sent in by colleges and universities 
throughout the country; their task would be to pro- 
vide ratings of the quality of the editorials. Follow- 
ing these instructions, subjects heard one of the taped 
communications over headphones. After listening to 
the appropriate 4-minute communication and com- 

pleting the dependent variable booklets, subjects were 
debriefed, thanked, and dismissed. 


Independent Variables 


Argument quality. As noted previously, all sub- 
jects heard a communication advocating that seniors 
be required to pass a comprehensive exam in their 
declared major prior to graduation. In brief, the 
Strong version of the message provided evidence 
(statistics, relevant studies, etc.) in support of the 
following arguments: (a) Prestigious universities 
have comprehensives to maintain academic excellence, 
(b) institution of the exams has led to a reversal in 
the declining scores on standardized achievement 
tests, (c) graduate and professional schools show a 
Preference for undergraduates who have passed a 
Comprehensive exam, (d) average starting salaries 
are higher for graduates of schools with the exams, 
(e) schools with the exams attract larger and more 
Well-known corporations to recruit students for jobs, 
(Ð) the quality of undergraduate teaching has im- 
Proved at schools with the exams, (g) state legisla- 
tures would increase financial support if exams were 
mstituted, allowing a tuition increase to be avoided, 
and (h) the (fictitious) National Accrediting Board 
of Higher Education would give the university its 
ighest rating if the exams were instituted. 
r he weak version of the message also contained 8 
po ments but relied more on quotations and opin- 
Ra, than on statistics and data to support the follow- 
& arguments: (a) Adopting the exams would allow 
zi university to be at the forefront of a national 
ee (b) graduate students have complained that 
ae they have to take comprehensives, undergrad- 
„ates should take them also, (c) by not administer- 
i‘ the exams, a tradition dating back to the ancient 
k a was being violated, (d) parents had written 
ee ministrators in support of the plan, (e) the 
ms would increase student fear and anxiety 
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enough to promote more studying, (f) the exams 
would help to cut costs by eliminating the necessity 
for other tests that varied with instructor, (g) the 
exams would allow students to compare their per- 
formance with students at other schools, and (h) 
job prospects might be improved. 

Issue involvement. Involvement was manipulated 
in a manner analogous to that employed in Experi- 
ment 1. For subjects in the high involvement condi- 
tions, the speaker advocated that the comprehensive 
exams be instituted at the University of Missouri 
(the ‘subjects’ institution), whereas for low involve- 
ment subjects, the speaker advocated that the exams 
be instituted at North Carolina State University. 


Dependent Variables 


The same measures were available as in Experiment 
1. Subjects rated the concept “comprehensive exams” 
on four 9-point semantic differential scales and then 
responded to an 11-point Likert-type scale regarding 
their agreement with the speaker’s position. As in 
Experiment 1, responses to the two attitude measures 
were converted to standard scores and were averaged 
to form the measure of communication acceptance. 

Next, subjects were given 24 minutes to list the 
thoughts they had while listening to the tape. Sub- 
jects also rated their thoughts with a +, —, or 0, as 
described for Experiment 1. In addition, some ancil- 
lary questions about the quality of the tape were 
completed, and subjects were asked to rate the 
amount of thought they engaged in about the issue. 
Finally, subjects were given 3 minutes to attempt to 
list as many message arguments as they could remem- 
ber. Each booklet was rated by two judges (r = .88) 
who were blind to the involvement manipulation, 
An argument had to correctly summarize one of the 
arguments that appeared in the appropriate message 
to be counted. Repetitions of the same argument 
were not counted. Disagreements between judges 
were resolved through discussion. 


Results 
Manipulation Checks 


In order to determine whether our manip- 
ulation of involvement affected the amount of 
message processing subjects engaged in, sub- 
jects were asked to rate on an 11-point scale 
(1 indicated not very much and 11 indicated 
very much) how much thought they put into 
evaluating what the speaker had to say. Sub- 
jects in the high involvement conditions re- 
ported doing more thinking about the mes- 
sages (M = 8.6) than subjects in the low in- 
volvement cells reported (M = 7.75), F(1, 
68) = 2.75, p < .05, one-tailed. These data 
support the view that the involvement manip- 
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ulation affected the hypothesized mediating 
variable. In other words, increasing the per- 
sonal relevance of an advocacy affected the 
perceived amount of thought subjects engaged 
in about that advocacy. 

Analyses of the attitude and cognitive re- 
sponse measures indicated that the manipula- 
tion of message quality was also effective. The 
mean score on the standardized attitude index 
for subjects hearing the strong arguments was 
45, which differed significantly from the mean 
score of —.45 for subjects hearing the weak 
arguments, F(1, 68) = 21.65, p< .001. An 
average of 2.19 favorable thoughts was gen- 
erated to the strong arguments, whereas only 
.99 were generated to the weak arguments, 
F(1, 68) = 15.19, p < .001. An average of 
2,69 counterarguments was generated to the 
weak message, whereas 1.69 were generated to 
the strong message, F(1, 68) = 8.36, p< 
.005. The attitude and cognitive response data 
provide support for the notion that the two 
communications differed in the strength of 
their arguments, 


(Tests of Hypotheses 


The attitude and cognitive response data 
are presented in Table 2. A significant Argu- 
ments X Involvement interaction on the atti- 
tude index, F(1, 68) = 5.62, p < .02, and a 
Newman-Keuls analysis (Table 2) provided 
support for the hypothesis that involvement 
would increase the persuasiveness of the 
strong arguments but would decrease the per- 
Suasiveness of the weak arguments. 

Involvement X Arguments interactions on 
the cognitive response measures also provided 


Table 2 
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support for the information-processing inter- 
pretation of involvement effects: counterargu- 
ments F(1, 68) = 7.46, p < .008; favorable 
thoughts F(1, 68) = 4.35, p < .04. A New- 
man-Keuls analysis on each of these measures 
(Table 2) revealed that under high involve- 
ment, subjects generated more favorable 
thoughts and fewer counterarguments to the 
strong than to the weak arguments; under low 


involvement, neither favorable thoughts nor | 


counterarguments were affected by the argu- 
ment quality manipulation. The Newman- 
Keuls analyses further demonstrated that 
higher involvement increased the production 
of counterarguments to the weak arguments 
and increased the production of favorable 
thoughts to the strong arguments. 

Analyses of the ancillary measures of tape 
quality produced no significant differences. 
No significant effects were obtained on the 
number of arguments that subjects could re- 
call either, although—consistent with the 
“depth of processing” notion—there was & 
slight tendency for high involvement subjects 
to recall more arguments (M = 3.75) than 
did low involvement subjects (M = 3.2), F(1, 
68) = 2.07, p < .15. The interaction on this 
measure did not approach significance ($ > 
25). 


Correlational Analyses 


Table 3 presents the correlations among the 
attitude and cognitive response measures S¢P- 
arately for high and low involvement subjects. 


The pattern of correlations indicates that me | 


sage-relevant cognitive responses are better 
predictors of attitude change under high than 
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f Correlations Among Attitude and Cognitive Responses for High and Low Involvement Conditions 
e ee Tan EEN ES eT E a nae ea! 


Low involvement 


High involvement 


Counter- Favorable Counter- Favorable 
Item arguments thoughts Recall arguments thoughts Recall 
ttitude —.22 35e .01 —.713"* 64** 19 
Counterarguments — —.26 .09 — —.51** ‘02 
Favorable thoughts — — -06 — a 10 


tp < 05. ** p < 001. 


under low involvement conditions. This repli- 
tes the findings of Chaiken (Note 1) and 
Petty & Cacioppo (1979) and is consistent 
ith the view that high involvement increased 
the importance of message processing in pro- 
ducing persuasion. The ability to recall mes- 
sage arguments did not allow any reliable pre- 
diction.2 


Discussion 


The pattern of data in both experiments 
Provides strong support for the cognitive re- 
Sponse view of the effects of increased issue 
involvement on persuasion. The results of 
both experiments contradicted the original 
Social judgment theory formulation (Sherif et 
al, 1965), which proposed that increased in- 
Volvement would invariably reduce persua- 
‘ion. In addition, the results of Experiment 2 
‘tended the Pallak et al. (1972) formula- 
tion, which limited the persuasion-facilitating 
= of increased issue involvement to ex- 
a FS consonant messages. Experiment 2 dem- 
i pos that increased involvement could 
attit to increased persuasion for a counter- 
_ aie advocacy if the arguments were 
a oped compelling. This suggests that it is 

Ot the direction of the advocacy (proattitu- 

J or counterattitudinal) that is important, 
poer the nature of the cognitive re- 

ses elicited. 

soot course it is likely that proattitudinal ad- 

TRN will generally elicit _ favorable 
TN ts, whereas counterattitudinal advo- 

es will elicit primarily counterarguments, 
on will not always be the case. Recent 
const has demonstrated the feasibility of 
"structing communications that take pro- 


attitudinal positions but employ arguments 
that elicit primarily counterarguments, and 
messages that take counterattitudinal posi- 
tions but elicit primarily favorable thoughts 
(cf. Petty et al., 1976). The latter was also 
accomplished in the present investigation and 
allowed a strong empirical test of the hypoth- 
esis of interest. 

It is also important to note that the atti- 
tudinal effects of issue involvement do not 
appear to be mediated by enhancing recall of 
the message arguments. Although high in- 
volvement did tend to increase argument 
recall for both the strong and weak messages 
in Experiment 2, high involvement increased 
persuasion for the strong message but de- 
creased persuasion for the weak one. Also, 
consistent with previous research (e.g-, Ca- 
cioppo & Petty, 1979; Insko, Lind, & LaTour, 
1976; etc.), the within-cell correlations failed 
to substantiate a relationship between learn- 
ing and persuasion. 

The most compelling explanation for the 
present data appears to be that increased in- 
volvement enhances the ‘importance of mes- 
sage content in producing persuasion. If the 
message content elicits primarily counter- 
arguments, then increased involvement will 
tend to enhance the production of thoughts 
unfavorable to the advocacy and will result 
in decreased agreement; but if the message 
content elicits primarily favorable cognitions, 


2Similar within-cell correlations among attitudes 
and cognitive responses were available for Experi- 
ment 1. The only significant correlation (based on an 
N of only 12) was between favorable thoughts and 
attitudes within the high involvement groups (r= 
58, p < .05). Message recall was not assessed in the 


first study. 
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then involvement will tend to enhance these 
positive thoughts and will result in increased 
agreement. The correlational data from Ex- 
periment 2 further corroborated the view that 
increased involvement enhances the impor- 
tance of message-based cognitions in produc- 
ing persuasion. The correlations between 
attitudes and cognitive responses were sub- 
stantially greater under high than under low 
involvement conditions. 

The finding that increased involvement en- 
hances the importance of message factors in 
producing persuasion may explain the previ- 
ously confusing finding that sources of high 
and low credibility produce differential per- 
suasion under low but not under high involve- 
ment conditions (e.g., Johnson & Scileppi, 
1969; Rhine & Severance, 1970). The present 
analysis suggests that nonmessage cues such 
as the expertise or attractiveness of a source 
should have maximal impact when persuasion 
is not tied to an extensive processing of the 
message content, as when a message is on a 
topic of low involvement. On the other hand, 
characteristics of the message content should 
have maximal impact under high involvement 
conditions. This analysis suggests that low 
involvement persuasion situations may be 
governed by what cognitive psychologists have 
called ‘‘automatic processing,” whereas high 
involvement persuasion situations may be 
governed more by “controlled processing” 
(Atkinson & Shiffrin, 1968; LaBerge, 1975; 
Schneider & Shiffrin, 1977; Shiffrin & Schnei- 
der, 1977). Under the former situations, mes- 
sage acceptance would be determined more by 
a well-practiced script such as “Experts are 
to be believed” (cf. Abelson, 1976; Langer, 
1978), whereas under the latter situations 
message acceptance would be determined more 
by a subject’s attention to and processing of 
the message content. In support of this con- 
jecture, Chaiken (Note 1) reported that sub- 
jects were more affected by the number of 
arguments employed in a message under high 
than under low involvement conditions but 
were more affected by source attractiveness 
under low than under high involvement condi- 
tions. 

At least two important questions about in- 
volvement remain unaddressed by the present 
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research. One concerns why increased involve. 
ment should facilitate information processing, } 
A possible explanation may reside in recent 
research indicating that information with self- 
relevance is processed more quickly than non- 
self-relevant information. For example, 
Markus (1977) argued that subjects have a 
more extensively developed “schema” or cog- 
nitive structure for self-relevant information 
and that this schema facilitates processing. 

Another important question concerns the 
limitations on increased issue involvement fa- 
cilitating processing. We suspect that there 
are circumstances where involvement may be 
so high, as when an issue is intimately associ- 
ated with certain central values (cf. Ostrom 
& Brock, 1968), that processing will termi- 
nate in the interest of self-protection. This 
level of involvement was not reached in the) 
present investigation, even though the mes: 
sage advocated a change in university policy 
that, if implemented, might have prevented 
many students from obtaining their degrees. 
Thus, within a normal range of involvement, 
increased issue importance appears to be asso 
ciated with increased message processing. This 
enhanced processing will most likely lead to 
reduced persuasion when a message presents 
weak arguments (i.e., arguments that art 
open to refutation and counterargumenta: 
tion), and to enhanced persuasion when 4 
message presents particularly good arguments 
for which subjects have no readily available 
counterarguments (and thus favorable 
thoughts will predominate). 


Reference Note 


1. Chaiken, S. Use of source and message oe 
persuasion. Paper presented at the meeting o n 
American Psychological Association, Toronto, CA 
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This article is focused on conceptual issues 
ther than data. In particular, I am con- 
erned with how we can best evolve a viable 
theory of individual differences (Royce, 
|1973a,b, 1978; Royce & Powell, Notes 1, 2). 
| Theoretical analysis and synthesis inevitably 

involve metatheoretic issues as well as sub- 
tantive issues (Royce, 1976). This interplay 
between metatheory and substantive theory is, 
Í course, well recognized in advanced scien- 
tific domains and in philosophy of science. 
The message from these quarters is clear— 
f ogress in developing substantive theory 
Hrequently requires clarification of underlying 
Metatheoretic issues. Two such issues are cen- 
ftal to this article: (a) the problem of scien- 
|tiic inference and (b) the problem of the con- 
‘eptual framework, 


i Problem of Scientific Inference 


E argument is that factor analysis is as 
À ject to the vagaries of inference as any 

er empiricoinductive scientific method. 
moan it is logically impossible to be 
A în about the interpretation of a factor 
®» scientific inference involves inductive 
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This article deals with two metatheoretic issues: (a) the problem of making 
scientific inferences and (b) the problem of the conceptual framework. Inability 
to overcome inductive uncertainty lies at the core of the debates concerning the 
interpretation of factors and how they are organized. Failure to embed factors 
in a viable, process-oriented conceptual framework has also obstructed progress. 
Theoretical structures based on the information-processing paradigm and in- 
variant factors are the most promising extant approaches. 


leaps.)* Thus, a major point that emerges is 
that the problem of scientific inference is not 
unique to factor methodology—It poses dif- 
ficulties for all scientists, and for the philos- 
ophers of science as well (Hanson, 1961; 
Morgan, 1973; Popper, 1959, 1963; Roze- 
boom, 1961). However, it does not follow 
that factor methodology has a fatal disease 
and must therefore be buried. Rather, what is 
called for is that we spell out the implications 
of this and other metatheoretic issues in the 
service of advancing substantive theory. 
While it has long been*recognized that dif- 
ferential psychology might provide the key to 
our understanding of personality, the Achilles’ 
heel of this approach has been the problem of 
identifying the fundamental dimensions of 
individuality. And although factor analysis as 
the best available scientific method for iden- 
tifying the dimensions of organized complex- 
ity has reduced the number of required per- 
sonality constructs from thousands to hun- 
dreds, the current state of the art is such that 
there is no mechanical algorithm available 
for deciding which dimensions are invariant. 
This means it is necessary to make qualitative 
judgments concerning this issue. It is my view 
that the problem of factor invariance under- 


1A more complete elaboration of the philosophical 
problem of induction and the logical basis for mak- 
ing scientific inferences via the methodology of factor 
analysis is provided in my article for the Nebraska 
Symposium (Royce, 1976, see especially pp. 13-20). 
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lies the substantive conflicts and confusions 
concerning factors and their organization. 

Contrary to common belief, there are two 
categories of factor invariance rather than 
one, These have been labeled internal and 
external invariance (Royce, 1976). Invariance 
of the first kind involves the replication of 
factors via several analyses despite differences 
in subjects, variables, and occasions, This 
constitutes the standard attack on the prob- 
lem, which although essentially qualitative in 
nature, is nevertheless an effective and plau- 
sible procedure. For example, in high intensity 
domains of investigation, such as pilot apti- 
tude (e.g., see the research of the U.S. Air 
Force Aviation Psychology Program, both the 
19 volumes from World War II and the post- 
war reports) or rodent emotionality (Royce, 
1977b), “marker variables” are so well-known 
that they constitute reliable and valid indices 
for specifiable factors. 

External invariance refers to the empirical 
and/or laboratory manipulation of previously 
identified factors, The most extensive research 
programs devoted to experimental manipula- 
tion of factors include Fleishman’s (1967) 
analyses of the learning dynamics of psycho- 
motor dimensions, Eysenck’s (1967) massive 
laboratory investigations of his three third- 
order constructs, and Royce’s (1966, 1973a; 
Aftanas & Royce, 1969; Mos, Lukaweski, & 
Royce, 1977; Royce, Holmes, & Poley, 1975; 
Royce & Poley, 1973; Royce, Poley, & Yeu- 
dall, 1973; Royce, Yeudall, & Bock, 1976) 
identification of the genetic and brain site cor- 
relates of affective and cognitive factors. Al- 
though the call for combining multivariate 
and experimental method and thought was 
sounded some time ago (e.g., Cattell, 1966a, 
1966b; Cronbach, 1957; Royce, 1950), it is 
unfortunately true that the volume of output 
that combines the two modes of attack is 
relatively small. The best single exemplar of 
this research paradigm is still embodied in the 
Cattell (1966a) handbook, But the Society of 
Multivariate Experimental Psychology jour- 
nal Multivariate Behavioral Research has not 
been very effective in this bridging role (e.g., 
see Royce, 1977a), despite the hopes of the 
founding fathers (Cattell, 1966b). Multi- 
variate experimental research is very demand- 
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ing, both in terms of orchestrating the rel- 
evant skills and in funding, so it will probably 
be some time before it hits full stride. Let me 
briefly cite two examples in order to make this 
point plainer. First example: There are no 
longitudinal data available as an empirical 
basis for a multivariate developmental psy: 
chology. The available theoretical model 
(e.g., Kearsley, Buss, & Royce, 1977) are 
based on extrapolations to the life span fron 
cross-sectional data. Large-scale testing of a 
appropriate sample of subjects and fact 
throughout the life span can only be accd 
plished by establishing a massive researchi 
stitute that can outlive its investigators. $e 
ond example: There is no large-scale resear 
program devoted to identifying the brain cor 
relates of factors. There have been sporadit 
factorial studies of humans with brain da l 
age, but there has been no sustained experi 
mental attack on the problem of the brai 
correlates of factors. Such a program woul 
require experimental lesioning, microelectrod 
implantations, and other brain interventid 
techniques on appropriate animal population 
(e.g., subhuman primates in the case of cog 
nitive factors), in addition to a sophisticatd 
and sustained factor strategy. Proper impli 
mentation of such a research program woul 
require an interdisciplinary team that woui 
range across factor analysis, psychometric 
animal behavior, and neuropsychology. i 
would also require access to a large number 
humans with a broad range of brain damag 
and extensive laboratory and animal faciliti 
as well, so that a variety of biological manif 
ulations of psychological factors could be c3 
ried out. These examples bring out the a 
factor on the one hand and the potenti 
value of such investigations on the oth 
hand. I cite these examples in order to oe 
size the point that we need both internal 

external invariance strategies in order 


age ” esi 
establish the “boundary conditions ( ; 
Factor A is functional at the early Stag 
learning a complex motor task, but rang 


tional during the later stages) Or u 
of effectiveness for each factorially 1 


unknown. i 
Although the intent of efforts j 
mathematical solution to the invariance H 


m is to generate invariant factors algorith- 
ically, this may not be possible in principle 
for example, because of an insufficiency of 
formation concerning the factors under com- 
arison). This means that the present state of 
the art is such that subjective judgments re- 
rding the interpretation and invariance of 
factors cannot be avoided. Thus, the hypoth- 
is-generating character of factor theory 
ust be taken seriously—that is, the inter- 
tation that Factor x is concerned with x 
id not x’ or y is a tentative hypothesis. How- 
yer, the principle of empirical tenability says 
t as factor investigators increase their 
owledge of the “boundary conditions” of 
ctors, these empirical constraints will in- 
ease the probability that the mth interpreta- 
ion of Factor x is a more plausible hypothesis 
Royce, 1976). 

Problems of factor invariance are com- 
unded by methodological difficulties and a 
tlative paucity of data when we move on to 
higher order factors. For example, there are 
ambiguities concerning causal influences be- 
ween levels, there are problems concerning 
the partitioning of the common factor vari- 
‘ance between levels, and there are ambiguities 
concerning the conceptual status of higher 
Strata factors (e.g., see Cattell, 1965a, for the 
Dest available treatment of these issues). But 
despite these methodological difficulties, all 
factor analysts share the same goal—namely, 
q identify invariant factors, to specify how 
ey are organized, and to specify how factors 
function as parts of complex behaviors. Thus 
We must transcend methodological and other 
Shortcomings in pursuit of these common 
ow (e.g., via such requirements as factor 
ahead? if we are to overcome the his- 
A 2 conflicts (exemplified once again 1n the 
19; A exchange between Guilford, 1975, 
©2171, and Eysenck, 1977) that have dom- 


Thated this complex and important area of 


vestigation. 


The Problem of the Conceptual Framework 


_,, critical implication of this problem is that 
Sa inventorying of invariant factors 
eee advanced forms of science 
“Con use explanatory theory requires viable 

Ceptual frameworks on which to hang em- 
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pirical observables (Royce, 1976, 1978). The 
point is that factor analysis per se is incapable 
of providing the prerequisite conceptual 
framework for its substantive findings. What 
it can do is identify potentially unifying com- 
ponents of such a framework (Royce, 1963), 
but the subsuming model (of which factors 
are critical parts) must come from elsewhere. 
This point has not been adequately perceived 
by either the factor analysts or their critics. 
The major shortcoming of the factor analytic 
proponents is their belief that a theoretical 
structure could somehow’ be automatically 
generated as a natural consequence of inven- 
torying invariant factors. The major implica- 
tion of this view is the belief that the factor 
model constitutes a sufficient framework on 
which to hang its substantive observables. 
Thus its major proponents, such as Guilford 
(1967) and Cattell (1965b, 1973), have 
tended to develop theoretical structures based 
primarily on the factor model. A major con- 
sequence of this conceptual commitment is 
that personality theory too heavily based on 
factor analysis has not been able to go beyond 
structure to dynamics. 

While there have been occasional noises 
from within the factor subcommunity ,con- 
cerning the need to get at process (e.g., see 
Messick, 1973), it is the current interaction 
between the two sciences (i.e. multivariate 
and experimental psychology) that is forcing 
the issue of conceptual framework into such 
sharp focus. For it is the experimental tradi- 
tion that has kept its eye on process, But it is 
the importation of a new ingredient, the infor- 
mation-processing paradigm, that is of partic- 
ular importance in this context. For this new 
paradigm is the conceptual framework that is 
bringing multivariate and experimental psy- 
chology together in an effort to understand 
complex psychological phenomena. The most 
important empirical work in this vein is being 
conducted in the cognitive domain (Carroll, 
1976, 1978; Hunt, 1974; Hunt, Frost, & Lun- 
neborg, 1973; Snow, 1976a,b; and Sternberg, 
1977). And the most explicit and comprehen- 
sive theoretical structure to combine the in- 
formation-processing paradigm with the two 
psychologies is Royce’s multifactor system 
theory of individual differences (see Royce & 
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Buss, 1976; Royce, Kearsley, & Klare, 1978; 
Royce & Powell, Note 1 and Note 2, for an 
overview). This theory defines personality as 
the total psychological unit, a suprasystem 
that is composed of six interacting systems. 
The cognitive (Diamond & Royce, in press; 
Powell & Royce, Note 3) and affective 
(Royce & McDermott, 1977) systems are 
conceptualized as processing units that func- 
tion as information transformers, The style 
(Wardell & Royce, 1978) and value (Schop- 
flocher, Royce, & Meehan, Note 4) systems 
are also central processing units, but they 
function more as personality integrators, Fin- 
ally, the sensory (Kearsley & Royce, 1977) 
and motor (Powell, Katzko, & Royce, 1978) 
systems are more peripheral processing units 
that function as input/output transducers and 
encoders. These six systems are organized as a 
multilevel, hierarchical suprasystem (Mesaro- 
vic, Macko, & Takahara, 1970) in which 
there is a controlled-process stratum (sensory, 
motor), a learning—coping stratum (cognitive, 
affective), and an integrative stratum (style, 
value). In turn, each of the six systems is also 
conceptualized as a multilevel, hierarchical 
system, where the elements of the hierarchies 
are identified via factor analysis. 

The long-range goal of factor research is to 
understand variations in Psychological func- 
tioning. But this kind of understanding im- 
plies the availability of a viable theory, which 
in turn implies the availability of an appropri- 
ate conceptual framework. The most germane 
candidate is the information-processing para- 
digm. This paradigm has the potential of gen- 
erating a viable, dynamic theory of individual 
differences, 
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This study tested the predictions of three models of coalition behavior in four 
different games, each including a veto player. Participants were master’s stu- 


dents who played each of the four games, rotating among the five player posi- 


tions between games. The games were played 
default conditions: (a) no time pressure; 


under one of three time pressure/ 
(b) a condition such that the 


constant payoff to coalitions was lost if an agreement was not reached in three 
attempts; and (c) a condition such that the payoff for no agreement was fixed 
at 60 points for the veto player and 10 for the other players. The veto players’ 
payoffs varied over games and tended to increase as play continued, at times 


approaching the entire payoff. Thus, 


the weighted probability and Roth-Shapley 


. models were not supported; the core received some support. The default condi- 


ti Gamson (1961) defined full-fledged coali- 
on situations as those where more than two 
ie must make a decision, where the ac- 
3 K preferences differ, and where no actor is 
ator (i.e., one who can impose a decision 
Ba no support from other parties) or veto 
oat (i.e., one who must be included in every 
i ition). Most coalition research (Murnig- 
n, 1978a) has focused on interactions that 
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tions had little effect. The discussion 
beneficial behavior in competitively motivating situations. f 


focuses on the likelihood of socially 


meet these requirements. However, a number 
of “real world” coalition situations include 
veto players: The oil producing countries have 
formed a cartel (the Organization of Pe- 
troleum Exporting Countries, or OPEC) that 
must be included in almost all major oil ex- 
changes. Company pri idents often must be 
included in all significant organizational deci- 
sions. And the members of the security coun- 
cil of the United Nations all hold veto power 
over the council’s resolutions. Recently, re- 
search has also investigated coalition situa- 
tions that include.a veto player (eg., Mur- 
nighan & Roth, 1977, 1978). The present 
study investigated four different veto games 
in conjunction with three time pressure/ 
aspiration level conditions, to test both game- 
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Table 1 

The Games, the Minimum Winning Coalitions, and the Models’ Predictions for the Players’ 

Mean Payoffs (Over All Trials) 

——— aaaaŘħŘħ 
Models’ predictions 


Weighted 
Game Minimum winning coalition probability Core Roth-Shapley 
20(17-6-5-4-3) AB, AC, AD, AE 80-5* 100-O% 80-5. 
38(27-13-12-6-5) AB, AC, ADE T1-11-11-3-3 100-08 70-12-12-3-3 
30(21-10-7-5-4) AB, ACD, ACE, ADE 62-11-9-9-9 100-08 65-15-7-7-1 
28(17-8-7-6-5) ABC, ABD, ABE, ACD, ACE, ADE 50-12.5* 100-0" — 60-108 


^ Payoff predictions that are not listed with an entry for each of the five players indicate that players B, 
C, D, and E are predicted to have equal payoffs. - v 

» The figures listed for the core's prediction are the limits that payoffs are predicted to approach. In sequential 
games like those in this study, less extreme payoffs that continuously approach this limit can be taken as 


support for the model. 


theoretic and social psychological models of 
coalition behavior and to establish a more 
general data base for veto games. 

One impetus for this study is the research 
by Michener, Fleishman, and Vaske (1976) 
that investigated eight veto and eight non- 
veto games. The eight veto games in the 
study included the same set of minimum win- 
ning coalitions (i.e., those that will no longer 
be winning if one player is left out of the 
coalition). For instance, in the 14(9-6-3-2) 
game, where 14 refers to the number of votes 
needed to form a winning coalition and the 
numbers within the parentheses refer to the 
votes assigned to players designated A, B, C, 
and D, respectively, five different coalitions 
(AB, ABC, ABD, ACD, and ABCD) are 
winning. Only AB and ACD, however, are 
minimal winning. Each of the eight veto games 
in the Michener et al. study used this set of 
winning and minimum winning coalitions, 
with different resource distributions. 

In an attempt to extend these results, the 
present study investigated four different five- 
person games that varied not only the play- 
ers’ resources (votes) but also the games’ 
underlying structures (i.e., the sets of mini- 
mum winning coalitions; see Table 1). Notice 
that in the 20(17-6-5-4-3) game the veto 
player can form a winning coalition with any 
one of the other four players. In the 28(17— 
8-7-6-5) game, the opposite extreme in this 
set, the veto player needs two other players 
to form a winning coalition. The other two 
games fall between these two, allowing the 


veto player to form either one or two two 
player winning coalitions. Three models af 
coalition behavior—the weighted probability 
model (Komorita, 1974), the core (Luce & 
Raiffa, 1957), and the Roth-Shapley value 
(Roth, 1977a,b; Shapley, 1953)—all make 
different predictions for these four games: 
This study, then, allows for theoretical a, 
of three models by using four veto games that 
vary in their underlying structure. — i 

Prior to discussing the other major inde 
pendent variable, the time pressute/ sa 
tion level conditions, each of the three mod? 
to be tested here will be briefly summar! 4 
using the 30(21-10-7-5-4) game as an è 
ample. 


The Weighted Probability Model 


The basic assumption underlying 4) 
weighted probability model (Komorita, oi 
is that because of the logistic po 
communicating offers and counteroffers, nA 
ers will attempt to form small aneii 
large coalitions. As the number of p ; 
coalition members increases, the seve 
the problem of achieving both reciprocity af 
unanimous agreement on the terms 


1 Other models, including Gamson’s 
mum resource and Komorita and Che f 
bargaining theories might also be RE 
study. These two models, however, €! or ot al 
limit themselves to nonveto games OF “Murnighan, 
ticularly amenable to veto games propriate 
1978a) and therefore would not be aP 
tested here. 
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offer also increases. In addition, the number 
‘of potential defectors from the coalition in- 
creases with its size; hence, a large coalition 
js not only less desirable but may also be 
‘more difficult to maintain, 

The weighted probability model assumes 
that an individual’s share of the prize should 
be a function of the number and size of al- 
ternative minimum winning coalitions avail- 
‘able to him or her. In the 30(21-10-7-5-4) 
game, Player A has four alternatives (AB, 
ACD, ACE, and ADE), including one two- 
person coalition. Player B has only one al- 
ternative (AB). Players C, D, and E each 
have two alternatives, both being three-per- 
son coalitions. The present form of the model 
assigns twice as much of an “advantage” to 
two-person opportunities as it does to three- 
person opportunities and predicts that the 
player’s outcomes should be proportional to 
this “advantage.” This “advantage” or weight 
‘is determined by the equation 


1/(n; — 1) 
D1/(n; — 1)’ 


Where P(C;) = the probability of coalition j 
"= the number of members of coalition j 
and the summation is over all minimum win- 
ting coalitions. Thus, the AB coalition is pre- 
dicted to form 40% of the time, and each of 
the three-player coalitions is predicted to form 
20% of the time. The probability of inclusion 
for a particular player depends on the proba- 
ilities of the coalitions in which he or she is 
4 member: Player i’s probability of inclusion 
is equal to the sum of the probabilities of each 
4 the coalitions he or she is a member of, 
3 P(C;), i e Cj). In the 30(21-10-7-5-4) 
fame, Player A will be included in all the 
Winning Coalitions (Pa = 1.0); the other 
Players will each be included 40% of the 
ime (Py = Po = Pp = Pp = 49)- Player 7’s 
tedicted payoff is proportional to his or her 
Ptobability of inclusion: 


2 P(C) = 


ke Ri = Player ?’s payoff (reward) in 
in ition j, P, = Player i’s probability of be- 

| E included in a minimum winning coalition, 
d the summation is over all members of Cy. 
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Payoffs in the AB coalition, then, should be 
72-28; in the three-player coalitions, payoffs 
should be 55-22-22. 

Although the original presentation of the 
weighted probability model (Komorita, 1974) 
made no predictions about the players’ over- 
all payoffs in a game (termed “success in 
bargaining” by Komorita and Moore, 1976), 
payoff predictions can be derived without 
additional assumptions. For any player i, the 
proportion of the total payoffs should equal 
the sum of the products of the player’s pre- 
dicted payoff in each coalition and the proba- 
bility of that coalition, In the 30(21-10-7-5- 
4) game, Player A is included in every coali- 
tion and is predicted to receive a payoff of 72 
on 40% of the trials and a payoff of 55 on 
60% of the trials. Player A’s overall mean 
payoff, then, should be 61.8. Player B, who 
should receive a payoff of 28 on 40% of the 
trials, should receive an overall mean payoff 
of 11.2. Similarly, for Players C, D, and E, 
the predicted mean payoffs are 8.8 each, 


The Roth-Shapley Model 


Unlike the weighted probability model, 
the Roth-Shapley model (Roth, 1977a, b; 
Shapley, 1953) does not predict the proba- 
bilities that different coalitions will form or 
the payoffs that members of the winning 
coalitions will receive. Rather, it specifies the 
proportion of the total payoff a player can 
expect from the play of a game. 

The Roth-Shapley model is purely pre- 
scriptive: Its derivation depends only on a 
set of underlying assumptions concerning the 
format of the game. In games with constant 
payoffs for the winning coalition, this expecta- 
tion is determined by dividing the number of 
times a player's resources are “pivotal” (in 
the sense that a losing coalition is converted 
to a winning coalition by the addition of this 
player) by the total number of permutations 
of the players. For instance, consider one 
permutation, DCEBA, in the 30(21-10-7-5- 
4) game. Because Player A holds veto power, 
he or she is the pivotal player in this permu- 
tation: The DCEB “coalition” can obtain no 
payoff without Player A. Indeed, in every 
permutation that places A third, fourth, or 
fifth, Player A is pivotal. Player A is also 
pivotal when Player B is first and A is second. 
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Player A is pivotal, therefore, in 75 of the 
120 permutations that are possible. (In any 
five-person game, 51 permutations exist). Be- 
cause Player B can form a winning two- 
player coalition with A, and the other players 
cannot, B will be pivotal more often than 
Players C, D, and E. In this game, B is piv- 
otal 18 times, and C, D, and E are each 
pivotal 8 times. Dividing by the total number 
of permutations (120) yields expected pro- 
portions of the payoffs of (.65, .15, .07, .07, 
and .07) for Players A, B, C, D, and E, re- 
spectively. 


The Core 


As with most game-theoretic solution con- 
cepts (Murnighan, 1978a), the core (when it 
exists) identifies a stable payoff for each po- 
tential coalition, without predicting the rela- 
tive frequencies of the different coalitions. 
The core is based on the concept of domina- 
tion. One outcome dominates another if the 
members of a “new” coalition receive greater 
payoffs than they did before, and they have 
the power to enforce the new outcome. For 
example, in the 30(21-10-7-54) game, a 
coalition between A and B may form, giving 
A 80 and B 20. This outcome is dominated 
by an ADE coalition with a payoff of 90-5-5. 
In the ADE coalition, Players A, D, and E 
receive payoffs that are better than the pay- 
offs they received in the AB coalition, and 
the ADE coalition can obtain the 100-point 
payoff. But just as the ADE (90-5-5) out- 
come dominates AB (80-20), every outcome 
can be dominated by one where Player A re- 
ceives more and more points by forming a 
smaller coalition or one with previously ex- 
cluded players. The only outcome that is 
undominated and therefore in the core of this 
game gives Player A the entire payoff. Al- 
though Players B, C, D, and E may not be 
disposed to accept this payoff, it remains the 
only undominated outcome and is the only 
outcome in the core. Like the Roth—-Shapley 
model, the core is a purely prescriptive model; 
behavioral assumptions (e.g., about the play- 
ers’ willingness to obtain particular outcomes) 
are not incorporated into the model. Thus, 
this extreme prediction might not be expected 
to occur in an actual game. Nevertheless, the 


core implies that Player A’s payoffs should 
increase as play in the 30(21-10-7-5-4) game 
continues. Indeed, whenever a game gives one 
player veto power, the core is the same, with 
the veto player receiving the entire payoff, 
The core is included in, and for veto games 
subsumes, other game-theoretic models. In 
particular, the various forms of the bargain- 
ing set (Aumann & Maschler, 1964; Mur- 
nighan, 1978a) are all identical with the core 
for this game, and their predictions are also 
tested with, although not differentiated from, 
the core. Other game-theoretic literature (Au 
mann, 1964; Debreu & Scarf, 1963) also sug- 
gests that the core may result from very com- 
petitive play. Thus, support for the core 
would also suggest that the games used in 
this study were played very competitively. 
Table 1 lists the predictions of each of the 
models for each of the games. Clearly the’ 
core makes more extreme predictions than 
either of the other two models. Differentia- 
tions between the weighted probability and | 
the Roth-Shapley models are more difficult, ! 
as they have been in previous studies (Mur- 
nighan, 1978b; Murnighan & Roth, 1978). 
Only in the 28(17-8-7-6-5) game are the 
predictions very different, and then only ni 
Player A. Thus, the data from the Be 
7-6-5) game provide the best test of i, 
two models. An additional test of 4 “| 
weighted probability model’s predictions H 
the frequencies of each coalition can ako 4 
conducted. The model predicts equal a 
quencies for the minimum winning cone 
in the 20(17-6-5—4-3) and 28(17-6-5 ne 
games, For the 38(27-13-12-6-5) ane n 
30(21-10-7-5-4) games, the two-person a 
alitions should form twice as often as 
three-person coalitions. | 
For the games in the Michener et al 
(1976) study, the core predictio: 
as for this study; the weighte 
model predicted payoffs of 60-4 i 
alitions, which should have formed A 
often as ACD coalitions (with corte 
offs); and the Roth-Shapley Mes payors 
games is (58, 25, 8, 8), for overa Player 
The data in that study show that af the 
averaged well over 60 points 1n oa study 
games. For the Michener et al. (12 y and 
then, both the weighted probabi 
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Roth-Shapley models underestimated the 
veto players’ outcomes. Because each group 
played each of the games only once, it is im- 
possible to determine whether the veto play- 
ers’ payoffs increased over trials. Neverthe- 
less, their outcomes were a considerable dis- 
tance from the core. 

The present study affords a finer test of 
the models and also assesses the effects of 
“default” conditions, a manipulation of time 
pressures and aspirations, that are not in- 
corporated into any of the models. 


Time Pressure and Aspirations 


Research on two-person bargaining has 
often manipulated the time bargainers are 
given to reach an agreement (e.g, Yukl, 
1974). When both parties lose all or part of a 
payoff if they do not agree by a certain time, 
agreements tend to be more likely and the 
players’ aspirations and demands tend to drop 
(Rubin & Brown, 1975). Time pressure has 
hot, however, been systematically studied in 
[Coalition situations. In the present study, if 
No coalition formed after three attempts, one 
of three “default” conditions resulted: (a) 
All five players received a payoff of zero; 
(b) the veto player received 60 points, 
Whereas each of the others received 10; or 
(c) bargaining continued until an agreement 
Was reached (i.e., there were no defaults). 
These conditions might be expected to affect 
the aspirations of the veto and nonveto play- 
tts differentially. For the veto player, either 
jot the default conditions should lead to re- 
duced aspirations, particularly as the third 
‘ltempt to form a coalition approaches. In 
dition, the 60-10 default condition might 
A expected to lead to higher aspirations for 
A veto players than the zero default condi- 
| i. would (cf, Rubin & Brown, 1975). On 
e other hand, should the veto players’ pay- 
h S approach the core, as they did in at least 
ne prior study (Murnighan & Roth, 1978), 
a may not perceive the two default con- 
ae differently: Both may be regarded as 
Nous setbacks. 
Pete nonveto players, the zero default 
a “ition might reduce each individual’s own 
| io tons, compared to the no default con- 
| “n, But there is always the possibility that 


1937 


“hurting” the veto player has positive value, 
Indeed, in previous research where the strong- 
est of the players always received the letter 
“A” (eg, Murnighan, 1978b), considerable 
anti-A feelings surfaced quickly. In addition, 
invoking the zero default may make Player A 
so cautious as to increase the nonveto players’ 
payoffs in subsequent coalitions (cf. Mur- 
nighan & Roth, 1978). In the 60-10 condi- 
tion, on the other hand, the nonveto players’ 
aspirations should clearly increase. Only a 
small minority of the theoretical predictions * 
(see Table 1) indicate that each of the non- 
veto players will be receiving 10 points on 
the average. Defaults, therefore, are advan- 
tageous to the nonveto players in this condi- 
tion and should result in “tougher” bargain- 
ing (Bartos, 1970) even when defaults do not 
occur. 

Considering the interactions of the default 
conditions with other aspects of the games, 
however, reduces the clarity of their potential 
effects. For instance, in the 60-10 default 
condition, the mixed-motive nature of coali- 
tion games removes restrictions on the po- 
tential payoffs that might result, Although 
pushing for a default as often as possible and 
refusing to take a payoff that is less than 10 
points seems like a desirable strategy, the 
possibility of exclusion from the winning 
coalition, with loss of any payoff for that 
trial, may limit such a strategy’s usefulness. 
Indeed, players who are willing to accept 
smaller payoffs typically are included more 
frequently in the winning coalition and also, 
at least in one study, accumulated larger over- 
all payoffs (Murnighan & Roth, 1977). Thus 
the nonveto players are faced with a dilemma: 
Although pushing for a default payoff in 
either default condition may result in greater 
joint or long-term gain, a willingness to take 
a smaller payoff may potentially increase 
individual, short-term gain (cf. Thibaut & 
Kelley, 1959). Indeed, using strict definitions 
for rationality (Simon, 1976), individual non- 
veto players who reject payoffs that are 
greater than the default payoff on the third 
attempt are “irrational,” at least in the short 
run, Add to this the possibility that giving 
deference to the powerful player may result 
in repeated inclusion in winning coalitions, 
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and it is clear that the nonveto players may 
be torn in several directions. 

Although several pressures are inherent in 
these coalition situations, the effect of aspira- 
tions is predicted to lead to higher payoffs 
for the veto player in the no default condition 
compared to the zero and 60-10 conditions. 
A priori predictions contrasting the 60-10 
and zero default conditions, however, are dif- 
ficult to make. 


Method 
Subjects 


Subjects were 120 master’s-level students enrolled 
in a behavioral science course in a commerce depart- 
ment. For the most part, these students had little or 
no previous background in behavioral science. To 
complete the project requirements in the course, stu- 
dents had a choice between writing a library research 
paper or participating in the bargaining exercises and 
completing a paper analyzing their own strategies in 
each of the games. The performance in either option 
accounted for 15% of the grade in the course. Stu- 
dents who participated in the bargaining exercises 
(presented as “exercises in strategy selection”) were 
informed that their performance in the games would 
be compared to the performance of other students 
in the same positions (i.e., with the same vote total) 
as themselves. Students could obtain an “A” grade 
for the project in any of three ways, by: (a) scoring 
better than the average total points for players in the 
same position as themselves, (b) scoring above the 
mean for similarly positioned players in the other 
groups in three of the four games, (c) doing an ex- 
cellent job of analyzing their strategies in their paper. 
All students, regardless of their game performance, 
were required to complete the paper. Only with a 
relatively poor paper and below average performance 
in the games would a student earn a “B” for the 
project.* 

Graduate students were used in this study in an 
attempt to parallel “real world” coalition bargainers 
more closely. Although none of the theories tested 
here speak to the expertise of the populations they 
address, it seems that most individuals involved in 
coalition bargaining, such as the examples mentioned 
earlier (Middle Eastern oil producers, corporation 
executives, and members of the United Nations 
Security Council), are both more intelligent and 
more experienced in bargaining situations than the 
general population. Including graduate students who 
have some stake in the outcomes of the bargaining 
and who are given repeated experience with coalition 
bargaining should more closely mirror such a popula- 
tion (cf. Murnighan, Note 1). 

There were 24 5-person groups in all: 6 composed 
of 5 males; 7 with 1 female and 4 males; 8 with 2 
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females and 3 males; 2 with 3 females and 2 mals: 
and 1 with 4 females and 1 male. Most of the 3 
female participants were commerce majors, 


Design and Analysis 


A number of factors were manipulated in this 
study. Each group played each of the four games for 
a total of 12 trials in one of the three default con- 
ditions. Thus, games and trials were within-group | 
variables, and default conditions was a between-group 
factor. Each game was played by five players, one in, 
each of the A, B, C, D, and E positions; players | 
were assigned different positions in the different 
games. Thus, player positions was a within-group. 
variable. Finally, the games were played in one of 
eight different orders. Two Latin square designs were 
used to assign groups to the eight game-order com: | 
binations. Thus, each game was played an equal 
number of times in each order. The variable called 
“order,” then, merely indicates whether a game wis 


with each other, their behavior in the games may ni 
have been strictly independent, violating one of the 
assumptions of analysis of variance. To attemp ‘ 
correct for this deficiency, the conservative test Bs 
posed by Box (1954), discussed by Winer i o 
was used in all analyses for the repeated ari 
Although the value of the F ratios remains M tlh 
in each case, the degrees of freedom are consider: 
reduced. For instance, for the main eff ii 
on the veto player’s payoffs, the normal deer wa 
freedom are 3 and 12; for the conservative test, HE 
degrees of freedom are reduced to 1 and 4. 


Procedure 
i the 

The participants were given instructions ahora 
coalition games in the third class of the P the use 
prior to the first session, Several examples cae dis- 
of the procedure (in games not used later) Juded i 
cussed. Theories on bargaining were "0! ait r com- 
this discussion and were only discussed Aod that 
pletion of the exercises. The players Eh on eal 
there would be 12 trials in each game, an prize 
trial the winning coalition would divide Re i 
100 points among its members. They were ize t 
to do as well as they could (ie. en 


of 


% 
an instructo $ 


sng poten 
ncerning i by 


cipants 


2 Participation by students from 
own class can raise ethical issues Co’ 
tial coercion. A copy of the safeguar i 
the authors to guard the rights of the pa 


in this study is available on request. 
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The Assignment of the 24 Groups to the Condition—Order-Game Combinations 
mmm 


Game 
Condition and order 20(17-6-5-4-3) 38(27-13-12-6-5) 30(21-10-7-5-4)  28(17-8-7-6-5) 
- No default 
1 4,7 3,8 2,6 1,5 
2 3, 6 oe ue 48 
3 Des 1,6 4,8 3,7 
4 1,8 4,5 3,7 2,6 
Zero default 
1 12, 15 11, 16 10, 14 9, 13 
2 11, 14 10, 15 9, 13 12, 16 
3 10, 13 9, 14 12, 16 11,15 
4 9, 16 12, 13 11,15 10, 14 
60-10 default 
1 20, 23 19, 24 18, 22 17, 21 
2 19, 22 18, 23 17,21 20, 24 
3 18,21 17,22 20, 24 19, 23 
4 17, 24 20, 21 19, 23 18, 22 


points) because their performance would (and did) 
determine part of their course grade. 
Students were randomly assigned to groups, with 
e constraint that members be unacquainted. Each 
koup met for their sessions at the same hour each 
Week for four weeks. They were told which game 
they would be playing each week, but not which 
Position they would hold throughout each game. Par- 
prints were told that they would not be in a 
a Hd position more than once. They were allowed 
llas iscuss the games with other members of their 
ii: excluding members of their own group. In 
ae the players were encouraged to formulate 
Poe for each position prior to each game. 
Ey a group played each of the games for a total of 
ot tare where a trial was defined as the formation 
tion winning coalition. For each game, player post- 
| F were designated A, B, C, D, and E, with Player 
Rea the most resources, Player B having the 
Playe most resources, and so forth. Assignment of 
N ts to positions was predetermined by randomly 
ae members of each group to @ particular 
indicat, The slot was unknown to the players and 
tach a what position each player would play in 
hight. the four games. For instance, the first slot 
layer ae that an individual would be assigned 
Eo osition A in the 30(21-10-7-5-4) game, B in 
[tents (17-6-5-4-3) game, and so forth. The assign- 
With hee each slot were determined randomly, 
t} z constraint that no player would be assigned 
[ho Beas more than once and that the player 
Wag n° not assigned to the A position in any game 
tme en the B position in the 30(21-10-7-5-4) 
“tablish hese constraints were rough attempts to 
uri equity among the assignments. 
ng the games, the players were seated around 


Nole. Each of the groups included Players A-E; they played each game for 12 trials. 


a set of opaque partitions that shielded them from 
view of each other and the experimenter. At each 
group’s first session, the experimenter read specific 
instructions about the procedures for the games. The 
players made offers on each trial by means of written 
offer slips, which required indicating to whom one 
wished to send offers and also a proposal regarding 
the division of rewards for the prospective coalition 
members. For example, if Player X wished to form 
an XY coalition, he or she addressed an offer to 
Player Y and specified a division of the rewards 
(eg., 60 for X and 40 for Y) on the offer slip. A 
player was required to send an offer slip to each 
player included in the proposed coalition. Thus, if a 
player proposed a three-person coalition, two iden- 
tical offer slips were sent, one to each of the proposed 
coalition partners. Players were also told that they 
could not send an offer to one person to form one 
coalition and a second different offer to another per- 
son to form another coalition. This procedure, orig- 
inally used by Komorita and Meek (Note 2), allowed 
two-, three-, and four-person coalitions to form in a 
single step. Thus, although large coalitions may be 
more difficult to form, the only difficulty inherent in 
this procedure is the need to fill out additional offer 
slips. After the players had completed the offers, the 
experimenter collected, examined, and distributed 
them to the proper persons, 

After receiving an offer, each person could accept 
or reject it by marking “Accept” or “Reject” at the 
bottom of each offer slip. A person receiving more 
than one offer could accept at most one, unless the 
offers proposed the identical payoff division for the 
same coalition. Hence each person could only accept 
offers to form a single coalition on each trial. Fur- 
thermore, in determining a winning coalition, any 
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player's proposal, if accepted, had priority over any 
offer he or she might accept, thus committing the 
player to his or her own offer. After the offers had 
been accepted or rejected, the experimenter collected 
the offer slips and announced the winning coalition, 
if one had formed, and the payoff division. A coali- 
tion was declared winning if all the proposed coali- 
tion partners accepted the offer. If no coalition 
formed because at least one person rejected each of 
the proposed coalitions, the procedure was repeated. 
This procedure allowed for acceptance within the 
group of two proposals on the same trial. For in- 
stance, if A sent an offer to C, D sent an offer to A, 
and both offers were accepted, AC would be declared 
the winning coalition because A was committed to 
his or her offer (invalidating his or her acceptance). 
While the offer D sent to A did not result in a coali- 
tion, it indicated the exact nature (i.e, how much D 
was willing to offer) of D’s interest. When two coali- 
tions were accepted in this manner, such that each 
was invalidated by the other (e.g, ACD and ACE), 
the players were informed of the situation and the 
trial was rerun. 

A practice trial was conducted before the start of 
the first session, Immediately after the practice trial, 
the players were assigned to their positions for that 
game, Lists of the resources (i.e., votes) for each 
position and the winning coalitions were also pro- 
vided, No verbal communication was permitted there- 
after; hence, the players could not identify each 
others’ positions. 

The instructions were summarized for the players 
at the start of their second, third, and fourth ses- 
sions. Practice trials were not run, but assignment 
to one of the five positions and information about 
the resources for each of the positions and the win- 
ning coalitions were distributed after the players were 
seated behind the partitions. Thus, for each game, 
the players were not informed of the identity of the 
players in the other Positions. 

Lengthy discussions of power, coalition, and bar- 
gaining theories were conducted during several classes 
at the end of the semester. The central focus in each 
of these discussions was the students’ behaviors dur- 
ing the exercises and their analyses of Strategies in 
their papers. Individuals were not identified in these 
discussions, and all other questions were answered. 


Results 
Sex Effects 


Vinacke (1959) and Komorita and Moore 
(1976) have reported that males tend to be, 
in Vinacke’s terms, “exploitative” and that 
females tend to be “accommodative” in coali- 
tion situations. There appeared to be no ef- 
fects due to sex in this study: Males tended 
to obtain overall payoffs as veto players only 
marginally higher than females did (89.8 vs. 
87.2); similarly, females obtained only mar- 
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ginally higher payoffs as nonveto players (26 
vs. 2.3). However, the random assignment of 
players to slots resulted in a disproportionate | 
number of male veto players in the 20(17-6-| 
5-4-3) game. Thus it is difficult to makel 
firm conclusions.* 


Coalition Frequencies 


Because each game resulted in a different 
set of minimum winning coalitions, the fre- 
quencies of the different coalition types were 
analyzed in four analyses of variance, one for 
each game. Thus, four Default (3) x Order 
(4) X Trial Block (3 blocks of 4 trials each 
x Coalition Type (which differed over gam 
see Table 1) analyses of variance were Tull 
with the frequencies of each coalition ty Pe 
converted to arcsin values * as the dependent 
variable in each case. Trials where nonmit:) 
imum winning coalitions formed (not preg 
dicted by any of the models, and occurring 
only 2% of the time) or where defaults re 
sulted were not included in these analysts 
The data are shown in Table 3. me 

In the 28(17-8-7-6-5) game, significai 
effects were found for default conditions, F(Z 
12) = 3.92, p < .05, and the default by coal 
tion type interaction, F(10, 60) = 2.44, ? É, 
.05. The default main effect merely refle i 
the fact that due to a total of 15 defaults W 


i, 


Type interaction appears to be p AB 
result of an unusually high frequency of P 
coalitions in the no default condition a 
relatively equal frequencies for the 0 
coalitions in each default condition. 
In the 30(21-10-7-5-4) game, the 7G, 
significant effect was for coalition tyP® 


e no female VO 


6-5-4-3) conditi 
t possible. 


3 Indeed, because there wer 
players in the no default 20(17- 
reliable statistical analyses were no 

4The transformation used was 


x’ =2 arcsin V%, 

s e mi 
where x is expressed as a proportion of t in | 
mum possible frequency for any coalil n= 
trial block, Also, for x= 0, 1/(4”) (wae (Bartle 
used; for x=1, 1—1/(4n) was usi 
1947). 


j 
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Coalition Frequencies and the Predictions of the Weighted Probability Model (in Parentheses) 


y Coalition 


Game AB AC AD AE ABC ABD ABE ACD ACE ADE Default 
20(17-6-5-4-3) 80 (72) 62 (72) 50 (72) 80 (72) 1 0 0 0 1 2 0 
38(27-13-12-6-5) 88 (112) 111 (112) -= == 1 0 0 0 1 81 (56) 6 
30(21-10-7-5-4) 134 (112) = = — 1 1 0 40 (56) 45 (56) 59 (56) 5 

41 (48) 48 (48) 45 (48) 29 (48) Si (48) 57 (48) 15 


28(17-8-7-6-5) =< = pe = 


Nov Totals do not always add to 288 because of the exclusion of infrequent coalitions such as ABDE, ACDE, etc. Coalitions 


that were not predicted to form have no parenthetical entry. 


36) = 13.91, p < .0001. Post hoc tests indi- 
ated that the AB coalition (mas = 134) 
ormed significantly more than either ACD, 
CE, or ADE (ns = 40, 45, 59, respectively). 
n the 38(27-13-12-6-5) and 20(17-6-5—4— 
3) games, no significant effects were found. 
These data show some support for the 
weighted probability model’s frequency pre- 
dictions (see Table 3). In the 28(17-8-7-6- 
5) and the 20(17-6-5—4—3) games, only one 
eect was significant for coalition types (a 
Default x Coalition Type interaction), pri- 
marily resulting from a high frequency for 
dhe coalition type in one particular condition. 
No main effects were found for coalition 
ypes; overall, then, the weighted probability 
model’s predictions in these two games were 
Supported, In the 30(21-10-7-5-4) game, the 
AB coalition was significantly more frequent 
than the three-person coalitions, as predicted. 
However, it was considerably more frequent 
a even the model predicted. Finally, in the 
| hela ) game, there were no sig- 
a cant differences among the coalitions, al- 
hve the model predicts that AB and AC 
3 each form twice as frequently as ADE. 

us the model receives only marginal sup- 
Port, in the two games where equal likelihood 
Predictions are made. 


The Veto Players’ Payoffs 


ae veto players’ payoffs were analyzed in 
a efault (3) x Game (4) X Trial (12) X 

the (4) analyses of variance, one including 
‘ee payoffs, the other excluding them. 

Sige the results of the two analyses were 

aii st identical, the data excluding the de- 
t payoffs will only be presented when the 


outcomes of the analyses differed. In addition, 
because the distributions of player A’s payoffs 
were negatively skewed, both in the entire 
distribution and within almost every cell of 
the design, medians and means for significant 
effects are reported. Although the analysis of 
variance is robust with respect to skewed dis- 
tributions, reporting the medians may be a 
better representation of the central tendency 
within each cell. 

Significant main effects were found for 
games, F(1, 4) = 19.83, p < .05, for trials, 
F(1, 12) = 9.93, p < 01, and for order, F(1, 
12) = 6.85, p < .05, and were marginally sig- 
nificant for the default conditions, F(2, 
12) = 3.12, p < .10. Post hoc tests using the 
Neuman-Keuls procedure indicated that the 
veto players received significantly more in the 
20(17-6-5-4-3) game than in the 38(27—-13- 
12-6-5) or 30(21-10-7-5-4) games and re- 
ceived the lowest payoffs in the 28(17-8-7- 
6-5) game (see Table 4). The veto players’ 
payoffs consistently increased over trials. 
Player A received significantly less in the first 
game played (M = 84.4, Mdn = 90), and no 
differences, again using the Newman-Keuls 
technique, were noted among the other orders 
(91.1 < Ms < 92.5, 95 < Mdns < 97). The 
marginal default effect suggests that Player A 
received the highest payoffs in the no default 
condition. This effect disappeared, however, 
when the default trials were removed, F (2, 


12) = 2.17, ms. 


using Huynh and Feldt’s 


5 Tests of homogeneity, 
also run for each of 
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Table 4 


Means and Medians of the Veto Players’ Payoffs in Each Game Over the 12 Trials 


Trial J 
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Game 1 T Sr ees T 68) 9) 10 L 12 ee 
20(17-6-5-4-3) 90.4 90.4 93.6 95.1 95.7 94.5 95.9 95.7 96.5 96.0 96.8 95.9 94.7 98 
38(27-13-12-6-5) 81.4 87.4 87.6 884 91.1 92.1 92.8 94.0 93.6 94.4 94.6 92.2 90.8 94 
30(21-10-7-5-4) 81.8 85.9 87.5 89.5 91.8 91.0 91.0 90.9 90.6 93.4 93.8 92.9 90.0 95 
28(17-8-7-6-5) 78.7 79.7 84.2 77.0 77.5 86.9 84.4 92.6 95.5 99.4 90.5 91.4 83.9 9) 

M 83.1 85.9 88.2 87.5 89.0 91.1 91.0 90,8 91.5 93.1 93.9 931 — — 
Mdn 85590) 90> ze SE SE 95S 96 Gis 97 - - 
Nonveto Players’ Payoffs nonveto players did in the same game who! 


The analysis of the nonveto players’ pay- 
offs was similar to those for Player A but in- 
cluded an additional factor, the players’ posi- 
tions (i.e., B, C, D, or E). The effects that do 
not include the players factor reproduce the 
results for the veto players’ payoffs and are 
not reported. The significant effects that did 
include the players factor were the players 
main effect, F(1, 12) = 15.10, p < .01, and 
the Games x Players interaction, F(3, 9) = 
3.98, p < .05. Table 5 displays the means for 
the Games X Player interaction and the asso- 
ciated main effects. Post hoc tests indicate 
that Player B received significantly higher 
payoffs than the other players in the 30(21- 
10-7-5—4) game did and that Players B and 
C received significantly higher payoffs than 
Players D and E in the 38(27-13-12-6-5) 
game. In both cases, then, nonveto players 
who held positions that allowed for the for- 
mation of a two-person coalition with Player 
A received significantly higher payoffs than 


were required to form three-person agree 
ments. 

In addition, combining the mean payoffs for 
players who were required to form three 
person coalitions compares very favorably toy 
the mean payoffs for players who could form 
two-person coalitions. For instance, in the} 
30(21-10-7-5-4) game, combinations of 
Players C and D’s payoffs, C and E’s payoffs, 
or D and E’s payoffs range from 3.88 to 4.10; 
Player B’s mean payoffs of 4.00 are almos 
identical. Thus, it appears that, across "i 
coalition type, the payoffs of the veto playeri 
partnership sets were fairly stable ve 
game. This result is predicted by several gang 
theoretic models (e.g., Aumann & Maschler’s, 
1964, bargaining set). ; i 

One way to explore the determinants 0 q 
veto players’ payoffs is to investigate the a 
relations between their payoffs and the o E 
they received, along with their own gerad 4 
It has been suggested that players who i 


Table 5 
Mean Payoffs for the Nonveto Players in Each Game, Including Default Payoffs ae 
Player 
z | 
Game B Cc D E y 
1.32: 
20(17-6-5-4-3) 1.76ca 984 1.074 l 149a 2.13» 
38(17-13-12-6-5) 2.87 abe 2.88:be 1.424 1.334 2.50» 
30(21-10-7-5-4) 4.00, 1.88.4 2.00ca 2.10 bea 3.42, 
28(17-8-7-6-5) 3.82, 3.14a, 3.15a 3.58. 
M 3.11, 2.22, 1.91» 2.12» 
tly diferet 


Note. Means with common subscripts within a main effect or the interaction are not significa’ 


from one another at the .05 level using the Newman-Keuls procedure. 
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given veto power can obtain almost any pay- 
off they desire, at least when there are no 
fault conditions (Murnighan & Roth, 
178). In this study, Player A’s demands (his 
ot her offers to self) were more highly corre- 
ted (significantly) with A’s payoffs than 
were either the mean of the nonveto players’ 
offers or the highest of the nonveto players’ 
offers. Player A’s payoffs and demands showed 
a correlation of .94 (n = 1126) when default 
payoffs were excluded and correlations rang- 
ing from .88 to .98 in the four games. The 
highest of the nonveto players’ offers revealed 
an overall correlation with A’s payoffs of .16, 
with correlations ranging from .66 to .83 in 
ach of the games. Needless to say, each of 
these correlations is significant. More im- 
portant, in each game the correlations be- 
ween A’s payoffs and A’s demands were sig- 
lificantly higher (using two-tailed £ tests, 
Edwards, 1967, pp. 250-251) than the cor- 
sponding correlations between A’s payoffs 
ind the nonveto players’ offers. It appears, 
then, that veto players controlled their payoffs 
more than the nonveto players, replicating 
ptevious findings (Murnighan & Roth, 1978), 
ven when defaults were possible. 


Defaults 


The frequency of defaults in the game/ 
lefault conditions is shown in Table 6. Given 
lhe total of only 26 defaults, reliable statis- 
lial analyses cannot be used on these data. 
In addition to the data in the table, which 
Siggest that the 60-10 default condition and 
lhe 28(17-8-7-6-5) games led to more fre- 
uent defaults, defaults were more frequent 
‘arly in each session, in the first set of four 


Table 6 


trials (n = 15), rather than later, in the sec- 
ond (6) or third set (5). 

The frequency of trials that reached at least 
three attempts is also shown in Table 6. The 
data suggest that reaching the third attempt 
was most likely to lead to a default in 
the 28(17-8-7-6-5) and 38(27-13-12-6-5) 
games in the 60-10 default condition. 

The uneven distribution of trials that 
reached three attempts necessitated a least 
squares analysis of variance of the offers to 
the veto player over the three attempt trials. 
Using default conditions (3), games (4), 
players (5), and attempts (first, second, and 
third) as- independent variables, the results 
yielded a suggestive interaction between de- 
fault conditions and attempts, F(1, 50) = 
3.84, p < .07. The means in the interaction 
suggest relatively stable, slightly increasing 
offers to A over the three attempts in the no 
default and zero default condition and de- 
creasing offers to A in the 60-10 condition. 
Even in this condition, however, the mean 
offers to A dropped only slightly, from 83.3 
to 80.6. 


Discussion 


The results clearly do not support the 
weighted probability and Roth-Shapley mod- 
els, Just as they did in the Michener et al. 
(1976) study, the veto players’ payoffs far 
exceeded the predictions of either model. Al- 
though the weighted probability model re- 
ceived some support from the data on coali- 
tion frequencies, it consistently underesti- 
mated the veto players’ payoffs. The Roth- 
Shapley model is similarly inadequate, even 
though its predictions are somewhat closer to 


Frequency of Defaults in Each Game/ Default Condition 


Game 


a er Maa A, 
Default 28 30 38 20 Total 
Ue! E37) 
N es — (10) — (2) (4) (3 
Fe 7 a 0 (13) 2 (6) 0 (2) Fe Se 
60-10 8 (15) 5 (14) 4 (8) 0 (4) 
Total 15 (49) 5 (37) 6 (16) 0 (10) 26 (112) 


Me The frequency of trials that reached three attempts is shown in parentheses. 
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the data. The models received some very mar- 
ginal support from the players’ payoffs for the 
significant differences that were found for 
nonveto players who could form two-player 
coalitions in the 38(27—13—12-6-5) and the 
30(21-10-7—5-4) games: Even though these 
players’ payoffs were smaller than predicted, 
they were significantly larger than those of 
players in the same games who were required 
to form at least three-person coalitions with 
Player A, as predicted, Also, in rank ordering 
the players’ payoffs, both the models did 
fairly well (rank correlation coefficients be- 
tween observed and predicted payoffs of .93), 
suggesting a correct depiction of the games’ 
structures, if not the specific outcomes. 

The core, on the other hand, was clearly not 
reached. However, the payoffs of the veto 
players consistently moved toward the core 
in each of the four games studied. The core 
is also not supported by the significant differ- 
ence found among the four games. In addition, 
the payoffs for the veto players in the 28(17- 
8-7-6-5) game were considerably below those 
in the core. Even in this game, however, the 
payoffs consistently increased over trials. In 
addition, the consistently strong correlations, 
across games and the other conditions, be- 
tween the veto players’ demands and payoffs 
strongly suggests that they had control of 
their payoffs. Just as Murnighan and Roth 
(1978) found that when the veto players de- 
manded 99% or more of the payoff, over 90% 
of the time they were receiving it, so too the 
present data suggest near-complete control for 
the veto player. Whether veto players could 
push their payoffs closer to the entire payoff 
if they were tougher in their demands remains 
to be tested in future research. This study, 
along with other recent research (e.g., Fiorina 
& Plott, 1978; Murnighan & Roth, 1977, 
1978), shows promise for the predictive capa- 
bilities of the core. 

The findings, then, clearly do not offer 
strong support for the weighted probability 
model, the Roth-Shapley value, or the core. 
Each received marginal support, however: the 
core by the increasing payoffs received by the 
veto players and the other models by their 
ordering of the magnitudes of the players’ 
payoffs across games. The effects for games 
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and for trials stand out and suggest thata 
combination of the properties of the two king 
of models may be necessary to predict the odf 
comes of coalition bargaining games that in 
clude a veto player. Komorita and Chertkofis 
(1973) bargaining theory, although not test 
able in veto games, suggest that payoffs will 
change over trials, increasing for the most 
powerful player, as they have here. The dif 
ficulty in directly adapting such an insights, 
in determining where a veto player’s payoffs 
will begin and where they will end, In the 
present study, the final median payoffs of the 
veto player were adequately, but not perfectly, 
represented by the core, The starting poini 
were quite close to values midway between 
core and either the weighted probability o 
Roth-Shapley models (i.e., 90, 85, 81-82, 
75-80 in the 20, 38, 30, and 20 games, respec, 
tively). A model that incorporates trials as a 
factor, with a veto player’s payoffs increasin 
from such a core-weighted probability - Roth 
Shapley midpoint on the early trials to tt 
core on later trials, dependent on the ga 
being played, would explain this study’s da 
fairly well. Difficulties arise, however, in con 
ceptually justifying such predictions in al 
ical way. In addition, the different conditio 
that surround the bargaining situation, inclu 
ing, for example, the physical separation 
individuals, as was the case in this study, 
also determine the magnitude of the a, ; 
obtained by the bargainers. Given the fal 
that few studies have been conducted oa 
dynamics of games that include veto play 
further research, even of an ‘explora o 
ture, might contribute sufficient io 
for the construction of theoretical prop if 
that are viable both conceptually an 

Par) ; of clear-cU 
pirically. Given the current lack of | nde 
results, it appears that not enough 1s ach 8 
stood about veto games to present sui i 
theory. iod. 

The effects for the other variables ed 
here, order and the default cone aad a 
somewhat unexpected. The players ae ame 
considerable preparation before belief tha 
nevertheless, most expressed the tablish tHe 
the first session was needed to P pargail 
firsthand understanding necessary gt 13 i 
effectively in the games. Indeed, P 
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the A position tended to increase their payoffs, 
especially in the 28(17-8-7-6-5) game, after 
their first game. These results suggest that 
experience may be a key factor in coalition 
research, if not in “real world” situations. 
Whereas most research on coalitions asks par- 
ticipants to play in only a single session, this 
study highlights the effects of experience, 
suggesting that stability is only achieved after 
‘arather involved first experience. 

The lack of effects found for the default 
conditions was particularly surprising. It 
would seem, at least, that the default condi- 
tions should have constrained Player A’s abil- 
ity to obtain such high payoffs. The fact that 
one player has veto power in each of the 
games appears to have overwhelmed any ef- 
fects of this manipulation. Combined with the 
tesults from Murnighan and Roth (1978), the 
conclusion that veto players can and will ruth- 
Iessly pursue their own rewards is difficult to 
tefute, Indeed, these results support the 
widely held assumption that strict controls 
are necessary to restrain monopolistic orga- 
nizations, 

There are occasions, however, when more 
cially beneficial behaviors result. In this 
tegard, it is important to describe one “excep- 
tional” group that played the 28(17-8-7-6— 
5) game first, in the no default condition. At 
‘the start of the game, the veto player made a 
Proposal of 60-20-20 (60 points for himself, 
and 20 for two of the other players). After 
à few rejections, this agreement was finally 
achieved. Even though Player A continuously 
eceived offers exceeding 60 points, he en- 
forced 60-20-20 payoffs on every trial, rotat- 
g among the nonveto players to insure equal 

Outcomes for each of them. By the end of the 
| ssion, all of the players were exchanging 

60-20-20 offers. This group’s first session was 

‘iteresting—their second was remarkable! 

4 € new veto player, now playing the 38(27— 

-12-6-5) game, demanded 98 of the 100 

Points on his first offer. He received four 

‘iers of 60-20-20! Inflexibility of both A 

J the nonveto players resulted in 32 un- 

Ccessful attempts prior to the formation of 

a first coalition. And the payoff finally 

Breed upon in that coalition was 60-20-20. 

‘appears that a norm had developed: Pay- 
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offs of 60-20-20 were defined as appropriate; 
others were not. Although this norm gradually 
eroded (due mainly to the impatience of 
players who were excluded from the winning 
coalition), the veto players in this group re- 
ceived payoffs that were considerably below 
the mean over all veto players, in each of the 
four games, The nonveto players, on the 
other hand, were above the mean for their 
position for each of the games. According to 
the grading system announced prior to play, 
scoring above the mean for all players in the 
same position in three out of four games (or 
above the overall mean) would insure an “A” 
for the project. Thus, all five players in this 
group received an “A”! 

Clearly this was an unusual group, How- 
ever, one explanation for the group’s success is 
that the first veto player’s strategy provided 
this individual with a chance to obtain a high 
grade for himself and contribute to a high 
grade for the other players in the group. Un- 
like prisoners’ dilemma games, then, where 
individual and group gains are generally in 
conflict, coalition games like these offer 
players at least one strategy that can be im- 
plemented to maximize both individual and 
group outcomes. 


Conclusion 


In summary, this study investigated four 
coalition games that included a veto player. 
The games varied the number of two-person 
coalitions that the veto player could form to 
obtain a fixed payoff in three default condi- 
tions. The games had a significant effect on 
the veto players’ payoffs, with payoffs dimin- 
ishing as fewer two-person coalitions were pos- 
sible, The other noteworthy effects included 
support for the core, particularly because of 


the consistently increasing payoffs for the 
veto player. The default conditions had little 
impact on the results. Indeed, the lack of sig- 
nificant effects for this variable suggests that 
players with veto power can control the game 
they are playing almost at will. These results 
further illustrate the need for controls of some 
kind for monopolistic organizations, Future 
research might also investigate the effects 
noted in the one “exceptional” group: When 
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joint gains can be tied to higher individual 
gains (or at least potentially higher individual 
gains), then (and possibly only then) coop- 
erative, group-oriented behaviors may emerge 
(cf. Argyris, 1957). Indeed, the incidence of 
cooperative behavior in highly competitive 
situations such as those in this study has 
rarely been investigated (an exception is 
Dawes, McTavish, & Shaklee, 1977) and 
would help to move research on coalitions into 
the determination of structural mechanisms to 
increase cooperation. 
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Weather, Mood, and Helping Behavior: Quasi Experiments With 
the Sunshine Samaritan 


Michael R. Cunningham 
Elmhurst College 


To establish the relationship of weather variables with helping behavior, two 
quasi-experimental field studies were conducted. In the first study, executed 
in the spring and summer and subsequently replicated in the winter, the amount 
of sunshine reaching the earth was found to be a strong predictor of a partici- 
pant’s willingness to assist an interviewer. Smaller relationships were also 
found between helping and temperature, humidity, wind velocity, and lunar 
phase. A second study was conducted indoors to control for comfort factors, 
and sunshine, lunar phase, and participant’s age and sex were found to predict 
the generosity of the tip left for a restaurant waitress. Sunshine and temperature 
temperature were also significantly related to self-reports of mood. 


So many things I would have done, but clouds got 
in my way. . . . (Joni Mitchell, “Both Sides Now”) 


One of the most pervasive background en- 
vironmental variables in human life, a factor 
that shapes agricultural economies, makes or 
breaks recreational plans, and serves as a 
Perennial topic for superficial conversation, is 
the weather. Yet although the weather ap- 
Pears to affect both emotion and social be- 
havior (Campbell & Beets, 1977; for reviews, 
see Huntington, 1945; Larson, 1965; Moos, 
1976; Tromp, 1974; Winslow & Herrington, 
1949), the specific weather variables most 
tlosely associated with behavior changes re- 
Main obscure. 

_ A brief scan of some of the research litera- 
lure will illustrate the wide variety of meteor- 
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ological indices purportedly linked to various 
human behaviors. 

Dexter (1904), for example, found very low 
barometric pressure, excessive humidity, and 
abnormal winds to be related to poor student 
deportment and increased teacher use of cor- 
poral punishment, Deteriorating or stormy 
weather has also been related to reduced task 
performance by workers (Muecher & Unge- 
heuer, 1961). Weather has also been impli- 
cated in psychological disturbances; Mills 
(1934) and Digon and Bock (1966) found a 
relationship between low barometric pressure, 
which usually characterizes stormy days, and 
suicide, whereas Lester (1970) obtained a 
relationship between the amount of snow dur- 
ing the winter months and the frequency of 
suicide. 

The sky conditions may also affect mood or 
psychological well-being. Winslow and Her- 
rington (1936) found a correlation of .78 be- 
tween the amount of sunshine apparent on a 
given day and judgments of atmospheric 
“pleasantness” (associations were also found 
with increased barometric pressure and de- 
creased humidity). Finally, Persinger (1975) 
observed that low moods in his 10 subjects 
were associated with fewer hours of sunshine 
and high humidity. 

Although the foregoing studies suggested 
that weather may influence emotion and be- 
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havior, straightforward conclusions are diffi- 
cult. In no study have all of the implicated 
weather variables been considered simulta- 
neously to determine the relative strength of 
their relationships with behavior. In addition, 
a number of the studies suffered from the 
methodological weakness of using indices 
based on the average level of a given weather 
variable, and the frequency of a certain be- 
havior within relatively wide time blocks, 
such as a day, week, or month, to establish a 
relationship (e.g., Banziger & Owens, 1978). 
But since a day or week in which tempera- 
tures varied from 30 °F to 70 °F (—1 °C to 
21 °C) produces a different experience from 
a stable 50 °F (9 °C) day, such summary 
statistics may be highly misleading. Further, 
if weather and behavior are not sampled con- 
currently, it is difficult to eliminate the argu- 
ment that purported weather effects are due 
to some third variable or by-product of the 
weather (e.g., mobility restriction). 

The use of experimentally controlled en- 
vironments would eliminate these objections. 
Griffitt and Veitch (1971) found that under 
conditions of manipulated high temperature, 
participants’ evaluation of both other people 
and the experimental environment was sig- 
nificantly more negative than under condi- 
tions of comfortable temperature. Correspond- 
ing changes in the participant self-ratings of 
affect suggest that the temperature effects 
probably were mediated by alterations in the 
individual’s mood. Subsequent work has indi- 
cated that there may be an upper limit to the 
temperature-aggression relationship; too 
much heat may actually inhibit aggression 
(Baron & Bell, 1976). 

Yet although temperature can be easily 
manipulated, controlling humidity, barometric 
pressure, wind velocity, and amount of sun- 
shine in a factorial design seems unfeasible. 
A compromise would be to adopt a quasi-ex- 
perimental design (Campbell & Stanley, 1963) 
in which uncontrolled variations in the 
weather are systematically observed and re- 
lated to changes in a carefully measured de- 
pendent variable using randomly selected sub- 
jects. 

Helping or altruism is a social behavior that 
has been found to vary as a function of the 
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mood or emotional state of the subject an 
ningham, Steinberg, & Grev, in press). Tf thy 
weather affects individuals’ emotional state, 
sense of well-being, then it might be expected! 
to affect helping, and two studies tentatively 
suggested such a relationship. Lockard, Me. 
Donald, Clifford, and Martinez (1976) re 
ported that panhandlers were more successful 
in spring than in autumn, and Cialdini, Vin. 
cent, Lewis, Catalan, Wheeler, and Darby 
(1975) footnoted a similar observation, 
Neither study, however, connected a specifi 
weather variable to an increase in helping, 
The present series of studies was design 
to assess the impact of a variety of weather 
climatic, and air quality indicators on helpin 
using two different indices. The first study; 
concerned with helping an interviewer, Wi 
initially conducted outdoors during the latë 
spring and summer and was subsequently rep. 
licated during the winter, The second stud 
involved restaurant tipping, and data wel 
collected indoors during the early spring. 
Predictions for the effects of weather 0 
both forms of helping were drawn from ti 
results of the earlier research. Since behavi 
seemed to be adversely affected by low bar 
metric pressure and high humidity, negativ 
correlations were predicted for those variabli 
with interpersonal helping. Although previo 
research has found only that mood was at 
versely affected by high temperature, it 
reasonable to imagine that mood may be F. 
pressed by the cold as well. An iye 
U-shaped relationship was thus predicted 
temperature and helping, with the most help 
ing expected at moderate temper 
nally, since mood seemed to be ate 
sunlight, a positive correlation was p" “shine 
between helping and the amount of Hoi 
reaching the earth due to the absence © 4 
cover. 5 


Experiment 1 


Method 


> fri 
Participants. Data were gathered a 

540 participants at two off-campus Joera 

city of Minneapolis and two lorak onn Every 1 
campus of the University of Mips a. esing E 
individual over the apparent age Of 10 I ted aft 
experimenter was approached, and the 

P 
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dj sex of the subject were recorded, Fifteen par- 
fcipants were approached each day during the after- 
oon on 36 weekdays randomly selected during the 
ring and summer of 1974 and the winter and early 
spring of 1974-1975. An average of 4 people each 
thy refused to stop for the study and were not in- 
duded in the primary analyses. Temperatures ranged 
fom —18° C to 38° C. Data were not collected dur- 
ing any periods of precipitation. 

Procedure. One male and one female were em- 
Joyed as experimenters each season. One female was 
ised in winter and another in summer. One experi- 
menter was blind to the purposes of the experiment 
kach season. The measure of helpfulness was obtained 
y approaching passersby with the statement: 


Hi. I’m from the sociology department of the Uni- 
versity of Minnesota and we’re conducting a sur- 
yey of social opinions. Although the survey is 80 
questions long, you don’t have to answer all the 
questions. How many questions would you be will- 
ing to answer for me? 


he number of questions the participant was willing 
to answer was employed as an interval measure of 
helping. Participants were then debriefed following 
their response. 

Weather readings taken at the beginning of the 
ecific hour in which the participants were ap- 
toached were employed as predictor variables. Sep- 
imate estimates were obtained of the amount of sun- 
ht reaching the earth, atmospheric temperature, 
rometric pressure, relative humidity, wind velocity, 
and lunar phase. To determine if changing weather 
lad an effect, the increase or decrease in temperature 
ind barometric pressure over the last hour and last 
Shours prior to the behavioral measurement was 
Glulated. Three measures of air pollution were also 
feuded: carbon monoxide, sulfur dioxide levels, and 
hex, a weighted composite of CO, SOs, and par- 
‘iculate levels. Weather readings were obtained from 
£ National Weather Service, Minneapolis-St. Paul 
ae Office, with the exceptions of sunshine level, 
Which was obtained by incident light reading using 
Rosen Luna-Pro light meter, and lunar phase, 
tained from an almanac. 


Results 


j Preliminary analysis revealed no association 
tween the age and sex of the subjects or 
Station and the weather variables. This inde- 
ects, and the well-trafficked public places 
j for sampling, supported the assumption 
eon assignment of subject to weather 
& itions and the employment of each sub- 
ho as an independent case. There was also 
hebing ciation between experimenters and 

A general regression equation was computed 
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for the summer, the winter, and the combined 
data, using the 13 weather variables as pre- 
dictors of helping, Each of the three regres- 
sion equations—for the summer, F(13, 256) 
= 7.41, p < 001; winter, F(13, 256) = 7.47, 
p < .001; and combined sample, F(13, 526) 
= 8.99, p < .001—was highly significant and, 
analogous to a significant overall F test in 
analysis of variance (ANOVA), justified ex- 
amination of the individual correlations. The 
Pearson correlations between the weather var- 
iables and helping are presented in Table 1. 

The most significant variable in this study, 
across both summer and winter, was sunshine. 
People were likely to be more helpful when 
the sky was clear and a large amount of sun- 
shine was striking the earth, compared to 
when the sky was cloudy and less sunlight 
reached ground level. The positive association 
of sunshine with helping can apparently not 
be attributed to temperature, since the corre- 
lation between the two variables was moder- 
ate, r(538) = .29, p < .001, and in the partial 
correlation analysis described below, both var- 
jables contributed independently to the pre- 
diction of helping. 

Temperature was also a significant predic- 
tor of helping, although the nature of the rela- 
tionship varied from summer to winter, Con- 
sistent with predictions, temperature was 
negatively associated with helping in the sum- 
mer and positively related to helping in the 
winter, To determine if the relationship of 
temperature with helping was nonlinear, a 
curvilinear equation was constructed. Tem- 
perature scores were divided into seven equal 
intervals and were weighted to produce a 
curve with an apex centering about 19 °C 
(65 °F), consistent with the experimentally 
obtained optimal outdoor temperature for 
clothed active individuals (Yaglow & Miller, 
1925). The correlation of the curvilinear 
equation with helping was positive, (538) = 
30, p < .001, and was significantly better 
than the linear relationship of temperature 
with helping, z = 2.11, p < .02. a 

Further, both sunshine and the curvilinear 
temperature index were employed a general 
regression equation to predict helping. Both 
sunshine and curvilinear temperature were 
significant, F(2, 537) = 55.82 and 31.11, re- 
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Table 1 


Correlations of Weather and Subject Variables With Amount of Help Offered an Interviewer 
_— iaaaaaaassuussssssssħn— 
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Item Summer Winter Combined 
Sunshine -32990 40*** 36% 
Temperature —.16° rte „1144 
Temperature increase (1 hr.) —.19°* —.09 —.130* 
Temperature increase (3 hrs.) —.05 —.01 —.03 
Barometric pressure .03 05 04 
Barometric pressure increase (1 hr.) ieee —.11° 07 
Barometric pressure increase (3 hrs.) 238%" —.02 -09 
Relative humidity —.19°* —22*** —.20*** 
Wind velocity .20°* —.15* 01 
Air pollution index (Apex) oh hd —.12* —.02 j 
Sulfur dioxide .02 —.26°** —.15* 
Carbon monoxide s129 —.16* —.03 a 
Lunar phase* —.1S* —.16* —.15* 
Age —.10° 01 04 
Sex? .03 .16* 10 


Note. Summer and winter ns = 270. 
^ Large value = full moon. » Large value = female. 
*p < 05. ** p < 01. *** p < .005. 


spectively, p < .001, with sunshine accounting 
for 13% and curvilinear temperature account- 
ing for 5% of the total variance in helping. 

Another predicted significant association 
found in this study was a negative correlation 
between relative humidity and helping. Peo- 
ple were more likely to be helpful when the 
humidity was low than when the humidity 
was high. 

As a converse to temperature, wind velocity 
was positively correlated with helping in the 
summer but negatively correlated with help- 
ing in the winter. This is reasonable, since a 
cooling breeze provides a desirable relief from 
summer heat but is an undesirable contributor 
to wind chill in the winter months. 

Unexpectedly, a small negative relationship 
was found between lunar phase and helping. 
In both summer and winter, people seemed to 
be less helpful when the moon was full than 
when the moon was less full. Further analyses 
reported below, however, raise questions con- 
cerning this relationship. 

A number of other correlations of weather 
variables with helping appear in Table 1 for 
one season but not for the other. Each of 
these variables correlated more highly with 
another weather variable than they did with 
helping, and they must be regarded as un- 
reliable until they are replicated.” 


Partial correlation analyses were conducted 
to determine the relative independence an 
strength of the significant weather variabl 
including sunshine, temperature, relative hu 
midity, wind velocity, and lunar phase. 
results of these analyses for the su 
winter, and combined sample are presented 
Table 2. A j 

The partial correlation analysis i, 
that sunshine, temperature, relative humidity 
and wind velocity were each significantly n 
lated to helping in the summer, 1n the sal 
directions as in the zero-order correlations: 
regression equation using these four pr 


1 A number of derived statistics involving 
tions of various weather variables were na 
dictors of helping. A discomfort index pr! K 
the U.S. National Weather Service (Man te 
p. 245), which consisted of a weighted mu proved 
equation involving temperature an! z col 
to be negatively related to helping but with @ 
lation smaller than curvilinear temperatu wind chi 
ity separately, r(538) =—-13, $ < 05. 
statistic (Mather, 1974, 
combine both temperature anı 
unrelated to helping, r(538) = 03; 
cause of the curvilinear relation 0 i l 
helping. Finally, a statistic propana midity, ani 
(1977) that combined temperature, E ation 
barometric pressure showed only a mo 
with helping, r(538) = .14, p < 01. 
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tors was highly significant, multiple R = .44, 
(4, 265) = 16.07, p < .001, and accounted 
for 20% of the variance in helping, with sun- 
shine accounting for 10% of the total vari- 
ance. In the winter data, sunshine, tempera- 
ture, and wind velocity were significantly re- 
lated to helping, whereas humidity, although 
in the right direction, was not significant. A 
regression equation with these three signif- 
icant weather variables was again significant, 
multiple R= 49, F(3, 266) = 27.44, p< 
001, and accounted for 24% of the variance, 
with sunshine accounting for 16% of the total 
variance. Lunar phase was significant in 
either the summer nor the winter. Further, 
cause of the curvilinear relationship of tem- 
erature and interactive relationship of wind 
velocity with temperature and helping, the 
artial correlation analysis on the combined 
sample revealed only sunshine to be signif- 
icantly associated with helping. 


Discussion 


This study found significant associations 
etween helping behavior and a number of 
feather variables. In both zero-order and par- 
lial correlation analysis, significant associa- 
lions were found between sunshine and help- 
lig in both seasons, such that helping was 
teater on bright sunny days compared to 
floudy days. In both sets of analyses, helping 
Nas greater during periods of cooler tempera- 
te and higher wind velocity in the summer, 
d warmer temperature and lower wind 
locity in the winter. Helping was also 
teater when the relative humidity was lower, 
though in the partial correlation analyses 
is association was significant only for the 
mmer data. 
q The association of lunar phase with helping 
las ambiguous, While lunar phase was nega- 
zy correlated with helping in the zero- 
et correlations, no association of lunar 
a and helping was found in the partial 
j.telations, Since lunar phase was negatively 
| as with other predictors of helping in 
(5 ears sunshine and lunar phase, 
Rion, = ~.29, p < ,001—the original corre- 
S may have been spurious. 
he association with helping of some of the 
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Table 2 
Partial Correlations of Selected Weather 
Variables With Helping 

Item Summer Winter Combined 
Sunshine 20% Yuh Ones 
Temperature —.26*** .26***  — 002 
Relative humidity —.19*** —.07 —.05 
Wind velocity 44** —.12* :02 
Lunar phase* —.02 03 —.05 


Note. Summer and winter ns = 270. 
a Large value = full moon. 
*p < 05. ** p < .01. *** p < 001. 


weather variables, such as wind velocity, tem- 
perature, and humidity, might be attributable 
to the comfort or discomfort that the indi- 
viduals anticipate experiencing while helping. 
Being asked to answer some questions when 
the temperature is low, for example, entails 
spending additional time in the cold, and that 
added cost factor could reduce helping. 

The comfort interpretation seems less effec- 
tive as an explanation of the sunshine rela- 
tionship, however. During Minnesota winters 
the temperature is often cold while the sky is 
bright and clear. Considering just those cases 
where the temperature was below freezing (0 
°C) and physical comfort was presumably 
low, sunshine was still associated with a 
greater likelihood of helping, r(177) = .23, 
p< Ol. 

The role of comfort in the sunshine-helping 
relationship might also be examined using the 
data on the rate at which individuals refused 
to stop and listen to the interviewer solicit 
participation. Although this is an ambiguous 
test of helping because people did not hear all 
of the request, refusals at least were not based 
on the length of time outdoors required for 
helping. Nevertheless, fewer people refused to 
stop, and thus more were helpful, on sunny 
days than on cloudy days, r(122) = —.32, 
p<.0l. 

A more direct method for investigating the 
role of comfort in the association of sunshine 
and helping would be to use a setting in which 
comfort factors such as temperature, humid- 
ity, and wind velocity could be controlled, 
whereas sunlight would be free to vary. A de- 
pendent variable that does not require a time 
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Table 3 
Correlations of Age and Sex With Tipping 
and Weather 


Item Ager Sex r* 
Tip 12* 29%" 
Sunshine E) aad 24494 
Temperature —.54*** —.50°** 
Barometric pressure zee. 
Relative humidity .20°* 
Wind velocity aaee 
Lunar phase” .02 

Note. N = 130. 


* Larger value = female. è Larger value = full moon. 
*p < .09. ** p < .01. *** p < .002. 


commitment would avoid the problem of the 
linkage between helping and remaining in a 
positive or negative environment. A follow-up 
study was thus conducted that examined the 
effects of weather on the tips left by patrons 
of an indoor eating establishment. 


Experiment 2 
Method 


Data were gathered from 130 parties dining during 
the afternoon at a moderately expensive climate- 
controlled medium-sized restaurant at a shopping 
center in a western suburb of Chicago. The study was 
conducted during the afternoon on 13 randomly 
selected days during the work week in April, May, 
and June of 1978, and 10 parties were observed each 
day. The restaurant had windows on two sides, and 


Table 4 

Correlations of Weather and Group 
Characteristics With Percentage of 
Check Left As Tip 


Zero order Age & sex 
correlation _ partialed 
Item r out r 

Sunshine 5 aad iad 
Temperature —.14** —.02 
Barometric pressure —.05 —.08 
Relative humidity 19** .10* 
Wind velocity 03 —.08 
Lunar phase* 14** sone 
Group size —.06 — 
Liquor served —.003 = 
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Note. N = 130. 
a Large value = full moon. 
*p < .10. ** p < 05. ** p < .001. 


i 
the temperature ranged from 4° C (40° F) to me 
(81° F) outdoors and was at 21° C (70° F) indon 
Six waitresses served to record data, and all cA 
blind to the purposes of the study. On each of thel 
designated days, one waitress was asked to recon l 
information about the first 10 parties she waited of 
after 1:00 p.m. The waitress recorded the number of 
people in the group, the total amount of the ched 
the amount of the gratuity, the approximate age anf 
sex of the person or persons leaving the tip, anif 
whether liquor was served. Prior to recording infor) ¢ 
mation on the customers, the waitress was asked to 
record her own mood on a 5-point scale ranging from 
good mood to bad mood. Waitresses were careft 
instructed in the use of the self-report scale and wer} 
asked to be honest, since in a previous study, rai)? 
domly selected participants were found to be le 
than adept in using introspective self-report sca} 
(Cunningham, Steinberg, & Grev, in press). For tiik 
reason and to avoid becoming intrusive, the moo 
of the tippers themselves were not assessed. 
Weather information was obtained from the Nuf 
tional Weather Service, Chicago Forecast Office, and 
included temperature, barometric pressure, and w 
velocity. Sunshine level and lunar phase were ob 
tained as in Study 1. 


Results 


Since it is customary to leave as a gratu 
an amount of money proportional to the ; 
of the check, the principal dependent variabl 
analyzed in this study was the ' 
amount of money left as a tip to i 
for the meal. Prior to examining the effect 0 
weather on tipping, the relationship 
weather to the status variables age and s 
was examined to insure independent 4 I 
pling.? It was found, however, that there Wi) 
an association between weather and the 4 
and sex of people in the restaurant on p o 
day. As Table 3 indicates, older peop!e ‘| 
women were more likely to dine out vie a 
sunshine was bright, the weather was ad 
and the humidity, barometric presti 
wind velocity were high. Although hee 
ciations are interesting in their ow? ne 
since older people and women also left a Mi 
substantial tip than the others, 


e a E 
eet ec es 
of the impact of weather on tipping " 


due 

2 To insure that variations in tipping wee ys 
to the behavior of individual waitresses, 9° tres 
of variance was conducted, using the fe waitress 
as independent variables. No effect due 


was found, F(5, 124) = .025, ns. | 
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arily must take into account this self-selec- 
jon factor. 

Although a regression equation using just 
he seven weather variables was significant, 
(6, 123) = 4.24, p < .O1, two sets of corre- 
tions are presented in Table 4, the zero- 
rder correlation between each weather varia- 
le and tipping and second-order partial 
orrelations, with the association of age and 
ex removed from the correlations of weather 
nd tipping. As Table 4 reveals, despite the 
act that sunshine was related to an increase 
1 the number of older people and females 
ining, sunshine was nevertheless significantly 
lated to helping. This correlation was 
maller than in Experiment 1, which might 
e expected, since participants were indoors 
nd were thus partially screened from effects 
Í the sun. 

Although inspection of the zero-order corre- 
itions suggested some reversals of the find- 
ngs of Experiment 1 with respect to tempera- 
ure and humidity, the partial correlations 
llayed such concerns. As might be expected 
ì a climate-controlled restaurant, outdoor 
émperature, barometric pressure, and wind 
locity had no significant direct effect on 
ipping. Group size and whether liquor was 
erved were unrelated to tipping, whereas hu- 
lity was marginally correlated. Apparently 
ontrary to Experiment 1, there was a positive 
ation between the fullness of the moon 
ind the size of the tip. 

Additional partial correlation analyses were 
etformed to determine the relationship of 
ch weather variable to tipping, with the 
ects of the other weather variables and age 
d sex removed. These analyses are pre- 
ted in Table 5. 

Once again, sunshine was significantly cor- 

Hated with tipping. Relative humidity was 
Positively correlated with tipping, 
Mereas lunar phase showed a marginal posi- 

Ve association, A regression analysis using 
p shine, temperature, relative humidity, and 
star Phase as predictors of tipping produced 
Picant associations of sunshine and rela- 
t humidity with tipping, F(4, 125) = 4.73, 
Ol, with sunshine accounting for 4% of 
“total variance. 
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Table 5 
Partial Correlations of Selected Weather 
Variables With Tipping 


Item Partial r 
Sunshine ort 
Temperature 03 
Relative humidity Dt Y ba 
Lunar phase* Alt 


Note. N = 130. The effects of age and sex have been 
partialed out as well as the other weather variables. 
ê Large value = full moon. 

*p < 10. ** p < .03. *** p < .005. 


To test for a curvilinear relation of tem- 
perature with tipping, a weighted equation 
centered on 19 °C (65 °F) was constructed. 
Curvilinear temperature was positively related 
to tipping, r(128) = .19, p < .02, as a zero- 
order correlation but was insignificant when 
the effects of age and sex were removed, 
r(128) = .09. 

Examination of the relationship of the 
weather variables to the self-reported mood of 
the waitresses, presented in Table 6, provides 
further insight into the nature of the weather 
effects. As Table 6 reveals, both sunshine and 
temperature were significantly related to a 
more positive mood of the waitress. Waitress 
mood (obtained prior to the receipt of tips) 
was not itself a significant predictor of tip- 
ping, 7(128) = .004, suggesting that the effect 
of weather on tipping may have been medi- 
ated by the mental or emotional state of the 


customers, 


Table 6 d sient 

Correlations of Weather With Waittress's 

Mood 

Se ee 

Item r 

Sunshine 60". 
Temperature 15 
Barometric pressure -32 
Relative humidity —.36 
Wind velocity TR 


Lunar phase* 


Note. Larger number is more positive mood. N = 13. 
3 Large value = full moon. 
+p < 05. ** p < 01. 
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Discussion 


The second study effectively controlled for 
the effect on helping of comfort factors such 
as temperature and wind velocity by investi- 
gating the effect of weather on a helping 
action that did not require the participants to 
remain outdoors for a period of time in order 
to be helpful. Nonetheless, this study repli- 
cated Experiment 1 by finding that the out- 
door sunshine level was significantly related 
both to the gratuity left for the waitress in a 
restaurant and to the waitress’s self-reported 
mood. 

Of course, the fact that sunshine was asso- 
ciated with both customers’ helping and wait- 
resses’ mood does not necessarily mean that 
changes in customers’ mood produced the 
variations in helping. The observed variations 
in both helping and mood as a function of 
sunshine could have involved separate media- 
tional mechanisms. Yet because of the strong 
experimental evidence for an effect of mood 
on helping (Isen & Levin, 1972; Cunningham, 
Steinberg, & Grev, in press) and the lack of a 
plausible alternate mediator for the effect of 
sunshine on helping, mood seems a prime can- 
didate. The present study did not, however, 
provide information on how sunshine might 
affect mood, Explanations based on symbolic 
associations, aesthetics, and biological pro- 
cesses all seem reasonable. 

Sunshine level could influence mood through 
its symbolic connection with pleasant or dis- 
appointing events. Thus sunshine could in- 
crease mood by stimulating thoughts of swim- 
ming, picnics, and other outings, whereas 
cloudy days could be associated with the dis- 
appointment of canceled plans and the annoy- 
ance of rain and snow and could in that way 
alter the individual’s mood. 

Alternatively, sunshine could produce a 
more positive mood by illuminating the en- 
vironment in a more stimulating and pleasing 
manner. Clear sunlight has spectral character- 
istics different from the light on cloudy days 
and with its stronger intensity may enhance 
colors and sharpen detail. The scenery on a 
cloudy day may appear more dull and mono- 
chromatic, by contrast. Maslow and Mintz 
(1956) have demonstrated that the aesthetic 
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quality of indoor settings affected mood and |) 
person perception, and sunshine might sim. 40 
ilarly influence mood primarily through aes. } 
thetic responses to the environment, 

Sunshine could also influence mood through} € 
its effect on physiological processes. There arel 4 
indications both from experimental studies on}! 
light deprivation and clinical studies of blind} ° 
and cataract patients that the level of light l 
detected may affect adrenal corticosteroid} 
production and other endocrine functions, 
hemoglobin formation, thyroid activity, the} 
detoxification capacity of the liver, and the? 
overall regulation of circadian and ciranmual! 
biorhythms (Luce, 1970). Solar radiation cat” 
also increase the atmospheric concentratia) 
of negative ions, and increased negative ion} 
concentration has been linked to increase 
oxidation of serotonin and increased relate 
tion in humans (Krueger & Reed, 1976) 
Randall, 1970). Alternately, since cloud cove 
filters out ultraviolet rays, decreased ultra 
violet light may retard physiological processes 
such as Vitamin D production (Ott, 1973), 
Yet since no direct measures of blood chemi 
istry were obtained in the present studies) 
such interpretations of the effect of sunlight 
like those based on associations and aesthe | 
ics, remain speculative. 

The series of studies reported here eni 
ployed a quasi-experimental correlation a 
sign, and the lack of total control over E 
independent variables produced some ani 
biguous results. First, in Experiment 2 oi 
people and women were more likely to eati | 
the restaurant, and thus participate 10 hel 
study, on sunnier, cooler, windier, more wel 
mid, higher pressure days. Although it 15 ©") 
lieved that statistical control for such a 
selection factors was adequate, compte a 
dom assignment to conditions would ha 
been more desirable. 


ja- 
Nowhere are the weaknesses of the correla | 


tional approach more apparent ae 
case of lunar phase effects. In the eeatt 


lation analysis in Experiment 1, & P 
association with helping was found aa in 
seasons, but this relationship ee at 

the partial correlation analysis. ee ‘i 

found a significant positive relations aa 
tipping, although no association was 
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ith waitresses’ mood. Since a recent review 
f the lunar phase literature pertaining to 
wmicide, suicide, and psychiatric admission 
requency (Campbell & Beets, 1978) indi- 
ated substantial inconsistency in findings 
cross studies and argued that published sig- 
ificant associations were due to Type I 
rors, the present small but significant corre- 
tions are regarded as somewhat of a nui- 
ance. As with other variables, such as 
unshine, employed in this study, a full ex- 
lanation of proposed effects will require mea- 
urement of physical, physiological, and 
sychological processes influenced by a factor 
uch as the moon. Given the overabundance of 
peculation concerning the nature of lunar 
fluences, no interpretation of the present 
ndings will be offered. 

Two other apparent inconsistencies should 
e noted, In Experiment 1 humidity was 
egatively associated with helping, whereas 
1 Experiment 2 the association was positive. 
‘his discrepancy is reasonable, however, if 
ne assumes that the higher the outdoor 
lumidity, the more relief the restaurant 
atron experienced when coming into a cli- 
nate-controlled environment. Appreciation or 
‘More positive mood following such relief 
ould have increased tipping. 

Another inconsistency concerns the relation- 
hip of temperature with the various depen- 
lent measures. Temperature showed an in- 
erted U-shaped relationship with helping in 
Experiment 1 but was positively related to 
laitresses’ mood in Experiment 2. Yet the 
sitive relation with waitresses’ mood is com- 
latible with an overall curvilinear relation- 
hip, given the more restricted temperature 
lange of Experiment 2. Temperature was not, 
towever, clearly linearly or curvilinearly re- 
“ted to tipping in Experiment 2. This is not 
‘itprising, however, given the climate-con- 
tolled environment of the restaurant. Fur- 
her, although waitresses’ mood was recorded 
pe the waitresses first came in to begin 

ir shifts, tipping did not occur until the 
Patrons had been away from the outdoor tem- 
Metature for some time. 

The lack of a substantial relationship of 
/ometric pressure with helping in either 
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study was somewhat unexpected in light of 
previous research, Yet it is well to bear in 
mind Piccardi’s (1962) observation that sim- 
ply ascending to the top of a skyscraper in- 
volves a pressure change equivalent to that 
stemming from a tornado. Barometric pres- 
sures in Experiment 1 ranged from 29.63 to 
30.37 and in Experiment 2 from 29.60 to 
30.27, which is an average range, but since 
data were not gathered during rainy or stormy 
weather, it is possible that some pressure 
effects were missed. 

Also unexpected was the finding that the 
consumption of alcohol was not related to 
tipping, nor was the size of the party, con- 
trary to the results of previous research (Free- 
man, Walker, Borden, & Latane, 1975), But 
since the precise amount of alcohol consumed 
was not recorded, and since it was frequently 
difficult to determine if one person or the 
entire group contributed to the tip, few con- 
clusions can be drawn. 

Although this series of studies has stressed 
the role of weather variables in helping, the 
conclusion should not be drawn that the 
weather is the only or even the major factor 
contributing to altruism. The strongest pre- 
dictor, sunshine level, accounted for only 13% 
of the helping variance when participants 
were outdoors and 4% of the variance when 
people were indoors, leaving a great deal to be 
accounted for by social and individual differ- 
ence factors, Yet given its relative neglect in 
the literature, future investigations might ex- 
amine the relationship between sunshine and 
other affect-linked behaviors. Alternative indi- 
cators of helpfulness or friendliness, such as 
people’s willingness to start a conversation 
with a stranger or give to charity, might be 
examined, as well as negative affect-related 
behaviors such as reports of crime, marital 
distress, suicide, initial contact with a psychi- 
atric clinic, and constricted nonverbal behav- 
ior (Cunningham, 1977). Given the reported 
friendliness of Californians, it would also be of 
interest to examine the friendliness of various 
cities around the world as a function of the 
amount of sunlight they receive (cf. Feldman, 
1968; Robbins, DeWalt, & Pelto, 1972). And 
the next time experimental subjects are not 


1956 


performing as predicted, the laboratory psy- 
chologist might look outdoors to see what 
kind of a day it is. 
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In an attempt to reconcile past research based on reactance theory that found 
convergence of evaluations of alternatives prior to a choice, and research 
based on choice certainty theory that found divergence, it was proposed that 
convergence occurs in overtly expressed evaluations but that divergence occurs 
in private evaluations. According to this interpretation, the freedom to choose 
a nonpreferred alternative is threatened by the overt expression of a pref- 
erence that involves commitment but is not threatened by the private holding 
of a preference. As expected from this interpretation, using the procedure in 
which Linder and Crane and Linder, Wortman, and Brehm found con- 
vergence of overtly expressed evaluations, divergence was found when evalua- 
tions were made privately, that is, supposedly without the experimenter being 
aware of the nature of the evaluations. 


Brehm’s (1966) theory of psychological 
reactance has been applied to evaluations of 
alternatives prior to a choice by Wicklund 
(1968, 1970), who suggested that a pref- 
erence for one alternative over another may 
threaten the freedom to choose the nonpre- 
ferred alternative and to reject the preferred 
, ‘ternative. Reactance theory assumes that a 
threat to a freedom leads to attempts to re- 
assert that freedom. The freedom to choose a 
Nonpreferred alternative can be reasserted by 
increasing the attractiveness of the nonpre- 
ferred alternative relative to the preferred 
alternative. This change would result in con- 
vergence of evaluations of the alternatives. 
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a Department of Psychology, University of 
aryland, College Park, Maryland 20742. 


Elaborating this analysis, Linder and Crane 
(1970) hypothesized that ratings of the at- 
tractiveness of alternatives will converge as 
a choice approaches. Linder and Crane had 
female undergraduates rate the desirability of 
two interviewers after reading brief descrip- 
tions of them. Some of the subjects had been 
led to believe that they would choose which 
of the two would interview them about highly 
personal topics following a brief “get- 
acquainted” period with both potential inter- 
viewers. The length of the get-acquainted 
period was varied. Linder and Crane found 
that the difference in the ratings, given the 
two interviewers, was smaller when the choice 
was expected after a 3-minute get-acquainted 
period than when the get-acquainted period 
was expected to last 15 minutes or when no 
choice was expected. 

Linder, Wortman, and Brehm (1971) used 
a procedure similar to that of Linder and 
Crane but varied the time to the choice while 
holding the amount of time in the get- 
acquainted period constant. Replicating the 
Linder and Crane effect, they found that the 
difference in the ratings given the two inter- 
viewers was smaller when the time to the 
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choice was 3 minutes than when it was 10 
minutes or when no choice was expected. 
The studies by Linder et al. provide evi- 
dence of convergence of alternatives prior to 
a choice. However, it is possible that the con- 
vergence occurred not because the freedom to 
choose the nonpreferred alternative was 
threatened by the private holding of a pref- 
erence but rather because it was threatened 
by the overt expression of a preference that 
involved commitment. In the studies by Lin- 
der et al., the ratings of the alternatives were 
not completely private but were made with 
the knowledge that they would be seen by the 
experimenter. This could have made the sub- 
jects feel that expressing a preference for one 
alternative would commit them to choosing 
it, for to do otherwise would leave the experi- 
menter with the impression that they were 
inconsistent or flighty. Such an impression 
should be expected to be stronger, the shorter 
the time from the preference to the choice. 
The assumption that the overt expression 
of a preference that involves commitment 
threatens the person’s freedom to choose the 
nonpreferred alternative, whereas the private 
holding of a preference does not, provides a 
means of reconciling the findings of Linder et 
al. with research by O’Neal and Mills (O'Neal 
& Mills, 1969; Mills & O’Neal, 1971; O'Neal, 
1971) that has provided evidence of diver- 
gence in evaluations of alternatives prior to a 
choice. The research by O’Neal and Mills 
was based on a theory of choice certainty 
(Mills, 1968), which assumes that people 
want to be certain when taking an action 
that it is better than the other alternatives. 
Certainty about a prospective choice can be 


increased by increasing the attractiveness of | 


the preferred alternative relative to the non- 
preferred alternative, which would result in 
divergence of evaluations of the alternatives. 

O’Neal and Mills (1969) tested the hy- 
pothesis that individuals faced with a choice 
about other persons will demonstrate a greater 
halo effect in their impressions of those per- 
sons. They found that the anticipation of 
making a choice about which women in a set 
of photographs were promiscuous increased 
the intercorrelation of rankings of the women 
on desirable traits such as artistic and sincere. 
Giving one person consistently high rankings 
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on desirable traits and another 


as consistently high or low. In | 
study, Mills and O’Neal (1971) rep! 
findings of O’Neal and Mills and y 
they could not be attributed to 
tion paid to the choice photographs: 
Using a choice about the subject’ f 
for a group task, O’Neal (1971) shor 
the greater the importance of the ant 
choice, the greater the magnitude of 
effect. When the prospective 
definitely determine the subject’ 
increased the intercorrelation of 1 
the potential partners on desirable 
when there was only a small chance 
choice would determine the subjec 
it did not. O’Neal also found evidi 
the influence of an anticipated choi 
halo effect depends on the presence 
plained arousal. Subjects who were a 
means of caffeine and were not 
about the source of their arousal sh 
effect of anticipated choice on the ha 
whereas those who were given ff 
told that it was a stimulant, or 
were given a placebo, did not. a 
behind the prediction of this find 
a lack of certainty about ani 
choice is typically accompame 
and when a person anticipatin; 
aroused and does not know the $0 
arousal, he may interpret it as re 
a lack of certainty about the ch 
may have the effect of increasin 
for certainty. y 
In the studies by O’Neal and N 
-found divergence in the atrag 
alternatives prior to a choice, the 
were not asked to make direct ê 
the alternatives. Although 
the choice alternatives in thor 
given to the experimenter, they | 
a number of different positive 
not, as Wicklund (1974) a 
a bold, blatant discrimination, 4 — 
ies by Linder et al. The asst 
convergence occurs in overtly 
ations of alternatives prior t 
that divergence occurs in pri 
is consistent with the previo’ 
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EVALUATIONS PRIOR TO A CHOICE 


The purpose of the present research was to 
test the interpretation that when evaluations 
Oí alternatives involve public commitment, 
they will converge prior to a choice, but when 
evaluations do not involve commitment, they 
‘will diverge prior to a choice. It was assumed 
that reactance theory, which predicts con- 
‘vergence in the attractiveness of alternatives 
prior to a choice, and choice certainty theory, 
which predicts divergence, are both valid but 
‘that predictions of the two theories apply to 
different circumstances. The prediction of 
convergence from reactance theory was as- 
sumed to apply when there is an overt ex- 
pression of a preference that involves commit- 
ment, and the prediction of divergence from 
choice certainty theory was assumed to apply 
When evaluations of the alternatives are 
private. 

a 


Experiment 1 


The first study was designed to vary the 
anticipation of a choice and whether evalua- 
tions of the alternatives were public or pri- 
Yate, while paralleling the procedure of Linder 
&t al. as closely as possible. A predecision 
Condition, similar to their 3-minute predeci- 
Sion condition, and a no decision condition, 
e same as their no decision condition, were 
Fincluded. The procedure of Linder et al. was 

Modified slightly so that a private assessment 
l condition, in which the subjects would think 
| that their evaluations would not be known by 
the experimenter, could be included as well 
48 a public assessment condition, similar to 
the assessment procedure Linder et al. used. 
Ih addition to the same direct assessment of 
the desirability of the two interviewers used 
by Linder et al., the subjects were asked to 
indicate how applicable a number of positive 
traits were to the interviewers. 

\In the only other change from the pro- 
4 dure used by Linder et al., the instructions 
oening the get-acquainted period with 
* interviewers were omitted Tt was 
if Ought that the expectation of meeting the 
“Wo interviewers would have the effect of 
ane changes in the attractiveness of 

Interviewers. Research: by Walster, Ber- 
heid, and Barclay (1967) has shown that 
nges in the attractiveness of an alternative 
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are attenuated by the expectation that ob- 
jective information about that alternative will 
be forthcoming. In addition, if information 
that might increase certainty about the choice 
were expected, there would be less need for 
the subjects to increase their certainty by 
changing their evaluations of the alternatives. 
The instructions concerning the get-acquainted 
period were not discussed by Linder et al. in 
relation to the arousal of reactance, and there 
seemed no reason why the amount of react- 
ance should be affected by the omission of 
the get-acquainted period. 


Method 


The subjects were 80 female students in introduc- 
tory psychology who volunteered, knowing that they 
would earn extra credit toward their course grades, 
An equal number was randomly assigned to each of 
the four experimental conditions: predecision-public 
assessment, no decision-public assessment, predeci~ 
sion-private assessment, no decision-private assess- 
ment, Within each of the four conditions, half of the 
subjects were run by a female experimenter and half 
by a male experimenter. 

When the subject arrived, she was shown to the 
experimental room and was given the same instruc- 
tions concerning the National Foundation for Opinion 
Research as in the studies by, Linder et al. Subjects 
in the predecision conditions were told that they 
would be interviewed concerning personal matters 
ranging from dating patterns and social regulations 
to their views on sex and that they would be able to 
choose the interviewer from two graduate students in 
clinical psychology who had both been judged satis- 
factory in terms of technique but who differed in 
their personality characteristics. Subjects in the no 
decision conditions were told that they would not be 
interviewed but were being asked to read written 
descriptions of two interviewers in order to determine 
the effects of the descriptions. All subjects read the 
same personality sketches about the two interviewers, 
Carter and Williams, used by Linder et al. 

Afterwards, in a departure from the procedure used 
by Linder et al., the subject was seated in front of 
an impressive-looking apparatus previously employed 
in studies using the Bogus Pipeline (Jones & Sigall, 
1971), with the explanation that it was a part of a 
new measurement technique about which they would 
like her opinions. The subject was told that when the 
machine was on, it was hooked up to a central in- 
formation bank that automatically recorded responses 
g a dial. The experimenter showed 


made by turnin, 
the subject how the dial controlled the movement of 


a needle on a scale marked from 0 to 30. 

Mentioning that he/she would turn on the ma- 
chine, the experimenter flicked the “on” switch and a 
loud clattering noise emanated from the apparatus. 


1960 
Table 1 
Means of the Absolute Difference in the 
Ratings for Experiment 1 


Decision condition 


Assessment No decision Predecision 
eee 
Public 10.7 7.9 
Private 8.1 15.4 


e_m 


Note. n = 20 per cell. 


Quickly turning the switch to “off,” the experimenter 
remarked with exasperation that there seemed to be 
a short circuit somewhere and said that the subject 
could still practice on the machine “to get the feel of 
it” while the problem was being fixed. As if on the 
spur of the moment, the experimenter suggested that 
the subject use her impressions of the two inter- 


viewers during the practice session. For subjects in 
the predeci conditions, the experimenter added 
that the tice rarely took more than 3 minutes, 


after which the subject would choose her interviewer. 

The subjects in the public assessment conditions 
were asked to indicate their responses to items read 
aloud by turning the dial to the appropriate needle 
reading and reading the number aloud to the experi- 
menter before returning the dial to 0. In clear view 
of the subject, the experimenter wrote down her 
responses. “` 

The subjects in the private assessment conditions 
were asked to indicate,their responses to items read 
aloud just by turning the dial to the appropriate 
reading and returning it to 0. The experimenter sat 
behind a partition that blocked the subject’s and ex- 
perimenter’s views of one another, supposedly so that 
the experimenter’s presence would not distract the 
subject. From a dial concealed behind the partition 
and connected to the subject’s dial, the experimenter 
recorded the subject’s responses. 

As in the procedure of Linder et al. the subject 
indicated how much she agreed with the statement, 
“J would like to have my interview with Mr. Carter” 
on a scale from O (not at all) to 30 (very much) and 
rated Williams on the same scale. The subject also 
indicated how well different traits applied to each of 
the two men on a scale from O (extremely unchar- 
acteristic) to 30 (extremely characteristic). The traits 
were: capable, kind, patient, responsible, sincere, and 
understanding. 

Following the evaluations of the interviewers, the 
subjects in the predecision conditions were asked if 
they had already made a choice about their inter- 
viewer. Two of the subjects in the predecision-public 
assessment condition and one in the predecision-pri- 
vate assessment condition said that they had decided 
previously. The experimenter casually mentioned to 
all subjects that there was something more to the 
study and asked if they had any idea what it might 
be. The response of one person run under the pre- 
decision-public assessment condition indicated suspi- 
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` tween the ratings given Carter and William 


cion about the procedure, and she was not included 
as a subject. Finally, the true purpose of the experi 

ment was explained, and each subject promised not 
discuss it with others. 


Results 


The absolute value of the difference be 


was calculated in the same manner as in th 
studies by Linder et al. Means of the absolu 
difference in the ratings for the experiment 
conditions are presented in Table 1. AS cal 
be seen from Table 1, when the assessment 
the alternatives was public, the results wef 
similar to those obtained in the studies b 
Linder et al.; the absolute difference in hi 
ratings of Carter and Williams was sma l 
in the predecision-public assessment conditio 
than in the no decision-public assessment co} 
dition, When the assessment of the altern 
tives was private, the results were reverse 
the absolute difference in the ratings V 
greater in the predecision-private assessm 
condition than in. the no decision-private a 
sessment condition. 
An analysis of variance of the absolute dil 
ference in the ratings revealed a signifi : 
Decision X Assessment interaction, F (1, 
= 6,96, p < .01. The main effects for decisio 
and assessment were not significant, nor W® 
any of the effects involving sex of the 
menter. A planned comparison revealed i 
the difference between the pregēcision-pu% 
assessment condition and thg no decisio! 
public assessment condition 
icant. A second planne 
that the difference betwee 
private assessment conditi 
sion-private assessment i 
icant, F(1, 72) = 780, sional 
An indirect measure of the eva 


taki 
i i was computed by 
two interviewers p e sum th 


absolute difference bet een its a d th 
scores given Carter 0) e six tr a 


sum of the scores given \ illiams 07 
The results for the indirec eas r meast 
the same direction as for the direc 
that is, the absolute difference Was 
predecision-public assessment co 

in the no decision-public assenar don "r 


& 
3 


and was greater in the PF eug 
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ssessment condition than in the no decision— 
rivate assessment condition. However, the 
ifferences between the conditions were not 
þeniñcant for the indirect measure. 


iscussion : 

The results of the first study lend support 
o the proposed interpretation of the research 
based on reactance theory (Linder & Crane, 
1970; Linder, Wortman, & Brehm, 1971) that 
found evidence of convergence of evaluations 
of alternatives prior to a choice and the re- 
earch based on choice certainty theory 
(Mills & O’Neal, 1971; O'Neal, 1971; O’Neal 
Ik Mills, 1969) that found evidence of diver- 
ence. The interpretation reconciles the ap- 
arently conflicting results of the past re- 
earch by making the assumption that the 
reactance theory prediction of convergence 
applies to the overt expression of a preference 
that involves commitment, whereas the choice 


Hound convergence of overtly expressed eval- 
ations, divergence was found when evalua- 
tions were made privately. 

Although the results for the public assess- 
ment conditions paralleled the results in the 
studies by Linder et al., they were not strong 
[enough to be statistically significant. The only 
the procedure of 
of the instruc- 


[With both interviewers. As mentioned in the 
}'ntroduction, it was thought that the expecta- 
J tion of meeting both interviewers in the get- 
acquainted period would make it less likely 
[that the subjects would change their evalua- 
Dtions of the interviewers. It was assumed that 
fthe instructions about the get-acquainted pe- 
ff riod did not influence the arousal of reactance. 
It is possible that the anticipation of getting 
Hmore information concerning the alternatives 
i} does affect the amount of reactance created by 
the Overt expression of a preference prior toa 
choice. According to reactance theory, the im- 
Portance of the freedom that is threatened 
letermines the amount of reactance created 
by the threat. The freedom to choose the non- 
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preferred alternative and reject the preferred 
alternative may be more important when 
additional information about both of the al- 
ternatives is expected. If so, then the conver- 
gence effect should be stronger when the in- 
structions concerning the get-acquainted pe- 
riod with both interviewers are included. 


Experiment 2 


The second study was designed to replicate 
and extend the findings of the first study. The 
variables manipulated were time to the choice 
and whether or not additional information was 
expected, as well as whether evaluations of 
the alternatives were public or private. In 
addition, an attempt was made to replicate 
exactly the procedures of the 10- and 3-min- 
ute predecision conditions of Linder, Wort- 
man, and Brehm (1971). 

One’ major prediction was that ‘the reac- 


tance effect—that is, convergence in the eval- 


uations of alternatives as time to choice de- 
creases—would occur when the evaluations 
are public and additional information is ex- 
pected, From the assumption that when addi- 
tional information is not anticipated, the im- 
portance of the freedom to choose the non- 
preferred alternative and reject the preferred 
alternative is reduced, it was predicted that 
the convergence effect expected from reac- 
tance would be reduced when additional in- 
formation is not expected. It follows that when 
evaluations of the alternatives are public and 
the choice close, the absolute difference in the 
ratings of the alternatives should be smaller 
when additional information is anticipated 
than when it is not anticipated. 

The other major prediction was that the 
choice certainty effect—that is, divergence in 
the evaluations of alternatives as time to 
choice decreases—would occur when the eval- 
uations are private and additional information 
is not expected. It was assumed that the ex- 
pectation of receiving additional information 
that might increase certainty about the choice 
would decrease the need to increase certainty 
through the reevaluation of alternatives. From 
this assumption it was predicted that the di- 
vergence effect would be reduced when addi- 
tional information was expected. It follows 
that when evaluations are private and the 
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choice close, the absolute difference in the 
ratings of the alternatives should be greater 
when additional information is not anticipated 
than when it is anticipated. 


Method 


The subjects were 100 female undergraduates in 
introductory psychology who volunteered, know- 
ing that they would receive extra credit toward their 
course grades, Ten subjects were randomly assigned 
to each of the eight experimental conditions of a 
2X2%X2 factorial design varying time to choice 
(10 minutes, 3 minutes), information (expected, not 
expected), and assessment (public, private). Ten 
subjects were later randomly assigned to each of the 
two replication conditions, 

The procedure was the same as in Experiment 1, 
with the following exceptions. All subjects were run 
by a male experimenter, and all were told that they 
would be interviewed and would be able to choose 
their interviewer. In the descriptions of the inter- 
viewers, the name Carter was changed to Caster. 

Half of the subjects in the factorial design were 
told that they would be able to meet both inter- 
viewers in a brief get-acquainted session prior to the 
choice (information expected), whereas the other 
half were not so informed (information not ex- 
pected). Just prior to the ratings of the interviewers, 
subjects in the 10-minute conditions were told that 
one of the interviewers was tied up for the next few 
minutes, whereas subjects in the 3-minute conditions 
were told that the interviewers should be ready mo- 
mentarily. 

In the 10 minute -information expected conditions 
the instructions were as follows: 


After the practice trials, when both men are finally 
available, you'll be able to have your chat with 
them. This session will last a couple of minutes, so 
considering the practice, get-acquainted period, and 
all, you'll be making your final choice of inter- 
viewer in approximately 10 minutes, 


In the 10 minute —information not expected condi- 
ditions the instructions were: 


After the practice trials, I'll get everything ready 
for the interview; that will also take a couple of 
minutes, so considering the practice and all, you'll 
be making your choice of interviewer in approxi- 
mately 10 minutes. 


In the 3 minute -information expected conditions 
the instructions were: 


After the practice trials, you’ll be able to have your 
chat with the two interviewers, which will last a 
couple of minutes, so considering the practice and 
the get-acquainted period, you'll be making your 
choice of interviewer in approximately 3 minutes. 
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In the 3 minute -information not expected condi- 
tions the instructions were: 


After the practice trials, I'll get everything read 
for the interview; that will also take a couple ol 
minutes, so considering the practice and all, you’ 
be making your choice of interviewer in appro; 
mately 3 minutes. 


After suggesting that the subject use her first im- 
pressions of the two interviewers for the practice! 
session, the experimenter went behind the partition 
and determined from a random assignment list 
whether the subject's preference ratings were to be 
assessed under public or private conditions, The in- 
structions for the public and private assessment con- 
ditions were the same as in Experiment 1, but in 
both conditions the experimenter was behind the par: 
tition, and the instructions were given by means of @ 
tape recording to eliminate any possibility that the 
experimenter could influence the subject’s responses 
by the way he administered the dependent measures. 
The order of the questions concerning Caster and 
Williams was counterbalanced. 

Subjects in the 10-minute and 3-minute replication 
conditions were given a booklet that was identical 
to the one used by Linder et al. (except that Carter 
was changed to Caster). They expected to receive 
additional information about the choice alternatives, 
and the segment of the procedure dealing with the 
measurement device was omitted. The experimenter 
manipulated the time-to-choice variable in exactly 
the same manner as in the factorial design. Ratings 
of the two interviewers were made on questionnaires 
that were filled out in the presence of the experi- 
menter and were returned directly to him. a 

After the dependent measures had been collected, 
subjects in both the factorial design and the Be 
tion conditions were asked if they had already made 
a choice about their interviewer. One subject in e 
of the four information not expected ¢ ) 
that she had decided previously. The experi “ts 
casually mentioned to all subjects that_ there i 
something more to the study and asked if fe a 
any idea what it might be. The responses 0) Hs 
persons indicated accurate suspicion about the p! on 
dure and they were not included as subjects. i 
was run under the 3 minute—information ay i 
pected — public assessment condition, one psec 
10 minute — information expected — private tion not 
condition, one under the 3 minute- informa i 
expected — private assessment condition, ant 
der the 10-minute replication condition. pment WS 

Finally, the true purpose of the on i 
explained, and each subject promised no 
it with others. 


Results 


i in the 

The means of the absolute ane a f 

ratings given Caster and wile a al 
eight experimental conditions of 
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design are presented in Table 2. As can be 
seen from Table 2, when the assessment of the 
alternatives was public, the results were sim- 
ilar to those of Linder et al. and the public 
assessment conditions of Experiment 1; the 
absolute difference in the ratings was smaller 

jn the 3-minute condition than in the 10- 

minute condition when additional information 

was expected and also when it was not ex- 
pected. Similar results were obtained in the 

replication conditions; the mean for the 3- 

minute replication condition was 7.2 and the 
mean for the 10-minute replication condition 
was 8.4. 

When the assessment of the alternatives 
was private and additional information was 
not expected, the results paralleled those of 
the private assessment conditions of Experi- 
ment 1; the absolute difference in the ratings 
was greater in the 3 minute — information not 
expected condition than in the 10 minute — 
information not expected condition. When 
additional information was expected, there 
was virtually no difference between the 3- 
minute and 10-minute conditions. 


the following effects were significant: 
main effect of assessment, F(1, 92) = 9.11, 
b< .05; the main effect of information, F(1, 
92) = 9.78, p < 01; the interaction between 
time to choice and assessment, F(1, 92) = 
4.35, p < 05.1 The major predictions were 
tested with the use of planned comparisons. 

The major prediction from reactance theory 
was that the absolute difference in the ratings 
of the alternatives would be less in the 3 min- 
ute information expected — public assess- 
ment condition than in the 10 minute — infor- 
mation expected — public assessment condi- 
tion. This comparison was not significant. The 
Comparison between the 3 minute — informa- 
tion not expected — public assessment condi- 
tion and the 10 minute — information not ex- 
Pected — public assessment condition was also 
Not significant. The two 3 minute — public 
assessment conditions were contrasted with 
x two 10 minute — public assessment condi- 
T ns, and the contrast was not significant. 

he 3-minute replication condition and the 
10-minute replication condition were not sig- 
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Table 2 
Means of the Absolute Difference in the 
Ratings for Experiment 2 


p 


Time to choice 


Assessment Information 10 min. 3 min. 
Public Expected 7.1 5.4 
; Not expected 9.5 6.6 
Private Expected 7.0 vial 
Not expected 10.0 14.9 


Note. n = 10 per cell. 


nificantly different. The 3-minute replication 
condition and the two 3-minute public assess- 
ment conditions were contrasted with the 10- 
minute replication condition and the two 10- 
minute public assessment conditions, This 
contrast was also not significant. In addition, 
the contrast between the 3 minute — informa- 
tion expected — public assessment condition 
and the 3 minute — information not expected — 
public assessment condition was not signif- 
icant. 

The major prediction from choice certainty 
theory was that the absolute difference in the 
ratings would be greater in the 3 minute - 
information not expected — private assessment 
condition than in the 10 minute — information 
not expected — private assessment condition. 
This comparison was significant, F(1, 92) = 
4.53, p < .05. As expected, the 3 minute — 
information not expected — private assessment 
condition was significantly greater than the 
3 minute — information expected — private as- 
sessment condition, F(1, 92) = 11.47, p< 


01. 


Discussion 


The results of the second study fortify the 
proposed interpretation, which reconciles the 


ee 


1 The error ter 
the planned comp 
estimate of error 
cells. Exclusion © 


m for the analysis of variance and 
arisons was derived by pooling the 
from the factorial and replication 
f the replication cells from the 
error term would not affect the significance levels 
of any of the comparisons. There would not be any 
change in the significance Jevels with exclusion of the 
four subjects who said they had already made a 


choice. 
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apparently conflicting results of the past re- 
search (Linder & Crane, 1970; Linder, Wort- 
man, & Brehm, 1971; O’Neal & Mills, 1969; 
Mills & O'Neal, 1971; O’Neal, 1971) by mak- 
ing the assumption that the reactance theory 
prediction of convergence of alternatives prior 
to a choice applies to the overt expression of a 
preference that involves commitment, whereas 
the choice certainty theory prediction of di- 
vergence applies when evaluations are private. 
As expected from the proposed interpretation, 
when preferences were privately assessed and 
additional information was not expected, the 
absolute difference in the ratings of the al- 
ternatives was greater, the closer the choice. 
The replication of the divergence effect found 
in the private assessment conditions of Ex- 
periment 1 lends substantial support to part 
of the proposed interpretation based on choice 
certainty theory. 

That divergence of private evaluations did 
not occur when additional information was 
expected is consistent with the choice cer- 
tainty theory analysis. If additional informa- 
tion is anticipated, there should be little rea- 
son to reevaluate the choice alternatives, since 
the forthcoming information may be sufficient 
to achieve certainty about which alternative 
is best. As expected from this reasoning, when 
evaluations were private and the choice close, 
the absolute difference in the ratings was 
greater when additional information was not 
expected than when it was expected. 

The divergence effect that was found in the 
present research is reminiscent of the diver- 
gence that occurs in evaluations of alterna- 
tives after a decision (Brehm, 1956). How- 
ever, there is no basis for regarding the 
divergence in the present research as post- 
decisional. It occurred in the private assess- 
ment conditions but not in the public assess- 
ment conditions, where there was just as 
much reason, if not more, for subjects already 
to have made a choice. Also, very few of the 
subjects in any of the conditions of either Ex- 
periment 1 or Experiment 2 said that they 
had already made a choice. 

As in Experiment 1, the results for the 
public assessment conditions were in the di- 
rection of the convergence effect predicted 
from reactance theory but were not strong 
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enough to be statistically significant, 
Same was true for the results for the 1 
tion conditions, which duplicated as close ý 
as possible the 3-minute and 10-minute con 
tions of Linder, Wortman, and Brehm (1971), 
It is not clear why the convergence effec 
was not stronger in the present research, The 
possibility that the lack of a stronger con 
vergence effect in Experiment 1 was due t 
the omission of the get-acquainted session 
with both interviewers—which might have 
reduced the importance of the freedom til 
choose the nonpreferred alternative—was not 
borne out by the results of Experiment 2, Th 
tendency for the alternatives to converge it 
the public assessment conditions was nii 
stronger when the get-acquainted session wa 
included (information expected conditions) 
than when it was omitted (information ml) 
expected conditions). K 
The fact that the data from the public 
assessment conditions and the replication con 
ditions were consistently in the predicted di 
rection, together with the significant differ- 
ences found previously by Linder et al., makes 
it reasonable to conclude that the convergent 
effect predicted from reactance theory 1s 1 
able. Since it occurs when pret k 
ublic but not when they are private, 1t œ 
Sie be concluded that the freedom to e. 
the nonpreferred alternative and to iis ‘a 
preferred alternative is threatened ‘ y im 
overt expression of a preference that invo: 
commitment but is not threatened by 
private holding of a preference. 
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column of p. 320 should be 
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crepant ones.” 

Thus, the predicted effect for degree 0 
greater audio primacy for more 
ported. This effect was not stronge 
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left-hand column 0 
ANOVA . . .”) should be changed to read as follows: 

“For the three samples, ¢ tests were comp 
differing on one affective dimension (slightly d 
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(slightly) less video primacy 


econd full paragraph in the right-hand 
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) was much smaller than originally re- 
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Consensus Information, Prediction, and Causal Attribution; 
A Review of the Literature and Issues 


Saul M. Kassin 


Purdue University 


The present review of the literature suggests that it is useful to distinguish 
between two types of consensus information—normative expectancies (e.g, 
Jones & McGillis’ prior probability concept) and explicit base rates (e.g, Kel- 
ley’s conception of observed covariation across actors). Normative expectan- 
cies, which may be derived from a knowledge of one’s own behavior (i.e., the 
false-consensus effect) or the behavior of others, provide one basis for predic- 
tion and causal inference. Explicit, sample-based consensus may also be em- 
ployed, but under somewhat restrictive conditions: (a) when prior expectations 
are neutralized and/or (b) when the consensus manipulation is particularly 
strong, salient, easily translatable, representative of the criterial population, 
and causally relevant. A number of additional issues are reviewed (eg., the 
cognitive strategies by which observers reject base rates), and recommenda- 
tions for the direction of future research are made. 


The attribution literature has recently been 
inundated with studies aimed at determining 
whether or not and under what conditions 
naive observers employ consensus informa- 
tion for predicting and explaining behavioral 
events. To date some experiments have shown 
that consensus is a valuable source of infor- 
mation, whereas others have indicated that it 
is not. This paper will attempt to review and 
organize the diverse findings and discuss some 
of the important theoretical issues that have 
been raised. 


Theory 


Attribution theories portray the layperson 
as a perceiver who, in order to understand 
what caused some form of behavior to occur, 
actively seeks and utilizes various kinds of 
information, One of the more intuitively ap- 
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pealing of the proposed relationships is the 
link between consensus and causal attribution 
When an individual behaves differently fro 
others (low consensus), observers are likely 
to view properties of that individual as hav: 
ing had a causal impact; when an actor 
behaves in a manner consistent with the peti 
formance of others (high consensus) , observ- 
ers are apt to infer that some invariant fer 
ture or object in the environment elicited 
common reaction. 2 hea 
Heider (1958) illustrated this hypot x 
when he suggested that consensus influent 
attributions about the nature of a percept, 
origin of motivation and pleasure, met 
causal locus of success and failure. a 
attribution theories readily adopted the i 
posed role of consensus in the inferen 
cess. For Kelley (1967), consensus: i 
as observed covariation across actors (ry) 
other actors in the situation behave sim» 
—is one of three criteria for m i 
validity of an environmental att actor's 
others being distinctiveness (does aoe 
behavior occur only in the Poa ‘toes the 
particular entity) and consistency entity at 
actor behave similarly toward © onsite 
different times). For temporally ted 
behaviors, Kelley (1967, 1973) PF 
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‘those high in consensus (and/or distinctive- 
ness) are causally attributed to the entity, 
whereas those low in consensus (and/or dis- 
tinctiveness) are attributed to the actor. Jones 
and Davis (1965) conceived of consensus as 
a prior probability variable: Based on a tacit 
knowledge of norms, social constraints, and 
values, observers compare an actor’s behavior 
to the expected behavior of others. As such, 
Jones and Davis predicted that behaviors 
that are low in their “assumed desirability” 
are more informative of an actor’s attributes 
than those known to be high in their attrac- 
tiveness or desirability. 

In short, theorists have distinguished be- 

tween two kinds of consensus information— 
observers’ beliefs about what other people 
would do if they were present and observers’ 
firsthand knowledge of what others in the sit- 
uation actually do. The former case exempli- 
fies Jones and Davis’s (1965) prior-probabil- 
ity concept and will be referred to as the 
perceiver’s normative expectancy or “Gmpli- 
cit consensus.” The latter case corresponds to 
Kelley’s (1967) definition of observed coyari- 
ation and will be referred to as sample-based 
or explicit-consensus information. 
_ Observers have thus been portrayed as 
information processors who are “character- 
istically normative and nomothetic” (Jones 
ét al., 1972, p. 85) and are acutely sensitive 
to variations in both assumed and observed 
Consensus. Other subsequent formulations 
(Weiner, 1974) and theoretical integrations 
(Ajzen & Fishbein, 1975; Anderson, 1974; 
| Jones & McGillis, 1976) have accepted the 
tole of consensus in the attribution process. 
Moreover, consensus manipulations have been 
Successfully applied by researchers to affect 
Perceptions of task difficulty (Weiner, Frieze, 
Kukla, Reed, Rest, & Rosenbaum, 1971) and 
attributions for emotional disorders (Valins 
& Nisbett, 1971). 


Research: An Overview 
The Normative Expectancy 


ag to observing the target actor, per 
Eas members of a social structure wit! 
"ften clearly defined behavioral norms—ex- 
Dect to see certain behaviors performed more 
"equently than others. These crude (i.e. 
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generalized across classes of actors and situa- 
tions) expectations serve as an initial basis 
for predicting the frequency of a given act in 
the population and the behavior potential of 
a particular individual. For drawing causal 
inferences, an unexpected act is conceptually 
equivalent to a low consensus behavior, 
whereas an expected event is equivalent to 
one of high consensus. 

Generalized behavioral expectancies. Nor- 
mative expectancies may be derived first 
from a knowledge of one’s own behavior. 
Heider (1958) often referred to actors’ ego- 
centric tendency to use their own behavior as 
a normative standard. Specifically, he (and 
recently Ross, 1977) delineated a two-stage 
process. First, individuals tend to see their 
own behaviors as environmentally caused and 
hence consensual. After all, I choose Brand X 
over Brand Y because Brand X is better; 
since Brand X is better, most people (would) 
also choose it. In the second stage, observers 
who know their own behavior interpret an- 
other actor’s discrepant performance as low in 
consensus and attributable to personal causes. 

The effects of self-generated consensus have 
been amply demonstrated. Hansen and Dono- 
ghue (1977) had subjects sample a beverage 
and then watch a confederate who drank 
either a similar amount or more. In addition 
to inferring population performance from 
their own behavior, subjects attributed the 
other actor’s similar performance to the bev- 
erage and dissimilar behavior to the actor. 
Ross et al. (1977) conducted an analogous 
questionnaire study in which subjects read 
brief stories of events culminating in a be- 
havioral choice. As it turned out, subjects’ 
own stated choices were related to their esti- 
mates of how others would behave. In fact, 
subjects made more confident and extreme 
trait ratings to characterize the typical person 
who would elect the alternative option than 
to describe the typical person making the 
same behavioral choice.* 

Certainly an individual need not actually 
observe his or her own behavior in order to 
draw normative inferences. Implicit consensus 


demonstrations of the false- 


1Note that these h 
ded with subject self- 


consensus effect are confoun 
selection. 
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may be generated through a simple knowledge 
of the population base rates for certain be- 
haviors. In some studies, behaviors that were 
low in their attractiveness or assumed desira- 
bility induced dispositional inferences about 
the target actors (Ajzen, 1971; Jones, Davis, 
& Gergen, 1961). In view of the strong cor- 
respondence between assumed desirability and 
perceived probability of occurrence (Ajzen, 
1971; Eisen, Note 1), these studies indi- 
rectly support the normative expectancy — 
causal attribution hypothesis. In other studies, 
Lay, Burron, and Jackson (1973) and Kassin 
and Lowe (in press) actually pretested the 
base rates and perceived base rates for stimu- 
lus behaviors, respectively. Lay et al. (1973), 
for example, found that subjects made more 
confident inferences about the traits of a per- 
son who supposedly endorsed a low base-rate 
item (e.g., “I joke and talk rather than work 
whenever possible”) than about one who en- 
dorsed a high base-rate item (e.g., “I love to 
tell and listen to jokes and funny stories”). 
Actor-based expectancies. Of course, be- 
havioral expectancies are often refined by an 
observer’s knowledge about characteristics of 
the target actor (e.g., sex, age, occupation). 
Jones and McGillis (1976) described actor- 
based expectancies as follows: “Category 
membership suggests a modal behavior ex- 
pectancy or the presence of one categorizing 
feature (obesity) suggests other correlated 
features (jolliness)” (p. 413). Kahneman and 
Tversky (1973), for example, capitalized on 
popular stereotypes about occupations. In 
the well-known engineer-lawyer study, they 
told one group of subjects that a sample con- 
tained 70% lawyers and 30% engineers, 
whereas another group was presented with a 
sample of 30% lawyers and 70% engineers. 
Subjects then read descriptions of five target 
individuals taken from that group and pre- 
dicted the likelihood that each person was a 
lawyer (engineer). One description read: 


Jack is a 45-year-old man. He is married and has 
four children. He is generally conservative, careful, 
and ambitious. He shows no interest in political 
and social issues and spends most of his free time 
on his many hobbies which include home carpentry, 
sailing, and mathematical puzzles. (p. 241) 


Results indicated that subjects underutilized 
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the 70-30 base rates and relied largely on thy 
personality-stereotypic cues (e.g, Classifying 
the individual as an engineer if he shows m’ 
interest in political and social issues, enjoys 
mathematical puzzles, and so on). Since this 
study, observers’ reliance on brief personality 
sketches for making category predictions has 
been repeatedly demonstrated (eg., Zucker | 
man, 1978a). 

Predictions about behavior are similarly 
influenced by actor-based expectancies, For 
example, Miller, Gillen, Schenker, and Rad- 
love (1974) had subjects read about Mil- 
gram’s (1963) obedience study and then 
guess how selected stimulus persons responded, 
They found that the target person’s sex and 
attractiveness affected this measure—subj i 
predicted that males and unattractive indi- 
viduals would administer greater levels of 
shock than females and physically attractive 
participants would. s 

Situation-based expectancies. Normativt 
expectancies may also be qualified by situa 
tional demands (Herzberger, in press; Jo 
et al., 1961; Jones & McGillis, } 
and Kassin (1977) manipulated the implicit 
consensus for helping behavior by i 
several factors of the described situation (ega 
the personality and appearance of pie 
quester). For subjects who were not presentes 
with the base rates for helping, PoP high 
estimates were as anticipated: 6770 te 
implicit consensus), 44% (moderate me, 
consensus), and 37% (low implicit cae 
sus). More importantly, causal attri E, 
and trait inference followed from these pe 
mates—subjects eard not 
more negatively when helping } 
Öter ‘se iat similar effects. vi 
al. (1974) found that Milgram's 7 ig. 
study elicited clear nonobedience reported 
tions. Nisbett and Borgida (1975) avior 
that their subjects expected oe eizue 
in the Darley and Latané (19 
study. 

Summary. 


tive expectancies 1 
self- and other-observation provice ©" to a 


on. 
basis for prediction and attri z ot p 
large extent, these expectations E aag 
cific (e.g, “males are generally Hon specific 
sive than females”) and situa! 


norma: 
In sum, it appears jaa 
that are based 0 an initi 


] 

i 

(eg, “people are generally not aggressive in 
school”). When information about the actor 
and/or situation is unavailable or uninforma- 
tive, then observers may fall back on norma- 
tive expectancies that are more generalized 
(eg, “People are generally [not] aggres- 
sive”). 


Explicit Consensus 


Explicit consensus refers to the actual be- 
havior of individuals in a sample. Early tests 
of Kelley’s (1967) model supported the con- 
sensus-attribution hypothesis (Frieze & Wei- 
ner, 1971; McArthur, 1972). In the proto- 
typical study, McArthur (1972) presented 
subjects with behavioral descriptions (e.g., 
“John laughs at the comedian”) to which 
high or low consensus, distinctiveness, and 
consistency were appended. Consensus infor- 
mation took the form “almost everyone 
(hardly anyone) else laughed at the co- 
median” and was utilized as predicted toward 
person attribution and especially toward stim- 
ulus attribution. In subsequent experimenta- 
tion, Orvis, Cunningham, and Kelley (1975) 
found that knowledge of consensus wes suf- 
ficient for subjects to “figure out” the dis- 
tinctiveness and consistency of an event—high 
and low consensus induced inferences of high 
Consistency as well as high and low distinc- 
tiveness, respectively. 

Finally, two somewhat established empiri- 
cal patterns are noteworthy. First, explicit 
Consensus appears to be particularly informa- 
tive about the stimulus in a situation (Orvis 
et al, 1975: McArthur, 1976; Zuckerman, 
1978b). Subjects given a stimulus attribu- 
tion for fictitious events readily infer high 
Consensus (Zuckerman & Mann, in press), 
Solicit consensus more than any other kind of 
information (Garland, Hardy, & Stephenson, 
1975), and rank it as the most important 
Oe governing these decisions (Vestewig, 
Os 2). Developmentally, the high consen- 
ne stimulus inference relationship precedes 
ik link between low consensus and person 
t Sie (DiVitto & McArthur, 1978). Sec- 
ou a growing body of evidence suggests 
nae perhaps observers make better use of 
a ensus information than do actors, who 
end to rely more heavily on distinctiveness 
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(Hansen, 1976; Hansen & Lowe, 1976) or on 
a knowledge of their own previous behavior 
(Hansen & Donoghue, 1977; Hansen & Ston- 
ner, 1978). 

Despite the findings cited above, certain 
recent developments in the literature have cast 
some doubt as to whether observers are as 
sensitive to consensus as was initially be- 
lieved. Within the attribution area, McArthur 
(1972) noted that although main effects for 
consensus on attributions were obtained, it 
was the least effective of Kelley’s three varia- 
bles. This conclusion about the relative in- 
effectiveness of consensus was extended to 
issues such as developmental trends in the 
use of Kelley’s variables (DiVitto & Mc- 
Arthur, 1978; Dix, Herzberger, & Erlebacher, 
Note 3; Karlovac, Feldman, Higgins, & Ru- 
ble, Note 4; Swann & Collins, Note 5) and 
has been widely cited as evidence for the 
absolute ineffectiveness of consensus. 

In the social judgment literature, analo- 
gous deficiencies in observers’ use of base 
rates toward prediction have been reported 
(e.g., the Kahneman & Tversky, 1973, stud- 
ies). On the one hand, the apparent compara- 
bility of results between the attribution (e.g. 
Nisbett & Borgida, 1975) and judgment 
(e.g., Kahneman & Tversky, 1973) studies is 
impressive in view of their contrasting re- 
search paradigms (for an excellent discussion, 
see Fischhoff, 1976). On the other hand, one 
must be cognizant of the vastly different con- 
clusions that follow from the fact that re- 
searchers in the two disciplines pose some- 
what different questions. Attribution re- 
searchers make the traditional comparison 
between subjects’ responses and the null hy- 
pothesis (i.e., does consensus influence in- 
ferences, or is it ignored?), whereas the social 
judgment investigators compare subjects’ pre- 
dictions with those implied by a formal 
probability model (ie., do subjects make 
optimal or suboptimal use of the base rate? 
see Borgida, 1978; Wells & Harvey, 1978). 

Citing the McArthur (1972) and Kahne- 
man and Tversky (1973) experiments, Nis- 


bett and his colleagues performed a number 


of experiments in which some subjects but not 
others received base-rate information. No 
differences in their dependent measures 


emerged, leading the authors to conclude 
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hat perceivers ignore consensus. Because of 
he importance of this research, two studies in 
articular will be described. 

Nisbett and Borgida (1975) presented sub- 
jects with descriptions of two experimental 
situations: the high fear condition of a shock 
tolerance study (Nisbett & Schachter, 1966) 
and the emergency condition of a bystander 
intervention study (Darley & Latané, 1968). 
Consensus-information subjects read fre- 
quency distribution tables indicating that 
most participants in the shock study tolerated 
a large degree of shock and that most par- 
ticipants in the helping study either took a 
long time to help or did not help at all. No 
consensus-information subjects did not read 
about these results. In one study, subjects 
read brief descriptions of participants and 
guessed how these target actors had behaved. 
In a second study, subjects were told that 
the described actors behaved in the extreme, 
high base-rate manner (i.e., tolerated maxi- 
mum shock, did not help). They then rated 
that participant on a number of relevant 
traits and indicated whether that behavior 
was personally or situationally caused. Over- 
all, knowledge of base rates had no effect on 
prediction, attribution, or trait inference. 

In another study, Nisbett et al. (1976) 
failed in three attempts to mitigate depression 
with the use of high-consensus information 
that was designed to externalize the perceived 
cause of the affect, thereby reducing worry. 
One study that dealt with the “Sunday blues” 
illustrates their general approach. Male un- 
dergraduates filled out a number of mood 
scales, a questionnaire reporting on their aca- 
demic and social activities for the day, and 
rating scales for cartoon funniness. After 
these initial Sunday premeasures, one group 
received statistics showing that many stu- 
dents experience the Sunday blues (92% 
occasionally, 65% often), a second group was 
provided with an explanation of the phenome- 
non to supplement this consensus, and a third 
group was not given any base-rate informa- 
tion. Postmanipulation measures revealed that 
no changes in mood took place in any of the 
groups, including those presented with the 
consensus. 

Research since these influential experi- 
ments has focused on determining those condi- 
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tions that mediate the efficacy of 
consensus. One outcome of the more 
work in this area is the widespread 
tion that the interaction between expec 
cies and sample-based consensus is impor 
and that the consensus-attribution relat 
ship is a complex one whose boundary ¢ 
tions are easily violated. What has th 
emerged is a collection of empirical 
and issues that suggests the following mo 
proposition: Any conclusions about the 
fects of consensus information must be 4 
fied—by stating what kind of consensi S 
nipulation (e.g., implicit or explicit, 
cretely or abstractly presented), what ki 
effect (i.e. individual or population pī 
tion, causal attribution, trait inference), 
compared to what response criterion { 
the null hypothesis vs. a normative mo 


Normative Expectancies Versus 
Consensus 


The Consistency Problem 


Self-generated consensus appears to pre 
a powerful frame of reference, one that 
be based upon a lifetime of self- and ¢ 
observation. Some investigators have 
suggested that experimental consensus 
ulations might be ineffective because oF 
ference from prior expectations. Af el 
subjects are typically presented wit 
rates about a target behavior that 3 
arouses normative expectations. Addi 
subjects may observe or read about 
rounding situation, the target stimu 
or the target actor, all of which contr 
this prior belief. In other words, sampi 
rates often comprise only one source © 
sensus information. 

In cases where an 
with base rates for a 
low in implicit consen: E 
sistency—inconsistency between the ; 
partly determine the efficacy of 
lated (ie., explicit) consensus. y 
treme, the impact of base rates (i.e: 
to a no-base-rate contro ) ma: 
they are redundant with prior € 
In this regard, 
noted that Nisbett et 
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presented subjects with a redundant and 
‘hence uninformative consensus manipulation. 
Recall that these latter investigators sought to 
mitigate depressive moods with the use of 
high-consensus information. Nisbett et al. 
(1976) noted that the Sunday blues are com- 
monly experienced by students; yet perhaps 
subjects were also aware that this negative 
affect was normative. Although the no con- 
sensus subjects’ prior beliefs were not as- 
sessed, their estimates might not have dif- 
fered from those of the base-rate subjects. In 
other words, perhaps the manipulation was 
uninformative, having little effect on already 
existing beliefs. 

A more common problem arises when base 
rates are rejected for being highly discrepant 
with implicit consensus. Instances of extreme 
inconsistency are fairly well documented. 
Nisbett and Borgida (1975) noted this kind 
of discrepancy in their study when they com- 
pared the “naive estimates” made by the no 
base-rate subjects (i.e., extreme shock toler- 
ance/not helping a victim were unexpected 
behaviors) with the base rates actually pre- 
sented (i.e., shock tolerance/nonhelping were 
high-consensus behaviors). Miller et al. 
(1974) presented their subjects with the sur- 
prising results of Milgram’s (1963) obedience 
study and also found that subjects did not 
employ the consensus for prediction or expla- 
nation. Finally, Kahneman and Tversky 
(1973) reported that base rates had little 
effect on category prediction when they con- 
ficted with the expectancies aroused by their 
target-case descriptions. 

In sum, it appears that observers’ use of 
Sample-based consensus is minimal when this 
information is highly redundant or discrepant 
with baseline expectancies. We turn now to 
discussions about the cognitive strategies by 
Which individuals reject discrepant base rates 
and the ways in which the effectiveness of 
explicit consensus has been increased. 


Cognitive Strategies for the Rejection of 
Explicit Consensus 


If observers do indeed reject base P 
that are inconsistent with their beliefs, what 
are the cognitive strategies by which they 


teject the new information? One possibility 
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is that the perceiver may simply disbelieve 
the base rate. Results of a survey sponsored 
by a political party as well as the manipula- 
tions of a psychology experimenter may thus 
be dismissed as false and deceptive. More 
important, however, are those situations where 
the observed consensus is believed but is 
rejected as uninformative. 

The first mechanism that comes to mind is 
the possibility for distorted recall. Perhaps 
subjects who read that very few ‘participants 
helped a seizure victim actually recalled a 
higher base rate immediately prior to their 
dependent variable responses. This possibility 
is ruled out by consensus recall results (Nis- 
bett & Borgida, 1975; Wells & Harvey, 
1977), which held up even when manipula- 
tion and assessment were separated in time 
(Nisbett & Borgida, 1975). Wells and Harvey 
(1977) did identify one strategy typically 
employed by observers who are confronted 
with discrepant base rates. Subjects—either 
informed or uninformed of sample randomness 
—read the bystander, intervention study and 
base rates indicating that helping was either 
high or low in consensus. They then indicated 
the extent to which the participants compris- 
ing the base rate were “representative” and 
“different” from students in general. In the 
no-knowledge-of-randomness condition, the 
lower the base rates for helping were, the 
more different from people in general that 
sample of actors was rated, Having been 
presented with highly unexpected informa- 
tion, subjects readily assumed a bias in the 
observed sample as a means of rejecting the 
base rate as uninformative of the “true” con- 
sensus. 

What about observers’ inferences about the 
size of a sample? Kahneman and Tversky 
(1972) asserted that “people often take 
seriously a result stated in percentages, with 
no concern for the number of observations, 
which may be ridiculously small” (p. 44). 
Wells and Harvey’s (1977) findings, however, 
suggest a second mechanism that perceivers 
ke when confronted with a discrep- 


might invo! ; 
ant percentage—to infer that it was based on 


a small sample. Accordingly, Kassin (1979) 
had subjects read one of two situations In 
which helping behavior was either expected or 
unexpected and results indicating either a high 
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or low base-rate percentage. As predicted, 
expectancies and base rates interacted—sub- 
jects for whom the consensus percentages were 
discrepant with normative expectations (i.e., 
high expectancy, low base rate and low ex- 
pectancy, high base rate) assumed that the 
consensus had been derived from a small 
sample. In sum, observers have at their dis- 
posal certain mechanisms with which they 
justify their dismissal of sample-based con- 
sensus information. 


Effective Manipulations of Explicit Consensus 


In view of the problems associated with 
highly discrepant or redundant consensus in- 
formation, it is not surprising that significant 
effects for sample-based consensus have gen- 
erally been obtained under somewhat restric- 
tive conditions-—either when the manipulation 
is strengthened (e.g., when it is particularly 
salient) or when normative expectancies have 
been neutralized (i.e., in the absence of in- 
formation about the target actor and stimu- 
lus). What follows now is a review of those 
conditions that have successfully increased 
observers’ use of base rates. 


Neutralization of Expectancies 


A number of investigators have obtained 
strong effects for sample-based consensus by 
denying subjects access to expectancy-arous- 
ing information. For example, Kassin and 
Lowe (Note 6) conducted two experiments in 
which consensus and distinctiveness informa- 
tion were illustrated through the movement 
of objects (after Heider & Simmel, 1944). 
Specifically, a series of animated films was 
shown depicting interactions between three 
triangles (Target Persons A, B, C) and three 
squares (Target Stimuli X, Y, Z). In one 
experiment, Target Actor B bumped Target 
Stimulus Y. For high consensus, Persons A 
and C also bumped Y; for low consensus, A 
and C approached but did not bump Y. Dis- 
tinctiveness was similarly operationalized by 
varying the behavior of Actor B toward the 
other stimuli, X and Z. In contrast to some 
previous findings, consensus had a large effect 
on causal attribution and was often cited by 
subjects in their free response explanations; 
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it also had a particularly strong effect on nd 
dictions about the target stimulus. 

Hansen and Lowe (1976), who manipu- 
lated consensus by having observers view ac 
tors’ bogus reactions to musical stimuli, als 
insured control over expectancies—by pre- 
testing the stimuli and selecting those’ that 
had been rated as neutral and equivalent, In 
this way, observers’ tendencies to infer high 
consensus from their own preferences (Han- 
sen & Donoghue, 1977; Ross et al., 1977) 
were minimized. Finally, the study by Feld- 
man, Higgins, Karlovac, and Ruble (1976) 
is particularly informative of this boundary 
condition for an  explicit-consensus effect. 
Consensus was presented in videotaped se 
quences of an actor making a choice among 
items and others either agreeing or disagree 
ing with that choice. As it turned out, causal 
attributions were influenced by the manipu- 
lation, but only for subjects who did not 
actually see the stimuli from which the ac 
tor’s choice was made. Of course, instances” 
where base rates influence target-case predic- | 
tion only when individuating information 1 
unavailable (Carroll & Siegler, 1977; Kahne- 
man & Tversky, 1973; Zuckerman, 1978a) 
reflect the same pattern. In fact, both Kahne- 
man and Tversky (1973) and Zuckerman 
(1978a) found that the impact of base a 
on target prediction was consistently grea 

$ ; da description 
for subjects who did not rea! 
of the stimulus person than for those W 
did. 


Factors That Strengthen Consensus 


Clearly, most of the recent research effort 
has been directed toward isolating those a 
ditions that strengthen manipulations be 
plicit consensus. These factors are rev! 
below. 


Magnitude of Consensus 
con 


sensus? Becat ios 
ortion, One Or. 
the consens 


How high is high con 
sensus refers to ee ee 
rediction is that the stro! 
peanindatiori (e.g., 90%-10% He rect a 
60% 40%), the greater will per n Me 
predictions and causal inferen - yestion 
Arthur’s (1972) and other simi! ; 
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naire experiments, the actual proportions 
were not specified. For those presentations of 
consensus in which subjects directly viewed 
the behavior of others, the three (Hansen & 
Lowe, 1976) or four (Feldman et al., 1976) 
others behaved unanimously, thereby creating 
small-sample base rates of 100%-25% and 
100%-20%, respectively. Only since the Nis- 
bett and Borgida experiments (1975) have 
attribution researchers begun to present their 
subjects with more complex base rates com- 
posed of larger samples of “others.” Wells 
and Harvey (1977) suggested that Nisbett 
and Borgida’s (1975) base rate manipulations 
were relatively weak. Experiments were thus 
conducted in which the same two situations 
were described but with a lower low consensus 
and a higher high consensus added. As pre- 
dicted, the stronger version of consensus had 
a somewhat greater effect than did the origi- 
nal manipulation (see also Smith, Note 7). 


Salience or Availability of the Base Rate 


Consensus is not necessarily an abstract 
form of information. Rather, manipulations of 
sample-based consensus may vary along a con- 
tinuum of salience and concreteness. At one 
extreme, it may be operationalized as a “re- 
mote, pallid, and abstract” (Nisbett et al., 
1976) summary base rate. At the other ex- 
treme, consensus information may be con- 
veyed more concretely—through the live or 
videotaped actions of others (e.g, Feldman 
et al., 1976) or through the animated move- 
ments of objects (Kassin & Lowe, Note 6). 
Borgida and Nisbett (1977) have argued con- 
vincingly that concrete information is more 
compelling than similar information presented 
in abstract form, To the extent that perceivers 
successively consider possible causes in order 
of their salience (Smith, Note 7), consensus- 
based inference may be facilitated by factors 
that increase the variable’s attention value and 
Perhaps its availability for recall (Taylor & 
Fiske, 1978), 

Sequential versus simultaneous presenta- 
tion, As with other variables in social per- 
ption, consensus often unfolds over time. 
Recall that Feldman et al. (1976) manipu- 
ated the consensus of an actor’s choice. In a 
“multaneous-presentation condition, the four 
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other actors then agreed or disagreed with 
this choice in unison; in a serial-presentation 
condition, four other actors then agreed or 
disagreed individually. A Consensus X Mode 
of Presentation interaction indicated that con- 
sensus affected causal attributions more when 
it was presented sequentially than when it 
was presented simultaneously. In fact, a com- 
parable study (reported by Manis, 1977) has 
revealed that subjects’ use of base rates for 
making categorical predictions is also in- 
creased by a sequential mode of presentation. 
Feldman et al. (1976) explained this effect 
as “a tendency for subjects to process simul- 
taneous information as a single bit of infor- 
mation, but process successive information as 
independent bits of information, Thus, there 
would be ‘more’ consensus information pro- 
cessed by subjects in the successive presenta- 
tion condition” (p. 697). The importance of 
these findings is underscored by their eco- 
logical validity. Since consensus information 
in “real-world,” nonlaboratory settings is 
often acquired through multiple observations 
that are separated by time, it is probably a 
powerful determinant of prediction and attri- 
bution. 

Order of information presentation. The 
recognized importance of temporal factors. has 
stimulated a number of interesting hypotheses 
concerning the order of information presenta- 
tion at three levels: Between consensus and 
other cues, between the target actor versus 
others’ behavior, and within the others’ be- 
havior. At the first level, Ruble and Feldman 
(1976) varied the order in which consensus, 
distinctiveness, and consistency were pre- 
sented in a questionnaire study and found a 
recency effect—high versus low consensus 
had the least impact when presented first and 
the greatest impact when presented last (see 
also Zuckerman, 1978b). The relative 
strengths of consensus and distinctiveness 
may thus be vulnerable to the order of infor- 
mation presentation. 

The order in which the target actor versus 
the others’ behavior is viewed is also an im- 
portant characteristic of consensus when its 

rception is embedded in time. In the video- 
tapes of Feldman et al. (1976), the target 
actor always made a choice before rather than 
after the others did. On the assumption that 
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observers are attuned to conformity and com- 
pliance cues, Kassin (1977) varied the con- 
sensus and order of behavior in a contrived 
aggression experiment and found that an ac- 
tor’s high-consensus aggression was personally 
attributed less often when he delivered shock 
after the others did than when he did so be- 
fore the others. Thus, the order of the target 
and others’ behavior may mediate the con- 
sensus effect, at least for behaviors that entail 
social pressure and when the order is not 
fixed by social constraints (e.g., “respond 
when you decide” rather than “respond in 
order of seating”). 

Finally at the third level, there is ample 
evidence from research in impression forma- 
tion (Jones & Goethals, 1971) to suggest 
that the order in which the others’ behaviors 
appear might actually elicit differential esti- 
mates of consensus (e.g., One’s estimate that 
“most actors choose X over 0” could vary if 
the choices unfold as xxxxx00, 00Xxxxx, or 
xx0xx0x). 

Translatability of the base rate. Carroll 
and Siegler (1977) suggested that perhaps 
Kahneman and Tversky’s (1973) subjects 
viewed the category base rates as relevant but 
had difficulty drawing a direct implication 
from them. Specifically, they noted, “if they 
attempt to probability match (cf. Weir, 1964), 
there would be no direct correspondence be- 
tween a 70% /30% division in the population 
and how a sample of five people can be di- 
vided. Seeing no direct solution to the prob- 
lem of how to use the base rate information, 
subjects may then decide to ignore it” (p. 
393). Accordingly, Carroll and Siegler (1977) 
examined the effect of translatability on base- 
rate utilization. 

In that study (reported Experiment 2), 
half the subjects predicted the occupations of 
10 individuals sampled from a population of 
20 with a 70%-30% base rate (these figures 
afforded direct translation into a 7-3 parti- 
tion for the 10-member sample). The other 
subjects were presented with base rates that 
were 75%-25% and hence did not allow for 
whole number translation to a sample of 10. 
Results indicated that despite the increased 
strength of the consensus manipulation in the 
latter condition, predictions were influenced 
only by the translatable base rates. A similar 
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effect was reported by Tversky and Kahne- 
man (1974), who labeled this availability 
heuristic “imaginability.” Thus, the ease with ` 
which a base rate may be applied to a predic- 
tion task affects subjects’ use of the informa- 
tion. 


Representativeness of the Sample 


Of perhaps even greater importance than 
the salience or availability of explicit consen- | 
sus is its perceived validity. One severe limi- 
tation of explicit consensus is that it is based 
on the observation of a limited sample. Thus, 
although the sample base rates should always 
influence guesses about the behavior of per 
sons from that group, utilization of these data 
toward prediction of other samples or the 
population at large is necessarily constrained 
by the perceived representativeness of the 
original sample. This limitation was demon- 
strated in studies showing that base-rate in- 
formation may be undermined by inferences 
that the sample is biased (Wells & Harvey, 
1977) or small (Kassin, 1979). r 

Knowledge of randomness. One variable 
that enhances the meaningfulness of a bast 
rate is knowledge of random selection. Theo- 
retically, consensus utilization requires 4 be- 
lief that the observed sample of actors 1S "o 
resentative of the population (e 
Harvey, 1978). In line with Kelley’s ae 
model, a consensual behavior should be ad 
ationally attributed. However, this infere! 
may be undermined by a perceiver $ dismi; a 
of the base rate as being attributable to 50 
idiosyncratic characteristics of the ie i 
sample. For example, if all the actors ia 
group solve a puzzle, an observer ate the 
infer task ease if he or she believes t E 
individuals are geniuses. The pero Ha 
know or be able to assume that the a e 
representative of the population, me appro" 
average intellect, in order to make 


riate inferences. : ivel 
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with this issue. Noting that Nisbett assured 
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that the base rate sample was He ects that 
and Harvey told half of their s" } 
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derived had been randomly selec 
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enough, percentage estimates of population 
performance and causal attributions for the 
behavior of another were influenced by con- 
sensus only for subjects in the knowledge-of- 
random-sampling condition, a finding corrobo- 
rated by Hansen and Donoghue (1977). 
Guesses about the behavior of persons who 
were in the original sample, however, were 
affected by consensus with and without the 
randomness variation. Consistent, then, with 
the notion that knowledge of randomness in- 
creases the generalizability of a base rate to 
new samples, such instructions affected extra- 
sample prediction (i.e., the prediction of indi- 
viduals in the general population) but not 
intrasample guesses (i.e. guessing the be- 
havior of target actors from the original 
sample). 

An intriguing and related side issue con- 
cerns the “others” comprising consensus. 
Kahneman and Tversky (1973) insisted that 
people are more confident when making infer- 
ences from correlated than from independent 
cues. However, Goethals (1972) found that 
an actor’s confidence is increased more by 
agreement with a dissimilar other than with 
a similar other, tentatively suggesting that 
high consensus is more informative about a 
stimulus when there is variability in the per- 
sonal characteristics of the actors in a sample. 
Finally, note that Borgida (1978) proposed 
an additional, more concrete method by which 
to convey representativeness information—to 
show subjects a few “typical” people from the 
sample who behaved consensually. 

Sample size. Ajzen and Fishbein (1975) 
Suggested that when a proportion is held con- 
stant, information gain and hence causal at- 
tribution should increase with sample size, 
that is, with the number of observed events. 
The fact that “30 out of 40 students failed the 
exam” should thus provide more information 
than would “3 out of 4 students failed the 
exam.” Accordingly, Kassin (1979) presented 
Subjects with a description of a helping ex- 
Petiment and two conflicting sets of results— 
One derived from a sample of 10 actors and 
the other from a sample of 50. For some sub- 
jects, helping was high in base rates for the 
irge sample, (60%, 70%, 80%, or 90%) and 
ow in base rate for the small sample (407, 
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30% , 20%, or 10%, respectively). For others, 
the base rates were conversely low for the 
large sample and high in the small sample. 
Results for both individual and population 
predictions indicated that although subjects 
did not make “optimal” use of the informa- 
tion (i.e., a weighted average), they did dem- 
onstrate an overall reliance on the larger- 
sample base rate, 


Causal Relevance of Consensus 


Base rates with an explanation. On the 
premise that people possess and often invoke 
their intuitive theories about the causes of 
events, Ajzen (1977) proposed that “causality 
heuristics” mediate the consensus-prediction 
relationship: 


When asked to make a prediction, people look for 
factors that would cause the behavior or event 
under consideration. Information that provides evi- 
dence concerning the presence or absence of such 
causal factors is therefore likely to influence pre- 
dictions, Other items of information, even though 
important by the normative principles of statistical 
prediction, will tend to be neglected if they have 
no apparent causal significance. Statistical informa- 
tion is used mainly when no causal information 
is available. (p. 304) 


To test this proposition, Ajzen (1977) had 
subjects predict fictitious students’ academic 
success (i.e., grade point average) from two 
cues—one that was intuitively causal (IQ or 
study time) and one that was not (income or 
distance from campus), In addition, he told 
subjects that each cue had demonstrated 
either a strong or weak relationship to the 
criterion. Results indicated that subjects’ pre- 
dictions were influenced more by causal and 
empirically strong cues than by noncausal 
and weak cues, respectively. More important, 
the causal nature of the information had a 
greater impact than did its empirical validity. 
In fact, the causal-weak cue was given greater 
weight than the noncausal-strong cue (e.g 
even when IQ was a weak predictor and dis- 
tance was strong, IQ weighed more heavily in 
subjects’ decisions). Within the framework of 
these results, Ajzen explained subjects’ under- 
utilization of base rates in Kahneman and 
Tversky’s (1973) lawyer-engineer study by 
noting that the proportion of lawyers (engi- 
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neers) did not cause any member of the sam- 
ple to become an engineer or lawyer, whereas 
the target-case description did provide such 
causal information. 

By way of implication, one way to increase 
observers’ use of base rates is to augment 
them with a reasonable explanation or “causal 
theory.” Smith (Note 7) presented subjects 
with the bystander intervention and shock 
tolerance descriptions employed by Nisbett 
and Borgida (1975). In addition to the stan- 
dard high- and no-base-rate groups, others 
were provided with supplementary explana- 
tions (e.g., “74% went all the way to the last 
shock because they were low in anxiety” or 
“because they wanted the experimenter to 
approve of them”). Sure enough, this manip- 
ulation increased subjects’ use of the con- 
sensus toward individual prediction, leading 
the author to conclude that “the reason be- 
comes salient as a possible cause of behavior” 
(p. 24). Finally, note that Tversky and 
Kahneman (1978) have also conceptualized 
their base-rate results within the context of a 
causality heuristic. 

Actions versus occurrences. After Krug- 
lanski (1975), Zuckerman (1978b) distin- 
guished between behaviors that are under the 
actor’s voluntary control (actions) and those 
that are not completely voluntary (occur- 
rences). Although occurrences may be at- 
tributed to either internal or external factors, 
actions are by definition caused by internal 
factors and should not logically be affected by 
consensus, The reasoning is as follows: Con- 
sensus provides information about the provo- 
cation or eliciting power of a stimulus (see 
Anderson, 1974; McArthur, 1976; also see 
McArthur’s, 1972, distinction between mani- 
fest and subjective verb categories). Yet if the 
stimulus is ruled out as a plausible (external) 
cause, consensus information becomes irrel- 
evant. 

To test this hypothesis, Zuckerman (1978b) 
had subjects read about one of two kinds of 
events—actions (e.g., “Jerry attended the 
Sunday meeting”) and occurrences (e.g., 
“Mary passed the exam in history”). These 
statements were accompanied by high or low 
consensus, distinctiveness, and consistency in- 
formation, Consistent with previous research 
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(e.g., McArthur, 1972; Orvis et al., 1975) 
consensus accounted for the highest percent. 
age of variance in stimulus attribution, More ` 
germane to the present discussion, consensys 
had a substantially larger effect on the at- 
tribution of occurrences than on actions, Ap- 
parently, the consensus-attribution link js 
Strongest for behaviors that are not com- 
pletely voluntary. 


Summary 


In sum, explicit consensus varies along a | 
number of important dimensions that influ- 
ence its efficacy. The following characteristics 
of the variable were reviewed: the strength 
or magnitude of the proportion, the salience’ 
of the information and the ease with which it 
may be applied, the perceived representative- 
ness (and hence generalizability of the base- 
rate sample), and the causal relevance of the 
base rate. We turn now to a consideration of 
the dependent variables. 


The Dependent Variables 
Prediction 


Subjects have been asked to predict ithe 
behavior of target members of the origi 
sample, frequencies or individuals in a new 
population, and/or their own behavior. An 
important issue here concerns the response 
criterion by which subjects’ performance 1s 
evaluated (for a comprehensive waa 

iteri trovers, 
the response criterion oe Pee 
was noted earlier that the conclusions drawn 
about base rates and predi 
based on one of two criteria. The ia 
null hypothesis—do the predictions 0 A 
rate versus no base rate or high- yes. ae 
base-rate subjects statistically differ a x A 
ventionally acceptable level? If they ; 
one would conclude that consensus A 
predictions; if no difference aie ct 
would conclude that subjects E pr 
sensus. An alternative strategy al ed by ® 
subjects’ predictions with those 1 P. of Bas 
normative model—do the prediction? plied 
rate subjects perfectly match pie may 
by normative standards? If they 
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conclude that subjects made perfect utiliza- 
tion of base rates; if not, one would conclude 

that observers “underutilized” consensus. 

Kahneman and Tversky’s (1973) lawyer-en- 

| gineer study illustrates the distinction, Recall 
that their subjects predicted the occupations 
of individuals taken from a population of 

70% or 30% engineers (lawyers). Overall, 
Subjects’ mean estimates were 55% in the 
high engineer condition and 50% in the low 
engineer condition, a result indicating that 

‘although subjects did not ignore the base rate 
(ie, the difference between 50% and 55% 
was statistically significant), they did under- 
Utilize it (i.e., 55% compared to 70%).? With 
this distinction in mind, the available research 
indicates that although observers generally 
do not ignore consensus information (Nisbett 
& Borgida, 1975, is an exception), they do 
hot make full utilization of it either (see 
Wells & Harvey, 1977; 1978). 

__ In general, population predictions are af- 
lected by both implicit and explicit consensus, 
the latter being especially effective when ob- 
Servers are assured that the base-rate sample 

is representative (Hansen & Donoghue, 1977; 

Wells & Harvey, 1977) and large (Kassin, 

1979), when the base rates are directly trans- 
latable into a prediction (Carroll & Siegler, 
1977), and when they unfold sequentially 
(Manis, 1977), The prediction of individual 
tases is also affected, though not optimally, by 
base rates (Lowe & Kassin, 1977; Wells & 
Harvey, 1977; Zuckerman, 1978a), except 
When relevant personal information (ie., 
‘ctor-based expectancies) is also available 
(Carroll & Siegler, 1977; Kahneman & Tver- 
‘ky, 1973; Nisbett & Borgida, 1975; Zuck- 
‘man, 1978a). Finally, note that base rates 
‘pear to have a greater impact on ob- 
“tvers’ discrete predictions (e.g., Category A 
e Category B) than on subjective probabil- 
ities associated with these outcomes (Manis, 

ovalina, Avis, & Cardoze, in press). 


Causal Attribution 


A number of important differences between 
yetictions and attributions deserve mention. 
tn aS the prediction task requires direct 

“nslation from base rates, causal attribution 
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demands that more complex inferences be 
made (Nisbett & Borgida, 1975). Also, in 
contrast to prediction data that may be com- 
pared to those implied by a formal model, 
“there is no generally agreed upon criterion 
for explanatory adequacy” (Fischhoff, 1976, 
p. 439). The null hypothesis (e.g., consensus 
vs. No consensus) rather than optimality thus 
serves as the unambiguous criterion by which 
to judge causal attributions. Finally, in con- 
trast to the assessment of predictions, which 
is straightforward and operationally consistent 
across studies, there is considerable disagree- 
ment over how to operationalize attribution 
measures that are far less sensitive and reli- 
able. 

Nisbett and Bellows (1977), among others, 
have argued that people infer causality from 
their normative expectations. This relation- 
ship has been established, even in cases where 
base-rate information is also available (e.g., 
Hansen & Donoghue, 1977; Lowe & Kassin, 
1977). On the other hand, observers use ex- 
plicit consensus toward causal attribution only 
under certain well-defined conditions—when 
prior expectancies are neutralized and/or 
when the base rate is strong, salient, repre- 
sentative of the population, and causally rel- 
evant. Note also that sample base rates do not 
affect personal and situational attribution 
equally. Instead, they generally provide more 
information about the stimulus than they do 
about the person (DiVitto & McArthur, 1978; 
Garland et al., 1975; McArthur, 1976; Orvis 
et al., 1975; Zuckerman, 1978b; Zuckerman 
& Mann, in press; Vestewig, Note 2). The 
actual behavior of actors thus provides in- 
formation about the power and characteristics 
of the focal stimulus or general situation (see 


2 Actually, Wells and Harvey (1978) are quick to 
point out that “proper utilization” in Kahneman 
and Tversky’s (1973) study does not necessarily 
imply target case predictions of 10% and 30%. 
Their subjects had two sources of information—the 
personality sketches and the category base rates, 
From a Bayesian perspective, if an individual case 
description is both accurate and diagnostic, it can 
serve to overcome the prior odds and reduce sub- 
jects’ reliance on the 70-30 probabilities, Conse- 
quently, a number of possible predictions could be 
acceptable within the normative model. 
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also the discussion by DiVitto & McArthur, 
1978). 

One final limitation within the domain of 
causal attribution is notable. That is, although 
consensus is informative for validating the 
causal categories of person versus situation, it 
is not necessarily of interest to observers who 
are evaluating alternative categories (eg., 
ability versus effort, intentional versus un- 
intentional). In short, the utility of consensus 
is restricted to a narrow range of deductively 
relevant causal categories (Kruglanski, 
Hamel, Maides, & Schwartz, 1978). 


Trait Inference 


A number of investigators have assessed 
subjects’ inferences about the characteristics 
of the target actor and stimulus. Trait infer- 
ence should follow from causal attribution 
such that actor-caused behaviors should yield 
more extreme and more confident trait ratings 
than situationally caused events would. As 
with causal attribution, dispositional infer- 
ences about the actor are often based on a 
comparison of his or her behavior with the 
expected norm (Ajzen, 1971; Lay et al., 1973; 
Lowe & Kassin, 1977; Ross et al., 1977). In 
other words, the actor’s characteristics are 
inferred from his or her unexpected (i.e., low 
implicit consensus) behavior. 

The effect of sample-based consensus on 
trait inferences has only recently received em- 
pirical attention. Nisbett and Borgida (1975) 
reported that inferences about actors’ traits 
were unaffected by consensus. Wells and Har- 
vey (1977), however, demonstrated that base 
rates do affect these inferences when the 
traits being assessed are relevant to the target 
behavior. Having read the bystander inter- 
vention study, their base-rate subjects rated 
nonhelping actors on trait dimensions, whereas 
their no-base-rate subjects evaluated the re- 
latedness between the target behavior and 
each dimension. Of nine traits seen as un- 
related to the target behavior, none was 
affected by the base rates. Yet, three out of 
four “relevant” traits were influenced by the 
consensus manipulation. Similarly, Hansen 
and Stonner (1978) found that consensus 
affects inferences about stimulus characteris- 
tics (e.g., task difficulty), but only when the 
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base rates and attributes are perceived to be 
causally related. 


Summary and Conclusions 


The present review suggests that any con. 
clusions concerning the effects of consensus 
must be qualified—by delineating what type 
of consensus, what type of effect, and com- 
pared to what response criterion. Normative 
expectancies that are based on self-observa- 
tion (i.e., the false-consensus effect) or pre 
vious other-observation (i.e., Jones & Mc 
Gillis’s, 1976, consensus variable) provide an 
initial basis for prediction and a frame of 
reference within which causes are inferred. 
These expectancies may be of a general nature 
or they may be refined by a perceiver’s knowl: 
edge about the (class of) target actor and sur 
rounding behavioral situation, Explicit, sam 
ple-based consensus (ie, Kelley’s, 1961 
conception of a covariation across actors) 
similarly employed, but under somewhat re 
strictive circumstances. That is, base rată 
appear to be relatively ineffective when they 
are either too inconsistent or are redundant 
with normative expectations. In some studies, 
the power of consensus has thus been demo! 
strated only when actor- or situation-bas 
expectancies have been neutralized. In oth 
studies, consensus utilization has been ! 
creased by varying characteristics of the e] 
perimental manipulation such as the HE, 
tude of the consensus proportion, the salient 
s. simultaneo 
presentation, order of information pe 
tion) and the ease with which it ma g 
plied (translatability), the perceive 
sentativeness of the base-rate sample 
edge of random selection, sample size), 
the causal relevance of the base rate 
rates with an explanation, actions V5. 
rences). k 

On the dependent variable side, y towa 
information is employed d 
the various tasks of prediction, f 
tion, and trait inference. Genet 

tilize but " 
observers appear to underu sagori 
ignore base rates when making cates + tefl 
behavioral predictions. Thi 
pered, of course, by the host pase 13 
strengthen the presentation wl 
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For making causal attribution and trait infer- 
ences, consensus seems to have a greater im- 
pact on stimulus than on person inferences. 
Finally, only traits and characteristics per- 
ceived as relevant to the target behavior are 
influenced by consensus information, 

To date, a wide variety of issues surround- 
ing the consensus controversy have been ad- 
dressed. These include actor—observer differ- 
ences in base-rate utilization, developmental 
trends in the use of consensus, the relative 
efficacy of consensus and distinctiveness, con- 
crete versus abstract information, response 
criteria in social judgment research, and so 
on, What has emerged from this recent surge 
in research is an empirical, though somewhat 
atheoretical, redefinition of the boundary con- 
tions for a consensus effect. Yet two impor- 
nt issues remain largely unresolved. First, 
ow do normative expectancies develop and 
on what informational characteristics are they 
based? The literature suggests that implicit 
consensus is a viable determinant of observers’ 
causal beliefs. Nevertheless, a theoretical ac- 
count of these expectancies, one that would 
enable us to predict their direction and mag- 
nitude, is currently lacking. A second im- 
Portant issue demanding theoretical integra- 
tion concerns the relationship between im- 
plicit and explicit consensus. In most life 
situations, people observe base rates for 
events that are high or low in their subjective 
Probabilities of occurrence. Although the 
available data indicate that base rates are 
More effective when information about the 
target actor or situation is absent, nobody has 
tested the logical hypothesis that the degree 
of discrepancy between implicit and explicit 
Consensus mediates utilization of the latter. 
At this point, attention might fruitfully be 
titected toward examining this relationship. 
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Female undergraduates (n = 62) who scored as extreme internals or externals 
on the Mirels Personal Fate Control Scale participated in a partial replication 
of Hiroto’s learned helplessness experiment. Lights were added to the treatment 
apparatus, which made explicit to subjects the contingency or noncontingency 
between their responses and the termination of an aversive tone. As predicted, 
the performance of internals was significantly impaired by uncontrollability 
(learned helplessness), while that of externals was facilitated by controllability 
(learned effectiveness). Externals performed as well as internals in the “escapa- 
ble” condition, but their performance was inferior to that of internals in the 
control condition. Following “inescapable” treatment, internals performed 
worse than externals. These results are supportive of Lefcourt’s theory of cue 
explication. Implications for locus of control and learned helplessness research 


are discussed. 


The behavioral consequences of both the 
experimental induction of learned helplessness 
(Abramson, Seligman, & Teasdale, 1978; 
Maier & Seligman, 1976) and pre-extant, gen- 
eralized perceptions of externality (Lefcourt, 
1976; Phares, 1976) have been examined in a 
study by Hiroto (1974). Hiroto used an in- 
strumental learned helplessness induction pro- 
cedure that has since become common: Sub- 
jects, told there is something they can do to 
terminate an aversive tone, are seated before 
a small box on which there is a button. In the 
“escapable” treatment condition, pressing the 
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button terminates the tone. In the 4 es 
able” condition, nothing the subject | 
terminates the tone. Rather, it is termit 
automatically. Hiroto concluded that 
fects of inescapable treatment parallel 
of pre-extant externality: Subjects i 
with inescapable noise performed signifi 
more poorly on a subsequent soluble tas 
did subjects not treated with inesca 
noise; and the performance of external 
of control subjects was inferior to that 0 
ternal locus of control subjects across ag 
perimental conditions. A 
Although it is typically found that in 
outperform externals at tasks va ; 
tions emphasizing skill are used ee 
Lewis, & Silverman, 1968; Rotter 
1965; Watson & Baumal, 1967), i 
(1974) results indicating that the $ E 
ance of internals was superior to tha A 
ternals in the inescapable conditio l 
zling when juxtaposed with other o 
trol studies. These studies have a 
internals adjust their aspiratan a 
following failure, whereas arm K 
to display atypical aspiration shi x 
1968; Lefcourt, 1967; Lefcourt 
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1965), that internals indicate that they have 
no control over a situation after failing to 

| avoid an aversive outcome (Gregory, 1978), 
that internals perform worse than externals 
on a digits-backwards task when they expect 
to receive uncontrollable shocks (Houston, 
1972), and that internals perform worse than 
externals on an anagram task following failure 
(Garrett & Willoughby, 1972). It would seem, 
then, that internals are often more adversely 
affected by negative experience than are ex- 
ternals, although Hiroto’s findings suggest the 
opposite. 

One possible explanation for these appar- 
ently discrepant results might lie in a seem- 
ingly minor procedural element: Hiroto 
(1974) did not utilize lights or other adjunc- 
tive feedback stimuli on the treatment ap- 
paratus to make explicit to the subjects the 
contingency or noncontingency between the 
subjects’ button-pushing responses and the 
termination of the aversive tone. The use of 
these lights has since become standard during 
the treatment phase of learned helplessness 
experiments when instrumental learning is in- 
volved (Hiroto & Seligman, 1975; Miller & 
Seligman, 1975; Price, Tryon, & Raps, 1978; 
Sacco & Hokanson, 1978). Hence, during the 
inescapable treatment, it may be argued that 
some of the subjects pushed the response but- 
ton just prior to or simultaneously with the 
automatic termination of the tone. Since there 
Were no response contingency lights to inform 
à subject as to whether he or she had termi- 
hated the tone or whether it had gone off 
automatically, some subjects might have per- 
ceived a relationship between their response 
and the termination of the tone. Having €x- 
Perienced partial reinforcement, these subjects 
Would be more likely to persist during the sec- 
ond phase of the experiment. Since Hiroto 
found that internals made significantly more 
tesponses than externals did during the in- 
escapable treatment (Mdns=21 and 5.5, 
Tespectively), this scenario is more likely to 

ave occurred for the internal subject than 
for the external subject. A test of this explana- 
tion would require only a partial replication of 

Hiroto procedure with the addition of re- 
‘Ponse contingency lights to the treatment 
“pparatus. If this explanation is valid, then 
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Hiroto’s finding should be reversed: For sub- 
jects in the inescapable condition, the per- 
formance of internals should be inferior to 
that of externals. 

But what effect should the addition of the 
response contingency lights have on the be- 
havior of internals and externals in the con- 
trol and escapable conditions? The perform- 
ance of control condition subjects should be 
unaffected, since there is no treatment and 
the response contingency lights are inapplica- 
ble. Thus internals should again outperform 
externals, as in Hiroto’s (1974) study. 

Different predictions can be made for the 
escapable condition. A number of studies have 
shown that explicit task cues have a positive 
effect on the behavior of externals (Dollinger 
& Taub, 1977; Gregory & Nelson, 1978; Lef- 
court, 1967; Lefcourt & Wine, 1969; Taub & 
Dollinger, 1975). These studies have demon- 
strated that providing reward or purpose for 
a task raises the motivational level of ex- 
ternals to equal that of internals or results in 
externals making aspiration shifts more typ- 
ical of internals following a success or a fail- 
ure. Lefcourt (1967) has labeled this process 
“cue explication” and has restricted its mean- 
ing to making the external subject “aware of 
the availability of reinforcements, and per- 
haps, of the methods for maximizing his 
chances of succeeding in given tasks” (p. 
378). In a learned helplessness experiment, a 
response contingency light would make ex- 


1 One study (Wolk & DuCette, 1974, Study 2) in- 
volving cue explication has failed to find that it alters 
the performance of externals to equal that of in- 
ternals. Under high cue explication conditions, Wolk 
and DuCette found that externals failed to display 
intentional or incidental learning at a level equal to 
that of internals. But it is questionable whether the 
high cue explication manipulation met the conditions 
prescribed by Lefcourt, since it consisted of inform- 
ing subjects that they would be tested on some aspect 
of the task in addition to taking a test already speci- 
fied. This does not make explicit the reinforcements 
available for a good task performance. This proce- 
dure could have produced debilitating anxiety or test 
anxiety, which are associated with externality 
(Feather, 1967; Ray & Katahn, 1968; Watson & 
Baumal, 1967). Supporting this argument, Wolk and 
DuCette found that high cue explication externals 
performed worse than low cue explication externals 
in their high difficulty condition. 
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plicit to the external subject that he or she 
had just been reinforced for a correct task 
performance. Whereas Hiroto (1974) found 
internals to exhibit superior performance in 
the escapable condition, the cue explication 
literature would suggest that the addition of 
the response contingency lights in the escap- 
able treatment condition would enhance the 
performance of externals to equal that of 
internals. 


Predictions 


It was predicted that the addition of re- 
sponse contingency lights to the treatment ap- 
paratus for an instrumental learned helpless- 
ness induction procedure would result in (a) 
no differences in performance between inter- 
nals and externals in the escapable condition, 
(b) internals outperforming externals in the 
control condition, and (c) externals outper- 
forming internals in the inescapable condition. 
For external subjects only, the theory of cue 
explication predicts that (d) the performance 
of subjects in the escapable condition should 
be enhanced above that of subjects in the con- 
trol condition. Comparative performance 
among internals would be unaffected by ex- 
plicit response cues, and they should display 
the typical learned helplessness pattern: (e) 
Inescapable treatment should produce per- 
formance decrements when compared to the 
control condition, whereas (f) the control con- 
dition should not differ from the escapable 
treatment condition. 


Method 


At Arizona State University, students enrolled in 
introductory psychology must write three brief re- 
search reports or participate in 3 hours of laboratory 
research, Students participating in this experiment 
had chosen the latter option. 

Subjects were 62 females who scored at the ex- 
tremes (0-2 or 7-9) of the Mirels (1970) personal 
fate control scale. This was a subsample of 183 
females to whom the Rotter (1966) Internal-External 
Locus of Control Scale was administered immediately 
prior to the experiment. Subjects were assigned ran- 
domly to either the escapable treatment, control, or 
inescapable treatment conditions. The relatively 
greater frequency of internals and the method of 
immediate assignment to condition resulted in un- 
equal numbers of subjects per cell of the 2 X3 fac- 
torial design. Although Hiroto (1974) used subjects 
scoring at the extremes of the DeKalb Survey Tests 
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(Student Opinion Survey, Form I-E, 1; see James, 
Note 1), the Mirels scale was employed here, since it ‘ 
is assumed to be more predictive of control over 
personal events and it is derived from the more com- 
monly used Rotter I-E scale. The James and Rotter 
scales are highly similar in content. Only female sub- 
jects were used, due to their availability in the sub- 
ject pool. 


Apparatus 


The treatment apparatus, adapted from Hiroto and 
Seligman (1975), consisted of a spring-loaded button 
housed in the center of a circular disc, 30.5 cm in 
diameter and 7.6 cm high, with two circular lights, 
one red and one green, each 1.3 cm in diameter, lo- 
cated on either side of the button. In the escapable 
condition two button presses terminated the aversive 
tone and stopped a remote timer. In the inescapable 
condition the button was electronically rendered ine 
operative. 

The test phase apparatus was a modification of 
Turner and Solomon’s (1962) human shuttle box. 
By sliding a knob to either extreme end contact could 
be made with a hidden microswitch that controlled 
tone termination. Specifications of the manipulandum 
are included in Hiroto (1974). 

The aversive stimulus was a 3000 Hz tone gen 
erated by a Hewlett Packard audio oscillator, pte- 
sented to the subject through Realistic Nova 10 
stereo headphones at 100 db. Formal instructions 
were presented to the subject on typewritten index 
cards. All response latencies were measured by 8 
standard (1/100 sec) automatic timer, and intertrial 
intervals (ITI) were timed manually on @ stopwatch. 
The oscillator, timer, and controlling circuitry were 
located in an adjacent room separated by @ one-way 


mirror. 


Procedure 


Treatment. Following administration of m A 
scale, each subject was escorted individually eon 
experimental room and was informed that the ai 
ment concerned the effects of noise pollution bie’ 
performance of simple tasks and that the ia rie 
continued participation in the study beyon aston 
sample of the unpleasant but harmless tor ‘che 
tional. If, after a 3 sec sample of the 3 ate rt 
the subject agreed to continue, the apne : 
ment phase was begun. Three subjects E 
ticipate (one internal and two moderates ; conditions 

Subjects in the escapable and jnescapa! tment af 
were seated at a table on which the trea 
paratus was located. The experime 
apparatus and presented instructions 
concerning the noise, A 
could ast stop it, and the func of os o 
lights. These instructions were w addition of 4 
Hiroto and Seligman (1975), with t the wo light 
statement reiterating the function © ol 
“You will have successfully stoppe 
when the green light comes 0n- 
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Forty unsignaled trials with an ITI range of 15 to 
75 sec and a mean of 17.4 sec followed. In the escap- 
able treatment, pressing the button twice within 5 
sec after tone onset terminated the tone, and the 
green light flashed on briefly. Response latencies were 
recorded. If the correct response did not occur within 
5 sec of tone onset, the tone terminated automat- 
ically, the red light flashed, and a response latency of 
“5 sec was recorded. 
In the inescapable treatment condition, exposure 
‘to tone duration and pattern was yoked to the sub- 
ject’s counterpart in the escapable condition and was 
“independent of the button. The tone was not escap- 
‘able, and the red light flashed at the termination of 
each trial. 

Control subjects were not exposed to the treatment 
apparatus but worked on a questionnaire composed. 
of.100, items selected from the California Psycholog- 
ical Inventory on the basis of relatively neutral con- 
tent, during which they received yoked exposure to 
the series of 40 tones. Subjects in this condition 
teceived instructions on completing the inventory and 
were told, “From time to time while you are filling 
out the questionnaire a noise will come on. Please 
continue working on the questionnaire.” * 

Test task. Immediately after treatment, subjects in 
all three groups were seated at a table on which the 
covered test task apparatus was located and were 
given instructions concerning it identical to those of 
Hiroto and Seligman (1975). These instructions were 
similar to those for the first task, but with no refer- 
} ences to lights, since they were not present on the test 
apparatus. 

The test phase consisted of 20 signaled 10 sec 
@scape-avoidance trials. A 5 sec amber light, located 
at the midpoint of the shuttlebox cover, preceded the 
| 3000 Hz, 100 db tone, with the light’s termination 
Coinciding with the onset of the tone. The ITI 
tanged from 10 to 25 sec, with a mean of 15 sec. 

Prior to the first trial the sliding knob was always 
located at the midpoint of the channel. The appropri- 
ate response was moving the knob to one side of the 
shuttlebox on one trial, moving it to the opposite 
Side on the next, and alternating for the remaining 
trials. Although the subject was informed only of the 
Possibility of escape, tone avoidance was possible by 
Making the appropriate response after the onset of 
the warning light and before the onset of the tone. 
i successful avoidance response (ie, @ response 
latency of less than 5 sec.) terminated the warning 
light and prevented exposure to the tone. A success- 
| i escape response (i.e. a response latency between 

| ae 10 sec) terminated the tone. If the subject 
om to avoid or to escape, the tone was terminated 

matically and a response latency of 10 sec was 
tecorded. 


Dependent Measures 


Eie test task-dependent measures were used tg 
| ae Performance. These were: (a) the “conditional 
obability” of an escape or avoidance response, de- 
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fined as the percentage of times a successful response 
(escape or avoidance) was preceded by a successful 
response; (b) the number of failures to avoid or 
escape, defined as the number of responses with 
latencies of 10 sec; (c) success criterion, defined as 
the number of trials needed to reach either an avoid- 
ance or an escape criterion of three consecutive escape 
or avoidance responses (subjects who failed to reach 
either were assigned scores of 21); (d) the mean 
overall response latency for the 20 trials; and (e) the 
number of trials needed to reach an avoidance cri- 
terion of three consecutive avoidance responses. Al- 
though partially redundant with (c), (e) was in- 
cluded as an exploratory measure to determine if any 
differences existed between internals and externals in 
their acquisition of avoidance responses. Latency was 
included as a dependent variable, since previous 
studies have utilized it, although Buchwald, Coyne, 
and Cole (1978) argue that it may be a poor measure 
of motivation in a learned helplessness experiment. 
At least two learned helplessness studies (Douglas & 
Anisman, 1975; Tennen & Eller, 1977) have failed to 
find significant main effects for this variable, The 
other three dependent variables have all been used 
previously in learned helplessness research, The con- 
ditional probability measure is assumed to assess cog- 
nitive learning ability, whereas the others index moti- 
vational processes (Klein & Seligman, 1976). 

Additionally, six questionnaire measures were in- 
cluded at the end of the experiment; these required 
the subject to rate her treatment experiences on 7- 
point bipolar scales. These were: (a) her motivation 
to solve the problem; (b) the extent to which she 
felt in control of her success and failure; (c) the level 
of her frustration; (d) the level of her irritation; (e) 
the level of her depression; and (f) the aversiveness 
of the tone. The second of these measures was in- 
tended as a manipulation check; the others were in- 
cluded as crude exploratory measures of possible 
differences in subjective state. 


Results 


Self-Report Measures 


A 2 (Internal vs. External Personal Control 
Orientation) X 3 (Escapable, Control, or In- 
escapable Treatment Condition) analysis of 
variance (ANOVA) was performed on each of 
the self-report measures. Only one significant 


2 At least two other studies (Gatchell & Proctor, 
1976; Hiroto & Seligman, 1975) have employed a 
control group that was told that noise was inevitable. 
Neither study found interference effects in the control 
group subjects. Apparently, informing subjects not to 
try to do anything about the noise precludes forma- 
tion of perceptions of helplessness (Buchwald, Coyne, 


& Cole, 1978, P- 184). 
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effect was revealed. A main effect for treatment 
was found for subjects’ perceptions of control, 
F(2, 56) = 3.55, p < .035. Simple effects 
analysis revealed that subjects in the inescap- 
able condition perceived less personal control 
than did control condition subjects, F(1, 56) 
= 6.23, p < 016, whereas control condition 
subjects did not differ from escapable condi- 
tion subjects, F(1, 56) < 1, ms. This suggests 
that the experimental manipulation of per- 
ceived control over events was successful. 


Random Assignment of Subjects 


To insure that the assignment of subjects 
to conditions had been random, the same 2 X 
3 ANOVA was performed on subjects’ personal 
control scores. Neither the interaction nor the 
main effect for treatment condition were sig- 
nificant, both Fs < 1. A main effect for per- 
sonal control orientation was found, F(1, 56) 
= 1248.18, p < .001, indicating that subjects 
in the internal conditions were significantly 
more internal than subjects in the external 
conditions. 


Multivariate Clustering 


Since the conditional probability measure 
was assumed to assess cognitive learning abil- 
ity, it was analyzed apart from the other de- 
pendent variables, which were assumed to 
assess motivation. Also, since the avoidance 
criterion variable was included as an explora- 
tory measure, and because the use of response 
latencies has been questioned previously, 
these variables, along with the number of fail- 
ures to escape or avoid and the success cri- 
terion variables, were subjected to a test of 
the assumption of homogeneity of the vari- 
ance-covariance matrices (Bock, 1975, p- 
412). This was done in order to determine the 
appropriateness of analyzing these variables 
as a single multivariate vector. This test indi- 
cated that the four variables together violated 
the assumption, x«* (50) = 104.83, p< .001. 
Accordingly, a principal components analysis 
with iterations utilizing the varimax rotation 
procedure (Nie, Hull, Jenkins, Steinbrenner, 
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& Bent, 1975) was undertaken to determing 
an appropriate clustering of the four vari. 
ables. The principal components analysis re. 
sulted in two unrotated factors with eigen. 
values greater than 1. The remaining two 
unrotated factors had eigenvalues of .15 or 
less and accounted for less than 5% of the 
variance. The first rotated factor accounted 
for 72.5% of the variance and consisted of 
the number of failures to avoid or escape and 
the success criterion variables. The second 
rotated factor accounted for 27.5% of the| 
variance and consisted of the avoidance cri- 
terion and response latency variables. Thus 
it was determined that separate multivariate) 
analyses of variance would be performed on 
the variables represented by these two factors. 
All subsequent analyses utilized the MANOVA’ 
program (Clyde, 1969), and planned com- 
parisons were employed as justified by the 
specific predictions of the study (Hays, 1973, 
p. 582; Winer, 1971, pp. 175, 384). The 
MANOVA program uses least squares analysts 
when performing univariate analyses of vati 
ance. 


Performance of Internal Versus 
External Subjects 


Planned comparisons were used to contrast 
h of the three 


internals to externals in eac 1 
treatment conditions. The planned comparison 
F value for each dependent variable is d 
played in Table 1. The means for all c on 
the performance measures are disp th 
Table 2. Figure 1 displays graphically : 
results of the conditional probability meast 
typical of the overall pattern of responses. a 
Escapable condition. As can be paer 
Table 2, there were no significant diffe 
between internals and externals on any 0 


3 e 00 
performance measures 1n te S 
ition. This supports the pre ae 
dition, This suppo! internals W 


ternals perform as well as 
i ici onse cues. 
given explicit resp Poa wi 


ition. As Pt 
Control condition Bees explici 


there was no prior exper! dition 
response cues, as in the ora e con 
i als 

internals outperformed extern Pures an 


ditional probability, number 0 
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Table 1 
Planned Comparison F Values for the Multivariate and Univariate Contrasts 
OSE ESE STORIES, SEEN era eee 
en 
Planned Multivari AN 
ee oie ultivariate contrast F Univariate contrast F 
condition NF &SC RL&AC cP NF Sc RL AC 
Internal vs. external 
subjects 
Escapable Li <i <i <1 
< 
Control 4 2:51% 1:58 5.04** hore Sase yet 
nescapable 2.58* <i , 4.10** e: $ ` y 
Internal subjects ane ane es 
Escapable vs. 
control <i 7.04*** <1 1.13 
Inescapable vs. ; iat p Matis 
control 10.05*** 4,43" 19.28*** 20,35*** eae f°) 
External subjects ee a Hi 
Escapable vs. 
control 2.39* 3.44** 5.05** 4.74** 4.48** 1.45 <1 
Inescapable vs. 
control <1 <i <1 <1 <1 <1 <1 
‘or all multivariate contrasts are 2 and 55, whereas they are 1 and 56 for all 


"Note, The degrees of freedom f 


measure. SC = success criterion measure. RL = response 


measure. 
1p < 10. 


3 criterion measures.? No significant 


** p < 05. p< OL. 


differences were obtained for the response 
latency and avoidance criterion measures, al- 
ae the means were in the predicted direc- 
ion. 

Inescapable condition. As predicted for 
the inescapable condition, where there were 


100 ay 
[< 90 HES 
Q 
Ja 80 
E 
tā, 
ZË vo 
E., 
| Sind ——_ 
16 EXTERNALS \ 
2 40 PIERE =e) % 
| 8 A INTERNALS 


ESCAPABLE CONTROL INESCAPABLE 


i k TREATMENT CONDITION 
E 1. Conditional probability by treatment condi- 


= contrasts. CP = conditional probability measure. 


NF = number of failures to escape or avoid 
latency measure. AC = avoidance criterion 


explicit cues indicating that subjects failed to 
control the outcome, internals performed sig- 
nificantly worse than externals did on the 
conditional probability and number of fail- 
ures measures. Differences on the other mea- 
sures were in the predicted direction but were 


nonsignificant. 


Performance of Internals Across 
Treatment Conditions 


Escapable versus control condition. It was 
also predicted that planned comparisons 
would not reveal significant differences be- 
tween escapable condition-internal subjects 
and control condition-internal subjects. With 
the exception of the avoidance criterion mea- 
sure, no significant effects were obtained, The 
multivariate comparison for the combined 
avoidance criterion and response latency mea- 


3 Healy (1969) has noted that since univariate 
confidence intervals are located along a linear axis, 
whereas bivariate confidence intervals are elliptical, 
it is possible to reject both univariate null hypotheses 
while failing to reject the bivariate null hypothesis. 
This is known as Rao’s paradox, and Baron (1978) 


has published results on its occurrence for a trivariate 
case. 
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Table 2 
Means for Each Performance Measure 


ee 


Treatment condition 


Escap- Con- Ines- 
able trol capable 

Measure M M M 
Internal orientation 

Conditional probability 97.8 93.0 39.1 

Number of failures 0.3 1.3 11.0 

Success criterion 3.4 4.2 12.0 

Response latency 5.9 4.5 7.2 

Avoidance criterion 19.4 12.3 17.4 
External orientation 

Conditional probability 94.6 65.4 63.8 


0.8 5.7 6.2 
4.0 9.3 8.7 
5.0 6.2 6.4 
16.6 16.7 


Number of failures 
Success criterion 
Response latency 
Avoidance criterion 


Note. With the exception of the conditional prob- 
ability measure, low numbers represent better 
performance. In the internal personal control orien- 
tation, escapable n = 17, control » = 12, and ines- 
capable n = 9. In the external personal control 
orientation, escapable n = 8, control n = 7, and 
inescapable n = 9. 


sures was significant, and inspection of the 
means indicates that internals in the control 
condition, contrary to prediction, were supe- 
rior in performance to the internals in the 
escapable condition. A brief comment may be 
in order concerning this unexpected finding. 
For the other measures of cognitive learning 
and motivation, explicit response cues indi- 
cating that internals were successful in escap- 
ing the aversive stimuli did not serve to en- 
hance their subsequent test task performance 
above that of control condition-internal sub- 
jects. However, the explicit cues may have 
led internals to overlearn an escape response 
to the extent that avoidance learning on the 
subsequent task was debilitated. This may 
occur only under conditions of explicit re- 
sponse cues, but further research is indicated 
on this topic. Given success, internals may be 
prone to stereotypic responses and fail to 
explore the possibilities of even more bene- 
ficial behaviors. 

Inescapable versus control condition. As 
predicted, planned comparisons revealed that 
the performance of inescapable condition- 


Ab 
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internal subjects was ‘Significantly worse than 
that of control condition-internal subjects for 
all measures. 

Summary of response pattern for internal | 
subjects. Overall, internals displayed the 
predicted learned helplessness pattern. That 
is, the performance of internal subjects in the | 
inescapable condition was inferior to that of 
internals in the control condition. With the | 
exception of the avoidance criterion measure, 
the performance of internal subjects in they 
escapable condition was not different from 
that of internals in the control condition, For 
the avoidance criterion measure, control cond 
dition-internal subjects outperformed escap 
able condition-internal subjects. 

It appears that an explicit response cl 
indicating that they are escaping an aversi 
stimulus does not serve to enhance the p 
formance of internal subjects above that 
control condition-internal subjects, althoug 
it can serve to interfere with subsequett) 
avoidance learning. By contrast, an explicit 
response cue indicating that they are failin 
to escape an aversive stimulus does servé 
debilitate the performance of internals i 
comparison to that of control condition! 
ternal subjects. 


Performance of Externals Across 
Treatment Conditions 


As prt 


Escapable versus control nii 
dicted, planned comparisons revea a 
A nance of escapable condition et a 
subjects was superior to the pel n 
control condition-external subjects PAN 4 
conditional probability, number of tai A 
and success criterion measures. 
univariate differences were reveal on ta 
response latency and avoidance ie 
sures, although the multivariate vyscrimial 
nificant. Computation of the ait of tt 
function scores, along with nee gitter 
means, indicated that the multiva d 
ence was in the predicted araroa ee 

Inescapable versus control ae soncersit 
specific predictions had been ee in th 
the performance of external sub) ; 


inescapable condition relative to 


ance of external subjects in the 
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dition. To remain consistent with the previous 
analyses, however, planned comparisons be- 
tween these groups were conducted. No sig- 
nificant differences were obtained on any of 
the performance measures. Since planned com- 
parisons are an extremely powerful technique 
for detecting differences, should they exist, 
the failure to obtain any significant differences 
for any of the univariate or multivariate con- 
trasts should underscore the apparent lack of 
differences between the performances of ex- 
ternal subjects in the inescapable and control 
conditions. 

Summary of response pattern for external 
subjects. It appears that externals did not 
display learned helplessness, since there were 
no differences between the control condition 
and the inescapable condition for any of the 
dependent’ measures. As predicted, the per- 
formance of externals in the escapable condi- 
tion was superior to that of externals in the 
control condition for most of the performance 
“measures, It would appear that an explicit 

response cue indicating that they are escaping 
an aversive stimulus does serve to enhance the 
subsequent performance of external subjects 
above that of control condition-external sub- 
jects, By contrast, an explicit cue indicating 

that they are failing to escape an aversive 
stimulus does not serve to debilitate the per- 
formance of externals in comparison to con- 
ttol condition—-external subjects. 

| 

| 

N 


Discussion 


Additional conclusions regarding cue expli- 
cation and the performances of internals and 
ĉternals would be possible, had the present 
design included nonexplicated escapable and 
inescapable conditions, as in Hiroto’s (1974) 
Study, Nonetheless, the cue explication hy- 
Pothesis (Lefcourt, 1967) received substan- 
tive support. Following escapable treatment 
an apparatus equipped with response cue 
lights, the performance of externals was sig- 
nificantly enhanced above that of control con- 
ition-external subjects. Additionally, these 
Rule condition-external subjects equalled 
i Performance the escapable condition-inter- 
| “l subjects, It appears that making explicit 
© the external subject the contingency be- 
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tween response and outcome is one means of 
cue explication. The finding that inescapable 
condition-internal subjects performed worse 
than both control condition-internal subjects 
and inescapable condition-external subjects 
when explicit response cues were provided 
suggests that cue explication may have a 
negative counterpart for internal subjects. 
When internals are made aware of the lack of 
availability of reinforcements, or the lack of 
contingency between behavior and outcome, 
their subsequent performance is debilitated 
to a level inferior to that of externals, This 
point suggests a potential for behavioral prob- 
lems in persons with extremely internal con- 
trol orientations, 

Previous researchers (DuCette & Wolk, 
1973) have argued that internals are cog- 
nitively superior to externals because they 
take fewer trials to criterion on a skill task. 
In the present study, internals and externals 
in the escapable condition did not differ in 
their subsequent acquisition of the escape- 
avoidance behavior. Thus, the provision of ex- 
plicit cue responses can result in externals 
equalling internals in learning ability. It is 
important to note that this effect was obtained 
on a subsequent (albeit highly similar) task 
that did not utilize explicit cues. This indi- 
cates that the motivational and cognitive 
effects of cue explication during the treatment 
phase generalized to the second task, Unlike 
much of the locus of control research that at- 
tempts to delineate differences between inter- 
nals and externals, this study seems to refocus 
attention on the neglected issue of alleviating 
or altering the behavioral consequences of an 
external perception of control. The present 
results are even more dramatic, considering 
that they were obtained from individuals with 
extreme external control orientations. Whether 
our findings are generalizable to extreme males 
as well as females is of course an empirical 
question, although the lack of consistent sex 
differences in the internal-external literature 
would suggest such applicability. 

The present results also lead us to consider 
some possible antecedents and consequences 
of an external expectancy for control. Given 
that the performance of externals in the con- 
trol condition was inferior to that of externals 
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in the escapable condition, it may be argued, 
as did Hiroto (1974), that one consequence 
of an external orientation is some degree of 
pre-extant helplessness for skill tasks. Since 
externals appeared to be helpless prior to en- 
tering the experimental situation, their altered 
performance in the escapable condition might 
best be termed “learned effectiveness.” Fur- 
ther, since there were no differences in per- 
formance between control condition-externals 
and inescapable condition-externals, it is clear 
that the pre-extant helplessness of externals 
in the inescapable condition was not increased 
by the treatment. Since helplessness is pro- 
duced (in instrumental induction studies) by 
failure to control an aversive experience, we 
propose that one developmental antecedent of 
externality is repeated failure to control aver- 
sive experiences. However, this speculation on 
externality is qualified by Gregory’s (1978) 
findings that the Rotter (1966) I-E scale and 
the Mirels (1970) personal fate control scale 
are predictive only of control over negative 
outcomes or aversive events. Externality for 
nonaversive events or positive outcomes 
would, of course, be expected to follow from 
different antecedents. 


Locus of Control, Cue Explication, 
and Learned Effectiveness \ 


The present findings suggest a reinterpreta- 
tion of another study involving locus of con- 
trol and learned helplessness. Following a pre- 
treatment consisting of contingent or noncon- 
tingent feedback on a concept formation task 
represented as a test of “spatial reasoning 
ability,” Cohen, Rothbart, and Phillips 
(1976) found what they interpreted as learned 
helplessness for external subjects on both a 
Stroop color task and a maze tracing task. A 
performance decrement for internal subjects 
was found only on the maze tracing task. In 
the contingent condition, subjects received 
accurate feedback concerning the correctness 
of their choice after each concept formation 
slide. Cue explication theory would predict 
that external subjects would perform as well 
in this condition as internal subjects, as in- 
deed was found. Although the performance of 
external subjects in the contingent condition 
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was superior to that of external subjects in 
the noncontingent condition, one cannot de 
termine whether learned helplessness or 
learned effectiveness occurred. As Buchwald, 
Coyne, and Cole (1978) note, in the absence 
of a control condition it is impossible to know 
whether performance enhancement (or learned 
effectiveness, as we have termed it) or intet- 
ference has occurred, Given the procedure em- 
ployed by Cohen et al, in the contingent con | 
dition, it is possible that their external sub: 
jects displayed learned effectiveness and, as 
in the present study, their internal subjects 
displayed learned helplessness. The finding 
that internal subjects displayed perfor 
decrements on only the maze tracing task ma 
have been due to its greater similarity to th 
training task. 

Finally, it should be noted that one unpll 
lished study (Benson & Kennelly, Note 2) has 
found results identical to the present study, 
Benson and Kennelly treated male and f 
subjects who had scored at the extremes of 
Levenson’s (1974) locus of control scale wi 
explicit contingent or noncontingent feedba 
on a concept formation task. Using anagral 
problems as the test task, they found i 
relative to control condition subjects, 
performance of internals was impair 
pretreatment with insoluble problems. 
performance of externals was not impaired © 
pretreatment with insoluble problems, i. 
facilitated by pretreatment with at i 
lems (p. 7).” Combined, the results a 3 
et al. (1976) and Benson and Kenni 
tend and generalize the notion that ¢ 
plication can improve significantly 
formance of externals. 


Conclusions 


The present study has implicatio 
search in two areas. e 
learned helplessness might inc 
investigation of the effects 
minor procedural variations n k 
haviors of subjects with individua al 
such as locus of control. Locus 0 a effects 
search could benefit by studying internal an 
explicit cues on the behavior © 
external locus of control subjects, 
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refocusing some attention on this inade- 
quately researched area. 
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Recent issues of this journal carried an announcement that the Journal of 
Personality and Social Psychology would appear in January 1980 under new 
policies and in a new sectioned format. However, in order to publish the back- 
log of manuscripts accepted under present policies, the first three issues of 1980 
will continue to carry articles accepted under Clyde Hendrick’s Acting Editor- 


ship. i 


This journal will appear in April as a sectioned journal edited by 


Melvin 


Manis, Ivan D. Steiner, and Robert Hogan, as previously announced. 
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An Empirical Analysis of the Correlations Between Leadership 
Status and Participation Rates Reported in the Literature 


R. Timothy Stein and Tamar Heller 
University of Illinois at Chicago Circle 


Many studies of small group leadership have reported a positive relationship 
between leadership status and the rate of verbal participation. The leadership 
participation relationship has been attributed io (a) the performance of task 
leadership behaviors, (b) task skill and knowledge, (c) social status character- 
istics, and (d) the presence of observers. Hypotheses based on these four 
explanations were generated and tested using 72 correlations reported in the 
literature as data points. The relationship of a large number of situational 
variables to the leadership-participation rate relationship was also examined. 
In a multiple regression analysis, four variables accounted for 63% of the 
variance in the leadership-participation rate correlations. The first three expla- 
nations were supported. They can be integrated if the moderating effect of task 
competence and social status on the performance of task leadership behavior is 


recognized. 


A positive relationship between verbal par- 
‘ticipation rates and leadership status is fre- 
quently reported in the small group literature. 
In reviewing that literature, the authors un- 
covered 77 correlations between the variables. 
The mean was .65, and the range was from 
'-.48 (Gustafson & Harrell, 1970) to 98 
(Bates, 1952); only 4 were negative. In these 
Studies new groups were formed for the re- 
search or for classroom activities. Either ob- 
servers or the group members rated the par- 
ticipants on leadership. These ratings were 
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measures of leadership status because all the 
group members, and not just those who 
emerged to a leadership position, were in- 
cluded, Measures of the relative amounts of 
talking by the group members were obtained 
from audiotapes or from observers’ records 
(e.g, the number of words spoken). 

Other lab studies have shown a covariation 
between participation and leadership status. 
Several authors report higher mean leadership 
ratings for high participants (e.g; Carter, 
Haythorn, Shriver, & Lanzetta, 1950; Gint- 
ner & Lindskold, 1975). Also, experimenters 
have manipulated the relative amounts of 
talking by group members. Parallel shifts in 
peer judgments about the participants’ lead- 
ership followed (e.8-, Bavelas, Hastorf, Gross, 
& Kite, 1965; Zdep & Oakes, 1967). 

The covariation of leadership status and 
participation rates has been attributed to (a) 
the performance of task roles, (b) task skill 
and knowledge, (c) social status character- 
istics, (d) the presence of observers, and (e) 
motivation. Although an extensive literature 
points to a relationship between leadership 
and participation, the research described here 
is the first to compare the explanations to 
determine how well they account for the rela- 


Inc. 0022-3514/79/371 1-1993$00.75 


‘ 1993 


1994 


tionship. Hypotheses were generated from the 
first four explanations.’ Additional hypotheses 
were formulated to include two situational 
parameters, group size and observer bias. 
Seventy-two of the correlations reported in 
the literature were used as data. 

First, it is possible that the role performed 
by leaders requires their participation and 
wins them status. The most common division 
of roles in small group interaction is the dis- 
tinction between task and group maintenance 
leadership (Hare, 1976, p. 145). Task lead- 
ership behaviors are defined as “behaviors of 
the individual related to his efforts to assist 
the group in achieving goals toward which the 
group is oriented” (Bales, 1958). Group 
maintenance leadership behaviors are “be- 
haviors of the individual related to his efforts 
to establish and maintain cordial and socially 
satisfying relations with other group mem- 
bers” (Bales, 1958). The development of task 
leadership roles is the major thesis of emer- 
gent leadership theories (Bales, 1953; Hol- 
lander, 1958; Stein, Hoffman, Cooley, & 
Pearse, 1979; Stogdill, 1959). Studies that 
have compared leaders to nonleaders in the 
performance of specific behaviors have clearly 
indicated that leaders perform more task- 
related behaviors than do nonleaders (e.g., 
Carter et al., 1950; Kirscht, Lodahl, & Haire, 
1959; Morris & Hackman, 1969). In sum- 
marizing this research, Stein and Heller (Note 
1) noted that leaders were significantly more 
active than nonleaders in performing task 
leadership behaviors in 51 of 76 (67%) sta- 
tistical tests.* The task leadership activities 
included problem identification, proposing so- 
lutions, seeking and giving information and 
opinions, and initiating structure. The per- 
formance of group maintenance activities dif- 
ferentiated leaders from nonleaders in only 5 
of 24 tests. The following hypotheses are 
based on the task facilitation explanation: 
Hypothesis 1a: Leadership—participation rate 
correlations in which task leadership measures 
are used are higher than those involving 
maintenance functions. Hypothesis 16: The 
more general the task leadership measure, the 
higher is the correlation with participation 
rates. 

Several researchers have suggested that the 
leadership-participation rate relationship 
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might be a reflection of technical con 
(e.g., Gintner & Lindskold, 1975). Task 
has been shown to be related to 
leadership in both field (Clifford 
1964; Crockett, 1955) and lab studies 
Heinicke & Bales, 1953; Moment & { 
nik, 1963). Hemphill (1961) found 
members of new groups who believed 
they were more competent than oi 
formed a higher number of leadershi 
iors. In some studies, the quality and 
tity of participation were manipulated | 
ner & Lindskold, 1975; Jaffee & Lucas, 1 
Reilly & Jaffee, 1970; Sorrentino & B 
1975). Even when confederates su; 
same solution to the assigned proble: 
ferent groups, their leadership rating 
dependent upon how much they talk 
the quality of the solution could not a 
for the leadership—participation rate re 
ship. However, it is possible that both: 
ship status and participation rate diff 
occur because those who offer better 
make more attempts to influence 0 
that case, competency may be a ba 
both influence and participation in tai 
quiring skill. Hypothesis 2: The value 
correlation between leadership stat 
participation is directly related to 
requirements of the task. a 

Third, Blau (1964) has theo! Ze 
members who impress others as bei 
of providing task assistance are Int 
mitted more participation time. The p 
characteristics that have been tied t 
judgments include intelligence, pæ 
traits, and social status (race, Sex,- 
come). Of these variables, only sexué 


ŁA hypothesis was not formulated ford 
vation explanation because dal r 


ers’ perceptions of motivatio F 
with the leadership-participation rate cor 
the literature. 


higher proportion in only 5 ol 
parisons (Stein & Heller, Not on th 
cate that leadership status 1S en a 
of task leadership acts, and Sane a 
one’s behavior that is task leade p- 
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seems to be strongly related to both partici- 
pation rates and leadership. Comparisons of 
sexes have shown men to be more verbose 
and more influential (e.g., Strodtbeck, James, 
& Hawkins, 1957), and more proactively in- 
volved in performing the task (e.g., Strodt- 
beck & Mann, 1956). 

Hypothesis 3: The correlation between 
leadership status and participation is higher 
in mixed-sex than in same-sex groups. 

Finally, Wilson (1971) has suggested that 
the leadership—participation rate relationship 
is an experimental artifact. Members with 
high assertiveness and ascendancy needs are 
thought to respond to the presence of an 
observer with high participation and ingrati- 
ating behaviors. The other members rate 
these people high on leadership because they 
assume that the observer identifies leadership 
with dominance, influence, and high partici- 
pation rates, 

Hypothesis 4: The correlation between 
leadership and participation is lower in groups 
unaware of being observed. 

Hypotheses were formulated for two situa- 
tional variables, group size and observer bias. 
Other independent variables were included 
for exploratory research. Several researchers 
have found that the differences in participa- 
tion rates among group members are depen- 
dent upon group size (see Hare, 1976, pp. 
219-221). After reviewing several studies, 
Thomas and Fink (1963) concluded that 
Steater size was associated with an increase in 
organization and division of labor. These find- 
Mgs suggest that (Hypothesis 5) the correla- 
lion between participation rates and leader- 
ship status is directly related to group size. 
, Greater observer dependency on participa- 
ton rates in formulating leadership judg- 
ments has been suggested in three studies 
(Juola, 1957; Kanungo, 1966; Stein, 1977). 

servers, but not the group members, failed 
' differentiate between task and maintenance 

"nctions, which suggests that the observers 
co a more global criterion in evaluating 
“adership, 

Tf the observers are so biased, then (Hy- 
Pthesis 6) the correlations between leader- 
a and participation involving observer 
4 rship ratings are higher than those based 

Peer ratings or self-ratings. 
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Table 1 
Correlations of Leadership Status and 
Participation Rates 
Study Correlations 
Bass, 1949 -93, .86 
Bass, Wurster, Doll, & 
Clair, 1953 .77, 60 
Bates, 1952 85 
Berkowitz, 1956 34, 48 
Borgatta, 1954 +995 235), 21, 271) 48; 
-25, .22, .16 
Borgatta & Bales, 1956 53, —.31, .18, .79, .79, 
+35, 43, 34, .55 
Burke, 1974 81 
Burroughs & Jaffe, 1969 -39, 45, 13, 30, .24 
Gustafson & Harrell, 1970.77, 80, .50, .92, .88, 
40, .68, .78, —.02, 
66, .67, —.48, .82, 
-80, .87, —.17, .51, 
«71, .70, .34, 56, .77, 
89, .47, .83, .79, 
s .78, .05 
Heinicke, 1950 91 
Jaffe & Lucas, 1969 63 
Juola, 1957 +89, 81 
Morris & Hackman, 1969 46, .65, .52 
Slater, 1955 -80, .75, .38, .48, .51, 
10 
Strodtbeck & Hook, 1961.69 


Method 
Dependent Measure 


Seventy-two correlations (Pearson or Spearman), 
reported in the literature between emergent leader- 
ship status and the verbal participation rates of 
group members were used as data for the dependent 
variable (see Table 1). The use of more than one 
correlation from a study reflects differences in tasks, 
group composition, the degree of group consensus 
on who was leader, the length of interaction cov- 
ered, or the leadership or participation measures 
used, Three of the values (from Morris & Hackman, 
1969) include both emergent and appointed leaders. 
No other data of this nature were found for formal 
leaders. The authors transformed the correlations 
to Fisher z scores for the statistical tests, 

The leadership-participation rate correlations are 


3 Four correlations were excluded because of in- 
sufficient information about the task and subject 
population (.96, French, 1950; reported in Bass, 
1954) and/or because an interaction time was used 
that far exceeded that of the other studies (.95 and 
94, Norfleet, 1948; and .98, Bates, 1952). The .89 
value reported by Stang, Castellaneta, Constantindis 
and Fortuno (1976) came to the authors’ attention 
too late to be included in the analysis. 
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not equally stable. The ms for the correlations dif- 
fer, and in some instances median rhos were used. 
However, the differences in error variance due to 
different sample sizes are randomly distributed; 
sample size was uncorrelated with the z transforma- 
tions of the leadership-participation rate correla- 
tions (r= —.12, ms). For this reason, the values 
did not need to be weighted. 


Independent Measures 


The study included 28 independent measures 
related to (a) the type of task, (b) the character- 
istics of the task, (c) the nature of the group, and 
(d) the type of leadership measure. The data pro- 
vided on these variables summarize the methodo- 
logical and measurement choices made by past re- 
searchers. From this summary, possible limitations 
on the generalizability of the leadership-participa- 
tion rate relationship can be identified. 

Task type. All the correlations are from groups 
ass'gned a task. The tasks were categorized accord- 
ing to four types: (a) problem solving, deriving a 
group solution to a problem (n= 16); (b) verbal 
production, preparing verbal presentations or cre- 
ating stories (m= 20); (c) opinion, discussing is- 
sues without having to reach a group decision (n= 
34); and (d) manipulation, assemblying objects (n 
=2). 

Task characteristics, The task dimensions in- 
cluded the six identified by Shaw (1971, 1973) and 
three additional measures: (a) skill required, re- 
quirements for special training or knowledge; (b) 
difficulty, including time pressures, skill or knowl- 
edge required, number of operations, knowledge of 
the desired outcome, amount of effort needed, and 
the degree of the subjects’ familiarity with the task; 
(c) goal clarity; (d) solution multiplicity, the num- 
ber of possible solutions and the extent to which 
the adequacy of a solution could be verified; (e) 
cooperation requirements, requirements for inte- 
grated group action; (f) manipulation-intellectual 
requirements, intellectual operations versus physical 
operations; (g) population familiarity, the occur- 
rence of the task in the larger society; (h) in- 
trinsic interest, interest created by the nature of 
the task; and (i) external motivation, consequences 
of performance (eg., money, grades, the judgment 
of esteemed others). Eight-point rating scales were 
used (after Shaw, 1973). Higher scale values indi- 
cate a higher level of the dimension in question 
(eg., a more difficult task). 

Nature of groups. In order to identify different 
types of groups, five dichotomous variables were set 
up. For the first, a score of 1 was assigned if all the 
group members were males, and O for groups of 
other compositions. In a similar fashion, variables 
were defined for all female members, members of 
both sexes, college students, and group members 
who were aware that they were being observed. 
Two additional variables were the number of group 
members, and the group members’ knowledge of 
one another prior to the meeting. A 3-point scale 
was used for the last variable. 
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Leadership measures. The characteristi 5 
leadership measures were assessed with ee 
bles. One variable indicated (a) whether the mer 
sure was of task leadership or maintenance leader. 
ship.* Other dichotomous variables indicated whether 
or not. the leadership ratings were made by. (b) 
peers or (c) observers, or (d) if they were self- 
ratings. Two time measures were used: (e) the 
number of minutes the group had met before the 
first leadership measure was taken and (f) the 
number of minutes before the last leadership mea- 
sure, (g) Whether or not the leadership rating was 
meaningful to the group members, that is, if the 
selected leader actually functioned in any capacity, 
was noted. (h) Finally, the breadth of the leader- 
ship measure (i.e how many different types of 
behavior it included) was rated on an 8-point scale, 


Interrater Reliabilities 


Two judges (including the first author) rated the 
leadership functions and descriptions of the task 
without knowledge of the correlations between par 
ticipation and leadership. The Pearson correlations 
between the judges’ ratings ranged from .79 to 3 
(see Table 2). 


Results 


The relationships between the dependent 
and independent variables are presented in 
Tables 2 and 3. If a rating scale was used for 
an independent variable, the scores were core 
related with the Fisher z transfora $ 
the leadershi rticipation correlatio 
Table 2). a a Variable was dichotomous, 


tests were used. The means are given in Table 


ten Pe 


from groups of all males), 
cluded cases.” The correlations 
condition was not fulfilled (¢.8, one ed 
from groups other than all male) are | 
as “excluded cases.” 


4Some experiments used 
“Who was the real leader of wa 
& Hackman, 1969). However, ‘ 
others (eg., Bales, 1953) have shown tm es 
name task leaders when asked in a p the gë 
identify their leaders. For this e 
leadership measures were includi na 
leadership measures. The data al 
this decision. The mean of the 13 eal Jeader: 
pation rate correlations in which ge! signif 
measures were used was .74. tot 
different from the mean of 
involving specific measures of t 


support 
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Situational Factors: Interrater Reliabilities and Correlations With the Degree of Association 


Between Participation Rates and Leadership 


———— ee 0 oOo eee 


Correlation 
a Interrater 
Variable reliability* All values? Task subsample° 
Task characteristics 
Skill requirements 94 oal ek 
Task diffieuliy ‘97 a a 
Goal clarity ‘97 2 07 
Solution multiplicity 79 ‘05 17 
Intrinsic interest 91 34 sav 
External motivation 97 .24* -21 
Cooperation requirements 95 -06 01 
Population familiarity 94 —.07 =15 
Manipulative-intellectual 95 19 .40** 
Nature of the group 
Knowledge of group members 87 AL +22 
Group size as .39%* Bhd 
Leadership rating 
Time before first rating a -25* 32" 
Total time of session at 19 -24 
Breadth of leadership 
measure 95 13 36** 


‘All interrater reliabilities are significant at the .001 level, df = 20. df= 70. °df = 55. 


P< .05. ** p <.01. 


Because the first hypothesis stated that the 
leadership—participation rate relationship was 
limited to measures of task leadership, all of 
the analyses were performed on both the to- 
lal sample of 72 correlations and on a sub- 
“ample of all the correlations that involved 
lask leadership measures. 


Tests of Hypotheses: Theoretical 
planations 


Leadership behavior. The data for Hy- 
Potheses 1a and 1b support the contention 
Het the performance of task leadership be- 

Viors is a common source of both high 
entiation and leadership status. The mean 

ue of the Jeadership-participation rate 
Wrrelations with task leadership measures was 
x Whereas the mean for the maintenance 
0 “hg was only .16, ¢(70) = 6.37, P< 
8 Means for subcategories of functions 
ae as follows (the first three categoriés in- 
Bs: maintenance leadership): (a) .017 for 

ive social interaction, (b) 178 for gen- 
18 Maintenance measures (e.g, liking), (c) 
for positive social interaction, (d) 390 


for seeking information and structure, (e) 
.615 for initiating structure and procedure, 
(f) .722 for giving information and opinions, 
(g) .735 for problem identification and solv- 
ing, (h) .736 for the general measures of 
leadership, and (i) .841 for general measures 
of task leadership. 

As hypothesized (1b), the association be- 
tween leadership and participation was sig- 
nificantly correlated with ratings of the 
breadth of the task leadership measures, 7 = 
36, p< .01. However, the correlation was 
only .13 when the entire sample of 72 values 
was used. This lower value is consistent with 
the notion that the maintenance behaviors 
are not the source of the leadership-partici- 
pation rate relationship. 

Task skill. A positive relationship was 
predicted (Hypothesis 2) between task skill 
requirements and the association between 
participation and leadership. The data are 
supportive. The variables were correlated .27 
(p < .01) for the total sample and 35 (p< 
01) for the task subsample. 

Sexual status. A significantly higher lead- 
ership-participation correlation mean was 
found for mixed-sex groups than for same-sex 
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Table 3 
Means and t Values for Dichotomous 
Variables 
i 
Included Excluded 
cases cases 
Variable M n M tscore 
Total sample 604 72 — — 
Nature of the group 
Male 505 30 .665 2.29° 
Female 466 8 619 1.27 
Mixed sex .702 34 497 3.18% 
College .657 52 437 2.75% 
Aware of observer 643 26 581 88 
Leadership measure 
Task 
(vs. maintenance) .687 57 156 6.37°°* 
Peer .609 58 .582 .30 
Observer .672 14 .586 1.30 
Self 581 9 607 .25 
Future 
significance 369 6 622 1.75 


Se 
Note. All t values were calculated with pooled error 
terms and with 70 degrees of freedom. The means 
are correlational values, not Fisher s scores. 
*p<.05. *p<.0l. * p < 001. 


groups in the total sample, #(70) = 3.18, p 
<,.01, and the subsample of task measures, 
(55) = 5.68, p < .001. The means for male 
and female groups were almost identical (see 
Table 3). Unfortunately, the mixed-sex varia- 
ble is confounded with type of task. Thirty- 
one of the 34 correlations for mixed-sex 
groups were from groups assigned opinion 
tasks. These 31 correlations constitute most 
of the data points (34) for opinion groups. 
Consequently, it is impossible to know which 
of the two variables is creating the signifi- 
cant result. The confound remained when the 
maintenance leadership measures were re- 
moved. 

Demand characteristics. The leadership- 
participation correlations based on data from 
subjects aware of being observed were not 
significantly higher than those from unaware 
subjects for the total sample, #(70) = .88, 
or for the subsample, t(55) = .61. 


Tests of the Hypotheses: Situational 
Parameters 

Group size. A higher association between 
leadership status and participation was pre- 
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dicted for larger groups (Hypothesis 
group size ranged from 3 to 18 me 
a mean of 4.8. The moderate correlatio 
tween group size and the dependent 
for both the total (.39) and subsample 
were significant (p < .01). 
Observer ratings. The hypothesis 6 
leadership-participation correlations W 
higher when based on observer rati 
when based on peer ratings or self- 
was not supported. Although the means 
in the predicted direction, the differe 
not significant for either the total 
t(70) = 1.30, or for the subsample, £ 
1.70. 


Exploratory Analyses 


As mentioned above, a large num 
independent variables, related to the 
their tasks, and the leadership measu 
were included for exploratory analy: 
authors recognize that because of 
of such procedures, the findings mayi 
to chance and may not be real effec 


as possible relationships that merit £ 
investigation. 

Differences due to the type of 
examined by a one-way analysis of 
Although the resulting F for the en 
ple was not significant, F(3, 86 


= .12, the value for the task subsai 
F(3, 53) = 5.92, p < 001. Using 4 
for opinion 


multiple range, the mear 
higher than 


(problem solving, verbal producti 
manipulation). 
assigned opinion tasks wer" ie 
mixed-sex groups. Consequently, &! 
composition or the type © E 
creating the results. 
The two measures 


e 


Table 2). a 

The interchange before the 
measure ranged from 1 ir 
with a mean of 42 minu es. 
to the last rating varied from 
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Table 4 
Stepwise Regression Analysis for the 
Entire Sample and for the Task Subsample 


oo 


All Task 
Statistical parameter values subsample 
Multiple R 1792 801 
R 628 641 
F 28.268* 18.220* 
af 4/67 5/51 
Constant (Fisher z) 816 303 


*p< 01. 


to 24 hours, with a mean of 95 minutes. A 
low but significant positive correlation was 
found only between the dependent measure 
and the time elapsed prior to the first leader- 
ship measure for both the total sample and 
| the subsample (Table 2). 

Only six of the correlations involved leader- 
ship ratings that had practical significance 
for the group members. They were from stud- 
ies which used the same methodology (Bur- 
roughs & Jaffee, 1969; Jaffee & Lucas, 1969). 
The six correlations were significantly lower 
than the others in the task subsample, #(55) 
= 3.33, p < .001. 


Multiple Regression Analysis 


Stepwise multiple regression analyses were 
performed to determine the amount of vari- 
ance in the leadership-participation Fisher 2 
Scores accounted for by the 28 independent 
variables, The analyses for the entire sample 
and for the subsample are summarized in 
Tables 4 and 5. The B coefficients are for 
Nonstandardized measures; the beta values are 
for standardized scores. 
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For the entire set of leadership-participa- 
tion correlations, the multiple R was .792, 
accounting for 63% of the variance. The 
largest coefficient was a negative value for 
maintenance functions. The inclusion of this 
variable and the ratings of the breadth of the 
leadership measures support the task leader- 
ship activity explanation, The contribution of 
the mixed-sex measure supports the view that 
the leadership-participation relationship is 
due in part to the sexual composition of the 
group. The inclusion of the observer ratings 
supports the observer bias hypothesis. Be- 
cause the maintenance functions were ex- 
cluded from the subsample, the strongest pre- 
dictor for the entire sample was removed from 
the second analysis. The multiple R was 801. 
The first three variables to be included were 
added in the same order as in the total sam- 
ple analysis. Two additional variables were 
included, all male groups versus other sex 
composition groups, and groups of college stu- 
dents versus noncollege subjects. 


Conclusions 


Three major explanations for the leader- 
ship-participation rate relationship were sup- 
ported by the data analysis: (a) task leader- 
ship behaviors are a component of total par- 
ticipation and a basis for leadership status, 
(b) those with superior task ability make 
more task-related contributions to the group, 
which increases their participation; and (c) 
males are permitted both greater influence and 
higher participation rates (although the sex 
differences were confounded with group dif- 
ferences). These explanations can be inte- 


Table 5 
Regression Beta Weights for the Entire ‘Sample and for the Task Subsample 
All values Task subsample 
oe E 
F Rt 
Variable B B F R B B 

Maintenance functions —.11 —.68 30.21%% 37 i T Ay m 
Mixed-sex subjects 37 AL 26.68 52 6 2 tore a8 
pele a a ae B ee IT ILS Sat EET 
Mace A PE = lel eae oem 
ee a T — 27 .32 7.03* 64 


College subjects a Tae net 


+ 
P< .05. **p < 01. 
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grated if the performance of task leadership 
behaviors is viewed as the major source of 
the variance common to leadership status and 
participation rates. Competency and sexual 
status can be considered as variables that 
moderate the performance of such behaviors. 
Behavioral manifestations of competency 
(such as giving information and proposing 
solutions, procedures, and structure) are in 
themselves task leadership behaviors. In ad- 
dition, members of new groups who feel they 
are competent make more influence attempts, 
whereas members who realize that they are 
less competent make fewer leadership at- 
tempts (Hemphill, 1961). The literature on 
sex differences indicates that males are con- 
sidered to be more influential, talk more often, 
and perform more task leadership acts, 

The suggestion is made from several points 
of view that the quality and frequency of 
members’ contributions are correlated. For 
example, theories of emergent leadership post- 
ulate that those who have established their 
competency are allowed greater influence as 

` the group continues (Bales, 1953; Stogdill, 
1959). Hoffman and Clark (1979) report 
positive correlations between the group mem- 
bers’ total participation and the extent to 
which they contributed to the adoption of 
the solution. 

Some of the explanations of the leadership- 
participation rate relationship that are based 
on within-group differences (such as personal- 
ity traits and motivation) could not be 
tested in this study. However, the perform- 
ance of task leadership behaviors can be 
offered as an alternative to Sorrentino and 
Boutillier’s (1975) assertion that differences 
in group members’ motivation explain the 
leadership—participation rate relationship. 
They found participation rates, but not the 
quality of the solution offered by the confed- 
erates, to be related to peer ratings of task 
leadership. Sorrentino and Boutillier sug- 
gested that the high participants received 
higher leadership ratings because they were 
perceived to be more motivated. However, in 
manipulating participation rates, Sorrentino 
and Boutillier held constant the proportion, 
but not the frequency, of the types of behav- 
jor performed by the high and low participant 
confederates, thus confounding task behaviors 
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with total participation, Consequently, the 
higher participants may have been ca 
higher in task leadership because the grou 
members recognized their performance of 
more task leadership acts. 

Our analyses suggest that the Aigh correla- 
tion of task leadership and participation may 
be cautiously generalized to real groups, Al- 
most all of the findings are from formally 
leaderless groups. The relationship between 
the two variables is lower if the leadership 
ratings have practical implications for group} 
members and if the groups are not comprised] 
of college students. These variables were als] 
confounded with task differences, which 
makes it more difficult to generalize the 
strength of the leadership-participation rate 
relationship to a variety of real-world tasks 
The degree of association between leadership} 
and participation was also lower when the| 
leadership judgments were not made by 
servers. In addition, for all but two of the l 
correlations analyzed, less than 45 minutes 
had passed before the first leadership meast 
was taken. | 

Additional research is warranted in severdi 
areas: : 
1. The generalizability of the leadership] 
participation rate relationship needs to 
explored. : 

A Predictive research is needed to de 
mine if the performance of task 1 Pi a 
behaviors actually serves as the basis for ‘dj 
common variance between leadership if 
participation and if it is moderated by 
status, competence, and motive a ne 

3. Although it is clear that a oul 
status and participation covaly, | at off 
relationship between the variables is mca i 
Perhaps high participation precedes sda 
leadership status. Or perhaps, #5 1979) suh 
ship valence theory (Stein et al., hip y 
gests, the attainment of leader j 
occurs almost simultaneously Me 
participation. According to that a j 
leadership status of a member A ae others: 
each successful attempt to 1 mi 

4. The relationship of parkam been 
formal leadership status has 10 may 
ined. The roles of formal e than 
that they speak more pee roles Tê 
(Burke, 1974). When the 10 
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the performance of task leadership behaviors, 
such behaviors could be the basis for both 
high participation and judgments of the ex- 
tent to which an individual actually leads the 


group. 


Reference Note 


1, Stein, R. T, & Heller, T. The relationship of 
emergent leadership status and high verbal par- 
ticipation in small groups: A review of the lit- 
erature. Manuscript submitted for publication, 
1979, 
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Two studies were conducted with nursing home residents to determine whether 
memory could be improved. This was accomplished by increasing the cogni- 
tive demand of the environment and then varying the extent to which resi- 
dents were motivated to attend to and remember these environmental factors. 
In Study 1, motivation to practice recommended cognitive activities was manip- 
ulated by varying the degree of reciprocal self-disclosure offered by interviewers 
in a series of dyadic interactions. In Study 2, motivation to practice recom- 
mended cognitive activities was manipulated by varying whether positive out- 
comes were contingent on attending to and remembering these activities, 
which increased in demand over time. Whether as a function of interpersonal 
(Study 1) or practical (Study 2) incentives, engaging in cognitive activity 
resulted in improvement on standard short-term memory tests, including probe 
recall and pattern recall, as well as in improvement on nurses’ ratings of alert- 
ness, mental activity, and social adjustment for experimental groups relative 


to controls. 


Arguments and counterarguments over evi- 

ce purportedly showing age-related intel- 
ectual and cognitive decline have been made 
ith great vigor (cf, Baltes & Schaie, 1976; 
dom & Donaldson, 1976; Horn & Donaldson, 
977). Moreover, there is disagreement about 
that the deficits might be, if they do exist. 
êt us take memory as an example. Adamo- 
ticz (1974) found short-term memory decre- 
ents at the encoding and postcoding phases 
lit not at the retrieval phase, whereas Anders 
nd Fozard (1973) basically found retrieval 
ticits. Drachman and Leavitt (1972) sug- 
Sted that the major cause of memory de- 
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ficiency for the elderly lies not in a deficit in 
retrieval but in a disorder of storage. Yet 
another view is that of Eysenck (1973), who 
suggested that the decrement is primarily in 
the decision process and not in retrieval. 

In this article we do not wish to question 
whether there is memory decline with ad- 
vanced age. Instead, we ask to what the de- 
cline may be attributed when it does appear. 
Thus, rather than looking for either a descrip- 
tion of the memory deficit (as in the studies 
previously cited) or methodological artifacts 
that may explain away purported differences 
(cf. Baltes & Schaie, 1976), we suggest an- 
other focus of analysis: one in which deficits 
are seen as real and one in which environ- 
mental factors are seen as influencing and 
masking cognitive abilities of the elderly. Our 
position is basically that social and environ- 
mental factors may impact on memory di- 
rectly, leading to real deficits that may only 
look as if they are biologically determined be- 
cause they are age related. 

How might social factors directly impact 
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on cognitive processes? The elderly, especially 
those residing in institutions, are a population 
who are by and large confined to a very re- 
stricted living space. Such restriction, arising 
from social system factors and physical ail- 
ments orthogonal to direct brain deterioration, 
may have far-reaching effects on cognitive 
processes. For these people, every day is like 
every other day. Moreover, for those residing 
in institutions, every room is like every other 
room, and each person seen today is the same 
as each person seen yesterday and the day 
before. Living in such highly redundant en- 
vironments leaves one with little information 
to process. As a result, the ability to think 
and remember may diminish. During the 
earlier part of one’s life, it may be an efficient 
strategy to try to minimize informational 
input when living in a demanding and over- 
stimulating environment. Over the years one 
may have successfully taught oneself to tune 
out much of the environment in order to func- 
tion more adaptively (cf. Langer, 1978; 
Langer, Note 1) only to find that in later 
years such proficiency has become maladap- 
tive. 

After years of learning how to avoid ex- 
pending cognitive effort, a person might find 
it difficult to know how to reverse the process 
when faced with little to think about and 
little to do. Yet it seems that reversal is pos- 
sible at least in some areas. In earlier research 
(Langer & Rodin, 1976; Rodin & Langer, 
1977) we found that making more demands 
on the decision-making abilities of institu- 
tionalized elderly adults had the effect of im- 
proving health and psychological well-being 
and tended to be related to lower mortality 
rates as well. In the present studies we con- 
sider whether improvements in memory might 
also result from changes in the social psycho- 
logical environment. 

If cognitive abilities do diminish at all with 
disuse, then practice in thinking over time 
may be successful in reversing the debilita- 
tions. However, if people have been implicitly 
led in the ways outlined above to avoid ex- 
pending cognitive effort, then more than a 
mere suggestion to expend this effort would 
seem necessary to motivate them to do so. The 
following studies were undertaken with these 
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two considerations in mind. In both studies 
we attempted to motivate subjects to adhere 
to a recommended course of action over time 
that asked them to think about various issues, 
Encouraging this commitment to increase cog. 
nitive activity was accomplished in the fi 
study by providing the opportunity for recip- 
rocal self-disclosure between the subject and 
an interviewer who recommended the partic- 
ular course of cognitive activity. 
According to Janis and Rodin (1979),] 
eliciting self-disclosure is one factor that is 
critical in helping a person to establish refer 
ent power (French & Raven, 1959) that pro 
motes commitment to recommended courses oi 
action, Persons have referent power for those 
who perceive them as likeable, benevolent, ad: 
mirable, and accepting. Their motivating 
power derives from this source. Rodin and 
Janis (1979) have described how the use ol 
referent power in health care settings py) 
motes internalization of medical recommenda: 
tions made by health care professionals. Be 
cause no coercive inducements are us 
motivate the person, Rodin and Janis hypotlt: 
esized that the use of referent power increas’ 
the likelihood that persons will feel perso 
responsibility in adhering to prescribed regi 
mens. To establish referent power that mi 
vates persons to commit themselves to a j 
mended courses of action, an individua! l 
must encourage them to disclose their a 
sonal feelings, troubles, or weaknesses “al 
in press; Quinlan, Janis, & Bales, 10 wae 
Demonstrated willingness to engage A 
rocal self-disclosure is effective in this reg 
(cf, Jourard, 1971). 
We obtained a commit tro! 
nitive activity in the second sade aA 
ducing a contingency whereby Gr e reward 
tive effort resulted in greater tangi 
In both studies, we hypothesize iye 
to control groups, increased a romote 1 
in experimental groups would bs areas 
efficient performance in cognit! 


as memory. 


ment to increase CF 


Study 1 
ves of recip? 
In this study we used a series O° © cet 
self-disclosure conversations 
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for adhering to a particular course of recom- 
mended cognitive activity. The cognitive 
activity involved remembering a series of 
uestions and searching the environment and 
e’s long-term memory for information rel- 
nt to their answers. A low-self-disclosure 
up was included to control for the amount 
time spent with an interviewer discussing 
st and present events. This more standard 
terview procedure, however, was not ex- 
pected to provide sufficient interpersonal in- 
entive to occasion the kind of commitment 
Mecessary to stimulate renewed cognitive 
activity. To control for changes that may 
have occurred simply because of multiple ad- 
ministrations of the memory tests, a no-treat- 
a group was added to the design. 


: Method 
Subjects 


Fifty-four middle-class ambulatory residents of a 
tivate suburban nursing home participated in the 
study, Residents who were judged by the institution’s 
head social worker to be at least somewhat alert took 
Part in the study. The average length of stay for the 
Sample was 2.8 years. All participants could eat their 
Meals in the dining room without assistance. Subjects 
Were matched on age, length of time in the nursing 
Ome, social worker’s ratings of alertness, and sex 
td on this basis were assigned to one of four 
Soups: high reciprocal self-disclosure (n= 10, M age 
=804); low reciprocal self-disclosure (n= 20, M 
%e= 79.6) ; pretest and posttest only (for no-inter- 
Mew controls, n= 14, M age=78.9); and pre- 
“st only (for test-instructions subjects, 7# = 10, 
age = 78.8) 2 


P tocedure 


All subjects were visited 1 month prior to the ex- 
imental treatment period by an experimenter who 
ministered an interview and the memory tests de- 
f a below. She did not participate in any of the 
hi and was blind to experimental conditions 
Th by 


knew about the memory pretest and posttest, 
ll were blind to the experimental hypothesis 
an memory. Visits were spaced so that, there 
eee East 1 week between the visits. Interviewers 
en structed to keep the length of each visit be- 

30 and 40 minutes, At the beginning of the 
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first session, residents were told that the interviewers 
would be coming for about 6 weeks. 

The interviewers introduced themselves as college 
students who were interested in getting to know 
people who were much older than themselves, The in- 
terviewer told the subject that he or she would like 
to get together once a week with the participant to 
talk, If the person agreed, the interviewer began a 
discussion. The nature of the discussion was deter- 
mined by the condition to which the subject had 
been assigned. At the end of each visit, the inter- 
viewer told the subject that he or she would return 
in approximately 1 week. 

High-reciprocal-self-disclosure group. In this con- 
dition the interviewer began the first visit by stating 
that “one of the things I never get to do is to tell 
someone older than I am the kinds of things I am do- 
ing and thinking about.” In each of the four visits 
the interviewer began the visit by talking about 
events that were occurring in his or her own life 
(eg., school, family, jobs) in a largely unstructured 
way, with the only requirement being that she or he 
let the resident talk whenever she or he wanted to. 
At the end of each visit, the experimenter suggested 
specific topics about which subjects were asked to 
think. These were to be discussed on the next visit. 

Low-reciprocal-self-disclosure group. Here the in- 
terviewer began the discussion by stating that “one 
of the things I never get to hear very much about is 
what people who are older than I am are doing and 
thinking about.” The questions asked during this and 
subsequent visits were concerned with current occur- 
rences and events, both personal and public (e.g. 
what was going on that day, families). The inter- 
viewer was instructed to ask questions, comment 
appropriately, and listen attentively and interestedly. 
At the end of each visit, the experimenter suggested 
specific topics about which subjects were asked to 
think, These were. identical to those suggested to 
high-disclosure subjects and were to be discussed on 
the next visit. This standard interview procedure was 
typical of the kinds of interviews older people in 
nursing homes experience. 

No-treatment group. Subjects in this condition 
were not visited by the interviewers, but received 
the same memory pretest and posttest as the above 


two groups. 


Dependent Variables 


Two types of measures were used to assess the 
effects of the experimental manipulations. The first 
measure, a questionnaire, was responded to by the 
nurses who staffed the floors on which the subjects 
lived; the nurses were unaware of the experimental 
hypotheses and treatments. The questionnaires were 
completed at two different times—once before the 
experimental manipulations were introduced and 
again several weeks after the experimental treatments 


1 Only those subjects who were available for each 
interview session were included in the final sample. 
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had ended. The questionnaires included 10-point 
scales that asked for ratings of how happy, alert, de- 
pendent, sociable, and active the residents were. 

The other measures were two types of memory 
tests administered 1 month prior to and 1 week after 
the experimental treatments. To minimize the anxiety 
that is often felt by the elderly when presented with 
testlike situations, the tests were not described as 
such. Instead the experimenter introduced herself as 
an individual interested in finding out what changes 
might be made to make the nursing home “a more 
pleasant place for the people who live here.” She then 
asked subjects for their suggestions as to what 
changes they would like to see made. After the sub- 
jects made their suggestions, the experimenter re- 
quested that they give their opinion on some new 
ideas that the experimenter had come up with. The 
two memory tests were disguised as a new activity 
or procedure that might be instituted in the nursing 
home. The subjects “tried out” these “new activities 
and procedures” and offered an opinion of each. The 
order in which the tests were presented was sys- 
tematically varied. 

The two tests used were pattern recall and probe 
recall. Pattern recall is a test of immediate recall of 
visual patterns. The subject is required to watch 
the experimenter tap a set of four blocks in a certain 
order and to repeat those exact movements imme- 
diately afterward. As the test proceeds the series of 
items (each block pointed to is considered an item) 
increases in length. The score reflects the length of 
the series a subject is capable of imitating. 

In probe recall, a test that measures short-term 
memory of paired items, photographs of five people 
were shown to the subject, and each photograph was 
paired with a person’s name. The photographs were 
then presented again at various intervals throughout 
the test. Subjects were required to recall the name 
associated with each photograph. To make this test 
relevant, pictures were used of elderly people who, 
subjects were told, might be coming to the nursing 
home, 

Both tests had two equivalent forms, with split- 
half reliability and/or test-retest reliability (rs = .86 
and .88, respectively) allowing both premanipulation 
and postmanipulation measures to be taken. Approx- 
imately one half of the subjects in each condition 
received Form A of the tests for the pretest and 
Form B for the posttest, whereas the other half re- 
ceived them in the opposite order. We predicted that 
greater practice in thinking would result in higher 
scores on these measures for the high-reciprocal-self- 
disclosure group relative to the remaining groups. 

As a secondary goal of this experiment, we were 
interested in determining whether making the tests 
relevant and nonthreatening in this way would 
actually have an effect on test performance, since test 
anxiety alone may impair performance. Therefore, 
we also included for comparison another no-treat- 
ment group of similar nursing home residents, who 
were given the pretest measures only and were told 
that they were memory tests rather than “new activ- 
ities.” 
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Results 


Analyses of pretest scores on both re 
tests and of nurses’ ratings revealed no sig 
icant differences among the three groups ai 
hence demonstrated group comparability 
the start of the experiment (all Fs < 1), 

Table 1 presents group means and chat 
scores on the recall tests as a function of { 
experimental treatment. Analyses of varia 
for both tests individually showed signific 
differences among the groups: for patt 
recall, F(2, 41) = 3.25, p < .05; for pr 
recall, F(2, 41) = 3.92, p <.05. Individ 
comparisons revealed that whereas the hij 
reciprocal-self-disclosure group showed 
proved memory relative to controls on p 
tests, the low-reciprocal-self-disclosure gt 
was not reliably different from the no-t 
ment group. In addition, pretest-posttest 00 
parisons on the pattern recall test showed 
there was significant improvement for f 
high-self-disclosure group, #(9) = 2.36, 1 
.05, but no significant change for the lows 
disclosure group or for the control group. $ 
ilar analyses for the probe recall test reve 
not only that the treatment improved mêl 
for the high-self-disclosure group, (9) 
3.46, p < .01, but also that when there was 
treatment or minimal treatment, memory 
clined significantly in as short as 6 wee 
time: for no treatment, ¢(12) a 
.05; for low self-disclosure, #(18) = 4-10! 
.05. r PN. 

A fourth group of residents were ao i 
the memory tests (» = 10). For andi 
tests were presented as memory tests and! 
not integrated into the nursin $ 
A raran of this group’s arn col 
scores for subjects in all the other a i 
bined (n = 44) was conducted m o te 
effects of test context. Subjects i i 
concealed groups sm signi 

attern recall M = 4.0; nceal 
e than subjects in the oe 3 
group (pattern recall M = a oe 
M=5.3), ¢(52) =2.01, $ 
2.68, p < .02, respectively- 

After each interview, the 
residents for amount spoken d 
view, percentage of items initia 


interviewers | 4 
uring je 
dls 


able 1 


tudy 1: £ b 
perimental Condition as a Function of Level of 
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Mean Change Scores (Posttreatment Minus Pretreatment) on Memory Tests for Each 


Involvement 


Memory test 


No treatment 


Pattern recall Probe recall 
Group Pre Post Change Pre Post Change 
High reciprocal self-disclosure 4.7 6.7 2.0, 10.0 15.3 Sai 
Low reciprocal self-disclosure 4.0 3.3 —.7T»y 12.2 9.2 -3.0, 
4.8 4.6 —.2 135) 9.0 — 2.5, 


ion, and how friendly they were to the inter- 
iewer. The high-reciprocal-self-disclosure res- 
dents spoke on the average for 70% of the 
nterview and initiated 75% of the items for 
iscussion. By comparison, the low-self-dis- 
Josure subjects spoke for an average of 45% 
f the interview and initiated only 30% of the 
items for discussion. The groups were not 
rated differently in how friendly they were to 
the interviewer. There were also no reliable 
differences between conditions in the length of 
interviews, and all interviews became longer 
over the 4 weeks. 

The next analyses of variance were per- 
formed on the nurses’ ratings. These results 
teveal that the greater practice in thinking 


Table 2 


Behavior by Experimental Condition 


ole. Different subscripts within each test indicate significant differences at $ < .05. 


initiated by the high-reciprocal-self-disclosure 
treatment not only influenced short-term 
memory but produced general social psycho- 
logical effects that were discernible to nurses, 
who were unaware of the experimental treat- 
ment. As can be seen in Table 2, the high- 
involvement group was rated as being more 
aware, more active, and more self-initiating 
than the no-treatment group (all ps < .05). 
Although the low-involvement treatment ap- 
pears to have resulted in nurses’ judgments 
of a greater self-initiating attitude, on other 
measures the low-involvement group was not 
significantly different from the no-treatment 
group. i 


Study 1: Nurses’ Mean Ratings (Posttreatment Minus Pretreatment Scores) of Residents’ 


Aware- Socia- Self- Alert- 
Group Mood ness bility Activity initiating ness Health 
High reciprocal 
self-disclosure 
Pre 4.5 3.5 5.3 3.0 2.8 6.5 4.3 
Post 5.0 5.8 7.0 4.5 5.3 6.8 Dial 
Change Sa 2.3a 17a 1.5, 2.55 3a Ba 
Low reciprocal . 
self-disclosure x ibaj 
Pre 4.8 4.7 5.0 3.3 3.7 5.8- 4.7 
Post 5.5 5.2 4.5 1.7. a 5.8 4.7 
Change ATs Sp — Sa —1.6p 2.0. Oa On 
No treatment 
Pre 4.0 4.5 5.3 3.8 35 5.8 4.0 
Post 4.3 4.5 4.0 3.5 4.0 5.8 4.0 
Change P 0 -1.3, —.3p 5b Ó, ‘Oa 


\ 


Note. Cells bearing different subscripts within each rating are significantly different at p < .05. 
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Discussion 


The results of Study 1 suggest that memory 
can be improved when the elderly are moti- 
vated to commit themselves to a recommended 
course of action that involves increased use 
of their cognitive abilities. Because people in 
the high-reciprocal-self-disclosure group spoke 
more and initiated more topics for conversa- 
tion than their low-self-disclosure counter- 
parts during the 6-week period of the study, 
they appear to have been more involved in the 
substance and conduct of the interviews, This 
greater involvement apparently led to a 
greater commitment to follow the inter- 
viewer's recommendations to think about a 
variety of proposed topics for future discus- 
sion. Although the low-involvement group also 
was asked to think about things to be dis- 
cussed in between the scheduled visits, the 
character of the interaction was not suf- 
ficiently involving to encourage them to make 
a commitment to do so. Participants were not 
judged to be more friendly in any one treat- 
ment condition, suggesting that increased 
thinking and not simply greater liking for the 
interviewer was responsible for the results 
obtained, In addition, the items on which the 
high-self-disclosure group increased on the 
nurses’ ratings (aware, active, and self-initiat- 
ing) also seem to reflect a general increased 
involvement with the environment. 

It is significant to note that the memory 
tests used were measures of 
in which there has been s 


Study 2 


Whereas the first 


study provided i 
sonal incentives for teed sided interper- 


increased cognitive actiy- 
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ity, in the present study practical incentives 
were employed. An experimental group for 
whom positive outcomes depended on remem- 
bering was compared with a group for whom 
positive outcomes were not contingent on 
memory and with a no-treatment cont 
group. We predicted that the contingent grou 
would engage in more cognitive activity that 
would result in improved memory, 


Method ? 


Subjects 


Forty-five middle-class residents of a second nurs- 
ing home participated in this research, The average 
length of stay for the sample was 3.4 years, There 
were 14 subjects (M age = 80.62 years) in the cone | 
tingent group, 16 subjects (M age = 76.19) in the 
honcontingent group, and 15 subjects (M age= 
80.73) in the no-treatment control group. Only those | 
residents who had been judged on an 11-point scale 
by the institution’s head nurse to be at least somewhat 
alert (contingency M = 7.77, noncontingency M= 
7.13, and no-treatment M = 8.33) served as subjects: 
Though residents varied greatly with respect to me 
Particular medical concerns, to participate in HA 
investigation they had to be healthy enough to be 
ambulatory. Thus, there were no initial group differ- 
ences with respect to social class, age, alertness, ant 
general health, Similarly, no initial mood ci 
Were revealed on an 11-point scale of self-reporte 
happiness (contingency Mf = 5.21, noncontingency 
= 5.58, and no-treatment M = 5.53). Pe 

The experiment was conducted by seven an 
experimenters who were blind to the study's hypo! 
esis. 


Procedure 


The residents in the two experimental ae 
Were visited a total of nine times over the aaa 
a 3-week period, with various cognitive AG 
made during each visit. The visits were spaced sie 
that the amount of time in between visits mike on 
over the 3 weeks, thereby increasing the dem rh 
subjects’ memories over the course of the visit 
Those in the no-treatment condition were (when 
only on the first and last day of the Gri tests 
the memory tests were administered). Men ninl 
(described below) were given 2 days after t 
visit. 3 told 

After introducing herself, the experimenter. 
each resident that she was a student who el h 
ested in learning about life in a nursing ho n 3h 
explained that with the person's permission, 
Would like to come several times during the er. 
Weeks to discuss certain things with him ident’ pre 
insure that the relationship from the reside 


liye was time bound, she indicated that she 
Id have to return to school after 3 weeks but that 

could correspond if the resident so desired. The 
ire of the subsequent interaction was a function 
ie condition to which the resident had been as- 


ingent condition. Residents in this condition 
told that they were going to be asked several 
tions during each visit and that they would 
ive one poker chip for each correct response. The 
erimenter explained that at the end of the 3 weeks 
would be able to redeem the chips for a gift 
hat the more chips they had the better the gift 
ld be. The experimenter then asked the resident 
few questions about that day’s meals and activities. 
ie first questions asked were made simple enough 
insure that everyone received at least one chip on 
ë first visit (e.g., “Did you eat breakfast today ?”). 
residents were then asked two questions that re- 
lited them to seek information from their immedi- 
i environment. If they did not know the answer, 
ty were told they would be asked again on the 
tt visit. On the first visit, the questions were “Do 
ii know the names of any of the nurses?” and 
0 you know the name of another resident?” Subse- 
tently, each individual was asked three such ques- 
Ons per visit, 
the assumption, which proved correct, was that 
tesident would not know the answer to at least 
me of the questions when first asked and would 
fore have to engage in an information-secking 
Dcess before the next visit. The questions were of 
g levels of difficulty, but they all concerned 
a the institution (e.g, “How many different 
es’ and patients’ names do you know?”, “What 
ities were there today?”, “When will the next 
cktail party/bingo/concert take place?”, and “Ap- 
i TAR how many male residents are there on 
A pr? ). Although the information-seeking ques- 
he ept changing, residents were asked what they 
i a breakfast and lunch that day, what they ate 
a HA the night before, and what activities they 
K in during that day on each visit. 
nA aa end of each visit, before departing, the 
fri enter told the resident, “Please try and re- 
er everything you do today and also find 
ce], ae when the next bingo game will take 
to e returning on .. . [day specified] to 
vers. Tene will give you more chips for more 
formation ae each subject had to remember what 
0 rememb 0 seek out, to actively seek it, and then 
Bach er it until the experimenter’s next visit. 
TORUN was asked on three consecutive visits 
i £ resident knew the answer sooner. If on the 
Maton, th the resident still did not know the infor- 
On thee Question was not asked again. 
Make an questions for which it was impossible to 
What the Immediate determination of accuracy (e.g., 
R pees ate for breakfast), the answers were 
ton reggie Values and the residents in this condi- 
Question, a a chip every time they replied to a 
* Since persons in all conditions in the 
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study at several points said “I don’t remember” in 
response to questions asked and the answers given 
were accurate much more often than not in the cases 
that could be checked, this assumption of accuracy 
seemed reasonable. 

Noncontingent condition. The substance and the 
form of the visits in this condition were exactly the 
same as those in the contingent condition. The only 
difference between the noncontingent condition and 
the contingent condition was that the residents in the 
former were told that they would be given chips as 
mementos of each visit, the number of which would 
depend on how many the experimenter had with her 
that day. The residents in this condition were told 
that the chips would be redeemable for a gift in 3 
weeks. Each subject in this condition was yoked to a 
subject in the contingent condition for number of 
chips and length of interview. 

No-treatment condition. These residents received 
an initial visit by the experimenters and were asked 
the same questions as the residents in the contingent 
and noncontingent conditions. At the time of this 
first visit, they were told that the experimenter 
would return to speak to them in 3 weeks and on 
that occasion, in return for cooperating with the 
project, they would each receive a gift. This was 
done to control for the possible effects of anticipating 
some positive future outcome. They were not visited 
again until the measures were taken. 


Dependent Variables 


Four days after the last visit all subjects were given 
a test of immediate memory and a test of memory 
for remote events by an experimenter with whom 
they had no previous contact. The test of immediate 
memory used was a probe recall measure in which 
test words varied in terms of how many new words 
were presented between the first presentation of the 
test word and its repetition. There were approx- 
imately 5 sec in between the presentation of each 
word. After the test was administered, the experi- 
menter also asked the residents several questions con- 
cerning remote past events (e.g., “When did the Great 
Depression occur?”; “Where is the Empire State 
Building?” ; “How long has television been around?” ; 
and “What were the names of your neighbors before 
you moved here?”). Subjects in the contingent and 
noncontingent groups also were asked how much they 
liked the woman who had visited them, whether they 
remembered her name, and approximately how many 
times they had been visited by her. 
h Responses to questions asked during visits to sub- 
jects in the contingent and noncontingent groups 
allowed for a comparison of memory for recent 
events between these two groups over time. These 
questions fell into two categories: memory for events 
they had experienced (e.g., meals and activities) and 
panei a hearse they had to seek out and 
remember (e.g., how many m: i 
in the nursing home). Laer Ta een 


In addition to these measures, with informed con- 
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sent, each resident’s medical records were examined 
and all entries regarding the following aspects of the 
Person’s medical condition were recorded: medica- 
tions, major tranquilizers, vital Signs, complaints, and 
social adjustment. 


Results 


The major questions addressed in this study 
were whether giving people a reason to re- 
member would commit them to greater cogni- 
tive activity and, if they were committed, 
whether the beneficial effects would generalize 
to other memory processes. Before looking at 
generalization effects, we felt it important to 
see if the training was effective. That is, did 
the contingent group show signs of improve- 
ment for the things on which the contingency 
was focused? A comparison of memory for 
meals and activities over time suggests that 
the contingency treatment was indeed effec- 
tive, A repeated measures analysis of vari- 
ance for meals remembered over time yielded 
a significant effect for contingency, F(1, 28) 
= 4.23, p < 05.2 Of a maximum of three, the 
contingent group remembered an average of 
2.76 meals, whereas the noncontingent group 
temembered an average of 2,20, The Group x 
Time interaction was also significant, F(8, 
224) = 2.66, p < 01 (Winer, 1971). Whereas 
there was no difference between the two 
groups on the first visit, the contingent group 
remembered significantly more meals than 
the noncontingent group for each visit after 
the first. The difference between the first and 
last visits for the contingent group was also 
significant, t(13) 
there was no difference between the first and 


In addition to improved memory of meals 
over time, subjects in the contingent condition 
took less time than subjects in the noncon- 
tingent condition 
formation regarding activities in the nursing 
home. Obviously, 


not know the answers upon being asked the 
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the second time they were asked were given a 
score of 3, those who obtained the informa. 
tion after being asked three times were as- 
signed a weight of 2, and those who only 
knew the answer after four visits were as. 
signed a weight of 1. The mean weighted total 
over all the target questions was 12.40 for the 
contingent group, compared with only 8.25 for 
the residents in the noncontingent group, 
#(28) = 2.62, p < .05. 

Attitudes toward the interviewer and the 
visits also were assessed. Although each group ‘ 
received a total of 9 visits, the contingent 
group estimated that they had received an 
average of 13 visits, and the noncontingent 
group estimated an average of 6 visits, 
Eighty-two percent of the contingent group 
overestimated the number of visits, whereas 
73% of the noncontingent group underesti- 
mated the number of visits, y2(1) = 6.60, p < 
-05. Both groups were also asked to indicate 
on a 10-point scale how much they liked the 
person who had visited them. The results indi- 
cated that all participants claimed to have 
liked the experimenter; the mean for the con- 
tingent group was 9.92, compared with 7.93 
for the noncontingent group, F(1, 26) a 
7.18, p < 02. Residents were also asked if 
they remembered the name of the person who 
had visited them. Sixty-nine percent of the i 
contingent group remembered, compared with 
53% of the noncontingent group. Although in 
the predicted direction, this difference was not 
statistically significant. 


Probe Recall Test 


To test whether training in remembering 
recent events would generalize to improve 
ment in immediate memory, scores on this test 
were compared for the three groups. Out of å | 
possible score of 10, the mean for the iti 
tingent group was 7.50, compared with 4. 
and 5.83 for the noncontingent and meen 
ment groups, respectively, F(2, 36) = 6 i 
Ż < .01. A Tukey test showed that the con 
tingent group was significantly different from 


—_____ 


sure 
* All subjects were not available when each mee 
was taken. Therefore the degrees of freedom 
from measure to measure, 


he noncontingent group ($ < .05) and the 
treatment group (p < .05). The latter two 
were not different from one another. This 
inding is especially impressive in light of the 
t that the residents were not rewarded for 
inswering these questions; nor were the ex- 
rimenters who administered the test pre- 
viously known to them. 


Remote Events 


Immediately after the probe recall test was 
dministered, subjects were asked questions 
Tegarding remote past events (e.g., “When 
as the Great Depression?” and “How long 
has television been around?”). Again, signif- 
ant differences emerged. Responses were 
toded by an individual who was unaware of 
the conditions to which subjects had been 
assigned. The mean number of correct an- 
Swers for the contingent group was 2.46, com- 
pared with 1.53 and 1.54 for the noncontin- 
Bent and no-treatment groups, respectively. 
A contrast that set the contingent group as 
Steater than the remaining two groups, which 
lere set equal to one another, was significant, 
M1, 38) = 7.42, p < 01. 


Medical Chart Data 


The social adjustment comments recorded 
Mn residents’ medical charts by the nursing 
e staff were rated on a 7-point scale, in 
a higher numbers indicated improvement 
4 m 1 week before the first visit to 1 week 

ter the last measures were taken; a score of 
es no change. These ratings were 
a © by an experimenter who had training as 

urse and who was blind to the purposes of 
fase and to the groups to which the sub- 
contin longed. The mean score for the 
Drove gent group was 5.00, showing some im- 
Onin” compared with 3.79 for the non- 
ment Bent group and 3.87 for the no-treat- 
ie both of which showed slight 
be EIN on this measure, F (2, 40) = 5.41, 
Eine A Tukey test indicated that the con- 
‘the oth group was significantly different from 
Thich F two groups (p < .05 in both cases), 

ah ere not different from one another. 
On residents’ vital signs showed the 


: 
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same pattern as above, although the ratings 
were not significantly different for the three 
groups (4.31, 4.29, and 4.21 for contingent, 
noncontingent, and no-treatment groups, re- 
spectively). There were also no reliable differ- 
ences in the amount of medication taken. 
However, differences did emerge in regard to 
the amount of complaints about physical dis- 
comfort that residents voiced. The mean rat- 
ing, on a 7-point scale (higher numbers indi- 
cate more complaints), for the contingent 
group on this measure was 5.07, compared 
with 3.64 and 3.80 for the noncontingent and 
no-treatment groups, respectively, F(2, 40) = 
8.14, p < .001. Again, a Tukey test showed 
the contingent group to be different from the 
other two (p < .05 in both cases), which were 
not different from one another. As in Langer 
and Rodin (1976), complaints were taken as 
a positive sign—when one feels one can have 
an effect on the environment, complaining 
makes sense; when one has given up, it does 
not. Subjects on whom demands were made, 
in turn, appeared to be more likely to make 
demands on their environment. 


Discussion 


Those residents for whom outcomes were 
contingent upon memory showed cognitive 
improvement relative to those to whom the 
same amount of attention was given without 
contingent rewards for cognitive activity. In- 
creasing the use of cognitive abilities in this 
way, that is, making remembering matter, had 
several generalized effects on memory. It im- 
proved short-term memory, as evidenced by 
performance on the probe recall test, im- 
proved memory for recent events, as evidenced 
by better recall of meals and activities, and 
resulted in greater speed in finding and report- 
ing information that was to be ascertained 
from the environment. It also significantly 
improved social adjustment. 

It appears that having this responsibility 
for outcomes was subjectively a positive ex- 
perience. Residents in the contingent condi- 
tion overestimated the number of visits they 
had received and also expressed greater liking 
for the visitor than did the comparison group. 


2012 


Conclusion 


Both studies provided either interpersonal 
or practical incentives for becoming more in- 
volved with the environment and for greater 
attention to and remembering of environ- 
mental events. We found that restructuring 
the environment to make it more demanding 
and then motivating people to increase their 
cognitive activity seem to lead to improve- 
ments in memory that are generalizable. It is 
critical to note that the interviews in no way 
provided explicit training for the types of 
memory tests that constituted the dependent 
variables of greatest interest; nonetheless, 
performance for experimental subjects showed 
reliable improvement. 

Perhaps one could argue that the improve- 
ment in performance for the high-reciprocal- 
self-disclosure group in Study 1 and the con- 
tingency group in Study 2 was due solely to 
an increase in motivation to perform well and 
not to improved ability through increased 
practice. However, the greater recall over time 
demonstrated in Study 2 suggests that real 
improvements did take place. Although this 
alternative explanation can never be ruled 
out completely in this or any other study, for 
that matter, that finds performance differ- 
ences, if one never has the motivation to ex- 
ercise one’s ability, the motivation/ability dis- 
tinction becomes a moot Point. Although this 
general issue has been most directly argued 
with regard to racial differences and IQ (eg. 
Loehlin, Lindzey, & Spuhler, 1975), the same 
can be said of age differences and some cogni- 
tive abilities. Again, like findings in which 
race and IQ are considered, the data from 
Study 1 suggest that the context in which the 
measures are given, such as 


y performance in the 


i mately, the desire to 
Practice and perform in general, 


environmentally rather 
tmined, the present in- 
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vestigations coupled with other studies exam 
ining the effects of environment on intelligenc 
in the elderly (e.g., Plemons, Willis, & Baltes 
1978) suggest that the impact of the environ 
ment may be profound. The significance o 
these factors for memory, and cognitive abil 
ity more generally, suggests that studies ex 
plicitly designed to test for age-related deficit 
must control for differences in environmenta 
variables. 


Reference Note 


1. Langer, E. Old age: An artifact? Washington, 
D.C.: National Research Council, Committee on 
Biology, Behavior, and Aging, in press. 
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When Practice Makes Imperfect: Debilitating Effects 
of Overlearning 


Ellen J. Langer and Lois G. Imber 
Harvard University 


It was hypothesized that as overlearning leads to “mindlessness,” the indi- 
vidual components of a task become relatively inaccessible to consciousness 
and therefore unavailable to serve as evidence of task competence. This may 
lead to a decrement in performance if circumstances, for example, a label 
connoting relative inferiority, lead one to question one’s ability. This was 
tested in the first experiment by varying practice on a task (no practice, 
moderate practice, and overpractice) by label assigned to subjects (no label, 
assistant, boss). As predicted, performance decrements resulted for the no 
practice and overpracticed subjects who were assigned the inferior status label 
but not for the moderate practice subjects for whom the task components were 
still salient. In a second experiment it was found that the debilitation could be 
prevented for an overlearned task by making components of the task salient. 
Implications for the vulnerability of experts to these performance debilitations 


are explored. 


In a recent article (Langer & Benevento, 
1978) it was argued that decrements in per- 
formance may result from negative inter- 
personal contextual factors, like labels that 
connote inferiority relative to another person 
(e.g., assistant vs. boss), regardless of the 
outcome one experiences. It was suggested 
that, if salient, the inferior label may lead 
the individual to an erroneous inference of 
incompetence, resulting in performance decre- 
ments.’ This effect was termed self-induced 
dependence, It is self-induced in that indi- 
viduals who are given low status labels rela- 
tive to other people, for example, may draw 
unnecessary inferences from those labels that 
they then generalize to new situations with- 
out any external inducement to do so. Self- 
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induced dependence was originally proposed 
in contrast to learned helplessness (see Selig: f 
man, 1975) where a perception of incompe: | 
tence is clearly externally induced through 
prior exposure to uncontrollable aversive out- 
comes. | 
If people become dependent on labels when 

performing tasks with which they have had @} 
good deal of prior experience, as was the cast | 
in the Langer and Benevento studies, on) 
might assume that they would be even mote) 
vulnerable with respect to tasks with which 
they have had little practice. However, recent 
research on the “mindlessness of ostensibly | 
thoughtful action” (Langer, 1978, 1979; 
Langer, Blank, & Chanowitz, 1978; Langer 

Newman, 1979) suggests that the reverse 
may occur. That is, repeated experience Ade 
a task actually may increase an individuals 
vulnerability to an experience such as being 
provided with a label that connotes inferior 


ity. 


| 


e hypothe: 
nce decre- 
that others 
aning t% 


1Ọther interpersonal factors that ar 
sized to result in unnecessary performal 
ments include no longer performing tasks 
continue to perform, engaging in a deme: 
or simply allowing someone else to help you: 
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When an individual first approaches a task 
e/he is necessarily attentive to the particu- 
rs of the task. With each repetition of the 
sk, less and less attention to those particu- 
fs is required for successful completion of 
at task. As “mindlessness” is achieved, the 
mponents of the task may drop out or 
alesce to form a whole. Learning, then, in 
sense is learning what elements of the task 
ay be ignored (see Langer, 1978). Re- 
ated practice with the task as a whole rather 
an with the individual parts may lead to a 
range turn of events. The person who has 
erlearned the task, the expert, may be in a 
sition of knowing that he/she can perform 
e task, without any longer knowing how he 
she performs it, that is, without knowing 
è steps or components that make up the 
formance, If external factors like labels 
en led the individual to question his or her 
mpetence on these overlearned tasks, the 
dividual would have difficulty supplying in- 
mation about the solution process as evi- 
mce of competence and could erroneously 
fer incompetence. 
The present research was designed to assess 
lis relationship between amount of task 
‘Sig and vulnerability to an inferior 
el, 
‘Tn Phase 1 of the paradigm used, baseline 
Itasures are taken as pairs of subjects indi- 
ually perform a task successfully. In Phase 
subjects perform a different task together, 
ith one in the role of a “worker” and the 
er as the “boss.” In Phase 3, subjects 
Aun to the original task. To test the rela- 
iP between vulnerability to external 
i like labels and amount of task ex- 
he ce, task experience was also varied in 
j “ee investigation. In Phase 1 subjects 
ee given an opportunity to overlearn 
ire ask or were given only a moderate 
of practice with that task. In Phase 


b No practice group was introduced, and all 
i 


Ss 


ie Performed a second task under one 
i labeling conditions (assistant, boss, 
N S ). In Phase 3 all subjects returned 
et oo task. A curvilinear relationship 
tty to the amount of practice and suscepti- 
Ms prer pele connoting relative inferiority 

k ea. Since the vulnerability is pre- 

© be a function of people losing sight 


of the intervening steps of the task, it was 
expected that individuals who had no prior 
task experience and those for whom the task 
had become overlearned would show the per- 
formance decrement in the inferior label con- 
ditions. However, it was predicted that since 
moderate practice groups would be necessarily 
attentive to the components of the target 
task, they would not show the decrement. 


Experiment 1 


The first study employed a 3 X 3 design 
where the variables of interest were label (no 
label, assistant, boss) and amount of practice 
(no practice, moderate practice, overpractice) . 


Method 


Subjects. One hundred twenty-six adults were 
recruited from Boston and New York airport 
lounges to participate in a study concerned with 
developing methods to improve task performance. 
Since there were far more women at the airports 
than men, either waiting for their own planes or 
for the departures or arrivals of friends and rela- 
tives, we recruited only females.? They were in- 
formed that for our research we needed people to 
perform different tasks for us, individually and in 
pairs. Once 1 subject agreed to participate in the 
study, a 2nd subject, a stranger to the first, was 
recruited to be her partner. Thus all subjects were 
run in pairs, and there were 14 subjects per cell. 
The experiment was conducted by one female and 
five male experimenters. Experimenters who ad- 
ministered the tasks were blind to the label assign- 
ment. 

Task. A novel task was employed so that both 
extremes of task experience, no practice and over- 
practice, could be compared. It was also necessary 
for the task to have several parts so that it could be 
determined whether, as predicted, overpractice re- 
sults in obfuscation of the task components. To this 
end a coding task was devised that involves trans- 
lating English sentences into a different language 
where every letter is represented by a corresponding 
symbol and number. Letters were alphabetically 
arranged in groups (two groups of nine and one of 
eight letters); each group had a different symbol 
(a triangle, circle, or square), and each letter in 
each group had a number from one to nine (eight). 
(Punctuation marks had only symbols, rather than 
symbols and numbers). For example, the letter A 


2 Although the subjects in these studies were all 
females, there is no reason to assume that the 
results would not generalize to males. Further 
investigation is required before this can be deter- 
mined conclusively, however. 
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had as the corresponding symbol a triangle and the 
number 1, and the letter J had as the symbol a 
circle and the number 1. Subjects were given com- 
puter cards on which sentences had been typed, 
and they were required to transcribe these sentences 
into the new language. They were to write the 
symbol below each letter and circle the correspond- 
ing number in the column for each letter of the 
sentence. This task permits subjects to chunk the 
various elements into fewer and fewer units, there- 
by facilitating overlearning with repeated practice. 

For practical purposes it was important to deter- 
mine a criterion for overpractice in advance, To 
that end, the task was pretested on 18 adult female 
subjects. Subjects were instructed to inform us when 
they felt as though they could perform the task 
automatically. After coding two sentences, none of 
the subjects had overlearned the task. After coding 
six sentences all subjects were performing the task 
at least 50% more quickly and more accurately than 
they had done originally: they no longer referred to 
the coding reference sheet, their performance 
reached an asymptote, and by this time they had 
all reported that the task had become automatic. 
Therefore, the overpracticed group in the present 
experiment was given six practice sentences, whereas 
the moderate practice group was given only two. 

Procedure. In Phase 1 of the study subject pairs 
were randomly divided into three groups: no prac- 
tice, moderate practice, and overpractice on the 
coding task. After the task was explained and dem- 
onstrated, the overpracticed group was instructed 
to code six sentences, the moderate practice group 
was instructed to code two sentences, and the no 
_ practice group did no coding. This group went 
immediately to Phase 2 of the study. Each sentence 
appeared on a separate computer card. While one 
experimenter administered this task, a second ex- 
perimenter, who was hidden from view, recorded the 
amount of time it took subjects to complete the 
first sentence. The amount of time it took to com- 
plete this sentence and the number of errors made 
were used as baseline measures for the moderate and 
overpractice groups. 

All subjects now participated in Phase 2 of the 
research. They were each told that the task they 
would be asked to perform next required a joint 
effort: One of them would be the assistant and one 
the boss (one-third of the pairs performed this task 
without labels indicating relative status). Subjects 
were asked to examine a picture and find objects 
that were hidden in it. One person was to find the 
objects, and the other was,to record the objects 
found. To control for task effects, for half of the 
pairs the boss. did the searching and the assistant 
the recording, whereas for the remaining half the 
tasks were reversed. The experimenter handed sub- 
jects the envelope with instructions to place the 
task materials back in the envelope when they had 
finished. At that time she/he would return to give 
them further instructions. In the meantime she/he 
was purportedly going to recruit two more sub- 
jects. Subjects chose a slip of paper from the 
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Table 1 
Mean Number of Minutes to Complete the 
Postlabel Coding Task 


—_—_$<——$_ $$ 


Label 

No 
Condition Assistant label Boss 
No practice 7.10, 5.50, 5.405 
Moderate practice 4,50, 4.54, 4.72, 
Overpractice 4.40, 3:286  73:105 


Note. All cell ns = 14. Cells bearing different sub- 
scripts are significantly different from each other: 
subscript a differs from all other cells at p < .001, 
and subscript c differs from b cells at p < .05. 


envelope that indicated both whether they would 
be in the boss, assistant, or no label condition and 
what their task would be. When the second ex 
perimenter, whose presence was unknown to the 
subject, observed that the task had been completed 
she/he signaled the first experimenter, who then, 
returned to complete the final phase of the experi- 
ment and thus was blind to the labeling condition. | 

In the last phase of the study, subjects were told 
that an individual effort rather than a joint effort 
again was required. All subjects then were asked to 
code another sentence. (This, of course, was the first 
sentence for the no practice group.) Once again the 
second experimenter timed their performance. Both 
speed and accuracy served as dependent measures, 

At the end of the task, all subjects were asked 
to complete a brief questionnaire. They wer 
asked to make a list of the components of the task. 
Specifically, subjects were told, “To do the coding 
task, certain steps were involved. For instance, you 
had to look at the letter to be coded, Please list a5 
many steps as you can think of which you used t0 
perform the coding task.” When subjects compta 
this questionnaire, they were completely debriefed. 


Results and Discussion 


Before examining the effects of the label, it 
pility at 
i 


is important to verify group compara 

the start of the experiment. Therefore ae 
for all of the groups that performed the a 
label task were compared. As expected, F 
analysis of variance of prelabel scores 
amount of time it took subjects to code 
first sentence and the number of errors w 
made) showed no differences among the at 
groups for which this measure was we 
the no label, assistant, and boss, ae 
practice and overpracticed groups- 
scores ranged from $56 to 5.89, and accuracy 
scores ranged from 2.36 to 3.07. 


Tt was expected, however, that differences 
ould emerge on postlabel measures. A 3 
Yo Label, Assistant, Boss) X 3 (No Prac- 
e, Moderate Practice, Overpractice) analy- 
i of variance was performed on both the 
d and accuracy of the postlabel coding 
kk. The analysis performed on speed scores 
ded highly significant main effects for 
tice, F(2, 117) = 133.82, p < .001, and 
|, F(2, 117) = 25.05, p < .001, as well as 
ighly significant interaction, F(4, 117) = 
, p< .001.* Table 1 presents the mean 
umber of minutes it took to complete the 
tlabel coding task for each group. A New- 
n-Keuls test was performed to understand 
ter the relationships among these means 
nd thus the nature of these significant ef- 
s. Cells bearing different subscripts in the 
le are significantly different from each 
ther by this test. As may be gleaned from 
table, subjects in the no practice/assistant 
Mdition took longer to complete the post- 
beling task than did subjects in all other 
ditions (p < .001), whereas subjects in 
no label/overpracticed and boss/over- 
ticed group performed most quickly ($ 
05). It is, of course, not surprising that 
jects who have had enough practice to 
f learn a task perform that task faster than 
libjects do who have had very little practice. 
that is interesting is the comparison of the 
sistant and the no label groups within each 
actice condition. As predicted, the label 
istant had a detrimental effect for both 
€ no practice and the overpracticed groups 
Hit had no effect on the moderate practice 


hly significant main effects for practice, 
(2,117) = 12.02, p < .001, and for label, 
12,117) = 35.85, p < 001, and a highly 
= interaction, F(4, 117) = 9.80, P< 
: The mean number of errors made by 
eae is shown in Table 2. A Newman- 
ù ten test revealed that the means that bear 
7 co subscripts in the table are signifi- 
3 y different from each other. Once again, 
ieee to the moderate practice group, 
ke el assistant seems to have had a detri- 

l effect for both the no practice and 
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Table 2 
Mean Number of Errors Made in Postlabel 
Coding Task 


a 


Label 

No 
Condition Assistant label Boss 
No practice 5.354 2.36. 2.00. 
Moderate practice 1.64, 1.93, 1,50, 
Overpractice 4.93), 1.07, 1.14, 


Note. The cell bearing subscript a differs from those 
bearing subscript c at p < .001. Cell b differs from 
cells c at p<.01, Cells a and b do not differ from 
each other. 


overpracticed groups. Subjects in both of 
these groups made at least twice as many 
errors as subjects in the other conditions did. 
Although 86% of the overpracticed/assistant 
group showed a decrement in performance 
from Phase 1 to Phase 3, only from 7% to 
14% of the remaining relevant groups per- 
formed more poorly, x?(2) = 6.08, p< .05. 

Although decrements were expected on the 
target task, subjects in the assistant groups 
should not show a decrement in performance 
on the intervening hidden objects task be- 
cause this task has been defined as one that 
is consistent with their label, that is, it is an 
“assistant” task. The analysis of these scores 
in fact revealed no group differences. This 
further suggests that the assistant subjects 
took the task as seriously as the other groups 
of subjects did (see Langer & Benevento, 
1978). 

Although subjects in all assistant groups 
apparently took the label seriously, only 
those who had no prior experience with the 
target task and those who were overpracticed 
on it were debilitated by the inferior status 
label, It was hypothesized and indeed found 
that this label would not adversely affect sub- 
jects in the moderate practice group. This 
group was expected to be protected from the 
potentially debilitating effects of the label 


3 Because subjects were run in pairs, there may 
be a loss of degrees of freedom, A more conserva- 
tive test that uses the degrees of freedom based on 
the number of pairs within groups shows that each 
of the Fs in this study are still significant at the 
p<.001 level. 
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because their experience with the task, in con- 
trast to that of the no practice group, has 
taught them the components of the task, but 
not so much as to obscure them, in contrast 
to the overpracticed group. After performing 
the last task, subjects in each group were 
asked to list all the components or steps they 
could think of that were involved in per- 
forming the coding task. The mean number 
of steps listed for the no practice and over- 
practiced groups ranged from 3.07 to 4.43, 
whereas the means for the moderate practice 
groups ranged from 6.07 to 6.21. The analy- 
sis of variance revealed a significant main 
effect for the amount of practice, F(2, 117) = 
30.95, p < .001. A Newman-Keuls test con- 
firmed the prediction: The moderate practice 
groups listed significantly more steps than 
either the no practice or overpractice groups, 
which were not different from each other. 

If the components of a task do indeed be- 
come obscured with practice, this should be 
revealed not only in the number of steps 
listed but also in their degree of specificity. 
To assess this, each subject’s list was rated 
by two blind raters (r = .92) on a 10-point 
scale that ranged from very specific to very 
global. The analysis revealed a significant 
group difference, F(2,123) = 31.38; p< 
001. A Newman-Keuls test showed that the 
responses of the no practice group (M = 7.5) 
were equivalent in specificity to the over- 
practiced group (M = 6.57) and that both 
were far more global than the moderate prac- 
tice group (M = 3.6). Typical steps listed by 
the no practice and overpracticed groups 
were “Look at the letter and write in the 
code.” In contrast, typical steps listed by the 
moderate practice group included such things 
as “Fill in the circle if it is a capital and 
check for punctuation.” Of course, the more 
specific one is, the more steps are available to 
be listed. Therefore, although both measures 
are interesting, they should not be viewed as 
independent of one another. 

The performance decrements found in this 
study occur for tasks for which the com- 
ponents are not salient (a novel task and an 
overlearned task), after the individual had 
experience in an inferior status position. 
Therefore one may ask whether these decre- 
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ments can be prevented by making com. 
ponents of the task salient. The next study 
was undertaken with this in mind. 


Experiment 2 


This experiment used basically the same 
paradigm as that described in Experiment 1, 
Subjects in Phase 1 performed a task suc- 
cessfully, in Phase 2 they performed a dif- 
ferent task in one of three label conditions 
(assistant, no label, boss), and in Phase 3 
they returned to the original task. In this 
experiment, however, only one level of prac: | 
tice was used. A task was selected that sub- 
jects, in all likelihood, already performed 
automatically. To see whether the debilita- 
tion could be prevented by making com- 
ponents of the task salient, task components 
were listed for half of the subjects before 
they began Phase 1 of the experiment. Thus 
the study utilized a 2 (Components Salient/ 
Nonsalient) x 3 (Assistant, No Label, Boss) 
design. It was predicted that relative to the 
no label condition, the assistant condition 
would show a decrement in performance, and 
the boss condition would either show an m- 
crement in performance (as in the Langer & 
Benevento, 1978, study) or would be equiva: 
lent to the no label group (as in Experiment 
1 reported in this paper). It was predicted, 
however, that the performance decrement 
would occur only in the components-non 
salient condition. That is, it was expected 
that the salience manipulation would prevent 
the debilitation, since these subjects now 
would have a set of task components recently 
used and verified to supply as evidence if 
they questioned their ability. 


Method 


Subjects. Seventy-two adult females were n 
cruited to serve as subjects from the lounges A 
Logan Airport in Boston. They were asked to ae 
ticipate in a study on developing educational Be 
ods where subjects would be asked to Pe ta 
different tasks either individually or with ano f 
person. Once one subject agreed to participate 
second person, who was a stranger to her, oa 
recruited to be her partner. Thus, as in Experim 
1, subjects were run in pairs. s ; 

Task. The target F was a proofreading ae 
This was selected because it is a task that * 


have performed—for example, 
term paper for school, in 
before sending it, and so 
because it is a task where many rules 
ny be employed without the individual's neces- 
ily knowing what 
al can locate errors without being able to 
Hiculate why the error is an error. Two articles 
chosen from popular women’s magazines for 
task, Their titles accurately reflect their con- 
mt; One was entitled “Is your doctor overcharging 
2” and the other was “Energy savers’ guide.” 
Me articles were edited so that each contained 40 
ors distributed evenly throughout each article. 
there were 13 spelling errors, 9 errors of capitaliza- 
on, 12 punctuation errors, and 6 grammatical er- 
in each article. 

Procedure. All subjects were initially told that 
me task found helpful for developing educational 
thods is proofreading. 


| literate people 


Virtually everyone has done this task although 
they do not necessarily call it proofreading. For 
example, if you ever read over a paper for school 
before handing it in, you were proofreading. Or 
if you ever read over a letter you wrote before 
sending it to see if it said what you intended it 
fo say, you e proofreading. Therefore we 
Would like you to proofread a story for us. 


Components-salient condition: For this condition 
Ie experimenter went on to say, 


Most people can do this task even though they 
are not aware of just how they do it. That is, 
they are not always aware of the rules they are 

ing. For instance, if you knew the word “su- 
san” should be typed “Susan,” you would be using 
the rule that the first letter of proper nouns is 
always capitalized, We would like you to make a 
list of at least three things which you would 


k to know in order to correct a story for 
ors, 


e subjects completed their lists, they were told 
4 ARS would like them to remember these rules 
aes ey read the story we were about to give 
y proofread. 

pict you fnd an error, try to recall the rule 
A a and then circle the error. You are only 
eee minutes to read the story, SO do not 
oan o finish it. Most people who do this task 
na six errors in 3 minutes. If you are able 
R this, you should consider yourself suc- 
i at this task. 


intentionally 
of the com- 


Whe ti 
Ih; ime allotted for this task was 


Nef in 
o = 
Dene, rder to prevent overlearning 


Con 
i conanis-nonsalient condition: Subjects in 
ition were also told that proofreading is 
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a task that most people can do. In addition, they 
were told, 


In order to proofread a story for errors, you 
should at least glance at each word in the story 
to see if it is correctly written. For instance, if 
the word “susan” appeared, you would know it 
is an error because the word should read “Su- 
san.” First, we would like you to make a list of 
at least three ways in which proofreading is used, 
We would like you to read this story and circle 
any errors that you find. You are only given 3 
minutes to read the story, so do not expect to 
finish it. Most people who do this task can find six 
errors in 3 minutes. If you are able to do this, 
you should consider yourself successful at this 
task. 


Thus both groups were led to expect success on 
the task. Instructions encouraged both to pay atten- 
tion to the story, both were given an example of an 
error, and both were asked to make task-relevant 
lists, The difference was that rules for determining 
errors were generated only by the components- 
salient group. The task was pretested on a similar 
group of subjects to ensure that all experimental 
subjects would be successful at the task, that is, that 
each subject would find at least six of the errors. 

After subjects successfully completed this task, 
they were told that the second task required a joint 
effort. One person was to solve a cryptogram task 
while the other person used the stopwatch provided 
to keep track of the solution times. The cryptogram 
task consisted of four lists of five words each that 
were written in number form. Each number had a 
corresponding letter (A=1, B=2, and so on). The 
subjects’ task was to transcribe the numbers into 
their alphabet equivalents with the use of a code. 
Two-thirds of the subject pairs from each of the 
two conditions described above. were told that the 
task thus “required” someone to be the boss and 
someone to be the assistant. One-third of the sub- 
ject pairs performed this task without reference to 
relative status (no label condition). Task effects 
were controlled for by varying the label assigned 
to the particular task (timing vs. solving). That is, 
for example, for half of the bosses the task was 
solving and for half the task was timing. After one 
subject in each pair had been asked to indicate the 
task she wanted to perform, the pair was informed 
which task was the boss task and which was the 
assistant task. Subjects worked together on the 
cryptogram task for approximately 5 minutes and 
then were informed that their joint participation 
was concluded. All subjects then were asked to 
proofread another story. As in Phase 1 of the 
study, they were given 3 minutes to locate as many 
errors in the story as they could. These scores, in 
comparison with the prelabel proofreading scores, 
comprised the primary dependent measure. 

At the end of the proofreading task, subjects were 
asked to complete a brief questionnaire. They were 
asked to make a list of the components of the 
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Table 3 3 
Mean Number of Postlabel Proofreading 
Errors Correctly Located 


Label 
No 
Condition Assistant label Boss 
Components salient 14.33, 13.99, 14.41, 
Components nonsalient 8.92, 13.00, 17.42. 


Note. All cell ns = 12. Cells bearing different sub- 
scripts are significantly different from each other: 
subscript b differs from all other cells at p < .005, 
and subscript c differs from all a cells at p < .05. 


proofreading task, Specifically, subjects were asked 
to “make a list of rules that you know about writ- 
ing which helped you to identify errors in the 
stories.” The time it took each subject to make 
the list was recorded. When subjects had com- 
pleted the questionnaire, they were completely 
debriefed, 


Results 


An analysis of the prelabel scores revealed 
no difference among the groups. Thus there 
was group comparability at the start of the 
experiment. Although the prelabel scores were 
equivalent across the groups (ranging from 
13.33 errors found to 14.67), a very different 
picture emerges when one examines the post- 
label scores. Table 3 shows the mean number 
of proofreading errors located by each of the 
six experimental groups. The analysis of these 
scores yielded a highly significant main effect 
for label, F(2, 66) = 8.09, p <.001, and a 
highly significant interaction, F(2, 66) = 
7.74, p < .001.* A Duncan test was computed 
to reveal the nature of this interaction. As 
may be ascertained from the subscripts in the 
table, the major hypothesis of the study was 
supported. The components-nonsalient group 
replicated the Langer and Benevento (1978) 
finding such that, relative to the no label con- 
dition, the assistant group showed a severe 
decrement in performance, whereas the boss 
group showed a facilitation effect. However, 
the salience manipulation was successful in 
wiping out the differences among the groups. 
The scores for the components-salient condi- 
tion are virtually identical to the prelabel 
scores. The correlation for the components- 
salient groups between. their performance on 
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the prelabel and postlabel stories was 83, 
Similarly, the correlation between prelabel ang 
postlabel stories for the nonlabeled subjects 
in the components-nonsalient group was ,95| 
Although these groups found as Many errors 
in the second story as in the first, the as. 
sistants in the components-nonsalient group] 
found on the average 5.75 fewer errors in the 
second story. The bosses in this group foun 
2.67 more errors on the average than they} 
had in the prelabel story. 

Of those subjects in the components-non\ 
salient/assistant group, 92% showed a decre- 
ment from Phase 1 to Phase 3. In contrast td 
this only 25%-33% of the remaining groups 
showed a decrement in performance. Both 
2 X 2 comparisons are significant [salient/ 
nonsalient by assistant/boss: x?(1) = 4.48) 
p< .05; salient/nonsalient by assistant/no 
label: x*(1) = 3.89, p < .05]. 

To test again that subjects who were la 
beled assistant took the task in Phase 1 
seriously, scores for the intervening crypto 
gram task were compared. As predicted, there 
were no differences among the groups. As 
sistants solved the puzzles as quickly as the 
remaining groups. (Solution times ranged) 
from 43.18 sec to 47.37 sec). ? 

Further evidence that the salience maniptt 
lation was responsible for restoring subjects 
to their prelabel performance comes from 
the number of rules subjects reported using q 
the proofreading task and the time it too} 
them to compose this list. Subjects in the 
salience condition, if they were keeping the 
rules in mind while performing the task 7 
instructed, should have been able to list mH 
rules more quickly than the components a 1 
salient group could. The analysis of tbey a s6 
ber of components listed (salient M = 08) 
vs. nonsalient M = 2.78, t = 2.98, P< oe 
and the analysis of time to compose ua 
lists (salient M = 44.56 vs. nonsalient ‘if 
73.81, £ = 6.39, p < .001) suggest that 
was the case. es | 

One might argue that subjects’ ability i 
generate components of the activity 


on the number of pairs within group: 
this more conservative test lowers the 
these measures to .01. 
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jween-Subjects Correlation Coefficients for Rules Listed at End of Experiment (Yes/No) 


| Proportion of Errors Found 


Errors found/total errors 


Type of rule 1 2 3 4 
Components-salient condition 
1. Spelling .69* (.72) —.40  (.08) —.20 (—.08) H PA LY) 
2. Capitalization —.30 (.01) .68*  (.60) —.01 (15) —.40 (—.32) 
3. Punctuation 00 (.14) —.01 (—.10) .67*  (.70) —.46 (—.49) 
4, Grammar 37 (.34) —.05 (—.18) —.34 (—.26) .82* — (.63) 
Components-nonsalient condition 

1. Spelling —.35 07 —.13 .01 
2. Capitalization .09 —.13 04 14 
3. Punctuation —.05 10 —.16 —.14 
4, Grammar 34 13 val —.25 


Note, All cell ns = 36, Numbers in parentheses represent correlations between rules listed at the start of 


he experiment by proportion of errors found. 
$< .001. 


fore and after the task suggests that counter 
the position presented here, the components 
not inaccessible. However, although sub- 
icts in the components-nonsalient condition 
sted rules after the task, that does not mean 
mat they used those rules (i.e., identified 
tors as a function of realizing why they 
ete errors) while performing the proofread- 
ask (see Dweck & Gilliard, 1975). In 
i t, it is important to demonstrate that they 
id not use the rules, whereas the components- 
lent group did. To do this, the next analy- 
Was conducted to see the relationship be- 
ten the ability to list certain rules and the 
ual use of those rules. Since all subjects 
Made a list at the end of the last proofreading 
this list was used for the first compari- 
Table 4 shows the correlation coefficients 
the type of rule listed by the proportion 
ae found for each type of error. Corre- 
T coefficients bearing asterisks are sig- 
Be at p < 001. As may be seen in the 
ed looking at the diagonals, the rela- 
lay ay between listing and finding particu- 
Ors was significant for the components- 

hi Nt group, but there was no such relation- 
h i > the components-nonsalient group. 
ing fon or example, subjects who listed spell- 
f nd more spelling errors than subjects 


who did not list spelling in the components- 
salient condition, and this was not the case 
for the components-nonsalient group. 

As may be seen in the top portion of Table 
4, the correlations between rules listed at the 
beginning of the experiment and the propor- 
tion of errors found (these correlations are 
reported in parentheses in the table) are also 
significant at p < .001. This strongly sug- 
gests that the salience manipulation was effec- 
tive in providing subjects with rules to use 
while they performed the proofreading task. 

To explore further the relationship between 
rules listed and errors found, a second corre- 
lational analysis was performed on within- 
subjects scores. This analysis yielded a co- 
efficient that represented, for every subject, 
the relationship between rules listed and the 
proportion of errors of each type that were 
found. These correlation coefficients were then 
transformed into Z scores, and the resulting 
analysis yielded a significant effect for salience, 
F(1, 66) = 71.12, p < .001. The mean corre- 
lations between the types of rules listed and 
the types of errors found ranged from —.29 
to —.01 for the components-nonsalient condi- 
tions and from .77 to .86 for the components- 
salient groups. Clearly there was a strong 
relationship between rules listed and the rules 
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used for the components-salient group that 
was absent for the components-nonsalient 
group, 


Discussion 


Both experiments provide strong support 
for the assumption that overlearning may 
lead to increased vulnerability to labels that 
connote relative inferiority. In the paradigm 
used, the effect of the label reveals itself in 
the final phase of the experiments, Subjects 
in the second phase of both studies are per- 
forming what is deemed by the label as- 
sistant to be a low status task. There is no 
reason, therefore, for them to question their 
ability to perform it. However, the original 
and final tasks are not assistant tasks. To 
the extent that subjects accept the validity of 
the label, they may question their compe- 
tence with regard to tasks that may now be 
perceived as psychologically inconsistent with 
that label. Such questioning will not neces- 
sarily have negative consequences, if the indi- 
vidual can convince himself or herself that he 
or she in fact can perform the task. This evi- 
dence must be lacking with respect to a novel 
or unfamiliar task. If, as the present studies 
Suggest, overlearning results in the obfusca- 
tion of the individual Components of the task, 
evidence also will be lacking in this case, 
Therefore only groups with moderate experi- 
ence should be able to Proceed with the task 
unhampered by the label. Experiment 1 
showed this to be true. Performance decre- 
ments resulted after the label assignment for 
the no practice and the overp) 


racticed groups 

but not for the moderate practice groups. 
Since the Overpracticed group was able to 
generate task components wh 


someone else, for example, 
ponents alone would be i 
the untreated overpractic 
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ment 2 showed no relationship between the Nb 
Steps generated and the steps they actually fu 
used in performing the task. Being able to) 
list components that could possibly be used |} 


to complete a task should not be as confi- |i 
dence inducing as knowing steps that were |} 
experienced successfully. I 

The second experiment provides evidence |] 


that the decrement in performance that may 
occur on overlearned tasks may be prevented, 
Tt will be recalled that this experiment com- 
pared two overpracticed groups: a group that's 
except for the tasks used, basically was fi 
treated just as the overpracticed group in the ' 
first experiment had been and a second group | 
for whom the components of the task were | 
made salient. The performance decrement 
Occurred again for subjects bearing the in 
ferior label in the untreated group. Making 
the components of the task salient, however, 
apparently inoculated the comparison group 
of subjects against the potential effects of the 
label. 
One may question how the salience manip- | 
ulation in Experiment 2 could have been 
effective in light of the assumption, for which | 
all of us can summon anecdotal support, that | 
renewed attention to that which is over- 
learned is disruptive (cf. Kimble & Perlmut- | 
ter, 1970; Langer, 1978). The salience manip- 
ulation made subjects attend to an over- 
learned task, yet their performance was not 
hampered. The reason for this is likely to liè 
in the fact that subjects were asked to get 
erate steps necessary to perform the task 
before performing it. Subjects in the com- 
Ponents-salient group showed a strong rela- 
tionship between the components they genr- 
erated initially and the components they at 
tended to while performing the task, whereas 
Subjects in the group left to their own de- 
vices (the components-nonsalient group) 
showed no relationship between the compo- 
nents they generated afterward and the com- 
Ponents they attended to while performing the 
task. This suggests that the salience manip 
lation led subjects in some sense to follow 4 
new set of rules for themselves, Therefore, if 
they were approaching the task with a new sêt 
of rules (thereby changing the task to 3 
nonoverlearned task), attention to those rules 
(which was revealed’ in the high correlation 


DEBILITATING EFFECTS OF OVERLEARNING 


tween the rules cited and the ones actually 
ged) should not be disruptive. And of course 
efformance on an overlearned task also 
ould not be disrupted if it simply were per- 
med automatically without bringing com- 
tence into question. The proficiency mani- 
sted by subjects in the overpracticed/no 
bel and boss groups in Experiment 1 bears 
is out. Thus although negative external 
tors like pejorative labels may be quite 
ebilitating, they, of course, are not neces- 
ily so. What about positive external fac- 
ts? The data from the present studies do 
ot permit any clear conclusions with respect 
j the label “boss.” More research on situa- 
onal and individual difference variables is 
eeessary before any conclusions about posi- 
ve labels can be drawn." 

In addition to exploring the potentially 
acilitating effects of positive external influ- 
ices like labels, additional research is also 
tquired to understand more fully the rela- 
lnship between particular task character- 
lics and performance debilitation of this 
rt. The present analysis would suggest that 
iss that have only one step, tasks whose 
formance requires the articulation of steps 
g., learning the alphabet), or tasks that 
so complex as to preclude complete mas- 
would not render the individual vulnera- 
Je to external influences. Vulnerability may 
X evident for many, if not most, other tasks, 
loWever, 

It is clear that overlearning is adaptive, 
ince it frees limited attention to be paid 
ewhere. It would seem from their daily 


it lives, which facilitate overlearning. 
‘hile acknowledging the adaptive function 
Indlessness or overlearning may serve in 
teing limited capacity, the present article 
Ports previous work (Langer, 1978, 1979; 
Ne Blank, & Chanowitz, 1978; Langer 
ewman, 1979; Langer, Note 1; Langer 
einman, Note 2) in demonstrating the 
‘Ys in which mindlessness may also be mal- 
aptive, $ 
Although the detrimental effects of over- 
ing may be pervasive, it is also likely that 
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some subject populations are more susceptible 
to these effects than others are. For example, 
the elderly are a group who for many reasons 
(see Langer, 1978, 1979, Note 1) probably 
have more experience with overlearned tasks 
than the nonelderly do. Children probably 
have more experience with no practice tasks 
relative to nonchildren. Thus these two groups 
should be most susceptible to external fac- 
tors, such as pejorative labels, that may lead 
one to question one’s competence. Although 
the groups may look similar, they are of 
course quite different in that the elderly can 
in fact do the tasks in question. Similarly, 
people who chronically occupy low status jobs 
are likely to be particularly susceptible. How- 
ever, most interesting, perhaps, would be the 
inclusion of experts on this list. 

Although one probably does not want to 
discourage overlearning or the attainment of 
mindlessness because of its general advantage 
in fast and efficient performance, educating 
people as to the potential side effects would 
seem fruitful. However, the present research 
suggests another, more direct way in which 
potential performance decrements may be pre- 
vented. Stated generally, focusing on process 


5In the original Langer and Benevento studies, 
the label “boss” had a facilitating effect such that 
relative to the no label conditions, these groups 
performed significantly better. Although this was 
also true for Experiment 2 in the present investiga- 
tion, it was not true for the first experiment. Here 
subjects labeled boss within each level of practice 
were equivalent to the no label groups. The fact that 
the “boss” group across all four studies did not show 
a decrement in performance suggests that the ma- 
jority of subjects in this condition are not question- 
ing their ability to perform the task, for such 
questioning for the overlearned and no practice 
groups would reveal to them an inability to find 
evidence to support the label. If one goes back to 
the original studies (Langer & Benevento, 1978), one 
finds that whereas the boss groups on the average 
performed better than the no label groups, the 
proportion of subjects who performed this way is 
the same for both groups. Although most subjects 
are unquestioningly accepting the label and trying 
to live up to it, it would appear that a few are not 
so accepting. The relative proportions from experi- 
ment to experiment of this latter minority group 
would determine whether the aggregate measures 
reveal equivalence, facilitation, or perhaps even 


debilitation. 
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rather than on outcome may reduce vulnera- 
bility to debilitation of this kind. 
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An analysis is offered of the role of unequal weighting in the averaging model 
of information integration. A distinction is made between unequal weighting 
at the normative level (which has been referred to as “differential weighting”) 
and unequal weighting at the level of the individual subject (which we refer 
to as “idiosyncratic weighting”). Two studies are reported that examine the 
prevalence of idiosyncratic weighting in the trait-judgment impression forma- 
tion task. Whereas most past research on the question of unequal weighting 
in this task involved averaging responses across both subjects and stimulus 
replications, the present studies were analyzed at the level of an individual 
subject's repeated responses to separate stimulus replications. Clear evidence of 
idiosyncratic weighting was obtained from about 50% of the 120 subjects; 
only 20% of the subjects indicated absolutely no tendency toward unequal 
weighting. There was no evidence that idiosyncratic weighting was restricted 
to just a subset of stimuli, since all of the 20 stimulus replications showed 
idiosyncratic weighting effects. In contrast to previous findings, negative traits 
did not always receive more weight than positive traits. In more than 20% of 
the instances of unequal weighting, the more positive trait was accorded a 


higher weight. 


Information integration theory (Anderson, 
974) offers an approach for understanding 
lw people combine stimulus information 
then making judgments and decisions. The 
heory seeks to determine the nature of the 
tegration rule (e.g., adding, averaging, mul- 
iplying) employed by people in various re- 
Ponse domains. In addition, it provides a way 
0 determine whether all stimulus items in the 
a contribute equally to the overall judg- 

ent or whether they carry different weights. 
Integration theory provides no a priori 
“sis for predicting which integration rule or 
Weighting assumption is correct for any par- 
cular response domain. It does, however, 
Povide an array of conceptual alternatives 
a 
Fn, esearch was supported by National Science 
h ion Grant GS-38604. The authors are grate- 
f t 
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along with a diagnostic methodology (func- 
tional measurement). With a comprehensive 
series of studies, it is possible to determine 
which conceptual alternatives best describe a 
response domain. Hence, one of the great 
strengths of integration theory is its ability to 
uncover different integration rules for differ- 
ent response domains. 

In order to apply integration theory to a 
response domain, it is necessary to specify 
two features of that domain, the population 
of information or stimulus items and the na- 
ture of the subjective judgment continuum. 
It is necessary to specify these two features 
because a shift in either may affect the inte- 
gration rule or parameter values. For example, 
one integration rule may apply when traits are 
combined with traits (e.g., averaging), and a 
different rule might apply when traits are 
combined with adverbs (e.g., multiplication). 
Parameter values may also vary as the nature 
of the judgment continuum shifts. Whereas 
friendly may have a higher scale value than 
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energetic on the favorability continuum, it is 
likely that energetic would have the higher 
scale value on an activity continuum. 

Integration theory has been used exten- 
sively to analyze the adjective combination 
task first introduced by Asch (1946) to study 
impression formation processes, The stimuli 
used in the impression formation task are per- 
sonality trait adjectives, and the response con- 
tinuum refers to overall impression favorabil- 
ity. At the operational level, most investiga- 
tors use the 555 traits scaled by Anderson 
(1968) as their domain of stimuli, The judg- 
ment scale usually has seven or more intervals 
and is anchored with the bipolar terms of 
favorable-unfavorable or like-dislike, 

It is widely assumed that for the impres- 
sion formation response domain, people em- 
ploy an averaging integration rule and assign 
approximately equal weights to all stimuli in 
the stimulus domain, Anderson has offered 
this view of an equal weighting model in a 
number of sources (Anderson, 1974, p. 261; 
Anderson, 1976, pp. 681-682; Anderson & 
Lopes, p. 69; Leon, Oden, & Anderson, 1973, 
PP. 301-302; Oden & Anderson, 1971, p. 
159). The main objective of the present 
article is to examine the assumption that all 
Personality trait adjectives, as items of person 
information, carry approximately equal 
weight in the impression formation task. 

This assumption of equal weighting has 

en questioned by some investigators (eg., 

the following review of 
e equal weighting ques- 
sting data are not con- 


Paper explores the Possibility that su 
data may “average out” 
ences in trait weights that 
each individual. 


lity ch group 
idiosyncratic differ- 
exist separately for 


Differential Weighting 


? Two kinds of deviations from equal weight- 
ing have been examined in Previous research- 
One is related to the lability of a trait’s weight, 

2 


rns the variation į i 
between different traits, 
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Weight lability. The lability of a traig 
weight reflects the extent to which the weight 
is affected by situational or contextual vani 
ables. There can be little question as to 
whether “differential” weighting occurs in th 
impression formation domain in this sense ol 
the term. 

Characteristics of the judgment task an 
known to affect the functional weight give 
any particular item. For example, communi i 
tor credibility (Rosenbaum & Levin, 1968) 
and serial position (Anderson, 1965) can al 
as situational determinants of trait weight. 

Weight can also be influenced by the se 
tic relationship that a trait holds with othe 
items in the set. The weight parameter isa 
fected by the level of redundancy (Schmit 
1969) and inconsistency (Anderson & Jacd 
son, 1965) among traits describing the sa 
person. Also, there are traits that possess tw 
distinct semantic meanings (polysemolk 
homographs)—traits such as discriminati 
and sensitive—that may change their scali 
value as well as their weight, depending P 
other stimuli in the information set. Althougl 
each of the above features can lead to diffen 
ential weighting, they all derive from the rel 
tionship between the semantic definitions 
the different traits and can in principle W 
specified a priori. They do not represent If 
stances in which particular traits carry diffe 
ent weights when in isolation from one ai 
other, 

Weight differences between traits. k 
weights of most, if not all, traits are cleat 
labile. However, this does not mean that ta 
differ in their natural or context-free weights 
When variables known to influence lability a 
held constant, it is possible that trait weld! f 
are either approximately equal or sub 
tially different from one another. It is a 
sense of the term differential weighting hA 
is being rejected when the integration ee 
in impression formation is said to conform 
an equal-weight averaging model. j 

Several possible determinants of E. 
Weight have been examined. Both trait nove K 
(Wyer, 1970) and trait ambiguity E 
1971, 1975; McKillip, Barrett, & Dimic g 
1978; Schümer, 1973; Wyer, 1974b; w 
Watson, 1969) have received some attenti" 


|In neither case, however, do the findings un- 
equivocally support the conclusion of differ- 
‘ential weighting in the integration process. 
Other research has been directed toward 
sources of differential weighting that are asso- 
ciated with trait scale value. It is possible, for 
example, that traits with extreme scale values 
carry more weight than do neutral traits (e.g., 
Warr, 1974; Warr & Jackson, 1975). These 
ndings are difficult to interpret, however, 
ecause of confounding between negativity, 
Prticulty: and extremity (Warr & Jackson, 
976; Wyer, 1973). 
Unlike the above potential sources of differ- 
ential weighting, research findings have un- 
equivocally established the existence of a 
Negativity effect. Traits with a negative va- 
lence tend to carry more weight than those 
with positive valence (e.g., Birnbaum, 1974; 
Hodges, 1974; Oden & Anderson, 1971). This 
Negativity effect is thought to exist for a num- 
er of response domains (Kanouse & Hanson, 
972). 
Although the negativity findings represent 
4 genuine limitation to the equal weighting 
assumption, they may not be especially seri- 
ous. Anderson (1974, p. 261) has speculated, 
or example, that the magnitude of this nega- 
ivity effect is not very large and that it may 
€ restricted to only extremely negative stim- 
ui, Another difficulty with this research is 
hat it only employed a limited sample of 
sitive and negative traits. No attempt was 
Made to see how representative the sample 
Was of the entire trait population. The pos- 
Sbility exists that most traits have equal 
Weights, but that a few negative traits carry 
‘higher weight and a few positive traits a 
Pwer weight. One objective of the present 
poarch was to examine these questions about 
© negativity effect. 


Idiosyncratic Weighting 


ae work on the issue of differential 
pe has been at the level of normative 
es, That is, the research has examined 
k much weight is given traits by the typical 
Been oe? subject. Little consideration has 
HEE to the problem of whether or not 
Vidu weighting applies to each person indi- 

ally. It is possible that because of a per- 
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son’s idiosyncratic experiences with different 
trait adjectives, he or she may assign widely 
differing weights to the adjectives. However, 
one person’s pattern of weights may be un- 
related to another’s. One person may weight 
friendly twice as much as intelligent, whereas 
another might give friendly only half the 
weight of intelligent. If we were to obtain 
normative weights by averaging over these 
two persons (as is the case in most differential 
weighting research), we would find that the 
two traits received the same weight. Differen- 
tial weighting of traits at the level of a single 
subject (what we term idiosyncratic weight- 
ing) may emerge as equal weighting when 
measured at the normative level. 

Only one published study has examined the 
possibility of idiosyncratic differences in trait 
weights. In this study (Anderson, 1962), the 
stimulus sets consisted of three traits, one 
from each of the three factors in the design. 
Each factor consisited of a negative, neutral, 
and positive trait to form its three levels. This 
3 x 3x3 design resulted in a total of 27 
stimulus sets. Subjects judged all 27 sets once 
a day for 5 days, The resulting data were 
analyzed to determine whether significant in- 
teraction variance emerged. This design was 
repeated for 12 subjects, with every 2 subjects 
receiving a different stimulus replication (i.e., 
six stimulus replications of the three-factor 
design, with 2 subjects per replication). The 
test for differential weighting (or nonadditiv- 
ity) was performed on the pooled (i.e., 20 
degrees of freedom) interaction. Of the 12 
subjects, 3 were found to produce significant 
nonadditivity. 

Even though the total number of subjects 
was small, this study suggests that about 25% 
of the people show differential weighting. That 
percentage figure would, no doubt, have been 
higher had each of the two-way interactions 
and the three-way interaction been separately 
tested for each subject. It would be possible 
for a subject to have a significant two-way 
interaction but not to have the pooled inter- 
action reach significance. 

On the other hand, it is possible that the 
25% figure is actually an overestimate of the 
number of people displaying differential 
weighting. Two of the three subjects with sig- 
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energetic on the favorability continuum, it is 
likely that energetic would have the higher 
scale value on an activity continuum. 

Integration theory has been used exten- 
sively to analyze the adjective combination 
task first introduced by Asch (1946) to study 
impression formation processes. The stimuli 
used in the impression formation task are per- 
sonality trait adjectives, and the response con- 
tinuum refers to overall impression favorabil- 
ity. At the operational level, most investiga- 
tors use the 555 traits scaled by Anderson 
(1968) as their domain of stimuli. The judg- 
ment scale usually has seven or more intervals 
and is anchored with the bipolar terms of 
favorable-unfavorable or like-dislike, 

It is widely assumed that for the impres- 
sion formation response domain, people em- 
ploy an averaging integration rule and assign 
approximately equal weights to all stimuli in 
the stimulus domain. Anderson has offered 
this view of an equal weighting model in a 
number of sources (Anderson, 1974, p. 261; 
Anderson, 1976, pp. 681-682; Anderson & 
Lopes, p. 69; Leon, Oden, & Anderson, 1973, 
Pp. 301-302; Oden & Anderson, TOTI. 
159). The main objective of the present 
article is to examine the assumption that all 
personality trait adjectives, as items of person 
information, carry approximately equal 
weight in the impression formation task. 

This assumption of equal weighting has 
me investigators (eg., 
the following review of 


Paper explores the 


the other concerns th 
between different traits, 
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Weight lability. The lability of a traits 
weight reflects the extent to which the weight 
is affected by situational or contextual vari- 
ables. There can little question as tọ 
whether “differential” weighting occurs in the 
impression formation domain in this sense off 
the term. 

Characteristics of the judgment task arf 
known to affect the functional weight given 
any particular item. For example, communica 
tor credibility (Rosenbaum & Levin, 1968 
and serial position (Anderson, 1965) can att 
as situational determinants of trait weight. 

Weight can also be influenced by the semam 
tic relationship that a trait holds with other 
items in the set. The weight parameter is al 
fected by the level of redundancy (Schmidl} 
1969) and inconsistency (Anderson & Jacobi 
son, 1965) among traits describing the sami 
person. Also, there are traits that possess tW 
distinct semantic meanings (polysemot 
homographs)—traits such as discriminati 
and sensitive—that may change their scali 
value as well as their weight, depending of 
other stimuli in the information set. Althougl 
each of the above features can lead to difier 
ential weighting, they all derive from the relej 
tionship between the semantic definitions 
the different traits and can in principle i 
specified a priori, They do not represen g 
stances in which particular traits carry di E 
ent weights when in isolation from one E 
other, 4 

Weight differences between traits. a 
Weights of most, if not all, traits are cleat 
labile. However, this does not mean that tral 
differ in their natural or context-free wis 
When variables known to influence lability K, 
held constant, it is possible that trait weig E 
are either approximately equal or sue 
tially different from one another. It is t 
Sense of the term differential weighting a 
is being rejected when the integration proci 
in impression formation is said to conform 
an equal-weight averaging model. at 

Several possible determinants of a 
weight have been examined. Both trait a h, 
(Wyer, 1970) and trait ambiguity (Kap fy 
1971, 1975; McKillip, Barrett, & Dine 
1978; Schümer, 1973; Wyer, 1974b; Wa 
Watson, 1969) have received some atten! 
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jn neither case, however, do the findings un- 
gquivocally support the conclusion of differ- 
ential weighting in the integration process. 

Other research has been directed toward 
vurces of differential weighting that are asso- 
dated with trait scale value. It is possible, for 
sample, that traits with extreme scale values 
arry more weight than do neutral traits (e.g., 
Warr, 1974; Warr & Jackson, 1975), These 
indings are difficult to interpret, however, 
hecause of confounding between negativity, 
mbiguity, and extremity (Warr & Jackson, 
1976; Wyer, 1973). 

Unlike the above potential sources of differ- 
mtial weighting, research findings have un- 
quivocally established the existence of a 
iegativity effect. Traits with a negative va- 
ince tend to carry more weight than those 
ith positive valence (e.g., Birnbaum, 1974; 
Hodges, 1974; Oden & Anderson, 1971). This 
ikgativity effect is thought to exist for a num- 
A y response domains (Kanouse & Hanson, 
72). 

Although the negativity findings represent 
genuine limitation to the equal weighting 
sumption, they may not be especially seri- 
iis, Anderson (1974, p. 261) has speculated, 
example, that the magnitude of this nega- 
ity effect is not very large and that it may 
‘restricted to only extremely negative stim- 
| Another difficulty with this research is 
a it only employed a limited sample of 
itive and negative traits, No attempt was 
lde to see how representative the sample 
of the entire trait population. The pos- 
lity exists that most traits have equal 
as but that a few negative traits carry 
Fe er weight and a few positive traits a 
i Bea One objective of the present 

was to examine these questions about 

Negativity effect, 


th 
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Mosyncratic Weighting 


E work on the issue of differential 
» igh ng has been at the level of normative 
hoy > That is, the research has examined 
h oa Weight is given traits by the typical 
een a Subject. Little consideration has 
ual wary > the problem of whether or not 
vay, enting applies to each person indi- 
‘At is possible that because of a per- 
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son’s idiosyncratic experiences with different 
trait adjectives, he or she may assign widely 
differing weights to the adjectives. However, 
one person’s pattern of weights may be un- 
related to another’s. One person may weight 
friendly twice as much as intelligent, whereas 
another might give friendly only half the 
weight of intelligent. If we were to obtain 
normative weights by averaging over these 
two persons (as is the case in most differential 
weighting research), we would find that the 
two traits received the same weight. Differen- 
tial weighting of traits at the level of a single 
subject (what we term idiosyncratic weight- 
ing) may emerge as equal weighting when 
measured at the normative level. 

Only one published study has examined the 
possibility of idiosyncratic differences in trait 
weights. In this study (Anderson, 1962), the 
stimulus sets consisted of three traits, one 
from each of the three factors in the design. 
Each factor consisited of a negative, neutral, 
and positive trait to form its three levels. This 
3 x 3X 3 design resulted in a total of 27 
stimulus sets. Subjects judged all 27 sets once 
a day for 5 days. The resulting data were 
analyzed to determine whether significant in- 
teraction variance emerged. This design was 
repeated for 12 subjects, with every 2 subjects 
receiving a different stimulus replication (i.e., 
six stimulus replications of the three-factor 
design, with 2 subjects per replication). The 
test for differential weighting (or nonadditiv- 
ity) was performed on the pooled (i.e., 20 
degrees of freedom) interaction. Of the 12 
subjects, 3 were found to produce significant 
nonadditivity. 

Even though the total number of subjects 
was small, this study suggests that about 25% 
of the people show differential weighting. That 
percentage figure would, no doubt, have been 
higher had each of the two-way interactions 
and the three-way interaction been separately 
tested for each subject. It would be possible 
for a subject to have a significant two-way 
interaction but not to have the pooled inter- 
action reach significance. 

On the other hand, it is possible that the 
25% figure is actually an overestimate of the 
number of people displaying differential 
weighting. Two of the three subjects with sig- 
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energetic on the favorability continuum, it is 
likely that energetic would have the higher 
scale value on an activity continuum. 

Integration theory has been used exten- 
sively to analyze the adjective combination 
task first introduced by Asch (1946) to study 
impression formation processes, The stimuli 
used in the impression formation task are per- 
sonality trait adjectives, and the response con- 
tinuum refers to overall impression favorabil- 
ity. At the operational level, most investiga- 
tors use the 555 traits scaled by Anderson 
(1968) as their domain of stimuli. The judg- 
ment scale usually has seven or more intervals 
and is anchored with the bipolar terms of 
favorable-unfavorable or like-dislike. 

It is widely assumed that for the impres- 
sion formation response domain, people em- 
ploy an averaging integration rule and assign 
approximately equal weights to all stimuli in 
the stimulus domain. Anderson has offered 
this view of an equal weighting model in a 
number of sources (Anderson, 1974, p. 261; 
Anderson, 1976, pp. 681-682; Anderson & 
Lopes, p. 69; Leon, Oden, & Anderson, 1973, 
pp. 301-302; Oden & Anderson, 1971, p. 
159). The main objective of the present 
article is to examine the assumption that all 
personality trait adjectives, as items of person 
information, carry approximately equal 
weight in the impression formation task. 

This assumption of equal weighting has 
been questioned by some investigators (e.g., 
Birnbaum, 1974), In the following review of 
research bearing on the equal weighting ques- 
tion, we note that existing data are not con- 
clusive regarding the source or strength of 
differences in trait weights. One of the prob- 
lems with this research is that most of the 
studies are based on group averages. This 
paper explores the possibility that such group 
data may “average out” idiosyncratic differ- 
ences in trait weights that exist separately for 
each individual. 


Differential Weighting 


Two kinds of deviations from equal weight- 
ing have been examined in previous research: 
one is related to the lability of a trait’s weight, 
and the other concerns the variation in weight 
between different traits, 
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Weight lability. The lability of a trait’s 
weight reflects the extent to which the weight 
is affected by situational or contextual vari 
ables. There can be little question as to 
whether “differential” weighting occurs in the 
impression formation domain in this sense of 
the term. 

Characteristics of the judgment task are 
known to affect the functional weight given 
any particular item. For example, communica- 
tor credibility (Rosenbaum & Levin, 1968) 
and serial position (Anderson, 1965) can act 
as situational determinants of trait weight. 

Weight can also be influenced by the seman- 
tic relationship that a trait holds with other 
items in the set. The weight parameter is af 
fected by the level of redundancy (Schmidt, 
1969) and inconsistency (Anderson & Jacob- 
son, 1965) among traits describing the same 
person. Also, there are traits that possess two 
distinct semantic meanings (polysemous 


homographs)—traits such as discriminating 
and sensitive—that may change their scalé 
value as well as their weight, depending on 


other stimuli in the information set. Although 
| 


each of the above features can lead to differ 
ential weighting, they all derive from the rela 
tionship between the semantic definitions ol 
the different traits and can in principle b 
specified a priori, They do not represent Im 
stances in which particular traits carry differ- 
ent weights when in isolation from one ale 
other. : 
Weight differences between traits. 4 
weights of most, if not all, traits are clearly 
labile. However, this does not mean that traits 
differ in their natural or context-free weights 
When variables known to influence lability a" 
held constant, it is possible that trait es 
are either approximately equal or su hi 
tially different from one another. It is a 
sense of the term differential weighting 4 
egration proces 
d to conform t0 


i trait 
possible determinants of 


Several d. Both trait novelty 


weight have been examine! 
(Wyer, 1970) and trait 
1971, 1975; McKillip, i 
1978; Schiimer, 1973; Wyer, 197 i enti 
Watson, 1969) have received some 


in neither case, however, do the findings un- 
quivocally support the conclusion of differ- 
tial weighting in the integration process. 
‘Other research has been directed toward 
fources of differential weighting that are asso- 
fated with trait scale value. It is possible, for 
sample, that traits with extreme scale values 
arry more weight than do neutral traits (e.g., 
Warr, 1974; Warr & Jackson, 1975). These 
dings are difficult to interpret, however, 
ecause of confounding between negativity, 
biguity, and extremity (Warr & Jackson, 
976; Wyer, 1973). 

Unlike the above potential sources of differ- 
"tial weighting, research findings have un- 
fuivocally established the existence of a 
gativity effect. Traits with a negative va- 
mce tend to carry more weight than those 
lith positive valence (e.g, Birnbaum, 1974; 
dges, 1974; Oden & Anderson, 1971). This 
igativity effect is thought to exist for a num- 
4 of response domains (Kanouse & Hanson, 
N72). 

Although the negativity findings represent 
Senuine limitation to the equal weighting 
lmption, they may not be especially seri- 
ls, Anderson (1974, p. 261) has speculated, 
i example, that the magnitude of this nega- 
iity effect is not very large and that it may 
Testricted to only extremely negative stim- 
Another difficulty with this research is 
t it only employed a limited sample of 
sitive and negative traits. No attempt was 
Mie to see how representative the sample 
S of the entire trait population. The pos- 
lity exists that most traits have equal 
fights, but that a few negative traits carry 
i igher weight and a few positive traits a 
®r Weight. One objective of the present 


k Was to examine these questions about 
Negativity effect. 
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son’s idiosyncratic experiences with different 
trait adjectives, he or she may assign widely 
differing weights to the adjectives. However, 
one person’s pattern of weights may be un- 
related to another’s. One person may weight 
friendly twice as much as intelligent, whereas 
another might give friendly only half the 
weight of intelligent. If we were to obtain 
normative weights by averaging over these 
two persons (as is the case in most differential 
weighting research), we would find that the 
two traits received the same weight. Differen- 
tial weighting of traits at the level of a single 
subject (what we term idiosyncratic weight- 
ing) may emerge as equal weighting when 
measured at the normative level. 

Only one published study has examined the 
possibility of idiosyncratic differences in trait 
weights. In this study (Anderson, 1962), the 
stimulus sets consisted of three traits, one 
from each of the three factors in the design. 
Each factor consisited of a negative, neutral, 
and positive trait to form its three levels. This 
3 X 3 X 3 design resulted in a total of 27 
stimulus sets. Subjects judged all 27 sets once 
a day for 5 days. The resulting data were 
analyzed to determine whether significant in- 
teraction variance emerged. This design was 
repeated for 12 subjects, with every 2 subjects 
receiving a different stimulus replication (i.e., 
six stimulus replications of the three-factor 
design, with 2 subjects per replication). The 
test for differential weighting (or nonadditiv- 
ity) was performed on the pooled (i.e., 20 
degrees of freedom) interaction. Of the 12 
subjects, 3 were found to produce significant 
nonadditivity. 

Even though the total number of subjects 
was small, this study suggests that about 25% 
of the people show differential weighting. That 
percentage figure would, no doubt, have been 
higher had each of the two-way interactions 
and the three-way interaction been separately 
tested for each subject. It would be possible 
for a subject to have a significant two-way 
interaction but not to have the pooled inter- 
action reach significance. 

On the other hand, it is possible that the 
25% figure is actually an overestimate of the 
number of people displaying differential 
weighting. Two of the three subjects with sig- 
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nificant interaction variance received the same 
stimulus replication. It is conceivable that this 
particular stimulus replication contained in- 
consistent traits, redundant traits, or polyse- 
mous homographs. Anderson (1962) did not 
report attempting to avoid stimulus combina- 
tions of this sort. 

There is a second reason that the 25% fig- 
ure might be an overestimate. There is no way 
to verify that the response scale that these 
three subjects used to make their judgments 
was linearly related to the underlying sub- 
jective continuum. It is possible that these 
three subjects assigned equal weight to all 
stimuli but that in reporting their subjective 
impressions they did not use an equal-interval 
response scale. For example, a statistical in- 
teraction of the form implying that negative 
traits carry more weight than positive traits 
could be produced by a nonlinear response 
scale if the categories at the negative end of 
the scale were wider (i.e., covered a greater 
range of the subjective continuum) than the 
categories at the positive end. 

The research in the present paper was 
undertaken to provide a more thorough test 
of the prevalence of idiosyncratic weighting 
in the trait judgment domain. Two studies 
are reported, both of which have the same 
methodological features. The studies were de- 
signed to maximize the likelihood of detecting 
idiosyncratic weighting. The design used by 
Anderson (1962) was improved upon in six 
ways. 

The stimulus sets used by Anderson con- 
tained three traits, According to the averag- 
ing model, however, the observable effects of 
weight differences due to a single trait de- 
crease as the number of other traits in the set 
increases. To maximize sensitivity to differ- 
ential weighting, the present studies used sets 
composed of only two traits. 

Second, there is a need to increase the num- 
ber of stimulus replications (Anderson, 1962, 
used only six). The use of an interaction to 
test the presence of differential weighting is 
contingent upon having the two traits with a 
difference in weight present in the same factor 
of the design. That is, the two must both be 
on Factor A or on Factor B. If two highly 
weighted stimuli are on Factor A and two low 
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weighted stimuli are on Factor B, the averag. 
ing model predicts no interaction, and so. the. 
effect of the weight differences would go wn. | 
detected. Increasing the number of stimulus 
replications increases the likelihood that at 
least some traits with differential weights will 
be assigned to the same factor in the design, 
The present research employed 10 different 
stimulus replications in the first study and 10 
new replications in the second. 

Third, most research applying integration 
theory to the trait-judgment domain has used 
equal weighting instructions in which subjects 
are explicitly told that each trait is accurate 
and equally important and that the subject 
should pay full attention to each of them. T 
the extent that subjects actively bear tho 
instructions in mind while making their judg! 
ments, it is possible that differential weighting 
is suppressed in this line of research. Some 
support for this notion is offered by Anderson 
and Jacobson (1965), who obtained slightly 
stronger evidence of inconsistency discounting 
under naturalistic instructions than unde 
equal weighting instructions, The first study] 
in this paper deleted any mention of equal 
weighting, and the second study manipulated 
weighting instructions as a factor. De. 

Fourth, it is possible that the design-wit 
structure of the stimulus combinations col f 
lead subjects to adopt an equal weighting s 
regardless of the explicit instructions. an 
son (1962) presented subjects with all É 
combinations in the three-factor design. © 
meant that each trait appeared nine time 
once with every combination of all the tra 
from the other two factors. Each i a 
peared in one third of all the test sè E 
sented. It is possible that such extensive a i 
tition induces a mechanistic judgment $ i 


differential weighting that 
naturally bring to the task. I 
studies, 40 test sets are prea a 
adjective appears in more than eN 

Fifth, if differential weighting T is it 
to some stimuli and not to others, 7 mize 
portant to design the study 
the likelihood of identifying those $ 
subject combinations that display action used 
weighting. Whereas the basic inter: i 


Anderson (1962) was a 3 X 3 (or four de- 
of freedom) interaction, the present 
dy used a 2 X 2 (or one degree of freedom) 
eraction. In the Anderson design, it would 
difficult to detect a significant interaction 
at was due to only one of the six stimuli 


n might affect only one of the four compo- 
nts of the interaction, statistically it would 
be spread out over the four degrees of freedom 
when tested on that basis, might not pro- 
fice overall significance. In a 2 X 2 design, 
lowever, differential weighting contributed by 
single trait would be wholly contained in the 
ihe-degree-of-freedom interaction term. 
lastly, the Anderson (1962) study em- 
yed only 12 subjects. It is difficult to gen- 
lize about an estimate of the percent of 
tople displaying differential weighting from 
h a small sample. In the present paper, 40 
tibjects were used in the first study and 80 in 
second, providing greater confidence in the 
herality of the findings. 


4 
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Experiment 1 


Th the first experiment, each subject was 
Wen 40 two-trait stimulus sets to judge on a 
Reability scale once a day for five days. The 
lmulus sets contained 10 replications of a 
2 design in which each factor of the de- 
Contained a moderately favorable and a 
Mierately unfavorable trait. Each stimulus 
lication Provided one opportunity for dif- 
ential weighting to emerge, offering 10 op- 
tunities for each subject. Such a design not 
na allowed an analysis of the prevalence of 
e yncratic weighting but permitted a de- 
“ination of whether significant interactions 
p iom many or only a few stimulus 
i, tions. Further, a qualitative analysis of 
„Pattern of means for significant inter- 
Hs can reveal the proportion of unequal 
rast due to negative traits carrying more 
_ at than positive traits. 


Method 


| Subj 
ly ae Were 40 introductory psychology students, 


“Sand 21 females, at Ohio State University, 
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who participated in partial fulfillment of course re- 
quirements. 


Stimuli 


The stimulus trait adjectives were arbitrarily 
selected from Edward’s (Note 1) Ohio State rescaling 
of Anderson’s (1968) personality trait adjectives. 
Subjects judged 40 stimulus trait pairs, 4 pairs from 
each of 10 stimulus set replications. The 4 pairs from 
each replication represented the four cells of a 2 X 2 
factorial design. Each factor of this design had two 
levels of trait likeableness, a moderately favorable 
(M+) trait and a moderately unfavorable (M—) 
trait. For each replication there is one M+M-+ trait 
pair, two M+M-— trait pairs, and one M—M— trait 
pair. Mean scale values for the M+ and M— traits 
were 5.5 and 2.5, rated on a scale of likeability rang- 
ing from 1 (dislike very much) to 7 (like very 
much). In selecting traits for each replication, care 
was taken to eliminate all instances of redundancy 
and inconsistency. 


Procedure 


When subjects arrived for the first day of the ex- 
periment they received the following written instruc- 
tions: 


This experiment is concerned with impression for- 
mation. What we are interested in is how people 
form impression of others on the basis of very 
limited information, . . . 


In this particular experiment, you will be seeing 
pairs of traits that describe various persons, You 
will be asked to tell how much you would like 
each person. Imagine that each of the two traits 
was contributed by a different person who knows 
him (her) well. 


Read each pair carefully, try to imagine the type 
of person being described, and rate how much you 
like the person, using the scale given below each 
pair of traits. Sometimes this may seem hard, but 
just act naturally and do the best you can. 


All subjects judged the same 40 stimulus sets on 
each of 5 successive days. Stimulus sets were pre- 
sented in a booklet, 1 set to a page. There were five 
random orders of stimuli. The five booklet orders 
were Latin square counterbalanced across the 5 days, 
so that each subject received each booklet order and 
all booklet orders appeared equally often over the 
5 days. Ratings were made on a 21-point scale rang- 
ing from 10 (like very much) to —10 (dislike very 
much). 


Results 
Group Data 


Most studies that have tested for equal 
weighting in the adjective judgment task have 
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Figure 1. Mean likeability rating of adjective sets 
composed of M+ and M— (moderately favorable 
and moderately unfavorable) traits, averaged over 
days, stimulus replications, and subjects in Experi- 
ment 1, 


averaged over both stimulus replications and 
subjects. In analyzing the present group data, 
the question of equal weighting can be ex- 
amined both on this overall basis and within 
each stimulus replication. 

If traits are equally weighted, there should 
be no interaction between the M+ and M— 
levels of the A factor and the M+ and M— 
levels of the B factor. The present group data 
brings considerable power to the test of that 
interaction; each cell mean is the average of 
2,000 (10 X 5 X 40) observations. The inter- 
action (see Figure 1) was significant, F(1, 
39) = 10.35, p < .005, and its pattern is sim- 
ilar to that obtained by previous investigators. 
It can be explained by assuming that negative 
traits carry more weight than positive traits. 

Whether the obtained overall interaction 
represents a serious challenge to the equal 
weighting assumption is contingent upon sev- 
eral factors, including its magnitude, the pro- 
portion of stimulus replications that contrib- 
ute to it, the presence of similar versus differ- 
ential interaction patterns across stimulus 
replications, and whether response scale non- 
linearity could account for the significant 
effects. 

The magnitude of the overall interaction 
was not very substantial. The interaction con- 
tributed only one percent to the total be- 
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tween-cells variance, With fewer observations: 
going into each cell mean, this interaction 
could well have gone undetected. 

It is conceivable that the overall interaction 
was due to only a few of the 10 stimulus rep 
lications. In line with Anderson’s (1974) sug 
gestions, it could be the result of the one of 
two stimulus replications that contained thel 
most extremely negative traits from the M4 
range. There is a second advantage to exam 
ining each of the stimulus replications 
arately. It is possible that the pattern dis 
played in Figure 1 does not hold for all sig} 
nificant stimulus replications, If the majority 
of the significant interaction replication 
showed the pattern in Figure 1 and were avery 
aged with others that showed different a 
terns (e.g., convergence rather than diver 
gence to the right of the graph), the observe 
overall interaction could still have emerged. 

Separate analyses were performed on 
of the 10 stimulus replications, and only 
were found to contain significant interactiol 
variance, F's(1, 1521) > 6.11, ps < .05. Th 
percent of between-cells variance cont 
by the interaction was 3.40 and 3.80 cm 
first and eighth replications, respectively. j 
solid line portions of Figure 2 display, 
interactions for those 2 stimulus replication 
Both replications showed the same patter 
one that could be explained by assuming, ai 
negative traits carry more weight than poi 
tive traits do. 

A significant interaction could be F 
by weight differences between the a i 
on the A factor, on the B factor, or O0 © 3 
Each stimulus replication, then, provides © 

iti i tial weighting, 
opportunities for differen T the repli 
occur. The fact that only two 0 ans 
tions had significant interactions m al 
near equal weighting was observed at th 
16 and 18 of the 20 opportunites tal 
normative level of analysis, tee 0% 1 
equal weighting was characteris i negativi) 
90% of the trait pairs. The E, , 
effect was produced by at most 
stimulus pairs. 


product 


aili 


SE 


intet 
as the 1 
1 The error term for these analyse ml person 
action between subjects (40) an 
(40). | 


‘The data provide some support for the 
merpretation that differential weighting is 
stricted to highly negative traits. The lowest 
rginal mean was computed for each stim- 
ilus replication to provide an index of which 
yeplications contained the most negative trait. 
mong the 10 stimulus replications, the 2 
plications for which the interaction was sig- 
fificant were the first and fifth lowest. 
Response scale linearity. Significant inter- 
ictions of the form portrayed in Figures 1 and 
could be obtained even if all traits were sub- 
ctively given equal weights. This would 
cur if subjects used wider categories at the 
ative end of their response scale than at 
positive end, In such a case, subjects’ 
rt responses would not be linearly related 
their subjective responses. Two qualitative 
sts were devised for the purpose of examin- 
ig the response scale explanation. 
sponse scale nonlinearity can be dis- 
sed if the qualitative pattern satisfies a 
of disordinality, The interaction por- 
ityed in Figure 1 is termed an ordinal inter- 
lion because both the lower and upper lines 
ordinally related to the horizontal axis in 
Ë same direction (i.e. the sign of both 
ies is the same). A disordinal interaction 
luld be one in which the slope of one line is 
tive and the slope of the other is negative. 
Ne means of a 2 x 2 design can be plotted 
Nat disordinality emerges, there exists no 
tinuous monotonic transformation of the 
Onse scale that will reduce that interaction 
veto. Consequently, the presence of dis- 
nality supports an unequal weighting in- 
Metation. Although none of the stimulus 
“ations in the analysis of group data (see 
N 2) displayed such a pattern, it is use- 
m „Point out the implications of such a pat- 
A for response scale linearity prior to dis- 
> Mg the individual data. 
-© intersection test involves comparing the 
ftem of a significant interaction for 
ys dulus replication against a base line 
i) by a nonsignificant (i.e, interaction 
np tHmulus replication. If the judgments 
a to both interactions came from 
htio € portions of the response scale, no 
bt 3 monotonic quadratic transforma- 
response scale (e.g., using wider 
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Figure 2. Use of the intersection test to eliminate 
scale nonlinearity as an explanation of the inter- 
actions obtained when averaging over days and sub- 
jects in Experiment 1. (M+ = moderately favorable. 
M-— = moderately unfavorable. The significant inter- 
actions for Stimulus Replications 1, F(1, 156) = 
6.35, and 8, F(1, 156) = 6.12, are in solid lines and 
the nonsignificant contro] interaction from Stimulus 
Replication 3, F(1, 156) = 1,00, is in dashed lines.) 


categories at the negative than at the positive 
end) could simultaneously reduce both inter- 
actions to zero. The intersection test provides 
a way of insuring that both the significant 
interaction and the nonsignificant control in- 
teraction involved responses from similar por- 
tions of the rating scale. The control stimulus 
replication must be selected so that when the 
two interactions are graphed together, the 
lower lines of each intersect one another and 
the upper lines of each intersect one another. 

This intersection test was applied to both 
significant stimulus replications for the group 
data. It was found that Stimulus Replication 
3 (F= 1.00), in which the interaction ac- 
counted for only 0.43% of the between-cells 
variance, provided such a control for both sig- 
nificant replications (see Figure 2). It can be 
concluded, then, that the two significant inter- 
actions portrayed in Figure 2 represent gen- 
uine instances of unequal weighting and 
should not be dismissed on the grounds of 
simple response scale nonlinearity. 


Individual Data 


The analyses of group data verified the 
presence of differential weighting in the trait- 
judgment paradigm. The deviation from equal 
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weighting, however, appeared to be relatively 
minor. The overall contribution of differential 
weighting accounted for only 1% of the be- 
tween-cells variance, was significant for only 
20% of the stimulus replications, and in all 
cases displayed a pattern that could be inter- 
preted as meaning that negative stimuli car- 
ried more weight than positive stimuli did. 
These restrictions on the equal weighting as- 
sumption would be even less consequential if 
it could be demonstrated that they applied to 
only a minority of the subjects. 

Setting alpha at .05 and testing each 2 x 2 
stimulus replication interaction against the 
subject’s own Days (5) X Stimulus Persons 
(40) interaction (on 156 degrees of freedom), 
`a full 82.5% of the subjects showed evidence 
of nonadditivity for at least 1 of the 10 stim- 
ulus replications (see Table 1). 

One difficulty with the use of 10 stimulus 
replications lay in the increasing role of 
chance as the number of replications in- 
creased. Assuming that 5 subjects out of 100 
would produce a significant interaction for 
any given stimulus replication simply by 
chance, that percentage increases to 40.1 when 
each subject responds to 10 different stimulus 
replications. The role of chance returns to 5% 
when the interaction variance pooled across 
all replications (i.e., on ten degrees of free- 
dom) for each subject is tested. This approach 
is statistically comparable to the pooled test 
for nonadditivity employed by Anderson 
(1962). Whereas only 25% of -Anderson’s 
subjects showed nonadditivity, Table 1 shows 
that 60% of the subjects in the present study 
produced significant overall nonadditivity. 
These results indicate that when the experi- 
ment is designed to be maximally sensitive to 
the effects of differential weighting, a sub- 
stantial number of people produce statistically 
significant interactions in the trait-judgment 
task. 

As noted in the presentation of the group 
data, a number of questions arise in interpret- 
ing significant interactions. First, the possibil- 
ity must be examined that the interactions 
resulted from response scale nonlinearity 
rather than from unequal weighting. Sec- 
ond, it is possible that unequal weighting oc- 
curred for only a few stimulus replications 
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(e.g., the two that were significant in the 
group analysis). Third, it may be that all sig. | 
nificant interactions are the result of a singe} 
stimulus characteristic (e.g, extremely nega. 
tive traits having a greater weight than all 
others), in which case all significant inter} 
actions for all persons would have the same 
pattern (as in Figures 1 and 2). The predic 
tion of idiosyncratic weighting, in regard to 
the second and third above questions, is tha 
all stimulus replications should be involved 
and that a variety of interaction pattern 
should be obtained for each replication. 
Scale linearity. Although we discard 
scale nonlinearity as an explanation for th 
two significant stimulus replications for th 
group data, such an explanation could stl 
be viable for a majority of the individuals 
Consequently, the pattern of significant 4 
nonsignificant stimulus replication interactiow} 
was examined for the 24 subjects (60% of oul 
sample) who displayed overall nonadditivit}} 
Of these 24 subjects, 14 satisfied the ds} 
ordinality test by displaying at least one Sb} 
nificant disordinal interaction. Of the 10% 
maining subjects, 8 satisfied the intersectia] 
test for at least one significant interaction) 
For the remaining 2 subjects (Subjects 1 
and 33), it was not possible to dismiss “i 
nonlinearity explanation, since in both a“ 
they showed the same interaction patte nig 
all 10 stimulus replications. That pattern WH 
consistent with the interpretation that we. 
tive stimuli carry more weight than posi af 
stimuli do. However, even after deler 
these 2 subjects, 55% of the sample $ 
genuine evidence of differential treia 4 
Stimulus replications. The poss eet 
mains that such widespread di of stin 
weighting is due to a limited number el 
srati three subje 
ulus replications. Two of the th Pa 
with significant interaction varian a 
Anderson (1962) study were presen 
the same stimulus replication. i presenti 
Out of 400 interactions tested a shows i 
study, 106 were significant. TO over thel 
distribution of significant interac E 
10 stimulus replications. It can t restricted 
the significant interactions weré £ i 
to just a few stimulus replicatio re ; 
dred percent of the replications W 


. 


Table 1 
Analyses of Individual Data for Experiment 1 
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_ Pooled Stimulus replication interactions 
Subject interaction 
number = s* 1 2 3 4 5 6 7 8 9 10 
1 1,04 
2 1.51 a 
3 3.36 F 
4 2.06* bd * * 
5 1,19 * 
6 4.05** ** ** 
7 ** +k ** 
8 ** 
9 
10 a 
11 . 2 
12 
13 . . ** +e ** +k 
14 . . oon * 
15 * ** eii ** * 
16 . * + 
17 ** * ** 
18 * * 
19 * 
20 . ek 
p 21 * + ** 
2 * * + 
23 
4 . ** ** * 
5 * 
26 7 K 
; 27 18.39** * ** + ** ** ** * ** ** 
8 1.73 x * 
| 29 2.27* 
30 6.09** + ** * ** ** 
18.87** + ** + ** ** +k ** 
1.09 * 
25.21** + * + ** ok * ae ** ** ok 
53 
1,25 
5.50** * * ** ** * 
2.14* * * * * ** 
.67 J 
6.13** * + ** ** 
4.06** ** * *k 
i- 10, 156, 
he 1, 156. 
05. < o 


dl s 

br o contributed approximately equally. 
an distribution of interactions over the 

5, mulus replications, x?(9) = 5.89, $ > 


aa erage of 26.5% of the subjects had a 
S te E interaction for the typical stim- 
Pication, The number of subjects (out 
aeith significant interactions ranged 
°w of 7 to a high of 15 over stimulus 


replications. Even for the most additive of the 
stimulus replications, 17.5% of the subjects 
had a significant interaction. In contrast to 
the findings of the group data, then, it would 
appear that the pattern of unequal weighting 
observed in the individual analyses cannot 
be dismissed as due to only a few stimulus 
combinations. 

Idiosyncratic weighting. The prediction of 
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Table 2 
Number of Subjects in Each Stimulus 
Replication With a Significant Interaction 
for Experiment 1 
OO 
Number of 
subjects 
with 
significant 
interaction 


Number 
interpret- 
able as 
Wa > Wp 


Number 
interpret- 
able as 
Wp > Wa 


Stimulus 
replication 


POANCIIUN IU 
K 


12 1 
8 3 
12 5 
12 6 
8 1 
8 1 
12 1 
15 0 
J 1 
12 2 
M 10.6 2i 


Doosan kane 


1 


Note. N = 40. The terms wp and Wa refer to the 
weights given the more positive and the more 
negative traits, respectively. 


idiosyncratic weighting derives from the rea- 
soning that trait meaning and importance is 
acquired on the basis of each individual’s per- 
sonal linguistic experiences with the word and 
its referents. It follows from this that for most 
stimulus replications, some people should 
weight the negative traits more highly and 
others should weight the positive traits more 
highly. Also, when examining the significant 
stimulus replications for each subject, there 
should be some subjects who show both pat- 
terns of trait weighting. Since the group data 
showed evidence only of negative traits carry- 
ing more weight than positive, we must first 
ask whether there is any evidence for the op- 
posite pattern in the individual data. 

Both ordinal and disordinal interactions can 
be coded in terms of whether the more posi- 
tive or the more negative trait is being given 
the most weight. Of the 103 codable? inter- 
actions in the present study, 82 (79.6%) 
were found to have a pattern consistent with 
the interpretation that negative traits carry 
more weight than positive traits do. The re- 
maining 20.4% showed exactly the opposite 
pattern, that is, positive traits carried more 
weight than negative traits did. 

By chance alone, it would be expected that 
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half of the 400 interactions would show a 
negativity effect and half a positivity effect, 
At the .05 level of significance, this means | 
that 10 tests should show significant negativ- 
ity and 10 tests significant positivity. The 
observed frequencies of 82 and 21 were both 
significantly greater than the expected fre- | 
quencies, y*s(1) = 531.69 and 12.41, ps<} 
001, respectively. Unlike previous research | 
at the normative level, clear evidence for a] 
positivity effect is found for some subjects. 
Table 2 shows that in all but one stimulus, 
replication, there existed at least one subject 
who weighted the positive traits more than the 
negative traits, An illustration of this diversity | 
of interaction patterns is provided in Figure 
3 for Stimulus Replication 7. 

Although there might be individual differ- 
ences in the propensity to assign great 
weight to negative or to positive traits, thë 
notion of idiosyncratic weighting leads us w 
expect that a substantial number of people 
will display both weighting patterns over di 
ferent stimulus replications. Of 23 subjects] 
who provided more than one significant stim: 
ulus replication interaction, 9 (or 39%) 
showed both weighting patterns. This is illus 
trated for Subject 36 in Figure 4. The i 
ing subjects split 12 and 2 in showing a 
sively higher negative weighting and hig) 
positive weighting, respectively. 


Discussion 


In contrast to the group data, the individu 
data indicated that differential weigh SA 
the trait-judgment paradigm was idep f. 
No stimulus replication was immune 4 
single pattern of differenti 
negative traits receiving m 
vailed exclusively. It appears, tD® 
a more sensitive experimental desi A 
used by Anderson (1962) is emp! a PE, 
differences emerge between the anal 7 ind 
group judgments and the analyses 


i oi 
2 Three disordinal interactions ore i 
in these terms because the two 


Saha t, it 
When such “double disordinality’ i A pres of the 
possible to determine the See Ait 


stimulus traits for the stimulus rel 


v Coolheaded 
Z Coolheaded 
= 

í 

5 

ed 
p= 

= 

2 ob g 

wi 

x 

s o— t 

A slra Unenterprising Unenterprising 
wi 
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SLY CONSCIENTIOUS SLY 


Subject no. 10 
(Ordinal, Warp) 


the more negative traits, respectively.) 


vidual judgments. Whereas the equal weight- 


ving assumption holds for most stimuli at the 
normative level, idiosyncratic weighting is the 
dominant feature at the individual level. 


Experiment 2 


| Experiment 2 was undertaken for three rea- 


pe Since the outcome of Experiment 1 stood 
q 


Refined 
5 


ae 


MEAN LIKEABILITY RATING 
° 


SQUEAMISH CONGENIAL 


NERVOUS 


CONSCIENTIOUS SLY 


Subject no. 16 
(Ordinal, wp>wų) (Disordinal, WyPWp) 
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Coolheaded 


Coolheaded 


hte as Unenterprising 


Unenterprising 


CONSCIENTIOUS SLY CONSCIENTIOUS 


Subject no. 40 
(Intersecting 
Disordinal) 


Subject no. 13 


Figure 3. Examples of four different significant interaction patterns obtained for Stimulus Replica- 
tion 7 in Experiment 1. (The terms wp and ws refer to the weights given the more positive and 


in direct contradiction to previous assumptions 
regarding equal weighting of items in the 
trait-judgment task, a need to replicate the 
findings was evident. Experiment 2 used the 
same individual-based design as Experiment 
1 but employed 10 new stimulus replications 
and 80 new respondents. 

Whereas Experiment 1 used “naturalistic” 
instructions, the majority of studies with the 


High-spirited 


erence 
Rebellious ae 
Overconfident 


PUNCTUAL 


RESOURCEFUL 
SUSPICIOUS 


REPLICATION NO. 8 REPLICATION NO. 10 REPLICATION NO. 1 


(Ordinal, wp>wp) 


(Ordinal, w p>) 


(Disordinal, w pw) 


Figure 4. Examples of three different interaction patterns obtained from Subject 36 in Experiment 1. 


(The 
teSpectively ) 


terms wp and wa refer to the weights given the more positive and the more negative traits, 
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trait-judgment task have used “equal weight- 
ing” instructions, It is possible to argue (see 
Wyer, 1974a) that people can suppress their 
idiosyncartic weighting of traits and give them 
nearly equal weight when instructed to do so 
by the experimenter. This would be consistent 
with the finding that other kinds of experi- 
menter instructions regarding weights (e.g, 
Anderson & Jacobson, 1965, “discounting” 
instructions) are known to be influential. 
Two respondents in the first study produced 
the same form of significant interaction for all 
10 stimulus replications. Consequently, it was 
not possible to eliminate the interpretation 
that these two were using a noninterval re- 
sponse scale. A procedural modification was 
introduced in Experiment 2 to reduce the 
likelihood that subjects would use a noninter- 
val response scale, A total of 20 extreme 
anchor stimuli preceded and were interspersed 
among the 40 test stimuli in this study. This 
leads subjects to use the interior portions of 
the scale when rating the stimuli (Simpson, 
Ostrom, & Sloan, 1973) and thereby reduces 
the influence of any noninterval response 
tendencies that may intrude into either the 
positive or negative extremes of the scale. The 
use of such anchor stimuli has become widely 
adopted in the trait-judgment paradigm. 


Method 
Subjects 


Subjects were 80 introductory psychology students 
from Ohio State University, 40 males and 40 females, 
who participated in partial fulfillment of course 
requirements. 


Stimuli 


New stimulus traits were randomly selected from 
the Edwards (Note 1) list. Subjects judged 40 test 
stimulus trait pairs from 10 randomly constructed 
stimulus set replications, where each set consisted of 
4 trait pairs produced from a 2 X 2 factorial design. 
The two levels of likeability for the two trait factors 
were M+ and M-, as in Experiment 1. No seman- 
tically inconsistent or redundant trait pairs were 
allowed within a stimulus replication. 

In order to adequately anchor the response scale, 
12 pairs of extreme anchor stimuli preceded the 40 
pairs of test stimuli, and 8 pairs of anchor stimuli 
were evenly interspersed among the 40 test sets. Half 
of the anchor pairs were positive, and half were nega- 
tive. 
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Procedure 


Subjects were given either “equal weighting” (Y= } 
40) or “naturalistic” (N= 40) instructions, The 
naturalistic instructions were the same as in Experi- 
ment 1, and the equal weighting instructions are 
given below. 


Each trait is equally important in describing the 
person. Sometimes, of course, the two words may 
seem inconsistent. That is to be expected, because 
each of the two people may see a different part of 
the person's personality, However, both traits are 
accurate, and each is equally important. You“ 
should pay equal attention to both. 


After the experimenter answered any questions, sub- 
jects completed a practice booklet containing 15 
stimulus pairs sampled from the entire range of the 
rating scale. 

Subjects judged all 60 of the stimulus pairs (2 
anchor and 40 test) in the experimental booklets or 
each of 5 successive days. Ratings were made on @ 
21-point scale ranging from 0 (dislike very much) 
to 20 (like very much). Five random orders of tet 
stimuli were Latin square counterbalanced acros 
days. Anchor stimulus pairs always appeared in the 
same order. 


Results 


The second experiment differed from th 
first in two important ways: (a) It employe! 
both equal-weight and naturalistic instru! 
tions and (b) it introduced extremely positive 
and negative anchor stimuli in order to stabi 
ize the response scale. No reliable differenc 
were found between the equal-weight an 
naturalistic instructions, either for the grou 
data or the individual data. The results 
Figure 5 and Table 3 show highly compara! 
findings for the two conditions. 

The introduction of extreme 
did have its expected ii 
means for the M+M+ an 
Gstaied 5.0 scale units on the average from 
the scale midpoint in Experiment 1, 
viated only 3.9 scale units in 
Further, the adoption of an 
parently succeeded in reducing th 
subjects employing a nonlinear res 
Whereas in the first study 2 of nonlinel 
were identified as possibly usina 
scales, none of the 80 subjects 10 


study were so identified. 


anchor stimu 
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Group Data 


As in the first study, the overall interaction 
was significant, F(1, 78) = 27.97, p < .001, 
and as Figure 5 shows, was not significantly 
reduced under equal weighting instructions, 
F(1, 78) < 1. The total proportion of the be- 
tween-cells variance contributed by the inter- 
action was .31%. Separate analyses were done 
for each instruction condition, and the inter- 
action was significant for both, Fs(1, 39) > 
13.87, ps < .001, with neither contributing 
more than .40% of the between-cells variance 
(see Table 3). 

When tested separately, 4 of the 10 stim- 
ulus replications (1, 4, 6, and 10) produced 
Significant interactions, Fs(1, 3042) > 3.97, 
ps < .05.° Of the 20 stimulus pairs in this 
Study, between 60% and 80% showed no 
significant departure from equal weighting at 
this normative level. This is slightly lower 
than was observed in the first study (for 
Which the comparable estimate was between 
80% and 90%). The observed increase in 
Number of significant interactions was due to 
the doubling of sample size. When each in- 
struction condition was tested separately, only 
two replications were significant in each (4 
and 6), 

_ Three of the four replications satisfied the 
Mtersection test and could not therefore be 
pismissed on the grounds of scale nonlinearity. 
All significant replications had the same pat- 
lem as Figure 5, indicating that when aver- 
Aged over all subjects, negative stimuli ap- 
Peared to carry more weight than positive 
Stimuli. There was, however, only modest 
idence that the significant stimulus replica- 
tions contained the most negative traits. Using 

e lowest marginal mean as an index of each 
"plication’s most negative trait, the four sig- 
Mfcant replications ranked second, third, 
Ourth, and seventh most negative. 

4 erall, the results for these group data 
py Stantially replicate the group results for 
| *Periment 1, 


Individwat Data 


liete > Pree quarters (77.5%) of the sub- 
| Alt had at least one significant interaction. 
ough this figure was quite comparable to 
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MEAN LIKEABILITY RATING 
bh o 


Me 


NORMATIVE SCALE VALUE 


Figure 5. Mean likeability ratings averaged over days, 
stimulus replications, and subjects in Experiment 2. 
(M+ = moderately favorable. M— = moderately un- 
favorable. Naturalistic instructions results are in 
dashed lines, and equal weighting instructions results 
are in solid lines.) 


that for Experiment 1, there was a reduction 
in the percentage of subjects who showed sig- 
nificant pooled interaction variance (on 10 
and 156 degrees of freedom), from 60% to 
46%. Even though this figure is slightly below 
50%, it is still substantially higher than a 
baseline provided either by chance, x?(1) = 
286.30, p < .001, or by the 25% level ob- 
served in Anderson’s (1962) study, x°(1) = 
19.27, p < .001. 

The discrepancy between the two studies is 
reduced somewhat when those subjects whose 
interactions might have been the result of 
nonlinear response scales are eliminated from 
consideration. Whereas 2 subjects were de- 
leted on these grounds in the first study (re- 
ducing the figure to 55%), none were elim- 
inated from the second study (leaving the 
figure at 46%). In examining the scale non- 
linearity interpretation for the 37 nonadditive 
subjects in the second study, 22 satisfied the 
disordinality test, and 15 the intersection test. 

Stimulus replications. Once again, all stim- 
ulus replications contributed approximately 
equally to the significant interactions obtained 


3 The error term for these analyses was the inter- 
action between stimulus persons (40) and subjects 
(40) nested within instruction conditions (2). 
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Table 3 
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Comparison of Experiments 1 and 2 for Group and Individual Data 


i 


Experiment 2 


Naturalistic Equal-weight 


Item Experiment 1 instructions instructions Combined 
Group data 

Percent between cells variance due to 

interaction 1.00 26 10 31 
Number of significant stimulus 

replications 2 2 2 4 

Individual data 

Percent of subjects with at least one 

significant interaction* 82.5 72.5 82.5 17.5 
Percent of subjects with significant n 

pooled interactions? 60.0 45.0 47.5 46.25 
Percent of stimulus replications with 

at least one significant interaction 100 100 100 100 
Total number of significant interactions 106 57 76 133 
Percent of all significant interactions 

in which: € i 
Wn > Wp 79.6 74.1 78.4 i 
Wp > Wn 20.4 25.9 21.6 23. 
Note. The total number of subjects in Experiment 1 was 40. Experiment 2 had 40 subjects in each of the two 


instruction conditions. The terms wp and wa refer to the weights given the more positive and th 


negative traits, respectively. 
a Chance = 40.1%. » Chance = 5%. 


at the individual level, x?(9) = 13.69, p > 
.10. This occurred, however, despite a sub- 
stantial decrease in the average percentage of 
subjects showing a significant interaction for 
the typical stimulus replication. The figure 
dropped from 26.5% to 16.6%, with a range, 
in the second study, from 7.5% to 27.5%. 
The decrease in the second study may have 
resulted from the introduction of 20 extreme 
anchor stimuli. The anchor stimuli were in- 
tended to eliminate some interactions that 
were due to scale nonlinearity. Since they in- 
creased the number of stimulus persons to be 
judged each day by 50%, however, they may 
also have induced a more additive integration 
set by making the task more boring. 
Idiosyncratic weighting. Of the 128 cod- 
able* interactions, 76.6% showed a pattern 
interpretable as meaning that negative traits 
carried more weight than positive traits, and 
23.4% showed a pattern favoring positive 
traits. This was very close to the split ob- 
served in Experiment 1 (see Table 3). Also, 
as in Experiment 1, both the number of sig- 


e more | 


nificant negativity and positivity interaoa 
exceeded chance, x?(1) = 1624, $ < 00h 
x°(1) = 5.13, p < .05, respectively. Instan 
of both weighting patterns appeared in 8 
the 10 stimulus replications, the remaina 
containing only interactions in which 
negative trait received more weight. iw 

As in the first study, a substantial we 
of subjects displayed both weighting pat id 
Out of the 37 subjects with more than one 
nificant stimulus replication ne i . 
35.1% showed both patterns. All but wail 
remaining 24 subjects gave the negative 
more weight than the positive traits. 


Discussion 


m d the 
The results of Experiment 2 rer rime 
idiosyncratic weighting findings 0 


4 Five disordinal interactions were N 
four because the lines intersected (see 
and one because the marginal means 
identical on one of the factors in e 
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1 and demonstrated as well that idiosyncratic 
‘weighting is obtained under both naturalistic 
Jand equal weighting instructions. The absence 
‘of differences between instruction conditions 

at the level of group analyses is consistent 
“with several previous studies finding no differ- 
ences in trait judgments between naturalistic 
‘and equal weighting instructional conditions 
(Anderson & Jacobson, 1965; Gollob & Lugg, 
1973; Lampel & Anderson, 1968; Wyer, 
1974b). Whereas people appear able to in- 
crease differential weighting tendencies when 
given “discounting” instructions (Anderson & 
Jacobson, 1965; Kaplan, 1973), they did not 
‘reduce their differential weighting tendencies 
when specifically instructed to weight all traits 
equally, 


General Discussion 


The data reported in this article support 
the conclusion that it is incorrect to describe 
the integration process in the trait-judgment 
paradigm as following an equal weighting 
Averaging rule at the level of the individual 
Subject. This conclusion is in contradiction to 
Previous descriptions of the trait-judgment 
(task, descriptions that were based upon the 
Mne previous direct test of individual differ- 
éces in trait weights (Anderson, 1962). The 
|ifference in outcomes between this and the 
Anderson (1962) study can be attributed to 
the far greater sensitivity of the present ex- 
Petimental design. 


Implications for Information 
Integration Theory 


Generality of idiosyncratic weighting. The 
Jait-judgment paradigm is not the only social 
wigment domain in which differential weight- 
"8 occurs at the individual level. Significant 
onadditivity has been obtained in several 
er- areas. Leon, Oden, and Anderson 


HAN found it for 38% of 16 subjects who 


is the “badness” of a group of people 
on Y of committing various crimes; Ander- 
' (1972) obtained it for 83% of 6 subjects 
e Uded, on the basis of behavior items, 
3 Severity of psychiatric disturbances; 
“man and Shanteau (1976) observed it 


= 
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for an average of 28% of 20 subjects who 
judged the quality of disposable diapers in 
one replication and infant car seats in the 
others; and Shanteau and Anderson (1969) 
obtained it for an average of 25% of 20 sub- 
jects over four stimulus replications in which 
subjects rated their preferences for different 
food and beverage combinations. In the last 
mentioned study, 65% of the subjects showed 
at least one significant interaction over the 
four replications. 

This extensive evidence of differential 
weighting at the individual level was obtained 
despite the fact that none of the above studies 
incorporated all of the precautions for max- 
imizing sensitivity employed in the present 
research, Yet none of these percentage figures 
are reasonably close to the five percent chance 
level (although due to small sample sizes, 
some may not statistically differ from it). It 
would appear, then, that at present there 
exists mo social judgment domain that could 
plausibly be regarded as an instance where 
an equal weight averaging integration rule 
held. 

The absence of any verified instance of 
equal weighting encourages speculation that 
there may, in fact, exist no social judgment 
domain for which equal weighting holds. Such 
a state of nature would follow from the 
premise that items of social information 
acquire meaning and importance on the basis 
of highly individualistic experiences with the 
signs and symbols conveying that information. 
Certainly on these grounds, it would seem 
that any social information conveyed lexically 
would be susceptible to idiosyncratic weight- 
ing differences. A more promising domain 
where equal weighting may prevail would in- 
volve purely sensory stimuli responded to 
through a nonsemantic mode (as is often done 
in cross-modality matching). 

Normative versus individual levels of anal- 
ysis. The present studies show that findings 
obtained at the group (or normative) level do 
not necessarily replicate for each member of 
that group. Consequently, it is appropriate to 
question whether other empirical findings ob- 
tained at the group level in the trait-judgment 
task (e.g., set size, serial position, and incon- 
sistency discounting effects) hold for all indi- 
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viduals. There are no adequate data presently 
available that allow us to estimate what per- 
centage of people employ an averaging rule 
(as opposed to a summative or multiplicative 
rule) in the trait-judgment task. This possibil- 
ity of individual differences in integration rule 
can be illustrated by a study in another social 
judgment domain (Leon, Oden, & Anderson, 
1973). They conducted analyses on individual 
subject data and- found significant set size 
effects for some subjects and not for others, 
The difference in outcome between group 
and individual responses should not be dis- 
missed as reflecting mere individual differ- 
ences around a group average. Instead, the 
two types of data should be regarded as fun- 
damentally different levels of analysis, each 
appropriate to different scientific objectives. 
Data from an individual, when obtained 
over repeated observations, are definitive for 
understanding the integration rule used by 
that person. If the objective is to establish 
that the typical person uses a particular inte- 
gration rule, it is necessary to show that most 
people, when studied as individuals, employ 
that integration rule. Such a generalization 
cannot be made on the basis of group data. 
This is not meant to dismiss the importance 
of the multitude of findings previously ob- 
tained at the group level. Group data inform 
us as to the integration rule that characterizes 
the modal societal reactions made in a par- 
ticular response domain. At this level of 
analysis, information integration theory pro- 
vides estimates of the normative weights and 
scale values of information items in that do- 
main. It also establishes a normative integra- 
tion rule for the subject group. This is clearly 
a descriptive enterprise that need have no im- 
plications for how individuals subjectively in- 
tegrate information, There are many problem 
areas for which it is of direct interest to study 
the integration rule underlying the modal 
group response, areas such as stereotyping, 
advertising, political preferences, and jury 
decision making. For example, it is useful to 
know that jurors in aggregate can presume 
innocence (Ostrom, Werner, & Saks, 1978), 
even though this presumption may not hold 
for all of the jurors considered individually. 
Ambiguity of the “parallelism” test. Up to 
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this point in the paper we have chosen to ex. 
plain deviations from parallelism in terms of 
unequal trait weights. But to do so requires 
us to assume that the theoretical scale value 
of each trait is unaffected by the nature of the 
other trait in the pair. Unlike trait weight, the 
scale value is viewed as nonlabile. This as- 
sumption has been made in most integration 
theory interpretations of differential weighting 
research and has been defended as a legit- 
imate strategy for model building (Himmel- | 
farb, 1975). 

Some recent studies (see Ostrom, 1977) 
have attacked that assumption, and have 
established the plausibility of scale value la- 
bility. The cognitive representation and ac- 
companying evaluative response evoked by a 
trait appears to be directly affected by the 
evaluative tone of the context traits. If such 
meaning shift processes do occur in the trait- ! 
judgment domain, the interpretation of devia- 
tions from parallelism becomes much more 
difficult, Significant interactions could be due 
to differential weighting, scale value shifts 
(with equal weighting), or to both. 

Most of the significant interactions ob- | 
served in the present studies had an ordinal or 
disordinal, nonintersecting pattern, These 
could all be interpreted either in terms of dif- 
ferential weighting or in terms of scale value 
shift. However, there is one logically possible 
interaction pattern that, if obtained, could m 
be explained by a differentially weighte 
averaging model, namely, a pattern that i 
both disordinal and intersecting (see the rig 
panel of Figure 3). Given that people sH 
using an averaging integration rule, this n i 
tern could only be obtained if the scale va Bi 
of the row traits reversed their relative m 
tivity, depending on which column trait A 
were paired with. Although such patterns 
observed in the present studies (three 10 i 
first and four in the second), they represen 
only 2.93% of all significant interac a 
The major findings of the present : scale 
could be explained as easily in terms 0! ential 
value shift as they could by ee that 
weighting, Namely, it could be conclu 
a majority of the subjects display bie of 
value shifts, that all stimuli were cal 
undergoing such shifts, and that 
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are highly idiosyncratic. For any particular 
trait pair, some people show greater shift in a 
negative context, others in a positive context, 
and other people show no shift at all. 

Limitations of a differential weighting aver- 
aging model. No fundamental difficulty is 
created for information integration theory if 
idiosyncratic weighting proves to be the state 
of nature in social judgment (or even all 
judgment) domains. Its objective is, after all, 
the specification of an integration rule, and 
determination of whether equal or differential 
weighting occurs, The problems created are 
More ones of experimental convenience. 

The equal weight averaging model has a 
temarkably useful feature, When it holds, and 
when subjects are responding on an interval 
judgment scale, no interaction should be sta- 
listically detectable when orthogonally com- 
posed stimulus sets are being judged. This has 
allowed researchers to draw three important 
conclusions when such parallelism emerges: 
(a) An equal weight additive model is ap- 
Ptopriate (discarding both unequal weight 
veraging models and multiplicative models), 
(b) the interval property of the response scale 
$ v. ted, and (c) the marginal means of- 
r estimates of stimulus scale values on an 
terval scale. The empirical procedures 
eded to discriminate between an averaging 
todel and a multiplicative model, to validate 
l response scale, and to obtain interval esti- 
ates of scale values are much more compli- 
ated in the absence of equal weighting. 
Another disadvantage is that the trait judg- 
ment task cannot be used as a validational 

eline for investigating other stimulus do- 
"ains, For example, Lampel and Anderson 
1968) had subjects judge persons described 
4 à photograph and several traits. The traits 
E factorially constructed, and since no 
ction was obtained, the authors con- 
ho that the response scale was validated 
tt all traits received equal weight. 
P two conclusions allowed an unambig- 
hy. tetpretation of an interaction that was 
mie between trait valence and photo- 
= attractiveness. The possibility of a non- 
lifere Ponse scale and the possibility of 

Tential trait weights could both be ruled 

cause of the absence of a Trait x Trait 
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interaction. This allowed the investigators to 
conclude (by a process of elimination) that 
there was a differential weighting of the pho- 
tographs in which weight was inversely re- 
lated to photograph attractiveness. The pres- 
ent finding of idiosyncratic weighting would 
tule out the use of the trait-judgment domain 
for such purposes in future research at the 
individual level. 


Implications for Person Perception Research 


The stimulus domain studied in this re- 
search was fairly narrowly defined, being re- 
stricted solely to person characteristics in the 
form of personality trait adjectives. Even that 
domain, however, was not fully represented 
in the traits sampled for use in the present 
studies (from Anderson, 1968). The traits did 
not include slang terms or adjectives solely 
descriptive of mood or feeling states (e.g., 
Bush, 1973). It seems unlikely, however, that 
either slang or feeling terms would be more 
characterized by equal weighting than were 
the sampled adjectives. In fact, slang terms 
may well be even more prone to idiosyncratic 
weighting, given respondents who are selected 
from two or more different social groupings. 

There is a wide variety of information 
items that have been used in person percep- 
tion research, including personal attitudes, 
hobbies and interests, demographic character- 
istics, group memberships, and behavioral acts 
and intentions. It is, of course, possible (al- 
though we do not regard it as probable) that 
one or more of these categories may represent 
an equal weighting domain, 

Although the present studies were not 
tailored to investigate the determinants of 
trait weight, the idiosyncratic weighting find- 
ings suggest two possible avenues of explora- 
tion. There was some suggestion of individual 
differences in whether or not weight was asso- 
ciated with scale value. Nearly two-thirds of 
the subjects who had more than one signif- 
icant stimulus replication showed the same 
weighting pattern (either negativity or posi- 
tivity) for each significant replication. Con- 
ceivably, such a dispositional tendency could 
be related to other individual differences in 
positivity or negativity in impression judg- 
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ments (see Kaplan, 1973). A second ap- 
proach would be to relate trait weight to the 
personal constructs (Kelly, 1955; Rosenberg, 
1977) each individual characteristically uses 
to describe important others in his or her 
social world. 


Reference Note 


1, Edwards, J. D. Revised likeableness ratings of 554 
personality trait adjectives. Unpublished manu- 
script, Ohio State University, 1967. 
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Most studies that have found sex differences in aggression have reported that 
males are more aggressive than females. Recent evidence, however, suggests 
that the expectation of female nonaggressiveness may be unwarranted. The 
present study attempted to reconcile these differences by considering the con- 
tingencies of female aggression. Thirty females competed in a task designed to 
Measure aggression (a) alone, (b) in the presence of a silent observer, or (c) 
in the presence of a supportive observer, Results indicated that as provocation 
increased, women in the private condition responded more aggressively than 
did women in the public condition. Also, women who responded in the presence 
of an audience were more aggressive when the observer was supportive than 
when she was silent. It is concluded that the usual findings of female non- 
aggressiveness may be attributable to women’s expectations of disapproval for 


aggressive behavior. 


Traditionally, women have been viewed as 
the nonaggressive sex. Most of the studies 
that have found sex differences in aggression 
have reported that males are more aggressive 
than females. These results have been ob- 
tained with children (Liebert & Baron, 1972; 
Pederson & Bell, 1970; Santrock, 1970; 
Shortell & Biller, 1970) as well as with adult 
subjects (Buss, 1966; Doob & Gross, 1968; 
Epstein, 1965; Youssef, 1968), using various 
methodologies and different aggression para- 
digms. 5 

It might appear, then, that females are 
adhering to a “norm of passivity.” According 
to Maccoby and Jacklin (1974): 


Aggression in general is less acceptable for girls, and 
is more actively discouraged in them, by either 
direct punishment, withdrawal of affection, or sim- 
ply cognitive training that “that isn’t the way girls 
act.” Girls then build up greater anxieties about 
aggression, and greater inhibitions against displaying 
it. (p. 234) 
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This response inhibition hypothesis proposes 
that the female’s socialization history caus 
her to inhibit the display of aggressi Mac- 
coby and Jacklin (1974) suggest "Hat 

differences in aggression may arise beca 
females are reinforced for nonaggressive T 
havior. Since aggression in general is a 
ceptable for girls and is more actively “al 
couraged in them by socialization agents, ss 

have greater anxieties about aggression as A 
as greater inhibitions against dipaya i 
Frodi, Macaulay, and Thome (197 J 
their review of the literature on sex © 


i urse 

ences in aggression, note that F a ya a 
ialization, women leai Š 

of normal socialization, on ety 


spond to provocation with aggressi 
and that in our culture aggressive wA 
from women is not approved of and Bes "ine 
negative evaluations. Thus, they PT 
that “through avoidance Or “a oa 
arousal of aggression anxiety or a d 

avoid acting aggressively” (P- 65 “phat the 

Other evidence, however, SUS ess ah 

expectation of female nonaggressive 1967) 
be unwarranted. Taylor and Epstein ion dis- 
found that sex differences in aggress 


behavior 
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appeared when women were confronted with 
increasing provocation by a male opponent in 
a reaction time experiment. Using the same 
paradigm, Richardson (Note 1) found no sex 
differences in response to continually high 
provocation. In addition, Frodi et al. (1977) 
concluded that evidence indicates that women 
do not show consistently lesser tendencies to 
physically aggress than do men. 

It is apparent that experimental evidence 
concerning the aggressiveness of women is 
contradictory. These differences might be 
reconciled, however, by looking at the en- 
vironmental contingencies of female aggres- 
sion. The variability of findings may result 
from the presence or absence of elements 
in the experimental situation that evoke con- 
formity to sex role expectations (Frodi et al., 
1977). The socialization process that women 
in our culture undergo has taught them that 
in many or most situations, aggression will 
not be approved of and will receive negative 
sanctions (Costrich, Feinstein, Kidder, Mare- 
tek, & Pascale, 1975). Thus women have 
learned to inhibit their aggressive behavior. 
At times, however, in spite of childhood so- 
tialization and adult sex role restrictions, 
Women do behave aggressively. This negation 
Of traditional sex role requirements can be 
&xplained once certain situational factors have 
been inspected. 

Using a competitive reaction time task, 
Taylor and Epstein (1967) found that when 
faced with increasing provocation from a 
Male opponent, women would deliver shocks 
0 their opponent at an intensity equivalent 
to that set by male participants for male 
Opponents, Since the male opponent violated 
Social expectations by delivering relatively 
high shocks to a woman, the female partici- 
Pants may have felt that they, also, were 
*Xcused from the usual sex role requirements. 
“Todi et al. (1977) note that sex differences 
| aggression seldom appear when aggression 
menmissible behavior for women. That is, 
| "en aggression is justified or when women 
te acting anonymously—as a deindividuated 
ae of a group—sex differences in aggres- 

are not found. f 
a he contention that women may drop their 
| “"€etypical feminine role and behave in a 
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relatively masculine manner also receives sup- 
port beyond the area of aggression. In a 
study conducted by Kidder, Bellettirie, and 
Cohn (1977), reward allocations were made 
by male and female participants in either a 
public or a private condition. In the public 
condition, each sex conformed to its role 
expectations. When assured of anonymity, 
however, both sexes abandoned their tradi- 
tional sex-typed roles. 

The present study was designed to investi- 
gate the effects of situational contingencies on 
so-called female nonaggressiveness. Female 
participants competed against a male oppo- 
nent in a reaction time task designed to 
measure aggression. The participants com- 
peted in one of three conditions: (a) private 
—alone in the experimental room; (b) pub- 
lic—in the presence of a silent female ob- 
server; or (c) supportive other—in the pres- 
ence of a female observer who offered social 
support for retaliating against the opponent. 

It was hypothesized that women in the 
public condition would behave less aggres- 
sively in response to provocation than would 
women in the private condition. The presence 
of a silent female observer in the public con- 
dition was expected to make salient the tradi- 
tional sex role expectations for female non- 
aggressiveness (e.g., Frodi et al., 1977) and 
result in behavior that conformed to the par- 
ticipants’ perceptions of what the observer 
expected of them, Women in the supportive 
other condition, not having to rely on their 
perceptions of the observer's expectations and 
knowing that the observer did not expect 
nonaggressive behavior, were expected to be- 
have more aggressively than would women in 
the public condition, It was also hypothesized 
that women in all three conditions would 
respond more aggressively as provocation by 
their male opponent increased. 


Method 
Participants 


The participants were 30 female undergraduates 
enrolled in introductory psychology classes at a 
large midwestern university. Participation was in 
partial fulfillment of course requirements. Ten 
women were randomly assigned'to each of the three 
experimental conditions, 
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Procedure 


Before initiating the procedure, participants 
were informed that they would be competing in a 
reaction time task with the person in the next room, 
that there would be a slight electric shock in- 
volved, and that they had the right to leave the 
experiment at any time without penalty. Partici- 
pants were then encouraged to ask questions before 
signing the informed consent statement, which also 
served as their record of participation.* 

The female experimenter seated the participant at 
the task board (described by Taylor, 1967) and 
attached a concentric shock electrode to her wrist. 

In the public and supportive other conditions, a 
female observer (confederate) arrived at the door of 
the participant’s cubicle and reminded the experi- 
menter that she was a student whose adviser had 
recommended that she observe an experiment in 
progress. After obtaining the participant’s consent, 
the experimenter seated the observer slightly behind 
and to the side of the participant and then left the 
room for 5 minutes, presumably to attach the 
opponent’s shock electrode. 

In the private condition, no observer was pre- 
sent, but the procedure was otherwise identical to 
the other two conditions. 

The participant's “unpleasantness” threshold was 
then determined. That is, intensity of shock was 
gradually increased until the participant reported 
it to be “definitely unpleasant.” This was the 
maximum intensity administered and was desig- 
nated as Level 5; 90% of the maximum was desig- 
nated as Level 4; 80% as Level 3; 70% as Level 
2; and 60% of the maximum was designated as 
Level 1. The participant also overheard a tape 
recording of a male voice telling the experimenter 
when the shock was unpleasant for him. Thus, she 
was led to believe that she was to be competing 
with a male opponent, although no opponent was 
actually involved in this experiment. 

The participant then heard a tape recording of 
the task instructions. She was told that at the 
beginning of each trial she should select any one of 
the five intensities of shock she wished her oppo- 
nent to receive at the end of the trial if his reac- 
tion time was slower than hers and that she would 
receive the shock her opponent set for her if she 
was slower than her opponent. 

The trials began immediately after the completion 
of the task instructions. Each trial consisted of 
four specific events: (a) the onset of the set light, 
which instructed the participant to press one of the 
five buttons corresponding to the intensity of shock 
she wished her opponent to receive; (b) the onset 
of the press light, which instructed her to press the 
reaction time key; (c) the onset of the release 
light, signaling to the participant that she should 
remove her finger from the reaction time key as 
quickly as possible; and (d) the onset of one of 
the five small red lights at the top of the task 
board, which indicated the intensity of shock that 
had been set for the participant by her opponent 
and, if the opponent was faster, the shock as well. 
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Each participant competed in 25 trials. These 
consisted of four blocks of 6 trials each plus an 
extra trial to measure the participant's reaction to 
Trial 24. The average feedback settings for Blocks | 
1 through 4 were 1.5, 2.5, 3.5, and 4.5, respectively, 
Wins and losses and feedback /shock delivered to the 
participant were programmed by the experimenter, 
The participant was led to believe that she had 
won 50% of the trials within cach block. 

In the supportive other condition, the observer 
initiated conversation with the participant during 
the first block of trials, speaking in a friendly, 
Spontaneous manner and sympathizing with her 
whenever she lost a trial. As the attack increased in 
the later blocks of trials, the observer attempted 
to encourage the participant to reciprocate by 
commenting upon the male opponent’s behavior 
(“he went up there a little bit,” “he gave a 
three!"); indicating that if she were competing 
with the opponent, she would retaliate (“I wouldn't 
let him get away with that!”, “I don’t think Id 
let him step all over me”); and suggesting thal 
the participant reciprocate (“Another 4! Maybe you 
should give it back”). At no time did the observer 
suggest initiating attacks or setting higher intensity 
shocks than the opponent. The observer in the pub- 
lic condition remained silent throughout the exper- 
mental session. s 

Following the competition task, women in i 
public and supportive other conditions were asko 
to rate the observer on a series of 6-point bipolar 
adjective scales. 


Results 


Aggression was measured by the magnit 
of shock the participants set for adminis' "i 
tion to their opponent. Participants were 


quired to select a shock intensity on the E 
trial without any knowledge of the cpr 
intentions of the opponent. As expected, the! 


were no significant differences among ee 
tions in the initial trial, F(2, 27) = 4> af 


A 3 (Group) x 4 (Blocks) aia 
variance was performed on mean oi 
tensities set during the four blocks © HA 
in order to examine changes in shock se! a 
over blocks. There was no significant Hoel 
effect for group, F(2, 27) = 2.24, "s. 
ever, the interaction between go Fe. 
blocks was significant, F(6, 81) = 4,60, 


1 This statement indicated that the I ae 
understood (a) the purpose of the BES at she 
any risks that might be involved, Hes without 
could leave the experiment at nats iment 
penalty, and (d) that the details of was complet’: 
would be explained after participation 


0005. According to Newman-Keuls analyses, 
as the trials progressed, women in the private 
and supportive other conditions responded in 
an increasingly more aggressive manner than 
did women in the public condition (p < .05). 
Figure 1 demonstrates this relationship and 
the similarity of the behavior of participants 
in supportive other and private conditions. 
The mean shock intensities set by women 
in the private condition in Blocks 1 through 
4 were 1.52, 1.97, 2.68, and 3.20, respectively. 
‘Women in the supportive other condition set 
‘mean shock intensities of 1.60, 1.90, 2.79, 
and 3.57 in Blocks 1 through 4, respectively. 
However, women in the public condition evi- 
denced less increase by setting mean shock 
intensities of 1.73, 1.58, 2.12, and 2.03 in 
Blocks 1 through 4. Further analysis re- 
vealed that the differences in mean shock set- 
tings were significant only in the final blocks 
of trials. 
_ As predicted, there was a significant main 
efect of blocks, F (3,81) = 36.22, p < .001, 
indicating that shock settings, in general, 
increased over blocks. The mean shock in- 
lensities set in Blocks 1 through 4, respec- 
tively, were 1.62, 1.85, 2.53, and 2.93. 
Following their participation in the compe- 
tition task, women in supportive other and 
Public conditions were asked to rate the ob- 
Sttver on a series of 6-point adjective scales 
k a midpoint of 3.5. Analysis of this data 
Y t tests indicated that the observer in the 
‘pportive other condition was perceived to 
significantly more friendly (4.90 vs. 3.90), 
Bt humored (5.50 vs. 4.60), happy (4.40 
i" 3.60), honest (4.60 vs. 3.40), and socia- 
hh. (4.70 vs. 3.60) than was the observer in 
f w public condition. Also, the observer in 
E public condition was perceived as being 
nificantly less revengeful (1.40 vs. 3.40), 
aS competitive (2.40 vs. 4.10), less aggres- 
(280 vs. 4.00), and more passive (3.60 
a 2.00) than was the observer in the sup- 
|Prtive other condition (all ps < .05). 


Discussion 
the Present study attempted to delineate 
emale ects of situational contingencies on 
Would aggression, It was assumed that women 
inv, TeSPOnd to provocation less aggressively 
€ presence of a silent audience than when 
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Figure 1. Mean shock settings as a function of group 
and block. 


they were either alone or in the presence of 
a supportive audience. Both hypotheses were 
supported. Women in the private condition 
responded to the male opponent’s provoca- 
tion by setting higher shocks than the women 
who participated in the presence of a silent 
observer. Similarly, women who were encour- 
aged by the audience retaliated at higher lev- 
els than did those who received no such social 
support. 

As predicted, situational contingencies in- 
fluenced female retaliative behavior, thereby 
bringing into question the common assump- 
tion of female nonaggressiveness. Indeed, it 
appears that the aggressiveness of women was 
largely determined by the contingencies pre- 
sent, When reciprocal responding was clearly 
called for, as in the supportive other condi- 
tion, women retaliated. When their response 
was relatively anonymous, as in the private 
condition, women retaliated, The only situa- 
tion in which nonaggressiveness was evidenced 
was in the presence of a silent observer in 
the public condition. In this case, one might 
expect that the women were responding to 
the assumed expectations of the audience. 

The relatively low level of aggression in 
the public condition lends support to the 
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response inhibition hypothesis mentioned 
earlier, That is, the presence of the silent 
observer may have increased the partici- 
pant’s concern about the “appropriateness” 
of her behavior, thereby enhancing the sali- 
ence of the “norm of passivity.” Evidence 
for this interpretation is provided by Borden 
(1975), who found that male participants’ 
aggressive behavior was influenced by their 
expectations of approval from observers. 
Borden noted that the “generalized expect- 
ancy associated with the sex of the observer” 
could account for his participants’ lower lev- 
els of aggression when they were observed by 
a female than when they were observed by a 
male. Borden concluded, as we might, that 
the participants’ aggressive behavior was 
“apparently a function of their expectations 
of approval for such behavior, based on the 
inferred or explicit values of the observer” 
(p. 567). Although such values were not 
explicit in the present study, participants did 
attribute characteristics consistent with the 
“norm of passivity” to the silent observer. 
They perceived her to be relatively nonre- 
vengeful, noncompetitive, nonaggressive, and 
passive. The participants in the public condi- 
tion apparently adapted their retaliative 
responding in accordance with this expecta- 
tion. 

Similarly, behavior of participants in the 
supportive other condition was consistent 
with expectations of the observer. The con- 
tingencies in that situation reinforced recip- 
rocation of the opponent’s behavior. Thus, 
expectations were no longer for nonaggressive 
responding and the female participants could 
reject the “norm of passivity” and still re- 
spond in a manner that would conform to 
the apparent values of the observer, It was 
not necessary for these women to inhibit 
their aggressive behavior due to fear of nega- 
tive sanction for “out-of-role” behavior. 


Reference Note 


1. Richardson, D. C. Sex differences in aggression: 
A new perspective. Paper presented at the meet- 
ing of the Southeastern Psychological Associa- 
tion, Atlanta, March 1978. 
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Emotion Recognition: The Role of Facial Movement and the 
Relative Importance of Upper and Lower Areas of the Face 


John N. Bassili 
University of Toronto 
Toronto, Canada 


In order to investigate the role of facial movement in the recognition of emo- 
tions, faces were covered with black makeup and white spots. Video recordings 
of such faces were played back so that only the white spots were visible. The 
results demonstrated that moving displays of happiness, sadness, fear, sur- 
prise, anger and disgust were recognized more accurately than static displays of 
the white spots at the apex of the expressions. This indicated that facial 
motion, in the absence of information about the shape and position of facial 
features, is informative about these basic emotions. Normally illuminated 
dynamic displays of these expressions, however, were recognized more ac- 
curately than displays of moving spots. The relative effectiveness of upper 
and lower facial areas for the recognition of these six emotions was also in- 
vestigated using normally illuminated and spots-only displays. In both in- 
stances the results indicated that different facial regions are more informative 
for different emotions. The movement patterns characterizing the various emo- 
tional expressions as well as common confusions between emotions are also 


discussed. 


e communication of emotions through 
l expression has received considerable 
tention since Darwin’s (1872) analysis of 
phenomenon in humans and animals. A 
ty of issues have been investigated 

h the years: Can observers recognize 


tically prewired? If so, can it be demon- 
that they are universally recogniz- 
What is the relative contribution of con- 
al information and facial expressions to 
recognition of emotions? Do different 
ms define points in a multidimensional 
or are they organized into distinct 
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rise to the judgment of different emotions. 
This direction has variously been referred to 
as “psychophysical approach” (Frijda, 1968) 
or “judgment approach” (Ekman, Friesen, & 
Ellsworth, 1972). It involves the description 
of information in the face that leads observ- 
ers to specific judgments of emotions. 
Several studies have investigated this prob- 
lem by presenting subjects with photographs 
of emotional expressions and analyzing the 
relationships between components of the ex- 
pressions and judgments made by observers 
(Ekman, Friesen, & Tomkins, 1971; Frijda, 
1968; Frijda & Philipszoon, 1963; Frois- 
Wittmann, 1930; for a review see Ekman et 
al., 1972). Two features of such experiments. 
are notable, The first is that judgment stud- 
ies of emotional expression have relied on 
static representations of facial expressions 
(photographs). Several authors have criti- 


i Ekman and Friesen’s (1978) Facial Action Cod- 
ing System is a notable exception to this assertion, 
Empirical research is needed, however, to establish 
correspondences between the system and specific 
emotions (see Ekman & Oster, 1979, for a brief 


review). 
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cized the use of such stimuli. For example, 
Bruner and Taguiri (1954) point out that “it 
is rare to the vanishing point that judgment 
ever takes place on the basis of a face caught 
in a state similar to that provided by a 
photograph snapped at 20 milliseconds” (p. 
638). Nevertheless, Ekman et al. (1972) have 
argued that frozen representations of the 
apex of facial movement provide an eco- 
nomical record of emotional expressions. 
Moreover, many studies using such records 
have demonstrated the sufficiency of static 
information for the recognition of emotions 
(see Ekman et al., 1972; Izard, 1971, for 
reviews). 

The second notable feature of judgment 
studies concerns the type of description of 
facial information yielded by the analysis of 
static stimuli. Such descriptions have invaria- 
bly analyzed the face in terms of the position 
and shape of features and wrinkles. For ex- 
ample, Frois-Whittmann (1930) analyzed the 
face in terms of brow, upper lid, lower lid, 
nostril, and so forth. Other researchers (e.g., 
Ekman & Friesen, 1975; Frijda & Philipszoon, 
1963; Leventhal & Sharp, 1965) have also 
offered such feature-based analyses of the 
face. Moreover, Ekman, Friesen, and Tomkins 
(1971) state that in devising their Facial Af- 
fect Scoring Technique, “a decision was 
made to describe the appearance of the face 
primarily in terms of wrinkles, of tension or 
relaxation in specific features, and of posi- 
tions of features” (p. 40). 

Feature-based descriptions derived from 
static stimuli ignore several forms of facial 
information relevant to the judgment of emo- 
tions. For example, Ekman and Friesen 
(1975) discuss situations where the rate at 
which an emotion is expressed modifies the 
meaning of the expression. One such case is 
the very brief expression of disgust implying 
that the person is not actually feeling the 
emotion but may be referring to something 
disgusting. 

Another kind of facial information over- 
looked in the analysis of static stimuli con- 
cerns the structured deformations of the 
surface of moving faces. In a recent experi- 
ment, Bassili (1978) argued that because 
facial muscles are fixed in a certain spatial 
arrangement, the deformations of the elastic 
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surface of the face to which they give rise 
during facial expressions may be very in- 
formative in the recognition of emotions, 

In order to investigate facial movement 
independently of feature-based information 
that could be captured in a static record of 
the face, Bassili (1978) covered faces with 
black makeup and placed white spots in a 
random order over them. Using a technique 
pioneered by Johansson (1973), the faces 
were videotaped and played back on a tele 
vision monitor adjusted for maximum cong 
trast and low brightness so that only the 
white spots were visible. The results of Bas 
silis (1978) experiment demonstrated 
observers were more accurate in recognizing 
several basic emotions on the basis of facial 
movement alone than would be expected b 
chance. However, accuracy rates increasét 
significantly when the faces of actors wert 
fully visible. 

Bassili’s (1978) experiment suffered from: 
two problems. First, because the actors had 
not been trained in the expression of emo; 
tions, recognition rates, even under condi 
tions where the face was normally illuminated 
were not always very high (e.g, 31% accu 
racy in the recognition of expressions of hap 
piness, when 16.7% accuracy could be e 
pected by chance). Second, so many whit 
spots were attached to the faces (around 10% 
that the outline of features could occasiona™ 
be detected. Since no control conditions &® 
hibiting the blackened faces statically at t 
apex of the expressions were included in i 
experiment, it is not possible to state witha 
tainty that the better-than-chance recogn! i 4 
rates were caused by motion information 
alone. : 

The present experiment investigates Fi 
role of facial motion using actors trang 
the expression of emotions, as well as T 
tions minimizing the presence of feature-)" 
information in the records of expressions. mos 
criterion for the latter aim is that the 4 a 
tions should not be recognizable if the ari 
of white spots is frozen at the apex É 
pressions. Control conditions using SUC : 
arrays were included in the experimen 
sign, nt 

Another issue investigated in the paa 
experiment involves the division of 
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into an upper and lower area. Ekman et al. 
(1972) have reviewed research investigating 
the relative importance of different facial 
areas in judgments of emotions, Of the seven 
studies relevant to this issue, two suggested 
that the mouth area is dominant (Dunlap, 
1927; Ruckmick, 1921), two found that no 
facial region is dominant (Coleman, 1949; 
Frois-Wittman, 1930), and three suggested 
that different areas are important for dif- 
ferent emotions (Hanawalt, 1944; Num- 
}menmaa, 1964; Plutchik, 1962). 

These contradictions were probably caused 
by the fact that few factors were standardized 
across experiments; the number of actors, 
their training, and the type of record em- 
ployed, as well as the emotions portrayed, 
varied across experiments. For example, Dun- 
lap’s (1927) conclusion regarding the su- 
periority of the mouth area was based mainly 
on judgments of happiness. This finding does 
hot contradict results from experiments sug- 
[gesting that different areas are important for 
different emotions (e.g., Hanawalt, 1944, also 
ound the bottom of the face to be dominant 
judgments of happiness). 

Since the present experiment benefits from 
cent developments in research on the recog- 
ition of emotions (ie., the use of several 
ained actors, as well as the choice of six cate- 
aries of emotions that have received consid- 
fable attention in the past), it was decided 
to investigate again the relative importance of 
the upper and lower face in judgments of 
‘motions. Moreover, this would afford a com- 
Parison of different facial areas under condi- 
tions where only facial motion serves as in- 
Ormation, 


Method 


Verview 


5 actors were video-recorded while expressing 
Rust, es Sadness, fear, surprise, anger, and dis- 
Consist he first independent variable (illumination) 
faces ed of either normally illuminated faces or 
severed with black makeup and white spots. 
ily be variable (area) was cross-cut with the 
ayes variable and consisted of full faces, 
joe or lower faces. Since six actors expressed 
eq. X emotions in two illumination and three 

conditions, a total of 216 moving displays 


Wey 
"e shown to’ subjects. Following these displays, 
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subjects were shown 36 static displays depicting 
the apex of the six emotions expressed by each of 
the actors in the spots-only/full face mode, Fol- 
lowing each display, subjects chose from among the 
six labeled emotions expressed in the display. 


Subjects 


Ten male and 10 female volunteers drawn from 
an introductory course in psychology served as sub- 
jects in the experiment. They were paid $5 for 
their participation. 


Stimulus Materials 


Twenty students enrolled in a laboratory course 
in social psychology were trained in the expression 
of emotions. The students were provided with 
photographs of emotional expressions prepared by 
Ekman and Friesen (1975). Each of the emotions— 
happiness, sadness, fear, surprise, anger and disgust 
—was represented in two photographs. Underneath 
the photographs was a written description of the 
facial features critical to an expression. Students 
were given a week to practice their skills at dupli- 
cating the expressions shown in the photographs. At 
the next meeting, students paired up and helped 
each other perfect their expressions, Each student 
was then video-recorded while expressing the six 
emotions. The quality of each expression was later 
rated by all students in the class with the exception 
of the actor. The six actors receiving highest rat- 
ings (four males and two females) were retained 
for further class research and were later approached 
for participation in the present experiment. 

Each actor was first recorded prior to the appli- 
cation of makeup. The expression of the six emo- 
tions was recorded under three conditions: full 
face, upper half of the face, and lower half of the 
face. The division of the face into upper and lower 
regions was accomplished by having the actor hold 
a rectangular piece of cardboard in front of the 
area to be masked. The horizontal edge of the 
mask was placed slightly below the bridge of the 
nose. Next, the actors were made up in black, and 
about 50 white spots, 8 mm in diameter, were 
affixed to their faces in a haphazard or quasi- 
random arrangement. The recording procedure was 
then repeated. 

These recordings yielded 216 single displays of 
an emotion, each lasting about 3 sec. The order 
of the expressions was randomized and divided into 
two blocks of 108 displays. The order of presenta- 
ion of the two blocks was counterbalanced across 
subjects. These presentations were preceded by 36 
practice trials and were followed by 36 control 
trials. The control displays consisted of a 30-sec 
static presentation of the apex of each of the six 
emotions expressed by the six actors under spots- 
only/full face conditions. The order of these displays 


was also randomized. 
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Procedure 


Subjects took part in the experiment in pairs and 
were separated by a large partition. They were 
seated 4 ft. (1.2 m) from two 11-in. (28-cm) Sony 
television monitors. One monitor was adjusted for 
normal viewing, whereas the other had its bright- 
ness and contrast levels adjusted so that only the 
white spots were visible in spots-only conditions. 
The experimenter switched the monitors on and off 
for the appropriate displays. 

Each display was followed by a 10-sec interval 
for responses. The responses were made on a short 
questionnaire containing 300 sets of the six emo- 
tion labels. Subjects simply checked which of the 
six labels corresponded best to the emotion por- 
trayed in a display. They were warned at the 
outset of the experiment that only white spots that 
had been placed on the faces of actors would 
be visible in certain displays, whereas other dis- 
Plays would only show the upper and lower half 
of the face. Moreover, the practice trials contained 
an example of each of the conditions, 


> Results and Discussion 


The results can be presented most clearly 
if the following three issues are considered in 
order; Gomparison of accuracy rates in mov- 
ing an static spot displays; analysis of ac- 
curacy rates, using a 2 (Sex) X 2 (Illumina- 
tion) x 3 (Facial Area) xX 6 (Emotion) re- 
peated measures analysis of variance 
(Anova); and analysis of movement patterns 
and confusions, The accuracy rates used in 
the Statistical analyses were computed by 
giving subjects a score representing the pro- 
portion of displays of an expression, as por- 
trayed by the six actors, that they recognized 


accurately, 
Comparison of Moving and Static Displays 


One of the main questions addressed by the 
present research concerns the sufficiency of 


Table 1 


Recognition Accuracy in Moving and Static Displays 


Emotion 


Happiness Sadness Fear 

Face 4 
Moving 90.0 44.2 43.3 
Static 30.8 25.8 12.5 
Difference 59.2 18.4 30.8 
M 60.4 35.0 27.9 


Note. Numbers represent recognition accu: 


r racy. The accuracy rate expected by chance is 16.7. 
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motion information for the recognition of 
emotions. To answer this question, the accu 
racy rates yielded by spots-only/full face dis. 
plays were compared with those yielded by 
Static displays of the same faces at the ap 
of expressions, A 2 (Sex) X 2 (Motion) x 
6 (Emotion) anova with repeated measures 
on the last two factors was used for this 
analysis. It revealed that moving displays ft 
yielded on the average (M = 52.7%) signif-B 
icantly higher recognition rates than static 
displays did (M 29.5%), F(1, 90) = 
192.29, p < 101. The effect of sex and emo- | 
tion will receive attention in the next sec-f 
tion. Table 1 shows the mean accuracy rates 
relevant to the present analysis. 
Since the anova also received a significant 
interaction between motion and emotion, F (3) 
90) = 11.37, p < .01, tests of simple: maini 
effects were performed to verify the effect o 
motion on each emotion individually. These 
tests indicated that the recognition rate m 
the case of all six emotions under movement 
was superior to the recognition rate undet 
Static conditions, F(1, 18) > 9.4, P< 01 i 
all cases. The recognition rate in each stat 
condition was also compared with expect 
chance performance. These comparisons re 
vealed that only in the case of happiness E 
surprise was recognition of static disp Ü 
significantly more accurate than chance, t( A 
= 2.83 and ż(19) = 7.59, respectively, ? 
:01, one-tailed-in both cases. 


Accuracy Rates in Overall Factorial Design 


The next two questions addressed by ua 
present research concern the cone ee 
normally illuminated moving faces wit be 
where only facial motion is visible, an 


r M 
Surprise Anger Disgust 
59.1 
88.3 39.2 53.3 29.4 
47.5 16.7 21.7 f 
40.8 22.5 we A 
67.9 27.9 : E 
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lative effectiveness of the upper and lower 
gions of the face for the recognition of the 
rious emotions. 
A 2 (Sex) X 2 (Illumination) x 3 (Facial 
rea) x6 (Emotion) repeated measures 
jova was used to analyze subjects’ accuracy 
tes. All main effects were significant. Female 
jects (M = 68.1%) were on the average 
ore accurate than male subjects (M = 
16%), F(1, 18) = 6.85, p< .025. Displays 
| normally illuminated faces (M = 77.3%) 
elded higher recognition rates than displays 
Í moving spots did (M = 52.3) F (1, 18) = 
10.31, p < .001. Full face displays (M= 
4.4%) were recognized more accurately than. 
ere displays of the bottom of the face (M = 
49%), which were in turn recognized more 
crurately than displays of the top of the face 
M= 55.1%), F(2, 36) = 50.40, p< .001 
multiple comparisons by Scheffé tests yielded 
values smaller than .01 in all cases). Fi- 
ally, the main effect due to emotions, F(95, 
0) = 20.50, p < .001, indicated that sur- 
tise (M = 82.2%) and happiness (M= 
.6%) were recognized more accurately than 
re sadness (M = 60.8%), anger (M= 
9%), fear (M = 57.6%), and disgust 
= 49.7%) (p< .01 by Scheffé tests in 
cases). 


NORMAL ILLUMINATION 
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Figure 1. Recognition rates for the six emotions yielded by the 


illumination and spots-only conditions. 
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Three two-way interactions were also sig- 
nificant. An Illumination X Emotion inter- 
action indicated that the superiority of nor- 
mally illuminated displays over spots-only 
displays was evident in anger, sadness, dis- 
gust and fear but not in surprise and happi- 
ness, F(5, 90) = 58.94, p < .001. Moreover, 
an Illumination Xx Facial Area interaction in- 
dicated that the superiority of normally illu- 
minated displays was particularly pronounced 
under conditions where the full face was pre- 
sented, F(2, 36) = 5.47, p < .01. Finally, a 
Facial Area X Emotion interaction indicated 
that the lower region of the face was superior 
to the upper in the recognition of happiness 
and disgust, the upper region was superior to 
the lower in the recognition of anger and fear, 
and facial region made little difference in the 
recognition of surprise and sadness, F(10, 
180) = 10.59, p < .001. 

The three-way interaction between illumi- 
nation, facial area, and emotion was also sig- 
nificant, F(10, 180) = 4.98, p< 001. It re- 
flected the fact that a facial area dominant in 
a normally illuminated display was not neces- 
sarily dominant in a spots-only display (see 
Figure 1). 

The results yielded by this analysis are 
relevant to several issues addressed by past 
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research. First, the main effect due to emo- 
tions can be compared with a number of 
studies using posed expressions. For example, 
Ekman et al. (1972) present a table of “Ac- 
curacy Studies of Posed Behavior” (p. 103). 
Averaging over the studies described in this 
table, happiness appears to be the easiest 
emotion to recognize. The results of the pres- 
ent experiment are in basic agreement with 
this, though surprise achieved an even higher 
accuracy rate. On the other side of the scale, 
the recognition of disgust was particularly 
poor in the present study. This was not gen- 
erally the case in past research, The reason 
for the inferiority of disgust in the present 
study is not clear and could be due to prob- 
lems in the training of actors, Another find- 
ing relevant to past research has to do with 
the fact that women subjects in the present 
study achieved significantly higher recogni- 
tion rates than men did, This finding is con- 
sistent with past research (Ekman & Oster, 
1979). 

The results yielded by the normally illumi- 
nated displays are of particular relevance to 
past research on the relative importance of the 
upper and lower regions of the face for emo- 
tion recognition, The present results suggest 
that the bottom of the face is more useful 
than the top for the recognition of happiness, 
surprise, and disgust, whereas the opposite 
is indicated for sadness and fear. Both regions 
were equally useful for the recognition of 


past research concerned 
rtance of various facial 
f emotions, however, 
it is difficult to compare the present findings 
ded by past studies. 
nly facial movement 
serves as information, the bottom of the face 
seems more useful for the recognition of hap- 
piness, sadness, and disgust, whereas the re- 
verse holds for surprise and anger. The two 
regions yielded equal (and poor) recognition 
rates for fear. The results should be ap- 
proached with caution because the recognition 
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rates with the exception of happiness and Sui 
prise were relatively low and in some cases dj 
not differ substantially from expected chane 
performance. The type of facial motion ir 
formation relevant to the various emotions į 
however, considered further in the next se 
tion. 


Movement Patterns and ( ‘onfusions 


This section analyzes the patterns of faci 
movement yielded by the expression of s 
emotions. The analysis is accomplished wi 
the help of photographs depicting the path 
of the white spots over the course of thee 
pressions (see Figure 2). In addition, a de 
scriptive analysis of major confusions is uf 
dertaken (see Table 2), For this pur 
an arbitrary cutoff point of 20% was us 
to isolate emotions that were often confus 
with a target emotion. 


Happiness 


The movement pattern yielded by the 
Pression of happiness consists of an p 
displacement of each side of the mouth and 
the cheeks. This movement results from 
smile. Following the 20% criterion, happ!? 
was confused substantially only with F 
in spots-only/upper face displays (23.3%): 


Sadness 


The movement pattern yielded by sail 
is more subtle. It consists of a slight ee 
displacement in the area of the chin, WHE 
the forehead area reflects an inward ani ‘i 
ward movement of the eyebrows. eh 
mally illuminated and spots-only disp va 
the bottom of the face yielded conta 
disgust (23.3% and 30.0%, rr i 
However, when the full face or top a 
shown in spots-only displays, cont q 
curred mainly with fear (22.5% an 
respectively). 


Fear d 
outwar 


Fear involves a downward and forehead 


i e 
movement in the mouth area. Th 
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Movement is similar to that of sadness (in- 
ard and upward), although the movement is 
more pronounced because the brows are raised 
igher. The similarity of movement in the 
forehead area in sadness and fear is reflected 
by the fact that in spots-only/upper face con- 
fitions fear was often confused with sadness 
2%). In addition, fear in these displays 
@s confused with surprise (25.8%), This 
filter confusion was probably caused by the 
fong upward movement of the brows char- 
terizing the expression of surprise. Another 
Ammon error yielded by the expression of 
dear involved happiness and occurred in spots- 
iily lower face displays (35.8%). 


Anger 
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Surprise 


Surprise was one of the easiest emotions to 
recognize, and was particularly so in spots- 
only displays. The expression involves a strong 
upward displacement of the brows and an 
equally strong downward displacement of the 
jaw. The rapidity of the movement (akin 
to a startle) is also specific to this emotion. 
Major confusions only occurred with fear in 
normally illuminated/upper face displays 
(28.3%). 


Anger 


The expression of anger involves a down- 
ward movement in the forehead area caused 


Disgust 


Figure 2. Movement patterns yielded by each of the six emotions investigated. ae paatos 
8taphs re resent time exposures of actual television displays. The shutter was opened just before 
“be onset of an pects and was closed just after the expression reached its apex. The arrows 


Indicate the direction of movement.) 
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Table 2 
Confusion Matrices 


———_— — — 


Stimulus Happiness Sadness 
Full face 
Happiness 97.5/90.0 0/.8 
Sadness 0/1.7 97.5/45.8 
Fear 0/10.0 1.7/14.2 
Surprise s Ey Sa Pd 0/.8 
Anger 0/0 8/32.5 
Disgust 0/1.7 8/2.5 
Bottom half 
Happiness 90.0/97.5 8/0 
Sadness 0/2.5 68.3/48.3 
Fear -8/35.8 6.7/10.8 
Surprise 8/.8 0/3.3 
Anger -8/10.0 7.5/20.0 
Disgust 0/7.5 1.7/3.3 
Top half 
Happiness 55.0/47.5 13.3/23.3 
Sadness 0/2.5 74.2/32.5 
Fear 0/1.7 0/24.2 
Surprise -8/0 0/.8 
Anger 0/0 2.5/23.3 
Disgust .8/0 3.3/18.3 


Note. The numbers represent the percentage of choices of a column label in response to the display of a 
emotion described by a row label; numbers on the left of the slash are for responses to normal 
displays, and those on the right are for responses to spots-only displays. 


by a frown, along with a compression in the 
mouth area, caused by the pinching of the 
lips. Anger was confused with disgust in lower 
face displays (both normally illuminated and 
spots-only conditions, 20.8% and 25.0% re- 
spectively). The normally illuminated /upper 
face displays yielded confusions with fear 
(20.0%). Finally, all spots-only displays 
yielded confusions with sadness (full face = 
32.59%, lower face = 20.0%, upper face = 
23.3%). 


Disgust 


The expression of disgust involves the 
wrinkling of the nose, which causes an upward 
movement on its sides as well as on the 
cheeks. Moreover, the expression can involve 
an upward movement in the area of the chin. 
Though the wrinkling of the nose is specific 
to disgust, this emotion leads to low recogni- 
tion rates. Confusions, however, seemed to 
concentrate on anger. In all conditions ex- 
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Response 
Fear Surprise 
2.5/3.3 0/1.7 
.8/22.5 0/5.0 
95.8/43.3 0/16.7 
10.8/7.5 87.5/88.3 
2.5/5.8 0/4.2 8 
8/5.8 0/1.7 28 
8.3/0 0/1.7 8/. 
0/4.2 0/0 8.3/15.0 23.3/300 
75.8/19.2 1.7/2.5 9.2/9.2 5.8/228 
16.7/11.7 82.5/79.2 8 i 
8/7.5 0/7.5 70.0/30.0 20.8/25! 
0/20.0 0/9.2 28,3/15.8  70.0/44 
8.3/15.8 8/2.5 8.3/5.0 14.2/58 
15.0/25.8 0/10.0 5.8/16.7 A 
89.2/20.8 10.0/32.5 0/13.3 8/75, 
28.3/11.7 70.0/85.8 0/0 sii 
20.0/6.7 1.7/1.7 69.2/56.7 6.1/! i 
2.5/10.0 0/5.8 51.7/46.7 41.7/1 


ly illuminate 


cept spots-only /bottom face displays (15.8% i 
anger was taken for disgust on more "f 
28% of the trials. This confusion was 7 
ticularly pronounced in upper face dip a; 4 
(normal illumination = 51.7%, spots -ort a | 
46.7%). The spots-only/bottom face display») 
lead to confusion with fear (20.0%). a 
One possible cause of confusions ana | 
emotions is that their expressions are 50 sA 
ilar that they are difficult to discriminate. A 
this is the case, confusions should be naa ; 
cal. This was seldom the case 1n the a? 
sions just described. Reciprocity wae in | 
evident in two cases: fear and sata 
spots-only/upper face displays a A 
and anger in normal illumination/bo nad 
displays. Another possibility is that t! i T 
failed to represent emotions adequate y al 
possibility, with one exception, 1S “a Fil 
likely, since recognition rates 1n norm f 
minated/full face displays were 


very HEr 
: res- 
(above 87%). The exception involves ii 


EMOTION RECOGNITION: THE 


ons of disgust (70%), which were often 
dged as anger. 

The third possibility accounting for non- 
ciprocal confusions is that specific expres- 
ons may have been nondistinct or ambig- 
us. In resolving the ambiguity, subjects 
ay have responded with an emotion sharing 
me components with the target emotion. 
he expression of that alternate choice may 
we also been ambiguous and shared compo- 
nts with yet other expressions. In short, two 
nbiguous expressions need not be resolved in 
reciprocal manner. 


Conclusion 


The present experiment demonstrates 
early that movement of the surface of the 
ice can serve as information for the recogni- 
on of emotions. It is not proposed, however, 
lat such information is generally necessary 
t the purpose. Much past research using 
atic stimuli has demonstrated that facial 
atures, with their specific shapes and spatial 
trangement, also provide very rich informa- 
lon for the recognition of emotions. Thus, it 
ppears that under natural conditions of ob- 
ttvation the observer has access to a redun- 

lant system of facial information. 
The issue of redundancy is complex and is 
manifested in a variety of ways in emotion 
pon: For example, although the pres- 
nt research has emphasized the distinction 
tween feature-based static information and 
i transformational movement informa- 
ay each of these types of information is in 
Ms mally redundant. Specifically, most 
n are represented in specific ways in 
a Ous areas of the face. It is precisely this 
Atacteristic that has led some researchers 
ea about the relative importance of 
on ent areas of the face in emotion recogni- 
be iadings yielded by such research, how- 
his ae been complex and inconsistent. 
Res e probably been caused by the fact 
a eee expressions contain many cues 
icin ve as information for recognition. By 
Ne the face into two or three regions, 
ets have presented observers with 
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selected subgroups of cues. Some more or less 
distinctive cues have fallen into the various 
groups as different emotions were considered, 
leading to the finding that no facial region 
dominates across emotions. In order to avoid 
this haphazard approach to the analysis of 
dominance in facial information, it would 
appear desirable for future research to de- 
lineate the cues relevant to expressions of 
specific emotions and to scale their relative 
information value. 

Another issue relating to redundancy and 
more directly relevant to the distinction be- 
tween feature information and movement in- 
formation has to do with the nature of their 
cooperation in the specification of emotions. 
Specifically, is the observer in a better posi- 
tion to recognize emotions accurately on the 
basis of both kinds of information than he/ 
she would be by using the better of the two 
alone? This question is particularly important 
in the context of the two types of information 
discussed here, since they are perfectly cor- 
related, that is, cannot vary independently. 
Although an empirical answer to this question 
will have to await future research, it is likely 
that the advantage of redundant information 
manifests itself under poor conditions of ob- 
servation. For example, movement may render 
slight or brief expressions more detectable by 
highlighting the contrast between the appear- 
ance of the face at two points in time. In a 
reciprocal manner, features that tend to be 
points of high contrast on the face may serve 
as particularly effective “carriers” of motion, 
thus rendering movement information more 
detectable. 

Research aimed at verifying these specula- 
tions would have to take another issue into 
account. In the present research it was 
deemed appropriate to establish the usefulness 
of movement information under optimal con- 
ditions for the recognition of emotions. Thus, 
the posed “stereotypic” expressions of trained 
actors were used in the generation of the 
stimulus displays. When the concern shifts to 
the specific interaction of feature and move- 
ment under less than ideal conditions of ob- 
servation, the research should utilize more 


ecologically valid stimuli. 
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sychological Masculinity and Femininity and Typical and Maximal 
Dominance Expression in Women 


Helen M. Klein and Lee Willerman 


University of Texas at Austin 


Four groups of women differing in psychological masculinity and femininity as 
measured by the Personal Attributes Questionnaire, participated in two lab- 
oratory situations designed to measure their typical and maximal dominance 
expression. Typical dominance refers to one’s average expression of the trait, 
whereas maximal dominance refers to one’s response capability regarding the 
trait. As predicted, undifferentiated and feminine women were significantly 
less dominant than masculine and androgynous women were in both laboratory 
situations and as measured by the California Psychological Inventory (CPI) 
Dominance scale. Traditional prescriptions for sex role behavior also affected 
women’s performance in the typical laboratory, where they were significantly 
less dominant with male confederates than with female confederates, but not 
in the maximal laboratory, where they were equally dominant with male and 
female confederates. However, women were more talkative with male confed- 
erates rather than female confederates in both laboratory situations. Multiple 
regression analysis revealed that the CPI Dominance scale was the most 
effective predictor of laboratory behavior. 


psychological attributes of masculinity and 
femininity are tightly _ intercorrelated” 
(Spence & Helmreich, 1978, p. 10). 

If these qualities are only weakly related, 
however, it becomes imperative to distinguish 
between masculine and feminine sex roles 
(defined as overt behaviors deemed appropri- 
ate for members of each sex in certain situa- 


Traditional conceptualizations of mascu- 
linity and femininity have become the target 
much criticism, Many contemporary psy- 
Ologists have rejected the unidimensional 
polar model in favor of a dualistic ap- 
foach, that masculinity and femininity are 
Idependent dimensions and coexist to some 
ree in everyone (Bem, 1975; Block, 1973; 


istantinople, 1973; Spence, Helmreich, & 
Mapp, 1974), 

_ One major source of dispute in this litera- 
[tte is the diversity of definitions of “sex 
Toles.” Under this heading, investigators have 
Meluded all of the behaviors, attitudes, pref- 
, "ces, and internal dispositions differentiat- 
FS Men from women. This state of affairs 
be attributed in part to the widespread 
3 umption that “biological gender, mascu- 
t and feminine sex-role behaviors, and the 


a authors thank Robert L. Helmreich, Janet T. 
e and Robert G. Turner for advice and as- 
Rea this study. 
hg, Ue for reprints should be sent to Lee Willer- 
fa NA “partment of Psychology, University of Texas 
| Stin, Austin, Texas 78712. 


tions) and internal dimensions of masculinity 
and femininity (defined as agentic and com- 
munal traits, Bakan, 1966). 

A study by Megargee (1969) on dominance 
expression illustrates this distinction nicely. 
When high dominant women (as defined by 
the California Psychological Inventory or 
CPI) were paired with low dominant men on 
a leader-follower task, the women did not 
often take on the role of leader. These 
women, however, actively chose the leader 
69% of the time, and when they did, they 
selected the men 90% of the time. In this 
way the women asserted their dominance 
without violating traditional sex role stan- 
dards. 

What if traditional sex role prescriptions 
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were relaxed? Would correlations between 
psychological dimensions of masculinity or 
femininity and behavior become more or less 
pronounced? One way to do this is by requir- 
ing subjects to perform under maximal con- 
ditions (Fiske & Butler, 1963; Wallace, 
1966; Willerman, Turner, & Peterson, 1976). 

These authors have distinguished between 
typical and maximal performance. The former 
refers to one’s usual or average behavior, and 
the latter refers to one’s maximum expression 
of the trait-linked behavior under conditions 
designed to elicit it to the greatest degree. 
Traditional laboratory situations, based on a 
typical performance format, often produce 
idiosyncratic differences in levels of motiva- 
tion as well as differences in the interpreta- 
tions of the task at hand. Laboratory situa- 
tions based on a maximal performance format, 
however, explicitly encourage the individual 
to focus on one specified extreme of a target 
trait to determine what he or she can do. 
This ought to lead to greater homogeneity 
in motivational level and clarity about the 
demands of the situation. In this context, 
socially undesirable or inappropriate behav- 
iors can also be made more socially desirable. 
Consequently, sex role stereotypes might be 
diminished. 

Finally, in line with a view of personality 
that emphasizes response capability, it may 
be important for individuals to generate re- 
sponses when describing themselves, rather 
than simply endorsing items presented to 
them. Turner and Gilliland (in press) found 
that self-generated trait descriptions signif- 
icantly predicted both typical and maximal 
laboratory dominance behavior in college 
men and women. Those individuals describing 
themselves as dominant expressed the most 
dominance and those describing themselves 
as reserved the least, whereas those not using 
adjectives on the dominance dimension fell 
in between. No information was available on 
the psychological masculinity or femininity 
of the subjects. Since it is now clear that both 
dimensions can coexist within the same indi- 
vidual (e.g, Spence et al, 1974), it would 
seem appropriate to examine whether these 
traits interact with typical and maximal 


situations. 
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The Present Study 


This study examined the effects of psych} 
logical masculinity and femininity and s% 
role demands on the typical and max 
expression of dominance in women, Self-gen 
erated trait descriptions and self-reports | 
dominance were obtained from four grow 
of women: undifferentiated (low masculin 
low feminine), feminine (low masculine, higi 
feminine), masculine (high masculine, 1 
feminine), and androgynous (high masculii 
high feminine). These women then partig 
ipated in two laboratory paradigms design 
to assess their typical and maximal dominan 
behavior. Within these paradigms, the wom 
were exposed to either male or female 
federates. From the studies by Bem (19% 
and Bem, Martyna, and Watson (1976); 
was expected that feminine and undiffere 
tiated women would have lower scores On 
self-report dominance measure and would! 
press less dominance in the typical laborato 
situation than would masculine and androg 
nous women. Following Megargee’s (196 
findings, it was predicted that women WOH 
express less dominance with male confed 
ates than with female confederates int 
typical laboratory situation, From Tur 
and Gilliland’s (in press) study, it was 
pected that self-descriptive trait lists ¥ 
predict dominance expression in both labor 
tory situations. It was expected that wom 
who listed adjectives on the dominance) 
mension as self-descriptors (e.g., domiti 
reserved) would be more extreme 1n, ag 
inance expressions than women who did me 
do so. al 

Although no specific predictions were mi 
for the maximal laboratory situation, SU 
general questions were proposed for m 3 
tion: How would feminine and undi 
tiated women, compared to androgynous E 
masculine women, respond when i, 3 
is made explicitly salient and desirable 
imal laboratory)? Would women 
ferently to male and fem: 
the maximal laboratory? W 
more dominance with the fema 
as hypothesized for the typical la 
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Method 


Design and Overview 


The basic design was a 4X 2X2 factorial with 
two between-subjects variables and one within-sub- 
jects variable. One hundred and twelve women were 
recruited as representatives of four combinations of 
psychological masculinity and femininity based on 
their scores on the short form of the Personal At- 
tributes Questionnaire (PAQ; Spence & Helmreich, 
1978). The women completed a variety of self-report 
measures and then each participated with two pairs 
of same-sex confederates in two group problem-soly- 
ing situations designed to assess her typical and max- 
imal dominance expression. Half of the women in 
each of the four groups were exposed only to male 
confederates and half only to female confederates, 


Subjects 


Several weeks prior to the beginning of the experi- 
ment proper, the short form of the PAQ was admin- 
istered to 635 female introductory psychology stu- 
dents at the University of Texas at Austin, Using 
norms established by Spence and Helmreich (1978), 
the women were classified into four groups based on 
median splits on their PAQ masculinity (M) and 
femininity (F) scores: undifferentiated (low M, low 
F), feminine (low M, high F), masculine (high M, 
low F), and androgynous (high M, high F). The 
total pretesting sample was comparable to Spence and 
Helmreich’s sample in mean PAQ scores, percentage 
of students falling into each M and F group, and 
telationship among the PAQ scales. 

A total of 123 women were recruited for the pres- 
‘tnt study from eligibility lists for the four M and F 
Btoups, Eleven women were dropped from the ex- 
Mtiment: 3 because of suspicion, 3 because of equip- 
ment failure, and 5 because they failed to appear for 
‘heir appointments. Half of the 28 women in each 
and F group were randomly assigned to one of 
the two sex-of-confederates conditions. The experi- 
Menter and the confederates were blind with respect 

the women’s classifications. Each subject received 
ourse credit for participation in the experiment. 


Measures 


a ftrsonal Attributes Questionnaire, The PAQ is a 
Ae of psychological masculinity and femininity 
eel by Spence et al. (1974). The short version 
4 ie PAQ (Spence & Helmreich, 1978) consists of 
ipolar items belonging to three scales: masculine 

» feminine (F), and masculine-feminine (M-F). 
Telations between the scales of the short version 
x a original 55-item PAQ are 93 (M), 93 (F), 
the | 1 (M-F) for college students. For each item, 
aa Omen were asked to rate themselves on a 
Ik nt scale, with the endpoints having verbal labels. 
Were scored from 0 to 4, with M and M-F 
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items keyed in a “masculine” direction and F items 
keyed in a “feminine” direction, The 8 item scores 
for each scale were summed to give M, F, and M-F 
scores for each person. Some examples of PAQ items 
are “not at all independent — very independent” (M), 
“not at all emotional—very emotional” (F), and 
“very submissive — very dominant” (M-F). 

Dominance scale of the California Psychological In- 
ventory. The CPI Dominance scale (Gough, 1956) 
consists of 46 items pertaining to dominance in a 
“true-false” format, such as “I doubt whether I 
would make a good leader” (false) and “I think I 
would enjoy having authority over other people” 
(true). The total score for each person was obtained 
by summing the number of items responded to in the 
significant direction. 

Person Description Form. This form (adapted 
from Turner & Gilliland, in press) is an idiographic 
assessment technique calling for the generation of a 
series of self-descriptive adjectives, Instructions are 
to “list the adjectives or traits that best describe you 
and your personality.” 

Following the procedure used by Turner and Gilli- 
land, the women were classified into four groups via 
their responses, based on the taxonomy developed by 
Layman and McDonald (The Layman-McDonald 
Trait Taxonomy: Round II; cited in Turner & 
Gilliland, in press). The fourth group was added by 
the present investigators for exploratory reasons. Dis- 
agreements on categorization were resolved by dis- 
cussion. 

1. Dominant group—women listing traits catego- 
rized as autocratic/domineering (Layman-McDonald 
Dimension #7) or self-assertive (Dimension #20). 

2. Reserved group—women listing traits on the 
opposite poles of these dimensions (eg, democratic, 
reticent). 

3. Trait irrelevant group—women not listing traits 
on these dimensions. 

4, Mixed group—women listing traits on both the 
dominance and reserved dimensions, 


Laboratory Measures 


Two methods were used to assess dominance be- 
havior in the laboratory (Turner, 1978). 

Dominance rating. The contents of the subject’s 
verbalizations (ie., forceful, assertive statements) 
were independently rated by the two confederates as 
soon as she left the room. These ratings were made 
according to the following rationally constructed 
scale (adapted from Turner, 1978): 

1. No dominance statements—totally passive. 

2. Statements of a general nature (comments, state- 
ments of fact, clarifying statements); a few very 
weak opinions voiced with little conviction of in- 
volvement. 

3. A few opinions offered; no disagreements with 
others expressed. May agree with others’ opinions. 

4. Opinions offered but no attempt to lead group 
(or very weak one); disagreements subtle and non- 
direct. Might offer suggestions about organizing. 
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$. Opinions offered with attempts to get some 
agreement from others; somewhat persuasive efforts 
toward other members. Statements solicitous to sup- 
port, such as “Don’t you think... . ?” An organizer 
—democratic—asks what others think (How do you 
feel? How about ... ? What do you think?). 

6. Strong attempts to persuade others and open 
disagreement with others. Emphatic statements, such 
as “I think. . . .” Person may take a while to warm 
up but takes over eventually. Assertive about opin- 
ions; leads; not too concerned about what others 
think. 

7. Complete usurpation of leadership of group with 
insistence upon influencing conclusions of group. 
Usurpation of leadership from the beginning. Strong 
opinions; not democratic; very persuasive; strong 
disagreements. 

The two confederates’ ratings were summed to- 
gether to yield a composite dominance rating for the 
typical laboratory and one for the maximal labora- 
tory for each subject. Intraclass interrater reliabil- 
ities were .96 for the typical laboratory and .95 for 
the maximal laboratory. 

Prior to the experiment proper, all confederates 
(five men and five women) participated in a series 
of laboratory’ training sessions to familiarize them- 
selves with their roles and with the dominance rating 
scale, Th M rdeticed in different-sex as well as same- 
sex paits with 20 pilot subjects. Tapes of the sessions 
were reviewed and discussed to promote consistency 
in their ratings. 

The dominance rating sheet also included a 5-point 
scale for facial attractiveness, ranging from (1) very 
low to (5) very high, on which each subject was 
independently rated by the confederates, The con- 
federates used an idealized Vogue model as a standard 
for these ratings. The summed ratings of the four 
confederates (two from each laboratory) served as a 
facial attractiveness score for each subject. Although 
the agreement among the raters on this measure was 
rather low (intraclass reliability = 47), the reliability 
of the average of the four raters was substantially 
higher (.78),-suggesting moderate reliability for the 
overall attractiveness rating of each subject. 

Talking time. The second method of assessing typ- 
ical and maximal dominance in the laboratory was to 
measure the amount of verbalization, that is, the ex- 
tent to which the subject monopolized the conversa- 
tion. This was obtained from tape recordings of the 
laboratory sessions. Two confederates independently 
reviewed the tapes to determine Percentage of talking 
time during the problem-solving period in the typical 
laboratory and in the maximal laboratory for each 
subject. Confederates’ times were within 12 sec of 
each other for all of the tapes. Since it was expected 
that the groups would differ in the time needed to 
make final decisions, the Problem-solving period was 

defined as “the time between the exit of the experi- 
menter and the group’s agreement upon the last of 
the four choices required by both the typical and 
maximal laboratory situations” (Turner, 1978, P. 
124). Talking times were unavailable for five subjects 
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in the typical laboratory because of equipment fail- 
ure. The correlation between dominance rating and 
talking time in the typical laboratory was 58; the 
corresponding correlation in the maximal laboratory 
was .56. These relationships were consistent for sub- 
jects with male and female confederates, Turner 
(1978) combined talking time and dominance rating 
to form composite measures of typical and maximal 
laboratory dominance. This was not done in the 
present study because of the different effect of sex of 
confederates on these measures, reported in the Re- 
sults section. 

Evaluation of Group Members Form. This mea- 
sure consists of a series of 5-point scales on which 
each subject was asked to rate herself and the two 
other group members (confederates) on leadership 
ability and passiveness/dominance. The subject com- 
pleted one of these forms after each laboratory con- 
dition, thus providing self-ratings of typical and 
maximal laboratory dominance behavior. For the 
typical laboratory, self-rating correlated .55 with 
confederates’ dominance rating and .52 with talking 
time. Corresponding correlations for the maximal 
laboratory were .31 and .37. 


Procedure 


The women signed up for one of four available 
sessions to complete the PAQ, the Person Description 
Form (PDF), and the CPI Dominance scale. 

After completing these tasks, the women signed up 
for individual laboratory sessions, held 1 to 4 weeks 
later." 

Ten senior psychology students (five men and five 
women) rotated through the experimenter and con- 
federates roles in the typical and maximal uae 
situations.* Subjects were exposed to either a femi r 
experimenter and four female confederates or 4 mi 
experimenter and four male confederates during t 
laboratory session. : 

The subject was greeted by the experimenter ff 
explained that the other students participating 1n a 
session were already present. The other students wi h 
two confederates. One confederate was waiting i 
the lounge while the other was in the laboro 
order to surreptitiously turn on a concealed ae 
corder, The experimenter entered the labor ai 
the subject and the confederate, introduced at s 
by his or her first name and had them sit aro 
table. They were then read the following: 


who 


ned with problem solving: 


i i T] p 
This research is conce: interested in att 


The kinds of problems that we are 


session 
1 Tw j w to the laboratory 
Two subjects went EA, 


within 3 days of the paper-an' the labora- 
2 The authors gratefully acknowledge Charlene 

tory assistance of Lisa Fowler, Bill oie Dickey» 

Henslee, Steve Bates, Teresa Matthews nha an 

Nancy Feinberg, Adele Goldstein, Paul 

Ray Mechler. 
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often abstract and seem to have no clear-cut an- 
swer. However, we are interested in what people’s 
conclusions concerning these problems are and how 

_ they arrived at them. I will read the problem to 
you and then give you 10 minutes to come up with 

i your conclusions as a group. You may have seen 
problems similar to this before. 

The school district in a small community is in 

" financial difficulty. The school board must dismiss 

i eight faculty members in addition to other cutbacks 

_ in order to have enough money to complete the 

_ school year. A decision has already been made that 
the eight faculty members would be specialists in 
various areas, not teachers of the basic subject 
areas, Which four of those listed below would you 
choose to remain? 

Reading specialist 

2. Counselor 

3. Assistant football coach 

4. Special education teacher in math 

5. Assistant band director 

6 

7. 

8 


. Drama teacher 

. Diagnostician of learning disabilities 
. Teacher of accelerated English classes 
. Teacher of accelerated math classes 
10. Baseball coach 

11. Work-study supervisor 

12. Art teacher 


The experimenter then placed a description of the 
problem in the middle of the table along with a pen- 
dl and a group conclusion form, on which someone 
Was to write down the four persons of the group’s 
thoice. The experimenter reminded the group of the 
minute time limit, pointed out the clock on the 

ble, and left the room. 

he confederates’ behavior was primarily composed 

Í presenting both pro and con arguments as to 
a specialists should be dismissed. Although they 

teracted freely in the discussion, they were demo- 

fatic in their approach and only moderately asser- 

Ve. Within this general pattern, three other proce- 
Pres were followed: 

a At initiation points (ie, when the experimenter 
eit and returned) the confederates gave the subject 
fronds to begin the interaction, if she chose to 


a Approximately halfway through the session, one 
the Confederates began to discourse on a tangential 
ible. This lasted for 1 minute unless it was termi- 
ted by the subject. 
> At some Point during the 10 minutes, one of the 
Pifederates openly disagreed with one of the sub- 
le Statements to determine her reaction to this 
nge, 
| y fter 10 minutes the experimenter returned, and if 
Av CUP had not yet finished the task, they were 
ith an additional minute. All of the groups finished 
lk N 12 minutes’ total time. The experimenter 
tet briefly about the problem and collected the 
pe conclusion form. The experimenter then told 
a OUP that it was necessary to get their individual 
Ons to the group process and asked the subject 
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to come to an adjacent room, saying that he (she) 
would return to get the others’ reactions after talking 
to the subject. The subject then completed an evalua- 
tion of group members form, During this time, the 
confederates independently rated the subject on dom- 
inance and facial attractiveness. 

The subject was then read the following statement: 


For the second part of this research, you will be 
with another group of people solving a problem 
similar to the one you worked on here. In this 
group, I want you to try to be the leader. I want 
you to be as dominant and assertive as you can be, 
This is an ability test. We will tape-record this 
session so that we can see how well you do. Re- 
member, this is a dominance ability test. You are 
to see that your group comes up with satisfactory 
answers to the problem. : 


The experimenter then stated that the other stu- 
dents (confederates) in this part of the experiment 
were ready to begin and led the subject to the second 
laboratory while stressing the ability nature of the 
assessment, 

The maximal laboratory situation was similar to 
the typical situation. Again, one confederate was 
waiting in the lounge while the other was in the 
laboratory to turn on the tape recorder. After follow- 
ing the same introductory procedure as enacted in 
the typical situation, the experimenter read the fol- 
lowing: 


This research. is concerned with problem solving, 
The kinds of problems that we are interested in 
are often abstract and seem to have no clear-cut 
answer. However, we are interested in what peo- 
ple’s conclusions concerning these problems are and 
how they arrived at them. I will read the problem 
to you and then give you 10 minutes to come up 
with your conclusions as a group. You may have 
seen problems similar to this before. 

Imagine that our country is under threat of im- 
minent nuclear attack. A man approaches you and 
asks you to make an independent decision: There 
is a fallout shelter nearby that can accommodate 
4 people but there are 12 people vying to get in, 
Which 4 do you choose to go into the shelter? 
Here is all the information we have about the 12 
people: 

1, A 40-year-old male violinist who is a suspected 
narcotics pusher. 

. A 34-year-old male architect who is thought to 
be a homosexual. 

3. A 26-year-old lawyer. 

4. The lawyer's 24-year-old wife who has just 
gotten out of a mental institution, 

5, A 75-year-old priest. 

6. A 34-year-old retired prostitute who was so 
successful that she’s been living off her an- 
nuities for 5 years. 

7. A 20-year old black militant. 

8. A 23-year-old female graduate student who 
speaks publicly on the virtues of chastity, 


Y 
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9. A 28-year-old male who will only come into 

the shelter if he can bring his gun with him. 

10. A 12-year-old girl who has a low IQ. 

11. A 30-year-old MD who is an avowed bigot. 

12. A high school student. 

The experimenter followed the same procedure as 
before and left the room. The confederates’ behavior 
during the problem-solving period was the same as 
that of the confederates in the typical laboratory. 

After 10 minutes, the experimenter returned and 
the same procedure as outlined above was followed, 
with the subject filling out another evaluation of 
group members form. The subject was then debriefed 
and was asked not to discuss the experiment with 
anyone until the study was completed. 

The school board and bomb shelter paradigms were 
counterbalanced for the typical and maximal labora- 
tory conditions. Since preliminary analyses did not 
reveal a significant order effect or a significant dif- 
ference between the two paradigms across laboratory 
conditions, it was assumed that the two paradigms 
were equivalent. 


Results 


As expected, mean scores on the CPI Dom- 
inance scale differed significantly by M and 
F group, F(3, 108) =6.41, p<.001. A 
planned comparisons * revealed that, as pre- 
dicted, the means of the undifferentiated and 
feminine women (27,3 and 26.9) were signif- 
icantly lower than the means of the masculine 
and androgynous women (32.1 and 32.7), 
F(1, 108) = 19.04, p < .01. This difference 
is entirely due to the correlation of .49 (p< 
-01) between the PAQ M scale and the CPI 
Dominance scale; the F scale has no relation- 
ship to CPI dominance (r = .00). 

Table 1 presents the means and standard 
deviations of the laboratory measures by M 
and F group and sex of confederates, Results 
were analyzed by the three-way analysis of 
variance (M and F Groups X Sex of Con- 
federates x Laboratory Condition) presented 
in Table 2. The difference between typical 
and maximal measures was highly significant 
for dominance rating (p < 001) and talking 
time ($ < .001), showing that the maximal 
instructions were highly effective in eliciting 
more dominance behavior in the second lab- 
oratory situation. 

As anticipated, M and F group had a sig- 
nificant effect on dominance rating (p< 
.002) and talking time (p < .01). For dom- 
inance rating, planned comparisons showed 
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a significant difference between the means 
of the undifferentiated and feminine women 
and those of the masculine and androgynous 
women for the typical laboratory, F(1, 208) 
= 9.82, p < .01, and for the maximal labora- 
tory, F(1, 208) = 7.75, p < .01. 

For talking time, results of planned com- 
parisons were parallel: significant for the 
typical laboratory, F(1, 198) =9.86, p< 
01, and marginally significant for the mar- 
imal laboratory, F(1, 198) = 3.47, p< 0. 

Closer examination of Table 1 showed that 
the means of the four M and F groups tended 
to follow a consistent pattern: masculine, 
androgynous, undifferentiated, and feminine, 
with the masculine women having the highest 
dominance ratings and the feminine women 
the lowest. | 


q 
| 


Sex Role Behavior 


The effect of sex of confederates on wom i 
en’s dominance behavior is complex. The pr 
diction that women would express less dom- 
inance with male confederates than with fe 
male confederates in the typical labora 
was supported for dominance rating, H 
contradictory results were obtained for tal 
ing time. ; ‘a 

Although for dominance rating the m A 
effect for sex of confederates was nonsiani 
icant (Table 2), the interaction between A 
of confederates and laboratory was E, 
(p < .03). Planned comparisons ik oe 
for the typical laboratory, the means 0° g 
en with male and female contedan a 7 
significantly different, F(1, 208) = 4.38, fa 
.05, whereas for the maximal laboraitt¥ og) 
means were practically identical, F pes 
= .02, ns. Thus, for dominance a we 
en’s behavior was not consi A 


nificantly less dominant with ma 
ates in the typical laboratory, 
but equally dominant with male ani 
confederates in the maximal laboratory: 


e per 
cell means WEF pel 


3 Planned comparisons among described by 


formed according to procedures 
(1973; pp. 445-446). 
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Table 1 3 
Means and Standard Deviations of the Laboratory Measures by M and F Group, 
Sex of Confederates, and Combined 


sss 


: 
Laboratory dominance rating Laboratory talking time (%) 
Typical Maximal Typical Maximal 
Mand F group n" M SD M SD M SD M SD 
Male confederates 
Undifferentiated 14 (14) 6.79 2.81 9.50: -2.21 21.0 14.0 36.5 13.4 
HPeminine 14 (13) 6.14 2,77 9.43 1.34 21.9 10.7 33.9 12.7 
‘Masculine 14 (13) 9.00 2.35 10.86 1.29 28.5 141 41.3 11.6 
Androgynous 14 (14) 7.43 2.31 9.79 1.48 25.4 14.6 377 13:1 
Combined 56 (54) 7.34 2.69 9.89 1.67 24.2 13.3 37.4 12.5 
Female confederates 
Undifferentiated 14 (13) 7.93 2.95 9.93 2.09 16.4 7.8 29.4 10.5 
Feminine 14 (14) VEY Qh) 8.14 2.03 13.4 74 275. 15.5. 
Masculine 14 (12) 9.29 1.82 10.71 2.02 27.2 11.0 36.4 10.0 
Androgynous 14 (14) 8.29 3.38 10.57 2.53 21.3 11.4 31:7 13.0 
Combined 56 (53) 8.27 2.81 9.84 2.34 19.4 10.5 31.2 12.4 
Combined 
Undifferentiated 28 (27) 7.36 2.83 9.71 2.08 18.8 11.3 33.0 12.1 
Feminine 28 (27) 6.86 2.85 8.79 1.78 esa 9.7) 430)? 14.0 


Masculine 28 (25) 9.14 2.03 10.79 1.63 27.9 12.2 38.8 10.7 
Androgynous 28 (28) 7.86 2.82 10.18 2.04 23.3 12.8 34.7 12.9 
Total 112 (107) 7.80 2.80 9.87 2.04 21.8 12.3 343 12.9 


Note Means are derived from the summed scores of two raters. M = masculinity. F = femininity. 
ws in parentheses are for talking time in the typical laboratory. 


Findings for talking time were the reverse federates ($ < .01). Planned comparisons 
Í what was expected. Women talked more were marginally significant for the typical 
th male confederates than with female con- laboratory, F(1, 198) = 3.81, p < .06, and 


Table 2 
Analy sis of Variance Summary for M and F Group X Sex of Confederates X Laboratory 
Ondition for the Laboratory Measures f 


Dominance rating Talking time 
Source df MS F. MS F 

A 5.28 5.47" 196.56 3.67* 
B 4 10.72 1:30 1,541.44 7.09% 
3 1 238.22  86.87** 81125.32 100.57** 

 AXB 3 2.50 “30 30.63 14 
AXC 3 1.62 ‘59 50.69 ‘63 
Bxc 1 13.50 4.92* 30.64 38 
AXB 3 4.91 1.79 28.17 35 
Error (between) 104 8.27 — oe a 
Error (within) 104 2.74 a ; 


ie For talking time, df = 99. M = masculinity. F = femininity. A = M and F group. B = sex of con- 
pâtes, C = laboratory condition. 


S05. ** 5 < L001. 
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significant for the maximal laboratory, F(1, 
198) = 8.19, p < 01. 


Self-Rating of Laboratory Behavior 


Unlike findings for dominance rating and 
talking time, self-ratings of the M and F 
groups do not follow a consistent pattern. The 
main effects for M and F group and sex of 
confederates were not significant, F(3, 104) 
= 1.49, ns, and F(1, 104) = 1.39, ms, respec- 
tively. Overall, the women did not perceive 
themselves as less dominant with male than 
female confederates. 


Facial Attractiveness 


For the facial attractiveness measure, the 
main effect for sex of confederates was non- 
significant, F(1, 103) = 1.63, ms, indicating 
that the male and female confederates per- 
ceived the women in their respective groups 
as equally attractive. Differences among the 
M and F groups were unremarkable, and the 
effect of M and F group on facial attractive- 
ness was nonsignificant, F(3, 103) = .21, ms. 
Facial attractiveness, however, correlated sig- 
nificantly with both typical and maximal 
talking time (r = 32; r=.21) and typical 
and maximal dominance ratings (r = .30; 
r =.22); sex of confederates did not affect 
these correlations. 


Self-Generated Trait Descriptions 


As expected, trait group significantly pre- 
dicted women’s self-reports on the CPI Dom- 
inance scale, F(3, 108) = 10.96, p < .001, 
and their behavior in the laboratory situa- 
tions; dominance rating, F(3, 104) = 5.67, 
p < .001; talking time, F(3, 99) = 5.86, p < 
.001. For all of these measures, the means of 
the trait-irrelevant group fell between those 
of the dominant and reserved groups. The 
mixed group behaved similarly to the trait- 
irrelevant group in the typical laboratory and 
the reserved group in the maximal laboratory. 
Results for self-rating of laboratory behavior 
were consistent with those for dominance 
rating and talking time. Trait group had a 
significant effect on this measure, F(3, 104) 
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= 4.34, p < .007. For both laboratory situa. 
tions, women in the dominant group perceived 
themselves as the most dominant, followed by 
women in the trait-irrelevant, mixed, and 
reserved groups, respectively. 

Trait group had a borderline significant ef- 
fect on the facial attractiveness measure, F(3, 
107) = 4.36, p < .06. The facial attractive- 
ness of the trait-irrelevant and dominant 
groups were similar and both were signif- 
icantly greater than that of the reserved 
group. Post hoc comparisons showed that the 
differences between these groups were signif- 
icant: trait irrelevant versus reserved, F(1, {f 
107) = 6.41, p < .05; dominant versus re 
served, F(1, 107) = 11.54, p < .01. 


Multiple Regression Analysis 


To get an idea of the relative contributions 
of the independent variables to the laboratory 
dominance scores, five predictors were entered 
into a multiple regression in a hierarchical 
sequence, CPI Dominance, M and F group 
(coded as three dummy variables), trait 
group classification (also coded as three dum- 
mies), sex of confederates, and facial attrac: 
tiveness. The outcomes to be predicted were | 
typical and maximal laboratory dominanct 
and talking times. 4 

CPI Dominance was entered first to see E 
M and F group contributed unique hie 
to the laboratory criteria beyond that Bi 
dicted by CPI Dominance and bea E 
laboratory outcomes could reasonably i ea 
garded as validity criteria for the CP 
itself. i 

Results of the multiple regression analysis 
are shown in Table 3. It is apparent the ail 
Dominance accounts for the greatest am! F 
of variance in the laboratory on 
in only one instance (maximal Bs signif 
rating) did the M and E classifica 
icantly contribute to variance be 
contributed by the CPI scale. 
tiveness and sex of confederates 
significant contributions to 
the typical laboratory, and sex 0 ene time 
also contributed significantl; 
in the maximal laboratory. 


ification PY” 
Although the trait group classificat! 
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Table 3 
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Proportion of Variance (R?) in Laboratory Dominance Behavior Accounted for by the Predictors 
OO ooo 


Dominance rating 


Talking time 


Predictor Typical Maximal Typical Maximal 
CPI Dominance scale .193* -103* .200* 162* 
M and F group (PAQ) .040 .075* 013 .020 
Trait group (PDF) .037 033 049 017 
Sex of confederates .033* -000 .026* .044* 
Facial attractiveness .042* 011 ,051* -003 
Total variance 
accounted for 345 -222 339 246 


Note. Significance levels take into account the degrees of freedom for the predictors. CPI = California 
Psychological Inventory. PAQ = Personal Attributes Questionnaire. PDF = Person Description Form. 


Mp < .05. 


jided no additional information in the multi- 
jle regression, it should be noted that the 
rrelation between CPI Dominance and the 
lominant trait group dummy was r= 42, 
(/<.01), with the reserved group, 7= 
~ 39 (p < .10), and with the trait-irrelevant 
gtoups, 7 = .00. This is important because it 
Siggests a moderate amount of multicolinear- 
"y for two of the dummy variables, which re- 
duces the potential predictiveness of trait 
ftoup classification in this analysis. 
In another hierarchical analysis, M and F 
ssification was entered first and CPI Dom- 
nce entered second. In every instance, M 
d F group accounted for a significant pro- 
i ttion of the variance (ranging from 5% to 
À 7b, average = 8.6%). However, the con- 
bution of CPI Dominance was also always 
A (range from 5% to 14%, average 
370). 
pee results of these analyses indicate that 
we fourfold M and F classification did not 
p provide additional information be- 
nd that given by CPI Dominance scores. 
a interpretation cannot be readily gen- 
et, however, since the outcome criteria 
= particular study could be regarded as 
a plars of the CPI scale itself. In studies 
other behavior outcomes, the pattern of 
ts might be different. 
poe these findings from the multiple 
S useful information on women’s 
“tance expression would not have come 
Jp Eht without the perspective of psycho- 


I 
| Bical masculinity and femininity, which con- 


ceptually guided this study. The consistent 
pattern observed for the four M and F groups 
in both laboratory situations for both de- 
pendent variables (masculine > androgynous 
> undifferentiated > feminine) could not be 
explained solely by the relationship between 
CPI Dominance and the PAQ M and F 
scales. For example, since the CPI scale is 
significantly related to the M scale (r = .49, 
p < .01) and unrelated to the F scale (r = 
.00), why did the masculine women con- 
sistently express more dominance than the 
androgynous women when the latter actually 
had slightly higher CPI Dominance scores 
(32.7 vs. 32.1) and M scores (24.1 vs. 23.6)? 
With male confederates, masculine women ob- 
tained significantly higher dominance ratings 
than the androgynous women in the maximal 
situation (p< .01) and marginally signif- 
icantly higher ratings in the typical situation 
(p < .10). Apparently, psychological “type” 
had an effect on women’s dominance behavior 
beyond that predicted from CPI Dominance 
and PAQ scale scores. 


Discussion 


Sex role demands were an inhibitory force 
in the typical laboratory, where women either 
consciously or unconsciously suppressed their 
dominance behavior with men, but not in the 
maximal laboratory, where women were 
equally dominant with men and women. The 
latter gave us the opportunity to observe how 
women would behave in a situation in which 
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dominance, a stereotypically “masculine” be- 
havior, was made salient and socially desir- 
able. Apparently, women have the capacity to 
express relatively more dominance than they 
generally do under typical or everyday cir- 
cumstances calling for appropriate sex role 
behavior. The explicit instructions to be max- 
imally dominant seemed to serve as a “green 
light” for the women to ignore established 
norms. The women were actually capable of 
behaving in a dominant fashion with men but 
typically inhibited this behavior. These re- 
sults demonstrate the importance of distin- 
guishing between what individuals do and 
what they can do. Despite reserved or sub- 
missive behavior under typical conditions, 
many individuals are capable of displaying 
dominant behavior. 

Findings for laboratory talking time were 
the reverse of expectations. Women were sig- 
nificantly more talkative with male con- 
federates than with female confederates in 
both laboratory conditions. In retrospect, this 
finding is hardly surprising. Women may 
have been more “turned on” by participating 
in an experiment with men. Although talking 
time showed substantial relationships with 
other dominance measures (e.g., CPI Dom- 
inance scale; dominance rating), it is by no 
means a “pure” measure of dominance. Verbal 
behavior undoubtedly contains components 
of sociability, and just as dominance behavior 
has a negative valence in terms of traditional 
sex role standards for women, sociable be- 
havior, particularly with the opposite sex, 
has a positive valence, Thus, sex role expec- 
tations were actually conducive to verbal 
interaction with men in the present study, 

We believe that mean dominance rating 
differences as a function of the rater’s sex 
are valid and not due to rating artifacts. Male 
and female confederates were trained to- 
gether in pilot work, and raters’ sex was sig- 
nificant only under typical instructions, Also, 
masculine women rated by males in the typ- 
ical laboratory obtained higher dominance 
ratings than did the other three M and F 
groups rated by female confederates. In addi- 
tion, correlations between talking time and 
laboratory dominance ratings were virtually 
identical regardless of whether males or fe- 
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males did the dominance ratings. And finally, 
correlations of CPI Dominance with labora. 
tory dominance did not differ significantly | 
as a function of sex of confederate (r = 30 
with female confederates; r = .49 with male 
confederates). 

These findings have several important im- } 
plications. First, traditional sex role prescrip- 
tions have an impact on women’s behavior. } 
They can serve as facilitators or inhibitors 
depending on situational context. Second, 
women’s capabilities for opposite sex role 
related behaviors may be underestimated, 
Women tend to suppress certain dispositional 
inclinations when sex role demands are op- 
erating but can exhibit them when these de- 
mands are no longer salient. Finally, it is 
possible to remove inhibitory forces in the 
environment so that women can express dom 
inance more freely. 

On the other hand, consistent individual 
differences were observed in the enactment of 
sex roles. Psychological masculinity and fem- 
ininity, as measured by the PAQ, had a sig- 
nificant impact on women’s self-reports and 
behavior even when they were given explicit 
instructions to be as dominant as possible: 
These findings are consistent with those re- 
ported by Bem and her coworkers (Bem) 
1975; Bem et al., 1976). $ 

Additionally, these psychological variables 
seemed to influence the degree to whici 
women responded to the situational sex a 
demands, that is, masculine women showe 
the least difference in behavior as a function 
of confederates’ sex in the typical laboratory: 
It is noteworthy that feminine women showe 
little increase in dominance when moving A 
the maximal laboratory with female conte 
ates but showed a large increase with m i 
confederates. Whether this is due to the e 
inine women being more influenced by w a 
of the experimenter giving the maxima 
structions or to less motivation 1n abila 
women to express their dominance cap leai 
ities in the context of other women a dif- 
(see Eagly, 1978). That feminine an inan 
ferentiated women showed less, p pora- 
than masculine women in the maximal a 
tory suggests, contrary to Bem s 
clusions, that they may be some 
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in basic competence and not simply inhibit- 
ing stereotypically masculine behavior. To 
study the nature of masculinity and feminin- 
ity, therefore, researchers need to distinguish 
conceptually between sex role behaviors and 
psychological dimensions of masculinity and 
femininity as advocated by Spence & Helm- 
reich (1978), especially if they are attempt- 

ing to predict sex role related behaviors. 
Another finding consistent with Spence and 
Helmreich’s (1978) formulations concerns 
facial attractiveness. Undifferentiated, fem- 
inine, masculine, and androgynous women 
were rated as equally attractive by the 
confederates. These results contradict pop- 
ilar notions that masculine women are phys- 
kally unattractive. Interestingly, although 
facial attractiveness was independent of psy- 
chological masculinity and femininity, this 
variable was significantly related to dom- 
hance behavior in the laboratory. In the 
Yypical laboratory, facial attractiveness ac- 
counted for 14% of the explained variance. 
Many of our commonly held notions about 
bsychological differences between the sexes 
tte based on observations of behavior in 
Weryday situations eliciting sex role behavior. 
notion that women are less dominant 
than men may reflect adherence to traditional 
“x role standards and expectations more 
than strong psychological differences. Women 
May be more similar to men in psychological 
Makeup than is generally assumed. Perhaps 
‘search paradigms utilizing a response capa- 
tility approach will shed new light on psycho- 
Pical differences between the sexes. Unfor- 
tinately, this investigation did not have a 

Male sample for purposes of comparison. 
Results for the self-generated trait descrip- 
i nS were as anticipated and similar to those 
vied by Turner and Gilliland (in press). 
descriptions are predictively valid mea- 
S and refer to relatively consistent dis- 
a and behavioral variables. Unlike 
Rea Personality measures, which re- 
tie to respond to a specific set of 
Reerat Dera by. thesiexpenmenea ce 
ha aL descriptions enable people to report 
‘nalig ey perceive as relevant to their pe: 
i, €s. Bem and Allen (1974) argue tha 
è idiographic approach to assessment 
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would improve the predictive validity of per- 
sonality tests. Further, they point out that 
“there is no inherent conflict between an idio- 
graphic approach to assessment and a nomo- 
thetic science of personality” (p. 511), Re- 
sults of the present study and Turner and 
Gilliland’s study show that an idiographic 
assessment technique can significantly predict 
dominance behavior. 

The behavior of the dominant, reserved, 
and trait-irrelevant groups is of particular 
interest in light of Bem and Allen’s (1974) 
formulations. Women who perceived the dom- 
inance dimension as relevant to their per- 
sonalities (i.e., dominant and reserved groups) 
were extreme in their dominance expression in 
accordance with their  self-descriptions, 
whereas women who perceived this dimension 
as unimportant (i.e., trait-irrelevant group) 
were moderate in their dominance expres- 
sion, The mixed group was inconsistent in 
the pattern of responses. This trait, then, was 
not of equal importance to all of the women, 
and this had a significant impact on their 
behavior. The predictive validity of per- 
sonality measures may well be improved by 
focusing on those individuals who perceive 
the trait under consideration as relevant to 
their psychological makeup. 

Finally, one must come away impressed 
with the strength of the CPI Dominance 
scale as a potent predictor of dominance be- 
havior in the laboratory. The multiple regres- 
sion analysis showed that this scale explained 
an average of 75% of the total variance ac- 
counted for by all predictors. The lesson to 
be learned from this finding is that in studies 
of psychological masculinity and femininity 
employing traditional aspects of dominance 
behavior as a criterion, one needs to be con- 
cerned with the unique contributions that 
these concepts make to outcomes beyond that 
given by traditional psychological measures. 
Although the concepts of psychological mas- 
culinity and femininity can be regarded as 
theoretically more valuable than the tradi- 
tional concept of dominance, recognition of 
this close alliance allows one to make contact 
with an already established literature of re- 
search findings whose origins lie in a differ- 
ent conceptual framework, 
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Effects of Person Salience Versus Role Salience on Reward 


Allocation in a Dyad 


Elena M. Carles and Charles S. Carver 
University of Miami 


Previous research shows that resource allocation in a dyad sometimes follows 
the principle of equity (proportional reward) and sometimes that of parity 
(equal reward). Existing evidence does not clarify the conditions under which 
each of these rules is invoked, however. A number of theorists have suggested 
that salience of the other as a person should lead to parity-based allocation, 
whereas salience of the other as a functionary filling a role should lead to 
equity-based allocation. The present study tested these possibilities. Subjects 
were led to perceive their own inputs to group performance as being either 
substantially lower or substantially higher than a partner's inputs. The partner 
had been portrayed to the subject in terms that made salient the partner's 
personal characteristics, the partner’s role assignment, or neither of these, 
Among females, subsequent reward allocation followed the predicted pattern in 
both high- and low-input conditions. Among males, contrary to expectation, 
person salience led to heightened feelings of competitiveness and to increased 
allocations to the self. Discussion centers on the theoretical implications of these 


findings. 


| When group members interact, each mem- 
ter gives something to the interaction and 
[ëch takes something in return. These are the 
osts and the rewards of group interaction 
Thibaut & Kelley, 1959). Although rules for 
pportioning a group’s rewards are sometimes 
termined by some form of open bargaining 
g, Morgan & Sawyer, 1967), members of 
group often arrive at implicit allocation 
les without direct feedback from each 
er, 
When persons are required to allocate 
up resources between themselves and their 
i tners in a dyad, the division can be ac- 
Mmplished in many different ways. One may 
4 Solely in self-interest, keeping all the 
Bees for oneself; one may give all the 
h Ources to one’s partner; or one may di- 
He the resources, giving some percentage to 
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oneself and some to one’s partner. Research 
in this area, which is often termed “distribu- 
tive justice,” usually finds that the allocator 
chooses the latter, that is, some division be- 
tween self and partner. The question, then, 
that is basic to that area of investigation is 
this: What characteristics of the group mem- 
bers, the context, or the ongoing interaction 
affect the allocator’s decision as to how the 
resources will be divided? 

Studies in this area typically formalize the 
interaction between group members as fol- 
lows. The subject (future allocator) and the 
other member are given some cooperative 
task. The subject subsequently receives feed- 
pack as to the relative productivity (inputs) 
of the two members. The group then is sup- 
plied with some reward, usually monetary, 
and the subject is asked to allocate this 
reward (or in some cases, reallocate it after 
an initial distribution by the partner). 


Allocation Rules 


Equity. One rule that could be used to 
allocate the group’s resources is the principle 
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of equity; that is, the notion that one should 
be rewarded in proportion to one’s inputs. 
There is considerable evidence that this rule 
is often used in resource allocation. For ex- 
ample, Adams (1963) showed in a series of 
experiments that people tend to minimize the 
psychological differences between their inputs 
and their outcomes, in relation to the other 
members of a dyad. 

Because many factors (e.g., psychological, 
social, physical) influence people’s percep- 
tions of inputs and outcomes, however, equity 
theory has been much modified since it was 
originally proposed. For example, Walster, 
Berscheid, and Walster (1973) postulated 
that there could be any number of systems of 
equity within a given social structure and 
that powerful individuals within that struc- 
ture will attempt to induce others to accept 
systems beneficial to the group in power. 
According to this reasoning, group members 
can be influenced to believe that any method 
of apportionment is “equitable.” Leventhal 
(1976) has also noted limitations of equity 
theory and has pointed out a number of other 
allocation rules that could be used instead 
of the equity rule. 

Equity and parity. In spite of the multi- 
ple possibilities, however, a review of the 
literature indicates that available resources 
are usually distributed in one of two ways. 
In some cases the equity rule operates, and 
resources are distributed to each member of 
the dyad according to merit or observed per- 
formance in the prior task. In other cases, 
the resources are distributed equally between 
the members. This is called the rule of parity. 

It is generally accepted that people choose 
one of these two rules as being appropriate 
in a given situation. Furthermore, it is clear 
that which rule appears appropriate in one 
set of circumstances does not necessarily dic- 
tate which is seen as appropriate in other cir- 
cumstances (cf. Walster & Walster, 1975). 
Studies involving unequal input? have 
yielded some support for the equity solution 
(Leventhal & Anderson, 1970; Leventhal & 
Lane, 1970; Leventhal & Michaels, 1969) 
and some for the parity solution (Morgan & 

Sawyer, 1967; Wiggins, 1966). Apparently 
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there is some additional factor that must ac. 
count for this discrepancy between findings, 
Degree of interaction. Shapiro (1975), in 
reviewing these studies, found that when 
equity predominated, there had been minimal 
social interaction between the allocator and 
the partner. In contrast, where parity oc- 
curred, there was either an acquaintance be- 
tween the two or the implication of future 
interaction. Shapiro attempted to explore 
the possibility that the critical influence on 
resource allocation was the expectation of 
future interaction. He found, in a subsequent 
experiment, that when such an expectation 
existed, subjects were more likely to allocate 
resources according to parity; when no ex- 
pectation existed, equity prevailed. 
Shapiro’s method, however, leaves serious 
questions as to exactly what variable deter- 
mined his subjects’ choice of allocation rule, 
In the condition involving no expectation of 
future interaction, so-called partners osten: 
sibly participated in the study at completely 
different times. On the other hand, in the 
future-interaction condition the partners nol 
only anticipated future interaction, but they 
worked in adjoining rooms with the door left | 
open and could hear each other respond 10 | 
questions. This confounding of the a 
tion of future interaction with a number if 
other variables prevents the ruling out 0 
other potential causes as having determin 
subjects’ choices of allocation rule. 4 
Salience as person. Other theory pe 
the possibility that Shapiro’s effect oe ‘id 
depended on a different variable: the EE 
to which the partner was salient as 4 i. 
For example, Lerner (1977; see also = that 
Miller, & Holmes, 1976) has theorize W 
the perceived relationship between pa A 
should affect which rule of allocation 1$ p! 


1 Several researchers (Leventhal, Allen, 
gor, 1969; Leventhal & Bergman, 1969; 
Weiss, & Long, 1969) have reported t 
are equal, subjects react to an unequal ity (ea! 
resources by reallocation to restore Gas of equal 
pay for equal work). However, in the ution (PIO" 
input (eg, equal work), the equity AR the parity 
portional allocation) happens also to arative tests 
solution (equal allocation). Thus compi 
require that there be unequal inputs. 


ferred. Lerner suggested that when one’s 
artner is seen as similar to oneself, parity 
prevails. On the other hand, when “people 
elate to one another on the basis of their 
gsition or function in the cooperative en- 
. the justice of equity would be 
pplied” (Lerner, 1977, p. 47, emphasis 
added). Thus as others are seen in terms of 
their functions (e.g., “co-worker”) rather 
than as individuals, the rule for allocation of 
group resources should change from parity to 
ity. Deutsch (1975) has similarly sug- 
ted that seeing people in terms of their 
Usefulness (function) leads to equity, whereas 
salience of social relations causes parity to 
dominate. : 

Low-input situations. Subjects in what 
ght be regarded as personalizing circum- 
stances often do allocate resources according 
parity (Leventhal, Michaels, & Sanford, 
72; Shapiro, 1975). But there are special 
cumstances in which equity-type allocation 
s been found regardless of an implied rela- 
Onship (Shapiro, 1975). Specifically, this 
$ Occurred when the subject’s input was 
lss than the partner’s input. For the subject 
W invoke a parity rule in this case would 
Mean, in effect, taking some of the other's 
fatning for himself, This is not something 
one would do to another “person”? with- 
t his permission; nor is it the way one 
Would be inclined to treat a “partner.” It 
lls seems that the parity-equity decision 
cussed above may be most relevant to cir- 
stances in which one’s own contribution 
the task had outweighed the other’s input. 


| esent Study 


F 

k It was suggested above that personalization 
f one’s partner in a dyad may bias the choice 
allocation rule toward parity, though this 
Mssibility has not yet been tested explicitly. 
Oreover, a review of the literature suggests 
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In virtually all previous research, the rela- 
tionship between the partners has been, either 
explicitly or implicitly, competitive. The 
“other person” in the present research sim- 
ilarly occupied what was implicitly a com- 
petitor role.? The present subjects experienced 
outcomes into which they themselves osten- 
sibly had greater or lesser input than their 
“partners.” The partner was portrayed to 
subjects such that (a) the partner’s identity 
as a person was made salient, (b) the part- 
ner’s role in the task was made salient, or (c) 
neither of these was made salient. The sub- 
ject then was asked to allocate a group re- 
ward. It was expected that role salience 
would lead to equity-based allocation and that 
person salience would lead to parity-based 
allocation, both relative to the condition in 
which neither person nor role was salient. 
It was also expected that these tendencies 
would be maximized when the subject’s per- 
ceived input was greater than the partner’s. 


Method 


Subjects and Procedures 


Subjects were 120 undergraduates (60 males and 
60 females) from the University of Miami. One 
other subject was deleted because questioning re- 
vealed that he had not understood the experimental 
manipulation. Subjects were tested one at a time. 
Each was told, however, that he or she was par- 
ticipating as part of a (same-sexed) pair. The part- 
ner was ostensibly in a different experimental cubicle. 

The experimenter explained that the study was in- 
tended to test the effects of group concentration on 
telepathic ability. The subject and his or her partner 
would be trying to receive telepathic communications 


2 There seems to be only one noncompetitive con- 
text in which roles are salient: when the roles are 
defined strictly in terms of cooperative partnership. 
If resource allocation becomes an issue in such a 
relationship, presumably both of the influences under 
consideration here—personalization and role salience 
would influence allocation in the direction of par- 
ity, because sharing is a component of this role. It 
would thus be difficult to separate the two influences 
from each other. In part for this reason, the present 
research focuses on a relationship whose role char- 
acteristics should exert an influence opposite to that 
expected for personalization. It is likely that this 
implicitly competitive relationship corresponds rela- 
tively well in this respect to most relationships in 
which resource allocation is an issue. 
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from a sender’ The experimenter emphasized that 
success depended on concentration exerted by both 
subjects, and not on their specific abilities. Subjects 
were told that they would get feedback on the ac- 
curacy of both group members later on, The experi- 
menter also stressed that partners would not meet 
either during or after the experiment. Ostensibly 
this was being arranged so that the partners would 
not be distracted by the anticipation of interacting. 
In reality, this information was being provided ex- 
plicitly to remove anticipation of interaction as a 
possible cause of the behavioral effect under investi- 
gation. Thus the test of our hypothesis was not con- 
taminated by this additional variable (cf. Shapiro, 
1975). 

Salience manipulations. In the person-salient con- 
dition, after the above task description had been 
given, the subject was told that distraction during 
the task would be minimized if the subject thought 
of the other as a unique individual, The experi- 
menter ostensibly had been able to gather some per- 
sonality information on potential participants from 
certain class sections, Subjects were told that al- 
though such information had not been obtained from 
them themselves, the information was available for 
their partner.‘ The subject then was given a brief 
description of the hypothetical partner, including 
the following specific information; 


Your corecipient’s name is Pete (Linda), and he 
(she) is 19 years old. S/He lives in south Florida, 
and it says here that s/he’s a Methodist. 


The personality tests show that Pete is mildly 
extraverted and somewhat impulsive. On a forced- 
choice questionnaire, Pete endorsed the item ‘I 
often leap before I look, and that gets me into 
trouble sometimes.’ The report indicates that Pete 
rarely feels depressed or anxious, He said ‘I really 
enjoy life, and I don’t let the bad things that 
happen get me down,’ 


Pete prefers an active to a Passive life. He par- 
ticipates in many sports, and says that he would 
rather play tennis on a Sunday afternoon than 
watch it on TV. Pete is highly sociable and loves 
to talk to people, 


Since Pete uses so much energy doing sports and 
socializing, it is not Surprising that he reported 
himself to be somewhat slack in getting his school- 
work done, and leaves everything till the last 
minute. ‘ 


The emphasis of all the information was to “per- 
sonalize” the partner as much as possible, but to do 
so in such a way that the partner would be fairly 
representative of students at the University of 
Miami. 

In the role-salient condition, the subject was told 
that thinking about who his or her counterpart 
might be would be disruptive. If the subject must 
think about the other at all, he or she should think 
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of the role that the other plays as a receiver of 
telepathic communications. 

Control subjects were told that the study was to 
test the effects of a specific vitamin complex on con- 
centration and success in ESP tasks. Subjects were 
told that they were in a control group that would 
receive no vitamin. Although subjects were told that 
they would be working with a partner who was in 
another room and who would be concentrating on 
the same task, the partner was not emphasized 
either as an individual or as a role, The intent be- 
hind these instructions was to give the subject an 
orientation to the task in which neither Person nor 
role would be salient. 

In all experimental conditions, subjects were told | 
to do their best and were reminded that the other ` 
person would also be trying to do his or her best 
at the task, 

Input manipulation. Each subject then was asked 
to work on the ESP task, On each trial, the subject 
attempted to guess (when a signal light went on) 
which ESP card an experimenter was holding, from 
among four possibilities. After the task was com- 
Pleted, the subject was given false feedback as to the 
number of correct guesses made by each individual 
of the alleged pair. The subject was told that he or 
she had performed either substantially better than 
the partner (6 correct for the subject vs, 2 correct 
for the partner) or substantially worse than the 
Partner (2 correct for the subject vs. 6 correct for 
the partner). 

Dependent measure. The subject was then told 
that a small grant had enabled the experimenter to 
be able to reward the group’s effort. Each group 
ostensibly was being rewarded according to how 
well the group had performed, After checking the 
group's total of “correct” responses, the experimenter 
indicated that the subject’s group would be awarded 
four dollars. The subjects were told that they had 
been chosen by lot to divide the earnings between 
their partner and themselves, The experimenter said 
that how to divide the money was entirely up t0 
the subject. The partner implicitly did not know 
what the group’s total monetary reward would be, 


3 This cover story was chosen because it allowed 
easy manipulation of perceived input and be 
it allowed the roles held by the subject and “co- 
Subject” to be equivalent to each other in cere 
nature of function performed, status, and so forth. 
Doubts as to the credibility of the cover story 
among subjects proved to be unfounded, as ees 
by pilot testing. Indeed, many subjects enter 
some disappointment upon learning that the eu 
had not actually concerned telepathy. ihe 

* This statement was included to guard year te 
Possibility that subjects’ behavior might be infil 
enced by being “known” by the other. The Hapi 
of the study thus depended solely on the ener 
Perceptions of the cosubject, rather than 
over the cosubjects’ knowledge of the subjects. 


‘ond the subject was told that the division would be 
completely anonymous, even to the experimenter, 
‘Thus the subject should have perceived complete 
freedom in allocating resources. 

“The subject was left alone in the cubicle to fill 
‘ut a form indicating what amount should go to 
himself or herself and what amount to the partner, 
The subject sealed this form (coded only by a spe- 
Wal number) in a envelope and placed it in a de- 
pository, from which a departmental secretary 
Gstensibly would take it later to type checks to be 
Mailed out. As soon as this secret division of earn- 
igs was completed, the subject was given a post- 


perimental questionnaire and was debriefed. No 
Pubject actually received money. All were debriefed 
immediately in that regard, and were provided with 
n opportunity to vent any negative feelings that 
he deception had caused, In general, subjects had 
ttle negative reaction, feeling that the small amount 
money involved did not warrant being upset at its 
Results 


In the interest of clarity, we will present 
ithe behavioral results of the study first, and 
en the results of the various manipulation 
lecks, 


Mllocation of Resources 


| Å 3 (Role-Salient vs. Person-Salient vs. 
eontrol) x2 (High vs, Low Input) X 2 
Male vs, Female) analysis of variance of 


3.0 
eè—e High Input 
e- Low Input 
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L 
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ure 1, Mean allocation to self, in dollars, 
°nship conditions, 
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subjects’ dollar allocations to themselves re- 
vealed three significant effects, First, there 
was a large effect for input level, F(1, 108) 
= 34.23, p< .001, indicating that subjects 
who were told that they had done better in 
the task than their cosubject (high input) 
allocated more money to themselves than 
those who were told they did more poorly 
than their cosubject (low input), There was 
also a significant main effect for sex, F(1, 
108) = 11.71, $ < .001, such that males al- 
located more earnings to themselves than did 
females, Of greatest interest was a highly 
significant interaction between relationship 
and sex of subject, F(2, 108) = 9.38, p< 
001 (see Figure 1). This interaction indi- 
cated that subjects’ allocations were indeed 
influenced by their relationships to their part- 
ners, but the nature of that influence differed 
dramatically between males and females. As 
is clear from Figure 1, each gender’s behavior 
was consistent across the two levels of per- 
ceived input. 

A Duncan multiple range test indicated 
that females (combined across input levels) 
allocated significantly more to themselves in 
the role-salient than in the person-salient 
condition (p< .05). The control mean did 
not differ reliably from either the person 
mean or the role mean but fell approximately 


3.0: 


PERSON TROL ROLE 
RELIANT ooh SALIENT 


B. Males 


by (a) females and (b) males in each of three rela- 
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Table 1 3 ; j checks on the relationship manipulation. Sep- 
Mean Ratings of Perceived Uniqueness of arate analyses of variance compared sub- 
Cosubject, Knowledge of Cosubject, Degree to jects’ responses (on 7-point scales with suit- 


Which Cosubject Was Like Oneself, and Felt 


Competitiveness Toward Cosubject able anchoring phrases) on each of these 


postexperimental questions. 

The first item asked subjects to what de- 
gree they perceived the other as a unique 
Item Person Control Role individual. It was expected that this tendency 
would be increased by the person manipula- 


Relationship 


Unigoenes aas tion and decreased by the role manipulation, 
Males jet 3.25 2.60 2.29 Consistent with this expectation, analysis of 
Females 3.80 2.75 2.00 responses to this item revealed a significant 

Knowledge main effect for relationship, F(2, 108) = 

of cosubject 6.72, p< .005 (see Table 1). A Duncan, 
Males 2.65 195 1.50 multiple range test showed that when data 
Arsis eae E 1.20 were combined across both input and gender 

Similarity to self variables, subjects in the person condition 
Malen ne ore a perceived the other as a unique individual 

we i 4 : more consistently than did subjects in either 

Felt ompetitiveaca 3.25 2.00 1.85 the role or the control condition (ps < .05). 
Females 160 190 275 A second item asked subjects how much 


SS. they. knew about the other. It was anticipated 
Note. Group n = 20. that the personalizing manipulation should 
increase this tendency. As expected, an anal 

midway between them. These findings, among ysis of variance showed only a highly signif- 
females, were precisely as had been predicted. icant main effect for relationship, F (2, 108)= 
The findings for males, on the other hand, 18.89, p< .001 (see Table 1). A Duncan 
took a form opposite to the prediction. That multiple range test showed that, overall, sub- 
is, when combined across input levels, males jects in the person condition indicated sig: 
allocated to themselves more of the group’s nificantly more knowledge about the cosubject 
earnings in the person-salient condition than than did subjects in either the control or the 
they did in either the role-salient condition ole condition (ps < .05); the latter two 
or the control condition (ps < .05 by Duncan means did not differ reliably from each other: 
test). The latter two means did not differ The third item asked subjects to indicat? 


reliably from each other. to what degree they saw the other as ano 
student like themselves. Again, it Re, tion 
“Manipulation Checks pected that the personalizing mae 
i anipu 
Degree of input. Subjects had been asked would increase and the role manip 


x is 0 
to respond to a number of postexperimental WOuld decrease this tendency. Analyst 


questions, some of which constituted checks A 
on the experimental manipulations.* One  sQuestions were also included to assess & n a 
item asked subjects to indicate their actual of potential confounds. For example, x AN the 
performance in the task compared to that of cated above, we wished to aea fii from 
the ostensible cosubject, with three response Variable of hia, E PEE (Shapiro, Te 
options: higher, lower, and the same. This Tibiisteno thoy ‘tn this regard that subject en j 
item served as a check on the input manipula- proved not to differ from each other M riformi 


5 ul 
tion. All subjects retained for data analysis pectations of future iteracion am N did the 
answered correctly for their input levels.® low likelihoods that that woul BER 


" y x a oups differ regarding other variabl cause 
Nature of relationship. Four questions in- *"? a Aen ee eR eck wad eked, b at 


directly assessed the way subjects perceived he chose the opposite response from 
their cosubjects. These can be construed as input value. 


nses to this item, however, revealed 
significant Sex X Relationship interaction, 
Ri2, 108) = 3.09, p< .05 (see Table 1). A 
Duncan multiple range test revealed that 
When combined across input levels, females 
perceived the other as a student like them- 
es to a greater degree in the person condi- 
tion than in the role condition (p < .05). For 
males, there was no reliable difference among 


fetitor. It was anticipated that the personal- 
ling manipulation would tend to decrease 
ijerception of the other as a competitor, and 
the role manipulation would tend to in- 
[ease perception of the other as a competi- 
l . Analysis of subjects’ responses to this 
tm revealed a highly significant Relation- 
hip X Sex interaction, F(2, 108) = 7.30, p < 
001. This interaction (see Table 1) followed 
jie same pattern as the interaction on the 
follar-allocation variable. That is, subjects 
ported varying feelings of competitiveness 
tcording to nature of relationship, but the 
fects of the relationship variable differed be- 
Ween the genders. 

| As was anticipated, females perceived the 
her more as a competitor in the role condi- 
Hn than in the person condition. A Duncan 
Multiple range test indicated that this differ- 
je was reliable between the person and role 
tions when the input levels were com- 
| ined, and for the high input condition taken 
j*Parately (ps < 05). Male subjects, on the 
her hand, tended to perceive the other more 
$a competitor in the person condition than 
d either the control or role conditions. A 
Ancan multiple range test indicated that the 
i condition mean differed reliably from 
j, Control and the role condition means 
bS < .05) when input levels were combined 
Md also differed for the high-input condition 
Een Separately, 

a might be expected, based on the sim- 
and between the results on this item and 
allocation variable, the degree to which 
ie reported perceiving the other as a 
; p ettitor was reliably related to the dollar 
4, ations they made to themselves, overall 


r 
M8) = 27, p < 01. 
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Correlations. These postexperimental data 
provide mixed support for the contention that 
the relationship manipulations did what they 
were intended to do, Ratings of uniqueness of 
the other, knowledge about the other, felt 
competitiveness (among females), and seeing 
the other as a student like themselves (among 
females) were consistent with expectations. 
But there was a clear reversal of the expected 
relationship in felt competitiveness among 
males. 

This gender difference was also revealed by 
correlational analyses. Rated knowledge of 
the other, which had been influenced by the 
personalizing information consistently across 
sex (Table 1), proved to be consistently cor- 
related with perceived uniqueness of the 
other among the entire subject sample, r (118) 
= .38, p< .001; among males taken sep- 
arately, 7(58) = .26, p < .03; and among 
females taken separately, r(58) = .47, p< 
001. Among females, neither knowledge of 
the other nor perceived uniqueness was re- 
lated to feelings of competitiveness. But both 
these perceptions correlated positively with 
felt competition among men, r(58) = 44, 
p< .001, and r(58) = .23, p< 04, respec- 
tively. 

This pattern of correlations suggests that 
one unanticipated perceptual-—cognitive trans- 
formation was responsible for the unpredicted 
behavior of men in this study. Apparently 
personalizing information, felt knowledge of 
the other, and perceptions of the other's 
uniqueness were translated among men into 
feelings of competitiveness toward the other. 
Felt competitiveness, then, influenced even 
male subjects’ allocations in a perfectly in- 


telligible fashion. 


Discussion 


The primary focus of this experiment was 
upon the effect of the nature of a relationship 
on resource allocation. The nature of the 
relationship, as perceived by the allocating 
member of the dyad, clearly did influence the 
way in which that member divided the group 
earnings. The nature of that influence, how- 
ever, depended on the gender of the subject. 
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Allocation Among Females 


It was expected that role salience would 
lead to an equity-based allocation, and per- 
son salience would lead to a parity-based 
allocation, both relative to a control condi- 
tion in which neither was made salient. Fe- 
males behaved precisely in this fashion. In 
the high-input condition, female subjects who 
were encouraged to view their partners as 
persons allocated themselves almost exactly 
50% of the group’s earnings (a parity solu- 
tion). Female subjects encouraged to view 
their partners in their competitive role allo- 
cated the earnings more in line with the rela- 
tive perceived input of each member (an 
equity solution), Although prediction was 
more ambiguous among subjects led to per- 
ceive their own inputs as low, dollar alloca- 
tions to self in those groups proved to follow 
the same pattern as that found among high- 
input subjects. In both cases, women were 
more willing to concede earnings when the 
other was perceived as a person than when 
the other was perceived as a functionary, 


Allocation Among Males 


Males, on the other hand, responded to 
the relationship manipulation in a very dif- 
ferent way. Among males, allocation did not 
differ between role and control conditions, 
hovering only slightly above parity in both 
cases, Allocation among men was reliably in- 
fluenced only by the personalizing informa- 
tion, and it was an influence that was 
opposite to what had been expected, In the 
person-salient condition, high-input males al- 
located themselves almost exactly 75% of the 
group’s earnings (an equity solution), Al- 
though willing to concede some of their 
“equitable” earnings to the other in control 
and role conditions, they were completely 
unwilling to do so in the person-salient condi- 
tion. As was true among females, the same 
pattern of dollar allocations was retained in 
low-input groups, although the overall level 
of self-allocation was reduced. 


Allocation and Perceptions of the Other 


This pattern among males was not antic- 
ipated. But the study yielded additional in- 
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formation that allows some insight into the 
dynamics behind males’ surprising behayior, 
In order to gain some perspective on the | 
issue, let us reexamine the rationale behind 
the study. As was noted earlier, both Lerner 
(1977; Lerner et al., 1976) and Deutsch 
(1975) have theorized that if one person 
views the other in a personal way, as a unique 
individual, as being “like oneself,” a parity- 
based division should occur. Women’s ratings 
and allocations were fully consistent with 
this reasoning. Among males, in contrast, the 
effects of the personalizing manipulation fit y 
with some of this characterization (enhanced 
knowledge of the other, perceived unique- 
ness), but not all of it. For some reason, the 
personalizing information induced a highly 
competitive orientation among men, It seems 
likely that this difference was the basis for 
the gender difference in allocation patterns’ 
Why should personalizing lead to feelings 
of competitiveness among males? One pos- 
sible explanation rests on the notion that 
males are generally more attuned to com- 
petitive aspects of the performance/alloca- 
tion situation than are females, and take such 
competition more or less for granted. Thus, 
in same-sex dyads at least, men may be less 
capable of extricating themselves from the 
competition role in inherently competitive 
situations than are women, This would fi 
with the common characterization of males 
in our culture as “task oriented,” whereas 
females are seen as more “social interactial 
oriented (cf. Bakan, 1966; Maccoby & is 
lin, 1974; Parsons & Bales, 1955). Whee a 
other person was not particularly eset a 
them, as was the case in both the ro i 
and control conditions, the situation may ati 
been relatively comfortable for males, ee 
tue of its familiarity, That is, among ™ 


eti- 
this sort of “ritualized” role-based n 
i d relati 
tion would be normal an > informa 


threatening. But providing specifi 


— that 

7 The possibility should be acknowledged ed 
males’ self-reports of competitive feelin decision 
as a function of having made the aban occurr 
as they did, inasmuch as the elise 
after the allocation. Although this pos uggest 
be discounted entirely, it appears to § themselve 
little basis for the pattern of allocations 
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about the other person made him more 
dient as an individual. Perhaps this en- 
need awareness of the other’s individuality 
de him appear more threatening as a com- 
. In effect, enhanced salience of the 


tation also suggests an important limita- 

lin on the theorizing of Lerner (1977) and 
tsch (1975). Specifically, personalizing 
other may have the consequences that 
Se theorists predicted only when the per- 
ization replaces the competitive role 
tation, rather than being added to that 
orientation, 


iplications 


jis evident that the nature of the rela- 
fiiship between the members of a dyad has 
portant impact on reward allocation, At 

t two factors—the degree to which the 
Tis personalized, and the perceived com- 

| liveness of the relationship—appear to 
Wate this impact. Moreover, the clear and 
ble sex difference in mediation of this 
q in the present study makes it evident 
future research concerning relationships 


oe, this gender difference appears to 
ay ground for future research oppor- 
k ‘or example, if (as we argued above) 
et pected findings among males reflect a 
nyeni, masculine” orientation toward 
lee mye situations, one might expect this 
PS be influenced by differences in sub- 
M) ji role orientations (eg., Bem, 1974, 
tty nother important possibility concerns 
e of person who is portrayed to sub- 
al oh at personatizingi o E = 

nalization iy examined only one type 
t port: on in this experiment. Perhaps 
ir es might lead males to perceive 
i oT as deserving egalitarian rather 
BY also oe treatment. Similarly, there 
e portrayals that would induce an 


cia] i 
Males, lly competitive orientation among fe- 


sees 
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: _These are only two among many possibil- 
ities, There can be little doubt, however, that 
men and women differ in important ways in 
how they perceive personal relationships and 
in what cues they use to define those relation- 
ships. These differences between the sexes 
remain an important area of scientific investi- 
gation. 


8 Inspection of the raw data shows that 5 sub- 
jects out of 120 allocated all the money to them- 
selves, All were males in the person-salient condi- 
tion, 4 in the high-input group, and 1 in the low- 
input group. 
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Allocation of Attention and the Type A Coronary-Prone 


Behavior Pattern 
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Kansas State University 
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University of Pittsburgh 


Three studies were conducted to assess the attentional style of individuals 
with the Type A coronary-prone behavior pattern. Experiment 1, which made 
use of a dual-task paradigm, revealed that Type A’s focus their attention on 
central tasks; thus, they attend less to peripheral tasks than do Type B’s. 
Experiments 2 and 3, which used a single task performed in the presence of a 
distracting stimulus, indicated that Type A’s actively inhibit or suppress their 
attention to task-irrelevant peripheral events that might distract them from 
task performance. These findings validated anecdotal observations that Type 
A's appear hyperalert (focused in their attention) but neglect task-irrelevant 
cues. Previous research has demonstrated that Type A’s fail to report fatigue 
as well as a variety of other physical symptoms of illness during task per- 
formance. To the extent that symptoms are analogous to peripheral events 
that distract from task performance, the data suggest that Type A’s suppress 
their attention to symptoms. Implications of the attentional style of Type A’s 


Í for the pathogenesis of coronary artery and heart disease are discussed. 


A behavior pattern called Type A has be- 
me widely known for its negative effects on 
diovascular health. Both retrospective and 
Mospective studies have demonstrated that 
ttern A is associated with at least twice 
e occurrence of heart disease as an opposing 
atten B, which is defined as the absence of 
atten A (Friedman & Rosenman, 1974; 
Mkins, Rosenman, & Zyzanski, 1974). This 
lationship remains when statistical controls 
(te introduced to partial out the effects of 
itional risk factors for coronary heart 
ase (CHD), such as serum cholesterol 
el and systolic blood pressure. Moreover, 
atten A is also associated with the degree 
I atherosclerosis in the coronary arteries 
lumenthal, Williams, Kong, Schanberg, & 
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Thompson, 1978; Zyzanski, Jenkins, Ryan, 
Flessas, & Everist, 1976). 


Behavioral Characteristics of the Type A 


The Type A behavior pattern is defined as 
“an action-emotion complex that can be ob- 
served in any person who is aggressively in- 
volved in a chronic, incessant struggle to 
achieve more and more in less and less time, 
and if required to do so, against the opposing 
efforts of other things or persons” (Friedman 
& Rosenman, 1974, p. 67). Systematic in- 
vestigations by Glass and his colleagues have 
verified the existence of a number of behay- 
joral manifestations of this behavior pattern 
(cf. Glass, 1977). For example, Type A’s sig- 
nal the passage of 1 minute sooner than Type 
B’s do, apparently sensing time passing rap- 
idly (Burnam, Pennebaker, & Glass, 1975). 
When frustrated, Type A’s become aggressive 
and hostile (Carver & Glass, 1978; Glass, 
Snyder, & Hollis, 1974). Type A’s work at 
near their maximal rate even when there is 
no explicit time deadline (Burnam et al., 
1975; Carver, Coleman, & Glass, 1976). 
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Simultaneously they report less fatigue and 
other physical symptoms than B’s do (Carver 
et al., 1976; Weidner & Matthews, 1978). 

One behavioral attribute of Type A’s that 
has not been systematically investigated is 
their policy for allocating attention to en- 
vironmental events. This is indeed surprising, 
considering that policies for allocation of at- 
tention play a critical role in such diverse 
behaviors as experiences of affect and bodily 
states (Gibbons, Carver, Scheier, & Hormuth, 
1979; Scheier & Carver, 1977), human judg- 
ments (Taylor & Fiske, 1978), and attain- 
ment of performance goals (Carver & 
Scheier, Note 1). We suggested elsewhere 
that Type A’s and Type B’s might differ in 
how they allocate their attention to the 
environment and that this difference might 
explain why Type A’s report less severe physi- 
cal symptoms than Type B’s do while per- 
forming a stressful task (Weidner & Mat- 
thews, 1978). Specifically, we reasoned that 
because of Type A persons’ chronic efforts to 
achieve, they devote their full attention to 
tasks at hand. Consequently, they do not 
notice peripheral events that are not immedi- 
ately relevant to task performance, such as 
physical symptoms or subtle cues of another’s 
distress. In other words, Type A’s focus their 
attention on environmental stimuli that will 
aid them in their efforts to excel. This notion 
is consistent with descriptions of Type A 
individuals as being extraordinarily alert but 
inattentive to extraneous cues (Bortner & 
Rosenman, 1967; Friedman & Rosenman, 
1959; Rosenman & Friedman, 1961) and 
with the argument advanced by Glass (1977) 
that only highly salient tasks elicit controlling 
behavior from Type A’s, 

Do A’s differ from B’s in the policies by 
which they allocate attention to their en- 
vironment? The purpose of the present re- 
search was to assess this possibility. Our 
major hypothesis was that Type A’s focus 
their attention on central aspects of their 
environment; consequently they attend less to 
peripheral aspects than Type B’s do. To test 
this notion, the first experiment used a dual- 
task paradigm (e.g., Bahrick, Fitts, & Rankin, 
1952; Hockey, 1970; Kahneman, 1973) in 
which one task was defined as central or 
primary and one task was defined as periph- 
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eral or secondary through physical arrange. 
ments of the tasks and experimenter instruc 
tions. Assumptions of this paradigm are that 
humans have a limited capacity of attention 
and that performance on the two tasks varies 
with the amount of attention allocated to 
them, provided that the capacity of the indi- 
vidual is fully used. Focusing of attention is 
said to occur when performance on the sec. 
ondary task is poor and performance on the 
central task is the same or good, compared to 
a referent. 

In Experiment 1, Type A and B under 
graduates worked for 64 minutes on two tasks 
simultaneously. The primary task was the 
Stroop Color Naming Task (Stroop, 1935), 
and the secondary task was to depress a tele 
raph key upon the onset of a light, located at 
the right periphery of the field of vision. We 
expected that Type A’s would allocate Jess 
attention to the secondary task than Type 
B’s would; they would allocate as much and 
perhaps more attention to the primary 
than Type B’s would. Consequently, Type 
A’s should perform poorer on the reaction 
time task and the same or better on the 
Stroop Color Naming Task than B’s should. 


Experiment 1 


Method 


Subjects. Subjects were 23 undergraduates s 
rolled in introductory psychology course a 
State University who gained course credit He 
ticipating in research. Subjects were casti F 
Type A or B on the basis of their sar n 
Jenkins Activity Survey (JAS) ( Jenkins, a aa 
& Rosenman, 1971) revised for students ~ i 
(1977). In large samples of undergradual ‘chs; 
median of the JAS falls between 7 and ee i 
1977). Type A’s were those who scored F rod 
median of this sample of subjects (8.0 ae d. below 
whereas Type B’s were those who eee p 
the median (7.0 and below). Two female, ES, 
were deleted because of equipment fal egion 
female Type B for failure to follow T aa 
The final sample contained 10 Type half female: 
Type B's; half of each group was male, pe Engish 
None were color blind, and all were nati i 
speakers. Nonnative English speakers 
ticipate because they answer the 
a different cultural referent (see Cohen, 


discussion of this point). y a front % 
Materials and apparatus. Direct ag for the 
the subjects on a table were the a “ask 


primary task, the Stroop Color 
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Stroop, 1935) modified by Hartley and Adams 
1974). The materials were a stack of sheets that 
pptained 30 lines of six color names; each color 
ame was printed in a color of ink other than its 
color name. Each line contained one stimulus 
lor name on the left and five possible choices of 
lor names on the right. Subjects were to circle 
ith a pen in their dominant hand the color name 
Í the stimulus word’s ink among the five choices. 
‘The apparatus for the secondary task, a simple 
action time task, was located in the subject’s 
ht field of vision.! At approximately two o'clock 
jthe visual field was a 12 inch X 12 inch (30.5 x 
05 cm) black vertical panel on which was 
unted a l-inch (2,54 cm) square red light, At 
ie base of the panel was a telegraph key secured 
) the horizontal platform adjoining the panel. 
luring the 63 minutes of the experiment, the light 
activated 12 times; the intertrial interval 
Weraged 20 sec, but varied from 10 to 30 sec. The 
tation of the light was 10 sec unless the subject 
inated the light by depressing the telegraph 
. The time to depress the key was measured in 
iseconds by a Lafayette 54517 clock in con- 
ction with a digital printer. 
Procedure. Subjects were tested individually. A 
le undergraduate experimenter greeted the sub- 
and explained that they would participate in 
Studies. After giving a brief overview of the 
Studies, she obtained their consent to partici- 
le in the research, In the first study, subjects were 
cd to complete the JAS because the experi- 
a wished to collect normative data on the 
styles of college students. Following completion 
the JAS, subjects filled out a second question- 
ite while the experimenter scored the JAS. Then 
jects were escorted to another testing room for 
# second study and were greeted by a male ex- 
Mimenter, a graduate student, who remained blind 
their Type A-B categorization. 
he second experimenter explained that they 
Muld be involved in an experiment designed “to 
the effects of extraneous stimuli on task per- 
mance.” He instructed the subjects on how to do 
Stroop Color Naming Task and allowed them to 
lüctice until they solved several correctly in a 
iV. Then he informed them that they would have 
Out 6 minutes for this task and that they should 
„as fast and accurately as they could. A time 
“ine was imposed because Type A’s and B’s 
a to perform at a similar maximal rate when 
te 1S such a time deadline (Burnam et al., 
aa Finally, almost as an afterthought, he said, 
nition to working on this Stroop task, I 
you to press this telegraph key every time 
i Otice this light come on.” After several prac- 
tials turning off the light, the instructions for 
: troop task were reviewed briefly, subjects be- 
Working, and the experimenter left the room. 
i,” Minutes had passed, the experimenter turned 
© reaction time apparatus and returned to 
“sting room, (As a result, subjects worked on 
°0p task for approximately 64 minutes.) Then 
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Table 1 
Performance by Type A's and Type B's 
on Secondary and Primary Tasks 
Type Type 
Dependent measure A B 
Secondary task 
M reaction time to each 
light® 4.14 1,97 
Number of noticed lights» 10.3 11.8 
M reaction time to each 
noticed light only* 3.18 1.83 
Primary task 
Number of Stroop problems 
completed correctly? 165.0 106.7 


® The lower the number, the better the performance 
on this measure, 

>The higher the number, the better the perform- 
ance on this measure. 


he administered a postexperimental questionnaire 
and fully debriefed them, 


Results and Discussion 


Performance on the secondary and primary 
tasks and items on the postexperimental 
questionnaire were analyzed by 2 (Type A- 
B) x 2 (Sex of Subject) analyses of vari- 
ance. Because there were no significant effects 
for sex of subject in any of the analyses, only 
analyses by type are reported. Table 1 pre- 
sents the mean Type A and B scores for the 
following dependent measures: (a) mean 
reaction time to each of the 12 lights, (b) 
number of lights noticed (as indicated by 
depression of the telegraph key prior to 
automatic turning off of the light), (c) mean 
reaction time to each of the noticed lights 
(all measures from the secondary task), and 
(d) the number of problems completed on 
the Stroop Color Naming Task (the primary 
task). ; 

Secondary task performance: Reaction 
time to lights. We expected that Type A’s 
would attend less to the secondary task than 
would Type B’s and consequently would Te- 
act more slowly to the onset of the stimulus 


1For the one left-hander in the study, the appa- 
ratus was switched to the subject’s left field of 
vision, and he was required to press the reaction 
time key and circle the correct response on the 
Stroop task with his left hand. 
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light than Type B’s would. As row 1 of 
Table 1 reveals, the mean reaction time to 
each of the 12 lights was significantly greater 
for Type A’s than for Type B’s, F(1, 16) = 
31.98, p < .001, In addition, Type A’s failed 
to notice as many lights as did Type B’s (see 
Table 1, row 2), F(1, 16) = 12.16, p < .003. 
Because it was possible that the mean reac- 
tion time to the lights was greater for A’s 
than for B’s simply because A’s did not no- 
tice as many lights as B’s did, we also exam- 
ined the mean reaction time to each light that 
had been noticed. As row 3 of Table 1 shows, 
Type A’s were significantly slower than B’s 
to react to these lights, F(1, 16) = 17.88, 
p < 0012 

Primary task perjormance: Stroop Color 
Naming Task. If Type A’s focus their at- 
tention on the primary task, they should per- 
form the same or better on the Stroop Color 
Naming Task than should B’s. As row 4 of 
Table 1 reveals, Type A’s did correctly com- 
plete significantly more Stroop problems than 
Type B’s did, F(1, 16) = 27.92, p < .001# 

Postexperimental questionnaire. In re- 
sponse to items on a postexperimental ques- 
tionnaire, Type A’s believed that they per- 
formed better on the Stroop task than Type 
B’s believed they performed, F(1, 16) = 
5.65, p < .03. There were no other signifi- 
cant effects for sex or type on ratings of 
difficulty of the Stroop task, quality of per- 
formance on the reaction time task, disruptive 
effect of responding to the light on the Stroop 
task performance, or the importance of the 
research. 

In summary, the results of Experiment 1 
indicate that, relative to Type B’s, A’s per- 
form more poorly on the secondary task and 
better on the primary task. It is worthwhile 
noting that the precise rewards for excelling 
on the primary and secondary task were left 
ambiguous in this study. We chose to do this 
to maximize the possibility that individual 
differences in allocation of attention would 
emerge. In fact, we suspect that should pre- 
cise rewards be established, both A’s and B’s 
would be capable of allocating their attention 
to receive maximum reward. In other words, 
both types might behave similarly in ex- 
plicit situations, for example, when perform- 
ance on secondary tasks is highly rewarded. 


KAREN A. MATTHEWS AND BRADFORD I. BRUNSON i 


Nonetheless, the results support the notion 
that A’s focus their attention on central 
environmental events. Thus, they allocate less 
attention to peripheral tasks than do B’s in 
relatively unstructured settings. 

Because A’s attend less to peripheral tasks| 
than B’s do, it is reasonable to extrapolate 
that they also attend less to task-irrelevant 
peripheral events. Experiment 1 did not, how-] 
ever, address the issue of how active this 
inattentiveness of A’s to peripheral events) 
might be. That is, the presence or absence 
of a task-irrelevant peripheral event maj 
simply be inconsequential for A’s, who are 
attending to task performance. Alternatively, ] 
however, it may be that Type A’s actively] 
inhibit their attention to such peripheral) 
events. 

Experiment 2 was conducted to compatt) 
these two possibilities. In Experiment 2 we 
began by assuming that task-irrelevant sound 
are peripheral events that can distract from} 
task performance if attention is directed o 
them. Previous research has shown that thg 
presence of extraneous noise improves Stroop 
color naming performance (Hartley & Adams, 
1974). Apparently this occurs because su 
jects presented with a noise actively inhibit 
their attention to the noise; in doing so, the 
also inhibit their attention to other task 
irrelevant cues—including the name 
color of the stimulus word (Houston, 
1969; Houston & Jones, 1967). Thus, 
formance is better as subjects 
tracted by even the distractor a 
ent in the task. This fact allows us to a W. 
tain whether Type A’s actively ieee i. 
attention to the extraneous sounds oF W A, 
this peripheral event is simply are eni 
tial in the following way. If Type ^5 S han 
less to task-irrelevant peripheral eveni 


J 


of the 


ina 
20n this measure only, there ya z a 
significant interaction term, Pp A 
16) = 4.21, p = 06. Although b 

A's were slower than mal 
B’s, the difference between the ty 
males than for females. 

3 Analyses of the number of 
problems for Experiment 1 also r 
pattern of results: Type A-B, os 
< 001. There were no significan 


age of errors. 


pes was larger 4 


attempted ae 
vealed the 


‘fects in POCO 
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ps do simply because they are attending to 
he central task, then the Stroop performance 
si A’s should not be affected by the presence 
sr absence of a distractor. On the other hand, 
f Type A’s actively inhibit their attention to 
jistractors, the performance of A’s should be 
icilitated by the presence of a distractor. To 
wmpare these possibilities, Type A and B 

dergraduates performed the Stroop Color 
Naming Task either in the presence or in the 
sence of a tape of distracting sounds. Each 
ubject was given an explicit time deadline of 
Í minutes to insure that both A’s and B’s 
jorked at the same high rate (cf. Burnam et 
il, 1975). 


| Experiment 2 
i 


ethod 


Subjects. Subjects were 22 male and 21 female 
Mdergraduates enrolled in introductory psychology 
irses at Kansas State University who obtained 
se credit for their participation. Subjects were 
lissified as Type A (9.0 and above on JAS) or 
pe B (60 and below) prior to experimental 
focedures.4 One Type B woman was deleted from 
Mt analyses because of her failure to follow instruc- 
. In the final sample, there were 11 Type A’s 
10 Type B's in the distractor group and 10 
We A’s and 11 Type B’s in the no distractor 
P. None were color blind; all were native 
elish speakers, 
Procedure. Subjects were tested individually. A 
ale undergraduate greeted the subjects, and as in 
riment 1, she explained that they would par- 
[Pate in two studies. After giving a brief over- 
AW of the studies, she obtained their consent to 
aS in the research. Then, during the first 
ma the JAS was administered. After subjects 
pleted the JAS, they completed a second person- 
# quiz while the experimenter scored the JAS 
pened subjects to either the distractor or the 
, .sttactor group, She then escorted subjects to 
àdjoining testing room where she introduced 
T to a male experimenter and informed the 
menter of the subjects’ stimulus group assign- 
N a not inform him of the subjects’ Type 
i toot experimenter explained to subjects in 
Ustractor group that the study concerned the 
of noise on task performance or to subjects 
Hi no distractor group that the study concerned 
ects of visual interference on task perform- 


p thoiects were then given instructions identical 


ae 


the 


lf 
sa and quickly as possible. In addition, sub- 
the distractor group were told that they 
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would be exposed to a tape of sounds through 
headphones while they were working on the Stroop 
task. The tape contained continuous electronic 
music interspersed with such common sounds as a 
sports broadcast, brief verbal remarks, and a ticking 
clock. The average intensity of the sounds was 78 
dB(A), 3000 Hz measured at the headphones with 
the aid of a coupler device. The range was 72 to 
85 dB(A). The no distractor group were also asked 
to wear headphones, but of course the tape was 
not played for them. Subjects then worked on the 
Stroop task until the experimenter stopped them 
after 5 minutes. Subjects completed a question- 
naire concerning their perceptions of the experi- 
ment; questions concerning perceptions of the 
sounds were not administered to no distractor sub- 
jects. All subjects were questioned regarding their 
suspicions of experimental procedures and were then 
fully debriefed. 


Results and Discussion 


Except for ratings of the sounds, the data 
were analyzed by 2 (Type A-B) x 2 (No 
Distractor — Distractor Groups) X 2 (Sex of 
Subject) analyses of variance. 

Performance on the Stroop Color Naming 
Task. We expected that if relative to Type 
B’s, Type A’s attend less to peripheral events 
simply because they attend more to central 
events, they should not perform better in the 
presence of a distractor. However, if Type A’s 
actively inhibit their attention to the dis- 
tractor, the presence of the distractor should 
facilitate their performance. In either case, 
there should be no differences in the types’ 
performance in the no distractor condition 
because the types work at a similar rate with 
a time deadline (Burnam et al., 1975), and 
in general, performance should improve in the 
presence of sounds. As row 1 of Table 2 indi- 
cates, A’s did perform better with a distractor 
than without. There were no differences in 


the performance of ‘A’s and B’s without a dis- 


tractor. The analysis of variance confirmed 
this pattern of results with a statistically sig- 
nificant interaction term, Type A-B X Dis- 
tractor - No Distractor Group, F(1, 34) = 


oe 


+There were 10 additional subjects who scored 
at 7.0 or 8.0 on the JAS who were a priori ex- 
ed from the study. yt 
eras tape was designed to be similar to the 
Houston (e.g., 1968) tape of sounds in dB(A) level 
as well as in type of random sounds. The tape was 


distracting but not aversive. 
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Table 2 


Number of Stroop Problems Completed Correctly by Type A’s and Type B's 
art an bern) 
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Distractor No distractor 
Type A Type B Type A Type B 
Experiment No. n No. No. n No. n 
2 113.0 il 81.7 93.5 10 ; 97.5 11 
3 125.1 9 104.2 101.5 8 95.2 9 


5.90, p < .02, and a significant main effect for 
Type A-B, F(1,34) =4.16, p< .05.° In- 
ternal contrasts revealed that Type A’s per- 
formed significantly better in the presence of 
a distractor than without, (19) = 2.14, p< 
.05, and outperformed B’s in the distractor 
condition, ¢(19) = 3.44, p < .01. Unexpect- 
edly, however, Type B performance tended to 
worsen with distraction, ¢(19) = 1.73, p< 
.10. There were no other significant effects, 
including those for sex of subject. 
Postexperimental questionnaire. Type A’s 
rated the tape of sounds as more unpleasant 
on a 7-point scale than Type B’s did (Ms = 
3.3 vs. M = 4.7, respectively), F(1,17) = 
4.80, p < .05. Women rated the tapes as sig- 
nificantly louder and more stressful than men 
did, Fs(1,17) > 4.63, ps < .05. There was 
a significant interaction term for ratings of 
task difficulty, Distractor-No Distractor 
Group X Sex of Subject, F(1,34) = 4.05, p 
= .05. Although female subjects rated task 
difficulty similarly in the two groups, men in 
the distractor group thought the Stroop task 
was less difficult than those in the no dis- 
tractor group did. There were no other sig- 
nificant differences across types, distractor 
conditions, or sex of subject for ratings of 
quality of performance, importance of re- 
search, and distractibility of the distractor. 
In summary, the results of Experiment 2 
are consistent with the notion that Type A’s 
inhibit their attention to  task-irrelevant 
peripheral events. Unexpectedly, Type B per- 
formance tended to deteriorate with distrac- 
tion. Thus, we did not replicate previous re- 
search that indicates distracting sounds lead 
to improved performance overall on the 
Stroop test. Therefore, in order to examine 
the reliability of the Type B responses to 


distraction, we decided to replicate Experi 
ment 2. 


Experiment 3 
Method 


Subjects. Subjects were 18 male and 19 female 
undergraduates enrolled in introductory psychology 
courses at Kansas State University who obtained 
course credit for their participation. Subjects wert 
classified as Type A (8.0 and above) or Type B 
(7.0 and below) prior to experimental phat 
and assigned to cells so that an approximately equ 
number of Type A men and women and Type B men 
and women were in each group, One Type A mall 
was deleted because of his failure to follow ini 
tions. In the final sample, there were 9 ee j 
and 10 Type B’s in the distractor group and ee 
A’s and 9 Type B’s in the no distractor group. ue 
were color blind; all were native English spei i 

Procedure. The procedure was identical dh 
used in Experiment 2 with one exception. Sua 
were administered the JAS during one test ae 
At that time, the female experimenter s¢ ce 
them for a second session one or two wn 
during which they were administered the er T 
Color Naming Task. As before, the experimen Tyo 
the second session was blind to the subjects 
A-B score. 


Results and Discussion 


Performance on the Stro 
Task. As in Experiment 2, 
formed better with a distractor ee 
(Table 2, row 2). There were no ene i 
in the performance of A’s and B’s W of BX 
distractor. In contrast to the results 


Stroo? 
6 Analyses of the number of aremt ye 7 
problems for Experiment 2 also E 
pattern of results: Type A-B X DEN There wert 
tractor Group, F(1, 34) = 5.60, ? f errors- 
no significant effects in percentage © 
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iment 2, B’s did not deteriorate in their 
formance in the presence of a distractor. 
analyses of variance confirmed this pat- 
of results with a significant main effect 
distractor — no distractor group, F(1, 28) 
4.17, p = .05, and no other significant re- 
ts. Internal contrasts revealed that the 
in effect was largely due to Type A indi- 
iduals’ behavior. Type A’s improved in their 
formance in the presence of a distractor, 
15) = 2.09, p < .06, whereas B’s did not, 
1. Type A’s tended to perform better 
B’s did in the distractor group, #(17) = 
95, p = .07. There were no other significant 
fects, including those due to sex.” 
Postexperimental questionnaire. Women 
ated the tape of sounds as more distracting 
a 7-point scale than did men, F(1, 15) = 
p < .02. There were no other differences 
ffoss types, distractor conditions, or sex of 
ibject for ratings of task difficulty, quality 
“performance, and loudness, stressfulness, 
id pleasantness of the distractor. 
The results of Experiment 3 replicate 
ious research that distracting sounds aid 
formance on the Stroop Color Naming 
k. Taken together, the results of Experi- 
ts 2 and 3 reveal that this effect is pri- 
ily due to the Type A’s. Apparently Type 
ÍS are able to inhibit their attention to the 
linds. By doing so, they can actively ignore 
Bk-irrelevant cues and their performance 
proves, 


( 


General Discussion 
The results of the three experiments are 


insistent with the hypothesized differences 
the types’ policies for allocating attention 
Uthe environment. It might be argued, how- 
» that the results were due to an unfortu- 
ite selection of tasks such that Type A’s 
Ply are better on the Stroop Color Naming 
K and poorer on simple reaction time tasks 
ù Type B’s are. This explanation is ren- 
Ned much less plausible, though, by the 
milar Stroop performances of A’s and B’s 
“Hout distraction in Experiments 2 and 3, 
f the lack of consistent differences in the 
ible reaction times of A’s and B’s in other 
“les (Abrahams & Birren, 1973; Krantz & 
eson, Note 2; Spieth, 1965). Therefore, 
“"M™ance differences of A’s and B’s on the 
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tasks probably reflect true differences in allo- 
cation of Type A and Type B individuals’ 
attention to the environment in these labora- 
tory investigations. 

In summary, then, the present findings pro- 
vide the first systematic documentation of 
Type As’ policy for allocating attention to 
the environment. They focus their atten- 
tion on central environmental information; 
consequently, they attend less to peripheral 
aspects than Type B’s do, Moreover, .Type 
A’s can actively inhibit their attention to 
peripheral events that might distract them 
from performing well on a task. These find- 
ings verify anecdotal observations that Type 
A’s appear hyperalert but are less able than 
B’s to report task-irrelevant (peripheral) 
cues (e.g, Bortner & Rosenman, 1967). 


Arousal and Type A Individuals’ Allocation 
of Attention 


It is documented that arousal leads to a 
focusing of attention and to attention to the 
most task-relevant or salient cues (Easter- 
brook, 1959). Also, accumulating evidence 
shows that Type A’s have greater increases in 
indices of sympathetic nervous system arousal 
than Type B’s do while performing challeng- 
ing tasks, although no differences between the 
types exist in resting levels (e.g., Dembroski, 
MacDougall, & Shields, 1977; Friedman, By- 
ers, Diamant, & Rosenman, 1975; Manuck, 
Craft, & Gold, 1978). It could be argued that 
if A’s in our studies were more aroused than 
B’s, they attended more to the salient aspects 
of the Stroop task—in this case, to the color 
of the stimulus word’s print—and ignored 
relatively irrelevant aspects, like the name of 
the Stroop stimulus word, the secondary task 
in Experiment 1, or the tape of sounds in Ex- 
periments 2 and 3. Thus, it is feasible that 
differences in arousal underlie the differences 
in the types’ policies of attention allocation. 


7 Analyses of the number of attempted Stroop 
problems for Experiment 3 also revealed the same 
pattern of results: distractor -no distractor group, 
F(1, 28) = 4.48, p < .05. For error rates there was a 
significant interaction term, Distractor~No Dis- 
tractor Group X Sex of Subject, F(1, 28) =S.11, $ 
< 05. Males had a larger percentage of errors in 
the no distractor than in the distractor group, 
whereas females showed the reverse pattern. 
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While this possibility is attractive in some 
respects and may be true for a variety of sit- 
uations, it has a number of negative features 
as an explanation of the findings from Experi- 
ments 2 and 3. First, it is inconsistent with 
physiological evidence of cardiac deceleration 
during Stroop performance, which is associ- 
ated with deliberate pacing used by subjects 
to reduce errors and confusion (see Kahne- 
man, 1973). Second, it cannot explain why 
Type A’s (presumably aroused) do not out- 
perform Type B’s in the no distractor group 
in the latter two experiments unless one as- 
sumes that the Stroop task alone is insuffi- 
ciently challenging to create arousal. For the 
arousal hypothesis to fit the existing data, we 
must assume that Type A’s were aroused 
while executing the Stroop task only in the 
presence of distracting sounds. But this as- 
sumption is inconsistent with the absence of 
arousal (measured by pulse rate) in subjects 
working on the Stroop task while listening to 
78 dB(A) sounds (Houston & Jones, 1967). 
Presumably some of those subjects were Type 
A’s. Finally, discernible arousal generated in 
ways other than noise (e.g., threat of shock) 
has a detrimental effect on performance of 
the digit symbol test, a subtest of the Wech- 
sler Adult Intelligence Test (Carlson & Laza- 
rus, 1953; Goldfarb, 1961), in contrast to 
distracting sounds, which have a facilitating 
effect (Houston, 1968). Type A’s but not 
Type B’s show significant improvement in 
their performance of the digit symbol test in 
the presence of distracting sounds (Matthews, 
Note 3), just as was true of the present 
Stroop performance. For these reasons, we do 
not favor an arousal interpretation of Type 
As’ allocation of attention in Experiments 2 
and 3, although we do expect that Type A 
individuals’ sympathetic nervous system 
arousal affects their attentional style in other 
situations. In future research, therefore, we 
plan to examine the physiological correlates 
of Type As’ attentional style in a variety of 
settings. 


Implications for Previous Research 


The differences in the types’ allocation of 
attention to the environment provide a plausi- 
ble interpretation of Type As’ reactions to the 
salience of uncontrollable events (Glass, 
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1977). Type A’s respond to highly ‘salient | 
losses of environmental control by making 
greater initial efforts than B’s do to reassert 
control, followed by more extreme giving up 
relative to B’s. When the uncontrollable 
event is moderately salient, Type A’s actually 
make less effort than Type B’s do to reassert 
control and do not give up, whereas Type B's 
do. Glass (1977) suggests that because of 
Type A persons’ need to avoid losses of en: 
vironmental control, they distort uncontrolla- 
bility cues if they are not highly salient, or 
at least they do not effectively encode them, 
However, contrary to this suggestion is the} 
absence of differences between A’s and B's inl 
perceptions of uncontrollability (Glass, 1977), 
On the other. hand, the present data suggest 
that Type A’s attend to the most salienti 
aspects of their environment and inhibit their 
attention to any peripheral cue that might 
deter them from an excellent performance, 
This attentional interpretation of the salienc 
findings can explain Type As’ lack of respon 
siveness to uncontrollable events of low sali 
ence. It does not explain why Type B’s do nd 
give up following highly salient uncontrollt: 
ble events, given that they do give up fok 
lowing moderately salient ones. ‘ 
As stated earlier, our initial speculation 
about the attentional styles of Type A’s w% 
derived from our attempt to explain 4 
failure to report symptoms during task E 
formance. Indeed, the results of the presei 
research are consistent with an attentio r 
explanation of this phenomenon. That is, d 
the extent that symptoms are analogon 
distractors from task performance, e E 
suggest that Type A’s can wae ae a i 
tion to symptoms. Tesser (€.8., 94 a 
ser, 1973) has shown that the grea! ted 
amount of attention to an event oF or ng 
the more extreme the evaluation of i a 
task performance, then, Type A’s T d 
their symptoms as less extreme ma B’s d0: 
because they attend to them less E 
We have stated elsewhere o eae 
Matthews, 1978) that the term pees i 
was inappropriate to describe Type vied {ht 
to report symptoms because it amp this ph” 
we knew the mechanism underlying evigent® 
nomenon. At that time, we had tym 6 
why Type A’s might fail to pagan’ that iH 
The present data, however, sugg? l 
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ferm suppression is appropriate to describe 
Type As’ failure to report symptoms in the 
inited sense that Type A’s suppress their 
sitention to symptoms. It is important to em- 
phasize that existing data do not suggest that 
'ş have less sensory input of symptoms than 
(0 B’s. That is, physiological indices of effort 
ind sympathetic arousal are often greater or 
jhe same for A’s relative to B’s during task 
Si (Carver et al., 1976; Weidner 
k Matthews, 1978). It is also important to 
ite that Type As’ suppression of attention to 
mptoms is not the same as denying or dis- 
lorting the truth in order to avoid admitting 
e disruptive effects of symptoms on task 
fetformance. Type A’s report as much disrup- 
lon of task performance from the distractor 
Experiments 2 and 3 and from symptoms 
(Weidner & Matthews, 1978) as Type B’s do. 
Rather, suppression appears to be a useful 
fhtategy for maintaining and even accelerat- 
g the efforts of Type A’s, certainly a strat- 
a Type A’s might seek out given their 
leeds to achieve (see Carver et al., 1976). 


Relevance to Heart Disease 
| 


| We have observed elsewhere that failure of 
Hype A’s to report symptoms may contribute 
Pattern A risk of coronary artery and 
Mart disease in two ways (Weidner & Mat- 
lens, 1978). First, Type A’s may delay 
king medical attention when experiencing 
tly heart attack symptoms (Carver et al., 
(96). With increasing delays, prodromal 
|mptoms are more likely to result in a full- 
pod heart attack and an ongoing infarc- 
a becomes more severe. In fact, delay is in 
a a risk factor for heart disease (Insull, 
3). Second, Type A individuals’ failure to 
Jott symptoms does not allow them to use 
| j Ptoms elicited by stress as cues to alter 
|" behavior to less stressful forms. To the 


ot for their failure to report symp- 
ao this style may contribute to Pattern As’ 
heart disease. Furthermore, the at- 
a style of Type A’s may also play a 
leant as critical role in the etiology of 
hy hy ease. Individuals who are described 
|i Peralert and who exhibit a high level of 
tY to focus their attention also show cor- 


[Stee that the attentional style of Type 4's - 
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related increase in sympathetic nervous sys- 
tem activation (Williams, Note 4). If such 
activation were to occur repeatedly, it would 
subject the cardiovascular system to sub- 
stantial physical loads on a daily basis, which 
might result in eventual heart disease (Wil- 
liams, 1975). Indeed Williams (1975) has 
speculated that the perceptual and atten- 
tional aspects of Pattern A might provide 
the intervening link between Pattern A and 
coronary disease pathogenesis. The present 
research offers evidence that the above “coro- 
nary-prone” attentional style is consistent 
with Type A persons’ style of attending to 
their environment. Thus, the attentional style 
of Type A’s may serve them well in the sense 
of allowing them to focus their attention on 
work and to ignore annoying distractions, but 
it may also place them in triple jeopardy for 
coronary artery and heart disease in their 
later years. 
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A SS eae 


situation. 


| Organizational supervisors typically have at 
leit disposal a variety of powers over sub- 
fidinates, Depending on the nature of the 
ganization, these powers may include the 
[ower to grant raises, recommend promotion, 
ommend transfer to a more fulfilling job 
ignment, and even discharge subordinates 
Inadequate performance. It is surely axio- 
tic that supervisory use of power can sub- 
ntially influence employee morale and need 
illment, One would expect that a super- 
Sor would promote need fulfillment among 
ibordinates if he allocated benefits in a 
Aner that was closely contingent on their 
y of performance (Porter & Lawler, 
68). Conversely, need fulfillment would 


bably diminish to the extent that the 


tt 


oe. 
“gnized organizational objectives. 
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An industrial simulation experiment was conducted with students earning a 
master’s degree in business administration to determine the influence of the 
power motive on their use of power. Need for power was assessed by means 
of the Thematic Apperception Test measure devised by Winter. Students 
scoring among the top and bottom third of those taking the test then proceeded 
to act as “supervisor” in the industrial simulation by directing the labors of a 
work crew in the next room. One member of the crew, Man C, behaved as an 
ingratiator. Supervisors high and low on » Power evaluated Man C about the 
same when he was neutral and noningratiating in his manner, but when Man C 
was an ingratiator, supervisors high on the power motive evaluated his per- 
formance more favorably than did low-scoring supervisors. High # Power 
supervisors also perceived themselves as exerting greater influence on the work 
group. These findings are seen as extending the heuristic value of the n Power 
construct by demonstrating its relationships to use of power in an industrylike 


The purpose of the present investigation 
was to examine the relation between the 
power motive and use of power by persons 
acting as supervisor in an industrial simula- 
tion experiment. Need for power was assessed 
by means of the Thematic Apperception Test 
(TAT) measure devised by David Winter 
(1973). The experimental design represented 
a variation of the paradigm developed in 
studies reported by Goodstadt and Kipnis 
(1970) and Kipnis and Vanderveer (1971). 

Studies that examined only situational in- 
fluences on use of power have yielded incon- 
sistent findings. Kipnis and Vanderveer 
(1971) found that persons acting as super- 
visor in an industrial simulation gave more 
pay increases and higher performance evalu- 
ations to an ingratiator than to a worker of 
similar performance who was not an ingrati- 
ator. A later experiment (Fodor, 1974) was 
unable to replicate these findings. Some stud- 
jes have suggested that the presence of a 
problem worker increases the likelihood that 
a supervisor will regard with favor a compli- 
ant member of the work crew (Fodor, 1974; 
Goodstadt & Kipnis, 1970; Kipnis & Vander- 
veer, 1971). Another investigation obtained 
opposite findings (Fodor, 1976). 
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An influence unexplored in these studies 
was the variable of individual differences in 
motivation or in some dimension of person- 
ality. A likely candidate for consideration 
emerged with Winter’s publication of The 
Power Motive (1973). By need for power, 
Winter means a need to influence, persuade, 
or control others and to gain recognition or 
acclaim through these forms of behavior. He 
developed a measure of need for power that 
is based on the TAT much in the same man- 
ner as McClelland’s concept of achievement 
motivation (McClelland, 1958; McClelland 
& Winter, 1969). The Winter scoring system 
builds upon TAT criteria of need for power 
previously put forth by Veroff (1957) and 
Uleman (1966) and adds numerous refine- 
ments of its own. One method Winter used 
for selecting scoring categories was to ad- 
minister the TAT both before and after ex- 
posure to an event that was replete with 
power imagery (John Kennedy’s inaugural 
address) and then to note what categories of 
TAT imagery increased in frequency of usage. 
Winter completed a number of investigations 
in an attempt to establish the validity of his 
scoring system. Students who were or had 
been officers in student organizations scored 
higher on need for power than those who 
were not. Resident advisers obtained high 
scores. These students, it should be recog- 
nized, occupy positions of considerable in- 
fluence and respect in the university com- 
munity. They serve as academic and personal 
counselors for students, give advice, and act 
as the liaison between students and the ad- 
ministration. Athletes who engaged in com- 
petitive sports likewise scored high. 

A major problem associated with experi- 
mentally derived TAT measures of motives 
has been low test-retest reliability (Atkinson, 
1958). Winter and Stewart (1977) have pre- 
sented evidence, however, that low temporal 
stability may derive from an implicit demand 
for novel stories during the second test ad- 
ministration. The problem substantially di- 
minished when they instructed subjects at 
the retest either (a) not to worry about 
whether their stories were similar or different 
or (b) to try to write the same stories. Sub- 
jects instructed to write different stories pro- 
duced reliability only at chance levels. 


EUGENE M. FODOR AND DANA L. FARROW 


Winter’s conceptualization implies that in 
their quest for recognition and acclaim, pm 
sons high in power motivation would prob. 
ably respond favorably to persons supplying 
them with unswerving support and loyalty in 
the form of ingratiation. The principal by 
pothesis of the present experiment was that 
persons scoring high on need for power would 
dispense greater benefits on an ingratiating 
subordinate than would individuals low in 
power motivation. Persons scoring high wer 
also expected to differentiate more broadly 
between two compliant but noningratiating 
workers of comparable performance in the pay 
increases and performance evaluations that 
they awarded. The assumption guiding thi 
prediction was that those who are strong 
dominated by power concerns take satisi 
tion in the exercise of power for its own sil 
and are apt to allocate benefits to subotth 
nates in an arbitrary and whimsical fashion 
The two compliant but noningratiating W0 
ers were equal in performance, appearing t 
deserve equal pay increases and performant 
appraisals. The expectation, however q 
that supervisors high on the power ee 
would allow personal preference based oni 
subordinate’s voice quality or some oa 
remark to take precedence over more ob} 
tive performance criteria, causing Be 
make sharp distinctions between "a y 
seemingly equal subordinates. A furt ie 
diction was made that high scoring ue 
would “read” members of the vor ae : 
haying been highly susceptible to yee 
visory influence. They would report the wo 
had exerted greater influence om : 


R sors. 
group than would low-scoring superv’ 


Method 


Subjects 


int 

’g in pusin! 

Male students enrolled in the ma jyrkso0 0 
administration (MBA) progth s 

lege served as experimental sub) ni managers 3 


Í 
are preparing for careers as bu Ë ganization 
as a consequence of exposure tO to reflect 0 
behavior courses, they have pan uld employ i 
what style of leadership they S orate th 
their ultimate career role. The a i til 
they act as supervisor in an ida he ba 
it was therefore presumed, shoul aed cH 
greater “experimental realism K Pi adua! 
smith, 1969) than for the typica: 
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Students were contacted by phone and were of- 
fered $2.50 to complete the TAT. They were told 
‘that they would be paid a like sum should they 
Jater be asked to participate in the industrial simu- 
lation, One hundred and twenty-two MBA students 
completed the TAT, and those scoring in the top 
md bottom third of the group on need for power 
were invited to return for the industrial simulation. 
This process was continued until a total of 80 
students had successfully performed in the experi- 
ent. Subjects were assigned randomly to the two 


The TAT Measure of Need for Power 


There is no standard set of pictures for use in 
tasuring power motivation, so eight pictures were 
Iilected that connoted some degree of power 
agery. One was a scene from a football game, 
other was the photograph of a young man speak- 
g to a crowd through a bullhorn, and a third 
lepicted a well-dressed man passing a less well- 
litessed younger man standing against a wooden 
lince. Three of the pictures were taken from the 
Mt compiled by McClelland, Davis, Kalin, and 
[Wanner (1972). 


br power imagery. The researcher first attempts to 
[ore four sets of 30 stories, checking the accuracy 
Í his or her scoring against the standard after 
lfompletion of each set. Three final sets are then 
Fred for the purpose of obtaining an estimate of 


i: 
koring proficiency, Winter describes two indices of 


interrater reliability. One is “category agreement,” 
li hich measures level of agreement with the standard 
r presence or absence of power imagery, and the 
kond is a rho coefficient that assesses correlation 
M ranks between one’s own scoring and the stan- 
a Winter recommends that the investigator 
ard a coefficient of .85 on both reliability mea- 
[ites as a prerequisite to research application of the 


Her’ 
a Measure, two different sets of four pictures 
ete presented to a class of 20 undergraduate stu- 
ip nts on each of two occasions separated by a 
Jt of 2 weeks. Scores for each story range from 
„to 11, and a total score can be computed by 
N averaging scores across stories; so an overall 
he act be obtained on the basis of only four of 
eight pictures, The Spearman-Brown formula 
led a stability coefficient of 46 for this split- 
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mained within their third of the distribution across 
administrations. The test would appear to have 
research value if interest were confined to the 
behavior of persons scoring at the top and bottom 
third of the distribution, that is, those distinctly 
high or low on need for power. 


Procedure 


The experimenter informed subjects upon their 
arrival at the laboratory that the purpose of the 
industrial simulation was to study their managerial 
ability. They would, he told them, assume the role 
of supervisor and direct the work activities of three 
college men in the next room (who, unknown to 
the supervisor, would actually be nonexistent). The 
supervisor was further given to believe that indi- 
vidual workers had been assigned the task of as- 
sembling Tinker Toy models from pictured dia- 
grams. These workers were to assemble as many 
models per trial as they were able, and which 
model they were to put together would change on 
each of five successive 5-minute trials. Each trial 
was followed by a 5-minute rest period. Successive 
trials featured models that were progressively more 
difficult to construct: a pair of eyeglasses, a tire 
swing, a teeter-totter, a hand winch for cranking 
up rope, and a windmill. The supervisor was ex- 
pected to use every means at his disposal to foster 
high productivity in his men. These methods, the 
experimenter explained to him, might include words 
of encouragement, reprimands, promises of pay 
increase, and general advice on construction. He 
would speak to his workers by means of a one-way 
communication device comprised of a microphone 
and light signal for contacting individual workers. 
Communications allegedly coming from the various 
workers consisted of preprogrammed, play-acted 
tape recordings. The explanation presented to super- 
visors for not being permitted to see the workers 
was that the impressions they would acquire by 
seeing individual workers might bias them in their 
later evaluations. At the end of a trial the super- 
visor heard voiced comments from individual men, 
and then the experimenter brought in a count of 
each worker’s output. Before and during every 
trial the supervisor had in front of him a fully 
assembled illustration of the model on which the 
men were currently working and also a fabricated 
set of norms (based on actual pretesting). These 
norms were expressed as performance percentile 
scores for college students. The student supervisor 
understood that workers had been promised a base 
pay of $2 for their participation but that he as 
supervisor could administer a 25¢ pay increase or 
decrease at the end of a given trial. 

There were two conditions in the experiment, one 
with Man C acting as an ingratiator and the other 
with Man C assuming a neutral, noningratiating 
role. All three workers maintained roughly an 
average performance throughout both conditions. 
Number of words spoken per man was also equalized 


across workers and experimental treatments. 
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Table 1 


Mean Performance Evaluations, Pay Increases, and Liking Ratings Given the Ingratiating Worky i 
St E a a  ——e, 
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Performance Pay Liking 
Level of — 
n Power in Neutral Neutral f Neutral | 
supervisor Ingratiator worker Ingratiator worker Ingratiator worker 
High 
i 29.3, 22.1 50.0 35.0 6.8 5.1 
SD 4.2 4.8 19.9 27.4 1.7 1.8 
Low 
M 24.25 23.0, 37.5 31.3 6.2 5.9 
SD 4,9 3.9 26.3 24.2 1,9 2.0 
Note. Means with differing subscripts differ significantly from each other, p < .01. 
Twenty of the student supervisors assigned to the subjects the details and underlying rationale of the 


ingratiator condition had scored in the top third of 
the need for power distribution, and 20 had scored 
in the bottom third. Of people assigned to the non- 
ingratiator condition, 20 had likewise obtained high 
scores on the power motive, and 20 were low. The 
ingratiator, Man C, established his style of relating 
to the supervisor at the end of Trial 2 by saying, 
“You know, I really like your approach, I can see 
you’re gonna be a good supervisor.” Trial 4 pro- 
duced the remark, “Personally, I think you're doin’ 
a great job.” Man C completed Trial 5 on an 
equally complimentary note by commenting, “You've 
done pretty good.” All of these utterances were 
illustrative of the form of ingratiation that Jones 
(1964) has designated as “complimentary other- 
enhancement.” 

~ Other communications by Man C and the remain- 
ing workers were neutral both in content and in 
emotional tone. At the conclusion of Trial 2, for 
example, Man A blandly proclaimed, “Not much to 
say this time, I guess.” Man B said, “Well, so much 
for this job. What’s next?” 

When the supervisor had finished directing the 
work activities of his crew, the experimenter led 
him to another room and requested that he evalu- 
ate the performance of each worker on four 9-point 
scales: (a) worker's ability, (b) worker’s overall 
worth to the company, (c) willingness to rehire 
worker for a second experiment, and (d) recom- 
mendation that the worker be promoted to super- 
visor in a future experiment, For purposes of analy- 
sis, these four scale scores-were summated to obtain 
a single overall score. The student supervisor also 
completed ratings on how much he would like the 
worker as a personal friend (liking score) and how 
much he thought he had influenced the performance 
of each (influence score). 

The experimenter then closely interrogated the 
supervisor to determine whether he believed there 
were no workers in the next room, Six supervisors 
so stated and were dropped from further data analy- 
sis. Eighty subjects remained, 20 in each of four 
groups as previously indicated. Another function of 
the postexperimental interview was to explain to 


study and then to swear them to secrecy. 


Results 


The first and major hypothesis of the study, 
predicted that student supervisors scoring 
high on the power motive would bestow i 
proportionately high rewards upan a a 
gratiator. The relevant data appear in tai 


ferent forms—performance ratings, a a 
crease, and liking. By all three forms © a 
ward, persons scoring high on the poni 
motive were found to confer higher Pe 
upon the ingratiator than were those n A 
low. Moreover, persons high on nê cael 
power favored the ingratiator more thanta 
counterparts in the control condition ial 
fited Man C who was not an ingra ret 
Since reward could take any of oe mi 
forms, the decision was made to Pe É, 
multivariate analysis of variance m és al 
then, as recommended by Humme mls ses dl 
(1971), to follow with univariate i A 
variance of the three variables ultivarit 
rately to trace the source of ek 3 
effects. Overall, considering nea A 
pendent variables together, there ' 6.20 

effect for condition, F(1, W 16) als 18 
001, but not for n Power, FUL, of variots 
ns. Supervisors distributed more ae 
forms of reward to Man C who the conti 
gratiator than their counterpar® ge not a 
group awarded to Man C wao ower motiv 

ingratiator. Supervisors high in p 

tion did not generally give 8" 


eater rewa" 
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Man C than did those low in power motiva- 
tion. A significant interaction effect was ob- 
tained, however, F(1,76) = 3.23, p< .05, 
indicating that high-scoring supervisors dis- 
pensed greater reward to Man C, who was an 
ingratiator, than did low-scoring supervisors, 
whereas high- and low-scoring supervisors 
treated Man C of the control condition, who 
was not an ingratiator, about the same. Analy- 
sis of the three dependent variables individu- 
ally revealed that the performance ratings 
alone accounted for most of the multivariate 
effect just described. There was a main effect 
lor n Power, F(1, 76) = 4.32, p < .05, for 
condition, F (1,76) = 17.89, p<.001, and 
for the interaction as well, F(1, 76) = 8.87, 
p< .01. Specifically, there was a pronounced 
tendency for supervisors high in need for 
power to evaluate very favorably Man C who 
Was an ingratiator. These same supervisors 
did not evaluate Man C as favorably in the 
trol condition when he was not an ingrati- 
ator. Supervisors low in power motivation 
tompleted roughly similar performance ap- 
a on Man C regardless of whether or 
Not he was an ingratiator, and in neither case 
tid these appraisals approach those that high- 
oring subjects gave Man C when he was an 
igratiator. There was in addition a mar- 
| nally significant main effect for condition in 
telation to pay increase, F(1, 76) = 3.74, p 
<.06, indicating a tendency for supervisors 
award greater pay increase to Man C when 


Crease or liking proved to be significant. 
Post hoc comparisons were made by means 
Tukey’s honestly significant differences 
HSD) test, and Table 1 portrays those re- 
lts. Student supervisors high on » Power 
l M gave higher evaluations to the ingrati- 
», than were granted by supervisors to Man 
+" other conditions of the experiment. 
The interaction effect for performance ap- 
Misal is illustrated in Figure 1. Supervisors 
and low on Power evaluated Man C 
t the same when he was neutral and non- 
“tatiating in his manner, but when Man C 
wed the role of ingratiator, supervisors 
k în power motivation rated his perform- 
a ec antialiy higher than did those low 
er, 


bow 
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HIGH n POWER 
LOW APOWER 
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NEUTRAL WORKER INGRATIATOR 


Figure 1. Effects of n Power and ingratiation on 
performance evaluations by student supervisors. 


The second hypothesis was that high n 
Power supervisors would make greater dis- 
tinctions between Man A and Man B, the two 
neutral workers, in performance appraisal and 
total amounts of pay increase they awarded. 
Data were collapsed across the ingratiator 
and noningratiator conditions so that all 40 
supervisors high on the power motive were 
compared against the 40 who were low. Not 
only did the performance appraisal data fail 
to confirm the hypothesis, but they actually 
indicated a slight but nonsignificant trend in 
the direction contrary to prediction, Super- 
visors high in n Power gave performance 
evaluations to Man A and Man B that dif- 
fered by a mean of 4.1, whereas the cor- 
responding figure for supervisors low on ” 
Power was 5.7. For pay increases, the figures 
were 13,1 and 12.5, respectively, again a 
nonsignificant difference. 

Perceived supervisory influence was the 
focus of the last hypothesis. A “total influ- 
ence score” was computed by tallying across 
workers figures circled to represent how re- 
sponsive supervisors believed the various 
workers were to the supervisor’s efforts to 
influence their performance. Individual rat- 
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ings ran from 9 (very responsive) to 1 (very 
unresponsive); so scores denoting total per- 
ceived influence could theoretically vary be- 
tween 27 and 3. The mean for supervisors 
scoring high on need for power was 17.3, and 
the mean for those low on the power need 
was 15.9, F(1, 78) = 4.44, p < .05. Subjects 
high in n Power perceived themselves as 
especially influential toward Man C in the 
ingratiation condition, their mean influence 
rating being 7.3, as contrasted with a rating 
of 5.5 for high n Power supervisors in the 
noningratiation condition. Subjects low in the 
power motive averaged 6.45 in their ratings 
of Man C for both conditions. The F value 
of greatest interest for data pertaining to 
perceived influence toward Man C alone was a 
main effect for interaction, F (1, 78) = 16.72; 
p < 02. 


Discussion and Conclusions 


The major findings of the present experi- 
ment accord well with Winter’s conceptuali- 
zation of need for power. Ingratiation, even 
though it is not sincere, may provide the tar- 
get person with the overt appearance of 
recognition for exercise of power-oriented be- 
haviors, the quest for which, Winter believes, 
is integral to the personality of persons high 
in power motivation. These features were 
unmistakably present in the personality of 
Woodrow Wilson, one of the presidents whose 
inaugural address Winter scored as reflecting 
high power imagery. Consider the observa- 


tions of Wilson’s biographers, Ge 
George (1956): NG 


All of Wilson’s close friends—the men, 


the w 
the professors, the Politicians, farsi 


the socialites—shared 


mutual affection. To him, i 
him about a matter of i 
the friend no longer 


milar point 
Power-seeking mana- 


gers “must surround themselves with ad- 
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mirers, and admiration becomes a drug for 
them. . .” (p. 199) 
Horney (1937) has argued that the quest 


for power arises from thwarted childhood 
needs for parental affection. As viewed in this 
perspective, ingratiation may have brought 
reassurance to power-motivated student super- 


visors that they were worthy of respect or 
possibly even affection. 

Belief by the supervisor high in n Power 
that he influenced the behavior of subordi- 
nates may have served a function similar to 


that provided by ingratiation from subordi- 
nates. Both events create a cognitive aware- 
ness that one has had an impact on others, 
that one has elicited a response from them 


that is contingent upon one’s own actions, It 
is that very awareness, according to our fore 
going theoretical analysis, which power-0t 
ented persons especially crave. 

Student supervisors high on the power mo- 
tive, it will be recalled, did not reveal any 
significant tendency to grant high pay I] 
creases to an ingratiating subordinate. The) 
pay increase measure probably was not highly 
sensitive to differences in response from work- 
ers, because it was not until the end of H 
session that Man C had thoroughly esta? 
lished himself as an ingratiator in the criticà 
condition of the experiment. By that T 
much of the opportunity for dispensing A 
ferential reward in the form of pay increas 
had ed. 

McClelland and Burnham (1976) E 
that managers high in power mot 
tended to foster organizational climates a 
were maximally conducive to high work i 
formance. It should be noted, however; jod 
McClelland and Burnham regarded as E 
to their analysis high » Power scores a 
panied by low scores on n Affiliation; 50 o e 
measure was not strictly comparable to did 
one devised by Winter. Winter (1973) oe 
find, however, that persons aspiring ae, 
reers in management scored high in # ee 

The McClelland and Burnham finding ti 
managers high in power motivation per “4 
Well in the sense of creating an optim?’ e 
ganizational climate lends plausibility n 
argument that perhaps high n Power P n the 
may be susceptible to flattery because, 2 pe 
basis of past experience with their oW” j 


mance, they have discovered that they 
y do make good decisions and perform 
at tasks they undertake. Thus the in- 
tor may more easily be believed, since 
formation he transmits accords well with 
Jier evidence the high m Power person has 
out himself. 

interesting parallel occurs between the 
sent conclusions concerning ” Power and 
ous findings on self-confidence. Good- 
dt and Kipnis (1970) reported that per- 
high in self-confidence, as contrasted 
persons low in self-confidence, exercised 
fre influence over subordinates, whereas 
pnis and Vanderveer (1971) found persons 
gh in self-confidence to be more susceptible 
e ingratiator’s flattery. What correlation 
ists between » Power and self-confidence 
ot known, although it is apparent that 
eClelland and Burnham (1976) see a 
htening of self-confidence as 4 conse- 
nce likely to result from their training 
fogram for promoting greater power moti- 
ion among managers. 

hese findings can be viewed as an illustra- 
on of the joint interaction of personality 
ind situation as codeterminants of social be- 
lavior, especially when contrasted with pre- 
is inquiries into use of power. One is 
inded of Kurt Lewin’s classic formula- 
, B={(P,E), which stated that be- 
ior is some interactive function of both 
son and environment. 

“The central message of the present inves- 
ion for those interested in managerial 


ng is clear, Persons best equipped by 
and control others 
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Biased Assimilation and Attitude Polarization: The Effects of 


Prior Theories on Subsequently Considered Evidence 


Charles G. Lord, Lee Ross, and Mark R. Lepper 
Stanford University 


People who hold strong opinions on complex social issues are likely to examine 
relevant empirical evidence in a biased manner. They are apt to accept “con- 
firming” evidence at face value while subjecting “disconfirming” evidence to 
critical evaluation, and as a result to draw undue support for their initial posi- 
tions from mixed or random empirical findings. Thus, the result of exposing 
contending factions in a social dispute to an identical body of relevant em- 
pirical evidence may be not a narrowing of disagreement but rather an in- 
crease in polarization. To test these assumptions and predictions, subjects 
supporting and opposing capital punishment were exposed to two purported 
studies, one seemingly confirming and one seemingly disconfirming their exist- 
ing beliefs about the deterrent efficacy of the death penalty. As predicted, both 
Proponents and opponents of capital punishment rated those results and 
procedures that confirmed their own beliefs to be the more convincing and 
probative ones, and they reported corresponding shifts in their beliefs as the 
various results and procedures were presented, The net effect of such evalua- 
tions and opinion shifts was the postulated increase in attitude polarization. 


The human understanding when it has once adopted 
an opinion draws all things else to support and agree 
with it. And though there be a greater number and 
weight of instances to be found on the other side, 
yet these it either neglects and despises, or else by 
some distinction sets aside and rejects, in order that 
by this great and pernicious predetermination the 
authority of its former conclusion may remain 
inviolate. (Bacon, 1620/1960) 


Often, more often than we care to admit, 
our attitudes on important social issues reflect 
only our preconceptions, vague impressions, 
and untested assumptions. We respond to 
social policies concerning compensatory edu- 
cation, water fluoridation, or energy conser- 
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vation in terms of the symbols or metaphots 
they evoke (Abelson, 1976; Kinder 
Kiewiet, Note 1) or in conformity with views 
expressed by opinion leaders we like or re 
spect (Katz, 1957). When “evidence” 18 
brought to bear it is apt to be incomplete 
biased, and of marginal probative value= 
typically, no more than a couple of vivi g 
concrete, but dubiously representative a 
stances or cases (cf. Abelson, 1972; Nisbe 
& Ross, in press). It is unsurprising, ther 
that important social issues and Maes, 
generally prompt sharp disagreements, te 
among highly concerned and intelligent ait 
zens, and that such disagreements oian 
vive strenous attempts at resolution thr 
discussion and persuasion. 

An interesting question, ke 
rompts the present research, A 
pe a a introducing the opit 
factions to relevant and objective da pe 
question seems particularly pertinent Pii 
temporary social scientists, who is ased 
quently called for “more enpi ell, 1969): 
social decision making (e-g., Camp?! tent and 
Very likely, data providing consis 


d one that 
involves the 
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unequivocal support for one or another posi- 
fio on a given issue can influence decision 
makers and, with sufficiently energetic dis- 
semination, public opinion at large. But what 
effects can be expected for more mixed or 
inconclusive evidence of the sort that is 
bound to arise for most complex social issues, 
especially where full-fledged experiments 
yielding decisive and easy-to-generalize results 
are a rarity? Logically, one might expect 
mixed evidence to produce some moderation 
in the views expressed by opposing factions. 
At worst, one might expect such inconclusive 
evidence to be ignored. 

2 The present study examines a rather differ- 
ent thesis—one born in an analysis of the 
lyperson’s general shortcomings as an intui- 
tive scientist (cf. Nisbett & Ross, in press; 
Ross, 1977) and his more specific short- 
comings in adjusting unwarranted beliefs in 
the light of empirical challenges (cf. Ross, 
Lepper, & Hubbard, 1975). Our thesis is that 
belief polarization will increase, rather than 
decrease or remain unchanged, when mixed 
ot inconclusive findings are assimilated by 
Proponents of opposite viewpoints. This 
l Polarization hypothesis” can be derived from 
the simple assumption that data relevant 
4 pone are not processed impartially. In- 
a , Judgments about the validity, reliability, 
at vance, and ‘sometimes even the meaning 
° proffered evidence are biased by the ap- 
Parent consistency of that evidence with the 
Powe’ theories and expectations. Thus 
os will dismiss and discount empirical 
hat ie that contradicts their initial views 
Se derive support from evidence, of no 
vith ih probativeness, that seems consistent 
ftia eir views, Through such biased assim- 
E even a random set of outcomes or 
Erea appear toi iadh sonnan 
that ed position, and both sides in a given 
the. Can have their positions bolstered by 
| “ai set of data. 

es introductory quotation suggests, the 
ha of biased assimilation and resulting 
Bie veraa have a long history. 
Wealth Philosophical speculations and a 
karn anecdotal evidence, considerable 
Bons a attests to the capacity of preconcep- 
| “and initial theories to bias the consider- 
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ation of subsequent evidence, including work 
on classic Einstellung effects (Luchins, 1942, 
1957), social influence processes (Asch, 
1946), impression formation (e.g., Jones & 
Goethals, 1971), recognition of degraded 
stimuli (Bruner & Potter, 1964), resistance 
to change of social attitudes and stereotypes 
(Abelson, 1959; Allport, 1954), self-fulfilling 
prophecies (Merton, 1948; Rosenhan, 1973; 
Snyder, Tanke, & Berscheid, 1977), and the 
persistence of “illusory correlations” (Chap- 
man & Chapman, 1967, 1969). In a partic- 
ularly relevant recent demonstration, Ma- 
honey (1977) has shown that trained social 
scientists are not immune to theory-based 
evaluations. In this study, professional re- 
viewers’ judgments about experimental pro- 
cedures and resultant publication recom- 
mendations varied dramatically with the 
degree to which the findings of a study under 
review agreed or disagreed with the reviewers’ 
own theoretical predilections. 

Thus, there is considerable evidence that 
people tend to interpret subsequent evidence 
so as to maintain their initial beliefs. The 
biased assimilation processes underlying this 
effect may include a propensity to remember 
the strengths of confirming evidence but the 
weaknesses of disconfirming evidence, to judge 
confirming evidence as relevant and reliable 
but disconfirming evidence as irrelevant and 
unreliable, and to accept confirming evidence 
at face value while scrutinizing disconfirming 
evidence hypercritically. With confirming evi- 
dence, we suspect that both Jay and profes- 
sional scientists rapidly reduce the complexity 
of the information and remember only a few 
well-chosen supportive impressions. With dis- 
confirming evidence, they continue to reflect 
upon any information that suggests less dam- 
aging “alternative interpretations.” Indeed, 
they may even come to regard the ambiguities 
and conceptual flaws in the data opposing 
their hypotheses as somehow suggestive of 
the fundamental correctness of those hypoth- 
eses, Thus, completely inconsistent or even 
random data—when “processed” in a suitably 
biased fashion—can maintain or even rein- 
force one’s preconceptions, 

The present study was designed to examine 
both the biased assimilation processes that 
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may occur when subjects with strong initial 
attitudes are confronted with empirical data 
concerning a controversial social issue and the 
consequent polarization of attitudes hypoth- 
esized to result when subjects with differing 
initial attitudes are exposed to a common set 
of “mixed” experimental results. The social 
controversy chosen for our investigation was 
the issue of capital punishment and its effec- 
tiveness as a deterrent to murder. This choice 
was made primarily because the issue is the 
subject of strongly held views that frequently 
do become the target of public education and 
media persuasion attempts, and has been the 
focus of considerable social science research 
in the last twenty years. Indeed, as our basic 
hypothesis suggests, contending factions in 
this debate often cite and derive encourage- 
ment from the same body of inconclusive 
correlational research (Furman v. Georgia, 
1972; Sarat & Vidmar, 1976; Sellin, 1967). 

In the present experiment, we presented 
both proponents and opponents of capital 
punishment first with the results and then 
with procedural details, critiques, and rebut- 
tals for two studies dealing with the deterrent 
efficacy of the death penalty—one study con- 
firming their initial beliefs and one study dis- 
confirming their initial beliefs. We anticipated 
biased assimilation at every stage of this pro- 
cedure. First, we expected subjects to rate 
the quality and probative value of studies 
confirming their beliefs on deterrent efficacy 
more highly than studies challenging their 
beliefs. Second, we anticipated corresponding 
effects on subjects’ attitudes and beliefs such 
that studies confirming subjects’ views would 
exert a greater impact than studies discon- 
firming those views. Finally, as a function of 
these assimilative biases, we hypothesized that 
the net result of exposure to the conflicting 
results of these two studies would be an 
increased polarization of subjects’ beliefs on 
deterrent efficacy and attitudes towards 
capital punishment. 


Method 


Subjects 


A total of 151 undergraduates completed an in-class 
questionnaire that included three items on capital 
punishment. Two to four weeks later, 48 of these 
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students were recruited to participate in a related 
experiment as partial vent of a course require. 
ment. Twenty-four were “proponents” who favored 
capital punishment, believed it to have a deterrent 
effect, and thought most of the relevant research 
supported their own beliefs. Twenty-four were “op. 
ponents” who opposed capital punishment, doubted 
its deterrent effect, and thought that the relevant 
research supported their views. 


Procedure 


Upon entering the experiment, mixed groups of 
proponents and opponents were seated at a large 
table. The experimenter, blind to subjects’ attitudes) 
told them that they would each be asked to reat! 
2 of 20 randomly selected studies on the deterrent 
efficacy of the death penalty and asked them to us 
their own “evaluative powers” in thinking about 
what the author(s) of the study did, what the critié 
had to say, and whether the research provided sup- 
port for one side or the other of this issue. 

The experimenter next showed subjects a set of 
10 index cards, each containing a brief statement of 
the results of a single study. Each subject was asked 
to choose one card and read it silently. In reality 
all 10 cards in any one session were identical, pt 
viding either prodeterrent information, for example: 


Kroner and Phillips (1977) compared murder ral ‘| 
for the year before and the year after adon g 
capital punishment in 14 states. In 11 of the af 
states, murder rates were lower after adoptions i 
the death penalty. This research supports the 


deterrent effect of the death penalty. l 
or antideterrent information, for example: 


Palmer and Crandall (1977) compared, mitt 
rates in 10 pairs of neighboring states WI ae. 
ent capital punishment laws. In 8 of the heap 
murder rates were higher in the state big 
punishment. This research opposes the 

effect of the death penalty. 


onenti 
To control for order effects, half of the prop j 


ni 
and half of the opponents saw 4 “nro a 
result first, and half saw an “antideterren vical 
first. The studies cited, although invented F 
for the present study, were characte A 
found in the current literature cit 
decisions. 

After reading one of these 
answered two sets of questions, 
about changes in their attitu 
punishment (from —8 = more oppas tite el 
in favor) and their beliefs about Eis beil 
efficacy of the death penalty (from + effect, t0 
that capital punishment has 4 ce z 
more belief in the deterrent effect). the single 
tions examined change occasioned by reading; $ 
of information they had just aai 
second set of questions assessed the c 


l subj 
, 
on 16-point § jal 


—— 
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produced by all of the materials read since the start 
| of the experiment.” 

Next the experimenter distributed detailed research 
descriptions bearing code letters corresponding to 
those on the result cards. The descriptions gave 
details of the researchers’ procedure, reiterated the 
results, mentioned several prominent criticisms of 
the study “in the literature,” listed the authors’ 
rebuttals to some of the criticisms and depicted the 
data both in table form and graphically. After read- 
ing this more detailed description and critique of the 
first study, subjects were asked to judge how well 
or poorly the study had been conducted (from —8 = 
very poorly done, to 8 = very well done), and how 
convincing the study seemed as evidence on the 
É deterrent efficacy of capital punishment (from —8 = 

completely unconvincing, to 8 = completely convinc- 
| ing)? Following this evaluation, subjects were asked 
to write why they thought the study they had just 
read did or did not support the argument that capital 
punishment is a deterrent to murder, and then to 
answer a second set of attitude and belief change 
questions on the effects of the description alone and 
the effects of all experimental materials (ie., the 
results and subsequent description and critique) up 
to that point in time. 

Following completion of these questions, the entire 
procedure was repeated, with a second fictitious study 
reporting results opposite to those of the first. Again, 
subjects initially received only a brief description of 
the results of this second study but were then pro- 
vided with a detailed presentation of the procedure, 
results, and critiques. As before, subjects were asked 
to evaluate both the impact of each single piece of 
| ‘vidence and the impact of all experimental materials 
| YP to that point in the experiment on their attitudes 
toward capital punishment and their beliefs concern- 
ing its deterrent efficacy. 
qe control for possible differences in the inherent 
| eee ulty of the two studies, two sets of materials 
fiis employed that interchanged the ostensible 
desi of the two invented experiments. The overall 
| ke was thus completely counterbalanced with 

fe to subjects’ initial attitudes, order of confirm- 

: He eee: evidence, and the association of 
| Res ‘ore-after” vs. “adjacent states” designs with 

cedure or negative results. At the end of the pro- 

: ce were carefully debriefed concerning 
not i itious nature of the studies and were asked 

a reveal this deception to others. In addition 

Ee that subjects understood the fictional 
inclu tha: the experimental materials, this debriefing 
[isimi discussion of the processes underlying the 
i Teassurance of evidence to previous theories and 

ce that a skeptical reaction to poorly 


esi 
i research is often a praiseworthy cognitive 


z Results 
Me 
‘aluations of the Two Studies 


0 
ing ur first hypothesis was that subjects hold- 
ifferent initial positions would differ- 
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entially evaluate the quality and ‘“‘convincing- 
ness” of the same empirical studies and find- 
ings, The relevant evaluations, presented in 
Table 1, revealed strong support for the 
hypothesized bias in favor of the study that 
confirmed subjects’ initial attitudes. 

A two-way analysis of variance (Initial 
Attitude x Order of Presentation) on the 
differences between ratings of convincingness 
of the prodeterrence and antideterrence stud- 
ies yielded only a main effect of initial atti- 
tude, F(1, 44) = 32.07, p < .001. Proponents 


1Since most of our subjects had reported initial 
positions at, or very close to, the ends of the 
attitude and belief scales used for selection pur- 
poses, our initial plan to assess attitude polariza- 
tion—in terms of difference scores assessing changes 
in subjects’ attitudes and beliefs on these same 
scales from these initial measures to the completion 
of the experiment—proved impossible. As a substi- 
tute, we employed three sorts of measures to assess 
attitude change. First, we asked subjects, after each 
new piece of information, to indicate any changes 
in their attitudes and beliefs occasioned by that 
single piece of information. Second, we asked sub- 
jects, at these same points, to report on “cumula- 
tive” changes in their attitudes and beliefs since 
the start of the experiment. Third, subjects were 
asked to keep “running records” of their attitudes 
and beliefs on enlarged versions of the scales ini- 
tially used for selection purposes. Although all of 
these measures individually raise some problems, 
the congruence of data across these different mea- 
surement devices gives us some confidence con- 
cerning the results reported. Indeed, because the 
results obtained on the “running record” measure 
so completely parallel the findings obtained on the 
cumulative change question depicted in Figures 1 
and 2, in terms of both the array of means and the 
obtained significance levels, the data from this 
measure will not be reported separately. 

2 Subjects were also asked, at this point, whether 
they thought the researchers had favored or op- 
posed the death penalty and whether they thought 
an unbiased consideration should lead one to treat 
the study as evidence for or against capital punish- 
ment. Analyses on the first question showed only 
that subjects believed the researchers’ attitudes to 
coincide with their stated results. Analyses on the 
second question proved wholly redundant with 
those presented for the “convincingness” and “well 
done” questions. 

8 Preliminary analyses were conducted to see if 
the particular association of positive versus nega- 
tive results with either the before-after or ad- 
jacent-states designs would affect the results ob- 
tained. There were no significant effects or inter- 
actions involving this variation in stimulus materials; 
hence, the data were collapsed across this actor, 
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Table 1 

Evaluations of Prodeterrence and Antideterrence 

Studies by Proponents and Opponents of 

Capital Punishment 

ee iĖŮĖ——— 
Study Proponents Opponents 


Mean ratings of how well the two 
studies had been conducted 


Prodeterrence 1.5 =2:1 
Antideterrence —1.6 — 3 
Difference 3.1 —=1.8 


Mean ratings of how convincing the two studies were 
as evidence on the deterrent efficacy of 
capital punishment 


Prodeterrence 1.4 —21 
Antideterrence —1.8 Jl 
Difference 3.2 —2.2 


Note. Positive numbers indicate a positive evalua- 
tion of the study's convincingness or procedure. 
Negative numbers indicate a negative evaluation of 
the study’s convincingness or procedure. 


regarded the prodeterrence study as signifi- 
cantly more convincing than the antideter- 
rence study, #(23) = 5.18, p < .001,* regard- 
less of whether it was the “before-after” 
design that suggested the efficacy of capital 
punishment and the “adjacent states” design 
that refuted it, or vice versa. Opponents, by 
contrast, regarded the prodeterrence study as 
significantly less convincing than the anti- 
deterrence study, #(23) = —3.02, p< 01, 
again irrespective of which research design 
was purported to have produced which type 
of results. The same was true of the difference 
between ratings of how well done the two 
studies had been, F; 44) = 33.52, p< 
-001.5 As above, Proponents found the pro- 
deterrence study to have been better con- 
ducted than the antideterrence study, #(23) 
= 5.37, p < .001, whereas opponents found 
the prodeterrence study to have been less well 
conducted, ¢(23) = —2.80, p < .05. As one 
might expect, the correlation between the 
“convincingness” and “well done” questions 
was substantial, 7 = .67, p < .001. 

These differing opinions of the quality of 
the two studies were also reflected in subjects’ 
written comments. At the risk of Opening our- 
selves to a charge of “biased assimilation,” 
we present a set of subjects’ comments— 
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selected for dramatic effect but not unrepre. 
sentative in content—in Table 2, As these 
comments make clear, the same Study can 
elicit entirely opposite evaluations from. 
people who hold different initial beliefs about 
a complex social issue. This evidence of bias 
in subjects’ evaluations of the quality and 
convincingness of the two studies is consistent 
with the biased assimilation hypothesis and 
sets the stage for testing our further predic- 
tions concerning attitude and belief polariza- 
tion. 


Overall Attitude Polarization 


Given such biased evaluations, our primary 
hypothesis was that exposure to the “mixed”! 
data set comprised by the two studies would 
result in a further polarization of subjects 
attitudes and beliefs rather than the con- 
vergence that an impartial consideration of 
these inconclusive data might warrant. To 
test this hypothesis requires a consideration) 


4 All p values reported in this article are based on 
two-tailed tests of significance. J 

SIn order to examine possible main eree 
either study direction or initial attitude ora 
jects’ ratings of how convincing and how xa 5 
the studies were—findings that would not ei 
trayed in the difference score analysis ee 
three-way analysis of variance (Initial aed 
Order of Presentation Direction of Said t 
also performed. There were no main effects e, 
direction on either measure. A main effect Pa a 
attitude—indicating that opponents evalua 


A id pro- 
total set of evidence more nega than ‘ae 
ponents—proved significant for tl t not for the 


question, F(1,44) = 4.69, P< 05, hul Za 
“convincing” question, F(1,44) = 1.53, eters 10 
ê The term mixed, we should emphasiz > nfirming 
the fact that one study yielded einen Fie 
the deterrent efficacy of the death, pena 4 
the other study yielded evidence ee of pur 
efficacy (with appropriate count ee subjects 
ported procedures and purported results). © d this 
regardless of initial position, Gn 
discrepancy between results, as 
our analyses of their responses to We do n 
ments of the study’s main Sad enomenologi 
mean to imply that the subje Pa equal pro: 
cally” judged the two studies to be the prece 
bative value; indeed, as indicated in E 
discussion, identical procedures ba nding 
to differ in their probativeness Dane ani 
congruity between the study’s OU 
subject’s initial beliefs. oN 


n 
C i 
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Selected Comments on Prodeterrence and Antideterrence Studies by Proponents 
and Opponents of Capital Punishment 
iii rit EE RE N ER 


Comments on 


24 Proponent 


5 Opponent 


36 Opponent 


in that it presents facts showing 
that there is a deterrent effect 
and seems to have gathered data 
properly.” 

“The experiment was well thought 
out, the data collected was valid, 
and they were able to come up 
with responses to all criticisms.” 


“The study was taken only 1 year 
before and 1 year after capital 
punishment was reinstated, To be 
a more effective study they should 
have taken data from at least 10 
years before and as many years as 
possible after.” 


“T don’t feel such a straightforward 
conclusion can be made from the 
data collected.” 


Subject Prodeterrence study Antideterrence study 
Set 1 materials 
$8 Proponent “It does support capital punishment “The evidence given is relatively 


meaningless without data about 
how the overall crime rate went 
up in those years.” 


“There were too many flaws in the 
picking of the states and too many 
variables involved in the experi- 
ment as a whole to change my 
opinion.” 

“The states were chosen at random, 
so the results show the average 
effect capital punishment has 
across the nation. The fact that 8 
out of 10 states show a rise in 
murders stands as good evidence.” 


“There aren’t as many uncontrolled 
variables in this experiment as in 
the other one, so I’m still willing 
to believe the conclusion made.” 


= 


SIs Proponent 


5 Opponent 


Set 2 materials 


“Tt shows a good direct comparison 
between contrasting death penalty 
effectiveness. Using neighboring 
states helps to make the experi- 
ment more accurate by using 
similar locations.” 


“It seems that the researchers 
studied a carefully selected group 
of states and that they were care- 
ful in interpreting their results.” 


“The data presented are a randomly 
drawn set of 10. This fact seems 
to be the study's biggest problem. 
Also many other factors are not 
accounted for which are very 
important to the nature of the 
results.” ; 

“There might be very different 
circumstances between the sets of 
two states, even though they were 
sharing a border.” 


“T don’t think they have complete 
enough collection of data. Also, 
as suggested, the murder rates 
should be expressed as percentages, 
not as straight figures.” 


“The research didn’t cover a long 
enough period of time to prove 
that capital punishment is not a 
deterrent to murder.” 


“The murder rates climbed in all but 
two of the states after new laws 
were passed and no strong evidence 
to contradict the researchers has 
been presented.” 


“These tests were comparing the 
same state to itself, so I feel it 
could be a fairly good measure.” 


ubjects? final attitudes, after exposure to 
8 studies and related critiques and rebut- 
» relative to the start of the experiment. 


The relevant data provide strong support 
for the polarization hypothesis. Asked for 
their final attitudes relative to the experi- 
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Table 3 

Mean Attitude and Belief Changes for a 

Single Piece of Information 
ee E 


Initial attitudes 


Issue and study Proponents Opponents 


Results only 
Capital punishment 


Prodeterrence 1.3 0.4 

Antideterrence —0.7 —0.9 

Combined 0.6 —0.5 
Deterrent efficacy 

Prodeterrence 1.9 0.7 

Antideterrence —0.9 —1.6 

Combined 1.0 —0.9 

Details, data, critiques, rebuttals 

Capital punishment 

Prodeterrence 0.8 —0.9 

Antideterrence 0.7 —0.8 

Combined 1.5 -1.7 
Deterrent efficacy 

Prodeterrence 0.7 —1.0 

Antideterrence 0.7 —0.8 

Combined 1.4 -1.8 


Note. Positive numbers indicate a more positive 
attitude or belief about capital punishment and its 
deterrent effect. Negative numbers indicate a more 
negative attitude or belief about capital punishment 
and its deterrent effect. 


ment’s start, proponents reported that they 
were more in favor of capital punishment, 
t(23) = 5.07, p < 001, whereas opponents 
reported that they were less in favor of 
capital punishment, ¢(23) = —3.34, p < .01. 
In a two-way analysis of variance (Initial 
Attitude x Order of Presentation), the effect 
of initial attitude was highly significant, 
F(1, 44) = 30.06, p < .001, and neither the 
order effect nor the interaction approached 
significance. Similar results characterized 
subjects’ beliefs about deterrent efficacy. 
Proponents reported greater belief in the 
deterrent effect of capital punishment, ¢(23) 
= 4.26, p < .001, whereas opponents reported 
less belief in this deterrent effect, #(23) = 
—3.79, p< .001. Final attitudes toward 
capital punishment and beliefs concerning 
deterrent efficacy were highly correlated, 
r = 88, p < .001. 

Such results provide strong support for the 
main experimental hypothesis that inconclu- 
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sive or mixed data will lead to increased 
polarization rather than to uncertainty and 
moderation. Moreover, the degree of polar- 
ization shown by individual subjects was pre- 
dicted by differences in subjects’ willingness 
to be less critical of procedures yielding sup- 
portive evidence than of procedures yielding 
nonsupportive evidence. Significant correla: 
tions were found between overall attitude 
change regarding capital punishment and dif 
ferences in ratings of both how convincin| 
r= .56, p < .001, and how well done, r 

56, p< .001, the studies were. -Overal 
changes in beliefs in deterrent efficacy pro 
duced comparable correlations of .53 and .57, 
both ps < .001. 


Components of Attitude Polarization 


In view of this strong evidence of overal 
attitude polarization, it is worth examining 
the course of attitude polarization as subjects 
opinions were successively assessed aftel 
exposure to the first study, the details ani 
critiques of the first study, the results of tl 
second study, and thè details and critiques ol 
the second study. At each stage, it will i 
recalled, subjects were asked about | 
impact of the single piece of information th 
had just considered and the cumulati 
impact of all information presented to tha 
point. Let us first examine the report 
effects of single segments of evidence al 
then the effects of accumulated evidence ov 
time. 


Effect of Exposure to the Results of 
Each Study 


Considering the result cards as single piec 
of evidence, both proponents and opona 
reported shifting their attitudes in H i i, 
tion of the stated results for both t ie. 
deterrence, t(47) = 4.67, P< Oecd 
deterrence, #(47) = —5.15, P< 00 howell 
As shown in the top half of Table Se al 
subjects’ responses to the two ae ean 
varied with initial attitude. Propone sterren 
to be influenced more by ro 
study and opponents more 


deterrence study, Thus a of Pr 
of variance (Initial Attitude X Order d 
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sentation) on combined change from the two 
result cards considered individually yielded 
only a main effect of initial attitude for both 


= 6.35, p < .02,' and beliefs about its deter- 
rent effect, F(1, 44) = 10.37, p < .01. Inter- 
gstingly, the analysis of beliefs regarding 
deterrent efficacy also showed an unantici- 
pated interaction effect, F(1, 44) = 7.48, p 
< .01, with proponents showing a differential 
esponse to results alone regardless of order 
f presentation but opponents showing a dif- 
ferential response to results alone only when 
the confirming study was presented first. 


Effect of Exposure to Procedures and Data, 
Critiques and Rebuttals 


When provided with a more detailed 
description of the procedures and data, to- 
ether with relevant critiques and authors’ 
rebuttals, subjects seemed to ignore the stated 
pelts of the study. As shown in the bottom 
half of Table 3, both proponents and oppo- 
Ments interpreted the additional information, 
Telative to the results alone, as strongly sup- 
Porting their own initial attitudes. Detailed 
descriptions of either the prodeterrence or 
‘the antideterrence study, with accompanying 
critiques, caused proponents to favor capital 
Punishment more and believe in its deterrent 
efficacy more, but caused opponents to oppose 
Capital punishment more and believe. in its 
deterrent efficacy less. A two-way analysis of 
Variance (Initial Attitude x Order of Pre- 
Mntation) on attitude change for the two 
“scriptions combined yielded only a signifi- 
ke main effect of initial attitude for both 
a capital punishment issue, F(1, 44) = 
- p < .001, and the deterrent efficacy 

estion, F(1, 44) = 26.93, p < .001. 


Ci . 3 5 
hanges in Attitudes Across Time 


abject reported changes in attitudes and 
o S, relative to the start of the experiment, 
PTE exposure to each of the four sep- 
a pieces of information are depicted in 
Dinish 1 for attitudes concerning capital 
“ae Ment and in Figure 2 for beliefs con- 

ng deterrent efficacy. These data, por- 
ved Separately for subjects who received 


attitudes toward the death penalty, F(1, 44) _ 
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first either the prodeterrence study or the 
antideterrence study, provide a more detailed 
view of the attitude polarization process. 
They allow, as well, an examination of the 
hypothesized “rebound effect,” that the pro- 
vision of any plausible reason for discounting 
data that contradict one’s preconceptions will 
eliminate the effects that mere knowledge of 
those data may have produced. 

The existence of such a “rebound effect” is 
obvious from examination of these figures. 
Whether they encountered the disconfirming 
result first or second, both proponents and 
opponents seemed to be swayed momentarily 
by this evidence, only to revert to their 
former attitudes and beliefs (and in 23% of 
the individual cases, to even more extreme 
positions) after inspecting the procedural 
details and data, and the critiques and rebut- 
tals found in the literature. Across all sub- 
jects, this rebound in opinions proved signifi- 
cant for both the capital punishment, ¢(47) 
= 4.43, p<.001, and deterrent efficacy, 
t(47) = 4.58, p < .001, issues. By contrast, 
no compensating rebound effects resulted from 
reading the descriptions and critiques of stud- 
ies supporting subjects’ initial attitudes, for 
either capital punishment, #(47) = 60, ms, 
or deterrent efficacy, (47) = .23, ns. 


Discussion 


The results of the present experiment pro- 
vide strong and consistent support for the 
attitude polarization hypothesis and for the 
biased assimilation mechanisms postulated to 
underlie such polarization, The net effect of 
exposing proponents and opponents of capital 
punishment to identical evidence—studies 
ostensibly offering equivalent levels of sup- 
port and disconfirmation—was to increase 
further the gap between their views. The 
mechanisms responsible for this- polarization 
of subjects’ attitudes and beliefs were clearly 


71m order to rule out the possibility that direc- 
tion of study interacted with initial attitude, a 
three-way analysis of variance (Initial Attitude X 
Order of Presentation X Direction of Study) was 
also performed on these data. The relevant interac- 
tion term did not approach significance, F(1,44) = 
1.62, ms. 
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MORE IN FAVOR OF 
CAPITAL PUNISHMENT 


NO CHANGE 


PRO- DETERRENCE ANTI-—DETERRENCE 


MORE OPPOSED TG RESULTS DETAILS RESULTS DETAILS 


CAPITAL PUNISHMENT 


MORE IN FAVOR OF 
CAPITAL PUNISHMENT, 


NO CHANGE 
pe itt ANTI-DETERRENCE PRO-—DETERRENCE 
POSED TO RESULTS DETAI RE ETAILS 
CAPITAL PUNISHMENT! LS SULTS Di 
1 2 3 n 
TIME 


Figure 1. Top panel: Attitude changes on capital punishment relative to start of experiment as 
reported across time by subjects who received prodeterrence study first. Bottom panel: Attitude 
changes on capital punishment relative to start of experiment as reported across time by subjects 
who received antideterrence study first. 


suggested by correlational analyses, Subjects’ strongly assumed, then studies whose “J 
decisions about whether to accept a study’s comes reflect that truth may reasonably : 
findings at face value or to search for flaws given greater credence than studies ie, 
and entertain alternative interpretations outcomes fail to reflect that truth. Hence k. 
seemed to depend far less on the particular physicist would be “biased,” but ae 
procedure employed than on whether the ately so, if a new procedure for evaluating 


STE : x A weet he 
study’s results coincided with their existing speed of light were accepted if it gave t 


ci 3 ieee the 
beliefs. “right answer” but rejected if it le: 
y “wrong answer.” The same bias E ee. 

The Normative Issue of us to be skeptical about reports © anced 


ulous virgin births or herbal cures for oe 
It is worth commenting explicitly about the and despite the risk that such heen 

normative status of our subjects’ apparent and experience-based skepticism may ad 
biases. First, there can be no real quarrel with us unable to recognize a miràculoj i a 
a willingness to infer that studies supporting when it occurs, overall we are S 
one’s theory-based expectations are more served by our bias. Our subjects Man f 
probative than, or methodologically superior to impugn or defend findings as a in past 
to, studies that contradict one’s expectations. their conformity to expectations kis es 
When an “objective truth” is known or be similarly defended. Only the si 
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their initial convictions in the face of the 
existing inconclusive social data and argu- 
ments can be regarded as “suspect.” 
Our subjects’ main inferential shortcoming, 
in other words, did not lie in their inclination 
to process evidence in a biased manner. Will- 
ingness to interpret new evidence in the light 
of past knowledge and experience is essential 
for any organism to make sense of, and 
respond adaptively to, its environment. 
Rather, their sin lay in their readiness to use 
evidence already processed in a biased man- 
ner to bolster the very theory or belief that 
initially “justified” the processing bias. In 
so doing, subjects exposed themselves to the 
familiar risk of making their hypotheses un- 
falsifiable—a serious risk in a domain where 
itis clear that at least one party in a dispute 
holds a false hypothesis—and allowing them- 


PROPO! 


MORE BELIEF THAT 
CAPITAL PUNISHMENT 
HAS DETERRENT EFFECT 


A 


NO CHANGE 


MORE DISBELIEF THAT 
CAPITAL PUNISHMENT 
HAS DETERRENT EFFECT 


RESULTS 


MORE BELIEF THAT 
CAPITAL PUNISHMENT 
HAS DETERRENT EFFECT 


NO CHANGE 


MORE DISBELIEF THAT 
CAPITAL PUNISHMENT 
HAS DETERRENT EFFECT 


RESULTS 


Figure 2. Top panel: Belief changes on Caj 
experiment as reported across time by Tol 
Panel: Belief changes on capital punishment 


aS reported across time by subjects who receiv 


PRO-DETERRENCE 
DETAILS 


ANTI-DETERRENCE 
DETAILS 


ital punishment’ 
See abo received prodeterrence study first. Bottom 
s deterrent efficacy relative to start of experiment 


ed antideterrence study first. 
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selves to be encouraged by patterns of data 
that they ought to have found troubling. 
Through such processes laypeople and pro- 
fessional scientists alike find it all too easy 
to cling to impressions, beliefs, and theories 
that have ceased to be compatible with the 
latest and best evidence available (Mahoney, 
1976, 1977). 


Polarization: Real or Merely Reported? 


Before further pursuing the broader impli- 
cations of the present demonstration, it is 
necessary to consider an important question 
raised by our procedure: Did our subjects 
really show change (i.e., polarization) in their 
private beliefs about the desirability and 
deterrent efficacy of capital punishment? Cer- 
tainly they told us, explicitly, that their 


NENTS 


(opponents 


ANTI—DETERRENCE 
RESULTS DETAILS 


PRO—DETERRENCE 
RESULTS DETAILS 


TIME 


4s deterrent efficacy relative to start of 


2108 


attitudes and beliefs did change after each 
new piece of evidence was presented, and 
from the beginning to the end of the experi- 
ment. Moreover, they did show a willingness 
to report a shift in their attitudes in the direc- 
tion of findings that were contrary to their 
beliefs, at least until those findings were 
exposed to methodological scrutiny and pos- 
sible alternative interpretations. Nevertheless, 
it could be argued that subjects were not 
reporting real shifts in attitudes but instead 
were merely reporting what they believed to 
be a rational or appropriate response to each 
increment in the available evidence. Although 
we believe that it remains an impressive 
demonstration of assimilation biases to show 
that contending factions both believe the same 
data to justify their position “objectively,” 
the potential limitations of the present mea- 
sures should be kept in mind in evaluating the 
relationship of this study to prior polarization 
research, As noted earlier (see Footnote 1) 
our, intended strategy of assessing direct 
changes from our initial selection measures of 
attitudes and beliefs, rather than asking sub- 
jects to report such changes within the experi- 
ment, was neither feasible nor appropriate, 
given the necessity of selecting subjects with 
strong and consistent initial views on this 
issue. Potentially such methodological prob- 
lems could be overcome in subsequent re- 
search through the use of less extreme samples 
or, perhaps more convincingly, by seeing 
whether biased assimilation of mixed evidence 
will make subjects more willing to act on 
their already extreme beliefs. 


Belief Perseverance and Attribution Processes 


The present results importantly extend 
the growing body of research on the persever- 
ance of impressions and beliefs. Two of the 
present authors and their colleagues have now 
amassed a number of studies showing that, 
once formed, impressions about the self 
(Ross et al., 1975; Jennings, Lepper, & Ross, 
Note 2; Lepper, Ross, & Lau, Note 3), beliefs 
about other people (Ross et al., 1975), or 
theories about functional relationships be- 
tween variables (Anderson, Lepper, & Ross, 
Note 4) can survive the total discrediting of 
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the evidence that first gave rise to such 
beliefs. In essence, these prior studies demon- 
strate that beliefs can survive the complete 
subtraction of the critical formative evidence 
on which they were initially based. In a com- 
plementary fashion, the present study shows 
that strongly entrenched beliefs can also sur- 
vive the addition of nonsupportive evidence. 

These findings pose some fundamental ques- 
tions for traditional attribution models. To 
the extent that beliefs and impressions can be 
shown to persevere in the face of subsequent 
challenging data, we need a “top down” 
rather than—or perhaps in conjunction with 
—a “bottom up” approach (cf. Bobrow & 
Norman, 1975) to the question of how indi- 
viduals extract meaning from their social 
environment, Instead of viewing people as 
impartial, data-driven processors, the present 
research suggests our models must take into 
account the ways in which intuitive scientists 
assess the relevance, reliability, representative 
ness, and implications of any given sample 0 
data or behavior within the framework of the 
hypotheses or implicit theories they bring 
to the situation (Lepper, 1977). In everyday 
life, as well as in the course of scientific 
controversies (cf. Kuhn, 1970), the meré 
availability of contradictory evidence rarely’ 
seems sufficient to cause us to abandon out 
prior beliefs or theories. 


Social Science Research and Social Policy 


We conclude this article, as we began it 
by considering the important links teri 
social policy, public attitudes and bell 
about such policy, and the role of the soci 
scientist. If our study demonstrates any’ a 
it surely demonstrates that social sce 
can not expect rationality, ene E 
and consensus about policy to emerge aa 
their attempts to furnish “objective gs- 
about burning social issues. If people of he 
ing views can each find support for mal 
views in the same body of evidence, it 15 $ He 
wonder that social science research, deal 
with complex and emotional socia 


: 5 
forced to rely upon inconclusive anne 
measures, and modes of a fires 0f 


quently fuel rather than calm t 
debate. 
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Sex Differences in Bystander Intervention in a Theft 


William Austin 
University of Virginia 


A series of studies demonstrated a strong relationship among the situation- 
defining variable of degree of harm to victim, sexual configuration among 
participants, and bystanders’ willingness to intervene to stop a theft. A pretest 
showed that a prior verbal commitment was absolutely necessary for interven- 
tion, The remaining data showed that high harm to a victim produced a high 
rate of intervention and showed strong sex differences in helping behavior in 
low-harm conditions. A high percentage of female bystanders helped in both 


low- and high-harm situations, whereas frequent helping by males was ob- 
served only when harm to the victim was high. Female victims elicited a 
significantly greater amount of helping, and sex of thief had no effect. A sig- 
nificant sex of bystander, sex of victim, and harm to victim interaction best 
describes the data. Results are interpreted in terms of different motivational 
sets held by males and females when they are responsible fo: the fate of 


others. Results also support the utility of an interactionist approach to the 
question of how individual and situational variables influence prosocial action. 


In recent years social psychologists have 
generated a voluminous amount of research 
on the topic of helping behavior in an attempt 
to understand the dynamics of prosocial 
action, Two general and interrelated questions 
seem to underlie most studies. The first ques- 
tion posed by researchers points to the 
situation-defining properties of variables. It 
asks what personal and situational factors 
encourage people to define a situation as one 
in which help is needed or absolutely required 
(e.g., an emergency), The second question is 
behavioral and asks what factors facilitate or 
inhibit helping once a situation is defined as 
one in which help is needed or required. 

Most variables studied by researchers are 
pertinent to answering both questions, which 
taken together constitute a unitary process 
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of prosocial action. For example, the ambigi 
ity of situational cues such as a victim’s dig 
tress (Clark & Word, 1974), number @ 
bystanders present (Latané & Darley, 1970) 
and degree of victim need (Ashton & Severi 
1976) has been shown to directly affect the} 
definition of a situation as an emergency aii 
to indirectly activate helping behavior. i 
Both questions also find their way into a 
of the popular models of helping behavi0 f 
including the decision-making models 
Latané and Darley (1970) and Schwar 
(1977), the cost model offered by J4 
Piliavin and Piliavin (1972), and E 
(1978) Person X Situation interactive a 
proach. In addition, these models introd 
variables and concepts designed to api 
the intervening processes that stand at th 
interface between the two questions. sil 
The two general concepts of respons! F, | 
and cost predominate as the main explana 
tools in these models. Both gous 
multiple dimensions of meaning 
been A in very different ways D 
researchers. : 
Responsibility has been gt 5 
meanings in the helping literature. an 
have described (a) the diffusion © 


BE 
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bility among multiple observers of an 
emergency (Latané & Darley, 1970), (b) 
personality differences in individuals’ willing- 
ness to assume responsibility for the fate of 
others (Schwartz, 1970), and (c) the role of 
responsibility in social norms that prescribe 
that help be given to people in need (Berko- 
vitz & Daniels, 1964; Walster & Piliavin, 
1972). These separate usages of the concept 
of responsibility complement one another; 
sch is appropriate for different research 
questions. It is clear, however, that some 
spect of responsibility must be taken into 
account in describing any prosocial act. 
Responsibility is the sine qua non of helping. 
Every helping action requires the observer 
b role play and see the situation from the 
ictim’s perspective, In fact, the labeling of 
iparticular action as helpful means not only 
hat one person has made another's plight 
hore pleasant or less painful but, by defini- 
lon, that the first person has assumed respon- 
ibility for the fate of the second. 
The concept of cost signifies a behavioral 
iientation. J. A. Piliavin and Piliavin (1972) 
Mesented a parsimonious cost model of help- 
behavior, This model consists of a 2 X 2 
trix of potential helping situations, with 
sts for helping matched against costs for 
bt helping. Costs for helping that would 
rally discourage bystanders from inter- 
ing might be time, risk of physical injury, 
md rebuked for not “minding one’s own 
Bic: Costs for not helping can be 
Med into the risks of external sanctions, 
te as social ostracism, and the risks of 
smal sanctions, typically described as 
lilt or self-blame. 
pe assumes that most of the cost factors 
a nt to a helping situation can be mea- 
or reliably guessed, the Piliavin model 
°$ Straightforward predictions in two of 
Cur cells, When costs for helping are low 
i, StS for not helping are high, then a 
ae of helping is expected. Conversely, 
5 Costs for helping are high and costs 
Ot helping are low, then prosocial acts 
k iiy less frequent, However, predict- 
(Ste in the remaining two situational 
ns in the Piliavin model (high/high, 
°w) is problematic. 
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_ The high/high situation is particularly 
interesting to researchers because it poses an 
interpersonal paradox, a “Catch-22,” in which 
there is no satisfying way to resolve what is 
a classic case of avoidance—avoidance ten- 
sion. J. A. Piliavin and Piliavin (1972) spec- 
ulated that individual differences might be 
more influential under such circumstances, 
which could mean personality differences 
relevant to prosocial action (Schwartz, 1977), 
sex differences, or Person X Situation inter- 
actions such that individual characteristics 
may alter responses as a function of the rela- 
tive strength of certain situation-defining 
variables (Staub, 1978). 

This study began with the assumption that 
both responsibility and cost are necessary 
explanatory concepts in most helping situa- 
tions. The research reported here was designed 
to test the effect of one situation-defining 
variable (degree of harm to a victim) and to 
explore possible sex differences under condi- 
tions that on one hand are conducive to 
helping (prior commitment obtained and 
person alone), yet pose the high/high cost 
dilemma on the other. 

The type of helping behavior called for was 
intervention to stop a theft. Degree of harm 
to victim was manipulated by varying the 
value of the items stolen. Sex differences were 
studied by varying the sex of the bystander, 
thief, and victim. The experimental procedure 
first attempted to create pressures so that 
bystanders would have to make a considerable 
effort to deny personal responsibility for 
intervening. Subjects were alone so as to rule 
out diffusion of responsibility (Latané & 
Darley, 1970) and were engaged in casual 
activity (usually reading) so as to eliminate 
lack of helping due to competing commit- 
ments. In addition, each bystander subject 
was required to make a verbal commitment 
to watch the stolen items in the victim’s 
absence. This setting of a clear obligation 
seems to create the high/high cost dilemma 
when coupled with the realities involved in 
stopping a thief. There are costs for not help- 
ing (incurring the victim’s wrath, guilt for 
being complacent) and anticipated costs for 
helping (risk of physical injury, embarrass- 
ment, or incurring the wrath of a wrongfully 
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accused thief), I thus assumed the setting 
was ripe for investigating whether a clear 
situation-defining variable such as degree of 
harm, bystander/thief/victim sexual config- 
uration, or both factors in combination would 
play a significant role in bystanders’ attempts 
to resolve the high/high cost dilemma under 
conditions of high personal responsibility. 


Predictions 


1. I predicted a significant positive rela- 
tionship between degree of harm to a victim 
and frequency of helping. I reasoned that 
greater harm would enhance the salience both 
of the bystander’s personal responsibility and 
of the potential costs for not helping. The 
concept of degree of harm should not be con- 
fused with the similar but distinct concepts 
of need and dependency, which have both 
been shown to be correlated with helping (see 
Berkowitz & Daniels, 1963; J. A. Piliavin & 
Piliavin, 1972). Degree of harm concerns 
specific consequences for not helping, whereas 
need seems to address the probability of 
negative consequences if help is not forth- 
coming, and dependency concerns the degree 
of responsibility for intervening resulting 
from an existing relationship between victim 
and bystander. By my procedure I attempted 
to create the impression that negative con- 
sequences would certainly befall the victim 
while holding need and dependency constant. 

The effect of degree of harm on helping 
under conditions of personal responsibility and 
high cost has not been studied previously. 
One study (Schwartz, 1970) varied degree of 
harm (or salience of consequences) in a study 
of willingness to donate bone marrow. 
Schwartz reported a curvilinear relationship 
between consequences and helping, but his 
manipulation of intensity of appeal to make 
a commitment and probability of the donor 
actually being called makes it difficult to 
compare his research with the present 
research. 

2. This study was conceived as exploratory 
with respect to possible sex differences and 
Sex x Harm to Victim interactions. Although 
I was confident the high personal responsibil- 
ity and high/high cost setting was best suited 
for discovering sex differences, the lack of a 
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sound theory of sex differences in moral 
behavior and equivocal research findings pre 
cluded my stating a priori propositions (see. 
Staub, 1978). Though researchers have re- 
ported some sex differences in prosocial be 
havior, the literature is far from consistent 
on the topic. Further difficulty stems from 
the usual problems in making between-studies 
comparisons of results in light of the differ 
ent variables and operations used, In partit 
ular, researchers have not fully examined the} 
potential effect of the sexual configuration 0 
all of the participants, Numerous studies exist 
in which the sex of the bystander, the victim 
or the wrongdoer has been studied individually, 
but only a few researchers have simultan 
eously varied the sex of more than one partidi 
pant, creating a situation in which very Tittle 
is known about the role of sexual configu 
ation in helping. To my knowledge, prior 4 
the research reported here, no one has orthog 
onally varied the sex of bystander, wrong 
doer, and victim in a single study. Despilé 


in this area I was hopeful that insights cout 
be gleaned from previous studies and that 
these possibilities might combine with thi 
operational setting employed in this reseattl 
to shed new light on sex differences in pi 
social action. : 
There is reason to expect sex differences 7 
helping in the high/high cost situation on 
here, A previous study (Austin & Mg ‘i 
1977) reported sex differences in reward à 
cation behavior and significant differen% 


personal harmony factors as more impo 
There is some justificati pi 
from reward allocation to helping, ques 
could argue that both situations involve va 
tions associated with treating anoh J 
fairly and what outcomes he oF A Ad 
receive (see Walster & Piliavin, 19 ule 
cordingly, in the present research, Jl bigne" 
might be expected to show an overs til i 
rate of helping because of å greater S 
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to interpersonal harmony vis-a-vis the victim 
(ie, costs for not helping). If this assumption 
is correct, then females might be willing to 
risk instigating conflict with a presumed thief 
because of a concern with harmony, whereas 
males might be more willing to risk harmony 
to avoid costs involved with intervention. 
Some support for this conjecture can be 
found in helping studies. Most encouraging 
are studies that purport to show greater help- 
ing by females because of their heightened 
affective sensitivity, including empathy 
(Kilham & Mann, 1974) and guilt (Walling- 
fon, 1973). These findings are backed by 
Schwartz and Clausen’s (1970) finding that 
females were more likely to intervene in an 
ileptic seizure when they were alone and by 
leda, Bleda, Byrne, and White (1976), who 
ported that females alone were more likely 
report cheating when an identifiable victim 
Was hurt by the dishonesty, Unfortunately, 
these studies shed little light on the role of 
ual configuration, For example, Austin and 
McGinn (1977) studied only same-sex rela- 
lonships, and Bleda et al. varied sex of by- 
nder and wrongdoer, but not sex of victim. 
n ad ition to this qualification, many other 
ludies have shown either no sex differences 
t greater helping by males. Latané and 
arley (1970), Moriarity (1975), and Shaf- 
£5 Rogel, and Hendrick (1975) reported no 
& differences in the willingness of bystanders 
s terene in a theft. I. M. Piliavin, Rodin, 
3 vn (1969) reported that females 
i “Sa likely to aid a person who collapsed 
art way in a group setting, and Gelfand, 
me ttn, Walder, and Page (1973) found 
hi More likely to report shoplifting be- 
ti, r. However, once again, these studies 
Rot fully study the sexual configuration, 
hy 1. OPerational settings varied consider- 
thi o example, Moriarity manipulated sex 
Be but victim/thief configuration was 
} Me Ctoss-sex and sex of bystander was not 
nto account; Shaffer et al. manipulated 


of victi 7 
i tin and thief together, but the con- 
on was always cross-sex for all 
b pants, 
n < 
the basis of previous research, predict- 
exual Configuration X Degree of Harm 


f ction is also problematic. My initial 
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thought that females would be victim oriented 
might suggest a high rate of helping regard- 
less of degree of harm and might suggest that 
a majority of males would help only under 
substantial harm to the victim (because high 
harm might imply higher costs for not help- 
ing). These propositions are consistent with 
the speculations of Gelfand et al. (1973) that 
females are less inclined to report shoplifting 
because the victim is an impersonal business 
establishment, which thus does not activate 
their heightened affective sensitivity. Similar 
findings have been reported by Bleda et al. 
(1976), who found higher prosocial action by 
females only when there was an identifiable 
victim, and by Schopler and Bateson (1965), 
who explained a somewhat higher rate of 
female helping in terms of an apparent greater 
concern by males with their own outcomes. 

This tentative prediction of greater overall 
helping by females, which should be most 
apparent when the degree of harm to a victim 
is modest, must again be held in abeyance 
because the supporting research did not 
examine sexual configurations. 

Prior to conducting a field experiment that 
fully examined the effect of degree of harm 
and sexual configuration among participants, 
one pretest and two pilot studies were con- 
ducted. The results of these preliminary data- 
gathering efforts are briefly described below. 
All studies used identical procedures. Only the 
experimental designs differed. 


General Procedure for All Studies 


All subjects were college students at a large 
southern university. Subjects in the pretest, 
Pilot 1, and the field experiment attended the 


11. M. Piliavin et al. (1969) explained their results 
in terms of fewer costs for females for not helping 
because of their role. These researchers conjectured 
that the female role prescribes less initiative in 
emergencies and absolves women of the worst of 
public censure for not helping. This explanation is 
interesting in that it demonstrates the close relation- 
ship between the concepts of cost and responsibility 
in helping situations: Not helping is less costly to 
females because they are not responsible; they are 
not responsible because of their role (see Hamilton, 
1978, on the importance of role in attributions of 
responsibility). Unfortunately, the data reported 
subsequently are exactly opposite to these conjectures, 
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same university; subjects in Pilot 2 attended 
a different university in the same state. 

Subjects were located by selecting college 
students who were relaxing (usually reading) 
and seated by themselves in a corridor of a 
large university classroom building. One ex- 
perimenter served as the potential victim and 
approached the bystander subject. He or she 
first gained a personal commitment from the 
bystander to watch his or her personal belong- 
ings. The experimenter paused directly in 
front of the bystander and asked, “Do you 
mind watching my things? I’ll be back in a 
couple of minutes.” All subjects consented in 
all studies. The experimenter then set several 
items next to and slightly forward of the by- 
stander. This made it almost impossible for 
the subject not to notice what the items were 
and the theft that was about to take place. 

The first experimenter (i.e., the victim) 
then moved off to a distant location to record 
the subject’s response to the forthcoming 
crime. (Only a few subjects watched the first 
experimenter leave.) A second experimenter, 
who served as the thief, then casually ap- 
proached the bystander and paused to 
examine the articles placed next to him or 
her. The thief passed directly in front of the 
subject and stopped for approximately 10 
sec (as if to convey indecision as to his or her 
impending deed) before picking up the item. 
The theft was designed to be conspicuous. 

The dependent variable was percentage of 
subjects in each condition who intervened to 
stop the theft. After intervening, or failing 
to intervene, subjects were told they had just 
been in an experiment. All subjects were 
unaware that this was the case. Subjects were 
then orally administered an open-ended ques- 
tionnaire. The questions differed slightly for 
subjects who helped and those who failed to 
help. The questions assessed whether subjects 
noticed the theft, when they first knew a 
theft was going to occur, their first reaction 
to it, why they did not stop the thief, why 
the commitment to safeguard the item was 
not honored, and how they felt about the 
experimenter. The purpose of the study was 
then explained to subjects, and permission 
to use their data was obtained. 
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Pretest: Effect of Prior Commitment 


In planning this research I first looked at 
the findings reported by Moriarity (1975) 
and Shaffer et al, (1975) on the importance 
of prior commitment for bystanders who 
observe a theft. Using different settings 
(public beach, Laundromat, or library) but 
stolen items of comparable value (portable 
radio or watch), Moriarity reported an inter- 
vention rate of 90%-100% for committed 
bystanders and 0%-20% for the uncom: 
mitted; Shaffer et al. reported a comparable’ 
70% and 10% for the committed andi 
uncommitted, respectively. 

These polarized rates of intervention as @ 
function of commitment impressed me 9 
much that I first attempted to confirm 
whether they were a reliable baseline fot 
reactions to a theft. Because I hold a ratha 
humanistic view of people I wanted to take 
another look at the reactions of the uncom: 
mitted bystanders to see if they were really 
so passive in the face of a crime. By using tht 
procedure described above, 20 bystander sub: 
jects (10 male, 10 female) were located ani 
exposed to the theft of an electronic calcula 
tor—an item of considerable value (higi 
harm to victim) and comparable with th 
items stolen in the Moriarity (1975) an 
Shaffer et al. (1975) studies, No prior com 
mitment was obtained from these subjects 
The victim was always female and the thie! 
male. The victim stopped directly in front 0 
the bystander and set a folder full of pape! 
and a calculator down next to the bystandet 
and then left, glancing back at her belonging 
The results were dramatic. Not 4 sion 
uncommitted bystander intervened to § i 
the theft. I interpreted this pretest data 

s $ ch. Commitmen! 
corroborating previous research. © fot Om 
appears to be a necessary condition for fa 
social action in responses to the type ° aa 
situation created here. Commitment, hora 
may not be necessary in all theft situati 


Pilot Study 1 

ro- 

This study was conducted as 2 grant owed 

ject for an undergraduate course. sare a 

me to perfect the experimental are averdé 
provided a means for comparing 
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rate of helping of the subject population 
with data gathered in other theft studies. 
The item stolen was a calculator. The 
thief was always male (M) and the victim 
| female (F). The sex of the bystander was 
| varied (32 males, 36 females), making the 
| bystander/thief/victim configuration M/M/F 
| for half the subjects and F/M/F for the 
other half. Size of the thief was also varied. 
One half of the bystanders witnessed an 
extremely large male, and half viewed an 
average-sized male, This variable was designed 
| to manipulate cost for helping. 

I expected a Sex of Bystander X Size of 
Thief interaction in a 2 X 2 design. I assumed 
that risk of a physical confrontation was 
greater for males accusing other males of 
stealing. Thus, I reasoned fewer males would 
help when the thief was huge. My predictions 
vere not supported. In a chi-square analysis 
‘ho significant effects were found.? The rate 
of intervention was extremely high, ranging 
ftom 64.7% to 70.6%, and was fairly uni- 
form. Females showed a slightly higher rate 
of helping, but it did not approach signifi- 
cance, 

, These null results suggested several empir- 
teal possibilities. First, when an initial com- 
mitment to help exists and the harm to the 
Victim is substantial, then bystanders focus 
® costs for not helping, ignore the risk of 
Physical danger for helping (e.g., size of 
thief), and show a high rate of helping. 
pips if the risks of physical danger were 
E apparent, such as the presence of a gun, 
the could expect different results. However, 
arity of the rate of helping found 
La to the rates found in other theft studies 

tané & Darley, 1970; Moriarity, 1975; 
3 a et al., 1975) suggests that the present 
fiti are reliable, though constrained 
E me limits of the apparent degree of 
f Gna danger that was apparent to the 
ers. Second, no sex differences were 
Parent, 


Pilot Study 2 


This Study broke further ground for the 
a experiment, In a 3X2 design I 
if ae the effect of harm to victim and sex 
| ef on helping. Degree of harm was op- 
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erationalized by the value of the items stolen. 
The bogus victim placed books and folders 
(low harm), books, folders, and several 
record albums (moderate harm), or books, 
folders, and an electronic calculator (high 
harm) near the bystander. I predicted a sig- 
nificant main effect for harm, reasoning that 
commitment by itself may not be sufficient to 
produce the high rate of helping found in 
Pilot 1. Perhaps substantial harm to the 
victim is also necessary to activate the by- 
stander’s sense of personal responsibility 
(Schwartz, 1977) because of the potentially 
high costs for not helping. If this interpolation 
is correct, then low harm to the victim should 
beget a low rate of helping and high harm 
should elicit a high rate of helping. 

This study was exploratory with respect to 
sex differences. I speculated that it would be 
more costly for a male to intervene with 
another male because of the physical con- 
frontation, which I assumed would imply a 
competitive definition of the situation be- 
tween males, but would be neutralized, or at 
least attenuated, in a cross-sex confrontation. 
If this assumption is correct, then one should 
observe males helping less when the thief is 
male and harm to victim is low. In this pilot 
I varied sex of thief, but used male bystanders 
only (N = 72). Once again the victim was 
always female, which produced M/M/F and 
M/F/F sexual configurations of bystander/ 
thief/victim, though it was obvious that the 
results could only be analyzed in terms of 
sex of thief. 

As Table 1 shows, the pilot data confirmed 
my prediction for the effect of degree of harm, 
As was the case in Pilot 1, male bystanders 
showed a high rate of helping (83.3%) when 
harm was high, but few bystanders helped 
when harm to victim was low (16.7%) or 
moderate (29.2%). This difference is reflected 


2 The frequency data for all studies were analyzed 
with a computer program (Everyman’s Contingency 
Table Analysis, Note 1) that implements the log- 
linear methods for analyzing dichotomous data 


‘pioneered by Goodman (1972). It partitions ex- 


plained variance in a manner similar to a squared 
multiple correlation regression analysis and yields 
analogues of main and interaction effects in the 
analysis of variance. I thank Thomas Guterbock for 
his assistance in making this program available. 
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Table 1 

Effect of Degree of Harm to Victim and Sex 
of Thief on Male Bystanders’ Helping 
Responses (Pilot 2) 


Degree of harm 


Sex of thief Low Moderate High Total 
Male 2/128 3/12 10/12 15/36 
% 16.7 25.0 83.3 41.7 
Female 2/12 4/12 10/12 16/36 
% 16,7 33.3 83.3 44.4 
Total 4/24 7/24 20/24 
% 16.7 29.2 83.3 


^ The numerator denotes the number of bystanders 
who intervened; the denominator denotes the total 
number of bystanders. 


in a significant main effect for harm, y°(2) 
=26.19, p <.001, although it is clear from 
Table 1 that substantial helping occurred only 
in high harm to victim conditions. 

The overall finding of Pilot 2 is consistent 
with my reasoning that commitment to help 
by itself is not sufficient to activate prosocial 
action. In this case degree of harm was an 
important situation-defining factor. One could 
argue that substantial harm caused an in- 
crease in bystanders’ sense of personal respon- 
sibility and their perception of costs for not 
helping. 

Pilot 2 also found no sex differences. My 
speculation as to qualitative differences in 
same-sex and cross-sex confrontations was 
disconfirmed. This finding, when paired with 
Pilot 1 results, was discouraging as to the 
possibility of sex differences, given that the 
high/high cost situation combined with 
manipulation of harm to victim seemed to be 
well suited for discovering such differences if 
they indeed existed (eg., J. A. Piliavin & 
Piliavin, 1972). 


Field Experiment 


Pilot 1 manipulated sex of bystander and 
found no significant effect. Pilot 2 found male 
bystanders to be unaffected by the sex of thief 
by itself or in combination with degree of 
harm to the victim. These results were per- 
plexing. I thought I had created ideal condi- 
tions for the expression of sex differences. 
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Subjects were alone; they were personally 
committed to assuming responsibility; 
faced a personal paradox of high costs 
for helping and for not helping; and th 
were exposed to different strengths of a poi 
situation-defining variable, which should hav 
affected the intensity of personal responsibilit 
they experienced and tipped the high/high 
cost balance in different directions, 

It is clear, however, that the pilot studi 
were at best superficial tests of the role of se 
in helping. Drawing on Austin and McGim 
(1977), who suggested that males and female 
hold different motivational orientations towar 
others when they are responsible for anoth 
person’s outcome, I designed a final fiel 


ja 


oriented and concerned with interpersom 
harmony under conditions of personal cd 
mitment and high/high cost. I reasoned { 
this possible ‘“nomothetic predisposition 
might translate into a higher rate of helpit 
when harm to the victim was low. Pilot 
showed that for males commitment by i si 
was not enough to produce help. Costs } 
intervening apparently outweighed costs i 
not helping when harm to the victim was lo i 
males were willing to sacrifice the victim 
belongings when their value was not hig 
even when a commitment had been me 
Females may not be so inclined. If they n 
indeed more victim oriented, then one shoul 
expect at least a moderate amount of 
in the low-harm conditions. 


Method 


All eight possible bystander/thief/vietint is, 
configurations were employed along $ 
(folders and books) and high (folders a 
tor) degree of harm conditions. The ara 1 
design can be alternatively describe es a ut 
design (if sexual configuration is on ee 
tary variable) or a 2 X 2 X 2 x 2 one 
of sex of each participant is analy 
Subjects were 352 male and female te of viel 
(n= 22 per cell). This design is arene in DA 
a complete estimate not only of a ction of MA 
standers’ helping responses as a fu 


strength of a situation variable (ie, harm) but 
aso of separate effects for the role of sex of victim 


jand thief. 


Results 


Table 2 shows that my tentatively held 
proposition that female bystanders would 
show greater overall helping and that this 
would be most apparent in low harm to victim 
conditions is correct. This observation is con- 
firmed by log-linear regression analyses that 
uncovered a number of significant effects. 

First, a significant main effect for degree of 
harm to victim, x*(1) = 31.13, p< .001, 
teproduces the finding of Pilot 2 on the 
important situation-defining function of this 
variable. Table 2 shows greater helping in all 
of the high-harm conditions within each 
Sexual configuration. In overall terms, 45.5% 
f bystanders intervened when harm was low 
and 75.6% when harm was high. Second, 
analyzing the helping data as a 2 x 8 design 
produces a highly significant main effect for 
Sexual configuration, x°(7) = 32.85, p < .001. 
However, this result only means that some 
pect. of the total sexual configuration among 
Participants made a difference. Looking at the 
parate effects due to the sex of the individ- 
P participants in a 2 X 2 X 2 X 2 partition- 
Mg of the design produced a significant main 
tiect for sex of bystander, y2(1) = 26.86, 
P< 001, a marginally significant main effect 
mt sex of the victim, x2(1) = 3.63, p < .07, 
pl a Sex of Bystander X Victim interaction, 
4 (2) = 30.90, p < .001. Sex of thief had no 
on helping, x2(1) = .16, ms. These find- 
on are easily summarized by percentages. 
,-“PSing across harm, 73.9% of female 
PYstanders intervened, whereas only 47.2% 
cs did so; 65.9% of the female victims 
ie help, but only 55.1% of the male 

Ms were as fortunate. The Sex of By- 
nder X Victim interaction reflects the 
oly high number of females helped by 

females: 80.7% of female bystanders 
ae female victims, and 51.1% of male 
iis anders helped females; 67.0% of female 

tanders helped males in need, and 43.2% 
Male bystanders helped another male. 
lj, £ third and most important finding of 
eld experiment is the highly significant 


ire ears 
i o 
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Table 2 

Effect of Sexual Configuration Among 
Participants and Degree of Harm to Victim 
on Helping (Field Experiment) 


Sex of Degree of harm 
bystander/ 
thief/victim Low High Total 
M/M/M 4 14 18/44 
% 18.2 63.6 40.9 
F/M/M 13 18 31/44 
% 59.1 81.8 70.5 
M/F/M 4 16 20/44 
% 18.2 72.7 45.5 
F/F/M 12 16 28/44 
% 54.5 72.7 63.6 
M/M/F 8 15 23/44 
% 36.3 68.2 52.3 
F/M/F 17 20 37/44 
o 77.3 90,1 84.1 
M/F/F 6 16 22/44 
% 27.3 72.7 50.0 
F/F/F 16 18 34/44 
% 72.7 81.8 77.3 
Total 80/176 133/176 
% 45.5 75.6 


Note. M = male; F = female; n = 22 per cell. 


Harm to Victim Sexual Configuration inter- 
action, x?(8) = 68.24, p < .001. I had ten- 
tatively proposed (based on one related study 
and a literature review) that female bystand- 
ers show a much higher rate of helping in low 
harm to victim conditions. This hypothesis 
was confirmed. I found, as I had in Pilot 2, 
that male bystanders helped relatively infre- 
quently in the low-harm condition (25.0%) 
but substantially more often in the high-harm 
condition (69.3%). The most impressive 
finding was the substantial rate of helping by 
female bystanders in both low- (65.9%) and 
high-harm (81.8%) conditions. The rate of 
helping by females in low-harm conditions 
fell just short of the proportion of males 
helping in high-harm conditions, Breaking 
this overall interaction effect into component 
parts produced significant Sex of Bystander 
x Harm, x*(2) = 61.04, p < .001, and Sex 
of Victim X Harm, x*(2) = 35.49, p < .001, 
interactions, Based on the significant Sex of 
Bystander X Sex of Victim interaction (re- 
ported earlier as part of the sexual configura- 
tion main effect) and the absence of any 
significant effect due to sex of thief, the 
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model that best fits these data (and there- 
fore the most important empirical finding of 
this study) is a Sex of Bystander x Sex of 
Victim X Harm interaction, y°(3) = 65.92, 
p < 001. 

Interview responses. The postexperimen- 
tal interview was open ended in character. It 
attempted to ascertain how the situation was 
defined by all subjects, those who helped and 
those who did not, and which, if any, justifica- 
tions were used by nonhelpers, A similar pat- 
tern of responses emerged in each of the 
three studies. Bystanders’ responses in the 
field experiment are summarized here in full 
appreciation of the tenuous nature of post 
hoc interview data. 

All bystanders noticed the theft. Among 
bystanders who intervened (e.g., 213, or 
60.5%), 170, or 79.8%, indicated that they 
noticed the theft as soon as the subject 
picked up the stolen object; the remainder 
did so as the thief walked away. A substantial 
number of the bystanders who helped indi- 
cated some hesitancy to intervene (97, or 
45.5%), most often saying they thought the 
thief might really be the true owner of the 
objects (39, or 40.2%). More helpers indi- 
cated they “were aware of the potentially 
negative consequences” of not helping (103, 
or 48.4%) than of helping (74, or 34.7%). 
Some bystanders said both types of costs were 
important to them (62, or 29.1%). The re- 
mainder did not believe they considered either 
type of risk (36, or 16.9%). Among female 
helpers (130), 82, or 63.1%, mentioned costs 
for not helping; 27, or 20.8%, mentioned costs 
for helping; and 44, or 33.8%, mentioned 
both. Among male helpers (83), 47, or 56.6%, 
mentioned costs for not helping; 21, or 
25.3%, mentioned costs for helping; and 18, 
or 21.7%, mentioned both. Thus, costs for 
not helping seemed to dominate responses 
among helpers, though the costs of interven- 
tion could not be ignored. This pattern sug- 
gests that the high/high cost tension was 
tipped in favor of costs for not helping among 
helpers. This finding is demonstrated by the 
remarks of one bystander: “What a fool Pa 
look like if she came back and somebody had 
ripped off her stuff. I know if I was in her 
place I would have been really steamed, And 
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I said I’d watch her things anyways.” A num- 
ber of helpers (23, or 10.89%) said they would 
not have intervened if the stolen object had 
been of lower value (e.g., harm to victim), 
Finally, only a few helpers (12, or 5.6%) 
indicated they would have intervened without 
a prior commitment to guard the stolen 
object. i 

Among bystanders who failed to inter 
vene (139, or 39.5%), a very substantial 
number indicated they “thought about saying 
something to the thief” (97, or 69.8%). One 
of these bystanders remarked; “What 
relief! It’s only an experiment!” Althou 
a majority of the inactive bystanders w 
thus aware that a theft was quite likely taki 
place, only a handful (14, or 10.1%) fail 
to provide a justification for their inactio 
The three most common justifications voicel 
by subjects were that they thought the thief 
was a friend of the victim (38, or 30.4%); 
the thief was the true owner of the objec 
(45, or 36.0%), or the victim sent the thi 
to retrieve the items (43, or 34.4%). Ninety 
three male bystanders did not help; 15, 0 
16.1%, mentioned costs for not helping; 3l 
or 39.8%, mentioned costs for helping; 9 
or 8.6%, mentioned both; and 41, or 44.1% 
said both types of risk were unimportant 
them. Forty-six female bystanders failed 
intervene; 15, or 32.6%, mentioned costs fí 
not helping; 12, or 26.1%, mentioned cos 
for helping; 2, or 4.4%, mentioned both; an 
19, or 41.3%, indicated neither type of ni 
was a significant factor. Several parE 
clear from responses to these questions. * i 
both types of costs were seldom men a 
together, which was not the case’ with val 
who helped. Second, male nonhelpers a 
cated costs for helping were most impor fol 
and female nonhelpers mentioned ae 
not helping slightly more often than 


j k ers 0 
Third, a substantial portion of mies 


This may be because of thi 
tion of trying to justify a bla 
Finally, a large number of bystam ven! 
or 58.3%) said they would have intei me, 
if the stolen items had been of hig jos 
which again probably reflects a need pE in 
blameworthy action. It should be 


ders (81; 
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mind that these patterns pertain only to the 
type of situation created in these studies. 


Discussion 


This research sheds some light on sex dif- 
ferences in bystanders’ responses to a theft 
and on how sex interacts with situational 
variables. Researchers may find. the postu- 
lated different motivational sets of males and 
females to be a plausible interpretation of 
these data. Since the experimenters were 
| aware of the experimental conditions, there is 
{a danger of experimental bias. However, the 
tates of helping compare favorably with the 
‘findings of other theft studies that used 
[items of similar value under conditions of 
personal commitment (Moriarity, 1975; 
Shaffer et al., 1975). This comparability 
Seems to argue against experimenter bias. The 
} data reported here in combination with these 
Previous theft studies also appear to provide 
an ecologically robust estimate of the rate 
| of intervention to be expected in a theft under 
| onditions of personal commitment, when the 
item is of reasonable value but the situation 
‘oes not appear life threatening (e.g., when 
the types of potential costs involved are 
(Probably psychological rather than physical 
in nature) 


| Summary of Findings 


f k When harm to the victim was substan- 
al (theft of calculator), bystanders of both 
E showed an extremely high frequency of 
Mtervention (Pilot 1 and field experiment). 
a finding may be due to an increase in the 
tence of costs of not helping in the high- 
arm situation, 
tupao eee of harm to victim by itself pro- 
q to a significant increase in intervention to 
P a theft (Pilot 2 and field experiment). 
a a effect adds further credence to the inter- 
tation that high harm added to personal 
eet produces an increase in personal 
a nsibility (because of the salience of 
k for not helping) that is sufficient to 
~ crantee a high intervention rate. 
iter he most important result was the sex 
eu ences found when the total sexual con- 
ration among participants was taken into 


2119 


account. The significant overall Harm x 
Sexual Configuration interaction demonstrates 
the utility of the interactionist approach to 
prosocial action recommended by Staub 
(1978) and Schwartz (1977). Sexual config- 
urations exerted differential effects as a func- 
tion of the strength of the situation-defining 
variable of degree of harm. I found that sex 
of thief made no difference in bystanders’ 
behavior, but sex of bystander and sex of 
victim were influential by themselves, in 
combination with each other, and in combina- 
tion with harm to victim. The Sex of By- 
stander X Harm interaction constitutes the 
most parsimonious fit to these data and 
accounts for a very large portion of the 
explained variance, x?(2) = 61.04, p < .001, 
where the overall test produced y*(15) = 
74.95, p < .001, and the test for an overall 
Harm X Sexual Configuration produced y?(8) 
= 68.24, p < .001. However, the theoretical 
significance of the greater helping elicited by 
female victims, alone and in combination with 
other variables, leads me to conclude that the 
three-way interaction between sex of by- 
stander, sex of victim, and harm to victim, 
x?(3) = 65.92, p < .001, best summarizes the 
empirical findings of this research, 


Reference Note 


1, Everyman's contingency table analysis. Chicago: 
University of Chicago, Department of Statistics, 
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The Effects of Systematic Variations in Information on 
Judges’ Descriptions of Personality 
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University of California, Berkeley 


The use of observers in personality assessment has recently been subjected to 
intensive scrutiny and criticism by several different commentators on the 
field. These criticisms have called into question some of the fundamental 
assumptions of personality assessment, and the evidence presented has been 
used to dismiss the findings that result from the use of human observers. The 
present study presents results demonstrating that observers can and do provide 
personality descriptions that reliably reflect variations in available informa- 
tion. Six different sets of information about-82 persons were furnished to a 
panel of 52 judges whose descriptions were then based on only one of the six 
information sets. The results of a series of analyses suggested that agreement 
among judges is not related to the amount of information used to formulate 
descriptions. Nonetheless, judges’ descriptions did show a relationship between 
information and variation of specific descriptors, in that variation decreases in 
conditions where information is minimal. Finally, it was found that interperson 
similarity of description was associated with information, decreasing when 
judgments were based on personal contact and/or interviews and increasing 


when based only on stereotypic data. 


Strong warnings about the continued use of 
observers in personality assessment have 
‘cently been sounded (eg., Berman & 
oy 1976; Bourne, 1977; Chapman & 
ee 1967, 1969; D’Andrade, 1965, 
ve Fiske, 1971, 1973, 1974, 1976; Mischel, 

iy 1973; Shweder, 1975, 1977), The 
“sential criticism has been that in making 
Eotility assessments, the observer is not 
sitive to stimuli or, if sensitive, is unable 
ates in ways that demonstrate such 
Ivity. Further, critics claim that judges 
|,’ Unable to attend to or to report validly 
e actual relationships among the behaviors 


a a is based on a dissertation submitted in 

fe wallment of the requirements for the PhD 
‘thor al ae University of California, Berkeley. The 
è ac nowledges and appreciates the helpful sug- 
Tack Blan comments of Gerald Mendelsohn and 
pists for reprints should be sent to Daniel S. 
bmi? 101 Parnassus Avenue, University of Cali- 


N 
nia, San Francisco, California 94143. 


on which descriptions are based. Instead, 
implicit personality theory and linguistic 
accounts of personality are offered to explain 
the responses of personality assessors. If these 
criticisms are accepted, personality assess- 
ment methodology and the host of empirical 
findings dependent on these methods are 
suspect. 

The most frequently cited evidence that 
observers’ descriptions are invalid is the set 
of studies showing that despite differences in 
information about ratees, observers’ trait 
ratings produce extremely similar factor struc- 
tures of trait terms. Frequently referenced is 
the study by Passini and Norman (1966), in 
which undergraduate subjects were asked to 
rate each other on a set of traits previously 
selected and factorially grouped by Norman 
(1963). The members of this group of under- 
graduate students had had no previous contact 
or acquaintance with each other but did have 
several minutes together before the rating 
task began. During this period facts about 
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demeanor, dress, and bearing could be 
observed. Ratings were factor analyzed, and 
the resultant factor structure was compared 
to Norman’s factor structure based on ratings 
where raters were more familiar with the per- 
sons being rated. The two factor structures 
were similar, leading Passini and Norman to 
conclude that hese data were more a function 
of the rater than of the ratee. That is, judges 
in this study, who could not have known the 
characteristics of their targets, apparently 
used as a base their own implicit personality 
theories when they described their targets. 

Using this basic approach, other research- 
ers have shown similarity of factor structures 
across information conditions and have like- 
wise concluded that personality ratings are 
invalid (e.g., Shweder, 1975). These conclu- 
sions, however, have been based on analyses 
of the relationships among stimuli, rather 
than on the ratings themselves. With respect 
to the wide citation and uncritical acceptance 
of the Passini and Norman study, a second 
paper that substantially moderates the Passini 
and Norman conclusions does exist and should 
be acknowledged. A curious aspect of this 
part of the controversy, however, is that this 
second relevant article, in the same journal, 
published during the same year and co- 
authored by one of the two authors of the 
Passini and Norman paper, is less frequently 
noted. This article (Norman & Goldberg, 
1966) requires that the conclusions of the 
Passini and Norman study be severely 
delimited. 

In this follow-up article, Norman and 
Goldberg further analyzed the Passini and 
Norman data and other data and found 
impressive evidence that the level of inter- 
rater agreement was a function of the degree 
of acquaintance. This finding was important 
because Norman and Goldberg recognized 
that the level of interrater agreement is rele- 
vant to the claim that implicit personality 
theories were the basis for the raters’ descrip- 
tions. If the variance of ratings across ratees 
is in excess of that expected by chance, then 
this evidence can be taken to “imply that the 
raters are to some degree responding in like 
fashion to some characteristics of the ratees” 
(p. 684). When the reliabilities of both the 
composite and component ratings were cal- 
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culated for Monte Carlo data, the origin 
Passini and Norman data, a Peace Q 
sample, and a fraternity senior sample, a 
correspondence was obtained between amount 
of reliability and amount of previous intimagy| 
and knowledge of ratees. Noting that the 
findings provided necessary but not sufficien 
evidence for the validity of the rating data 
Norman and Goldberg then demonstrate 
that with other convergent validational info 
mation (self-ratings and predicted pæ 
ratings) in the Peace Corps and Passini ant 
Norman sample, the validity diagonals of ti 
Campbell and Fiske multitrait multimethol] 
matrix yielded coefficients that lent “suppa 
not simply [to] the inference of differenti 
ratee relevance in the data from the tif 
groups, but also differences in the pertinel 
these data have to the personality characté 
istics of the ratees in the two cases” (p. 690 
Given these data, the mere finding of sim an 
ity of factor structures of trait rating ter 
across samples of differing acquaintance ¢a 
not be taken as evidence of the insensitivity W 
the ratings to the ratees. 
Wiggins and Blackburn (1976) presen 
data quite similar to that of Norman 4 
Goldberg’s. As part of their investigation 
implicit personality theories, they too founi 
that the factor structure of the traits was MM 
different as a function of knowledge of 
target being rated (in this case, stranger 
friends). It was also shown that individu 
differences existed in the use of implicit | 
sonality theories, but the data did not @ | 
the correlates of such differences to be cleat | 
specified. | 
AN the findings of similarity of ea 
structures, there are studies suggesting © 
the effects of implicit personality theories | 
not overwhelming. Block (1971), for ee 
has shown that a panel of judges’ descr ra 
of a set of targets are more similar to €% 
other within targets across judges ta reel} 
are similar to descriptions ac fe ct 
within judges. That is, Judges A, age A, Bi 
agree more about Target X than Ju T ab 
or C agrees with himself or wa. are 
targets X, Y, and Z. Nonetheless, ás found 
many reasons why some researc? 
some personality judgments vr in pet 
A fundamentally important varia! 


PERSONALITY DESCRIPTIONS AND VARIED INFORMATION 


gnality rating studies is the amount and kind 
bf information that judges have available. 
Thus, when comparing several different re- 
arch studies, commentators have often 
lumped together types of judgment tasks that 
lifer in the amount of information provided 
b the judges. Generalizations about the sensi- 
ivity of observers are inappropriate if it can 
be shown that the amount of information 
wailable to judges affects their judgmental 
processes and responses. Thus, the conclusions 
bout the competencies of observers may need 
modification in the light of the limits imposed 
m judges by restrictions in the amount and 
kind of information available to them. Most 
ior work has not directly examined the 
iriations in judgment due to variation in the 
mount and kind of information provided to 
lidges. Studies that have examined informa- 
lon as a variable have generally cast results 
n terms of the validities of the judgments, 
spite the fact that the variables serving as 
titeria were often little better than the 
fariables serving as predictors (see eg., 
lden, 1964; Kostlan, 1954; Sines, 1959). 
i balance, these studies did suggest differ- 
ces in the validity of the judgments as a 
inction of the amount and kind of informa- 
lon on which a judge based his or her pre- 
It may also be noted that the con- 
Sions of these studies were not based on 
nalyses of the structure of personality 
tings, 
| A Summary of the research reviewed up to 
ae suggests several strong and clear 
Pee: First, the robustness of trait 
in 8 factor structures to changes in informa- 
te Seems beyond dispute, especially when 
k of trait terms employed has been 
“Bag selected on factorially derived 
ka 5. Second, the robustness of such factor 
4 ures provides only equivocal evidence 
conclusions about (a) the process used 
ke ratings or (b) the validity of the 
ngs that eventually produce the factor 
ae Third, the modifications of person- 
bon angs produced by variations in infor- 
E have generally gone uninvestigated. 
or assume however, that if the data 
ntains very little information, as have 
data bases used by some critics, then 
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raters have very little on which to base their 
ratings other than implicit personality con- 
ceptions. Thus, conclusions about the sensi- 
tivity of raters are unwarranted when based 
only on data gathered under meager informa- 
tional conditions. If, however, amount and 
kind of information are systematically varied, 
and the ratings themselves, rather than their 
structure, are examined, then researchers have 
the opportunity to evaluate unbiased and 
relevant evidence of the sensitivity of ob- 
servers of personality. The present study was 
undertaken to provide such evidence: The 
amount and kind of information given to 
observers was systematically varied, and the 
effects of information variation on person- 
ality descriptions were evaluated. 


Method 


The data used in this research derived from two 
sources. The first category contained data in the 
archives of the Institute of Personality Assessment 
and Research (IPAR). These data were personality 
descriptions of assessees made by staff members dur- 
ing a series of 1-day assessments held at the Institute. 
The second category contained data collected cx- 
plicitly for this study. The collection of these new 
data required the use of other archival materials 
housed at IPAR, namely the biographical records 
and interview protocols that new judges used to 
formulate personality descriptions of the assessees. 


Subjects 


One of the assessment projects carried out at IPAR 
was a study of attitudes and behaviors in the domain 
of population psychology (see Gough, 1973, 1975, 
for representative findings). A heterogeneous group 
of 402 persons living in communities surrounding the 
San Francisco Bay area was studied. Of this group, 
82 (41 couples) came to the Institute during eight 
different 1-day assessment programs in 1974, 1975, 
and 1976. This group represented about 15% of all 
those to whom invitations to participate were made. 
All but one couple were married. The mean age for 
the 41 males was 31.12 (SD = 5.26), and for the 41 
females was 29.20 (SD = 4.51). Their educational 
level in years was 13.73 (SD = 1.80) and 13.49 
(SD = 1.75) respectively. The Hollingshead social 
class of the husband was 3.15 (SD = .99), where 1 
is upper class and 5 is lower class. 


1 The author wishes to thank the Institute’s direc- 
tor, Harrison G. Gough, for graciously making these 
data available. 
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Procedure 


During the assessment day, each assessee was 
interviewed by two different staff members. One 
interview focused on the assessee’s life history; the 
other, on the assessee’s personal values with ques- 
tions directed toward, but not limited to, issues in 
the realm of family and larger life goal orientations. 
Additional assessment procedures included a leader- 
less group discussion, an improvisational role-playing 
procedure, a couple interaction procedure, luncheon, 
a cocktail hour, a sit-down dinner, and finally a 
modified form of the party game charades. All these 
additional procedures were observed by all staff 
members, among whom were the individuals who 
conducted the interviews. 

After the assessments were concluded, personality 
descriptions of each assessee were completed by the 
staff. The data utilized in this study came from two 
response instruments, the Adjective Check List 
(ACL; Gough & Heilbrun, 1965) and the California 
Q-sort (Block, 1961). Descriptions of each assessee 
were made on the ACL and Q-sort by both inter- 
viewers and by two or three other staff members 
who had previously been assigned to record their 
understanding of that assessee. These descriptions 
formed the materials in the archival data set. Thus, 
for any assessee, there were ACLs and Q-sorts com- 
pleted by (a) a life history interviewer (for 2 of 
the 82 assessees, these data were missing), (b) a 
values interviewer (for 5 of the 82 assesses, these 
data were missing), and (c) two (for 30 of the 82 
assessees) or three (for 52 assessees) staff members 
who had not interviewed that assessee but who had 
observed him or her during the assessment day (non- 
interview staff). Neither the life history nor the 
values interviewer was blind to the rest of the 
assessment procedures. Each was free to, and pre- 
sumably did, draw upon information gleaned from 
the other assessment procedures to compile his or 
her ACL and Q-sort portraits. On the other hand, 
the noninterview staff did not have access to either 
interview protocol. 

To extend the range of variation of amount and 
kind of information further, a second phase of data 
collection was undertaken. In this phase, three 
additional conditions or sources of information for 
judges’ descriptions were provided, For each of these 
three conditions, one male and one female judge 
completed ACL and Q-sort materials for each case. 

The first variant of what will be called the “proto- 
col” data set was a life history protocol condition. 
For each case, the written life history interview 
protocol was read in its entirety by each of the two 
judges assigned to that case. In addition, demographic 
data (to be described below) were also given to the 
judges. Independently of each other, both judges 
then completed the ACL and the Q-sort for the case. 

The second condition of the interview Protocol 
data set was a values protocol condition. The same 
procedure used for the life history protocols was 
followed for the values protocols. One values inter- 
view from a male assessee in 1974 and one from a 
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female assessee in 1976 were missing, reducing 
n for the analyses on the values protocols to 80. Th 
interview protocols for both the life history ay 
values conditions consisted of handwritten record 
of the subjects’ responses to a standardized set of 
questions. The interviewer's records were as nearly 
verbatim as possible. The actual questions used may 
be found in Weiss (1978). F 

The final condition in the study will be referm 
to as the stereotypic-demographic condition, T 
information available to judges in this conditio 
consisted of sex, number of children, age, educatio 
race, occupation, place of birth, and type of 
munity of origin (either rural, suburban, or urban 
Information about religious preference, a stand 
Stereotype variable, was not included because the 
data were not available for all cases, In this cond 
tion, then, judges had neither direct nor ind 
personal contact with the assessees. Their respos 
can be based only on their conceptions of personal 
characteristics and differences associated with clas 
of individuals. 


Judges 


The persons who comprised the assessment stafi i 
1974, 1975, and 1976 were either senior PhD facillj 
full-time research personnel, visiting scholars, 
advanced graduate students in personality or cli k 
psychology. The logic of the design requires that 
judgments in the various conditions vary only 
a function of information and not also in terms) 
the panels of judges contributing descriptions # 
those conditions, Thus, great effort was made 
secure protocol and stereotype judges who we 
similar to the staff members in the assessments. 

There were 45 assessment staff members wht 
ACL and Q-sort descriptions were used i 
assessments. Of these 45 judges, 21 were st a 
Berkeley and available and willing to pare 
judges in the new study. In addition to a 
persons, 7 advanced graduate students who m 
participated in the assessments were asked abl 
in the new study. These 7 judges were oe tic 
to the advanced graduate students who ha 2 j 
pated in the earlier assessments. Thus HN ott 
judges in all who did Q-sorts and ACLs in Aa 
col and stereotype portions of the RAE ca 
52 judges was employed for the combined! a 
protocol, and stereotype portions of the si n fh i 

If no case had been missing, there would h ee A 
a total of 902 descriptions. Thus, 34 oe Zr 
4% of the theoretically complete file o! to the tol 
missing, a minute fraction. Contributions ' franks 
of 868 descriptions made by any sinh $ arale 
from a high of 59 cases (79) to a 10 


f an once by 
(.2%). No case was described peer for th 


approximately 2%. That is, 0 i 
judge contributed about 2% of Lae 
which subsequent analyses were CO 


| 
any biasing effects of idiosyncratic judges should be 
minimal. 


PERSONALITY DESCRIPTIONS AND VARIED INFORMATION 


Analyses and Results 


Three separate, but related, sets of analyses 
were performed, each evaluating a different 
facet of the effect of information on observers’ 
descriptions of personality. If the kind of 
information made no difference to the judges, 
then the results found in any one condition 
should be essentially identical to those found 
in any other condition, That is, if the informa- 
tion condition is irrelevant, then the degree to 
which assessees are differentiated from each 
other should not vary from one condition to 
another. Likewise, the mean level and vari- 
ation of interobserver agreement should not 
be related to the information condition. The 
analyses were designed to provide evidence 
contradicting the position that information 
condition does not affect personality descrip- 
tions. Further, they sought to show that the 
manner in which information does affect 
description can support the conclusion that 
Iudges are reliable and sensitive “transducers” 
of variations in information and that their 
Personality assessments can faithfully reflect 
the characteristics of the information avail- 


able to them, 


Interobserver A greement 


Studies employing rating tasks or pro- 
‘dures should report estimates of the reli- 
Hility of the ratings. Because the information 
îvailable to judges varied in this study, one 
Might expect the amount of interobserver 
eement also to vary with condition, as 
Norman and Goldberg (1966) had found. 
ee judges provided descriptions of tar- 
Persons on two descriptive devices, the 
ye to evaluate interobserver agree- 
ese using different response formats was 
| Ree Three interobserver indices of reli- 
Y were calculated. 
first index involved the Q-sort. For 
Pair of raters for every case, the corre- 
n over the 100 Q items was obtained. For 
ra otocol and stereotype conditions, this 
ation indexed interobserver agreement. 
€ two interview conditions, where only 
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one judge described each case, observer agree- 
ment was calculated between the two inter- 
viewers. For the noninterview condition, there 
were 10 male and 13 female assessees for 
whom only two judges contributed descrip- 
tions; the remaining cases were described by 
three judges. For the former cases the correla- 
tion between the pair of judges was used as 
the index of agreement; for the latter cases, 
the mean correlation of the three pairs of 
judges was employed (in this analysis, and 
every analysis reported henceforth, all cal- 
culations involving the averaging of correla- 
tion coefficients employed Fisher’s r-to-z 
transformation). 

The second index of agreement was calcu- 
lated for the judges’ responses on the ACL. A 
correlation coefficient across the 300 items 
comprising the ACL was computed for each 
pair or set of raters for each case in the same 
fashion as was employed for the Q-sort. The 
dichotomous response format of the ACL 
means that the coefficients produced are 
actually phi coefficients. This index was there- 
fore termed the ACLPhi index, 

The ACL does not require judges to make 
a judgment for every item; that is, judges are 
free to check as many or as few adjectives as 
they believe to be descriptive. Since there are 
strong individual differences in the number of 
adjectives a judge is likely to check, regard- 
less of information condition, this format 
poses some problems, being based on a four- 
fold tally of a present versus absent dichot- 
omy. To overcome this problem, a third index 
of observer agreement was defined. The ACL 
items can be scored for 24 scales, some of 
which are empirically derived dimensions 
(e.g., self-control) and others of which are 
rationally derived scales based on Murray’s 
need-press system, In the scoring algorithm 
the total number of adjectives checked is 
used as a correction factor when the raw 
scale scores are converted to standard scores. 
The net effect of such a procedure is to re- 
move the differences on the scale dimensions 
that are attributable only to the differences 
in the total number of items checked. Thus, 
a correlation between two judges over the 24 
scale scores will not be unduly influenced by 
differences in the total number of items 
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Table 1 TTA : 
Analysts of Variance of Reliability Indices 


Index and condition n 
Q-sort® 
Life history/values interviews 75 
Life history protocol 82 
Values protocol 80 
Noninterview 82 
Stereotypic-demographic 82 
ACLPhi® 
Life history/values interviews 75 
Life history protocol 82 
Values protocol 80 
Noninterview 82 
Stereotypic-demographic 82 
ACLScale® 
Life history /values interviews 75 
Life history protocol 82 
Values protocol 80 
Noninterview 82 
Stereotypic-demographic 82 
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M SD Range F 
Al 19 09-.66 — 
.38 .28 —.50-.76 = 
38 27 — 42-73 4l 
Al 21 07-.73 — 
39 22 11-70 — 
35 20 27—63 = 
27 17 06-58 = 
24 18 13-.55 4,87* 
„31 17 02-.64 = 
26 17 10-59 - 
58 57 76-.94 4 
53 56 —.80-.97 =- 
45 48 —.70-.90 m4 
53 48 ~ 49-91 = 
54 46 53-.94 z 


Note. Means are z-transformed values; values for range are raw correlations. All values are between pairs 


of judges. ACLScale = Adjective Check List Scale. ACLPhi = Adjective Check List Scale with 


coefficient. 


* Correlations based on 100 items; index shows heterogeneity of variance. » Correlations based on 300 ite 


° Correlations based on 24 scales, 
* p (4, 396) < .001. 


checked. Descriptions of the target cases were 
converted to the 24 standard scale scores 
using the conversion formula appropriate for 
each sex. For those cases described by three 
noninterview judges, the average correlation 
was used as the index. This index was termed 
ACLScale. 

The interobserver correlations provided the 
raw data for the first group of analyses. A 
one-way analysis of variance was performed 
to evaluate the differences in level of agree- 
ment across conditions, for each index, The 
results of these analyses are reported in 
Table 1. 

Q-sort reliability results. The first finding 
of interest is that for the Q-sort, there are no 
significant differences in interobserver agree- 
ment as a function of information condition. 
This result is not in accord with the Norman 
and Goldberg (1966) study. In the present 
research, judges reported that they found 
some conditions of information to be dis- 
tinctly more helpful than others. One there- 
fore might expect to find the strength of 
agreement among the judges to be related to 


the kind of information provided. That A 
expectation was not confirmed can i 
attributed in part to the nature of the Q-s0 
and in part also to the nature of the stereo 
typic-demographic information. Further, 7 
may well ask if the expected differences Wi : 
in fact warranted. A more detailed examini 
tion of this question will be deferred un 
the discussion section. 3 

Of note, however, is the finding meg 
accompanying tests for heterogeneity A 
ance were significant, indicating that si 
ences in the spread of agreement acot 
different conditions did occur. Agree s 
the assessment conditions was wee pe f 
than —.09, whereas for the protoco ont 
tions agreement went as low as —. ing, Om 
other hand, in the protocol conditions, g 
efficients as high as .76 were obtina 
highest value in the assessment © 
was .66. pal- 

ACL reliability results. The one ae 
ysis of variance of the mean a 7 
agreement for the ACLPhi index E infor 
significant overall difference, althou 


d 


mation condition accounted for only 4% of 
the variance in reliability coefficients. The 
ordering of the mean levels of interobserver 
agreement for the ACLPhi coefficient was 
associated with the greater information in the 
assessment conditions. Nonetheless, the lack 
of strength of the results does not strongly 
demonstrate that information condition affects 
reliability of description, 

The ACLScale index, like the Q-sort index, 
did not show significantly different means as 
a function of information condition. The 
mean level of agreement for the ACLScale 
index, however, is higher than that for the 
[other two indices, This finding is not un- 
expected, since the standardization procedure 
| reduced the effects of number checked. 


| Differential Usage of 
Descriptors by Conditions 


The second set of analyses concerns the use 
of the individual items of the two instruments 
(Q-sort and ACL). For the Q-sort it will be 

[shown that judges’ descriptions of assessees 
are more differentiated at the item level when 
fount and variety of information increase. 
For the ACL, it will be shown that judges use 
= descriptors with more varied informa- 
ion, 
„Q items. This analysis concentrated on 
item variances within condition; that is, on 
‘the variance of assessees’ scores on a partic- 
War item within a particular condition. The 
E array of scores consisted of 600 vari- 
eS; 100 items over six conditions. 
A six standard deviations for each item 
a calculated using Re-Q’d? data. Then 
+ of homogeneity of variance across the 
he wormation conditions were conducted. 
€ Bartlett-Box F test and Cochran’s C 
ae (Winer, 1971) were used. Twenty-one 
iy i 100 items showed significant heterogene- 
4 Appo (p < .05) according to both 
D ighteen more items, making a total of 
-A significant departures from homo- 
: Y of variance on one or the other test. 
ied expected that the two interview 
À kos would, on the average, show the 
ea variance of item placement across the 
le oe because these two conditions were 
es with the most abundant and varied 
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information available to each judge; that is, 
when information is most varied, judges 
should be able to make the most differentiated 
responses and hence produce the largest vari- 
ance. To evaluate this anticipation, an anly- 
sis that examined the relationship between 
amount of variation and information condi- 
tion was conducted, The data for this next 
analysis were derived from the 6 X 100 array 
of standard deviations, For Q Item 1, the six 
within-condition standard deviations were 
ranked from 1 (the smallest SD) to 6 (the 
largest SD). This ranking procedure was 
repeated for the remaining 99 items. Thus, 
each of the 100 items was assigned a profile 
of six ranks, In effect, each profile indicates 
the ordering of the six information conditions 
in terms of the degree of variation in judges’ 
descriptions. An analysis of variance of ranks 
across the six information conditions (Winer, 
1971) was performed and evaluated by the 
Friedman test for the significance of the 
difference of mean ranks. The results of this 
analysis are presented in Table 2. 

According to the Friedman test, x?(5) = 
84.35, p < .0001. The information condition 
that shows the highest rank and thus the most 
differential usage of items is the life history 
interview condition; the values interview and 
life history protocol conditions follow. Also 
of note is that the stereotypic-demographic 
condition shows the smallest mean rank of 
any of the information conditions, a result 
fully in accord with the impoverished infor- 
mation in that condition. 

For the sets of 21 and 39 items showing 
significance on both or one of the hetero- 
geneity of variance tests, results were also 
significant, with x*(5) = 38.39 and y?(5) = 
71.89, p < .007, respectively. Moreover, these 
two sets reveal an important point. The SD 
of the mean rank for the stereotypic-demo- 
graphic condition is, for both the 21 and 39 


2 It will be recalled that the number of judges 
varied from one to three within information condi- 
tions. In those conditions where there was more 
than one judge, the data were Re-Q’d to standardize 
all Q-sort distributions. This was accomplished by 
summing across judges and recasting the array of 
100 sums into the forced distribution of values used 
for an individual judge’s sort. 
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Table 2 
Means and Standard Deviations of Q-Item 
Variance Ranks 


DANIEL S. WEISS 


Condition M SD 
All items (2 = 100) 

Life history interview 4.40 1.48 
Values interview 4.21 1.38 
Life history protocol 3.65 1,65 
Values protocol 3.32 1.66 
Noninterview 3.12 171 
Stereotypic-demographic 2.30 1,49 


Items with both heterogeneity of variance 
tests significant (n = 21) 


Life history interview 5.00 1.30 
Values interview 4.78 1.03 
Life history protocol 3.57 1.43 
Values protocol 2.71 1.71 
Noninterview 3.33 1.71 
Stereotypic-demographic 1,90 0.99 


Items with one heterogeneity of variance 
test significant (n = 39) 


Life history interview 4.85 1.14 
Values interview 4.89 1.21 
Life history protocol 3.59 1.52 
Values protocol 2.74 1.58 
Noninterview 3.59 1.71 
Stereotypic-demographic 1.74 0.91 


item sets, the lowest SD of the six conditions. 
The next lowest SD is, in both cases, the SD 
of the mean rank for an interview condition. 
Thus, not only are these conditions extreme 
in terms of variation of item placements 
across assessees as a function of information 
condition (as evidenced by the highest and 
lowest means), but they are also more con- 
sistently extreme (as evidenced by the low 
SDs) than are the other conditions. This 
result does not necessarily follow from the 
other findings in the analysis and thus under- 
scores the effect of information on the vari- 
ation of item placement within condition. 
Although it is evident that the assessees are 
most differentiated in the interview conditions 
and least differentiated in the stereotype con- 
dition, it remains to be shown that the type 
of items sorted with the most variation are 
items for which the particular condition pro- 
vided relevant information. There are no 
statistical procedures for determining this 


fit; goodness of fit must be judged psych 
logically. 

Tables 3 through 8 report the Q items th 
showed the greatest variance in each of | 
six information conditions. That is, Tabl 
reports the items that showed their greates 
variation in the life history interview con 
tion, Table 4 reports those items that shoy 
their greatest variation in the values in 
view condition, Table 5 reports on the 
history protocol condition, Tables 6, 7, and 
contain items from the values protocol, n 
interview, and stereotypic-demographic conti 
tions, respectively. 

Several points of interest may be noted] 
Table 3. First, 27 of the 100 Q items reveal 
their greatest variation in the life history inte 
view condition; this was the largest num) 
observed for any condition, The item conte 
appears to be appropriate for what may} 
obtained in a life history interview. Four 
clusters of items may be delineated in thes 
of 27. Several deal with behavioral style al 
tempo (Items 4, 17, 20, 26, 30, 31, 77). Fo 
refer to matters of sexuality (Items 58, 4 
77, 80). Five refer to social and interperson 
relationships (Items 17, 48, 49, 54, 89). 4 
final category contains 12 items that refer! 
temperament and character (Items 9, 30, 
60, 62, 68, 70, 75, 77, 86, 97, 99). 

Table 4, containing items showing theil 
largest variation in the values interview C% 
dition, has 22 entries. As anticipated, this 
the second largest number. The item ean 
overlaps somewhat with that found in the i 
history interview condition, but this simila 
is not unexpected. Personal contact and m 
interview transaction were as much 4 al 
the one condition as the other. Items ın +4 
4 can be classified into (a) the bos al 
interpersonal sphere (Items 27, 35, 3 turd 
38, 61, 64, 92), (b) character struc” 
(Items 14, 39, 40, 42, 45, 72, 74, 98), 98). J 
expressive behavior (Items 43, ao cam 
fourth category, (d), distinct from t itd 
served in the life history interview ee 
is the expected set of items on persona 
(Items 22, 84, 91). 1 it 

Table § lists the 19 items showing a 
highest variation in the life history P item 
condition. All items, with the exception 


K=? 
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2, which refers to assertive behavior, can be 
ccommodated in the categories postulated 
or the life history interview condition. An 
important difference is that in the present 
listing the expressive behavior category has 
m entries. This omission is reasonable, as 
judges in this condition did not see the 
interviewee. 

Items with largest variance in the values 
tocol condition are presented in Table 6. 
anticipated, items referring to personal 
alues are included, more, in fact, than for 
lhe values interview condition. This may have 
curred because in contrast to the values 
terview condition, where both personal 
bservations and factual information about 
ersonal values were available to a judge, in 
values protocol condition only the latter 
re used. Items referring to personal values 
re thus more salient for the protocol than 
t the interview, In addition to the seven 


fable 3 


H. Is a talkative individual.* 


Tends to be self-defensive.* 


productive; gets things done. 


Sort Items Showing Greatest Variance in Life History Interview Condition 


’ Behaves in a sympathetic or considerate manner. 
| Has rapid personal tempo; behaves and acts quickly.” 
‘ *xtrapunitive; tends to transfer or project blame. 
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items dealing directly with values (Items 3, 
7, 51, 63, 66, 83, 90), there were also four 
items referring to temperament (Items 16, 
44, 46, 76) and two referring to intellectual 
functioning (Items 8, 51). 

The items showing their largest variation in 
the noninterview condition are listed in Table 
7. With the exception of a single item dealing 
with impulse control (Item 25), the remaining 
items can be classified under the heading of 
modes of interaction, This finding too is ap- 
propriate; that is, in this condition judges 
saw the target persons in casual social inter- 
action throughout the assessment day but did 
not obtain detailed facts in the realms of 
values, life history, or character. These 
judges should then make most differential use 
of items pertaining to social style and physical 
demeanor. 

Table 8 presents the items that showed 
their largest variation in the stereotypic- 


’ Is uncomfortable with uncertainty and complexities. 


D. Gives up and withdraws where possible in the face of frustration and adversity. (N.B. If placed high, 


7 egards self as physically attractive.* 


implies generally defeatist; if placed low, implies counteractive.) 


Seems to be aware of the impression he makes on others. 
» Is moralistic. (N.B. Regardless of the particular nature of the moral code.) 
Keeps People at a distance; avoids close interpersonal relationships. 


j mphasizes being with others; gregarious.” 
fends to be rebellious and nonconforming. 
s basically anxious.> 


$ basically distrustful of people in general; questions their motivations. 


I 
E 
fhioys sensuous experiences (including touch, taste, smell, physical contact).® 
B, 


Chaves in an ethically consistent manner; is consistent with own personal standards. 


L 
as insight into own motives and behavior.* 


Hi 


A 'S not intended here.) 


Í Handles anxiety and conflicts by, in effect, refusing to recogni. 


tendencies.» 


| k emotionally bland; has flattened affect. 


T dramatizing; histrionic.* 


as a clear-cut, internally consistent personality. ( 


Tends À i 2 : 
i) texts in sexual terms. } > 
B eee N.B. Amount of information available before sorting 


j | DPears straightforward, forthright, candid in dealings with others." 
j, p crested in members of the opposite sex. (N.B. At opposite 


end, item implies absence of such interest.)* 
ize their presence; repressive or dissociative 


mpares self to others. Is alért to real or fancied differences between self and other people. 


meh heterogeneity of variance tests significant (n = 10). > One test significant (» = 3). 
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Table 4 


Q-Sort Items Showing Greatest Variance in Values Interview Condition 


61. 


64, 
72. 


74, 
78. 
84, 


OG 
92, 


98. 


. Is a genuinely dependable and responsible person.” 

. Genuinely submissive; accepts domination comfortably. 
. Feels a lack of personal meaning in life.* 

. Shows condescending behavior in relations with others. (N.B. Extreme placement toward unchi 


. Is guileful and deceitful, manipulative, opportunistic. 
. Has hostility toward others. (N.B. Basic hostility is intended here; mode of expression is to be in 


. Thinks and associates ideas in unusual ways; has unconventional! thought processes. 

. Is vulnerable to real or fancied threat, generally fearful. 

. Reluctant to commit self to any definite course of action; tends to delay or avoid action, 

. Is facially and/or gesturally expressive.* 

. Has a brittle ego-defense system; has a small reserve of integration ; would be disorganized and 


istic end implies simply an absence of condescension, not necessarily equalitarianism or inferi 
Has warmth; has the capacity for close relationships; compassionate 
Is subtly negativistic; tends to undermine and obstruct or sabotage. 


by other items.) 


tive when under stress or trauma." 

Creates and exploits dependency in people (N.B. regardless of the techniques employed, e.g. Pi 
ness, overindulgence). (N.B. At other end of scale, item implies respecting and encouragi 
independence and individuality of others.) 

Is socially perceptive of a wide range of interpersonal cues. 

Concerned with own adequacy as a person, either at conscious or unconscious levels. (N.B, A 
judgment is required here; Number 74 reflects subjective satisfaction with self.)” 

Is subjectively unaware of self-concern; feels satisfied with self." 

Feels cheated and victimized by life; self-pitying.” 

Is cheerful. (N.B. Extreme placement toward uncharacteristic end of continuum implies unhap} 
or depression.) 

Is power oriented; values power in self or others. 

Has social poise and presence; appears socially at ease.” 

Is verbally fluent; can express ideas well. 


^ Both heterogeneity of variance tests significant (” = 4). è One test significant (* = 6). 


Table 5 
Q-Sort Showing Greatest Variance in Life History Protocol Condition 


24. 
29, 
34. 
47. 
50. 
52. 


53. 
95. 
59. 
65. 
67. 


79. 
82. 
87. 
94. 


. Is critical, skeptical, not easily impressed. 
r Behaves in a giving way toward others (N.B. regardless of the motivation involved). 
. Anxiety and tension find outlet in bodily symptoms. (N.B. If placed high, implies bodi 


. Is protective of those close to him. (N.B, Placement of this item expresses behavior rangini 


ily dysfuní 
if placed low, implies absence of autonomic arousal. )* 


overprotection through appropriate nurturance to a laissez-faire, underprotective manner.. 
Prides self on being “objective,” rational. 
Is turned to for advice and reassurance. 
Overreactive to minor frustrations; irritable.” 
Has a readiness to feel guilty (N.B. regardless of whether verbalized or not). 
Is unpredictable and changeable in behavior and attitudes. . 
nee in z assertive fashion. (N.B. Item 14 reflects underlying submissiveness; this 

ehavior.. ‘ 

Various needs tend toward relatively direct and uncontrolled expression; unable to delay gratifi 
Is self-defeating. 
Is concerned with own body and the adequacy of its physiological functioning.” _ 
Characteristically pushes and tries to stretch limits; sees what he can get away with. 
Is self-indulgent.» 
Tends to ruminate and have persistent, preoccupying thoughts.” 
Has fluctuating moods. i 7 
Interprets basically simple and clear-cut situations in complicated and particularizing waya 
Expresses hostile feelings directly. 


refers to 


a Both heterogeneity of variance tests significant (n = 2). > One test significant (n = 3)- 


able 6 


not necessarily assumed). 


insight.) 


consider motivational factors.) 


here.) 


and so forth, 


quent response is intended here.) 


B. Able to see to the heart of important problems. 
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N.Sort Items Showing Greatest Variance in Values Protocol Condition 
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/\Has a wide range of interests. (N.B. Superficiality or depth of interest is irrelevant here.) 
1! Favors conservative values in a variety of areas. 
8, Appears to have a high degree of intellectual capacity (N.B. Whether actualized or not; originality is 


3, Is thin-skinned; sensitive to anything that can be construed as criticism or an interpersonal slight. 
6, Is introspective and concerned with self as an object. (N.B. Introspectiveness per se does not imply 


. Evaluates the motivations of others in interpreting situations. (N.B. Accuracy of evaluation is not 
assumed.) (N.B. Again, extreme placement in one direction implies preoccupation with motivational 
interpretation; at the other extreme, the item implies a psychological obtuseness; subject does not 


, Engages in personal fantasy and daydreams, fictional speculations. 
. Genuinely values intellectual and cognitive matters, (N.B. Ability or achievement are not implied 


. Judges self and others in conventional terms of “popularity,” “the correct thing to do,” social pressures, 
p 


. Enjoys aesthetic impressions; is aesthetically reactive." 
p. Is sensitive to anything that can be construed as a demand. (N.B. No implication of the kind of subse- 


Í. Tends to project his own feelings and motivations onto others.” 


Į. Is concerned with philosophical problems, e.g., religions, values, the meaning of life, and so forth." 


mographic information condition. Ideally, 
0 item should have shown greatest variance 
Nthis condition. In fact, none showed signifi- 
Ant heterogeneity of variance, and their 
inking highest in this condition may best 
| viewed as a minor fluctuation around the 
lore or less equivalent variances of the six 
Onditions, 
We may now summarize the Q item vari- 
ce results. Overall, the pattern is a strong 
ne; the three assessment conditions, in which 
Sessees were actually observed, contain all 
He items making reference to physical qual- 
lies of appearance and bearing. This finding 
hardly surprising, but its absence would 
ave raised serious doubts about the effects 
information on personality descriptions. 
© noninterview condition, where no inti- 
a Contact occurred but persons were none- 
E personally observed, contained items 
h ting to social interaction and social skills. 
° values protocol condition contained the 
ii of the values items. The two life history 
yy mation conditions (interview and proto- 
contained a preponderance of items deal- 
; ith personality style, socialization, ad- 
* lent, and sexuality, all topics that were 
citly discussed in the interview. Finally, 


Both heterogeneity of variance tests significant (7 = 2). è One test significant (m = 1). 


the stereotypic-demographic information con- 

dition contained only five items, all lacking 

significant heterogeneity of variance, 
Viewing the six tables as a whole reveals 


Table 7 
Q-Sort Items Showing Greatest Variance 
in Noninterview Condition 


15. Is skilled in social: techniques of imaginative 
play, and humor.” 

18. Initiates humor." 

19. Seeks reassurance from others. 

21. Arouses nurturant feelings in others. 

25. Tends toward overcontrol of needs and impulses ; 
binds tensions excessively; delays gratifica- 
tion unnecessarily.” 

28. Tends to arouse liking and acceptance in people.» 

33. Is calm, relaxed in manner.” 

56. Responds to humor. 

57. Is an interesting, arresting person. 

81. Is physically attractive; good-looking. (N.B. 
The cultural criterion is to be applied here.)* 

88, Is personally charming." 

93. Behaves in a masculine style and manner or 
Behaves in a feminine style and manner. (N.B. 
The cultural or subcultural conception is to 
be applied as a criterion.)> 

95. Tends to proffer advice. 


* Both heterogeneity of variance tests significant 
(n = 3). One test significant (x = 8). 
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Table 8 
Q-Sort Items Showing Greatest Variance in 
Stereotypic-Demographic Condition 


6. Is fastidious. 
71. Has high aspiration level for self. 
85. Emphasizes communication through action and 
nonverbal behavior. 
96. Values own independence and autonomy. 
100. Does not vary roles; relates to everyone in the 
same way. 


another interesting and important result. As 
information decreases, so does the number of 
items that show significant heterogeneity of 
variance. So, for example, in Table 3, which 
tallies items for the life history interview 
condition, nearly half of the items show sig- 
nificant heterogeneity according to one or 
both tests. This contrasts with Table 8, which 
shows no significant items in the stereotypic- 
demographic condition, The number of signifi- 
cantly heterogeneous items is proportional to 
the total number of items in each condition, 
which is in turn related to the amount and 
kind of information available in each condi- 
tion. This pattern, then, further corroborates 
the abilities of judges to respond to and reflect 
the variations in information on which their 
descriptions are based. 

ACL items. To show that the effects of 
information are not confined to a single 
descriptive method, an analysis of the ACL 
data paralleling that for the Q-sort was per- 
formed. This analysis involved comparing 
across conditions the total number of adjec- 
tives checked about assessees. In the assess- 
ment conditions, where judges had direct 
contact with their targets, the mean number 


Table 9 


Analysis of Variance of Total Adjective 
Check List Items Checked 


ee 


Condition n M SD F 
Life history interview 80 89.96 20.54 a 
Values interview 78 85.20 18.91 — 
Life history protocol 164 58.27 20.37 92.40* 
Values protocol 160 54.96 22.04 — 
Noninterview 224 83.93 19.67 |) — 
Stereotypic- 
demographic 164 $1.90 23.25 — 


* p (5, 864) = .0001. 
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of adjectives checked per assessee should 
higher than for the protocol and stere 
conditions, where contact with the targe 
only through written documents, The to 
number of adjectives checked for each ¢ 
by each judge was computed. Differen 
between conditions were evaluated by a 
way analysis of variance, The results] 
presented in Table 9. 

The findings for the ACL parallel 
found with the Q-sort. The total number 
adjectives checked about a target 
varies substantially with information 
tion, which accounts for 34% of the varial 
When judges had personal contact with th 
targets, they checked approximately 85 itt 
as descriptive; judges without such coni 
checked only about 54 adjectives, Schel 
test for differences among the means of 
total number of adjectives checked rev 
two homogeneous subsets significant at 
05 level, As predicted, the protocol 
stereotype conditions form one group and 
assessment conditions form the other. rt 
on a second and independent measuring; 
strument, judges in this study demonstra 
that information does indeed affect de a 
tion: When information available to ju% 
is more abundant and varied, their desci 
tions are correspondingly fuller and 1 
detailed. 


Within-Condition Homogeneity of 
Description 


In the analysis just reported, differential 
among assessees was examined at the a 
individual items, In the analysis to f0 
differentiation among assesses Wi 
examined at a more global level. Specifica 


S ovid 
the Q correlations between assessees pr 
rately, 


(1961) notes, the Q correlation 15 i 
venient index of interperson simil 
information condition had no effect, ess 
mean level of similarity between A 

would be no greater in any one cond d 
in any other. If, however, informà a 
affect description, then a relationship 


n 
condition and similarity should obt@ : 
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stereotypic-demographic condition should pro- 
duce the most homogeneous (i.e., the least 
ifferentiated) set of descriptions, and the 
AÀ) interview conditions the least homoge- 
neous set. In Table 10 the mean interperson 
Q correlation is shown for each information 
condition. 
The patterning of the means of interperson 
similarity emphasizes the results from the 
analysis of variance of ranks; the least 
differentiation among assessees was obtained 
in the stereotypic-demographic condition, 
where the mean of all correlations between 
male assessees was .40 and the corresponding 
ean for females was .44. In marked contrast, 
the life history interview condition, the 
an interperson correlation for males was 
15 and for females was .14. Information 
ndition accounts for 33% of the variance in 
relational similarity. Additionally, the 
artlett-Box F test shows significant (p < 
01) heterogeneity of variance. The life 
tory interview condition shows less spread 
han the other conditions do. An interesting, 
lbeit unanticipated, finding was that there 
as a significant difference in mean similarity 
tween males and females in the stereotypic- 
emographic condition. The females were 
éscribed more homogeneously than the 
ales, (80) = 2.60, p = .01. 
In bold terms these analyses show that 
When little or no information is available, 
tifferentiation among assessees is low. On the 
ther hand, when information is detailed, 
timate, and characterologically revealing, 
dges describe their targets in a far more 
erentiated fashion. Finally, when the 
Mount of information available is inter- 
diate, the level of differentiation is inter- 
diate, This is perhaps the clearest evidence 
at the degree to which judges can differ- 
ee among assessees is a direct function of 
TA €gree to which differentiating infor- 
n is available to them. 


Discussion 


E study reported above has been con- 
ed with a basic and fundamental problem 
Personality psychology: Are human ob- 
ae capable of giving reliable and sensitive 
"iptions of personality? Recent writing 
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Table 10 
Analysis of Variance of Interperson Similarity 


Condition n M SD F 


All assessees (df = 5, 479) 
Life history interview 80 14 .09 — 


Values interview AAS. SS — 
Life history protocol 82.24 AI- EA9200 
Values protocol ATEA TOS T S — 
Noninterview 82.29 14 — 
Stereotypic- 

demographic 82 4t 13 <= 

Male assessees (df = 5, 235) 

Life history interview 40 15 08 — 
Values interview SEES K TA VA =. 
Life history protocol 41 .26 13 24.599 
Values protocol Min eo Ae — 
Noninterview 41.28 42 = 
Stereotypic- 

demographic 41 40 E) a 

Female assessees (df = 5, 238) 

Life history interview 40 14 .10 — 
Values interview 40.21 16 ae 
Life history protocol MI) 2800 {Seu G85% 
Values protocol 41.28 15 = 
Noninterview 41. 30 15 = 
Stereotypic- 

demographic 41 47 13 = 
*p < .001. 


in personality assessment has questioned the 
continued use of raters of personality, claim- 
ing that the data they produce are biased, 
unreliable, and insensitive to the “actual 
behavior” of the ratee. The central assertion 
of the criticisms of personality ratings is that 
the ratings do not reflect the “actualities” of 
the stimuli upon which they are based. The 
processes by which observers make their 
judgments are hypothesized to reflect various 
kinds of artifacts (e.g., logical error, illusory 
correlation). This pessimistic conclusion has 
been based on several types of analyses, most 
of which have considered solely the patterning 
or intercorrelations of the ratings. The evi- 
dence from the ratings themselves, however, 
has not generally been evaluated. This step 


“logically comes prior to examining the struc- 


ture of the ratings, and if convincing evidence 
of the tenability of ratings can be demon- 
strated, then the results derived from anal- 
yses of the patterning of ratings become 
phenomena to be explained, rather than 
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demonstrations of the irrelevance of person- 
ality ratings. 

To demonstrate the tenability of observers’ 
ratings of personality in this study, the 
amount and kind of information provided to 
judges was systematically varied, and the 
effects on the subsequent ratings were 
examined. If judges’ ratings were insensitive 
to the actualities of the information, then no 
systematic effects should have been produced 
by the systematic variation in information. 
If, however, variations in information produce 
clear, comprehensible, and predictable vari- 
ations in personality descriptions, the inescap- 
able conclusion is that judges can and do 
respond sensitively to the information from 
which descriptions are made. 

We may now discuss the evidence in this 
study that supports an optimistic view of the 
capacities of observers. A primary demonstra- 
tion of the effects of information on descrip- 
tion examined the differential use of descrip- 
tions. The analysis of variance of ranks of Q 
item variances and the analysis of the mean 
number of total adjectives checked on the 
ACL both provided strong evidence that 
judges perceived, integrated, and responded 
to the information each condition provided. 

An alternative explanation for the differ- 
ential variation of Q item placements across 
conditions might be that adventitious aspects 
of the assignment of cases to judges within 
condition, rather than responsiveness to 
changes in information, produced the ob- 
served results. But this reasoning would not 
explain why items that showed differential 
variation were items with content that was 
relevant to the information condition in which 
they showed increased variation. This second 
aspect of the data argues strongly against 
such an alternative explanation. It argues for 
judges’ competent use of information, 

The ACL demonstrates the effects of infor- 
mation on description in a different fashion. 
Information in the protocol and stereotype 
conditions led judges to reduce markedly the 
sheer number of items judged to be descriptive 
of an assessee. Thus, without a forced format, 
as in the Q-sort, the judges’ own descriptive 
behavior on the ACL demonstrates their 
responsiveness to variations in information. 
When the information available lacks imme- 
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diacy, i.e., when the person is not physical 
present with the judge, descriptions are} 
full and rich. a 
The analysis of interperson similarity p ' 
ably provided the most vivid and compre 
sible demonstration of the effects on pet 
ality assessment of variations in informati 
If differentiation in personality assessmi 
is a function of information, then the de 
of homogeneity in description should 
related to information condition. The analy 
showed that the amount of similarity 
description among persons was inven 
related to the amount of information, 
tionally, the homogeneity of variance of m 
interperson correlation decreased in the com 
tion where information was greatest—the 
history interview condition. This findi 
perhaps self-evident, is nonetheless hea ten 
to those who have argued for the contin 
use of the human observer in persondi 
assessment, Because the target sample} 
the same in each condition, because # 
measure used to describe the targets was 
same in each condition, and because | 
amount of judge overlap was great, the ni 
ings are best interpreted as showing 
judges altered their descriptive beha 
depending on what information was provid 
The analysis of the level of interob e 
agreement as a function of information E` 
tion yielded less straightforward results. 
ferences between conditions were not of if 
magnitude for the ACL and did not 0 
with the Q-sort; differences in the homos 
ity of agreement coefficients were pret 
however. Several reasons may be sugea 
for finding equivalent levels of reliab 
with the Q-sort: } ed 
1. The Q-sort distribution is fore a 
therefore if a core of agreement on a 
be magnified as a function of the eae 
of means and standard devine wi 
judges, regardless of the information © 4 
2. The item content of the Q-sort 7 
variable; some items are rarely hali 
strongly characteristic of anyone. itemi 
the location of placements for some ial 
highly modal, and agreement 1$ a 
heightened relative to a set with Pe d 
mally distributed items, whatever 
tion of information. 
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3. For the stereotypic-demographic condi- 
jon, in which agreement was as high as it 
vas for the other conditions, it may very well 
e the case that judges possess a shared cul- 
ural stereotype and can, if need be, draw 
pon this for their descriptions when they 
ave no alternative. Nonetheless, this shared 
fereotype does not provide the detailed 
formation needed to make item placements 
many of the Q items. Hence, the level of 
greement reached is only moderate. 

The meaning of the level of agreement that 
tists for the stereotypic-demographic condi- 
on is revealed by the analyses demonstrating 
lat the agreement reached under impover- 
hed information was reached at the price of 
iferentiated descriptions of the targets. Thus 
dges who do not have personally relevant 
formation agree with each other about the 
me target as much as life history interview 
dges do; further, across assessees they agree 
ja much larger degree. In fact, the stereo- 
pic-demographic condition is the only con- 
tion in which there is not a single negative 
tan interperson correlation for any assessee. 
his is what one would expect if the judges 
d not have the requisite information to 
fferentiate among assessees but nonetheless 
id knowledge about characteristics asso- 
ated with classes of persons. 

The crucial question in regard to inter- 
server agreement is whether equivalent 
vels across information conditions undercut 
clusions about the sensitivity of observers. 
steement for all conditions was only mod- 
a Tarenthetically, it should be noted that 
4 M iabilities reported are analogous to 
~ m correlations. The reliabilities of the 
Posite descriptions furnished by four or 
p dees for these 82 cases had a mean 
© of 702 and a median value of .729 on 
hag These composite descriptions are 

Yy employed in assessment studies 
er than the descriptions from pairs of 
Nonetheless, it is still the case that 
evel of agreement is equivocal with 
t to the sensitivity of observers. For 
> what should be said about a 32-year- 

pesewite and mother of two, having 16 

of education and working part-time as 
stitute teacher? Artifactual accounts of 
ality assessment based on implicit per- 
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sonality theory or semantic similarity of trait 
terms seem to require a much higher degree 
of consensus than that which in fact occurs. 
This suggests that shared assumptions about 
personality are not as pervasive as some com- 
mentators have indicated (Shweder, 1975, 
1977), 

An implication of the findings on agreement 
is that there may be both a floor and a 
ceiling, below or above which consensus can- 
not extend. Given less than total but more 
than minimal information about a large set 
of personality variables for which judgments 
will be requested, judges may well be able to 
agree to a certain point, regardless of the type 
of information with which they are furnished. 
For example, if it can be shown that in con- 
ditions of impoverished information, agree- 
ment among judges is principally a function 
of strong consensus about a small number of 
descriptors, whereas in conditions of more 
abundant information agreement stems from 
more mild consensus about a larger number 
of descriptors, our understanding of the 
nature of interjudge agreement would be 
deepened. Alternatively, it may be the case, 
as some of the clinical judgment studies show, 
that more information leads to the discount- 
ing of base-rate information that is actually 
more accurate and predictive. The role of 
extent of information is psychologically com- 
plex, as this second formulation makes clear, 

In arguing for an acceptance of the data 
that observers provide about personality, we 
do not suggest that this method is the only 
appropriate method for study in personality. 
Nor do we suggest that the use of observers 
has reached its practical limits in regard to 
reliability and validity. Training, more famil- 
iarity with the response instruments, and a 
larger number of observers are all ways of 
increasing the reliability and accuracy of 
observers’ judgments, It is, of course, obvious 
that the results discussed here do not certify 
that the ratings of these assessees are valid. 
That question must await further analyses 
of the data. But it is reasonable to assume 
that the prerequisites for validity are present, 
and that a clear possibility for impressive 
nonrating correlates exists for these data. A 
key consideration, moreover, in establishing 
accuracy is that for many, if not most or all, 
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studies using other methods of assessment, 
ultimately validity rests on the correspondence 
of observations with the criteria laid down by 
observers. As argued elsewhere (Block, Weiss, 
& Thorne, 1979), appeals to behavior counts 
are no less dependent on observers than are 
ratings. Disagreement among researchers 
arises regarding the meaning of the observa- 
tions not their necessity. 

The present report has asserted that the 
use of the human observer as a personality 
assessment tool is reasonable and useful and 
that personological descriptions furnished by 
observers are reliable and often acutely sensi- 
tive to the nature of the stimuli on which they 
are based. The conclusion is that observers 
are sensitive encoding “instruments.” Despite 
the known biasing effects on ratings of halo 
and the logical error, for example, it is un- 
likely that there exists at present a method 
more economic, precise, and valid for gather- 
ing personality assessment data than that 
utilizing the human observer. The basic 
soundness of the data in this study supports 
this observation. 
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Settings of a Professional Lifetime 


Roger G. Barker 


Iwas Professor of Psychology and Depart- 
t Chairman at the University of Kansas 
Lawrence in 1948. On a particular day of 
3 year, I worked on a research report in 
Study at Home, answered correspondence, 


sitman's Office, met my Graduate Class in 
mental Social Psychology for two 


a äspects of social psychological research. The 
prs begun under the general editorship of 
i;o Greenwald; the action editor for the series 


S Offspring of my life has benefited greatly 
Rus long period of gestation and labor from 
i and encouragement of Irwin Altman and 
aan In addition, Irwin Altman had an 
oa Part in its conception,/and Phil Schoggen 

“siding pediatrician after its premature birth. 
bs extend my most sincere thanks. However, 
t natural parent, I must answer for its de- 


ker = for reprints should be sent to Roger G. 
‘OX 98, Oskaloosa, Kansas 66066. 


University of Kansas 


This is a venture in ecocentric [sic] autobiography. Behavior settings important 
to the author during his 50 years in psychology are identified and described, 
circumstances of his entrances into and departures from the settings are re- 
ported, and immediate and enduring consequences for him are set forth. The 
article is a microsection of the history of psychology, 1928-1978. 


hours, conferred with the Dean of the College 
of Arts and Sciences in the Dean’s Office, 
chaired a meeting of the Department Faculty, 
and attended a Departmental Colloquium; 
these six behavior settings made up my pro- 
fessional environment. During my life in 
psychology, which began, I shall presume, 
when I entered upon graduate study at Stan- 
ford University in 1928, and has continued 
to this day, the stream of my behavior as a 
psychologist has meandered through thou- 
sands of behavior settings. These settings have 
constituted the environment of my profes- 
sional life, and in this article I shall describe 
some of their features and consequences for 
me. 

The settings of my environment have had 
widely varying attributes; I have entered 
some of them by my own choice and have 
been coerced in varying degrees to inhabit 
others; in some settings I have had consider- 
able influence, in others little power. The 
Chairman’s Office differed greatly in these re- 
spects from the Experimental Social Psychol- 
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ogy Class: in the Office there was great di- 
versity of both human and nonhuman com- 
ponents (secretaries, faculty, students, type- 
writers, telephones, files, people conversing, 
dictating, typewriting, and more), and all of 
these were incorporated into a relatively 
stable, multiform pattern of Office activities. 
In the Class there was less diversity (no sec- 
retaries, telephones, typewriters, files, people 
conversing or dictating) and a radically differ- 
ent, less complex standing Pattern of lecturing, 
listening, and note taking. I entered the 
Chairman’s Office by my own choice, but I 
was under strong compulsion to turn up for 
the Class at the scheduled time. I was power- 
less to change the program of activities of the 
Dean’s Office, but I was able to alter the pro- 
gram of the Class, although my power was by 
no means unlimited. This setting had been a 
part of the College and the University before 
I joined the faculty, and they imposed cer- 
tain features upon it that I was powerless to 
alter: I had to teach experimental social psy- 
chology, the setting’s raison d’étre (not ecol- 
ogy, a great interest of mine at the time), and 
I had to give grades (objectionable to me, but 
a policy of the University). Still, within these 
and other limits, I was in command of the 
program of activities. 

When I have inhabited a behavior setting 
—whatever its attributes, the conditions of 
my entrance, and my power—I have become 
one of its component parts, and my behavior 
and experience have been formed by its on- 
going program; these, therefore, have changed 
appropriately as I have moved from setting 
to setting. I did not pick up my mail or dic- 
tate letters in the Experimental Social Psy- 
chology Class or give lectures or quizzes in 
the Chairman’s Office. The impress of some 
settings upon me has not extended beyond 
their boundaries; the stamp of others has 
been long-lasting. The influence of the Chair- 
man’s Office ceased abruptly at the door, 
whereas some attributes of my behavior and 
experience first occurring in response to 
forces within the Class have remained with 
me. In this account, I shall pay special atten- 
tion to behavior settings that had long-con- 
tinuing consequences for me. 

Unfortunately the data I have to work with 
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are poor. Few contemporaneous accounts of 
any setting are available, and still fewer are.’ 
by independent observers, Most of what I am 
able to report is the fallible memory of how 
settings appeared to one inhabitant, In an 
attempt to mitigate the faults of the data, I 
shall confine my account to those settings for 
which there are contemporary data or very 
clear memories and that appear to have been 
crucial to my scientific activity. Fifteen be- 
havior settings or clusters of settings most 
adequately meet these requirements. Here 
they are listed in temporal order of their 
occurrence and with their institutional con- ) 
nections; Stanford University, 1929-1935 
(Terman’s Seminar, Miles’ Later Maturity 
Facility, Stone’s Animal Laboratory); Uni- 
versity of Iowa, 1935-1937 (Lewin’s Offices, 
Nursery School Laboratory, Topology Meet 
ings); Harvard University, 1937-1938 (Mug 
ray’s Clinic, Child Psychology Class, Boring 
Sack Lunch); University of Illinois, 1938- 
1942 (Study at Home, Extension Classes); 
Stanford University, 1942-1945 (Office of 
Disability Survey); Clark University, 1946- 
1947 (Office at the University); University 
of Kansas, 1947-1972 (Office of Department 
Chairman, Field Station in Oskaloosa). 

I shall introduce each of these settings W! 
its specifications under six rubrics: program: 
physical attributes, temporal characteristic 
human components, powers of human com 
ponents, and boundary properties. Then 
shall report the forces and circumstances E 
brought me into them, the actions and orn i 
ences they elicited from me when I inhal E, 
them and after I left them, and my P? 
within them. 

But first I must describe the 1928 pe 
who was to make the trip through these 
havior settings. 


The 1928 Person 


ae tual 
He was unsound physically, his intellec 


powers were unknown, his financi a 
were meager, his motivation was 
unfocused. left hip 
An osteomyelitis infection of pee, 
and right knee had had many pS 1928 
since its beginning 12 years before. 


the disease was quiescent, but acute episodes 
were a threat; a flare-up kept him out of 
school during 1929-1930. In its quiescent 
periods the disease was mildly debilitative, 
and its destruction of joints caused some loco- 
motor impairment. 

The 1928 person was 25 years old and 3 
years behind academically. He attributed this 
to missing school because of illness, but he did 
not know what it signified for a career in 
science. Perhaps his isolation in sickrooms 
and his 6 instead of 8 years of secondary and 
college education handicapped him for ad- 
vanced study; he was poorly grounded in 
athematics and languages; he had studied 
0 physics and had had only elementary 
ourses in chemistry and biology. He remem- 
ted many evidences of intellectual mar- 
inality and inadequacy. There was his ex- 
rience in the fourth grade, when the teacher 
uld not decide if he belonged with the A 
oup or the B group, so she had him sit 
lone between the elect and the scrubs. And 
è could not forget failing the first course in 
igh school algebra, having to repeat it, and 
ging through the second go by memorizing 
the answers to likely problems, ending with 
the unanswered question: “How can it be pos- 
‘ible to add, subtract, multiply, and divide 
letters, and why do it in any case?” Years 
kter, too late to be of any comfort to him, he 
iscovered that George Bernard Shaw had 
‘so had trouble with algebra, “mistaking the 
“, b, n and x for goods” such as eggs and 
theese “with the result that I rejected algebra 
‘S nonsense” (Shaw, 1949, p. 41). Shaw 
blamed the instruction; “I had been made a 
Col of”; the 1928 person blamed himself. 

nd there was his mother’s aunt, an intelli- 

nt and insightful person, who was devoted 
him during his adolescence and greatly 
Norried about his future; she summed up her 

"pressions of his assets and liabilities with 
the advice that he should prepare himself to 
“a short-order cook in a country restaurant. 
There was some evidence on the other side. 
ere was the occasional, ambiguous remark 
{aut him by partisan adults: Still water runs 
“ep. But this was less convincing than his 
Meat aunt's explicit evaluation. There was the 
| tilling time in the eighth grade when he was 
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called upon to stand before the class and ex- 
plain how to point off decimals in the quotient 
of a long division problem, and the teacher 
praised him. Having graduated from Stanford 
University with sufficient promise for the 
Psychology Department to accept him for 
graduate study would have been reassuring 
had there been entrance requirements; at 
least he was not discouraged by Professor 
Terman, the chairman, By far the most per- 
suasive evidence, which tipped the scale in his 
self-evaluation, was the engagement between 
him and Louise Dawes Shedd. Louise Shedd 
knew him well, she herself had high scholarly 
credentials and bright goals for a life in the 
classroom and laboratory, and she was not 
about to link her life with that of a short- 
order cook. He was willing, in fact, relieved, 
to accept her evaluation, and he hoped to 
justify it. 

His financial resources were less than zero, 
for during his undergraduate years he had 
accumulated hundreds of dollars in tuition 
loans that became due to the University in 
years immediately ahead. 

Psychology for him was a means, not an 
end. His undergraduate introduction to the 
subject matter and methods of the science 
had no special appeal to him, but he believed 
it to be the least onerous route to his goal of 
“doing good” for mankind. He was a thor- 
oughgoing, naive idealist and ardent reformer. 
He considered medicine but was sure he would 
not be accepted as a student, and his long, 
discouraging experience as a consumer of 
medical services had alienated him from the 
profession. He found economics too remote 
from particular people, and sociology too 
speculative and wordy. He tried to get a 
handle on demography but could discover no 
people or publications to inform him suffi- 
ciently about it. He did not consider litera- 
ture, art, or the physical and biological 
sciences of any relevance to his goals. Psy- 
chology seemed to be a promising route, but 
even as he began graduate study he did not 
foreclose other routes, and within psychology’s 
varied and vast domain, he had no preference. 
Although the zeal of the 1928 person was 
great, there were clouds on the horizon. He 
anticipated that the psychology route to good 
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works would have difficult and unpleasant 
stretches, and he was keenly aware of the 
likelihood of recurring acute phases of his 
illness. His prescience in the latter respect 
was verified by the flare-up in 1929-1930, and 
he hesitated about persisting, but for reasons 
impossible for him to comprehend to this day, 
his fiancee stood fast, encouraged him, and 
even married him in the summer of 1930. So 


- 


Stanford University, 1929-1935 


I was a graduate and postgraduate student 
at Stanford Univæsity for 6 years. For 4 
years I attended classes, did research under 
supervision, led “quiz sections” of elementary 
psychology, did various departmental chores 
(mimeographed class notes and exams, scored 
Strong Vocational Interest Tests, and cared 
for the animal colony), passed examinations, 
and received the MA degree (1930) and the 
PhD degree (1934). Being unable to find a 
regular position in the depression year of 
1934, I was fortunate in being kept on as a 
research associate for 2 more years. The work 
was rewarding and the salary of $100 a month 
was satisfactory; with Louise’s $1,800 a year 
as high school biology teacher we were able to 
pay my tuition loan debts to the University 
and save some money. Three behavior settings 
were crucial to me during these Stanford 
years. 4 


Terman’s Seminar 


Program: Report by seminar member, usually a stu- 
dent, with interruptions by others discussing or ques- 
tioning points made. Physical attributes: Locus, Ter- 
man’s home on the Stanford University campus, in 
an attractive living room of sufficient size for a 
single-row circle of 25 persons. Temporal character- 
istics: Weekly meetings (with occasional skips) for 
about 2 hours during most autumn, winter, and spring 
terms; approximately 20 meetings most years, Hu- 
man components: Dr. Lewis M. Terman, one or two 
other faculty members, occasional guests, 10-20 
graduate psychology students. Powers of human com- 
ponents: Program determined at three levels of 
power. Terman’s power extended over the entire 
setting (he selected the speaker, approved the topic, 
opened and closed the meeting) ; the speaker's power 
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was supreme over the content and method of his 
report; the power of the other inhabitants was lim- 
ited to supplementing, correcting, criticizing, and ap- 
proving the speaker's presentation. Boundary prop-| 
erties: Admission at the invitation or urging (in the 
case of students) of Dr. Terman. 


Reports usually described completed re- 
search or plans for research; occasionally} 
books or monographs were reviewed, Almos 
all reports were concerned with research, em 
phasizing methods; theories or general issues) 
were very rarely seminar topics. Interchanges 
between reporters and other members were} 
usually for more information about proce- 
dures. The Seminar was eminently civilized 
and controlled, the occasional disagreements 
and arguments being muted. Terman set thé 
tone, being quietly attentive, making fey 
contributions himself other than introducini 
the speaker and ending the session with ge 
eral remarks, Despite its appearance of trani 
quility, many students reported, outside th 
Seminar, that it was a tense, even traumatil 
experience for them. Many of the controllé 
interactions were undoubtedly reactions t 
the Seminar as a dangerous place for students) 
they were on trial before powerful present andi 
future evaluators. 


scientific work, and to improve my chances q 
doing so by demonstrating what strength 
had to Terman, the other faculty members/h 


and my peers. The first and last intentions 


ignorance. And to walk on 
in the Seminar, always on the DENE i cond 
disaster and improved footing, is PO! Hine 
ducive to a wide perspective and clear a a 

ing; it is inhibiting. Still, one must ac gi 
remain motionless on the ice, Or on aul 
Seminar, is a sure way to disaster. P A 
tion is essential. Under these Gira not 
motives and abilities and disabilita J 
openly revealed, and valid judgme! i 
sons are not possible, though re 
made, I never knew my reputation 
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embers of the Seminar, but I did know my 
f-estimate. I placed myself near the median 
f the student seminarians, I was uncertain of 
assing the examinations, and I was quite 
re my destiny was at best to become a 
urneyman psychologist, perhaps in a junior 
lege where, after all, there was plenty of 
wpe for good works. Considering the people 
minst whom I was judging myself, this self- 
timation is not surprising; during the years 
Í my attendance they included six who be- 
ime president of the American Psychological 
sociation and a number of others who were 
nored by the Association for their scientific 
intributions, 
My immediate reaction after the Seminar 
as frequently one of great dismay: at my 
morance of what seemed to be common 
nowledge, at my failure to contribute what 
terward appeared to me to be a valuable 
put, and at my espousal of weak, foolish, or 
relevant ideas, Louise came to dread the 
te-night aftermath of the Seminar. 
I cannot now remember what I learned; I 
ü quite sure that I encountered no mountain 
taks of new understanding. But the Seminar 
fovided a regular update of what was hap- 
thing in psychology, especially with respect 
)methods of investigation. More than this, 
became acculturated into the language, the 
Dres, the ethos of the psychology tribe. And 
Made acquaintances who have continued to 
© valued personal friends and influential 
nections within the profession. On the 
fative side, I first encountered an aspect of 
emic life that has continued to be dis- 
ŝteful to me: the ubiquity of judging others 
Passing and failing students, approving and 
nn proving candidates, promoting and not 
ping colleagues, accepting and not ac- 
fh ng research proposals, and so forth) and 
cing judged by others. The fact that most 
personal relations within Terman’s Sem- 
t involved, explicitly or implicitly, judg- 
nts of personal worth reduced its attraction 


; à social occasion and eroded its educational 
efits, 


les» 

_ °* Later Maturity Facility 

iy 

p 0m: To study change in abilities from middle 
ter years of life. Physical attributes: Located 


bh. 
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near the central business district of Palo Alto in a 
one-time residence. The former reception hall, parlor, 
living room, dining room, and bedrooms were mod- 
ified for use as offices and for administering tests and 
experiments; office furnishings and equipment for 
tests and experiments. Temporal characteristics: Op- 
erated weekdays for several months during academic 
year 1931-32 from 9:00 a.m. to 5:00 p.m. Human 
components: Walter R. Miles, graduate student re- 
search assistants, secretary, subjects, Boundary prop- 
erties: Free access by staff; subjects admitted by 
appointment. Distribution of power: The power of 
Miles extended over the entire setting and program; 
he planned the research, secured the financing, 
selected the staff, recruited the subjects, approved 
the methods of testing and experimentation. The 
power of each research assistant was dominant within 
his laboratory room. The only power of subjects was 
to refuse to participate in a particular procedure; to 
my knowledge this did not occur. 


I had completed my master’s degree work 
under Miles with a thesis on finger maze 
learning, and it was now time to undertake” 
research for the PhD degree. So when Miles 
returned from a trip East, bringing the news 
of a grant for the magnificent sum of $10,000 
to support a study of old age, I eagerly em- 
braced the opportunity to. participate. For 
reasons I cannot recall, I chose to study 
muscular fatigue. Miles gave me complete 
freedom to devise a muscle fatigue test suit- 
able for both robust 5O-year-olds and frail 
centenarians, and I took considerable satis- 
faction in adapting a spirometer determine 
hand and arm fatigue quickly. Other graduate 
students investigated changes in intelligence, 
learning, memory, motor skills, and sensory 
acuity. Subjects were recruited, from clubs, 
churches, and living groups by paying the 
organization for each member who became a 
subject. They were brought to our experi- 
mental rooms by the secretary. At the time, 
I did not appreciate the luxury of having sub- , 
jects provided to me with no effort on my 
part; many times since, I have appreciated 
Miles’ efficient logistics. 

I greatly enjoyed working within the Later 
Maturity Facility; the freedom within my 
own domain, and the straightforward problem 
were agreeable. I went home most nights feel- 
ing I was making progress toward providing 


_ some potentially useful new knowledge. 


The more enduring consequences of this 
setting for me were, first, increased confidence 
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in my ability to engage in research on my 
own; after approving the area of my con- 
tribution, Miles took no part in my project; 
it became my own. Second, I learned some- 
‘thing about the possibilities and difficulties 
of a kind of field study; I observed the ex- 
tensive public relations activities required to 
secure subjects, and I was especially im- 
pressed with the effectiveness of approaching 
citizens by way of the organizations to which 
they were devoted. I also noted that the de- 
mands of public relations, at which Miles was 
very effective, divorced him from the details 
of the research. Because of this experience, I 
believe it was easier for me, when I later 
engaged in field studies myself, to realize that 
they require staffing different from that of 
laboratory investigations with captive or hired 
subjects and that community experts are as 
essential as observers, interviewers, testers, 
and so forth. 


Stone’s Animal Laboratory 
Begs 

Program: To investigate motivation in animals, 
chiefly rats, with emphasis on sexual behavior. Phys- 
ical attributes: Locus, third-floor attic of the build- 
ing housing the psychology department in an area 
partitioned into a sky-lighted animal room for about 
100 animal cages, perhaps 10 small rooms for as- 
sistants and equipment, and Stone’s large office. The 
equipment was primitive by present standards: stu- 
dent-made mazes, jumping apparatus, activity drums, 
Monroe hand calculators, simple shop equipment. 
Temporal characteristics: Functioned at some time 
every day. Human components: Calvin P. Stone, 
one or two postdoctoral fellows, two graduate as- 
sistants, several thesis students, a few students doing 
class projects, visitors. Powers of human components: 
Stone's power extended over the entire laboratory; 
he approved the fellows, selected the assistants, ad- 
mitted the students and approved their research, set 
rules and standards. Fellows had complete power 
over their own research; assistants controlled their 
particular segment of the total program of the lab- 
oratory after consultation with Stone. Visitors’ power 
was limited to viewing approved areas and asking 
questions. Boundary properties: Fellows, assistants, 
and students admitted by Stone personally; visitors 
were tolerated, but except for professionals, they 
were not welcomed, 


As in 1931 (the opportunity to participate 
in the Later Maturity Facility came at a 
crucial time), I was fortunate in 1933 to be 
able to work in Stone’s Animal Laboratory 
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when I could not find a regular job. I did 
choose to spend two years in this setting, but! 
neither did I object; it was inevitable, 

I found work in the setting even more 
satisfying than in the Later Maturity Facil 
ity. I had passed all examinations and written) 
my thesis; that tension was gone. The intel 
lectual camaraderie was congenial; postdoc 
toral fellows brought news and innovations 
from other laboratories; the pace of activities 
was not determined by tightly scheduled sub 
ject appointments, But there was a vigorous 
steady program of activities; almost all 
animal work required regular cleaning, feed 
ing, observing, testing, and examining; Ston 
was a hard, regular worker himself, and ever} 
one knew that he valued industriousness righ 
along with honesty and intelligence. He wi 
known to disapprove strongly of the policy 4 
another university that reportedly providd 
technicians to do such detail work for grat 
uate students as sectioning and preparing 
tissues for examination and running rats J 
mazes. In Stone’s Laboratory, students caré 
for their own animals and spent the boring 
hours putting them through the necessary 
procedures. Inasmuch as Stone was in 4 
out of the laboratory many times most days 
beginning at 7:30 a.m., his power and repula 
tion were sufficient to maintain a fairly tighi 
ship without many definite rules and regula 
tions, i 

Special emphasis during this period was on j 
variation in the age of sexual maturity of "a 
its genetic basis, and its relation to other ; 
havior and somatic characteristics. Parah 
investigations were conducted of the apoi 
maturity of human females and its ee 
to their size and their interests and attitue™ 
I came to find much of the work interesting 
and some of it had a permanent influence a 
me. Even the evening after evening of tes be 
young male rats for age of first copulation £ 
came enjoyable when Louise came along ant” 
read aloud from a high stool in 


the animal room. (Conrad, Tolstoy, E eubjects 


the potential value of archival d 
chological science. Stone was interes 
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tellectual level of children of abnormally 
arly sexual maturity, and he had previously 
iwed and summarized the literature. He 
tme to updating the survey, and I was im- 

ed to find that although most evidence 
in the form of reports of single or a very 
cases, and none were adequate method- 
pgically in terms of number and selection of 
ses and methods of testing, still, en masse, 
se data, reported by many independent 
vestigators at widely varying times, in di- 
se situations and cultures, provided over- 
helming evidence that early intellectual 
Maturation does not accompany abnormally 
ly sexual maturation, I noted that almost 
D cases were reported in the psychological 
rature, whereas medical journals served as 
luable archives for these rare cases. I be- 
to see that the insistence of academic 
Aychology that every publication conform to 
l current canons of scientific adequacy was 
priving the science of data on important 
ùes, Years later I discovered a pair of iden- 
ftal adolescent twins, one of whom had been 
tiously crippled since childhood, and I saw 
as an exceptional opportunity to obtain 
ta from a very rare natural experiment on 
effects of physique on personality; I 
died them extensively. But I was advised 
it to attempt to get the material published 
ause “one case means nothing.” I won- 
ted: Would a paleontological journal reject 
Study of 1 dinosaur egg because the investi- 
lor did not present data on 50 eggs? The 
‘son I began to learn in Stone’s Laboratory 
S stayed with me, and I and my colleagues 
Ve not hesitated to collect “inadequate” 
la under certain circumstances, namely, 


‘ortant, when fully acceptable data cannot 
| °btained due to their temporal or physical 
|, Persion or their rarity, and when the data 
Á Such that investigators at other times and 
| ther places can add to them. 
n t was in Stone’s Laboratory that I first en- 
“tered a set of problems that have oc- 
fied me ever since, namely, environmental 
fs on behavior. At the time, however, 
h d not see the particular problem in this 
k t. It arose from our studies of relations 
ĉen age at menarche and physical and 
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social development in girls. We discovered 
that in the early adolescent years, early ma- 
turing girls are more similar to older females 
in measurements of physique than are late 
maturing girls of the same chronological ages, 
and when we asked if this same relation holds 
for social interests and attitudes, we found 
that it does: Postmenarcheal girls 12, 13, and 
14 years of age are more similar to older fe- 
males in their responses to interest and atti- 
tude test items than are premenarcheal girls 
of the same ages. So we faced this question: 
To what degree are the differences in attitudes 
and interests due to direct hormone influences, 
and to what degree to the fact that the phy- 
siques of the early and late maturing girls 
provide the girls and their associates with 
stimuli with different social significances, 
thereby imposing different social environ- 
ments upon them? We could not answer this 
question with the data at hand. Five years 
later, at the University of Illinois, I returned 
to this problem and found evidence that 
powerful adults (parents, teachers) bring 
greater pressure upon physically mature girls 
to engage in mature behavior than upon 
physically immature girls. And 10 years later, 
when I was back at Stanford, I returned to 
the problem in connection with studies with 
my students of the psychological consequences 
of physical crippling. 

In addition to initiating this particular con- 
tinuing interest, Stone’s Animal Laboratory 
further strengthened an attitude I carried 
with me from other Stanford settings, namely, 
that strong and persisting but rather narrow 
programs of activities are productive. There 
were no brilliant performances at Stanford. 
The research of Terman on the gifted, of 
Stone on animal motivation, and of E, K. 
Strong on interests yielded no remarkable 
breakthroughs, but they were and have con- 
tinued to be recognized as substantial 
achievements. This was not an “Aha!” ex- 
perience for me; but as I look back I can see 
that it became a deeply rooted conviction that 
this was to be my way of doing science. Along 
with this vaguely developing insight there 
was increased self-assurance. At the end of my 
6 years at Stanford, I was more sure that my 
aspirations for a productive career in science 
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were not hopeless. However, I did not have 
a strong commitment to any problem. I had 
worked on a rather wide range of problems 
via a considerable number of methodologies, 
but I had become devoted to none. I was 
still unfocused, at the beck and call of 
almost any opening that would allow me to 
earn a living as a psychologist. In the spring 
of 1935 an opening came that radically altered 
my intellectual lifestyle and threw me again 
into uncertainty about myself. 

I took part in many behavior settings at 
Stanford other than the three I have de- 
scribed: classes, lectures, seminars, labora- 
tories, projects, examinations. They enriched 
my intellectual life and widened my perspec- 
tive, but only one of them had particular, 
identifiable consequences for me. This setting 
channeled the stream of my behavior as a 
psychologist into its first great bend. Kurt 
Lewin came as a visiting professor for the 
year 1932-1933. I was very busy during this 
year finishing my doctoral thesis so I was 
able to attend Lewin’s class only as a visitor. 
He attracted me greatly as a person, but his 
psychology confused me or, perhaps more 
correctly, it was incomprehensible to me. 
Fairies and ectoplasm would have been more 
comprehensible than life space, valence, psy- 
chological force, inner-personal regions, sub- 
stitute value, psychological satiation, and so 
forth. Stanford students knew something 
about Gestalt psychology experiments and 
theories of perception from the writings of 
Wertheimer and Kohler, and of development 
from the works of Koffka, but Lewin’s so- 
called Gestalt psychology seemed to have 
nothing to do with these, as we expected. 
And equally disconcerting, Lewin seriously 
reported experiments with seven subjects, and 
instead of replicating the experiments, he 
changed the conditions; even worse, if the 
results from the altered conditions were in 
accord with his predictions on the basis of 
theory, he took this as a verification of the 
findings and the theory. I suppose Stanford 
University in those days was among the least 
auspicious places in the United States for an 
understanding of Lewin; theory was almost 
a nonword in the psychology department, al- 
though we did use it in connection with 
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Spearman’s interpretation of intelligence test 
intercorrelations (theory of general intelli- 
gence) and we read about psychoanalytic 
theory. But most of us had no background in 
the philosophy of science, and the place of 
theory in science. So Lewin’s Dynamic Theory 
oj Personality, which I reviewed (with Ter- 
man) and which he taught in his course, was 4) 
transient foreign body, a UFO, to most of us.) 
I cannot recall that any of the students who” 
attended his class took and retained a serious: 
interest in his viewpoint unless they had later 
association with him. The gulf was too wide f 
to be bridged quickly, and the dissonance was ¢ 
so great that some rejected his ideas out of 
hand. I was not negative; I was tolerantly 
baffled. aA 
So when the opportunity arose to join 
Lewin as a General Education Board Fellow 
at the University of Iowa for the year 1935- 
1936 I was intrigued; here was a chance to 
learn a different kind of psychology, to eami 
a full living for Louise and me for the first 
time, and to have the prestige of the fellow: f 
ship. Indeed, I had no choice; circumstances 
beyond my knowledge and control were turn 
ing the stream of my professional behavior, 
indeed my whole life, in a new direction. I 
have no idea to this day how my nomination 
to this fellowship came about, but I am sure 
that an important factor on the Stanford side 
must have been: It is really time for Barker f 
to push off the home place. I thought so, toai 


University of Iowa, 1935-1937 d 


Kurt Lewin went to the Iowa Child well 
Station in the autumn of 1935 on a ara 
from the General Education Board, with pr ' 
vision for three research assistants. The i | 
assistants were Tamara Dembo (who n 
been a student of Lewin in Berlin), Hebd 
F. Wright (who had been a student of Do ¢ 
K. Adams at Duke University; Adams My | 
studied with Lewin in Berlin), and “Of 1936. | 
fellowship was extended in the spring e were | 
for 1 more year. Three behavior settings | 
of primary importance to me at Towa. 


Lewin’s Offices 


r e w 
Program: To plan the research pre ng al 
undertake, and to discuss and make de 
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rocedures and problems when it was underway; to 
nalyze data from the project and write up reports; 
read the writing Lewin was doing, to listen to his 
oposals for reformulations and additions, and to 
jticize and make suggestions. Physical attributes: 


peated in four adjacent office rooms on an upper 


bor of East Hall, University of Iowa campus; 
¢ building had been a hospital and the rooms were 
mer single-patient rooms; they were a little 
owded with a desk, a bookshelf, a blackboard (in 
ewin's office), and four chairs when we were all 
esent in a single office. Temporal characteristics: 
i operation at some time almost every day, with 
œ or more participants; sessions with Lewin oc- 
red whenever the project needed his attention and 
henever he had something to discuss. Human com- 


ments: Lewin, the three assistants, frequent vis- 
ors. Boundary properties: Assistants selected and 
Imitted by Lewin; free entrance for visitors. 
twers of human components: Lewin established 
t program of the setting; he decided that the re- 
arch would be upon frustration, but the details of 
rocedure were worked out in group consultation, 
here he was first among equals; he was, of course, 
| total control of his own writing, the assistants 
ing more as persons upon whom he could try out 
$ ideas than as consultants or collaborators, al- 
aa he took seriously all criticisms and sugges- 
ns, 


It is difficult to imagine more different cli- 
lates for scientific work than those obtaining 
{Stanford and at Iowa. At Stanford, science 
ithered facts about behavior via experiments 
nd tests, determined their central tendencies, 
Md analyzed their interrelations; in Lewin’s 
‘up, science explored ideas about behavior 
a experiments and observations. At Stan- 
td, conclusions were in terms of the means, 
Spersions, and correlations of samples of 
‘cts about behavior; my own work discov- 
‘ed that the rate at which 80-year-olds pump 
d into a spirometer declines faster during a 
B period than the rate at which 50-year- 
3 pump air, At Iowa, conclusions were in 
ms of verified or altered ideas about be- 
a in our work there we found (in ac- 
"dance with a theory) that in psychological 
‘stration, inner-personal systems can be con- 
Ted to be in a state of blocked tension that 
Mounts to functional dedifferentiation, one 
pression of which is lowered level of intel- 
tual activity, that is, intellectual regression. 
n the beginning, the sessions in Lewin’s 
las were an ordeal for me; they were be- 
ering and tiring, The ideas we were to 
Ore and clarify in connection with the 
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frustration study were too unfamiliar for me 
to make many contributions; as for the mono- 
graph Lewin was working on, The Conceptual 
Representation and Measurement of Psycho- 
logical Forces, the ideas were utterly baffling. 
Lewin’s eagerness and the energy to back it 
up seemed boundless, whereas the tension and 
alertness of the 2-hour sessions of Terman’s 
Seminar left me dog-tired; after meeting with 
Lewin, Dembo, and Wright from 2:00 to 7:00 
in the afternoon (with Lewin reading aloud 
what he had dictated the day before, interlin- 
ing new sentences, rearranging the order, vio- 
lently objecting to a criticism by Dembo, 
turning to Wright—“Herbert, is she right?” — 
accepting Dembo’s criticism, diagramming a 
relation on the blackboard, crossing the whole 
page out, dictating a new version to Dembo, 

and so forth), I was ready to drop. After five 

o’clock I would hope beyond hope that my 

dear, pregnant wife, lonesome at home, would 

telephone that I was urgently needed. Some- 

times she did. A frequent concluding remark 

by Lewin was “We must think about this,” 

and that after 3, 4, 5 hours of nothing else. 

Did this add up, perhaps, to a kind of brain- 

washing? In any case, as the months went by, 

I began to understand Lewin, and his ideas 

have remained at the center of all my sub- 

sequent work, But equally important to me 

has been the new, higher level of intellectual 

effort to which I became adapted. I could 

never come close to Lewin’s intensity; I had 

to take it much slower, but his refrain, “We 

must think about this,” has stayed with me. 

Although no one could have been more sub- 

ordinate to Lewin in terms of knowledge, I 

was always treated as a colleague, never as a 

pupil. My contributions, however naive, were 

always taken seriously, Lewin often seeing in 

them more than I had intended. 


Nursery School Laboratory 


Program: To carry out the research on frustration, 
we planned in the offices in East Hall. Physical at- 
tributes: Two ground-floor rooms in a remodeled 
residence across the street from the station’s nursery 
school equipped with a one-way observation booth, 
child-sized chairs and tables, toys, and a movable 
barrier in accordance with the research design. Tem- 
poral characteristics: Setting occurred intermittently 
during winter and spring of 1936-37, according to a 
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schedule of appointments. Human components: 
Tamara Dembo and myself, alternating as observer 
and experimenter, and a single child from the nursery 
school at each occurrence. Boundary properties: 
Free access to Dembo and me; child subjects, a 
different one on each occasion, selected and required 
to attend in accordance with decisions made by us 
and nursery school staff. Powers of human compo- 
nents: Dembo and I had joint authority over the 
entire setting, although, according to the experi- 
mental design, we did not intrude into regions desig- 
nated for free play; child subjects had no power 
except in designated free play regions. 


Basic to much of the early research Lewin 
initiated is the theory that psychological ten- 
sion systems correspond to intentions to en- 
gage in molar actions. The studies of inter- 
rupted and substitute tasks issued from this 
theory. Lewin came to Iowa with the intention 
of investigating this idea further; he thought 
that the undischarged tensions that occur in 
frustration result, via increased rigidity and 
spread of tension to adjacent regions of the 
person, in functional dedifferentiation, and 
that one manifestation of this is intellectual 
regression to a state normal for an earlier age. 
The problems for the assistants were to devise 
a frustrating situation and methods of assess- 
ing intellectual level inside and outside this 
situation. The nursery school setting in which 
we worked on these problems elicited from 
me two insights that have remained with me. 
I discovered the value, for studies of molar 
actions, of situations where the investigator 
intrudes not at all or very little, and the value 
of detailed narrative records of behavior, The 
frustration study involved, for the most part, 
a continuation of the child’s usual nursery 
school day, and in his free situation, we were 
able by means of nonintrusive narrative rec- 
ords to assess intellectual level as accurately 
as formal intelligence tests do. In later work, 
my colleagues and I have come to depend 
very extensively on narrative records of be- 
havior in situations that are completely free 
of our influence; and we continue to be im- 
pressed with the value of ordinary language 
as a coding system for the subtleties and com- 
plexities of behavior. The Nursery School 
Laboratory was a welcome refuge for me from 
the conferences in Lewin’s Office. Here I 
found a kind of therapy in applying my Stan- 
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ford expertise to the gathering and analyai 
of masses of data. 


Topology Meetings 


Program: To discuss Lewinian theory and rese 
findings. Physical attributes: Occurred in ac. 
conference rooms of various universities. Tempo 
characteristics: Took place during Christmas vaca 
tion for two or three full-day and evening sessi 
Human components: Lewin and invited guests; 


latter were former and present students and col 
leagues, and a few others interested in Lewin 
ideas. Boundary properties: Invitations issued by 
Lewin and by people at host school, Powers of hie 
man components: Designated persons at host insti 
tution controlled local arrangements, Lewin arranged 
program of papers; free discussion 


The Topology Meetings were as much a 
of my Iowa experience as were Lewin’s Ofi 
and the Nursery School Laboratory, even 
though the meetings were held at Cornell Uni- 
versity and Bryn Mawr College in those 
years. They agitated me somewhat as Te 
man’s Seminar had, although the testing 
aspect was less pervasive and the learnin 
aspect more pervasive; still these set 
were new and strange, and I was alert 4 
cautious. In them my world expanded greal 
to encompass many new friends and acquaill 
ances, many new concepts applied to n 
problems, and many new locales and instit 
tions (I had not previously been east of ti 
Mississippi). FA 
The adjustments I made to the three cruci 

behavior settings at Iowa were only part 
the intellectual turmoil I experienced. 
were other upsetting settings. In some of 
Herbert Feigl expounded physicalism and 
views of the Vienna Circle; in others, Spence 
kind of Hullian behaviorism was expound 
Towa, for that brief time, at least, was 2 
bling cauldron of antagonistic ideas. 
my nontheoretical background, I was sev 
buffeted by the strongly asserted convicti 


of these sophisticated advocates. É a 
handling of these disturbances impr ough 
and has been a model for me since. h ; 


he defended his viewpoints strongly, Pe ie 
not rigidly partisan. An early remark hen his 
to a student objector at Stanford, x sition: 
English was poor, expressed his Po 


; 


A 


on be, but I sink absolute uzzer.” He did 
“it think current controversy would settle 
ic issues, that only empirical evidence 
Id in the long run sift the true from the 
, and that the business of science was to 
ahead with empirical tests, He strongly 
Jored the disruptive partisanship within 
man psychology. 
Before the end of my second fellowship 
r a new opportunity arose that I could not 
ine; an instructorship at Harvard Uni- 
ity to teach a course in child psychology. 
with the General Education Board Fellow- 
tip, I have no knowledge of the forces that 
ught this opportunity to me, But again, 
ler 9 years, the stream of my professional 
avior was to flow in a ready-made chan- 
. Previously, the Later Maturity Facility, 
Animal Laboratory, and the General Ed- 
tion Board Fellowship had taken charge 
my life in psychology, and now the Harvard 
ychology Department took over with no 
ort by me. But there was a difference: 
Nhereas previously I had been in the role of 


pil (to Terman, to Miles, to Stone, to 
in), now I was to be on my own in my 


hing and in my research. 

Harvard University, 1937-1938 
larrived at Harvard under the cloud of an 
ess, not my old familiar trouble, but sus- 
ted appendicitis. I presume now, more 

fen I admitted then, that the symptoms were 
Psychosomatic origin. And why not? I was 
Mronted with the task of teaching a subject 
both to me and to the Harvard Psychol- 
Department (Child Psychology had not 

*n in its curriculum before) to hypercrit- 
l Harvard students (Boring warned me of 
m in a letter in which he also expressed 
“concern of the Department about its rep- 


tion among undergraduates for good teach- 

) before the eyes of some of psychology’s 
Morass (Boring, Allport, Murray, Lashley) 
te its brightest lieutenan® colonels (S. S. 

vens, B, F, Skinner, Robert W. White). 
h 
h 


Mray’s Clinic 


am To carry out studies of personality; the 
am was minimal this year, however, as Murray 
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was on sabbatical leave. Physical attributes: Located 
in a converted residence on Plimpton Street, a block 
and a half from Emerson Hall, headquarters for the 
Department; clinic office, staff offices, library, ex- 
perimental rooms, shop, kitchen; the library and 
several offices were elegantly furnished with fine rugs 
and period furniture. Temporal characteristics: In 
regular operation weekdays and at any time for par- 
ticular staff members. Human components: Acting 
director, three or four staff and/or psychology de- 
partment members, a postdoctoral fellow, several 
students, clients and subjects, secretary, shop man. 
Boundary properties: Free access to official person- 
nel, clients and subjects by appointment. Powers of 
human components: The acting director was Robert 
W. White, who had final control of the entire set- 
ting, although staff members and the fellow were 
quite independent; the secretary and shop man were 
semiautonomous within their areas; powers of clients 
and students were limited by the staff members with 
whom they were associated. 


Murray’s Clinic was my salvation: It was 
leisurely and quiet; my office was large, with 
comfortable chairs and a couch; after the 
hectic pace at Iowa, it was a refuge, Here I 
could sit and think about where I had been 
and where I was going, emerging only at 
intervals to grapple with my one class. Part 
of where I had been was with me in the form 
of a partially completed manuscript about the 
Iowa frustration experiment; sitting on my 
desk in a big box, it was a continual irritant, 
but it did not claim or oppress me. I tested 
where I was going by doing an experiment on 
conflict resolution; it was a Lewinian experi- 
ment and it was congenial to me, but I did 
not see it as a channel to a lifetime of re- 
search; I did only one other study of conflict. 
I believe I became dimly aware, then, that 
further investigation of conflict resolution 
would lead to finer grained and more complex 
theories and more detailed and precise experi- 
mental procedures that did not suit me. This 
may have been the beginning of my aversion 
to the reduction of theoretical explanations 
of molar events to theories of their more 
molecular components, with the implication 
that the latter are the more fundamental. 
Murray’s Clinic was my first experience of a 
behavior setting without a coercive program. 
Whereas the settings at Stanford and Iowa 
took me in hand and put me through their 
paces, Murray’s Clinic protected me from im- 
positions. Settings of this kind have fortu- 
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career. 


Child Psychology Class 


Program: To teach child psychology to undergrad- 
uate students, Physical attributes: Classroom for 
about 40 students in the basement of the Clinic; 
chairs, podium, blackboard; furnishings and decora- 
tions somewhat shabby. Temporal characteristics: 
One hour’s duration, three times weekly for one 
semester. Human components: Instructor and 35 
students. Boundary properties: Access only for in- 
structor and registered students. Powers of human 
components: Program under complete control of 
instructor; powers of students limited to expressing 
approval or disapproval. 


The sensitivity of the Psychology Depart- 
ment to the students’ regard for the teaching 
it provided caused me to approach this, my 
first teaching, with more than my usual trepi- 
dation. I worked hard welding my Stanford 
and Lewinian background into what I am 
sure was a unique course of study. There were 
a few rough places in this setting, marked by 
some foot shuffling, but it ended with an ac- 
ceptable round of applause. So I emerged 
with some confidence in my teaching ability 
and with a course outline that was a long- 
time asset. 


Boring’s Sack Lunch 


Program: To combine an economical lunch with 
friendly conversation among colleagues, Physical at- 
tributes: Locus, Boring’s office in Emerson Hall; 
table around which eight people could sit comfort- 
ably; sack lunches; in the background a desk and 
shelves filled with books. Temporal characteristics: 
1:00 to 2:00 p.m. most weekdays. Human compo- 
nents: Boring and, usually, five or six staff members 
and visitors, Boundary properties: Access at Boring’s 
invitation. Powers of human components: Boring’s 
benign influence pervaded the setting. 


I was flattered to be invited, I expected 
stimulating conversation of the kind I had 
heard took place at the high tables in the 
English colleges, but in this I was disap- 
pointed. Despite the high caliber and great 
achievements of a number of the attenders 
(B. F. Skinner, S. S. Stevens, Boring, occa- 
sionally Lashley and Beebe-Center), small 
shoptalk about equipment, library resources, 
particular research results, publishing prob- 
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lems, together with bantering but restr 
gossip about current university, commu 
and national affairs prevailed. I emerged 
this setting with some impressions of 
and soon-to-be eminent people, but not 

of professional value to me. It did strength 
my belief, originating at Stanford 

strengthened at Iowa, that the effective me 
in the science were blue-collar, hard-hat 
workers, not gentlemen scholars, Here 
Boring, well past middle age, the Mr. 
chology of the profession in America, 
his cold egg sandwich with the boys and 
cussing the nuts and bolts of research 
writing, not lunching gracefully in the 
ard Faculty Club with his professional i 
administrative peers on the Club’s fi 
horsemeat steaks. 

My time at Harvard was one of 
expanded perspective, not of psycholo 
field of study, but of the people, instituti 
and cultural milieu of the science. Other 
tings that contributed to this expansion 
conferences sponsored by the Macy Fo 
tion, where I encountered the eastern 
the child development movement and for 
first time directly heard psychoanalysts 
pound their views. 

Although I was no longer in the role 0 
student, I was still a probationer. As I under 
stood the Harvard policy, it was: 3 years 
most as an instructor and then up or ou! 
I saw no possibility of my moving up, I 
mined before the end of the year to 
other position. Behind this decision there 
a number of considerations: uncertainty 
would be appointed for a second year 
apprehension about finding a place in 2 
years, when I would need one; the 
9 years as apprentice (6 years at Stanic 
at Iowa, 1 at Harvard) with the coni 
strain of being tested was enough; ano 
realization that I was veering shai 
from my goal of using psychology a 
direct benefit of people. So when I E duct 
a position as Assistant Professor College 
tional Psychology was open 10 the Pri: , 1 ap l 
Education at the University of Ili P. arei 
plied for it and was appointed. As offered t0 
out, another year at Harvard was 0” 
me, and some surprise was exp! 
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chose Illinois. But Louise and I went with 
View regrets; we were at last taking a hand in 
{the direction of my professional career, our 
economic future did not come to a dead end 
in 1 or 2 years, and education was surely a 
promising place to apply what I had learned 
at Stanford and Iowa. 


University of Illinois, 1938-1942 


We arrived in Urbana, Illinois, in July 1938 
with our 2-month-old and 2-year-old children. 
We rented a house while looking for one to 

A buy; we were ready to settle down. 
i 


Study at Home 


Program: To get on at last with my own version of 
psychology. Physical attributes: We soon purchased 
an old house from a retired academic who had made 
himself a study to match, on a small scale, the 
library on Plimpton Street: paneled walls, a fireplace 
that worked, many shelves, stained glass windows, 
‘pace for a large desk and work table. Temporal 
tharacteristics: Occurred at my discretion during the 
Illinois years. Human components: Myself, my stu- 
dents, Louise, and our children. Powers of human 
Components: I was in overall charge, but in these 
Yeats Louise began to participate in the research, a 
Partnership that has continued, with some breaks 
When she has had outside jobs, Boundary properties: 
Penetrable to me, Louise, and children at any time; 
0 students on invitation. 


What I had prepared for the Harvard stu- 
dents of child psychology seemed good enough 
lo for the Illinois students of educational 
Psychology, so I turned early from prepara- 
tions for my class to other undertakings. Two 
of these were rooted in Stone’s Laboratory, 
two in Lewin’s Offices, and one was a “do 
bood” effort. 

The question of whether differences in the 
îttitudes and interests of premenarcheal and 
| Postmenarcheal girls have a basis in the social 
( nificance of their different physiques was 
a Unanswered 5 years earlier, so I returned 

i a Now and found evidence that, indeed, the 

ult associates of mature adolescent girls 
mide a different (more “mature”) social 

vironment for them than they do for im- 


ate girls. And I expanded my concern for 


th environmental significance of physique to 


Ysically disabled people and asked if they 
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also live in a different environment from that 
of physically normal people by reason of their 
physiques. I began a long involvement with 
this question with two case studies. Although 
the roots of my interest in the environmental 
significance of physique were in Stone’s Lab- 
oratory, my interpretations and theories came 
from Lewin’s Office, and undoubtedly my ex- 
pansion of this interest to physical disability 
was influenced by my personal experience with 
the problem. 

The unfinished manuscript of the frustra- 
tion research still occupied a prominent place 
on my desk, and I continued to wrest time 
from more immediately interesting new tasks 
to inch along toward its completion. And the 
conflict resolution study had become such an 
uncompleted task for me that I was impelled 
to replicate (and verify) the findings by an 
entirely different method. 

My “do good” effort was carried out in col- 
laboration with Herbert F. Wright, my fellow 
Fellow at Iowa, and Jacob S. Kounin, my 
new colleague at Illinois, This was a collec- 
tion of reports of psychological research in 
the field of child development. It was in- 
tended to provide a “bookshelf” of primary 
research for students of child development 
and thereby promote scientific child study as 
a field for both practitioners and scientists. 
It is my impression that this was the first 
“reader,” as they are now called, in child 
psychology, and perhaps in psychology as well 
(Barker, Kounin, & Wright, 1943). 

My Study at Home was not only a place to 
get on with my version of psychology as a 
science; it became a place to teach my ver- 
sion, as well. In the early autumn of 1940 an 
acute phase of the bone infection occurred, 
and I was completely incapacitated for the 
term. Jack Kounin came to my rescue, taking 
over my classes, presumably for a short time, 
But as it turned out, I was out of commission 
for 4 months. Due to Jack’s heroic efforts and 
the tolerance of the College and University, 
my pay continued. In the second semester, 
although still in a full body cast, I was able, 
again with the kind indulgence and special 
arrangements of the College, University, and 
students, to meet small classes in my Study 
at Home. Perched on a high tavern-type chair 
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within a specially built surrounding pulpit, I 
was able to expound the word to a score of 
students at a time. 

Again, as with Murray’s Clinic, my Study 
at Home saved me from possible disaster. 
This one nourished projects I brought to it 
from Stanford and Iowa, and in it new, long- 
continued undertakings originated. 


Extension Classes 


Program: To teach educational psychology at the 
graduate level to primary and secondary school 
teachers. Physical attributes: Located in classrooms 
of state teachers’ colleges, which did not at that time 
offer graduate-level instruction. Temporal character- 
istics: Occurred on Saturday mornings for two hours 
during each semester. Human components: Instruc- 
tor and 20 to 40 students from the schools of the 
area. Boundary properties: Only students satisfying 
registration requirements of the University were ad- 
mitted. Powers of human components: Program con- 
trolled by instructor, 


The Saturday trips from Urbana to out- 
lying teachers’ colleges, usually by auto- 
mobile but in some cases by train, were 
enjoyable and satisfying. The mature, prac- 
ticing teachers brought me into contact 
with real teaching problems in the towns 
where they taught; they educated me perhaps 
as much as I educated them. And the trips 
revealed to me something about the rural 
Midwest, I was intrigued with the small 
towns, and elsewhere I have related how the 
general problem that dominated the last 25 
years of my professional life occurred to me 
as I was rushed through them by the train 
on the trip from Urbana in the center of Tli- 
nois to Carbondale in the south (Barker & 
Associates, 1978). In short, I had an over- 
whelming negative “Aha!” experience: Here 
I was, a native of the culture and an expert 
on child behavior (and especially on frustra- 
tion) who knew no more about the everyday 
behaviors and environments of the children of 
the towns than laymen know. I was aware, 
too, that other child psychologists knew no 
more than I did, and furthermore, that we had 
no means of discovering more; no methods of 
determining the extent and conditions of 
frustration, joy, anger, success, conflict, prob- 
lem solving, fear, and so forth among the 


BARKER 


towns’ children, I thought how different the 
position of an agronomist would be. He would 
know or could determine the kinds, yields, and ! 
qualities of the crops we were passing, the 
properties of the soils in which they were 
growing, and the relations between soil con- 
ditions and output. This was the beginning 
of a growing conviction that a science that 
knows no more about the distribution in na- 
ture of the phenomena with which it is con- 
cerned than laymen do is a defective science, 
and it was the beginning of my impression 
that small towns of the kind I observed in 
Illinois and learned about from the teachers 
are favorable places to begin to remedy the 
defect. It was seven years before the seed 
planted in the Extension Classes and on my 
trips across the plains of Illinois began to 
sprout, A 
It was not long after we went to Illinois to 
“settle down” that irritants began to appear. 
There were the disagreeable administrative 
impositions upon my teaching that finally be 
came intolerable to me. And I was soon to dis- 
cover that my flight from a marginal position” 
vis-a-vis the upper uppers of the profession 
at Harvard had landed me squarely with the 
lower lowers. I attempted to establish Me 
legial relations with psychologists in the Psy- 
chology Department but without success. Part 
of the difficulty was structural: I was sepi 
arated from them spacially, temporally, 
administratively; I was in a different buil 
ing, their seminars and other gatherings ofte! 
conflicted with my staff meetings and othet 
duties, and administrative messages were On 
different communication networks, And in 
addition, I found that a psychologist in 
college of education is distinctly lower-class: 
This so impressed me and it was so important 
to me personally that I used my experience 
as evidence for an analysis of the values ani 
dynamics involved. Finally, I became $ 
couraged about my ability to combine scien- 
tific work with applications. I began to a 
that it required more than clear lectures J 
alter the practices of teachers. My on 
perience, namely, that because of imposta 
from the encompassing system I esn, y 
practice in my own teaching at the Univers”. » 


: e to 
the principles that my science showed ™ 
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be true should have made this immediately 
dear to me, but it took time. I struggled on 
with increasing dissatisfaction. Finally, in the 
spring of 1942, these multiple environmental 
stresses became so great that I wrote to Ter- 
man telling him of my disenchantment. To 
the surprise and delight of both Louise (who, 
surprisingly, did not thrive with a troubled 
mate) and me, Terman answered immediately 
with the offer of a place at Stanford for the 
duration of the war. 
At Illinois I learned some important things 
about myself and about my profession; I 
j closed out some unfinished tasks and initiated 
some new ones, I look back on the time there 
as a rough but beneficial passage. 


Stanford University, 1942-1945 


The Stanford appointment was as Acting 
Associate Professor, I understood clearly that 
this was an emergency appointment to bolster 
awar-depleted staff. At the time its temporary 
nature did not trouble me or Louise; the ad- 
vantages were many: I was free to teach as I 
Pleased (the motto for Stanford of its first 
President, David Starr Jordan, “The winds of 
freedom are blowing,” was fully realized in 
the classrooms of the Psychology Depart- 
ment); I was again a member of psychology’s 
Upper class; we were home among friends; 
Louise was welcomed back on a part-time 
teaching basis to her old school; my health 
Was on the upgrade and promised to improve 
more in the California sunshine. We were in 
Such good shape, in fact, that our third child 
Was born in the spring of 1943. Furthermore, 
it turned out that I was able to make an im- 
Portant immediate application of my psycho- 
logical knowledge. 

But not all was rosy. The war was ever 
Present on the West Coast, with blackouts, 
+ shortages, relatives and friends embarking for 

pian combat, and casualties disembarking. 
i he teaching load was heavy; in addition to 
‘he regular offerings, short cram courses were 
a for officers in training. I taught a num- 

i of subjects new to me, so I had to do 
| N homework to keep ahead of the classes. 
| E of these 3} years at Stanford were a 

ady grind. Only one setting stands out as 
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of special significance for my career. It came 
about through a fortunate conjunction of (a) 
my desire to make a contribution to the war 
effort; (b) the national need for rehabilita- 
tion service for war casualties; (c) my inter- 
est in, my exploratory research upon, and my 
personal experience with, physical disability; 
and (d) resources that the Social Science Re- 
search Council (SSRC) made available. A 
crucial person in tying together these separate 
strands was my colleague Quinn McNemar, 
who was a member of the relevant SSRC com- 
mittee; it was at his instigation that the 
Council funded the preparation of a survey of 
what was known about the psychological 
aspects of the rehabilitation of the physically 
disabled (Barker, Wright, Meyerson, & 
Gonick, 1953). I set up a special study at 
home for this project. 


Office of Disability Survey 


Program: Preparation of monograph Adjustment to 
Physical Handicap and Illness. Physical attributes: 
We were living in a house built by Walter Miles, 
my former professor and thesis adviser, which had a 
fine, spacious, isolated study; it became headquarters 
for the monograph project. Temporal characteristics: 
Occurred whenever I had time during the years 
1943-1945. Human components: When the magnitude 
of the project became clear, I was fortunate in ob- 
taining contributions from Beatrice A. Wright, Lee 
Meyerson, and Molly Gonick. Boundary properties: 
Ready access by the four participants. Powers of 
human components: Overall control was in my 
hands; others had complete autonomy for their own 
contributicns, 


In this setting I experienced some of the 
great satisfactions of my professional life, and 
they have continued to this day. In the first 
place, I felt competent; I believed I was as 
able any anyone to do the job at that time. 
Second, I saw this as my first opportunity to 
bring my skills to bear upon an urgent prac- 
tical problem. As it turned out, the time was 
ripe for this; beginning with World War 2, 
concern and services for the disabled have 
increased greatly, so any firm foundations laid 
in 1943 have had continuing consequences. 
And third, I believed at the time that I was 
able to contribute to this firm foundation by 
pointing out that some psychological problems 
of the disabled are not unique to them, that 
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adolescents and ethnic minorities, for ex- 
ample, face the same problems; I was able 
to do this in terms of Lewin’s concepts of 
marginal men and overlapping situations. I 
have not followed the course of rehabilitation 
psychology in recent years, so I have no first- 
hand knowledge of the permanence of this 
contribution, but I am told that the mono- 
graph is still in demand, Clearer evidence of 
the enduring consequences of the Disability 
Office is the fact that both Beatrice Wright 
and Lee Meyerson have become leaders of 
this field of psychological study. 
In 1945 I wore the child psychology label, 
primarily, and there was not much demand 
for this branch of the science at Stanford. 
Members of the Department had ambitions to 
develop a program of studies in child psychol- 
ogy, but money was scarce and progress slow; 
their principal efforts went to accommodating 
regular staff members returning from war ser- 
vice and to filling vacant, established slots. 
So in October 1945 I was told that there was 
no hope for me beyond the next spring semes- 
ter. As of June 1946 I would be through at 
Stanford. The likelihood of this subsidence of 
the stream of my professional life had long 
been within my time perspective, but the 
reality was more vivid than the distant pros- 
pect, It gave a new urgency to the somewhat 
relaxed inquiries I had initiated before reality 
descended. I had communicated about pos- 
sible jobs with a number of schools, including 
Clark University, and in the summer I had an 
interview with President Atwood about the 
vacant G. Stanley Hall Professorship in Child 
Psychology. I did not take this possibility 
seriously. So, when a definite offer came from 
Clark in November, Louise and I were almost 
overwhelmed by the drastic alteration in our 
prospects. But we were able to make the 
transfer; we arrived in Worcester, Massachu- 
setts, for the second semester in January 1946. 
Two firsts of some importance occurred dur- 
ing the 34 years at Stanford. One was advis- 
ing graduate students on their thesis research, 
In line with my interest in the disability 
problem and my work on the survey, three 
students undertook theses on attitudes toward 
the disabled. The other first was the estab- 
lishment of the Disability Survey Office as a 


ROGER G. BARKER 


major behavior setting with a program after 

my own specifications; until this time, apart | 
from minor side excursions, I had operated in 
behavior settings with programs arranged by 
others, At Stanford I was born as a psychol- 
ogist, and on my return 15 years later I 
gained these two evidences that I had at last 
reached my majority. 

Louise and I left Stanford older and with 
more experience of the stream of university 
life. We had drifted through the dangerous, 
white waters of Harvard, portaged to the slow 
Illinois channel with its sandbars and drift- 
wood, portaged again to the steady, full flow 
at Stanford, on whose banks we were briefly 
stranded until an unexpected flash flood car- 
ried us to Clark. One more portage was to 
come, 


Clark University, 1946-1947 


Clark turned out to be a bayou for us. The 
top administrators changed six months after 
we arrived, the psychology department had 
been understaffed for some time and had an 
acting chairman, and I was uncertain of the 
direction of my next efforts, having completed 
the disability monograph. With so much in 
the process of change, all of us—the Uni- 
versity, the Department, and I—marked time. 
It was now that the seed planted in my mind 
on the plains of Illinois began to grow, and 
seriously considered a project to discover ani 
describe the living condition and behavior of 
the children of a small town. I took two ac- 
tions. I explored the towns around Worcester, 
and I discussed the project with Kurt Lewin, 
who was then at the Massachusetts Institute 
of Technology and was living only 30 m 
away. The exploration was discouraging. T $ 
small settlements of the region were not z 
contained towns as in the Midwest; they py 
spacially dispersed and often specialized S 
ments of political units (towns) often str TA 
gling many miles along streams Or ee 
forested ridges. The children of these 


ity 
i mmon commun 
ments did not have a co the towns 


jons 


were encouraging, and in 2 
thusiastic. He, himself, was movin; 
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tory experimental research to “action re- 
ch” in communities and institutions to 
udy the consequences for the inhabitants of 
duced changes in their structures and pro- 
mms, So he was supportive of efforts to 
sablish baselines in unaltered communities 
yainst which to assess the effects of changes. 
With his encouragement, I worked on an ap- 
lication for funds for a community study 
ith the intention of doing the research at a 
stance from Worcester if necessary. 
And now luck was with us. The United 
Mates Public Health Service was expanding 
$ support of basic research, particularly on 
hildren. The advisory committees included 
ersons familiar with my work whom I had 
net at Stanford, Iowa, Illinois, and the Macy 
oundation and Topology Meetings. The 
oject was approved, and the research was 
inded early in 1947. And then Dean Lawson 
the University of Kansas turned up in 
lorcester. He was looking for candidates for 
Wt chairmanship of their Psychology Depart- 
t. The department was at a low ebb, and 
administration was ready to make a new 
inning and provide the support required 
bring psychology at Kansas to its former 
porous state. 

Why should I be interested? I had just 
Pme to Clark, to an endowed professorship 
ith some status and to a department that 
ko was set to make new beginnings. One 
buld interpret my career as exhibiting clear 
RNs of instability; a solid and frank New 
(glander, on hearing from Louise of the 
‘ns in which we had lived, remarked, “I see 
Mr husband is something of a floater.” 


Tequired. Did he know of such a town in 
I Lawrence area? “Yes,” he said without any 
Station, “I know the place. I’ve spoken 
fe several times, Its name is Oskaloosa.” 
JN two trips to Lawrence that spring; on 
fy, cond trip I asked my former fellowship 
ly ue at Iowa, Herbert Wright, to come 
i ng and consider joining in the department 
|" the research. We visited Oskaloosa. He 
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came and he joined. So Louise and I made 
one more portage. 


University of Kansas, 1947-1972 


We arrived in Lawrence in late August 
1947 in a 115° heat wave. But, no matter, 
important things were underway. Fritz and 
Grace Heider were joining us, along with 
Herbert and Lorene Wright. There were 
houses to rent, a new department office to 
establish, instructors, teaching assistants to 
hire, and the University’s rules and regula- 
tions to master. 

I learned slowly why I had been chosen 
for this job over some other strong candidates. 
G. E. Coghill, a stellar name in biology in 
those days, had recently retired, and psychol- 
ogists Raymond Wheeler, F. T. Perkins, and 
J. F. Brown had lately left the University. 
All of these men were well known for their 
“organismic” viewpoints. It was the wish of 
the two regular staff members who remained, 
Beulah Morrison and Anthony J. Smith, and 
of the administration, that this tradition be 
continued. Without making this explicit, Dean 
Lawson apparently saw that I would do this 
naturally and with enthusiasm. And in fact 
he had two immediate signs of the correctness 
of his insight when I recruited Herbert Wright 
and Fritz Heider within a couple of months, 
In the next three years we added Martin 
Scheerer, Alfred Baldwin, and Erik and Bea- 
trice Wright. The new senior staff members 
were all from the center or the fringes of 
the Gestalt psychology movement. 

Two behavior settings dominated my pro- 
fessional activities over the next 25 years; 
one of them was of my own creation, and I 
had considerable power over the other. 


Office of Department Chairman 


Program: To invigorate and expand the psychology 
department and to establish and administer settings 
subordinate to it according to University and Staff 
policies. Physical attributes: Suite of four rooms in 
Strong Hall with office furnishings and equipment. 
Temporal characteristics: My occupancy continued 
for 3 years; open for regular business weekdays 8:00 
am. to 5:00 p.m., and for special business at any 
time. Human components: Myself, other staff mem- 
bers, students, secretaries. Boundary properties: 
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Open access. Powers of human components: The 
basic policies of the Department were determined by 
external settings: those of the University administra- 
tion and the setting Department Staff Meeting. 
Within the constraints imposed by these settings, the 
chairman had complete control of the Office. 


In my 19 years of academic life the extent 
of my administrative experience was limited 
to registering and advising students at Illinois 
and Stanford and to voting on minor issues 
in the College Staff Meeting at Illinois. My 
former colleagues at Stanford expressed anx- 
ious surprise that I would or could become 
an administrator. Their surprise was not justi- 
fied. They did not know the strength of my 
commitment to certain academic and educa- 
tional principles and my eagerness to have 
more of a hand in promoting them than 
hitherto; and they did not know that admin- 
istration was part of the package that in- 
cluded Oskaloosa, the crucial component. 
There were grounds for their anxiety, though 
not the ones they probably had in mind. They 
would know of my deficiencies in keeping 
appointments, answering correspondence, ar- 
ranging schedules, making and keeping to 
budgets, gregariousness, and so forth. But 
they would underestimate, I think, the power 
of the behavior setting Office of Department 
Chairman to take me in hand in these re- 
spects. However, their anxiety (and mine, 
too) would have been justified had they 
known of the probable conflict between the 
duties of the Office of Department Chairman 
and those of the research setting Field Station 
in Oskaloosa. I lasted 3 years as chairman. 

The usual conflict between administration 
and research was exacerbated in this case by 
a geographical factor; we found it impossible 
to do the research in Oskaloosa, 20 miles 
from Lawrence, without living there and 
having full-time headquarters there; and the 
urgencies that occurred in both places could 
not be scheduled. However, if I had found 
administration congenial and rewarding, I 
might have remained longer as chairman, But 
I did not. For one thing, I did not have easy 
relations with my superiors. Some sort of a 
personal status and power problem was in- 
volved. I saw myself as being too submissive 
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to policy directives I opposed, and el 

I felt guilty of betraying my principles, I di 

not stand up to authority in the way I 
thought I should. This was partly a matter of 
my divided commitment; I did not 


have k 
time to prepare for hassles with top 1- 


istrators; but there was also inability to 
front superior power, Another problem for 
as chairman was the discovery that 
with whom I agreed and had harmoi 
relations were not always in agreement 
harmony among themselves. My ambition 
was for a unified, amicable department, 
with incredible naiveté I supposed that tho 
with whom I was congenial would be} 
genial inter se, There were no great co 
within the department, but not the t 
had hoped for. An added circumstance 
interfered with maintaining a depa 
with a common point of view on ps; g 
matters was the rapid increase in the 5 
to serve the great increase in university 
rollment after 1947, Inevitably those 
were of a variety of psychological pers 
In my 3 years in the chairman’s slot, the 
partment was invigorated with staff me 
who received wide recognition as 
scholars in the next two decades. 
These were tremendously busy years 10 
me, with increasing conflict between the 
mands of the chairmanship and the 
with increasing dissatisfaction with my te 
tions with University administrators, and Wi 
some disappointment that my goal of a sma 
excellent, unified group of scholars was? 
more completely achieved. So, my © mee 
of the chairman’s position in the Office 0 
Department of Psychology brought me = 
satisfactions and regrets. 
Classroom teaching was not clear $ 
either. I had decamped when there Wa 
ministrative interference with the 
wished to function as instructor at the k 
versity of Illinois. Now, it was disco! 
to me to discover that there were other 
pervasive institutional obstacles to my 
ing. The Kansas University admini 
intruded very little in class operations, 
generally prevailing program of a 
struction made it difficult for me to 2 
the kinds of programs I desired. This 
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t experience with the fact that the settings 
an institution may be so interdependent 
t deviant settings cannot function. My 
were arranged to foster students’ skill 
formulation of questions, finding and using 
most relevant evidence, and writing re- 
rts on the basis of the evidence. Learning 
ts had no place in my classes; today’s facts 
obsolete tomorrow, and the current ones 
always available if one knows where to 
them and how to use them. My classes 
ld operate best if the students had some 
eduled time: to think of interesting and 
erable questions, to search for relevant 
idence, and to analyze, organize, and pre- 
t the findings. But who can think, search, 
lyze, and write when confronted with two 
izzes a week in one class, 500 pages of 
ing in another, and 9 hours of laboratory 
tendance in another? As so-called standards 
tup and competitive grading became more 
ere, the students lost their freedom to act 
inquirers, problem solvers, and expositors 
the imposed demands of fact-, page-, and 
ur-oriented classes. 

In addition to trouble with the teaching 
tem within the University, I found that 


as to be mutually injurious. I have ob- 
ed this to be true for others, too. Both 
“man and Lewin were excellent classroom 


t when they were deep in research, as they 
“uy were, their teaching declined pre- 
tately, 
From the intrusions of the Illinois Dean I 
Ped to Stanford; from the interference of 
„ Prevailing teaching system and the con- 
tS with research I escaped to Oskaloosa. 
tunately a Research Career Award from 
ational Institute of Health made this 
e. 


klg Station in Oskaloosa 


fram: In the beginning, the program was to 
the living conditions and behavior of all the 

n of the town; later, all inhabitants were in- 
pe hysical attributes: Suite of offices in Oska- 
» With office furnishings and calculators; 
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at intervals a satellite Station was established in 
Leyburn, Yorkshire, England. Temporal character- 
istics: Operated during regular office hours, and often 
into the evening for particular staff members, from 
the fall of 1947 through the spring of 1972. Human 
components: Three to seven staff members, one to 
four graduate students, and occasional town resi- 
dents. Boundary properties: Staff and graduate stu- 
dents selected and admitted by Station directors; 
they thereafter had free access; townspeople also 
had free access. Powers of human components; Dur- 
ing the first 7 years, the Station was administered 
by the codirectors, Herbert F. Wright and me; we 
had power over the entire setting; when Herbert left 
the Station, I was in charge. Next in power were 
professional scientists (field workers, data analysts) 
who had complete control of their special operations; 
after them came the graduate students with power 
over their projects in consultation with their advisers, 
and finally the secretaries, who were masters of their 
own desks and secretarial facilities. Townspeople and 
other visitors had no power. 


Here, at last, in the Midwest Psychological 
Field Station, the stream of my professional 
behavior entered a setting constructed to my 
own design: I initiated it, and its program 
was arranged cooperatively with Herbert 
Wright. Most other major settings I had in- 
habited had ongoing operations into which I 
was incorporated, with limited power to make 
alterations, I soon discovered, however, that 
in the Station I was not a free spirit but was 
captive of the setting I had established—a 
setting that embodied a past I could not 
escape and a future I could not control. 
Herbert and I had programmed the Midwest 
Field Station to describe the behavior and 
psychological habitat of the children of the 
town individually. With the setting underway 
(procedures developed and tested, staff 
trained, computer programs set up, citizens 
alerted and cooperative), it was not a simple 
matter when our insights and intentions 
changed to alter the program to one of de- 
scribing extraindividual behavior within be- 
havior settings. We had to struggle against 
our own creation. It makes one wonder how 
much of the stability of individual behavior 
has its source in the stability of the behavior 
settings people create and inhabit, a stability 
that is sustained by the fact that people estab- 
lish new settings according to designs they 
carry with them from previous settings. The 
origins of some features we built into the 
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Midwest Field Station are fairly clear: The 
aim to describe the living conditions and be- 
havior of all of the children originated in the 
Extension Classes in Illinois; concern with 
the naturally occurring environment was 
brought from Stone’s Animal Laboratory and 
the Iowa Nursery School Laboratory; dis- 
ciplined, persisting concentration on a rather 
narrow program of activities, the policy that 
“dogged does it,” came aboard in the Stan- 
ford settings; the dual importance of precise 
data and precise theories welded together 
characteristics of both the Stanford and the 
Towa settings; the particular theories that 
undergirded the program of the Midwest re- 
search came from Lewin’s Offices in Iowa; 
the importance attached to archives of atheo- 
retical data originated in Stone’s Animal Lab- 
oratory; the first data collecting system we 
installed, the narrative record, dated back to 
the Iowa Nursery School Laboratory; the 
importance given to community relations was 
first met in Miles’ Later Maturity Facility. 
These problems, emphases, methods, and 
theories with which we endowed the Field 
Station were imported from earlier settings. 
We installed the past in the present. But the 
inheritances were assembled in new relation- 
ships, and new elements were added, including 
staff members, methods, ideas, and data; the 
Station developed a dynamic of its own with 
unforeseen consequences for its inhabitants. 
For me these were far-reaching. In this latest 
stretch of the stream of my professional be- 
havior, I entered the Field Station as a psy- 
chologist aiming to study the naturally oc- 
curring behavior and environments of the 
people of the town as individuals. For a long 
time I clung to the view that this was the way 
to observe and explain their everyday activ- 
ities. But the Field Station finally turned me 
around and showed me that more than people 
and the stimuli that impinge upon them indi- 
vidually are required, that an ecobehavioral 
science of extraindividual behavior and its 
nonbehavioral context is needed. The course 
of this change has been presented in a number 
of publications (Barker, 1960, 1963a,b, 1965, 
1968; Barker & Associates, 1978; Barker & 
Gump, 1964; Barker & Schoggen, 1973; 
Barker & Wright, 1951, 1955). 


ii 
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Other Places, Other People 

I have had to omit many rewarding and 
pleasant reaches of my behavior stream and 
most of the people who shared parts of it with’ 
me and gave me tows and directions. Space 
is a factor here, but more important is ab- 
sence of records of the early days. And there 
were important hidden settings (committees, 
administrative offices, and so forth), whose 
inhabitants I do not know, that gave me free 
time (Research Career Award: Fellowship, 
Center for Advanced Study in the Behavioral 
Sciences), research grants (U.S. Public Health 
Service, Society for the Aid of Crippled Chil- 
dren, Commonwealth Fund, Ford Foundation, 
Carnegie Foundation of New York, Uni- 
versity of Kansas, Kansas University Endow- 
ment Association), honors (Kurt 
Award, Society for the Psychological Study 
of Social Issues; Research Contribution 
Award, American Psychological Association; 
G. Stanley Hall Award, Division on Develop- 
mental Psychology, American Psychological 
Association), and summer appointments 
(Columbia, Oregon, Colorado, California). 1 
regret especially not being able to name the 
graduate students who studied with me 
Stanford University, Clark University, 
the University of Kansas, some of whom 
become valued colleagues, and many of W 
contributed greatly and still contribute to 
intellectual development. My story would 
too incomplete, however, without four 
influences on the odyssey. Louise s 
Barker was an eager spectator, a reais 
viser, and an occasional stand-in in all the ie 
settings. But when we discovered that Hi 
Station could not be operated from Lawrence, 
that it required us to live in Oa 
Louise became an essential operative ef 
field worker and our main line of commit 
tion with the community. She ae rit 

red. In her own professions she 200 oy 
es as a field cornish first as eae 
the Stanford Marine Station and on e 1 
versity campus, and then m a homes Of 
“home visitor,” searching out seven 
absent children and investigating deas about 
of their absence, Fritz Heider's ae petween 
media and things provided the li i 


ğ 
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people and behavior settings that prompted 
me to see the latter as more than convenient 
areas for sampling behavior but rather as en- 
tities (things) that impose patterns upon their 
components, including their human inhab- 
tants (media). It was this insight that raised 
behavior settings for me from a technological 
convenience to the basic unit of an ecobehav- 
oral science. Herbert F. Wright was co-equal 
developer of the Field Station’s program dur- 
ing its first years, making unique contribu- 
tions to the study of individual behavior; and 
Paul V. Gump was equally important in the 
litter years by bringing the Station’s ideas and 
methods to bear on school and community 
Operation, 


Conclusion 


I began this trip with the desire to benefit 
tumanity. I have not, myself, been able to 
Stisfy this desire, for I have found that, as 
With teaching, the time and skills required to 
make and disseminate applications have so 
‘onflicted with those required by the research 
Sto make the former impossible for me. So 
the zeal and success of a small number of 
lormer staff members, students, and others 
‘0 bring our methods and findings to earth in 
‘onnection with hospitals, schools, architec- 
lure, town planning, child development, and 
en the movement “small is beautiful” have 
liven me keen satisfaction. I began the jour- 
‘ey, too, with reservations about my ability 
"0 make it in the scientific world. When I re- 
Port that the self-doubt has not been dis- 
Pelled, the response may well be: How greedy 

n he be? What does he want in brownie 
Pints? The answer: The stream of my pro- 

lonal life has skirted the areas of psychol- 
a that are currently richly cultivated and 
ae sted, and in fact, it has finally landed 
Bei Outside the turf of the psychology tribe. 
ng something of a maverick has its re- 
S from those (and there are many) who 
ue what they hope will prove to be worth- 
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while innovations. The innovator hopes so, 
too, but in the meantime he is at a disad- 
vantage vis-a-vis his mainstream associates, 
since he cannot keep up with them in the 
main channel. So, how can he be sure where 
he stands? 
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Roger G. Barker and Behavior Settings: A Commentary 


Phil Schoggen 
Cornell University 


These comments on the preceding paper by Roger G. Barker suggest that it 
documents and illustrates Barker’s approach to science, his lifelong concern 
with environment-behavior relations, the major influences that have shaped his 
thought, and the utility of the behavior setting concept. 


- Barker’s paper impresses me as a remark- 
ably accurate, concise, and revealing reflec- 
tion of his lifelong approach to science in 
general and to the study of the relations be- 
tween molar behavior and its environment in 
particular, It sketches and illustrates in a 
fascinating way the development of his most 
significant conceptual contribution, the be- 
havior setting, from its earliest precursors in 
Calvin Stone’s animal laboratory to its pres- 
ent status in the broader domain that he now 
refers to as “an ecobehavioral science of 
extraindividual behavior and its nonbehay- 
ioral context.” And he does it with the candor 
and wit that his associates will recognize as 
so characteristic of Barker. Of course, a good 
understanding of Barker’s work requires study 
of his several books and papers, but I know 
of no single paper that is more truly indica- 
tive of Barker’s thinking than this one, I 
have tried below to identify some of the main 
reasons why I have this impression. 

This paper, above all, documents well a 
way of going about scientific work that both 
Barker and his principal mentor, Kurt Lewin, 
had in common, namely a spontaneous and 
omnipresent genius for finding in the mun- 
dane experiences of one’s own everyday activ- 
ities inspiration for the development of new 
concepts or practical tests of hypothetical 
relationships. For example, Lewin used his 
World War I battlefield experiences to de- 
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velop his concept of the psychological en- 
vironment (Lewin, 1917), His astute ob- 
servation of a coffeehouse waiter’s impressive 
memory for hours of charges and sudden for- 
getfulness once the bill was paid served as 4 
basis for his conception of psychological ten: 
sion systems (Marrow, 1969), Similarly, 
Barker’s experiences on train trips through 
the countryside of central Illinois spawned 
his subsequent preoccupation with studies ol 
“the nature of the situations that actually 
confront children in their daily lives and how 
they react to them” (Barker & Associates, 
1978). Where others saw only pleasant coun- 
tryside and small midwestern towns, Barker 
saw a major challenge to scientific psychology: 
His efforts to cope with illness and physi 
disability shaped his thinking about the rela- 
tion between the psychological and the non- 
psychological environments. Like Lewin also, 
Barker encouraged his students and colleagues 
to use their own direct experiences to further 
their understanding of fundamental psycho- 
logical processes. As a graduate student E 
search assistant and resident staff member 0A 
the Midwest project in its early days, I reg- 
ularly attended church and sang tenor M Si 
church choir. On one of our many trips by of 
between Lawrence and Oskaloosa, F rE É 
ber expressing puzzled amazement 
faithfulness of "the other parisien iy 
couldn’t understand why they so ieee ae 
braved the rigors of Kansas winters an 
fered through Sunday after Sunday A P 
sweltering heat of summer to be pres quiet, 
the morning worship service. In his qv" 
hesitant, almost apologetic Way, 


$00.75 


2158 


ROGER G. BARKER AND BEHAVIOR SETTINGS 


“Well, 1 don’t know. But, Phil, why do you 
Figo?” 

This paper also documents another im- 
portant feature of Barker's career in that it 
shows both his early concern with the prob- 
lem of environment-behavior relations (he 
was writing about ecological problems many 
years before any other psychologist except 
lewin and Brunswik) and the impressive 
Singleness of purpose with which he has dedi- 
cated himself to this concern, forsaking all 
others. With his long-time friend and col- 
league Herbert F., Wright, Barker wrote in 
1949 on what they then called psychological 
«ology (Barker & Wright, 1949). This was 
iollowed by their major report on the Mid- 
West project (1955), Barker’s seminal theo- 
retical paper in the Nebraska Symposium on 
Motivation (Barker, 1960), his now classical 
Ecological Psychology (Barker, 1968), sev- 
Jel other books and papers alone and with 
colleagues, and most recently, a new work 
that presents his current thinking about eco- 
behavioral science, which goes well beyond 
el psychology (Barker & Associates, 

8). 

Dedication to his early learning at Stan- 
ford that “dogged does it” is apparent also 
in the fact that, with the single exception of 
the 1953 revision of his Social Science Re- 
search Council monograph on the social psy- 
thology of physique and disability (Barker, 
B. Wright, Meyerson, & Gonick, 1953), every 
One of Barker’s seven books and 22 papers 
Published since 1949 is devoted to the ecology 
of behavior. 

Barker (1963) has credited Kurt Lewin 
‘nd Egon Brunswik as the intellectual fore- 
I rs of his work on the ecology of behavior. 
M the present paper, he also traces his con- 
terns for understanding environmental influ- 
“ces on behavior to some rather surprising 

ginnings. Barker’s thinking reflects still 

er important influences. 

\ yt work of social anthropologists such as 
{p garet Mead, Clyde Kluckhohn, and Ruth 
J Bi edict figured prominently in the thinking 

“Barker and Herbert Wright in the early 

8Ys of the Midwest Field Station research. 

pall many discussions of the project staff 
ting which great admiration was expressed 
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for the pioneering efforts of the social-anthro- 
pologists to study behavior in natural situa- 
tions. Although Barker and H. Wright wanted 
to study child behavior and development in 
ordinary American as opposed to primitive 
cultures, and although they tried to develop 
observational and recording methods more 
systematic than the traditional field notes of 
the social anthropologists, it is clear that 
Barker and H, Wright built heavily upon the 
foundation laid by these workers. 

Sociology and particularly human ecology 
also provided important influences. I remem- 
ber well my delight in finding that my oral 
examination for the master’s degree surpris- 
ingly turned into an interesting debate among 
Roger Barker, Herbert Wright, and Martin 
Scheerer as to whether the classical study of 
mental disorders in different areas of Chicago 
by sociologists Faris and Dunham (1939) 
could be considered a study in psychological 
ecology. This discussion showed clearly that 
the thinking of Barker and H. Wright was 
strongly influenced by the work of sociologists, 
This was evident also from their familiarity 
with and appreciation of other related so- 
ciological literature, for example, Dollard’s 
Caste and Class in a Southern Town, the 
Lynds’ Middletown in Transition, and Hol- 
lingshead’s Elmtown’s Youth. 

A less conspicuous but still important influ- 
ence on Barker’s thinking came from the field 
of biology, which provided him with extremely 
useful models. Although it was clear to me 
that Barker had done a considerable amount 
of reading in biology on his own, I think 
Louise Shedd Barker’s formal training in 
biology and her central involvement in Roger 
Barker’s thinking throughout his career 
greatly enhanced the value of concepts, prin- 
ciples, and research strategies from biology 
in the development of his own ecological ap- 
proach. 

Certainly Barker’s concept of ecology came 
directly from its use in plant and animal 
ecology. Webster’s unabridged dictionary def- 
inition suits equally well Barker’s use of the 
term and that of biological ecology, “the 
totality or pattern of relations Between orga- 
nisms and their environment.” 

In his early attempts to get some project 
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staff members to understand the mutual in- 
terdependence of behavior and its immediate 
context, Barker spoke of the well-worn foot- 
path that cut diagonally across a familiar 
vacant lot near the high school in Oskaloosa. 
At some earlier time, the path did not exist, 
and the first persons to cross the lot diago- 
nally shaped that part of the environment in 
line with their need to reduce the amount of 
required walking, Yet the path that they cre- 
ated was not completely straight but curved 
and bent around clumps of bushes, mounds, 
and depressions. Once established, however, 
that path shaped the behavior of countless 
pedestrians, nearly all of whom now follow 
its curves and bends as faithfully as a train 
on steel rails, 

Eugene P. Odum in Fundamentals of Ecol- 
ogy (1971) expresses the same thought in the 
context of his discussion of the ecology of 
fresh water ponds: “Not only is the pond a 
place where plants and animals live, but plants 
and animals make the pond what it is” (p. 
12). 

Finally, it is appropriate that Barker chose 
to write in terms of behavior settings central 
in his professional career, because the concept 
of behavior settings will, I believe, turn out 
to be his most important and enduring con- 
tribution. Its potential significance for be- 
havioral science is only beginning to be recog- 
nized (e.g., Moos, 1976; Wicker, 1979). To 
the best of my knowledge, it is the only con- 
cept in the behavioral sciences that combines 
both the ecological environment and behavior 
in a single unit of empirical analysis. The be- 
havior setting concept derives from and re- 
flects Barker’s concept of the environment 
(Barker, 1963). This concept marks Barker’s 
originality and most clearly distinguishes his 
position from those of Brunswik and Lewin. 
Whereas for them and for psychologists gen- 
erally (e.g., Leeper, 1963), the ecological en- 
vironment is unstable, characterized by only 
statistical regularities, and related to behavior 
only in terms of probabilities, Barker argues 
and provides impressive empirical evidence 
that there is order and organization in the 
preperceptual environment and that its rela- 
tion to behavior is neither random nor merely 
probabilistic. Although the laws that govern 
the phenomena of the nonpsychological en- 
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; 
vironment are different from and incommen- 
surate with the laws that govern individual 
behavior, Barker and his associates have 
demonstrated in their work with behavior 
Settings that despite this conceptual incom- 
mensurability, our understanding of molar 
behavior can be improved through systematic) 
studies of its relations with the immediate 
environmental contexts. Behavior settings and 
the molar behavior of their inhabitants are 
mutually, causally related. Barker's review 
of important behavior settings in his own pro- 
fessional career provides further testimony to 
this important discovery. | 
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The Identification of Persons as Supersets and Subsets 
in Free-Response Personality Descriptions 


Michael A. Gara and Seymour Rosenberg 
Rutgers—The State University 


Using a free-response format, 14 subjects each described 36 target persons. A 
set-theoretical model was proposed for representing the perceived similarities 
and differences among the persons described. Using this model, four types of 
targets were identified for each subject’s protocol: supersets, subsets, disliked 
contrasts, and miscellaneous. Supersets are persons whose perceived character- 
istics subsume those of many of the other persons described. Subsets are per- 
sons described with only a limited portion of the characteristics attributed to 
the supersets. Disliked contrasts are those disliked persons who were described 
with a set of terms almost completely different from those used to describe the 
other targets. Miscellaneous targets are target persons who could not be iden- 
tified as supersets, subsets, or disliked contrasts. As predicted, the supersets 
were persons who were perceived by the subjects to be very significant in their 
lives. Disliked contrasts were persons whom the subjects perceived as least 
significant; subsets and miscellaneous targets could not be distinguished from 
one another in this regard. It is suggested that supersets provide the perceptual 
categories for the construing of persons, and the correspondence of this idea 


with other extant psychological ideas, such as transference, is discussed. 


We are often in the process of comparing 
Kople with each other. Our conversations are 
bunctuated with statements such as “Those 
Wo sisters could hardly be more different” 
ind “John and Bill are as alike as two peas 
Na pod.” It does not seem unusual, then, that 
lhe perceived similarities and differences 
‘mong the people known by an individual 
"ve occupied the serious attention of social 
Eychologists in the area of person perception. 
_ tse perceived relationships have been ex- 
ened in a number of studies (Cronbach, 
aa Jackson, Messick, & Solley, 1957; 
oa Young, 1972; Messick, 1961; Rosen- 

, 1977). 


eg work was supported in part by National 
Rent Foundation Grant BNS 76-10675 to Seymour 
berg and grants for computer use from Rutgers 
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As part of the analyses of these perceived 
relationships, a measure of the psychological 
proximity of each person in a subject’s life to 
each other person was defined in each of the 
above studies. The measures chosen all satis- 
fied the symmetry assumption. That is, the 
similarity of Mary to John is defined as equal 
to the similarity of John to Mary. 

A considerable number of theoretical and 
empirical arguments make the assumption of 
symmetry questionable. Within the field of 
person perception, the possibility of asym- 
metric relationships among personality de- 
scriptors has been raised several times at the 
theoretical level (Hays, 1958; Jackson, 1962; 
Rosenberg & Sedlak, 1972). At the empirical 
level, there is a study by Warr and Knapper 
(1968) in which subjects made inferences on 
an adjective checklist about a target person 
who had been initially described by the in- 
vestigators with six different cue traits. The 
subject was asked two questions for any given 
pair of traits: If this person is cynical, how 
likely is he to be precise. If this person is 
precise, how likely is he to be cynical? The 
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mother 


Figure 1, A: Hypothetical superset relationship. B: 
Hypothetical symmetric relationship. 


results indicate marked asymmetries: Being 
cynical implied being precise, but being pre- 
cise did not imply being cynical. 
_ The notion of asymmetry has also received 
attention outside the immediate sphere of 
person perception. Working in the tradition 
of personal construct theory, which shares 
methodological and substantive concerns with 
person perception research, Hinkle (1965) 
developed a technique (ImPeRID) for assessing 
the one-way implications between constructs. 
Furthermore, investigators have found asym- 
metric similarity judgments with objects such 
as countries and geometric figures, (Tversky 
& Gati, 1977), auditory signals (Wish, 1967), 
and colors, line orientations, and numbers 
(Rosch, 1975). 

In the present study, a set-theoretical 
model of person perception is used to depict 
the target persons described by subjects. This 
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model permits asymmetries in the description 
of the targets to be reflected in the set-theo- 
retical relation of targets to each other, The 
most extreme type of asymmetry is depicted 
in Panel A of Figure 1: Athena is a proper 
subset of Mother. The fact that Athena has 
an attribute implies that Mother has it as 
well; however, the converse is not true, Hays 
(1958) proposed a set-theoretical model for 
the relationships among personality descrip- 
tors, but it received little subsequent atten- 
tion. 

In this study, the set-theoretical relation 
between targets is measured by a special im- 
plication phi (here denoted ¢,) that was pro- 
posed by Francis (1961), Unlike the standard 
phi coefficient for the correlation between two 
dichotomous measures, there are two ¢1s for 
any two target persons, A and B, one for 
A > B and one for B — A. It is the inequality 
of these two phis that reflects the degree and 
direction of asymmetry. When there is a sym- 
metric relationship between targets such as 
that depicted in Panel B of Figure 1, the two 
phis will be equal. 

The application of a set-theoretical model 
to the target persons that an individual de- 
scribes has a special appeal. Some targets 
(dubbed supersets) may bear the superset 
relation that Mother bears with Athena 1" 
Panel A of Figure 1 (although it may not be 
perfect) to many other target persons de- 
scribed. A group of other targets (dubbed 
subsets) bear relationships akin to that which 
Athena bears to Mother in the same figure. 

It seems reasonable that supersets are per- 
sons in an individual’s life who provide the 
perceptual categories for the constare 
other persons. The underlying notion is j i 
supersets are persons with whom the ao 
ual has interacted in many different types 3 
situations. A good deal is thus known oe 
these persons in terms of personal Toan 
acteristics, and a portion of be ae s 
sequently be assigned to persons teristics 
well. In fact, the personality charac an a 
referred to by the personality terms t rob- 

Y people were p! 
ably first observed in these sopena these 

Consequently, we shou 
same supersets are known 
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the subject than are subsets and any other 
ype of target we identify. In addition, super- 
gts are probably perceived as more complex 
than other targets, and the affect that is dis- 
played in their presence is more ambivalent 
(multivalent) than that displayed in the 
esence of other targets. Further interaction 
ith nonsupersets may result in a significant 
impression change, whereas the overall im- 
pression with regard to supersets should be 
already well formed and slow to change. In 
um, supersets are the most significant people 
an individual's life. 

To test these ideas we obtained a data base 
independent of the one we used to identify 
Supersets and subsets, This consisted of rat- 
gs of the targets on several properties that 
thought would differentiate the targets in 
s of how significant they were in a sub- 
ject’s life. Then the question was asked: Are 
Supersets the significant persons in the sub- 
’s life? 


Method 
Subjects 


Ten female and four male undergraduates enrolled 
ha psychology course at Livingston College served 
S subjects. Each subject was paid $36 for par- 
Be Pating in the study, which involved approximately 
[his hours of his or her time. Subjects were told in 
vance that they would be paid only upon the 
Ompletion of their work, that incomplete work 
"ould receive no payment. 


"rocedure 


fe first data that were obtained consisted of each 
Mbject’s descriptions of a sample of people that he 
‘she knew, This data base was used to identify the 
ie ts as well as the other types of target persons 
peach subject. 
a obtaining these data for each subject, the free- 
Ponse methods developed by Rosenberg (1977) 
te followed exactly, except for the way target 
tople were selected. For that purpose the subjects 


lie “scription and then list a person in his/her 

te seemed to fit that particular description best. 
» Subjects generated their own lists of 36 target 

ons to describe. 

ay free-response method for obtaining descrip- 
te Of these targets required that the subject gen- 
his/her own descriptive vocabulary. Each sub- 
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Table 1 
List of Persons Described by Subjects 


. Mother 

Father 

. Liked brother 

Least liked brother 

Liked sister 

Least liked sister 

Most liked relative—intimate with 

Second most liked relative—intimate with 

. Most liked relative whom you are not intimate 

with 

10. Second most liked relative whom you are not 
intimate with 

11. Most disliked relative 

12. Second most disliked relative 

13. Best friend 

14. Next best friend 

15. Third best friend 

16. Casual friend 

17. Casual friend 

18. Casual friend 

19. Most disliked person you know well 

20. Second most disliked person you know well 

21. Third most disliked person you know well 

22. Most liked person you have met in last six 
months 

23. Second most liked person you have met in last 
six months 

24. Most disliked person met in last six months 

25. Second most disliked person you have met in 
last six months 

26. Public figure—most liked 

27. Public figure—second most liked 

28. Public figure—east liked 

29. Public figure—second least liked 

30. Me now 

31. Me ideal 

32. Me negative ideal 

33. Spouse or lover—present 

34. Spouse or lover—past 

35. Mythical figure—liked 

36. Mythical figure—disliked 
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ject generated two different kinds of vocabularies: 
traits that the subject perceived as descriptive of the 
various target people and feelings that were elicited 
in the subject by these target people. The subject 
was asked to formulate his/her trait perceptions and 
feelings in the form of discrete units, but not neces- 
sarily single words. The subject cumulated a trait 
and feeling list as he/she described the target persons 
—that is, when he/she used a trait or elicited feeling 
to describe one person, he/she was asked to judge 
the presence of that trait or elicited feeling for all 
the other target people. 

Before systematically describing any of the people 
on his/her list, each subject prepared two “starter 
lists,” one of traits and one of feelings. To obtain 
these lists, the subject was instructed to write down 
on a provided questionnaire the first five traits char- 
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Table 2 
General Relationship Between Attributions to 
Two Target Persons 


Target Person B 


Target Person A lor2 0 
lor2 a b 
0 c d 


Note. 0 = target does not possess attribute; 1 or 2 
= target does possess attribute. a, b, c, d = number 
of times a given combination occurs in description 
of target person. 


acteristic of, and first five feelings elicited by, Tar- 
gets 1, 7, 12, 14, 18, 21, 23, 26, 30, 32, 34, and 35. 
The starter lists served not only to reduce biases in 
content that might be associated with the first few 
people on the list but also to help the subject get 
started with the description task. Once collected, 
the two starter lists were each randomized for each 
subject, and the description task began. The de- 
scriptions were collected on a computer terminal 
tied to the Rutgers time-sharing system. Each sub- 
ject entered his/her list of 36 target persons and the 
two starter lists, one of traits and one of feelings. A 
computer program (Livingston & Kingsley, 1976) 
was used to (a) present the subject with his or her 
list of target persons one at a time; (b) present for 
each target person all descriptors (traits or feelings) 
the subject had entered; (c) record the subject’s 
judgment (0, 1, 2) of each target person on each 
descriptor; (d) record any new trait (or feeling) 
term that the subject wished to add to his or her 
vocabulary of terms; (e) re-present for judgment 
the persons already described, when a new trait (or 
feeling term) was added. 

Subjects were instructed to make the trait and 
feeling judgment on a 3-point scale. For traits, these 
were: 

0= the trait is not descriptive of the person, but 
does not necessarily mean that the person possesses 
the opposite trait (in order to indicate that the per- 
son possesses the opposite trait, the subject was in- 
structed to add the opposite as a separate trait 
term). 

1=the person possesses the trait to a noticeable 
degree but need not describe the way the person is 
all the time. 

2 =the person posseses the trait to an extreme de- 
gree but need not describe the way the person is all 
the time. 

A similar scale, appropriately worded, was used for 
the feelings, 

O= the person does not make (or has not made) 
you feel this way. 

1 =the person makes (or has made) you feel this 
way to a noticeable degree. 

2= the person makes (or has made) you feel this 
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way to an extreme degree but need not make you 
feel this way all the time. 

During this entire process of describing the 36 
target persons, the trait and feeling vocabularies were 
kept separate: A subject systematically alternated be- 
tween trait and feeling descriptions according to a 
fixed schedule, The subject started with trait descrip- 
tions for the first 5 people on his/her list, followed 
by feeling descriptions for the first 10 people on the 
list. The subject switched back to trait descriptions 
and alternated them with feeling descriptions, such 
that after persons n through n +9 were described by 
traits, persons n +5 through n +14 were described 
by feelings. 

When the description tasks were all completed, the 
trait and feeling vocabularies were combined for each 
subject. Thus the computer contained 0, 1, and 2 
entries in a 36 X n descriptor matrix for each subject, 


y 
Identification of Supersets and Subsets ina 
Subjects Protocol 7] 


Suppose we represent all possibilities of perceived 
trait and feeling occurrence in Person A with respect 
to Person B, using a 2 X 2 table (Table 2), where 0 
represents the rating that the target person (A or B) 
does not possess an attribute, and 1 or 2 
the rating that the person does, The cell entries 
b, c, and d each represent the number of times 
a given combination occurs in the actual description 
of these two persons. 

The relationship expressed with the Venn 
in Panel A of Figure 1 depicts Mother as a 
subset of the goddess Athena, Let Mother be 
A and Athena be Person B in Table 2. As seen 
Panel A of Figure 1, whatever attributes At 
possesses, Mother does also, In other words, € 
equal zero. For the general case, whenever WE 
marshall evidence that the c cell is indeed equal | 
zero, we can state that Person B is 7 Lea er 

A. Situati is sm: ul ra 
Person A. Situations where c is yi J 
e 


to zero can be placed on a continuum of what 

(1961) calls “weak implication” and he ng 
strates that a ø: coefficient (range from —1 to i 
can be calculated, where i 


ad — bc o 
l a CPA PEST ar Y f- 


“utoe +) 


A high positive value indicates the degree to mia 
Person B's characteristics are a subset of Person aie 
A high negative value indicates that Person B i 
the opposite of Person A. 

For the weak implication case, ¢1’s were calculated 
for each subject for all pairs of targets ed 
yielding a 36 X 36 matrix of ¢r's for acd as 
Each entry indicates the extent to which ae 
acteristics of the target in column 7 a a: By A 
the characteristics of ke yt in row $ FE) 
tion, the diagonal ent: are 1. a L 

To simplify examination of this matrix, aS i 
and columns were permuted separately to DEE 
proximity rows or columns with a similar 
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This permutation was done according to a hier- 

clustering of the rows of the matrix and 
mm separately of the columns, using as input a 
distance 


lumns) and columns (where k now refers to rows). 
is simple type of two-way clustering is described 
some detail in Rosenberg (1977). 

Figure 2 illustrates a sample protocol from one 
bject following two-way clustering. All target per- 
numbers in Figure 2 correspond to the numbers 
Mithe role figures in Table 1. In this example, the 
labeled 1 (which is Mother) illustrates the 
kere ($: < 10) to which the characteristics of each 

t designated in the columns—6, 11, 24, and so 
h—are subsets of the characteristics attributed 
> Mother. A minus sign indicates that the ¢: is 
“Mative, whereas an asterisk indicates that ¢: = 1.0. 
Distinct clusters of targets in Figure 2 are sep- 

ated by horizontal lines for the row-wise clustering 
d vertical lines for the column-wise clustering. 
following guidelines were used to group the 
ts (row-wise and column-wise) for each sub- 
tt. We chose the level of the hierarchical tree pro- 
kd by the clustering program that separated the 
{f targets into 7 clusters. If there was a compara- 

tly large cluster (in terms of number of targets 
tained) remaining at this level of the tree, a 
sher level of the tree was chosen to break up this 
ter, However, in no case were more than 11 clus- 
5 chosen 

Alter the clustering of the targets had been fixed 
is way, targets were classified as supersets, sub- 
disliked contrasts, or miscellaneous for each 
ject, as shown in Figure 2. (Miscellaneous targets 
pet labeled in Figure 2.) 

‘ote that target persons in the same cluster share 
Very similar pattern of ¢ıs. To indicate the gen- 
,, Size of the œs within a cluster, the sum of all 

‘ated ¢ıs within a cluster is averaged by the num- 
"of targets within that cluster, and this number 
Sh for each row and column cluster in Figure 
ty 0 be Classified as a superset, it is necessary, but 
Rawat that a target be included in the 
ow T(S) of targets with a pattern of high ıs 
eal Similarly, to be classified as a subset, it is 

ord but not sufficient, that a target be in- 
M = the cluster(s) of targets with a pattern of 


reason that the above conditions are necessary 
cad Sufficient for classifying targets as supersets 
ts is that a given target could occur in both 
described above, if that target was char- 
by a pattern of high ¢:s, both row-wise 
Olumn-wise, That target would bear an overall 
tric relation, not a superset or subset relation. 
nuets rarely occur, In a total of 504 targets 
ex 36 targets per subject) we classified 

4 Bets as supersets, 71 as subsets, and only 23 as 
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symmetrical. Moreover, the latter were found in only 
7 of 14 subjects. 

In every case, a cluster(s) of target persons emerges 
with negative phis in relation to supersets and sub- 
sets. This category was dubbed “disliked contrasts” 
because without exception, only the disliked role 
figures in Table 1 emerge in this category. 

Targets not identified as supersets, subsets, or dis- 
liked contrasts are grouped into a miscellaneous 
category. Included in this category are the symmetric 
targets, as defined above. 


Property Ratings 


The other data that were obtained from each sub- 
ject after he/she had completed the person descrip- 
tion task were ratings of each of his/her target 
people on the properties listed in Table 3. These 
properties were constructed to assess the significance 
of each of the target persons in the subject’s life, 
The property ratings were then used to test the 
idea that supersets, which had been identified in the 
free-response protocols, were significant people. 


Results 


In order to relate the interpersonal signif- 
icance of a target person as assessed by the 
property ratings with the type of target as 
identified by the ıs, a multivariate analysis 
of variance (MANOVA) was performed on the 
property ratings. The target type effect 
(which has four levels: supersets, subsets, 
disliked contrasts, and miscellaneous) was 
tested against the variance-covariance matrix 
for the Target Type X Subject interaction, 


1 Using a test proposed by Huynh and Feldt 
(1970), we found that certain assumptions were not 
met that were requisite to the validity of this par- 
ticular test of the main effect when repeated mea- 
sures are involved. To ensure the validity of the 
reported findings, alternative procedures of analysis 
were pursued. These involved the calculation of three 
orthogonal contrasts among each subject’s four target 
type means for each property and testing these 
contrasts for flatness using a Hotelling T? and Bon- 
ferroni adjusted significance levels. There were sig- 
nificant differences among the four target type means 
for every property except Complexity and Feeling 
Ambivalence. Specific contrasts among the target 
type means for each property were also tested 
(essentially a single sample £ test of the hypothesis 
that the population value of the contrast is zero) 
using Bonferroni adjusted critical values (Harris. 
Note 1). The profile of results using MANOVA È 
essentially replicated by these procedures, thus only 
the MANOVA is reported. A summary of these alterna- 
tive To is available from the author upon re- 
Sre i John Miller for his assistance with 
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Figure 2. Sample protocol with supersets and subsets identified. 


and a significant main effect was obtained, 
approximation to Wilks’s lambda (24, 93) 


5.78, p < .0001. 
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In a multivariate analysis of variance, 
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Label Question 7-point bipolar scale 
He Dione 1. How much do you discuss personal problems? not at all-a lot 
cern 2. How concerned are you about whether this not at all concerned- 
person thinks well of you? very concerned 
bil-Expression 3. How far do you express feelings of love, try to conceal- 
approval, or admiration for this person? express freely 
linge of Interaction 4. In how many different types of situations limited-many 
have you interacted with this person? 
p- 5. How close are you to this person presently? little-very 
plexity 6. How could you characterize the complexity simple-complex 
of this person? 
keling Ambivalence 7. How conflicted or ambivalent are your little-very 
feelings about this person? 
mpression Stability 8. If you were to interact with this person in it would change a 


many more types of situations than you do 
now, what do you expect would happen to 
your impression of this person? 


great deal from 
what it is now-it 
would not change 
much from what it 
is now 


iminant functions that distinguish among 
groups, Harris (1976) argued that Wilks’s 
bda, although providing a valid test for 
overall effect, cannot be used as a test 
the significance of any particular discrim- 
nt function, including the first. He recom- 
ded comparing the characteristic roots 
iated with each discriminant function to 
è critical value of the greatest-character- 
c-root distribution. This procedure was 
lowed, using the s-parameter specifications 
ested by Harris for testing each root and 
Same m and n parameters as for the over- 
test. (See Morrison, 1976, p. 189.) The 
t statistic 420 was extracted for the first 
timinant function and was statistically 
ficant (p < 01). The second and third 
, timinant functions were not found to be 
“nificant, 
The first discriminant function associated 
: the target type effect accounts for almost 
% of the variation in the property ratings 
ls the direction in these data that most 
‘tinguishes the four target types. To clarify 
` discrimination, a Duncan multiple com- 
a procedure was performed on the 
tivariate means, which are obtained, one 
‘ach target type, by multiplying the prop- 
ly, Means collapsed across subjects by the 
“discriminant function. This procedure 


revealed that the supersets scored highest 
(p < .001) on the linear combination of the 
properties yielded by the first discriminant 
function. There were no significant differ- 
ences between subsets and miscellaneous tar- 
gets, and the disliked contrasts scored lowest. 

The actual discriminant function coeffi- 
cients and the loadings of each of the eight 
properties on this function are presented in 
Table 4. The loadings are the correlations of 
each property with the discriminant function 
and tell us how well we could estimate the dis- 
criminant function with a property if only we 
had that property at hand. 

With respect to the discriminant function 


Table 4 

Coefficients and Property Loadings 
Associated With Target Type Effect for the 
First Discriminant Function 


Property Coefficient Loading 
Self-Disclosure —.000002 59 
Concern -020061 94 
Self-Expression 008741 ahs) 
Range of Interaction —.000437 AS 
Closeness 006339 12 
Complexity — .002833 01 
Feeling Ambivalence -003891 —.09 
Impression Stability —.003147 23 
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coefficients, the target types are largely dis- 
tinguished by their ratings on the question: 
How concerned are you about whether this 
person thinks well of you? The subjects were 
very concerned about the opinions of supersets 
and least concerned about the opinions of 
disliked contrasts, They did not distinguish 
subsets and miscellaneous targets in this re- 
gard. 

The profile of loadings in Table 4 presents 
a similar picture. Though Self-Disclosure, 
Self-Expression, Range of Interaction, and 
Closeness do a fairly good job of predicting 
the discriminant function that optimally dis- 
tinguishes the target types, Concern does 
the best job. These properties are all highly 
intercorrelated, as we assumed they would be; 
they were designed to assess the significance 
of a target in a subject’s life, albeit in dif- 
ferent ways. There is some evidence that the 
impressions of supersets are stable relative to 
the other targets. (Impression Stability has 
a loading of .23 on the first discriminant 
function.) There is no evidence that impres- 
sions of supersets are perceived as more com- 
plex. Further, subjects did not perceive their 
feelings as especially ambivalent for any one 
type of target. 

In general, the idea that supersets are sig- 
nificant people seems well supported. On the 
other hand, we were not able to distinguish 
subsets from targets classified as miscella- 
neous. This is not to say that there are not 
important differences here but only that the 
properties we happened to construct failed 
to capture them. Further investigation is 
needed to distinguish these target persons. 


Discussion 


In this study, target persons were de- 
scribed in a free-response manner by subjects. 
Four categories were used in the classification 
of targets. To recapitulate, supersets (sub- 
sets) are target persons who characteristically 
bear superset (subset) relations to other tar- 
gets, although these set relationships may not 
be perfect. Disliked contrasts are character- 
ized by disjunctive logical relationships with 
other targets, Targets not identified as super- 
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Sets, subsets, or disliked contrasts were 
grouped into a miscellaneous category. i 

The fact that these four kinds of targets 
are structurally distinct does not guarantee 
that they will be psychologically distinct. Ti 
determine the psychological distinctiveness 
among these target types, independently mea 
sured ratings of the targets were obtained! 
In terms of the eight rated properties that 
were used to determine the perceived signif- 
icance of each target person in the life of the 
Subject, supersets were found to be the 
significant people, as predicted, and disliked 
contrasts were distinguished as the least sig} 
nificant people. Subsets and miscellaneous 
targets could not be distinguished from one 
another, K, 

The remainder of the discussion focuses on 
supersets and addresses two questions: What 
are supersets? Who are supersets? 


What Are Supersets? 


This question has already been answered, 
in terms of the logical relation these targets 
share as the characteristic way in which they 
relate to most other targets. To elaborate 
supersets receive ascriptions from i 
every content category in a subject’s person 
descriptions. A content category refers 10% 
group of descriptors (traits and feelings) 
are used in a similar way by the subject in 
describing persons, probably because hoi 
descriptors roughly refer to some omna 
quality detected in certain persons an n : 
some common quality detected (felt) in i; 
self when certain persons are present. A 
use of hierarchical clustering with the A 
response person description protocols is 


ini ject’s 
of the methods for determining 4 a 
content categories (€.g., Roscibe es ei 


Nonsupersets—especially subsets—ar Reid 
described with many of the terms in ae ternal 
tent category (or more) and few of o | 
in other content categories (Gara, No ties of 
What might these structural prope! bout 
the subjects’ persons description: 
the process of person perception? 
about process is suggested from 
of these protocols, although no this study- 
process were directly obtain in ; 
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persets, that is, significant persons in one’s 
è, provide the perceptual (content) cat- 
ries for the perception of other persons. 
is idea is certainly not new. It is involved, 
gh stated differently, in many extant psy- 
logies of the person (e.g, Freud, 1940/ 
49; Kelly, 1955; Sullivan, 1953). 

Freud’s concept of transference will be dis- 
sed briefly as an illustration of this idea 
t significant persons provide the perceptual 
tegories for the perception of other persons. 
tansference refers to the situation in psycho- 
erapy where the therapist is seen as a par- 
jal replication of a significant person in the 
batient’s life, and the patient acts and reacts 
: the therapist on that basis. In other 


rds, the therapist is a subset of a significant 

er, Applied to the broader context of per- 
on perception in general, rather than thera- 
ist perception in particular, transference 
imply refers to the experience of persons as 
bsets of significant others. It is interesting 
b note here that two of the roles that loom 
irge in classical accounts of transference— 
the self (Me Now) and mother—are each 
lound to be supersets in 8 of 14 cases. 


Who Are Supersets? 


Although more than 50% of the subjects 
bad Mother and/or Me Now as supersets, it 
Would be misleading to think that in terms 
{ the role descriptions presented in Table 1, 
Mpersets only cover a small range, such as 
Parental figures. What is true is that supersets 
tte never mythical, public, or disliked role 
‘ures, Furthermore, in a preliminary anal- 
Wis (Gara, 1978) it was found that one 
Soup of subjects had only parental figures as 
‘Upersets, whereas, in another group, super- 
ts appeared to be mostly nonparental and 
®ntemporary figures such as lovers and 
fiends. Another group of subjects seemed 
fall between these two extremes with re- 
a to the supersets identified in the proto- 
s. 


Regardless of these differences in the role 
“lationship of the supersets to the subjects, 
yoctsets were universally found to be very 
nificant persons in the subject’s life. What 

specially striking is that the ratings of 
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Concern best discriminated supersets from 
other targets. Subjects are very concerned 
that the supersets think well of them. 

What is the nature of this concérn? The 
answer depends on how the subject inter- 
preted the question, Subjects’ ratings on 
other properties show that they feel close to 
supersets, that they discuss personal problems 
with supersets, and that they freely express 
their feelings with supersets. It could be that 
subjects’ ratings of supersets on Concern re- 
flect their desire that these important people 
in their lives accept them totally, regardless of 
shortcomings. In other words, the question 
could have been interpreted as the desire for 
unconditional positive regard (Rogers, 1959) 
from supersets. 

On the other hand, there is the possibility 
that some or all subjects interpreted the ques- 
tion as an assessment of how positive they 
wanted to appear to a target. In this case, 
when the subjects report, as they do, that 
they express themselves freely with supersets, 
the spontaneity of the expression is limited 
by their concern that they will appear posi- 
tive. That is, subjects monitor themselves in 
the presence of supersets. Regardless of what 
role the superset bears with the subject, even 
if the superset is the self-concept itself, there 
is always at least one phenomenological rep- 
resentation of a person that, in effect, mon- 
itors the individual. The notion that the self- 
concept itself can be a monitor is not new: 
“Most of the ways of behaving which are 
adopted by the organism are those which are 
consistent with the concept of the self” 
(Rogers, 1951, p. 507). The notion of “mon- 
itoring” per se is involved in the psycho- 
analytic concept of introjection of the signif- 
icant other, which results in the formation of 
the superego. Elaboration of these notions can 
be found in object relations theory (Fair- 
bairn, 1954). 

Future investigations of the set-theoretical 
representation of target persons in person 
perception need to be directed at the un- 
raveling of the subjective meaning(s) of 
Concern vis-à-vis supersets and the investiga- 
tion of the possibility that supersets are mon- 
itors. It would also be important to determine 
if a subject’s general psychological adjust- 
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ment and living problems can be differentially 
affected depending on the role relationship 
he/she bears with his/her*perceived monitor. 
That is, is it better in terms of general ad- 

- justment to have a parental figure or a con- 
temporary figure as a superset/monitor? Are 
there any people with mythical/public figures 
as supersets (none in this study), and if so, 
what are the consequences? We should also 
not exclude the possibility that all representa- 
tions of significant others are, in a sense, 
mythical, and the monitoring function per se 
that those representations serve is what is 
problematic in human experience. 


Reference Notes 
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Internalization Versus Identification in the Laboratory: 


A Causal Analysis of Attitude Change 


: Daniel Romer 
University of Illinois at Chicago Circle 


Two possible mediators of attitude change, internalization and identification, 
were investigated in a laboratory setting. Internalization was assumed to under- 
lie change when respondents are attracted to others who hold the same opinion 
ond can argue in favor of their attitude position; identification was assumed to 
mediate change when respondents are attracted to similar others but cannot 
necessarily support their position. A causal analysis based on these assumptions 
confirmed the independent existence of internalization and identification as 
mediators of attitude change. The analysis suggested that internalization in- 
volves more valid change than identification does and that attraction toward 
similar others is affected by both internalization and identification. These con- 
clusions are supported in terms of both individual and treatment variation. The 


results suggest that attraction toward similar others does not necessarily reflect 
true attitude change but that valid change can be detected even in laboratory 


settings. 


Distinguishing true attitude change from 
mere response change is `a critical issue in the 
Study of persuasion, Hendrick and Seyfried 
(1974) approached this problem by assuming 
that genuine attitude change would be re- 
ted by increased attraction to others who 
hold the same opinion (Byrne, 1971). There 
several objections to this approach, how- 
rer. First, as Wells (1976) noted, demand 
tharacteristics and evaluation apprehension 
May just as easily affect the measurement of 
Ritraction as the measurement of attitude 
fange. Thus, both attitude change and at- 
action could be disingenuous, Second, it is 
E ible for attraction to be genuine and yet 
A Ot reflect true attitude change. For example, 
wh may identify (Kelman, 1961) with others 
Who agree with a message without having 
lly accepted (internalized) the message 
Mselves, Third, more direct evidence of the 
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persuasive impact of a message should be ob- 
tained before labeling its effects as valid. For 
example, evidence that the message has been 
internalized should be evaluated. 

To overcome these objections, the present 
research employed a causal modeling ap- 
proach based on Kelman’s (1961) three- 
process model. This research was conducted 
to demonstrate (a) that identification and 
internalization are independent processes 
with separate determinants and separate con- 
sequences; (b) that, as Kelman has theorized, 
internalization is a more central response to 
persuasion than identification is and (c) that 
attraction toward similar others is a function 
of both identification and internalization. In 
addition, the research was planned to encom- 
pass both treatment and individual variation 
in response to a persuasive message, thereby 
providing a sensitive test of the similarity- 
attraction hypothesis (cf. Hendrick & Bukoff, 


1976). 


The Causal Model 


Causal modeling has its roots in the bio- 
metric work of Wright (1921) and was de- 
veloped to uncover causal mediators in cor- 
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Figure 1. The causal model with observed values for causal paths. 


relational research, More recent developments 
(Goldberger & Duncan, 1973) have extended 
the method to latent variables and multiple 
indicators. An important advance in these 
methods is Joreskog’s (1969; 1973) applica- 
tion of maximum likelihood methods to the 
analysis of causal models. Jéreskog’s method 
involves the specification of a causal model 
assumed to underlie a data set and the con- 
firmation of the model with maximum-likeli- 
hood factor analysis (cf. Kenny, 1976). 
Although causal analysis was originally de- 
veloped for nonexperimental research, it is 
also useful for separating latent mediating 
effects in experimental designs (Alwin & Tes- 
sler, 1974; Costner, 1971). In the case of 
attitude research, agreement with a persuasive 
message may be mediated by several pro- 
cesses. According to Kelman, the most central 
(and valid) change occurs when the attitude 
has been internalized by the recipient. It is 
possible, however, for change to occur as a 
result of identifying with the communicator or 


others who hold the same position. Such 
change reflects attraction toward others that 
is based on motives other than the true ac 
ceptance of the position.’ À 
A causal model representing Kelman s 
theory as it applies to the present research i$ 
shown in Figure 1. Causal links (arrows) be- 
tween variables represent the relationships 
that are hypothesized by the present theoret- 
ical approach (the coefficients are discussed n 
the results). Each observed independent an 
dependent variable is shown to be affected by 
one or more latent variables or factors (€ 
closed in boxes). In the model, internaliza- i 
tion and identification are assumed to be pi 
dependent processes that underlie Spn : 
attitude measures (described below). In a 5 
tion, the model suggests that experime 


k third 
1 Kelman also hypothesizes the existence ot fy not 
process (compliance); however, this me! 


the focus of the present research. 


dependent variables may affect one or both 
{ the mediators at the same time. According 
the model, it is quite possible that effects 
internalization and identification are con- 
unded in ordinary experimental research. 
he use of causal analysis, however, permits 
be separation of these mediating processes. 
in order to employ a causal analysis, it was 
essary to manipulate and to measure the 
mediators in such a way that their causal 
fects could be distinguished. 

Independent variables. If internalization 
nd identification are independent, then each 
nust have its own determinants. Furthermore, 
p demonstrate that these processes do not 
rely reflect individual differences, it is 
essary to show that they are amenable to 
aperimental variation, Thus, independent 
iables were chosen that would affect the 
jtocesses and facilitate their separation. 

For internalization, the variable that ful- 
these purposes was the extent to which 
persuasive message contained arguments 
t were congruent with recipients’ values. 


ftadily they can be internalized. It is also 
ible, however, for this manipulation to 
ifect identification. Communicators who use 
Ore value-congruent arguments might also 
ar attractive over and beyond the extent 
O which their arguments are internalized. 
Hhus the congruence manipulation is shown 
Sa potential determinant of both internaliza- 
lon and identification in the diagram. 

A second variable, communicator trust- 
? Orthiness, was manipulated to affect identi- 
ication, Communicators who are trustworthy 
Ould be more likely to induce identification 
n ones who are not. At the same time, com- 
cator trustworthiness might also affect 
thalization, so this causal link was also 
Sted in the model. Trustworthy communica- 
p> may facilitate internalization because 
“ir arguments are more credible. 

„Finally, the product of the independent 
tiables was included because the variables 
; nt interact in affecting either internaliza- 
or identification. 

Dependent variables. It was also critical 
the present research to distinguish internal- 
‘on from identification at the response 
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level. To accomplish this goal, an effect of 
internalization that would not be produced by 
identification was measured. This variable 
measured the extent to which respondents’ 
attitudes are justified by the beliefs in their 
own value system, The justification could in- 
volve (a) beliefs that the advocated position 
produces benefit (proarguments) or (b) be- 
liefs that it produces harm (counterargu- 
ments). Both of these beliefs can be repre- 
sented in an expectancy—value model of atti- 
tude structure (Peak, 1955; Rosenberg, 
1956) in which proarguments correspond to 
positive expectations of the advocated posi- 
tion and counterarguments to negative expec- 
tations, Thus, a single argument index is 
shown affected only by internalization in the 
diagram. 

Two typical indices of attitude change 
(agreement with and convincingness of the 
message) were measured and are shown af- 
fected by both internalization and identifica- 
tion in the diagram, A single index of com- 
municator trustworthiness was included to 
validate the corresponding manipulation. It 
was assumed to reflect internalization as well 
as identification because a communicator who 
presents more value-congruent arguments 
should also be seen as more trustworthy. 

Finally, an index of attraction toward a 
similar other was measured. The other was 
described as a fellow subject who agreed with 
the advocated position. It was assumed that 
to the extent that subjects also agreed with 
the advocated position, their similarity and 
attraction toward the fellow subject would be 
greater. A major implication of the present 
model is that this process occurs as a conse- 
quence of both internalization and identifica- 
tion and therefore is not a pure measure of 
true attitude change. 

Uncontrolled variation. Also shown in the 
diagram are sources of variation in the de- 
pendent variables that were not under ex- 
perimental control. These include individual 
differences in internalization and identifica- 
tion that could be sizable. It is desirable to ana- 
lyze the present model in terms of both treat- 
ment and individual variation, since similarity— 
attraction is so sensitive to individual varia- 
tion. Furthermore, subjects’ value systems 


2174 


could be expected to differ greatly, producing 
considerable individual variation in internal- 
ization. Finally, a certain amount of error is 
shown affecting each dependent variable, The 
errors were assumed to be mutually uncor- 
related, an assumption that should be (and 
was) evaluated. 


Method 
Subjects and Design 


One hundred and sixty undergraduates fulfilling 
course requirements in introductory psychology were 
randomly assigned to four conditions of an orthog- 
onal design involving two levels of value congruence 
and two levels of communicator trustworthiness. 
Equal numbers of males and females were assigned 
to each condition. 


Experimental Variables 


The intent of the manipulations was to produce 
change in subjects’ attitudes toward the admittance 
of Puerto Rico to the Union. The value-congruence 
manipulation involved variation in the acceptability 
of a message that argued in favor of making Puerto 
Rico the Sist state. The less congruent message, an 
adapted version of one used by Watts and McGuire 
(1964), referred to the need to establish military 
bases in Puerto Rico and to replace those in Cuba, the 
international propaganda value of admitting a state 
composed of a minority population, and the economic 
advantages of eliminating import duties upon Puerto 
Rican goods. The more congruent message, which was 
written by the author, claimed that statehood would 
ensure a higher minimum wage for the underpaid 
and exploited work force, a tax structure more 
suited to the needs of the people, and a voice in 
Congress to affect the decisions that are now made 
without Puerto Rico’s direct representation. The 
Messages were approximately equal in length, and 
pretesting confirmed that they were perceived as 
equally well written. 

The arguments were presented as a statement 
given by a witness for a congressional committee 
hearing on Puerto Rico’s present and future status. 
The communicators were described as either a dis- 
tinguished professor from the Yale law faculty who 
was an expert on Latin America or as a director of 
public relations for a land development corporation 
located in Puerto Rico. The public relations director 
was assumed to be less trustworthy because his firm 
might benefit from Puerto Rico’s statehood; further- 
more, it seemed reasonable that university students 
would see a law school professor as trustworthy and 
would be more inclined to identify with him than 
with a public relations person. 
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Procedure 


Subjects were assembled in groups of 6 to 12 anc 
were told that the experiment was concerned with 
their “ability to process and evaluate information,’ 
They were asked, in written instructions, to read 
the message carefully and to complete a question- 
naire that followed the message. The questionnaire 
contained 14-category scales for rating the con- 
vincingness of the arguments, agreement with the ad- 
vocated position, and the trustworthiness of the 
source, 

When everyone had completed the questionnaire, 
the argument and attraction questionnaires were ad- 
ministered, their order being counterbalanced within 
experimental conditions, The argument questionnaire 
was constructed to increase the likelihood that a 
respondent's ability to support the advocated posi- 
tion would be measured. Previous research by the 
author (Romer, 1979) has suggested that “thought 
listing” procedures such as the one used by Brock 
(1967) do not adequately represent a respondent's 
repertoire of arguments and that a questionnaire 
Specifically requesting that respondents list as many 
arguments as they can is a more direct measure of 
their ability to argue for or against an attitude posi- 
tion. Therefore subjects were not simply asked to list 
their “thoughts and ideas” about the advocated posi- 
tion. Instead, they were asked to consider that both, 
positive and negative consequences can follow from 
an event. They were given as an example the de- 
valuation of the dollar, which might lead to greater 
exports but higher prices for imported goods. Fol- 
lowing the example, they were asked to list as many 
Positive and negative consequences of admitting 
Puerto Rico as they could within a 5-minute period. 
At the end of the 5 minutes, both the desirability 
and the likelihood of the consequences listed were 
rated. Positive desirability was indicated by a scale 
of 1 to 5 ranging from “only slightly desirable” to 
“extremely desirable.” Undesirability was rated from 
—1 (“only slightly undesirable”) to —5 (“extremely 
undesirable”). Likelihood judgments were requested 
on a scale from 1 (“not very likely”) to 5 (“ex- 
tremely likely”). 

The attraction questionnaire was designed to mea- 
Sure attraction in a relatively unobtrusive oS a 
Rather than asking subjects how much they wou 
like a person whose attitude was completely similar 
to theirs (as in Hendrick & Seyfried), the question” 
naire asked subjects to imagine another subject in a 
experiment similar to the one in which they, vie 
participating who expressed “strong agreement” W! 


d 
the message they had read. Once they had pon 
a picture of this person,” they were asked to e or he 


the person in terms of the likelihood that sh de 
Possessed each of 10 traits. The ratings were ely” 
on a 10-category scale ranging from “not ae 
to “extremely likely.” The traits, taken from pleness 
son (1968), covered the entire range of lika 

and were all high in judged meaningfulness. 


ects were 
At the conclusion of the experiment, subjects 
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oavited to ask questions about the experiment and its 
spurpose. Although the content and the occurrence 
oi questions varied considerably across sessions, all 
mbjects were promised written, posted feedback 
about the results of the experiment, 


‘WMeasurement of Dependent Variables 


Three of the dependent variables were simply 
liken from the category ratings of convincingness, 
igreement, and trustworthiness. Convincingness was 
nied in response to the question “How convincing 
were the arguments in the statement?” (“not very 
Wnvincing” to “very convincing”). The agreement 
item asked “To what extent do you agree that Puerto 
Rico should be admitted as the Sist state?” with 
‘Mtrongly agree” to “strongly disagree” as polar de- 
iptions; trustworthiness was assessed with “How 
thworthy do you regard the witness to be with 
pect to the Puerto Rican issue? (“not very trust- 
thy” to “very trustworthy”). 

The argument index was formed by multiplying 
the desirability rating by the likelihood rating for 
ch consequence that a subject listed and taking 
arithmetic sum of these products as the overall 
sx for a subject. In a similar fashion, the attrac- 
on index was obtained by multiplying the likeli- 

ratings by a transformation of the scale values 
Mt likableness tabled by Anderson (1968) for each 
M the traits. The tabled values range from 0 to 6, 
Vering the entire range of likableness. For the pur- 
of this index, however, a value of 3 was assumed 
reflect neutrality, with values greater than 3 
tkviating in a positive direction and values less than 
in a negative direction. These deviation scores de- 
the trait values that were multiplied by the 
tlihood judgments in forming the overall index. 


nal ysis 


The causal diagram presented in Figure 1 cor- 
Ponds to an oblique five-factor model in Jéres- 
$ system. In this model, observed variables in 
diagram (¥) are written as a function of latent 
= (F) and error (E): 


Y = AF + DE, 


bi A is a matrix of factor loadings and D is a 

en matrix of error coefficients, This model pre- 

à the correlation matrix of independent and de- 
ident variables (C) from 


C=ABA'+D*, 


Which B is a matrix of interfactor correlations. 
Parameters of the A and B matrices as defined 
he causal model are shown in Table 1. 
As is evident, many parameters were predefined 
ero. These restrictions are implied by the causal 
tl and indicate that the variable or factor is 
cted not to correlate with the relevant factor. 
eters that were meant to be estimated are left 
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Table 1 
Parameters Defined by Causal Model 
Factor 
Variable/factor Ge Oy = Sad BES: 
Matrix A 
Congruence* Se Oe 0 te 05 220 
Trustworthiness* 0 = 0 00: 
Interaction* ODE T = 00 
Convincingness 0 0 0 — — 
Agreement 0 0.05 my M 
Trust 0 0 0 —- — 
Arguments Dy Ona On his li 
Attraction 09 0.0 -— >— 
Matrix B 
1. Congruence 1.0 
2. Trustworthiness 0.0 1.0 
3. Interaction 0.0 0.0 1.0 
4. Internalization ki irate je, 
5. Identification — — — 00 1.0 


Note. The far left column lists variables for Matrix 
A and factors for Matrix B. 
a Independent variable coded 1, —1. 


blank. These correspond to the causal links that. were 
hypothesized in the diagram. Two of the restrictions 
are notable: (a) The loading of the argument index 
on the identification factor was set to zero, reflecting 
the assumption that identification is independent of 
the extent to which respondents can argue in favor 
of the advocated position, and (b) the correlation 
between internalization and identification was set to 
zero, reflecting the assumption that identification is 
independent of internalization. All of the unrestricted 
correlations in B refer to potential effects of the 
independent variable factors upon internalization and 
identification. The unrestricted loadings in A refer 
to links between factors and the various variables. 
An important aspect of the present model concerns 
the fact that the parameters estimated under it are 
uniquely defined. In ordinary factor analysis, the 
solution is arbitrary because the factors can be 
rotated without violating the assumptions of the 
model. In the present model, the factors are defined 
(by the zero loadings in the A matrix) such that no 
other estimates of the free parameters will fit the 
model as well as the ones that are obtained? The 


2 According to Jéreskog (1969) the criteria for a 
unique solution are that there are at least ° (where 
k= the number of factors) restrictions in the A and 
B matrices and that the rows of the A matrix can 
be permuted such that restrictions on the factors lie 
in the upper triangle of the matrix. Both of these 
criteria are met by the present model. 
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Table 2 


Correlation Matrix and Means and Standard Deviations of Independent and 


Dependent Variables 
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ees 


Variable 1 2 
1, Value congruence 
2. Communicator trustworthiness 000 
3, Interaction 000 000 
4, Convincingness 247 «187 
5. Agreement 082 139 
6. Trust 193 285 
7. Arguments 135 020 
8. Attraction 216 095 


3 4 5 6 7 M SD 
0,00 1,00 
0,00 1,00 
0.00 1.00 
—035 8.42 3.73 
—024 426 8.67 347 
—016 516 272 7.72 3.54 
—046 368 397 155 15.37 46.07 
—058 535 429 397 466 16.93 22.56 


solution, therefore, is the best that can be obtained, 
given the restrictions that define the factors, 

There are a number of computer programs pres- 
ently available to solve this causal system. The one 
used in this analysis was Jöreskog’s analysis of co- 
variance structures (Jöreskog, Gruvaeus, & Van 
Thillo, 19/0). This program, as well as others, employs 
a large sample chi-square test of fit based on the 
maximum-likelihood principle. The test of fit mea- 
sures the extent to which the predicted correlation 
matrix matches the observed matrix. The program 
produces a solution that minimizes the discrepancy 
between these matrices; however, a particular model 
may or may not fit the observed matrix. If a model 
Produces a good fit, it can then be considered a 
candidate for explaining the data set. “ 


Results 


The correlation matrix for both indepen- 
dent and dependent variables is presented in 
Table 2, Rather than focusing upon specific 
correlations in the matrix, the present analysis 
attempts to account for the pattern of correla- 
tions in the entire matrix. Before turning to 
the analysis, however, it is worth asking 
whether the present matrix fully accounts for 
the effects that were observed. In particular, 
the order of measuring the argument and at- 
traction indices was counterbalanced within 
experimental conditions, and equal numbers 
of males and females were studied in all cells 
of the design. However, neither order of mea- 
surement nor sex of subject produced any 
effects upon the variables in the matrix. It 
was assumed, therefore, that the data set was 
adequately described by the correlations in 
the matrix. 

The results of the covariance structure 
analysis are shown in Figure 1. The first ques- 
tion one can ask is whether the correlation 


~ sponding to both internalization and identi- 


a “= 
matrix is adequately reproduced by the model. 
The answer appears to be yes. The chi-square 
test of fit between the predicted and obtained 
correlation matrix was nonsignificant, x?(10) 
= 3.55, p= .96, indicating a close corre- 
spondence between predicted and obtained 
values. Inspection of the differences between 
these values revealed that no residual was 
greater than .07 in absolute value (M = .006, 
SD = .021). Thus, the model does a good job * 
of predicting the data, 

As is evident in the figure, factors corre- 


fication mediated variation in the dependent 
variables. All of the dependent variables 
loaded positively upon internalization, indi-* 
cating that respondents who had positive 
attitudes also thought the communicator was 
trustworthy, had arguments to support their 
position, and were attracted toward others 
who agreed with the advocated position. A 
factor corresponding to identification, how- 
ever, was orthogonal to internalization and 
mediated variation that was independent of 
arguments that supported the advocated posi- - 
tion. It contained loadings from all of the 
other dependent variables, indicating that the 
more respondents agreed with the advocated, 
position, the more they trusted the com- 
municator and the more attracted they were to 
others who agreed with the advocated posi- 
tion, Thus, as expected, attraction tower 
others was affected by both internalization 
and identification. asl 
An important assumption of the F ie 
model is the theoretical independence 


p re 
ternalization and identification. As they # 
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defined, both involve variation in expressed 

titudes and attraction toward those who 
bold the same attitudes, but only internaliza- 
tion is based on arguments in support of the 
attitude position, Although the goodness-of-fit 
iest indicates that this assumption is plausi- 
ble, it is also possible to test the model with- 

t making the assumption. In this test, the 
model is fitted without the B-matrix restric- 
tion that the factors are uncorrelated. The 
obtained solution, however, was virtually 
identical to the one in Figure 1, and the fac- 
tors were still orthogonal (r = —.02). It ap- 
pears, therefore, that relaxing the assumption 

s not improve the fit of the model and that 
the assumption is valid, 

In evaluating the effects of the independent 

iables upon internalization and identifica- 

n, a special procedure was employed. The 

tained correlation (shown in the figure) 
Was set to zero, and the remaining parameters 

the model were fixed to their obtained 
Value. The model was then tested to see if the 
Boodness of fit of the entire model was sig- 
lficantly poorer than the fit of the original 
Model. If it is, then the correlation can be 
mr to be different from zero. 

As expected, the value congruence manip- 
lation affected internalization, x?°(1) = 4.38, 
$< 05. The message thāt contained more 
he arguments was more readily in- 
ized. There were no other effects upon 
internalization, however. Apparently, the 
“ustworthiness of the communicator did not 
fect the credibility of his persuasive argu- 
Ments, 

_ The trustworthiness manipulation did affect 
identification, however, x?(1) = 14.03, p < 
5. The more trustworthy communicator ap- 
Patently induced greater change by virtue of 
attractiveness. In addition, the value con- 
Stuence of the message affected identification, 
¥(1) = 5.42, p <.05. Thus the congruence 
Manipulation produced change mediated by 
oe internalization and identification. The 

Mteraction of the independent variables did 

t affect identification, however. 

As is evident in Figure 1, uncontrolled 

rces of variation in both internalization 
identification were sizable. This indicates 
t most of the variation in the mediators 
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was due to individual differences. Neverthe- 
less, the success of the model suggests that 
both treatment and individual variation can 
be encompassed by the same causal model. 
The estimated errors shown in the Figure 
indicate that from 40% to 64% of the varia- 
tion in the dependent variables was unex- 
plained.* The assumption that these errors of 
measurement are uncorrelated does not seem 
unreasonable in view of the very close fit of 
the model, That is, there does not appear to 
be any reliable variation in the dependent 
variables that was not explained by the in- 
ternalization and identification factors. Fur- 
thermore, the finding that both factors were 
affected by the independent variables sug- 
gests that the factors are not solely composed 
of error. Although the error variation in the 
variables was sizable, all of the remaining 
error-free variation (36% to 60%) was pre- 
dicted by the model. 7 


Rival Models 


Even though a particular model does well 


- in predicting data, it is always possible that 


an alternative model is superior. One advan- 
tage of causal analysis, however, is that the 
assumptions underlying a model are made 
explicit so that different models can be tested. 
For example, it is possible that all of the 
reliable variation in the dependent variables 
is accounted for by a single internalization 
factor. Therefore, a model containing only 
internalization was fitted to the data, This 
model was identical to the favored model with 
the exception that the identification factor 
was not included. This model did not fare 
well, however, by the goodness-of-fit test, 
x°(17) = 27.67, p< .05. Furthermore, the 
two-factor model was a significantly better 
predictor of the correlation matrix than the 
internalization model was, x?(7) = 24.12, 
p< 05. It appears, therefore, that a model 
involving both internalization and identifica- 
tion is a better explanation of the data than 
a model involving only internalization is. 


* The estimated proportion of variance due to 
error in each dependent variable is equal to the 
square of the corresponding error coefficient. 
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The possibility exists, nevertheless, that a 
different two-factor model also accounts for 
the data, To evaluate this possibility, alterna- 
tive models that contained different restric- 
tions on the second dependent variable factor 
were constructed. These included (a) restric- 
tions on both attitude measures so that the 
second factor only contained loadings from 
trust, arguments, and attraction; (b) a re- 
striction on trust so that all of the remaining 
variables could load on the factor; and (c) 
a restriction on attraction, with the remaining 
dependent variables free. Although each of 
these models provided a good fit to the data, 
the solutions that were obtained were un- 
interpretable. Two of them (a and c) implied 
that the variables on the second factor were 
inversely correlated but positively affected by 
the independent variables; the third implied 
that congruence had a negative impact on all 
the variables of the second factor. Thus al- 
though these models could fit the data mathe- 
matically, they did so at the expense of 
plausibility and interpretability, This is not 
surprising, since there were no obvious theo- 

_ retical models that would support these solu- 
tions, Nor could the second factor in these 
solutions be attributed to correlated measure- 
ment error: The factors in each model were 
correlated with the independent variable fac- 
tors suggesting that if these models were 
valid, the factors were not entirely composed 
of error. Thus, the tests of these alternative 
models suggest that the internalization-iden- 
tification model is the best explanation of the 
data. 


Discussion 


There were three things to be demonstrated 
in the present research. The first was that 
internalization and identification indepen- 
dently mediate attitude change. This, of 
course, was the rationale underlying the two- 
factor model. This model did significantly bet- 
ter in predicting the data than did a model 
containing only internalization as a mediator. 
Furthermore, the separate effects of the inde- 
pendent variables upon internalization and 
identification indicate that each factor re- 
flected valid variation in the dependent vari- 
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ables. In fact, the value-congruence manipula- 


tion affected both mediators, but trustworthi- 


ness only affected identification. Finally, a 
test of the model without the orthogonality 
assumption still affirmed the independence of 
the factors. Thus, two orthogonal factors ap- 
peared to mediate attitude change in the 
present laboratory setting, 

The second goal of the research was to 
show that internalization is a more central 
process than identification is in mediating 
attitude change. This was demonstrated by 
restricting the argument index from loading 


on the identification factor. Since subjects’ 


were asked to generate as many positive and 
negative arguments as they could (in 5 min- 
utes), the argument index has face validity as 
a measure of a respondent’s ability to support 
the advocated position. Therefore only the 


internalization mediator reflected variation 


in the perceived consequences and justifica- 
tion of the advocated position. Although the 
argument index was only one of many pos- 
sible measures of internalization, it did dis- 
play validity by correlating with the other 
measures, Furthermore, the finding that trust- 
worthiness affected identification but not 
internalization indicates that there were ef- 
fects that were mediated by processes other 
than internalization. Therefore, even if a 
more sensitive measure of internalization had 
been used, the present results suggest that a 
separate factor would still be needed to ex- 
plain variation that is unrelated to internal- 
ization. 

The third goal of the research was to show 
that attraction toward similar others is 4 
function of both internalization and iden- 
tification. This hypothesis was clearly sup- 
ported. The attraction index loaded on ea 
factor, indicating that variation in attraction 
was affected by both internalization ani 
identification. Apparently, we express accept- 
ance of an advocated position not only be- 
cause we can argue in favor of it but also be- 
cause we identify with others who hold A 
same opinion. Therefore, Hendrick and s 
fried (1974) were justified in measuring pi 
traction as an indicator of true attitu z 
change. However, the present results 
that without evidence about whether Or 


* 
i 


d 


ATTITUDE CHANGE IN THE LABORATORY 


internalization has occurred, it is not possible 
fo know whether attraction reflects internal- 
ization or identification. 

Wells’ (1976) charge that attitude change 
and attraction may both be caused by ex- 
perimental artifacts (demand characteristics 
and evaluation apprehension) seems an un- 
likely explanation for the present results. 
Since attitudes and attraction loaded on each 
of two factors, one would have to postulate a 
“complex pattern of compliance motives to ac- 
jcount for both factors, In the case of inter- 
talization, subjects would be assumed to have 
tole played greater acceptance of the more 
congruent message, However, since all sub- 
ects received a persuasive message of com- 
parable length and writing quality, demands 
lor expressing acceptance of the advocated 
Position should have been equal in both con- 
Suence conditions, Furthermore, the instruc- 
‘tions for generating arguments requested sub- 
kets to list as many arguments as they could. 
anything, this instruction should have in- 
fuced demands to be prolific rather than to be 
“Worable toward the advocated position, 
Winally, if the internalization factor merely 
Tellects compliance, it is unclear why the fac- 
br would not also be affected by source trust- 
‘Northiness. 

Whether the identification factor reflects 
“perimental artifacts is less certain. How- 
“er, attraction was measured less obtrusively 
ha in Hendrick and Seyfried (1974), and, 
{en if compliance contributed to the identi- 
“ation factor, the conclusion that more than 
Me form of attitude change mediated the 
palts would still be tenable. Thus although 

. mpossible to rule out completely a com- 
Hance explanation, it seems unlikely that 

Mpliance can account for the entire pattern 

€ present results. 

none implication of the present approach is 
jab if subjects’ attitudes were retested at a 
„> When the message and source were less 

doa’ attitudes based on internalization 
pn be more stable than those based on 
w cation, Kelman (1958) used this as- 
potion to show that compliance, identifica- 

, and ihternalization could be distin- 

- The present results are consistent, 
| efore, with research on the persistence of 
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attitude change in which variation due to 
communicator trustworthiness decays at a 
different rate from “message-only” variation 
once “dissociation” of the source has occurred 
(Gruder, Cook, Hennigan, Flay, & Halamaj, 
1978; Hovland & Weiss, 1951). The basis of 
this differential decay could be that trust- 
worthiness is mediated by identification more 
than by internalization. 

One advantage of the present causal model- 
ing approach is that both individual and 
treatment variation could be tested in the 
same analysis. It was possible, therefore, to 
observe the similarity-attraction relationship 
at both the individual and treatment levels. 
Although the variation in individual attitudes 
and attraction was far greater than the treat- 
ment variation, both appeared to be sub- 
sumed by the same causal processes. 

The present results indicate that labora- 
tory-produced attitude change may be valid. 
This is cause for optimism on at least two 
grounds. First, although the study of attitude 
change was once a central issue in social psy- 
chology, interest in the topic has waned, Re- 
searchers doubt the validity of studying atti- 
tude change in the laboratory, and the results 
that have been obtained are regarded with 
suspicion. Although there are probably many 
reasons for the reduced interest, the present 
results suggest that suspicion of the labora- 
tory is not always justified. Second, if genuine 
attitude change can be distinguished from 
other forms of change, the study of attitude 
change can proceed despite the belief that 
laboratory-induced persuasion is suspect. For 
example, research can be conducted further 
to determine how and when a message is in- 
ternalized. Thus even if the many variables 
that have been studied only produce surface 
change, this realization need not impede 
Progress toward determining the factors that 
cause long-lasting and meaningful change in 
peoples’ attitudes, 
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Attitudes and Cogpitive Response: 
An Electrophysiological Approach 
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Two experiments employed electrophysiological procedures for assessing the 
covert information-processing activity of message recipients. In Experiment 1, 
24 subjects expected to hear discrepant communications and were requested to 
“collect their thoughts” following each forewarning. As discrepancy increased, 
anticipatory counterargumentation increased, whereas production of favorable 
thoughts and agreement decreased. In addition, following forewarnings oral 
muscle, cardiac, and respiratory activity increased, whereas nonoral muscle ac- 
tivity remained constant and quiescent. In Experiment 2, 60 subjects antic- 
ipated and heard a proattitudinal, a counterattitudinal, or a neutral communi- 
cation. They evaluated more positively and generated more favorable thoughts 
and fewer counterarguments to the proattitudinal than to the counterattitudinal 
advocacy, but rated similarly the neutral and proattitudinal advocacies. As in 
Experiment 1, incipient oral muscle activity increased following the forewarn- 
ing of an involving counterattitudinal advocacy; it also increased for all con- 
ditions during the message. Patterns of subtle facial muscle changes reflected 
the affective nature of the cognitive responding before and during the message. 
These results provide evidence that electrophysiological assessments offer ob- 
jective, concurrent, and independent measures of cognitive response in persua- 
sion, and support the notion of recipients as active information processors when 
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topic involvement is high. 


Recent work in attitude change has em- 
phasized the manner in which persons process 
the information contained in persuasive mes- 
Sages. Investigators have studied how such 
Variables as source credibility (Cook, 1969; 
Gillig & Greenwald, 1974), distraction (Petty, 
Wells, & Brock, 1976), message repetition 
(Cacioppo & Petty, 1979), message compre- 
hensibility (Eagly, 1974), forewarning of per- 
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suasive intent (Petty & Cacioppo, 1979a) and 
of topic and position (Petty & Cacioppo, 
1977), number of arguments employed (Cal- 
der, Insko, & Yandell, 1974), issue involve- 
ment (Petty & Cacioppo, 1979b), heart rate 
(Cacioppo, 1979), group discussion (Burn- 
stein & Vinokur, 1977), and so forth affect 
the profile of cognitions (e.g., counterargu- 
ments, favorable thoughts, neutral thoughts), 
and attitude change. Theoretical interest in 
the influence on persuasion of a person’s 
idiosyncratic cognitive responses to an ad- 
vocacy is certainly not new (cf. Hovland, 
Lumsdaine, & Sheffield, 1949), but the present 
level of research activity in the area marks a 
shift in emphasis toward this approach (cf. 
Petty, Ostrom, & Brock, in press). The re- 
invigorated interest in these covert thought 
processes stems in part from the apparent 
inability of classical learning theories to pro- 
vide parsimonious accounts of observed atti- 
tude changes (Greenwald, 1968; Petty, 1977), 
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and in part from improvements in measure- 
ment procedures (Brock, 1967; Cacioppo, 
Harkins, & Petty, in press). 

We now know that the level of cognitive 


responding to an advocacy can vary from 


complete inattention or simple structural (e.g., 
orthographic) analyses of the message to ex- 
tensive semantic examination and cognitive 
elaboration. The latter, but not the former, 
allows recipients to determine the implica- 
tions of an event for themselves and for sig- 
nificant others (Cacioppo, Glass, & Merluzzi, 
1979; Craik & Tulving, 1975; Janis & Mann, 
1977; Rogers, Kuiper, & Kirker, 1977) and 
appears to be an important determinant of 
attitude change (cf. Petty et al., in press). 
On the other hand, the cognitive response 
(information-processing) approach to persua- 
sion has been questioned in recent years 
(Baron, Baron, & Miller, 1973; Langer, 
Blank, & Chanowitz, 1978; Miller & Baron, 
1973; Romer, 1979). The self-report mea- 
sures of cognitive response have been crit- 
icized for placing post hoc demands on sub- 
jects to report rational(izing) processes for 
their attitude change. These critics have 
called for direct, concomitant measures of cog- 
nitive activity in persuasion. 

Perhaps the most commonly used measure 
of cognitive response .in persuasion is the 
“listed thoughts” procedure developed by 
Brock (1967) and Greenwald (1968). Inspec- 
tion of the existing data regarding the re- 
activity and reliability of the thought-listing 
procedure is encouraging. The same pattern 
of attitudinal responses has been reported in 
groups that list and do not list their thoughts 
(Petty & Cacioppo, 1977), and the order of 
attitude and cognitive response assessment 
has not affected either measure (Calder et al., 
1974; Petty et al., 1976). Third, split—half 
and test-retest reliability of attitude and cog- 
nitive response measures are satisfyingly high 
(Cullen, 1968). Nevertheless, it can still be 
argued that requesting subjects to list their 
thoughts produces responses that would not 
and do not occur naturally. 

Hence, we believed it worthwhile to de- 
velop a concurrent measure of cognitive and 
affective response that could be obtained 
without the subject’s knowledge of its purpose 
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and that would not require the subject to emit 
a voluntary response. An electrophysiological 
approach seemed promising: t 


It is possible that a hypothetical construct, hypoth- 
esized to be a covert response (r), may be directly 
measured as an electromyogram that occurs after the 
external stimulus, but before an overt response— 
(indeed) the covert electromyographic response may $ 
be an antecedent or determiner of the following 
overt response, (McGuigan, Culver, & Kendler, 1971, 
p. 146) 


Thus, by studying electrophysiologically 
microprocesses operating in a persuasion set 
ting, much may be learned (i.e., confirmed or 
disconfirmed) about the presence and nature 
of the postulated cognitive-response processes, 

Fortunately, psychophysiological research 
has yielded a few biocognitive relationships $ 
that make the present endeavor simpler and 
more assuredly informative. McGuigan and 
his colleagues, for instance, have amassed 
substantial data illustrating the sensitivity 
and reliability of the electromyogram 
(EMG), particularly of the speech muscles, 
as an index of covert linguistic activity (for, 
reviews, see Garrity, 1977; McGuigan, 1970, 
1973b, 1978; Sokolov, 1972). Also, cardiac 
activity has been shown to accompany covert 
information processing, with a decelerat 
heart rate characteristic of sensory acuity 
reception, and an accelerated heart rate charg 
acteristic of cognitive elaboration (e.g., Caci 
oppo & Sandman, 1978; Lacey, Kagan) 
Lacey, & Moss, 1963; Schwartz & Higgis 5 
1971). A brief overview of these areas of re 
search is provided next? 


Electromyographic Specificity and 
Cognitive Responding 


A conception held generally about the EMG 
activity of the speech muscles (when this 


1 This discussion is certainly not exhaustive of the 


psychophysiological responses indicative of men 
activity. cae notable exclusions are the pupillary 
(Goldwater, 1972; Kahneman, 1973) and rag ae 
evoked (cortical) responses (Posner, 1975). T! oa 
terested reader may wish to consult McGuigan t 
Schoonover (1973) or Greenfield and Dei. 
(1972) for additional information about t 

sures, 
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tivity reflects subvocal speech) is that the 
sultant afferentiation traveling from these 
uscles to the language centers of the cortex 
an aid to stimulus comprehension, elabora- 
pn, and abstraction (Hardyck & Petrinovich, 
49; McGuigan, 1973a, 1978; Sokolov, 
67, 1972). Specifically, the afferentiation 
om the speech muscles appears to serve as 
nonmeaningful but redundant phonetically 
ded representation of the immediately pre- 
ding efferent language commands, which 
iginate in the cortex and result from (some- 
mes preliminary) processing of external and 
emoria) stimuli, This redundancy serves to 
rengthen, in a manner similar to that of 
gnitive rehearsal, the links between the ex- 
mal event and the internal representations. 
ltimately, as the links between the external 
imulus and the resulting (overt) response 
come established (eg, automated), less 
dundancy is needed and less speech EMG 
tivity is in evidence, 
Consistent with these notions, Sokolov 
969) reported that subjects performing a 
mplex matrix manipulation exhibited bursts 
Covert oral EMG activity at first. Repeated 
als with the task, however, resulted in a 
turn of the EMG activity to basal levels. 
milarly, nonproficient readers display 
tater oral EMG activity while reading than 
) proficient readers—but both groups dis- 
ày elevated oral EMG activity when read- 
8 difficult material (McGuigan, 1967; Mc- 
tigan & Bailey, 1969b; McGuigan, Keller, 
Stanton, 1964); greater oral EMG activity 
Present when persons read silently than 
they process nonlinguistic materials 
8, music); and the silent processing of 
th linguistic and nonlinguistic stimuli leads 
‘increased oral EMG activity compared to 
a measures (McGuigan & Bailey, 
a). 
aurther, these changes in oral EMG do not 
ae simple changes in the general somatic 
autonomic arousal of the organism: A con- 
eat increase in EMG activity is not ob- 
ed in irrelevant (eg, nonlinguistic) 
169 € groups (cf. McGuigan, 1970; Sokolov, 
); nor has the galvanic skin response 
k Telated to oral EMG during the silent 
Ormance of language tasks (Sokolov, 
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1969, 1972); and quickened breathing is ob- 
served during silent reading, presumably re- 
flecting subvocalization (McGuigan et al., 
1964; McGuigan & Rodier, 1968). Finally, 
the EMG responses of the speech muscles are 
highly specific. Collectively, these results 
suggest that oral EMG activity can reflect 
information processing in persuasion.” 


Cardiac and Cognitive Activity 


Contemporary investigations of changes in 
heart rate and covert information processing 
have included: (a) monitoring heart rate and 
performance during the anticipation and per- 
formance of a wide variety of cognitive (e.g., 
sentence generation) and sensory (e.g., view- 
ing flashing lights) tasks (e.g., Lacey, 1959; 
Lorens & Darrow, 1962); (b) the measure- 
ment of the differences in heart rate change 
during task performance by individuals who 
differ dispositionally in their mode of infor- 
mation processing for that task (e.g., Blatt, 
1961); (c) the manipulation of attributes of 
the task (e.g., processing requirements) while 
monitoring heart rate and accompanying 
physiological responses (e.g., Blaylock, 1972; 
Tursky, Schwartz, & Crider, 1970); (d) mon- 
itoring heart rate and general somatic activity 
during the anticipation and performance of 
tasks under varying levels of incentive or 
motivation (e.g., Elliott, 1974; Obrist, Webb, 
Sutterer, & Howard, 1970); (e) monitoring 
sensory thresholds, reaction time, or'cognitive 
and attitudinal responses following endog- 
enous changes in heart rate (e.g., Cacioppo, 


2 Festinger and Maccoby (1964) were among the 
frst persuasion researchers to discuss the subvocal- 
izations of a recipient in response to a persuasive 
appeal. Keating and Brock (1974) pursued the no- 
tion that a recipient's subvocalizations may contain 
rich evidence regarding what was happening to the 
stimulus between the time the subject heard it and 
responded overtly to it. However, these efforts were 
largely unsuccessful empirically, due to the insensitive 
procedures employed; no direct electrophysiological 
measurement of speech EMG activity in a persuasion 
setting has ever been reported. Moreover, Previous 
discussions of subvocal activity in Persuasion have 
focused solely upon the counterarguing process, 
whereas we focus here on more general cognitive- 
response processes. 
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Sandman, & Walker, 1978; Edwards & Alsip, 
1969); and (f) observing the cognitive and 
attitudinal effects of exogenously manipulated 
heart rate (Cacioppo, 1979). The cardiac re- 
sponse generally covaries with the cognitive 
requirements or difficulty of the task, par- 
ticularly when the task requirements are sub- 
stantial, Although considerable controversy 
still exists in psychophysiology regarding the 
neurophysiological and/or biological processes 
controlling the heart rate responses observed 
in the aforementioned studies, the empirical 
link between the cardiac response and cogni- 
tive activity seems well established. Whether 
the cardiac acceleration observed during the 
performance of cognitive tasks is initiated by 
a metabolic control center (cf. Obrist et al., 
1970) or by a modulating negative feedback 
system (cf. Lacey, 1967) is unimportant here. 
Central to the present study is the finding 
that the cardiac response reflects considerable 
changes in cognitive activity, even when these 
changes are spontaneous and self-induced 
(Schwartz, 1971; Schwartz & Higgins, 1971). 


Methodological Considerations 


Several methodological safeguards were in- 
stituted here to assure that the electrophysio- 
logical measures we observed reflected covert 
information processing. First, a measure of 
general (nonlinguistic) somatic activity was 
obtained in addition to the measures of oral 
EMG and cardiac activity. To the extent that 
the activation is specific to the speech muscle 
fibers and heart rate when processing an ex- 
ternal event, we are more confident that we 
have tapped covert processing rather than 
irrelevant (i.e., unreliable) movements, pos- 
tural shifts, and so forth. Second, the level of 
heart rate and EMG activity observed while 
anticipating and processing the stimulus was 
compared with prestimulus measures to assess 
whether or not electrophysiological “response” 
actually occurred (McGuigan, 1970). And 
third, Miller and Baron (1973) have suggested 
that observations of oral EMG (and cardiac) 
activity may not differentiate the cognitive 
elaboration of the advocacy from the covert 
rehearsal of the arguments constituting the 
persuasive message. One of our aims was to 
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determine the existence of cognitive-response 
processes in persuasion; hence, we obtained $ 
recordings of the electrophysiological mea- 
sures while subjects were anticipating highly 
involving and counterattitudinal advocacies 
as well as while subjects were processin; 
them. A good deal of evidence documenting 
that anticipating a discrepant and involving 
message evokes issue-relevant cognitive re- 
sponding has accrued using the thought-list- | 
ing procedure (Cacioppo, Petty, & Snyder, “ 
1979; Petty & Cacioppo, 1977; Cialdini & 
Petty, in press). 


Experiment 1 


The aim of Experiment 1 was primarily 
methodological. We sought to test the utilit, 
of an electrophysiological assessment of cogni- 
tive response. To do so, we forewarned indi- 
viduals that they would hear a (counter- 
attitudinal) message and we asked them to 
“collect their thoughts” about the issue. Im- 
mediately preceding our forewarning, we col- 
lected basal measures of physiological re- 
sponse, and we continued to record these 
meaures while the individuals were anticipat- 
ing the discrepant message (i.e., when they | 
presumably were following instructions and | 
were generating cognitive responses concern- 
ing the advocacy). We expected oral EMG 
and cardiac activity, but not general somatic 
activity, to be heightened during the “collect 
thoughts” interval, compared to basal levels. 

Confirmation of these expectations would 
illustrate the applicability and utility of elec- 
trophysiological techniques for studying cog- 
nitive response in persuasion. Obtaining these 
results would not, of course, provide evidence 
that cognitive responses in persuasion are » 
generated naturally, since we asked the indi- 
viduals to collect their thoughts following the 
forewarning. A second experiment addressed 
this latter issue. 

A second aim of Experiment 1 was to ex- 
plore the effects of affect- and nonaffect-laden 
cognitive responding on electrophysiologica 
response patterns. Neither oral EMG (cf. 
Garrity, 1977; McGuigan, 1978) nor cardiac 
activity (Cacioppo & Sandman, 1978; Harris, 
Katkin, Lick, & Habberfield, 1976) appear t° 


distinguish the types of affective response 
woked, though differentiation between affect- 
and non-affect-laden thought sequences using 
heart rate has been observed (Schwartz, 
1971). The discrepancy of the impending ad- 
yocacy was manipulated in Experiment 1 to 
obtain gradations of affect-laden processing 
to assess its physiological effects. Additionally, 
the thought-listing procedure was used to ex- 
plore the relationships among the cognitive 
leg., favorable thoughts, counterarguments, 
neutral /irrelevant thoughts) and electrophys- 
jological (eg. oral EMG, heart rate) re- 


x 3% 22 mixed design was employed in 
Which the three replications served as a between- 
subjects factor, and levels of communication dis- 
crepancy (low, moderate, and high), two different 
topics within each level of discrepancy, and interval 
during which electrophysiological measures were 
tecorded (prewarning baseline and postwarning 
“collect thoughts” interval) served as within-subjects 
factors. (The interval factor was relevant only to the 
analyses of the electrophysiological measures.) 


Materials 


A separate audiotape was prepared for each of 
the three replications. Each tape contained the ex- 
Perimental instructions and six announcements re- 
farding the source of, topic of, and position to be 
Advanced in an upcoming message. (In fact, how- 
tver, the messages were never presented.) Each level 
of discrepancy for each topic appeared in one replica- 


Table 1 
Experimental Stimuli 
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tion. The tapes (i.e. replications) differed in the 
order of the topics, which was determined randomly 
for each replication, The experimental stimuli are 
displayed in Table 1. 


Procedure 


When subjects arrived at the laboratory, they 
were placed in a sound-attenuated room and were 
seated in a comfortable chair. Electrodes were at- 
tached for measuring oral (orbicularis oris—lips; 
digastricus—chin; platysma—throat) EMG activity, 
nonspeech (trapezius—back) EMG activity, heart 
rate, breathing rate, and cephalic pulse amplitude. A 
5-minute adaptation period preceded the experi- 
mental trials. 

Subjects were told that in about 40 minutes they 
would hear several different messages having direct 
consequences for undergraduates, and that before the 
presentation of the messages we should like to obtain 
their comments on and evaluations of the position 
(ie, advocacy) to be advanced in each message. 
The subjects were asked to sit quietly and collect 
their thoughts for the minute following each an- 
nouncement; then, at the experimenter’s signal, sub- 
jects were asked to list everything about which they 
had been thinking (subjects were given 3 minutes to 
do so), to rate their agreement with the upcoming 
advocacy, and to complete several ancillary measures 
(ie. felt effort, involvement, distraction, and respon- 
sibility). The nature of these forms is described in 
detail in Petty and Cacioppo (1977). The forewarn- 
ing, collect thoughts, and thought-listing intervals 
were repeated six times to cover six different topics, 
each separated by a variable intertrial interval (ITI) 
ranging from 90 to 120 seconds. Each ITI was in- 
itiated when the experimenter requested the subjects 
to “please sit quietly for the next minute or so.” The 
final 60 seconds of each ITI served as the baseline 
measure for the subsequent 60-second “collect 
thoughts” interval. 

Chin muscle activity. Two Grass ESS cup elec- 
trodes filled with Grass EKG Sol were placed on 
the midline of the chin; the first was placed 1.8 cm 
above the point of the chin and the second was 


Discrepant position 


Source Topic Low Moderate High 
University board of regents Increasing student tuition by... $5.00 $40.50 $90.00 
M te medical association Increasing drinking ageto... 19 years 21 years 25 years 
Valet of state legislature Increasing gasoline sales tax by... 2¢ Sé 10¢ 
niversity faculty committee Graduate students teaching all . . . freshmen freshmen & freshmen & 
Ż : f sophomores juniors 
President of the university Extending finals by . . . 2 days 5 days 9 days 
Ounty municipal courts Increasing traffic fines by . . . 25% 100% 300% 
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placed 1.8 cm below the point of the chin. Follow- 
ing Cacioppo et al, (1978), chin muscle activity was 
calculated for the 60-sec intervals of interest using 
the following formula: 


z (kX si 
Activity index = E (kosis x s) 
m k 


where s, is the scale value of a particular amplitude 
of EMG activity (larger amplitude EMG activity 
was assigned larger scale values; scale values ranged 
from 0, for deflections 2 mm or less, to 5, for de- 
flections exceeding 2 cm); i: is the total horizontal 
length (i.e., time) of a particular scale value of EMG 
activity measured in millimeters; and m is the num- 
ber of distinct instances of a particular amplitude 
of EMG activity. The data were quantified in this 
manner for the 60-sec baseline and collect thoughts 
intervals, 

Lip muscle activity. Two Grass ESS cup elec- 
trodes filled with Grass EKG Sol were placed 1 mm 
below the bottom lip; each was placed .5 cm in from 
the ends of the mouth. Lip muscle activity was cal- 
culated using the activity index described above. 

Throat muscle activity, Two Grass ESS cup elec- 
trodes filled with Grass EKG Sol were placed off 
midline of throat; the first was placed approximately 
1.0 cm to the right of and level with the midpoint 
of the throat, and the second was placed approxi- 
mately 1.0 cm to the left and 1.0 cm above the 
midpoint. Throat muscle activity was calculated us- 
ing the activity index described above. 

Back muscle activity. Subjects were asked to 
place their fingertips on their collarbone while the 
electrodes were secured. Two Grass ESS cup elec- 
trodes filled with Grass EKG Sol were placed 
over the trapezius muscle group. The first was placed 
4.0 cm outward from the midline of the line passing 
between the first thoracic and seventh cervical verte- 
brae, and the second was placed halfway between 
the spine and the head of humerus (near the point of 
the shoulder), Back muscle activity was calculated 
using the activity index described above. 

Heart rate, Grass ESS cup electrodes filled with 
EKG Sol were placed over the lower left rib cage 
and the right collar bone. The signal was amplified 
by a Grass wide-band AC preamplifier, Heart rate 
was calculated by counting the number of beats that 
occurred in the 1-minute intervals of interest. 

Breathing rate. The respirometer was a sliding 
piston (consisting of a photocell and a small light) 
mounted on an elastic band and placed around the 
subject’s chest (Shmavonian, Miller, & Cohen, 1968). 
Breathing rate was calculated by counting the num- 
ber of cycles (to the nearest half cycle) occurring in 
the intervals of interest. 

Cephalic pulse amplitude? A photoplethysmo- 
graph was placed over the supraorbital notch (above 
the eyebrow), providing a relative measure of blood 
volume in the supraorbital artery. The photoplethys- 
mograph was comprised of three light-emitting 
diodes (LED), radiating 100 uW of light output at 
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660 nanometers. The LEDs were spaced at 120° on 
a radius of 6.3 mm around a high-speed photocon- 
ductor, and resistance changes were recorded with a 
Grass oscillograph. Cephalic pulse amplitude was cal- 
culated for the intervals of interest using the follow- 
ing formula: 


Amplitude index = 


pulse amplitude in mm 
X total mV/cm calibration 


standard sensitivity in mV/cm 


where total mV/cm was the sensitivity of the preamp 
settings for a given subject, and the standard sen- 
sitivity was 1 mV/cm. 

Data reduction. Physiological processes were mon- 
itored during the 1 minute preceding and following 
each of the six forewarnings. Persons scoring the 
electrophysiological data were unaware of the experi- 
mental hypotheses and of the treatments with which 
the data were associated, 

The Grass Model 7 polygraph used in the experi- 
ment was equipped with three preamplifiers capable 
of measuring electromyographic activity. Since four 
EMG measures were of interest in the experiment, a 
random procedure was used to determine which three 
of the four EMG measures would be recorded for 
each subject within a replication. However, all elec- 
trode placements were prepared on cach subject, and 
the subject was unaware of the dummy electrode 
placements. Difference scores relative to the pre- 
warning levels were computed for each electro- 
physiological measure. 

The subjects classified their cognitive response in a 
manner described by Petty and Cacioppo (1977): 
After listing their thoughts, subjects were instructed 
to place a plus (+) next to those thoughts that were 
in favor of the advocacy, a minus (—) next to those 
thoughts opposed to the advocacy, and a zero (0) 
next to those thoughts that were either neutral to- 
ward or irrelevant to the advocacy. Frequency counts 
served as measures of cognitive response. 


3A photoplethysmograph was placed over the 
supraorbital artery just above the eye. The amount 
of light reflected back onto the photoplethysmograph 
is inversely proportional to the amount of blood 
between the photoplethysmograph and the supra- 
orbital notch (over which it was placed). Brain 
blood perfusion was assumed to be indirectly rela 
to the relative volume changes of the pulse wave 
that traveled to the brain via the supraorbital artery. 
Pulse rate was not considered to be important be- 
cause of the venous return mechanism, which drains 
the blood from a given body area at approximately 
the same rate as the blood is delivered to that aree 
(Wallace & Wallace, 1968). The validity of this 
measure, however, has not yet been firmly pee 
lished. The plethysmograph used in these studies ie 
developed by Robert Isenhart. More informa 
about the plethysmograph is provided in . 
McCanne, Kaiser, and Diamond (1977). 
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Results and Discussion 


| Measurements of the seven electrophysio- 
logical dependent measures (chin, lip, throat, 
and back EMG activity, heart rate, breathing 
rate, and cephalic pulse amplitude) were ob- 
tained the minute preceding and following 
tach forewarning. Parametric analyses of the 
questionnaire data were conducted while non- 
parametric analyses of the electrophysiological 
data were conducted, since the latter data 
were not distributed normally (cf. Schwartz, 
Fair, Salt, Mandel, & Klerman, 1976a, 
1976b). 


Cognitive Response, Agreement, 
tnd Ancillary Measures 


: We expected topic-relevant cognitive re- 
sponding and agreement to be affected in a 
particular manner by discrepancy, but we had 
o particular expectations regarding neutral 
thought production and the ancillary mea- 
res. Hence, we set the experimentwise error 
10 and distributed this protection 
nequally across the tests (i.e., .05 for coun- 
trargumentation, favorable thought produc- 
ion, and agreement; .0S for the remaining 
lve measures), We used Bonferroni-adjusted 
Mitica] values for all tests, conducted two- 
kiled tests throughout, allocated 99% of the 
‘pha associated with the contrasts for the 
t set of variables into the tail correspond- 
to the predicted direction of the effect 
(sving 1% as acknowledgment that op- 
ite rather than predicted results sometimes 
tain), and allocated 50% of the alpha as- 
lated with the tests for the second set of 
Mriables into each tail. Further, the Huynh 
ùd Feldt (1970) test for the homogeneity 
treatment-difference variance (HOTDV) 
Conducted for each dependent measure 
(ci. Harris, 1975, pp. 125-127).° These tests 
taled a violation of the HOTDV assumption 
Y for the measure of neutral thoughts, 
(2) = 8.11. Neutral thought production 
` Unaffected by discrepancy even with a 
tively biased F ratio. 

means for all cognitive response, agree- 
t, and ancillary measures are summarized 
Table 2. The analyses indicated that in- 
ing the discrepancy between the subjects’ 
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Table 2 

Mean Responses to Thought Listing and 
Questionnaire Measures for Low, Moderate, 
and High Levels of Communication 
Discrepancy: Experiment 1 

E ED SE a a a 


Mean 
Measure Low Moderate High 
Primary 
Agreement 4.08 3.96 2.67 
Counterarguments 2.04 2.73 3,40 
Favorable thoughts 1.33 1.00 63 
Neutral thoughts 1.04 Aire 58 
Ancillary 
Effort 7.35 842 842 
Involvement 7.50 8.60 8.56 
Distraction 3.98 3.40 2.90 
Responsibility 6.81 7.96 7.42 


Note. Entries for cognitive response measures in- 
dicate the mean frequency obtained in thought 
listings. Entries for questionnaire items are mean 
response to 11-point scale items in which higher 
numbers indicated more agreement, effort, involve- 
ment, distraction, and responsibility. Twenty-four 
subjects received two forewarnings at each level of 
discrepancy. 


initial positions and the advocated position 
led to more anticipatory counterargumenta- 
tion, F(2, 42) = 8.80, p < .01; production of 
fewer anticipatory favorable thoughts, F(2, 
42) = 3.70, p < .017; and greater felt effort 
in preparing cognitively for the message, F (2, 
42) = 5.33, p < .01. One additional test ap- 
proached significance: Agreement tended to 
decrease as discrepancy increased, F(2, 42) 
= 3.57, p < .019. No other effect or inter- 
action was statistically significant. This pat- 
tern of results is similar to that obtained in 
prior research on cognitive response and 
agreement as a function of communication 
discrepancy (Brock, 1967; Cacioppo, 1977). 


Electrophysiological Measures 


We hypothesized that the activity of the 
oral muscles and heart rate would increase 


4 Required for statistical significance by these ad- 
justments were ps < .017 for counterarguing, favor- 
able thoughts, and agreement, and ps < .01 for the 
remaining measures. 

© We are indebted to Richard Harris for providing 
a computer program with which to test the HOTDV 
assumption. 
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Figure 1. Median change from baseline for lip, chin, 
throat, and back electromyographic (EMG) activity 
following a forewarning about an impending coun- 
terattitudinal advocacy. 


during the collect thoughts interval (relative 
to basal levels), since we expected cognitive 
response processes to be reflected electro- 
physiologically. Strong support for the hy- 
pothesis was obtained (see Figure 1). Oral 
EMG activity was elevated significantly after 
forewarnings of involving counterattitudinal 
communications (by the Wilcoxon Test: lip, 
p< .001; chin, p < .001; throat, p< .10). 
Also evident in Figure 1, general somatic 
activity, as measured by back EMG activ- 
ity, was not altered by the anticipation 
of the counterattitudinal message (p > .25). 
The Wilcoxon Test for changes from baseline 
revealed that heart rate (Mdn = 2.00 bpm, 
p< .02) and breathing rate (Mdn = 1.0 
cycle/min, p < .01) increased following the 
forewarning as well, whereas cephalic pulse 
amplitude was left unchanged (Mdn = 0.00). 
No other comparisons were significant statis- 
tically.® 


Correlational Analyses 


The affective intensity of the cognitive re- 
sponses, which was varied by increasing com- 
munication discrepancy, did not affect the 
electrophysiological activity monitored in this 
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study. Canonical correlations between the 
cognitive responses (i.¢., counterarguments, 
favorable thoughts, neutral/irrelevant 
thoughts) and the relevant electrophysiolog- 
ical measures (ie. lip, chin, throat, and 
cardiac activity) were computed within each 
level of discrepancy to determine whether the 
predicted relationship between cognitive and 
electrophysiological activity existed and to 
explore whether any association existed be- 
tween the affective nature of the covert proc- 
essing and electrophysiological activity. The 
correlations were respectable (r= 47 for 
low discrepancy conditions; r = .44 for mod- 
erate discrepancy conditions; and z = .64 for 
high discrepancy conditions), When a canon- 


ical correlation was calculated, collapsing | 


across the levels of discrepancy, a coefficient 
of .42 was obtained, 

Furthermore, the electrophysiological sê- 
cificity obtained in this research is in striking 
contrast to the massive and diffuse arousal 
associated with the fight-or-flight reaction of 
extreme emotional states (Cannon, 1927) 
and misattribution phenomena (Schachter, 
1964; see also, Rhodewalt & Comer, 1979). 

The calculation of within-cells correlations 
among cognitive responses and agreement re- 
vealed that anticipatory counterargumenta- 
tion correlated negatively with agreement 
(r = —.67, p < .01), favorable thoughts (r= 
—.67, p <.01), and neutral thoughts (r= 
—A5, p<.01). Favorable thoughts corre- 
lated positively with agreement (7 = 12,95 
01) 


to count to five aloud and 


to move and tense slightly in their chair before 5 
completion of the experiment to assess the validity 
the EMG electrode placements. Trapezius ye- 
activity increased during body tensing and re in 
ments, Lip and chin EMG activity increased 2" 
overt oral behavior (counting aloud), bare ine 
throat EMG placement proved to be 4 relativi 
sensitive measure. joyed 

7 Analysis of covariance procedures were "5 ol 
to explore some of the possible causal seat should 
cognitive responding and attitude change. do not 
be noted, however, that these procedures rating: 
prove that a particular causal model is ca the 
These analyses were conducted here to amne nS, 
reduced agreement found with increasing ion. Pre- 
possibly resulted írom counterargumen 


ê Subjects were asked 
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Within<ell correlations further revealed 
it enfavora! ie thoughts were 

merelated wit! ngs of effort (r = 42, 9 < 
Hi), involvement: (r= AS, p < OS), and 
sponsibility (7 29, p < 05), and corre- 
ed negatively with ratings of distraction 
y=—29, } 5), None of the electro- 
bysiologica! casures was correlated signif- 
gaily with any ancillary measure, 

la sum, clectrophystological and thought- 
iting measures indicated that tople-relevant 
maitive reponling accompanied the antic- 
ution of an involving counterattitudinal 
mmunication. Inspection of Figure 1 mp- 


Mts the notion that cognitive response proc- 


BG in persuasion can be measured concur- 
aly and without the subject's doing any- 

overt in particular (eg, listing 
baghts). Of course, no evidence was pro- 


ded in Experiment 1 concerning the natural 
titence or elicitation of cognitive response 
i persuasion, since the subjects were aware 
lt they were to list their thoughts; in fact, 
ky had been instructed to collect their 
bought s following each forewarning. We 
Re also unable to distinguish electrophysio- 
Wally the affective natures of the recip- 
RY covert preparations for the advocacy. 


We have found that topic-reievant thinking 

to mediate the subsequent agreement 

Sivocacy (Cacioppo & Petty, 1979; Petty & 
» 1977), rather than vice versa. We 

Mme results from 


1974, for an extended rationale for the use of 


F(z, 


Me notion consistent 
that increasing discrepancy increased 
tion, which then reduced 
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We conducted a second experiment to address 
these issues. Wie 


Experiment 2 
The first issue, that of providing electro- 
physiological evidence concerning the cogni- 
tive responses elicited naturally in persuasion, 
was addressed easily in the design of Experi- 
ment 2, Rather than asking the subjects to 
collect or list thoughts, we simply monitored 
the electrophysiological activity displayed 
during the anticipation and presentation of a 
advocacy. Subjects had no notion that 
they would subsequently be asked to list their 
thoughts. 


The second issue, developing an electro- 
measure capable of differentiat- 
ing the affective nature of cognitive respond- 
ing (should it exist in persuasion), proved to 
be more difficult, Previous studies of attitudes 
and bodily reactions have employed three 
basic research strategies: Procedures have 
been employed (a) to tap the physiological 
processes indicating covert information proc- 
essing (eg, Experiment 1; Cacioppo, 1979); 
(b) to measure an evaluative reaction by 
monitoring a classically conditioned physio- 
logical (eg., Tognacci & Cook, 
1975); and (c) to assess the naturally occur- 
ring physiological indicators of affective 
states (eg, Cooper, 1959; Hess, 1965). 
Studies of attitudes applying the third re- 
search strategy have often used measures of 
response or electrodermal activity 
(galvanic skin responses). These studies have 
provided some evidence that attitudes, if ex- 
treme, may be measurable. Even in these 
instances, however, the polarity of the attitude 
(ie, positive or negative) has not been dis- 
i (Cacioppo & Sandman, in press; 
Mueller, 1970). 

Recent work on the neuromuscular sub- 
strates of emotion and depression offered us 
a potential solution. Darwin (1965/1872) 
frst documented the specificity and reliabil- 
ity of facial muscle patterning in the expres- 
sion of emotions (cf. Cacioppo & Petty, in 

). More recently, Schwartz and his 
(Schwartz, 1975; Schwartz et al., 
1976a, 1976b) have found that generating 
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imagery or attempting to experience through 
fantasy the emotional states of happiness, 
sadness, and anger leads to distinctive pat- 
terns of EMG activation of the face. More- 
over, these patterns go unnoticed by subjects 
as well as observers (see also Izard, 1971). 
We reasoned that by monitoring these facial 
(i.e., corrugator, zygomatic, depressor anguli- 
oris, and mentalis) muscles during the antic- 
ipation and presentation of advocacies, we 
would be able to distinguish favorable from 
unfavorable (i.e., counterargument) cognitive 
responses emitted by a recipient. In Experi- 
ment 1 and in previous research, we have 
found oral EMG to distinguish the extent 
rather than the affectivity of covert processing 
(e.g., Cacioppo & Petty, in press-b, in press- 
c). Hence, we considered the measure of 
mentalis EMG activity, which taps the elec- 
trical activity of the muscle fibers between 
and including the lower lip and chin, as a 
measure of the extent rather than emotion- 
ality of processing (see also McGuigan, 
1978). Again, heart rate was recorded. 


Method 
Subjects and Design 


Sixty male undergraduates were led to believe 
they were evaluating the sound quality of taped 
radio editorials that had been produced by the stu- 
dents in a sound-engineering course. Electrodes were 
placed on each subject’s body, and subjects were 
tested individually in a darkened, sound-attenuated 
room “to reduce external distractions from the task.” 
Forty-eight subjects were forewarned about and 
heard either a proattitudinal or a counterattitudinal 
advocacy on one of two topics (alcoholic beverages 
or visitation hours). Twelve additional subjects 
were forewarned only that they would hear a taped 
communication, and they heard a message about an 
obscure news event. Subjects in this group served 
in an external control (neutral advocacy) condition. 
The assignment of subjects to condition again was 
determined randomly* 


Materials 


The topics of alcoholic beverages and visitation 
hours were selected because initial pilot testing re- 
vealed existing university regulations regarding them 
to be highly involving and counterattitudinal. Fore- 
warnings and messages were constructed that ad- 
vocated the adoption of either stricter (counteratti- 
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tudinal) or more lenient (proattitudinal) regulations 
regarding these issues. The neutral message con- 
cerned a small archeological find and was obtained 
from a past issue of a national news magazine. 


Procedure 


When subjects arrived at the laboratory, they 
were told that their task was to evaluate the sound 
quality of a taped radio editorial, that electrodes 
would be attached, and that during the study we 
would be recording the involuntary bodily responses 
that accompany listening to a communication.” Sub- 
jects were instructed to refrain from unnecessary 
movements, to breathe normally, and to keep their 
eyes closed throughout the study. After adapting to 
the laboratory, subjects again heard these instruc- 
tions and were told that the study would begin 
shortly. At this point, a computer-controlled proces 
dure—which involved (a) a 60-sec prewarning 
(baseline) interval, (b) a 15-sec forewarning, (c) 
60-sec postwarning-premessage interval, and (d) a 
120-sec message—was initiated. A 

After listening to the tape, the subjects read the 
following: 


Because your own opinion about the position 
advocated on the tape may influence the way 
you rate the quality of the tape, we would like 
to obtain a measure of how you feel about the 
views proposed by the speaker on each scale below, 


The subjects responded to four 9-point semantic 
differentials; their responses were summed to obtain 
a measure of their attitude toward the advocacy.” 
In the same manner as in Experiment 1, sub! j 
were instructed to list everything about which they 
had thought during the message (ssbjects were given 
3 minutes). Afterwards, subjects rated their listed 
thoughts as favorable (+), unfavorable (—), OF 
neutral/irrelevant (0) toward the message. Subjects 
then rated on 1l-point Likert-type scales their felt” 
involvement, effort, and distraction, the pe! 
relevance of the message, the sound quality of 4 
tape, and the speaker's rate of delivery and en- 
thusiasm. 

Heart rate, Grass ESS cup electrodes filled with 
Grass EC3 paste were placed over the lower left rib 
and the right collar bone, The signal was ampli 


$ An additional factor included in the design Was 
whether subjects were informed that the commun 
tion had implications locally or not. Manipulal 
checks revealed that our manipulation of this 
failed here, so this factor is not discussed further. 

® Electroencephalographic measures were ob 
also, the results of which are to be reported 
where, since they were collected to address a 
ent issue. Suffice it to say that enough pes 
were attached to subjects to make the cover 
concerning the measurement of involuntary P 
seem entirely plausible to them. 
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"by a Narco Biosystem Physiograph AC preamplifier. 

W The output was displayed on a Narco Biosystem 
Physiograph 6 and was transmitted on-line to a 
PDP-8! laboratory computer for analysis. 

Facial muscle activity, Grass ESS cup electrodes 
filled with Grass ECS paste were placed adjacent to 
tach other in pairs with interelectrode resistance re- 
duced to less than 10,000 ohms. The four muscles 

| ve which the pairs of electrodes were placed were 

the corrugator (just above the eyebrow), zygomatic 
(upper cheek), depressor anguli oris (lower cheek), 
and mentalis (between lip and chin) on the left side 

Of the face (cf, Schwartz et al, 1976a). Since surface 

dectrodes were used, recordings of EMG activity 
were obtained from these and surrounding muscle 
| troups 

Each EMG measure was amplified by a Narco 
Biosystem Physiograph AC preamplifier, individually 

_teetified and summed by an EMG integrator with a 
time constant of 2 see. The average integrated EMG 

‘Was displayed on the physiograph with a full-scale 
Jpn deflection of 40 mm (1 mm=75 aV). The 
_ integrated EMG also was transmitted on-line to the 
PDP-81 laboratory computer, sampled 10 times per 
‘cond, and recorded. The computer was pro- 
frammed to eliminate from its recordings any ob- 
vious movement artifact or overt (¢g., visually de- 
Iectable) facial expression. 

Data reduction. The cognitive data were scored 
in the same manner as in Experiment 1. Physiological 
Tsponses were monitored during the minute preced- 
log the forewarning through the completion of the 
120-sec message. The data for each measure were 

Averaged for each subject and each interval. Differ- 
Scores relative to the prewarning (basal) levels 
calculated for each measure and interval, and 
parametric analyses were employed, since the 
tata were not distributed normally, The presentation 

of the results for facial EMG activity is similar 
format to the presentation by Schwartz et al. 

(19763, 1976b) to facilitate comparisons. 


f 


Results and Discussion 


d Our purposes in this study were to deter- 
Mine if cognitive responses were generated 
‘aturally in persuasion settings and, if so, 

Ni "hether or not the affective nature of these 
“sponses could be assessed electrophysiolog- 

‘ally. A multivariate analysis of variance of 
11 questionnaire measures was conducted 

tst to determine the general effects of the 

perimental factors. As expected, the effect 

M position was highly significant, based on 

Wilks’ lambda, F(11, 30) = 3.63, p < 01, 

ne the effect of topic was not significant, 

(1, 30) = 1.74, p > .11. Hence, all analy- 
„Teported below are collapsed across the 

“pic factor. 
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Table 3 

Mean Cognitive and Altitudinal Responses as 
a Function of the Affectivity of the Advocacy: 
Experiment 2 


Pro- Counter- 


Measure attitudinal attitudinal Neutral 


Evaluation of the 


taped advocacy 27.04 16.75% 25.00 
Counterarguments 2.00 3.54 217 
Favorable thoughts 2.34» 1,13 E29 
Neutral /irrelevant 

thoughts 2.04 1.548 3.33. 

Total thoughts 6.38 6.23 6.74 


* The mean differs from the corresponding neutral 
mean at the .05 level by Dunnett’s test. 


Cognitive Response, Attitude, 
and Ancillary Measures 


Subjects anticipated and listened to a pro- 
attitudinal advocacy, a counterattitudinal ad- 
vocacy, or a neutral communication, and rated 
their evaluation of and thoughts about the 
taped presentation. As is evident from an in- 
spection of the means in Table 3 for these 
measures, the proattitudinal advocacy was 
evaluated more positively, F(1, 40) = 
30.37, p < .001, and elicited more favorable 
thoughts, F(1, 40) = 6.58, p<.02, and 
fewer counterarguments, F(1, 40) = 6.50, 
p < .02, than did the counterattitudinal ad- 
vocacy, 

The Dunnett test for comparisons involving 
an external control mean (Kirk, 1968) was 
employed to determine the relative effects of 
the proattitudinal and counterattitudinal com- 
munications relative to the neutral communi- 
cation on cognitive and attitudinal responding. 
These comparisons revealed that the neutral 
communication differed from the counteratti- 
tudinal communication in evaluation and in 
the number of neutral/irrelevant thoughts 
elicited (ps < .05); in addition, these com- 
munications differed marginally in the num- 
ber of counterarguments elicited (p < .10). 
On the other hand, the neutral and proattitu- 
dinal communications were evaluated and 
were thought about similarly (see Table 3). 
Evidently the subjects enjoyed hearing our 
“neutral” message about an archeological 


dig. 
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Analyses of the ancillary measures and rat- 
ings of tape quality failed to produce any sig- 
nificant effects. 


Electrophysiological Measures 


All tests of the significance of changes from 
prewarning baselines for the electrophysiolog- 
ical measures were conducted using two- 
tailed Mann-Whitney tests. 

Does cognitive responding occur naturally? 
The analyses of covert oral (mentalis) EMG 
activity, which was the most sensitive mea- 
sure of covert processing in Experiment i 
indicated that it was elevated during the 
postwarning-premessage interval for the coun- 
terattitudinal condition (Mdn = .24 aV, p< 
03). This finding replicates that of Experi- 
ment 1, but here was obtained without ex- 
plicit requests for subjects to collect their 
thoughts. Interestingly, oral EMG activity 
was not altered significantly during this inter- 
val in the proattitudinal or neutral conditions 
(ps > .15), Still, the presentation of a mes- 
Sage, whether it was proattitudinal, counter- 
attitudinal, or neutral, led to increased oral 
EMG activity (Mdn = .43 HV, 1.21 pV, and 
1.26 ey, respectively, ps < .03), Also, as in 
Experiment 1, the affective nature of the 
covert processing was not distinguishable by 


oral EMG; the mentalis activity did not differ 
as a function of 


and the proattitudinal 


ence is not imm, E a cause of this differ- 


, Is the emotional tone 
tivity distinguishable? 
determine if the affective 


Of this cognitive ac- 
We next sought to 
Nature of the cogni- 
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tive responses was distinguishable by the pat- 
tern of facial EMG activity. Schwartz and his 


associates have demonstrated that pleasant 
States (¢.g., happiness) lead to less corrugator 
and more zygomatic and depressor EMG 
activity than do unpleasant states (e.g., sad- 
ness, anger), with corrugator EMG activity 
providing the most discriminating measure 
(Schwartz, Fair, Salt, Mandel, Mieske, & 
Klerman, 1978). Figure 2 displays the me- 
dian change from baseline for these measures 


as a function of position and interval in the 
present study. 

Several findings are evident immediately 
upon inspection of Figure 2, First, only cor- 


rugator activity was altered during the an- 
nouncement of the forewarning ( upper panel), 
with greater and equal activation relative 


to baseline appearing in all conditions (ps < 
01). Less evident in the upper panel of Fig- 
ure 2 is the marginally significant tendency 
for zygomatic activity to discriminate between 
the counterattitudinal and neutral forewarn- 
ings (p < .06). : 
During the postwarning—premessage inter- 
val (middle panel of Figure 2), corrugator 
activity remained elevated from basal levels 
(ps < .02), though a significant decrease from 
the forewarning level was displayed in the 
proattitudinal condition (p < .01). Between- 
group comparisons yielded a nonsignificant 
difference in corrugator activity between the 
proattitudinal and counterattitudinal condi- 
tions (p< .11), with the direction of the 
difference that which would be expected from 
Schwartz and his colleagues’ research. That 33) 
Corrugator activity was higher when antic- 
ipating the counter than proattitudinal advo- 
cacy, Finally, the activity in the zygomatic 
Muscle region in the neutral condition was oa 
hanced relative to basal levels ( < .03) an 
distinguished this condition from the group 
anticipating a counterattitudinal aivo 
(P< .01). This too is consistent with m 
Previous studies of emotional fantasy anf 
imagery (cf. Schwartz, 1975). No other di 
ferences were Statistically significant. st 
The presentation of the messages (l0W a 
panel of Figure 2) resulted in elevated Sy 
rugator EMG activity relative to basal K 
:01) and postwarning-premessage levels 
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Figure 2. Median change from baseline for corrugatory (C), zygomaticus (Z), and depressor (D) 
electromyographic activity during the forewarning (top panel), postwarning-premessage (middle 
panel), and message (bottom panel) intervals. (The data are displayed separately for subjects who 
anticipated and heard the neutral [n= 12], proattitudinal [n= 24], and counterattitudinal [n = 


24] advocacies.) 


<.01); in addition, it was elevated mar- 
ginally during this interval compared to fore- 
Warning levels (ps <.06). The zygomatic 
activity continued to differentiate the affec- 
tivity of the covert processing: Zygomatic 
EMG activity during the counterattitudinal 
Message was lower than that displayed during 
baseline (p < .05), the proattitudinal message 
(p< 01), and the neutral message (p < 
05). Furthermore, the zygomatic activity 
uring the proattitudinal message was sig- 
nificantly greater than its basal level and 
marginally greater than its forewarning (p < 
08) and postwarning-premessage levels (p< 


.08). Similarly, depressor EMG activity was 
enhanced marginally during the proattitu- 
dinal message compared to its postwarning— 
premessage level (p < .06) and compared to 
the counterattitudinal message (p < .10), 
effects also similar to those found by Schwartz 
et al. (1976a, 1976b). No other comparisons 
approached statistical significance. 

In sum, oral EMG activity increased from 
baseline after the forewarning of an impend- 
ing and involving counterattitudinal com- 
munication, even though we did not request 
subjects to collect their thoughts and subjects 
were unaware that they would be asked to list 
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their thoughts, heard only one forewarning 
and message, and had no reason to suspect 
the purpose of our measurements. Further- 
more, anticipating and hearing proattitudinal, 
counterattitudinal, and neutral communica- 
tions led to distinctive and predictable pat- 
terns of facial EMG activity during the stim- 
ulus sequence. Specifically, less corrugator 
and more zygomatic activity was observed in 
the proattitudinal and neutral conditions than 
in the counterattitudinal condition, The activ- 
ity of the depressor muscle region, for the 
most part, was unaffected by the affectivity 
or interval of the communication sequence. 
The differences that were observed, however, 
were as predicted: Depressor activity tended 
to be greater during the proattitudinal and 
neutral communications than during the coun- 
terattitudinal communication, Finally, al- 
though the patterns of facial EMG activity 
were similar for the neutral and proattitudinal 
conditions, there was a high degree of sim- 
ilarity between these conditions in the cogni- 
tive and evaluative responses as well. 


Correlational Analyses 


Canonical correlations between the cogni- 
tive responses (i.e., counterarguments, favor- 
able thoughts, neutral thoughts) and electro- 
physiological scores (i.e. heart rate, mentalis, 
corrugator, zygomatic, and depression EMG 
activity) were calculated, once using the 
physiological responses for the postwarning— 
premessage interval, and once using the re- 
sponses for the message interval. Since 
thought listings were obtained immediately 
following the message interval, the canonical 
correlation between thoughts and the bodily 
responses from this interval should show the 
strongest association; but physiological re- 
sponses from both intervals should correlate 
somewhat with the listed thoughts, Correla- 
tions were .30 for the postwarning-premes- 
sage interval and .46 for the message interval, 
x2(15) = 7.20 and 16.20, respectively, ms. 
Hence, the covariation of cognitive and elec- 
trophysiological response was weak by these 
indices, but the latter index was stronger 
than the former, as expected. 


Calculations of within-cell correlations 
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among the cognitive response data revealed 
that, as in Experiment 1, counterargumenta- 
tion correlated negatively with the attitude 
toward the communication (r = —.30, p< 
.05) and favorable thoughts (r = —.41, P< 
.05), whereas favorable thoughts correlated 
positively with attitude (r = 45, p< 0S). 


General Discussion 


Theory and research in persuasion have 
focused recently on the covert idiosyncratic 
responses of individuals, In this article, we 
have reported two experiments describing the 
theory for and development of electrophysio- 
logical procedures for assessing this covert 
cognitive activity. Moreover, the evidence 
obtained suggests strongly that cognitive re- 
sponse processes are evoked naturally, at least 
when the advocacy is involving and counter- \ 
attitudinal. 

In Experiment 1, we found that subjects 
who had been asked to collect their thoughts 
about an upcoming discrepant message Cx 
hibited increased oral muscle, cardiac, and 
respiratory activity, whereas nonoral somatic 
activity remained constant and quiescent. 
These results are in accord with research in 
cardiovascular psychophysiology (€.8-, Lacey 
et al., 1963) and with the literature on the 
electromyographic concomitants of thought 
(Jacobsen, 1973; McGuigan, 1978; Sokolov, 
1972). Further, these results demonstrate that 
cognitive responses in persuasion settings are 
measurable concurrently and reliably, without 
asking the subject to respond overtly during 
the measurement and without the subject's 
awareness of the purpose or focus of the 
(electrophysiological ) measurement instru- 
ments. i 

A second experiment was conducted to pro- 
vide answers to two important questions: e) 
Do subjects engage normally in active, cover 
processing when anticipating bears 
persuasive appeals? (b) If so, cam the ast 
tional tone of this cognitive activity em 
sessed electrophysiologically? Using an Sie 
tromyographic technique to measure that 
responses of the facial muscles, we fou 
the forewarning of a count t 
not a proattitudinal or neutral, CO 
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led to elevated oral EMG activity. Further- 
more, the anticipation and presentation of 
counterattitudinal, proattitudinal, and neutral 
communications led to active though covert 
processing activity, the affective nature of 
which was revealed in the concomitant pat- 
terning of facial EMG activity.° 


Cognitive Encoding or Elaboration? 


It might be argued that the increased oral 
muscle activity indexed silent rehearsals 
rather than cognitive elaborations of the mes- 
sage arguments (cf. Miller & Baron, 1973). 
Listening to prose does cause a slight increase 
in oral muscle activity (McGuigan & Bailey, 
1969a). Hence, comprehending and rehearsing 
the message arguments probably contributed 
to the elevated oral muscle activity exhibited 
during the message presentation, However, 
the anticipation of a counterattitudinal advo- 
tacy led to elevated speech muscle activity 
When there were no message arguments to 
tehearse, One might argue that subjects were 
tehearsing the forewarning and potential mes- 
Sage arguments, stopping to generate counter- 
arguments only when asked by the experi- 
Menter at the end of the study to “list every- 
thing about which you thought.” Besides the 
Absence of parsimony, this explanation fails 
to account for the subtle facial expressions 
indicating affect-laden cognitive responding 
that was displayed while subjects awaited the 
Message. More persuasively, this explanation 
*annot account for the failure of subjects who 
ere expecting a proattitudinal message to 
play significant increases in oral muscle 
tivity during the postwarning-premessage 
Interval. 


’S cognitive defenses (Cialdini et al., 1976; 
uire & Papageorgis, 1962; Petty & Caci- 
1977). Interestingly, there are no 
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studies in the literature that report the effects 
of anticipating a proattitudinal message that 
collected thought listings. We believe that 
anticipating a counterattitudinal as compared 
to a proattitudinal message generally results 
in deeper processing (i.e, more extensive 
cognitive preparation—Cacioppo & Petty, 
1979). This difference may be attributable to 
the relative importance in defending from 
attack one’s attitudes and beliefs (e.g., avoid- 
ing cognitive inconsistency), or to the relative 
thought that was devoted previously to (or 
scripts developed for) proattitudinal rather 
than counterattitudinal positions. The present 
study does not consider which of these in- 
terpretations is most plausible. 

It should be noted that the observation of 
greater oral EMG activity being obtained 
when anticipating a counterattitudinal rather 
than proattitudinal or neutral communication 
does not imply that oral EMG activity is a 
measure of “counterargumentation.” Indeed, 
the results of both experiments suggest it is 
not. As mentioned above, we believe that the 
cognitive preparation for a counterattitudinal 
rather than a proattitudinal or neutral mes- 
sage elicits more extensive processing, which 
may result in the generation of counterargu- 
ments, favorable thoughts, and/or neutral 
thoughts, Accordingly, the activation of the 
oral muscles reflects this difference in the ex- 
tent of cognitive elaboration rather than the 
affectivity of the processing. Evidence that 
oral EMG reflects this depth of processing has 
been found in a study in which subjects 
viewed a word and identified whether it was 
printed in uppercase letters (shallow process- 
ing) or was self-descriptive (deep processing 
—Cacioppo & Petty, in press-b, in press-c). 
We found that deeper processing resulted in 
elevated speech EMG activity. 

Finally, a word might be said about the 


10 Love (1972) attempted to detect subtle changes 
in facial expression by videotaping the shoulders and 
face of subjects as they listened to an advocacy, 
Raters then scored the nonverbal cues emitted by 
these subjects. This measure proved to be insensitive 
to the experimental manipulations. The electro- 
physiological approach illustrated here has the ad- 
vantage of being sensitive to subtle changes in re- 


sponding by recipients. 
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sensitivity and specificity of the electrophysio- 
logical measures employed. The procedures 
we employed for scoring these responses 
eliminated obvious movement artifacts and 
facial expressions (the number of the edits 
did not differ across the experimental condi- 
tions); hence, we intentionally confined our- 
selves to the study of covert electrophysiolog- 
ical responses. Nevertheless, in this research, 
oral EMG activity has been a more sensitive 
measure of covert information processing than 
heart rate (see also McGuigan, 1978). Heart 
rate and speech muscle activity increased fol- 
lowing the forewarning of an upcoming dis- 
crepant message, but the change in heart rate 
in Experiment 2 was not significant statis- 
tically; heart rate did increase significantly 
during the presentation of the communica- 
tions, although again, the electromyographic 
measures proved more sensitive. Similarly, the 
various measures of facial EMG activity were 
not equally sensitive to affect-laden process- 
ing. Schwartz and his colleagues (1976a, 
1976b, 1978) have found, as we here have 
found, that corrugator EMG activity best 
distinguishes subtle affective states. These in- 
stances of differing sensitivity, or response 
discordance, illustrate a point made by the 
Laceys (e.g., 1959, 1967) regarding the spe- 
cificity of autonomic and somatic activation. 
Moreover, this remarkable specificity contrasts 
sharply with Cannon’s (1927) theory of 
physiological arousal and emotion, and pro- 
vides unique evidence for cognitive response 
processes in persuasion. 
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Remembering Schema-Consistent Information: Effects of a 


Balance Schema on Recognition Memory 


Keith P. Sentis and Eugene Burnstein 
University of Michigan 


The major premise of this article is that schema-consistent (balanced) informa- 
tion is structurally represented in memory in a qualitatively different fashion 
from schema-inconsistent (imbalanced) material. The former closely resembles 
overlearned information and “easily chunked" information in that multicompo- 
nent stimuli are represented in memory as unitary wholes, whereas the latter is 
stored as discrete propositions. Subjects studied short scenarios that described 
either balanced or imbalanced structures involving three relations and then 
answered questions that required retrieval of either one, two, or all three of the 
relationships. For imbalanced information, reaction times to questions increased 
with the number of to-be-retrieved relations, whereas with balanced informa- 
tion, reaction times actually decreased as the number of to-be-retrieved rela- 
tionships increased. Thus, for balanced structures, all three relations were more 
quickly verified than any two, and any two were more quickly verified than 
any one. A model is proposed for the operation of the balance schema that 


explains why the whole is retrieved more easily than its parts. 


During at least the past 100 years, there 
has been a sustained belief that the concept 
of knowledge structures is useful for explain- 
ing mental processes. This is evident in the 
remarkable similarity between the descrip- 
tions of cognitive organization contained in 
the works of several nineteenth-century theo- 
rists and those offered by current researchers 
in cognitive psychology and artificial intel- 
ligence. To cite only one example, regarding 
cognitive representations of frequently re- 
peated episodes, Butler (1877) felt that “the 
memory of many past performances strikes 
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a sort of fused balance in the mind, which 
results in a general method of procedure with | 
but little conscious memory of even the latest 
performances, and with none whatever of by 
far the greater number of the remoter ones” | 
(p. 158). A recent iteration of Butler’s ideas 
can be found in Schank and Abelson (1977) 
who note that “as an economy measure in the 
storage of episodes, when enough of them are 
alike they are remembered in terms of a 
standardized generalized episode which we 
will call a script . . . This economy of storage 
has a side effect of poor memory for detail 
(p. 19). 

The research presented here is concerned 
with knowledge structures pertaining to the 
social environment, that is, social schemas. 
Social schemas are coherent conceptual frame- 
works for representing relationships amon 
social stimuli that guide the individual k- 
organizing the social world. These cognitive 
structures are built up on the basis of Gi 
ence with social reality and are active in ok 
interpretation and comprehension of et 
social events, In particular, we will toa 
schemas about social relationships that 
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be described in terms of structural balance. 
Heider (1958) postulated that social rela- 
tions, to the extent that they form a balanced 
structure, are subject to Gestaltlike organiza- 
tional principles in that balanced relations are 
perceived as a cohesive unit, whereas imbal- 
anced relations are not—the latter, in fact, 
are thought to be disharmonious and cogni- 
tively segregated (Heider, 1946). 

Naturally, a schema about relations be- 
tween people should influence the processing 
of social information, Indeed, one of Heider’s 
important hypotheses was that information 
consistent with a balance schema will be 
better remembered than information incon- 
sistent with the schema (1958, p. 184). Prior 
investigations of this hypothesis have em- 
ployed a variety of procedures, including 
paired associates learning (Zajonc & Burn- 
Stein, 1965a, 1965b; Zajonc & Sherman, 
1967) and conceptual rule learning (Cottrell, 
1975), as well as recall (Gerard & Fleischer, 
1967) and recognition tasks (Delia & Croc- 
kett, 1973; Sherman & Wolosin, 1973). The 
Tesults, however, have provided only check- 
tred support, with contradictory findings oc- 


The present inquiry into the nature of the 
balance schema, especially its effects on mem- 
ory, uses chronometric procedures that are 
Standard in the study of information process- 
ng but are relatively novel in the area of 
Social schemas. Substantively, it is concerned 
With how the balance schema functions. This 
‘sue was most recently explored by Picek, 
Sherman, and Shiffrin (1975). They had sub- 
ects indicate whether the relationships be- 
tween people in stories they had read earlier 
re positive or negative or were not pre- 
nted. Fewer errors were observed on the 
lanced stories than on the imbalanced ones. 
Oreover, there was a distinctive effect of 
€ order in which the relationships were pre- 
‘nted, with the most errors occurring on the 
itd and succeeding relationships in the im- 
lanced structures—the point in the narra- 
€ at which the social situations became 
mbiguously imbalanced (or balanced). 
ese findings suggest that once the degree 
balance of a social structure could be de- 
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termined, the balance schema resulted in 
schema-consistent information being stored in 
a simpler and more available manner than is 
schema-inconsistent information. As a result, 
subjects were able to retrieve the former more 
easily from memory. The critical question, 
which remains open, is ow does it come about 
that schema-consistent information is stored 
more efficiently and hence is more easily re- 
trieved than schema-inconsistent information. 

Retrieval processes have been extensively 
investigated using a paradigm in which sub- 
jects study several sentences, are later pre- 
sented with a probe sentence, and are asked to 
decide whether they have seen it before (e.g., 
Anderson, 1976; Anderson & Bower, 1973). 
Some of the discoveries using this procedure 
are important for our analysis of social 
schemas, in particular the finding that the 
more facts learned about a particular concept, 
the longer it takes to recognize correctly a 
relevant sentence as having appeared before 
(“old”) or as not having appeared before 
(“new”). The relationship between the num- 
ber of sentences learned about a concept and 
sentence recognition time has been termed the 
“fanning effect.” This name derives from the 
memory model used in this research, which 
postulates that the concept becomes a central 
node in a memory network and that facts 
about the concept are linked to this central 
node. These facts are thought to “fan out” 
from the concept node, As their number in- 
creases, the number of network paths to be 
searched in the recognition process also in- 
creases, and therefore the more time-consum- 
ing the search process. As a consequence, the 
more that is known about a concept, the 
longer it takes to correctly recognize any sen- 
tence about that concept. 

The counterintuitive implications of the 
fanning effect are that an increase in knowl- 
edge about a topic hinders our ability to 
answer questions about the topic readily or 
that an expert will answer questions more 
slowly than a novice. The fanning effect thus 
poses a paradox. How can someone with ex- 
pertise on a certain topic answer questions 
about the topic more readily than a less 
knowledgeable person, despite having to 
search through a much larger data base of 


2202 


relevant information? One class of explana- 
tions for this paradox centers on the fre- 
quency with which the facts are retrieved. 
Hayes-Roth (1977), for instance, has shown 
that the fanning effect disappeared after ex- 
tended practice with the same set of facts. 
More relevant to our purpose, however, is the 
work by Smith, Adams, and Schorr (1978), 
who were interested in how the retrieval in- 
terference represented by the fanning effect 
could be overcome when the facts involved 
have not been frequently retrieved. They ob- 
served that when subjects learned sets of facts 
that could be combined as a meaningful unit 
by reference to knowledge about the world, 
the fanning effect was eliminated. These find- 
ings imply that information that can be inte- 
grated is more easily retrieved than informa- 
tion that cannot be integrated. Recall that a 
central theme in Heider’s analysis of social 
perception is that balanced social relation- 
ships form a Gestalt, that is, they tend to be 
represented as integrated wholes or units. 
Imbalanced relations, on the other hand, are 
postulated to be cognitively segregated. If 
this is true, it would seem to follow from the 
research on the fanning effect by Smith et al. 
that balanced relationships are retrieved with 
relative ease because the balance schema 
allows for a highly integrated representation 
in memory of schema-consistent information. 
This means that the schema serves as a map- 
ping function and generates an integrated 
memory representation of the individual com- 
ponents of information that are consistent 
with it. More specifically, such an analysis 
implies that (a) the balance schema inte- 
grates information consistent with the schema, 
which (b) results in less retrieval interference 
for balanced relationships than for imbal- 
anced ones; consequently, (c) memory for 
balanced structures is relatively superior. As 
mentioned earlier, Smith et al. (1978) showed 
that integration of facts offset retrieval inter- 
ference and reduced recognition times. If bal- 
anced information is stored in memory in a 
more integrated fashion than imbalanced in- 
formation (and hence is more easily retriev- 
able), this should be reflected in lower reac- 
tion times (RTs) for questions about the bal- 
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anced structures than for questions about im- 
balanced structures. 

In essence, we propose that balanced social 
relationships are represented in memory in a 
fundamentally different fashion from imbal- 
anced relationships. We investigated the 
structural differences in the representation of 
balanced and imbalanced information by 
using a technique originated by Sternberg 
(1966, 1969, 1974), whereby subjects mem- 
orize a set of items and then respond to a 
test item by indicating whether or not the 
latter is a member of the memorized set, If 
RT increases as a function of size of the 
memory set, this indicates a serial comparison 
between the probe stimulus and the memory 
set items, However, if RT does not increase 
with memory set size, then parallel compari- 
sons would be indicated. In the former case, 
a positive slope for the RT function with ré 
spect to set size is a sensitive index of 
time it takes for each comparison. Fur 
more, the scanning of the memory set in 
item-by-item fashion implies that the com- 
ponents of the memory set are represented as 
discrete units in memory. In contrast, an RT 
function with no slope would mean that the 
several items could be processed as rapidly as 
one, This parallel processing would indicate, 
that the components of the memory set are 
stored as an integrated representation such 
that the items are available for comparison all 
at once (Hayes-Roth, 1977). Variations of 
Sternberg’s technique have been used to in- 
vestigate the issue of integration of visual 
images. For example, De Rosa and Tkacz 
(1976) found no effect of set size on RT in 
a Sternberg task when the material in the 
memory set could be integrated into a single 
representation. Similar results were obtained 
by Guenther (Note 1) using cartoonlike pic- 
torial stimuli and by Salthouse (1977) using 


1 As discussed by Townsend (1972), a positive RT 
slope would also be indicative of a limited capacity 
parallel process, In addition, Anderson (1976) dem 
onstrated that a serial model can make equi a 
predictions to any parallel model. However, in man; 
instances, the equivalent serial (or parallel) ne 
must incorporate some rather nonintuitive aa 
tions. Thus, for our purposes, the original dis 
will suffice. 
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dot pattern prototypes. Finally, Thompson 
š and Klatzky (1978) suggest that Gestaltlike 
organizational principles are used to encode 
| visual stimuli and that under certain condi- 
| tions these principles facilitate parallel, un- 
limited capacity processing. 

The present research on the balance schema 
used a variant of the Sternberg technique to 
evaluate the hypothesis that information con- 
sistent with a balance schema is represented 

+ in memory as an integrated unit. In this 
study, subjects studied two scenarios describ- 
ing P-O-X triads (i.e., social structures com- 
prised of the affective relationships between 
two people and their attitudes toward an 
issue). One scenario described a balanced 
Structure, the other an imbalanced one. Fol- 
lowing the study session, subjects answered 
questions about these scenarios that required 

p them to verify one, two, or all three of the 
telationships in the structure. Notice that in 
our version of the Sternberg task, the size of 
the memory set is constant and the size of 
the probe (i.e., the size of the retrieval set) 
is varied. We expected that the slope of RT 
as a function of the number of structural 
relationships to be verified should reflect the 
integrative function of the balance schema. 

«This means that if the balance schema results 
‘in subjects forming an integrated representa- 
fion of the scenarios, then the RT function 
ought to reflect differences in the degree to 
which this integration occurred. Specifically, 
àS the degree of integration increases, the 
‘lope of RT as a function of the size of the 
retrieval] set should decrease. We predict, 
therefore, that because schema-consistent in- 
formation is more readily integrated than 
Schema-inconsistent information, the former 
Should exhibit a flat slope and the latter a 
Positive slope for the function relating RT 
t0 the number of relationships to be verified 
' the question. 


Method 
Subjects 


Subjects were 38 undergraduates enrolled in intro- 
Netory psychology classes. The 35 males and 3 fe- 
Males received course credit for their participation. 
Data from six subjects who made more than six 
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errors during the 56 trials were excluded from the 
analyses, 


Materials 


Scenarios. Two short descriptions of hypothetical 
stuations were used, each involving two persons of 
the same sex and an issue. Each of the scenarios was 
five sentences long and contained the following 
information: (a) brief description of both individ- 
uals, (b) the interpersonal sentiment relationship 
between the individuals, (c) the relationship between 
the persons and the issue, that is, their attitudes 
toward the issue. 

Two versions of each scenario were generated by 
reversing the relationships between the entities. This 
reversal of relationships produced one version in a 
balanced state and the other in an imbalanced one. 
Two sets of scenarios were formed by varying the 
order of occurrence of the balanced scenario. Ap- 
proximately equal numbers of subjects were ran- 
domly assigned to each set of scenarios. In both 
sets, the scenario about admissions quotas preceded 
the abortion scenario, In the two scenarios that are 
reproduced below, the reversed relationships that 
appear in parentheses were substituted for the pre- 
ceding italicized sections to produce imbalanced 
versions of the scenarios. 

Scenario 1. Mary has been married for three years 
and has a year-old child. She is very much opposed 
to (in favor of) liberal abortion legislation and 
works on a local committee that is lobbying against 
(for) such legislation. Susan strongly opposes 
(favors) liberal abortion laws and is one of the most 
active members of the local committee. Mary and 
Susan became acquainted through their work to- 
gether on the committee. They have grown to like 
(dislike) each other very much. 

Scenario 2. Bill is a senior at a large university 
and has majored in Sociology. Bill is applying to sev- 
eral prestigious law schools and strongly opposes 
(favors) the graduate schools’ admissions policy that 
is based on racial quotas. John is also a senior at 
the same university and he is an Urban Studies 
major. John is applying to these law schools also 
and is very much opposed to (in favor of) the quota 
system. Bill and John are (are not) friends; in fact, 
they like (dislike) each other very much. 

Questions. Fourteen distinct questions were con- 
structed about each scenario (28 in all). Since we 
were interested in analyzing RT as a function of the 
complexity of the question, the questions were pre- 
sented in a schematic format rather than in verbal 
form, in an attempt to minimize differences in ques- 
tion encoding times. Each schematic question took 
the form of three words from the scenario (two 
names and an issue) and from one to three symbols 
(+ or —), used to represent the relationships be- 
tween the entities in the scenarios. Typical schematic 
questions are shown in Figure 1. The spatial arrange- 
ment of the words and symbols was constant. across 
questions. Subjects were required to verify the rela- 
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ONE -RELATIONSHIP RELATI THREE- RELATIONSHIP 
QUESTIONS QUESTIONS QUESTIONS 
MARY Dil q 
+ ABORTION + QUOTAS 


JOHN 

MARY 

a * ABORTION 
+ 

SUSAN 

MARY 


Figure 1. Examples of schematic questions presented to subjects. 


tionships represented by the symbol(s) in the sche- 
matic question against the corresponding relation- 
ship(s) in the scenario by responding true or false 
to each question. 

The 14 schematic questions about each scenario 
formed three sets, according to the number of rela- 
tionships between entities that the subject was re- 
quired to verify. These sets were: 

1. One-relationship questions—six questions that 
required the subject to verify one relationship among 
the entities in the scenario (see examples in Figure 
1, left panel). 

2. Two-relationship questions—six questions that 
required verification of two relationships (Figure 1, 
center panel). 

3. Three-relationship questions—two questions re- 
quiring verification of all three relationships in the 
scenario (Figure 1, right panel). 

In seven of the 14 schematic questions, each rela- 
tionship in the question accurately reflected those in 
the scenario. Thus, the correct response to these ques- 
tions was true. In the other seven questions, the 
relationship(s) were opposite to those in the scenario, 
thus requiring a false response. Note that in order to 
completely counterbalance the question format across 
the true-false and balance-imbalance factors (see 
Design section), the following constraint was re- 
quired. For any given schematic question, either 
every symbol accurately reflected the relationship in 
the scenario, or none did, that is, all were the op- 
posite of those in the scenario. 

Of the six one-relationship questions described in 
Set 1 above, two questions (one true and one false) 
pertained to the relationship between the people in 
the scenario (as illustrated at the top of the left 
panel of Figure 1), and four questions (two true and 
two false) pertained to the relationship between the 
people and the impersonal entity in the scenario 
(middle and bottom of left panel in Figure 1). 

Question Set 2 was composed of two questions 
(one true and one false) about each of the possible 


subsets of two relationships among the three entities 
The three possible subsets of relationships are shown 
in the center panel of Figure 1. 

In Question Set 3, all three relationships wert 
specified in each question. 

Four presentation orders (two for each set of 
scenarios) were prepared, each order consisting of 
two consecutive random permutations of 28 
resulted in a total of 56 trials that were broken into 
consecutive blocks in each of which the entire set 
28 questions occurred. Approximately equal numbers 
of subjects were assigned to each presentation order. 


Design 


The characteristics of the questions as described 
above, coupled with the fact that the reversal of 
relationships changed a given scenario from a bal- 
anced to an imbalanced state, meant that the correct 
response to a particular schematic question was 
cither true or false and concerned either a balanced 
or an imbalanced structure, depending on the version 
of the scenario. Thus, collapsing across subjects effec- 
tively counterbalanced for scenario effects, question 
order effects, and their interaction. This resulted in 
a 3X3X2X2X2 within-subjects design, the fac- 
tors being (a) the number of relationships in A 
schematic question; (b) the particular subset of rela- 
tionships that the question dealt with; (c) the cor- 
rect response (true or false); (d) the type of struc 
ture (balanced or imbalanced) ; and (e) the replica 
tion block (1 or 2). 


Apparatus 


f re- 
Presentation of questions and recording O' i- 
sponses and RTs were controlled by a PDP 11 bari 
puter equipped with three ADDS 980 hasr oe 
minals. Six subjects were run at a time, Ca pace. 
ceeding through the trials at his or her own 
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Each subject sat in an individual soundproofed 
cubicle with a 12-inch (30.48 cm) monitor and two 
movable response-button boxes. Subjects were in- 
structed to move the button box labeled TRUE to 
their dominant hand, to position the boxes com- 
fortably, and to rest their forefingers on the buttons. 
The schematic questions were centered on the screen 
of the monitor, which was approximately at eye 
level. 


| Procedure 


Subjects were seated in their cubicles and were 
handed an introductory sheet that described the ex- 
periment as one concerning “the manner in which 
individuals learn and use information about the 
world around them.” The sheet also asked the sub- 
jects to study “very short descriptions of situations 
that one might encounter in the course of daily life” 
and informed them that they would be asked ques- 
tions about specific details of the descriptions. 

After the introductory sheet, the subjects were 
fiven two minutes to study each of the scenarios that 
Were typewritten on separate sheets.” Following the 

ludy period, subjects were given an instruction 
booklet that described the format of the questions 

d stressed the importance of answering them by 

ssing the true or false button as quickly as pos- 
ble while maintaining perfect accuracy, 

The instruction booklet was followed by 12 prac- 
lice trials (six schematic questions about each sce- 

io). The format of the practice questions was 

ntical to that of the questions used in the actual 
ials. During the practice trials, the first incorrect 
tesponse made by a given subject was flagged by an 
correct message at the center of the monitor. The 
‘cond and succeeding errors by that subject were 
flagged with the message INCORRECT. COMPLETE AC- 
CURACY IS VERY IMPORTANT. SLOW DOWN IF NECES- 
Ary, 

Following the practice trials, subjects were asked if 

y had any questions. During the 56 actual trials 
that followed, a 2-sec pause was interpolated be- 


tween a response and the presentation of the next 
Khematic question. 


Results 

Mation (5.0%). The analysis of RT includes 
rect responses only. Data from the practice 
ls were excluded from the analyses, as 
ed as those more than three standard de- 
tions from the subject’s mean (1.3% of 
trials). 


The overall error rate was 4.6% with 
Slightly fewer errors occurring on balanced in- 
formation (4.2%) than on imbalanced infor- 
for; 

e data from trials with aberrant RTs, de- 
The principal data are the parameters that 
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describe RT as a function of the number of 
structural relationships to be verified. Straight 
lines were fitted to each subject’s data, and 
the parameters (slopes and intercepts) of 
these RT functions were analyzed in 2 x 2 x 
2 within-subjects analyses of variance 
(anovas), with factors of structure type 
(balanced/imbalanced), replication block 
(first/second), and correct answer (true/ 
false). The results reported below are based 
on data collapsed over replication blocks be- 
cause there was no effect of replication block 
on the slopes of the RT functions, F(1, 
31) < 1.0, and the effect of replication block 
on the intercepts of the RT functions indi- 
cated that subjects merely improved their per- 
formance with practice, F(1, 31) = 8.74, p < 
006). 

The mean RTs for correct responses appear 
in Figure 2, together with the means of the 
parameters of straight lines fitted to each 
subject’s data using a least-squares criterion. 
In general, the results confirm the predictions 
regarding the slopes of RT as a function of 
the number of relationships to be verified. 
For trials in which the correct response was 
true, the 95% confidence interval for the bal- 
anced slope ranges from —95 to —15 msec, 
whereas the same confidence interval for the 
imbalanced slope ranges from 84 to 164 
msec, For the false trials, the 95% confidence 
intervals for the balanced and imbalanced 
slopes were —19 to 61 msec and —32 to 48 


msec, respectively. 


2 Earlier studies in this program (Sentis, Note 2) 
indicated that study time is an important factor in 
this paradigm. When subjects were allowed to allo- 
cate their study time differentially between balanced 
and imbalanced stories, their subsequent recognition 
performance (in terms of both errors and RT) was 
markedly superior for imbalanced information. 
Other studies that employed a single study period for 
balanced and imbalanced structures also reported 
superior memory performance on imbalanced in- 
formation (Gerard & Fleischer, 1967; Spiro & Sherif, 
1975). This result appears to be due to the greater 
distinctiveness of the imbalanced scenarios relative 
to the balanced ones. The greater salience of imbal- 
anced information apparently commands a dispropor- 
tionate share of subjects’ attention during the study 
phase and subsequently results in better perform- 
ance on imbalanced structures during the test phase. 
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NUMBER OF STRUCTURAL RELATIONSHIPS IN 
QUESTION (R) 


Figure 2. Mean reaction time (RT) for correct re- 
sponses as a function of number of structural rela- 
tionships to be verified in the questions, (Least- 
squares linear functions were applied to individual 
subject means. Shown are the means of the param- 
eters of the RT functions by type of structure and 
response.) 


Discussion 


The results show that the balance schema 
had a large effect on the processing associated 
with correct true responses, as indicated by 
the slopes of the rue RT functions, which 
differed considerably according to whether the 
question to be answered concerned a bal- 
anced or an imbalanced structure. As can 
readily be seen in Figure 2, the RT results 
from the érue responses confirm the prediction 
that imbalanced information is not well inte- 
grated in memory and thus exhibits a positive 
slope for RT as a function of the number of 
structural relationships retrieved from mem- 
ory. In contrast, balanced information yielded 
not merely a flat slope for the true RT func- 
tion but a reliably negative slope. Thus, there 
is the distinct possibility that the balance 
schema results in the formation of such highly 
integrated memory structures that the whole 
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structure is more readily accessed than a sub- 
part, Indeed, it seems that the smaller the, 
subpart, the more difficult the access. In short, 
not only was parallel processing of the struc- 
tural relationships indicated, but more in- 
triguingly, the balance schema, by virtue of 
its integrative function, actually appears to 
inhibit the processing of subparts of the 
structure, 

In contrast to the true responses, the slopes 
of the RT functions from the correct false 
responses did not differ according to the 
degree of structural balance, In addition, the 
RTs to false questions were considerably 
faster than the true RTs. These qualitative 
differences between the true and false RT 
functions suggest a two-stage memory search 
model that is similar to the recognition mem- 
ory model of Atkinson and Juola (1973). For 
information stored in long-term store (LTS), 
Atkinson and Juola propose a recognition 
model assuming that the subject either makes 
a fast initial response based on the familiarity 
of the test item or, if the item is neither & 
tremely familiar or extremely unfamiliar, dè- 
lays responding until a more extensive search 
of the memorized information is carried out. 
As shown in Figure 3, for the present data 
we assume that the familiarity criterion for 4 
fast no is set above the bulk of the “false” 
distribution and the fast yes familiarity ct 
terion is set above most of the “true” te 
sponses. The area of extended search between 
these criteria overlaps most of the true dis- 
tribution, and an account of this extended 
search process will be given below. Note that 
the lower tail of the frue distribution fal 
below the fast no criterion, and thus the in 
correct true responses (i.e., responses thal 
ought to have been true but were not) shoul 
be relatively fast. On the other hand, the 
upper tail of the false distribution lies in th 
search area, and hence subjects’ incorre¢ 
false responses will occur after a search ant 
should be relatively slow. The RT data fol 
errors are consistent with this reasoning, W* 
the 42 incorrect true responses averaging 
1,617 msec and the 36 incorrect false by 
sponses averaging 1,785 msec (although th 
difference is not statistically significant). k; 

Anecdotal evidence also supports the P 
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‘posed two-stage model. Subjects often re- 
Sported that they were sure they had made 
isme errors, despite receiving no feedback 
during the actual question session. This intro- 
spection is in line with the model, which 
‘posits that even if the initial familiarity of 
‘an item produces a decision to respond quick- 
ly, the extended memory search continues 
and quite likely results in the subject’s con- 
firming that he has or has not made the cor- 
Hrect response, 

Let us now consider the processing that 
‘occurs during the extended search initiated for 
‘true responses. In particular, we wish to ad- 
dress the theoretical significance of the nega- 
tive slope for the balanced true RT function. 
‘Of course, our study is not the first to dis- 
‘cover that the information about a complete 
structure maybe more accessible than infor- 
ymation about any part. For example, Post- 
pan, Bruner, and Walk (1951) found that 
jubjects were able to determine what a de- 
lective word ought to be before they noticed 

ich letter in the word was defective. Sim- 

rly, Horowitz, Day, Light, and White 

1968) showed subjects a word with one letter 

issing and found that they could report the 

mpleted word twice as fast as the missing 
tter; and Johnson (1975) demonstrated 
at subjects identify words faster than their 
nstituent letters. Using a paradigm similar 
the present studies, Hayes-Roth (1977) 
rved that after extensive learning of 
ree-part propositions, which presumably 
lted in integrated (“unitized”) memory 


FAST ‘NO’ 


wy 


INCORRECT TRUE 
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representations, two-part subsets of the three- 
part propositions were verified faster than 
one-part subsets. In light of these findings, it 
is reasonable to interpret the negative slope 
of the balanced RT function for true re- 
sponses as indicating the balance schema re- 
sults in such a highly integrated cognitive 
representation of schema-consistent informa- 
tion that subjects found it more difficult to 
process subsets of the balanced structure than 
to access it in its entirety. The implication is 
that there is a cost associated with the inte- 
grative function of the balance schema, 
namely, the integrated memory structure 
must be decomposed in order to access sub- 
sets of it. Because the three-relationship ques- 
tions require no decomposition, they can be 
answered most quickly of all. On the other 
hand, two-relationship questions, since they 
require that one relationship be removed from 
the structure prior to responding, are an- 
swered more slowly. The one-relationship 
questions require still another decomposition 
step and are answered the most slowly. 

The present results and the other similar 
findings discussed above suggest the following 
processing model for information about social 
structures varying in their degree of balance. 
It assumes that structural relationships in 
the scenarios are transferred to LTS and are 
maintained there in a particular format (de- 
tailed below). The question-answering task 
requires retrieval of structural relationships 
into short-term store or working memory, 
where the decision-making operations needed 


SEARCH AREA 1 Be 
FAST ‘YES 


INCORRECT FALSE 


Figure 3. Theoretical distribution of familiarity values for true and false questions with assumed 
high and low criteria set to result in the obtained pattern of reaction times. 
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Figure 4. Processing algorithm for answering questions about balanced and imbalanced structures. 


to answer the questions occur (for an over- 
view of memory models, see Klatzky, 1975; 
Shiffrin & Schneider, 1977). 

A primary feature of the model is the 
mode of storage in LTS. Imbalanced rela- 
tionships are inherently difficult to integrate 
and are therefore stored in memory as dis- 
crete propositions, Thus, the imbalanced 
P-O-X triads used in the present studies 
would be stored as three discrete memory 
representations, one for each relationship. As 
a result, each imbalanced relationship must 
be retrieved individually. Balanced informa- 
tion on the other hand, is integrated into 


unitized nodes that are retrieved in an all-or 
none fashion (cf, Hayes-Roth, 1977; Schnel- 
der & Shiffrin, 1977). Thus in the present 
studies, the three balanced relationships that 
comprise a schema-consistent scenario wou 
be stored as a single unitized memory repre 
sentation. J 
With respect to the question-answeritt 
phase of this task, the model assumes i 
algorithm shown in Figure 4. The ae 
familiarity check results in the relativ! ; 
“fast and flat” RT functions for the 2 
responses that are not affected by the et ck 
of structural balance. Assuming the fast 
y 


SCHEMA-CONSISTENT INFORMATION 


criterion values that are shown in Figure 3, 
the bulk of the true questions generate inter- 
mediate familiarity values and would neces- 
sitate a more extended memory search. 

This extended search incorporates two pri- 
mary operations—retrieval from LTS and 
decomposition. The sequencing of these op- 
erations depends both on the mode of infor- 
mation storage (discrete propositions vs. 
unitized nodes) and the complexity of the 
question to be answered, Consider the se- 
quence of operations that occurs in the algo- 
rithm shown in Figure 4 for true responses 
to questions, First, the LTS representation of 
4 relationship in the encoded question is re- 
trieved to short-term store, Next, the number 
of retrieved relationships is compared to the 
number in the encoded question. When their 
numbers are equal, the match of the retrieved 
telationships with the encoded question is 
thecked and a response is made. Following 
the first retrieval from LTS, the outcome of 
the stage that compares the number of rela- 
tionships retrieved and the number in the 


)ropositions, and the relationships that com- 
rise them must be retrieved one at a time. 
Because each relationship to be verified in 
he question requires additional retrieval op- 
trations, the number of such operations co- 
ies with the complexity of the questions 

ttaining to an imbalanced structure. Thus, 
the 124-msec slope for the imbalanced true 
RT function represents the time required to 
search LTS and retrieve a particular struc- 
‘ural relationship. In contrast, all of the rela- 
Honships in a balanced structure are stored 
Na single unitized node, and consequently a 
Memory search initiated for any single rela- 
‘Onship results in retrieval of the entire struc- 
ute. The storage of balanced structures in 
Mnitized nodes permits true questions about 
“ose scenarios to be answered after only a 
ngle retrieval operation, regardless of the 
Umber of relationships in the question, How- 
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slope for the balanced information reflects 
the time required for these decomposition op- 
erations and comprises the “penalty” in the 
model. 

How reasonable is the model outlined 
above? First, the RT data from the errors, as 
well as subjects’ introspection regarding these 
errors, are consistent with the model. Second, 
the model posits that decomposition opera- 
tions take place in working memory. These 
operations consist of transforming the infor- 
mation retrieved from LTS into a format 
suitable for answering the questions, It seems 
reasonable to assume, therefore, that these 
relatively simple transformations would be 
less time-consuming than retrieval from LTS. 
The relative magnitudes of the slopes of the 
RT functions representing the decomposition 
and retrieval operations bear this out, with 
retrieval taking roughly twice as long as de- 
composition (124 msec versus 55 msec). 
Finally, the model predicts equal RTs for 
true responses requiring the same retrieval 
and decomposition operations, Two conditions 
in the present experiment required the same 
set of operations, namely a single retrieval 
operation with no decomposition operations. 
The conditions requiring the same set of 
operations were the one-relationship true 
questions regarding imbalanced structures 
and the three-relationships true questions 
concerning balanced structures. Given the 
model, these two conditions should exhibit 
comparable RTs, and in fact, the predicted 
values for these conditions (based on the 
equations in Figure 2) differ by only 2 msec. 

In summary, the present model of the bal- 
ance schema is a fairly well-specified account 
of how information concerning social relation- 
ships is processed. It accounts for the differ- 
ent signs and magnitudes of the slopes of the 
RT functions associated with answering ques- 
tions regarding balanced and imbalanced 
structures. The model also raises a number 
of questions. At what point in the encoding 
sequence is the balance schema activated? 
How are the retrieval and comparison opera- 
tions sequenced in working memory? What is 
the effect of the balance schema on memory 
for larger social structures? A more complete 
specification of the model waits on further 
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research, especially with respect to the tem- 
poral organization of retrieval, decomposi- 
tion, and comparison operations in working 
memory, as well as the entire encoding 
process that results in the different modes of 
storage in LTS. 

Our research on the balance schema pre- 
sents evidence supporting the hypothesis that 
information consistent with a schema is stored 
in an integrated fashion. In general, this 
integrated mode of storage will facilitate re- 
trieval of schema-consistent information and 
will enhance the processing of such material. 
This integrative function, however, is a two- 
edged sword. If the schema results in such a 
high degree of integration that the relevant 
information structure becomes unitized, then 
processing is only facilitated to the extent 
that there is a match between the amount of 
information to be retrieved (i.e., the size of 
the probe) and the size of the unitized struc- 
ture. Under certain conditions, therefore, the 
integration function of schemas may be dys- 
functional because of the necessity of decom- 
posing the unitized memory representation to 
access the subparts of the structure. 

In certain respects, the integrative function 
of schemas can be thought of as “chunking,” 
that is, schema-consistent information seems 
to be processed in larger units than is schema- 
inconsistent material. Similar results have 
been obtained in research on the perceptual 
abilities of chess experts (Chase & Simon, 
1973) and masters of the oriental game of Go 
(Reitman, 1976). This research indicates that 
experts in these games (individuals with a 
schema about chess or Go) recognize larger 
patterns of pieces on the board than do 
novices (players without a schema). How- 
ever, differences between the expert and nov- 
ice do not obtain for randomly placed pieces 
(ie. for schema-inconsistent information). 
Thus, one reason that an expert’s play is 
superior to that of a novice is this ability to 
process the schema-consistent information in 
larger units or chunks, In related research, 
Chiesi, Spilich, and Voss (1979) have 
shown that the schema, or structured knowl- 
edge, possessed by baseball fans allows them 
to integrate or chunk baseball-related infor- 
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mation, thereby enhancing the processing of 
such information. Moreover, Spilich, Veson- 
der, Chiesi, and Voss (1979) report that 
a baseball schema results in the integration 
of the discrete events in a game into mean- 
ingful units or sequences and leads to subse- 
quent recall of larger chunks of game-related 
information. The findings of Spilich et al. also 
suggest that a schema’s integration or chunk- 
ing function is hierarchical in nature and re- 
sults in easier retrieval of higher-order units 
than of lower-order constituents—a finding 
completely consistent with our own 

To the extent that the integration function 
is a general property of social schemas, the 
present research has a number of implications 
for social cognition in general and for inter- 
personal communication in particular. For ex- 
ample, interpersonal communication typically 
involves the transmission of one chunk at a 
time, and as a result, efficient communication 
requires the individuals to use a mutually 
acceptable chunk size, Consider the case of 
an expert interacting with a novice. If a nov- 
ice attempts to query an expert regarding 4 
particular topic, a common source of ineffi- 
cient communication is the diminutive (low- 
level) chunks used by the novice to phrase 
the question. In many cases, the expert does 
not understand the question and asks the 
novice for more background information in 
order to put the question into a meaningful 
context. The obverse of this problem fre 
quently occurs when an expert does answer 4 
novice’s question. The expert’s response may 
often be delivered in large (high-level) 
chunks incomprehensible to the novice. For 
example, computer documentation is notori- 
ously difficult to comprehend. It is typically 
written by a programmer who usually under- 
stands not only the particular program to h 
documented but also the other programs and 
routines that interface with it, as well as the 
larger computer system. The crux of the 
communication difficulty with computer docu- 
mentation, therefore, seems to be the a 
grammer’s failure to decompose the Ei 
level chunks used in an expert’s represe ie 
tion of the program into chunks of a size M 
digestible by novices. 


Reference Notes 
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Locus of Control, Interpersonal Trust, and Assertive Behavior 
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This study related two cognitive personality characteristics—locus of control 
and interpersonal trust—to assertive behavior in a sample of recently married 
couples. Assertive behavior was measured by the Inventory of Marital Con- 
flicts, an observational procedure in which couples resolve disagreements about 
hypothetical marital problems. Hypotheses were formulated in terms of indi- 
vidual locus of control as well as two combinations of locus of control and 
interpersonal trust—internal — low trust and external ~ high trust. Results showed 
that internal husbands were more assertive than external husbands in the mar- 
ital conflict situation, that external—high trust husbands were least assertive, 
and that internal—low trust wives were highly assertive. These findings are 
interpreted in light of previous locus-of-control and trust research, as well as in 
terms of unconventional marital role behavior. 


t 


Contemporary research on the marital rela- 
tionship has focused on the cultural and situa- 
tional determinants of spouses’ attitudes and 
behavior. Sociologists have emphasized the 
influence of marital role prescriptions (e.g., 
Stryker, 1964), whereas psychologists have 
studied the processes by which spouses shape 
each other’s behavior through reciprocal re- 
wards and punishments (e.g., Weiss & Mar- 
golin, 1977). 

In contrast to these major interest areas in 
marital studies, the personality or individual 
differences dimension of marriage has received 
relatively little attention in recent years. His- 
torically, this area was once highly popular 
with researchers, As reviewed by Tharp 
(1963) and Barry (1970), two research tradi- 
tions once dominated personality and mar- 
riage studies. The first tradition was con- 
cerned with the personality correlates of 
marital “adjustment” From the 1930s 
through the 1950s, studies consistently found 


that neurotic traits in individual spouses 
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(especially husbands) were associated with 
greater marital instability and lower levels 
of self-reported marital adjustment (Burgess 
& Cottrell, 1939; Burchinal, Hawkes, & 
Gardner, 1957; Terman, 1938). 

The second personality and marriage re- 
search tradition was concerned with whether 
spouses tend to be similar or complementary» 
in personality characteristics, Although 
Winch’s (1958) research on complementary 
needs in mate selection stirred considerable 
controversy, the bulk of the research has sup- 
ported the similarity hypothesis for married 
couples (Tharp, 1963). Clore and Byrne 
(1977) summarized a review of this issue with 
the following conclusion: ‘Marriages, an! 
stable marriages in particular, are compo: 
of people with similar personalities, attitudes, 
and other characteristics” (p. 548). Tia 

In recent years, research on the personality 
aspects of marriage has nearly come to & 
standstill. At least two problems with past 
research in this area may help to explain 
decline in interest. First, the major finding 
added little to the common sense notions m 
“likes marry likes” and that unhappy indivi 
uals are apt to have unhappy marriages. pce 
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ond, on a methodological level, these studies 
suffered from a reliance on global, atheoretical 
personality traits as independent variables 

(eg, neuroticism and maladjustment) and 

on self-reports of overall marital adjustment 

as criterion variables, In the past decade, each 
of these categories of variables has become 
the object of considerable scholarly criticism. 
The traditional trait approach has been at- 
tacked by personality psychologists like Wal- 
ter Mischel (1968, 1973) on the grounds that 
global trait ascriptions ignore the situational 
constraints on behavior, In addition, marriage 
researchers have criticized marital adjustment 
research on both conceptual and methodolog- 
ical bases, arguing that the construct is ill 
defined and value laden and that marital ad- 

_justment inventories have been psychomet- 

tically weak (Laws, 1971; Ryder, 1967; 
Spanier, 1972; Spanier and Cole, 1976). Per- 

haps for these reasons, related to the turmoil 

in both personality psychology and marital 
studies, scholars have turned to other areas 
of interest. 

_ The present study attempts to break new 
ground in personality and marriage research 
through two innovations: first, the use of cog- 
nitive personality constructs as opposed to 
traditional trait dimensions, and second, a 
focus on an interaction dimension of marriage 
—assertive behavior. As far as can be deter- 

| mined, no previous study has found a per- 
sonality correlate of observed marital inter- 

i action. 

Specifically, this study examines the rela- 
tionship between the locus of control orienta- 
tions of individual spouses and their assertive 
behavior in a marital disagreement situation. 
Spouses’ interpersonal trust expectancies are 

[ea as a moderator personality dimension 
between locus of control and marital assertive- 
ness, 

There is no general agreement on the 
| meaning of assertiveness. After examining 
f the various usages found in the psychological 
_ literature, Rich and Schroeder (1976) de- 
fined assertive behavior in the general sense 
_ as “skills that (a) are concerned with seeking, 

Maintaining, and enhancing reinforcement 
and (b) occur in interpersonal situations in- 
volving the risk of reinforcement loss or the 
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possibility of punishment” (p. 1083). In a 
review of the related marriage research, Olson 
and Cromwell (1975) have urged that re- 
searchers adopt a standard definition of asser- 
tive behavior in marriage, namely, “attempts 
to change the behavior of the (spouse)” (p. 
6). Combining elements of these two defini- 
tions, the present study conceptualizes asser- 
tive behaviors in marriage as attempts to 
modify the partner’s behavior in order to 
maintain or enhance one’s interests during a 
marital conflict. This definition is deliberately 
broad and does not distinguish assertive be- 
havior from aggressive behavior. The asser- 
tiveness construct is operationalized in this 
study primarily through the coding of spouses’ 
influence-attempt behavior during a con- 
trived marital disagreement situation. 

As defined and operationalized by Rotter 
(1966), locus of control is a generalized ex- 
pectancy that one’s outcomes are contingent 
more on one’s own efforts or stable personal- 
ity characteristics (internal) or more on out- 
side forces such as luck, fate, or powerful 
others (external). Interpersonal trust as de- 
fined by Rotter (1967) is a generalized ex- 
pectancy that other persons can be relied 
upon to live up to their verbal promises, 

Although locus of control has been studied 
only rarely in the context of ongoing relation- 
ships, a considerable body of research using 
college student groups has found that in- 
ternals tend to be less malleable to social in- 
fluence than externals, as well as more persua- 
sive and assertive (see reviews by Phares, 
1976, and Strickland, 1977). These findings 
are consistent with the proposition that in- 
ternals, believing more strongly that they can 
control their outcomes, are more likely to de- 
velop and use the social skills necessary to 
manipulate their environment. Extending this 
line of reasoning to the marital relationship, 
one can speculate that married individuals’ 
locus of control orientations may be associ- 
ated with assertive and yielding behavior 
vis-à-vis their marriage partner. Presumably, 
internality would be positively associated with 
assertiveness. According to Rotter’s theory 
(Rotter, 1975), however, generalized locus- 
of-control expectancies would be only mod- 
erately predictive of behavior in a specific, 
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ongoing behavioral context such as marriage. 
Rotter believes that the more familiar the 
situation, the less powerful the influence of 
generalized expectancies. 

To enhance the predictive value of locus of 
control in the study of assertive behavior in 
marriage, interpersonal trust was used as a 
moderating personality characteristic. In 
other words, this study assumed that the 
relationship between locus of control and 
assertive behavior would be mediated by the 
individual’s interpersonal trust expectancies, 
This strategy was adopted for the following 
reasons: 

1. Interpersonal trust itself has implica- 
tions for understanding the marital relation- 
ship. Evidence presented by Rotter (Note 1) 
suggests that low trust individuals are more 
suspicious and behave more competitively 
than do high trust individuals, High trust per- 
sons, in contrast, are characterized in some 
studies as pleasant and conventional. 

2. Previous research has found combina- 
tions of locus of control and interpersonal 
trust to be better predictors of certain cri- 
terion behaviors than is locus of control alone. 
In particular, Hochreich (1974, 1975) has 
distinguished theoretically and empirically 
between two types of external males based on 
their trust scores, External — low trust persons 

(“defensive externals”) have been found to 
behave much like internals in achievement 
situations; however, they tend to project 
blame for failure onto the environment. Ex- 
ternal—high trust individuals (‘congruent 
externals”), on the other hand, were found to 
behave in a relatively passive, noncompetitive 
fashion consistent with theoretical assump- 
tions about externality, This distinction has 
been found only in males. 

3. Finally, in research reviewed by Seeman 
(1976), sociologists and political scientists 
have distinguished fruitfully between varia- 
tions of trust and control expectations. One 
prominent hypothesis from this research tra- 
dition holds that internal control expectancies 
in the political area, when combined with low 
trust in government institutions, are associ- 
ated with greater political activism. 

In sum, two combinations of locus of con- 
trol and interpersonal trust seem to have 
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special importance for assertive behavior in 
marriage. If internals are generally more 
assertive than externals, then the low trust- 
ing or suspicious internal should be especially 
likely to protect his or her self-interest in an 
interpersonal conflict. Conversely, the least 
assertive behavior should characterize the 
generally unassertive external who also has a 
high degree of confidence in the honest inten- 
tions of other persons. 

Based on the foregoing review, this study 
hypothesizes that individual spouses’ locus- 
of-control orientations, with interpersonal 
trust as a moderating personality dimension, 
will be associated with levels of assertive 
behavior by that spouse in marital con- 
flict situations, Specifically, it is hypothesized: 
(a) that internality will be associated with 
greater assertiveness, (b) that the combina- 
tion of internality and low trust will be asso- 
ciated with the highest levels of assertive be- 
havior, and (c) that the combination of ex- 
ternality and high trust will be associated (at 
least in husbands) with the lowest levels of 
assertive behavior. 


Method 
Subjects 


Eighty-six recently married couples were recruited 
by mail from marriage license records of first mar- 
riages. All had been married less than one year 
(average 6.4 months), and they may be characterized 
as a white, college-educated, middle-class group. 
None had children. The sample was part of a larger 
pool of recently married couples generated by the 
Couples Research Project at the University of Con- 
necticut. 


Procedure 


Couples recruited in the first phase of data gather- ? 
ing went through a two-session procedure in a uni- 
versity setting. During the initial session, couples 
wrote essays about their relationship, then complet! 
an omnibus questionnaire that included two rhe 
ments used in this study. Couples were then invit 
back for a second session of testing. Although vir- 
tually every couple agreed to return, in practice end 
about half did so. During the second session, coupl 
were administered the locus-of-control and interper- 
sonal trust scales as well as the marital interactio® 
measure. The n for this first group of couples W 
28. 
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Couples recruited in the second phase were admin- 
istered all the instruments (except the essays) con- 
secutively in one session. The n for Phase 2 was 58. 
A comparison between the Phase 1 and Phase 2 
groups revealed no differences in locus of control, 
interpersonal trust, or demographic variables (ex- 
cept that Phase 2 wives were more likely to have 
completed college). The two groups were combined 
for data analysis. 


Instrumentation 


1. Rotters Internal-External Locus of Control 
(I-E) Scale was used to measure locus of control 
expectancies. A subset of items from Rotter’s scale 
similar to Mirels’ (1970) Personal Control Factor 
was derived through factor analysis, but it yielded 
no improvement in terms of relating to the marital 
variables. Hence, only the whole scale score was 
retained. The I-E scale contains 29 items in a forced- 
choice format, including 6 filler items. A higher 
score indicates a more external orientation. 

2. Rotter’s Interpersonal Trust Scale (Rotter, 
1967) was used to measure interpersonal trust expec- 
tancies. The scale consists of 25 Likert-type items 
plus fillers. A higher score indicates a more trusting 
orientation. 

3. The Inventory of Marital Conflicts (IMC), a 
laboratory interaction procedure, was used to provide 
both a process measure and outcome measure of 
assertive behavior (Olson & Ryder, 1970). The IMC 
poses a nonthreatening disagreement situation in 
which spouses typically try to influence each other’s 
behavior, The procedure presents couples with 18 
vignettes in which married couples are involved in 
typical marital disagreement situations (over money, 
sex, household chores, etc.). Each spouse individ- 
ually reads the vignettes and answers two main ques- 
tions about each conflict situation: (a) Who is pri- 
marily responsible for the problem? and (b) should 
a proposal for resolving the conflict be accepted or 
rejected? (After the partners completed the indi- 
vidual part of the IMC procedure, they filled out 
the I-E and interpersonal trust questionnaires, which 
took about 20 minutes.) Finally, the partners are 
brought together and asked to discuss each conflict 
situation and reach a consensus on two questions: (a) 
Who is primarily responsible for the problem? and 
(b) which of two mutually exclusive solutions is 
best? This discussion is audiotape-recorded with no 
experimenter present in the room. 

Disagreement is built into the IMC procedure 
rather than being left to chance. For 12 of the 18 
vignettes, the partners are given somewhat different 
Versions of the conflict situation. The husband's ver- 
sion slants the responsibility for the problem toward 
the wife, whereas the wife's version makes the hus- 
band look more responsible for the marital conflict. 
The couples were told in the IMC instructions that 
in some cases, they would be reading different per- 
Spectives on the same problem. 

The IMC yields two types of data: (a) win scores 
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(who won more of the disagreements) and (b) inter- 
action data taken from the taped discussions. The 
coding of the couple interaction was based on the 
Marital and Family Interaction Coding System 
(Olson & Ryder, Note 2), which contains 15 content 
codes for verbal behavior (plus an uncoded state- 
ment category). The codes are divided conceptually 
into three categories: task related, assertive, and 
affective. Each statement made during the IMC dis- 
cussion procedure was coded directly from the audio- 
tape by two independent coders, whose frequency 
ratings for each code were averaged. Since the con- 
cern of this study is with assertive or influence be- 
havior in a disagreement situation, coding was done 
only for discussions of vignettes on which the spouses 
had initially disagreed on their individual forms. 
Because of damage to several tapes, the final n for 
the IMC interaction data was 80, 

Reliability of the IMC variables. Two types of 
reliability estimates were performed on the IMC data. 
First, interrater reliability was assessed for each code 
by examining the agreement between raters on the 
scores of each code for each spouse. A code’s score 
was its frequency divided by the total frequency for 
all interaction codes for that spouse. (This procedure 
controls for amount of talking.) Corrected by the 
Spearman-Brown formula for two independent 
raters, the interrater reliability coefficients ranged 
from .49 to .99, averaging .86 across all the codes for 
both spouses. 

The second set of reliability procedures examined 
the internal consistency of the IMC data. To com- 
pute internal consistency reliability estimates for the 
IMC variables, the items (vignettes) were divided 
into two halves that were equal in the degree of 
disagreement elicited in the sample. As expected from 
previous studies (Olson & Ryder, 1970), the win 
scores proved highly unreliable (split-half r= .12, 
with Spearman-Brown correction) and were retained 
for exploratory purposes only. For the interaction 
codes, split-half reliability coefficients, corrected by 
the Spearman-Brown formula, ranged from .31 to 
92, averaging .59 across all codes for both spouses. 
Codes with interrater or internal consistency reli- 
ability coefficients of under .60 (averaged over both 
spouses) were dropped from subsequent analyses, 
Remaining were 9 of 15 originally coded variables: 
initiation of discussion (who begins discussion of 
each vignette), laughter, outcome question (e.g. 
“What do you think?”), read/content (e.g, “My 
story says the husband had tried to stop smoking”), 
self-disclosure (the first spouse to reveal an opinion 
on the vignette), partisan opinion (statements de- 
fending one’s opinion), outcome disagreement (the 
first statement of disagreement during the discussion 
of a vignette), process disagreement (all subsequent 
disagreement statements), and disapproval of spouse 
(criticizing the other’s viewpoints or character). 
Average interrater and split-half rs for those 9 codes 
were .91 and .72, respectively. 

Deriving an assertiveness interaction dimension, 
Four of the reliable IMC interaction codes are con- 
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Table 1 
Unrotated Principal Components Factor 
Matrix For IMC Codes 


Code Factor 1 Factor 2 Factor 3 
Husbands 

Initiation -723 178 304 
Self-disclosure -170 195 —.378 
Read -587 -106 586 
Outcome 

question —.052 — 471 485 
Partisan 

opinion —.600 010 —.310 
Process 

disagreement —.545 559 178 
Outcome 

disagreement —.748 —.067 433 
Disapproval 

of spouse —.069 770 188 

Wives 

Initiation .839 142 068 
Self-disclosure 637 .532 —.416 
Read .112 —.262 123 
Outcome 

question .295 .132 546 
Partisan 

opinion —.493 .234 —.607 
Process 

disagreement —.409 724 .089 
Outcome 

disagreement —.625 —.335 356 
Disapproval 

of spouse —.150 626 589 


Note. IMC = Inventory of Marital Conflicts. 


ceptually related to assertive behavior as defined 
earlier—partisan opinion, outcome disagreement, proc- 
ess disagreement, and disapproval of spouse. Each of 
these codes represents an effort either to stand one’s 
ground or to change the other’s opinion—and thus to 
gain an advantage in the disagreement. The other 
codes, except for laughter, appear primarily to repre- 
sent behaviors that facilitate discussion of the vig- 
nettes, for example, initiating the discussion of a 
vignette, asking for clarification, and describing the 
content of the story. 

As an empirical aid to the selection of the asser- 
tiveness dimension, a principal components factor 
analysis was performed on the reliable IMC codes, 
with laughter excluded. Separate analyses were con- 
ducted for husbands and wives. Unities were used as 
diagonal elements and no rotation was performed. 
Inspection of the factor loadings in Table 1 reveals 
three assertiveness-related codes—partisan opinion, 
process disagreement, and outcome disagreement— 
loading in the same direction on Factor 1. Three task- 
related codes loaded in the opposite direction from 
the assertiveness codes on this factor. This pattern 
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suggests a negative relationship between task-oriented 
and assertiveness behaviors. The fourth code that had 
been related conceptually to assertivencss—disap- 
proval of spouse—did not load on Factor 1. How- 
ever, a decision was made to retain this code as! 
part of an assertiveness dimension for two reasons; © 
first, it represents a theoretically interesting 1 
aspect of marital assertiveness, and second, it cor 
related with process disagreement 41 for wives and 
25 for husbands, which suggests that disapproval of 
spouse has a fair degree of commonality with one 
other assertiveness code. 

For each spouse an assertiveness score was derived 
by summing the s scores for partisan opinion, process > 
disagreement, outcome disagreement, and disapproval 
of spouse. Husband and wife assertiveness scores 
correlated 45 with each other, a finding that suggests 
a moderate degree of reciprocity between spouses on 
these assertive behaviors 


Data Analysis 


The primary statistical procedure was two-way 
analysis of variance of locus of control (internal- 
external) and interpersonal trust (low-high) on 
marital assertiveness, Separate analyses were per- 
formed on the IMC observed interaction dimension 
and the win score variable. Of central interest were 
the main effects for locus-of-control and planned $ 
test comparisons focusing on the external - high trust 
group and the internal —low trust group 

Extreme groups were created on locus of control 
and trust by dropping out the middle third of each 
distribution. Mean scores for the remaining husbands 
were as follows: Internal = 6.6; external = 15.4; low 
trust = 60.6, high trust = 76.7. The corresponding ? 
means for wives were: Internal = 8.0, external = 
16.4; low trust = 59,9, high trust = 79.2. Four cells 
were created for husbands and wives separately in a 
2X2  design—internal—low trust, internal- high 
trust, external — low trust, external — high trust. 

Exploratory analyses of variances were also con- 
ducted on assertiveness difference sources (hi 
minus wife), and an examination was made of the 
relationship between couple locus-of-control patterns 
(husband I-E X wife I-E) and marital assertiveness. 


Results 
Descriptive Data 


A significant difference between husbands’ 
and wives’ locus-of-control scores was found, 
with husbands more internal (M = 11.0) that 
wives (M = 12.3). (Studies typically repor 
no sex differences on the Rotter I-E sae 
in college student samples—see sais 
1977; erty, in press, also found rA ) 
differences in a national probability sample. 
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Spouses’ locus-of-control scores correlated at 
.18, which was nonsignificant. Husbands and 
wives did not differ in average interpersonal 
trust scores (M = 68.5 for both spouses); 
the correlation between husband and wife 
trust scores was r = .07, which was not sig- 
nificant. 

The relationship between locus of control 
and trust was different for husbands and 
wives, with husbands’ scores on these two 
variables correlating at —.18 and wives’ 
scores correlating at —.38, p < .001, indicat- 
ing a positive association between internality 

and trust. This pattern contrasts with Ham- 
sher, Geller, & Rotter’s (1968) report that 
-I-E and trust typically correlate higher for 
males than for females in college samples. 

Two other husband-wife comparisons are 

| worth noting. First, there were no significant 
husband-wife differences on the individual 
variables that comprised the marital assertive- 
ness dimension—process opinion, outcome 
disagreement, process disagreement, and dis- 
approval of spouse. Second, the win score vari- 
able also showed no husband-wife mean dif- 
ferences, with husbands winning an average 
of only 1.1% more of the disagreements in 
the IMC procedure. 
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Locus of Control, Interpersonal Trust, 
and Marital Assertiveness 


Table 2 shows the cell means and sizes for 
the two-way analyses of variance (Locus of 
Control x Interpersonal Trust) on the IMC 
marital assertiveness variables. Results for 
the observed assertive behavior variable will 
be presented first, followed by results for the 
win scores and the supplementary analyses. 
It should be noted that some of the cells in 
Table 2 have small ns, especially the in- 
ternal — low trust wife group. 

Observed assertive behavior. For hus- 
bands, a significant main effect was found for 
locus of control, F(1, 38) = 7.93, p < .008, 
indicating that internal husbands had higher 
assertiveness scores in the IMC marital con- 
flict situation than did external husbands. 
Table 2 shows that the external —high trust 
husband group had the lowest average asser- 
tiveness score. A planned ¢ test comparison 
between the external—high trust group and 
the other three groups combined showed a 
significant difference, (38) = 3.09, p < .004. 
Consistent with the congruent versus defen- 
sive external distinction, the external- high 
trust group was significantly less assertive 


Table 2 

Means and Standard Deviations of Marital Variables By Locus of Control and 

Interpersonal Trust Groups 

IMC assertiveness IMC wins* 
Group n M SD n M SD 
Husbands 
Internal-low trust 10 -232 2.196 10 59.9 19.5 
Internal-high trust 13 -106 3.110 13 53.6 10.7 
External-low trust 11 —.599 1.970 12 57.8 16.4 
External-high trust 8 —2.867 2.076 9 30.1 13.8 
Wives 

Internal-low trust 6 3.012 3.221 6 46.5 13.8 
Internal-high trust 14 512 1.140 16 57.6 12.5 
External-low trust 14 —.158 2.620 16 $1.2 17.1 
External-high trust 9 611 2.735 10 45.4 19.9 


ote. IMC = Inventory of Marital Conflicts, 


Win points for husband relative to wife. Higher score represents more husband wins, lower score represents 


wife wins. 
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than the external — low trust group, which was 
indistinguishable from the internal groups. 

For wives, the analysis of variance yielded 
a nonsignificant effect for locus of control, 
F(1, 39) = 2.90, p < .097, However, the in- 
teraction effect was significant, F(1, 39) = 
4.37, p < .043. Inspection of the cell means 
suggests that the extremely high assertiveness 
scores of the internal—low trust wife group 
were primarily responsible for the significant 
interaction effect. A planned ¢ test comparison 
between the internal-low trust wife group 
and the other groups combined indicated that 
the difference was significant, ¢(39) = 2.53, 
p < 016. There were no assertiveness differ- 
ences among the other three groups. 

Win scores. Although win scores were 
found unreliable across vignettes in the IMC, 
they were retained for exploratory purposes. 
For husbands, the analysis of variance results 
showed a pattern similar to observed inter- 
action findings. The main effect for locus of 
control was significant, F(1, 40) = 7.60, p < 
.009, with internals winning more disagree- 
ments than externals. However, in this anal- 
ysis, the very low win scores of the external — 
high trust husband group created a significant 
interaction effect, F(1, 40) = 5.34, p < .026, 
The other three husband groups averaged 
more IMC wins than their wives, whereas the 
external—high trust group won an average 
of only 30% of their disagreements. The dif- 
ference was highly significant, ¢(40) = 4.76, 
p< .001. There were no significant effects 
for wives on the win score variable. 

Supplementary analyses. Results of analy- 
ses of variance employing mixed husband- 
wife locus of control groups (Husband I-E x 
Wife I-E) yielded no significant effects for 
either observed assertiveness or win scores. 
The other supplementary analysis examined 
the relationship between individual spouse 
locus of control and trust scores and the 
couple’s relative assertiveness (husband minus 
wife). For husbands, the results showed a 
significant main effect for locus of control, 
F(1, 38) = 5.09, p < .030, indicating that 
internal husbands when compared to external 
husbands were more assertive relative to their 
wives. Of the four Husband I-E x Trust 
groups, only the internal — low trusters had 
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average assertiveness scores higher than their 
wives. This difference between the internal - 
low trust group and the other husband groups * 
combined was significant, ¢(55) = 2.04, p <i 
.049. There were no significant effects for 
wives. 


Discussion 


The main findings of this study may be 
summarized as follows: (a) For husbands, , 
internals behaved more assertively in the 
marital conflict than externals did; external- 
high trusters were least assertive; (b) for 
wives, internal—low trusters behaved most 
assertively in the marital conflict. Husband- 
wife locus-of-control combinations were not 
associated with differential levels of assertive 
behavior, 

These findings may be interpreted in terms , 
of previous theory and research on locus of 
control and interpersonal trust and in terms 
of unconventional role behavior. To begin 
with, the tendency for internal husbands to 
behave more assertively than external hus- 
bands is consistent with locus-of-control 
theory. Since they believe they can control 
events, internals are thought to pursue valued 
outcomes more vigorously and persistently 
than externals do, although relatively little” 
research has examined this hypothesis in 50- 
cial interaction situations, Presumably, sup- 
porting one’s interests in a marital conflict 
constitutes a generally valued goal that im, 
ternal husbands seek more assertively tha? 
do external husbands. No explanation 1$ 
readily apparent for the failure to find a main 
locus-of-control effect for wives, but one may 
speculate that succeeding in a more-or-less 
public marital conflict may be more important 
to husbands than wives. 

The tendency for external — high trust 
bands to engage in unassertive marital inter” 
actions is likewise consistent with Laer 
research on congruent externality (Hochret a 
1974, 1975). According to this jasar 
viewed earlier, external —high trust itive 
tend to behave in a passive, noncompe 
manner consistent with theoretical =a 
tions about externality. Believing that 


rust hus- 
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interpersonally trustworthy environment con- 
trols their lives, these men may have little 
mo® „ation to win or “make points” in marital 
disagreements. 

Although there is previous empirical re- 
search on external —high trust males, the in- 
ternal - low trust combination has not been 
highlighted in the literature. However, it 
seems intuitively reasonable that more vigor- 
ous assertive behavior should follow from the 
belief that one’s environment is controllable 
but socially untrustworthy. In the marital 
‘context, the present data suggest that the 
combination of internality and low trust is 
associated with greater assertive behavior by 
wives in marital conflict situations. A supple- 
mentary analysis found internal —low trust 
husbands to be more assertive relative to their 
wives than were the other husband groups. 
The second set of interpretations of these 
ndings takes the position that external — high 
rust husbands and internal — low trust wives 
represent personality patterns opposite to the 
nventional stereotypes of husband and wife 
le behavior, The external — high trust hus- 
nd does not match the cultural norm of the 
tive male mastering his environment and 


ntinuum from the “macho” male, the ex- 
nal — high trust husband is fairly unasser- 
ive and yielding in arguments with his wife. 
Nonnormative in the opposite direction are 
ternal — low trust wives. Instead of assum- 
a relatively passive and trusting stance 
yard the world, these women combine 


their lives and a sense of skepticism about 
ther people’s intentions. Women with this 
tientation are apt to relate to their husbands 
an assertive manner, arguing vigorously 
d persistently for their viewpoints. 

Tn sum, a picture emerges of the internal — 
trust wife that is exactly opposite that of 
e external —high trust husband. Whereas 
tends to be passive and uninvolved in 
rital disagreements, she tends to be active 
heavily involved. Unfortunately, because 
Small cell sizes in the present study, couples 
th an external — high trust husband and an 
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internal —low trust wife could not be com- 
pared with other husband—wife matches. 


Conclusion 


Findings have been reported here for a 
sample of recently married couples without 
children. Whether the same results would 
hold for couples married longer is problemat- 
ical. In addition to sampling issues, a further 
note of caution pertains to the small amount 
of variance accounted for in most of the sta- 
tistical analyses. Accounting for such low 
levels of variance, however, is not surprising 
in light of the notorious difficulty involved in 
predicting observed behavior from paper-and- 
pencil personality instruments. 

Within these limits, this study suggests 
that there may still be life in personality and 
marriage research. For the first time, cognitive 
personality variables have been related to ob- 
served marital interaction behavior, In addi- 
tion to breaking new ground in personality 
and marriage research, the present study has 
identified two interesting types of married 
persons, external- high trust husbands and 
internal — low trust wives. Following a strat- 
egy suggested by Rotter (1975), it may be 
useful in the future to employ a measure of 
locus of control that is specific to marital out- 
comes. In this way more powerful predictions 
may be possible concerning some interesting 
marital behavior patterns and attitudes, and 
some new understandings may be reached 
about the role of individual differences in 
marriage. 
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An Examination of Self-Perception Mediation 
of the Foot-in-the-Door Effect 


William DeJong 
Brandeis University 


In 1966, Freedman and Fraser demonstrated that an individual is more likely 
to comply with a large request for help if that person has previously agreed to 
an initial small request—a phenomenon they called the “foot-in-the-door” ef- 
fect. In the present survey, studies that have sought to replicate the foot-in- 
the-door effect are reviewed. The adequacy of a self-perception explanation for 
the foot-in-the-door effect is assessed by examining (a) the importance of the 
size of the initial request; (b) the effect of noncompliance with the initial re- 
quest; (c) the impact of salient external justifications for the initial act of 
compliance; (d) the impact of social labels on subsequent levels of compliance; 
and (e) attempts at actually measuring changes in self-perception. Alternative 
explanations of the foot-in-the-door effect are considered and rejected, and di- 
rections for future research are outlined. 


‘| In 1966 Freedman and Fraser tested the 
notion that once an individual has complied 
with a small, sometimes trivial request, that 
person will be more likely to comply with a 
larger and more substantial demand made in 


the future—an effect they christened the 

“foot-in-the-door” phenomenon. During the 

< last 12 years, social psychologists have con- 
tinued to be fascinated by those results. How 

«. is it that the simple act of assenting to a small 
request for help can dramatically increase, 
even double, the probability of a person’s 
agreeing to give help in the future? 

d Because the foot-in-the-door paradigm has 

Í become an important vehicle by which to 
Study the link between self-concept and be- 
havior, it is important to take a step back and 
ask some basic questions about the progress 
of this research: (a) Is the foot-in-the-door 
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effect a reliable one? Under how wide a range 
of situations has it been demonstrated? (b) 
Self-perception theory (Bem, 1972) is most 
often used to explain the effect. What hypoth- 
eses can be derived from this explanation, and 
what evidence has been gathered for each of 
them? (c) How else can the phenomenon be 
explained? What is the evidence for and 
against the alternative propositions that have 
been offered? (d) What directions can future 
research take? It was in the hope of answer- 
ing these questions that the present article 
was designed. First, I begin with a review 
of the Freedman and Fraser (1966) experi- 
ment. 


The Freedman and Fraser Study 


The procedure of their experiment was 
straightforward, A male experimenter first 
contacted several suburban housewives in 
their homes, identifying himself as a member 
of either the Community Committee for Traf- 
fic Safety or the Keep California Beautiful 
Committee. The initial requests he made of 
these women were small and innocuous, and 
almost all of the subjects agreed to them. 
Half were asked to display a small sign in the 
front window of their homes. The other sub- 
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jects were asked to sign a petition advocating 
certain legislation, These requests were con- 
cerned with one of two issues, driving safety 
or making California more beautiful. 

Two weeks later, the housewives were 
again contacted. A different experimenter, 
who claimed to be a representative from the 
Citizens for Safe Driving, asked the women 
if they would be willing to have a large, ugly 
billboard reading “Drive Carefully” installed 
in their front yard for a period of one week. 
Thus emerged a simple two-way factorial de- 
sign. For half of the subjects this second re- 
quest was like the first in terms of the action 
required of them (i.e., displaying a sign). For 
those who had previously signed a petition, 
the action now required of them was quite 
different. As a cross-dimension, the issue of 
concern was the same as before for half of the 
subjects (ie., driving safety), Those orig- 
inally approached by the Keep California 
Beautiful Committee were now dealing with 
a different issue. 

Freedman and Fraser demonstrated a foot- 
in-the-door effect of remarkable strength and 
generality. Each of the four experimental con- 
ditions produced significantly greater compli- 
ance with the final request than did a control 
condition in which subjects were not ap- 
proached with a first request: Only 20% of 
the subjects in that control condition were 
willing to have the billboard installed, com- 
pared with 55% of the experimental subjects. 
To be sure, those subjects for whom the sec- 
ond request was similar to the first along the 
two dimensions of issue of concern and mode 
of action were somewhat more helpful than 
subjects in the other three experimental con- 
ditions, but that difference did not approach 
Statistical significance. This increased compli- 
ance effect did not seem to depend much on 
the type of request that had been made in- 
itially, Even the seemingly trivial action of 
agreeing to sign a petition about the need to 
“keep California beautiful” more than 

` doubled the probability that a subject would 
agree to display a billboard that exhorted 
motorists to drive more safely. The exciting 
feature of these results is that the effect 
cannot be explained merely in terms of either 
the subject’s involvement with a particular 
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experimenter or her increased commitment to 
a certain issue or mode of action. As Freed- 
man and Fraser recognized, a more compli- 
cated (and psychologically interesting) ex- 
planation is required. 


Attempts at Replication 


Since Freedman and Fraser's (1966) orig- 
inal report, several replications of the foot-in- 
the-door effect have been attempted. Before 
this research is reviewed, the criteria that any 
study must meet before it can be considered 
a valid replication attempt should be listed: 

1. Obviously, a proper control group, con- 
sisting of subjects not receiving an initial re- 
quest, must be included as part of the experi- 
mental design. Three studies must be excluded 
for this reason: Harris, Liguori, and Joniak 
(1973); Harris and Samerotte (1975); and 
Schmidt (1973). The first two, in all fairness, 
were not explicitly designed to test for a foot- 
in-the-door effect. 

2. The data analysis must include all sub- 
jects assigned to the experimental condition 
and not just those who agree to the first re- 
quest. Any analysis that excludes those not 
complying with the first request can be crit- 
icized on grounds of differential subject self- 
selection, Studies by Arbuthnot et al. (1976- 
1977) and Beaman, Svanum, Manlove, and 
Hampton (1974) must be excluded from con- 
sideration because of this problem. 

3. High compliance with the initial request 
must be obtained. Harris, Liguori, and Stack 
(1973, Study 3), for example, failed to obtain 
a significant foot-in-the-door effect, but inter- 
pretation of their findings is made problematic 
by the fact that only 42% of their experi- 
mental subjects agreed to the first request 
(see also, Dutton & Lennox, 1974). Ideally, 
of course, 100% of the subjects would con- 
sent to the first request, but that goal is 
rarely obtained. Some studies have reported 
a significant foot-in-the-door effect with 3 
rate of initial compliance as low as 6 
(e.g., Freedman & Fraser, 1966, Study 1). 

It should be noted that in testing for wil 
nificant foot-in-the-door effects, each expr 
mental group should be compared individu y 
with the control group. Some studies (¢8» 
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Cann, Sherman, & Elkes, 1975; Harris, 1972, 
Study 1) combined experimental groups in 
their analyses, and the data from such studies 
have been reanalyzed when possible. 

Table 1 summarizes known replications of 
the foot-in-the-door phenomenon that had 
been attempted by the time of this review 
(November 1978). The table lists the two 
requests made of the subjects, the rate of 
compliance with the second request for both 
the experimental groups and the no-initial- 
request control group, and the conclusions 
drawn from appropriate statistical compari- 
sons. 

A glance at the table shows that a signif- 
icant foot-in-the-door effect was not always 
found, It is unwarranted, however, to declare 
that the effect cannot be reliably obtained. 
It must be noted that only a few studies 
(Cann, 1976; Cialdini & Ascani, 1976; Cial- 
dini, Cacioppo, Basset, & Miller, 1978; De- 
Jong & Funder, 1977, Study 1; Harris & 
Samerotte, 1976; Miller & Suls, 1977; Tipton 
& Browning, 1972; DeJong, Note 2) showed 
the percentage of compliance with the second 
request for one or more experimental groups 
to be lower than or equal to that for the con- 
trol group. Thus, almost all of the failures fo 
replicate were in the predicted direction, but 
did not reach traditional levels of statistical 
significance. And as noted previously, two 
failures to replicate (Dutton & Lennox, 1974; 
Harris, Liguori, & Stack, 1973, Study 3) are 
difficult to interpret because of the very low 
levels of compliance with the first request in 
the experimental group. 

The subsequent discussion of many of these 
studies reveals other potential explanations 
for failures to replicate. They can be men- 
tioned here briefly, First, the initial request 


_ must be large enough to cause people to think 


about the implications of their own behavior; 
if the request is too small, the effect will prob- 
ably not be obtained (Seligman, Bush, & 
Kirsch, 1976). Thus, for example, it may be 
that a request to take a small card is too 
small to increase compliance with a high-cost 
request like donating blood (Cialdini & As- 
cani, 1976). Moreover, some evidence sug- 
gests that if the initial request is too large, 
the probability of obtaining the effect will be 
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reduced (Baron, 1973; Miller & Suls, 1977). 
Second, people must feel that their initial 
compliance resulted from the exercise of free 
choice, not because of pressure to comply 
(Uranowitz, 1975; Zuckerman, Lazzaro, & 
Waldgeir, in press). Fish and Kaplan’s 
(1974) findings are difficult to interpret for 
this reason. Subjects in their experiment were 
first asked to write an essay on various ways 
of fighting poverty, but they were asked to do 
this as part of a class about the poverty pro- 
gram. It is possible, then, that the subjects 
may have felt pressured into writing the 
essay because they saw it as a class assign- 
ment. 

A key issue of concern is the generality of 
the foot-in-the-door phenomenon. Under how 
wide a range of situations has it been demon- 
strated? As Table 1 reveals, most of the re- 
quests that have been made of subjects in- 
volved little time or effort. But some did in- 
volve major commitments on the part of the 
subjects. For example, Lowman (1973) used 
the foot-in-the-door technique to increase 
household participation in a complicated trash 
recycling program. In one study, Freedman 
and Fraser (1966, Study 1) induced house- 
wives to allow a survey team of five or six 
men to come into their homes to classify the 
household products they used. 

The length of the delay between the two 
requests has also been varied. The foot-in-the- 
door effect has been demonstrated with delays 
between the two requests of up to two weeks 
(eg., Freedman & Fraser, 1966, Study 2; 
Harris, 1972, Study 2), but most successful 
replications have involved much smaller de- 
lays of a day or two. The effect of delay has 
never been systematically explored, save Cann 
et al.’s (1975, Study 1) comparison between 
delay (7-10 days) and no-delay procedures 
(see Table 1). The length of the delay itself 
may be less important than whether the occa- 
sion of the second request somehow reminds 
people of their earlier behavior. 

The degree of similarity between the two 
requests is an important dimension as well. 
Unfortunately, most investigators have em- 
ployed requests that share many common fea- 
tures, so that the robustness of the foot-in- 
the-door phenomenon has rarely been put to 
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Table 1 
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Replications of the Foot-in-the-Door Effect 


Study 


First request 


Second request 


eee S 1 ee ne LEE 


Baer, Goldman, & 
Juhnke (1977) 


Baron (1973) 


Cann (1976) 


Cann, Sherman, 
& Elkes (1975, 
Study 1) 


Cann, Sherman, 
& Elkes (1975, 
Study 2) 


Cialdini & Ascani 
(1976) 


Cialdini, Cacioppo, 
Basset, & Miller 
(1978, Study 2) 


Crano & Sivacek 
(Note 1, Study 
1) 

Crano & Sivacek 
(Note 1, Study 
2) 


DeJong (Note 2) 


DeJong & Funder 
(1977, Study 1) 


(1) Give the time to 
experimenter 

(2) Same as (1); later 
misinformation given to 
different experimenter 


(1) Accept leaflet on the 
dangers of pollution 

(2) Sign antipollution peti- 
tion, get two friends to 
sign, and mail in 


(1) Agree to receive a 
questionnaire on recycl- 
ing in the mail and fill it 
out 


(1) Answer three questions 
on driving habits; no 
delay between the two 


requests 
(2) Same as (1); 7-10-day 
delay between requests 


(1) Answer three questions 
on driving habits; no 
delay between the two 
requests 


(1) Take and display small 
card advertising blood 
drive 


(1) Agree to display a 
United Way window 
poster 


(1) Answer 10 questions 
about beverages 


(1) Answer 10 questions on 
household products 


(1) Agree to sign a petition 
for pro-disabled legisla- 
tion; learn they were one 
of many to sign 

(2) Same as (1); learn they 
were the first to sign 


(1) Answer 15 questions on 
the quality of life in the 
local community 


Correct misinformation given to 


experimenter by another 
clevator passenger 


Agree to put 3 foot X 5 foot 
antipollution sign in front 


Volunteer time for a neighbor- 


hood cleanup project 


Agree to accept 15 pamphlets 
on traffic safety and dis- 
tribute to neighbors 


Agree to accept 15 pamphlets 
on traffic safety and distri- 
bute to neighbors 


Agree to donate blood the next 


day 


Agree to pick up a United Way 


poster packet at dormitory 
lobby — 


Agree to answer 30 questions 
on driving habits 


Agree to answer 45 questions 
on the mass media 


Notify a second experimenter 
that he dropped a quarter 


to answer 50 questions 
on highway safety 


Results* 


(1) 70 
(2) 35 
(C) 33 


Con- 
clusion 
FITD 
No FITD 


FITD 
No FITD 


No FITD 


FITD 
No FITD 


No FITD 


No FITD 


No FITD 


FITD 


No FITD 


FITD 
No FITD 


No FITD 


Table 1 


Study 
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(continued) 


First request 


Second request 


DeJong & Funder (1) Answer 15 questions on Agree to answer 50 questions 


| (1977, Study 2) 


DeJong & Musilli 
y (Note 3) 


$ 


Í 

Dutton & Lennox 
(1974) 

v Fish & Kaplan 
(1974) 


Freedman & Fraser 
(1966, Study 1) 


Harris (1972, 
Study 1) 


fb Harris (1972, 
Study 2) 


M Liguori, & 
Stack (1973, 
Study 3) 


Harris & Samerotte 
(1976, Study 1) 


quality of life in the 
local community ; receive 
letter acknowledging 
participation 


(1) Agree to participate in 
a 5-minute survey on 
parking facilities for 
compact cars; experi- 
menter appeared 
physically normal 

(2) Agree to participate in 
a 5-minute survey on 
parking facilities for dis- 
abled drivers; experi- 
menter appeared phys- 
ically normal 


(1) Give money to a white 
panhandler 


(1) Write a short essay on 
ways of fighting poverty 


(1) Answer eight questions 
on household soaps 


(2) Agree to be in survey 
on households soaps 


(1) Give directions 
(2) Give the time 


(1) Write a letter toa 
minority high school stu- 
dent, indicating willing- 
ness to answer questions 
about the university and 
student life 


(1) Allow name to be sent 
to local congressman as 
supporter of organiza- 
tion’s programs 


(1) Watch experimenter's 
possessions; a theft at- 
tempt is later thwarted 
by the subject 


(2) Same as (1); the second 


request is made by a 
i t experimenter 


on highway safety 


Agree to participate in a 30- 


minute telephone survey on 
highway laws and driving 
hazards 


Agree to donate time to various 
activities as part of an inter- 


racial Brotherhood Week 


Volunteer time and services to 


a welfare agency 


Agree to allow six-man survey 


team to enter home and 
spend 2 hours classifying all 
household products 


Give the experimenter a dime 


Sign class list to volunteer 
time to a university publicity 


campaign 


Agree to donate money or 


cookies to a fund-raising 
baked-cookie sale 


Give money to experimenter to 


permit the purchase of food 


2225 
Con- 

Results* clusion 
(1) 66 NoFITD 
(C) 56 

(1) 55 FITD 
(2) 53 No FITD 
(C) 40 

(1) 54.84 No FITD 
(C) 46.2 

(1) 36 No FITD 
(C) 33 

(1) 53 FITD 

(2) 33 No FITD 
(C) 22 

(1) 39 No FITD 
(2) 44 FITD 

(C) 11 

(1) 18 FITD 

(C) 9 
(1) 30 NoFITD 
(C) 25 
(1) 20° No FITD 
(2) 40 No FITD 
(C) 35 


(table continued) 
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Table 1 (continued) 


Study First request 
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Second request 


n U ea 


Harris &Samerotte (1) Watch experimenter’s 
(1976, Study 2) possessions; a theft at- 
tempt is later thwarted 
by the subjects 

(2) Same as (1); the second 
request is made by a dif- 
ferent experimenter 

(3) Watch experimenter’s 
possessions; no theft 

j attempt is made 
(4) Same as (3); the 
second request is made 
by a different ex- 
perimenter 

Lowman (1973) (1) Answer four questions 
on recycling and con- 
tainer use 


Miller & Suls 
(1977) 


(1) Give directions that 
are difficult to explain 

(2) Give directions that 
are simple to explain 


| Pliner, Hart, 
Kohl, & Saari 
(1974) 


(1) Wear pin to advertise 
a fund drive 

(2) Wear pin and persuade 
member of family to do 
so 


Reingen & Kernan 


(1) Agree to participate i 
(1977) Peas pasa 


a S-question survey on 
household products 


Seligman, Bush, & 


(1) Answer 5 questi 
Kirsch (1976) ee 


the energy crisis and 
inflation 
(2) Answer 20 questions 
(3) Answer 30 questions 
(4) Answer 45 questions 


(1) Listen to a 2-minute, 
pro-McGovern speech ; 
agree to display a small 
campaign sign 

(2) Agree to display a small 
campaign sign only 

(3) Listen to a 2-minute 
speech on fire prevention, 
agree to display a small 

Prevention si 

(4) Agree to display a 
small fire prevention 
sign only 


Seligman, Miller, 
Goldi 


berg, Gel- 
berd, Clark, & 
Bush (1976) 


Snyder & Cun- 


1 
ningham (1975) eree to answer 8 


questions on household 
paper products or on 
traffic safety 


Give money to penar to 
permit photocopying of an 
article 


Agree to participate in a glass 
and metal trash recycling 
program 


Help a male experimenter pick 
up dropped groceries 


Contribute money to the fund 


Agree to participate in a 20- 
question survey on household 
products 


Agree to answer 55 more ques- 
tions for the same survey 


Agree to display a McGovern 
poster in front window 


Agree to answer 30 questions 
for the other organization 


(1) 75 
(C) 58 


(1) 38 
(2) 35 
(3) 74 
(4) 74 
(C) 31 


(1) 38 
(2) 23 
(3) 30 
(4) 31 
(C) 16 


(1) 52 
(C) 33 


Con- 
clusion 


No FITD 
No FITD 
No FITD 
No FITD 


FITD 


FITD 
FITD 

No FITD 

No FITD 

No FITD 
FITD J 
FITD 


FITD 

No FITD 

No FITD 
No FITD 
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Con- 
_ First request Second request Results* clusion 


(1) Help an elderly woman Help a young woman in a (1) of No FITD 
ahh & dropped wheelchair up over a curb (C) 36 


(1) Watch experimenter’s Notify a second experimenter (1) 80 FITD 
“ernie bags while he that she dropped her package (2) 45 No FITD 
a dollar bill 


(C) 35 
(2) Watch experimenter’s 
shopping bags while he 
retrieves his wallet 
F (1) Agree to participate Agree to participate in a 20- (1) 64 No FITD 
in a S-minute survey on minute survey on household (C) 45 
traffic safety products 


in a study is indicated by (1), the second condition by (2), and so on; results are 

(C) = the no-initial-request control group; FITD = the foot-in-the-door effect. 

as the percentage of subjects complying with the second request. A chi-square test 

tinuity) was executed for comparisons between individual experimental groups and the 

$ < -10, it is concluded that effect was successfully replicated. If a 
be by the authors, the conclusions based 


the nature of the first r $ 
s € equest. As noted be 
e eeina survey as fore, possible explanations for the effect cen 
seek participa- tering around subjects’ commitment to a par 


ticular experimenter, organization, issue 


~i ye ne mode of action Were effectively ruled out T 


account for their unexpected 
pected results, thes 
requests in terms of the issue offered the following explanation. = = 


What may occur h 
compliance with the nd about getting i, Bes a change in the pers 


Ived or ab 
Miller, Goldberg, Gelberd, B Mis afreal to a means bee 
7 le may become i tude may ches 


to in a 
Who does this sort S EYEL, the kiad 
compliance when the two made by = anieci 1 thing, who agress 3 
believes in, whe <, = anes aai 
tes with 


; man & Fraser, 16, 
“Self-Perception Explanation 4 eA. a. 200) 


a Foie Thus, the seli-perception i 
the size of the font ia ue foot-in-the door phenomens, te 


MDOT 
stage process. Firs non inv 


= own behav 10r 


2) Seeing, 
taking artica 


800d Crean F. 


Ptople n 


amd the 
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which it occurs and, from those data, make 
an inference about their own dispositions and 
attitudes. It has been argued that subjects 
comply with the initial request in the ap- 
parent absence of any external inducements 
and then decide on the basis of that evidence 
that they are the kind of people who cooper- 
ate with good causes or help out other 
people. Second, this changed self-perception 
is thought to increase the probability of their 
performance of similar kinds of action in the 
future. As presently articulated, self-percep- 
tion theory does not adequately explain the 
processes that mediate this last step (cf. Bem, 
1972). 

The self-perception explanation of the foot- 
in-the-door phenomenon has led to the formu- 
lation of several testable hypotheses. The 
evidence for each of these is now considered. 


Size of the Initial Request 


A straightforward prediction derived from 
self-perception theory is that the larger the 
size of the first request to which people agree, 
the greater the probability of their compliance 
to the second larger request. In other words, 
the greater the costs of compliance with the 
initial request, the more likely it should be 
for people to find dispositional meaning in 
their behavior. To test this hypothesis, Selig- 
man, Bush, and Kirsch (1976) first asked 
subjects to answer either 5, 20, 30, or 45 
short questions for a Survey on reactions to 
the energy crisis and inflation. There were no 
differences among those four conditions with 
regard to compliance with the first request for 
help. Later, subjects were called back by a 
different experimenter representing the same 
Survey group and were asked to answer 55 
more questions. Unexpectedly, subjects in the 
5- and 20-question groups were not more 
likely to answer the 55 questions than were 
control subjects contacted for the first time. 
However, the two larger initial requests were 
effective in inducing significantly higher rates 
of compliance with the second request. 

Seligman, Bush, and Kirsch recognized the 
possibility that sufficiently large initial re- 
quests might discourage subsequent compli- 
ance, but evidence for that was not found in 
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_ request would be more likely if subjects actu- 


their study. It should be noted that subject¥ 
in all cases were told the survey would requife 4 
only “a couple of minutes,” thus making eve 
the larger requests seem relatively small, A) 
study by Miller and Suls (1977) suggests the © 

possibility of a curvilinear relationship be. 

tween size of the initial request and subs 
quent compliance. Subjects in their experif - 
ment were asked to give street directions to 
first experimenter, the directions being either 
difficult or simple to explain. Compared with) 
control group subjects, those who gave th 
simple directions were more likely to help a ` 
second experimenter pick up his dropped (5% 
groceries a few moments later. Those who 
gave the difficult directions were less likely 
to help. Existence of a curvilinear relation- 
ship, of course, could not be explained by self-_ 
perception theory alone, 


Agreement to the Request Versus 
Actual Performance 


Tt could also be predicted from self-percep- 
tion theory that compliance with the second 


ally performed what was asked of them in the 
initial request than if they agreed to comply 
but were never called upon to carry out their 
promise. The only study that directly ad- 
dressed this issue is an experiment reported 
by Freedman and Fraser (1966, Study 1).. 
Housewives in the so-called agree-only condi- 
tion were asked to participate in a consumer 
survey about household products; those will- 
ing to help were told the survey would be con: 
ducted at a later time. Subjects in the p 
formance condition actually participate 
the survey. In a later call, the same ex 
menter asked subjects if they would peril 


*Self-perception theory has not articulated well 
the exact nature of the self-perception changes thal 
are said to occur, It is not clear, in the case of most 
foot-in-the-door studies, whether subjects classify 
their behavior as compliance or as helping. This un- 
certainty is due to the fact that most experimenters, 
use compliance with a request for help as sheir de 
Pendent measure (sce Table 1). Experiments 
instead require subjects to initiate help giving § 
Miller & Suls, 1977; Uranowitz, 1975) suggest 
most subjects may view their assent to the $ 
request as helpfulness. 


rN irvey team to come into their homes to list 
See ne household products they used. Whereas 
22% of the control group subjects agreed to 
i hat request, over half the subjects in the 
» performance condition did so. In contrast, a 
mere 33% of the agree-only condition sub- 
jects complied with that request. A clear 
| theoretical interpretation of these results is 
MT hade difficult by the fact that subjects in the 
Agree-only condition did not learn the exact 
‘size of the request to which they had agreed. 
‘Their compliance to the second request may 
have been lower only because they guessed 
the initial request to be quite large (cf. Miller 


“noted that other experimenters have shown 
that actual performance of the initial request 
is not requisite in demonstrating the foot-in- 
| the-door effect (e.g., Snyder & Cunningham, 


Effect of Noncompliance With the 
Initial Request 


Self-perception theory led Snyder and Cun- 
ningham (1975) to predict that subjects in- 
duced not to comply with an initial request 
would be less likely to comply with a second 
«request, In their experiment, one group of 
subjects was first asked to participate in a 
50-question telephone survey, a request suf- 
ficiently large to guarantee almost universal 


sexed experimenter representing a second 
Organization asked subjects to answer 30 ques- 
tions. Consistent with their prediction, sub- 
jects initially approached with the large re- 
» quest were less compliant with the second 
equest than were subjects in the control 
‘oup. Similar results were reported by Cann 
êt al. (1975) and Reingen and Kernan 
(1977). These studies support the idea that 
induction of initial noncompliance leads peo- 
is ple to perceive themselves as the kind of 
_ | People who do not comply with such requests. 

One might predict that the smaller the re- 
quest that people refuse, the lower the prob- 
ay lity of their compliance to a second re- 

“St; this proposition remains to. be tested. 
sn added complication arises when the tim- 
& of the second request is considered. In 
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two studies, Cann et al. (1975) reported that 
noncompliance with a large initial request 
actually led to greater compliance when the 
second request was made immediately after 


` the first, rather than after a delay period of 


several days. For example, one study showed 
that 90% of the subjects agreed to an im- 
mediate second request after having refused 
the experimenter’s initial large request, com- 
pared with 50% in the control group. Only 
29% compliance was obtained when that sec- 
ond request was made 7-10 days later. 

These findings replicate an effect first dem- 
onstrated by Cialdini et al. (1975), which 
they dubbed the “door-in-the-face” technique. 
Quite simply, these authors claimed that if 
an experimenter first approaches subjects 
with an unreasonable request that is sure to 
be refused, but then immediately asks a 
smaller favor, subjects will view the experi- 
menter’s action as a concession. The subjects, 
in turn, will feel normative pressure to recip- 
rocate that concession, and they will respond 
to that pressure with compliance with the 
second request. Replications of this effect 
abound (Cialdini & Ascani, 1976; Miller, 
1974; Miller, Seligman, Clark, & Bush, 1976; 
Reingen, 1977). 

In one experiment, Cialdini et al. (1975) 
demonstrated that the subjects’ perception 
of the encounter as a kind of bargaining situa- 
tion probably underlay the effect. In the re- 
jection-moderation condition, both requests 
for help were made by the same experimenter. 
In the so-called two-requester condition, sub- 
jects’ noncompliance with the first request 
was followed immediately by another request 
put to them by a second experimenter. This 
second experimenter was seemingly unrelated 
to the first, but subjects knew this person had 
overheard their conversation with the first 
experimenter. Whereas a control procedure 
produced a 31% compliance rate, 55% of the 
rejection-moderation condition subjects con- 
sented to the second request. But when that 
same second request was made by a different 
experimenter, only 10% of the subjects in 
the two-requester condition complied, a re- 
sult consistent with those reported by Snyder 
and Cunningham (1975). The authors argued 
that subjects in the latter condition could 
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test this prediction (DeJong, Note 2), 
sion, simply because it was made by a differ- pers were approached individually by a phi 
ent experimenter.” ically able male experimenter and shown 

Why do subjects in the door-in-the-face card on which was written the resolution of 
procedure who do not comply with the initial petition formulated by a group called the 
large request not come to see themselves as Upper Valley Foundation to Help the Handi- 
the kind of people who refuse such requests, capped. After reading the resolution, 87% of 


ham’s (1975) results the subjects agreed to sign the petition. At 


as Snyder and Cunning 2 t 
might lead us to expect? Even-Chen, Yinon, that point, some subjects were shown a peti 


and Bizman (1978) argued that the initial tion with many signatures and one blank 
requests used by Cialdini et al. (1975) and space at the bottom (consensus condition) 
others were too large for subjects to draw As they signed, the experimenter said, “As youl 
dispositional inferences from their failure to can see, almost everyone we've asked has 
comply. Implicit in their argument is the sug- signed our petition.” The other subjects were 
gestion that subjects must believe the request shown a completely blank petition, and as 
is one to which someone (though not they) they signed, the experimenter noted that they 
might consent in order for a self-perception were the only ones so far to do so (noncon- 
change to occur, In their study, when subjects sensus condition). When the subjects walked 
refused a large, but not reasonable, first re- On, a second male experimenter dropped a 
quest, they were less likely to agree to a sec- quarter as he walked some 15 to 20 feet ahead 
ond smaller request. Only those who refused of them, Thirty-two percent of the subjects in 
an extremely large request showed a subse- the no-initial-request control group notified 
quent door-in-the-face effect. These results the second experimenter of his loss, whereas 
are suggestive, but Even-Chen et al.’s argu- 64% of those told that many people had 
ment cannot account for the results obtained signed the petition did so. Unexpectedly, only 
by Cialdini et al, In the earlier experiment, 28% of those told they had been unique in 
whether a door-in-the-face effect or a result their agreement to sign helped the second 
consistent with self-perception theory was experimenter. 

2 hae solely on who made the sec- - Consensus information is more psycholog- 
In general, research on the effect of initial EN ieee than attribution theorists have 
M ana TETAS ct of ini admitted (cf. McArthur, 1973). In addition 
; n supportive of the to pointing to the power of situational con- 

self-perception hypothesis. Only under a spe- straints on behavi 5 E 
cial set of circumstances, that is, when the eee. E (Kelley, 1967), OOS 
first request is extremely large fe a sus information can help define the behavioral 
request can be viewed as a concession on the ae ae are operative in a specific context: 
part of the experimenter, does an initial re- Subjects in the consensus condition may have 
fusal not lead to subsequent refusals. helped the second experimenter more often 
because they were reminded that helpi 


others was appropriate behavior. In additi 


not interpret the second request as a conces- 


Effect of Consensus Information 


The self-perception analysis of the foot-i 

f foot-in- 2 Pendleton and Batson -pre~ 
ani effect leads to the prediction that if sentation Eesiccation of a eens $ 
su jects were informed they were not unique Migue, suggesting that subjects are motivated to come 
in their compliance with the initial request, PY With the second request to avoid being perceived 
they would be induced to see their behavior pilin te dma yee a3 
as determined by situational pressures and this tea Sadao oct Baon erpeed that 


not to view it as having any implications for th 

! t i for e phrasing of thi 

their own traits or attitudes (cf. Cook, Pallak, (But maybe you could heip me") may Dave 
& Sogin, 1976; Cooper, Jones, & Tuller, ae Sees oag s low Eee 


1972). In one recent experiment designed to Pma oo es: iaoei = 
controversy. 


nsensus information can reaffirm the rea- 
sonableness of one’s own behavior. Subjects 
in the nonconsensus condition may have been 
shocked to learn of their unique status and 
may literally have been “lost in thought” 
when the second experimenter dropped his 
quarter. To date, no foot-in-the-door study 
involving the introduction of consensus infor- 
mation has yielded results consistent with 
self-perception theory. 


Presence of External Justification 


Theoretical analyses of attributional pro- 
cesses (Kelley, 1967) propose that people 
assign dispositional meaning to behavior after 
a careful assessment of the possible explana- 
tory power of controlling influences in the 
environment. Thus, a self-perception account 
of the foot-in-the-door phenomenon stresses 
that the amount of external pressure used to 


aramount. If those external pressures pro- 
yide people with an adequate explanation of 
heir behavior, they will not be led to infer 
nything about their own traits or attitudes. 


rhe impact of external justifications for the 
al act of compliance on subsequent help- 
was demonstrated by Uranowitz (1975) 
ale shoppers were asked by an experi- 
Ee to watch his grocery bags while he ran 
CX to retrieve a lost article. I i 

Justification condition, he aad he nace E 
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of course, was that subjects in the high-justi- 
fication condition would believe that circum- 
stances demanded their compliance and that 
anyone faced with such a request would agree 
to it. When the first experimenter returned 
with the lost article, the subject proceeded on 
her way. A female experimenter later dropped 
a package in the subject’s path, and the sub- 
ject’s response was noted. The results were 
striking. Eighty percent of the subjects in the 
low-justification condition helped the second 
experimenter, whereas only 45% of the high- 
justification and 35% of the control subjects 
did so. 

Zuckerman et al. (in press) investigated 
this same problem using an experimental pro- 
cedure modeled after that used by Snyder and 
Cunningham (1975). Housewives were first 
asked to take part in a 5-minute telephone 
survey; some were promised a monetary pay- 
ment in exchange for their cooperation, and 
others were not. If a subject agreed to par- 
ticipate, she was told that the interview would 
be conducted at a later time. Subjects prom- 
ised the monetary payment were told they 
would receive a check after the interview, 
Two or three days later, the subjects were 
called by a second experimenter representing 
a different service organization and were 
asked to consent to a 20-minute interview. 
Forty-five percent of the control group sub- 
Jects agreed to that request, whereas 64% 
of the subjects promised no money consented 
to the interview. In contrast, only 33% of 
those promised a monetary payment in return 
for their initial compliance agreed to the sec- 
ond request. Similar results were reported by 
Reingen and Kernan (1977). 

DeJong and Funder (1977), however, 
found the monetary payment to have the op- 
posite effect. The day after their participation 
in a 15-question survey, subjects in the pay- 
ment condition received the $2 that had been 
Promised them in exchange for their help. 
Subjects were later asked to take part in a 
50-question survey being conducted by a sec- 
ond organization. Fifty-six percent of control 
subjects never before contacted agreed to the 
second Tequest, whereas 78% of the subjects 
in the payment condition did so, These find- 
ings were unexpected. It was thought that the 
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actual receipt of money by subjects in the 
payment condition would increase the salience 
of the external justification for their compli- 
ance with the first request. A follow-up study 
showed that this finding was not due to sub- 
jects’ expectation that they would be paid for 
helping with the second survey, When the 
second caller informed subjects they could not 
be paid for their help, the pattern of results 
was virtually unchanged. One possible ex- 
planation for this unexpected set of results is 
that a letter that accompanied the payment 
may have labeled subjects as “doers” (cf. 
Kraut, 1973), That certification may have 
had greater implications for subjects’ self- 
perceptions than did the small monetary pay- 
ment, 

Finally, in a study conducted by DeJong 
and Musilli (Note 3), subjects were ap- 
proached at home with a small initial request 
made by a female experimenter who appeared 
to be either handicapped or physically normal. 
It was hypothesized that subjects approached 
by the disabled experimenter would feel psy- 
chological pressure to comply with her request 
and would, in turn, attribute their compliance 
to the fact of her disability. The experimenter 
asked half the subjects to answer questions 
for a survey on special parking facilities for 
disabled drivers for a group called Friends of 
the Handicapped. The others were told that 
the experimenter represented Friends of the 
Environment and that the questions con- 
cerned special parking for compact cars. Two 
days later, subjects were called on the tele- 
phone by a second experimenter and were 
asked to participate in a 30-minute survey on 
traffic safety for a different organization. The 
hypothesis was partially supported. Fifty-six 
percent of the subjects approached by the 
handicapped experimenter for Friends of the 
Handicapped complied with the second re- 
quest, whereas 40% of control subjects did so. 
However, only 41% of subjects approached 
by the handicapped experimenter for Friends 
of the Environment complied with the second 
request. 

In sum, evidence is generally consistent 
with the proposition that high external justi- 
fication for initial acts of compliance can dam- 
pen the probability of subsequent compliance. 
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Only one study found no such effect at all 
(DeJong & Funder, 1977), and there is ax 
plausible explanation for that failure. Little 
evidence for an overjustification effect exists; 
only Reingen and Kernan (1977) and Zuck- 
erman et al, (in press) reported results for a 
high-justification condition to be actually be- 
low those for a control group, and these differ- 
ences were small, One possible reason for not 


finding overjustification effects is that subjects : 
in these studies were not preselected for their 
high level of initial intrinsic or altruistic mo- 
tivation. 

There may be circumstances under which 
high external justification for initial compli- 
ance actually leads to higher levels of subse- 
quent compliance. Such an outcome might be 
anticipated if people believe that behavior 
that is truly intrinsically motivated is more 
highly regarded than that which is thought to 
arise in response to external pressure (cf. 
Nemeth, 1970). Thus, when subjects’ self- 
esteem (or their good name) is threatened by 
the perception that they have complied in re- 
sponse to the dictates of external contin- 
gencies, they may be motivated to be more 
helpful in the future when such pressures are 
absent. A report by Upton (1973) is con- 
sistent with this notion. He found that pre- 
vious blood donors classified as intrinsically 
motivated did not respond well to an appeal 
for donations that offered money in exchange 
for a pint of blood. When no such bribe was 
offered, the percentage of such subjects will- 
ing to donate was almost 30 percentage points 
higher. It is conceivable that subjects offered 
the bribe wished to avoid the perception that 
they were motivated to help others only as & 
means of garnering rewards for themselves. 


Effect of Social Labels 


Sociological theories of deviance stress that 4 
society encourages those whom it labels as 
deviant to learn and accept a deviant role 
identity. Once such people come to share this 
definition of themselves, the changed self- 
image is believed to sustain their deviant be 
havior (see Schur, 1971). Although te 
theories focus almost exclusively on labels ¢ 
deviance and generally ignore positively ves 


T 


for a donation to a fund-raising drive for 
multiple sclerosis, (Unfortunately, a control 
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enced labels, their basic propositions can still 
easily be translated into the language of self- 
perception theory. Though actual behavior 
(and the context in which it occurs) may pro- 
vide the clearest evidence about one’s traits 
or attitudes, self-perception theory recognizes 
that a self-definition or label provided by 
others may serve as an important source of 
information about one’s dispositions (cf. 
Miller, Brickman, & Bolen, 1975) or at least 
may signal to people that their behavior and 
its implications for their self-image should be 
reflected on, thereby energizing a self-percep- 
tion analysis. 

This proposition was put to the test by 
Kraut (1973), First, a male experimenter 
went to subjects’ homes and solicited con- 
tributions for the Heart Association, If sub- 
jects made a donation, they were randomly 
assigned to be in a labeled or a nonlabeled 
condition, Subjects receiving the charitable 
label were simply told, “You are a generous 
person. I wish more of the people I met were 
as charitable as you.” Subjects who did not 
contribute were also assigned to a labeled or a 
nonlabeled condition. Subjects receiving the 
label were frankly told that they were “un- 
charitable.” One or two weeks later, a second 
experimenter contacted the subjects and asked 


group consisting of subjects receiving only the 
second solicitation was not tun.) As Kraut 
Predicted, 62% of those given the positive 
label donated the second time, whereas only 
47% of the nonlabeled subjects did so. Kraut 
recognized that this effect could be explained 
ìn terms of a social reinforcement model if 
the charitable label Constituted a reward for 
the subjects. He also found that those who did 
Not donate the first time gave somewhat less 
When they had been branded as uncharitable 
by the first experimenter, 
Was not significant. 

The impact of labels on 
Subsequent behavior was 
study conducted by Paulhus, Shaffer, and 
downing (1977). Prior f 
tions, blood donors 
munications that 


Motives for giving blood. Half the subjects 


but this difference 
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read a communication that emphasized altru- 
istic motives, and as a cross-dimension, half 
the subjects were told about the personal 
benefits they would receive (e.g., free blood 
in case of emergency). After donating, sub- 
jects filled out a questionnaire that included 
an item on their plans for giving again within 
the next year. The results showed a main 
effect for salience of altruistic motives, such 
that subjects led to feel they had acted altru- 
istically reported a greater likelihood of fu- 
ture donation. 

McArthur, Kiesler, and Cook (1969) 
labeled some subjects as doers (people who 
know what needs to be done and then take 
the appropriate action) on the basis of bogus 
test results. This positive label increased the 
proportion of subjects willing to help dis- 
tribute antipollution leaflets only when sub- 
jects had been told that their doer personality 
qualified them to be paid for their participa- 
tion in a second experiment. Apparently, sub- 
jects’ belief that being a doer was a saleable 
quality strengthened the impact of the label, 
either because it made the feedback more con- 
vincing or simply because the subjects 
thought about it more. 

In contrast, Steele (1975) expected labels 
to have effects different from those predicted 
by self-perception theory. He argued that a 
negative label (name-calling) that impugned 
subjects’ character would motivate them to 
take action that would restore their self- 
esteem, thus increasing the probability that 
they would comply with a later request for 
help. Because Positive labels would not do 
this, Steele expected them to have little im- 
pact on help giving. In this study, an experi- 
menter claiming to be a pollster contacted 
subjects by telephone. In the relevant-nega- 
tive-name condition, subjects were lambasted 
for being “apathetic about the welfare of 
others,” whereas subjects in the irrelevant- 
hegative-name condition were criticized for 
their lack of concern for driving safety. Re- 
cipients of the relevant-positive-name message 
were praised for their desire to help their 
fellow man. (No irrelevant-positive-name con- 
dition was run.) Two days later, subjects 
were called by a second experimenter repre- 
senting a food cooperative in a lower income 
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neighborhood. To help the cooperative 
achieve its goal of “aiding the less fortunate,” 
subjects were asked to compile a detailed list 
of the quantities and brands of foods and 
household items used in their homes, Com- 
pared with a control group never before con- 
tacted, subjects receiving the positive label 
were more likely to help, though not signif- 
icantly so (cf. Kraut, 1973), On the other 
hand, virtually all subjects in both negative- 
name conditions promised to help out, and 
most actually did so. This effect was repli- 
cated in a second study. 

The name-calling procedure employed by 
Steele (1975) differs from the labeling proce- 
dure used by other researchers in one im- 
portant way. The labels used by others were 
based either on recent behavioral evidence 
(Kraut, 1973) or on the results of phony 
psychological tests (McArthur et al., 1969). 
In contrast, no such evidence substantiated 
the claim made by Steele’s experimenter. This 
suggests that whether the negative label in- 
duces a change in self-perception may depend 
on the degree to which subjects’ future be- 
havior can belie the label. If refutation of the 
label is still possible, it may not lead to a 
change in self-definition, but may lead to a 
vigorous effort to defend self-esteem or to 
assuage guilt. 

This possibility was, explored in a recent 
study by Gurwitz and Topol (1978). Stu- 
dents at a large suburban university were 
first contacted by telephone and accused of 
not taking advantage of the opportunities 
afforded by a nearby city. This accusation was 
directed at subjects as members of a group 
(students at the university) or as individuals, 
Before the accusation was made, half the sub- 
jects were asked how many times in the past 
few months they had gone into the city, the 
modal answer being zero. The other subjects 
were not led in this way to provide evidence 
in support of the accusation. Later that eve- 
mng, a second experimenter had subjects fill 
Out a questionnaire concerning their interest 
in a number of student-organized activities 
in the city. 

_ When subjects had not been led to provide 
_ evidence in support of the accusation, the 
degree to which they later belied the label 
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depended on whether the accusation had been 
made about them as individuals or as mem- 
bers of a group. When they were seen as part 


of a group, they indicated more interest in the 
activities, thus belying the label, In contrast, 
when the subjects had been led to provide 
such evidence prior to the accusation, they 
were more likely to disconfirm the label when 
it had been directed at them as individuals. 
Similar results were found in a laboratory 
study conducted by Gurwitz and Topol 
(1978) in which subjects were labeled as be- 
ing low in self-confidence. Why subjects’ re- 
sponses to the label depended so greatly on ” 
whether it was applied to them individually 
or as members of a group is not clear 

In sum, although the effect of positive 


labels has been fairly consistent across sev- 
eral studies, the impact of negative labels on 
subsequent behavior presents a more compli- 
cated picture. 


Measurement oj Altered Selj-Perceptions 


Perhaps the most convincing evidence in 
support of the self-perception explanation of 
the foot-in-the-door effect would come from 
studies that actually measured the changes in 
self-perception that are thought to occur. Un- 
fortunately, this kind of direct evidence is 
hard to come by. A long catalog of excuses 
for this failure can be offered, most of them 
pointing to possible inadequacies in design 
and measurement (see Bem, 1972; Lepper; 
1973). But the major difficulty may be that 
the self-perception changes that follow a per- 
son’s initial compliance with a small request 
have never been clearly specified and may be 
much more complicated than has been 
acknowledged. 

First, it may be unreasonable to expect the 
subjects’ general perception of themselves tO 
be altered by the kinds of brief experiences “ 
involved in these foot-in-the-door studies. In 
fact, particular subjects may only come tO 
infer something about their attitudes or MO- 
tives for a fairly limited set of situations 
(eg., telephone solicitations from strangers): 
This notion is consistent with recent state- 
ments of self-concept theory (Gergen, 19715 
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McGuire & Padawer-Singer, 1976), which 
seememphasize that a person may harbor a variety 
of self-definitions that differ in salience across 
various situation contexts. 

Second, measurement of these supposed 
changes in self-perception is made difficult 
by the fact that individuals are likely to code 
their behavior along different dimensions or 
into different groupings (Bem & Allen, 1974). 
Which behaviors constitute actions that are 
psychologically similar to the initial act of 
compliance is a function of each person’s view 
of the world. To give one example, some per- 
sons might code their behavior as compliance 
or as participating in a survey, whereas others 
might code it as helping someone in need. 

For these two reasons, investigators who 
have hoped to show a general change in sub- 


; 


“a jects’ self-definitions seem to have had little 
chance of success. Still, two experiments have 
successfully demonstrated that extrinsic in- 
centives for help giving can lead helpers to 

By describe themselves as less altruistically mo- 


tivated. During the course of an experiment 
on “first impressions,” Batson, Coke, Jas- 
noski, and Hanson (1978) asked male under- 
| graduates to help an experimenter code data. 
Payment for this help was offered before sub- 
jects agreed to help (payment condition), 
after the subjects’ agreement (payment 
after), or payment was not mentioned at all 
(no payment). The request for help was al- 
ways made in the presence of a male con- 
=». federate who never volunteered his services. 
A control group received no such request. 
Subjects were then asked to rate themselves 
and the confederate on several dimensions, in- 
cluding helpfulness and cooperativeness. As 
x expected, subjects who agreed to aid the ex- 
perimenter in exchange for money rated them- 
selves as less altruistic than the confederate, 
whereas subjects in the other three conditions 
, rated themselves as more altruistic. 

In an experiment conducted by Smith, Gel- 
fand, Hartmann, and Partlow (1979), second- 
and third-grade children played a marble 
drop game in which they could earn pennies 
toward the purchase of a prize. During the 
first part of the experiment, the children were 
given a number of opportunities, signaled by 
a light, to donate pennies to another child 
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playing the game in a nearby room. The ex- 
perimenter also created situations in which 
the children could not make a donation (but 
would think they should have done so) by 
having the signal light go off before they 
could respond. Whereas some subjects were 
merely praised each time they donated a 
penny, others were given a monetary reward 
for doing so, and the relationship between 
their help giving and the reward was spelled 
out to them. Other subjects were scolded by 
the experimenter each time they did not 
avail themselves of an opportunity to help; 
another group of children were fined each 
time they failed to donate, and the contin- 
gency between their behavior and the punish- 
ment was made explicit. Finally, a control 
group did not receive any type of reward or 
punishment from the experimenter. Intensive 
interviews conducted by a second experi- 
menter with each subject showed that chil- 
dren who were rewarded or fined were more 
likely to attribute their help giving to external 
pressures than were children in the control 
group or than were those who received praise 
or a scolding. 


Alternative Explanations for the 
Foot-in-the-Door Effect 


Surprisingly few alternative explanations 
for the foot-in-the-door effect have been sug- 
gested. In part, this is because the self-per- 
ception analysis has proven to be a heuristic 
explanation, leading investigators to examine 
a wide variety of phenomena such as over- 
justification effects and the impact of social. 
labels. Other explanations have not generated 
much research, but they do resurface period- 
ically. Three of these are considered. 


Adaptation Level 


It has been argued that the small initial 
request to which people agree establishes a 
new baseline against which the subsequent 
larger request is compared (eg., Schmidt, 
1973). In other words, the prior request 
causes the magnitude of the second to be re- 
defined, making it seem less extreme than it 
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otherwise would. It is not clear that this ex- 
planation would lead one to predict a differ- 
ence in subsequent compliance between those 
who refuse and those who agree to the initial 
request. The perceived relative size of the 
second request should not depend very much 
on whether people have complied with the 
first request. However, several experiments 
show foot-in-the-door effects to be stronger 
when those who refuse the first request are 
excluded from the data analysis (e.g., Snyder 
& Cunningham, 1975; DeJong & Musilli, 
Note 3). It should be noted that this explana- 
tion cannot easily account for the results of 
studies looking either at extrinsic justification 
for the initial act of compliance or at the im- 
pact of social labels. 

To lay this explanation permanently to 
rest, two types of studies could be done. First, 
subjects could be asked to rate the relative 
size of various requests under differing experi- 
mental conditions, Second, the subsequent 
‘compliance of those who are merely informed 
of a request being made of others and those 
who actually comply with that request could 
be compared. An adaptation level explanation 
would predict higher compliance from both of 
those groups, compared with a control group 


never before contacted, but self-perception 
theory would not. 


Salience of Social Norms 


_ Harris (1972) Suggested that being asked 
to perform an initial small request makes 
People more aware of the norm of social re- 
sponsibility, a norm that prescribes that one 
should help those who are in need. Like the 
adaptation level explanation, this argument 
also leads to the Prediction that those who 
are only informed about the first request will 
More compliant with the second demand. 

It would also lead us to expect a high rate of 
compliance from those who refuse the initial 
request, in contrast with available evidence, 
The major problem with this explanation, 
wever, is that it, too, cannot account for 
es on extrinsic constraints or labels, 


the studi 
which 
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Behavioral Consistency 


It has also been argued that foot-in-the- S 
door demonstrations show that people try to 
act consistently with the way they have be- 
haved in the past. Does a person induced to 
agree to a first request agree to a second in 
order to sustain a consistent public image? 
If the effect were due to subjects’ attempts at 
such impression management, then a larger 
foot-in-the-door effect should result when the 
Second request is made by the same experi- 
menter or involves the same issue or action 
The importance of these kinds of variables 


has not been adequately tested, but the avail- 
able evidence suggests that similarity between 
the requests on these dimensions is not ter- 
ribly important (e.g., Freedman & Fraser, 
1966). 

A second version of this argument was sug- 
ested by Brock (1969), who argued, in es- 
sence, that people desire to be psychologically 
consistent for the sake of a self-image, not a 
public one. People have behaved in a certain 
way in the past and continue to behave that 
way in the future simply because they value 
consistency, However, it is clear that to man- 
age a self-image in the fashion Brock sug- 
gested, people must first code their behavior 
and understand its attributional meaning. 
They must decide, essentially, whether the be- 
havior reflects the pressures of extrinsic con- 
Straints or reflects their own dispositions or 
attitudes, Of course, it may be that the self- 
rewarding consequences of being consistent 
with one’s self-image are what mediate the 
relationship between changes in self-percep- 
tion and subsequent behavior. Evidence on 
this point is lacking. 


Conclusions 


1. It is concluded that the foot-in-the-door 
effect can be reliably obtained. Although the 
number of studies that have failed to demon- 
Strate it unequivocally is surprising, it must 
be reemphasized that almost all of the studies 
reviewed showed experimental results in the 
Predicted direction. Furthermore, a number 
of plausible explanations for failures to repli- 
cate can be offered. 
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2. In addition to the self-perception ex- 
planation first offered by Freedman and 
Fraser (1966), three alternative explanations 
for the effect have been suggested. It is con- 
cluded that these explanations are inadequate 
accounts of the foot-in-the-door literature. 

3. A number of theoretical derivations from 
the self-perception explanation of the foot-in- 
the-door effect have been outlined, and the 
evidence for most is found to be generally 
supportive. However, investigators have not 
clearly specified the exact nature of the self- 
perception changes that are said to mediate 

t the foot-in-the-door effect. Self-perception 
theory must concern itself with how people 
classify or code their own behavior and how 
they are led to form a broad or highly specific 
inference about their attitudes, traits, or dis- 
positions, Echoing Bem’s (1972) admission, 
it must also be underscored that self-percep- 
tion theory does not fully explain the link be- 
tween self-attribution and subsequent behav- 
ior, Further work is needed to explore these 
gaps in the self-perception analysis of the 
foot-in-the-door effect. 
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i i i informational factors on attribution 
Ro na e ae e issues. First, manipulations of the 
RN of thought subjects gave to their attributions and of a delay before 
responding to attribution questions did not diminish the effect of salience = 
attribution; in fact, the delay increased the effect. Second, recall of the stim 
ulus material was shown to be influenced by salience and by a in 
formation (consensus, distinctiveness, and consistency) and to be related to 
attributions. These findings, together with theory and data from the pruss 
on comprehension and representation of linguistic material in memory, are usec 
to argue that salience is not simply a process by which people make attribu 
‘tions without giving much thought to them. Instead, salience effects reflect the 
close relationships among the processes of comprehension, remembering, and 
attribution, and the fact that attributional processing can take place at the time 
of the encoding and storage of information, as well as at the time of its re- 


trieval from memory. 


The literature on the perception of causa- 
tion currently contains support for two dis- 
tinct and even opposed theories of the attribu- 
tion process. Kelley (1967, 1972) has pro- 
posed that the perceiver makes attributions 
after logically weighing information about the 
covariation of an effect with various possible 
causes, This covariation information may be 
observed or assumed (Kelley’s, 1972, “causal 
schema”), but in either case it is processed 
more or less rationally, in a fashion resem- 
bling the analysis of variance (ANova), Kel- 
ley’s model of attribution has come to be 
known as the anova model for this reason. 
On the other hand, several recent writers 
(Pryor & Kriss, 1977 ; Taylor & Fiske, 1975, 
1978) have argued that Perceivers use a much 
simpler information-processing Strategy: at- 
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tributing causality to the most salient plausi- 
ble cause instead of weighing many possible 
causes to make a decision, Salience is used 
synonymously with Tversky and Kahneman’s 
(1974) use of “availability” to suggest some 
factor that is literally prominent in the per- 
Ceiver’s field of view (Taylor & Fiske, 1975) 
or that is easily retrievable from memory 
(Pryor & Kriss, 1977). 

A variety of studies have demonstrated 
that both object salience and the information 
Specified by the anova model can influence 
Perceivers in making causal assignments. 
Taylor and Fiske (1978) present strong evi- 
dence that in many everyday situations per- 
ceivers give little thought to issues of causa- 
tion, making attributions to salient stimuli 
off the “top of the head.” These situations 
are primarily routine ones that involve the 
Perceiver very little. On the other hand, in 
Psychological experiments where information s 
is available and instructions (or implicit de- 
mands) urge the subject to consider evidence 
carefully before making a response, perceivers 
can respond to information in the fashion 
Kelley suggests (McArthur, 1972). 
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SALIENCE AND COGNITION 


Is it possible to identify the processes by 
which both salience and the types of informa- 
tion identified by Kelley influence attribu- 
tions? That is, going beyond the simple dem- 
onstration that different manipulations do in- 
fluence attributions, research can address the 
question of how they do. One line of research 
that may give insights into processes is that 
aimed at identifying points of demarcation 
between salience and anova processes (cf. 
Taylor & Fiske, 1978, Section VII). One pre- 
diction is that salience effects might weaken 
or vanish, and the more thoughtful anova 
processes take over, when the attributor is 
highly involved in the situation about which 
he or she is making causal judgments. Taylor, 
Crocker, Fiske, Sprinzen, and Winkler (1979) 
, attempted to moderate salience effects on 

attributions by a variety of manipulations. In 
several studies they varied distraction, gen- 
eral arousal, and interest in the event to be 
attributed—all ways of affecting the subjects’ 
degree of inyolvement in the situation—but 
‘were unable to modify the salience effect 
strongly. This suggests that involvement in 
events may not lead to more rational, anova- 
like processing of them; on the contrary, such 
engrossment may sometimes prevent more 
active processing. One can readily imagine a 
harried attributor tossing off causal judgments 
as needed in response to minimal cues. Study 
1l of this article attempted to manipulate the 
Subjects’ degree of involvement in attribu- 
tional information processing itself, as distinct 
from involvement in the Situation, to deter- 
mine whether salience effects are weakened 
when perceivers are forced to give extensive 
Consideration to many possible causes, that is, 
Under conditions that ought to be ideal for 
the emergence of the more rational informa- 
tion processing described by Kelley’s anova 
model. 


Study was used and validated by Pryor and 
Kriss (1977). Their Study 1 showed that 


à sentence influenced the salience of each 
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2 then showed that this salience manipula- 
tion, applied orthogonally to an information 
manipulation (sentences varying consensus, 
distinctiveness, and consistency), did affect 
attributions in the predicted direction. The 
more salient sentence element (person or ob- 
ject) attracted more attributions from sub- 
jects. It can be argued, however, that Pryor 
and Kriss’s (1977) procedure, in its use of 
simple closed-ended attribution scales as the 
dependent measures, encouraged subjects to 
make quick, relatively unthinking attributions 
and hence encouraged reliance upon the sa- 
lience cue. The current study focused upon 
this possibility. This study also examined 
recall data that may illuminate the processes 
that underlie the effects of salience and in- 
formational factors on attributions, 

The present Study 1 conceptually replicates 
and extends Pryor and Kriss’s (1977) Study 
2. Pryor and Kriss presented subjects with 
16 sentences of the type “Joe likes the film.” 
In a within-subjects experiment they manip- 
ulated salience (by varying the order of pre- 
sentation of subject and object in the stim- 
ulus sentence), information (by including 
additional sentences that contained consensus, 
distinctiveness, and consistency information 
pointing to either the subject or object as 
causal), and verb form (the verb was “like” 
or “dislike”). They measured attributions of 
causality to the subject and object on sep- 
arate 11-point scales but reported analyses 
only on the difference between these two at- 
tributions. The current study largely followed 
Pryor and Kriss’s procedure, with the addition 
of a between-subjects manipulation intended 
to influence strongly the amount of time and 
attention subjects gave to consideration of 
several possible causes in answering the at- 
tribution dependent measures, and a recall 
measure, administered after the conclusion 
of the attribution questionnaire, in which sub- 
jects attempted to reproduce the stimulus sen- 
tences. 


Three specific predictions are made for this 
study. 
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“or are overly engrossed and pressed for 
time, This view implies that increased thought 
or information processing concerning attribu- 
tions should diminish the effects of the salience 


_ manipulation. 


2. If salience effects are mediated by cogni- 


_ tive availability or similar factors that affect 


memory, then relationships between attribu- 
tions and memory should appear. Differential 
availability of the subject and object of the 
sentence should influence the way the sentence 
is recalled as well as the attribution that is 


_ made, In general, relationships between mem- 


. ory and attribution should provide informa- 


tion about the cognitive dynamics of salience. 
3. Finally, the recall measures may be in- 


' fluenced by the manipulations (salience and 


information) that are applied to the sentences 
and are known to affect attributions. Effects 
of salience on recall would confirm once again 
(as in Pryor & Kriss’s Study 1) that salience 
Operates through an effect on the stimulus 
sentence’s representation in memory (thereby 
influencing its recall). Effects of covariation 
information on recall are a more interesting 
possibility; they would point to a close con- 
nection between attribution and the cognitive 


Processes involved in sentence comprehension 
and recall, 


Study 1 
Method 


Overview. The 
Kriss’s Study 2, e; 
extension is the a 
nipulation intend 


first study replicates Pryor and 
xtended in three ways. The first 
eee of a between-subjects ma- 
to influence the am 
thought Subjects give to their attributions, ioe 
in the little thought condition received the closed- 
t measures used in the Pryor and 
two 11-point scales for attributions to 


te thought condition were in- 
and write in their own words 
the event presented in the stim- 


then to code that 3 
scales. Subjects in aera 


one explanation for 
ulus sentence and 
using the two 
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A larger amount of careful and systematic thought 
should be given to the final closed-ended responses 


in the moderate and extensive thought conditions” 
than in Pryor and Kriss's little nt condition 
The second extension is the addition of a recall 
measure. After subjects completed the experimental 
booklet that presented the 16 sentences and causal 
measures, they were given a sheet of paper with in- 
structions to “list below as many [sentences] as you 
can remember—just the sentences, not the additional 


information that was given.” Each sentence was 
coded for whether it was recalled with subject first 
Or object first (paralleling the salience manipulation) 
and whether the subject and object of the sentence 
were each recalled. 

The third extension is of a minor nature, Eight 
different positive verbs and eight different negative 
verbs were used in place of Pryor and Kriss's two 
verbs “like” and “dislike.” This was intended to pro- 
vide confirmation of their unexpected finding of a 
Verb Form X Information interaction 

Subjects. The subjects were 84 undergraduate and 
graduate students' (28 randomly assigned to each 
of the three conditions) at the University of Cal- 
ifornia, Riverside, who were paid for their participa- 
tion. Subjects were run in groups that were hetero- 
geneous as to experimental condition and in which 
communication among subjects did not take place. 
The experimenter who administered the questionnaire 
and debriefed the subjects was blind to subjects’ 
conditions, 

Materials. Sixteen verbs were used in sentences, 
8 different verbs implying a positive relationship be- 
tween the subject and object of the sentence (c8. 
likes, agrees with, is pleased with, helps, values), 
and 8 different negative verbs (eg. dislikes, despises, 
complains about, is angry with). With this exception, 
the salience and information manipulations are as 
described by Pryor and Kriss (1977). Salience refers 
to the order of presentation of the subject and ob- 


ject, and information to the consensus, distinctive- 


ness, and consistency factors presented in the same 
manner as in McArthur (1972). The salience manip- 
ulation is not equivalent to active versus passive 
syntactic forms of the sentence, although it does 
Overlap somewhat. Some sentences (such as the first 
example below) are syntactically passive in the per- 
son-salient version (eg, “Penny is disgusted by the 
turnips . . .”) and active in the object-salient version 
(as shown). Examples are (object salient with object 
information) : 


The turnips at dinner disgust Penny. 

The turnips at dinner disgust almost everybody. 
Most other vegetables do not disgust Penny. 

In the past the turnips at dinner have almost 
always disgusted Penny. 


fields unrelated to social psychology, and exclusion 


1 The five graduate-student subjects were all from 
of their data does not materially affect the results. 
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SALIENCE AND COGNITION 


(person salient with person information) : 


Phil complains about his co-worker’s habits. 

Almost nobody else complains about Phil’s co- 
worker's habits. 

Phil complains about almost everybody’s habits. 

In the past Phil has almost always complained 
about his co-worker’s habits. 


(object salient with person information) : 


The vegetables at the Italian restaurant are liked 
by Tony. 

The vegetables at the Italian restaurant are not 
liked by almost anybody else. 

Most other vegetables are liked by Tony. 

In the past the vegetables at the Italian restaurant 
have almost always been liked by Tony. 


The salience, information, and verb form (positive/ 
negative) manipulations were all orthogonal, form- 
ing a 222 design of which each subject com- 
pleted two entire replications, for a total of 16 differ- 
ent sentences. Four questionnaire forms were used, 
so that each stimulus sentence was presented to equal 
numbers of subjects in each of the four Salience x 
Information conditions. The order of presentation of 
the 16 sentences was randomized separately for each 
subject to avoid systematic order effects. The de- 
pendent measures (attribution and recall) were de- 
scribed above. 


Results 


The results fall into three sections: attribu- 
tions, recall, and relations between attribution 
and recall, Attributions and recall were 
analyzed in a Thought x Salience x Informa- 
tion X Verb Form X Replication (3 x 2 x 2 
x 2 Xx 2) analysis of variance, with all but 
thought being within-subjects factors. Repli- 
cation is always treated as a dummy factor, 
and any significant effects it may have are 
ignored; it has no significant interactions 
that qualify effects discussed below.? 

Attribution measures. The attribution de- 
Pendent measures were subject attribution, 
object attribution, and the difference between 
them (the measure used by Pryor & Kriss, on 
which a higher score means more person at- 
tribution). In the current data, the within- 
Conditions correlation between subject and 
Object attribution averages —.470, p < .05, 
justifying the use of the difference score. In 
addition, results for all three measures are 
quite similar.” Considering the difference mea- 
Sure, four significant effects appeared, the 


2243 


first three of which were also obtained by 
Pryor and Kriss. Covariation information 
strongly affected attributions: The mean at- 
tribution score was 5.08 with person-oriented 
information and —1.74 with object-oriented 
information. (Fs are shown in Table 1.) Sa- 
lience also had a significant effect on attribu- 
tions. The mean was 2.03 when the sentence 
was presented with the person salient and 1.32 
when the sentence was presented with object 
salient. A significant Verb Form x Informa- 
tion interaction also appeared, with covaria- 
tion information having more impact when 
the verb was negative than when it was posi- 


? The use of an alternative analysis based on quasi 
F ratios, recognizing that replication as well as sub- 
ject is a random factor in the design, does not change 
the essential conclusions. In the current study, sub- 
jects contributed much more variance than replica- 
tions. The analyses reported here are ordinary Fs 
based on the appropriate interactions with subject 
as error terms, because this approach is more familiar 
and because in some analyses to be reported, the use 
of dichotomous scores renders untrustworthy the 
high-order interactions required for the computation 
of quasi Fs. 

3 The results for the object-attribution dependent 
measure were the same as for the difference measure, 
except that the effect of verb type was not significant, 
The results for the person measure were the same, 
except that the salience effect was only marginally 
significant (p< .06) and a three-way Verb Type X 
Salience X Information interaction reached signif- 
icance, indicating that with negative verbs, object 
information and especially object salience gave rise to 
much less person attribution. The results for the dif- 
ference measure thus summarize well the results for 
the two separate dependent variables: Four of the 
five significant effects on the difference measure (all 
except verb type) are at least marginally significant 
on both the person and object measures, 

The strength of the person-object attribution cor- 
relation, in contrast to the typical findings of a non- 
significant correlation between measures of situa- 
tional and dispositional attributions (e.g. Miller, 
Smith, & Uleman, Note 1), may be due to the pre- 
sentation of the informational manipulations. These 
sentences strongly imply a person or object attribu- 
tion for each stimulus sentence and hence may sensi- 
tize subjects to the idea that person and object 
attributions are alternatives. 

#Verb form also affected attributions, with more 
Person attributions being given for positive than for 
negative verbs. The Verb Form X Replication inter- 
action was significant, however, so the verb form 
effect should be interpreted with caution (and it re- 
ceives no further attention in this article). 


Analysis of Variance Results for Difference Allribulion Measure and Significant Effects ir 


—— rer ovr rm 
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Table 1 

Study 1 

Effect 
Salience J 
(Error) Salience X Unit 
Information 
Information X Thought 

(Error) Information X Unit 


Verb Form X Information | 
(Error) Verb Form X Information X Unit 


tive (means appear in Table 2). This replica- 
tion of Pryor and Kriss’s finding on verb 
form with an extended sample of verbs in- 
creases confidence in the effect. This inter- 


action may be due to subjects being more 


careful (i.e. paying more attention to infor- 
mation) in their attributions for negative 
events, perhaps because such events have 
bad implications or are unusual (cf. Kanouse 
& Hanson, 1972). 


The other attribution effect involved the 


_between-subjects manipulation and so was 


not part of Pryor and Kriss’s findings. The 
thought manipulation was successful, Sub- 
jects in the extensive thought condition wrote 
down and considered a mean of 3.1 different 
explanations before selecting the best. Sub- 


Table 2 
Means of Difference Attribution Measure for 
Study 1 A 
a a 
Information 
OO S a 
Item Person Object 
Thought" 
Little- 5.683 —2.638 
Moderate 5.839 —1.705 
Extensive 3.723 —0.862 
Verb form> 
Positive 4.860 —0.818 
Negative 5.304 —2.652 
Nole. A higher mean indicates more perso; il 
tion and les object attribution. erat 
Ar observations i 
» n = 336 observati, aS 


ons per mean. 
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MS dj l p 
170.715 1 6.5 013 
26.038 ŝi 
15,614.238 1 205.811 000 
435.421 2 5,739 005 
75.867 ŝi 
435.435 1 13,761 001 
31.642 81 
jects in the moderate thought condition wrote 


down one explanation and then coded it, 
whereas subjects in the little thought condi- 
tion had only to make two checkmarks on 
closed-ended attribution scales, It is notable 
that the data contain no support for Predic- 
tion 1, which called for the weakening of 
Salience effects in the extensive thought con- 
dition; for the Thought x Salience interac- 
tion, F(2, 81) = .78, p = .46. Salience effects 
on attributions were as strong when subjects 
carefully considered several possible explana- 
tions as when they simply made two check- 
marks on attribution scales, The thought 
manipulation did interact with information, 
however, with the impact of covariation in- 
formation being largest in the little thought 
condition and smallest in the extensive 
thought condition. Means appear in Table 2, 
and F test results in Table 1. 

Recall measures. Four recall measures were 
analyzed. These were the proportion of sen- 
tences recalled with person first, the propor- 
tion recalled with object first,’ the proportion 
of persons recalled at all, and the proportion 
of objects recalled at all. Partial recall was 
fairly frequent, especially of the form “—— 
complains about his co-worker’s habits.” This 
example would have been scored as recalled 


5 These two proportions are independent, as they 
add to the Soia of sentences recalled at all, 
and not to 1.0. The order-of-recall variables were 
scored independently of the initial salience (i.e os 
of presentation) of the sentences—that is, any pen 
of a sentence in a particular order was counted, P! 
just correct recall of order. 
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with person first, object recalled, person not 
recalled. The reliability of this coding was 
near perfect. A second rater checked approxi- 
mately 30 sentences, with no disagreements 
on coding, r 

Both salience and information influenced 
the order of recall variables. More sentences 
were recalled with the object first when the 
object was salient (M = .125 vs. .029), F(1, 
81) = 28.31, p< .001. A Salience X Verb 
Form interaction affected both of the order- 
of-recall variables. Object salience led to 
stronger object-first recall, and person sa- 
lience led to stronger person-first recall, when 
the verb was negative as compared with posi- 
tive; for person-first recall, F(1, 81) = 4.82, 
p < .05; for object-first recall, F(1, 81) = 
5.66, p < 01. 

More important is an interaction of sa- 
lience with covariation information. Inspec- 
tion of the Salience X Information interaction 
means (Table 3) reveals that information has 
little or no effect on order of recall in the 
person-salient condition, where nearly all re- 
called sentences were reproduced accurately 
with person first—in this condition the mean 
object-first recall was only .029. When sen- 
tences are presented with object salient, how- 
ever, there is more variation in the recall mea- 
sures, and covariation information has an 
effect. For the Salience X Information inter- 
action, for person-first recall, F(1, 81) = 
7.16, p < .05; for object-first recall, F(1, 81) 
= 9.16, p < 01. Tests of the simple effects 
of information within the object-salient condi- 
tion, using the appropriate pooled error mean 
square (Winer, 1970, pp. 544-545), reveal 
that this effect is significant for both the per- 
son-first recall and object-first recall mea- 
sures, #(81) = 3.07, p< .01, and ¢(81) = 
2.93, p < .O1, respectively. In both cases, co- 
variation information influences recall in the 
predicted direction, with the factor the infor- 
mation pointed to as the cause more likely to 
be recalled first. The clear conclusion is that 
these recall measures suffer respectively from 
a ceiling effect and a floor effect in the per- 
Son-salient condition, producing the observed 
Salience x Information interaction. When the 
analysis is focused on the object-salient con- 
dition in which the absence of ceiling and floor 
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Table 3 
Means for Recall of Verbs by Salience and 
Information for Study 1 


Information 


Proportion recalled Person Object 


Person salient (person first) 


Person first 426 459 
Object first 047 012 
Object salient (object first) 
Person first -470 358 
Object first 098 152 


Note. Each proportion is based on 336 observations. 


effects allows the recall measures to show 
more variance, the effects of the covariation 
information manipulation on recall are sig- 
nificant.® 

Correlations between attributions and re- 
call. Finally, we come to the relationships 
between attributions and recall. These were 
analyzed as partial correlations to ensure that 
a true relationship between the variables is 
present, not a spurious relationship resulting 
from the variables being affected in similar 
ways by the manipulated factors (cf. Taylor 
& Fiske, 1978, section VI.C). The partial 
correlations of the two attribution measures 
(person and object) with the four recall mea- 
sures were computed across responses, con- 
trolling for salience, information, and experi- 
mental conditions (thought). This test shows 


Other effects on recall measures, not of central 
theoretical significance, can be briefly mentioned: 
More sentences were recalled person first, and more 
persons were recalled, in the extensive thought condi- 
tion than in the other two conditions, F(2, 81) = 
4.54, p <.05. Person recall was influenced strongly 
by verb form, with more persons presented with 
positive verbs being recalled (means of .317 vs. .208), 
F(1, 80) = 19.05, p<.001. The proportion of ob- 
jects recalled was influenced by salience, with more 
object recall when the object was salient, F(1, 80) = 
5.95, p<.05. Both object recall and person-first 
order of recall were influenced by a three-way inter- 
action of Thought X Verb Form X Salience. How- 
ever, this effect appears not to be simply inter- 
pretable and does not qualify the interpretation of 
any other significant effects, and it will not be dis- 
cussed further. 
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that two of the eight relationships are signif- 
icant. (The probability of obtaining at least 
two of eight relationships at the .05 level 
under the null hypothesis is less than .002 by 
a binomial test.) The proportion of sentences 
recalled object first was associated with at- 
tributions to the object on the object attribu- 
tion scale, partial (1317) = .08, p < .01, and 
proportion of persons recalled was associated 
with higher person attribution on the person 
scale, partial r(1317) = .06, p < .05. Thus, 
relationships between attribution and recall 
measures do appear when the experimental 
manipulations are controlled. The small size 
of these correlations can be attributed to the 
facts that (a) the attributional measures may 
be of low reliability (the two items only cor- 
relate 47) and (b) these are correlations be- 
tween a continuous variable and a dichotomy, 
which can limit the maximum size of correla- 
tion coefficients to substantially less than 1. 
(Correction of these correlations for the at- 
tenuation due to the use of a dichotomy yields 
estimates of .15 and ,08 for their “true” sizes, 
respectively; Harshbarger, 1977, p. 440.) 


Discussion 


Three basic findings were replicated in this 
study, and two new relationships were uncoy- 
ered, Salience affects both recall and attribu- 
tion, and covariation information affects at- 
tribution, as Taylor and Fiske (1975) and 
Pryor and Kriss (1977) have shown. In addi- 
tion, information affects recall, in the object- 
salient condition where the recall measures 
showed adequate variance. Also, a partial cor- 
relation between recall and attributional mea- 


sures emerged (controllin for thi i- 
mental factors), eae 


Contra: 
tensive thought condition 


importance of the covariation information 
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manipulation. The extensive thought manip- 
ulation was specifically intended to influence 
Subjects to consider several explanations and 
choose the best one, rather than to engage in 


simple “top of the head” processing of just 
one causal bility—the most salient one. 
The manipulation was evidently successful, as 


subjects in the extensive thought condition 
considered a mean of 3.1 different explana- 


tions. Yet the impact of salience was not re- 
duced by the manipulation. A couple of points 
Should be made about the thought manipula- 
tion. First, the particular manipulation used 
does not artifactually bias subjects to give 
more or less weight to any particular kind of 


information as a determinant of causal at- 
tributions (e.g., the covariation information) ; 
it asks them only to choose the best cause in 
the light of all available information. How- 
ever, the manipulation may have other effects 
besides the intended one. For example, there 
may be a motivational effect: In the exten- 
sive thought condition subjects may have per- 
ceived the task as more important (since 
they were asked to spend so much time at 
it) and as a result may have tried harder 
to be accurate in their attributions. Still, the 
earlier theory of salience (as based on un- 
thinking top-of-the-head responses) would 
predict that such a motivational effect would 
also reduce the impact of salience—contrary 
to what was found. The thought manipula- 
tion thus seems adequate to support the con- 
clusion that we wish to draw from it: Sa- 
lience effects persist even when subjects give 
careful consideration to many possible causes 
of an event. 

The fact that salience manipulations have 
an impact over a broad range of conditions 
indicates that salience is not just a matter of 
making attributions without thinking. In- 
Stead, salience effects appear to be related to 
Such cognitive processes as the way the stim- 
ulus sentence is perceived and encoded in the 
first place, giving rise to the observed mem- 
ory-attribution links. These links are both 
the effects of information on recall ( implicat- 
ing attribution processes in the comprehension 
and recall of Sentences) and the partial cor- 


relations between recall and attribution in- 
dices. 
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A simple model can account for the find- 
ings. In carrying out the experimental task, 
subjects must first comprehend the stimulus 
materials, then report their attribution, then 
(in the recall task) report their memory of 
the sentence itself. Results of Study 1 show 
that salience and information affect both the 
attribution and recall results and also that 
there is a partial correlation between attribu- 
tion and recall indices, These effects can most 
parsimoniously be accounted for by the hy- 
pothesis that attributional processing and 
recall both operate from a single internal, en- 
coded representation of the stimulus sentence 
to which the subject refers in answering the 
questions he or she is posed. The experimental 
factors that affect both attribution and recall 
can thus parsimoniously be assumed to affect 
the nature of the internal representation and 
thereby, indirectly, the dependent variables. 
It is easy to see that salience may affect the 
way a sentence is encoded, but the hypothesis 
that covariation information does so as well is 
novel. It has been generally assumed that such 
informational manipulations affect attribu- 
tion but do not affect memory per se (Pryor 
& Kriss, 1977; Taylor & Fiske, 1975). Here, 
however, information is shown to affect re- 
call, so parsimony suggests that the informa- 
tion factor too may operate through an effect 
on the representation of the sentence in mem- 
ory. This proposition will be further tested in 
Study 2. 

The subject’s reporting of the attribution 
could then take place in either or both of two 
ways. The subject could retrieve from memory 
an attribution that was made during encoding 
and report it, or the subject could retrieve a 
representation of the sentence as well as of the 
additional consensus, distinctiveness, and con- 
sistency information and make a causal judg- 
ment at the time of retrieval. This argument 
that attributions are sometimes made at the 
time information is stored in memory, rather 
than subsequently on the basis: of data re- 
trieved from memory, has two noteworthy im- 
plications. First, as mentioned above, it is 
contrary to Pryor and Kriss’s (1977) inter- 
Pretation of salience. They raise two argu- 
ments against the idea that attributions are 
encoded and retrieved from memory: Since 
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their subjects were not told that the experi- 
ments involved an attribution task they had 
no reason to make attributions; and covaria- 
tion information did not influence the speed 
of recall in their study, as would be expected 
if attributions were involved in information 
encoding and storage. In response it can be 
argued that (a) social psychologists have often 
claimed that attributions are frequently made 
by people in their everyday lives, not just 
when experimenters ask for them. If we did 
not believe this, there would be little point in 
studying attributions. (b) Subjects in the 
Pryor and Kriss study, as well as in the cur- 
rent study, received the covariation informa- 
tional manipulations with each sentence. 
Since these strongly point to either the sub- 
ject or the object as the cause, it would be 
surprising if reading these stimuli did not gen- 
erate some causal thinking. (c) In the current 
study, information was shown to influence two 
recall measures (not measures used by Pryor 
& Kriss), 

Second, the notion that attributions can be 
made as information is stored in memory is 
consistent with many current theories of sen- 
tence comprehension and memory in cognitive 
psychology that give important places to in- 
ferred (attributed) causality. Among others, 
Kintsch (1974), Norman, Rumelhart, and 
the LNR Research Group (1975) » and 
Schank (1975) have all proposed theories that 
involve the representation of “cause” links 
between propositions. In comprehending a 
text like “A burning cigarette was carelessly 
discarded. The fire destroyed many acres of 
forest,” people make the attribution that the 
cigarette caused the fire and seem to store 
that inferred causal link as if it were explicitly 
stated in the text, when they are tested after 
a 15-minute delay (Kintsch, 1975). The cog- 
nitive theories and studies support the idea 
that attributional (cause-inferring) process- 
ing is intrinsically involved in the initial com- 
prehension of sentences and therefore that it 
goes on all the time, not just when a subject 
is asked an attributional question. Such 
theories also imply that the cause generated 
by the initial attributional processing is rep- 
resented in memory in some form. Hence rela- 
tions between recall and causal responses, 
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such as those obtained in Study 1, are to be 
ted. 

hoa tentative model thus holds that sub- 
jects comprehend the stimulus material: They 
process the sentence as influenced by the sa- 
lience and information factors and transform 
it into a representation in memory. Some at- 
tributional inferences occur in this process, 
and causal material is included in the memory 
representation (Kintsch, 1974, 1975). If sub- 
jects are immediately asked to give an at- 
tribution, they may report the cause they have 
stored in memory—but on the other hand, the 
additional covariation information may still 
be available to them, so subjects may be able 
to carry out further attributional processing 
in response to the question. If a delay inter- 
venes between the study of the material and 
the attributional question, Kintsch’s results 
suggest that subjects will be more likely to 
answer the question directly from the memory 
representation, The additional covariation in- 
formation will in general have been lost from 
memory during such a delay. Only the stim- 
ulus sentence itself will be retrievable because 
it was more intensively studied by the subjects 
and because it is more central; such proposi- 
tions are better recalled (Kintsch, 1974). The 
prediction of this model is thus for a stronger 
relationship of attribution and recall indices 
after a delay than immediately and for a 
fg eeu of salience on attributions 

via its influence on the memory r A 
tion) after the delay. These Diea 
a Kintsch’s (1975) results showing that 
after a delay causal relations seem to be re- 


trieved as part of a person’s encoded memory 


for the 
“hae itself, not as separately encoded 
Study 2 was conducted as 
a test 
ideas, In particular, it hee 
the effect of info 


` Presentation of the sen- 
information (eg, sioa 
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from memory and the subject will be forced 
to base his attribution and recall responses on 
the same encoded memory representation of 
the sentence (cf. Kintsch, 1975); the 
effect of salience on attributions should 
be stronger after the delay, since that effect 
is presumed to be mediated by the 
representation, 


also 


memory 


Study 2 
Method 


Overview. Subjects read sentences simila 
in Study 1, together with the salience and 
tion manipulations, presented as materials 
study of memory. After studying the sentence 
minutes, some subjects (randomly assigned t 
immediate condition) were asked to recall as many 
sentences as they could in 10 minutes, writing each 
one and then answering the two closed-ended subject 
and object attribution questions about it (essentially 
Pryor & Kriss's questions). During this time, the 
other subjects (those in the delayed condition) per 
formed an irrelevant anagrams task. A 20-minute 
delay then followed, during which the subjects lis- 
tened to and took notes on a lecture on an unrelated 
topic. Finally, the memory and attribution dependent 
Measure was administered to all subjects, again for 
10 minutes. Thus, immediate recall subjects filled out 
the measure beginning only seconds after they had 
finished studying the sentences, and delayed subjects 
filled it out 30 minutes later. 

Subjects, The 107 subjects were volunteers drawn 
from introductory social psychology courses at the 
University of California, Riverside (26 subjects) and 
at New York University (81 subjects). The study 
was run during a class period, prior to exposing 
the students to material concerning attribution in the 
course. 

Materials. The 16 sentences used in this study 
were written to have a person as subject and a non- 
human object, to climinate ambiguity in the wording 
of the attribution questions (see below). Some sen- 
tences from the first study were retained and some 
new ones were written. The salience and information 
manipulations were as described for Study 1. In this 
study, to simplify the experimental materials and the 
task of balancing cell sizes in two locations 3,000 
miles apart, sentences were nested within (Salience 
X Information), so that just one form of stim- 
ulus presentation needed to be prepared. That is, each 
Sentence was presented in just one Salience X Infor- 
mation condition. 

The dependent measure consisted of 16 occurrences 
of three questions: “Write down one of the sentences 
here, as exactly as you can remember it—just the sen- 
tence, not the additional information about it” (i-, 
the information manipulation), followed by person 
and object attribution questions. These were the 
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same as those used in the little thought condition in 
od Study 1, except that instead of including the per- 
day Son's name and the object from the sentence (which 
would have constituted a powerful recall cue for each 
sentence), the questions asked, “To what extent was 
the event described in the sentence caused by the 
person named in the sentence” and “caused by the 
object the person was reacting to.” 

Coding of variables. From the dependent measure, 
each stimulus sentence was coded as recalled with 
person first, with object first, or not recalled; the 
person was coded as recalled or not; and the object 
was coded as recalled or not. Four recall measures 
identical to those in Study 1 result. Since attribution 
measures are available in this design only for sen- 
tences that were recalled, analyses involving the 
attribution variables must exclude sentences that were 
not recalled, For such analyses the two order-of- 
recall variables are redundant (since a recalled sen- 
tence must be scored as either recalled person first or 
recalled object first) and only one can be used. The 
person and object attribution scales were used in 
their original form, with a higher score meaning 
in cach case less person or object attribution. A 
difference score was not used in this study, since the 
correlation of the two attribution scales was only 
—.26, not strong enough to justify using the differ- 
ence as a single summary measure. 

Analysis, Attributions cannot be analyzed in a 
repeated-measures design in this study because they 
are not given for sentences the subject did not recall. 
Recall, however, can be so analyzed. The anova de- 
sign is Time X Sentence Nested Within (salience by 
information), the latter three all being within-sub- 
jects variables. Attributions and the relationships 
among recall and attribution variables can be an- 
alyzed with the sentence as the unit of analysis. 
Since a subject typically recalled several sentences, 
the observations in this mode of analysis are not 
independent of each other, giving rise to some degree 
of bias. The bias is small, however, since the variance 
due to sentences is generally many times greater than 
that due to subjects in these analyses, and this 
analysis is the most effective available for examining 
the questions of interest.” 


Results 


It was predicted first that this study would 
replicate an effect found in Study 1, an im- 
Pact of information on the order of recall of a 
Sentence, within the object-salient condition. 
Unfortunately, no such effect was found, and 
the reason became clear in examining the 
A4Nova table: The design used, with sentences 
Nested within information, had very low 
Power to detect such effects, since the sen- 
tence factor contributed much variance (sen- 

was significant at the .001 level for both 
the order-of-recall dependent measures). This 
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indicates that the design should have crossed 
Sentence X Information Condition, as Study 
1 did, yielding higher power for tests of in- 
formation effects (cf. Kenny & Smith, Note 
2). 

The second prediction was for a within- 
conditions relationship between attribution 
and recall. This was examined by computing 
correlations between the three recall variables 
(order of recall, person recalled, object re- 
called) and the two attribution variables (per- 
son and object attributions) separately for the 
four combinations of Time x Salience. (Infor- 
mation need not be considered, since it did not 
significantly affect recall; thus, it cannot gen- 
erate spurious correlations between recall and 
attribution measures.) One of the 12 correla- 
tions from the immediate subjects was signif- 
icant: In the object-salient condition, object 
recall was associated with less person attribu- 
tion, r= .14, p < .05. For the delayed sub- 
jects, 4 of 12 correlations reached the .05 
level, all in the object-salient condition. All 
three recall variables (object-first order of 
recall, person recall, and object recall) were 
associated with more person attribution (7 = 
—.16, —.22, and —.17, respectively). Object 
recall was also associated with more object 
attribution, r = —.20. Overall, obtaining 5 of 
24 correlations at the .05 level by chance 


™The use of analyses that may contain technical 
violations of assumptions has precedents in the at- 
tribution literature. McArthur (1972) found herself 
in a situation similar to ours, argued that the bias 
due to ignoring subject effects would be “negligible,” 
and proceeded with an analysis that is conceptually 
similar to the ones we perform. Our data permit a 
reassurance like the one she was able to give: The 
correlation of recall responses given by the same 
subject is negligible relative to item effects. In addi- 


» tion, we have conducted reanalyses to test the effects 


of the violations of assumptions. Removal of subject 
effects from the attribution variables (by subtracting 
each subject’s mean score from each observation) 
and reanalyses of these corrected scores show that 
the major points of the paper are unaffected by the 
nonindependence in the original data. This is the 
case with the partial correlations between memory 
and attribution in Studies 1 and 2, the anova on 
attribution in Study 2, and the path models in Study 
2. (The other analyses reported in the paper ex- 
plicitly incorporate the subject variable and so are 
not subject to the criticism of nonindependence of 
observations.) 
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alone has a binomial probability of less than 
002: a clear demonstration of relationships 
between recall and attributions. : 

The third prediction was that the relation- 
ship between recall and attributions should 
be stronger when recall was delayed. This 
issue can be illuminated by path diagrams 
showing the influences of the experimentally 
manipulated factors on recall and attribution 
variables, separately for the immediate and 
delayed recall conditions. Using the conven- 
tions of path analysis to draw the diagrams 
(Duncan, 1975), the following are the results 
for the most sensitive recall and attribution 
variables, object-first recall and person at- 
tribution.’ 

In Figure 1, the straight arrows represent 
the causal effects (numerically, the standard- 
ized regression coefficients) of the manipu- 
lated variables on the two response variables. 
The residual causes w and v represent all 
other causes of the responses (i.e., error vari- 

ance). The curved arrow between u and v 
represents a correlation between those resid- 
ual causes, generated theoretically by an un- 
measured factor that has a causal impact on 
both recall and attribution (in this case, the 
factor is presumed to be aspects of the mem- 
ory representation of the sentence). Note that 
the model shows neither the recall nor attribu- 
tion dependent variables as causing the other. 
This is a consequence of the theoretical view- 
point sketched above: The recall and attribu- 
tion processes must work from the same en- 
coded representation of the sentence (so one 
can expect to find a partial correlation be- 
tween them, represented in the diagram by 
the correlation between the residual causes 
u and v). However, neither can be presumed 
to cause the other, even though the recall 
dependent measure comes first in time—at- 
tributions (those made during the initial pro- 
cessing of the sentence) may influence recall 
as well as the other way around, 

In the immediate condition, there seems to 
be a separation of the recall and attribution 
processes: Salience affects recall, and the in- 
formational manipulation affects attribution. 

Furthermore, the residual correlation between 
recall and attribution is nonsignificant. How- 
ever, the delayed condition shows a different 


ELIOT R. SMITH AND FREDERICK D. MILLER 


jameðate recall (N= 54 


Salience na Recall object test tf “7 
“ J 
” 
Information L-e Person attntution <— Y 
Delayed recall (N» 366 
Salience a Reve ect first ay 
| De 
if Í 
| 
"WF 
Information Person attribution <== Y 
a p< OS 


Figure 1. Path diagrams showing relations among 
manipulated factors, recall, and attribution in imme- 
diate and delayed conditions, Study 2. (¢ < 05.) 


pattern of effects. Salience still strongly affects 
order of recall, but it now also affects attribu- 
tion. The information effect has vanished, 
Finally, the partial (residual) correlation is 
now significant, (These three correlations or 
path coefficients differ significantly between 
the immediate and delayed conditions.) An 


8 Detailed results for the other combinations of 
variables are not presented here for reasons of space, 
Briefly, the object attribution measure never shows 
significant relations in these analyses, so only the 
Person scale need be considered. There are three 
nonredundant recall measures, object-first recall, per- 
son recall, and object recall. (Since only sentences 
that are recalled have attributions, results using the 
person-first order-of-recall variable are identical to 
those with the object-first variable except for having 
the opposite sign.) The major conclusions are the 
following: (a) For two of the three analyses (object- 
first recall and person recall), the delayed recall- 
attributional correlation is significant and signif- 
icantly different from the immediate one; (b) in all 
three cases, there is a significant delayed effect of 
salience on attribution, significantly greater than the 
immediate effect; (c) for object recall only, there is a 
Significant delayed effect of information on recall— 
Such that information implying that the object is 
causal leads (after the delay) to a greater likelihi 
of recalling the object. This effect supports our model 
but is not discussed further, since it appears with 
only one of the three measures. Researchers On 
salience, at least since Taylor and Fiske (1975), have 
Often obtained theoretically expected effects 
only a subset of dependent measures; the exact form 


of the most appropriate measure is a matter for fur- 
ther study, 


| eS 
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analysis of variance on the person attribution 
variable verifies that the impact of salience is 
stronger after the delay: Salience x Time 
interaction, F(1, 12) = 8.39, p < .05. 


Discussion 


This pattern of findings lends support to 
the idea that after a delay (though not im- 
mediately after study of the sentences) recall 
and attribution are based on a single encoded 
representation of the sentence. The effect of 
salience on that encoded representation seems 
to be stronger than the effect of the informa- 
tion manipulation: the effect of salience on 
attribution was significantly stronger after the 
delay than before. On the other hand, the 
impact of information was significantly 
cer (and nonsignificant) after the delay. 
This pattern may be accounted for by the 
salience manipulation’s status as part of the 
sentence itself, whereas the covariation infor- 
mation was presented as separate sentences, 
subject to memory decay during the 30-min- 
ute delay. 

Several puzzles remain in the results, First, 
the residual correlation between recall order 
and attribution in the delayed condition is in 
the wrong direction. That is, order of recall 
has a relation to attributions opposite to that 
of the order of presentation of the sentence. 
The reason for this is unknown; the exact 
form of the representations of sentences in 
memory is a matter of some controversy (cog- 
nitive theories differ widely on the issue). 
Similarly, we do not yet know what specific 
characteristics of the representation are re- 
sponsible for observed differences in order of 
recall or attribution. Salience does appear to 
affect the memory representation, but the de- 
velopment of sensitive and relatively pure 
Measures of the affected characteristics re- 
Mains a topic for future research. Second, the 
Tecall-attribution partial correlations in Study 
l, although significantly different from zero, 
Were small. This may also be due to the use 
Of measures that are only indirectly related 
to the true variables of interest, or the effects 
May actually be relatively small in magnitude 
(and thus difficult to study). In either case, 
the elucidation of the relationships among 
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salience, attributions, and the cognitive pro- 
cesses and representations involved in sen- 
tence comprehension is a goal that deserves 
the further research efforts that will be re- 
quired. 


Conclusions 


In summary, this article has attempted to 
accomplish three things. It has presented 
evidence that one effect of salience—the de- 
pendence of attributions upon the order of 
presentation of person and object in a stim- 
ulus sentence—is surprisingly robust. The 
effect holds up even when the subject is in- 
structed to invent several possible causes and 
pick the best one and becomes stronger, rather 
than weaker, after a delay that forces subjects 
to rely more on an encoded memory. This 
finding led to the idea that salience process- 
ing involves more than just top-of-the-head 
responses but may reflect the basic cognitive 
processes underlying sentence comprehension 
and the storage of events in memory. (This 
analysis could be extended to visual manipula- 
tions of salience, as in Taylor & Fiske, 1975, 
by assuming a verbal representation of en- 
coded visual material; cf. Chase, 1978.) One 
implication of the data is that strict lines of 
demarcation between salience processes and 
more thoughtful attribution processes do not 
exist. Rather, one must seek to understand 
how salience and informational factors exert 
influence on the perceiver across a series of 
stages of encoding, processing, and recall. 

Second, two new findings concerning mem- 
ory are presented (in addition to the replica- 
tion and extension of Pryor & Kriss’s finding 
that salience affects memory as well as at- 
tributions): an effect of information on recall 
(in Study 1) and a relationship between at- 
tributions and characteristics of the recall of 
a sentence (in both studies). Further research 
is clearly needed to investigate the effects of 
different types of stimuli and different delay 
periods and particularly to identify the most 
appropriate and sensitive recall measures with 
which to elucidate the cognitive processes 
underlying attribution. The present findings 
point to a close relationship between the 
processes that subjects use in reporting attri- 
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butions and in reporting recall; in particular, 
we advance the hypothesis that both of these 
processes work from an encoded representa- 
tion of the stimulus that the subject initially 
forms. This interpretation is supported by the 
fact that both the recall-attribution relation- 
ships and the effects of salience are increased 
by a delay (Study 2). Although the support 
of the hypothesis by these findings is inferen- 
tial, theory and data from the literature on 
text comprehension and representation, cited 
earlier, also argue strongly for such an encod- 
ing process, which can involve attributional 
processing at the time of the encoding and 
storage of a stimulus as well as at the time of 
its retrieval. Thus attribution can mediate 
memory, as well as memory mediating attribu- 
tion. 

Third, an implicit message of these argu- 
ments has been that the formulation of ex- 
plicit theories of information processing in an 
experimental task can be a productive ap- 
proach for future research. Instead of asking 
about whether one variable mediates another 
in the sense that a statistical relationship be- 
tween them can be established, we should 
attempt to understand how the subject goes 
about performing the task and therefore how 
the results of a process such as encoding or 
memory can influence other subsequent pro- 
cesses. Such explicit theories of stages in in- 
formation processing lend themselves readily 
to test by means that have been used in cogni- 
tive psychology with great success for the 
past several years (Schneider & Shiffrin, 
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Two experiments investigated the effects of “voice” (participating in allocation 
decision making by expressing one’s own opinion about the preferred alloca- 
tion) on responses to an inequitable allocation. In addition to subjects’ (female 
college students) either having or not having voice, Experiment 1 manipulated 
(a) whether the allocation made by a “decision maker” (supposedly another 
subject but actually the experimenter) was or was not biased (due to self- 
interest) and (b) whether the subject did or did not learn that a “co-worker” 
believed the allocation to be inequitable. Experiment 2 (with female high school 
students) manipulated the presence or absence of voice and involved only a 
self-interested decision maker; also, a note from a co-worker either supported 
the decision maker's allocation or confirmed the subject's opinion that the allo- 
cation was inequitable. In both experiments, the impact of voice was mediated 
by knowledge about the co-worker’s opinion. When subjects had no knowledge 
of the co-worker's opinion (Experiment 1) or knew that the co-worker’s opinion 
coincided with the decision maker's allocation (Experiment 2), there was evi- 
dence for a “fair process effect”: Voice subjects expressed greater satisfaction 


than those with no voice. 


How do people know that they have been 
treated fairly? According to equity theory 
(Adams, 1965; Walster, Berscheid, & Wal- 
ster, 1973), a distribution of outcomes is con- 
sidered fair (equitable) if the ratio of out- 
comes to inputs is constant across people. 
Apart from considerations of equity, however, 
fairness judgments may also be affected by 
whether a distribution is the result of an ac- 
ceptable decision-making procedure (see the 
distinction between distributive and proce- 
dural justice in Folger, 1977; Leventhal, 
1976; and Thibaut & Walker, 1975). Deutsch 
(1975), in discussing how “injustice of deci- 
sion-making procedures” affects the percep- 
tion of justice, makes the following argument: 
“There is much social psychological research 


Preparation of this manuscript was facilitated by 
a National Science Foundation Postdoctoral Fellow- 
ship to the first author and by National Institute of 
Mental Health Grant 30968-01 to the first two 
authors. Philip Brickman graciously provided com- 
ments on an earlier draft. 

Requests for reprints should be sent to Robert 
Folger, Department of Psychology, Southern Meth- 
odist University, Dallas, Texas 75275. 


which would suggest that . . . [procedural] 
injustice is the most fundamental, The re- 
search to which I am referring indicates that 
people are more apt to accept decisions and 
their consequences if they have participated 
in making them” (p. 139). 

Indeed, many classic studies support 
Deutsch’s position. Lewin, Lippitt, and White 
(1939), for example, found that the unity of 
a group was greater under democratic leaders 
than under autocratic or laissez-faire leaders. 
Similarly, Shaw (1955) found group members 
to be more satisfied under nonauthoritarian 
than under authoritarian leaders, Another 
classic example is Leavitt’s (1951) research 
on communication networks, which found 
that the satisfaction of group members in- 
creased as the centrality of their position 
(opportunity to influence decisions) increased. 
Reviewing related research, Lawler (1975) 
concluded that workers’ participation in deci- 
sion making can lead to both greater worker 
satisfaction and higher productivity. 

There is, however, an ambiguity in this re- 
search. Positive feelings may reflect the con- 
Sequences of participation rather than par- 
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ticipation per se (ie., increased participation 
in decision making might ordinarily lead to 
people actually obtaining more favorable out- 
comes). Although those given an opportunity 
to make their preferences known can hope to 
influence the outcome, it is clear that they 
may at times be unsuccessful. It is important, 
therefore, to distinguish between (a) the re- 
sponse to an opportunity to exercise a “voice” 
in decision making (cf. Folger, 1977; Hirsch- 
man, 1970)? and (b) the response to the out- 
come of having exercised voice. A key issue 
is whether voice is in itself sufficient to induce 
greater acceptance of decisions that result in 
inferior outcomes. 

Thibaut and Walker (1975) provide an 
encouraging answer to this question in sug- 
gesting that the opportunity to present evi- 
dence supporting one’s own case (voice) has a 
remarkable effect on a defendant’s satisfaction 
with the verdict. Their research has demon- 
strated (1975; see also LaTour, 1978; 
Walker, LaTour, Lind, & Thibaut, 1974) that 
the more such voice is available, the more an 
otherwise intolerable outcome (eg., a guilty 
verdict imposed on those who think them- 
eae innocent) becomes relatively accept- 

ei 

We will refer to Thibaut and Walker’s pat- 
tern of results as the fair process efect. This 
effect refers to cases in which greater satisfac- 
tion results from giving people a voice in de- 
cisions. Thibaut and Walker’s results suggest 
that satisfaction with the procedure may also 
generalize to other aspects of the situation 
(eg., the distribution of outcomes). Our in- 
vestigation concerns the generality of the fair 
Process effect, and we begin by examining an 
experiment that produced the opposite results 


—a pattern that we will ter ; 
effect. m the frustration 
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the fiscal policy was allegedly determined by 
a vote of all participants (a voice procedure), 
Bogus feedback about this vote showed that 
two corporations had voted for the equitable 
policy and two for the inequitable policy and 


that the government had cast the tie-break- 
ing vote that established the inequitable 
policy, In the “no-participation” condition, 
the government decided on the inequitable 
policy without consulting the corporations 
The results showed that the proportion of 
taxes paid to taxes owed was approximately 
twice as high among no-participation sub- 
jects as among participation subjects (i.e, 
those who had voice were more likely to ex- 
press discontent with the inequitable policy 
by cheating on their taxes), Thibaut et al, 


explained these results, which are clearly 
Opposite from a fair procedure effect, by sug- 
gesting that discontent is intensified when- 
ever a participative process raises hopes for a 
desired outcome and those hopes are dashed, 

Thus the proposed mechanism whereby 
voice can cause heightened displeasure is 
increased frustration due to raised expecta- 
tions (hence the term frustration effect). But 
if voice enhances frustration when outcomes 
are inferior, why was the opposite, fair pro- 
Cess, effect evident in the LaTour (1978) and 
Walker et al. (1974) studies, which involved 
unfavorable outcomes? One possible answer 
(Thibaut, Note 1) is that although the deci- 
sion maker in these latter studies was a neu- 
tral or “disinterested” third party, the gov- 
ernment in the Thibaut et al. (1974) study 
was clearly voting for its own self-interest. 
However, Folger (1977) found a fair process 
effect under some conditions in which the 


1 The original use of the term voice by Hirschman 
(1970) was in the context of alternative responses 
to declining outcomes. At the same time, however, 2 
more general usage was suggested by passages such 
as the following (pp. 16, 30): “Voice is political 
action par excellence”; “voice is nothing but a basic 
Portion and function of any political system, known 
Sometimes as ‘interest articulation” It is in this 
Sense of interest articulation that we use the wo! 
voice in place of longer locutions such as “oppor- 
tunity to express opinions and preferences or to pre- 
sent facts relevant to one's position in the context 


of decision making.” 


VOICE AND INEQUITY 


allocator did benefit directly from this in- 
equity, as was the case in the Thibaut et al. 
study. Since this result demonstrates a fair 
process effect even when the allocator’s self- 
interest could have biased the decision, it 
appears that some other aspect of the Thi- 
baut et al. study was responsible for the 
frustration effect. 

We suggest that the frustration effect in 
the study of Thibaut et al. was due to the 
feedback subjects did or did not receive about 
the other subjects’ opinions, Since no vote 
was allowed in the no-participation condition, 
subjects in this condition did not know the 
fiscal policy the other corporations favored. 
In contrast, each subject in the participation 
condition learned that in addition to himself, 
one other corporation had opposed the in- 
equitable policy; thus these subjects had so- 
cial support for their opinions. 

The importance of such social support is 
highlighted by Asch’s (1952) conformity 
studies. When the majority’s unanimity was 
broken by a single confederate, the ordinarily 
substantial rate of conformity was markedly 
reduced. Perhaps a similar phenomenon oc- 
curred when participation subjects learned 
that one other corporation agreed with their 
own opinion: This support from a solitary 
peer may have reduced these subjects’ tend- 
encies to conform to the government’s opin- 
ion, thereby leaving them more dissatisfied 
with the inequitable result. Unfortunately, 
since voice and feedback about others’ opin- 
ions were confounded in the Thibaut et al. 
experiment, it is impossible to determine 
which factor was responsible for the obtained 
effect. The present line of investigation in- 
volves two experiments designed to shed light 
on this issue by manipulating procedure and 
feedback independently of one another. 

It is predicted that voice will interact with 
the feedback the subject receives concerning 
the other subject’s opinion. The exercise of 
Voice should lead to more positive responses 
When subjects do not receive any feedback 
from others supporting their own opinion that 
the decision was unfair, conditions similar to 
those that yielded a fair process effect in the 
LaTour (1978) and Walker et al. (1974) 
Studies. However, when subjects receive in- 
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formation from others that confirms their be- 
liefs and indicates that the final decision was 
inequitable (as in the Thibaut et al. study), 
then voice may not lead to a fair process 
effect. Additionally, Experiment 1 will include 
a manipulation of whether the decision maker 
is biased or unbiased. As noted above, Thi- 
baut has suggested that voice may not lead to 
a fair process effect if the decision maker has 
some external inducement that might have 
biased his or her decision. 


Experiment 1 
Method 


Subjects. The subjects were 82 female undergrad- 
uates enrolled in an introductory psychology course. 
They received extra credit for participating in the 
experiment, Although the gender was restricted pri- 
marily for convenience, it should be noted that pre- 
vious research has suggested that females are more 
likely than males to prefer an equal division of out- 
comes (e.g., Leventhal & Lane, 1970). 

Procedure, The participants, scheduled in groups 
of three, were seated in separate rooms and listened 
to taped instructions. They were told that the study 
was simulating business decision making and that a 
random drawing would assign subjects to the roles 
of “decision maker” and “workers” (each person 
was actually assigned the worker role), Workers 
were to make small words from the letters of a 
larger word and to help evaluate the decision maker's 
effectiveness. There would be two sessions of four 
word-making trials, and on each trial the decision 
maker would select from a list of words a subset 
that would be given to the two workers (making 
the selection so that the number of possible smaller 
words would be maximized). The subjects were also 
told that they would receive lottery tickets for a $50 
prize as additional compensation, Nine tickets were 
to be distributed during each of the two work ses- 
sions, with three of those tickets going to the deci- 
sion maker. The decision maker would decide how to 
divide the remaining six tickets between the two 
workers. 

Subjects in the unbiased decision-maker condition 
were told that whoever drew the role of decision 
maker would keep that role for both work sessions, 
which meant that she would receive the same number 
of tickets regardless of how she chose to allocate the 
workers’ tickets, In the biased condition, the person 
initially chosen as decision maker would have that 
role for one session only. During the second session, 
one of the workers would become the new decision 
maker and would be in a position to determine how 
many tickets the original, first-session decision maker 
would receive during the second session. Thus, the 
decision maker’s ticket allocation during the first 
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session might be biased because she would have an 
extrinsic reason to favor one worker (the one who 
would be the next decision maker) over the other, 

Tn all conditions, the drawing was rigged so that 
the subject believed that she would be a worker 
(Worker B) during both sessions, In the biased con- 
dition, she also believed that the other worker for 
the first session (Worker A) would be the decision 
maker during the second session. Note that subjects 
thought the initial decision maker knew which first 
session worker would be the decision maker during 
the second session (Worker A). 

After the first session, every subject was asked 
to complete (“for our records”) an “opinion card” 
on which she wrote her ideas of the fair way to 
divide the lottery tickets for that session. In the 
voice conditions, the experimenter then added the 
following statement: 


Even though the decision maker will be 

the final decision about how the lottery tickets will 
be divided, one of you workers will have an op- 
portunity to let her know what you think is fair 
before she makes her decision. We will have a 
drawing and the worker who wins the drawing 


will get to have her opinion card shown to the 
decision maker. 


Subjects in the mute (no voice) condition were not 
given these additional instructions, 

sy the subject filled out her opinion card, the 
experimenter counted the total 
= ong otal number of smaller 
to the decision 


bject’ TRR 
allocation an mame feelings about the 
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from “very dissatisfied" (1) to * 
Finally, two questions asked the » 
their decision maker on scales of “ir 
to “competent” (11) and “inefficient 
cient” (11). Answers to these two 
summed to obtain an average rating 
maker, 

After completing the questionnaire, the s 
were told that there was not t 
finish the experiment. The expe 
tempted to assess any suspicions t j 
had and debriefed them. The subjects were al 
that they would each receive three ticket 
lottery. A lottery drawing actually took place a 
end of the semester, and the winner w 
$50. 


very satisf 


ne 


given her 


Results and Discussion 


The results were analyzed by a 2 x 2 x 2 
multivariate analysis of variance This 
MANOVA revealed that the Feedback (No 
Feedback vs. Inequity-Confirmed) x Proce- 
dure (Voice vs. Mute) interaction was signif- 
icant, F(4, 71) = 2.77, p< .05. Also, the 
procedure main effect was marginally signif- 
icant, F(4, 71) = 2.07, p < .09. Given these 
results, univariate analyses of variance 
(ANovas) were examined to determine the na- 
ture of these effects, 

The univariate aNovas showed significant 
Procedure main effects on the fairness of the 
Process, the ratings of the decision maker, and 
the fairness of the allocation, F(1, 74) = 
745, p < 01; F(1, 74) = 4.30, p < 05; and 
F(1, 74) = 5.72, p< .05, respectively. As 
Table 1 Shows, subjects in voice conditions 
generally expressed more positive feelings 

those in the mute conditions, Note, how- 
ever, that the mute-voice differences are al- 
Ways larger within the no-feedback condition; 
these differences are generally quite minimal 
in the inequity-confirmed condition and are 
even reversed slightly on the satisfaction mea- 
sure, 
__ I fact, the procedure main effect is qual- 
ified by the significant Feedback x Proce- 
dure interaction, indicating that the main 
effect was due almost entirely to the large 
differences within the no-feedback condition. 
This interaction was significant on the uni- 
variate analyses of the fairness of the deci- 
Sion-making process and the subject’s satis- 
faction with the work Situation, F(1, 74) = 
5.34, p < .05, and F(1, 74) = 4.89, p < .05, 


. 


VOICE AND INEQUITY 


Table 1 
Responses to Inequity as a Function of 
Procedure and Feedback, Experiment 1 


—_—_ 
Inequity No 
confirmed feedback 

Measure Mute Voice Mute Voice 
Fairness of the 
process 67 7.0 5.6 8.8 
Fairness of the 
decision 4.9 64 6.0 7.8 
Satisfaction 8&1 7.7 7.6 9.5 
Ratings of 
decision maker* M46 15.3 15.0 18.4 
* Higher numbers indicate more positive ratings of 


the decision maker. 


respectively. As Table 1 demonstrates, voice 
did not lead to substantially more positive 
reactions than did mute in the inequity-con- 
firmed conditions (p> .50 for each of the 
four dependent measures), but it did lead to 
substantially more positive feelings for those 
subjects who were given no feedback and 
hence were not as convinced that they had 
been treated inequitably (p< .05 for fair- 
ness of the process and satisfaction with the 
work situation). 

This experiment thus substantiates that 
Voice sometimes tends to mitigate the distress 
associated with inequitable treatment; how- 
ever, the significant Feedback x Procedure 
interaction indicates that voice alleviates dis- 
tress only when the subjects do not receive 
4 second opinion from their co-workers sug- 
Resting that the co-workers also feel that the 
decision was inequitable. When such an opin- 
lon is received (which reinforces the belief 
that the decision was unfair), voice does not 
make the subjects feel that the inequitable 
decision was any more satisfying. 

There was little evidence of any strong 
tendency for knowledge of a supportive opin- 
‘on to make voice any Jess satisfying than the 
Mute procedure, as it did in the Thibaut et al. 
Study, However, it is possible that increased 
displeasure due to voice may be unique to 
Situations in which decision makers stand to 

€nefit directly from the allocation, as they 
did in the Thibaut et al. study. In our study, 
the decision maker could benefit indirectly in 
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the biased conditions (if giving more lottery 
tickets to the worker who would be the next 
allocator would increase the chances of being 
overrewarded by her in return), but the in- 
direct nature of this advantage may have re- 
duced its salience, as is reflected in the ab- 
sence of any significant differences between 
the biased and unbiased conditions. Since Ex- 
periment 2 placed the decision maker in a 
position to benefit directly from reducing the 
Payments to other people, further discussion 
of frustration effects will be postponed until 
after the presentation of that experiment. 


Experiment 2 
Method 


Subjects. In response to telephone solicitations, 61 
female high school students participated. Each under- 
stood that she would receive at least the minimum 
wage for participating for 1 hour in a study about 
“working conditions.” 

Procedure. Experiment 2 was conducted similarly 
to Experiment 1, with several exceptions. First, there 
was no biased-unbiased manipulation (Experiment 
2 used biased conditions only). During each work 
session, the decision maker supposedly had $3.00 to 
divide between herself and the two workers. Thus, 
by giving the workers less (her decision was allegedly 
to give the workers $.75 each), she was able to keep 
more for herself ($1.50). In keeping with this pro- 
cedure, it was further announced that the decision 
maker would remain in that role for the entire ex- 
periment (three sessions). 

Another difference was that the feedback manipula- 
tion was stronger. In the inequity-confirmed condi- 
tions, subjects were told that their co-worker had 
indicated on her opinion card that an even split 
of the $3.00 was fair. Since the decision maker sup- 
posedly gave the workers only $.75 while keeping 
$1.50 for herself, this feedback confirmed the sub- 
jects’ feelings that the decision was inequitable. In 
the inequity-disconfirmed conditions, the subjects 
again received information about their co-worker’s 
opinion, but this time the co-worker supposedly had 
indicated that $.75 for the workers and $1.50 for 
the decision maker was fair, Another difference be- 
tween the first and second studies was that the deci- 
sion maker in the latter was not told how many 
words the subjects had formed. Thus her decision 
was made (allegedly) without taking productivity 
into account. 

The final difference involved dependent measures. 
In addition to the measures used in Experiment 1, 
Experiment 2 also asked subjects to indicate how 
certain they were that the division of money that 
they had written down on their opinion card (before 
learning about the decision maker’s allocation or 
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Table 2 
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Responses to the Inequity as a Function of Procedure and Feedback, Experiment 2 


or —— 


Inequity confirmed Inequity disconfirmed 
Measure Mute Voice Mute Voice 
62 87 
Fairness of the process 3 n “4 nit 
Fairness of the decision . ay a T 
Satisfaction 1.3 ee Hi py 
Ratings of decision maker* 13.1 H 357 070 
Payment to decision maker 79.7 70. < ie 
Certainty 9.9 94 i 
a Higher numbers indicate more positive ratings of the decision maker, 
: 
their co-worker’s opinion) was the fair way to split cell means (see Table 2) and appropriate con- 
the money, on an 11-point scale from Se trasts show that voice led to more positive 
certain” to “very certain” (57 out of 61 had orig- f A re bonus money 
inally indicated that an equal split was fair). An- feelings (in the form of more ORTEN 
other additional measure was assessed just after the given to the allocator, greater percep et eo 
second session, Subjects were told that there was not fairness, and less certainty that an equal spli 


enough time to complete the third session, so the 
experiment would have to be terminated. This prob- 
lem had supposedly come up before, however, and 
in those cases it had been decided that since the de- 
cision maker was not selecting any words for the 
last session, there was no reason why the decision 
maker should distribute the money for that session, 
Thus subjects were told that one of the workers 
would decide how to divide the money for that ses- 
sion, and a drawing took place to decide which 
worker would make the decision, arranged so that 
the subject always won. The subject was then asked 
to make the allocation, and the amount given to the 
decision maker was used as a behavioral Measure of 
her satisfaction with the decision maker, 


Results and Discussion 


The results were initially analyzed by a 
2 X 2 anova. This MANOVA indicated that 
both main effects and the interaction were 
significant, F(6, 51) = 2.46, p < .05, for the 
procedure main effect; F(6, 51) = 4.35 p< 
005, for the feedback main effect; and F (6 
51) = 2.61, p < 05, for the Feedback x Pro’ 
cedure interaction, Univariate anovas were 
then examined to determine the nature of 


was fair) when the feedback from the other 
worker indicated that the decision was not 
inequitable (inequity-disconfirmed conditions, 
ps < 05S), but that mute-voice differences 


d 


tended to be slightly (although nonsignif- | 


icantly) in the opposite direction in the in- 
equity-confirmed conditions. s 
The feedback main effect was also signif- 
icant on each of these measures, as well as on 
the subjects’ ratings of the decision maker 
and the fairness of the decision (ps < .05). 
The cell means show that subjects were gen- 
erally more content with their low outcome 
when co-worker’s feedback indicated that 
there was no inequity than when it confirmed 
that the inequity existed. It is important to 
note that this difference was substantial only 
in the voice conditions, however, and was 
usually very small (or even slightly in the 
opposite direction) in the mute conditions. 
Finally, the procedure main effect was sig- 
nificant only on the certainty measure (p < 
005). Voice subjects were significantly less 
certain that an even split was fair, but con- 
sistent with the interaction, this was signif- 
icant only in the inequity-disconfirmed condi- 
tion (p < .05). f 
The obtained main effect for feedback 15 
not surprising, given the evidence that a Per” 
son’s discontent with a leader will be affect J 
by whether a fellow member of the group m 
dorses or expresses displeasure with the leade! 
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(Michener & Tausig, 1971; Michener & 
Lyons, 1972). What is striking, however, is 
the nature of the Feedback x Procedure in- 
teraction: Contrary to the trend within the 
inequity-confirmation conditions, inequity- 
disconfirmation subjects displayed more dis- 
content under mute than under voice proce- 
dures. The latter result is noteworthy since it 
means voice has a mitigating effect on discon- 
tent over and above that due to a disconfirma- 

tion of one's own opinion. Voice-disconfirma- 
tion subjects were not only less distressed 
than mute-disconfirmation subjects about 
their inequitable treatment but also seemed 
even somewhat pleased in an absolute sense. 
Voice-disconfirmation subjects’ ratings of the 
fairness of the process averaged nearly 9 on 
an |1-point scale, their average allocation to 
the decision maker was more than $1 of the 
available $3, and the certainty measure 
showed that voice-disconfirmation subjects 
even came to entertain some doubts about 
their initial judgment of what was fair. 


General Discussion 


The results from both of these experiments 
display a similar pattern, in which the voice 
Procedure is associated with more positive 
affect than the mute procedure is under some 
Circumstances (no-feedback conditions, Ex- 
periment 1; inequity-disconfirmation condi- 
rss, Experiment 2), whereas this tendency 

1s virtually neutralized (Experiment 1) or 
even slightly reversed (Experiment 2) under 
Other circumstances (inequity-confirmation 
Conditions), This pattern is contrary to the 
| Commonsense prediction that a voice proce- 
dure should always produce less discontent 
than a mute procedure would because it is 
fairer, but the results are consistent with our 
Contention that the positive impact of voice 
tends to be neutralized when the viewpoint 
€xpressed by a fellow recipient confirms one’s 
Own opinion that the allocation was unfair. 

understanding of why such a confirmation 
Mitigates the desirability of voice can be 
Sained by first considering why voice should 
desirable. 

Voice may be desirable for two reasons. 

First, voice may be preferable to mute proce- 
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dures because the latter are based on incom- 
plete information. A mute procedure may not 
take into account the claims of disputants 
and hence may entail an inferior, or at least 
suspect, decision. Second, being given a voice 
in the decision is considered to be a fairer 
procedure than being given no voice, Someone 
who is given voice at least has a chance to 
defend his/her position and present his/her 
side of the issue. The general preference for 
voice over mute procedures (Houlden, LaTour, 
Walker, & Thibaut, 1978; LaTour, 1978; 
Thibaut, Walker, LaTour, & Houlden, 1974; 
Walker et al., 1974) is so strong that dis- 
putants with the preponderance of evidence 
in their favor prefer a voice procedure over 
a mute procedure even though it may allow 
their opponents to present claims more 
strongly and could hence promote their op- 
ponents’ position. This preference is shared 
by the decision makers themselves when they 
are given absolute authority and thus might 
“efficiently” resolve the dispute without hav- 
ing to consult the disputants. 

Given the desirability of voice, it might be 
expected to affect positively the acceptability 
of a decision even when that decision pro- 
vides undesirable outcomes. Our results show 
this to be the case under two conditions: 
when people do not know whether anyone 
else would agree that they have been cheated 
(Experiment 1) and when they discover that 
a co-worker’s opinion disconfirms their belief 
that they have been cheated (Experiment 2), 
Thus, when subjects were not certain that 
they had been treated inequitably, the voice 
procedure was rated fairer than the mute pro- 
cedure and there was a more favorable reac- 
tion to the decision maker and to the entire 
situation. 

On the other hand, when people learn that 
someone else agrees that they have been de- 
nied their just deserts (inequity-confirmation 
conditions), the positive impact of voice 
seems to be negated; indeed, our results show 
that when supportive social “evidence” is 
available, the fairness of the allocation proce- 
dure becomes essentially irrelevant. Presum- 
ably the raison d’étre of having a fair proce- 
dure is to prevent inequities by improving 
the quality of information on which the allo- 
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cation decision is based, but this possible ad- 
vantage is obviously not present when it is 
clear that an injustice has been done. Thus, 
although the voice procedure could have po- 
tentially been fairer than the mute procedure, 
when the voiced opinions seem to have been 
completely ignored such a procedure is no 
longer perceived to be any better than a mute 
procedure. Support for this view comes from 
our “fair process” measure, which showed 
that subjects in the inequity-confirmation 
conditions did not rate the voice procedure as 
any fairer than the mute procedure. 
Although the overall pattern of results does 
show that some circumstances are more likely 
than others to promote the fair process effect, 
the evidence regarding when voice procedures 
will lead to greater discontent (the frustration 
effect) is less conclusive. Frustration effect 
tendencies appeared only in Experiment 2, 
where they were nonsignificant. The main 
reason that the present frustration effect tend- 
encies were less pronounced than those in the 
Thibaut et al, (1974) experiment may be that 
voice and inequity-confirmation were con- 
founded in the Thibaut et al, study, whereas 
they were manipulated independently in the 
present experiment. The discontent expressed 
by participation (voice) subjects in the Thi- 
baut et al, study may have been due to the 
fact that the voice subjects were given an- 
other opinion that confirmed their feelings 
Rsk tain jee A Subjects in 
not receive m) varias a 
5 any such supporting opinion, 
s confirming the inequity can 


& Tausig, 1971), this 


have been responsibl 
i i n ponsible 
pt obtained frustration effect in their 


ek should also point out, however, that 
aoai have at times found frustration 
. For example, Austin, Williams, Wor- 


thermore, the Provision of voice under cir- 
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cumstances in which outcomes improve after 
voice (although cumulative outcomes remain 
inequitable) has been observed to create a 
frustration effect in at least two instances 
(certain conditions of an experiment by Fol- 
ger, 1977, and an earlier study by Thibaut, 
1950). These results indicate that the frustra- 
tion effect found by Thibaut et al. may not 
have been due solely to the confound between 
voice and inequity-confirmation. Thi 
sibility, coupled with our finding that the 
positive effects of voice can be neutralized 
under certain circumstances, indicates the 
need for further study of the conditions under 
which voice can affect one’s satisfaction either 
positively or negatively. Field studies, where 
involvement level and applicability are high, 
would be especially helpful. 


pos- 


Reference Notes 


1. Thibaut, J. Personal communication, October 25, 
1976. 

2. Austin, W., Williams, T. A. III, Worchel, S., 
Wentzel, A. A. & Siegel, D. Effect of mode of 
adjudication, presence of defense counsel, and 
Javorability of verdict on observers’ evaluation of 
a trial proceeding: An empirical study of proce- 
dural justice. Unpublished manuscript, 1978 
(Available from William Austin, Department of 
Psychology, Gilmer Hall, University of Virginia, 
Charlottesville, Virginia 22901.) 
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iation of Byrne’s anonymous-stranger technique, a PDP-11 compute 
e some on five attitude issues for each of the gh fa 
design created by three levels of the p-x variable (—2, 0, +2) and three a 
of the o-y variable (—2, 0, +2). Measurements of p-o sentiment were usec > 
test the adequacy of three models for the quantification of balance: the equa ; 
weights tetrahedron model, the unequal-weights tetrahedron model, and the 
Feather model. Correlations of obtained and predicted measures for the three 
models were .48, .51, and .52, respectively, with various problems noted for 
each model. The results further revealed a predicted shift in the polarity of 
similar attitudes, but a predicted difference between ambivalent and indiflerent 


neutral attitudes was not supported. 


Byrne (1969, 1971) has reported a sizable 
amount of evidence indicating that pre- 
acquaintance attraction is a linear function 
of the proportion of similar attitudes. Sim- 
ilarity is defined by Byrne as “any response 
on the same side of the neutral point as the 
subject’s response, and dissimilarity as any 
Tesponse on the opposite side of the neutral 
point” (1971, p. 75), As Byrne is aware, such 
a definition of similarity ignores the polarity 
of the attitude. Thus on a +3 to —3 scale, 
+3,+3 agreement should produce as much 
attraction as +1,+1 agreement, In his own 
words, “In effect, each item has been treated 
as if it were a 2-point scale with mild, mod- 
erate, and strong feelings on a given side of 
an issue defined as equivalents” (1971, p. 75). 

As a result of some research done by Nel- 
son (1965), Byrne did expand his formula 
for predicting attraction so as to include the 
discrepancy between the subject’s and the 
other’s attitude (for a given side of the scale) 
as well as the propor : 
same side of 
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be sent to Chester A, 
ology, University of 
North Carolina 27514, 


cording to Byrne, “If a subject is strongly 


integration” (1971, p. 77). This expansion of 
the formula, however, still did not take into 
account the polarity of the attitude. Byrne's 
formula predicts, for example, that two people 
mildly in favor of integration should be just 
as attracted as two people strongly in favor 
of intergration. 

Consideration of the possible effect of po- 
larity (the difference between +3,+3 agree- 
ment, and +1,+1 agreement, for example) 
leads fairly directly to concern with 0,0 agree- 
Ment, or attraction between two people who 
agree in having a neutral attitude. Since 
Byrne uses 6-point scales (possibly to sim- 
plify his definition of similarity), the yee 
issue of agreement between two people wit 
neutral attitudes does not arise. Nonetheless; 
agreement between two people with neutra 
attitudes does pose a theoretical issue of somè 
interest, : j 

There are two quantitative ormmlation s 
balance theory (Heider, 1946, 1958) al 
have been developed or extended by W — 
and Thistlethwaite (1971a) so as to imply 
that attraction increases with increased p° 


committed to racial integration, for example, 
he not only prefers integrationists to segrega- 
tionists, he prefers those strongly in favor 
(like himself) to those mildly in favor of 
{ 
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larity of similar (or agreeing) attitudes. 
These are Wiest’s (1965) tetrahedron model, 
and Feather's (1967) model. As is explained 
below, the Feather model, as Wellens and 
Thistlethwaite have extended it, only predicts 
a change with a shift from O polarity, whereas 
the Wiest model predicts a change with every 
increment or decrement in polarity. 

William Wiest had the ingenious idea of 
conceptualizing Heider’s p-o-x triad as a cube. 
One dimension of the cube consists of the 
minus to plus values of the p to o sentiment 
relation, one the minus to plus values of the 


A r A s 
$ to x sentiment relation, and one the minus 


to plus values of the o to x sentiment relation. 
Four of the corners of the cube represent 
balanced triads (the product of the three 
signs is positive), and four of the corners 
represent imbalanced triads. Connecting the 
balanced corners with straight lines creates a 
tetrahedron (a three-sided pyramid) inside 
the cube. Wellens and  Thistlethwaite’s 
(1971a, 1971b) representation of Wiest’s 
three-dimensional space and tetrahedron is 
Presented in Figure 1. In Figure 1 the p-o 
relation is symbolized as Z, the p-x relation 
as X, and the o-x relation as F. 

Wiest assumed that all balanced triads are 
On the surface of or within the tetrahedron. 
However, the assumption regarding balanced 
triads within the tetrahedron was modified by 
‘wo further considerations. First, since points 
near the center of the tetrahedron are char- 


acterized by the absence of any valence for 


the three relations, they could better be de- 
Scribed as “vacuously balanced” (1965, p. 3). 
Agreeing with Harary’s (1959) postulated 
tendency toward completeness,” Wiest as- 
Sumes that there is a force toward making 
Cognitive elements relevant to one another” 
(1965, p. 6) that moves points away from 
Yacuous balance to balance (or from the cen- 
ter to the surface of the tetrahedron). The 
Second consideration having a bearing upon 
the distribution of points (or triads) within 
tetrahedron is also a postulated force— 
good and Tannenbaum’s (1955) assumed 
tendency toward maximum polarization of 
Sentiments or attitudes. Osgood and Tannen- 
um argued that attitudes have a tendency 
© drift toward simpler, extreme judgments 
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Figure 1. Wiest’s three-dimensional space as defined 
by the three relations (or dimensions) p-x (X), o-x 
(Y), and p-o (Z). (From “An analysis’ of two 
quantitative theories of cognitive balance” by A. R. 
Wellens and D. L. Thistlethwaite, Psychological Re- 
view, 1971, 78, 141-150. Copyright 1971 by the 
American Psychological Association. Reprinted by 
permission.) 


and away from more difficult, discriminating 
judgments. Wiest agrees with this postulated 
tendency and assumes that it is a second 
force moving points toward the surface of the 
tetrahedron. Thus Wiest postulates that the 
balanced points are, by and large, on the sur- 
face of the tetrahedron. 

We believe that Osgood and Tannenbaum’s 
assumption of movement toward maximal po- 
larity is more reasonable if the neutral atti- 
tudes, or sentiments, are characterized by am- 
bivalence rather than indifference (cf. Kap- 
lan, 1972). An ambivalent-neutral attitude is 
an imbalanced one and thus according to bal- 
ance theory should be unstable. No special 
or additional assumption is required. In the 
case of indifferent attitudes, however, we are 
skeptical that there is any movement toward 
polarity. It should be further noted that if 
there is movement toward polarity on all 
three dimensions, the points should move to 
the corners of the tetrahedron and not just 
generally to the surfaces. 

Wiest interpreted his tetrahedron model as 
having implications regarding the correlation 
between any two of the relations, holding the 
third constant at some value. The predictions 
were derived by “slicing” the tetrahedron at 
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Table 1 
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Predictions of the Upper- and Lower-Boundary Formulas When k = 2 


Upper boundary 
aan aoei 0 FE 
+2 -2 -1 0 1 2 

+1 àg (a 4 2 1 
0 0 i 2 1 0 
“1 1 wap 04 =a 
-2 2 iO -1 -2 
M 0 6 8 6 0 


different points along a given relation (or 
dimension) and then examining the expected 
scatterplot for the relation between the two 
remaining relations. Wiest tested the tetra- 
hedron model by calculating certain correla- 
tions among the sociometric ratings of fifth, 
sixth, and seventh grade children. The chil- 
dren initially indicated their liking for each 
of the other children in the class, and then 
indicated the extent to which the highest, 
intermediate, and lowest of these rated chil. 
dren liked, or were liked by, each of the other 
children in the class, Wiest confirmed the pre- 
diction that the mean correlation between p’s 
liking of o and o’s liking of q (or q’s liking of 
9) should be largest for the most liked q and 
smallest for the least liked g, 

Wellens and Thistlethwaite (1971a) ex- 
tended Wiest’s model so as to enable predic- 
tion from specified values for two of the rela- 
tions to some value for the third relation. If 


we specify two values for X and F on the 
“bottom” of the c 


boundary. Wellens 
the following two 
values on the uppi 


Zı=k-|X-Y]| 
Z,=|X+Y¥|~2, 


“where Z, esent. 
limit for Z; Z terse the upper boundary 


r 
limit for z epresents the lower-boundary 


and |X + y| 
spond to the absolute values b the ip ait- 


o-x value 
Lower boundary i 
M -—2 =i 0 Hi M 
0 —2 =] 0 i 0 
6 -i -2 -f 0 6 
8 0 -1 2 1 8 
6 1 0 -l 2 6 
0 2 1 0 I 0 
0 -.6 8 6 } 
ference and sum, respectively, of the 


given relations; and the integer & is a 
Stant representing the highest positive 
value possible on each of the three interél 
ement relations” (p. 142). 

Illustrative predictions for the upper- 
lower-boundary equations are presented 
Table 1. We have specified the depen 
variable as the p to o relation (Z), and have 
assumed a 5-point plus-to-minus scale so 
k = 2. Since our main present concern is wi 
the polarity of similar attitudes, the i 
interesting values are in the diagonal runni 
from lower left to upper right, Note that 
upper-boundary formula predicts values 
are the same (+2), that is, do not differ as 
function of same-sign polarity, Except for l 
specification of the 0,0 case, this agrees wil 
Byrne’s assumption. As Table 1 also indicat 
however, the lower-boundary formula pre 
dicts a very marked effect for same-sign 
larity. 

Wellens and Thistlethwaite present a “ 
eralized tetrahedron formulation” that pi 
dicts a single value for Z 4 


Z=aZ, + bZ, 


where a and b are weights for the range limi 
Wellens and Thistlethwaite point out that 
values for a and b will vary according to t 
assumptions made regarding the distributii 
of points between the upper and lower bo 
daries. Wiest assumed that the points t it 
to go to the surface of the tetrahedron. Ifi 
is also assumed that the points drift equ 
toward the two boundaries, we have a further 
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basis for what Wellens and Thistlethwaite 
refer to as an “equal-weights” tetrahedron 
model in which a and 6 are both assigned 
values of .5, Other distributions of points are, 
of course, possible, and it is worth noting that 
any symmetrical distribution of points could 
provide justification for the equal-weights 
model, To the extent that the distribution of 
points is skewed, one boundary will be 
weighted more heavily than the other. It does 
appear implicit in Wellens and Thistleth- 
waite’s formulation that a+ 6= 1. Other- 
wise Z would not be an average of Z, and Zz. 

Illustrative predictions for the equal- 
weights tetrahedron model are given in Table 
2. Note in particular that the values along 
the lower left to upper right diagonal show 
an efiect for polarity. As long as some weight 
is given the lower-boundary formula, the gen- 
eralized tetrahedron model will predict greater 
attraction with greater same-sign polarity. 
Wellens and Thistlethwaite (1971a) found 
that their data were best fit by a model in 
which the upper boundary was weighted .75 
and the lower .25. As the predicted values 
for this unequal-weights model show (see 
Table 2), there is a clear effect for same-sign 
Polarity, 

The second approach to the quantification 
of balance theory flows from Feather’s (1967, 
1971) discrepancy principle. This principle 
States that when two relations are of the same 
sign, small discrepancies will be associated 
} With a strong positive, third relation, and 
When two relations are of opposite sign, large 
discrepancies will be associated with a strong 
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negative, third relation. Feather (1971) wrote 
simple formulas for predicting the missing 
relation when the signs of the known rela- 
tions are the same or opposite. Wellens and 
Thistlethwaite’s (1971a) version of the two 
formulas is as follows: 


Z.=k-|xX—Y| 


_Ix-¥| 
Z= — 7 

where Z, is the missing relation when the 
signs of the two known relations are the same, 
Z, is the missing relation when the signs of 
the two known relations are opposite, & is the 
highest possible scale value, and X and Y 
are the two known relations. When either X 
or Y is 0, Wellens and Thistlethwaite recom- 
mend averaging the values for Z, and Z,, 
Such a procedure seems more reasonable for 
ambivalent-neutral attitudes than for indif- 
ferent—neutral attitudes. 

Illustrative predictions for the Feather 
model are given in Table 3. Note that the 
same-design diagonal values show an effect 
for polarity only at the 0,0 point. Since the 
same-sign formula is identical to the upper- 
boundary formula (which, however, also ap- 
plies to opposite signs), it is evident that the 
polarity shift at the 0,0 point is due to the 
formula for opposite signs. 

Wellens and Thistlethwaite (1971a) used 
a role-playing procedure to test the implica- 
tions of the various models for p to o attrac- 
tion. The result indicated that although all 
models provided a significant fit to the data, 


Table 2 
Predictions of the Equal-Weights and Unequal-Weights Tetrahedron Models When k=2 
o-x value 
Equal weights Unequal weights 
4 
“ia —2 ae. E Sea Nes GA eae Ue awa 
— eas eee 
= 0 
2 —2 -1 0 1 a. 0 —2 1 0 1 2 4 
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te. For equal weights a = .5, b = .5. For unequal weights, a = .75, b = .25. 
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le 3 
ae of the Feather Model When 
Neutral Attitudes are a Composite (Average) 
of the Same-Sign and Oppostte-Sign 
Predictions and k = 2 


———— 


o-x value 

x 
ide =2 =r 0 +1 +2 M 
+2 —2 iS --.5 1 2 —.20 
tL) -15 —1 hie 1 AS 

0 =.5 oe 1.0 25. =5 510 
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an unequal-weights tetrahedron model (see 
Table 2) was descriptively superior to the 
equal-weights tetrahedron and Feather mod- 
els. Wellens and Thistlethwaite (1971b) rep- 
licated this basic result but also found that 
for incomplete triads with p-x and o-x de- 
pendent variables “the results were best de- 
scribed by an equal-weights version of Wiest’s 
tetrahedron model” (p. 82). Wellens and 
Thistlethwaite (1971b) speculated that per- 
haps the lesser importance, or lower weight, 
of the lower-boundary formula for predicting 
the p-o relation may be due to subjects’ re- 
luctance to imagine negative relations with 
others. (Note the abundance of negative rela- 
tions for the lower-boundary formula in 
Table 1.) 
The present investigation is a further study 
_ of the adequacy of the various balance models 
where the p-o relation, interpersonal attrac- 
tion, is the dependent variable. The present 


investigation differs from the two ground- 


breaking experiments by Wellens and This- 
tlethwaite in three salient respects. First 
the general experimental procedure did not 
involve hypothetical role playing but a vari- 
ation of Byrne’s anonymous-stranger tech- 
nique. Second, instructions regarding the ex- 
treme e y at the endpoints were used in 
t to improve the measure: 
Third, Kaplan’s (1972) technique anes 
to measure and distinguish ambivalent and 
indifferent, neutral attitudes. Each of these 
Matters will be considered in turn, 


The fi 
Variation o. o otutal change was to use a 


on of Byres anonymous-stranger 
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technique. Each subject received information 
regarding an anonymous stranger's attitude 
on five different issues. For four of the five 
issues the subject's attitude was —2, 0, or 
+2, and for four of the five issues the other's 
attitude was —2, 0, or +2. The exact pat 
terns used are shown in Table 4. For exe 
ample, in the +2,0 cell there are three ine 
stances in which the subject's attitude is +2 
and the other’s attitude is 0, one instance in 
which the subject’s attitude is +1 and the) 
other’s attitude is 0, and one instance in whi 
the subject’s attitude is +2 and the other's 
attitude is (randomly) either +1 or =I 
Such patterns were used in preference to 
pletely uniform pairings in an attempt to 
create verisimilitude and to prevent a pi 
lem noticed during pilot testing. During j 
testing some of the subjects in cells with id 
tical subject and other attitudes told us 
they felt as if they were judging themselves? 
The slight variation eliminated the problem 
(and also reassured us that small scale diffi 
ences are meaningful). The use of such a five 
item procedure required that we initially pi 
test so as to obtain a sufficient number 
items (or item responses) to enable any su 
ject to be randomly placed in any cell. 
procedure also required that a computer 
used to select the appropriate items regardi 
which feedback is to be given. 


Table 4 


Patterns of the Feedback Items 
> Sa a 


o-x value 

p-x TPE 
value —2 0 +2 
+2 =2 0 

+2 =2 0 +2 
+2 =! —1/+1 +1 
+2 = 0 +2 
+1 0 


Table 5 


Equal weights o-x 
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Mean Predicted Attraction in Each Cell of the Design Based on the E 
the Unequal-Weights Tetrahedron, and the Feather Models 
SSSeSeeFeFeFFFsFsFsFsFsFsFsFshshFeseses 


Unequal weights o-x 
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qual- Weights Tetrahedron, 


Feather (1967) o-x 


From a theoretical perspective the problem 
is that the Wiest and Feather models apply 
most directly to one and not five issues. We 
decided to calculate the Wiest and Feather 
| predictions for each item and then average 
across the five items (weight each of the 
items by 1/m), Mean predicted results for the 
equal-weights tetrahedron model, the unequal- 
weights tetrahedron model, and Feather’s 
model are presented in Table 5. All three 
models predict an interaction, but only the 
‘tqual-weights model does not predict p-x and 
0-x main effects. 

The second procedural change involved an 
attempt to improve the measurement. Ander- 
Son (e.g., 1974) has repeatedly emphasized 
the advantage of end-anchors for rating scales. 
inderson maintains that rating scales are 
lased near their endpoints and that the pre- 
Caution of using end-anchors overcomes pos- 
sible ceiling effects that may result from such 
bias, According to Anderson, end-anchors are 
Stimuli that are higher or lower in value than 
the regular experimental stimuli” (1974, p. 
245). We did not use end-anchors in this 
Sehse of the term, but we did attempt to in- 
Tporate the spirit of the recommendation. 
us subjects were initially instructed that 
the —3,+3 extremes of the scale were only 
for extraordinarily positive or negative feel- 
Mgs and that such feelings were not common. 
Or the case of the initial instructional ex- 
le, “North Carolina,” they were told that 
t € endpoints should not be used unless “you 
tel that there definitely isn’t any place in the 
Ole world better than it or worse than it.” 


J 


p-x 
valu -2 0 +2 M —2 0 +2 M -2 0 +2 M 
+2 —1.61 —.01 1.60 —.01 —1.61 .09 1.60 .04 —1.81 —.36 1.60 —.23 
0 —.01 00 01 00 .09 .80 ll 45 —.35 10 —.34 18 
-2 1.61 02 —1.64 00 1.61 .13 —1.64 .06 1.61 —.32 —1.82 —.21 
M -00 .00 0 — 05 45 05 — —.23 18 -22 — 
Nole. The slight variations from symmetry in these values are due to minor variations in the number of 


items per subject and to random variation in the feedback patterns (see Table 4). Note also that the mar- 
ginals reflect the greater number of subjects in the 0 conditions. 


A reminder that the endpoints were only for 
extreme feelings was periodically repeated on 
each subject’s viewing screen. With this pro- 
cedure the percentage of endpoint endorse- 
ment for all items was 5.86. 

Although the initial instructions and pe- 
riodic reminders may provide for a more uni- 
form interpretation of the scales, the general 
“anchoring” procedure does create some un- 
certainty regarding the & term in the various 
models, With a 7-point scale, k = 3, whereas 
with a 5-point scale, $ = 2. There are a num- 
ber of considerations, however, that lead us to 
believe that the most reasonable interpreta- 
tion is for & to equal 2. Twenty-five percent 
of the subjects did not endorse a single end- 
point on any of the scales. For these subjects, 
then, none of the items evoked feelings as 
extreme as the abstract anchor, and the 
effective scale was a 5-point scale. On the 
other hand, 75% of the subjects did endorse 
an endpoint for at least one item. Although 
there was some scatter in the items producing 
endpoint endorsement, the tendency was more 
apparent for some items than for others. Six 
of the 50 items had endpoint endorsements 
above 10%. For example, “homosexuality” 
produced 19% endorsement at —3, and “one 
true religion” produced 9% endorsement at 
+3 and 10% endorsement at —3. Since the 
computer only selected items that a given 
subject had endorsed at —2, 0, or +2, it is 
arguable that the 75% of the subjects who 
endorsed one or more endpoints really dis- 
covered their own idiosyncratic anchors, and 
thus for purposes of the items utilized in the 
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manipulations there was a functional 5-point 
scale. To the extent that the 75% of the sub- 
jects who did endorse one or more endpoints 
and the 25% of the subjects who did not en- 
dorse any endpoints at all had functional 
5-point scales, no subjects should have en- 
dorsed an endpoint of any of the three de- 
pendent variables. In fact only 1 of the 224 
subjects did so, 

Beyond this it is important to note that 
whether is 2 or 3 has less effect on the pre- 
dicted values than might be initially thought. 
There is no effect at all for the equal-weights 
model, since the & terms exactly cancel out. 
For the unequal-weights model, the predicted 
values in each cell are all altered by the same 
constant, so that the magnitudes of the pre- 
dicted main effects and interactions are ex- 
actly the same. For the Feather model the 
variation in k affects only the cells involving 
same ‘si r 0, with the result that the pre- 
dicted | ectsfdo not vary in magnitude, 
but the predicted interaction is somewhat 
smaller when k= 2. As far as the predicted 


da $ . : 
effects for the analysis of variance are con- 
cerned, then, 


tion betwee: 
the’ value for 
(since & deter 
Finally, with regi 


interesting techniqu 
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alence. After rating an issue 


ne ept on 
7-point, bipolar, semantic-cifferential scal 
subjects use a 4-point, unipolar scale (0 to 3 
to rate a concept’s good qualities ignoring i 
Negative ones, and then a 4-point, unipol 
scale (0 to —3) to rate a concept's bad qua 
ities ignoring its positive ones. Reassuringl 
Kaplan found that the bipolar ratings (A 
were highly correlated with the simple sum 


the unipolar positive (A,) and the unipol 
negative (A,) ratings. More relevan pre 
ent purposes, however, Kaplan used the ra 
ings to develop a mathematical index of an 
bivalence (AMB). According to Kaplan, 


AMB = TA POL 


(ambivalence equals total affect, TA, mim 
polarity, POL), where 


TA = A, + |A. 


and 
POL = |A, + A| œ |A 


It is assumed that a neutral attitude with lo 
ambivalence has high indifference. 

For the present study half of the neutri 
p-® attitudes were high ambivalence and ha 
low ambivalence (or high indifference). Sinc 
an initial pilot test seemed to indicate tha 
the distinction had little effect, we decided t 
incorporate an analogous distinction into th 
o-x relation. Perhaps, for example, ambiv 
alent subjects are more attratted to ambiv 
alent others than to indifferent others. Sinc 
We were somewhat skeptical that all subject 
could adequately cope with feedback for thre 
different scales for each of the five issues, 
adopted a somewhat simpler procedure. Sub 
jects were told that the other subject whos 
attitudes they were seeing had not been aske 
to mark the unipolar scales, as they had done 
but simply to check whether they were “con 
flicted” or “not conflicted.” The meaning 0 
the term “conflicted” was carefully describ 
5o as to involve the inclusion of both positivi 
and negative qualities. Accordingly, half ° 
the subjects receiving neutral feedback a 


* Kaplan (p. 369) uses |A| and |A, + Aa] inte 
changeably in discussion of polarity. In the presë 
Study the second term was used in the calculati 


of ambivalence (POL = |As + 4al). * 


_ 


ahe “conflicted” alternative checked for all 
Give issues and half had the “not conflicted” 
ternative checked for all five issues. The 
bjects receiving either —2 or +2 feedback 
fhad the “not conflicted” alternative checked. 
lit is mathematically impossible for subjects 
ith highly polarized attitudes to be con- 
JMicied or ambivalent. 

i| Since we were initially uncertain regarding 
how to include the above ambivalence-indif- 
‘ference distinction in the model testing, we 
tBecided to do a two-step analysis in which the 
first analysis related solely to the p-x neutral 
lls. If the ambivalence-indifference distinc- 
n made no difference, the cells would be 
llapsed. If the distinction did make a differ- 
ce, the two types of neutral attitudes would 
separately included with the remaining 


ub jects 


The subjects were 224 students (80 male and 144 
ale) from the introductory psychology class at 
University of North Carolina who participated 
partial fulfillment of a course requirement. 


nde pendent Variables 


The design can be conceived of as involving two, 
lour-level factors (p-x by o-x). The four levels of 
p-x factor are —2, O indifference, O ambivalence, 
+2. The four levels of the o-x variable are —2 
conflicted, © nonconflicted, 0 conflicted, and +2 
nconflicted. 
The p-x factor was manipulated by having the 
P-11 search for five items that conformed to the 
ject’s randomly assigned condition. If the subject 
re assigned to a O attitude condition, the computer 
culated the ambivalence scores for all these items, 
ked the scores, and then selected either the four 
est or the four lowest. A similar procedure was 
Wed for the fifth, +1 or —1, item. Mean ambivalence 
r all five items was .19 for O indifference and 1.63 
© ambivalence.’ Whether the fifth item was +1 
—1 was randomly determined. 
_ Although the initial set of 50 items had been 
ed so as to maximize the probability of always 
“hding five appropriate items for any randomly 
signed condition, in 17 of the 224 cases we were 
forced to use fewer than five items. In 11 cases four 
items were used, and in 6 cases three items were 
. Since omitting these subjects from the various 
Multivariate and univariate tests made no difference 
A the results, all subsequently reported analyses in- 
ude these 17 subjects. 
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The o-x factor was manipulated by having the 
computer provide feedback regarding the other’s 
attitude according to the pattern in Table 4. Infor- 
mation was also provided regarding whether the sub- 
ject was conflicted. Except for the O conflicted case, 
the nonconflicted alternative was always indicated. 


Procedure 


The experiment was conducted on a PDP-11 com- 
puter. Subjects were tested in separate, soundproof 
cubicles. The instructions and other information 
were presented to each subject on a cathode ray tube 
screen, and subjects responded by using different keys 
on a teletypewriter. 

Upon arrival subjects were informed, consistent 
with information provided on the “sign-up” sheet, 
that they were going to participate in two different 
experiments, one concerned with attitude measure- 
ment and one concerned with impression formation 
on the basis of limited information. Subjects were 
then directed to separate cubicles and were instructed 
regarding the use of the teletype. Each subject was 
randomly assigned, to a cell of the design, and the 
cell number was fed into the computer. 

In the first experiment subjects were presented 
with 50 attitude items relating to a miscellaneous set 
of issues (supersonic planes, gun-control laws, the 
honor system, communes, nuclear energy, divorce, 
etc.). Subjects initially responded to the items on a 
7-point, bipolar scale, then on a 4-point, positive, 
unipolar scale, and finally on a 4-point, negative, 
unipolar scale. The bipolar scale was labeled: ex- 
tremely in favor, 1=(+3); strongly in favor, 2 = 
(2); mildly in favor, 3= (+1); undecided or neu- 
tral, 4= (0); mildly against, 5=(—1); strongly 
against, 6 = (—2); extremely against, 7 = (—3). In- 
structions for the bipolar scale used ‘the example of 
“North Carolina”: 


You should type the number of any of these op- 
tions which best represents your feelings about 
North Carolina. For example, if you feel very 
positively about North Carolina you should type 
2 (strongly in favor). The two endpoints 1 (+3) 
and 7 (—3) are indicators of very extreme feel- 
ings which are not common in the majority of 
cases. You may use them only if you have an 
extraordinarily positive or negative feeling about 
an issue (e.g., in the case of North Carolina, if you 
feel that there definitely isn’t any place in the 
whole world better than it or worse than it!). 


The above instructions, along with additional infor- 
mation indicating that the O point could indicate 
either conflicting feelings or indifference, were fol- 
lowed by a practice example. The first 4 items and 


21f POL = |A|, mean ambivalence is .67 for indif- 
ference, and 1.88 for ambivalence. 
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then every 10th item were preceded by the following 
reminder; 


Please indicate your feelings about the following 
Statement by typing the number of your selected 
option, Remember that options 1 (+3) and 7 
(—3) are for very, very extreme feelings that are 
not common in many cases. 


Consistent with Kaplan's (1972) suggestion, the 
positive, unipolar rating was introduced with in- 
structions to rate the positive qualities, ignoring the 
negative. The example was “ice cream” (taste vs, 
calories). 

Following the negative, unipolar rating, the second 
experiment was introduced. Subjects were asked to 
press a button if they wished to participate in the 
second experiment. The instructions were in the 
vein of the typical “judgment about others on the 
basis of limited information,” except for the inclu- 
sion of detailed information regarding the other's 
indication of whether he or she was “conflicted” or 
“not conflicted.” The other was described as some 
other University of North Carolina undergraduate 
of the same sex as the subject. At the time this per- 
son rated the items, the unipolar scales had not been 
used; rather a simple decision tegarding the con- 
flicted-nonconflicted dichotomy was made, Again an 
example was given. The feedback regarding the 
other’s bipolar responses followed the pattern in 
Table 4. The information tegarding the other's re- 
Sponse to each item was shown on the screen for 20 
sec, and at the same time a copy was printed on the 
teletype for later reference. 

After the fifth item was Presented, the subjects 
were given the several items of Byrne’s Interperso: 
Judgment Scale (IJS). Subjects were again Raed 
that the endpoints were for extreme feelings, 

Finally, subjects were taken to a different room 
and were asked to fill out a brief questionnaire that 
asked for further ratings of the other, as well as in- 
formation regarding their general impressions of the 


other and also opinions they had egarding 
pose of the experiment. : aya 


Dependent Variables 


The main dependent vari A 
ing of variable js 


in an experiment, The final question: 
you if you were to meet.” 


ny > Was an expectation that liki 
Perceived reciprocal liking liking and 


ABBAS TASHAKKORI AND CHESTER A. INSKO 


Results 


MANOVA 


A two-factor (4 X 4) multivariat 
of variance (MANOVA) of the thre 
pendent variables (liking, worki: 
rocal liking) revealed a significant interacti 
F(26, 602) = 441, p< 
effects for p-x, F(9, 501) 


Ol, and no 


1.62 p< 


0-%, F(9, 501) = 0.01, p < .51. These resulu 
are in agreement with the predictions of t 
equal-weights tetrahedron model and in pa 
tial disagreement with the unequal-weigh 
and Feather models (since they also predict 
main effects). Since, as is explained below, t 
different types of neutral attitude produced 
Significant variation, we will initially exami 
these dependent variables with the differe 


types of neutral attitude collapsed Such L 
procedure facilitates testing of the vario 
models. 


Liking 


A 3X3 analysis of variance (ANOVA) 
the liking scores revealed a significant inte 
action, F(4, 215) = 20.29, p < .01, and no 
Significant main effects for p-x, F (2, 215) 
1.36, and o-x, F(2, 215) =.59. A tren 
analysis of the interaction revealed both a si 
nificant linear by linear component, F (1, 215 
= 69.08, p < .01, and a significant quadrati 
by quadratic component, F(1, 215) = 114 
$ < 01. The liking means are in Table 6. / 
the interaction indicates, there is a tendency 
for the means to increase across the top 10 
and to decrease across the bottom row. 
trend analysis of just the top, +2, row a 
veals a significant linear trend, F(1, 215) = 
28.37, p < .01, and a nonsignificant quadrati 
trend, F(1,215) =.83. Across the middle 
0, row the linear trend is nonsignificant, F( 
215) = 42, but the quadratic trend is si 
nificant, F(1, 215) = 7.09, p < .01. Acros 
the bottom, —2, row the linear trend is sei 
icant, F(1, 215) = 41.32, p < .01, as is be 
the quadratic trend, F(1, 215) = t1, ‘7 
Ol. Finally, in view of our particular oe 
in polarity it is relevant to note that a 
extreme attitudes (42,42 or —2,—2) P 
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Liking, Preference As a Co-worker, and Reciprocal Liking 


Liking o-x 
-2 0 +2 M —2 
—.64 A4 143 .27 —.50 
21 64 OF 38 21 
164 —18 —.86 .11 1.07 
36 jl 16 — 25 
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Working o-x Reciprocation 0-x 
0 +2 M =2 0 Fh M 
—14 114 .09 —43 .50 1.36 48 
334.) 18,27 36.89 39.63 
—14 —.64 04 1.43 .29 —.21 .45 
0" 7 21 43.64 48 


ttitudes, F(1, 215) = 13.94, p < 01. 
All of the theoretical models predict inter- 
, but only the equal-weights model 
es not predict main effects. Since no main 
ects were obtained, this result provides sup- 
for the equal-weights model. On the 
er hand, the nature of the obtained inter- 
tion is only partially in accord with the 
al-weights model. As the theoretical values 
Table 5 indicate, the equal-weights model 
edicts a linear by linear interaction with an 
easing linear trend across the +2 row, no 
d across the O row, and a decreasing linear 
end across the —2 row. These predicted 
ear effects are significant. There are also 
mificant quadratic effects, however. Par- 
ularly noteworthy is the quadratic trend 
oss the O row produced by the rise in the 
0 cell, This quadratic or curvilinear trend 
interesting because such an effect is pre- 
by both the unequal-weights and 
father models (see Table 5). 
Finally, there is also a significant quadratic 
nd across the —2 row. This curvilinear 
d appears to result from the fact that the 
—2 mean (1.64) is not as discrepant from 
0,0 mean (—.18) as is the —2,+2 mean 
86). Stated somewhat differently, the 
ne-sign mean is more polarized in the posi- 
e direction than the opposite-sign mean is 
the negative direction. Although the curvi- 
ar trend is not significant across the top, 
, row, there is a similar tendency for the 
sign (+2,+2) mean (1.43) to be more 
red in the positive direction than the 


same-sign and opposite-sign means (for the 
corner cells) is significant, F(1, 215) = 8.60, 
p< 0l. 

Table 7 presents the correlations and mean 
squared deviations between obtained and pre- 
dicted values for the various Wiest and 
Feather formulations. For sake of comparabil- 
ity with Wellens and Thistlethwaite analyses, 
the calculations have been done separately 
for upper- and lower-boundary formulas and 
also for three different formulations of the 
Feather model. The composite model uses the 
average of the same- and opposite-sign formu- 
las when ġ-o or o-x is neutral, The same-sign 
predictions are based on the same-sign for- 
mula for neutral attitudes, and the opposite- 
sign predictions are based on the opposite- 
sign formula for neutral attitudes. In all other 
respects the three Feather formulations are 
identical, The two Wiest models differ accord- 
ing to whether the weights are equal (¢ = 
.5, b = .5) or unequal (@ = .75, 6 = 25). In 
terms of mean squared deviations from pre- 


Table 7 
Correlations Between and Mean Squared 


Deviations of Obtained and Predicted 
Liking Values 


MSD MSD 

Basis of prediction (k=2) (k=3) r 
Upper-boundary formula 1.36 2.79 .48* 
Lower-boundary formula 2.42 5.01 .28* 
Equal-weights model 1.22 1:22 .48* 
Unequal-weights model 1.13 1.34 pole 
Feather same-sign 1.41 2.93 A6* 
Feather opposite-sign 1.79 1.93 .50* 
Feather composite model 1.24 1.39 By be 
*p<.0l. 
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dictions, the equal-weights, unequal-weights, 
and Feather models are descriptively smallest, 
and the lower-boundary formula is descrip- 
tively largest. This is true for both & = 2 and 
k= 3. Since no appropriate tests of signif- 
icance are possible, some care should be ex- 
ercised in interpreting these results. 

In terms of the correlation of predicted and 
obtained results, the coefficients for the three 
models are approximately the same. In this 
instance it is possible to do tests of signif- 
icance—assuming one is not concerned with 
the problem of repeated testing. As it turned 
out, however, that concern is not relevant, 
since comparisons among all possible pairs of 
the three models (equal weights, unequal 
Weights, and Feather composite) produced $ 
values greater than .05 in every case, 

The data in Table 7 descriptively indicate 
that the lower-boundary formula has both the 
largest mean squared deviation and the low- 
est correlation. A stepwise regression analysis, 
however, reveals that the lower-boundary 
formula is necessary to the Wiest formulation, 
With only the best predictor (upper-bound- 
ary formula) in the model, the correlation is 
48. When the lower-boundary formula is 
added, the (multiple) correlation increases to 
SI and the difference is Statistically signif- 
icant, F(1, 221) = 29.00, p < 01, The multi. 
ple regression equation for k = 2 is as follows: 


Liking = .14 + 49 (upper) + .20 (lower). 


With E = 3, the intercept changes from .14 
to —.15. 
It is interesting to note that th 
. . ii R 
equation gives greater weigh! ah 
boundary 
between h 
icant, F(1 221) = 7.36 P< 01. Th 
a, : :01. Thus the 
assumption of “equal wei ts” i justi 
for the Present data. eh eae «CCN 
In view of the fact that the generalized 


¢ as to guar- 

ese weights, 43 (u 

per) and .18 (lower), a 7 PAD: 
ged from the p 7e not appreciably 


1 € values of .49 and 2 
en the weights are rescaled so as to sum a 
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1, they are approximately 70 and 
greatly different from Weilcens and Th 
waite’s “unequal weights” of .75 and .25. 
Examination of the lower-le{t to upper 
diagonal values in Table ı makes it 
that the lower-boundary formula predict 
polarity shift for same-sign values. The f 
son for this is that the |X + Y| term of 
lower-boundary formula is exactly equal 
direct measure of polarity, |X| + ixis 
same-sign values, This suggests the use 
stepwise regression to demonstrate fi 
the importance of polarity. The two ten 
are |X — F| from the upper-boundary 


mula and the direct measure of polarity, 
+ |F|. The best predictor, |X — F|, coe 
lates .48 with liking, and the correlation 


creases to .52, significantly higher, F(1, 
= 38.20, p < .01, when polarity is added. 
same-sign values the first term, |X — F|, 
Measure of discrepancy. This suggests @ 
the limited success of the tetrahedron md 
is at least partially due to the fact that) 
includes simultaneous measures of both d 
crepancy and polarity. 


Working 


Table 6 reports the mean working (prel 
ence for working within an experiment) f 
sults. An analysis of variance revealed a 
nificant interaction, F(4, 215) = 10.53, $ 
01, and no significant main effect for 
F(2, 215) = 1.17, p < .31, or o-x, F(2, 20 
= .50, p < .61. Both the linear component 
the interaction, F(1, 215) = 37.93, p< @ 
and the quadratic component of the interaé 
tion, F(1, 215) = 4.12, p < .05, are sign 
icant. Trend analyses of the separate k 
revealed a linear effect in the +2 rows, F 
215) = 18.7, p < .01, and a linear effect 
the —2 row, F(1, 215) = 19.78, p < 01. ` 
other trends, linear or quadratic, are sig 
icant. The quadratic interaction appears | 
result partially from the somewhat great 
deviation from the O level of the same-Si 
than from the opposite-sign corner means: | 
comparison of the absolute value of the samí 
sign corner cells with the absolute values’ 
the opposite-sign corner means is margi 
F(1, 215) = 3.85, p < .06. These results 
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a somewhat weak parallel of the liking results. 
As was the case for liking, similar extreme 
attitudes (+2,+2 or —2,—2) produced more 
attraction than similar neutral attitudes, F(1, 
215) = 12.14, p < OL. 


Reciprocal Liking 


Table 6 reports mean reciprocal liking (per- 
ceived reciprocal liking) results. An analysis 
í variance revealed a significant interaction, 
F(4, 215) = 17.54, p < .01, and no signif- 
icant main effect for p-x, F(2, 215) = 1.20, 
p< .30, or o-x, F(2, 215) = 1.49, p< .23. 
Both the linear component of the interaction, 
F(1, 215) = 59.93, p < .01, and the quad- 
ratic component of the interaction F(1, 215) 
= 8.90, p < .O1, are significant. Trend analy- 
ses of the separate rows revealed significant 
linear effects in the +2 row, F(1, 215) = 
32.51, p < 01, and —2 row, F(1, 215) = 
27.52, p < .O1. There is also a significant 
quadratic trend in the 0 row, F(1, 215) = 
10.94, p < .01. No other trends are signif- 
icant, A comparison of the absolute value of 
the same-sign corner cell with the absolute 
value of the unlike corner cells is significant 
F(1, 215) = 23.67, p < .01. Except for the 
lack of a significant quadratic trend in the 
~2 row, these results parallel those for liking. 
Again, similar extreme attitudes (+2,+2 or 
~2,—2) produced a greater effect than sim- 
ilar neutral attitudes, F(1, 215) = 40.49, 
P< 01. 


Reactions to Neutral Others 


In view of the fact that one of the major 
Purposes of the present study is to examine 
the effect of neutral attitudes, it is of interest 
to note some of the written comments regard- 
ing neutral others. The postexperimental ques- 
tionnaire contained a question that asked sub- 
jects to indicate whether or not they had any 
Specific individual in mind when responding 
t0 questions about the other person. It is in- 


Attitudes, Here is a sample of some of the 
‘omments: “someone who is wishy-washy or 
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Table 8 
Mean Liking of Ambivalent and Indifferent 
Neutral Subjects 


o-x value 
px 
value —2, On 0. +2, M 
Oina 86 93 93 —.14 64 
Oamb —.43 57 14 21 12 
M 21 75 54 03 = 


Note. Subscript n symbolizes nonconflicted and 
subscript ¢ conflicted. 


uninformed,” “a blah sort of person in that he 
did not feel any way about any issue,” “‘some- 
one indifferent to the society,” “(someone 
with) little feelings about current issues,” 
“(one who) didn’t care about anything,” 
“some dumb female who didn’t care about 
anything but her own private world,” “(I 
don’t) know anybody who has no opinion,” 
“(a person who) didn’t have any opinion 
about anything,” “a person who wouldn’t 
make up his mind on anything,”and “some- 
one strongly individualistic enough to not 
really care.” 

Even some of the subjects who were them- 
selves neutral expressed negative feelings 
about the neutral other. Here are some of 
their comments: “a freshman (who) didn’t 
seem to be extremely well-informed, close to 
myself,” “a friend of mine who is always 
neutral about everything,” “(a person who) 
is not well acquainted with current events, 
just like me,” “(a person who) was not very 
intelligent, someone who was indecisive on a 
lot of topics,” “a very indecisive person,” “a 
very unopinionated (person),” and “(a per- 
son) who does not keep up with current 
events.” In view of the fact that these com- 
ments came in response to a very indirect 
question (regarding whether or not they had 
a specific individual in mind), their negative 
tone is somewhat surprising. Some subjects at 
least appear to have generalized beyond the 
small amount of conveyed information so as 
to regard the other as “empty headed.” 


Type of Neutral Attitude 


Table 8 presents the mean liking results 
for the ambivalent and indifferent levels of 
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the p-x factor and the cross-cutting levels of 
the o-x factor (some of which are conflicted 
and some of which are nonconflicted), As 
previously indicated, a 4 X 4 multivariate 
analysis of variance for liking, working, and 
reciprocal liking revealed that neither main 
effect (p-x or o-x) is significant. Our planned 
comparisons, however, relate to the inter- 
action within the four center cells (see Table 
8). Specifically, it was predicted that attrac- 
tion would be greater for the indifferent-non- 
conflicted and ambivalent-conflicted cells 
than for the ambivalent-nonconflicted and 
indifferent-conflicted cells. A multivariate test 
of this interaction is nonsignificant, as are 
also each of the three univariate tests. In the 
case of liking and working, the interaction is 
in fact in the “wrong” direction, 

Tn order to be somewhat more certain about 
the above findings, an internal analysis was 
done by selecting subjects who differed more 
extremely on ambivalence—indifference, The 
results were not appreciably changed. 


Within-Cell Distributions 


One final matter relates to the within-cell 
distribution of liking scores, Wellens and 
Thistlethwaite (1971a, 1971b) Teport that 
their within-cell distributions were not Notice- 
ably bimodal and took this as evidence against 
Wiest’s assumption that the points inside the 
tetrahedron go toward the two surfaces or 
boundaries. Such a test, however, rests on the 
assumption that individuals will markedly 
differ in the extent to which upper- or lower- 
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Discussion 


The obtained results leave liuu 
a same-sign shift in polarity ír 
—2,—2 to 0,0 has a marked eficct on int 
personal attraction. This result was obtai 
for all measures, direct and indirect. Bey 
this, the results give some qualified support 
the equal-weights 


tetrahedron uneq 
weights tetrahedron, and Feather model, wil 
no unqualified support for any one of 
models. The equal-weights tetrahedron m 
is the only one that predicts the obtai 
pattern of an interaction and no main eff 
According to the equal-weights tetrahed 
model, however, the interaction should 
entirely linear (see Table 5). In fact, both h 
linear and quadratic interactions are si 
icant for all three dependent variables (liki 
working, and reciprocal liking). One aspect 


this quadratic interaction is a tendency for 
curvilinear trend to appear in the 0 row. S 
a trend (which is not predicted by the eq 
weights model but is predicted by the 
equal-weights and Feather models) is sig 
icant for both liking and reciprocal liking a 
nonsignificant but in the same direction f 
working (see Table 6). These anova resul 
then, are not entirely in accord with any 
the three models. 

The mean squared deviations of the 
tained from predicted values and the corre 
tion between the obtained and predict 
values are approximately the same for 
three models (see Table 7). The three v4} 
sions of the Feather formulation are fait 
similar, with a slight descriptive edge goi 
to the composite model, The most appa 
thing from the Table 7 measures is the r 
tively lower correlation and higher me 
Squared deviation for the lower bounda 
formula. This impression is further buttre: 7 
by the finding that the multiple regressi 
weight for the upper boundary is signiñcan 
greater than for the lower boundary. 
rather clearly indicates that the data ae fj 
consistent with the equal-weights at 
Overall, then, we are left with a somew 
mixed picture. 

One of the most apparent problem 


s is the 
failure of the subjects to report as much 


like as any of the models predict. Particularly 
teworthy is the failure of the opposite-sign 
(or dissimilar) 2,2 cells to be as extreme in 
the negative direction as the same-sign (or 
similar) 2,2 cells are in the positive direction 
(see Table 6). Judging from the numerous 
tudies reported by Byrne, our subjects’ re- 
ctance to express extreme dislike when there 
high dissimilarity is a typical occurrence. 
at is responsible for this phenomenon? A 
rief digression suggests a possible answer. 
Aronson and Worchel (1966) have ad- 
vanced the interesting idea that the similar- 
ity-attraction effect is due, “at least in part, 
to an implicit assumption that people who 
hold attitudes similar to our own will like 
us” (p. 157). By extrapolation one could also 
E that there is an implicit assumption that 
people who hold attributes dissimilar to our 
own will dislike us, The reciprocal liking re- 
sults for the present study (see Table 6) do 
indeed indicate that similarity information 
does have implications for perceived recip- 
tocal liking (and disliking). Insko, Thompson, 
Stroebe, Shaud, Pinner, and Layton (1973) 
Obtained similar results, and, in addition, 
found further support for this “implied eval- 
tation” hypothesis with an experimental 
Manipulation of evaluative feedback. As pre- 
dicted by this hypothesis, the slope of the 
Similarity-attraction effect was less when 
‘valuation feedback was either positive or 

gative than when it was absent. Similar 
fesults were obtained by Byrne and Rhamey 
(1965) and also by Clore and Baldridge 
(1970). 

There is, therefore, evidence in support of 
this implied-evaluation hypothesis. Since bal- 
ance theory, as stated by Heider (1958), 
‘karly asserts that reciprocal, p-o liking-dis- 
fking is balanced, the implied-evaluation hy- 
thesis can be regarded as a type of balance 
“planation of the similarity-attraction effect. 
is an explanation that focuses attention on 
nce in the p-o, o-p cycle, as opposed to 
emphasis upon balance in the p-o-x cycle. 


ts of p’s phenomenology and assumes 
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that the basic tendency to maintain or achieve 
balance always operates. It can be further 
noted that balance theory allows for the pos- 
sibility of any of a number of causal sequences 
among the three variables of similarity, p-o 
sentiment, and o-p sentiment. Aronson and 
Worchel’s hypothesis suggests that similarity 
affects p-o sentiment directly and also indi- 
rectly via the mediation of o-p sentiment. 
Other possibilities exist, however; for example, 
that similarity causes p-o sentiment, which 
causes 0-p sentiment, which feeds back to p-o 
sentiment (cf. Insko et al., 1973). 

For present purposes, however, it is inter- 
esting to note that the (direct or indirect) 
inference from similarity to reciprocal liking 
is stronger than is the inference from dissim- 
ilarity to reciprocal disliking. As the data in 
Table 6 indicate (and consistent with the less 
marked but still evident results for liking), 
the opposite-sign 2,2 cells are not as extreme 
in the negative direction as the same-sign 2,2 
cells are in the positive direction, Why should 
this be? There is nothing in the p-o, 0-p cycle 
per se that would predict such a result. It is, 
however, evident that for the typical person 
of high self-esteem the simultaneous occur- 
rence of p-o liking and o-p liking produces 
agreement regarding the worth of the self, 
but the simultaneous occurrence of p-o dis- 
liking and o-p disliking produces disagree- 
ment regarding the worth of the self. Common 
sense certainly suggests that there is some 
tendency to avoid disagreement regarding the 
worth of the self. What is the reason for such 
avoidance? Social comparison theory (Fest- 
inger, 1954), of course, predicts a general 
avoidance of all disagreement—assuming that 
the os are similar, On the other hand, bal- 
ance theory more simply predicts the same 
effect. Assuming a positively evaluated self, 
and that disagreement implies negative eval- 
uation, it is imbalanced for the self (+) to be 


3 It is interesting to note that with k=3, the un- 
equal-weights and Feather models actually predict 
that the opposite-sign, corner means should not be as 
negative as the same-sign, corner means are positive. 
The obtained data, however, show this tendency 
much more markedly than is predicted under this 


circumstance. 
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(+) negatively evaluated (—). In this in- 
stance the verb “to be” implies equality, OF 
similar grouping, and is thus a positive unit 
relation. Whatever the theoretical basis for 
this avoidance of disagreement phenomenon, 
its assumed reality provides a possible ex- 
planation for the failure of the opposite-sign, 
high-dissimilarity cells to produce the pre- 
dicted amount of disliking. The inference 
from dissimilarity to -o disliking would be 
weakened because of the suggestion of o-p 
disliking, which (for the typical high self- 
esteem person implies disagreement regarding 
the worth of the self. Such a complex feed- 
back system could account for the lesser po- 
larity of the opposite-sign, corner cells, Incor- 
poration of such speculation into a model that 
includes some version of the Wiest or Feather 
formulations represents a major theoretical 
challenge, 

Aside from such complexities, close exam- 
ination of the Feather model makes apparent 
that it predicts some fairly implausible results 
in the neighborhood of 0,0. Thus with X and 
Y both equal to 001, the predicted Z = 2; 
however, with Opposite signs so that X = 
—.001 and Y = 001, the predicted Z = 001. 
The problem arises from the fact that the 
same- and opposite-sign formulas Predict such 
ons results, and as both X and Y ap- 
proach 0, such large differenc 
teasonable.* x ee 

Finally, it is clear that the experiment did 
not support our initial expectations regardi; 
the difference between ambivalent- and indif- 
ferent-neutral attitudes. We think that Kap- 
lan’s technique is an interesting One, and that 


Dur data, however, do not j 
distinction is important in the 
terpersonal attraction, In the 
Tesearch, one pro 
did 


Tn general, however, this 
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issue needs to be addressed. It may be 
the more that is known about an object, 


less likely it is that the attitude toward 
object will be indifferent 

*We are indebted to the action editor for 
ing this out to us. 

* We were amazed to learn that some subjects 
that “socialized medicine” had to do with the ki 
of training that MDs receive 
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Personality Structure and the Circumplex 
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computer simulation employing the responses of hypothetical responde 
EEN solely in terms of Jackson’s threshold model dor stylistic respe r 
reproduced closely the circumplex structure reported by Wiggins as repr: 
ing adjective trait endorsements in the interpersonal domain of peno 
Wiggins’ data are interpreted as defining two general dimensions, the fir i 
resenting the differential salience of trait desirability, and the second, di 


The circu 


plex structuring of trait content is considered too simple a model for descr: 


£ 


the complexities of personality and is judged to be inconsistent with much « 


modern measurement theory, which highlights convergent and discriminant \ 


lidity, simple structure, 


There is much that is commendable in Wig- 
gins’ (1979) approach to the classification of 
Personality. It is systematic in that it seeks 
to distinguish definitionally a number of do- 
mains of enduring Personality dispositions or 
traits; it is scholarly in that it seeks to incor- 
porate a number of previous conceptualiza- 
tions; it is integrative in that it seeks to re- 
late, relatively exhaustively, diverse traits; 
and it is empirically based because certain 
data are presented that purport to support a 
particular interpretation of trait structure, 
We do not take issue with the value to per- 
sonality theory of systematic taxonomic ef- 
forts. In this paper we take issue primarily 
with Wiggins’ interpretation of his empirical 
data and its supposed relevance to the struc- 
turing of personality trait content in the 
interpersonal domain. 


Jackson, Depari 


I 
Western Ontario, London, Ontari 


and the mutual independence of traits 


Wiggins distinguished interpersonal traits 
from those derived from other domains, such 
as those based on “temperament,” “char- 


acter,” and “qualities of mind as manifested 
in thought, perception, and speech.” Within 
the domain of interpersonal traits, he has 
identified eight theoretical variables, labeled 
(with accompanying abbreviations): Gregat- 
ious-Extraverted (NO), Ambitious-Dominant 
(PA), Arrogant-Calculating (BC), Cold- 

rrelsome (DE), Aloof-Introverted (FG); 
Lazy-Submissive (HI), Unassuming-Ingenv- 
ous (JK), and Warm-Agreeable (LM). He 
hypothesizes that measures of adjacent vari- 
able should be correlated, and if arrayed in 4 
Circle, correlations should decrease between 
variables as a function of their mutual dis- 
tance along the circumference of the circle. 
Wiggins proceeded to identify 16 specific trait 
dimensions, and on the basis of rational cri- 
teria as well as empirical item response sat 
tics and desirability ratings, he selected eigh 
adjectives to define each trait, Similar pairs 
of traits were combined into broader cat- 
egories yielding eight measures corresponding 
to the eight theoretical traits. These were 
intercorrelated and factored by principal com 
ponents. A plotting of the two largest dimen- 
sions yielded the circular array of traits, aP 


PERSONALITY STRUCTURE 


| parently confirming the hypothesized struc- 
‘ture, Wiggins seems to attach psychological 
‘importance to this structure, corresponding 
to what Guttman (1954) termed a circum- 
plex 


We shall present evidence that Wiggins’ 


findings regarding personality trait structure 
are subject to a very plausible alternative 
interpretation. We shall also suggest that his 
approach to the measurement of personality 
.is not consistent with modern measurement 


theory and shall argue that as a model for 
describing behavioral consistency, a circum- 
plex structuring of personality traits is an in- 


complete oversimplification, The latter is not 
to say that important processes may not 
underlie the reported circumplex structure. 


We shall argue that although they are not 

uniquely relevant to particular traits, these 
processes are of fundamental importance in 
in assessment and interpersonal be- 
lavior 


An Interpretation of Processes Underlying 
the Circumplex 


There is now considerable evidence that 
responses to personality questionnaires and 
judgments of the personality of others are re- 
lated to the desirability level of the items be- 
ing judged or endorsed. The probability of a 
true response to a personality statement or 
Adjective by a group of respondents, for ex- 
Ample, has been found to be a function of the 
judged desirability of the item. This finding 

s been observed over a wide variety of per- 
Sonality item content and respondent popula- 
tions. Of course, this functional relationship, 
ās Norman (1967) has noted, applies to item 
Statistics and scale values, not to individual 
differences. But it is possible to compute indi- 
vidual probabilities of responding to items at 
Similar levels of desirability by grouping items 
With similar scale values. When this is done, 
individuals show diverse patterns of respond- 
ing (Jackson & Messick, 1969). Some respond 
in a manner highly related to desirability 
‘Scale values; some respond as if they were 
Unaware of desirability. For the former one 
Might say that desirability is salient; for the 

tter it apparently is not, Furthermore, when 
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PUD) 


PROBABILITY ofa 
TRUE RESPONSE 
it) 


DESIRABILITY SCALE VALUE 


Derivation of the Probability of a True Response on 
Item j, for Subject i with the following Parameters : 


Th «Desirability Responding Threshold = 5 

Si = Salience Parameter 2125 

dj» Desirability of Item «7 

P(t); Probability of a True Response = .94 
Figure 1. Description of the threshold model. 


one plots the functional relation between an 
individual’s frequency of a true response to 
items having different levels of desirability 
scale values, one finds that some persons tend 
to endorse many characteristics as self-de- 
scriptive in the middle range of desirabil- 
ity, whereas others do so only at rela- 
tively higher levels of desirability. This im- 
plies different thresholds for responding 
desirably. 

These two parameters, the individual sa- 
lience of the desirability dimension (or the 
respondent’s sensitivity to desirability) and 
the threshold for responding desirably in in- 
teraction with a stimulus parameter, the 
judged desirability scale value of a personal- 
ity item, were the central features of the 
threshold formulation for stylistic respond- 
ing, proposed a decade ago (cf. Voyce & Jack- 
son, 1977; Jackson, Note 1). This model is 
represented in Figure 1. Using this formula- 
tion, in which salience corresponds to the 
slope of the individual function relating prob- 
ability of responding true and the threshold 
corresponds to some median probability of a 
true response, processes underlying previous 
findings (Edwards, 1963; Jackson & Messick, 
1961, 1962a) have been clarified. Measures 
of salience and threshold, for example, have 
been found to show highest loadings in the 
two-dimensional circumplex structure found 
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in measures of psychopathology and were 
found to be very strongly associated with the 
number of items endorsed as self-descriptive 
(Voyce & Jackson, 1977). 

Might the same processes operate in the 
self-characterization endorsement of adjec- 
tives? There is suggestive evidence that they 
might. Edwards (1957) found a strong de- 
sirability factor in adjective endorsement. 
Jackson and Lay (1968) and Morf and Jack- 
son (1972; Bentler, Jackson, & Messick, 
1971) report two factors in endorsing adjec- 
tives, one associated with the tendency to 
endorse desirable adjectives (related to re- 
sponding desirably to personality items) and 
the other associated with endorsing many ad- 
jectives (linked in a second-order factor to 
the tendency to endorse many personality 
items as self-descriptive). A similar circum- 
plex patterning of correlations and factor 
loadings was reported for three factor analyses 
of the Minnesota Multiphasic Personality In- 
ventory (MMPI; Jackson & Messick, 1961, 
1962a, b). The question arises, what would 
happen if a group of respondents answered 
the Wiggins adjectives in terms of the thresh- 
old model, holding constant all other phe- 
nomena? Would they yield a circumplex sim- 
ilar to that of Wiggins? With human respon- 
dents one has no way of knowing with cer- 
tainty that they would follow instructions to 
answer in terms of the threshold model, even 
if it could be communicated to them, and to 
ignore other influences; such as item content. 
But it is a relatively simple matter to program 
a computer to simulate the responses of a 
number of hypothetical subjects instructed to 


answer solely in terms of the threshold 
(Rogers, 1971). A gr’ 


Computer Simulation of Adjective 
Endorsements 


i) computer program was 
ulate the responses of 500 subjects, i 
mary source of sagt 
sirability for the adjecti 


Norman list, and scale values 
wi 
from the Goldberg data (Note At en 
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value for one adjective (organized) could not 
be located and was given a neutral s ale value 
of 5.0. All scale values are based 


on the mean 

of the mean ratings of the two sexe reported 
by Norman (Note 2) and Goldberg (Note 3). 
The simulation program predicted re 
Sponses to the 128 adjectives for cach of the 
500 subjects on a 9-point Likert scale. The 


program first generated two independent, ran 


dom numbers from a standard normal dis 
tribution. One number was rescaled, using a 
predetermined value for mean and standard 


deviation, to the salience parameter. Corre 


sponding values were used to rescale the sec 
ond number as a threshold. The value of 
threshold was then converted to the equiv- 
alent value of the intercept of the straight 
line illustrated in Figure 1, using a value of 
5 for the value of the ordinate in the equation 
of the line, 

To predict item responses, the values of the 
threshold (intercept) and the salience (slope) 
were used in the linear prediction equation 


with the social desirability scale values for 
the adjectives. The predicted response was 
rounded to the nearest integer value (1 to 9), 
Which was then taken as the value of the re 
sponse to that adjective. Responses were then 
summed to form scores for the eight scales. 
The threshold model describes the process 
by which a person ascribes traits as self de- 
Scriptive at different levels of desirability. 
This adds a slight complication in the case of 
negations. Consider the trait uncrajty. A 
“false” (or “uncharacteristic”) response 5 
uncrajty implies the presence of the trait 
crafty. Thus in the case of items containing 4 
negation, it is necessary to consider the de 
sirability scale value of the unnegated adjec- 
tive. The Wiggins scale contains 46 such nega- 
tions in four octant scales, The social deiri 
ability scale values for unnegated forms 0 
these adjectives (obtained from Norman 
Note 2) were used in the simulation. Such 
values for the unnegated forms of four adjec 
tives could not be located, For these a 
tives, the complement of the scale value F 
the negation was used. Referring to Figure £ 
a false response to a negation has the s 
interpretation as a true response to the T 
inal. This has an important bearing on 
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e 
ok 
-100 THRESHOLD 
Figure 2. Plot of first two principal components from computer simulation. (PA = Ambitious- 
Dominant. BC = Arrogant-Calculating. DE = Cold-Quarrelsome. FG =Aloof-Introverted, HI = 


Lazy-Submissive. JK = Unassuming-Ingenuous. LM = Warm-Agreeable. 


verted.) 


interpretation of the Wiggins results, as it 
tequires the complementation of the predicted 
tesponses. It will be demonstrated that this 
Can be interpreted as accounting for the sec- 
Ond principal component. 

h Two principal components were extracted 
tom the correlation matrix for the “re- 
poose” to the items of eight scales of the 
00 hypothetical respondents. Two compo- 
nents accounted for 94.4% of the variance. 

hese are plotted in Figure 2. A comparison 
zh the Wiggins figures indicates a very sim- 
structure, The differences consist of an 

Overlap of NO and LM and a different order- 

& of the DE-FG-HI triad. It should be 

Noted that the salience parameter loads highly 
n Factor 1 and the threshold parameter has 
à substantial loading on Factor 2. The strik- 

ly similar pattern of results suggests that 
tocesses underlying the threshold model pro- 

a a plausible interpretation of the Wiggins 


NO = Gregarious-Extra- 


The precise interpretation of the two di- 
mensions underlying the Wiggins data is facil- 
itated by an examination of the properties of 
his scales. Wiggins does call attention to what 
he terms the “confound” between desirability 
and his scales, but the extent of this relation- 
ship is of some interest. First, note that in 
Figure 2 the four scales having the least de- 
sirable content (BC, DE, FG and HI) have 
substantial negative loadings, whereas the 
other four scales have neutral or positive 
loadings. In addition, a slight (approximately 
30°) orthogonal rotation of the axes for the 
data shown in Wiggins’ Figure 4 will give a 
complete separation of the scales high and low 
in social desirability. This, combined with the 
high loading of the salience parameter on this 
factor, would argue for an interpretation in 
terms of social desirability. Figure 3 presents 
a plot of the average desirability scale values 
and the scale means for ratings of accuracy of 
self-description as obtained from Norman 
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'D ACCURACY OF 
SEF -DESCRIPTION 


SOCIAL DESIRABILITY 


Figure 3. Correlation of social desirability scale values and judged accuracy of self-end 
Wiggins’ scales. (PA = Ambitious-Dominant. BC = Arrogant-Calculating. DE 
FG = Aloof-Introverted. HI = Lazy-Submissive. JK = Unassuming-Ingenuows. LM 


able. NO = Gregarious-Extraverted,) 


(Note 2) and Goldberg (Note 3). The corre- 
lation is .98. This sheds further light upon 
the interpretation of the first principal com- 
ponent, It implies that the average respondent 
is sensitive to the mean desirability of the 
adjectives and that these desirability scale 
values are colinear with the probability of the 
average respondent’s endorsing items on the 
different scales. But some respondents are 
more sensitive than others to desirability 
(Jackson & Messick, 1969). High scorers on 
the first principal component are thus those 
who show higher relationships between their 
own probabilities of responding to an item 
and the item’s judged desirability. Low 
Scorers on the second Principal component 
respond less in terms of the desirability scale 
values of the items. 

The second principal com; 
reflect the differential tendency to endorse 
traits as self-descriptive (cf, Bentler, Jack- 
son, & Messick, 1971; Morf & Jackson, 1972; 
Voyce & Jackson, 1977) and is dependen, 
upon the presence of negations in the Wiggins 
traits. Table 1 presents the number of negated 
items on each scale together with the loading 
of the scale on the first Principal 


ponent appears to 


co 
in the simulation study and aye: TSM 
from Wiggins (1979), When the simulation 


was undertaken without the special provision 
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10 


v$ 
> 


0 


Cold-Quarrelsor 
Warn 


for the negations described above, negati 
loadings disappeared, along with the circu 
plex structure. It is hypothesized that thej 
would also disappear in Wiggins’ data, had | t 
not chosen negations to comprise the majori 
of items in the scales forming the lower pafl 
of the circumplex. An examination of Figu 
1 illustrates how, in terms of the thresho 
model, endorsing an item at a given level | 
desirability yields a negative correlation wi 
the endorsement of a negated item at 
same level of desirability. Those who endo 
many traits as self-descriptive at a given lev 
of desirability will tend to reject many neg 
tions at the same level of desirability. T 
Wiggins engaged in item selection for the pu 
pose of confirming a hypothesized structuré 
it is possible that his inclusion of many negn 
tions (even highly contrived ones like a 
crafty,” “unwily,” and “warmthless”) fori | 
lower two quadrants was dictated by the a 
that the inclusion of such negations was nec 
sary for obtaining the circumplex structure: 


Plausibility of a Content Interpretation 
to the Circumplex 


A ion are 

Although the findings in the simulation s 

very similar to the data reported by Wi po 
the success of the simulation does not 


PERSONALITY STRUCTURE 


that the hypothesized threshold processes are 
operating, particularly since the simulation 
results do not duplicate Wiggins’ results ex- 
actly. These differences may in part be due to 
differences between the desirability of an ad- 
jective when it is judged on its desirability 
and its desirability when an individual at- 
tributes the adjective to himself (see Jackson 

Helmes, 1979), or to changes in adjective 
desirability since the Norman (Note 2) data 
were obtained. The forcing of a linear model 
onto a nonlinear situation may also have a 
bearing upon the differences between Wiggins’ 
results and those of the simulation. But in 
weighing the competing interpretations of the 
data, it would be useful to consider the plau- 
sibility of the evidence as it bears on a de- 
notative content interpretation of the two 
principal components, 

Ever since Thurstone’s pioneering work in 
the 1930s in multiple factor analysis, the 
weight of opinion has been in the direction of 
seeking evidence for the existence of char- 
acteristics of intellect or personality in the 
form of simple structure, that is, of identify- 
ing a resolution of test vectors such that a few 
tests defining a factor had high loadings, with 
the remainder loading negligibly. There are 
now literally hundreds of studies in the per- 
Sonality area confirming the applicability of 
the Thurstonian approach, that is, of expect- 
ing separate factors for each trait, defined by 
Some analytic or judgmental criterion of sim- 
Ple structure, It is true that circumplexes such 
4% those reported by Wiggins have been re- 
Ported by others (Benjamin, 1974; Leary, 
1957; Schaefer, 1959; Stern, 1970), but in 
all of these studies the potential role of re- 
Sponse or judgmental styles was uncontrolled. 

ziven the independence of our present simula- 
tion study, as well as that of Rogers (1971), 
ftom any particular content, it would be ex- 
Pected that the results of the above authors 
Could just as easily have been simulated using 
Parameters derived from the threshold model. 

the other hand, when response styles were 

licitly investigated, circumplexes consist- 

tly appeared (Jackson & Messick, 1961, 

962a, 1962b; Jackson & Pacine, 1961; 

°yce & Jackson, 1977). Jackson (Note 1) 

Ound that 28 MMPI “scales” devised by rank- 
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Table 1 

Number of Negated Items in Wiggins Scales, 
Loadings on Second Principal Component 
of Simulation, and Average Loadings on 

the Corresponding Wiggins Component 


No. of Loading M loading 
negated on first from Wiggins 
Scale items component (1979) 
PA 0 98 81 
BC 0 .96 44 
DE 11 —.85 —.14 
FG 7 —.04 —-.61 
HI 11 —.13 —.19 
JK 16 —.94 —.53 
LM 0 94 07 
NO 1 94 55 


Note. All scales have 16 items. The data on the actual 
loadings were kindly made available to us by 
Wiggins. PA = Ambitious-Dominant. BC = Arro- 
gant-Calculating. DE = Cold-Quarrelsome. FG = 
Aloof-Introverted. HI = Lazy-Submissive. JK = 
Unassuming-Ingenuous. LM = Warm-Agreeable. 
NO = Gregarious-Extraverted. 


ing all 560 items in terms of judged desirabil- 
ity and keying alternate 20 item sets “true” 
and “false” yielded a regular, replicated cir- 
cumplex array. True keyed scales formed a 
regular “fan” array from least to most desir- 
able, all with positive loadings on the second 
factor, whereas false keyed scales also formed 
a similar regular array, but with negative 
loadings on the second factor. Voyce and 
Jackson (1977) showed that these findings 
did not depend upon use of MMPI items and 
could be obtained with frequency of endorse- 
ment scale values as well as with those for de- 
sirability. When the role of response biases is 
curtailed in personality assessment, however, 
as with the Personality Research Form, sim- 
ple structure, rather than a circumplex, has 
been reported (Nesselroade & Baltes, 1975; 
Stricker, 1974). 

There is an important reason why traits 
should define independent, distinct, uncorre- 
lated factors—such traits are more likely to 
yield evidence of convergent and discriminant 
validity, With all traits arrayed in only two 
dimensions and in general showing substantial 
correlations, it is unlikely that they would 
meet any of the Campbell and Fiske (1959) 
criteria for multitrait-multimethod validity. 
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In fact, before proffering or accepting the 
circumplex structure as a framework for un- 
derstanding the content of personality, it 
would be appropriate to investigate and con- 
firm the heteromethod convergent and dis- 
criminant validity of the vectors in such a 
space. Our own view, consistent with the find- 
ings of Jackson and Lay (1968), Kusyszyn 
and Jackson (1968), Morf and Jackson 
(1972), and others, is that the discriminant 
validity of traits is to be found, if at all, in 
the residual factor scores, after removing the 
influence of the two large principal compo- 
nents. In this sense, the reliabilities of indi- 
vidual scales reported by Wiggins can be re- 
garded as spurious, representing not what is 
uniquely relevant to a given trait but what it 
shares with other traits. In the case of the 
Wiggins’ data, each scale score is largely pre- 
dictable from the others, A concept of differ- 
ential reliability, as opposed to classical reli- 
ability, would distinguish the contribution to 
reliability of general factors, due to method 
and global content variance, from those due 
to unique trait variance, 

But what is one person’s method variance 
may be another person’s trait variance 
(Campbell & Fiske, 1959; Jackson & Messick, 
1958). The identification of two broad dimen- 
sions relevant to presentation of self may be 
ae aia a important and general findin 
of wide relevance to human functio; ning, in- 
cluding such diverse areas as EN 
ogy, socialization, and interpersonal influence. 

Wiggins has contributed importantly by call. 
ing attention to these dimensions, but we wish 
to emphasize the relative independence of 
these dimensions from articular traii 

evant stimuli. By an Aai pep 
stimuli and negations, Wiggi ci 
could be changed dramatically, Putting one’s 


best foot forward can be done withi i 
in the context of a variety of oe 
desirability of behaviors relevant to particular 
traits can be manipulated, So let us not con- 
fuse content and style. (And let it not be said 
that We maintain that the Wiggins scales 
devoid of content variance. Wiggins has ‘cad 
founded Content and style), Both are aa 
in understanding Personality, but these con- 


cepts are not interchangeable, ‘This issue 
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might be resolved by identifying a new set of 
adjectives, cach substantively linked to the 
individual scales but diverse in desirability. 
Half of the items should be worded positively, 


and the remaining, negatively. With the addi- 
tion of marker scales for desirability respond- 
ing, content and style factors might 
tinguished, particularly if a s 
of desirability among items wi 
could be identified. 

In a more general sense, in addi 
methodological and empirical reserv: 
are not impressed with the psychologi 
personological import of the circump! 
Spite what ancient philosophers have « 


about perfect geometric figures, we doubt i 
explanatory value. Personality is almost cet 
tainly more complex than that which can bë 
represented realistically in a two-dimensional 


plane. Even when considering only the inter 
Personal domain, such representations, ak 
though perhaps seductive in their simplicity, 
can hardly be expected to do justice to 

Subtlety and multiplexity of the variety dj 
behavior. Similarly, although there is indeed 
diversity in the words used to describe traits 
we would prefer to go beyond single words 
and use, among other things, larger units 

the natural language as it has evolved ové 
thousands of years, to define and to asses 
Personality characteristics. | 
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ali a Function of Obesity in Children: Pervasive 
me “Style or Eating-Specific Attribute? 


Philip R. Costanzo and Erik Z. Woody 
Duke University 


The developmental sources of the link between stylistic externality and fo 

related externality found in the obese by Schachter and others were explored by 

testing whether the externality phenomena that have been found to difiere 

ate obese and normal adults are also discriminators of obese and normal chil 
dren. The results suggest that obese children as young as 7-12? years of age 


show an external responsiveness to salient food cues but not yet 
external perceptual style. The implications of these findings for the deve 
ment of obese externality are examined. 


The most influential and intuitively appeal- 
ing theory concerning the distinctive psycho- 
genic characteristics of obese eating has been 
advanced by Schachter and his associates (cf, 
Schachter & Rodin, 1974). Stated most 
simply, Schachter’s theory consists of two 
relatively independent hypotheses: (a) The 
eating behavior of the obese is less influenced 
by internal physiological cues signaling satiety 
or hunger (e.g., blood sugar level, gastric 
motility, etc.) than that of normal weight 
individuals, and (b) the eating behavior of 
the obese is more influenced by salient 
external cues (e.g., time of day, the taste and 
ready accessibility of foods, etc.) than that 


of normals, Although both of these hypotheses 


have met with fairly convincing confirmation 
in the research of Schachter and his colleagues 
(see Schachter & Rodin, 1974), it is the 
Second hypothesis that has Stood the test of 
independent replication (cf. Johnson, 1970; 
McArthur & Burstein, 1975; Nisbett & Gur- 
witz, 1970), 

In attempting to provide a clear theoretical 
explanation for the externally moderated eat- 


Requests for reprints should be sent ili 
to Ph 4 
Costanzo, Department of Psychology, iA 
versity, Durham, North Carolina 27706. 


a genera 


p 


ing style of overweight individuals, Sc hachtet 
(1971) has advanced the broad hypothesis 
that any stimulus (food related or not) above 


a certain intensity level is more likely to 
evoke a response from an obese than from a 
normal weight subject. In short, Schachter 
has argued that the externality evident in the 
eating behavior of the obese is only one man- 
ifestation of a pervasive external style. A 
number of studies support this broader 
hypothesis, Obese subjects show faster dis 
junctive reaction times, better short-term 
recall and lower tachistoscopic recognition 
thresholds than normal weight subjects do 
(Rodin, Herman, & Schachter, 1974), indi- 
cating a greater tendency for a salient stim 
ulus to trigger a response in the obese. In 
addition, both the time estimations (Pliner; 
1973; Rodin, 1975) and reaction times 
(Rodin, 1973) of the obese are more respon 
sive to salient external-contextual cues e 
are similar responses of normals. ae 
there is evidence that obese individuals ar 

significantly more field dependent on wie 
rod-frame test (RFT) than normals are 
(Karp & Pardes, 1965; McArthur & er 
stein, 1975), indicating a dispositional depe! 


o 


The correspondence between general a 
ternal hyperreactivity and food related S: 
ternality found in the obese led Rodin io 
Slochower (1976) to propose that external 
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he 4 
‘ence upon salient contextual cues for ¢ 
bese. 


EXTERNALITY AS A FUNCTION OF OBESITY IN CHILDREN 


in general style leads to overeating, which in 
turn leads to obesity, In testing the plausibil- 
jty of this causal sequence Rodin and Sloch- 
ower (1976) performed a study that assessed 
environmentally mediated weight gain among 
preadolescent and adolescent normal weight 
girls who varied in general external respon- 
siveness. More specifically, Rodin and Sloch- 
ower used multiple assessments to classify 
9-15-year-old camp-bound girls on external 
responsiveness. They predicted that the more 
external girls would evidence greater weight 
gain contingent upon the shift occasioned by 
an 8-week summer camp experience in a food- 
rich novel environment, 

The findings were equivocal with respect 
to the primary hypothesis. Instead of finding 
a linear relationship between stylistic exter- 
_nality and weight gain, Rodin and Slochower 
found a nonmonotonic relationship between 
these variables, Thus, both the weight gainers 
and weight losers were found to be more 
external on the premeasured index than those 
girls whose weights remained stable across the 
8-week experience. The obtained results were 
Quite similar for the small contingent of over- 
Weight campers included in the sample. What 
one might conclude from the Rodin and Sloch- 
Ower study is that among a sample of normal 
Weight girls one can predict weight change 
(but not simply weight gain) with a modicum 
of accuracy. Yet these data fall short of 
Providing evidence for the causal role of gen- 
eralized externality in producing environ- 
“Mentally mediated overeating. 

__ In the current investigation we have ad- 
dressed questions that are similar to those 
inquired about in the Rodin and Slochower 
‘Study. That is, the current study has en- 
deavored to examine the relative plausibility 
of the potential sources of the stylistic 
externality — food-related externality link sug- 
 Bested by Schachter’s viewpoint and by the 
Past research literature. It was felt that an 
important first step in examining the role of 
Beneralized externality in obesity would 
involve discerning whether obese children 
When compared to normal weight children 
Would exhibit enhanced responsiveness to 
both external food-related cues and external 
Nonfood cues when exposed to the kind of 


2287 


experimental arrangements which have char- 
acterized the research using adult samples. 
This strategy of inquiry introduces a broader 
developmental perspective on the issue of 
causal determination of the eating style of the 
obese than is evident in work such as Rodin 
and Slochower’s study. 

If general externality is a predisposing 
style resulting in overeating and obesity, then 
one should expect that children who are 
already obese will evidence greater external 
hyperreactivity than their nonobese counter- 
parts will. This proposal, of course, presumes 
that childhood obesity is continuous with 
obesity in adulthood. The evidence would 
appear to justify such a presumption. A num- 
ber of studies have indicated that those who 
are obese in childhood tend to remain obese 
through adulthood (Heald & Hollander, 
1965;Knittle, 1972; Mullins, 1958; Rimm & 
Rim, 1976; Rose & Mayer, 1968). In adult 
follow-up studies of earlier childhood public 
health examinations (Abraham, Collins, & 
Nordsieck, 1971; Abraham & Nordsieck, 
1970) it was found that 86% of overweight 
boys and 80% of overweight girls became 
overweight adults, against 42% of average 
weight boys and 18% of average weight girls. 

An extremely poor prognosis for juvenile- 
onset obesity was given by Stunkard and 
Burt (1967), who empirically estimated that 
the more than 4-1 odds against an obese 
child becoming a normal weight adult rise to 
28-1 if weight reduction does not occur by 
adolescence. Finally, it is important to note 
that the primary source of obese subjects 
used in adult inquiries into general and food 
specific externalities have been college-aged 
young adults (19-22 years of age). According 
to Grinker’s (1973) criterion, which because 
of juvenile lipid cell development sets age 19 
as the dividing line between obesity of 
juvenile onset and that of adult onset, most 
previous laboratory studies of obese external- 
ity are most likely to have used samples of 
subjects characterized by juvenile onset of 
overweight. Thus, although a definitive but 
expensive longitudinal study exploring the 
developmental vicissitudes of external respon- 
siveness in obese individuals would be highly 
desirable, a study probing both general and 


Tl 
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food-specific externalities in obese children 
would seem to warrant cross-sectional com- 
parisons with the findings from method- 
ologically similar adult inquiries, since the 
overlap in childhood and young adult obese 
populations appears to be quite substantial. 

Accordingly, the current study posed three 
primary questions: (a) Is there a differential 
dependence on external food related cues for 
obese versus normal weight children? (b) Is 
there a differential dependence on and respon- 
sivity to external perceptual and cognitive 
processing cues for obese versus normal chil- 
dren? (c) Is there a relationship between a 
child’s dependence upon external cues in the 
eating process and his dependency upon ex- 
ternal cues in the perceptual—cognitive area? 

The differential responsivity to food-related 
external cues by obese and normal children 
was examined in the current study by manip- 
ulating the accessibility of food. The specific 
manipulation used was derived from the work 
of Schachter and Friedman (1974) and Mc- 
Arthur and Burstein (1975). In both of these 
studies, obese and normal subjects were 
Presented with either shelled or unshelled 
nuts to nibble on. In accordance with exter- 
nality theory, both studies convincingly 
demonstrated that obese subjects eat signifi- 
cantly fewer nuts with shells on than with 
shells off, whereas the ad lib nut eating of 
normals is largely unaffected by the presence 
of a shell, 

In order to examine differences in non-food- 
related externality between obese and normal 
children, two assessments Were made. The 
first consisted of Witkin’s rod-frame test of 
field dependence; the second Was a measure 
of obese and normal children’s differential 
estimations of the elapsed time of a boring 
versus an interesting film, The 


task, and it was found 
Were significantly more 
Normal weight subject 


jects 
: field dependent 
ts were. More 
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important, it was found that external food 
related behavior was significantly related t 
degree of field dependence. That is, McArthy 
and Burstein found a quite » unt corre 
lation (r= —.$8) between the nber € 
unshelled nuts eaten and RFT ed fel 
dependence. Furthermore, McArthur an 
Burstein found a sizable correlation (r = 62 
between percentage overweight and degree o 
field dependence. Since these findings consti 
tute the strongest and clearest support for th 
link between perceptual-cognitive externality 
and food-related externality, McArthur am 
Burstein’s experimental arrangements wer 
replicated with 7~12-year-old ch n in h 
Current study. Because past longitudinal stud 
ies of field dependence and psychologica 
differentiation (cf. Faterson & Witkin, 1970) 
have indicated that the RFT is both reliablé 
for children and is evidenced by cross-aged 
Stability, it appeared reasonable for use with 


a childhood sample. 

Obese and normal children’s estimations ol 
the elapsed time of a boring versus 
interesting film (shown in the presence of 
bowl of either shelled or unshelled nuts) we 
used as a second measure of generalized 
externality because it has been cited by Leon 
and Roth (1977) as the single strongest 
evidence for generalized externality differ 
ences between obese and normal weight indi 
viduals. Both Pliner (1973) and Rodi 
(1975) have demonstrated that obese esti 
mates of time elapsed are significantly morg 
affected by interest cues in the arise 
environment than are the time estimates of 
normal weight individuals. The current study 
Partially derives from the Rodin (1975) 
investigation, which very clearly demoni 
Strated that obese subjects tend to overestt 
mate the length of time of a boring tape 4 | 
underestimate the length of time of af 
interesting tape, whereas the difference M 
time estimation of the boring and interesting 
tapes for normals is not significant. 

In summary, the current investigation a 
endeavored to discern whether the externality 
phenomena that have been found to differ 
entiate obese and normal adults are also 7 
ctiminators of obese and normal children. be 
externality view of obesity suggests a stro" 
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causa! relationship between general and food- 
specific externality. It is believed that the 
tevelopmental continuity or discontinuity of 
ye adult phenomena in a sample of children 
with a high probability of maintaining an 
obese adjustment into young adulthood will 
help to provide data on the plausibility of 
the application of this causal assumption to 
the life span. 


Method 
Design Overview 


The experiment proper was run in two parts. In 
the first part obese and normal children were asked 
lo watch and rate either a boring or an interesting 
film of 10 minutes’ duration while in the presence 
of a large bowl of either shelled or unshelled 
peanuts. This arrangement yielded a 2 (Weight 
Group) x 2 (Film) X 2 (Shelled-Unshelled Nuts) 
betwoen-subjects factorial design. The dependent 
Measures assessed were (a) the amount of ad lib 
tating of peanuts, (b) the subjects’ estimate of the 
time taken by the observed film, and (c) the sub- 
ag rating of the interestingness of the observed 


For the second part of the study, all children were 
individually escorted to a darkened room and were 
Administered eight trials of Witkin’s rod-frame mea- 
Sure of field dependence-independence. These data 
Were evaluated in a 2 (Weight) X 2 (Sex) factorial 
design. In addition to the analyses of variance on the 
“parate dependent variables, correlations were com- 
Puted (within and across sex and weight) among 
Percentage overweight, quantity of nuts consumed, 
and field dependence. 


Subjects 


Subjects were drawn from the second through fifth 
Brades of North Elementary School, a Roxboro, 
North Carolina, school of about 350 children. A 
Sample of 56 males and females (28 predesignated 
Sbese and 28 predesignated normal weight) was run 
through the several conditions of the study. The 
Mitial sample was selected through teacher nomi- 
Nation. Teachers were asked to indicate which chil- 
dren in their class appeared distinctly overweight, 
Which appeared normal weight, and which appeared 
| distinctly underweight. From their nominations, 

groups of 28 obese and 28 normal weight subjects 

te assessed on three measures under the conditions 
indicated above: (a) their eating of shelled or un- 
nuts, (b) their time estimations of the length 

°f a boring or interesting film, and (c) their RFT- 
Measured field dependence. Inclusion in the final 
‘tmple of subjects was determined by the subjects’ 
| actual measured deviation from average weight for 


! 
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age and height and not the teacher rating. Because 
of the instability of weight in growing children 
(Garn, Clark, & Guire, 1975), a 25% overweight 
minimum criterion was used for classification of an 
overweight subject. The criterion for inclusion as a 
normal weight subject was that the child fall between 
—10% and 10% deviation from average weight for 
age and height. Children were weighed and mea- 
sured in the school’s annual health screening, which 
was fortuitously scheduled 1 week after the com- 
pletion of our study. Of the 28 teacher-referred 
overweight children, only 1 failed to meet our 
minimum criterion for inclusion in the sample. The 
27 overweight children included in the final sample 
ranged from 25% to 124% overweight for their age 
and height (the National Center for Health Statistics, 
1970, studies were used for purposes of comparing 
our sample children to national averages). On the 
average the 27 overweight children in the experi- 
mental sample were 48.8% overweight, with a range 
of 40.8% to 52.7% overweight in the four Shelled 
Nut-Unshelled Nut X Boring Film-Interesting Film 
cells of study, The average age of the overweight 
children was 10.0, with a range of 9.7 to 10.5 in the 
four experimental cells. The average height of the 
overweight groups was 57.8 inches (1.47 m; cell 
means ranged from 57.2 inches to 58.1 inches, or 
1.45 to 1.48 m). Finally, the average weight of the 
obese sample was 118.2 Ibs. (cell means ranged from 
115.0 to 124.6 lbs). 

Teachers were not quite so accurate in their 
normal weight referrals. Of the 28 subjects designated 
as normal weight, 20 approximated our a priori 
criterion of between —10% and 10% deviation from 
normal weight. The 8 subjects discarded from the 
normal weight experimental cells included 2 children 
who were appreciably underweight (—17% and 
—18% below average weight) and 6 subjects who 
were of borderline overweight status (14% to 24% 
overweight). The final sample of normal weight 
subjects ranged from —12% to 8% deviation from 
average age/height weight. The mean deviation from 
average weight for the normal sample was —2.3% 
(with a range of —4.6% to 1.1% across the four 
experimental cells), The average age of the normal 
weight sample was 10.0 years, with a range of 
means of 9.5 to 10.7 in the experimental cells. The 
average height and weight of the normal weight 
sample were 56.1 in. and 73.1 Ibs., respectively (with 
a cell mean range of 55.4 in. and 69.3 lbs. to 57.0 in. 
and 77.8 lbs.). Although the 7 discarded mild to 
moderately overweight subjects (the 6 who had been 
teacher designated as normal weight and 1 who had 
been teacher designated as obese—but whose objec- 
tive weight status was not found to correspond to 
teacher rating) were excluded from the univariate 
analyses comparing obese with normal children on 
the primary dependent measures, their data were 
included in appropriate correlational analyses using 
percentage overweight as a criterion measure. Thus, 
the correlational analyses contained 54 subjects whose 
weights varied continuously from —12% to 124% 
deviation from average weight. 


2290 


Procedure and Materials 


One experimenter escorted each subject from his 
classroom to the trailer and introduced him to 
a second experimenter in the trailer. The second 
experimenter asked the subject to take a seat. This 
seat faced both a projection screen and a table 
upon which was a bowl of peanuts, shelled or un- 
shelled, according to the assigned condition. This 
table was lit by three 40-watt incandescent lights. 
These provided enough illumination to make the 
bowl of nuts salient but not so much as to obscure 
the projected image. 

In the nuts-with-shells condition, the fragments 
of three emptied shells were in the bowl. The second 
experimenter said, “We're trying to see what kids 
think of different kinds of movies. We'd like to show 
you a movie and then get your opinion of it, OK? 
Help yourself to some nuts. I'll be back when the 
movie is over.” Meanwhile the experimenter had 
shelled a couple of nuts in the nuts-with-shells 
condition. In either condition he left, munching a 
couple of nuts. The movie was Projected through 
the open door of the two adjoining rooms. The pro- 
jection distance was about 10 feet (3,05 m), making 
a rather small but bright and easy-to-see image, 
Subjects were then shown an interesting or boring 
film. For the “interesting” condition, Toot, Whistle, 
Plunk and Boom, a 10-minute Disney cartoon about 
the origin and development of musical instruments, 
was shown. For the “boring” condition, 10 minutes 
of The Right of Age, a child-educational film deal- 


ing with protective services available for the aging 
was shown, ; 


subject was presented with a clock fi i! y 
Rae ck face with mov- 


and 
Square lit up with a stick i Ail he would see was a 


the task was a game, obj 

ji ie ect of ich was 
make the stick “go straight b ai aes lik to 
walls of a building or a fia i aike the 
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Then the experimenter beid a r at an angle an 
asked the subject to make it cht up and dow 
Invariably the subject immediately repositioned th 


ruler to vertical 
The subject was blindfolded, led into the con 


pletely darkened room, and seated in a chair abot 
seven feet in front of the rod-ar ame apparatu 
The rod-and-frame apparatus war modeled after thi 
of Witkin and his associates (1962) and comsisted 4 
an Outline square 40 inches (102 m) on cach 

within which was centered a 18-inch (.97 m) 

The rod and frame could be 1 independenti 
and their respective degrees of cad off a pre 
tractor at the back of the appa s. A varie wi 


used to limit the brightness of luminescent 
on the rod and frame, so that in a dark room o 
the illuminated outline square and rod could be 
The degree of tilt of the frame could be contra 
manually by the experimenter, whereas the deg 
of tilt of the rod could be controlled remotely 
the subject through push buttons that operated) 
Motor. After a brief period of dark adaptation, 

subject's blindfold was removed and he was 
to make eight judgments of verticality. On b 
the trials the frame was tilted 28 degrees to the 
and on half it was tilted 28 degrees to the left, U 
generating four combinations of rod-and-frame Ñ 
Each combination was presented twice in rando 
order. After each trial the subject was told to d 

his eyes while the experimenter used a flashlight 
record the degrees of deviation of the rod settii 
from vertical and to set the rod and frame at Ú 
appropriate tilt for the next trial. The av 
degrees of deviation from vertical placement of ii 
rod for the eight trials constituted the measure 

field dependence, Finally, the subject was thank 
and was returned to his classroom 


Results 
Eating Behavior 


A2 x2 x 2 (Weight x Film x Nut) 
ysis of variance was performed on log £ 
of nuts consumed by the children. There 
two significant effects that emerged from 
analysis. First there was a main effect for nut 
indicating that a greater quantity of 
without shells were consumed than nuts 
shells, F(1, 39) = 9.15, p < .01. There W® 
also a significant Weight x Nut interacle™’ 
F(1, 39) = 4.94, p<.05, which q i 
the nut main effect. Table 1 presents 
means involved in the interaction. Both @ 
grams of nuts consumed and log grams of 
are represented on this table. It is cleat # 
even a casual perusal of the means pres 
in this table that the presence or ab: 
of shells on the nuts exerted a much 
effect on the consumption of obese 


Table 1 
Nut € umption as a Function of Weight 
nd the Presence of Shells 
Normal Obese 
Log Log 
Condition grame Grams = grams Grams n 
Shells off 1.06 1897 11 1.47 43.50 13 
Shells on 8 NOT 9 .67 6.29 14 


than on the consumption of normal weight 
children. Simple effects analysis shows that for 
obese subjects the presence of shells very 
significantly depressed consumption in com- 
parison to the shells off condition, F(1, 39) = 
16.94, p < 001, whereas for normal weight 
subjects the presence of a shell had an insig- 
nificant effect on nut consumption, F (1, 39) 
< 1. The type of film (i.e., boring or inter- 
esting) had no influence on eating behavior, 
cither as a main effect or in interaction with 
the other independent variables.’ 

In summary, then, these data provide a 
very convincing case for the early entry of 
externality effects in the eating of the obese. 
The obtained results of this study compare 
very favorably with those of the Schachter 
and Friedman (1974) study and the Mc- 
Arthur and Burstein study (1975). A more 
articulated comparison of these data and Mc- 
Arthur and Burstein’s nut-eating data will be 
taken up in the discussion. 


Field Dependence: Judgments of Verticality 


The 2x2 (Weight x Sex) analysis of 
Variance on subjects’ RFT judgments re- 
Vealed no effects whatever for either weight, 
Sex, or the Weight X Sex interaction. In fact, 
all Fs were less than 1. 

Nonetheless, inspection of within-cell corre- 

tions revealed a consistent pattern of 
Tesults. In spite of the lack of significant 
| üfference in field dependence between obese 

and normal subjects, the correlation between 

Percentage overweight and RFT assessed field 
[dependence for the total sample of 54 was 

Significant, r($2) = .33, p < .02. However, 

the major source of this correlation was within 

the obese sample itself, r(25) = .50, p < .01, 

d it was not significant for the normal 
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weight subjects, (18) = .14, p < .20. Hence, 
only for obese children was field dependence 
quantitatively related to degree of overweight, 
and field dependence appears unrelated to 
the qualitative distinction between overweight 
and normal weight. 


The Relationship Between Eating Behavior 
and Judgments of Verticality 


The negative correlation previously re- 
ported by McArthur and Burstein between 


1 Given that the sample of children used in the 
study varied on age and sex dimensions, it was impor- 
tant to discern whether either of these variables 
accounted for the obtained results. First of all, the 
sex distribution approximated a 50-50 split within 
the Shells X Weight experimental cells. Each of the 
three cells containing an odd number of subjects 
(see Table 1) was comprised of one more female than 
male subject, whereas the remaining cell contained 
the same number of males and females. Similarly, 
the ages of subjects were reasonably homogeneously 
variant within cell; the range of age standard devia- 
tions in the four Weight X Shells experimental cells 
was .91 to 1.37. Despite this equivalence of distri- 
bution by sex and age, it was felt necessary to rule 
out statistically the effects of both age and sex as 
qualifiers of our results. Accordingly, three analyses 
were run, The first of these analyses was a 2 X 2 X 2 
(Weight X Shells x Film) analysis of covariance 
using age as a covariate and log grams of nuts 
consumed as the dependent measure. The Weight X 
Shells interaction and the shells main effect remained 
significant, Fs(1, 38) = 6.44 and 11.40, respectively. 
In addition, no other main effects or interactions 
emerged as significant. Secondly, a similar three-way 
analysis of covariance was performed using age and 
sex as multiple covariates. The results of this analysis 
were quite congruent with the first covariance anal- 
ysis, Weight X Shells, F(1, 37)=6.27, p < .02; 
Shells, F(1, 37) = 11.15, p < .002, with no addi- 
tional effects significant. Third, a Weight X Shells 
Sex analysis of variance was performed to test 
directly the impact of sex on the results. It was found 
that sex as a main effect yielded an F <1 and that 
all three interactions of the experimental variables 
with sex yielded Fs < 1. Finally, reapportioning the 
sample by sex did not appreciably alter the Fs 
for the Weight X Shells interaction, F(1, 39) = 4.49, 
p <.04, nor for the shells main effects, F(1, 39) = 
12.32, p<.001. A similar absence of age and sex 
effects was found in the case of the time estimation 
and field dependence measures. Neither sex nor age 
used as covariates nor sex used as an independent 
variable changed the pattern of reported results. In 
view of the lack of significant variance accounted 
for by age and sex of the child, collapsing the sample 
on these variables appeared warranted. 
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Table 2 

Mean Time Estimations (in Minutes) of the 
Boring Versus Interesting Film by 

Obese and Normal Children 

—<—<$<—— 


Film 
Group Interesting Boring 
ane ns ee eee NE SS | fA 
Obese 19,79 16.02 
Normal 24.80 13.43 


a A S E a 


subjects’ field dependence on the RFT and 
log grams of nuts with shells consumed was 
not found with this sample of children, For 
all subjects run in the shells on condition, 
including those excluded from subject groups 
in the analysis of variance, this correlation 
was insignificantly positive, r(25) = 08. The 
correlation failed to approach significance, 
or even to take a negative sign, for any group 
of subjects—females, r(14) = 04; males, 
r(13) = .15; obese subjects 20% overweight, 
r(14) = 27, and normal weight subjects, 
(9) = 05. The correlation between field 
dependence on the RFT and log grams of 
nuts without shells consumed was also insig- 
nificantly positive for all subjects run in the 
shells off condition, 7(29) = -28, as well as 
for each group of subjects—females, r(14) = 
05; males, r(15) = 38 
overweight, r(13) = 
subjects, 7(11) = 
dence whatever 
Prediction that 
related to field depe; 


Time Estimation 


A2x2x2 
ysis of Variance was performed 


effect, F(1, 39) = 
estimated a longer 
8 film (M = 22.29 


= 14.73 
minutes), even though the real duration of 
pa films was the same (10 minutes), 

mar y significant Wei ht x Fi 
interaction was also found, F(1, 30) = pa 
$ <10. Although the marginality 
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may be seen from Table 2 that the 


ifferencs 
in time estimations for type of film appear 
somewhat more pronounced for normal sub. 
jects than for obese subjects. In fact, the 
difference was highly significant for norma 
subjects, F(1, 39) 12.77, $ 01, bul 
insignificant for obese subjects, F(1, 39) = 
1.98, p > .10, Hence there is a tentative sug 
gestion that the time judgments of obesi 
subjects were less affected by interest cud 
than were those of normal weight ects, 
Film Ratings 
The children's ratings of the interestingn 
of the two films were assessed in t 
2(Weight) x 2(Shells) x 2(Film) analysa 
of variance. The first analysis was performed 
upon subjects’ responses to the question ‘How 
much do you think other kids would like this 
movie?” Whereas the second analysis was 
done on the children’s responses to the ques 


tion: “How interesting was this movie t 
you?” Responses to the questions were give 
on a 1-6 scale of liking and interest level 
For both questions, the main effect for ty; 
of film was highly significant, Fs(1, 39) z 
49.84 and 40.79; ps < .001. Subjects indi 
cated that “other kids” would like the movi 
used in the interesting condition “prett 
much” (M = 5.32) and would dislike thé 
movie used in the boring condition “a little 
(M = 3.05). The subjects themselves found 
the movie used in the interesting condition të 
midway between “pretty interesting” an 
“very interesting” (Af = 5.50) and the movié 
used in the boring condition to be “a little 
interesting” (M = 3.82). These results indi 
cate that the intended differentiation between 
the boring and interesting movie was success 
fully manipulated. Although all other effects 
and interactions for the “other liking” meas 
Sure yielded Fs < 1, there was a surprising 
and interesting Weight x Film x Nut inter- 
action, F(1, 39) = 4.49, p < .05, for the sub- į 
jects’ report of their own interest in the films. 
Inspection of the cell means (see Table 3) 
reveals that all groups rated the interesting 
film nearly the same (no significant aita 
ences), However, there was a signifa d 
Weight x Nut simple interaction for Ti 
boring film, F(1, 39) = 6.47, p < .025. a 
Presence or absence of shells on the pean! 


ected the ratings of the boring film differ- 
Mestly for obese subjects than for normal sub- 
Kets, Looking at simple main effects, the 
GHerence did not reach significance for 
arma! subjects, F(1, 39) = 1.86, p> .20, 
but it did for obese subjects, F(1, 39) = 5.24, 
P< 05. That is, the obese subjects rated the 
boring film as significantly more interesting 
when it was viewed in the presence of nuts 
rithout shells than when it was viewed in the 
esence of nuts with shells (shells off M = 
433 or “a little interesting,” whereas shells 
on Af = 3.21, or “a little boring”). Thus, it 
appears that the presence of available and 
attractive food enhances obese children’s 
in rather mundane non-food-related 
it should be noted that no other 
tlects {rom the 2 2 x 2 analysis on self- 
[Pted interest even approached significance. 


Discussion 


Externality in Eating and 
Generalized Externality 


, The results of the present studies concern- 
Mg the externality theory strongly support 
i hypotheses about food-related behavior 
put fail to give strong support to the hypoth- 
Sis that food-related externality has its 
origins in a general external style for obese 
Populations. Table 4 and Figure 1 present a 
‘ailed comparison of the McArthur—Bur- 
on experiment with university students and 
attempted replication with children. With 
"gard to the eating behavior dimension of the 
“periments, it is evident that the overall 
waenitude and pattern of the original results 
A closely matched in the replication. How- 
te? in the McArthur—Burstein experiment, 
, ° interaction of weight with type of nut was 
und to occur because the presence of a shell 
maced the likelihood of eating any nuts 
Š re for obese than for normal individuals. 
,) the replication, the results were mainly due 
4 the tendency for the shell to reduce the 
antity of nuts consumed more for obese 
„an for normal-weight eaters. Although this 
„terence may merely be due to differences 
the kind of nut used—almonds are hard 
ell, peanuts easy—it is worth mentioning 
the only real distinction between the eating 
vior found in the two experiments. 
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Table 3 
Cell Means for the Question, 
“Was This Movie Interesting to You?” 


Normal Obese 
Shells Shells Shells Shells 
Film off on off on 
Interesting 5.75 5.40 5.43 5.43 
Boring 3.50 4.25 4,33 3.21 


However, the respective results of the field- 
dependence dimension of the experiments are 
far less congruent. In the McArthur-Burstein 
experiment, there was a strong relationship 
between degree of overweight and field depen- 
dence for all female subjects but no such 
relationship for male subjects. By contrast, 
in the replication there was a strong relation- 
ship between overweight and field dependence 
for all obese subjects but no such relationship 
for normal weight subjects. Similarly, the 
strong relationship between field-dependent 
eating and field-dependent perception found 
in the original experiment was totally lacking 


Table 4 
A Comparison of the McA rthur-Burstein 


Experiment With University Students and 
an Attempted Replication With Children 


McArthur- 
Burstein Replication 


Item r r 


Correlations between percent overweight and 
judgments of verticality 


All subjects 62 33 
Obese females .60 45 
Obese males —.06 Si 
Normal females AL 25 
Normal males 04 13 


Correlations between nut consumption and 
judgments of verticality 
—.58 .08 


Shells on 
28 


Shells off 25 


Correlations between percent overweight and 
nut consumption 
—.45 

09 


—.42 


Shells on 
18 


Shells off 
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in the replication. Furthermore, the lack of 
any difference between obese and normal chil- 
dren in field dependence casts doubt on the 
hypothesis that externality is an obesity- 
disposing and long-standing generalized trait. 
Instead, it appears that developmentally early 
evidences of externality in the obese are con- 
fined to the eating circumstances. 

Using the field-dependence results with 
children as evidence against the generality of 
externality as a trait may seem a bit thin. 
However, the results of the time estimation 
phase of the experiment are also clearly 
incompatible with the generalized trait formu- 
lation, In a very recent review (Leon & Roth, 
1977) of externality theory and the evidence 
supporting it, it has been claimed that the 
studies on time estimation constitute the 
“greatest support for the external control 
theory” (p. 129) among non-food-related 
behaviors. Although these studies are some- 
what contradictory methodologically, they do 
consistently report that the time estimation 
of the obese is more manipulable by external 
cues than is the time estimation of normals. 
However, in the present study with children, 
obese subjects’ time estimation was somewhat 
less affected by the boring-interesting distinc- 
tion than was normal subjects’ time estima- 
tion. It would not seem that the failure to 
obtain the predicted Weight x Film Interest 
interaction could be accounted for by the 
irrelevance of the time estimation task for 
children. The direction of the significant main 
effect for film interest, F(1, 39) = 13.25, 


MCARTHUR DEVELOPMENTAL 
BURSTEIN REPLICATION 
Š e 
4 
w 
n L2 
ž napol Normal 
o 8 
z Obese 
~ 4 Obese 
Without with Without With 
Shells Shells Shells Shells 
TYPE OF NUT 


Figure 1, Log grams eaten by type of nut for 
McArthur-Burstein experiment and developmental 
replication. 
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p < .001, is quite compatible with obtaing 
results for this measure that have issue 
from studies of adult cognitive process. Thus 
the children in this study react to interes 
cues in time estimation quite similarly 
adults. Yet, these interest-based distorti 
in time estimation were not found to 
moderated by weight status in the ma 
Suggested by a generalized externality hypo 
esis. This evidence, along with the fi 
dependence evidence, suggests that externality 
in response to food cues precedes other m 
festations of externality in the obese chi 
In contrast to the obese children's relati 
low responsiveness to interest cues in 
time estimation task, they exhibited an 
anticipated responsiveness to food cues in 
film ratings. For the obese children, the pi 
ence or absence of shells on the nuts app 
to have influenced their affective experientt 
of a relatively uninteresting movie. This r 
Suggests that rather small variations in i 
dental food cues may have a consider 
effect on how the obese child experiences 
unrelated, fairly neutral stimulus. Altho! 
externality in obese children may be confi 
mainly to food cues, these cues may affi 
the obese child’s experience and behavior 
ways not closely related to eating. 


Conclusions and Implications 


The above-reported results lead to the su 
mary conclusion that obese children as you! 
as 7-12 years of age already show evid 
of the externality in response to salient fod 
cues that had been previously documen 
in adult obese samples. Yet, it does 
appear that this external response style hi 
clearly derived from general externality 
this childhood sample. i ad 

There are at least two ways to view 
discontinuities in the results of this stat 
and prior studies of generalized externality 
samples of obese young adults. One oo” 
plausibly maintain that the phenomena © 
companying and accounting for chiidhe 
obesity are distinct from those that ara 
terize obesity in adults, Given this perspec id 
the results of the current study would 


of 
one to conclude that two different segmen'® 
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he obese population (ie., children and young 
adults) have a common tendency to react 
iifferentially to salient versus nonsalient food- 
rdated cues, Further, the disparity between 
the evidence for non-food-related externality 
from the current study and past adult inquir- 
suggest that both general externality 


ies woul 
land its relationship to overeating are distinc- 
tive characteristics of the adult obese. 


Although such a descriptive distinction 
between two populations of obese individuals 


is interesting in its own right, previously cited 
studies documenting the population continuity 
between obese children and obese young 
adults (cf, Abraham & Nordsieck, 1970; 
Heald & Hollander, 1965; Rimm & Rim, 


1976) would suggest that the differences 
found between the current child and past 
adult. obese samples have implications for 
understanding the development of an obese 
behavioral adjustment. 

From this vantage point it would appear 
that the causal priority of general externality 
in the overeating of the obese suggested by 
Schachter’s perspective and directly proposed 
by Rodin and Slochower (1976) does not 
bear up well developmentally. The current 
tata from a comparison sample of obese chil- 
dren indicate that the early instances of 
{ternality in juvenile-onset obese individuals 
are largely confined to the eating circum- 
stance, 
_ As an alternative to the proposal that 
implies a causal sequence that represents 
Sbesity resulting from overeating, and over- 
tating resulting from stylistic externality, 
One might propose that the externality evi- 
dent in the eating responses of the current 
&mple of children is a function of their 
obesity and the later development of perva- 
Sve externality a consequence of generaliza- 
tion of eating style to other stimuli. Such a 
Proposal makes sense when one takes account 
of the socialization practices that are likely 
Consequences of a child’s obesity. Owing to 
the deviant status of obesity in middle-class 
Culture, childhood obesity is likely to arouse 

ncern and resulting attempts to mediate 

tight loss by a variety of food restrictions. 

is kind of socialization strategy is likely 

impose a norm of high restraint of eating 
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responses on the child while simultaneously 
wresting control over the eating environment 
away from the child. Such directive parental 
constraint over a child’s behavior has been 
shown to result in external control adjust- 
ments in a number of areas (e.g., Aronfreed, 
1964; Baumrind, 1971; Becker, 1964). Aron- 
freed, for example, notes that strong restric- 
tive control in a given behavioral domain both 
results in a poor internalization of controls 
and a sensitization to the stimuli in that 
domain of behavior. We would propose that 
the obese individual’s food externality may 
be an instance of such a process of sensitiza- 
tion. 

Although a proposal such as the one just 
presented is a digression from the data base 
of this study, it is compatible with our pattern 
of findings and does bear some linkage to at 
least one emerging framework in the study 
of obesity. Herman and his colleagues (Her- 
man & Mack, 1975; Herman & Polivy, 1975; 
Hibscher & Herman, 1977) have proposed 
that the critical source of externality in the 
obese individual is his high level of restraint. 
Because the obese individual is deviant from 
a desired norm of physical attractiveness, he 
actively restrains food intake. Such restraint 
paradoxically renders food cues more salient 
and results in the stimulus-bound eating char- 
acterized by Schachter and others. Herman’s 
results tend to confirm this viewpoint in that 
they demonstrate (Hibscher & Herman, 1977) 
that even normal weight restrainers are more 
likely to evidence external behavioral charac- 
teristics than are so-called unrestrained eaters. 
This might suggest that tracking the develop- 
mental origins of dietary self-restraint in the 
obese person’s socialization history might be 
a fruitful direction for subsequent cross-sec- 
tional and longitudinal inquiries into the 
sources of obese externality. 
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Avoidance of the Handicapped: An Attributional 
Ambiguity Analysis 
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We demonstrated a general strategy for detecting motives that people wish to 
conceal. The strategy consists of having people choose between two alterna- 
lives, one of which happens to satisfy the motive. By counterbalancing which 
one does so, it is possible to distill the motive by examining the pattern of 
choices that people make. The motive used in the demonstration is the desire 
we believe most people have to avoid the physically handicapped. Because 
they do not wish to reveal this desire, we predicted that they would be more 
likely to act on it if tbey could appear to choose on some other basis. In two 
studies we found that people avoided the handicapped more often if the 
decision to do so was also a decision between two movies and avoidance of 
the handicapped could masquerade as a movie preference. 


_ This paper illustrates a general strategy for 
ting motives that people wish to conceal. 
Strategy involves asking people to choose 

two alternatives, one of which acci- 
dentally happens to satisfy the motive that 
a is present but hidden. For instance, 
mn that most people wish to avoid 
OR with the physically handicapped but 
Rot want to admit it. If we give a person a 

e between sitting next to a handicapped 

n or or sitting ee ee 
the handicapped so as to conceal his 
pn However, if we ask a person 
Choose between two movies, one of which 
24 by accident happens to entail 
ng next to a handicapped person, the other 
to a normal, he can avoid the handi- 

q while appearing to exercise a preference 

"fa movie. By having enough people make 


such decisions and by varying which movie 
is associated with the handicapped person, 
or more generally, by varying which of two 
alternatives happens to satisfy the suspected 
motive, we can see whether people consis- 
tently make choices that satisfy it. Thus, we 
would look for more frequent avoidance of 
the handicapped when there is a choice be- 
tween the two movies than when there is not. 

This strategy for detecting hidden motives 
was derived from correspondent inference 
theory (Jones & Davis, 1965). Jones and 
Davis (1965) are concerned with how the 
decisions that an actor makes inform us about 
his motives. We learn about the actor by look- 
ing at the effects his decision produces. Jones 
and Davis (1965) classify the effects of mak- 
ing decisions into two categories: common 
effects (those that result regardless of which 
alternative is selected) and noncommon 
effects (those brought about by selecting one 
alternative but not the other). Assume for 
example, that Sam has a choice between two 
roommates, Both like classical music but one 
likes to cook, whereas the other likes long 
philosophical discussions; in this case, classical 
music is a common effect, and good food and 
bull sessions are noncommon effects. Decisions 
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tell us nothing about a person’s intentions 
with respect to common effects. In this case, 
Sam’s decision can tell us nothing about his 
feelings toward classical music. However, 
decisions have the potential to inform us 
about whether noncommon effects are in- 
tended. They are most informative when the 
number of noncommon effects is low. In Sam's 
choice of roommates, there are only two non- 
common effects: good food and bull sessions. 
If he chooses the roommate who likes long 
discussions, we are tempted to conclude that 
Sam likes bull sessions. But there is still some 
ambiguity. Perhaps Sam is trying to lose 
weight and is avoiding the good cook rather 
than seeking the philosopher. We could be 
more confident about Sam’s motive if there 
was only one noncommon effect, for example, 
if the only difference between the roommates 
was that one liked long discussions and the 
other did not. The general rule is that the 
more noncommon effects there are, the more 
ambiguous the motives behind a decision. 
Our strategy for detecting concealed mo- 
tives can be understood in terms of corre- 
spondent inference theory, Again, we will use 
the handicapped example. When the choice is 
simply between sitting with a handicapped 
person or with a normal one, the handicap is 
the most salient noncommon effect. A decision 
to avoid strongly suggests the actor’s inten- 
tion. However, when the choice is between 
movies that are accidentally associated with 
sitting mext to one person or the other, the 
two movies as well as the handicap are now 
noncommon effects, and the decision that hap- 
pens to avoid the handicapped leaves the 
motive ambiguous. The actor can choose to 
avoid the handicapped person and claim to be 
exercising a movie preference because objec- 
tively there is no way to tell from a single 
decision. However, by having many people 
make such a decision and by counterbalancing 
which other noncommon effect, that is, which 
movie, is paired with the handicapped, we can 
see whether people do more frequently choose 
to avoid the handicapped person when doing 
so is one of several noncommon effects of the 
decision rather than the only noncommon 
effect. 
This strategy can also be understood in 


SNYDER, KLECK, STRENTA, AND MENTZER 


very similar fashion by application of Kelley 
(1972) discounting principle. The principi 
States that “the role of a given cause in pr 
ducing a given effect is discounted if oth 
plausible causes are also present” (p, I 
For example, we are less confident that 
choice of a person to date is based on 
person's intelligence if the person poss 
other features such as physical attractiv 
that are plausible causes for the choice. 
making the choice between handicapped 
normal persons also a choice between mo 
we allow discounting of the role that 
handicap plays in the decision. With 
introduction of other plausible causes, 
motives behind a decision are obscured. 
the researchers, can discover the rea! 

by employing another of Kelley's princip 
the covariation principle: “An effect is atti 
buted to the one of its possible causes 
which, over time, it covaries” (Kelley, 19 
p. 3). We can see whether movie c 
covaries with avoidance of the handic: 


above. That is, we decided to test whe 
people are motivated to avoid the handi 
but are unwilling to acknowledge 
motive. Being around the handicapped 
to make people feel uncomfortable. (K! 
1966, 1968). There are a variety of 
why this might be so, for example, fear of 
unknown, uncertainty about how to 0 
or perhaps the reminder that the handi 
provides of our own mortal physicali 
(Becker, 1973). At the same time, few 
capped people do anything to deserve 
fate, and we may believe that we 
treat the handicapped with kindness (Kl 
Ono, & Hastorf, 1966). The motive to a be 
is not socially acceptable and may not 2 
personally acceptable either. 

To test for the presence of this u! for 
able motive, our general strategy akg 
giving people a choice between two to 
natives, one of which accidentally happens od 
satisfy the motive. The alternatives W° | 
were the same as in our example abov' 
movies, 


Experiment 1 
Method 


‘Subjects 


Subjects were 21 males and 4 females enrolled in 
» College. They were recruited through a 
per ad and through posters offering $2.00 to 
ate for balf am bour in an experiment on 
evaluation, Ome subject was deleted from the 
erent movie condition because be knew a con- 
te. After the deletion there were 12 subjects 
a tach condition. 


Confederates 


Confederates were three college-age males. Each 
tesion required the services of two of them, one to 
pay the role of a handicapped person, the other to 
Pay the normal. None of the confederates were 
hindicapped. The confederate assigned to the 
hundicapped condition for a particular session wore 
A metal leg brace 


The Setting 


A partition divided the wall at the far end of a 

room. The furniture arrangement on one side 
the partition was the mirror image of the other 
fide. On each side was a television monitor on a 
ible. The monitor faced out from the corner made 
ÈY the partition and the wall, at about a 45° orien- 
ution from each, A few feet away, there were two 


Mrtition was empty. On one side of the partition 

the normal confederate. On the other side sat 
the confederate with the metal brace. In addition, 
crutches, which are made of metal, were 


The subject came into the room at the 
id from the monitors and confederates, 


tion. 
Cover story. The experimenter said that he 


l 
i 
; 
i 
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a brief description of it, which he said the subject 
should read while he went into the next room to 
rewind the videotape. The experimenter returned 
after 2 minutes. When the subject finished reading, 
his attention was directed to a table located at the 
end of the partition. It was a small triangular table 
with the apex pointing toward the subject in line 
with the partition edge. The table was equidistant 
from the two confederates and from the two empty 
chairs. The subject was told to pick up a back- 
ground questionnaire from the table, take a seat in 
front of either monitor, and fill out the question- 
naire; the movie would begin shortly. After the 
movie, the real subject and the normal confederate 
were taken from the room, and the real subject was 
put in another room by himself and was asked to 
All out a movie evaluation form. Then the experi- 
menter came in, probed for suspicion, and gave a 
debriefing. 

Different movie condition. The only difference 
from the same movie condition is that there was no 
mention of a broken videotape machine. Subjects 
were told that they had a choice of two movies and 
were given a description of each. They were told 
which movie was to be shown on which side of the 
partition. Signs taped to the edge of the table 
below each monitor repeated this information. They 
were told to sit by the monitor showing the film 
they would like to see. The two movies were 
described as follows: 


Slapstick. This film covers the great era of visual 
comedy and the clowns who made it great. In- 
cluded in the film are some of the top comics of 
the 1920s. Charlie Chase, Monty Banks, Fatty 
Arbuckle, Larry Semon, Andy Clyde, and others 


appear. 


Sad Clowns. Charlie Chaplin, Buster Keaton, and 
Harry Langdon, Hollywood’s comedy greats, all 
had widely differing styles and techniques, but 
a common ability to mix laughter and tears. 


The following were counterbalanced: which movie 
was associated with the handicapped person, which 
side of the partition the handicapped person was on, 
which confederate was handicapped. The experi- 
menter was blind as to which side of the partition 
the handicapped person was on. The confederates’ 
behavior prior to the subject’s choice of which side 
to sit on was standardized. Neither confederate 
looked up. Each appeared preoccupied with the 
background questionnaire. 


Results 


The first two columns of Table 1 present 
the choice data for the first experiment. In 
the same movie condition, 58% of the sub- 
jects sat with the handicapped confederate. 
In the different movie condition, however, in 
which subjects have an excuse for avoidance, 
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Table 1 
Seating Choices 
Experiment 1 Experiment 2 
Dif- Dif- Social 
Same ferent Same ferent compar- 
Choice movie movie movie movie ison 
Handi- 
capped 7 2 11 6 5 
Normal 5 10 1 6 7 


sonly 17% sat with the handicapped. Thus, 
as predicted, there is greater avoidance of the 
handicapped in the different movie condition 
than in the same movie condition, This differ- 
ence between conditions is significant, y*(1) 
= 4.44, p < 05. 

For purposes of assessing alternative expla- 
nations, it is useful to ask for each condition 
whether results differ from the 50-50 split 
one would expect by chance. In the same 
movie condition, the slight preference for the 
handicapped person (58%) does not differ 
from the null hypothesis of 50%, x7(1) = 
0.33, ns. In the different movie condition, 
however, the 83% who chose the normal 
person and avoided the handicapped person 
is significantly greater than chance, x (1) = 
5.33, p < .05. This pattern permits an alter- 
native interpretation of the results in terms of 
social comparison (Festinger, 1954). The 
argument is as follows: In the same movie 
condition the subjects choose seats at ran- 
dom; in the different movie condition sub- 
jects believe that the two confederates have 
exercised a choice between the two movies. 
Because social comparison tendencies are 
stronger with a similar, that is, normal, other 
than with a dissimilar other, subjects are more 
influenced by the normal’s choice. Thus, 
rather than avoiding the handicapped person, 
the subject is choosing a movie he believes to 
be superior on the basis of its selection by the 
normal. To test this explanation, we ran a 

second experiment, repeating the two original 
conditions, adding a third condition, and 
making a few other changes. 
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Experiment 2 
Method 
The physical setting was maintained for the ts 


the far wall into three sections, cach with a 
vision monitor, The two outer sections were occ 
the handicapped confederate, the other 
| Signs were tacked to the partitions 
movie cach was going to sc; 
never going to see the same movie. The 
was always left for the subject. There 
additional signs on the table in front of 

was to take the sign with 
wished to see and place it 
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Results 


The third and fourth columns of Table 1 
Porc the choice results for the second 
periment. As in the first experiment, a com- 
ison of the two original conditions shows 
ewer people choosing to sit next to the bandi- 
Epped person in the different movie condition 
(50%) than im the same movie condition 
(92%), x'(1) = 5.04, p < OS. Again, there 
greater avoidance of the handicapped when 
Abere is an excuse for doing so. The manipula- 

on of number of movies, that is, noncommon 
or plausible causes, affects avoidance 
fin the same way as in the first study and to 
the same extent. In the second study, for some 
ason, Stross these two conditions people 
More often chose the person than 
they did in the first study, x*(1) = 5.37, p < 
105. This could be the result of a variety of 
factors, such as different subjects, female 
‘onfederates, younger confederates, 

An ing possibility is that the con- 


that eye contact enhances self-awareness. One 
% the consequences of self-awareness is 


hich is accomplished by sitting next to him 
her. In this way eye contact may have led 
© greater affiliation with the handicapped. 
Styder, Grether, and Keller (1974) report 
'a consistent with this line of reasoning, 
ich they interpret in a similar way. They 
nd that hitchhikers could improve their 
man of getting a ride by staring at 
ing drivers. 


ie condition, x?(1) = 8.33, p < .01. And 
her, the two experiments show that the 
choice variation has the same effect at 
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diferent places on the scale of net tendencies 
to approach or avoid the handicapped. 

Now let us turn to the new condition, in 
which avoidance is irrelevant because the 
subject has no choice of seats but in which 
social comparison effects are possible. If in 
the different movie condition people sit with 
the normal not in order to avoid the handi- 
capped but rather to see the movie that they 
believe the normal selected, they should prefer 
the movie the normal selected in this new 
condition. There is only a slight preference 
for the normal’s movie. Seven subjects rather 
than the six expected by chance choose the 
normal’s movie and this result is, of course, 
well within the range of chance expectation, 
x*(1) = 0.33, ms. There does not seem to be 
a social comparison effect. This is not too 
surprising if we keep in mind that the normal 
confederate was a high school student dis- 
criminably younger and therefore different 
from the subject. Further evidence against a 
social comparison explanation of the main 
results is the significant preference for the 
handicapped person in the same movie condi- 
tion. Every subject but one did so. Social 
comparison theory would suggest that a more 
similar other, that is, the normal, would be 
preferred to inform the subject as to the 
appropriate response to the movie. Finally, 
although the ms are too small for formal 
analysis, social comparison theory suggests 
that the results should be stronger for sub- 
jects who are the same sex as the confederates 
(Zanna, Goethals, & Hill, 1975). If anything, 
the results tended slightly in the other direc- 
tion: a little stronger for females in the first 
study when the confederates were male and a 
little stronger for males in the second study 
when the confederates were female. In short, 
there were no obvious consistent sex differ- 
ences, and certainly none that would suggest 
a social comparison explanation. 


Movie Evaluation 

For both experiments subjects evaluated 
the movie they saw on an eight-item rating 
form. Each item was a 7-point scale with an 
adjective at each end. The adjective pairs 
were: bad/good, weak/strong, passive/active, 
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Table 2 
Movie Evaluation 
Experiment 1 Experiment 2 
Dif- Dif- Social 
Same ferent Same ferent compar- 
Choice movie movie movie movie ison 
Handi- 
capped 
M 40.3 30.0 29.7 388 33.9 
n 7 2 11 6 5 
Normal 
M 34.0 386 380 34.0 32.7 
n 5 10 1 ó 7 
Combined 
M 37.2 37.7 312 364 33.3 
n 12 12 12 12 12 


not at all humorous/humorous, uninteresting / 
interesting, frivolous/thoughtful, boring/ 
exciting, and unimportant/important. The 
second adjective in each pair defined the high 
end of the scale. A single score was derived 
by summing across the eight adjectives. 

Table 2 presents the move evaluation 
results for both experiments. We thought that 
evaluations might be high in the different 
movie condition, as subjects who avoided the 
handicapped person may deny that they did 
so by emphasizing that their decision was 
based on the desirability of the movie, Sim- 
ilarly, those in this condition who sit with 
the handicapped person may wish to deny 
that the handicap made a difference, and thus 
they may also stress the desirability of the 
movie. In Experiment 1, the average movie 
rating is slightly and nonsignificantly higher 
in the different movie condition, (22) = .15. 
In the second experiment, compared to sub- 
jects in the same movie condition, different 
movie condition subjects do rate the movie 
more highly, ¢(33) = 2.08, p <.05. How- 
ever, this comparison fails to control for self- 
selection and dissonance reduction with re- 
spect to the movie choice in the different 
movie condition. To control for these pro- 
cesses, we need to use the social comparison 
condition as a baseline. When we do so, 
results are in the same direction but short of 
significance, ¢(33) = 1.37, p < 18. The over- 
all F for the evaluation data is also short of 
significance, F(2, 33) = 2.23, p < -12. 


SNYDER, KLECK, STRENTA, AND MENTZER 


Any semblance of order in the movie eval 
uation data breaks down when we separat 
subjects according to their movie or ses 
choice (see Table 2). In retrospect, proces 
of dissonance reduction, self-selection 
justification of avoidance of the handicap, 
may have been joined by the additional fac 
tors of arousal about being seated next to 
handicapped person, the ease of making al 
decision, and confederate sex to create noise, 
We are not inclined to offer a precise interpr 
tation of the movie evaluation data, « 

In both studies, the experimenter (alk 
with the subject about the’reasons for ch 
ing a movie or a seat. This was done suf- 
ciently systematically in the dy ù 
report the following: In the same je con 
dition, of the 11 who sat with the handicapped 
person, 3 mentioned the handicap in thet 
explanation, for example, 1 said “because 
People usually avoid them" and 2 said “be 
cause they did not want to avoid race.” 
One indicated a preference for blo An 
other mentioned a preference for right 
side and added that there was no particulst 
reason, The other 5 gave only the laue 
reason, or rather nonreason, as did the, | 
subject who chose the normal. They.said such 
things as “it was random,” “no reason,” of 
“T just chose.” 

In the different movie condition, aš 
expected, most justified their decision on the 
basis of a movie preference: five of six who 
sat with the handicapped confederate and 
also five of six who sat with the normal. One 
who sat with the handicapped said there was 
no reason, One who sat with the normal said 
he chose by a mental coin toss. 

In the social comparison condition, of th 
five choosing the handicapped person's film 
two said it was a film preference, and three 
said they just chose. Among’ the seven 
ing the normal’s film, six said it was a fil 
preference, one said “I just chose.” 

To sum up, 3 of the 11 subjects who s 
with the handicapped person in the 
movie condition acknowledged that 
handicap played a role, but the handicap 
unmentioned by the others. Subjects are 
particularly willing to admit to affiliation 


basis of a handicap and appear unwilling 
admit to avoidance on that basis. 


piet y 


The anxiety measure was the sum of re- 
onses to “clutched up,” “tense,” “fearful,” 
vous,” “on edge,” and “jittery.” The 
fore for cach adjective could range from 0 
3, and thus total anxiety scores could range 
om O to 18, Table 3 presents the results 
both the first administration of the Mood 


h movie to see, and the second admin- 
which came after the 


not differ on Score 1, F(2, 33) = 1.42, p 
26, nor on Score 2, F(2, 33) = 1.59, p= 
. However, within the different movie con- 
ition, subjects who sat with the handicapped 
on reported more anxiety right after the 
bice (Score 1) than did those who sat with 
norma}, £(10) = 3.18, p < .01, If this 


11 subjects in the same movie condition 
O sat with the handicapped. For these 
bjects, anxiety levels are lower, almost 
nificantly so, ¢(15) = 1.92, $ < .08. An- 
fe finding of interest is the high level of 
tiety among subjects in the social com- 
tison condition who chose the normal’s 


mpared the 
ovie condition who did so; for Score 1, #(11) 
1 


To summarize the anxiety results, experi- 


ved by those in the different movie condi- 
n who sat with the normal. By the time 
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Table 3 
Experiment 2: Anxiety Scores 
Social 
Same Different compar- 
Choice movie movie ison 
Handicapped 
Score 1 2.36 5.50 3.20 
Score 2 1,82 67 2.80 
" 11 6 5 
Normal 
Score 1 0.00 0.00 5.57 
Score 2 0.00 33 271 
n 1 ó 7 
Combined 
Score 1 2.17 2.75 4.58 
Score 2 1.67 50 2.75 
n 12 12 12 


anxious. The decline in anxiety is not signifi- 
cant for social comparison subjects who chose 
the normal’s movie, ż(6) = 1.58, p < .17. It 
is significant for different movie subjects who 
sat next to the handicapped person, ¢(5) = 
3.24, p < .03. 


Discussion 


Our goal was to illustrate a general strategy 
for detecting motives that people wish to 
conceal. We suspected that people desire to 
avoid the handicapped but do not wish to 
admit it, In a kind of bootstrap operation, we 
have demonstrated the general strategy by 
using it to reveal this motive. The strategy 
is to ask people to choose between two alter- 
natives, one of which accidentally happens 
to satisfy the suspected motive. In terms of 


‘attribution theory, having the decision to 


avoid or to affiliate with the handicapped also 
be a decision between two other alternatives 
is to add noncommon effects of the decision 
(Jones & Davis, 1965) or plausible causes 
for the decision (Kelley, 1972). Increasing 
noncommon effects or plausible causes creates 
ambiguity about the reasons for behavior. 
The person may act on an unacceptable 
motive while appearing to select on some 
other basis. : : 
Concretely, we had subjects choose between 
sitting with a handicapped person or a 
normal, or choose between two movies, one of 
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which entailed sitting next to a handicapped 
person, the other of which entailed sitting 
next to a normal. In the latter case, we pre- 
dicted greater avoidance of the handicapped 
because, objectively, motivation is ambiguous 
and avoidance of the handicapped can mas- 
querade as a motive preference. In both 
experiments we found, as predicted, greater 
avoidance when the choice between people 
was also a choice between movies. 

Anxiety scores immediately after the deci- 
sion were high for the six subjects in the 
different movie condition who sat next to the 
handicapped person. But it is questionable 
whether simple proximity to the handicapped 
is the explanation, as the 11 subjects who sat 
next to the handicapped confederate in the 
same movie condition had much lower scores 
—a difference that statistically falls short 
of significance. Perhaps after sitting down 
they realize the ease with which they could 
have avoided the handicapped person—by 
choosing the other movie—and they begin to 
suspect that they possess some sort of morbid 
curiosity. 

The other group with a high anxiety score 
right after the decision is the seven people 
in the social comparison group who chose the 
normal’s movie. Our attempt at explain- 
ing this result is rather involved, and as we 
find it less than compelling, we will dispense 
with it here. Our final comment on the anxiety 
scores is that whatever differences exist right 
after the decision have washed out by the 
time the movie is over. 

Similarly, the movie evaluation data are 
sufficiently complex to preclude a simple inter- 
pretation. One thing that might have hap- 
pened but did not—at least not in any strik- 
ingly obvious fashion—is subjects in the 
different movie condition did not justify 
avoiding the handicapped person by rating 
the movie highly. But consider subjects’ 
explanations for the choice during debriefing 
of the second experiment: Even though movie 
ratings were not high, subjects in the different 
movie condition said that their decision was 
based on a film preference. No one said it 
was to avoid the handicapped. There were 
only three people who mentioned the handi- 
cap in explaining their decision. They sat next 
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to the handicapped person in the same movie] 
condition and in debriefing expressed concern 
about not wanting to avoid the handicapped 
person. 

These results still leave open the followi 
question, When the subject does avoid t 
handicapped, who is the subject trying to 
fool; the handicapped person, the experi- 
menter, or himself? If the phenomenon of 
rationalized avoidance occurs in the abse 
of the experimenter, imagine what the experiy 
ence of the handicapped person may be like 
in a world so complex that it is much more 
like our different movie condition than like 
our same movie condition, The data suggest 
that the handicapped person may be repeat 
edly rebuffed in social encounters by people 
who give what may seem to them to be res- 
sonable excuses, The handicapped person can 
then distill the attitudes of others toward him 
in the same way that we can in the current 
study. This casts doubt on whether people 
really fool the handicapped when they give 
an excuse for avoiding. 

The handicapped are probably not the only 
group of people to receive such treatment. Fot 
one, Gaertner and Dovidio (1977) present 
data suggesting that blacks are comparably 
treated. They used the bystander intervention 
paradigm (Darley & Latané, 1968). Subjects 
were led to believe that a confederate was 
either black or white. Then it was made to 
appear that the confederate was in need of 
help because a stack of chairs had fallen on 
him. In addition to varying race of victim, 
the experimenters also varied the number of 
bystanders present. Either the subject was 
the only potential helper, or she was led to 
believe that she was one of three poten 


t desire when they have an excuse such as 
assumption that others had already helped 
the thought that somebody else would be 
betier helper. We concur with this interpre- 
jon. We would argue that the three-helper 
dition is much Uke our different movie 
dition. It permits the person to act on an 
ceptable motive because it provides other 
usible causes for doing so. Once again, 
biguity about the motive behind behavior 
mits the person to fulfill an objectionable 
desire 

In using our motive detection strategy, one 

problem is how to select the two choice alter- 
Mtives that constitute the added noncommon 
tects. A balance must be struck, On the one 
hand, if they are too trivial, they may lack 
Pausibility as a basis for a decision. For 
instance, would we have gotten avoidance of 
the handicapped if we had had noticeably 
different, but only slightly different, chairs on 
the two sides of the room? On the other hand, 
if the alternatives are too important in and 
bl themselves, they may overwhelm the 
Motive to avoid the handicapped. If the film 
Will last 3 hours and one is a western and the 
Mher is a satire, the person who hates west- 
tms and loves satires may choose on the basis 
fa real movie preference even if the choice 
tails the discomfort of sitting next to the 
icapped. 

If motives can only be distilled across many 
ions, is this strategy of any value in 
ing the hidden motives of a single indi- 

ual? We believe so. The problem is to get 

person to make several decisions without 
ing suspicion about what one is looking 

. If the decisions are between several dif- 
t kinds of alternatives and if they take 

at several different times in several 
tifferent contexts, the problem of suspicion 
y be overcome. Alternatively, instead of 
ing for several decisions, one could ask for 

a few, but each might have several alter- 

tives, only one of which satisfies the hidden 

tive. If we could avoid arousing suspicion, 
could ask for a single decision among 
ty alternatives and randomly select the 
that satisfies the hidden motive. Selection 

that alternative would be significant at the 
level, 


AVOIDANCE OF THE HANDICAPPED AND ATTRIBUTIONAL AMBIGUITY 


2305 


We believe that the layperson often uses 
informal versions of this strategy to get infor- 
mation without creating offense. Two women 
inspect the menu posted outside a French 
restaurant, trying to decide whether to enter. 
The wealthier one, suspecting that the other 
thinks it is too expensive but is too shy to 
say so, asks whether the other is more in the 
mood for Italian cuisine than for French. 
When the other replies that yes, she is more 
in the mood for Italian food, the first will 
conclude that the French restaurant was too 
expensive for her friend, but her friend is 
saved the embarrassment of admitting it. And 
there are times when the person with the 
hidden motive will make the excuse herself. 
For example, without prompting she may 
state that she is not in the mood for French 
cuisine. In short, the creation of causal 
ambiguity can smooth social interaction and 
permit participants to save face by provid- 
ing socially acceptable rationalizations for 
behavior. 

Several studies suggest that people who 
fear failure on a task will take steps to 
provide themselves with an excuse for failure 
by not trying very hard (Frankel & Snyder, 
1978; Snyder, Smoller, Strenta, & Frankel, 
Note 1) or by ingesting a drug that they 
believe will inhibit performance (Berglas & 
Jones, 1978). Low effort and performance- 
inhibiting drugs are plausible causes for fail- 
ure, and their presence creates ambiguity 
about whether failure should be attributed to 
them or to low ability. 

We are pursuing the general theme that 
people create attributional ambiguity as in 
the studies just mentioned and take advan- 
tage of it as in the different movie condition 
of the current research. Here are some other 
possibilities under consideration. Poor per- 
formance in the laboratory by the depressed 
(e.g., Klein, Fencil-Morse, & Seligman, 1976) 
and by those who believe in external control 
of their fate (Hiroto, 1974) may be the result 
of a low effort strategy designed to create 
ambiguity about the reasons for failure. We 
have already shown this to be so for poor 
performance in the so-called learned helpless- 
ness paradigm, In this paradigm, subjects are 
first given unsolvable problems and then a 
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second task that can be done. Subjects per- 
form worse on the second task than those 
initially given solvable problems or no prob- 
lems, Working in this paradigm, we gave sub- 
jects ready-made excuses for failure on the 
second task by describing the task as difficult 
(Frankel & Snyder, 1978) or by playing 
music alleged to be distracting (Snyder et al., 
Note 1). Their performance improved, pre- 
sumably because the excuse we provided 
allowed them to try without forcing them to 
blame failure on lack of ability. 

We can also ask whether procrastination is 
sometimes a means of providing an excuse of 
lack of time for possible failure. Does the 
involvement in multiple and diverse achieve- 
ment activities that characterizes the Type A 
or coronary-prone behavior pattern allow such 
a person to lay failure in a single activity to 
having spread himself too thin? Is social 
comparison with similar others avoided or 
obscured when one fears a negative con- 
clusion? Is affiliation under stress sometimes 
a means of creating ambiguity about respon- 
sibility by diffusing it among members of a 
group? We think the time is ripe to explore 
how people exploit rational principles of 
attribution (e.g., correspondent inference 
theory, the discounting principle) to serve 
such needs as self-esteem and self-presenta- 
tion through the creation and judicious use 
of attributional ambiguity. 


Reference Note 


1. Snyder, M. L., Smoller, B., Strenta, A., & Frankel, 
A. A comparison of egotism, negativity and learned 
helplessness as explanations for poor perform- 
ance after unsolvable problems. Unpublished man- 
uscript, 1979. (Available from Melvin L. Snyder, 
Psychology Department, Dartmouth College, Han- 
over, New Hampshire 03755.) 
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Aggression Cause a Preference for Viewing Media Violence? 


Allan Feni 
Kenyon Cı 


Virtually all of ibe research concerned with media violence has attempted to 
determine whether the viewing of violence is associated with or causes aggres- 
uon Little experimental research bas been directed at understanding why 
persons view media violence. The present research experimentally tested the 
ypotheaes that physical aggression and fantasy aggression would lead to a 
preference for viewing violence. In Experiment 1, undergraduate men and 
women were induced to express aggressive, nonaggressive, or no fantasies and 
were then given an opportunity to select film clips for viewing. The films 
chosen by men contained more violence than those chosen by women. In addi- 
uon, aggressive fantasies in males, compared to nonaggressive fantasies, 
increased the preference for viewing violence. Experiment 2, using only males, 
replicated the results of the first study and also found that men who were 
given an opportunity to aggress physically, compared to those who had no such 
opportunity, were more likely to choose to view films containing violent con- 
tent. These results provide an additional perspective on the relationship 
between the observation of violence and the expression of aggression by sug- 
gesting that the causal effects are bidirectional: Just as the viewing of violence 
may increase aggression, $0, too, aggressive behavior may increase the pref- 
erence for viewing violence. 


the early 1960s, an impressive number of 


Psychologists have shown a strong interest 
tthe relationship between the mass media 
ad behavior, especially with respect to ob- 
tved and expressed aggression. Extensive 
"relational research using 
erent indices of violent programming and 
Reressive behavior has consistently shown 
lat exposure to violence and antisocial be- 
tvior are positively related (eg., Dominick 
Greenberg, 1971; Lefkowitz, Eron, Walder, 
| Huesmann, 1971; McIntyre & Teevan, 
71; and McLeod, Atkin, & Chaffee, 1971). 
One causal explanation of this relationship 
Rt has received considerable experimental 
port has been that the observation of 

e leads to aggressive behavior. Since 


z= 
Portions of this article were presented at the 
tual Convention of the American Psychological 


Nociation, Toronto, August 1978. This research 
% conducted while the author was on leave at the 


laboratory and field experiments have inves- 
tigated the consequences, for aggressive be- 
havior by the observer, of observing violence 
(eg., Bandura,» Ross, & Ross, 1961; Berk- 
owitz, 1965; Berkowitz & Geen, 1966; Leifer 
& Roberts, 1971; Liebert & Baron, 1972; 
Murray, 1973; Stein & Friedrich, 1971; 
Steuer, Applefield, & Smith, 1971). Although 
these experiments were conducted under sev- 
eral diverse theoretical frameworks such as 
social learning theory and drive theory, and 
in spite of some disconfirming evidence (e.g., 
Feshbach & Singer, 1971; Kaplan & Singer, 
1976; and Milgram & Shotland, 1973), there 
is a remarkable degree of convergence in the 
experimental literature suggesting that the 
observation of violence contributes to aggres- 
sion (Geen, 1976). 

In view of the predominance of violence in 
the mass media (Baker & Ball, 1969; Clark 
& Blankenburg, 1971; Gerbner, 1971) and 
its potential for harm, it would seem impor- 
tant to know why persons seek out violent 
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material for viewing. Although research on 
media violence and aggression has been pro- 
lific, virtually no experimental work has been 
directed toward understanding the determi- 
nants of the preference for viewing violence 
(Kaplan & Singer, 1976; Weiss, 1969). The 
present studies attempt to identify some fac- 
tors that may affect one’s exposure to media 
violence. Specifically, it is proposed that 
aggression on the part of the viewer leads to 
a heightened interest in viewing violence, A 
number of previous studies have found indi- 
vidual differences in the preference for view- 
ing violence that are consistent with this 
proposal (Diener & DeFour, 1978; Greenberg 
& Gordon, 1971), Diener and DeFour, for 
example, found a positive relationship for men 
between dispositional aggression and liking 
for program violence. One causal explanation 
for these findings is that aggressive persons 
seek out violent programs; presumably, ag- 
gressive thoughts or actions set in motion 
certain psychological processes, such as the 
need to understand one’s behavior, which then 
cause persons to seek out portrayals of 
violence in the media. 

The hypothesis that aggression is a causal 
antecedent of the viewing of violence may 
also be seen as an alternative interpretation 
of the well-established correlation between the 
observation of violence and the occurrence of 
aggression. As noted earlier, a causal relation- 
ship positing that the viewing of violence 
facilitates aggressive behavior has received 
considerable experimental support. This evi- 
dence, however, does not provide a complete 
explanation of the aggression-television rela- 
tionship; a substantial portion of the variance 
in this relationship remains unaccounted for, 
and additional explanatory mechanisms are 
warranted (Eron, Lefkowitz, Huesmann, & 
Walder, 1972). Furthermore, the existence 
of a unidirectional causal relationship does 
not rule out the plausibility of a bidirectional 
relationship. It is quite possible that aggres- 
sive behavior would result in congruent view- 
ing preferences, while at the same time media 
exposure would be shaping subsequent be- 
haviors. Although both explanations are 
feasible, only one—that viewing violence 
increases aggression—has been systematically 
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subjected to experimental verification. The 
present research contends that a better and 
more complete understanding of the relations 
ship between aggression and the visual masg 
media can be achieved by examining the 
hypothesis that aggression may also increas 
selective exposure to filmed violence 

There are two correlational studies that 
have used statistical techniques to determing 
if a preference for violent TV programs ii 
caused by aggression, Chaffee and McLeod 
(1973), using a systematic process analysis, 
found no relationship between reports of ags 
gressive behavior and preference for age 
violence, However, these investigators fou 
only a weak correlation between their mea 
sure of preference and actual viewing of 
violent programs. Thus, the validity of the 
preference measure itself is somewhat open 
to question, Lefkowitz et al, (1971), using & 
cross-lagged panel technique, found no rela- 
tionship between aggressive behavior at age 
9 and the viewing of violence on TV 10 years 
later, again suggesting that the preference for 
viewing violence is not caused by prior aggre’ 
sion. However, this study failed to rule out 
any immediate effects of aggression on the 
preference for viewing violence, and the reli- 
ability of the measures used is also open t0 
question (see Eysenck & Nias, 1978). M 
addition to the specific drawbacks of thest 
two studies, correlational research in general 
cannot control all relevant variables, and the 
possibility of interference by third-order 
variables rules out any definitive conclusions 
regarding causal relationships. Even if the 
data generated by these studies were com- 
pletely reliable and valid, the fact remains 
that only two studies (both correlational) 
exist that address the question of the effects 
of aggression on the preference for viewing 
violence, and neither of these studies has 
examined the problem through direct, COn- 
trolled experimentation. 

The first study reported here focused spe- 
cifically on fantasy aggression as a causal 
antecedent of the preference for mass media 
violence. The fact that much of the violence 
on television is a visually presented et 
suggests that aggressive fantasics May 
ticularly relevant to an understanding of TV 


ing preferences, and previous research 
shown that the two variables are related 

bach, 1976). One explanation for this 
tionship may be found in Feshbach's 
061) work showing that watching violence 
TV produced an increase in fantasy aggres- 
However, the present research suggests 
diferent causal sequence: the presence of 
sive fantasies may lead to a preference 
viewing violence. Thus, it was predicted 
subjects who were induced to express an 

fantasy would have a greater 
in viewing violence 


requirement ia istredec paychology 
Procedure ae : 


remive fantasy, In addition there was a 
that was not gives any opportunity to fanta- 
All subjects were subsequently asked to select 
dip for viewing. Subjects in the fantasy condi- 
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were no gender differences on the ratings, suggesting 
that it i unlikely that a sex bias was operating in 
the perception of the films. Interrater agreement was 
high, averaging 84.9%, indicating that the perceived 
aggresive content, degree of interest, and amount of 
action were reasonably consistent across viewers.’ As 
an example, on a scale of O (contains none at all) 
to 3 (contains a great deal of), the film described 
by the phrase “the Mardi Gras festival” received 
average ratings of 1.58 on aggression, 2.61 on interest, 
and 2.76 on action; the film described as “a person 
getting hit by a brick” had average ratings of 2.90 
on aggression, 0.84 on interest, and 2.15 on action. 
The total aggression, interest, and action scores of 
the 10 fims chosen constituted the major dependent 
variables" 

These three dimensions were assessed in an attempt 
to remove, through covariance techniques, a persist- 
ent confound that pervades research on media 
violence: specifically, that a violent program usually 
contains a good deal of action and that violence often 
occurs in a context of other interesting story ele- 
ments; thus it is difficult to measure the effect of 
violence alone. This confound was confirmed by 
pretest scores indicating that aggressive content was 
significantly correlated (p <.01) with both interest 
(r= S4) and action (r= 81). By measuring and 
statistically controlling for the amount of action and 
interest of each film chosen, it was possible to obtain 
a relatively unconfounded indication of the preference 
for viewing films with aggressive content. 

Following the choice of films, all subjects were 
given a questionnaire that assessed their expecta- 
tions of the films they chose, Subjects were asked 
the chosen films, as a group, on 7-point 
bipolar scales measuring interest, humor, violence, 
‘action, educational value, and aesthetic value. Experi- 
mental subjects only were also given a second ques- 
tionnaire, which asked them to indicate the motives 


subjects were debriefed. 


Results 


In view of the possibility of gender differ- 
ences in aggression (see Maccoby & Jacklin, 


1 Interrater agreement was assessed by dichotomiz- 
ing the aggression rating continuum into low aggres- 
sion scores (0 & 1) and high aggression scores (2 & 
3). The percentage of agreement on each film descrip- 
tion was then calculated: for example, if 11 of the 
14 raters gave a high score, then the agreement on 
that film was 78.57%. The percentage of agreement 
for each film was then used to calculate an overall 
mean agreement of 84.9%. 

2 All film descriptions, scores, and measures are 
available upon request from the author. 
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Table 1 
Mean Aggression Scores of Films Chosen 
Fantasy condition 
Non- 
Gender aggressive None Aggressive 
Women 
M 1.24 1.07 1.09 
n 13 22 10 
Men 
1.28 1.49 L71 
n 10 20 12 


Note. Scores range from 0 (low aggression) to 3 
(high aggression). 


1974), sex was used as a variable in the 
analyses. 

Success of the fantasy manipulation. The 
word-story method used in the present study 
had no prior validation as a technique for 
inducing fantasies. Thus, it was necessary to 
determine whether the fantasy groups differed 
in terms of the stories they told. When 2 
(Nonaggressive, Aggressive Word-Story) X 2 
(Male, Female) analyses of variance were 
performed on the subjects’ perceptions of the 
motives and emotions expressed in their 
stories, sex had no effect, nor were there any 
significant interactions. However, subjects 
using aggressive words perceived their stories 
as expressing significantly more fear (p < 
.004), disgust (p < .002), anger (p < .001), 
and hatred (p < .001), and significantly less 
pleasure (p < .004), happiness (p < .001) 
and love (p < .04) than those using nonag- 
gressive words. Similarly, aggressive-word 
subjects felt their stories had indicated 
greater needs for aggression (p < .001) and 
dominance (p < .001) and lesser needs for 
nurturance (p< .01) and affiliation (p< 
.02) than the nonaggressive-word subjects. 

In addition, preliminary content analyses 
of the fantasies leave little doubt that the 
word-story technique is a successful means of 
inducing specific fantasies. In view of this 
success, it may be well to point out that this 
method is considerably easier to administer 
and score and, perhaps more important, may 
offer greater experimental control over the 
nature of the fantasy induced, than other well 


ALLAN FENIGSTEIN 


established fantasy instruments such as the 
Thematic Apperception Test (TAT 
Aggressive content oj chosen fies. The 


total aggression scores of the 10 films chosen 
is presented in Table 1. A 3 (Aggressive, Non: 
aggressive, or No Word-Story) x 2 (Mala 
Female) analysis of variance on these score 
revealed a main effect for sex, F(1, 81) = 
39.5, p < .OO1, and a significant Sex x Fam 
tasy interaction, F(2, 81) = 6.6, p < .002, 
Films chosen by men contained signi {icantly 
more aggression than those chosen by women, 
Planned comparisons showed that when im 
duced to express aggressive fantasics, met 
had a far stronger preference for observing 
violence than women, #(20) = 5.56, p < 00L 
Also, in the absence of any induced fantasy, 


men were more likely than women to prefer 
films containing violence, 1(40) = 4.49, p< 
001. 


Of more interest to the hypotheses of the 
present study is the finding that aggressive 
fantasies, compared to nonaggressive fai 
tasies, increased the amount of violence in the 
films chosen by males, #(20) = 4.0, p < .00l, 
but had no effect on the violent content of 
films chosen by women. 

Interest and action levels of chosen films. 
The films chosen by males contained mort 
action, F(1, 81) = 16.2, p < .001, and wert 
more interesting, F(1, 82) = 5.37, p < os 
than those chosen by women. However, thi 
preferences for action and interest in films 
chosen were not affected by the type of 
fantasy induced, nor were there any signif 
cant interactions between the sex and fantasy 
variables on these dimensions. 

Covariation. In view of both the se 
effects for action and interest content of the 
films and the previously reported intercorre 
lations between aggression, interest, a 
action content of the films, a 3 x 2 covariance 
analysis was performed on the total aggression 
scores of the films chosen, using action ani 
interest scores as covariables. This analys! 
yielded a significant effect for sex, F (1, 79) = 
8.00, p < .006, and a signiñcant Sex X a 
tasy interaction, F(2, 79) = 5.38, P? < la 
suggesting that the preferences for vio a 
film content operate independently of 
films’ action and interest content. 


Gender effects. Vt may be argued that 
» in comtemporary society is eien- 
ya masculine behavior, and the present 
y found that this gender difference ap- 
tly generalizes to Alm preferences, Sev- 
treason: for this difference may be sug- 

Socialization strongly affects the 
jon and aceeptability of aggression 
Bandura, 1973), resulting in a greater fre- 
y ol physically aggressive acts on the 
of males than females. This increased 
arity with and exposure to aggressive 
may heighten the attractiveness of 
sive stimuli (Zajonc, 1968). Men's 
ences for viewing violence may also be 


results of the present study may be offered 
Partial support of the validity of the film 


up provi 
i of the study. But since there was no 
e that these cues led to differences in 
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by any subject. Thus, there is no evidence to 
demand 


support a strong effect. 
In view of some recent evidence suggesting 


37) = 50.9, p < .001, again providing evi- 
dence for the existence of sex differences in 
the preference for viewing violence. 

Fantasy effects. The present findings sup- 
port the hypothesis that the expression of 
aggressive, compared to nonaggressive, 
thoughts and images would result in corre- 
sponding viewing preferences, but this was 
the case only for men. For women, the degree 
of preference for violent films was relatively 
constant across all conditions, and generally 
lower than men’s preferences; this may best 
be attributed to socialization pressures that 
produce a consistent indifference or antipathy 
on the part of women toward violence in the 
media. 

It is clear that men’s film preferences were 
affected by the nature of their preceding fan- 
tasy, but the underlying mechanisms are open 
to speculation. Singer (1966) has shown that 
the expression of specific fantasies may arouse 
corresponding motives. Thus, subjects in the 
aggressive fantasy condition, compared to 
those in the nonaggressive fantasy condition, 
may have experienced a heightened motiva- 
tion toward aggression. It may be argued 
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that persons who are motivated to aggress 
show an increased preference for violent 
media programs under several different cir- 
cumstances: Given that the observation of 
violence facilitates aggressive behavior (Geen, 
1976), a person in a prepotent state of ag- 
gression may seek out media violence; and if 
there is no expectation or opportunity for 
aggression, then media violence may be 
sought out for its symbolic catharsis value 
(Geen, 1976), as a means of vicariously sat- 
isfying the aggressive drive (Pytkowicz, 
Wagner, & Sarason, 1967; Rosenbaum & 
DeCharms, 1960). 

Alternatively, the underlying explanation 
may involve cognitive rather than behavioral 
processes: Fantasy aggression may lead to 
increased exposure to violent stimuli for pur- 
poses of social comparison (Festinger, 1954). 
That is, one may wish to compare his own 
visual image of an aggressive event with that 
provided by the media, Clearly further re- 
search is needed to identify the nature of the 
processes that underlie the relationship 
between aggressive fantasies and exposure to 
media violence. But despite the absence of 
any clear explanatory mechanisms, the signifi- 
cance of the basic phenomenon should not be 
overlooked: Aggressive fantasies in males 
have been shown to increase their preference 
for viewing violence. Thus, an important 
antecedent of the preference for viewing 
violence has been uncovered. In addition, the 
relationship between TV violence and aggres- 
sion may be viewed from a new perspective: 
Not only does television violence increase 
aggressive behavior, but aggression, in the 
form of fantasies, also leads to a heightened 
preference for filmed violence. 


Experiment 2 


The results of the first study left un- 
answered the question of whether the tend- 
ency to observe violence is affected by overt 
aggressive behavior. Earlier correlational 
studies demonstrating a relationship between 
observed violence and aggression were con- 
cerned specifically with behavioral measures 
of aggression (e.g. Dominick & Greenberg, 
1971; McIntyre & Teevan, 1971). As sug- 
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gested earlier, a reasonable explanation for 
this relationship is that aggressive behavior 
increases the preference for viewing violence! 
Several mediating processes may be proposed, 
Persons who behave aggressively may selec- 
tively expose themselves to media violenct 
because of a meed to justify their prior be 
havior (Aronson, 1976; Brock & Buss, 1962; 
Glass, 1964). Watching others act aggrer 
sively may provide that justification by inti 
mating that aggression is an appropriate, 
acceptable behavior (Bandura, 1973; Gren, 
1976). In addition, aggression may arouse the 
need to understand one's aggressive behavior, 
and that need may be satisfied through social 
comparison processes (Festinger, 1954), lead 
ing to the observation of other's aggressive 
behavior on film, In this context, it is interest: 
ing to note that exposure to sexually explicit 
stimuli is also believed to be a function of 
social comparison needs; that is, it provides 
a way of comparing ourselves with othett 
(Byrne, 1976). 

Thus, the second study hypothesized thal 
physical aggression would increase the preb 
erence for viewing violent film stimuli, Fan- 
tasy aggression was also included as 4 
variable in this study because of the desit- 
ability of replicating the findings of Experi 
ment 1, 


Method 


Subjects. The subjects were 64 male undergrad 
uates who participated as part of a course require 


nonaggressive fantasy; a third go% 
fantasy induction. In the se 


ip 
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‘After compicting the Gest part, aff subjects 
to a weend room, where another experimenter 

to the subjects” fantasy condition) 
next part of the stedy. During this time, the 
c and confederate participated together, For 


to 
“decide bow creative cach response i 
be communicated through the use of 
a Bus (1961) type ageresion apparatus that 
deliver 10 different levels of “static noise” to 
confederate through earphones. The subject was 
provide one evaluation 


subject's button panel. 
did not wear earphones, and there was no mention 
of the we of noise. As before, subjects were informed 
that Button 1 indicated a highly creative response 
and Button 10 a noncreative response. Again, it was 
explained that the experiment was interested in 
moderate (4 to 7) levels of evaluation. In both condi- 
tions, confederates surreptitiously recorded the 
levels used by the subject, and the experimenter 
briefly questioned each subject concerning his per- 
the button’s effects. This information was 
elicited during the final debriefing session. In 
all cases, subjects in the noise condition indicated 
of the aversiveness of their responses, 
whereas no such information was offered by subjects 
in the no noise condition. 
second part of the study had been com- 
bject alone was sent to a third experi- 
who was blind as to the subject’s experi- 
condition. As in Experiment 1, the subject 
was told that this part of the study was interested 
in the “effects of visual stimuli on thought pro- 
cesses." The procedure for choice of films used in 
first study was followed here. Following the 
f films, all subjects filled out questionnaires 
assessing their expectations of the chosen films, and 
subjects also indicated the motives and 
expressed in their fantasies. After completion 
of the experiment, participants were fully debriefed. 
there was little indication of suspicion, and 
no subject was able accurately to ascertain the 


actual hypothesis of the study. 


Fantasy manipulation. The success of the 
fantasy induction was determined by perform- 
ing 2 (Nonaggressive, Aggressive Word- 
Story) x 2 (No Noise, Noise) analyses of 
variance on the subjects’ perceptions of the 
motives and emotions expressed in their 
stories. The noise variable had no effect, and 
there were no significant meaningful inter- 
actions. There were strong effects, however, 
for the word-story manipulation, and these 
were virtually identical with the results of the 
first study. Subjects using aggressive words 
perceived their stories as expressing signifi- 
cantly more fear (p < .004), disgust (p < 
001), anger (p< .001), and hatred (p < 
.001), and significantly less pleasure (p < 


2314 
Table 2 
Mean Aggression Scores of Films Chosen 
Fantasy condition 
Non- 
Stimulus aggressive None Aggressive 
No noise 
1.28 141 1.48 
n 12 8 12 
Noise 
M 1.63 1.78 1.80 
n 12 8 12 


Note. Scores range from 0 (low aggression) to 3 
(high aggression). 


001), happiness, (p < .001) and love ($ < 
.001). Again, subjects in the aggressive fan- 
tasy condition felt that their stories had indi- 
cated greater needs for aggression (p < .001) 
and dominance (p < .001) and lesser needs 
for nurturance (p < .001) and affiliation (p 
< .002) than subjects in the nonaggressive 
fantasy condition believed their stories had 
shown. Thus, the success of the fantasy 
manipulation was well established. 

Film preferences. The major dependent 
variable was the total aggression scores of the 
films chosen (see Table 2). A 3 (Aggressive, 
Nonaggressive, or No Word-Story) X 2 (No 
Noise, Noise) analysis of variance yielded an 
effect for noise, F(1, 58) = 21.0, p < .001, 
and a marginally significant main effect for 
fantasy, F(2, 58) = 2.46, p < .10; there was 
no interaction between the aggression and 
fantasy variables on this measure. Subjects 
who had had an opportunity to aggress 
physically using noise clearly had a stronger 
preference for viewing violent films than did 
subjects who had no opportunity to aggress; 
planned comparisons showed that this effect 
was consistently significant across all fantasy 
conditions. 

To assess the effects of the fantasies on film 
preferences more clearly, an additional anal- 
ysis of variance was performed involving only 
those groups that had expressed fantasies. 
This analysis indicated that there was a 
greater preference for viewing violence among 
the aggressive fantasy subjects than among 
the nonaggressive fantasy subjects, F(1, 44) 
= 4,18, p < 05. Planned comparisons found 
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that in the no noise condition 
parable to Experiment 1), there was a 
greater amount of violence in the fim» che 
by men who expressed an aggressive fantasy 
than in the films selected by those in the non 
aggressive fantasy condition, /(22? 19 
p< 0S. 

Covariance analyses, Films chosen bj 
subjects who had physically aggressed im 
volved more action, F(1, $8) = 154, p< 
001, and were more interesting, F(1, 58) = 
6.53, p < 02, than did those chosen by the 
nonaggression group, Fantasy had po effect 
preferences for action or interest, por wemi 
there any interactions. These argressiat 
effects suggested the need for a 3x2? œ 
variance analysis, using interest and action 
scores as the covariates, in order to deter 
mine whether preferences for viewing violat 
films were independent of the films’ action 
and interest content. The results of the 
ysis indicated that this was the case: Thee 
was now a significant effect for fantasy, F(2, 
56) = 3.26, p < 05, as well as an effect {ot 
physical aggression, F(1, 56) = 5.70, p <0 


which is com- 


General Discussion 
Empirical Findings 


It was hypothesized that both physical a& 
gression and fantasy aggression would lead t 
an increased preference for viewing violenct: 
These predictions were confirmed for me- 
Experiment 2 effectively replicated the first 
study: Both found that for males, the expre 
sion of an aggressive fantasy, compared © 
a nonaggressive fantasy, led to a greate 
preference for viewing violence. In the second 
study, this effect was sustained despite the 
fact that another time-consuming and impact- 
ful manipulation interceded between the fa 
tasy condition and the t measure 
Experiment 2 also found that persons wh? 
physically aggressed toward another, using 
noxious noise, were far more likely to choot 
to view violence than were those who had 


Although the present studies represent 7 
first experimental demonstration that agg" 
sive behavior leads to the viewing of violence; 


causa! hypothesis is completely consistent 
previous correlational work demonstrat- 
itive relatiomhip between TV 
i ritet Several other studies, 
in both the laboratory and the 
may also be interpreted in terms of the 
hypotheses of the present research, 
Macdonald (1979) found that 

» high crime areas watched more 
TV than those in low crime areas. 
Hoyasowsky, Newtson, and Wal- 
j34) found that attendance at a violent 
ie increased markedly following a brutal 
] murder, whereas no such increase in 
tendance occurred for a nonviolent film. A 
ble explanation for both these findings 
that violent crime results in a heightened 
cupation with aggressive thoughts and 
tasies, which then leads persons to seek 
t violence in the media. Feshbach, Stiles, 
Bitter (1967) showed that witnessing 

s of aggression may have reinforcing 
ects, but only for those who have been 
ited. Again, this is consistent with the 
ion that persons seek out media portrayals 
ol violence following an episode in which they 


a theory of viewer 
Watch violence in the media because of both 
long-term and situationally induced aggres- 
sive thoughts and actions. 


Implications 


The problem of violence in the mass media 
has attracted a great deal of public, govern- 
Mental, and scientific attention. That atten- 
tion has been primarily fi 
violence 
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directed toward the antecedents of the view- 
ing of violence. Yet, in a sense, the two are 
inseparable: in order to understand the effects 
of the mass media, it is necessary to under-: 
stand why viewers watch the media; and con- 
versely, to understand viewing behavior, the 
effects of the mass media must be known. 
Cognizant of the fact that watching violence 
facilitated aggression, the present research 
assumed that aggression, both physical and 
fantasy, was related to the viewing of 
violence. The results suggest that the assump- 
tions were warranted: The preference for 
watching violence may be attributed, in part, 
to aggression-related behaviors on the part of 
the viewer. 

The present research also provides an addi- 
tional perspective on the relationship between 
the observation of violence and the expression 
of aggression: aggressive behaviors and fan- 
tasies are not only a result of viewing violence, 
as established by previous research (€g, 
Eron, et al., 1972; Feshbach, 1961), but may 


also be a cause of the viewing of violence. 
Just as what one watches in the media influ- 
ences one’s behavior, so too the way one 
behaves influences what one watches, This 
bidirectional model of influence argues for a 
cyclical pattern of effects such that aggressive 
thoughts and actions increase one’s preference 
for viewing violence in the mass media, which 
in turn increases aggressive behavior. 
Finally, the research helps to counter some 
of the criticism that has been directed at the 
form of mass media research, Because pre- 
vious experimental research on television 
violence and aggression was interested in the 
effects of watching violence, the typical 
research paradigm denied the participant the 
choice of what to watch; subjects were simply 
exposed to predetermined programs, and 
effects were measured. This procedure, al- 
though standard (eg., Friedrich & Stein, 
1973; Liebert & Baron, 1972), may be con- 
sidered ecologically invalid in that it dis- 
regards tendencies or motives that may deter- 
mine the choice of program and may interact 
with the program content in affecting sub- 
sequent behavior. The present research par- 
tially overcomes these criticisms by investi- 
gating and identifying some of the variables 
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that affect what the viewer selects to watch, 
thus directing attention to the reciprocal 
nature of the relationship between aggressive 
behavior and mass media violence. 
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